python - XML Cross Reference -


i have xml file contains id , xml file contains same id. cross reference these files , extract information second file. first file contains id's need. example first file contains id's 345, 350, 353, 356 , second file contains id's 345,346,347,348,349,350 .... want extract data node , of children second file.

the first file structure:

<data>     <node>         <info>info</info>         <id>345</id>     </node>     <node2>         <node3>                 <info2>info</info2>                 <id>2</id>         </node3>         <otherinfo>1</otherinfo>         <text type = "02">                 <role>info</role>                 <st>1</st>         </text>     </node2> </data> 

the second file structure:

<data>     <node>         <info>info</info>         <id>345</id>     </node>     <node2>and bunch of other nodes</node2>     <node2>and bunch of other nodes</node2>     <node2>and bunch of other nodes</node2> </data> 

i have tried ruby/nokogiri solution can't seem far. i'm open solutions in scripting language.

to extract id values first xml string:

from lxml import etree  e1 = etree.fromstring(xml1) ids = e1.xpath('//id/text()') 

to extract <node> elements second xml string parents id elements known id values first one:

import re  e2 = etree.fromstring(xml2) ns_re = dict(re="http://exslt.org/regular-expressions") re_id = "|".join(map(re.escape, ids)) nodes = e2.xpath("//id[re:test(.,'^(?:%s)$')]/parent::node" % re_id,                  namespaces=ns_re) 

Comments