i have xml file contains id , xml file contains same id. cross reference these files , extract information second file. first file contains id's need. example first file contains id's 345, 350, 353, 356 , second file contains id's 345,346,347,348,349,350 .... want extract data node , of children second file.
the first file structure:
<data> <node> <info>info</info> <id>345</id> </node> <node2> <node3> <info2>info</info2> <id>2</id> </node3> <otherinfo>1</otherinfo> <text type = "02"> <role>info</role> <st>1</st> </text> </node2> </data>
the second file structure:
<data> <node> <info>info</info> <id>345</id> </node> <node2>and bunch of other nodes</node2> <node2>and bunch of other nodes</node2> <node2>and bunch of other nodes</node2> </data>
i have tried ruby/nokogiri solution can't seem far. i'm open solutions in scripting language.
to extract id
values first xml string:
from lxml import etree e1 = etree.fromstring(xml1) ids = e1.xpath('//id/text()')
to extract <node>
elements second xml string parents id
elements known id
values first one:
import re e2 = etree.fromstring(xml2) ns_re = dict(re="http://exslt.org/regular-expressions") re_id = "|".join(map(re.escape, ids)) nodes = e2.xpath("//id[re:test(.,'^(?:%s)$')]/parent::node" % re_id, namespaces=ns_re)
Comments
Post a Comment