i have xml file , dtd defined. understanding of sax parser processes events instead of storing entire xml document (like dom) in memory. say, have xml file declaration < name> ... // 2 million lines here < /name> .. so, sax parser store in memory in case? how know end-tag name occur. , real question, how sax parser validate against dtd ? not looking in-depth explanation general idea on how validation occurs.
typically dtd converted set of finite state automata - there's standard algorithm converting bnf grammar deterministic fsa found in compiler textbooks such aho , ullmann. produce 1 fsa content model of each element. current state of parsing/validation represented stack holding 1 fsa (with current state) each open element. when parser encounters start tag, checks whether start tag represents valid transition in topmost fsa, , changes current state in fsa making transition; adds new fsa stack corresponding fsa content model of new element. when sees end tag, checks whether current state of topmost fsa final state, , pops fsa off stack.
Comments
Post a Comment