Curse of the MiniDOM
I have spent a depressing evening trying to find out if there was a mechanism using minidom or pyxml to load the bbc traffic tpeg data into a DOM and get it to resolve the node entity values. No luck. I can’t even use minidom to parse the entity file or DTD directly as it rejects it. The only way I think I can get this working is to
- read the entity file line by line and generate name pairs from the definitions using regex to match valid entity defns.
- (somehow) load the xml file into memory, perform a text replace on all entity references from the name pair dict and _then_ pass this to minidom and my parser code which I have built line by excrutiating line.
I don’t like this approach but I have set myself the goal of doing this so I will do it.
Comments
2 Responses to “Curse of the MiniDOM”
Leave a Reply
So, we’re both in the same plight. I am trying to resolve entity references and still no luck:D Oh well… Hope you had better luck than me..even if it was 2.25yrs ago:D
Orbimus, Alas I ended up doing it the hard way. I had to read in the entity file values and then process the file using text replace before passing it to the parser. What a pain in the backside. I have not had revisit this code since 2006 and am alarmed that 2 years have passed and this still isn’t easy to do in Python.
Al