Vanity Foul
Dedicated to the wanderings of an egotistical mind.


20030825
Monday August 25, 2003

Atom and XOM

Digester wasn't cutting it (haven't received that custom Rule yet), so I thought I'd try XOM. Well, XOM thinks the example Entry isn't WellFormed:
<content type="application/xhtml+xml" xml:lang="en-us">
<p xmlns="...">Hello, <em>weblog</em> world! 2 < 4!</p>
</content>

But if I put just "Hello World" in the <content> tag it's happy. Update: Tassos points out "2 < 4" is the culprit. Somehow I overlooked that. Thanks Tassos!

I don't know what isn't well-formed about that, but since <content> could contain "nasty" non-X HTML, wellformedness shouldn't be a requirement (damn users, can never count on them to do the right thing!). My search for another solution will continue tonight (Jason, wanna send me that package your mentioned in my comments?).

Good night, and thanks for your support.
( Aug 25 2003, 02:25:28 PM ) Atom Permalink Comments [4]


Trackback URL: http://www.brainopolis.com/roller/trackback/lance/Weblog/atom_and_xom
Comments:

Well, the document most certainly isn't well-formed. see that litle < between 2 and 4? Yep, that's the culprit - the less-than sign should have been entity-ized by the feed, but it wasn't.

Posted by Tassos Bassoukos on August 27, 2003 at 05:45 AM CDT #

You are right, somehow my eyes glossed over that one. Thanks.

Posted by Lance on August 27, 2003 at 08:44 AM CDT #

XOM is telling you something here that you really need to listen to. It is simply not legal to put a less than sign in raw, unescaped XML content, ever. If the user types in a less than sign, then you need to escape it. I'm not sure how your application works, but if you use XOM to create the xontent in the first place; i.e. something like:



Element element = new Element("content");

element.appendChild(userInputString);



then XOM will automatically escape the string for you. If you receive an "XML" document in which the < isn't escaped or hidden inside a CDATA section, then the document is malformed and no conformant XML parser will accept it. This is a deliberate design decision in XML which is inherited by all vocabularies that use XML including RSS.

Posted by Elliotte Rusty Harold on November 30, 2003 at 06:30 AM CST
Website: http://www.cafeconleche.org/XOM/ #

Yes, of course you are right. However, I'm working from an example of "Atom" from DiveIntoMark. The example *isn't* valid XML - which is a fault of Atom or the example.

Posted by Lance on November 30, 2003 at 08:52 AM CST #

Comments have been disabled.

archives
links