A Conversation for The H2G2 Programmers' Corner

XML

Post 1

Alex 195614 As everyone else seems to like incredibly long names I keep mine ironically short.

I am looking into using xml can someone give me some advice as the content out there baffles me


XML

Post 2

MaW

Okay...

a) what do you already know?
b) what do you want to use it for?
c) what system will you be developing on and what system are you targeting?


XML

Post 3

Pastey

If you're looking into books, the best I've come across are by Wrox. We've got literally (no pun intended) thousands of pounds worth of books at work, but the Wrox ones are the only ones we really use.

smiley - rose


XML

Post 4

Dancer (put your advert here)

If you need a parser, there's Xerces, it's a great parser and it is bothe free and has both Java and C versions, so if you write in both, you use the same API with the parser.

smiley - hsif
Dancer


XML

Post 5

Ion the Naysayer

XML isn't really a big stretch from HTML. If you use GuideML, you're already using XML and you probably didn't even know it smiley - winkeye.

I found the W3C's website ( http://www.w3.org/ ) both incredibly helpful and incredibly confusing. You may (or may not) find the actual XML 1.0 spec helpful. Watch out, it's big. I would highly recommend a visit to http://www.xml.com which is maintained by O'Reilly and Associates (the people behind those lovely computer books with the animals on the front - Perl has a Camel, for example). That's where I got started. Look up "Taming the XML Beast", which is the first understandable XML article I came across. You don't actually have to write DTDs but if you do you should keep in mind that DTDs are on their way out and XML Schemas are on their way in.

If you had a more specific question, I've been doing nothing but XML for the past three weeks at work. At first I thought smiley - cool. Then when I did a little more reading, it was more like smiley - headhurts. After I found out that XLink barely has support and XInclude doesn't have support at all, I was smiley - steam. But now XML is smiley - cool again because I got my Perl parser to work.

If you want to write basic, well-formed (actually XML), non-validated (you don't compare it to a list of allowable tags) XML, IE 5 and Mozilla 1.0 both have parsing support.

I'd also recommend a look into XSLT. There are XSLT articles by the writer behind "Taming the XML Beast" on xml.com as well. You need an external parser such as Dancer mentioned if you want to use XSLT, though. If you look on the W3C's website there should be a list somewhere (smiley - huh) of software that implements XSLT.

Good luck! But don't worry, you probably won't need it.


XML

Post 6

xyroth

hi.

does anyone here have a clue as to how you are supposed to seperate parameters in xlm?

the docs are not very clear, and it has come up in another thread.

acording to the xml spec, if you have to pass an "&" as a parameter seperator to a binary, you should replace "&" with "&", but it claims to be a recoding of html4 into xml, and that says something different.

The html4 spec says that you should stop using "&" as a seperator, and use ";" instead, which isn't mentioned at all in the xml spec except as a special character.

anyone know what the truth is?


XML

Post 7

Ion the Naysayer

The following are excerped from the HTML 4.01 specification at http://www.w3.org/TR/1999/REC-html401-19991224/html40.txt

Forms submitted with this content type (form submission by use of the GET method, e.g. AddThread?inreplyto=2203831) must be encoded as follows:

1. Control names and values are escaped. Space characters are replaced by '+', and then reserved characters are escaped as described in [RFC1738], section 2.2: Non-alphanumeric characters are replaced by '%HH', a percent sign and two hexadecimal digits representing the ASCII code of the character. Line breaks are represented as "CR LF" pairs (i.e., '%0D%0A').

2. The control names/values are listed in the order they appear in the document. The name is separated from the value by '=' and name/value pairs are separated from each other by '&'.
---

The other piece I found was right near the bottom of the specification and says:

We recommend that HTTP server implementors, and in particular, CGI implementors support the use of ";" in place of "&" to save authors the trouble of escaping "&" characters in this manner.
---

The latter is a recommendation. The former is a requirement. I'll test out the use of ; on my copy of Apache when I get the chance.


XML

Post 8

xyroth

interestingly, mark told me that h2g2 already supports the use of ; instead of &, so the question then is why don't they make & the default?

by the way, I tested, and it does work.


XML

Post 9

Dancer (put your advert here)

Like meny things in computing, the answer "Historical reasons" applies here too.

Actually, the "correct" way to do things is with & but people are too lazy to do things correctly.

smiley - hsif
Dancer


Key: Complain about this post