This is the Message Centre for Jim Lynn

@ entity characters

Post 1

Joe aka Arnia, Muse, Keeper, MathEd, Guru and Zen Cook (business is booming)

Hi Jim,

I just got a question about URL parsing in GuideML. The problem is that a certain link breaks the parse routine. I suspect that it's because the link contains a couple of @ signs (that is the point the parse seems to break) and XML doesn't like unquoted @'s but is that actually the case and if so, what is the entity code for @ signs?

Thanks,
Joe


@ entity characters

Post 2

Jim Lynn

Is this in postings?


@ entity characters

Post 3

Joe aka Arnia, Muse, Keeper, MathEd, Guru and Zen Cook (business is booming)

*checks with enquirer*

Yes apparently. Is there a way to make @ signs in URLs work in postings?


@ entity characters

Post 4

Jim Lynn

Not currently, and I'd be reluctant to do that, because the @ sign has a very particular meaning in URLs - usually a malicious one.

Basically, the @ sign separates the username/password from the actual domain name and URL. For example:

fred:[email protected]/page.htm (preceded by the http gubbins) will go to the server somewhere.com and fetch the page page.htm having identified the user as fred with a password secret.

Unfortunately, this obscure trick of browsers isn't widely known, which allows nefarious people to make up URLs which *look* like they've come from one site when actually they've come from another. You simply do:

[email protected]/somethingbad.htm

If innocentsite is somewhere you trust (BBC News, Google, etc.) and the page on nastysite has the same HTML layout, you might be fooled into thinking that this fake page actually came from the real site. (There was a URL a while back which claimed to be from BBC news but was actually a spoof. Lots of people claimed that 'The BBC was hacked' when all someone had done was copy the HTML layout of a news page and put it on their own server.)

In fact, because of the way IP addresses work, it's possible that the above malicious link could be even more sneaky:

www.innocentsite.com@1122334455/somethingbad.com

Your browser takes the 1122334455 and assumes that's a 4 byte integer and uses those four bytes as an IP address. Since the URL now only has one valid-looking domain name, it's even less likely that you will realise that it's a spoof address.

So, having explained all of that, do you want to tell me what the URL your friend wanted to/wants to/has post(ed)? Just in case there's a perfectly valid reason for it containing an @.


@ entity characters

Post 5

Joe aka Arnia, Muse, Keeper, MathEd, Guru and Zen Cook (business is booming)

There is, unlike the FTP password reason or the malicious 'hack' reason, its from a site that uses @ signs as part of session ids.

Although knowing about that trick, I'm with you and would be tempted to say that they should eliminate the session ID from all URLs that contain it (something they should do anyway really)

Not the least techie answer, but it seems the best one for now.

BTW, if you have the time (which I doubt but its worth asking) would you be interested in having some minor contribution in a new series I want to work on 'Doing the Web Well' A820162

Its going to be about proper, standards compliant webdesign and targetted at all levels from beginners (so they don't learn the bad habits of table tags for layout etc) to more advanced levels and cover all kinds of webstandards and markups.


Key: Complain about this post

More Conversations for Jim Lynn

Write an Entry

"The Hitchhiker's Guide to the Galaxy is a wholly remarkable book. It has been compiled and recompiled many times and under many different editorships. It contains contributions from countless numbers of travellers and researchers."

Write an entry
Read more