This is a Journal entry by Jim Lynn

Amazon Webservices

Post 1

Jim Lynn

I took a look at Amazon's webservices the other day, and really liked it.

http://www.amazon.com/webservices

All it is is Amazon exposing their facilities as XML, so you can write code to put the results of amazon searches and other features on your own website.

The really nice thing about it is that it works in *exactly* the same way as DNA. They fetch data from their database, turn it into XML, then pass that through an XSLT transformer to present to the user. With the webservices, though, you can get the raw XML directly or tell it to use your own stylesheet. All things which the DNA engine can do. I was almost going to try using a slightly modified h2g2 stylesheet with Amazon's service, but then I noticed they put a 20K limit on the size of the stylesheet they allow - sensible because they are caching these on their own servers, and they don't want to hold huge files (the base DNA stylesheet is almost 1M in size).

Now, I realise that this is really the only way to do this kind of stuff, but I do feel a certain amount of pride that a large (and smart) company like Amazon has arrived at the same point (albeit two years later).

If I thought we had a snowball's chance in hell of getting it past editorial policy, it would be great to have an tag which could be expanded into a listing of the book's details. That would be great for 'What book are you reading' wouldn't it?


Amazon Webservices

Post 2

SEF

I found the book listings were different on Amazon's American site than their English one. Have they had the sense to make the XML code the same on both?


Amazon Webservices

Post 3

Frankie Roberto

Looks interesting.

The XML/XSLT route does seem to be getting a lot more popular - but not many webproviders seem to offer it as part of a package (server-side xslt that is).


Amazon Webservices

Post 4

Jim Lynn

Probably because XSLT solutions for Unix aren't particularly high-performance. For a very long time (and possibly still - I haven't looked recently) Microsoft's MSXML component was the best performing XSLT engine by a very long way, and obviously it only works on Windows. Which is another reason why we're still running on Windows servers when the rest of the BBC runs on Solaris.


Amazon Webservices

Post 5

Tango

Does this mean there's precedent for releasing this kind of raw XML? Might this help persuade TPTB to do the same with h2g2? smiley - grovel

Tango smiley - winkeye


Amazon Webservices

Post 6

Jim Lynn

It's only a matter of time. We're busy with other things at the moment, but once we get some time to check that we're not putting anything into the XML that needs to be private (which doesn't matter at the moment because that kind of thing is hidden by the XSLT) and we should be able to just open up the XML directly.


Amazon Webservices

Post 7

Tango

You know, if it weren't for "time", h2g2 would be absolutely perfect by now... smiley - winkeye

Good luck! smiley - smiley

Tango


Amazon Webservices

Post 8

DoctorMO (Keeper of the Computer, Guru, Community Artist)

Yes, the time stream realy gets on my nerves too.

Tis the only thing about H2G2 I dislike, Windows Servers, and the proof is in the pudding.

So er, they still don't have xslt for *nix, I find that hard to belive.

-- DoctorMO --


Amazon Webservices

Post 9

Spelugx the Beige, Wizard, Perl, Thaumatologically Challenged

I've just been looking at the xslt packages for perl available on debian (apt-cache search libxslt | grep perl), and I can see two: XML::XSLT, which appears to be a pure-perl solution, with some unfinished features; and XML::LibXSLT, which describes itself as:

'This module is an interface to the gnome project's
libxslt. This is an extremely good XSLT engine, highly
compliant and also very fast. I have tests showing this to
be more than twice as fast as Sablotron.'

I'll (hopefully) be playing with it this afternoon, so I'll tell you all how fast it is then.


Amazon Webservices

Post 10

Jim Lynn

I didn't say there weren't any XSLT engines for unix - there are, plenty of them. But so far, their performance doesn't match up to what we've already got. That's assuming they can even cope with our stylesheets - last time I tried Xalan, the Apache XSLT engine, it core-dumped every time I tried to get it to use our stylesheet. But then, we've had that effect on virtually every XSLT product we've tried it on. The ActiveState Visual-XSLT tool would simply vanish if I tried to get it to load ours. I reported the bug to them, and after a long period of silence, I heard that they fixed it - but the had to re-architect large parts of their code to make it work with such a large stylesheet.

Pah. Amateurs.

And Sablotron got much of its speed from not being completely 1.0 compliant, as far as I can remember.

I'm sure things have improved, but after I joined the BBC, we had a meeting with some of the people from Internet Services, the division who support the live servers. They are almost entirely Unix (Solaris) based, and they were keen to get us to port h2g2 to Solaris. They had done some preparatory work, like getting Xalan (I think it was) to parse our stylesheet. They said that the initial parse took ten seconds. This was a little scary to me because, before we came to the BBC, h2g2 used to parse the entire stylesheet *on each request* and we still had acceptable performance. So at that stage, performance was definitely an issue.

My position on porting h2g2/DNA is that, when there's a definite business need, then we'll do it. What I don't want to do is spend months porting it, when the end result will be an identical system with similar preformance and no benefit for the end user. I'd much rather spend that time adding new features and fixing bugs.

Oh well. Maybe if we open-source it, you can do all that work for me, and I won't have to bother.


Amazon Webservices

Post 11

IMSoP - Safely transferred to the 5th (or 6th?) h2g2 login system

"Maybe if we open-source it..."

smiley - yikesNow there's a thing to let slip casually in passing: you say it almost like it's a real possibility! [Would *so* love it if it did come true, but can't imagine the powers that Beeb being convinced smiley - winkeye]


Amazon Webservices

Post 12

Jim Lynn

There are some 'powers that be' at the BBC who are very keen indeed on open-sourcing everything we do. But naturally, there are others who are more sceptical.

I'm somewhere in the middle. On the one hand, it would be cool to get all my bugs fixed for free, but on the other hand managing an open-source project of even a handful of developers is a very difficult thing. And we'd also have to make sure our code was up to scratch before we opened it - tidy up all those out-of-date comments, take out any bound-in security items, and document that *correct* way to do things (which isn't always the actual way we've done things because we've learned as we've gone along).

So even if the will is there, it's not just a matter of dumping the codebase onto sourceforge and waiting.


Amazon Webservices

Post 13

Frankie Roberto

Interesting. I'd heard rumours of a porting...

I'm trying to find a host for an xslt site i'm working on. I'll look for windows servers then...


Amazon Webservices

Post 14

Tango

He's admitted open-source DNA is a possibilty plenty of times before, but that's as far as it's gone. smiley - sadface

Tango


Amazon Webservices

Post 15

IMSoP - Safely transferred to the 5th (or 6th?) h2g2 login system

I suppose it would need some kind of specific trigger to make opening either the XML or the source enough of a priority - both 'politically' and technically. I think both could help make this even more of a good thing than it already is, but I've no idea what that trigger would be - *puts thinking cap on...*


Amazon Webservices

Post 16

Tango

How about a well thought out letter writing campaign? Jim, what email address shall we spam... i mean send demands to opensource DNA? smiley - winkeye

Tango


Amazon Webservices

Post 17

SEF

Demands? That sounds strangely familiar... smiley - winkeye


Amazon Webservices

Post 18

Tango

did i say demands? i meant "requests", of course. smiley - winkeye

Tango


Amazon Webservices

Post 19

DoctorMO (Keeper of the Computer, Guru, Community Artist)

*looks back*

er, I think I better read up on what xlts thingy is, before I can make any comments, and because I've only ever worked on small perl systems. hmm..

-- DoctorMO --


Amazon Webservices

Post 20

DoctorMO (Keeper of the Computer, Guru, Community Artist)

hmm, I see, so is DNA based on XML at the moment? or is it a future thing?

Can't see much wrong with CSS pointers, but I must be missing a problem as it's 1am.

-- DoctorMO --


Key: Complain about this post