This is the Message Centre for Gnomon - time to move on

Scanning the Edited Guide

Post 1

Gnomon - time to move on

The Search box in h2g2 can be used to look for whole words in entries. If you want to find any entry which includes the word "Ireland", for example, you can just used the Search box. But some things can not be easily searched for.

If you want, for example, to search for entries which have the initial GUIDE tag in lowercase, which will cause the entry to be invisible in Pliny, you can't use the search box.

I have Visual Basic on my PC, a left-over from the last century. I've written a little program which gets a random Edited Entry from h2g2 and then scans it for a string. Looking for guide with the angle brackets around it will spot illegal uses of this particular tag. The scanner is slow - it has to physically read each entry, so it only gets through about 2 a second. It's found three entries so far with this particular problem - I've fixed them. But it's been running now for an hour, and hasn't come up with any more. I don't know how long to leave it before I decide I've found them all.

There's another problem with the Guide which requires a little more ingenuity - the spotting of entries where the opening tags don't match the closing tags. It's not possible for you or me or even the Guide Editors to create such an entry. But it was possible for the BBC Editors to do it, and they occasionally did it when they were in a hurry and couldn't wait to find the error in the GuideML. So a few entries have actually been stored with invalid GuideML. They display OK in Ripley, but cause an error in Pliny - they give the "Cannot find the page" error.

One way to find them would be to edit the entry and save without making any changes. The GuideML validator in h2g2 will object to the illegal GuideML. But this updates the "last updated" date on the entry.

Another way is use a variant of my scanner program. It will read the entry and scan through it for every tag. It will distinguish between opening tags and closing tags, and make sure they all match each other.

It's nearly finished - I forgot about "self-closing" tags, which don't have a corresponding closing tag. I'll have to write a bit of extra code to ignore these. Then it will be ready to be let loose on the Guide.


Scanning the Edited Guide

Post 2

Dmitri Gheorgheni, Post Editor

That's brilliant. smiley - applause

And the explanation made my head hurt. smiley - winkeye


Scanning the Edited Guide

Post 3

2legs - Hey, babe, take a walk on the wild side...

Or... an idea... which I've probably not thunk through sufficiently... Could you create something, to go through the edited entrys, opening them, one by one, on* the site, in pliny, and scan the page, itself, for when the error message comes up, on opening the entry, and then 'log' the A number of the entrys that, when opened, in the new skin, produce teh error message, due to the nonmatching code?

I'm actually using pliny at the moment (OK, I'm not actually, still in ripley, for conversations as I am right* now), but using ripley, I'm going through the guide, alphabetically... I'm not sure how long my continued concentration on this will last smiley - blush I'm not quite* looking at every entry, soem I recognize as being done quite recent, I think I can ignore, and some on subjects I know absolutely nothing about, or have no interest, I'm having to skip, looking for old entrys in need of an update, or ones with any genral errors I spot on skim reading through them (missing links in particular), and also, whilst doing it, I'm noticing 'words' in teh entrys, that I would imagine* oughta* link to an entry, on that word, and where they don';t, and where searching for that word, reveals no edited entry, I'm compiling a list of entrys that need writing... smiley - puff
Not sure how far I'll get in this endevour... On page 11 at the moment, still on teh letter 'A'.... but its maybe soemthign I can plough through when I'm boared, and the interent land is quiet, and I've nothing else to do with my time... smiley - zensmiley - birosmiley - weird


Scanning the Edited Guide

Post 4

Icy North



By the law of exponentially diminishing returns, you could be here a very long time.


Scanning the Edited Guide

Post 5

Superfrenchie


So that is what happens when Gnomon "takes a break" smiley - biggrin

smiley - applause Well done, though. smiley - hug


Scanning the Edited Guide

Post 6

Gnomon - time to move on

That's a good idea, 2legs.


Scanning the Edited Guide

Post 7

2legs - Hey, babe, take a walk on the wild side...

smiley - wow Just sprung out as a sort of obvious thing to me... as each entry, with the error/mistake/wrong GML, will provide that error message, on the A page, in that skin, (I guess always in teh same place on the page), it oughta, be possilbe oto find a way to automatically 'look' for that error appearing... Mind, of course, I've no clue whatsoever how one would even contemplate starting to think about how to design a bit of automatic software to do that, and be able to write you a nice list of those places where it finds it smiley - doh


Scanning the Edited Guide

Post 8

Recumbentman

This is the kind of things computers are good at. You just have to learn to address them correctly smiley - geek


Scanning the Edited Guide

Post 9

2legs - Hey, babe, take a walk on the wild side...

Exactly... I'm always talking the wrong language to my computer smiley - wahsmiley - doh I've been racking my brain (and my little toe too, just in case), trying to figure out a way to get a chronological list of edited* guide entrys... I'm not sure it can be done, just on the site (I don't think I cna do a playing about with the address type thing, to get a list of them...) smiley - dohsmiley - ermIt ought* be possible... one would imagine smiley - erm


Scanning the Edited Guide

Post 10

Icy North

drop me an e-mail, 2legs, and I'll send you a list of EGEs later today.

icy underscore north at hotmail dot com


Scanning the Edited Guide

Post 11

2legs - Hey, babe, take a walk on the wild side...

Seriously? smiley - wowsmiley - grovel I must owe you a pint, at least for that... Hmm.... or perhaps not, depending how insane an adventure this one turns out to be... approx 10000 entrys, edited, I think... say 100 a day... that won't take... oh... smiley - sillysmiley - biggrin about to go Email, smiley - cheerssmiley - run


Scanning the Edited Guide

Post 12

Icy North

Sent smiley - ok


Scanning the Edited Guide

Post 13

2legs - Hey, babe, take a walk on the wild side...

Oh. dear... smiley - blush that* is scarey! smiley - laugh and thanks ever so much... your little explination of how you came by the data, made no sense to me... I'm just no where near as up to date or as computer literaterat as once maybe I was... but it all seems to be there... and I do mean all* smiley - bigeyes
Also got me to update my office stuff, with the conversion update thingg, which shows how long since I got sent a modern office doc, as mine is 2003 and I've not had a need to do a conversion for a more recent version smiley - snork

I just wish I'd not done the go look at the last cell bit.... and notice its number, and hence how many guide entrys that is smiley - laughsmiley - bigeyes Mind, I'm guessing the more recent ones are fine.

It may take me some while. three entrys read, all links in which clicked, so far. As a bonus, noticed two or three non-existant entrys, that I'd imagine should* be writen, as there weren't links to them, from the entrys smiley - magicsmiley - biro

I'm going to read each, and every entry, chronologically now. Three down, 10 thousand err some hundred and something, left to go smiley - laugh

(as regards turnign the A numbers into links, its just as easy for me to copy and paste each A number into the address bar really, and that isn't going to be the kind of thing, to slow down the process, the reading each entry bit is where the time is, I guess....) smiley - zensmiley - biro


Scanning the Edited Guide

Post 14

Gnomon - time to move on

There's a feature in h2g2 which will list all the entries of a particular type (such as Edited, Pending etc), but we don't generally make it known, because there's a danger someone would use it list all 250,000 entries in the Guide, with unknown consequences for the servers.

I use it occasionally to update my list of all Edited Entries.


Scanning the Edited Guide

Post 15

2legs - Hey, babe, take a walk on the wild side...

smiley - laugh I could see that might upset the servers somewhat smiley - snork and probably be even more terrifying than looking at how long the list is, in a simple little excell spreadsheet.. smiley - puff I'm guessing once I get so* far, in terms of the dates, the entrys will all* more of less, be OK, and not require checking, but I'm not sure at which point that will be.... However I do my calculations (within the bounds of reality), at the moment, even at a fairly hefty reading rate (and allow for things like I won't get round to it every day,), its somewhere between 100 and 400 weeks for me to read them all smiley - bigeyessmiley - snork I'm a little worried on some, though, that I've just not got enough 'general knowledge' on certain subject areas, to spot any what might otherwise be, to better informaed people, obvious errors or omissions smiley - dohsmiley - headhurts I'm not great on history, cosmology, or geography for a start smiley - blushsmiley - weird Mind, any I'm just not sure about, I'll mark on teh spreadsheet that maybe someone else oughta have a look smiley - zensmiley - erm Right. I've a bit* of reading to go do smiley - run


Scanning the Edited Guide

Post 16

Gnomon - time to move on

Join the gang, 2Legs. I read Edited Entries regularly, looking for mistakes. Lanza reads them looking for entries that would be good on the Front Page along with the new ones. And I'm sure there are others regularly trawling through the Edited Guide.


Scanning the Edited Guide

Post 17

2legs - Hey, babe, take a walk on the wild side...

Seven down.... 9 thousand and... whatever left to read smiley - laughsmiley - biggrin well, a few more than seven... its easy to get distracted, on clicking (to check), a link, and brining up another entry, and then accidentially reading that one too... Odd really, the number of entrys I'm coming across that I've a vague memory of already having read once or more before smiley - zen May try reading a few more later on, depending how late I stay up... So difficult to decide on some though, if they're 'good enough', or 'complete' enough' or not... I mean, there is only really* so much I guess that can be said about some topics, but for others, is clear they're not being addressed anywhere near enough in teh entry, or the entrys really not about, say for example a 'country', and is, really, probably better off being renamed, along the lines of, say, for example, 'a guide to some major citys withing ' ... smiley - erm


Scanning the Edited Guide

Post 18

Gnomon - time to move on

We'd be delighted to hear any suggestions you have at Feedback / Editorial.


Scanning the Edited Guide

Post 19

2legs - Hey, babe, take a walk on the wild side...

That's kind of the idea... Still seems to be quite a lot of the older entrys, (I'm in 1999 at the moment), that need something somewhere between a complete rewrite and updating, some more towards one end of the two extremes than others, I'm trying to spot any that have missing links which would be good to put in too, but I'm mainly trying to do that from memory, of what I know is already in the guide, that could link from them; Though, already on searching for a couple of things, that I were supprised weren't in the entrys as links, I've found a few guide entrys that haven't been written, and which would be good to have in the guide, probably via something like challange h2g2 if that still exists, or some such smiley - zen


Scanning the Edited Guide

Post 20

Gnomon - time to move on

Challenge h2g2 is still there, although it's not promoted at the moment. We could get it put on the Front Page occasionally, although there won't be any prizes of t-shirts.


Key: Complain about this post