A Conversation for The GuideDog project

BBCi Testers

Post 1

Peta

Would you like the BBCi Testers to have a look too? Either ask on their main page or drop a note on my page and I'll ask the group, they might well be prepared to give it a go too!

smiley - smiley


BBCi Testers

Post 2

Felonious Monk - h2g2s very own Bogeyman

Peta,
Thanks for the kind offer: it's very rough and ready; I'm a bit unsure of letting this little doggie outside the garden gate yet! However, if it's understood that it is nowhere near finished then I'd be happy for others to look at it. If you are going to invite comments, please spell out that:
* it's very buggy
* it doesn't do everything it's supposed to yet
However, these problems will be fixed eventually. Have you looked at it yet btw?

Regards
FM


BBCi Testers

Post 3

Baryonic Being - save GuideML out of a word-processor: A7720562

Hello.

I can't test the prototype myself so may I ask a little about how it works, please? Could you explain in general? (Or point somewhere where you already have).

For example, how is the 'Download A-number' function implemented?

Also, is Approved GuideML / Writing Guidelines validation planned/available?

Since I have experience in XML, C++ and (some) VB can you suggest any ways that I could - conceivably - attempt to create a port for other OSs (since I don't meet the minimum requirements as they stand). What happened to the Linux version anyway (the Sourceforge page is more or less empty)?


"Any help - apart from comments telling us to write it in another language - is welcome."

Obviously I wouldn't dream of telling you to write in another language [assuming you mean programming language], but can you just run past me the technical reasons for why you chose a non-OS-independent language? I mean that genuinely, because I would like to know. smiley - smiley

Thanks.


BBCi Testers

Post 4

Felonious Monk - h2g2s very own Bogeyman

OK, these are all very reasonable questions, and demand considered answers.

GD is written in a combination of C# and VB.NET. It exploits several publically available libraries: Tim Anderson's HTMLEditor and Chris Lovett's SGMLReader are two of these. The former is the critical component, and this in turn uses the MSHTML rendering engine built into IE. This is the reason why it is *not* cross-platform. MSHTML was the most mature and powerful HTML editing platform at the time it was conceived. Hence the whole concept was predicated on MSHTML being the core component, as no other browser had a HTML editor that could do what it did. If I couldn't do it in MSHTML I wasn't going to bother as I would have had to jump through too many other hoops.

Now MSHTML edits HTML, not XML. So GD uses XSL stylesheets to convert the MSHTML to HTML. This is then edited on screen and then, when the file is saved, converted first into XHTML using SGMLReader then into XML using another stylesheet. Footnotes are implemented as element behaviours (HTC components). Download of A-Numbers is accomplished using standard .NET HTTP functions. Cut-and-paste is accomplished by simply intercepting actions through hooking callback functions into the editor. GuideDog only supports Approved GuideML, mainly because this is a subset of the non-approved dialect and therefore is less work. It already validates the entries against an XML Schema.

I chose IE because it was then and still is the dominant desktop browser, and because I knew the programming environments inside-out. Had Mozilla supported this kind of functionality, and had I the skills to exploit it, I would have chosen that instead. No ideology came into it, just pure and simple pragmatism. I didn't really see that the extra effort involved in writing my own editing engine would have a made a cross platform solution viable, and that the bulk demand would be from people who ran IE on Windows. I asked Jim Lynn whether I was doing the right thing before I got started, and he agreed.

The Linux version never came to fruition, mainly because I have not been able to finish the Windows version. The reason behind this is that (a) my father died earlier this year after a very long illness and when all that had finished I (b) started another, very stressful job and (c) my house was being torn apart and remodelled, and I am still working on this. All my spare time has been taken up with these issues.

To put it bluntly, all this wouldn't have been so obstructive had not virtually all the early offers of help dried up when most people realised that they would have to get their hands dirty using Microsoft technologies. I really could have done with help from people who liked producing something of value more than they hated Microsoft. Sorry if this sounds bitter and cynical, but there you are. My preoccupations are not theirs, thank God, and I'm more interested in ends than means.

If you want to port to other environments then please go ahead. The main obstacle you will come up against is the availability of a sufficiently powerful HTML editing component. Mono doesn't provide this as far as I am aware. If you can find a cross-platform component that is better than MSHTML I would be very interested to see it. Being a pragmatist through and through, I'd like to finish the Windows version first as it is really almost there bar a few glitches, so if you prepared to help me with that then I would be prepared to hand over the source code to you to do with as you see fit. I would also be prepared to help as I actually would like to broaden my technical knowledge, not narrow it.


BBCi Testers

Post 5

Felonious Monk - h2g2s very own Bogeyman

This emptomises what you will be up against: http://www.mozilla.org/editor/editor-embedding.html . The Mozilla editor simply isn't finished yet.


BBCi Testers

Post 6

Baryonic Being - save GuideML out of a word-processor: A7720562

Thank you! That was a very helpful explanation, and so I can now comment further on how I could - conceivably, and over a long period of time - assist - in some way or other, possibly. smiley - biggrin

I can sympathise with your reasoning. I agree that many of the solutions for different OSs are vastly more complicated or unfinished (that's all over computing, not just web development), and I obviously can't dispute the fact that most people still use IE, whatever their reasons.

I am fully appreciative and respectful of the 'real life' issues that do, as an unavoidable consequence, hinder development in these sorts of things.


From your description, I draw one simple conclusion that I can relate to: porting GuideDog would mean a complete re-write. Perhaps that was obvious.

However, there are consolations when considering a solution for other OSs. That is to say, a solution for Linux would necessarily encompass any X11-capable architecure, including Macs. And a solution in an OS-indepedent language is thus even better.

1) I could modify Mozilla Composer directly, or provide a plugin. I don't think this is a good idea, though.

2) I could develop a GuideML-specific HTML editing engine from scratch in C++ (obviously a very very long long long term goal).

3) I could modify Quanta Plus - KDE's HTML development environment. The trouble with this is that Quanta requires that you use the actual HTML (it's not strictly WYSIWYG, but that is one of their next goals). However, I don't think it would be a big problem after all because:

For example:
Inserting a horizontal line in a WYSIWYG is pressing a button and seeing a horizontal line appear.

Inserting a horizontal line in Quanta is pressing a button and seeing appear.

So with Quanta, you get the added advantage of learning HTML as you go, without actually needing to know it beforehand. A GuideML version might be simple to implement.

Another problem there though is that Quanta requires the KDE libraries.


With any of these possibilities, though, it would still probably be helpful to have your GuideDog source code as a reference. What do you think?


"To put it bluntly, all this wouldn't have been so obstructive had not virtually all the early offers of help dried up when most people realised that they would have to get their hands dirty using Microsoft technologies."

I'm actually quite surprised that there were so many people like that. Unfortunately, I cannot help with the Windows version myself because (a) I don't know much about MSHTML or .NET, (b) I'm slowly phasing out my use of Windows, (c) I personally have no use for something designed for IE and (d) my Windows Internet security license is expiring soon and I don't want to risk a virus by using IE. [Just like all the rest, aren't I?]

(a) of course is the most significant factor.

I like creating things of value, but I also hate Microsoft, and a Microsoft-technology-based project is of no value to me, as well as beyond my capabilities. I also think that IE usage is dropping as more people realise the alternatives, and I am personally of the opinion that IE will not be in a majority a few years, and that Windows too will soon be laughed at in museums by about 2050 at the latest. smiley - laughsmiley - smiley [Just a joke!]


BBCi Testers

Post 7

Felonious Monk - h2g2s very own Bogeyman

'I'm actually quite surprised that there were so many people like that'

I'm not: I've seen it so many times before: the 'you've invented the game, provided the ball and the pitch, but unless you play by MY rules I'm not going to join in' mentality. Pathetic, really.

'I like creating things of value, but I also hate Microsoft...'
I'm sticking with Windows because I expect to get the project finished by 2050, even if I have to do it all by myself.
You, on the other hand, had better decide what you value more: creating something useful or nurturing your hatred of a software company. But I meant what I said: help finish what's already there, and you can do what you like with the source code. No help = no source code. It's your choice.


BBCi Testers

Post 8

Baryonic Being - save GuideML out of a word-processor: A7720562

Well, I don't know much about MSHTML, as I say, but, like you, I'd like to broaden my horizons rather than narrow them. So to that end it would probably be quite good to see how things are done with Microsoft technologies.

What's that saying - something about knowing thine enemy.

I'll download it all when I have time and take a look.

However, how will I know if I have the ability to help or not if I can't see the source code already there? Is there any way around that?


Another question: when GD downloads articles, how does it extract the GuideML from the article?

(BTW, how do I know if I have SP1 of IE 6?)


BBCi Testers

Post 9

Felonious Monk - h2g2s very own Bogeyman

OK, you're on. there is a GotDotNet project workspace. They're upgrading the site right now: when I can get back onto it I'll post you a link and then you can apply to join. After bareing your left knee and learning the funny handshake you should have full access smiley - winkeye I started off hosting on SourceForge but I found it such a pain in the ass to use I gave up after a while.

The next time you browse to a h2g2 entry, replace the A123456 part of the address with 'test123456. Then see what happens....


BBCi Testers

Post 10

Felonious Monk - h2g2s very own Bogeyman

Regarding the SP1, just click on Help|About in IE.


BBCi Testers

Post 11

Baryonic Being - save GuideML out of a word-processor: A7720562

I know about the test123456 thing, but that puts the GuideML into a text box. How do you get a computer program to extract that GuideML? Or am I missing something simple here?


BBCi Testers

Post 12

Baryonic Being - save GuideML out of a word-processor: A7720562

Bear in mind, though, that I really may not be able to help out at all, either for lack of knowledge or lack of time or both; and if that's the case I obviously won't be able to do a port either. But it's likely that in the longer-term future I'd be able to do a port.


BBCi Testers

Post 13

Felonious Monk - h2g2s very own Bogeyman

Try this link, and click on Apply to Join. You will need a Passport.


BBCi Testers

Post 14

Felonious Monk - h2g2s very own Bogeyman

The link, sorry:

http://www.gotdotnet.com/workspaces/directory.aspx?&Column=ActivityPercentile&Direction=DESC&Page=&ST=guidedog


BBCi Testers

Post 15

Felonious Monk - h2g2s very own Bogeyman

I didn't answer the question about the download, did I? Well, it works like this. The URL is fed into an HTTPWebRequest object and a response is obtained. The contents of the response - the web page - are then fed into an SGMLReader, which writes out an XHTML translation of the web page. This is then fed into a XMLDocument object and XPATH is used to query the document for the various bits of the object that are needed.

The code is:

Sub DownloadEntry()
Dim sURL As String, sArticleID As String
sArticleID = InputBox("Type the article number")
Dim rxNo As New System.Text.RegularExpressions.Regex("[a-zA-Z](\d+)")
sURL = gobjSettings.H2G2Root
If sArticleID <> "" Then
sArticleID = sArticleID.ToUpper
If rxNo.IsMatch(sArticleID) Then
sURL += "test" & rxNo.Match(sArticleID).Groups(1).ToString
End If
End If

Dim sHTML As String
If sArticleID <> "" Then
Dim MyWebClient As System.Net.HttpWebRequest = WebRequest.Create(sURL)
setupProxy(MyWebClient)
Try
Dim cc As New CookieContainer
Dim uri As New System.Uri(sURL)

cc.GetCookies(uri)
Dim wresp As HttpWebResponse

With MyWebClient
.CookieContainer = cc
wresp = .GetResponse()
End With
'see for following code

Dim sr As StreamReader
sr = New StreamReader(wresp.GetResponseStream())
Dim sg As New Sgml.SgmlReader
sg.DocType = "HTML"
sg.InputStream = New StringReader(sr.ReadToEnd())

Dim sw As New StringWriter
Dim writer As New Xml.XmlTextWriter(sw)
writer.Formatting = Formatting.Indented
writer.Indentation = 4
Do While sg.Read()
If sg.NodeType <> Xml.XmlNodeType.Whitespace Then
writer.WriteNode(sg, True)
End If
Loop

Dim xDoc As New Xml.XmlDocument
xDoc.LoadXml(sw.ToString)

Dim sb As New System.Text.StringBuilder(xDoc.SelectSingleNode("//textarea").OuterXml)

sb.Replace("&amp;", "&")
sb.Replace("&lt;", "<")
sb.Replace("&gt;", ">")
xDoc.LoadXml(sb.ToString)

Dim objAttr As Xml.XmlAttribute = xDoc.CreateAttribute("ID")

objAttr.Value = sArticleID
Dim nodArticle As Xml.XmlNode

nodArticle = xDoc.SelectSingleNode("//ARTICLE")

nodArticle.Attributes.Append(objAttr)

xDoc = New Xml.XmlDocument
xDoc.LoadXml(nodArticle.OuterXml)
mbFileOK = False
With sfdDownload
.FileName = sArticleID & " - " & xDoc.SelectSingleNode("//SUBJECT").InnerText
.InitialDirectory = Environment.GetFolderPath(Environment.SpecialFolder.Personal)
.ShowDialog()
If mbFileOK Then
Dim objWriter As New System.Xml.XmlTextWriter(.FileName, Encoding.ASCII)
writer.Formatting = Formatting.Indented
writer.Indentation = 4
xDoc.Save(objWriter)
objWriter.Close()
OpenGuideMLFile(.FileName)
End If
End With
Catch ex As System.Net.WebException
MsgBox(ex.Message & vbCrLf & _
"Check your firewall or proxy settings: these might be preventing you from connecting.", MsgBoxStyle.Exclamation)

Catch ex As System.NullReferenceException
MsgBox("Unable to download the entry '" & sArticleID & "'." & vbCrLf & _
"The entry may not exist: check the entry number.", MsgBoxStyle.Exclamation)
Catch ex As Exception
MsgBox(ex.Message, MsgBoxStyle.Exclamation)
Finally

End Try
End If
End Sub


BBCi Testers

Post 16

Baryonic Being - save GuideML out of a word-processor: A7720562

OK, thanks; I've applied.


Key: Complain about this post

Write an Entry

"The Hitchhiker's Guide to the Galaxy is a wholly remarkable book. It has been compiled and recompiled many times and under many different editorships. It contains contributions from countless numbers of travellers and researchers."

Write an entry
Read more