A Conversation for Speech recognition
- 1
- 2
Peer Review: A781463 - Speech recognition
Bluttsuuft Started conversation Jul 7, 2002
Entry: Speech recognition - A781463
Author: Bluttsuuft - U174943
I have tried to write an accessible piece on speech recognition from the point of view of the technical support professional who gets all kinds of questions on how to work with the software.
This is not supposed to be a piece about the science of speech recognition. I do not understand this science myself, I could not write about it in a knowledgeable fashion. I DO get to talk to a lot of people who do not fully grasp the realities of how this kind of software works from a user perspective. Hence the piece.
A781463 - Speech recognition
xyroth Posted Jul 8, 2002
I find two problems with this article.
first (and relatively trivial) are the complete lack of headings.
The second problem comes from your above mentioned lack of understanding of the technology.
Speech software (contrary to the pr of the companies) is still basically at the travel dictionary level of competance. It also has the problem that speech interfaces are basically primative.
Contrary to your statement, unless you already happen to talk fairly like a dalek (ie no character in the voice), you have to earn how to talk pretty much like a dalek.
This involves clearly enunciating, and trying to speak exactly the same sentance exactly the same way.
This is because speech and writting have almost nothing in common. This is most clearly demonstrated with dialects, where two people can speek intelligently in chatrooms or forums for years, but cannot understand each other on the phone.
It is still the case that speech software (or language understanding software generally) work best when doing what is known as technical translation. This is when you are using poly-sylabic words that specify an idea exactly, and thus eave little room for the software to screw-up.
Anyone who tells you that you can read literature or peotry to a computer and it will get it right most of the time is either a liar, or a fool. the technology just isn't up to it yet.
A781463 - Speech recognition
Spiff Posted Jul 8, 2002
Hi Bluut,
So, you're back and you've got Schreiblust, eh?
Here's my feedback, along with a for yourself,
I wasn't initially sure from the title what this piece would be about. Sure, you clear up any confusion straight away in the piece, but it is not clear until you decide to have a look.
Headers are a good suggested improvement, i agree, but do you know about Guide-ML? Do you know about Mark-up languages generally. You know, tags and stuff?
This needs tags around your paragraphs, and some other minor formatting that would make it considerably more readable. It may seem like fancy extras, but it makes a difference.
As to the content, I admit that I too am dubious about the quality of transcription with these products. I have worked with translators who used them and most had reservations.
This is an interesting group to poll on the value of the things, cos freelance translators would benefit enormously (in terms of productivity) from a good quality speech-recognition programme! They are not flocking to pick up the existing ones, though.
This does *not* make it an unsuitable subject for an entry. Just expressing my personal reservations.
Best of luck with this and future efforts,
spiff
A781463 - Speech recognition
Bluttsuuft Posted Jul 8, 2002
Thank you for your feedback. It is much appreciated.
I see I got another comment on headers and I'll answer that in the reply to that posting.
As for my lack of scientific understanding of the subject : writing speech recognition software is quite complex. I have a basic grasp of how it works on the development level but I have no coding ability and I will do myself the huge favour of not pretending that I do.
I wanted to offer some suggestions to the user of this kind of software how to go about improving his/her chances of being correctly understood by the software. I know that a lot of people do not (want to) take the trouble of using it correctly. This results in a lot of frustration and agony because getting the wrong result when you're dictating very quickly leads to anger and resentment. Believe me when I say that I know what I'm talking about.
As for talking like a Dalek : the software does work better when you use inflection. Although there are people who always use the same tone of voice, most of us do not. When you use the same tone of voice all day while dictating, you will find that dictating is tedious and after a short while also painful.
I'm not going into translation software, that's something totally different.
Dictating poetry or literature in general is possible too. What you would want to do here is to build your own vocabulary. There is software that allows you to do that. Once trained you will notice that your recognition improves dramatically. Having said that, I have tried dictating Shakespeare (and parts of The Guide !) without creating a specific vocabulary or training. I've printed out the result and stuck it to the wall to look at everytime I want a good laugh
I also get a lot of positive feedback from people who use this kind of software in their everyday life. Although there is obviously room aplenty for improvement, we've entered a stage IMHO where the software allows a level of productivity that's practical in an office environment.
I wouldn't want to land a plane using only my voice, though.
A781463 - Speech recognition
Bluttsuuft Posted Jul 8, 2002
Spiff,
Thank you very much for your comments as well. Don't be modest with the salt. Rub it in thick. I'm in tech support, pain is my life. I gather that you would have to go to extreme lengths to even register on the scale (I'm not doubting your ability, mind).
Although I haven't written much here yet, I want to add more to my curriculum because I absolute believe in what the late great DNA wanted to do here. Rather, it is because of my genuine respect for this great man that I haven't added a lot more. The reason for that is : I know nothing. At least that's what it feels like. I am confronted with the limitations of my knowledge on a daily basis and it makes me very careful to add anything here. I want to expose as little of my ignorance as possible. I'm vain that way.
But I'm working on it. I'm teaching myself HTML to create my first webpage and I would have liked to add some headers to make my entry more easy to read, but I don't know how to go about doing that yet. It will be a pleasure to find out how to create a decently formatted text. Actually, I just got an idea that should yield good results fairly quickly...
As for the title of my piece, I thought that "speech recognition" would be clear enough but you rightly assume there are more aspects to that than what I covered. Honestly, the title fit in the "Subject" box and that was plenty good enough for me. However, if you want a more elaborate title, I am only to happy to accomodate. I've somewhat of a reputation for creating titles for papers. How about "Some Comments On The Proper Training And Use Of Contemporary Speech Recognition Software - Practical Pointers From A TechSupport Point Of View" ?
Also, when I want to find my piece in the "Search" box using the title as criterion, why do I only get a less than 40% probability that I'm getting the right text ? Does that make sense ?
Greetings.
A781463 - Speech recognition
Ausnahmsweise, wie üblich (Consistently inconsistent) Posted Jul 8, 2002
Hi,
Maybe you still need a bit more of a short introductory paragraph? Only after reading quite a way into the entry did I know what kind of speech recognition software it was about. I assume it's solely for dictating a text and having it converted to a computer document. But there's also IVR. And in factory settings, machines that require the use of both hands and allow the operator to give additional spoken commands. (A digitizer for semiconductor layouts come to mind.)
I don't know who Moira Stewart is (but I do know Onslow ). But then I've been gone a long time.
There's a story about Dr. Harry Tennant (A.I. / Natural Language Processing) testing some software. It was used to produce custom reports from a financial data base. He got a few standard reports OK. Then he asked it "And how about recently?" - It crashed!
Awu.
P.S. I think the writing guidelines say to avoid trying to write in DNA's style. So the reference to dingos' kidneys may have to go
A781463 - Speech recognition
Spiff Posted Jul 8, 2002
don't worry about the search tool. It's not the best thing on h2g2!
how about 'Voice Recognition Software'...
But then, as your super-title rightly points out, this piece is more like 'In defence of VRS'. It starts out from the assumption that people have certain negative prejudices about VR.
Either you could do a more introductory piece on what VRS is all about. Or do an out and out 'Apology' style piece, which is more where this one tends, I think.
Or you could do both with links.
All kinds of possibilities.
seeya
spiff
A781463 - Speech recognition
Bluttsuuft Posted Jul 8, 2002
Awu,
Two things on this :
1) Since DNA's work is very much in the realm of literature I feel quite free to quote him or refer to his work. I do that all the time with other writers, he is no different. I use this particular one because it will be readily recognised in this environment (Granted, I did not actually use quotes, I'm sorry. I'll make sure to do that next time). You do not need to worry that I would try to emulate him or use his words as my own. I have not the slightest desire to do so, there's no such thing.
2) : the big DNA bear owes me. Big time.
I'm from Belgium, you see. I live in Ghent.
I will freely quote DNA anytime it pleases me to do so.
But I will write you a nice introduction. I do mean to offer quality. It might take a while to get there. Bear with me.
Moira Stewart is a BBC newsreader. Beautiful woman. Beautiful voice.
Be well.
A781463 - Speech recognition
xyroth Posted Jul 9, 2002
on the question of a title, "how to use speech recognition software" or "how to get better results from speech recognition software" might do the job.
as to infelction, the current generation of systems do work better if you don't speak in amonotone, but they seriously screw up if you have a good speaking or radio voice, to the extent that you do have to tone it down almost to the level of speaking like a dalek.
The reason I mention translation is that speech to text is a particularly hard problem in language translation.
in particular, it requires the use of information at every level of understanding just so that it can do a fairly incompetant job.
to get it to the point where you can use speech recognition like it is used in star trek (ie as standard) is a research level problem which will involve not just revolutions in speed of processing, but also revolutions in the methods of processing.
this is why despite doublings in processor speed every 18 months or so, speech recognition systems are still little better than those holiday translation dictionaries you can buy.
They do something usefull, but not to the level advertised, and won't in the forseeable future.
As to how the software works, I do understand it (and why it fails), which is why I have such a low opinion of the current software.
A781463 - Speech recognition
Spelugx the Beige, Wizard, Perl, Thaumatologically Challenged Posted Jul 9, 2002
I haven't been in PR for a while, so this may seem like a stupid thing to say/ask, but I'll ask it anyway. Is it really a *problem* for an article submitted to PR to have been written using plain text? Is not having proper headings really a reason not to pick an entry? My view has always been that PR was more about checking whether the article made sense, checking factual accuracy, and rewriting unclear 'technical' (ie language specific to the subject been discussed) paragraphs, not turning the article into something that could be copied straight into the edited guide. What are your views?
spelugx -- former scout
A781463 - Speech recognition
Spiff Posted Jul 9, 2002
hi Spelugx,
as you ask for responses: I don't think any formatting is 'necessary for PR' but I *do* think *some* formatting makes a piece more readable, and is therefore helpful to reviewers.
The suggestions to add headers and para marks were hardly overbearing and came along with other comments of the type you mention. Seems fair enough to me.
seeya
spiff
A781463 - Speech recognition
Spelugx the Beige, Wizard, Perl, Thaumatologically Challenged Posted Jul 9, 2002
I would agree that formatting does make an article more readable, and therefore likely to favoured more by reviewers. What particularly irritated me was the remarks made by xyroth at the top of the thread, but I've now calmed down and re-read them, and they seem in line with what you say, but later you yourself say:
> This needs tags around your paragraphs, and some other minor
> formatting that would make it considerably more readable. It may seem
> like fancy extras, but it makes a difference.
It is totally unnecessary for guideml to be used, this post is not in GuideML yet I've used paragraphs. *This* is what particulary annoys me, the way that authors are encouraged to do more than just write. Surely writing the article is the most important thing. Polishing the formatting into a publishing format is the responsibility of those who will edit it. The other object I have about people pushing GuideML is that not everybody wants to learn how to use it, and the continual encouragement to learn it could discourage people from writing. For 'Bluttsuuft' this is his first article. The system should be easily accessable to people using it for the first time, in particular we don't want to set the standard too high.
spelugx -- lost this argument before.
A781463 - Speech recognition
Smij - Formerly Jimster Posted Jul 9, 2002
Hi Bluttsuuft,
Just a note on the DNA quotes; you are of course, free to quote whoever you like in your entries. However, as the Edited Guide is intended to be accessible to all (whether they're Hitchhikers fans or not - this is, after all, not a Douglas Adams fan-site), references to 'dingoes kidneys' will only confuse non-Hitchhiker fans and might spoil an otherwise splendid entry.
As a rule, therefore, the editors remove/reword all off-topic Hitchhikers quotes from entries before they hit the front page.
Hope that helps to clarify the issue,
Jimster
A781463 - Speech recognition
Spiff Posted Jul 9, 2002
It's a fair point and I will bear it in mind. Far be it from me to discourage those who want to write but don't want to learn about gml.
Even so, it might be said that by not encouraging authors to use guide ml, we end up giving more work for volunteer sub-eds, whose time is valuable too.
A similar point might be raised about fixing typos in PR, I guess.
Anyway, glad to hear you've calmed down!
seeya
spiff
*stifling in Strasbourg*
A781463 - Speech recognition
Smij - Formerly Jimster Posted Jul 9, 2002
Hi spelugx,
GuideML is just a recommendation, as you point out. If the entry's good, it'd never be rejected just because it was in plain text (hence why that option exists). A good sub-editor can always do the work there. But it should be noted that GuideML is all part of the fun... well, maybe fun's too strong a word
I should note that before I started at the Beeb, I'd hardly done any HTML, just very, very basic stuff. But I picked up GuideML very quickly.
Jimster
A781463 - Speech recognition
Spiff Posted Jul 9, 2002
Yeah, I guess that's why i might tend to mention it quite often to new authors in PR- I'd never done any html at all and learnt here.
spiff
*wondering if Bluttsuuft mines us having a chat about the merits of guide-ml in his thread*
*figuring that if nothing else, at least it is keeping it at the top of the list! *
A781463 - Speech recognition
Smij - Formerly Jimster Posted Jul 9, 2002
Good point, Spiff - chatting in the thread's a great way to make sure more people see your entry.
Jimster
(who got told off for playing Calvinball in his Robert De Niro thread)
A781463 - Speech recognition
Wayfarer-- I only wish I were crackly Posted Jul 9, 2002
Hi Bluttsuuft.
Quotes are really only appropriate when they 1)are relevant to the topic and 2)help to make a point/help explain something or tell the reader something they may not have known already(ie, a quotation from a song that is being discussed). DNA semi-quotes are superfluous to an entry about using speech regonition software, and to someone who hasn't read the books or just doesn't remember that part, fetid dingo's kidneys will just seem bizarre and confusing.
And, as Jim pointed out, they get edited out anyway.
A781463 - Speech recognition
Bluttsuuft Posted Jul 9, 2002
I am very happy with the attention my post is receiving. It is in itself an interesting interaction, worthy of a post. By observing your reactions one learns what is perceived to be important, what is deemed inappropriate and how these messages are communicated is a lesson all by itself.
Very interesting. Quite entertaining.
Guide-style HTML seems to be both desirable and non-essential. Should we take the trouble to learn it or should we just apply a finger to the key when the text wants to come up for air ?
References to The Creator should be shunned it seems, to protect the innocent. Selective amnesia on the other hand is not a problem.
Titles should state the business but not extensively so.
Editors seem to care for the plate but not for the dish.
Things to consider while rewriting the piece.
Stay in touch.
A781463 - Speech recognition
Spiff Posted Jul 9, 2002
Nice response!
"Editors seem to care for the plate but not for the dish."
Please don't take this on my authority! I think the sub-eds, as volunteers, enjoy the chance to read other researchers' contributions. and work hard getting them into shape for the EG.
glad to see you enjoying peer review, ,
seeya
spiff
Key: Complain about this post
- 1
- 2
Peer Review: A781463 - Speech recognition
- 1: Bluttsuuft (Jul 7, 2002)
- 2: xyroth (Jul 8, 2002)
- 3: Spiff (Jul 8, 2002)
- 4: Bluttsuuft (Jul 8, 2002)
- 5: Bluttsuuft (Jul 8, 2002)
- 6: Ausnahmsweise, wie üblich (Consistently inconsistent) (Jul 8, 2002)
- 7: Spiff (Jul 8, 2002)
- 8: Bluttsuuft (Jul 8, 2002)
- 9: xyroth (Jul 9, 2002)
- 10: Spelugx the Beige, Wizard, Perl, Thaumatologically Challenged (Jul 9, 2002)
- 11: Spiff (Jul 9, 2002)
- 12: Spelugx the Beige, Wizard, Perl, Thaumatologically Challenged (Jul 9, 2002)
- 13: Smij - Formerly Jimster (Jul 9, 2002)
- 14: Spiff (Jul 9, 2002)
- 15: Smij - Formerly Jimster (Jul 9, 2002)
- 16: Spiff (Jul 9, 2002)
- 17: Smij - Formerly Jimster (Jul 9, 2002)
- 18: Wayfarer-- I only wish I were crackly (Jul 9, 2002)
- 19: Bluttsuuft (Jul 9, 2002)
- 20: Spiff (Jul 9, 2002)
More Conversations for Speech recognition
Write an Entry
"The Hitchhiker's Guide to the Galaxy is a wholly remarkable book. It has been compiled and recompiled many times and under many different editorships. It contains contributions from countless numbers of travellers and researchers."