Windows Speech Recognition: Editing Text with Voice

BreakingModern – I previously wrote about how Windows Speech Recognition (WSR), though treated by Microsoft as a state secret, is an integral part of Microsoft Windows. Its powerful features, which let you input text and control your computer using only your voice, are available as soon as you add a suitable microphone to your PC.

Now you know how to invoke WSR, set up the microphone and begin dictating text. Using Windows Notepad, we ended up with this evocative text window on our screen.

Windows Speech Recognition 2-1

But of course, simply inputting text is not enough — you have to edit and correct it, and to do that you have to navigate within the text. So let’s look at some text-editing and navigation functions.

Navigation and Editing

Let’s say you have accomplished all the steps in the previous story and have the Windows Notepad window on your screen. WSR is running, the microphone is on and you have dictated the greeting shown in the screen shot.

Say this aloud: “SELECT GREETINGS.”

WSR will react like this:

Windows Speech Recognition 2-2

The word “Greetings” is now selected (or painted) pretty much as if you had double-clicked the word with a mouse.

Now say this aloud: “DELETE, FEAR ME.”

Your Notepad screen should then look like this:

Windows Speech Recognition 2-3

The selected word “Greetings” disappeared when you said DELETE and was then replaced by the other two words, “Fear me.”

But, you think, perhaps “the Singularity” should be wordier.

Say this aloud: “GO AFTER AM.”

Then say aloud: “THE ONE THEY CALL.”

The input cursor will move to the point after the word “am” and then the rest of your dictated text will be inserted there, with this result:

Windows Speech Recognition 2-4So WSR gives you two basic functions for text navigation and editing — SELECT and GO. With SELECT you can change individual words, and with GO you can place the input cursor wherever you want within the text.

Some Pointers

When using the SELECT command always DELETE the word you selected before inputting the replacement. Otherwise WSR launches into a clumsy text-replacement procedure that’s typically overkill. This trivial point is the chief advantage of the heavily-promoted Dragon Naturallyspeaking over WSR. Dragon lets you directly overwrite text whereas WSR complicates the over-write process. (Also, Dragon’s recognition accuracy is a little better than that of WSR but, as noted previously, you’ll probably never notice the difference.)

These SELECT commands will also work (where X is a numeral):

  • SELECT NEXT WORD
  • SELECT LAST WORD
  • SELECT NEXT X WORDS
  • SELECT LAST X WORDS
  • SELECT SENTENCE
  • SELECT PARAGRAPH
  • SELECT TO END OF SENTENCE
  • SELECT TO START OF SENTENCE
  • SELECT TO END OF PARAGRAPH
  • SELECT TO START OF PARAGRAPH
  • SELECT NEXT X SENTENCES
  • SELECT LAST X SENTENCES
  • SELECT NEXT X PARAGRAPHS
  • SELECT LAST X PARAGRAPHS

(Instead of SELECT you can also use DELETE but if WSR misunderstands you the results can be like setting off a bomb in your text.)

When using GO you can say (X is a word or phrase in your text):

  • GO BEFORE X
  • GO AFTER X

These GO commands are also handy:

  • GO TO END OF PARAGRAPH
  • GO TO END OF DOCUMENT
  • GO TO END OF SENTENCE
  • GO TO START OF DOCUMENT
  • GO TO START OF SENTENCE
  • GO TO START OF PARAGRAPH

Sometimes the presence of punctuation means that the GO command does not get you exactly where you want to go. In such cases you can use the PRESS command to move the input cursor one space at a time:

  • PRESS LEFT ARROW
  • PRESS RIGHT ARROW
  • PRESS DOWN ARROW
  • PRESS UP ARROW

You can actually use the PRESS command to push any key on the keyboard, and you could (eventually) spell out any word that way, no matter how non-standard. For instance, for Cape d’Or (in Nova Scotia) you would say: “CAPS CAPE SPACE PRESS D PRESS APOSTROPHE PRESS CAPITAL O PRESS R.”

DELETE, BACKSPACE, ENTER, SCROLL UP, SCROLL DOWN, PAGE UP, PAGE DOWN and (as used above) SPACE can also be used without preceding them with PRESS.

If the text you are composing includes a word that is also used as a command, and this causes problems, use the word LITERAL in front of it, as in “LITERAL DELETE.” Or if you have a problem getting it to input the word LITERAL, then say, “LITERAL LITERAL.” Yes.

Meanwhile, you can still use your keyboard at any point.

We’ve just scratched the surface of the voice commands that WSR responds to. If you master the commands I listed above you can use WSR for ordinary typing, such as composing email, using only your voice.

Correction

But, if we are talking about editing, we also need to talk about correction, which is subtly different. Correction is where WSR has responded incorrectly to a word that you pronounced, and you want it to respond correctly in the future.

Let’s say, when talking previously about Nova Scotia, it mistook cape for tape, like this.

Windows Speech Recognition 2-5

Say this: “CORRECT TAPE.”

Notice that you pronounce the word that WSR rendered (tape) not the word that you originally wanted to say (cape).

WSR will respond with a screen like this:

Windows Speech Recognition 2-6Basically, WSR has highlighted the word in question and then generated a list of alternatives. If the word you seek is on the list you say its corresponding number and then say, “OKAY.” If the desired alternative is not there you can say the word again and a different list will appear.

After using CORRECT, the software will, hopefully, have a better chance of recognizing the word the next time you use it.

Once the correction process is done the cursor will return to the place where it was before you began. This lets you finish a paragraph and then go back and correct problems within it. CORRECT LAST WORD and CORRECT NEXT WORD also work.

Of course, inputting raw text is not enough these days — you’ll also want to format it. We’ll cover that in a future installment of our WSR How To, plus Program Control, Graphics Control and other tips and traps — including the elusive topic of dictation techniques.

For BMod, I’m .

Feature image credit: © Sergey Nivens / Dollar Photo Club

All screenshots: Lamont Wood

Lamont Wood

Author: Lamont Wood

Based in San Antonio, Texas, Lamont Wood is a senior editor at aNewDomain.net. He’s been covering tech trade and mainstream publications for almost three decades now, and he’s a household name in Hong Kong and China. His tech reporting has appeared in innumerable tech journals, including the original BYTE (est. 1975). Follow @LAMONTwood on Twitter.

Share This Post On

Submit a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>