Speech recognition: Tips for Translators

Alison Tunley
April 22, 2022
4:42 pm

This blog is brought to you by speech recognition, which I was finally forced to embrace having fractured my shoulder cycling at our local velodrome. Many years ago I studied acoustic variation in speech for my PhD and did a placement with a speech recognition company. So I knew enough about the complexity of the speech signal to appreciate the scale of the challenge involved in deciphering it. Early encounters with speech recognition technology had done little to persuade me that it was reliable enough to be useful. But suddenly being down to only one functioning arm forced me to revisit my prejudice and I have had to eat humble pie and admit these tools have improved beyond my wildest imaginings.

I am aware that most people were way ahead of me on this thanks to Siri, Alexa and their ilk, but here’s what I’ve learnt during the last few weeks of being reliant on speech recognition for my translation work. The first hurdle to overcome was my feeling of stupidity talking at my computer. Thank goodness I work from home and could close the door firmly to keep out my mocking teenagers. The second lesson was that the computer matters, as does the microphone. I first tried using a speech recognition tool on a low-powered laptop and this had an accuracy rate of about one word in three, which seemed to confirm my worst suspicions. But then I found a suitable microphone for my much more powerful desktop computer (I nicked the PlayStation head set my 13 year old uses for gaming while chatting to his mates). Immediately the accuracy shot up to 90 per cent or higher.

Making full use of the commands in the dictation dictionary will make your life much easier. For example, you will need to memorise the commands for inserting punctuation (open quote, close quote, forward slash etc.). You will also need to get into the habit of instructing the tool to insert punctuation, including hyphens between words etc. It’s not fool proof – I never figured out why sometimes the tool would insist on writing ‘colon’ rather than ‘:’ (in theory it should only write the full word when you say ‘literal colon’ but in practice this seems somewhat haphazard).

Another important thing to learn is how to use the correction tools to fix any errors. You can instruct the tool to select a particular word or phrase, at which point it will offer you several different alternatives in a numbered drop down list. You can then select the number you want or alternatively repeat the phrase to get a new set of suggestions. If all else fails, you can ask the tool to let you spell the relevant word and, if necessary, it will then offer you the option of recording the item for entry in its dictionary.

Learning the commands in the dictation dictionary is important for another reason. When I first started using speech recognition, I would regularly inadvertently issue a Windows command that took me away from dictation mode. Often this would result in something entirely random happening, such as unexpectedly launching my banking application in the middle of attempting to dictate a sentence! I soon learned that speaking in full sentences or at least phrases consisting of several words makes this kind of misinterpretation much less likely. However, this hasn’t stopped me accidentally issuing a command to abruptly close the entire web browser every so often, so I recommend regularly saving your work!

Next, we will look at using speech recognition tools specifically for translation and proof reading tasks and offers some tips for working productively using dictation.

The previous blog describes getting to grips with speech recognition tools for successful dictation. In terms of using speech recognition for translation work, certain types of projects lend themselves more easily to be being handled by a dictation application. The very first project I took on was relatively straightforward from a linguistic perspective in the sense that it consisted of a set of product descriptions for an online retailer. However, it was riddled with proper nouns, brands and catchy product names that were not in the speech recognition dictionary. You have the option of spelling out individual words, but when these are frequent or contain foreign diacritics, the whole thing slows you down so much it becomes exasperatingly tedious.

I developed a technique of inserting place markers for proper nouns or other text the speech recogniser couldn’t handle, so I could cut and paste from the original later. In fact, another key lesson was just how helpful it is to still have one hand able to use the mouse to cut and paste, and for switching quickly between applications, hitting return, and making minor edits. The challenge would be far greater if you were forced to use speech recognition for everything. I’m sure it’s not impossible, but the learning curve would be much steeper.

But the biggest takeaway from this was to pick and choose projects carefully and avoid anything with lots of text that would not be in a standard dictionary. I also found that PEMT and proof reading projects were fiddly because these involve moving around in existing text, which is harder than creating text from scratch.

One annoying quirk of using the speech recognition tool in combination with CAT applications was the way it automatically inserts a space at the end of sentences. CAT tools usually have individual sentences in separate segments and then insert the necessary spaces between sentences automatically. There may be a way to stop the speech recognition tool doing this, but I never figured it out and resorted to manually deleting the rogue additional space at the end of every sentence.

Aside from the technical challenges, working in this way had an impact on the translation process itself. In particular, it required thinking ahead in whole sentences, rather than typing and retyping as part of the translation process. This made me realise that my standard practice is to read the German then start typing in English, immediately going back and rephrasing repeatedly until I reach the end of the sentence. At which point, I read the whole sentence and rework it again. This approach simply isn’t efficient using speech recognition due to the added difficulty of editing existing text. It makes more sense to try and produce a more polished initial version by thinking before you type! I wonder if this increased forethought will be a habit that I will carry over to my translation practice even when working with two arms.

I successfully used speech recognition with an online CAT tool, within Microsoft Word and Excel, to compose messages in a browser-based email application and to work with SDL Trados. Working with the Termbase in Trados proved slightly tricky as once you are in dictation mode you cannot pick up terms from the dictionary quite as easily as when you are typing. In the online translation tool that I use, if you switch to a different tab in the middle of dictation to look something up (e.g. using an online dictionary, thesaurus or a generic web search tool), you lose everything you’ve entered so far, so you need to remember to save your partial translation before switching away.

One final important difference to note when working with speech recognition tools is that a different type of proof reading is required for the completed translation. Typographic errors are unlikely to involve spelling mistakes because the speech recognition tool is limited to inserting words from its dictionary. What you will find are real words that have been misheard (e.g. our -> are, this -> thus etc.). These are often harder to spot, so the text will need very careful checking.

About the Author

Alison Tunley

Alison is a seasoned freelance translator with over 15 years of experience, specialising in translating from German to English. Originally from Wales, she has been a Londoner for some time, and she holds a PhD in Phonetics and an MPhil in Linguistics from the University of Cambridge, where she also completed her First Class BA degree in German and Spanish… Read Full Bio

Speech recognition: Tips for Translators

About the Author

Alison Tunley

Share This Post

Leave a Reply Cancel reply

Our Accreditations

Recent Updates

Tips for Proofreading Academic Papers

Testimonials