Tony Robinson (speech recognition)

Tony Robinson is a researcher in the application of recurrent neural networks to speech recognition,[1][2][3] being one of the first to discover the practical capabilities of deep neural networks and its application to speech recognition.[4]

Education and Early Career

edit

Robinson studied natural sciences at Cambridge University between 1981 and 1984, where he specialized in physics. He went on to complete an MPhil in computer speech and language processing in 1985 and continued with a PhD in the same area in 1989, both at Cambridge. He first published on the topic of speech recognition during his PhD[5] and has published over a hundred widely cited research papers on automatic speech recognition (ASR) in the years since.[6]

Entrepreneurial Career

edit

In 1995, Robinson formed SoftSound Ltd, a speech technology company which was acquired by Autonomy with a view to using the technology to make unstructured video and voice data easily searchable. Robinson helped build the fastest large vocabulary speech recognition system available at the time, and operating in more languages than any other model, based on recurrent neural networks.[7]

From 2008 to 2010, Robinson was the Director of the Advanced Speech Group at SpinVox, a provider of speech-to-text conversion services for carrier markets, including wireless, VoIP and cable. Their Automatic Speech Recognition (ASR) system was, for a time, being used more than one million times per day and SpinVox was subsequently acquired by global speech technology company Nuance.[8]

Robinson was also founder of Speechmatics, which launched its cloud-based speech recognition services in 2012. Speechmatics subsequently announced a new technology in accelerated new language modeling late in 2017.[9] Robinson continues to publish papers in speech recognition technology, especially in the area of statistical language modelling.[10]

References

edit
  1. ^ Robinson, Tony; Fallside, Frank (July 1991). "A recurrent error propagation network speech recognition system". Computer Speech and Language. 5 (3): 259–274. doi:10.1016/0885-2308(91)90010-N.
  2. ^ Robinson, Tony (1996). "The Use of Recurrent Neural Networks in Continuous Speech Recognition". Automatic Speech and Speaker Recognition. The Kluwer International Series in Engineering and Computer Science. Vol. 355. pp. 233–258. CiteSeerX 10.1.1.364.7237. doi:10.1007/978-1-4613-1367-0_10. ISBN 978-1-4612-8590-8.
  3. ^ Wakefield, Jane (2008-03-14). "Speech recognition moves to text". BBC News. Retrieved 2020-08-24.
  4. ^ Robinson, Tony (September 1993). "A neural network based, speaker independent, large vocabulary, continuous speech recognition system: the WERNICKE project". Third European Conference on Speech Communication and Technology. 1: 1941–1944. Retrieved 17 May 2018.
  5. ^ Robinson, Anthony John (June 1989). "Dynamic Error Propagation Networks". PhD Thesis. Retrieved 17 May 2018.
  6. ^ Robinson, Tony. "Tony Robinson - Profile". ResearchGate. Retrieved 17 May 2018.
  7. ^ Robinson, Tony; Hochberg, Mike; Renals, Steve (1996). "The Use of Recurrent Neural Networks in Continuous Speech Recognition". Automatic Speech and Speaker Recognition. The Kluwer International Series in Engineering and Computer Science. Vol. 355. pp. 233–258. CiteSeerX 10.1.1.364.7237. doi:10.1007/978-1-4613-1367-0_10. ISBN 978-1-4612-8590-8.
  8. ^ "Nuance Acquires SpinVox". Healthcare Innovation. 2011-06-24. Retrieved 2023-09-09.
  9. ^ Orlowski, Andrew. "Brit neural net pioneer just revolutionised speech recognition all over again". The Register. Situation Publishing. Retrieved 17 May 2018.
  10. ^ Chelba, Ciprian; Mikolov, Tomas; Schuster, Mike (2013). One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling (Report). Cornell University Library. arXiv:1312.3005.