People

Investigators

Steve Renals, University of Edinburgh.
Steve Renals is the director of NST, and professor of speech technology at the University of Edinburgh. He is the director of the Institute of Language, Cognition, and Communication (ILCC) in the School of Informatics at Edinburgh, where he is a member of CSTR. He received a BSc from University of Sheffield and an MSc and PhD from University of Edinburgh. He has held teaching and research positions at the Universities of Cambridge and Sheffield, and at the International Computer Science Institute. He has over 200 publications in speech technology and spoken language processing. He has coordinated a number of large collaborativer projects, including the interdisciplinary EU Integrated Projects AMI and AMIDA, which combined research in speech technology with multimodal interaction, HCI, and computational linguistics to recognise and interpret meetings. He is co-editor-in-chief of the ACM Transactions on Speech and Language Processing and is a senior area editor of the IEEE Transactions on Audio, Speech, and Language Processing. He is a fellow of the IEEE, a member of the ISCA Advisory Council, and an advisory board member for Scottish Enterprise.
Phil Woodland, University of Cambridge
Phil Woodland is a Professor of Information Engineering in the Machine Intelligence Laboratory (formerly the Speech Vision and Robotics (SVR) group), of which he is currently the head, and a Professorial Fellow of Peterhouse. He is also the head of the Speech Research Group. He is a fellow of the IEEE and a fellow of ISCA.
Thomas Hain, University of Sheffield.
Thomas Hain is Professor of Speech and Audio Technology and a member of the Speech and Hearing Group at the Department of Computer Science, University of Sheffield. He is mostly working on large scale systems for speech and language processsing using machine learning.
Simon King, University of Edinburgh.
Simon King is the director of the Centre for Speech Technology Research and Professor of Speech Processing at the University of Edinburgh. His research interests include new acoustic models for speech recognition; speech synthesis; the use of articulatory information for speech synthesis and recognition.
Mark Gales, University of Cambridge.
Mark Gales is Professor of Information Engineering in the Machine Intelligence Laboratory, and a Fellow of Emmanuel College, Cambridge. He studied for the B.A. in Electrical and Information Sciences at the University of Cambridge from 1985-88. Following graduation he worked as a consultant at Roke Manor Research Ltd. In 1991 he took up a position as a Research Associate in the Speech Vision and Robotics group in the Engineering Department at Cambridge University. In 1995 he completed his doctoral thesis: Model-Based Techniques for Robust Speech Recognition supervised by Professor Steve Young. From 1995-1997 he was a Research Fellow at Emmanuel College Cambridge. He was then a Research Staff Member in the Speech group at the IBM T.J.Watson Research Center until 1999 when he returned to Cambridge University Engineering Department as a University Lecturer. Mark Gales is a Fellow of the IEEE and was a member of the Speech Technical Committee from 2001-2004. He is currently an associate editor for IEEE Signal Processing Letters and IEEE Transactions on Audio Speech and Language Processing. He is also on the Editorial Board of Computer Speech and Language. Mark Gales was awarded a 1997 IEEE Young Author Paper Award for his paper on Parallel Model Combination and a 2002 IEEE Paper Award for his paper on Semi-Tied Covariance Matrices.
Bill Byrne, University of Cambridge
Bill Byrne is Professor of Information Engineering, a member of the Speech Research Group, and a Fellow of Clare College, Cambridge. His research is in statistical modeling for speech and language processing.
Phil Green, University of Sheffield.
Phil Green is a Professor in the Speech and Hearing group (SPandH) in the Department of Computer Science, Univeristy of Sheffield, which he founded in 1985. In recent years Professor Green has concentrated on research in missing data methods for robust automatic speech recognition and on clinical applications of speech technology. In NST he leads the work on Clinical Applications of Speech technology in the homeService project.
Junichi Yamagishi, University of Edinburgh.
Junichi Yamagishi is an EPSRC Career Accelersation fellow at CSTR, mainly working on speech synthesis. He holds a joint appointment with the National Institute of Informatics (NII), Tokyo. He received a PhD in 2006 from Tonyo Institute of Technology. His PhD thesis, ‘Average-voice-based speech synthesis’, which won the Tejima Doctoral Dissertation Award 2007.
Stuart Cunningham, University of Sheffield.
Stuart Cunningham is a lecturer in the Department a of Human Communication Sciences at the University of Sheffield. He has a PhD in Computer Science from the University of Sheffield. His main research interests include robust automatic speech recognition; the use of speech technology in assistive technology and speech and language therapy; and the recognition and perception of speech in adverse acoustic conditions. In NST he will be contributing to the work on clinical applications of speech technology.

Researchers

Peter Bell, University of Edinburgh.
Peter received an MPhil degree in speech and language processing from Cambridge University and a PhD from Edinburgh University, studying full covariance acoustic modelling for speech recognition. Since 2010, he has worked as a Research Associate at CSTR at the University of Edinburgh, where projects have included speech recognition of Scottish Gaelic, and spoken language processing for tutorial dialogue systems.
Bhusan Chettri, University of Sheffield.
Bhusan is a Research Assistant in the Department of Computer Science, University of Sheffield working with Prof. Thomas Hain. He received his B.Tech and M.Tech degrees in Computer Science from Sikkim Manipal University, India in the year 2005 and 2009 respectively. He came to Sheffield University in September 2013 to study an MSc in Computer Science with specialization in Speech and Language Processing and completed the course with a distinction. After the MSc course, he joined the SPandH group for 4 months (october 2014 - january 2015) and worked on an external project called ITSLanguage along with Mauro Nicolao. In June 2015, he joined the group again and have been working on various projects. Currently he has also been working on the NST project - HomeService with Mauro Nicolao and Heidi Christensen. Before coming to Sheffield University, Bhusan was lecturing in the Department of Computer Science at Sikkim Manipal University, India for 7 years.

Salil Deena, University of Sheffield.
Salil is a Research Associate in the SpandH group at the University of Sheffield since September 2014 and working with Prof. Thomas Hain. He received a PhD in Computer Science from the University of Manchester with a thesis on visual speech synthesis, awarded in 2012. From 2012 to 2014, he worked as Research Engineer at Image Metrics researching Computer vision and Machine Learning techniques for facial analysis and animation. His research interests are in probabilistic models and their applications to speech recognition and synthesis.

Mortaza Doulaty, University of Sheffield. Research
Mortaza Doulaty is a Ph.D. student at the Department of Computer Science, University of Sheffield. He received his B.Sc. (Hons) and M.Sc. (Hons) in Computer Science, Intelligent Systems from the University of Tabriz, Iran in 2009 and 2011, respectively. He is working on canonical modelling for automatic speech recognition under the supervision of Thomas Hain. Mortaza joined the NST project in Feb 2013.
Gustav Henter, University of Edinburgh.
Gustav is a research fellow at the Centre for Speech Technology Research (CSTR) at the University of Edinburgh, United Kingdom. He received the PhD degree in electrical engineering (telecommunications) in 2013 and the MSc degree (Civilingenjör) in engineering physics in 2007, both from KTH Royal Institute of Technology in Stockholm, Sweden. Before joining NST in 2014 he was a research fellow on the INSPIRE Marie Curie ITN, also at CSTR. Gustav's research interests include statistical modelling and speech evaluation methods, especially the interplay between mathematical design choices and perceived quality in speech synthesis.
Qiang Huang, University of Edinburgh. Research
Qiang Huang is a Research Fellow at CSTR. He received a PhD from the University of East Anglia. Before joined CSTR in 2013, he has worked on several EPSRC projects for speech recognition, natural language understanding, information retrieval, and multimodal information processing. He is now working on language modelling for speech recognition and a dailogue system using multimodal information.
Penny Karanasou, University of Cambridge.
Penny is a Research Associate in the Machine Intelligence Laboratory of the Cambridge University Engineering Department. She received her Ph.D. from LIMSI-CNRS and University Paris-Sud, France in June 2013, under the supervision of L.Lamel and F.Yvon. During her Ph.D. she worked on pronunciation modeling for automatic speech recognition, with a focus on automatic generation of pronunciations for new words and of pronunciation variants for existing ones, as well as on discriminative methods for adaptation of the lexicon to speech data in an FST-based decoder. In summer 2011, she visited as a research intern the SRI International's STAR laboratory, where she developed a discriminative framework for improving a keyword-spotting system. Prior to that, in 2008, she received her Diploma in Electrical and Computer Engineering from the National Technical University of Athens, Greece. Her research interests lie in the areas of machine learning, speech recognition and natural language processing.
Jonathan Kilgour, University of Edinburgh. Research
Jonathan is a programmer and research associate at the University of Edinburgh. He was involved in corpus building and storage on the AMI and AMIDA projects, and prior to that on the NITE project where he was a major developer of the NXT toolkit for storing complex (linguistic) corpora. On the NST project, he will be developing infrastructure for LifeLog and looking at how we can use automatically derived information from multi-party speech to provide useful tools.
Pierre Lanchantin, University of Cambridge. Research
Pierre is a Research Associate in the Speech Research Group at the Cambridge University Engineering Department (CUED). He received a MSc degree in Acoustics, Signal Processing and Computer science applied to Music from Paris VI University and a PhD in Statistical Signal Processing from Telecom SudParis, France. His research interests include statistical modeling of signals, speech processing and their applications to Music. During his PhD, he studied generalizations of Hidden Markov Models (HMM) called Pairwise and Triplet Markov chains with applications to image segmentation. He then directed his research toward speech processing and joined the analysis/synthesis team at IRCAM, working on HMM-based speech segmentation, HMM-based speech synthesis and Voice Conversion with applications to Music.
Andrew Liu, University of Cambridge. Research
Andrew (Xunying) Liu is a Senior Research Associate in the Machine Intelligence Laboratory. He graduated from Shanghai Jiao Tong University in 2000 before studying the MPhil in Computer Speech and Language Processing at CUED, and progressing to a PhD on discriminative complexity control and linear projections completed in 2005 with Dr. Mark Gales.
Yulan Liu, University of Sheffield. Research
Yulan Liu received a Bachelor's degree in Acoustics in Nanjing University, China. Her study covered all branches of acoustics, physics, mathematics, electronic engineering and even biology, and provided her with a wide foundation. Her research interest is in audio acoustics, machine translation, speech recognition and processing. In 2012, Yulan joined the Speech and Hearing group at the University of Sheffield as a research student to conduct research on automatic speech recognition under the supervision of Thomas Hain.
Liang Lu, University of Edinburgh. Research
Liang is now a Research Associate in Centre for Speech Technology Research, University of Edinburgh. He got his BSc and MSc degrees from Beijing University of Posts and Telecoms, Beijing, China in 2007 and 2009, respectively, and expects to get his Ph.D degree in 2012, University of Edinburgh. His Ph.D research topic is speech recognition based on subspace Gaussian mixture models, with focuses on multilingual and cross-lingual acoustic modelling, noise compensation and adaptive training.
Tom Merritt, University of Edinburgh.
Tom is a PhD student in the School of Informatics and a member of CSTR. He is working in the field of speech synthesis, investigating methods of producing more natural-sounding synthesized speech. Tom received a BSc degree in Computer Science With A Year In Industry from University of East Anglia in 2012 and completed a Summer Internship during this time investigating effective methods of reducing the footprint of speech recognition systems.
Mauro Nicolao, University of Sheffield. Research
Mauro is a Research Associate and part-time Ph.D. student at the Department of Computer Science, University of Sheffield. He received an MSc degree in Telecommunication Engineering from University of Padova in Italy. From 2005 to 2009, he was a Research Engineer at the Italian National Council of Research collaborating on several projects: the development of an expressive speech synthesiser for the Italian language, a speech recogniser for Italian children's speech, and a study on multilingual (Arabic, Italian, English) prototype ASR. Since 2010, he has been part of SPandH researching speech synthesis, speech adaptation, Computer Assistive Language Learning (CALL), and speaker recognition. From 2010 to 2012, he was part of the Marie Curie European network,"SCALE", developing a speech synthesiser which reacts to environmental disturbance and mimics the human speech modifications.
Oscar Saz, University of Sheffield. Research
Oscar is a Research Associate at the University of Sheffield, working with Thomas Hain in the NST project since 2012. He received a B.Sc. and Ph.D. from the University of Zaragoza, Spain in 2004 and 2009, respectively. From 2010 to 2012, he was a Fulbright Fellow at Carnegie Mellon University in Pittsburgh, USA. He has worked in several areas of speech and language technologies like acoustic modeling, robust ASR in noisy conditions, ASR for disordered speech, speaker adaptation, pronunciation verification and computer-assisted language learning tools.
Yanmin Qian, University of Cambridge
Yanmin is a Research Associate in the Speech Research Group of the Machine Intelligence Lab, at the Cambridge University Engineering Department (CUED). He received his Bachelor’s degree of telecommunication engineering in Huazhong University of Science and Technology, P.R.China in 2007. Then he entered the Department of Electronic Engineering at Tsinghua University in China, and obtained his PhD in 2012. From 2013 he was an Assistant Professor at the Department of Computer Science and Engineering, Shanghai Jiao Tong University in China. He is a member of IEEE and ISCA, and a the member of the Kaldi Develop Group. He has published more than 30 papers on speech recognition and speech signal processing. His current research interests include Large vocabulary continuous speech recognition, Discriminative training of acoustic models, Multilingual speech recognition and Low-resource speech recognition, Robust speech recognition and speaker recognition.
Pawel Swietojanski, University of Edinburgh. Research
Pawel is a PhD student in School of Informatics (Institute of Language, Cognition, and Communication (ILCC)) as well as the member of the Centre For Speech Technology Research (CSTR). In the scope of the NST project Pawel works on deep learning in presence of multiple sound sources. Pawel received an MSc degree in Computer Science from AGH University of Science and Technology in Cracow and BSc from HSVS in Tarnow, Poland. His MSc dissertation gained 2nd prize and distinction in XII edition of the AGH Diamond MSc Contest. In the past Pawel actively contributed to research projects in the field of automatic speech recognition.
Marcus Tomalin, University of Cambridge.
Marcus is a Research Associate at the Speech Research Group at Cambridge and also a Fellow in English at Downing College Cambridge.
Cassia Valentini-Botinhao, University of Edinburgh.
Cassia is a Research Associate in the Centre for Speech Technology Research in the University of Edinburgh. She graduated in the Federal University of Rio de Janeiro, Brazil, receiving the title of Electronic Engineer and received an MSc from the University of Erlangen-Nuremberg in Germany, on the program Systems of Information and Multimedia Technology. As a Marie Curie Fellow of the SCALE project Cassia obtained her PhD in University of Edinburgh, United Kingdom, with the thesis "Intelligibility enhancement of synthetic speech in noise”. Her research interests are speech intelligibility models and signal processing for speech synthesis.
Christophe Veaux, University of Edinburgh.
Christophe is a Research Associate in the Centre For Speech Technology Research (CSTR), University of Edinburgh. He is working on the voice banking, voice reconstruction, and personalised speech synthesis.
Linlin Wang, University of Cambridge.
Linlin is a Research Associate in the Speech Research Group of the Machine Intelligence Lab, at the Cambridge University Engineering Department (CUED). She received her Ph.D. degree in Computer Science from Tsinghua University, July 2013, mainly focusing on speaker recognition. She previously worked on speech recognition and sentence boundary detection in BOLT program at CUED. Her current research interest lies in automatic speech recognition.
Oliver Watts, University of Edinburgh.
Oliver is a research fellow at CSTR, University of Edinburgh, where he received his PhD degree in 2012 with the doctoral thesis "Unsupervised Learning for Text-to-Speech Synthesis". Since then he has worked on the European funded speech synthesis project Simple4All. His main research interests are in exploiting unsupervised and lightly supervised learning to build text-to-speech systems without the conventional reliance on annotated data, rapid TTS system development in scarcely-resourced languages, and the use of novel textual, linguistic and pragmatic contexts for TTS.
Mirjam Wester, University of Edinburgh.
Mirjam is the scientific manager of NST, and a research fellow at CSTR, University of Edinburgh. She received her PhD from the Radboud University Nijmegen, the Netherlands in 2002, and has been with CSTR since 2003. Her research focuses on using knowledge from human speech production and perception in the fields of automatic speech recognition and synthesis.
Chunyang Wu, University of Cambridge.
Chunyang is a Ph.D student in the Machine Intelligence Laboratory, Cambridge University Engineering Department (CUED). He received his B.E. in computer science from Shanghai Jiao Tong University in 2013. He is currently working on neural-network based canonical models for speech recognition.
Zhizheng Wu, University of Edinburgh. Research
Zhizheng is a research associate in the Centre for Speech Technology Research (CSTR), University of Edinburgh. He works on speech synthesis and voice conversion for natural synthesized speech. He is also interested in spoofing and countermeasures for speaker verification.
Chao Zhang, University of Cambridge.
Chao is a Ph.D student in the Machine Intelligence Laboratory, Cambridge University Engineering Department (CUED). He received the B.S. and M.S. degrees in computer science from Tsinghua University, in 2009 and 2012 respectively. He is currently working on neural network based methods for large vocabulary continuous speech recognition.

Scientific Advisory Board

Alex Acero, Apple, USA
Herve Bourlard, Idiap Research Institute and EPFL, Switzerland
Stephen Cox, University of East Anglia, UK
Michael Picheny, IBM, USA
Keiichi Tokuda, Nagoya Institute of Technology, Japan

Alumni

Yanhua Long, University of Cambridge. Research
Yanhua was a Research Associate in the Speech Research Group of the Machine Intelligence Lab, at the Cambridge University Engineering Department (CUED). She finished her five years' Master-PhD studies in the iFly Speech Lab at the University of Science and Technology of China, and received the Ph.D. degree in July 2011. Her PhD research mainly focused on the speaker and language recognition,especially on the channel compensation techniques. Her research work in NST focused on natural transcription techniques under diverse data conditions.
Arnab Ghoshal, University of Edinburgh. Research
Arnab Ghoshal was a Research Associate at The University of Edinburgh and a member of the Centre for Speech Technology Research until November 2013. He received the M.S.E. and Ph.D. degrees from Johns Hopkins University, Baltimore, USA in 2005 and 2009, respectively. Before joining CSTR, he was a Marie Curie Fellow at Saarland University in Saarbrücken, Germany. His current research interests include acoustic modeling for large-vocabulary automatic speech recognition, multilingual speech recognition, pronunciation modeling and adaptation.
Heng Lu, University of Edinburgh. Research
Heng Lu was a Post Doc. research associate in the Centre For Speech Technology Research (CSTR), University of Edinburgh. He received his MSc and Ph.D. degree both from the Ifly Speech Lab, University of Science and Technology of China (USTC). During his Ph.D., he visited and worked in Nagoya Institute Technology and IBM Watson Research Center. His research interests mainly concern speech synthesis (both HMM and Unit-selection), Computer Assisted Language Learning (CALL), error detection and naturalness evaluation for synthesis speech.
Shi-Xiong (Austin) Zhang, University of Cambridge. Research
Shi-Xiong (Austin) completed his Ph.D. at Cambridge University, studying "Structured Discriminative Models for Speech Recognition", supervised by Prof. Mark Gales. Before going to Cambridge, Austin completed his M.Phil. degree in the Electronic and Information Engineering department at Hong Kong Polytechnic University in 2008. His research interests include machine learning, speech recognition and speaker verification.
Adrià de Gispert, University of Cambridge.
Adrià is a Senior Research Associate at the Machine Intelligence Laboratory of the Cambridge University Engineering Department. He is also a Fellow of Clare College, where he teaches undergraduate Engineering supervisions, and a Research Scientist at SDL plc. He received his PhD on incorporating linguistic knowledge into ngram-based statistical machine translation in January 2007 by the Universitat Politècnica de Catalunya. Then he joined the University of Cambridge, where he has worked as Research Associate, Lecturer in Speech and Language Technologies and Senior Research Associate. His work is on statistical modelling of speech and text, natural language processing and computational linguistics, with a strong focus on machine translation and its connections with spoken and natural language processing, machine learning and parallel computing problems. His current interest lies in developing natural language generation models that can produce fluent English text with applications to translation and synthesis.
Matt Shannon, University of Cambridge.
Matt successfully completed his PhD at the Department of Engineering, University of Cambridge. He received B.A. (Hons) and M.Math degrees in mathematics and an M.Phil degree in computer, speech, text and internet technology from the University of Cambridge. His research interests include statistical speech synthesis and probabilistic modelling of speech. His PhD research focusessed on using the autoregressive HMM for statistical speech synthesis and investigating global variance generation
Heidi Christensen, University of Sheffield. Research
Heidi received the M.Sc. and Ph.D. degrees from Aalborg University, Denmark in 1996 and 2002 respectively. She is a Research Associate at the Department of Computer Science, University of Sheffield for 11 years working on numerous European and UK funded project. Her research interests mainly concern clinical application of speech technology, spoken language processing and binaural machine listening.
Charles Fox, University of Sheffield. Research
Charles is a member of the Sheffield Center for Speech and Hearing and Sheffield Center for Robotics.

Main menu

Navigation

tags

You are here

Investigators

Researchers

Scientific Advisory Board

Alumni

Main menu

Search form

Navigation

User login

tags

You are here

People

Investigators

Researchers

Scientific Advisory Board

Alumni