Chinese artificial intelligence (AI) start-up CloudWalk Technology Co on Monday revealed that its new voice-recognition model achieved a new accuracy record, a fresh sign of the country's growing strength in AI.
The new model, known as Pyramidal-FSMN, had a word error rate of 2.97 percent, setting a new world record in the area of speech recognition technology based on the world's largest open source speech corpus Librispeech, read a press release sent to the Global Times on Monday.
The new milestone signaled a leap forward in speech recognition - an error rate of 5.9 percent is generally considered to equal human parity while professional transcribers who have received strict training post an error of about 3 percent. The numbers alone also marked a conspicuous advancement from previous efforts achieved by global industry giants including Microsoft and IBM.
Microsoft and IBM competed to claim the accuracy crown last year with speech recognition software falling in the ballpark of roughly 5 percent, based on the Switchboard corpus of telephone conversations.
The Chinese start-up, born out of the Chongqing Institute of Chinese Academy of Sciences, was only founded in 2015 and has over recent years established itself as a major face-recognition supplier.