In recent years, a Chinese company named iFLYTEK has been gaining momentum in speech recognition and computer vision areas. Hundreds of millions of people have benefitted from its Chinese translation and recognition applications. Cong Liu, the vice president of iFLYTEK AI Research Institute, plays a crucial role in the development of related technologies.
Liu has been working on speech recognition and its related fields at iFlYTEK since he was a junior at the University of Science and Technology of China. Early in his career, he primarily focused on Chinese recognition and translation and how to increase the performance of such aspects. One of his ingenious creations was the world's first Chinese dialect recognition tool, supporting up to 22 different dialects.
Another one of Liu’s breakthrough innovations is the DFCNN (Deep Fully Convolutional Neural Network) model. Compared to traditional CNN, it better expresses long-term information by integrating many convolution layers and directly modeling every sentence instead of every word. The development of this model helped Liu and his team win three competitions at the 4th CHiME challenge in 2016.
After applying DFCNN to real-world applications, they discovered that is was capable of boosting the performance of the iFLYTEK speech recognition engine by approximately 30% per year. Now, the speech recognition accuracy rate of the engine is up to 98%.
Since 2014, Liu has become the vice president of iFLYTEK's AI Research Institute. Thereafter, computer vision became a new research sector for him. Under his leadership, the company successfully migrated its sophisticated deep learning models from speech recognition to computer vision.
“I see a bridge connecting the two areas. That bridge is deep learning,” explained Liu.
Under this vision, he has been leading the team into medical imaging, video monitoring, and image-text recognition. iFLYTEK Computer-Aided Diagnosis in Medical Imaging, which has already been deployed in more than 50 hospitals, is one of their recent achievements. This AI-powered system increases doctors’ operation efficiency by performing certain imaging diagnosis within one second.
Liu’s future research topics include cross-modal information integration and customized speech recognition. “I am so fortunate to be able to live in this era where core technological breakthroughs prosper. And I want to use them to build more real-world applications,” he said.