Baidu Unveils DuerOS Prometheus Project to Advance Conversational AI

Project consists of one of the largest open datasets for conversational AI and $1 million fund to invest in conversational AI projects


REDWOOD CITY, Calif., Nov. 09, 2017 (GLOBE NEWSWIRE) -- Baidu, Inc. (NASDAQ:BIDU), the leading Chinese language Internet search provider, launched DuerOS Prometheus Project today to rapidly advance the state of conversation AI capabilities. The project consists of one of the largest open datasets for conversational AI, interdisciplinary collaborations as well as a $1 million fund to invest in conversational AI projects and foster talent in this space.

A photo accompanying this announcement is available at http://www.globenewswire.com/NewsRoom/AttachmentNg/d1c8c939-8fd4-43b3-92c6-1cd772b19916

“Voice is increasingly becoming how we interact with our devices today,” Kaihua Zhu, CTO of Baidu’s DuerOS, said at the announcement event for the project today in Redwood City, California. “Open datasets, interdisciplinary collaborations, and financial incentives will create the conditions necessary for rapid advancement of conversational AI.”

DuerOS is Baidu’s conversation-based AI platform with conversational skills in 10 major domains and over 100 sub-domains. Since its launch in the beginning of 2017, it has quickly become the top choice for third party hardware manufacturers in China that have integrated DuerOS in over 100 branded devices ranging from refrigerators and air conditioners to TV set-top boxes, storytelling machines and smart speakers.

Dr. Björn Hoffmeister, Sr. Manager of Amazon Machine Learning, Dr. Sanjeev Khudanpur, Director of Human Language Technology Center of Excellence at Johns Hopkins, and Dr. Antoine Raux, CTO and Co-founder of a Stealth Conversational AI Startup, also attended the Duer OS Prometheus Project launch event and gave presentations in their respective fields.

Following the event, Baidu will gradually open three large scale datasets in far field wake word detection, far field speech recognition, and multi-turn conversations to enable developers and technologists to develop and train their algorithms for conversational AI systems.

The wake word detection dataset will consist of around 500,000 voice clips of five to ten popular Chinese wake words, including “xiaodu xiaodu”, which is the wake word to activate DuerOS enabled devices.

The speech recognition datasets will include thousands of hours of Mandarin speech recognition data to enable people to train systems that can accurately “hear” human speech under complex circumstances such as noisy environments. The project will also release thousands of dialogue data, covering 10 different domains to promote the development of multi-turn conversation technology.

“In the age of AI, data is the new oil,” said Guoguo Chen, Baidu’s Principal Architect for DuerOS. “It is also the barrier that prevents many smaller organizations and individuals from developing leading edge conversational AI systems. By opening our dataset and offering interdisciplinary collaborations and financial incentives, we hope to accelerate the pace of innovation in this space and advance the future of conversational computing.” 

In addition to the open dataset, Project Prometheus will also feature programs with universities and research organizations to conduct joint training, course design, and workshops. In combination, these programs will promote constructive exchanges around conversational AI and attract the best talent in this space.

The DuerOS Prometheus project is sponsored by the Baidu Duer Business Unit, together with Baidu Speech Technology Group, Baidu Campus Branding and Baidu Cloud.

For more information on the datasets and the $1 million conversational AI fund, please visit http://prometheus.baidu.com

About Baidu

Baidu, Inc. is the leading Chinese language Internet search provider. Baidu aims to make a complicated world simpler for users and enterprises through technology. Baidu’s ADSs trade on the NASDAQ Global Select Market under the symbol “BIDU.” Currently, ten ADSs represent one Class A ordinary share.

Media Contact

Intlcomm@baidu.com

 

DuerOSProjectP