1111 Hours Hindi ASR Challenge 2022 by MyGov: Submit by Mar 7

Applications are invited for 1111 Hours Hindi ASR Challenge 2022 by MyGov. The last date of application is 7 March.

About

A challenge on Automatic Speech Recognition for Hindi is being organized as part of INTERSPEECH 2022 by sharing the spontaneous telephone speech recordings collected by a social technology enterprise Gram Vaani. The regional variations of Hindi together with spontaneity of speech, natural background and transcriptions with varying degrees of accuracy due to crowd sourcing make it a unique corpus for automatic recognition of spontaneous telephone speech.

Recent advancements in Speech technology have shown that ASR systems can work on par with humans. To build a good ASR system requires large amounts of training data and high-end computational resources.

However, when it comes to Indian languages, not everyone, especially academic institutions and startups, have access to these resources. As a part of this challenge, we will be releasing telephone quality speech data in Hindi. Everyone who participates in this challenge will then be free to use this data for research purposes.

What makes this proposed session special?

Since this will be a focused challenge and all participants will be building systems using the data released as a part of the challenge, a special session would be more appropriate than having this be part of the main conference.

This special session will encourage collaboration between speech researchers and experts in languages and linguistics, due to the nature and type of the data posed by the challenge.

We will be releasing a baseline recipe which will ensure that the barrier for entry is low, and will encourage submissions from many research groups all over the world.

Rules

Participants are expected to sign an agreement when they download the data and are expected to use the data only for research purposes.
If registered participants feel that they cannot submit a system, they will have to submit a withdrawal clause that states that they will use the data for research purposes only.
The systems submitted are expected to beat the baseline system in terms of WER/CER, however, innovative systems that come close to the baseline may be considered.
Only the audio for the blind test set (5 hours) will be released. Participants are expected to run their systems on the blind test set and submit the ASR hypotheses for evaluation.
Participants will need to share their final ASR model or an API of their model, along with the paper to be able to reproduce the hypotheses against the blind test set.

Important dates

Release of training data : 1 February
Evaluation data release : 7 March
Closing of submission site : To be decided
Announcement of results : To be decided

Types of Challenge

Closed Challenge – Participants can use only the Gram Vaani 100 hours Train dataset and Gram Vaani 5 hours Development dataset for training models(Both acoustic and language models).
Self Supervised Closed Challenge – Participants can use only the Gram Vaani 1000 hours, Gram Vaani 100 hours Train dataset and Gram Vaani 5 hours Development dataset for training models(Both acoustic and language models).
Open Challenge – Participants can use any external/additional dataset for training models(Both acoustic and language models).