Whisper is an Automatic Speech Recognition (ASR) model that has been trained using 680,000 hours of supervised data from the web, encompassing a range of languages and tasks. One of its limitations is the low-performance on low-resource languages such as Marathi language and Dravidian languages, which can be remediated with fine-tuning. However, fine-tuning a Whisper […]
Source