My project aims to address the challenge of accent speech recognition and racial biases in Automatic Speech Recognition (ASR) systems, specifically focusing on African American Vernacular English (AAVE). In particular, this research explores the potential of fast adaptation techniques by utilizing a Transfer Learning approach to enhance the performance of transformer models in recognizing accents and dialects not adequately represented in the training data. By using the Corpus of Regional African American Language (CORAAL), my approach involves meticulously cleaning and extracting features from the dataset to prepare it for thorough training and testing of an ASR model trained on a large-scale dataset, DeepSpeech by Mozilla. The performance evaluation, based on the Word Error Rate (WER) metric, compares the model's accuracy. Overall, the average WER for the CORAAL dataset running with pre-trained DeepSpeech is at 25.89%, which is higher than the ground truth for Standard American English at 11.82%, with the outlier from Lower East Side, New York (LES) in CORAAL region at 65.94%. The results of this project not only have the potential to significantly contribute to the advancement of more accurate and unbiased ASR systems but also provide valuable guidance for mitigating racial biases in Natural Language Processing (NLP), thereby fostering a fair and equitable application of ASR technology.
Primary Speaker
Faculty Sponsors
Faculty Department/Program
Faculty Division
Presentation Type
Do You Approve this Abstract?
Approved