A Comparative Study on Enhancing the Accuracy of Chinese Speech-to-Text in Instructional Videos Using Large Language Models (88466)

Session Information:

Monday, 25 November 2024 15:50
Session: Poster Session 1
Room: Orion Hall (5F)
Presentation Type: Poster Presentation

All presentation times are UTC + 9 (Asia/Tokyo)

With the rapid development of speech recognition technology, Chinese speech-to-text (STT) systems play an important role in the production of subtitles and are often used in instructional videos. However, due to the complexity of the Chinese language and the large number of homophones, there is still significant room for improvement in the accuracy of existing STT systems. In this study, we proposed two optimization methods based on large language models (LLM), including language model-assisted editing and fine-tuned language model-assisted text editing, to improve the accuracy of Chinese STT, and verified them by producing subtitles for instructional videos in various domains and calculating the Levenshtein distance between two strings with dynamic programming. The results indicated that the fine-tuned language model-assisted text editing approach is significantly better than the language model-assisted editing approach in terms of text accuracy, and it can generate fine-tuning strategies for specific language characteristics to recognize language nuances more efficiently, thus significantly improving the accuracy of Chinese speech-to-text systems.

Authors:
Chih Chang Yang, National Taiwan Normal University, Taiwan
Tzren-Ru Chou, National Taiwan Normal University, Taiwan
Shu Wei Liu, National Taiwan University of Science and Technology, Taiwan


About the Presenter(s)
My name is Chih-Chang Yang, and I am currently serving as a Senior Technician at Taiwan National Open University, where I have worked for seventeen years. My current responsibilities include planning and recording digital learning course videos using

See this presentation on the full scheduleMonday Schedule


A Note to Presenters

To enhance academic profiles and showcase research, we encourage all presenters and co-presenters to include links to their public LinkedIn, ResearchGate profile, and research websites. Presenters may update their bio for their presentation by completing the form linked below by October 22, 2024.
- Presenter Information Update Form
Submitted changes will be reflected on November 01, 2024

Additionally, presenters should also update their IAFOR account details if there have been any changes to affiliations or biographies.
- https://submit.iafor.org/my-account/edit-account


Conference Comments & Feedback

Place a comment using your LinkedIn profile

Comments

Share on activity feed

Powered by WP LinkPress

Share this Presentation

Posted by Clive Staples Lewis

Last updated: 2023-02-23 23:45:00