# ASR_CTC_LMRS **Repository Path**: weimingtom2000/ASR_CTC_LMRS ## Basic Information - **Project Name**: ASR_CTC_LMRS - **Description**: Imported from https://github.com/muncok/ASR_CTC_LMRS - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2023-11-23 - **Last Updated**: 2023-11-23 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # ASR_CTC_LMRS End-to-End ASR using LSTM + CTC + LM Rescoring Dataset * wsj0 RNN Model * 3-layer Bi-directional LSTM (Hidden dimensions: 400) * Layer Normalization * CTC loss * Gradient Clipping (MaxNorm 5) * Adam Optimizer * Decreasing Learning Rate (Init 0.01) Decoding ([Stanford-ctc](https://github.com/amaas/stanford-ctc)) * Greedy Decoding * Decoding with Character Level Language Model (Maas, Andrew L., et al. "Lexicon-Free Conversational Speech Recognition with Neural Networks." HLT-NAACL. 2015) * Beam width: 10 * Language Model: [Kenlm](https://github.com/kpu/kenlm) (5-gram CLM) * Alpha: 1.25, Beta: 1.5 Files * main.py * model/model.py - CTC-RNN Model * data_processing.py - manipulating dataset and compute CER and WER * decoding.py, decoder.c, wsj_5gram.binary, char_set_reverse.txt - for LM rescoring * tarin.py * evaluate.py Pre-trained Model [Google drive](https://drive.google.com/drive/folders/1TkEZtcFRocW3cHILxtouhZNl6iLja6dB?usp=sharing) |Error|value| |------|---| |CER w/o LM|12.45%| |WER w/o LM|44.43%| |CER w/ LM|6.28%| |WER w/ LM|18.16%| Usage * dataset (wsg0) - put wsj0, wsj1 at root directory * Stanford-ctc > python setup.py install * KenLM > pip install https://github.com/kpu/kenlm/archive/master.zip * main.py