You need to do the following task:
- Isolated Digit Recognition using Discrete HMMs (code). Use the given features also extract your own features and compare the results.
- Use the HMMs trained in task 1 to recognize continuous digits. You need to concatenate the HMMs trained in task 1 to recognize continuous digits. Use only the given features.
Datasets:
Digit dataset: This dataset consists of spoken utterances. The MFCC feature files and the original .wav files given.
Data: download here, Group Mapping: Download here
Continues digits dataset:
- Download development data from here and test data from here.
- The data contains directories with the group numbers.
- Each directory contains MFCC features from utterances of multiple digits
(corresponding to the isolated digits assigned to your batch).
- The set of digits uttered are given below: symbol uttered word 1 one 2 two 3 three 4 four 5 five 6 six 7 seven 8 eight 9 nine z zero o o
- In development data, the file name represents spoken digits. Eg. In file 534.mfcc, the digits spoken are five three four.
- Test data consists of 5 unlabeled sequences (blind data). Provide the possible sequence of digits obtained in the report.
Feature File Format:
- The data given are the MFCC features of speech audio.
- Structure of MFCC file: The first line of the MFCC file contains two space-separated integers. First integer NC The dimension of the feature vector (The number of MFC coefficients) Second integer NF The number of frames, the .wav file is divided into.
- The next NF rows contain the MFCC features of dimension NC. Each row corresponds to a feature vector in the sequence. Please note that NF varies with the example.
Guidelines:
- You need to plot ROC, DET and confusion matrices for task 1.
- You can include graphs and tables for your results.
Reviews
There are no reviews yet.