This program transcribes MP3 audio files into text using the Whisper model. By default, it provides segmented transcriptions without punctuation. The program can process individual MP3 files or all MP3 files in a specified directory.
- Segmented Transcriptions: By default, the transcription is split into segments without punctuation.
- Retain Punctuation: You can use the
--with-punctflag to retain punctuation in the transcription. - Continuous Text: Use the
--continued-textflag to combine the transcription into a continuous block of text. - Directory Processing: With the
-dirflag, the program can process all MP3 files in a directory. - Save or Print Output: The output is saved as a
.txtfile with the same name as the audio file if the-saveparameter is given; otherwise, it prints the transcription on the console.
- Python 3.x
openai-whisperlibrary
To install the required dependencies, you can use the following command:
pip install -r requirements.txtpython generate_transcr.py [filename.mp3] -save -dir input_directory --with-punct --continued-text