A dataset of informal Persian audio and text chunks, along with a fully open processing pipeline, suitable for ASR and TTS tasks. Created from crawled content on virgool.io.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
MahtaFetrat 5d07f4ec0f add third party licenses 8 months ago
..
README.md add third party licenses 8 months ago
hazm.txt add third party licenses 8 months ago
hezar.txt add third party licenses 8 months ago
jiwer.txt add third party licenses 8 months ago
parsi_io.txt add third party licenses 8 months ago
pydub.txt add third party licenses 8 months ago
spleeter.txt add third party licenses 8 months ago
vosk.txt add third party licenses 8 months ago
wav2vec_fa.txt add third party licenses 8 months ago
whisper_fa.txt add third party licenses 8 months ago

README.md

Third-Party Licenses

This directory contains the licenses for the third-party tools and libraries used in this project. Below is a list of the tools along with their licenses.

Tools and Licenses

Tool Name Usage Repository Page License
Parsi.io Number extraction & number to text conversion GitHub Apache-2.0
Hazm Text normalization GitHub MIT
Pydub Silence detection/removal GitHub MIT
Perpos Part of speech tagging for sentence tokenization GitHub MIT
Vosk Forced alignment GitHub Apache-2.0
Whisper-fa Forced alignment HuggingFace Apache-2.0
Wav2vec2-v3 Forced alignment HuggingFace -
Wav2vec2-fa Forced alignment GitHub Apache-3.0
Hezar Forced alignment GitHub Apache-2.0
JiWER CER calculation GitHub Apache-2.0

License Files

This directory also contains the actual license files for each tool:

Please refer to these files for the full text of each license.