A dataset of informal Persian audio and text chunks, along with a fully open processing pipeline, suitable for ASR and TTS tasks. Created from crawled content on virgool.io.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
MahtaFetrat 5d07f4ec0f add third party licenses 10 months ago
..
README.md add third party licenses 10 months ago
hazm.txt add third party licenses 10 months ago
hezar.txt add third party licenses 10 months ago
jiwer.txt add third party licenses 10 months ago
parsi_io.txt add third party licenses 10 months ago
pydub.txt add third party licenses 10 months ago
spleeter.txt add third party licenses 10 months ago
vosk.txt add third party licenses 10 months ago
wav2vec_fa.txt add third party licenses 10 months ago
whisper_fa.txt add third party licenses 10 months ago

README.md

Third-Party Licenses

This directory contains the licenses for the third-party tools and libraries used in this project. Below is a list of the tools along with their licenses.

Tools and Licenses

Tool Name Usage Repository Page License
Parsi.io Number extraction & number to text conversion GitHub Apache-2.0
Hazm Text normalization GitHub MIT
Pydub Silence detection/removal GitHub MIT
Perpos Part of speech tagging for sentence tokenization GitHub MIT
Vosk Forced alignment GitHub Apache-2.0
Whisper-fa Forced alignment HuggingFace Apache-2.0
Wav2vec2-v3 Forced alignment HuggingFace -
Wav2vec2-fa Forced alignment GitHub Apache-3.0
Hezar Forced alignment GitHub Apache-2.0
JiWER CER calculation GitHub Apache-2.0

License Files

This directory also contains the actual license files for each tool:

Please refer to these files for the full text of each license.