ManaTTS is the largest open Persian speech dataset with 86+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
MahtaFetrat 02767ee22f add third party licenses 10 months ago
..
README.md add third party licenses 10 months ago
hazm.txt add third party licenses 10 months ago
hezar.txt add third party licenses 10 months ago
jiwer.txt add third party licenses 10 months ago
parsi_io.txt add third party licenses 10 months ago
pydub.txt add third party licenses 10 months ago
spleeter.txt add third party licenses 10 months ago
vosk.txt add third party licenses 10 months ago
wav2vec_fa.txt add third party licenses 10 months ago
whisper_fa.txt add third party licenses 10 months ago

README.md

Third-Party Licenses

This directory contains the licenses for the third-party tools and libraries used in this project. Below is a list of the tools along with their licenses.

Tools and Licenses

Tool Name Usage Repository Page License
Spleeter Source separation (remove background music) GitHub MIT
Parsi.io Number extraction & number to text conversion GitHub Apache-2.0
Hazm Text normalization GitHub MIT
Pydub Silence detection/removal GitHub MIT
Perpos Part of speech tagging for sentence tokenization GitHub MIT
Vosk Forced alignment GitHub Apache-2.0
Whisper-fa Forced alignment HuggingFace Apache-2.0
Wav2vec2-v3 Forced alignment HuggingFace -
Wav2vec2-fa Forced alignment GitHub Apache-3.0
Hezar Forced alignment GitHub Apache-2.0
JiWER CER calculation GitHub Apache-2.0

License Files

This directory also contains the actual license files for each tool:

Please refer to these files for the full text of each license.