ManaTTS is the largest open Persian speech dataset with 86+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.

Third-Party Licenses

This directory contains the licenses for the third-party tools and libraries used in this project. Below is a list of the tools along with their licenses.

Tools and Licenses

Tool Name	Usage	Repository Page	License
Spleeter	Source separation (remove background music)	GitHub	MIT
Parsi.io	Number extraction & number to text conversion	GitHub	Apache-2.0
Hazm	Text normalization	GitHub	MIT
Pydub	Silence detection/removal	GitHub	MIT
Perpos	Part of speech tagging for sentence tokenization	GitHub	MIT
Vosk	Forced alignment	GitHub	Apache-2.0
Whisper-fa	Forced alignment	HuggingFace	Apache-2.0
Wav2vec2-v3	Forced alignment	HuggingFace	-
Wav2vec2-fa	Forced alignment	GitHub	Apache-3.0
Hezar	Forced alignment	GitHub	Apache-2.0
JiWER	CER calculation	GitHub	Apache-2.0

License Files

This directory also contains the actual license files for each tool:

Please refer to these files for the full text of each license.

README.md 3.2KB Raw Blame History

Third-Party Licenses

Tools and Licenses

License Files

README.md 3.2KB

Raw Blame History