2 months ago · 3356ba5e77
--- a/README.md
+++ b/README.md
@@ -9,6 +9,8 @@ The text for this dataset was generated using GPT4o, with prompts covering a wid
 These generated texts were then recorded in a quiet environment. The audio and text files underwent forced alignment using [aeneas](https://github.com/readbeyond/aeneas), resulting in smaller chunks of audio-text pairs as presented in this dataset.

 ## Download
 [![Hugging Face](https://img.shields.io/badge/Hugging%20Face-dataset-orange)](https://huggingface.co/datasets/MahtaFetrat/GPTInformal-Persian)

 You can download the dataset from [this repository](https://huggingface.co/datasets/MahtaFetrat/GPTInformal-Persian).

 ### Data Columns
@@ -28,11 +30,16 @@ Each Parquet file contains the following columns:
 If you use GPTInformal-Persian in your research or projects, please cite the following paper:

 ```bash
@article{fetrat2024manatts,
      title={ManaTTS Persian: a recipe for creating TTS datasets for lower resource languages}, 
      author={Mahta Fetrat Qharabagh and Zahra Dehghanian and Hamid R. Rabiee},
      journal={arXiv preprint arXiv:2409.07259},
      year={2024},
@inproceedings{qharabagh-etal-2025-manatts,
    title = "{M}ana{TTS} {P}ersian: a recipe for creating {TTS} datasets for lower resource languages",
    author = "Qharabagh, Mahta Fetrat  and Dehghanian, Zahra  and Rabiee, Hamid R.",
    booktitle = "Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)",
    month = apr,
    year = "2025",
    address = "Albuquerque, New Mexico",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.naacl-long.464/",
    pages = "9177--9206",
 }
 ```