HomoRich: The first large-scale Persian homograph dataset for G2P conversion, featuring 528K annotated sentences with balanced pronunciation variants and dual phoneme representations.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Mahta Fetrat d1180e6c49 Add 'data/README.md' 1 month ago
..
README.md Add 'data/README.md' 1 month ago
part_01.parquet Add files via upload 5 months ago
part_02.parquet Add files via upload 5 months ago
part_03.parquet Add files via upload 5 months ago

README.md

Please access the dataset files in this link.