Benchmarking notebooks for various Persian G2P models, comparing their performance on the SentenceBench dataset, including Homo-GE2PE and Homo-T5.

Mahta Fetrat 95507ffc2a Update README.md		1 week ago
benchmarking-scripts	Add files via upload	2 weeks ago
LICENSE	Initial commit	2 weeks ago
README.md	Update README.md	1 week ago

Persian G2P Tools Benchmark

This repository contains benchmarking notebooks for various Persian grapheme-to-phoneme (G2P) models, including both baseline models and the proposed Homo-GE2PE and Homo-T5 models in the Fast, Not Fancy: Rethinking G2P with Rich Data and Rule-Based Models study. The benchmarks are conducted using the SentenceBench Persian G2P Benchmark.

Repository Structure

benchmarking-scripts/
│   ├── Benchmark_AzamRabiee_Persian_G2P.ipynb
│   ├── Benchmark_GE2PE.ipynb
│   ├── Benchmark_HomoFast_eSpeak.ipynb
│   ├── Benchmark_Homo_GE2PE.ipynb
│   ├── Benchmark_Homo_T5.ipynb
│   ├── Benchmark_PasaOpasen_PersianG2P.ipynb
│   ├── Benchmark_de_mh_persian_phonemizer.ipynb
│   ├── Benchmark_dmort27_epitran.ipynb
│   ├── Benchmark_eSpeak_NG.ipynb
│   └── Benchmark_mohamad_hasan_sohan_ajini_G2P.ipynb
│   └── Benchmark_sajadalipour7_Persian_Grapheme_To_Phoneme_With_Transformer.ipynb

Each notebook benchmarks a specific model using the SentenceBench dataset. The results of each run (5 independent runs per model) are documented in the last markdown cell of each notebook.

Benchmarking Results

The table below presents the performance of each model, averaged across 5 runs:

Model	PER (%)	Homograph Acc. (%)	Avg. Inf. Time (s)
PersianG2P (AzamRabiee)	35.23	21.23	11.1374
PasaOpasen_PersianG2P	15.04	37.74	2.1686
persian_phonemizer (de_mh)	25.27	29.25	0.1803
Epitran (dmort27)	45.12	0.00	0.0003
G2P (mohamad_hasan_sohan_ajini)	19.63	29.91	28.0039
Persian_Grapheme_To_Phoneme (sajadalipour7)	12.85	40.00	0.9685
eSpeak NG	6.92	43.87	0.0169
GE2PE	4.81	47.17	0.4464
HomoFast eSpeak	6.33	74.53	0.0084
Homo-T5	4.12	76.32	0.4141
Homo-GE2PE	3.98	76.89	0.4473

Contributions

Contributions and pull requests are welcome. Please open an issue to discuss the changes you intend to make.

README.md

Persian G2P Tools Benchmark

Repository Structure

Benchmarking Results

Contributions

License

Citation

Additional Links