Skip to content

add translation literals for various Indic languages (Bengali, Gujarati, Punjabi, Tamil)#1015

Merged
NathanHB merged 2 commits intohuggingface:mainfrom
rpm000:rpm/add-translation-literals
Oct 21, 2025
Merged

add translation literals for various Indic languages (Bengali, Gujarati, Punjabi, Tamil)#1015
NathanHB merged 2 commits intohuggingface:mainfrom
rpm000:rpm/add-translation-literals

Conversation

@rpm000
Copy link
Copy Markdown
Contributor

@rpm000 rpm000 commented Oct 10, 2025

We introduce translation literals for four indic languages (Bengali, Gujarati, Punjabi, Tamil). This allows multilingual evaluations to be run over these languages.

@rpm000
Copy link
Copy Markdown
Contributor Author

rpm000 commented Oct 10, 2025

@NathanHB , @hynky1999 : adding some translation literals for indic languages. This is my first PR in lighteval, any feedback/comments would be very much appreciated :)

@clefourrier
Copy link
Copy Markdown
Member

Hi! It's looking good, so let's see if tests pass. Just to be sure, did you make sure the words you provided make sense in the examples provided in the doc?

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown
Collaborator

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@rpm000
Copy link
Copy Markdown
Contributor Author

rpm000 commented Oct 14, 2025

Hi @clefourrier , thanks for fixing the failing tests 🙏

Just to be sure, did you make sure the words you provided make sense in the examples provided in the doc?

Yep, we did make sure to check that. Just noting that these words were not professionally translated but machine translated (with some sanity checking from speakers from each language). Would you like me to add a comment highlighting this ?

@clefourrier
Copy link
Copy Markdown
Member

Hi! We usually avoid machine translations in lighteval in order to get expressions as fluent as possible - I'll let @NathanHB decide whether we want to merge.

@NathanHB
Copy link
Copy Markdown
Member

having the words sanity checked by native speakers is good enough imo !

@NathanHB NathanHB merged commit bf8b547 into huggingface:main Oct 21, 2025
4 checks passed
@rpm000
Copy link
Copy Markdown
Contributor Author

rpm000 commented Oct 21, 2025

having the words sanity checked by native speakers is good enough imo !

Excited to have my first contribution to lighteval :) ! Very excited to hopefully add more in future (if there is a list of tasks/things that need to be fixed somewhere, would be happy to start working towards those too :) )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants