Hindi language is not very similar to English language, so i feel like i should like train from scratch, but i'm not able to find a proper dataset for it, can someone help me on how i can actually do that? like find an proper dataset or maybe create my own proper dataset? and how should i train/fine tune it?
Any help from your side will be appreciated.