-
Notifications
You must be signed in to change notification settings - Fork 107
Description
Requirement
The sentences.csv file has very limited data which can be used for the initial training. The aim is to gather more data via publicly available datasets and sources to help improve the responses of the bot via ML models.
Pre-requisite
- Elementary knowledge of Python
- Elementary understanding of the available data
Dependencies
None
Description
This is an open-ended issue where participants can explore various sources to gather the data required for improving the bot's NLP capabilities. Depending on the data, it may or may not require some elementary pre-processing before getting added to the available data. A separate issue might be created for the pre-processing if needed later.
A good point to start here would be to look for common conversation examples like 'Hello', 'How're you', 'That's good to hear' which are labelled as 'C' in the sentences.csv file. Looking for data based on the different labels might be easier.