LLMs for Educational Assessment

Scope: Evaluate the effectiveness and reliability of LLMs in automating assessment of anonymized summer school applications through direct comparison with human evaluations.

Provide actionable recommendations on implementation feasibility and contexts favoring human versus LLM-based evaluation.

Importance: Identify biases and inconsistencies in human assessment while revealing fundamental differences between human and LLM evaluation approaches, enabling systemic improvements in educational evaluation practices.

Data overview

The raw data is saved into /data/raw folder. It contains anonymized entries (CVs, Motivational letters and Presentation texts) from applicants and their scores. The initial data is spreaded among several subfolders with .txt files. The applicants' scores and their final marks are saved in several .csv files that have been created at various stages of selection process and during the school.

The preprocessed and compressed dataset is saved as /data/dataset.parquet file with all necessary data for each student, who have either submitted CV, Motivational letter or Presentation. The dataset contains 577 rows and 25 columns in total. The table below gives explicit description of the dataset content.

#	Column	Non-Null Count	Dtype	Range	Description
0	cv	567 non-null	object	N/A	The anonymized CV in the form of string.
1	letter	568 non-null	object	N/A	The anonymized Motivational letter in the form of string.
2	presentation	564 non-null	object	N/A	The anonymized text from Presentation slides in the form of string.
3	cv_phd_1	567 non-null	float64	0 - 5	The CV score from first PhD student assessor.
4	cv_phd_2	567 non-null	float64	0 - 5	The CV score from second PhD student assessor.
5	letter_phd_1	567 non-null	float64	0 - 5	The Motivational letter score from first PhD student assessor.
6	letter_phd_2	567 non-null	float64	0 - 5	The Motivational letter score from second PhD student assessor.
7	pres_phd_1	567 non-null	float64	0 - 20	The Presentation score from first PhD student assessor.
8	pres_phd_2	567 non-null	float64	0 - 20	The Presentation score from second PhD student assessor.
9	pres_class	567 non-null	float64	0 - 3	The type of Presentation (review, article replication of personal project).
10	video_phd_1	383 non-null	float64	0 - 20	The Presentation video score from first PhD student assessor.
11	video_phd_2	409 non-null	float64	0 - 20	The Presentation video score from second PhD student assessor.
12	all_phd_1	567 non-null	float64	0 - 100	The total score from first PhD student assessor.
13	all_phd_2	567 non-null	float64	0 - 100	The total score from first PhD student assessor.
14	final_human_score	98 non-null	float64	0 - 100	The final score (weighted mean of PhD students and Professor scores).
15	prof_score	98 non-null	float64	0 - 10	The Professor's overall score.
16	offline_test_1	48 non-null	float64	0 - 1	The results of the first offline test.
17	offline_test_2	48 non-null	float64	0 - 1	The results of the second offline test.
18	offline_test_3	48 non-null	float64	0 - 1	The results of the third offline test.
19	offline_test_4	48 non-null	float64	0 - 1	The results of the fourth offline test.
20	offline_test_total	48 non-null	float64	0 - 1	The sum of all offline tests.
21	online_test_score	68 non-null	float64	0 - 67	The results of the online test
22	project_participation_flag	577 non-null	bool	T/F	The flag of student's participation in the project activity.
23	project_ta_score	118 non-null	float64	0 - 5	The project score from the teacher assistants.
24	project_peer_score	118 non-null	float64	0 - 10	The project score from the peer-review stage.

The detailed data description is provided in /preprocessing.ipynb.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
output		output
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
correlations.ipynb		correlations.ipynb
preprocessing.ipynb		preprocessing.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLMs for Educational Assessment

Data overview

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

On-Point-RND/Students-Assesment

Folders and files

Latest commit

History

Repository files navigation

LLMs for Educational Assessment

Data overview

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages