Although big language models have made significant progress in the field of natural language processing, they still face challenges in processing complex educational documents containing multimodal information, especially in intelligent understanding and reasoning. For example, middle school test questions contain a large amount of multimodal information such as text, tables, illustrations, etc., which not only requires the model to have cross modal understanding and logical reasoning ability, but also considers the effective alignment of the model output with the thinking mode of middle school students, thereby providing answers that are more in line with their cognitive level. The current mainstream multimodal large language models (MLLMs) still have shortcomings in information integration and cross modal reasoning, which limits their application in educational settings. In order to promote the development of MLLMs in the field of Chinese education, this article constructs a large-scale thinking chain dataset containing 223811 junior high school test questions. This dataset contains 84607 images and 54998 multimodal test questions, each of which introduces a detailed thought chain reasoning process. The thought chain reveals the intrinsic connections between knowledge points, enabling multimodal big language models to gradually derive answers and improve their performance in complex reasoning tasks
-
Notifications
You must be signed in to change notification settings - Fork 0
MaxTEX310/MaxCoT
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published

