
handle: 2117/430221
This paper examines the application of Generative AI (henceforth GenAI), specifically OpenAI's advanced Large Language Models (LLMs), in evaluating student reports within challenge-based courses with focus on Sustainability and Ethics (S&E). Traditional grading methods, heavily reliant on manual effort and subject to human biases, present a significant workload for educators and often lack consistency. By integrating LLM into the grading process, this study explores the feasibility of automating assessment tasks, aiming to achieve more objective and efficient evaluation while reducing evaluators' workload. The research method entails preprocessing and anonymizing student reports, promptifying existing rubrics for LLM compatibility, and analyzing the AI-generated assessments against humangraded benchmarks. Preliminary findings suggest that LLMs can complement human grading by providing consistent evaluations under certain conditions, such as when reports are text-based and rubrics are clearly defined. However, limitations such as the model's reasoning capabilities and handling of non-textual information indicate areas for future research. This research contributes to ongoing discussions on the potential of GenAI in education, underscoring the need for further exploration into AI assisted assessment tools that could enhance the transparency and efficacy of grading practices in engineering education.
Objectius de Desenvolupament Sostenible::4 - Educació de Qualitat
Peer Reviewed
Generative Artificial Intelligence (GenAI), Large Language Models (LLMs), Educational Assessment, Engineering Education, Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Aspectes socials
Generative Artificial Intelligence (GenAI), Large Language Models (LLMs), Educational Assessment, Engineering Education, Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Aspectes socials
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
