Comparative Evaluation of Gemini and Copilot Performance in University Entrance Exams: A Systematic Analysis Based on Multiple-Choice Questions and Images
DOI:
https://doi.org/10.18687/LEIRD2025.1.1.431Keywords:
artificial intelligence, chatbot, performance, university, examsAbstract
The objective of this research was to compare the performance of Gemini and Copilot in solving multiple-choice questions, interpreting texts and images, for the entrance exams of a prestigious Peruvian university across its various faculties over the past three years. This study analyzed 838 questions, of which 83 were analyzed as images. The overall results indicate a higher proportion of correct answers for Copilot, at 75% (627/838) versus 67% (561/838) for Gemini. The performance of both AIs was significantly lower in image analysis, with correct answers of 36.1% (30/83) for Gemini and 39.8% (33/83) for Copilot. In conclusion, these findings highlight the need to improve accuracy in image processing, as well as the importance of understanding its current limitations to optimize its performance and integration into the academic field.Downloads
Published
2025-12-09
Issue
Section
Articles
License
Copyright (c) 2025 LEIRD

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
How to Cite
Medina Llerena, D. A., & Velarde Lam, D. M. (2025). Comparative Evaluation of Gemini and Copilot Performance in University Entrance Exams: A Systematic Analysis Based on Multiple-Choice Questions and Images. LACCEI, 2(13). https://doi.org/10.18687/LEIRD2025.1.1.431