Comparative Evaluation of Gemini and Copilot Performance in University Entrance Exams: A Systematic Analysis Based on Multiple-Choice Questions and Images
DOI:
https://doi.org/10.18687/LEIRD2025.1.1.431Palabras clave:
artificial intelligence, chatbot, performance, university, examsResumen
The objective of this research was to compare the performance of Gemini and Copilot in solving multiple-choice questions, interpreting texts and images, for the entrance exams of a prestigious Peruvian university across its various faculties over the past three years. This study analyzed 838 questions, of which 83 were analyzed as images. The overall results indicate a higher proportion of correct answers for Copilot, at 75% (627/838) versus 67% (561/838) for Gemini. The performance of both AIs was significantly lower in image analysis, with correct answers of 36.1% (30/83) for Gemini and 39.8% (33/83) for Copilot. In conclusion, these findings highlight the need to improve accuracy in image processing, as well as the importance of understanding its current limitations to optimize its performance and integration into the academic field.Descargas
Publicado
2025-12-09
Número
Sección
Articles
Licencia
Derechos de autor 2025 LEIRD

Esta obra está bajo una licencia internacional Creative Commons Atribución-NoComercial 4.0.
Cómo citar
Medina Llerena, D. A., & Velarde Lam, D. M. (2025). Comparative Evaluation of Gemini and Copilot Performance in University Entrance Exams: A Systematic Analysis Based on Multiple-Choice Questions and Images. LACCEI, 2(13). https://doi.org/10.18687/LEIRD2025.1.1.431