1 |
Auffarth, B. (2023). Generative AI with LangChain. Packt Publishing, Birmingham, England.
|
|
2 |
Bencke, L., Paula, F., dos Santos, B., and Moreira, V. P. (2024). Can we
trust LLMs as relevance judges? In Anais do XXXIX Simpósio Brasileiro
de Bancos de Dados, pages 600–612, Porto Alegre, RS, Brasil. SBC. DOI:
http://dx.doi.org/10.5753/sbbd.2024.243130.
|
|
3 |
Collins, K. M., Jiang, A. Q., Frieder, S., Wong, L., Zilka, M., Bhatt, U., Lukasiewicz, T., Wu, Y., Tenenbaum, J. B., Hart, W., Gowers, T., Li, W., Weller, A., and Jamnik, M. (2024). Evaluating language models for mathematics through interactions. Proceedings of the National Academy of Sciences, 121(24):e2318124121. DOI: 10.1073/pnas.2318124121.
|
|
4 |
Gandolfi, A. (2025). GPT-4 in Education: Evaluating Aptness, Reliability, and Loss of Coherence in Solving Calculus Problems and Grading Submissions. International Journal of Artificial Intelligence in Education, 35:367–397. DOI: http://dx.doi.org/10.1007/s40593-024-00403-3.
|
|
5 |
Harvey, E., Koenecke, A., and Kizilcec, R. F. (2025). “Don’t Forget the Teachers”: Towards an Educator-Centered Understanding of Harms from Large Language Models in Education. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, CHI ’25, New York, NY, USA. Association for Computing Machinery. DOI: http://dx.doi.org/10.1145/3706598.3713210.
|
|
6 |
Lehmann, M., Cornelius, P. B., and Sting, F. J. (2025). AI Meets the Classroom: When Do Large Language Models Harm Learning? SSRN. DOI:
http://dx.doi.org/10.2139/ssrn.4941259.
|
|
7 |
Liu, J., Sun, D., Sun, J., Wang, J., and Yu, P. L. H. (2025). Designing a generative AI enabled learning environment for mathematics word problem solving in primary schools: Learning performance, attitudes and interaction. Computers and Education: Artificial Intelligence, 9:100438. DOI: https://doi.org/10.1016/j.caeai.2025.100438.
|
|
8 |
Makridakis, S., Petropoulos, F., and Kang, Y. (2023). Large language models: Their success and impact. Forecasting, 5(3):536–549. DOI:
http://dx.doi.org/10.3390/forecast5030030.
|
|
9 |
Marques, D. and Morandini, M. (2024). Uso do ChatGPT no Contexto Educacional: Uma Revisão Sistemática da Literatura. In Anais do XXXV Simpósio Brasileiro de Informática na Educação (SBIE 2024) , SBIE 2024, page 1784–1795. Sociedade Brasileira de Computação - SBC. DOI: http://dx.doi.org/10.5753/sbie.2024.242535.
|
|
10 |
Miranda, B. and Campelo, C. E. C. (2024). How effective is an LLM-based Data Analysis Automation Tool? A Case Study with ChatGPT’s Data Analyst. In Anais do XXXIX Simpósio Brasileiro de Bancos de Dados, pages 287–299, Porto Alegre, RS, Brasil. SBC. DOI: http://dx.doi.org/10.5753/sbbd.2024.240841.
|
|
11 |
Pardos, Z. A. and Bhandari, S. (2024). ChatGPT-generated help produces learning gains equivalent to human tutor-authored help on mathematics skills. PLOS ONE, 19(5):1–18. DOI: http://dx.doi.org/10.1371/journal.pone.0304013.
|
|
12 |
Rodrigues, L., Xavier, C., Costa, N., Batista, H., Silva, L. F. B., Chaleghi de Melo, W., Gasevic, D., and Ferreira Mello, R. (2025). LLMs Performance in Answering Educational Questions in Brazilian Portuguese: A Preliminary Analysis on LLMs Potential to Support Diverse Educational Needs. In Proceedings of the 15th International Learning Analytics and Knowledge Conference, LAK ’25, page 865–871, New York, NY, USA. Association for Computing Machinery. DOI:
http://dx.doi.org/10.1145/3706468.3706515.
|
|
13 |
Santos, V. S. and Dorneles, C. F. (2024). Unveiling the Segmentation Power of LLMs: Zero-Shot Invoice Item Description Analysis. In Anais do XXXIX Simpósio Brasileiro de Banco de Dados (SBBD 2024), SBBD 2024, page 549–561. Sociedade Brasileira de Computação - SBC. DOI: http://dx.doi.org/10.5753/sbbd.2024.240820.
|
|
14 |
Satpute, A., Gießing, N., Greiner-Petter, A., Schubotz, M., Teschke, O., Aizawa, A., and Gipp, B. (2024). Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’24, page 2316–2320, New York, NY, USA. Association for Computing Machinery. DOI:
http://dx.doi.org/10.1145/3626772.3657945.
|
|
15 |
Wang, S., Xu, T., Li, H., Zhang, C., Liang, J., Tang, J., Yu, P. S., and Wen, Q.
(2024). Large Language Models for Education: A Survey and Outlook. URL:
https://arxiv.org/abs/2403.18105.
|
|
16 |
Weidinger, L., Uesato, J., Rauh, M., Griffin, C., Huang, P.-S., Mellor, J., Glaese, A., Cheng, M., Balle, B., Kasirzadeh, A., Biles, C., Brown, S., Kenton, Z., Hawkins, W., Stepleton, T., Birhane, A., Hendricks, L. A., Rimell, L., Isaac, W., Haas, J., Legassick, S., Irving, G., and Gabriel, I. (2022). Taxonomy of risks posed by language models. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’22, page 214–229, New York, NY, USA. Association for Computing Machinery. DOI: http://dx.doi.org/10.1145/3531146.3533088.
|
|