Paper Information | SBC BOOKS

Fill in your paper information

Language (*)

Title (*)

Keywords (*)

Abstract (*)

Pages (*)

File Link

English Information

Title

Keywords

Abstract

(*) To change the order drag the item to the new position.

Authors

#	Name
1	João Junqueira da Silva(jgabriel@discente.ufg.br)
2	Sávio de Oliveira(savioteles@ufg.br)
3	Lucas Alves(lucas.alves@discente.ufg.br)
4	Nicolás Eiris(nicolas@panoplai.com)
5	Arlindo Galvão Filho(arlindogalvao@ufg.br)

(*) To change the order drag the item to the new position.

Reference

#	Reference
1	Aquino, I., dos Santos, M. M., Dorneles, C., and Carvalho, J. T. (2024). Extracting information from brazilian legal documents with retrieval augmented generation. In Anais Estendidos do XXXIX Simpósio Brasileiro de Bancos de Dados, pages 280–287, Porto Alegre, RS, Brasil. SBC
2	Edge, D., Trinh, H., Cheng, N., Bradley, J., Chao, A., Mody, A., Truitt, S., Metropolitan- sky, D., Ness, R. O., and Larson, J. (2025). From local to global: A graph rag approach to query-focused summarization.
3	Jiang, C., Gao, L., Zarch, H. E., and Annavaram, M. (2024). Efficient llm inference with i/o-aware partial kv cache recomputation.
4	Jimenez Gutierrez, B., Shu, Y., Gu, Y., Yasunaga, M., and Su, Y. (2024). Hipporag: Neurobiologically inspired long-term memory for large language models. Advances in Neural Information Processing Systems, 37:59532–59569.
5	Kociský, T., Schwarz, J., Blunsom, P., Dyer, C., Hermann, K. M., Melis, G., and Grefen- stette, E. (2018). The NarrativeQA reading comprehension challenge. Transactions of the Association for Computational Linguistics, 6:317–328.
6	Lavie, A. and Agarwal, A. (2007). Meteor: an automatic metric for mt evaluation with high levels of correlation with human judgments. In Proceedings of the Second Work- shop on Statistical Machine Translation, StatMT ’07, page 228–231, USA. Association for Computational Linguistics.
7	Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W.-t., Rocktäschel, T., Riedel, S., and Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive nlp tasks. In Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS ’20, Red Hook, NY, USA. Curran Associates Inc.
8	Li, B., Jiang, Y., Gadepally, V., and Tiwari, D. (2024). Llm inference serving: Survey of recent advances and opportunities. In 2024 IEEE High Performance Extreme Comput- ing Conference (HPEC), pages 1–8.
9	Li, H., Li, Y., Tian, A., Tang, T., Xu, Z., Chen, X., Hu, N., Dong, W., Li, Q., and Chen, L. (2025). A survey on large language model acceleration based on kv cache manage- ment.
10	Lin, C.-Y. (2004). ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Com- putational Linguistics.
11	Liu, A., Liu, J., Pan, Z., He, Y., Haffari, G., and Zhuang, B. (2024a). Minicache: Kv cache compression in depth dimension for large language models. In Globerson, A., Mackey, L., Belgrave, D., Fan, A., Paquet, U., Tomczak, J., and Zhang, C., editors, Advances in Neural Information Processing Systems, volume 37, pages 139997–140031. Curran Associates, Inc.
12	Liu, Y., Li, H., Cheng, Y., Ray, S., Huang, Y., Zhang, Q., Du, K., Yao, J., Lu, S., Anan- thanarayanan, G., Maire, M., Hoffmann, H., Holtzman, A., and Jiang, J. (2024b). Cachegen: Kv cache compression and streaming for fast large language model serving. In Proceedings of the ACM SIGCOMM 2024 Conference, ACM SIGCOMM ’24, page 38–56, New York, NY, USA. Association for Computing Machinery.
13	Nolet, C. J., Lafargue, V., Raff, E., Nanditale, T., Oates, T., Zedlewski, J., and Patterson, J. (2021). Bringing umap closer to the speed of light with gpu acceleration
14	NVIDIA Corporation (2024). Benchmarking metrics for large language mod- els. https://docs.nvidia.com/nim/benchmarking/llm/latest/ metrics.html. Accessed: 2025-04-17.
15	Oliveira, V. P. L. (2024). MemoryGraph: uma proposta de memória para agentes con- versacionais utilizando grafo de conhecimento. Tese (doutorado em ciência da com- putação), Universidade Federal de Goiás, Goiânia.
16	Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. (2002). Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on As- sociation for Computational Linguistics, ACL ’02, page 311–318, USA. Association for Computational Linguistics.
17	Paschoal, A. F. A., Pirozelli, P., Freire, V., Delgado, K. V., Peres, S. M., José, M. M., Nakasato, F., Oliveira, A. S., Brandão, A. A. F., Costa, A. H. R., and Cozman, F. G. (2021). Pirá: A bilingual portuguese-english dataset for question-answering about the ocean. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, CIKM ’21, page 4544–4553, New York, NY, USA. Associ- ation for Computing Machinery.
18	Reimers, N. and Gurevych, I. (2019). Sentence-BERT: Sentence embeddings using Siamese BERT-networks. In Inui, K., Jiang, J., Ng, V., and Wan, X., editors, Pro- ceedings of the 2019 Conference on Empirical Methods in Natural Language Pro- cessing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992, Hong Kong, China. Association for Computa- tional Linguistics
19	RunPod (2025). Runpod – cloud compute for ai, ml, and more. Acesso em: 28 abr. 2025.
20	Sarthi, P., Abdullah, S., Tuli, A., Khanna, S., Goldie, A., and Manning, C. D. (2024). RAPTOR: Recursive abstractive processing for tree-organized retrieval. In The Twelfth International Conference on Learning Representations.
21	Souza, F., Nogueira, R., and Lotufo, R. (2020). BERTimbau: pretrained BERT models for Brazilian Portuguese. In 9th Brazilian Conference on Intelligent Systems, BRACIS, Rio Grande do Sul, Brazil, October 20-23.
22	Taschetto, L. and Fileto, R. (2024). Using retrieval-augmented generation to improve performance of large language models on the brazilian university admission exam. In Anais do XXXIX Simpósio Brasileiro de Bancos de Dados, pages 799–805, Porto Alegre, RS, Brasil. SBC
23	Yao, J., Li, H., Liu, Y., Ray, S., Cheng, Y., Zhang, Q., Du, K., Lu, S., and Jiang, J. (2025). Cacheblend: Fast large language model serving for rag with cached knowledge fusion. In Proceedings of the Twentieth European Conference on Computer Systems, EuroSys ’25, page 94–109, New York, NY, USA. Association for Computing Machinery
24	Yu, H., Gan, A., Zhang, K., Tong, S., Liu, Q., and Liu, Z. (2024). Evaluation of retrieval- augmented generation: A survey.

Paper Registration

Fill in your paper information

English Information

Authors

Reference