Paper Information | SBC BOOKS

Fill in your paper information

Language (*)

Title (*)

Keywords (*)

Abstract (*)

LLMs (Large Language Models) have demonstrated human-level language and knowledge acquisition skills in several tasks. However, despite the recent success and broad use, understanding how these skills are learned and encoded inside the underlying neural network is still challenging.
The goal of the LLM-MRI package is to simplify the study of activation patterns in any transformer-based LLM, similarly to how MRI (magnetic resonance imaging) simplifies with biological brains.
The package, written for the Python language, allows the mapping of neural regions using a parameterized reduction of the model's dimensionality. Neural regions can be visualized according to the forward-pass activations stimulated by a set of documents. Similarly, the package enables the creation of graph models representing the interlayer network of connections stimulated by a set of documents.
These features allow for qualitative and quantitative assessments of the underlying structure of activations, depending on the type of documents that the LLM model is exposed to.

Pages (*)

File Link

English Information

Title

Keywords

Abstract

(*) To change the order drag the item to the new position.

Authors

#	Name
1	André Santanchè(santanch@ic.unicamp.br)
2	Luiz Celso Gomes Jr(lcjunior@utfpr.edu.br)
3	Luiz Felipe Costa(l230613@dac.unicamp.br)
4	Mateus Figênio(matigenioo@gmail.com)

(*) To change the order drag the item to the new position.

Reference

#	Reference
1	@inproceedings{NIPS2000_728f206c, author = {Bengio, Yoshua and Ducharme, R\'{e}jean and Vincent, Pascal}, booktitle = {Advances in Neural Information Processing Systems}, editor = {T. Leen and T. Dietterich and V. Tresp}, pages = {}, publisher = {MIT Press}, title = {A Neural Probabilistic Language Model}, url = {https://proceedings.neurips.cc/paper_files/paper/2000/file/728f206c2a01bf572b5940d7d9a8fa4c-Paper.pdf}, volume = {13}, year = {2000} }
2	@article{Dalvi2019, title={What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models}, volume={33}, url={https://ojs.aaai.org/index.php/AAAI/article/view/4592}, DOI={10.1609/aaai.v33i01.33016309}, number={01}, journal={Proceedings of the AAAI Conference on Artificial Intelligence}, author={Dalvi, Fahim and Durrani, Nadir and Sajjad, Hassan and Belinkov, Yonatan and Bau, Anthony and Glass, James}, year={2019}, month={Jul.}, pages={6309-6317} }
3	@misc{derose2020, title={Attention Flows: Analyzing and Comparing Attention Mechanisms in Language Models}, author={Joseph F DeRose and Jiayao Wang and Matthew Berger}, year={2020}, eprint={2009.07053}, archivePrefix={arXiv}, primaryClass={cs.HC} }
4	@misc{geva2022, title={Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space}, author={Mor Geva and Avi Caciularu and Kevin Ro Wang and Yoav Goldberg}, year={2022}, eprint={2203.14680}, archivePrefix={arXiv}, primaryClass={cs.CL} }
5	@misc{shelby2024, author = {Shelby Hiter}, date = {2024-02-20}, year = {2024}, title = {Top 20 Generative AI Tools and Applications in 2024}, editor = {{eWeek}}, url = {https://www.eweek.com/artificial-intelligence/generative-ai-apps-tools/}, note = {Disponível em: \url{https://www.eweek.com/artificial-intelligence/generative-ai-apps-tools/}.}, urldate = {2024-05-01} }
6	@misc{hoover2019, title={exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformers Models}, author={Benjamin Hoover and Hendrik Strobelt and Sebastian Gehrmann}, year={2019}, eprint={1910.05276}, archivePrefix={arXiv}, primaryClass={cs.CL} }
7	@misc{naveed2024comprehensive, title={A Comprehensive Overview of Large Language Models}, author={Humza Naveed and Asad Ullah Khan and Shi Qiu and Muhammad Saqib and Saeed Anwar and Muhammad Usman and Naveed Akhtar and Nick Barnes and Ajmal Mian}, year={2024}, eprint={2307.06435}, archivePrefix={arXiv}, primaryClass={cs.CL} }
8	@misc{samek2017, title={Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models}, author={Wojciech Samek and Thomas Wiegand and Klaus-Robert Müller}, year={2017}, eprint={1708.08296}, archivePrefix={arXiv}, primaryClass={cs.AI} }
9	@book{tunstall2022natural, title={Natural language processing with transformers}, author={Tunstall, Lewis and Von Werra, Leandro and Wolf, Thomas}, year={2022}, publisher={" O'Reilly Media, Inc."} }
10	@misc{vaswani2017, title={Attention Is All You Need}, author={Ashish Vaswani and Noam Shazeer and Niki Parmar and Jakob Uszkoreit and Llion Jones and Aidan N. Gomez and Lukasz Kaiser and Illia Polosukhin}, year={2023}, eprint={1706.03762}, archivePrefix={arXiv}, primaryClass={cs.CL} }

Paper Registration

Fill in your paper information

English Information

Authors

Reference