SBBD

Paper Registration

1

Select Book

2

Select Paper

3

Fill in paper information

4

Congratulations

Fill in your paper information

English Information

(*) To change the order drag the item to the new position.

Authors
# Name
1 Ana Machado(anaclaudiamachado211@aluno.ufsj.edu.br)
2 Celso França(celsofranca@dcc.ufmg.br)
3 Ian Nunes(iannunes@aluno.ufsj.edu.br)
4 Marcos Gonçalves(mgoncalv@dcc.ufmg.br)
5 Leonardo Rocha(lcrocha@ufsj.edu.br)

(*) To change the order drag the item to the new position.

Reference
# Reference
1 Abdelrazek, A., Eid, Y., Gawish, E., Medhat, W., and Hassan, A. (2023a). Topic modeling algorithms and applications: A survey. Information Systems, 112:102131.
2 Abdelrazek, A., Eid, Y., Gawish, E., Medhat, W., and Hassan, A. (2023b). Topic modeling algorithms and applications: A survey. Information Systems, 112:102131.
3 Arora, S., May, A., Zhang, J., and Ré, C. (2020). Contextual embeddings: When are they worth it? arXiv preprint arXiv:2005.09117.
4 Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan):993–1022.
5 Bouma, G. (2009). Normalized (pointwise) mutual information in collocation extraction. Proceedings of GSCL.
6 Boutsidis, C. and Gallopoulos, E. (2008). Svd based initialization: A head start for nonnegative matrix factorization. Pattern Recognition, 41(4):1350–1362.
7 Churchill, R. and Singh, L. (2022). The evolution of topic modeling. ACM Comput. Surv., 54(10s).
8 Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American ACL: human language technologies, volume 1 (long and short papers), pages 4171–4186.
9 Doogan, C. and Buntine, W. (2021). Topic model or topic twaddle? re-evaluating demantic interpretability measures. In North American Association for Computational Linguistics 2021, pages 3824–3848. ACL.
10 Formal, T., Lassance, C., Piwowarski, B., and Clinchant, S. (2022). From distillation to hard negative sampling: Making sparse neural ir models more effective. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’22, page 2353–2359, New York, NY, USA. Association for Computing Machinery
11 Gao, X., Lin, Y., Li, R., Wang, Y., Chu, X., Ma, X., and Yu, H. (2024). Enhancing topic interpretability for neural topic modeling through topic-wise contrastive learning. In 2024 IEEE 40th ICDE.
12 Ghahramani, Z. and Attias, H. (2000). Online variational bayesian learning. In Slides from talk presented at NIPS workshop on Online Learning.
13 Grootendorst, M. (2022). Bertopic: Neural topic modeling with a class-based tf-idf procedure. arXiv preprint arXiv:2203.05794.
14 Júnior, A. P. D. S., Cecilio, P., Viegas, F., Cunha, W., Albergaria, E. T. D., and Rocha, L. C. D. D. (2022). Evaluating topic modeling pre-processing pipelines for portuguese texts. WebMedia ’22, page 191
15 Kuang, D., Choo, J., and Park, H. (2015). Nonnegative Matrix Factorization for Interactive Topic Modeling and Document Clustering, pages 215–243. Springer International Publishing, Cham.
16 Lee, D. D. and Seung, H. S. (1999). Learning the parts of objects by non-negative matrix factorization. nature, 401(6755):788–791.
17 Liang, P., Bommasani, R., Lee, T., Tsipras, D., Soylu, D., Yasunaga, M., Zhang, Y., Narayanan, D., Wu, Y., Kumar, A., et al. (2022). Holistic evaluation of language models. arXiv preprint arXiv:2211.09110.
18 Viegas, F., Canuto, S., Gomes, C., Luiz, W., Rosa, T., Ribas, S., Rocha, L., and Gonçalves, M. A. (2019). Cluwords: exploiting semantic word clustering representation for enhanced topic modeling. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pages 753–761.
19 Viegas, F., Pereira, A., Cunha, W., Franc¸a, C., Andrade, C., Tuler, E., Rocha, L., and Gonçalves, M. A. (2025). Exploiting contextual embeddings in hierarchical topic modeling and investigating the limits of the current evaluation metrics. Computational Linguistics, pages 1–41.