1 |
Batista, G. E. A. P. A., Prati, R. C., & Monard, M. C. (2004). A study of the behavior of several methods
for balancing machine learning training data. SIGKDD Explor. Newsl., 6(1):20–29.
|
|
2 |
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). Smote: synthetic minority oversampling technique. J. Artif. Int. Res., 16(1):321–357.
|
|
3 |
Cunha, W., França, C., Fonseca, G., Rocha, L., & Gonçalves, M. A. (2023). An effective, efficient, and scalable confidence-based instance selection framework for transformer-based text classification. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’23, page 665–674, New York, NY, USA. Association for Computing Machinery.
|
|
4 |
Cunha, W., Moreo Fernández, A., Esuli, A., Sebastiani, F., Rocha, L., & Gonçalves, M. A. (2025). A noise-oriented and redundancy-aware instance selection framework. ACM Trans. Inf. Syst., 43(2)
|
|
5 |
Han, H., Wang, W.-Y., & Mao, B.-H. (2005). Borderline-smote: A new over-sampling method in imbalanced data sets learning. In Huang, D.-S., Zhang, X.-P., & Huang, G.-B., editors, Advances in Intelligent
Computing, pages 878–887, Berlin, Heidelberg. Springer Berlin Heidelberg.
|
|
6 |
He, H. & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9):1263–1284.
|
|
7 |
Hu, J., Ruder, S., Siddhant, A., Neubig, G., Firat, O., & Johnson, M. (2020). XTREME: A massively
multilingual multi-task benchmark for evaluating cross-lingual generalisation. In Proceedings of the 37th
International Conference on Machine Learning, volume 119 of Proc. of Machine Learning Research,
pages 4411–4421. PMLR
|
|
8 |
Last, F., Douzas, G., & Bação, F. (2017). Oversampling for imbalanced learning based on k-means and
SMOTE. CoRR, abs/1711.00837.
|
|
9 |
McClure, J., Shimmei, M., Matsuda, N., & Jiang, S. (2024). Leveraging prompts in llms to overcome
imbalances in complex educational text data.
|
|
10 |
Nguyen, H. M., Cooper, E. W., & Kamei, K. (2011). Borderline over-sampling for imbalanced data classification. Int. J. Knowl. Eng. Soft Data Paradigm., 3(1):4–21
|
|
11 |
Souza, F., Nogueira, R., & Lotufo, R. (2020). BERTimbau: pretrained BERT models for Brazilian Portu-
guese. In 9th Brazilian Conference on Intelligent Systems, BRACIS, Rio Grande do Sul, Brazil, October
20-23 (to appear).
|
|
12 |
Tabar, V. R., Eskandari, F., Salimi, S., & Zareifard, H. (2018). Finding a set of candidate parents using
dependency criterion for the k2 algorithm. Pattern Recognition Letters, 111:23–29.
|
|
13 |
Taskiran, S. F., Turkoglu, B., Kaya, E., & Asuroglu, T. (2025). A comprehensive evaluation of oversampling
techniques for enhancing text classification performance. Scientific Reports, 15:21631.
|
|
14 |
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L. u., & Polosukhin,
I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems, volume 30.
|
|
15 |
Yadav, V., Tang, Z., & Srinivasan, V. (2024). Pag-llm: Paraphrase and aggregate with large language models
for minimizing intent classification errors. In Proc. of the International ACM SIGIR Conference, SIGIR
’24, page 2569–2573.
|
|