1 |
Altman, N. S. (1992). An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician, 46(3):175–185.
|
|
2 |
Bergstra, J., Komer, B., Eliasmith, C., Yamins, D., and Cox, D. D. (2015). Hyperopt: a python library for model selection and hyperparameter optimization. Computational Science & Discovery, 8(1):014008.
|
|
3 |
Bojanowski, P., Grave, E., Joulin, A., and Mikolov, T. (2017). Enriching word vectors with subword information. Transactions of ACL, 5:135–146.
|
|
4 |
Campos, R., Canuto, S., Salles, T., de Sá, C. C., and Gonçalves, M. A. (2017). Stacking bagged and boosted forests for effective automated classification. In SIGIR, page 105–114.
|
|
5 |
Canuto, S., Gonçalves, M. A., and Benevenuto, F. (2016). Exploiting new sentiment-based meta-level features for effective sentiment analysis. In WSDM, pages 53–62.
|
|
6 |
Canuto, S., Salles, T., Gonçalves, M. A., Rocha, L., Ramos, G., Gonçalves, L., Rosa, T., and Martins, W. (2014). On efficient meta-level features for effective text classification. In CIKM´14, pages 1709–1718.
|
|
7 |
Canuto, S., Salles, T., Rosa, T. C., and Gonçalves, M. A. (2019). Similarity-based synthetic document representations for meta-feature generation in text classification. In SIGIR, pages 355–364.
|
|
8 |
Canuto, S., Sousa, D. X., Goncalves, M. A., and Rosa, T. C. (2018). A thorough evaluation of distance-based meta-features for automated text classification. IEEE TKDE, 30(12):2242–2256.
|
|
9 |
Chen, T. and Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In SIGKDD conference, pages 785–794.
|
|
10 |
Cunha, W., Canuto, S., Viegas, F., Salles, T., Gomes, C., Mangaravite, V., Gonçalves, M. A., and Rocha, L. (2020). Extended pre-processing pipeline for text classification: On the role of meta-feature representations, sparsification and selective sampling. Inf. Processing & Management, 57(4):102263.
|
|
11 |
Cunha, W., Mangaravite, V., Gomes, C., Canuto, S., Resende, E., Nascimento, C., Viegas, F., França, C., Martins, W. S., Almeida, J. M., et al. (2021). On the cost-effectiveness of neural and non-neural approaches and representations for text classification: A comprehensive comparative study. Inf. Processing & Management, 58(3):102481.
|
|
12 |
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
|
|
13 |
Diao, Q., Qiu, M., Wu, C.-Y., Smola, A. J., Jiang, J., and Wang, C. (2014). Jointly modeling aspects, ratings and sentiments for movie recommendation (jmars). In SIGKDD conference, KDD ’14, page 193–202.
|
|
14 |
Ding, W. and Wu, S. (2020). A cross-entropy based stacking method in ensemble learning. Journal of Intelligent & Fuzzy Systems, pages 1–12.
|
|
15 |
Džeroski, S. and Ženko, B. (2004). Is combining classifiers with stacking better than selecting the best one? Machine learning, 54(3):255–273.
|
|
16 |
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., and Lin, C.-J. (2008). Liblinear: A library for large linear classification. JMLR, 9:1871–1874.
|
|
17 |
Gomes, C., Gonçalves, M. A., Rocha, L., and Canuto, S. D. (2021). On the cost-effectiveness of stacking of neural and non-neural methods for text classification: Scenarios and performance prediction. In Proc. of the Association for Computational Linguistics: ACL/IJCNLP, pages 4003–4014.
|
|
18 |
Hull, D. (1993). Using statistical testing in the evaluation of retrieval experiments. In SIGIR, pages 329–338.
|
|
19 |
Joulin, A., Grave, E., Bojanowski, P., and Mikolov, T. (2016). Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759.
|
|
20 |
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov, V. (2019). Roberta: A robustly optimized bert pretraining approach.
|
|
21 |
Lundberg, S. M. and Lee, S.-I. (2017). A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems 30, pages 4765–4774. Curran Associates, Inc.
|
|
22 |
Pedregosa, F. e. (2011). Scikit-learn: Machine learning in Python. JMLR, 12:2825–2830.
|
|
23 |
Silva, R. M., Gomes, G. C., Alvim, M. S., and Gonçalves, M. A. (2016). Compression-based selective sampling for learning to rank. In CIKM´16, pages 247–256.
|
|
24 |
Sokolova, M. and Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Inf. processing & management, 45(4):427–437.
|
|
25 |
Sun, C., Qiu, X., Xu, Y., and Huang, X. (2019). How to fine-tune bert for text classification? In Conference on Chinese Computational Linguistics, pages 194–206.
|
|
26 |
Tang, J., Alelyani, S., and Liu, H. (2014). Data classification: algorithms and applications. Data Mining and Knowledge Discovery Series, pages 37–64.
|
|
27 |
Tang, J., Qu, M., and Mei, Q. (2015). Pte: Predictive text embedding through large-scale heterogeneous text networks. In SIGKDD Conference), pages 1165–1174.
|
|
28 |
Urbano, J., Lima, H., and Hanjalic, A. (2019). Statistical significance testing in information retrieval: an empirical analysis of type i, type ii and type iii errors. In SIGIR, pages 505–514.
|
|
29 |
Viegas, F., Rocha, L., Gonçalves, M., Mourão, F., Sá, G., Salles, T., Andrade, G., and Sandin, I. (2018). A genetic programming approach for feature selection in highly dimensional skewed data. Neurocomputing, 273:554–569.
|
|
30 |
Wolpert, D. H. (1992). Stacked generalization. Neural networks, 5(2):241–259.
|
|
31 |
Yan-Shi Dong and Ke-Song Han (2004). A comparison of several ensemble methods for text categorization. In SCC 2004, pages 419–422.
|
|
32 |
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. R., and Le, Q. V. (2019). Xlnet: Generalized autoregressive pretraining for language understanding. In NIPS, pages 5753–5763.
|
|
33 |
Zhang, X., Zhao, J. J., and LeCun, Y. (2015). Character-level convolutional networks for text classification. CoRR, abs/1509.01626.
|
|