1 |
Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer, W. P. (2002). Smote: syn-
thetic minority over-sampling technique. Journal of artificial intelligence research,
16:321–357.
|
|
2 |
Chung, Y., Kraska, T., Polyzotis, N., Tae, K. H., and Whang, S. E. (2020). Automated data
slicing for model validation: A big data - ai integration approach. IEEE Transactions
on Knowledge and Data Engineering, 32(12):2284–2296.
|
|
3 |
Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences. Lawrence
Erlbaum Associates, Hillsdale, NJ, 2nd edition.
|
|
4 |
El Gebaly, K., Agrawal, P., Golab, L., Korn, F., and Srivastava, D. (2014). Interpretable
and informative explanations of outcomes. Proceedings of the VLDB Endowment,
8(1):61–72.
|
|
5 |
Foster, D. P. and Stine, R. A. (2008). α-investing: a procedure for sequential control of
expected false discoveries. Journal of the Royal Statistical Society Series B: Statistical
Methodology, 70(2):429–444.
|
|
6 |
Kamiran, F. and Calders, T. (2012). Data preprocessing techniques for classification with-
out discrimination. Knowledge and Information Systems, 33(1):1–33.
|
|
7 |
Kerrigan, D. and Bertini, E. (2023). Slicelens: Guided exploration of machine learn-
ing datasets. In Proceedings of the Workshop on Human-In-the-Loop Data Analytics,
pages 1–7.
|
|
8 |
Kohavi, R. (1996). Census Income. UCI Machine Learning Repository. DOI:
https://doi.org/10.24432/C5GP7S.
|
|
9 |
Lin, Y., Gupta, S., and Jagadish, H. (2024). Mitigating subgroup unfairness in machine
learning classifiers: A data-driven approach. In 2024 IEEE 40th International Confer-
ence on Data Engineering (ICDE), pages 2151–2163. IEEE.
|
|
10 |
Liu, H. and Motoda, H. (1998). Feature Selection for Knowledge Discovery and Data
Mining. Kluwer Academic Publishers, USA.
|
|
11 |
Moro, S., Rita, P., and Cortez, P. (2012). Bank Marketing. UCI Machine Learning
Repository. DOI: https://doi.org/10.24432/C5K306.
|
|
12 |
Pastor, E., De Alfaro, L., and Baralis, E. (2021). Looking for trouble: Analyzing classifier
behavior via pattern divergence. In Proceedings of the 2021 International Conference
on Management of Data, pages 1400–1412.
|
|
13 |
Polyzotis, N., Roy, S., Whang, S. E., and Zinkevich, M. (2017). Data management chal-
lenges in production machine learning. In Proceedings of the 2017 ACM International
Conference on Management of Data, pages 1723–1726.
|
|
14 |
Ribeiro, V., Pena, E. H. M., Saldanha, R., Akbarinia, R., Valduriez, P., Khan, F., Stoy-
anovich, J., and Porto, F. (2023). Subset modelling: A domain partitioning strategy for
data-efficient machine-learning. In Anais do XXXVIII Simpósio Brasileiro de Bancos
de Dados, pages 318–323, Porto Alegre, RS, Brasil. SBC.
|
|
15 |
Sagadeeva, S. and Boehm, M. (2021). Sliceline: Fast, linear-algebra-based slice finding
for ml model debugging. In Proceedings of the 2021 International Conference on
Management of Data, pages 2290–2299.
|
|
16 |
Sakar, C. and Kastro, Y. (2018). Online Shoppers Purchasing Intention Dataset. UCI
Machine Learning Repository. DOI: https://doi.org/10.24432/C5F88Q.
|
|
17 |
Sharma, A., Jain, A., Gupta, P., and Chowdary, V. (2021). Machine learning applications
for precision agriculture: A comprehensive review. IEEE Access, 9:4843–4873.
|
|
18 |
Shehab, M., Abualigah, L., Shambour, Q., Abu-Hashem, M. A., Shambour, M. K. Y., Al-
salibi, A. I., and Gandomi, A. H. (2022). Machine learning in medical applications: A
review of state-of-the-art methods. Computers in Biology and Medicine, 145:105458.
|
|
19 |
Yang, L. and Shami, A. (2020). On hyperparameter optimization of machine learning
algorithms: Theory and practice. Neurocomputing, 415:295–316.
|
|
20 |
Zhang, X., Ono, J. P., Song, H., Gou, L., Ma, K.-L., and Ren, L. (2023). Sliceteller: A
data slice-driven approach for machine learning model validation. IEEE Transactions
on Visualization and Computer Graphics, 29(1):842–852.
|
|