1 |
Chaudhuri, S., Ganti, V., and Kaushik, R. (2006). A Primitive Operator for Similarity Joins in Data Cleaning. In Proceedings of the ICDE Conference, page 5.
|
|
2 |
Doan, A., Halevy, A. Y., and Ives, Z. G. (2012). Principles of Data Integration. Morgan Kaufmann.
|
|
3 |
Fier, F., Augsten, N., Bouros, P., Leser, U., and Freytag, J. (2018). Set Similarity Joins on MapReduce: An Experimental Survey. Proceedings of the VLDB Endowment, 11(10):1110–1122.
|
|
4 |
Oliveira, D., Borges, F. F., and Ribeiro, L. A. (2017). Uma Abordagem para Processamento Distribu ́ıdo de Junc ̧ ̃ao por Similaridade sobre Múltiplos Atributos. In Proceedings of the Brazilian Symposium on Databases, pages 300–305.
|
|
5 |
Ribeiro, L. A. and H ̈arder, T. (2011). Generalizing Prefix Filtering to Improve Set Similarity Joins. Information Systems, 36(1):62–78.
|
|
6 |
Ribeiro-Júnior, S., Quirino, R. D., Ribeiro, L. A., and Martins, W. S. (2017). Fast Parallel Set Similarity Joins on Many-core Architectures. Journal of Information and Data Management, 8(3):255–270.
|
|
7 |
Shanbhag, A., Madden, S., and Yu, X. (2020). A Study of the Fundamental Performance Characteristics of GPUs and CPUs for Database Analytics. In Proceedings of the SIGMOD Conference, pages 1617–1632.
|
|
8 |
Xu, L., Butt, A. R., Lim, S., and Kannan, R. (2018). A Heterogeneity-Aware Task Scheduler for Spark. In Proceedings of the IEEE International Conference on Cluster Computing, pages 245–256.
|
|
9 |
Zaharia, M., Xin, R. S., Wendell, P., Das, T., Armbrust, M., Dave, A., Meng, X., Rosen, J., Venkataraman, S., Franklin, M. J., Ghodsi, A., Gonzalez, J Shenker, S., and Stoica, I. (2016). Apache Spark: a Unified Engine for Big Data Processing. Communications of the ACM, 59(11):56–65.
|
|