1 |
Chaudhuri, S., Ganti, V., and Kaushik, R. (2006). A Primitive Operator for Similarity Joins in Data Cleaning. In ICDE, page 5.
|
|
2 |
Deng, D., Li, G., Hao, S., Wang, J., and Feng, J. (2014). MassJoin: A Mapreduce-based Method for Scalable String Similarity Joins. In ICDE, pages 340–351.
|
|
3 |
Li, G., He, J., Deng, D., and Li, J. (2015). Efficient Similarity Join and Search on Multi-Attribute Data. In SIGMOD, pages 1137–1151.
|
|
4 |
Ribeiro, L. A. and Härder, T. (2011). Generalizing Prefix Filtering to Improve Set Similarity Joins. Information Systems, 36(1):62–78.
|
|
5 |
Sidney, C. F., Mendes, D. S., Ribeiro, L. A., and H ̈arder, T. (2015). Performance Prediction for Set Similarity Joins. In SAC, pages 967–972.
|
|
6 |
Vernica, R., Carey, M. J., and Li, C. (2010). Efficient Parallel Set-similarity Joins using MapReduce. In SIGMOD, pages 495–506.
|
|
7 |
Zaharia, M., Xin, R. S., Wendell, P., Das, T., Armbrust, M., Dave, A., Meng, X., Rosen, J., Venkataraman, S., Franklin, M. J., Ghodsi, A., Gonzalez, J., Shenker, S., and Stoica, I. (2016). Apache Spark: a Unified Engine for Big Data Processing. Communications of the ACM, 59(11):56–65.
|
|