1 |
Baeza-Yates, R. A. and Ribeiro-Neto, B. A. (1999). Modern Information Retrieval. ACM Press/Addison-Wesley., New York, NY, USA. p. 39-48. (KDD 03).
|
|
2 |
Bianco;, G. D. et al. (2013). Tuning large scale deduplication with reduced effort. In Proceedings International Conference on Scientific and Statistical Database Management, ACM, new york., p. 18:1-18:12. (SSDBM).
|
|
3 |
Bianco;, G. D. et al. (2016). A practical and effective sampling selection strategy for large scale deduplication. IEEE International Conference on Data Engineering, p. 1518-1519.
|
|
4 |
Bilenko, M.; Mooney, R. J. (2003). Adaptive Duplicate Detection Using Learnable String Similarity Measures. In: ACM., New York, NY, USA. p. 39-48. (KDD 03).
|
|
5 |
Carvalho, M. G. et al. (2008a). The impact of parameter setup on a genetic programming approach to record deduplication. S.B.C.; Brazilian Symp. Databases, p.91-105.
|
|
6 |
Carvalho, M. G. et al. (2008b). Replica identification using genetic programming. In: ACM Symposium on Applied Computing., p. 1801-1806. Carvalho, M. G. et al. (2009). Evolutionary approaches to data integration related problems. Tese. Universidade Federal de Minas Gerais, p. 66-81.
|
|
7 |
Carvalho, M. G. et al. (2012). A genetic programming approach to record deduplication. IEEE Transactions on Knowledge and Data Engineering; NJ, USA., v.24, p. 399-412.
|
|
8 |
Fellegi, I. P; Sunter, A. B. (1969). A theory for record linkage. Journal of the American Statistical Association., [S.l.], v.64, n.328, p. 1183-1210.
|
|
9 |
Jain, A. K. et al. (1999). Data clustering: A review. ACM Computing Surveys. 31(3)8.
|
|
10 |
Koudas, N. et al. (2006). Record linkage: Similarity measures and algorithms. ACM International Conference on Management of Data., p. 802-803, Chicago, USA.
|
|
11 |
Ma, K. et al. (2015). Large-scale schema-free data deduplication approach with adaptive sliding window using mapreduce. The Computer Journal, 58, n. 11, p.3187-3201.
|
|
12 |
Ziv, J. et al. (1977). A universal algorithm for sequential data compression. IEEE Transactions on Information Theory, 23(3) pp. 337-343.
|
|