1 |
Beskales, G., Soliman, M. A., Ilyas, I. F., and Ben-David, S. (2009). Modeling and querying possible repairs in duplicate detection. Proceedings of the VLDB Endowment, 2(1):598–609.
|
|
2 |
Christen, P. (2008). Automatic record linkage using seeded nearest neighbour and support vector machine classification. In SIGKDD, pages 151–159, Las Vegas, Nevada, USA.
|
|
3 |
Christen, P. (2012). Data Matching: Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection. Springer, Berlin.
|
|
4 |
de Carvalho, A. P., Ferreira, A. A., Laender, A. H., and Gonçalves, M. A. (2011). Incremental unsupervised name disambiguation in cleaned digital libraries. JIDM, 2(3):289–304.
|
|
5 |
de Carvalho, M. G., Laender, A. H., Gonçalves, M. A., and da Silva, A. S. (2012). A genetic programming approach to record deduplication. TKDE, 24(3):399–412.
|
|
6 |
Draisbach, U., Naumann, F., Szott, S., and Wonneberg, O. (2012). Adaptive windows for duplicate detection. In ICDE, pages 1073–1083, Arlington, Virginia, USA.
|
|
7 |
Hajishirzi, H., Yih, W.-t., and Kolcz, A. (2010). Adaptive near-duplicate detection via similarity learning. In SIGIR, pages 419–426, Geneva, Switzerland.
|
|
8 |
Hernández, M. A. and Stolfo, S. J. (1995). The merge/purge problem for large databases. In SIGMOD, pages 127–138, San Jose, CA, USA.
|
|
9 |
Ioannou, E., Rassadko, N., and Velegrakis, Y. (2013). On generating benchmark data for entity matching. Journal on Data Semantics, 2(1):37–56.
|
|
10 |
Jain, R. (1992). The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling. Wiley.
|
|
11 |
Sadinle, M. (2017). Bayesian estimation of bipartite matchings for record linkage. Journal of the American Statistical Association, 112(518):600–612.
|
|
12 |
Steorts, R. C., Ventura, S. L., Sadinle, M., and Fienberg, S. E. (2014). A comparison of blocking methods for record linkage. In PSD, pages 253–268, Ibiza, Spain.
|
|