1 |
Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990). Basic local alignment search tool. Journal of molecular biology, 215(3):403–410.
|
|
2 |
Araujo, J. D., Santos-e Silva, J. C., Costa-Martins, A. G., Sampaio, V., de Castro, D. B., de Souza, R. F., Giddaluru, J., Ramos, P. I. P., Pita, R., Barreto, M. L., et al. (2022). Tucuxi-blast: Enabling fast and accurate record linkage of large-scale health-related administrative databases through a dna-encoded approach. PeerJ, 10:e13507
|
|
3 |
Barbosa, G. C. G., Ali, M., Barreto, M., Araujo, B., Reis, S., Sena, S., Ichihara, Y., Pescarini, J., Fiaccone, R., Amorim, L., Pita, R., Smeeth, L., and Barreto, M. (2020). CIDACS-RL: A novel indexing search and scoring-based record linkage system for huge datasets with high accuracy and scalability. BMC Medical Informatics and Decision Making, 20
|
|
4 |
Barreto, M. L., Ichihara, M., Almeida, B. d. A., Barreto, M., Cabral, L., Fiaccone, R., Carreiro, R., Teles, C., Pitta, R., Penna, G., et al. (2019). The centre for data and knowledge integration for health (cidacs): linking health and social data in brazil. International journal of population data science, 4(2):1140.
|
|
5 |
Barreto, M. L., Ichihara, M. Y., Pescarini, J. M., Ali, M. S., Borges, G. L., Fiaccone, R. L., Ribeiro-Silva, R. d. C., Teles, C. A., Almeida, D., Sena, S., et al. (2022). Cohort profile: the 100 million brazilian cohort. International journal of epidemiology, 51(2):e27–e38.
|
|
6 |
Christen, P. (2008). Febrl- an open source data cleaning, deduplication and record linkage system with a graphical user interface. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1065–1068.
|
|
7 |
Christen, P. (2011). A survey of indexing techniques for scalable record linkage and deduplication. IEEE transactions on knowledge and data engineering, 24(9):1537–1555.
|
|
8 |
Christen, P. (2019). Data linkage: The big picture. Harvard Data Science Review, 1.
|
|
9 |
Christen, P., Ranbaduge, T., and Schnell, R. (2020). Linking Sensitive Data: Methods and Techniques for Practical Privacy-Preserving Information Sharing. Springer Cham.
|
|
10 |
Christen, P., Ranbaduge, T., and Schnell, R. (2021). Linking sensitive data: Methods and techniques for practical privacy-preserving information sharing: Synopsis by kerina jones. International Journal of Population Data Science, 6(2).
|
|
11 |
Dillinger, P. C. and Manolios, P. (2004). Fast and accurate bitstate verification for spin. In Graf, S. and Mounier, L., editors, Model Checking Software, pages 57–75, Berlin, Heidelberg. Springer Berlin Heidelberg.
|
|
12 |
Dong, X. L. and Srivastava, D. (2013). Big data integration. In Proceedings - International Conference on Data Engineering, pages 1245–1248.
|
|
13 |
Durham, E. A., Kantarcioglu, M., Xue, Y., Toth, C., Kuzu, M., and Malin, B. (2013). Composite bloom filters for secure record linkage. IEEE transactions on knowledge and data engineering, 26(12):2956–2968.
|
|
14 |
Jurczyk, P., Lu, J. J., Xiong, L., Cragan, J. D., and Correa, A. (2008). Fril: a tool for comparative record linkage. In AMIA annual symposium proceedings, volume 2008, page 440.
|
|
15 |
Kirsch, A. and Mitzenmacher, M. (2006). Less hashing, same performance: Building a better bloom filter. volume 4168, pages 456–467.
|
|
16 |
Kristensen, T. G., Nielsen, J., and Pedersen, C. N. (2010). A tree-based method for the rapid screening of chemical fingerprints. Algorithms for Molecular Biology, 5:1–10.
|
|
17 |
Nobrega, T., Pires, C. E. S., and Nascimento, D. C. (2021). Blockchain-based privacy preserving record linkage: enhancing data privacy in an untrusted environment. Information Systems, 102:101826.
|
|
18 |
N´obrega, T., Pires, C. E. S., and Nascimento, D. C. (2022). Explanation and answers to critiques on: Blockchain-based privacy-preserving record linkage. Information systems, 108:101935.
|
|
19 |
Patki, N. (2016). The synthetic data vault: generative modeling for relational databases. PhD thesis, Massachusetts Institute of Technology.
|
|
20 |
Pinto, C., Pita, R., Melo, P., Sena, S., and Barreto, M. (2015). Correlacao probabilıstica de bancos de dados governamentais. Simpósio Brasileiro de Bancos de Dados (SBBD 2015), pages 77–85.
|
|
21 |
Pita, R., Carreiro, R. P., Santos, C. J. C., Protasio, L. d. S., Barreto, M. E., Orrico, V. B., Gomes, J. A. D., Eust´aquio, F. S., Sena, S., Barreto, M. L., Ramos, P. I. P., Rangel, D., and Almeida, B. d. A. (2025). Big data linkage no brasil: Aspectos metodológicos e práticos. In Minicursos do XXV Simpósio Brasileiro de Computação Aplicada a Saúde, pages 306–345. SBC.
|
|
22 |
Pita, R., Menezes, L., and Barreto, M. E. (2018a). Applying term frequency-based indexing to improve scalability and accuracy of probabilistic data linkage. In LADaS@VLDB, pages 65–72.
|
|
23 |
Pita, R., Pinto, C., Sena, S., Fiaccone, R., Amorim, L., Reis, S., Barreto, M. L., Denaxas, S., and Barreto, M. E. (2018b). On the accuracy and scalability of probabilistic data linkage over the brazilian 114 million cohort. IEEE Journal of Biomedical and Health Informatics, 22(2):346–353
|
|
24 |
Ranbaduge, T., Vatsalan, D., and Ding, M. (2023). Privacy-preserving deep learning based record linkage. IEEE Transactions on Knowledge and Data Engineering, 36(11):6839–6850.
|
|
25 |
Schnell, R. (2014). An efficient privacy-preserving record linkage technique for administrative data and censuses. Statistical journal of the IAOS, 30:263–270.
|
|
26 |
Schnell, R. (2015). Privacy-preserving record linkage. Methodological developments in data linkage, pages 201–225.
|
|
27 |
Schnell, R., Bachteler, T., and Reiher, J. (2009). Development of a new method for privacy-preserving record linkage allowing for errors in identifiers. methods, data, analyses, 3(2):15.
|
|
28 |
Schnell, R., Bachteler, T., and Reiher, J. (2011). A novel error-tolerant anonymous linking code. Available at SSRN 3549247.
|
|
29 |
Vatsalan, D., Sehili, Z., Christen, P., and Rahm, E. (2017). Privacy-preserving record linkage for big data: Current approaches and research challenges. Handbook of big data technologies, pages 851–895.
|
|