SBBD

Paper Registration

1

Select Book

2

Select Paper

3

Fill in paper information

4

Congratulations

Fill in your paper information

English Information

(*) To change the order drag the item to the new position.

Authors
# Name
1 Lucas Lima de Oliveira(lloliveira@inf.ufrgs.br)
2 Viviane Pereira Moreira (viviane@inf.ufrgs.br)

(*) To change the order drag the item to the new position.

Reference
# Reference
1 Bazzo, G. T., Lorentz, G. A., Vargas, D. S., and Moreira, V. P. (2020). Assessing the impact of OCR errors in information retrieval. In European Conference on Information Retrieval, pages 102–109.
2 Croft, W. B., Harding, S., Taghva, K., and Borsack, J. (1994). An evaluation of information retrieval accuracy with simulated OCR output. In Symposium on Document Analysis and Information Retrieval, pages 115–126.
3 Ghosh, K., Chakraborty, A., Parui, S. K., and Majumder, P. (2016). Improving information retrieval performance on OCRed text in the absence of clean text ground truth. Information Processing & Management, 52(5):873–884.
4 Hegghammer, T. (2021). OCR with tesseract, amazon textract, and google document AI: a benchmarking experiment. Journal of Computational Social Science, pages 1–22.
5 Kantor, P. B. and Voorhees, E. M. (2000). The TREC-5 confusion track: Comparing retrieval methods for scanned text. Information Retrieval, 2(2):165–176.
6 Mittendorf, E. and Schäuble, P. (2000). Information retrieval can cope with many errors. Information Retrieval, 3(3):189–216.
7 Oliveira, L. L. d., Romeu, R. K., and Moreira, V. P. (2021). REGIS: A test collection for geoscientific documents in portuguese. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, page 2363–2368.
8 Oliveira, L. L. d., Vargas, D. S., Alexandre, A. M. A., Cordeiro, F. C., Gomes, D. d. S. M., Rodrigues, M. d. C., Romeu, R. K., and Moreira, V. P. (2023). Evaluating and mitigating the impact of OCR errors on information retrieval. International Journal on Digital Libraries, 24(1):45–62.
9 Sanderson, M. (2010). Test collection based evaluation of information retrieval systems. Foundations and Trends® in Information Retrieval, 4(4):247–375.
10 Santos, D. and Rocha, P. (2004). The key to the first CLEF with portuguese: Topics, questions and answers in CHAVE. In Workshop of the Cross-Language Evaluation Forum for European Languages, pages 821–832.
11 Taghva, K., Borsack, J., and Condit, A. (1996a). Effects of OCR errors on ranking and feedback using the vector space model. Information Processing & Management, 32(3):317–327.
12 Taghva, K., Borsack, J., and Condit, A. (1996b). Evaluation of model-based retrieval effectiveness with OCR text. ACM Transactions on Information Systems (TOIS), 14(1):64–93.
13 Vargas, D. S., de Oliveira, L. L., Moreira, V. P., Bazzo, G. T., and Lorentz, G. A. (2021). sOCRates-a post-OCR text correction method. In Anais do XXXVI Simpósio Brasileiro de Bancos de Dados, pages 61–72.