Paper Registration

1

Select Book

2

Select Paper

3

Fill in paper information

4

Congratulations

Fill in your paper information

Language (*)

Title (*)

Keywords (*)

Abstract (*)

Pages (*)

File Link

English Information

Title

Keywords

Abstract

(*) To change the order drag the item to the new position.

Authors

#	Name
1	Danny Suarez Vargas(dannysvof@gmail.com)
2	Lucas Lima de Oliveira(lloliveira@inf.ufrgs.br)
3	Viviane Pereira Moreira (viviane@inf.ufrgs.br)
4	Guilherme Torresan Bazzo(gtbazzo@inf.ufrgs.br)
5	Gustavo Acauan Lorentz(galorentz@inf.ufrgs.br)

(*) To change the order drag the item to the new position.

Reference

#	Reference
1	Guilherme Torresan Bazzo, Gustavo Acauan Lorentz, Danny Suarez Vargas, and Viviane P. Moreira. Assessing the impact of OCR errors in information retrieval. In Advances in Information Retrieval, pages 102–109, 2020.
2	Steven M. Beitzel, Eric C. Jensen, and David A. Grossman. A survey of retrieval strategies for OCR text collections. In Symposium on Document Image Understanding Technologies, 2003.
3	G. Chiron, A. Doucet, M. Coustaty, and J. Moreux. ICDAR 2017 Competition on Post-OCR Text Correction. In Intl. Conf. on Document Analysis and Recognition, volume 01, pages 1423–1428, 2017.
4	W. Bruce Croft, Stephen Harding, Kazem Taghva, and Julie Borsack. An evaluation of information retrieval accuracy with simulated ocr output. In Symposium of Document Analysis and Information Retrieval, 1994
5	Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
6	M. Droettboom. Correcting broken characters in the recognition of historical printed documents. In Joint Conference on Digital Libraries, pages 364–366, May 2003.
7	John Evershed and Kent Fitch. Correcting noisy ocr: Context beats confusion. In Intl. Conference on Digital Access to Textual Cultural Heritage, DATeCH ’14, pages 45–51, 2014.
8	Paul B. Kantor and Ellen M. Voorhees. The TREC-5 confusion track: Comparing retrieval methods for scanned text. Information Retrieval, 2(2):165–176, May 2000
9	Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems 26, pages 3111–3119. 2013.
10	T. Nguyen, A. Jatowt, M. Coustaty, N. Nguyen, and A. Doucet. Deep statistical analysis of OCR errors for effective post-OCR processing. In Joint Conference on Digital Libraries (JCDL), pages 29–38, June 2019
11	Thi Tuyet Hai Nguyen, Adam Jatowt, Mickael Coustaty, and Antoine Doucet. Survey of post-ocr processing approaches. ACM Computing Surveys (CSUR), 54(6):1–37, 2021.
12	Javier Parapar, Ana Freire, and Alvaro Barreiro. Revisiting n-gram based models for retrieval in degraded large collections. In Advances in Information Retrieval, pages 680–684, 2009
13	C. Rigaud, A. Doucet, M. Coustaty, and J. Moreux. ICDAR 2019 competition on post-ocr text correction. In Intl. Conf. on Document Analysis and Recognition, pages 1588–1593, 2019
14	Diana Santos and Paulo Rocha. The key to the first clef with Portuguese: Topics, questions and answers in Chave. In Workshop of the Cross-Language Evaluation Forum for European Languages, pages 821–832, 2004.
15	Kazem Taghva, Julie Borsack, and Allen Condit. Evaluation of model-based retrieval effectiveness with ocr text. ACM Trans. Inf. Syst., 14(1):64–93, January 1996.