SBBD

Paper Registration

1

Select Book

2

Select Paper

3

Fill in paper information

4

Congratulations

Fill in your paper information

English Information

(*) To change the order drag the item to the new position.

Authors
# Name
1 Aline Gassenn(aline.gassenn@usp.br)
2 Luís de Andrade(gustavo.modelli@unesp.br)
3 Douglas Teodoro(douglas.teodoro@unige.ch)
4 José F. Rodrigues-Jr(junio@icmc.usp.br)

(*) To change the order drag the item to the new position.

Reference
# Reference
1 Ali, S. N. and Shuvo, S. B. (2021). Hospital ambient noise dataset.
2 Ali, S. N., Shuvo, S. B., Al-Manzo, M. I. S., Hasan, A., and Hasan, T. (2023). An end-to-end deep learning framework for real-time denoising of heart sounds for cardiac disease detection in unseen noise. IEEE Access, 11:87887–87901.
3 Arora, R. K., Wei, J., Hicks, R. S., Bowman, P., Quiñonero-Candela, J., Tsimpourlas, F., Sharman, M., Shah, M., Vallone, A., Beutel, A., Heidecke, J., and Singhal, K. (2025). Healthbench: Evaluating large language models towards improved human health.
4 Baevski, A., Zhou, H., Mohamed, A., and Auli, M. (2020). wav2vec 2.0: A framework for self-supervised learning of speech representations. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NeurIPS), pages 12449–12460. Curran Associates, Inc.
5 Banerjee, S., Agarwal, A., and Ghosh, P. (2024). High-precision medical speech recognition through synthetic data and semantic correction: United-medasr. arXiv preprint arXiv:2412.00055.
6 Canopyai (2025). Canopyai/orpheus-tts: Towards human-sounding speech.
7 Devatine, N. and Abraham, L. (2024). Assessing human editing effort on llm-generated texts via compression-based edit distance. arXiv preprint arXiv:2412.17321.
8 Gonçalves, Y. T., Alves, J. V. B., Sá, B. A. D., da Silva, L. N., de Macedo, J. A. F., and da Silva, T. L. C. (2024). Speech recognition models in assisting medical history. In Proceedings of the 39th Brazilian Symposium on Databases (SBBD), pages 485–497, Florianópolis, SC, Brazil.
9 Hsu, W.-N., Bolte, B., Tsai, Y.-H. H., Lakhotia, K., Salakhutdinov, R., and Mohamed, A. (2021). Hubert: Self-supervised speech representation learning by masked prediction of hidden units. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29:3451–3460.
10 Le-Duc, K. (2024). VietMed: A dataset and benchmark for automatic speech recognition of Vietnamese in the medical domain. In Calzolari, N., Kan, M.-Y., Hoste, V., Lenci, A., Sakti, S., and Xue, N., editors, Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 17365–17370, Torino, Italia. ELRA and ICCL.
11 Le-Duc, K., Phan, P., Pham, T.-H., Tat, B. P., Ngo, M.-H., Nguyen-Tang, T., and Hy, T.-S. (2025). MultiMed: Multilingual medical speech recognition via attention encoder decoder. In Rehm, G. and Li, Y., editors, Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track), pages 1113–1150, Vienna, Austria. Association for Computational Linguistics.
12 Lee, S.-H., Park, J., Yang, K., Min, J., and Choi, J. (2022). Accuracy of cloud-based speech recognition open application programming interface for medical terms of korean. Journal of Korean Medical Science, 37(18).
13 Norvig, P. (2025). Pyspellchecker: Pure python spell checking library.
14 Nurfadhilah, E., Jarin, A., Ruslana Aini, L., Pebiana, S., Santosa, A., Teduh Uliniansyah, M., Butarbutar, E., Desiani, and Gunarso (2021). Evaluating the bppt medical speech corpus for an asr medical record transcription system. In 2021 9th International Conference on Information and Communication Technology (ICoICT), pages 657–661.
15 Radford, A., Kim, J. W., Xu, T., Brockman, G., McLeavey, C., and Sutskever, I. (2023). Robust speech recognition via large-scale weak supervision. In Proceedings of the 40th International Conference on Machine Learning (ICML), pages 28492–28518. PMLR.
16 Tang, C., Zhang, H., Loakman, T., Lin, C., and Guerin, F. (2023). Terminology-aware medical dialogue generation. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE.
17 Zeng, G., Yang, W., Ju, Z., Yang, Y., Wang, S., Zhang, R., Zhou, M., Zeng, J., Dong, X., Zhang, R., Fang, H., Zhu, P., Chen, S., and Xie, P. (2020). MedDialog: Large-scale medical dialogue datasets. In Webber, B., Cohn, T., He, Y., and Liu, Y., editors, Pro- ceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 9241–9250, Online. Association for Computational Linguistics.