1 |
Cai, D., Yu, S., Wen, J.-R., and Ma, W.-Y. (2003). Extracting content structure for webpages based on visual representation. InAsia-Pacific Web Conference, pages 406–417.Springer.
|
|
2 |
Chu, X., He, Y., Chakrabarti, K., and Ganjam, K. (2015). Tegra: Table extraction byglobal record alignment. InProceedings of the 2015 ACM SIGMOD InternationalConference on Management of Data, pages 1713–1728. ACM.
|
|
3 |
Crescenzi, V., Mecca, G., Merialdo, P., et al. (2001). Roadrunner: Towards automaticdata extraction from large web sites. InVLDB, volume 1, pages 109–118.
|
|
4 |
Elmeleegy, H., Madhavan, J., and Halevy, A. (2011). Harvesting relational tables fromlists on the web.The VLDB Journal - The International Journal on Very Large DataBases, 20(2):209–226.
|
|
5 |
Fang, Y., Xie, X., Zhang, X., Cheng, R., and Zhang, Z. (2018). Stem: a suffix tree-based method for web data records extraction.Knowledge and Information Systems,55(2):305–331.
|
|
6 |
Ferrara, E., De Meo, P., Fiumara, G., and Baumgartner, R. (2014). Web data extraction,applications and techniques: A survey.Knowledge-based systems, 70:301–323.
|
|
7 |
Goertzel, G. (1958). An algorithm for the evaluation of finite trigonometric series.TheAmerican Mathematical Monthly, 65(1):34–35.
|
|
8 |
Grigalis, T. (2013). Towards web-scale structured web data extraction. InProceedings ofthe sixth ACM international conference on Web search and data mining, pages 753–758. ACM.
|
|
9 |
Guo, J., Crescenzi, V., Furche, T., Grasso, G., and Gottlob, G. (2019). Red: Redundancy-driven data extraction from result pages? InThe World Wide Web Conference, pages605–615. ACM.
|
|
10 |
Jindal, N. and Liu, B. (2010). A generalized tree matching algorithm considering nestedlists for web data extraction. InProceedings of the 2010 SIAM International Confe-rence on Data Mining, pages 930–941. SIAM.
|
|
11 |
Kayed, M. and Chang, C.-H. (2009). Fivatech: Page-level web data extraction fromtemplate pages.IEEE transactions on knowledge and data engineering, 22(2):249–263.
|
|
12 |
Liu, B., Grossman, R., and Zhai, Y. (2003). Mining data records in web pages. InProce-edings of the ninth ACM SIGKDD international conference on Knowledge discoveryand data mining, pages 601–606. ACM.
|
|
13 |
Liu, B. and Zhai, Y. (2005). Net–a system for extracting web data from flat and nesteddata records. InInternational Conference on Web Information Systems Engineering,pages 487–495. Springer.
|
|
14 |
Liu, W., Meng, X., and Meng, W. (2009). Vide: A vision-based approach for deep webdata extraction.IEEE Transactions on Knowledge and Data Engineering, 22(3):447–460.
|
|
15 |
Miao, G., Tatemura, J., Hsiung, W.-P., Sawires, A., and Moser, L. E. (2009). Extrac-ting data records from the web using tag path clustering. InProceedings of the 18thinternational conference on World wide web, pages 981–990. ACM.
|
|
16 |
Qiu, D., Barbosa, L., Dong, X. L., Shen, Y., and Srivastava, D. (2015). Dexter: large-scale discovery and extraction of product specifications on the web.Proceedings of theVLDB Endowment, 8(13):2194–2205.
|
|
17 |
Roldán, J. C., Jiménez, P., and Corchuelo, R. (2019). On extracting data from tables thatare encoded using html.Knowledge-Based Systems.
|
|
18 |
Schulz, A., Lässig, J., and Gaedke, M. (2016). Practical web data extraction: are wethere yet?-a short survey. In2016 IEEE/WIC/ACM International Conference on WebIntelligence (WI), pages 562–567. IEEE.
|
|
19 |
Shi, S., Liu, C., Shen, Y., Yuan, C., and Huang, Y. (2015). Autorm: An effective approachfor automatic web data record mining.Knowledge-Based Systems, 89:314–331.
|
|
20 |
Simon, K. and Lausen, G. (2005). Viper: augmenting automatic information extractionwith visual perceptions. InProceedings of the 14th ACM international conference onInformation and knowledge management, pages 381–388. ACM.
|
|
21 |
Sleiman, H. A. and Corchuelo, R. (2012). A survey on region extractors from web docu-ments.IEEE Transactions on Knowledge and Data Engineering, 25(9):1960–1981.
|
|
22 |
Varlamov, M. and Turdakov, D. Y. (2016). A survey of methods for the extraction ofinformation from web resources.Programming and Computer Software, 42(5):279–291.
|
|
23 |
Velloso, R. P. and Dorneles, C. F. (2013). Automatic web page segmentation and noiseremoval for structured extraction using tag path sequences.JIDM, 4(3):173.
|
|
24 |
Velloso, R. P. and Dorneles, C. F. (2017). Extracting records from the web using a signalprocessing approach. InProceedings of the 2017 ACM on Conference on Informationand Knowledge Management, pages 197–206. ACM.
|
|
25 |
Velloso, R. P. and Dorneles, C. F. (2019). Web page structured content detection usingsupervised machine learning. InInternational Conference on Web Engineering, pages3–18. Springer.
|
|
26 |
Wai, F. K., Yong, L. W., Thing, V. L., and Pomponiu, V. (2017). Cmdr: Classifying nodesfor mining data records with different html structures. InTENCON 2017-2017 IEEERegion 10 Conference, pages 1862–1862. IEEE.
|
|
27 |
Wang, J., Wang, H., Wang, Z., and Zhu, K. Q. (2012). Understanding tables on the web.InInternational Conference on Conceptual Modeling, pages 141–155. Springer.
|
|
28 |
Xie, X., Fang, Y., Zhang, Z., and Li, L. (2012). Extracting data records from web usingsuffix tree. InProceedings of the ACM SIGKDD Workshop on Mining Data Semantics,page 12. ACM.
|
|
29 |
Zhai, Y. and Liu, B. (2005). Web data extraction based on partial tree alignment. InProceedings of the 14th international conference on World Wide Web, pages 76–85.ACM.
|
|
30 |
Zhang, Z., Zhu, K. Q., Wang, H., and Li, H. (2013). Automatic extraction of top-klists from the web. In2013 IEEE 29th International Conference on Data Engineering(ICDE), pages 1057–1068. IEEE.
|
|
31 |
Arasu, A. and Garcia-Molina, H. (2003). Extracting structured data from web pages. InProceedings of the 2003 ACM SIGMOD international conference on Management ofdata, pages 337–348. ACM.
|
|
32 |
Cafarella, M. J., Halevy, A., Wang, D. Z., Wu, E., and Zhang, Y. (2008). Webtables:exploring the power of tables on the web.Proceedings of the VLDB Endowment,1(1):538–549.
|
|