shareshare link cite add Please grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added 0 works in your ORCID record related to the merged Research product.
You have already added 0 works in your ORCID record related to the merged Research product.
news-please
doi: 10.18452/1447
Dewey Decimal Classification: ddc:020
ACM Computing Classification System: InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL InformationSystems_MISCELLANEOUS
news crawler, news extractor, scraper, information extraction, 020 Bibliotheks- und Informationswissenschaft, news crawler, news extractor, scraper, information extraction, 020 Bibliotheks- und Informationswissenschaft
news crawler, news extractor, scraper, information extraction, 020 Bibliotheks- und Informationswissenschaft, news crawler, news extractor, scraper, information extraction, 020 Bibliotheks- und Informationswissenschaft
Dewey Decimal Classification: ddc:020
ACM Computing Classification System: InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL InformationSystems_MISCELLANEOUS
Baburov, Y. (2010): python-readability. https://github.com/buriy/python-readability
Geva, R. (2016): article-date-extractor. https://github.com/Webhose/article-date-extractor
Kohlschütter, C., P. Fankhauser, and W. Nejdl (2010): Boilerplate detection using shallow text features. In: Proceedings of the third ACM international conference on Web search and data mining (pp. 441-450). ACM. [OpenAIRE]
Kouzis-Loukas, D. (2016): Learning Scrapy. Packt Publishing Ltd.
Labs, G. (2016): Goose - Article Extractor. https://github.com/GravityLabs/goose
Lewis, D. D., Y. Yang, T. G. Rose, and F. Li (2004): Rcv1: A new benchmark collection for text categorization research. In: Journal of machine learning research, 5 (Apr), 361-397.
Meschenmoser, P., N. Meuschke, M. Hotz, B. Gipp (2016): Scraping Scientific Web Repositories: Challenges and Solutions for Automated Content Extraction. In: D-Lib Magazine, 22 (9/10). [OpenAIRE]
Ou-Yang, L. (2013): Newspaper: Article scraping & curation. http://newspaper.readthedocs.io/en/latest/
Paliouras, G., A. Mouzakidis, V. Moustakas, C. Skourlas, C. (2008): PNS: A personalized news aggregator on the web. In: Intelligent interactive systems in knowledge-based environments (pp. 175-197). Springer.