publication . Article . 2016

Text mining resources for the life sciences.

Piotr Przybyła; Matthew Shardlow; Sophie Aubin; Robert Bossy; Richard Eckart de Castilho; Stelios Piperidis; John McNaught; Sophia Ananiadou;
Open Access
  • Published: 21 Nov 2016
Abstract
Text mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work. We categorise the various resources under three sections: Content Discovery looks at where and how to find biomedical publications for text mining; Kno...
Persistent Identifiers
Subjects
free text keywords: linguistic annotation, web services, ontology, system, integration, growth, tool, implementation, researchers, workflow, [INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM], General Biochemistry, Genetics and Molecular Biology, General Agricultural and Biological Sciences, Information Systems, text mining, biocuration, interoperability, text mining resources, annotation formats, content discovery, knowledge encoding, repositories, aggregators, Review, Bio-informatique, Bioinformatics, linguistic annotation;web services;ontology;system;integration;growth;tool;implementation;researchers;workflow, Interoperability, Biomedical text mining, Information extraction, computer.software_genre, computer, Web service, Greek language, Workflow, Workflow management system, Data science, Co-occurrence networks, Computer science
Funded by
UKRI| Enriching Metabolic PATHwaY models with evidence from the literature (EMPATHY)
Project
  • Funder: UK Research and Innovation (UKRI)
  • Project Code: BB/M006891/1
  • Funding stream: BBSRC
,
EC| OpenMinTeD
Project
OpenMinTeD
Open Mining INfrastructure for TExt and Data
  • Funder: European Commission (EC)
  • Project Code: 654021
  • Funding stream: H2020 | RIA
Communities
CLARIN
Digital Humanities and Cultural HeritageDH-CH communities: CLARIN
115 references, page 1 of 8

Vardakas,K.Z., Tsopanakis,G., Poulopoulou,A. and Falagas,M.E. (2015) An analysis of factors contributing to PubMed's growth. J Informetrics, 9, 592-617.

Druss,B.G. and Marcus,S.C. (2005) Growth and decentralization of the medical literature: implications for evidence-based medicine. J Med. Libr. Assoc., 93, 499-501. [OpenAIRE]

Larsen,P.O. and von Ins,M. (2010) The rate of growth in scientific publication and the decline in coverage provided by Science Citation Index. Scientometrics, 84, 575-603. [OpenAIRE]

Simpson,M.S. and Demner-Fushman,D. (2012) Biomedical text mining: a survey of recent progress. In: Aggarwal, C.C., Zhai, C. (eds). Mining Text Data. Springer, New York, pp. [OpenAIRE]

Ananiadou,S., Kell,D.B. and Tsujii,J. (2006) Text mining and its potential applications in systems biology. Trends Biotechnol., 24, 571-579.

6. St u¨hrenberg,M., Werthmann,A. and Witt,A. (2012) Guidance through the standards jungle for linguistic resources. In: Proceedings of the LREC 2012 Workshop on Collaborative Resource Development and Delivery. pp. 9-13. [OpenAIRE]

7. Hirschman,L., Burns,G.A.P.C., Krallinger,M. et al. (2012) Text mining for the biocuration workflow. Database, 2012, bas020.

8. Ferrucci,D. and Lally,A. (2004) UIMA: an architectural approach to unstructured information processing in the corporate research environment. Nat. Lang. Eng., 10, 327-348.

9. Rak,R., Rowley,A., Black,W. and Ananiadou,S. (2012) Argo: an integrative, interactive, text mining-based workbench supporting curation. Database, 2012, bas010.

10. Kano,Y., Baumgartner,W.A., McCrohon,L. et al. (2009) UCompare: share and compare text mining tools with UIMA. Bioinformatics, 25, 1997-1998. [OpenAIRE]

11. Gavrilidou,M., Labropoulou,P., Desipri,E. et al. (2012) The META-SHARE Metadata Schema for the Description of Language Resources. In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC'12), European Language Resources Association (ELRA), Istanbul, Turkey. http://www.lrec-conf.org/proceed ings/lrec2012/pdf/998_Paper.pdf.

12. Weibel,S. (2005) The Dublin core: a simple content description model for electronic resources. Bull. Am. Soc. Inform. Sci. Technol., 24, 9-11.

13. Huh,S. (2014) Journal Article Tag Suite 1.0: National Information Standards Organization standard of journal extensible markup language. Sci. Edit., 1, 99-104.

14. Brase,J. (2009) DataCite-A Global Registration Agency for Research Data. In: Fourth International Conference on Cooperation and Promotion of Information Resources in Science and Technology. IEEE, pp. 257-261.

15. Pentz,E. (2001) CrossRef: a collaborative linking network. Issues in Science and Technology Librarianship, 2001, 10.5062/F4CR5RBK. http://istl.org/01-winter/article1.html. [OpenAIRE]

115 references, page 1 of 8
Abstract
Text mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work. We categorise the various resources under three sections: Content Discovery looks at where and how to find biomedical publications for text mining; Kno...
Persistent Identifiers
Subjects
free text keywords: linguistic annotation, web services, ontology, system, integration, growth, tool, implementation, researchers, workflow, [INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM], General Biochemistry, Genetics and Molecular Biology, General Agricultural and Biological Sciences, Information Systems, text mining, biocuration, interoperability, text mining resources, annotation formats, content discovery, knowledge encoding, repositories, aggregators, Review, Bio-informatique, Bioinformatics, linguistic annotation;web services;ontology;system;integration;growth;tool;implementation;researchers;workflow, Interoperability, Biomedical text mining, Information extraction, computer.software_genre, computer, Web service, Greek language, Workflow, Workflow management system, Data science, Co-occurrence networks, Computer science
Funded by
UKRI| Enriching Metabolic PATHwaY models with evidence from the literature (EMPATHY)
Project
  • Funder: UK Research and Innovation (UKRI)
  • Project Code: BB/M006891/1
  • Funding stream: BBSRC
,
EC| OpenMinTeD
Project
OpenMinTeD
Open Mining INfrastructure for TExt and Data
  • Funder: European Commission (EC)
  • Project Code: 654021
  • Funding stream: H2020 | RIA
Communities
CLARIN
Digital Humanities and Cultural HeritageDH-CH communities: CLARIN
115 references, page 1 of 8

Vardakas,K.Z., Tsopanakis,G., Poulopoulou,A. and Falagas,M.E. (2015) An analysis of factors contributing to PubMed's growth. J Informetrics, 9, 592-617.

Druss,B.G. and Marcus,S.C. (2005) Growth and decentralization of the medical literature: implications for evidence-based medicine. J Med. Libr. Assoc., 93, 499-501. [OpenAIRE]

Larsen,P.O. and von Ins,M. (2010) The rate of growth in scientific publication and the decline in coverage provided by Science Citation Index. Scientometrics, 84, 575-603. [OpenAIRE]

Simpson,M.S. and Demner-Fushman,D. (2012) Biomedical text mining: a survey of recent progress. In: Aggarwal, C.C., Zhai, C. (eds). Mining Text Data. Springer, New York, pp. [OpenAIRE]

Ananiadou,S., Kell,D.B. and Tsujii,J. (2006) Text mining and its potential applications in systems biology. Trends Biotechnol., 24, 571-579.

6. St u¨hrenberg,M., Werthmann,A. and Witt,A. (2012) Guidance through the standards jungle for linguistic resources. In: Proceedings of the LREC 2012 Workshop on Collaborative Resource Development and Delivery. pp. 9-13. [OpenAIRE]

7. Hirschman,L., Burns,G.A.P.C., Krallinger,M. et al. (2012) Text mining for the biocuration workflow. Database, 2012, bas020.

8. Ferrucci,D. and Lally,A. (2004) UIMA: an architectural approach to unstructured information processing in the corporate research environment. Nat. Lang. Eng., 10, 327-348.

9. Rak,R., Rowley,A., Black,W. and Ananiadou,S. (2012) Argo: an integrative, interactive, text mining-based workbench supporting curation. Database, 2012, bas010.

10. Kano,Y., Baumgartner,W.A., McCrohon,L. et al. (2009) UCompare: share and compare text mining tools with UIMA. Bioinformatics, 25, 1997-1998. [OpenAIRE]

11. Gavrilidou,M., Labropoulou,P., Desipri,E. et al. (2012) The META-SHARE Metadata Schema for the Description of Language Resources. In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC'12), European Language Resources Association (ELRA), Istanbul, Turkey. http://www.lrec-conf.org/proceed ings/lrec2012/pdf/998_Paper.pdf.

12. Weibel,S. (2005) The Dublin core: a simple content description model for electronic resources. Bull. Am. Soc. Inform. Sci. Technol., 24, 9-11.

13. Huh,S. (2014) Journal Article Tag Suite 1.0: National Information Standards Organization standard of journal extensible markup language. Sci. Edit., 1, 99-104.

14. Brase,J. (2009) DataCite-A Global Registration Agency for Research Data. In: Fourth International Conference on Cooperation and Promotion of Information Resources in Science and Technology. IEEE, pp. 257-261.

15. Pentz,E. (2001) CrossRef: a collaborative linking network. Issues in Science and Technology Librarianship, 2001, 10.5062/F4CR5RBK. http://istl.org/01-winter/article1.html. [OpenAIRE]

115 references, page 1 of 8
Any information missing or wrong?Report an Issue