
The document presents an overview of the EFRA Data Platform's data assets and sources. It details the sources, challenges, and recent integrations of data relevant to global food safety. The EFRA Platform already contains millions of data records from highly heterogeneous sources, including: 1. Public Food Safety Authorities: Data from over 50 international food safety authority websites, daily scraped and customized for efficient information extraction. The challenge is the data's heterogeneity in language, format, and lack of a global schema. 2. EFSA Lab Tests: Annual lab test results from the European Food Safety Authority, aggregated from national authorities across Europe, with the challenge of transforming non-machine-readable formats. 3. Food Safety News Sites: Up-to-date information from authoritative food safety websites, requiring sophisticated processing to structure the natural language content. 4. Weather Data: Weather information from the Visual Crossing API, with challenges in data consistency and quality due to variations in measurement units and discrepancies in timestamps. 5. Regulatory Bodies: Regulations and news from public authorities worldwide, featuring language diversity, various document formats, and complex tables that complicate data parsing. 6. Pests Data: Scientific data on pests, manually extracted and paired with pesticide information from governmental sources, presented in machine-readable formats. 7. Food-safety Videos: Information extracted from food safety videos from CDC, EFSA, and USDA via speech transcription, with recent additions to the data sources. 8. Private Sources: Data from the Agrivi Platform on farm management, pest alarms, and weather parameters; the Moy Park MTech Platform on poultry production and Salmonella testing; and the Food Fortress Platform on mycotoxin analysis in animal feed. The executive summary highlights the integration of various data sources into EFRA, enhancing the understanding of food safety trends, and elaborates on the data's format, the languages it's presented in, the inherent challenges in handling and standardizing the diverse information, and the solutions implemented to create a uniform dataset for stakehold
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
