Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2025
License: CC BY
Data sources: ZENODO
ZENODO
Dataset . 2025
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2025
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

Variations of Food Prices in Italian Supermarkets

Authors: Sasso, Daniele; Bacco, Luca; Palumbo, Luigi; Marcucci, Juri; Salvini, Niccolo; Laureti, Tiziana; Vollero, Luca;

Variations of Food Prices in Italian Supermarkets

Abstract

The dataset includes retail prices for meat, fruit, and vegetable products collected over a period spanning from December 2020 to March 2023. The data is structured in tabular format and includes multiple columns providing detailed attributes for each entry. Specifically, each row in the dataset represents the price of a product recorded at a specific date. The columns in the dataset are: date: Date of price collection, format DD/MM/YYYY (e.g., 03/12/2020). price: Retail price in euros (EUR), using a decimal point (.). product_id: A unique identifier assigned to each product. store_id: Anonymized unique identifier of the store where the price was recorded. region: Italian region where the store is located (e.g., Calabria, Lazio). product: Full commercial name of the product, including quantity or weight (e.g., "arance navelina italia calibro 1.5 kg"). COICOP5: Product classification at the 5-digit level based on the COICOP nomenclature (e.g., "Oranges"). COICOP4: Higher-level COICOP category (e.g., "Fruit", "Meat", "Vegetable"). Units and Notes: - Currency: All prices are in euros (EUR).- Quantities: The quantity or weight is included in the product field (e.g., "1.5 kg", "500 g").- Date Format: Dates are in DD/MM/YYYY format.- COICOP classification: Assigned via manual annotation and rule-based categorization using domain-specific keywords. File Information: - Format: CSV (.csv), UTF-8 encoded, comma-separated.- Each row corresponds to one product observation at a specific store on a specific date.- No missing values are present in the cleaned version. This structure facilitates comprehensive analyses, enabling exploration of regional price variations, comparisons across product categories, and time-series investigations into price dynamics within the Italian retail food market. # ------------------------------------------------------------# Sample Code for Dataset Analysis# ------------------------------------------------------------ # Required librariesimport pandas as pdimport matplotlib.pyplot as pltimport seaborn as sns # Load datasetdf = pd.read_csv("Variations_Food_Prices_Italian_Supermarkets.csv") # Convert date columndf['date'] = pd.to_datetime(df['date']) # Format: YYYY-MM-DD # Define category colorscategory_colors = {"Fruit": "blue", "Vegetable": "green", "Meat": "red"} # ------------------------------------------------------------# Geographic distribution of unique products by region# ------------------------------------------------------------geo = df.groupby(["region", "COICOP4"])["product_id"].nunique().reset_index()pivot_geo = geo.pivot(index="region", columns="COICOP4", values="product_id").fillna(0)pivot_geo["Total"] = pivot_geo.sum(axis=1)pivot_geo = pivot_geo.sort_values("Total", ascending=False).drop(columns="Total")pivot_geo = pivot_geo[["Fruit", "Meat", "Vegetable"]] pivot_geo.plot(kind="bar", stacked=True, figsize=(10,6), color=["blue", "red", "green"])plt.ylabel("Number of Unique Products")plt.title("Geographic Distribution by Region and Category (Sorted)")plt.xticks(rotation=45, ha="right")plt.legend(title="Category")plt.tight_layout()plt.show() # ------------------------------------------------------------# Basic analysis: average price trend over time (by COICOP4)# ------------------------------------------------------------price_trend = df.groupby(["date", "COICOP4"])["price"].mean().reset_index() plt.figure(figsize=(10,5))sns.lineplot(data=price_trend, x="date", y="price", hue="COICOP4", palette=category_colors)plt.title("Average Price Over Time by COICOP4 Category")plt.xlabel("Date")plt.ylabel("Average Price (€)")plt.legend(title="Category")plt.tight_layout()plt.show()

Keywords

Web Scraping, Economics, Price Analysis, Food Prices, Retail Prices, Supermarkets

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average