Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Conference object . 2023
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Conference object . 2023
License: CC BY
Data sources: Datacite
versions View all 3 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Enhancing Industrial Data Analysis through Machine Learning-based Classification of Petrochemical Datasets

Authors: Fáber, Rastislav; L'ubušký, Karol; Mojto, Martin; Paulen, Radoslav;

Enhancing Industrial Data Analysis through Machine Learning-based Classification of Petrochemical Datasets

Abstract

Incorporating data analytics and machine learning (ML) algorithms into industrial decision making has proven to be a promising way to boost production efficiency. By utilizing ML algorithms to classify historical measurements from online sensors and laboratory analyses, it is possible to provide an operation guideline that was previously unavailable. We apply rigorous data treatment to prepare the raw data for ML-based classifier design. This process includes data cleaning, data standardization, data averaging, variable removal (based on linear dependency analysis), and distant outlier detection; to ensure the quality and reliability of available data. Selection of a suitable classifier model depends on the complexity of an industrial process, the level of its automation (implementation effort) and the ability to handle data outliers. We employ Density-Based Spatial Clustering of Applications with Noise (DBSCAN) for initial ground-truth labeling, after which we utilize well understood ML algorithms; k-Means, k-Nearest Neighbors (k-NN), Support Vector Machine (SVM) and SVM with time difference, to engineer a framework for real-time classification. Accurate categorization of measurements is crucial for identifying slight deviations from real values that could impact the quality of the final product. Moreover, the complexity of the data plays a significant role in the performance of ML algorithms. With precise categorization of real-time data, the need for human intervention in process control can be minimized. To evaluate the performance of the designed classifiers, we compare their classification accuracy against the aforementioned synthetic ground truth labels. This comparison is carried out on a testing dataset that was not used during the framework design. Overall, our results demonstrate that the ML-based classifiers achieve comparable results in real-time classification. The most accurate classifier was the SVM model which uses not only absolute data, but also their time differences, which achieved the highest anomaly detection, 82 %.

Keywords

Machine Learning, Analytics, Alkylation Process, Data Classification, Industry 4.0

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 12
    download downloads 11
  • 12
    views
    11
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
0
Average
Average
Average
12
11
Green
Funded by