
doi: 10.69997/sct.185205
This research addresses the challenges associated with data-driven soft sensors in industrial applications, where successful implementations remain limited. The scarcity of practical applications can be attributed to variable operating conditions and frequent disturbances in real-time processes. Industrial data are often nonlinear, dynamic, and highly unbalanced, complicating efforts to capture the essential characteristics of underlying processes. To tackle these issues, we propose a comprehensive solution for industrial application, that encompasses feature selection, feature extraction, and model updating. Feature selection aims to pinpoint the independent variables that have a substantial impact on key performance indicators, including quality, safety, efficiency, reliability, and sustainability. By doing so, it simplifies the model and boosts its predictive accuracy. The process begins with screening variables based on process knowledge, followed by a thorough analysis of correlation and redundancy to eliminate redundant information, which can burden computational resources and degrade prediction accuracy. We propose a mutual information-based algorithm for feature selection that assesses the relevance and redundancy among process variables through a comprehensive correlation function. This algorithm ranks variables by their importance using a Greedy search method to identify the optimal set of variables. After selecting the optimal variables, feature extraction is carried out to derive internal features from this set and establish a relationship between these latent features and the output variables. Given the intricate nature of industrial processes, we employ deep learning techniques, specifically Long Short-Term Memory (LSTM) networks, which are a type of Recurrent Neural Network (RNN) well-suited for capturing long-term dependencies in sequential data. LSTMs excel at modeling temporal correlations due to their ability to maintain memory states that allow for learning from sequential data over extended periods. To address short-term nonstationary features resulting from process disturbances, we incorporate a differential unit into the latent layer of the LSTM network. Once trained, the model is updated during online applications to incorporate gradual changes in equipment and reaction agents. Quality-related data, although typically available only post-measurement, can be leveraged to fine-tune model parameters, ensuring sustained predictive accuracy over time. To validate our approach, we present a case study on a delayed coker unit, yielding promising long-term predictions for tube metal temperature and showcasing the potential of our methodology for industrial applications.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
