Data preprocessing for integrated asset modeling
https://doi.org/10.51890/2587-7399-2024-9-4-152-158
Abstract
Introduction. Machine learning is increasingly being applied in various industries, including the oil and gas sector. However, the quality of data collected from oilfi elds does not always allow for its correct use in the digitalization of production processes. Unfortunately, it is not possible to quickly re-equip all oilfields for more accurate and frequent data collection. As a result, we must continue working with the data that has already been collected.
Objective. The objective of this work is to examine the set of transformations that occur with data in the process of its use in machine learning as a unified process (the ETL process).
Materials and methods. As examples to demonstrate the discussed challenges and approaches, we used data on oil well operations. Python scripts were developed for data analysis and visualization.
Results. The study found that the quality of data collected from oilfi elds is not always suffi cient for use in machine learning. To improve data quality during collection and preparation stages, it is proposed to implement ETL processes.
Conclusion. The application of ETL processes will significantly increase the quantity and quality of data available for creating digital twins of oilfields. Therefore, the impact of introducing this technology is difficult to overestimate.
About the Authors
K. A. PechkoRussian Federation
Konstantin A. Pechko, Chief data analytics officer
190000; 16, Gorohovaya str.; Saint Petersburg
Scopus: 57331243400
D. I. Konstantinov
Russian Federation
Dmitry I. Konstantinov, Specialist
Saint Petersburg
M. V. Simonov
Russian Federation
Maksim V. Simonov, Head of the Center
Saint Petersburg
A. A. Afanasev
Russian Federation
Aleksandr A. Afanasev, Chief specialist
Saint Petersburg
References
1. Pechko K.A., Senkin I.S., Belonogov E.V. Well modeling using machine learning methods for integrated modeling. PROneае. Professionally about Oil. 2022, no. 7(2), pp. 114–120. (In Russ.) doi: 10.51890/2587-7399-2022-7-2-114-120
2. Fan C. et al. A review on data preprocessing techniques toward efficient and reliable knowledge discovery from building operational data. Frontiers in energy research. 2021, vol. 9, p. 652801.
3. García S. et al. Big data preprocessing: methods and prospects. Big data analytics. 2016, vol. 1, pp. 1–22.
4. Malik J.S., Goyal P., Sharma A.K. A comprehensive approach towards data preprocessing techniques & association rules. Proceedings of the 4<sup>th</sup> National Conference. 2010, vol. 132.
5. Katser I.D. et al. Data pre-processing methods for NPP equipment diagnostics algorithms: an overview. Nuclear Energy and Technology. 2021, vol. 7, no. 2, pp. 111–125.
6. Bykov K. V. Specifics of data preprocessing for the application of machine learning. Young scientist. 2021, no. 53 (395), p. 1. (In Russ.)
7. Pechko K.A., Chuprov A.A., Afanasev A.A., Simonov M.V. Well data preprocessing using machine learning algorithms. PROneft . Professionally about Oil. 2023, no. 8(3), pp. 163–166. (In Russ.). doi: 10.51890/2587-7399-2023-8-3-163-166
8. Pechko K. et al. Data Pre Processing Techniques in Integrated Asset Modeling // Fourth EAGE Digitalization Conference & Exhibition. European Association of Geoscientists & Engineers. 2024, vol. 2024, no. 1, pp. 1–4. (In Russ.)
Review
For citations:
Pechko K.A., Konstantinov D.I., Simonov M.V., Afanasev A.A. Data preprocessing for integrated asset modeling. PROneft. Professionally about Oil. 2024;9(4):152-158. (In Russ.) https://doi.org/10.51890/2587-7399-2024-9-4-152-158