Machine learning algorithms have become powerful tools for modeling in the engineering field. These methods fit the nonlinear relationships among multiple variables from a higher dimension by using complex structures or multiple nonlinear transformations. They are suitable for solving problems that cannot be effectively solved by traditional physical models or empirical models due to the complex relationship of variables in engineering. Since the traditional interpretation approaches of logging data are based on petrophysical mechanisms and models, many assumptions are needed, and there may be deviations in practical application. Therefore, when using machine learning for logging data processing and interpretation, reservoir fluid identification is of great significance. The existing reservoir fluid identification methods have not thoroughly mined the multi-dimensional correlation of logging data. Moreover, the distribution of reservoir types is seriously unbalanced. Reservoirs with similar physical properties may be easily confused. We present an efficient method using machine learning to identify reservoir fluids with logs. A long and short-term memory network (LSTM) is used to characterize the time series characteristics of logs varying with depth domain. The convolution kernel of the convolutional neural network (CNN) is used to examine multiple logging curves to characterize the correlation between them. Considering the unbalanced distribution of categories and the different value ranking of reservoirs, this paper uses the weighted cross entropy loss function to improve the weight of small sample categories in model training, which further improves the identification accuracy of oil-bearing reservoirs. According to the difference and similarity of reservoir physical properties, a multi-layer reservoir fluid identification method is designed. The LSTM + CNN model structure is applied to the prediction of layer level II (oil-bearing reservoirs, water-bearing reservoirs, and dry layer) and layer level III (oil layer, oil-water layer, poor oil layer, and water layer, oily water layer). This method is verified on the logging data of natural oil fields, in which the data categories distribution is highly unbalance. Moreover, the oil-bearing reservoirs account for 9%, which aligns with the actual industrial scene. A series of comparative experiments proved that the parallel network structure of LSTM and CNN can fully capture the correlation characteristics of the multi-dimensional space of logging data. The weighted cross-entropy loss function significantly improves the identification accuracy of high-development-value oil-bearing reservoirs. Moreover, the multi-layer reservoir fluid identification method is more accurate in avoiding confusing reservoirs with similar physical properties, such as oil-water layer and oily water layer. The experimental results demonstrate that this method can effectively overcome many of the problems in reservoir fluid identification. It has specific practical value to help geological experts and engineers find underground reservoirs and complete reservoir evaluation.
Key words:
machine learning; loss function; logging data; oil and gas reservoir; fluid identification