CN116011351A

CN116011351A - Oil well reasonable sinking degree determining method based on clustering algorithm and Widedeep network

Info

Publication number: CN116011351A
Application number: CN202310308352.2A
Authority: CN
Inventors: 王鑫炎; 张黎明; 李敏; 程丞; 张凯
Original assignee: China University of Petroleum East China
Current assignee: China University of Petroleum East China
Priority date: 2023-03-28
Filing date: 2023-03-28
Publication date: 2023-04-25
Anticipated expiration: 2043-03-28
Also published as: CN116011351B

Abstract

A method for determining reasonable oil well sinking degree based on a clustering algorithm and a Widedeep network belongs to the technical field of intelligent development of oil and gas fields and comprises the following steps: firstly, mining and analyzing multi-source data of oil wells in a block by a clustering algorithm, and taking high-efficiency oil wells with similar characteristic distribution as an excellent sample library; secondly, extracting knowledge of the excellent sample library by adopting a nearest neighbor algorithm, taking the knowledge as posterior features for guiding the design of the sinking degree of the target oil well, and simultaneously establishing a mapping relation between the multi-source features of the oil well of the Widedeep learning network fitting block and the sinking degree; and finally, generating reasonable sinking degree under the combination of the design parameters and the posterior characteristics of the target oil well with high precision and high efficiency, thereby scientifically and effectively guiding the on-site production optimization.

Description

Oil well reasonable sinking degree determining method based on clustering algorithm and Widedeep network

Technical Field

The invention discloses a method for determining reasonable oil well sinking degree based on a clustering algorithm and a Widedeep network, and belongs to the technical field of intelligent development of oil and gas fields.

Background

The oil pumping well is the main oil extraction mode in domestic and foreign oil fields. The sinking degree of the oil pumping well refers to the height of a liquid column in the annular space of the oil sleeve above the suction inlet of the oil pump and below the working fluid level during oil well production. When the sinking degree is smaller, the pumping degree is insufficient due to the fact that gas enters the oil pump in the pumping process, and when the sinking degree is serious, the phenomenon of 'airlock' occurs, so that the oil well is unbalanced in supply and discharge. When sinking degree is larger, sinking pressure is overlarge, oil well yield reduction caused by production of a thin oil layer is restrained, meanwhile, unnecessary consumption of equipment power and operation cost can be caused by increase of pump hanging depth, and finally system efficiency and economic benefit of an oil pumping well are affected. Therefore, in oilfield production and optimization management, reasonable sinking is an important guarantee for realizing oil well supply and drainage coordination and efficient production.

At present, the determination method for reasonable sinking degree at home and abroad is mainly divided into three types:

the first is well determination based on oil-gas ratio and water-containing level of the oil well, by on-site personnel experience or by reference to the same block production status. The second is determined by theoretical derivation and trial-and-error with the goal of maximizing system efficiency or yield. And thirdly, carrying out regression statistical analysis on the indexes such as the sinking degree, the pump efficiency, the system efficiency, the hundred-meter ton liquid power consumption and the like according to different standards, and searching potential corresponding rules through a fitting equation so as to guide the determination of reasonable sinking degree. The problems that the characteristics of the blocks and the oil wells are limited in consideration of diversity, the dependence on an empirical formula and a simple mathematical model is strong, the existing efficient wells of the blocks cannot be fully utilized to conduct guide production and the like are solved, and the requirements of fully excavating mass data rules and guaranteeing efficient production of the oil wells in the intelligent oil field production process cannot be met.

Disclosure of Invention

In order to solve the technical problems, the invention discloses a method for determining reasonable oil well sinking degree based on a clustering algorithm and a Widedeep network.

According to the invention, excavation analysis is carried out on multi-source data of the oil well in the block through a clustering algorithm, a high-efficiency oil well with similar characteristic distribution is used as an excellent sample library, knowledge extraction of the excellent sample library is realized by adopting a nearest neighbor algorithm and is used as posterior characteristics for guiding the sinking degree design of the target oil well, and meanwhile, a mapping relation between the multi-source characteristics and the sinking degree of the oil well in the block is established by using a Widedeep learning network fitting block, so that reasonable sinking degree under the combination of design parameters and the posterior characteristics of the target oil well is accurately and efficiently generated, and on-site production optimization is scientifically and effectively guided.

The detailed technical scheme of the invention is as follows:

the method for determining the reasonable oil well sinking degree based on the clustering algorithm and the Widedeep network is characterized by comprising the following steps of:

firstly, mining and analyzing multi-source data of oil wells in a block by a clustering algorithm, and taking high-efficiency oil wells with similar characteristic distribution as an excellent sample library;

secondly, extracting knowledge of the excellent sample library by adopting a nearest neighbor algorithm, taking the knowledge as posterior features for guiding the design of the sinking degree of the target oil well, and simultaneously establishing a mapping relation between the multi-source features of the oil well of the Widedeep learning network fitting block and the sinking degree;

and finally, generating reasonable sinking degree under the combination of the design parameters and the posterior characteristics of the target oil well with high precision and high efficiency, thereby scientifically and effectively guiding the on-site production optimization.

According to a preferred embodiment of the present invention, the sources of the multi-source data of the oil well in the block include:

and step 1, acquiring multi-source data reflecting the production efficiency and the sinking rationality of the oil well according to the oil extraction process theory and the on-site production experience, and carrying out data preprocessing and multi-source feature set construction to obtain an oil well multi-source data set in a block.

Preferably, according to the present invention, the clustering algorithm is:

and 2, performing unsupervised cluster analysis on the multi-source data set of the oil well in the block obtained in the step 1 by adopting a K-Means clustering algorithm, and screening out high-efficiency oil well clusters by taking system efficiency as an evaluation index to be used as an excellent sample library for representing reasonable sinking sample characteristic distribution.

Preferably, according to the present invention, the nearest neighbor algorithm is:

and 3, constructing a knowledge extractor for the excellent sample library in the step 2 by adopting a KNN regression algorithm, and generating posterior feature distribution of specific production parameter combinations of the target well in the feature space of the efficient oil well cluster for guiding the subsequent reasonable sinking degree design.

According to the invention, the method for establishing the mapping relation between the multisource characteristics and the sinking degree of the Widedeep learning network fitting block oil well comprises the following steps:

and 4, based on the multi-source data set of the oil well in the block obtained in the step 1, the oil well production characteristic item is used as a Wide layer input, the posterior characteristic item is used as a Deep layer input, and a Widedeep depth neural network is constructed to fit the mapping relation between the multi-source characteristic and the sinking degree of the oil well of the target block. Wherein the well production profile comprises data items: daily liquid yield, pump efficiency, working fluid level, oil pressure, casing pressure, water content, pump diameter, stroke frequency and oil pipe diameter; the posterior feature term includes data items: downhole efficiency, system efficiency, surface efficiency, and pump indicator diagram features.

According to the preferred embodiment of the present invention, the method for generating reasonable submergence under the combination of the target well design parameter and the posterior feature comprises:

And 5, inputting specific production parameter items of the target well and posterior feature items extracted from a good sample library into a trained Widedeep model, generating reasonable submergence for the target well, and finally, optimizing on-site production. Wherein the specific production parameter items for the target well include: designing daily liquid production, designing pump efficiency, calculating working fluid level, designing oil pressure, designing casing pressure, water content, pump diameter, stroke and oil pipe diameter; the posterior feature combination is the data item generated using the knowledge extractor in step 3: downhole efficiency, system efficiency, surface efficiency, and pump indicator diagram features.

According to the invention, the specific process of the step 1 is as follows:

according to oil extraction process theory and field production experience, the oil well acquisition parameter extraction, evaluation index calculation and pump indicator diagram feature extraction are respectively carried out through the steps 1.1, 1.2 and 1.3, and the data preprocessing and the multi-source data set construction of the oil well in the block are carried out through the step 1.4, wherein the specific process is as follows:

step 1.1, collecting direct influence parameters which influence the production efficiency and the energy consumption of an oil well, wherein the direct influence parameters are used as characteristic acquisition items, namely the input power of a motor, the daily liquid yield and the working fluid level; indirectly influencing parameters, namely oil pressure, casing pressure, water content, pump diameter, oil pipe diameter, pump depth, sinking degree, stroke frequency, maximum upward stroke current and maximum downward stroke current;

Step 1.2, calculating an evaluation index for representing the production efficiency and the energy consumption of the oil well as an energy consumption evaluation characteristic item, wherein the evaluation index comprises the following components: the oil pumping well system efficiency, ground efficiency, underground efficiency, pump efficiency and hundred meter ton liquid power consumption are calculated according to the following formula:

（1）/>

（2）

（3）

（4）

（5）

（6）

（7）

in the formulas (1) - (7),

representing the efficiency of the pumping well system; />

Representing ground efficiency; />

Representing downhole efficiency; />

Representing motor input power, < >>

；/>

Indicative of hydraulic power, +.>

；/>

Indicating polish rod power +.>

；

Representing the pump efficiency; />

Indicating daily production of oil well,/->

；/>

Represents the density of the pumped liquid->

；/>

Indicating the effective elevation, i.e. the working fluid level depth,/->

；/>

Representing the area enclosed by the load of the diagram, +.>

；/>

Indicates the stroke of the polish rod>

；/>

Indicates the impulse->

；/>

Indicating the power meter ratio, < >>

；/>

Indicate length, ++>

；/>

Representing the sectional area of the plunger of the oil pump>

；/>

Represents the electricity consumption of hundred meters ton liquid and->

；/>

Representing the daily power consumption of a single well, < >>

；

Step 1.3, according to actual measured polished rod load and displacement data of an oil well, solving a one-dimensional damping wave equation by taking a suspension point position and the load as boundary conditions and adopting a Fourier series method to obtain a pump indicator diagram for representing the running condition of an underground oil well pump, wherein the polished rod load and the displacement are indicator diagram data items of the oil well, the suspension point displacement and the load are wellhead positions, and the data are known to a person skilled in the petroleum field without objection, wherein the one-dimensional damping wave equation is as follows:

（8）

In the formula (8) of the present invention,

representation->

Moment sucker rod string->

Displacement at the section; />

Representing the propagation speed of sound waves in the sucker rod string, is->

；/>

Representing the resistance coefficient;

and (3) carrying out standardization processing on the pump indicator diagram by adopting a min-max standardization method, namely carrying out linear transformation on the pump indicator diagram data so as to eliminate the displacement and load item dimension, wherein the displacement and load are in the interval [0,1] after standardization, and the conversion function is as follows:

（9）

in the formula (9) of the present invention,

representing the normalized data; />

Representing an original sample data matrix; />

Representing the maximum value of the sample data; />

Representing a minimum value of the sample data;

drawing a displacement-acquisition point curve by taking the acquisition point serial number as a horizontal axis and taking dimensionless displacement data of a standardized pump indicator diagram as a vertical axis; drawing a load slope-acquisition point curve by taking an acquisition point serial number as a horizontal axis and taking a dimensionless load change slope of a standardized pump indicator diagram as a vertical axis, wherein a calculation formula of the load slope is as follows:

（10）

in the formula (10) of the present invention,

representing the load change slope; />

Indicating +.f. in normalized pump indicator diagram>

Load values for the individual acquisition points; />

Representing the number of acquisition points;

extracting pump valve switch points from a displacement-acquisition point curve and a load slope-acquisition point curve according to the working principle of the underground pump, and extracting physical characteristics and geometric characteristics of a pump indicator diagram on the basis to represent pump indicator diagram information under the sinking degree of an oil well, wherein the physical characteristics comprise the pump valve point switch position, the pump effective stroke in the up-down stroke and the average load; the geometric feature is the slope of the curve during loading and unloading:

The specific pump switch valve point extraction and pump indicator diagram geometric characteristic process is as follows:

step 1-1: searching for a first displacement equal to 0 point, namely a bottom dead center D, from the first acquisition point backwards in the displacement-acquisition point curve; searching the maximum displacement point, namely a top dead center U;

step 1-2: in a load slope-acquisition point curve, firstly determining that the maximum point and the minimum point of the load slope are the point K1 and the point K2, wherein the point K1 is positioned in the up-stroke loading process, and the point K2 is positioned in the down-stroke unloading process;

step 1-3: in the load slope-acquisition point curve, starting from a point K1, searching a point with a first load slope approximately equal to 0 backwards according to the sequence number of the acquisition point, namely, a first maximum value point between an upper dead point U and an upper stroke loading process in a pump indicator diagram, as a fixed valve opening point S1;

step 1-4: in the load slope-acquisition point curve, starting from a point K1, searching backwards for a point with the last load slope approximately equal to 0 between the upper dead points U according to the sequence number of the acquisition point, namely, a last maximum value point between the upper dead points U in the up-stroke loading process in the pump indicator diagram, and taking the last maximum value point as a fixed valve closing point S2;

step 1-5: in the load slope-acquisition point curve, starting from a point K2, searching a point with a first load slope approximately equal to 0 backwards according to the sequence number of the acquisition point, namely, a first minimum value point between the down stroke unloading process and the bottom dead center in the pump indicator diagram, and taking the first minimum value point as a traveling valve opening point T1;

Step 1-6: in the load slope-acquisition point curve, starting from a point K2, searching a point with the last load slope approximately equal to 0 backwards according to the sequence number of the acquisition point, namely, a last minimum value point between the down stroke unloading process in the pump indicator diagram and a bottom dead center D, and taking the last minimum value point as a traveling valve closing point T2;

step 1-7: in the pump indicator diagram, the load of each collecting point between the fixed valve opening point S1 and the fixed valve closing point S2 is recorded

Calculate the average load of upstroke +.>

The method comprises the steps of carrying out a first treatment on the surface of the Recording the load of each collecting point between the opening point T1 of the traveling valve and the closing point T2 of the traveling valve

Calculate the average load of downstroke +.>

；

Step 1-8: in the pump diagram, the difference between the displacements of the fixed valve opening point S1 and the fixed valve closing point S2, i.e. the upstroke effective stroke, is calculated

The method comprises the steps of carrying out a first treatment on the surface of the Calculating the difference between the displacements of the traveling valve opening point T1 and the traveling valve closing point T2, i.e., the downstroke effective stroke +.>

；

Step 1-9: in the pump indicator diagram, calculating the slope of a two-point connecting line of a traveling valve closing point T2 and a fixed valve opening point S1

As a pump indicator diagram loading process slope; calculating the slope of a two-point connecting line of a fixed valve closing point S2 and a traveling valve opening point T1

As a pump indicator diagram unloading process slope;

step 1-10: summarizing all the characteristic values extracted in the steps into

As a characteristic item of a pump indicator diagram, the characteristic item is taken as a characteristic set for representing the working condition of the oil well pump under the sinking degree of the oil well;

step 1.4, cleaning the data outlier and repeated value by the collection characteristic item extracted in step 1.1, the energy consumption evaluation characteristic item calculated in step 1.2 and the pump indicator diagram characteristic item data extracted in step 1.3, processing by adopting a min-max standardization method shown in formula (9), and finally obtaining the multi-source data set of the oil well in the block

。

According to the invention, the specific process of the step 2 is as follows:

2.1, performing unsupervised cluster analysis on the multi-source data set A of the oil well in the block obtained in the step 1 by adopting a K-Means algorithm, wherein the algorithm is an algorithm for minimizing the square sum of errors between samples and clustering centers, and finally obtaining K groups of clustering results set in advance by iteratively calculating the distance between each sample and each clustering center and redistributing the samples;

according to the classification strategy for dividing the block oil well into the low-efficiency well, the normal well and the Gao Xiaojing, dividing the multi-source data set A of the oil well in the block into 3 clusters, and enabling the square sum of errors in the clusters to be minimum, namely, meeting the following conditions:

（11）

in the formula (11) of the present invention,

is to make->

The variable value when reaching the minimum value; / >

Is the number of clusters;

is->

Clustering the population; />

Is->

Average value of center; />

Is a characteristic attribute vector of the sample; the smaller the square sum of errors in the groups, the larger the distance between each sample and the clustering center, the smaller the difference between the groups, and the better the clustering effect;

and 2.2, respectively calculating the average system efficiency in groups of 3 oil well clustering observation sets, and dividing the average system efficiency into an efficient well observation set A1, a normal well observation set A2 and a low-efficient well observation set A3 in sequence according to the average system efficiency, wherein the efficient well observation set A1 represents sample set distribution which is higher in system efficiency in a block and has similarity characteristics (production characteristics, efficiency characteristics and pump indicator diagram characteristics), and the sample set distribution is used as an excellent sample library for guiding the optimization adjustment of oil well production.

According to the present invention, the specific process of the step 3 is:

on the basis of the high-efficiency well observation set A1 obtained in the step 2.2, aiming at the problem that the efficiency characteristic and the pump indicator diagram characteristic cannot be determined before the oil well is optimized for sinking, a nearest neighbor regression strategy is adopted to construct a good sample knowledge extractor for generating posterior characteristic distribution of known production parameter target wells in Gao Xiaojing cluster characteristic space:

extracting a feature set X1 required for constructing a knowledge extractor from the high-efficiency well observation set A1, wherein the feature set X1 comprises oil well optimization design parameters and currently known production parameters: daily liquid yield, working fluid level, pump efficiency, oil pressure, casing pressure, water content, pump diameter, stroke frequency and oil pipe diameter;

Extracting a target set Y1 required for constructing a knowledge extractor from the high-efficiency well observation set A1, wherein the target set Y1 comprises posterior features: downhole efficiency, system efficiency, surface efficiency, and pump indicator diagram features;

the construction of the knowledge extractor by adopting the KNN algorithm comprises the following steps: inputting a target well sample

Based on Euclidean distance measurement mode, traversing all samples in the feature set X1, and calculating Euclidean distance between the samples>

：

（12）

wherein ,

representing the +.sup.1 in feature set X1>

Sample number->

Represents sample->

Item features; />

Representing the target well sample +.>

And->

Euclidean distance between samples; />

Statistically determining minimum distance from a target well

The samples are taken as the nearest sample set

Then, the weighted average calculation is carried out on the target set Y1 of the nearest neighbor samples according to the Euclidean distance:

（13）

in the case of the formula (13),

representing the nearest neighbor sample set->

The%>

A sample number; />

Representing the nearest neighbor sample set->

The%>

Euclidean distance between each sample and the target oil well; />

The feature vector generated by weighting calculation of the nearest sample set of the target oil well is posterior feature distribution of the target oil well in Y1, namely feature knowledge which is learned from a good sample library and is used for representing reasonable submergence, and the feature vector comprises system efficiency, downhole efficiency, ground efficiency and pump indicator diagram features.

According to the invention, the specific process of clustering the oil well multisource data set A in the block by adopting the K-means is as follows:

step 2-1: randomly selecting 3 vectors as initial clustering center points, wherein each point represents a cluster;

step2-2: traversing all samples, calculating Euclidean distance between each sample and each initial clustering center point, and endowing each sample with a class represented by the closest clustering center;

step2-3: for 3 clusters, calculating a sample average value of each cluster as a new cluster center point;

step 2-4: and repeating Step2-2 and Step2-3 until the central points of all clusters are not changed obviously, and ending the cycle to obtain the final 3 clustering results.

According to the present invention, the specific process of the step 4 is:

step 4.1, constructing a Wide neural network part of a Wide deep learning network as an input module of oil well production characteristics, wherein the type of the received characteristics is the same as the characteristic set X1 in the step 3; the generalized logistic regression model is adopted as the main structure of the module neural network, and comprises an input layer, a hidden layer and an output layer, wherein the calculation process is as follows:

（14）

in the case of the formula (14),

a weight matrix representing the hidden layer; / >

A feature matrix representing neural network inputs;

representing hidden layer bias; />

An activation function representing a hidden layer; />

An output matrix representing the neural network;

step 4.2, constructing a Deep neural network part of the Widedeep Deep learning network, and taking the Deep neural network part as an input module of posterior characteristics, wherein the type of the receiving characteristics is the same as that of the target set Y1 in the step 3; the multi-layer feedforward neural network is adopted as the main structure of the module neural network, and comprises an input layer, a plurality of hidden layers and an output layer, wherein each hidden layer performs the following transformation on the output vector of the upper layer in the forward calculation and feeding process of the input vector:

（15）

at the publicIn the formula (15), the amino acid sequence of the compound,

indicate->

A weight matrix of the layer; />

Indicate->

An input matrix of layers; />

Indicate->

Bias of the layer; />

Indicate->

An activation function of the layer; />

Indicate->

An input matrix of layers;

the optimizer selects AdaGrad, which eliminates the need to manually adjust the learning rate to accommodate iterative updates of different parameters. Meanwhile, because the DNN neural network adopted by the Deep neural network part belongs to a full-connection layer network, a strong overfitting risk exists, and therefore, a Dropout strategy is adopted for each hidden layer of the module, so that a certain proportion of neurons are deactivated randomly, and the overfitting problem of the DNN neural network is reduced;

Step 4.3, constructing an output fusion processing module of the Wide neural network part and the Deep neural network part: the output layers of the two modules are fused according to the mode of adding output dimensions to obtain a composite hidden layer, the number of neurons is the sum of the number of neurons of the output layers of the two modules, and the input matrix is obtained by longitudinally splicing the output matrixes of the two modules;

meanwhile, the composite hidden layer is a logistic regression network, and the output information of the Wide neural network part and the output information of the Deep neural network part are subjected to nonlinear fusion transformation to finally serve as the predicted sinking degree output of the Wide Deep learning network:

（16）

in the formula (16) of the present invention,

representing the final output of the model; />

Representing the partial neural network output of the Wide neural network; />

Representing a weight vector corresponding to the output of the Wide neural network part in the composite hidden layer; />

Representing a Deep neural network partial neural network output; />

Representing a weight vector in the composite hidden layer corresponding to the output of the Deep neural network part;

step 4.4, extracting 23 kinds of characteristic data in total from the multi-source data set A of the oil well in the block obtained in the step 1, including daily liquid yield, working fluid level, pump efficiency, oil pressure, casing pressure, water content, pump diameter, stroke frequency, oil pipe diameter, underground efficiency, system efficiency, ground efficiency and pump indicator diagram characteristics, according to the step 4:1 is divided into a training set and a prediction set;

Adopting a combined training strategy to simultaneously optimize the Wide neural network part and the Deep neural network part, and simultaneously feeding back the prediction error of the composite model into the Wide neural network part and the Deep neural network part to update the neural network weight matrix and the bias matrix;

determination of coefficients using commonly used model performance evaluation criteria

The prediction effect of the test set is evaluated, and the calculation formula is as follows:

（17）

in the formula (17) of the present invention,

indicate->

A sample number; />

The total number of samples in the test set; />

And->

Respectively representing the real submergence of the test set and the predicted submergence of the composite model; />

Representing the average value of the true sinking degree of the test set; />

The closer to 1 the value of (c) is, the higher the model prediction accuracy is.

The construction of the Widedeep model is realized through an open source deep learning library TensorFlow according to the calculated result

And (3) comprehensively considering the model performance to adjust the model parameters, and finally obtaining the model with the prediction precision meeting the requirement.

According to the present invention, the specific process of the step 5 is:

step 5.1, acquiring known production parameters and optimal design parameters of an oil well to be optimized, and inputting the known production parameters and optimal design parameters into a good sample knowledge extractor (KNN model) trained in the step 3 to obtain posterior feature distribution of the good sample knowledge extractor in a high-efficiency well sample space;

And 5.2, respectively inputting the production characteristics of the target oil well (the same type as the characteristic set X1 in the step 3) and the posterior characteristics of the oil well (the same type as the target set Y1 in the step 3) into a Wide neural network part and a Deep neural network part of a multisource characteristic-submergence fitting model (a Wide Deep hybrid neural network), and obtaining reasonable submergence under the guidance of high-efficiency well knowledge through model prediction for actual production optimization adjustment.

The invention has the beneficial technical effects that:

the invention provides a method for determining reasonable sinking degree of an oil well based on a clustering algorithm and a Widedeep network, aiming at the problem that the conventional reasonable sinking degree design method based on conventional statistical analysis and theoretical deduction is not suitable for actual production due to numerous influence factors and strong specificity of a block oil well. According to the invention, the high-efficiency oil well samples with higher system efficiency and similar knowledge distribution in the block are screened through unsupervised cluster learning; constructing a knowledge extractor by adopting a nearest neighbor algorithm to generate posterior features of a target well for subsequent guidance of reasonable sinking design; fitting a nonlinear mapping relation between multi-source data of a block oil well and the sinking degree by constructing a Widedeep depth composite neural network framework; the posterior characteristics of the target well under reasonable sinking degree are generated through the excellent knowledge extractor, and the multi-source characteristic-sinking degree fitting model with strong generalization is combined, so that the determination of the reasonable sinking degree suitable for the target well under the guidance of high-efficiency oil well knowledge is realized, and the method has good popularization and application values.

Drawings

FIG. 1 is a schematic flow chart of a history fitting method based on a deep autoregressive network and a continuous learning strategy;

FIG. 2 is a graph of the set of changes from the surface indicator to the downhole pump indicator after normalization;

FIG. 3-1 is a displacement-acquisition point plot;

FIG. 3-2 is a load slope-acquisition point plot;

3-3 are pump diagrams;

FIG. 4 is a flow chart of clustering multi-source data set A of oil wells in a block by using K-means according to the present invention;

FIG. 5 is a schematic diagram of a neural network main structure using a generalized logistic regression model as the module;

FIG. 6 is a schematic diagram of the main structure of a neural network using a multilayer feedforward neural network as the module;

FIG. 7 is a composite hidden layer as a logistic regression network;

FIG. 8 is a schematic diagram of clustering results after dimension reduction;

FIG. 9 is a KNN model result for each parameter;

fig. 10 is a schematic diagram of a multimodal training result.

Detailed Description

The invention is described in further detail below with reference to the attached drawings and detailed description:

example 1,

A method for determining reasonable oil well sinking degree based on a clustering algorithm and a Widedeep network comprises the following steps:

Sources of well multi-source data within the block include:

step 1, acquiring multi-source data reflecting the production efficiency and the submergence rationality of an oil well according to an oil extraction process theory and on-site production experience, and carrying out data preprocessing and multi-source feature set construction to obtain an oil well multi-source data set in a block;

further, the specific process of the step 1 is as follows:

（1）

（2）

（3）

（4）

（5）

（6）

（7）

In the formulas (1) - (7),

representing the efficiency of the pumping well system; />

Representing ground efficiency; />

Representing downhole efficiency; />

Representing motor input power, < >>

；/>

Indicative of hydraulic power, +.>

；/>

Indicating polish rod power +.>

；/>

Representing the pump efficiency; />

Indicating daily production of oil well,/->

；/>

Represents the density of the pumped liquid->

；/>

Indicating the effective elevation, i.e. the working fluid level depth,/->

；/>

Representing the area enclosed by the load of the diagram, +.>

；/>

Indicates the stroke of the polish rod>

；/>

Indicates the impulse->

；/>

Indicating the power meter ratio, < >>

；/>

Indicate length, ++>

；/>

Represents the sectional area of the plunger of the oil pump,

；/>

represents the electricity consumption of hundred meters ton liquid and->

；/>

Representing the daily power consumption of a single well, < >>

；

（8）

in the formula (8) of the present invention,

representation->

Moment sucker rod string->

Displacement at the section; / >

Representing the propagation velocity of sound waves in the sucker rod string, m/s; />

Representing the resistance coefficient;

（9）

in the formula (9) of the present invention,

representing the normalized data; />

Representing an original sample data matrix; />

Representing the maximum value of the sample data; />

Representing a minimum value of the sample data;

as shown in FIG. 2, the graph is a graph of the graph change from the ground graph to the underground pump graph after the normalization treatment, the underground pump graph eliminates the graph deformation caused by the deformation and vibration of the sucker rod string, and compared with the ground graph, the graph is smoother and simpler, and the current sinking degree pump working condition is truly and accurately reflected;

（10）

In the formula (10) of the present invention,

representing the load change slope; />

Indicating +.f. in normalized pump indicator diagram>

Load values for the individual acquisition points; />

Representing the number of acquisition points;

extracting pump valve switch points from a displacement-acquisition point curve and a load slope-acquisition point curve according to the working principle of the underground pump, and extracting physical characteristics and geometric characteristics of a pump indicator diagram on the basis to represent pump indicator diagram information under the sinking degree of an oil well, wherein the physical characteristics comprise the pump valve point switch position, the pump effective stroke in the up-down stroke and the average load; the geometric feature is the slope of the curve during loading and unloading: as shown in fig. 3-1, 3-2 and 3-3, the specific pump switch valve point extraction and pump indicator diagram geometric characteristic process is as follows:

step 1-7: in the pump indicator diagram, record each sampling between the fixed valve opening point S1 and the fixed valve closing point S2Point load

Calculate the average load of upstroke +.>

Calculate the average load of downstroke +.>

；

；

As a pump indicator diagram unloading process slope;

according to the theoretical analysis of oil extraction technology, when sinking is too small, the pump inlet gas is too much or is in an uncooled state of supply and discharge, the filling degree of the pump is greatly reduced, and the pump indicator diagram mainly shows that:

in the course of downstroke, the gas in pump cavity is compressed, resulting in early unloading, and the slope of connecting line between S2 and T1

The opening delay of the traveling valve, namely the displacement of the T1 point, is reduced; during the up stroke, the gas in the pump cavity expands to slow the loading, and the slope of the connecting line between the two points S1 and T2 is +.>

The fixed valve opening delay is reduced, namely the displacement of the S1 point is increased; meanwhile, the up-stroke effective stroke and the down-stroke effective stroke of the pump indicator diagram are correspondingly reduced, and the reduction degree of the down-stroke effective stroke is relatively larger.

At this time, if the sinking degree is increased, the pump filling degree is increased, and the pump indicator diagram features are changed accordingly, so that the pump working condition represented by the pump indicator diagram features implies the rationality information of the sinking degree of the oil well, and therefore, the pump indicator diagram feature system is extracted through the step and is used for constructing a multi-source feature data set of the oil well afterwards.

。

The clustering algorithm is as follows:

The specific process of the step 2 is as follows:

（11）

in the formula (11) of the present invention,

is to make->

The variable value when reaching the minimum value; />

Is the number of clusters;

is->

Clustering the population; />

Is->

Average value of center; />

as shown in fig. 4, the specific process of clustering the multi-source data set a of the oil well in the block by using K-means is as follows:

the nearest neighbor algorithm is:

The method for establishing the mapping relation between the multi-source characteristics and the sinking degree of the Widedeep learning network fitting block oil well comprises the following steps:

According to the present invention, the specific process of the step 3 is:

note that here the efficiency is not taken as a posterior feature term, since it can be obtained by calculation of the target well design displacement and the rated displacement of the oil pump, while other efficiency terms relate to non-liquid lifting power consumption, belonging to the fact that the terms cannot be obtained according to known optimization parameter calculation, the system efficiency, the surface efficiency and the downhole efficiency are taken as posterior feature terms;

：

（12）

In the formula (12) of the present invention,

representing the +.sup.1 in feature set X1>

Sample number->

Represents sample->

Item features;

representing the target well sample +.>

And->

Euclidean distance between samples; />

Statistically determining minimum distance from a target well

The samples are taken as the nearest sample set

（13）

in the case of the formula (13),

representing the nearest neighbor sample set->

The%>

A sample number; />

Representing the nearest neighbor sample set->

The%>

Euclidean distance between each sample and the target oil well; />

The construction of the KNN model is realized through an open source machine learning library Sklearn, and the excellent sample library and the knowledge extractor are simultaneously stored in the KNN model obtained by training, so that the data storage efficiency and the posterior feature generation efficiency are greatly improved;

And 4, based on the multi-source data set of the oil well in the block obtained in the step 1, the oil well production characteristic item is used as a Wide layer input, the posterior characteristic item is used as a Deep layer input, and a Widedeep depth neural network is constructed to fit the mapping relation between the multi-source characteristic and the sinking degree of the oil well of the target block. Wherein the well production profile comprises data items: daily liquid yield, pump efficiency, working fluid level, oil pressure, casing pressure, water content, pump diameter, stroke frequency and oil pipe diameter; the posterior feature term includes data items: downhole efficiency, system efficiency, surface efficiency, and pump indicator diagram features;

the specific process of the step 4 is as follows:

step 4.1, constructing a Wide neural network part of a Wide deep learning network as an input module of oil well production characteristics, wherein the type of the received characteristics is the same as the characteristic set X1 in the step 3; the generalized logistic regression model is adopted as the main structure of the module neural network, as shown in fig. 5, and comprises an input layer, a hidden layer and an output layer, wherein the calculation process is as follows:

（14）

in the case of the formula (14),

a weight matrix representing the hidden layer; />

A feature matrix representing neural network inputs; />

Representing hidden layer bias; />

An activation function representing a hidden layer; / >

An output matrix representing the neural network;

the Wide neural network part has a simpler structure and stronger 'reproduction capability', can capture the distribution characteristics of oil well production characteristics, and can rapidly reproduce and output corresponding target values aiming at the 'strong similar characteristics' when the input sample characteristics are similar to the training sample characteristics. For example, an oil well with large pump diameter, long stroke and high pump efficiency is usually large in sinking degree, and the Wide model strengthens the corresponding weight of the mapping relation in the training process, so that the learned output information representing the large sinking degree is reproduced when the approximate oil well parameters are input.

Step 4.2, constructing a Deep neural network part of the Widedeep Deep learning network, and taking the Deep neural network part as an input module of posterior characteristics, wherein the type of the receiving characteristics is the same as that of the target set Y1 in the step 3; the multi-layer feedforward neural network is adopted as the main structure of the module neural network, as shown in fig. 6, and comprises an input layer, a plurality of hidden layers and an output layer, wherein each hidden layer performs the following transformation on the output vector of the previous layer in the forward calculation feeding process of the input vector:

（15）

in the case of the formula (15),

indicate->

A weight matrix of the layer; />

Indicate->

An input matrix of layers; / >

Indicate->

Bias of the layer; />

Indicate->

An activation function of the layer; />

Indicate->

An input matrix of layers;

the learning rate of each parameter in the objective function can be continuously adjusted in the iterative process by AdaGrad, and for any parameter in the neural network

AdaGrad will be +.>

Gradient modification of computation this time general studyThe learning rate:

（18）

wherein ,

representing a learning rate; />

Indicate->

In wheel +.>

A gradient of the individual parameters; />

Before->

Wheel model parameters->

And (5) accumulating gradients.

The Deep neural network part has a relatively complex structure and relatively strong generalization capability, can learn an implicit mapping relation between a feature space and a target space from the existing feature combination of a training sample, and discovers the distribution of the feature combination which never occurs in the target space. For example, when the pump indicator diagram feature combination which does not appear in the training data is faced, the Deep neural network part can conduct generalization prediction according to the learned mapping relation, and deduce the sinking degree information corresponding to the posterior feature combination.

（16）

in the formula (16) of the present invention,

representing the final output of the model; />

Representing the partial neural network output of the Wide neural network; />

Representing a Deep neural network partial neural network output; />

（17）

in the formula (17) of the present invention,

indicate->

A sample number; />

The total number of samples in the test set; />

And->

Representing the average value of the true sinking degree of the test set; />

The closer the value of (2) is to 1, the higher the model prediction accuracy is;

The value is comprehensively considered, model parameters are regulated by the aid of model performance, and a model with prediction accuracy meeting requirements is finally obtained;

finally, generating reasonable sinking degree under the combination of the design parameters and posterior characteristics of the target oil well with high precision and high efficiency, so as to scientifically and effectively guide on-site production optimization;

the method for generating reasonable sinking degree under the combination of the design parameters and the posterior characteristics of the target oil well comprises the following steps:

The specific process of the step 5 is as follows:

In order to verify the feasibility of the method, the proposed technical scheme is tested by taking the actual data of a certain oilfield block as an example. The method for determining reasonable sinking degree comprises the following specific steps:

step 1, collecting production data of the block 213 pumping well for 3 months as an original data set, and selecting direct influence parameters affecting the production efficiency and energy consumption of the oil well, namely motor input power, daily liquid production and working fluid level, and indirect influence parameters, namely oil pressure, casing pressure, water content, pump diameter, oil pipe diameter, pump depth, sinking degree, stroke frequency, maximum upward stroke current and maximum downward stroke current, wherein 14 data items are taken in total according to the oil extraction process theory and field production experience. Calculating evaluation indexes representing the production efficiency and the energy consumption of an oil well, namely 5 evaluation indexes of the efficiency of a pumping well system, the underground efficiency, the ground efficiency, the pump efficiency and the power consumption of hundred meters ton liquid;

and solving a one-dimensional damping wave equation according to the actual measured polished rod load and displacement data of the oil well by a Fourier series method to obtain a pump indicator diagram for representing the running condition of the underground oil well pump. The geometric characteristics and physical characteristics representing the running condition of the oil pump are obtained by extracting the characteristics of the underground indicator diagram, the displacement-acquisition point change curve and the load slope-acquisition point change curve after standardized treatment, wherein the geometric characteristics and the physical characteristics comprise traveling valve opening displacement, traveling valve closing displacement, fixed valve opening displacement, fixed valve closing displacement, up-stroke average load and down-stroke average load, and 10 data items are obtained in total of up-stroke effective stroke, down-stroke effective stroke, loading slope and unloading slope.

And (3) carrying out data association, abnormal value and repeated value cleaning on the total 29 kinds of characteristic data, and simultaneously keeping only one piece of data with highest system efficiency in the same day of the same well in consideration of maintaining reasonable data quantity and data diversity, so as to finally obtain the oil well multisource characteristic set A containing 5785 samples.

And 2, based on the multisource feature set A extracted in the step 1, performing unsupervised cluster analysis on the set A by adopting a K-means algorithm according to a strategy for dividing the oil well sample into an efficient well sample, a common well sample and a low-efficiency well sample in the process shown in fig. 4. The number of cluster cores for K-means in this example was set to 3, and the number of samples finally obtained for the three well groups was 1884, 2213, and 1688, respectively.

The multi-element characteristic samples of the oil well are projected into a two-dimensional space through a dimension reduction algorithm T-SNE for visual display, a clustering result after dimension reduction is shown in fig. 8, wherein neither dimension 1 nor dimension 2 has actual physical significance, and the multi-element characteristic samples only represent evaluation indexes after dimension reduction of the multi-element characteristic samples. From fig. 8, it can be obviously observed that the well group 1 samples are mainly distributed at the upper part of the dimension-reducing space, the well group 2 samples are mainly distributed at the middle part of the dimension-reducing space, the well group 3 samples are mainly distributed at the lower part of the dimension-reducing space, and clear blank spaces exist among the well group samples, which means that three well group samples have obvious difference distribution in the 29-dimensional feature space.

The average system efficiency within the group of 3 well cluster observation sets was calculated, respectively, and in this example, the average system efficiency for well group 1 samples was 33.63%, the average system efficiency for well group 2 samples was 24.89%, and the average system efficiency for well group 3 samples was 16.61%. The system is divided into a high-efficiency well observation set A1, a common well observation set A2 and a low-efficiency well observation set A3 in sequence according to the average system efficiency.

And 3, extracting a feature set X1 and a target set Y1 from the high-efficiency well observation set A1, and constructing a high-efficiency well sample knowledge extractor, wherein X1 comprises 10 data items including daily liquid production, pump efficiency, working fluid level, oil pressure, casing pressure, water content, pump diameter, stroke frequency and oil pipe diameter, and Y1 comprises 13 data items including downhole efficiency, system efficiency, ground efficiency and pump indicator diagram characteristics.

And (3) generating a strategy of posterior feature items based on a nearest neighbor algorithm, and establishing a KNN knowledge extractor for fitting the mapping relation between the feature set X1 and the target set Y1 by adopting a KNEighborsReggresor tool of an open source machine learning library Sklearn. The superparameter is set in this example to take the "Euclidean distance" for the calculation criteria of the distance between samples; and (3) generating a target value of the input sample by adopting a distance weight strategy, namely carrying out weighted average calculation according to Euclidean distance as a weight term. The number of nearest neighbor samples is set to be 5, 10, 25 and 50 respectively, and the performance evaluation index of each model is calculated

：

（19）

In the formula (19) of the present invention,

indicate->

Sample number->

For the total number of test set samples, +.>

And->

Representing the test set realism value and the predicted value, respectively, < >>

Representing the mean of the test set. />

The result of KNN model corresponding to each parameter is shown in figure 9, when the nearest neighbor parameter is equal to 5

The highest, therefore the model is taken as the knowledge extractor for end use +.>

。

Step 4, for fitting the mapping relation between the multi-source parameter combination and the sinking degree of the block oil well, firstly extracting a feature set based on the multi-source feature data set A of the oil well

The hydraulic oil pump comprises 10 data items including daily liquid yield, pump efficiency, working fluid level, oil pressure, casing pressure, water content, pump diameter, stroke frequency and oil pipe diameter; feature set->

13 data items are included including downhole efficiency, system efficiency, surface efficiency, and pump indicator diagram features; target set->

Is the sinking degree. Random according to 4: the scale of 1 is divided into training and test sets, 4628 training samples and 1157 test samples.

Construction of a structure as shown in FIG. 7 by using open source deep learning library TensorFlowIs a widedep model of (c). Wherein the Wide neural network part comprises an implicit layer, 32 neural nodes are arranged, and the method adopts

Activating a function, wherein an output layer is provided with 16 nerve nodes; the Deep neural network part comprises 4 hidden layers, 128, 64, 32 and 16 neural nodes are respectively arranged, and all the neural nodes adopt

Activating a function, wherein the random loss proportion Dropout of each layer is 0.1, and 16 neural nodes are arranged on an output layer; the composite hidden layer is provided with 32 nerve nodes, and the composite hidden layer adopts +.>

The function is activated.

To obtain the Widedeep model with the best performance, model training is carried out according to a plurality of groups of learning rates and iteration rounds shown in the table 1, and the submergence prediction is carried out on a test sample set, wherein the method comprises the following steps of

Evaluation of model Properties, metropolis (R)/(L)>

The closer 1 represents the better the model performance, the higher the accuracy of predicting sink from the multi-source signature. The multimodal training results and comparison are shown in table 1 and fig. 10:

TABLE 1 model under combination of different learning rates and iteration rounds

Comparison

Selecting according to model performance comparison under the combination of different learning rates and iteration rounds

The maximum parameter combination, i.e. learning rate of 0.0025 and iteration number of 150, is used for theThe Widedeep model trained under the parameter combination is used as a final multisource characteristic-submergence prediction model for subsequent reasonable submergence prediction of the target well.

And step 5, taking a certain well in the block as an example, and optimally designing target parameters as shown in a table 2, wherein the daily liquid production rate and the corresponding working fluid level are calculated according to the dynamic curve of the inflow of the oil well and the multiphase pipe flow.

TABLE 2 optimal design target parameters for test wells

Inputting the parameters into the knowledge extractor trained in step 3

Obtaining posterior feature distribution thereof in a high-efficiency oil well sample space as shown in table 3:

table 3 test well generation posterior feature terms

The related pump indicator diagram data items in table 3 are normalized dimensionless values, and the posterior feature combination has the efficient well production effect from the view of the effective stroke and each efficiency, which shows that effective information which can guide reasonable sinking degree to be determined is successfully extracted from the efficient well good sample library.

And finally, respectively inputting the oil well production characteristics and the generation posterior characteristics into a Wide neural network part and a Deep neural network part of a multisource characteristic-sinkage fitting model (a Wide Deep depth hybrid neural network) to obtain the reasonable sinkage of the target oil well under the guidance of high-efficiency oil well sample knowledge. Based on the sinking degree, the pump efficiency composition under the sinking degree is analyzed according to the oil extraction process theory, the theoretical pump efficiency value 81.56% is obtained, the pump efficiency is only 1.56% different from the pump efficiency related to the target well, the requirement of yield error in the oil well optimization design process is met, and the method can be used for actual oil well production optimization.

Aiming at the problems that in daily optimization production of an oil well, the conventional methods of determining the sinking degree by means of artificial experience, determining the sinking degree according to a simple fitting curve of system efficiency and sinking degree, deriving reasonable sinking degree by means of an empirical formula theory and the like are limited in consideration of factors, low in precision, poor in effect and the like, the invention establishes a reasonable sinking degree prediction framework based on a clustering algorithm and a Widedeep depth neural network, and an excellent knowledge extractor based on a nearest neighbor regression prediction algorithm is embedded.

Claims

1. The method for determining the reasonable oil well sinking degree based on the clustering algorithm and the Widedeep network is characterized by comprising the following steps of:

and finally, generating reasonable sinking degree under the combination of the design parameters of the target oil well and the posterior characteristics.

2. The method for determining reasonable oil well subsidence based on a clustering algorithm and a WideDeep network according to claim 1, wherein the sources of the oil well multi-source data in the block comprise:

step 1, acquiring multi-source data reflecting the production efficiency and the submergence rationality of an oil well, and carrying out data preprocessing and multi-source feature set construction to obtain an oil well multi-source data set in a block;

the clustering algorithm is as follows:

3. The method for determining reasonable oil well sinking degree based on a clustering algorithm and a WideDeep network according to claim 2, wherein the nearest neighbor algorithm is as follows:

and 3, constructing a knowledge extractor for the excellent sample library in the step 2 by adopting a KNN regression algorithm, and generating posterior feature distribution of the specific production parameter combination of the target well in the feature space of the efficient oil well cluster.

4. The method for determining reasonable oil well sinking degree based on a clustering algorithm and a Widedeep network according to claim 2, wherein the method for establishing the mapping relation between the multi-source characteristics of the Widedeep learning network fitting block oil well and the sinking degree comprises the following steps:

and 4, based on the multi-source data set of the oil well in the block obtained in the step 1, the oil well production characteristic item is used as a Wide layer input, the posterior characteristic item is used as a Deep layer input, and a Widedeep depth neural network is constructed to fit the mapping relation between the multi-source characteristic and the sinking degree of the oil well of the target block.

5. The method for determining reasonable oil well sinking based on the clustering algorithm and the WideDeep network according to claim 2, wherein the method for generating the reasonable sinking under the combination of the target oil well design parameters and the posterior features comprises the following steps:

And 5, inputting specific production parameter items of the target well and posterior feature items extracted from a good sample library into a trained Widedeep model, generating reasonable submergence for the target well, and finally, optimizing on-site production.

6. The method for determining reasonable oil well sinking degree based on the clustering algorithm and the WideDeep network according to claim 2, wherein the specific process of the step 1 is as follows:

the method comprises the steps of extracting acquisition parameters of an oil well, calculating evaluation indexes and extracting characteristics of a pump indicator diagram through the steps 1.1, 1.2 and 1.3 respectively, preprocessing data and constructing a multi-source data set of the oil well in a block through the step 1.4, wherein the specific process is as follows:

（1）

（2）

（3）

（4）

（5）

（6）

（7）

In the formulas (1) - (7),

representing the efficiency of the pumping well system; />

Representing ground efficiency; />

Representing downhole efficiency; />

Representing motor input power, < >>

；/>

Indicative of hydraulic power, +.>

；/>

Indicating polish rod power +.>

；/>

Representing the pump efficiency; />

Indicating daily production of oil well,/->

；/>

Represents the density of the pumped liquid->

；/>

Indicating the effective elevation, i.e. the working fluid level depth,/->

；/>

Representing the area enclosed by the load of the diagram, +.>

；/>

Indicates the stroke of the polish rod>

；/>

The number of impulses is indicated,

；/>

indicating the power meter ratio, < >>

；/>

Indicate length, & lt>

；/>

Representing the sectional area of the plunger of the oil pump>

；

Represents the electricity consumption of hundred meters ton liquid and->

；/>

Representing the daily power consumption of a single well, < >>

；

Step 1.3, according to actual measured polished rod load and displacement data of an oil well, solving a one-dimensional damping wave equation by taking a suspension point position and load as boundary conditions and adopting a Fourier series method to obtain a pump indicator diagram for representing the running condition of an underground oil well pump, wherein the one-dimensional damping wave equation is as follows:

（8）

in the formula (8) of the present invention,

representation->

Carved sucker rod string->

Displacement at the section; />

Representing the resistance coefficient;

the method comprises the steps of carrying out standardization treatment on a pump indicator diagram by adopting a min-max standardization method, wherein the standardized displacement and the standardized load are both in a section [0,1], and the conversion function is as follows:

（9）

In the formula (9) of the present invention,

representing the normalized data; />

Representing an original sample data matrix; />

Representing the maximum value of the sample data; />

Representing a minimum value of the sample data;

（10）

in the formula (10) of the present invention,

representing the load change slope; />

Indicating +.f. in normalized pump indicator diagram>

Load values for the individual acquisition points; />

Representing the number of acquisition points;

extracting a pump valve switching point from a displacement-acquisition point curve and a load slope-acquisition point curve, and extracting physical characteristics and geometric characteristics of a pump indicator diagram to represent pump indicator diagram information under the sinking degree of an oil well, wherein the physical characteristics comprise the pump valve switching point position, the pump effective stroke in the up-down stroke and the average load; the geometric feature is the slope of the curve during loading and unloading:

Calculate the average load of upstroke +.>

The method comprises the steps of carrying out a first treatment on the surface of the Recording the load of each collecting point between the opening point T1 of the traveling valve and the closing point T2 of the traveling valve +.>

Calculate the average load of downstroke +.>

；

The method comprises the steps of carrying out a first treatment on the surface of the Calculating the difference between the displacements of the travelling valve opening point T1 and the travelling valve closing point T2, i.e. the downstroke effective stroke

；

As a pump indicator diagram loading process slope; calculating fixed valveTwo-point line slope of closing point S2 and traveling valve opening point T1 +.>

As a pump indicator diagram unloading process slope;

。

7. The method for determining reasonable oil well sinking degree based on the clustering algorithm and the WideDeep network according to claim 2, wherein the specific process of the step 2 is as follows:

step 2.1, performing unsupervised cluster analysis on the multi-source data set A of the oil well in the block obtained in the step 1 by adopting a K-Means algorithm;

（11）

in the formula (11) of the present invention,

is to make->

The variable value when reaching the minimum value; />

Is the number of clusters; />

Is the first

Clustering the population; />

Is->

Average value of center; />

Is a characteristic attribute vector of the sample;

and 2.2, respectively calculating the average system efficiency in groups of 3 oil well clustering observation sets, and dividing the average system efficiency into a high-efficiency well observation set A1, a normal well observation set A2 and a low-efficiency well observation set A3 in sequence according to the average system efficiency, wherein the high-efficiency well observation set A1 is used as a good sample library.

8. The method for determining reasonable oil well sinking degree based on the clustering algorithm and the WideDeep network according to claim 7, wherein the specific process of the step 3 is as follows:

On the basis of the high-efficiency well observation set A1 obtained in the step 2.2, a nearest neighbor regression strategy is adopted to construct a fine sample knowledge extractor for generating posterior feature distribution of known production parameter target wells in Gao Xiaojing cluster feature space:

Traversing all samples in the feature set X1, and calculating the Euclidean distance between the samples>

：

（12）

wherein ,

representing the +.sup.1 in feature set X1>

Sample number->

Represents sample->

Item features; />

Representing the target well sample +.>

And->

Euclidean distance between samples;

statistically determining a distance to a target wellWith the smallest distance

The samples are taken as the nearest neighbor sample set +.>

Then, the weighted average calculation is carried out on the target set Y1 of the nearest neighbor samples according to the Euclidean distance: / >

（13）

In the case of the formula (13),

representing the nearest neighbor sample set->

The%>

A sample number; />

Representing the nearest neighbor sample set

The%>

Euclidean distance between each sample and the target oil well; />

The feature vector generated by weighting calculation of the nearest sample set of the target oil well is posterior feature distribution of the target oil well in Y1, namely feature knowledge which is learned from a good sample library and is used for representing reasonable submergence, and the feature vector comprises system efficiency, underground efficiency, ground efficiency and pump indicator diagram features;

the specific process of clustering the oil well multisource data set A in the block by adopting the K-means is as follows:

9. The method for determining reasonable oil well sinking degree based on the clustering algorithm and the WideDeep network according to claim 4, wherein the specific process of the step 4 is as follows:

（14）

in the case of the formula (14),

a weight matrix representing the hidden layer; />

A feature matrix representing neural network inputs; />

Representing hidden layer bias; />

An activation function representing a hidden layer; />

An output matrix representing the neural network;

（15）

in the case of the formula (15),

indicate->

A weight matrix of the layer; />

Indicate->

An input matrix of layers; />

Indicate- >

Bias of the layer; />

Indicate->

An activation function of the layer; />

Indicate->

An input matrix of layers;

（16）

in the formula (16) of the present invention,

representing the final output of the model; />

Representing the partial neural network output of the Wide neural network;

Representing a Deep neural network partial neural network output; />

step 4.4, extracting 23 kinds of characteristic data in total from the multi-source data set A of the oil well in the block obtained in the step 1, namely, daily liquid production, working fluid level, pump efficiency, oil pressure, casing pressure, water content, pump diameter, stroke frequency, oil pipe diameter, underground efficiency, system efficiency, ground efficiency and pump indicator diagram characteristics, and dividing the characteristic data into a training set and a prediction set according to the proportion;

（17）

in the formula (17) of the present invention,

indicate->

A sample number; />

The total number of samples in the test set; />

And->

Representing the true sinking degree of the test setValues.

10. The method for determining reasonable oil well sinking degree based on the clustering algorithm and the WideDeep network according to claim 5, wherein the specific process of step 5 is as follows:

step 5.1, acquiring known production parameters and optimal design parameters of an oil well to be optimized, and inputting the known production parameters and optimal design parameters into the excellent sample knowledge extractor trained in the step 3 to obtain posterior feature distribution of the excellent sample knowledge extractor in a high-efficiency well sample space;

and 5.2, respectively inputting the production characteristics of the target oil well and the posterior characteristics of the oil well into a Wide neural network part and a Deep neural network part, and obtaining reasonable sinking degree under the guidance of high-efficiency well knowledge through model prediction.