CN109508738A - A kind of information processing method and relevant device - Google Patents

A kind of information processing method and relevant device Download PDF

Info

Publication number
CN109508738A
CN109508738A CN201811293443.9A CN201811293443A CN109508738A CN 109508738 A CN109508738 A CN 109508738A CN 201811293443 A CN201811293443 A CN 201811293443A CN 109508738 A CN109508738 A CN 109508738A
Authority
CN
China
Prior art keywords
indicator card
sample
mark
target
sample set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811293443.9A
Other languages
Chinese (zh)
Inventor
夏楠
夏一楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201811293443.9A priority Critical patent/CN109508738A/en
Publication of CN109508738A publication Critical patent/CN109508738A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Abstract

The embodiment of the invention provides a kind of information processing method and relevant devices, and the efficiency of the failure mark of the indicator card of rod-pumped well can be improved, save human cost.This method comprises: obtaining target indicator card, the target indicator card is target rod-pumped well indicator card to be marked;Determine that the target signature sequence of the target indicator card, the target signature sequence include at least the attributive character of the target indicator card;The target signature sequence inputting is preset into identification model to carry out failure mark to the target indicator card, the default identification model is by there is supervision fault identification model to obtain the training of training sample set, the training sample set includes the characteristic sequence of sample indicator card, sample indicator card is by query function to the indicator card obtained after the indicator card processing of rod-pumped well each in database, and sample indicator card passes through the mark of target object, the characteristic sequence of the sample indicator card includes at least the attributive character of the sample indicator card.

Description

A kind of information processing method and relevant device
Technical field
The present invention relates to field of information processing, in particular to a kind of information processing method and relevant device.
Background technique
Pumping unit system is one of most common mechanical system in oil field system, is the important composition of oil-gas field development production One of part and the main operation maintenance object of field management system.Pumping unit system is broadly divided into sucker rod, oil well pump two Part.Oil well pump is then divided into 4 pump barrel, pump plunger, standing valve and travelling valve main members.Under working condition, pass through ground Face motor drives sucker rod to cause pumping for oil well pump, and the crude oil in stratum can constantly be passed through oil by pumping unit system Pipe gives rise to ground.
In general, pumping unit needs not intermittent duty in 24 hours to reach maximum economic well-being of workers and staff under normal production status.And with Pumping unit operation, the accumulation of some gradually changeables or paroxysmal event, may make it fall into certain failure, serious to may cause It stops production.Common failure mode includes valve leakage (being divided into the leakage of fixed or travelling valve), and sucker rod is disconnected, and wax deposition, feed flow is not Foot etc..Due to bar, pipe, pump it is buried in underground, be difficult to analyze by way of manually directly observing its whether failure, failure is former Cause and fault degree, therefore the rod-pumped well diagnostic method of mainstream is all based on indicator card to be unfolded at present.
In decades-long oil field production run maintenance process, field operations personnel are had accumulated much about pumping unit The identification and service experience of failure, when having summed up most common failure generation, the substantially changing rule of indicator card.Therefore, manually show The accident analysis of function figure is applied more universal at present.
But in an oil field system, pumping unit quantity in operating status usually reaches thousands of or even up to ten thousand mouthfuls of rule Mould, the data volume that each second generates is huge, completely at high cost by the way of manually determining.And according to Supervised machine learning Model then equally needs manually in advance to be labeled full dose data, it is difficult to reduce human cost for automating fault detection.
Summary of the invention
The embodiment of the invention provides a kind of information processing method and relevant devices, and the indicator card of rod-pumped well can be improved Failure mark efficiency, save human cost.
First aspect of the embodiment of the present invention provides a kind of information processing method, specifically includes:
Target indicator card is obtained, the target indicator card is target rod-pumped well indicator card to be marked;
Determine that the target signature sequence of the target indicator card, the target signature sequence include at least the target and show function The attributive character of figure;
It is described by the default identification model of the target signature sequence inputting to carry out failure mark to the target indicator card Default identification model is by there is supervision fault identification model to obtain the training of training sample set, the training sample set Characteristic sequence including sample indicator card, the sample indicator card are by query function to rod-pumped well each in database The indicator card obtained after indicator card processing, and the sample indicator card passes through the mark of target object, the sample indicator card Characteristic sequence includes at least the attributive character of the sample indicator card.
Optionally, described that the target signature sequence inputting is preset into identification model to mark to the target indicator card Before note, the method also includes:
Obtain the indicator card of each rod-pumped well in the database;
The characteristic sequence for determining the indicator card of each rod-pumped well, is not marked sample set, described not mark sample This collection is the set of the attributive character sequence of the indicator card of each rod-pumped well;
Step A, training sample subset, the trained sample are determined from described do not mark according to the query function in sample set This subset be it is described do not mark target range in sample set be less than the query function distance threshold sample set;
Step B, failure is carried out to the training sample subset to mark to obtain mark sample set;
Step C, the distance threshold of the query function is adjusted by the mark sample set;
Step A, step B and step C are repeated, until meeting preset stopping criterion for iteration, when by iteration ends The mark sample set be determined as the training sample set;
There is supervision fault identification model to be trained training sample set input, obtains the default identification model.
Optionally, described to determine that training sample subset includes: in sample set from described do not mark according to query function
The corresponding decision tree of sample set is not marked by the way that query function generation is described, and the decision tree includes root section Point and leaf node, the root node and the leaf node have incidence relation, and the leaf node is not marked with described Each sample that do not mark in note sample set is with incidence relation;
Each leaf node is obtained in the decision tree to the target range of the root node;
The sample for not marking the distance threshold that target range described in sample set is less than the query function is true It is set to the training sample subset.
Optionally, it is described failure carried out to the training sample subset mark to obtain mark sample set include:
Determine each sample corresponding generation moment in the training sample subset;
The generation moment based on each sample obtains the corresponding indicator card of each sample;
The corresponding indicator card of each sample is sent into target object, so that the target object is to each sample This corresponding indicator card carries out failure mark, obtains the mark sample set.
Optionally, the method also includes:
Judge whether the number of iterations reaches default value, if so, determination meets the preset stopping criterion for iteration;
Or,
Judge whether the distance threshold of the query function restrains, if so, determination meets the preset iteration ends Condition.
Second aspect of the embodiment of the present invention provides a kind of information processing unit, comprising:
Acquiring unit, for obtaining target indicator card, the target indicator card be target rod-pumped well it is to be marked show function Figure;
Determination unit, for determining that the target signature sequence of the target indicator card, the target signature sequence are at least wrapped Include the attributive character of the target indicator card;
Processing unit, for the target signature sequence inputting to be preset identification model to carry out to the target indicator card Failure mark, the default identification model is by there is supervision fault identification model to obtain the training of training sample set, institute The characteristic sequence that training sample set includes sample indicator card is stated, the sample indicator card is by query function in database The indicator card obtained after the indicator card processing of each rod-pumped well, and the sample indicator card passes through the mark of target object, institute The characteristic sequence for stating sample indicator card includes at least the attributive character of the sample indicator card.
Optionally, described device further include: training unit, the training unit are used for:
Obtain the indicator card of each rod-pumped well in the database;
The characteristic sequence for determining the indicator card of each rod-pumped well, is not marked sample set, described not mark sample This collection is the set of the attributive character sequence of the indicator card of each rod-pumped well;
Step A, training sample subset, the trained sample are determined from described do not mark according to the query function in sample set This subset be it is described do not mark target range in sample set be less than the query function distance threshold sample set;
Step B, failure is carried out to the training sample subset to mark to obtain mark sample set;
Step C, the distance threshold of the query function is adjusted by the mark sample set;
Step A, step B and step C are repeated, until meeting preset stopping criterion for iteration, when by iteration ends The mark sample set be determined as the training sample set;
There is supervision fault identification model to be trained training sample set input, obtains the default identification model.
Optionally, the training unit determines training sample subset packet from described do not mark according to query function in sample set It includes:
The corresponding decision tree of sample set is not marked by the way that query function generation is described, and the decision tree includes root section Point and leaf node, the root node and the leaf node have incidence relation, and the leaf node is not marked with described Each sample that do not mark in note sample set is with incidence relation;
Each leaf node is obtained in the decision tree to the target range of the root node;
The sample for not marking the distance threshold that target range described in sample set is less than the query function is true It is set to the training sample subset.
Optionally, the training unit failure carried out to the training sample subset mark to obtain mark sample set include:
Determine each sample corresponding generation moment in the training sample subset;
The generation moment based on each sample obtains the corresponding indicator card of each sample;
The corresponding indicator card of each sample is sent into target object, so that the target object is to each sample This corresponding indicator card carries out failure mark, obtains the mark sample set.
Optionally, the training unit is also used to:
Judge whether the number of iterations reaches default value, if so, determination meets the preset stopping criterion for iteration;
Or,
Judge whether the distance threshold of the query function restrains, if so, determination meets the preset iteration ends Condition.
The third aspect of the embodiment of the present invention provides a kind of processor, and the processor is for running computer program, institute The step of executing the information processing method as described in above-mentioned various aspects when stating computer program operation.
Fourth aspect of the embodiment of the present invention provides a kind of computer readable storage medium, is stored thereon with computer journey Sequence, it is characterised in that: the computer program is when being executed by processor the step of information processing method described in above-mentioned various aspects.
In view of the foregoing it is apparent that in embodiment provided by the invention, by default identification model to target indicator card Failure is labeled, since this presets identification model by being trained to obtain to training sample set, and the training sample This collection, which is combined into, to carry out handling and passing through target signature pair by indicator card of the query function to rod-pumped well each in database As the set of the characteristic sequence of the sample indicator card of mark, it is possible thereby to fast and accurately carry out failure mark to target indicator card Note, compared with the existing technology in be manually labeled, due to having trained default identification model in advance, it is only necessary to will it is to be marked therefore The indicator card input model of the rod-pumped well of barrier can both be labeled the failure of indicator card, it is possible thereby to promote pumping unit The annotating efficiency of well reduces human cost.
Detailed description of the invention
Fig. 1 is the embodiment schematic diagram of information processing method provided in an embodiment of the present invention;
Fig. 2 is the training flow diagram of default identification model provided in an embodiment of the present invention;
Fig. 3 is the embodiment schematic diagram of information processing unit provided in an embodiment of the present invention;
Fig. 4 is the hardware structural diagram of server provided in an embodiment of the present invention.
Specific embodiment
The embodiment of the invention provides a kind of information processing method and relevant devices, and rod-pumped well indicator card can be improved Failure annotating efficiency saves human cost.
Description and claims of this specification and term " first ", " second ", " third ", " in above-mentioned attached drawing The (if present)s such as four " are to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should manage The data that solution uses in this way are interchangeable under appropriate circumstances, so that the embodiments described herein can be in addition to illustrating herein Or the sequence other than the content of description is implemented.In addition, term " includes " and " having " and their any deformation, it is intended that Cover it is non-exclusive include, for example, containing the process, method, system, product or equipment of a series of steps or units need not limit In step or unit those of is clearly listed, but may include be not clearly listed or for these process, methods, produce The other step or units of product or equipment inherently.
Information processing method of the invention is illustrated from the angle of information processing unit below, the information processing unit It can be server, the service unit being also possible in server, specifically without limitation.
Referring to Fig. 1, Fig. 1 is the embodiment schematic diagram of information processing method provided in an embodiment of the present invention, comprising:
101, target indicator card is obtained.
In the present embodiment, the available target indicator card of information processing unit, wherein the target indicator card is target oil pumping The corresponding indicator card to failure calibration of motor-pumped well, does not limit the mode for obtaining target indicator card specifically, such as can be direct herein Receive user input target indicator card, or receive user input target rod-pumped well identification number and target show At the time of function figure corresponds to, and the corresponding indicator card of target rod-pumped well is voluntarily obtained from database.
102, the target signature sequence of target indicator card is determined.
In the present embodiment, information processing unit can determine that the corresponding target signature sequence of target indicator card, the target are special Levy the attributive character that sequence includes at least target indicator card, such as maximum displacement, the maximum displacement load, minimum of target indicator card Displacement, least displacement load, maximum load, minimum load, upper effective stroke, lower effective stroke, upper effective stroke average load Lotus, the average load of lower effective stroke, the Oscillating Coefficients of upper effective stroke, the Oscillating Coefficients of lower effective stroke and target show function The attributive character such as the area of pictural surface.
It should be noted that specifically do not limit the target signature sequence for how determining target indicator card herein, such as can be with The characteristic sequence of target indicator card is calculated by the Displacement Sequence and payload sequence of target indicator card, this feature sequence can To include attributive character and/or temporal aspect, which refers to the Displacement Sequence and load by target indicator card The maximum displacement of the calculated target indicator card of sequence, maximum displacement load, least displacement, least displacement load, maximum load, Minimum load, upper effective stroke, lower effective stroke, the average load of upper effective stroke, the average load of lower effective stroke, on have Imitate the features such as the area of the Oscillating Coefficients of stroke, the Oscillating Coefficients of lower effective stroke and target indicator card;The temporal aspect, refers to Be the trend situation that changes over time in the attributive character of target indicator card, such as features such as ratio of maximum displacement, later Delete the corresponding characteristic sequence of indicator card of indicator card and special state label with special state label.
103, target signature sequence inputting is preset into identification model, to carry out failure mark to target indicator card.
In the present embodiment, information processing unit can train a default identification model in advance, which is By there is supervision fault identification model to obtain the training of training sample set, which includes the spy of sample indicator card Sequence is levied, sample indicator card is to show function to what is obtained after the indicator card processing of rod-pumped well each in database by query function Figure, and sample indicator card passes through the mark of target object, the characteristic sequence of sample indicator card includes at least the category of sample indicator card Property feature.
It should be noted that first passing through query function to each rod-pumped well in database headed by the default identification model Indicator card handled (concrete processing procedure please refers to Fig. 2), target object carries out the indicator card that obtains after processing later Failure marks to obtain sample indicator card, and the attributive character sequence for extracting sample indicator card obtains training sample set, passes through later There is supervision fault identification model to be trained training sample set.
In view of the foregoing it is apparent that in embodiment provided by the invention, by default identification model to target indicator card Failure is labeled, since this presets identification model by being trained to obtain to training sample set, and the training sample This collection, which is combined into, to carry out handling and passing through target signature pair by indicator card of the query function to rod-pumped well each in database As the set of the characteristic sequence of the sample indicator card of mark, it is possible thereby to fast and accurately carry out failure mark to target indicator card Note, compared with the existing technology in be manually labeled, due to having trained default identification model in advance, it is only necessary to will it is to be marked therefore The indicator card input model of the rod-pumped well of barrier can both be labeled the failure of indicator card, it is possible thereby to promote pumping unit The annotating efficiency of well reduces human cost.
The training that identification model is preset in the embodiment of the present invention is illustrated below with reference to Fig. 2.
Referring to Fig. 2, Fig. 2 is the training flow diagram of default identification model provided in an embodiment of the present invention, comprising:
201, the indicator card of each rod-pumped well in database is obtained.
In the present embodiment, rod-pumped well in the process of running, can generate an indicator card, this shows function at regular intervals At the time of label has indicator card on figure, pumping unit system can save the indicator card and be associated with preservation with rod-pumped well to database In.Information processing unit can extract the indicator card of each rod-pumped well from database, it is to be understood that each pumping unit Well has a corresponding at least indicator card.
202, the characteristic sequence for determining the indicator card of each rod-pumped well, is not marked sample set.
In the present embodiment, information processing unit can determine the characteristic sequence of the indicator card of each rod-pumped well, not marked Sample set is infused, this does not mark the set of the characteristic sequence for the indicator card that sample set is each rod-pumped well, that is to say, that this is not marked Each sample in note sample set can correspond to the attributive character sequence an of indicator card, and the attributive character sequence of indicator card is at least Including the corresponding maximum displacement of indicator card, maximum displacement load, least displacement, least displacement load, maximum load, minimum load Lotus, upper effective stroke, lower effective stroke, the average load of upper effective stroke, the average load of lower effective stroke, upper effective stroke Oscillating Coefficients, lower effective stroke Oscillating Coefficients and the features such as indicator card area.
It should be noted that may include that some special states that have mark in the indicator card of each rod-pumped well, it should Special state label refers to during rod-pumped well operation, it may appear that is generated when shutdown or other operations shows function Figure can stamp in special state label, such as special state label classification these indicator cards, and constant is displaced a field mark and is denoted as 1, then it proves that this indicator card is invalid indicator card, may not have instruction due to being to measure obtained when rod-pumped well stops well Practice value, therefore directly delete the indicator card, and by its extracted each characteristic value, traversal does not mark every in sample set A characteristic sequence deletes this and does not mark all characteristic sequences with special state label in sample set.
It should be noted that the characteristic sequence for how determining the indicator card of each rod-pumped well specifically do not limited herein, obtain To not marking sample set, such as can be calculated by the Displacement Sequence and payload sequence of the indicator card of each rod-pumped well Out.
It should be noted that since pumping unit system in rod-pumped well can generate pumping at regular intervals in the process of running Oil machine well corresponding indicator card carries the label for generating the moment in the indicator card of generation, so, the indicator card of the rod-pumped well Characteristic sequence in can also include the rod-pumped well indicator card the generation moment.
203, it is never marked according to query function and determines training sample subset in sample set.
In the present embodiment, it is corresponding certainly that information processing unit can not mark sample set by query function generation first Plan tree, decision tree include root node and leaf node, and root node and leaf node have an incidence relation, and leaf node with not Each sample that do not mark in mark sample set is with incidence relation;Later, each leaf node is obtained in decision tree to root section The target range of point;Finally, the sample for not marking the distance threshold that target range in sample set is less than query function is determined For training sample subset.The training sample subset is the distance threshold for not marking target range in sample set and being less than query function The set of sample.
It is understood that query function is the ring for carrying out importance ranking to the sample not marked in sample set Section, importance herein refer in current pumping-unit workdone graphic fault identification problem, are defined as some indicator card as event A possibility that hindering indicator card.This sentences query function to be illustrated for interrogation model, can be used for example IsolationForest interrogation model is ranked up the importance for the sample not marked in sample set, should All samples can be indicated by IsolationForest interrogation model with a decision tree, at each decision tree nodes To a certain feature, take a threshold value that sample is divided into two classes at random, when certain one kind sample size branched away is seldom, no longer Continue to divide and (be known as leaf node at this time), and another kind of sample is continued to divide, until all samples all assign to each leaf In child node.Since exceptional sample there can be certain off-notes, often it is assigned at some leaf node soon, therefore it There was only very short distance to the root node for initially starting to divide, and normal sample often has longer distance.This distance is A distance threshold is arranged in the evaluation index that can be used as sample importance, will not mark all less than the distance in sample set The sample of threshold value is determined as training sample subset.
204, mark sample set is determined according to training sample subset.
In the present embodiment, information processing unit, can be true according to training sample subset after obtaining training sample subset Calibration note indicator card collection, the mark indicator card collection are that the characteristic sequence in the training sample subset after failure marks is corresponding Indicator card, specifically, information processing unit first determine training sample subset in the corresponding indicator card of characteristic sequence, by this The corresponding indicator card of characteristic sequence in training sample subset is shown to target object, receives target object to training sample The failure of the corresponding indicator card of characteristic sequence in subset marks, and will be by the training sample subset of target object failure mark The corresponding indicator card of characteristic sequence, later, determine this by target object failure mark training sample subset feature sequence The attributive character sequence for arranging each indicator card in corresponding indicator card, obtains mark sample set, which can be pumping The expert of oil machine well system, naturally it is also possible to be other, as long as can carry out correctly marking to the failure of indicator card, specifically not It limits.
205, the distance threshold of query function is adjusted by mark sample set.
In the present embodiment, after information processing unit obtains mark sample set, mark sample set adjustment inquiry can be passed through The distance threshold of function, specifically, information processing unit, it can be for the result (i.e. mark sample set) after target object mark Dynamic adjustment is carried out to this parameter of the distance threshold of query function, so that the mark sample set failure after target object marks Sample is approximate with the normal sample presentation order of magnitude, in order to the training of subsequent monitor model.
206, step 203, step 204 and step 205 are repeated, until meet preset stopping criterion for iteration, will Mark sample set when iteration ends is determined as training sample set.
In the present embodiment, information processing unit can repeat step 203, step 204 and step 205, until full The preset stopping criterion for iteration of foot, mark sample set when by iteration ends are determined as training sample set.
It should be noted that information processing unit is in the process for repeating step 203, step 204 and step 205 In, every completion an iteration then judges whether the number of iterations reaches default value, such as 1000 times, meets in advance if so, determining The stopping criterion for iteration set;Or judge whether the distance threshold of query function restrains, if so, determination meets preset iteration Termination condition.
207, there is supervision failure marking model to be trained training sample set input, obtain default identification model.
It, can be by there is supervision failure marking model to the training sample after obtaining training sample set in the present embodiment This collection is trained, and obtains default identification model, and not limiting specifically herein is which kind of has supervision failure marking model, as long as can be right Training sample set is trained, and obtains default identification model.
It should be noted that after obtaining training sample set two parts can be divided into the sample that training sample is concentrated: instruction Practice sample and test sample, can proportionally (such as 9:1 or 8:2) or actual conditions be divided, pass through instruction later Practice sample to be trained, after training, default identification model be tested by test sample, to optimize output effect.
In view of the foregoing it is apparent that in the default identification model of training, collection where reducing fault sample by query function The range of conjunction recommends target mark, reduces mark cost, and the training set fault sample frequency after diminution is higher, mitigates failure Identify the influence of data nonbalance.
Information processing method provided in an embodiment of the present invention is illustrated above, the present invention is implemented below with reference to Fig. 3 The information processing unit that example provides is illustrated.
Referring to Fig. 3, Fig. 3 is the embodiment schematic diagram of information processing unit provided in an embodiment of the present invention, at the information Managing device includes:
Acquiring unit 301, for obtaining target indicator card, the target indicator card, which is that target rod-pumped well is to be marked, to be shown Function figure;
Determination unit 302, for determining the target signature sequence of the target indicator card, the target signature sequence is at least Attributive character including the target indicator card;
Processing unit 303, for the target signature sequence inputting to be preset identification model to the target indicator card Failure mark is carried out, the default identification model is by there is supervision fault identification model to obtain the training of training sample set , the training sample set includes the characteristic sequence of sample indicator card, and the sample indicator card is to pass through query function logarithm According to the indicator card obtained after the indicator card processing of rod-pumped well each in library, and the sample indicator card passes through the mark of target object Note, the characteristic sequence of the sample indicator card include at least the attributive character of the sample indicator card.
Optionally, described device further include: training unit 304, the training unit 304 are used for:
Obtain the indicator card of each rod-pumped well in the database;
The characteristic sequence for determining the indicator card of each rod-pumped well, is not marked sample set, described not mark sample This collection is the set of the attributive character sequence of the indicator card of each rod-pumped well;
Step A, training sample subset, the trained sample are determined from described do not mark according to the query function in sample set This subset be it is described do not mark target range in sample set be less than the query function distance threshold sample set;
Step B, failure is carried out to the training sample subset to mark to obtain mark sample set;
Step C, the distance threshold of the query function is adjusted by the mark sample set;
Step A, step B and step C are repeated, until meeting preset stopping criterion for iteration, when by iteration ends The mark sample set be determined as the training sample set;
There is supervision fault identification model to be trained training sample set input, obtains the default identification model.
Optionally, the training unit 304 determines training sample from described do not mark according to query function in sample set Collection includes:
The corresponding decision tree of sample set is not marked by the way that query function generation is described, and the decision tree includes root section Point and leaf node, the root node and the leaf node have incidence relation, and the leaf node is not marked with described Each sample that do not mark in note sample set is with incidence relation;
Each leaf node is obtained in the decision tree to the target range of the root node;
The sample for not marking the distance threshold that target range described in sample set is less than the query function is true It is set to the training sample subset.
Optionally, the training unit 304 marks training sample subset progress failure to obtain mark sample set packet It includes:
Determine each sample corresponding generation moment in the training sample subset;
The generation moment based on each sample obtains the corresponding indicator card of each sample;
The corresponding indicator card of each sample is sent into target object, so that the target object is to each sample This corresponding indicator card carries out failure mark, obtains the mark sample set.
Optionally, the training unit 304 is also used to:
Judge whether the number of iterations reaches default value, if so, determination meets the preset stopping criterion for iteration;
Or,
Judge whether the distance threshold of the query function restrains, if so, determination meets the preset iteration ends Condition.
Interactive mode between each unit of information processing unit in the present embodiment is real as shown in earlier figures 1 and Fig. 2 The description in example is applied, specific details are not described herein again.
In view of the foregoing it is apparent that in embodiment provided by the invention, by default identification model to target indicator card Failure is labeled, since this presets identification model by being trained to obtain to training sample set, and the training sample This collection, which is combined into, to carry out handling and passing through target signature pair by indicator card of the query function to rod-pumped well each in database As the set of the characteristic sequence of the sample indicator card of mark, it is possible thereby to fast and accurately carry out failure mark to target indicator card Note, compared with the existing technology in be manually labeled, due to having trained default identification model in advance, it is only necessary to will it is to be marked therefore The indicator card input model of the rod-pumped well of barrier can both be labeled the failure of indicator card, it is possible thereby to promote pumping unit The annotating efficiency of well reduces human cost.
Referring to Fig. 4, Fig. 4 is a kind of structural schematic diagram of server provided in an embodiment of the present invention, which can Bigger difference is generated because configuration or performance are different, may include one or more central processing units (central Processing units, CPU) 422 (for example, one or more processors) and memory 532, one or more Store the storage medium 430 (such as one or more mass memory units) of application program 442 or data 444.Wherein, it deposits Reservoir 432 and storage medium 430 can be of short duration storage or persistent storage.The program for being stored in storage medium 430 may include One or more modules (diagram does not mark), each module may include to the series of instructions operation in server.More Further, central processing unit 422 can be set to communicate with storage medium 430, execute storage medium on server 400 Series of instructions operation in 430.
Server 400 can also include one or more power supplys 426, one or more wired or wireless networks Interface 450, one or more input/output interfaces 458, and/or, one or more operating systems 441, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc..
The step as performed by information processing unit can be based on the server architecture shown in Fig. 4 in above-described embodiment.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
The embodiment of the invention also provides a kind of storage mediums, are stored thereon with program, when which is executed by processor Realize the information processing method.
The embodiment of the invention also provides a kind of processor, the processor is for running program, wherein described program fortune The information processing method is executed when row.
The embodiment of the invention also provides a kind of equipment, equipment includes processor, memory and stores on a memory simultaneously The program that can be run on a processor, processor perform the steps of when executing program
Target indicator card is obtained, the target indicator card is target rod-pumped well indicator card to be marked;
Determine that the target signature sequence of the target indicator card, the target signature sequence include at least the target and show function The attributive character of figure;
It is described by the default identification model of the target signature sequence inputting to carry out failure mark to the target indicator card Default identification model is by there is supervision fault identification model to obtain the training of training sample set, the training sample set Characteristic sequence including sample indicator card, the sample indicator card are by query function to rod-pumped well each in database The indicator card obtained after indicator card processing, and the sample indicator card passes through the mark of target object, the sample indicator card Characteristic sequence includes at least the attributive character of the sample indicator card.
Optionally, described that the target signature sequence inputting is preset into identification model to mark to the target indicator card Before note, the method also includes:
Obtain the indicator card of each rod-pumped well in the database;
The characteristic sequence for determining the indicator card of each rod-pumped well, is not marked sample set, described not mark sample This collection is the set of the attributive character sequence of the indicator card of each rod-pumped well;
Step A, training sample subset, the trained sample are determined from described do not mark according to the query function in sample set This subset be it is described do not mark target range in sample set be less than the query function distance threshold sample set;
Step B, failure is carried out to the training sample subset to mark to obtain mark sample set;
Step C, the distance threshold of the query function is adjusted by the mark sample set;
Step A, step B and step C are repeated, until meeting preset stopping criterion for iteration, when by iteration ends The mark sample set be determined as the training sample set;
There is supervision fault identification model to be trained training sample set input, obtains the default identification model.
Optionally, described to determine that training sample subset includes: in sample set from described do not mark according to query function
The corresponding decision tree of sample set is not marked by the way that query function generation is described, and the decision tree includes root section Point and leaf node, the root node and the leaf node have incidence relation, and the leaf node is not marked with described Each sample that do not mark in note sample set is with incidence relation;
Each leaf node is obtained in the decision tree to the target range of the root node;
The sample for not marking the distance threshold that target range described in sample set is less than the query function is true It is set to the training sample subset.
Optionally, it is described failure carried out to the training sample subset mark to obtain mark sample set include:
Determine each sample corresponding generation moment in the training sample subset;
The generation moment based on each sample obtains the corresponding indicator card of each sample;
The corresponding indicator card of each sample is sent into target object, so that the target object is to each sample This corresponding indicator card carries out failure mark, obtains the mark sample set.
Optionally, the method also includes:
Judge whether the number of iterations reaches default value, if so, determination meets the preset stopping criterion for iteration;
Or,
Judge whether the distance threshold of the query function restrains, if so, determination meets the preset iteration ends Condition.
Equipment herein can be server, PC, PAD, mobile phone etc..
The present invention also provides a kind of computer program products, when executing on data processing equipment, are adapted for carrying out just The program of beginningization there are as below methods step:
Target indicator card is obtained, the target indicator card is target rod-pumped well indicator card to be marked;
Determine that the target signature sequence of the target indicator card, the target signature sequence include at least the target and show function The attributive character of figure;
It is described by the default identification model of the target signature sequence inputting to carry out failure mark to the target indicator card Default identification model is by there is supervision fault identification model to obtain the training of training sample set, the training sample set Characteristic sequence including sample indicator card, the sample indicator card are by query function to rod-pumped well each in database The indicator card obtained after indicator card processing, and the sample indicator card passes through the mark of target object, the sample indicator card Characteristic sequence includes at least the attributive character of the sample indicator card.
Optionally, described that the target signature sequence inputting is preset into identification model to mark to the target indicator card Before note, the method also includes:
Obtain the indicator card of each rod-pumped well in the database;
The characteristic sequence for determining the indicator card of each rod-pumped well, is not marked sample set, described not mark sample This collection is the set of the attributive character sequence of the indicator card of each rod-pumped well;
Step A, training sample subset, the trained sample are determined from described do not mark according to the query function in sample set This subset be it is described do not mark target range in sample set be less than the query function distance threshold sample set;
Step B, failure is carried out to the training sample subset to mark to obtain mark sample set;
Step C, the distance threshold of the query function is adjusted by the mark sample set;
Step A, step B and step C are repeated, until meeting preset stopping criterion for iteration, when by iteration ends The mark sample set be determined as the training sample set;
There is supervision fault identification model to be trained training sample set input, obtains the default identification model.
Optionally, described to determine that training sample subset includes: in sample set from described do not mark according to query function
The corresponding decision tree of sample set is not marked by the way that query function generation is described, and the decision tree includes root section Point and leaf node, the root node and the leaf node have incidence relation, and the leaf node is not marked with described Each sample that do not mark in note sample set is with incidence relation;
Each leaf node is obtained in the decision tree to the target range of the root node;
The sample for not marking the distance threshold that target range described in sample set is less than the query function is true It is set to the training sample subset.
Optionally, it is described failure carried out to the training sample subset mark to obtain mark sample set include:
Determine each sample corresponding generation moment in the training sample subset;
The generation moment based on each sample obtains the corresponding indicator card of each sample;
The corresponding indicator card of each sample is sent into target object, so that the target object is to each sample This corresponding indicator card carries out failure mark, obtains the mark sample set.
Optionally, the method also includes:
Judge whether the number of iterations reaches default value, if so, determination meets the preset stopping criterion for iteration;
Or,
Judge whether the distance threshold of the query function restrains, if so, determination meets the preset iteration ends Condition.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.
The present invention be referring to the method for the embodiment of the present invention, equipment (system) and computer program product flow chart and/ Or block diagram describes.It should be understood that each process that can be realized by computer program instructions in flowchart and/or the block diagram and/ Or the combination of the process and/or box in box and flowchart and/or the block diagram.It can provide these computer program instructions To general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices processor to generate one A machine so that by the instruction that the processor of computer or other programmable data processing devices executes generate for realizing The device for the function of being specified in one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
In a typical configuration, calculating equipment includes one or more processors (CPU), input/output interface, net Network interface and memory.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/ Or the forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable Jie The example of matter.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method Or technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data. The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory (SRAM), moves State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable Programmable read only memory (EEPROM), flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM), Digital versatile disc (DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices or Any other non-transmission medium, can be used for storage can be accessed by a computing device information.As defined in this article, computer Readable medium does not include temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability It include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrap Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including element There is also other identical elements in process, method, commodity or equipment.
It will be understood by those skilled in the art that the embodiment of the present invention can provide as method, system or computer program product. Therefore, complete hardware embodiment, complete software embodiment or embodiment combining software and hardware aspects can be used in the present invention Form.It is deposited moreover, the present invention can be used to can be used in the computer that one or more wherein includes computer usable program code The shape for the computer program product implemented on storage media (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) Formula.
The above is only the embodiment of the present invention, are not intended to restrict the invention.To those skilled in the art, The invention may be variously modified and varied.It is all within the spirit and principles of the present invention made by any modification, equivalent replacement, Improve etc., it should be included within scope of the presently claimed invention.

Claims (10)

1. a kind of information processing method characterized by comprising
Target indicator card is obtained, the target indicator card is target rod-pumped well indicator card to be marked;
Determine that the target signature sequence of the target indicator card, the target signature sequence include at least the target indicator card Attributive character;
It is described default by the default identification model of the target signature sequence inputting to carry out failure mark to the target indicator card By training sample set to there is supervision fault identification model training to obtain, the training sample set includes identification model The characteristic sequence of sample indicator card, the sample indicator card are to show function to rod-pumped well each in database by query function The indicator card obtained after figure processing, and the sample indicator card passes through the mark of target object, the feature of the sample indicator card Sequence includes at least the attributive character of the sample indicator card.
2. the method according to claim 1, wherein described by the default identification mould of the target signature sequence inputting Before type is to be labeled the target indicator card, the method also includes:
Obtain the indicator card of each rod-pumped well in the database;
The characteristic sequence for determining the indicator card of each rod-pumped well, is not marked sample set, described not mark sample set For the set of the attributive character sequence of the indicator card of each rod-pumped well;
Step A, training sample subset is determined from described do not mark according to the query function in sample set, training sample Collect the set for not marking the sample for the distance threshold that target range in sample set is less than the query function for described in;
Step B, failure is carried out to the training sample subset to mark to obtain mark sample set;
Step C, the distance threshold of the query function is adjusted by the mark sample set;
Step A, step B and step C are repeated, until meet preset stopping criterion for iteration, institute when by iteration ends It states mark sample set and is determined as the training sample set;
There is supervision fault identification model to be trained training sample set input, obtains the default identification model.
3. according to the method described in claim 2, it is characterized in that, described do not mark in sample set according to query function from described Determine that training sample subset includes:
By the query function generate it is described do not mark the corresponding decision tree of sample set, the decision tree include root node with And leaf node, the root node and the leaf node have incidence relation, and the leaf node does not mark sample with described Each sample that do not mark in this set is with incidence relation;
Each leaf node is obtained in the decision tree to the target range of the root node;
The sample for not marking the distance threshold that target range described in sample set is less than the query function is determined as The training sample subset.
4. according to the method described in claim 2, it is characterized in that, described mark training sample subset progress failure Include: to mark sample set
Determine each sample corresponding generation moment in the training sample subset;
The generation moment based on each sample obtains the corresponding indicator card of each sample;
The corresponding indicator card of each sample is sent into target object, so that the target object is to each sample pair The indicator card answered carries out failure mark, obtains the mark sample set.
5. according to the method described in claim 2, it is characterized in that, the method also includes:
Judge whether the number of iterations reaches default value, if so, determination meets the preset stopping criterion for iteration;
Or,
Judge whether the distance threshold of the query function restrains, if so, determination meets the preset stopping criterion for iteration.
6. a kind of information processing unit characterized by comprising
Acquiring unit, for obtaining target indicator card, the target indicator card is target rod-pumped well indicator card to be marked;
Determination unit, for determining that the target signature sequence of the target indicator card, the target signature sequence include at least institute State the attributive character of target indicator card;
Processing unit, for the target signature sequence inputting to be preset identification model to carry out failure to the target indicator card Mark, the default identification model is by there is supervision fault identification model to obtain the training of training sample set, the instruction Practice the characteristic sequence that sample set includes sample indicator card, the sample indicator card is by query function to each in database The indicator card obtained after the indicator card processing of rod-pumped well, and the sample indicator card passes through the mark of target object, the sample The characteristic sequence of this indicator card includes at least the attributive character of the sample indicator card.
7. device according to claim 6, which is characterized in that described device further include: training unit, the training unit For:
Obtain the indicator card of each rod-pumped well in the database;
The characteristic sequence for determining the indicator card of each rod-pumped well, is not marked sample set, described not mark sample set For the set of the attributive character sequence of the indicator card of each rod-pumped well;
Step A, training sample subset is determined from described do not mark according to the query function in sample set, training sample Collect the set for not marking the sample for the distance threshold that target range in sample set is less than the query function for described in;
Step B, failure is carried out to the training sample subset to mark to obtain mark sample set;
Step C, the distance threshold of the query function is adjusted by the mark sample set;
Step A, step B and step C are repeated, until meet preset stopping criterion for iteration, institute when by iteration ends It states mark sample set and is determined as the training sample set;
There is supervision fault identification model to be trained training sample set input, obtains the default identification model.
8. device according to claim 7, which is characterized in that the training unit does not mark according to query function from described Determine that training sample subset includes: in sample set
By the query function generate it is described do not mark the corresponding decision tree of sample set, the decision tree include root node with And leaf node, the root node and the leaf node have incidence relation, and the leaf node does not mark sample with described Each sample that do not mark in this set is with incidence relation;
Each leaf node is obtained in the decision tree to the target range of the root node;
The sample for not marking the distance threshold that target range described in sample set is less than the query function is determined as The training sample subset.
9. a kind of processor, which is characterized in that the processor is for running computer program, when the computer program is run It executes such as the step of any one of claim 1 to 5 the method.
10. a kind of computer readable storage medium, is stored thereon with computer program, it is characterised in that: the computer program It is realized when being executed by processor such as the step of any one of claim 1 to 5 the method.
CN201811293443.9A 2018-10-31 2018-10-31 A kind of information processing method and relevant device Pending CN109508738A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811293443.9A CN109508738A (en) 2018-10-31 2018-10-31 A kind of information processing method and relevant device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811293443.9A CN109508738A (en) 2018-10-31 2018-10-31 A kind of information processing method and relevant device

Publications (1)

Publication Number Publication Date
CN109508738A true CN109508738A (en) 2019-03-22

Family

ID=65747341

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811293443.9A Pending CN109508738A (en) 2018-10-31 2018-10-31 A kind of information processing method and relevant device

Country Status (1)

Country Link
CN (1) CN109508738A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111611953A (en) * 2020-05-28 2020-09-01 北京富吉瑞光电科技有限公司 Target feature training-based oil pumping unit identification method and system
CN112526959A (en) * 2019-09-19 2021-03-19 北京国双科技有限公司 Oil well pump fault diagnosis method, device, equipment and storage medium
CN114444620A (en) * 2022-04-08 2022-05-06 中国石油大学(华东) Indicator diagram fault diagnosis method based on generating type antagonistic neural network

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103886339A (en) * 2013-06-14 2014-06-25 洛阳乾禾仪器有限公司 Oil pumping device indicator diagram dynamic identification method and device based on BP neural network
CN105631436A (en) * 2016-01-27 2016-06-01 桂林电子科技大学 Face alignment method based on cascade position regression of random forests
CN107357790A (en) * 2016-05-09 2017-11-17 阿里巴巴集团控股有限公司 A kind of unexpected message detection method, apparatus and system
CN107453709A (en) * 2017-07-03 2017-12-08 重庆大学 The photovoltaic hot spot method for diagnosing faults that a kind of isolation mech isolation test merges with intersecting measurement
CN108154029A (en) * 2017-10-25 2018-06-12 上海观安信息技术股份有限公司 Intrusion detection method, electronic equipment and computer storage media
WO2018105320A1 (en) * 2016-12-06 2018-06-14 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Information processing device, information processing method, and program

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103886339A (en) * 2013-06-14 2014-06-25 洛阳乾禾仪器有限公司 Oil pumping device indicator diagram dynamic identification method and device based on BP neural network
CN105631436A (en) * 2016-01-27 2016-06-01 桂林电子科技大学 Face alignment method based on cascade position regression of random forests
CN107357790A (en) * 2016-05-09 2017-11-17 阿里巴巴集团控股有限公司 A kind of unexpected message detection method, apparatus and system
WO2018105320A1 (en) * 2016-12-06 2018-06-14 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Information processing device, information processing method, and program
CN107453709A (en) * 2017-07-03 2017-12-08 重庆大学 The photovoltaic hot spot method for diagnosing faults that a kind of isolation mech isolation test merges with intersecting measurement
CN108154029A (en) * 2017-10-25 2018-06-12 上海观安信息技术股份有限公司 Intrusion detection method, electronic equipment and computer storage media

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
FEI TONY LIU ET AL.: "Isolation-Based Anomaly Detection", 《ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA》 *
YEZHU: "iForest(Isolation Forest)孤立森林 异常检测 入门篇", 《简书》 *
任毅飞: "基于 PSO-RBF 神经网络的示功图识别", 《微型机与应用》 *
张宁: "基于智能型的BP神经网络的示功图故障诊断研究", 《中国优秀硕士学位论文全文数据库 工程科技Ⅰ辑》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112526959A (en) * 2019-09-19 2021-03-19 北京国双科技有限公司 Oil well pump fault diagnosis method, device, equipment and storage medium
CN111611953A (en) * 2020-05-28 2020-09-01 北京富吉瑞光电科技有限公司 Target feature training-based oil pumping unit identification method and system
CN114444620A (en) * 2022-04-08 2022-05-06 中国石油大学(华东) Indicator diagram fault diagnosis method based on generating type antagonistic neural network

Similar Documents

Publication Publication Date Title
CN104317681B (en) For the behavioral abnormal automatic detection method and detecting system of computer system
US8756175B1 (en) Robust and fast model fitting by adaptive sampling
CN110148285B (en) Intelligent oil well parameter early warning system based on big data technology and early warning method thereof
CN111798312B (en) Financial transaction system anomaly identification method based on isolated forest algorithm
CN109508738A (en) A kind of information processing method and relevant device
CN112393931B (en) Detection method, detection device, electronic equipment and computer readable medium
CN106651416A (en) Analyzing method and analyzing device of application popularization information
CN108463973A (en) Fingerprint recognition basic reason is analyzed in cellular system
CN105824748B (en) For determining the method and system of test case efficiency
US11200377B2 (en) Cluster model to predict build failure
CN108255857A (en) A kind of sentence detection method and device
CN110798467B (en) Target object identification method and device, computer equipment and storage medium
CN105468677A (en) Log clustering method based on graph structure
CN104298679A (en) Application service recommendation method and device
CN111506637B (en) Multi-dimensional anomaly detection method and device based on KPI (Key Performance indicator) and storage medium
US20190246299A1 (en) Method and test system for mobile network testing as well as a network testing system
CN111506504B (en) Software development process measurement-based software security defect prediction method and device
CN104407688A (en) Virtualized cloud platform energy consumption measurement method and system based on tree regression
CN112232833A (en) Lost member customer group data prediction method, model training method and model training device
CN105868956A (en) Data processing method and device
CN108230003A (en) The dispensing effect analysis method and device of keyword
CN110008473A (en) A kind of medical text name Entity recognition mask method based on alternative manner
CN111159167B (en) Labeling quality detection device and method
CN106909454A (en) A kind of rules process method and equipment
CN114625406A (en) Application development control method, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190322