Summary of the invention
The embodiment of the present invention provides a kind of abnormal data prediction technique, device, electronic equipment and computer storage medium.
In a first aspect, providing a kind of abnormal data prediction technique in the embodiment of the present invention.
Specifically, the abnormal data prediction technique, comprising:
Training abnormal data is obtained, and extracts the feature of the trained abnormal data;
Abnormal data prediction model is obtained using the feature of the trained abnormal data as input training;
It obtains history abnormal data and extracts its feature, the feature of the history abnormal data is input to the abnormal number
It is predicted that obtaining abnormal data prediction result in model.
With reference to first aspect, for the embodiment of the present invention in the first implementation of first aspect, the acquisition training is different
Regular data, and extract the feature of the trained abnormal data, comprising:
The first history abnormal data in the first history preset time period is obtained, as the trained abnormal data;
Extract the feature of the first history abnormal data.
With reference to first aspect with the first implementation of first aspect, second in first aspect of the embodiment of the present invention
It is described to obtain abnormal data prediction model for the feature of the trained abnormal data as input training in implementation, comprising:
Determine abnormal data prediction model to be trained;
Using the feature of the trained abnormal data as input, it is input in the abnormal data prediction model to be trained,
Training obtains abnormal data prediction model.
With reference to first aspect, second of implementation of the first implementation of first aspect and first aspect, this public affairs
Be opened in the third implementation of first aspect, it is described using the feature of the trained abnormal data as input training obtain it is different
After regular data prediction model, further includes:
Validation verification is carried out for the abnormal data prediction model.
With reference to first aspect, the first implementation of first aspect, first aspect second of implementation and first
The third implementation of aspect, the disclosure are described for the abnormal data in the 4th kind of implementation of first aspect
Prediction model carries out validation verification, comprising:
Determine the verification time, and the second history in the second history preset time period before obtaining the verification time is different
Regular data;
Extraction obtains the feature of the second history abnormal data;
By the feature of the second history abnormal data be input to the abnormal data prediction model be verified prediction it is different
Regular data;
According to the difference between the verifying predicted anomaly data and the true abnormal data of verification time point for described
Abnormal data prediction model carries out validation verification.
With reference to first aspect, the first implementation, second of implementation of first aspect, first party of first aspect
The third implementation in face and the 4th kind of implementation of first aspect, five kind implementation of the disclosure in first aspect
In, the acquisition history abnormal data simultaneously extracts its feature, and the feature of the history abnormal data is input to the abnormal number
It is predicted that obtaining abnormal data prediction result in model, comprising:
Determine the time to be predicted, and the third in the third history preset time period before obtaining the time to be predicted is gone through
History abnormal data;
Extraction obtains the feature of the third history abnormal data;
By the feature of the third history abnormal data be input to the abnormal data prediction model obtain it is described to be predicted
The abnormal data prediction result of time.
With reference to first aspect, the first implementation, second of implementation of first aspect, first party of first aspect
The 5th kind of implementation of the third implementation in face, the 4th kind of implementation of first aspect and first aspect, the disclosure
In the 6th kind of implementation of first aspect, the feature of the abnormal data includes one of following characteristics or a variety of: being belonged to
Property feature, time identifier feature, efficiency evaluation feature, statistical nature, operating characteristic.
Second aspect provides a kind of abnormal data prediction meanss in the embodiment of the present invention.
Specifically, the abnormal data prediction meanss, comprising:
Module is obtained, is configured as obtaining trained abnormal data, and extract the feature of the trained abnormal data;
Training module is configured as obtaining abnormal data prediction for the feature of the trained abnormal data as input training
Model;
Prediction module is configured as obtaining history abnormal data and extracts its feature, by the spy of the history abnormal data
Sign is input in the abnormal data prediction model, obtains abnormal data prediction result.
In conjunction with second aspect, the embodiment of the present invention is in the first implementation of second aspect, the acquisition module packet
It includes:
Acquisition submodule is configured as obtaining the first history abnormal data in the first history preset time period, be made
For the trained abnormal data;
First extracting sub-module is configured as extracting the feature of the first history abnormal data.
In conjunction with the first of second aspect and second aspect implementation, second in second aspect of the embodiment of the present invention
In implementation, the training module includes:
First determines submodule, is configured to determine that abnormal data prediction model to be trained;
Training submodule is configured as being input to described wait train using the feature of the trained abnormal data as input
In abnormal data prediction model, training obtains abnormal data prediction model.
In conjunction with the first implementation of second aspect, second aspect and second of implementation of second aspect, this public affairs
It is opened in the third implementation of second aspect, the training module further include:
First verifying submodule, is configured as carrying out validation verification for the abnormal data prediction model.
In conjunction with the first implementation of second aspect, second aspect, second of implementation and second of second aspect
The third implementation of aspect, the disclosure is in the 4th kind of implementation of second aspect, the first verifying submodule packet
It includes:
Second determines submodule, is configured to determine that the verification time, and obtain the second history before the verification time
The second history abnormal data in preset time period;
Second extracting sub-module is configured as extraction and obtains the feature of the second history abnormal data;
Input submodule is configured as the feature of the second history abnormal data being input to the abnormal data prediction
Model is verified predicted anomaly data;
Second verifying submodule is configured as the true exception according to verifying the predicted anomaly data and verification time point
Difference between data carries out validation verification for the abnormal data prediction model.
The first implementation, second of implementation of second aspect, second party in conjunction with second aspect, second aspect
The third implementation in face and the 4th kind of implementation of second aspect, five kind implementation of the disclosure in second aspect
In, the prediction module includes:
Third determines submodule, is configured to determine that the time to be predicted, and obtains the third before the time to be predicted
Third history abnormal data in history preset time period;
Third extracting sub-module is configured as extraction and obtains the feature of the third history abnormal data;
It predicts submodule, is configured as the feature of the third history abnormal data being input to the abnormal data prediction
Model obtains the abnormal data prediction result of the time to be predicted.
The first implementation, second of implementation of second aspect, second party in conjunction with second aspect, second aspect
The 5th kind of implementation of the third implementation in face, the 4th kind of implementation of second aspect and second aspect, the disclosure
In the 6th kind of implementation of second aspect, the feature of the abnormal data includes one of following characteristics or a variety of: being belonged to
Property feature, time identifier feature, efficiency evaluation feature, statistical nature, operating characteristic.
The third aspect, the embodiment of the invention provides a kind of electronic equipment, including memory and processor, the memories
It is executed in above-mentioned first aspect based on abnormal data prediction technique by storing one or more support abnormal data prediction meanss
Calculation machine instruction, the processor is configured to for executing the computer instruction stored in the memory.The abnormal data
Prediction meanss can also include communication interface, for abnormal data prediction meanss and other equipment or communication.
Fourth aspect, it is pre- for storing abnormal data the embodiment of the invention provides a kind of computer readable storage medium
Computer instruction used in device is surveyed, it includes be abnormal data for executing abnormal data prediction technique in above-mentioned first aspect
Computer instruction involved in prediction meanss.
Technical solution provided in an embodiment of the present invention can include the following benefits:
Above-mentioned technical proposal carries out abnormal data pre- by abnormal datas and prediction models such as historic asset losses
It surveys, obtained prediction result can provide various references for technical staff, reduce the randomness and difficulty of event handling, not only
Facilitate technical staff and formulate most suitable counter-measure, additionally it is possible to the working efficiency of Effective Way by Using Project personnel.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not
The embodiment of the present invention can be limited.
Specific embodiment
Hereinafter, the illustrative embodiments of the embodiment of the present invention will be described in detail with reference to the attached drawings, so that art technology
Them are easily implemented in personnel.In addition, for the sake of clarity, being omitted in the accompanying drawings unrelated with description illustrative embodiments
Part.
In embodiments of the present invention, it should be appreciated that the term of " comprising " or " having " etc. is intended to refer in this specification
The presence of disclosed feature, number, step, behavior, component, part or combinations thereof, and be not intended to exclude it is one or more its
A possibility that his feature, number, step, behavior, component, part or combinations thereof exist or are added.
It also should be noted that in the absence of conflict, the feature in embodiment and embodiment in the present invention
It can be combined with each other.Embodiment that the present invention will be described in detail below with reference to the accompanying drawings and embodiments.
Technical solution provided in an embodiment of the present invention by historic asset loss etc. abnormal datas and prediction model, for
Abnormal data is predicted that obtained prediction result can provide various references for technical staff, reduce event handling with
Machine and difficulty not only facilitate technical staff and formulate most suitable counter-measure, additionally it is possible to the work of Effective Way by Using Project personnel
Make efficiency.
Fig. 1 shows the flow chart of abnormal data prediction technique according to an embodiment of the present invention, as shown in Figure 1, described
Abnormal data prediction technique includes the following steps S101-S103:
In step s101, training abnormal data is obtained, and extracts the feature of the trained abnormal data;
In step s 102, abnormal data prediction mould is obtained using the feature of the trained abnormal data as input training
Type;
In step s 103, it obtains history abnormal data and extracts its feature, the feature of the history abnormal data is defeated
Enter into the abnormal data prediction model, obtains abnormal data prediction result.
Mentioned above, with the development of data and information technology, many criminals are real using network or communication tool
Swindle is applied, so that many users suffer more or less loss of assets.In the prior art, usually there is loss of assets in user
It is just verified later, searches reason and formulate corresponding measure, this processing mode lacks the perspective of data, increases event
The randomness and difficulty of processing, are unfavorable for the raising of technical staff's working efficiency.
In view of the above problem, in this embodiment, a kind of abnormal data prediction technique is proposed, this method is by history
The abnormal datas such as loss of assets and prediction model, predict abnormal data, and obtained prediction result can be technology people
Member provides various references, reduces the randomness and difficulty of event handling, and it is most suitable to not only facilitate technical staff's formulation
Counter-measure, additionally it is possible to the working efficiency of Effective Way by Using Project personnel.
In an optional implementation of the present embodiment, the abnormal data can be it is recordable, statistics available, can monitor
Improper data, such as loss of assets data, loss of goods data, system failure data, machine alarm data etc..
In an optional implementation of the present embodiment, as shown in Fig. 2, the step S101, that is, it is abnormal to obtain training
Data, and the step of extracting the feature of the trained abnormal data, include the following steps S201-S202:
In step s 201, the first history abnormal data in the first history preset time period is obtained, as described
Training abnormal data;
In step S202, the feature of the first history abnormal data is extracted.
In this embodiment, the trained abnormal data can be the historical data generated, for example, taking first to go through
Then the first history abnormal data in history preset time period extracts the trained abnormal data as training abnormal data again
Feature, the input data as subsequent prediction model.
Wherein, the first history preset time period can carry out really with the characteristics of abnormal data according to the needs of practical application
It is fixed, for example may be configured as 6 months or 1 year, the present invention does not make specifically the specific value of the first history preset time period
It limits.
In an optional implementation of the present embodiment, the feature of the abnormal data may include one in following characteristics
Kind is a variety of: attributive character, time identifier feature, efficiency evaluation feature, statistical nature, operating characteristic etc..
Wherein, the attributive character is used to characterize the attribute of the abnormal data, such as the classification, different of the abnormal data
Size of regular data etc..
In the prior art, in order to control the processing ratio of loss of assets time to a certain extent, the control processing time,
And also to filter loss of assets event caused by some maloperations, it will usually an amount of asset loss threshold value is set,
The processing for the loss of assets event can be just triggered when the amount of asset loss of a certain loss of assets event is more than the threshold value.
Since above-mentioned loss of assets processing mode in the prior art does not predict loss of assets event, before lacking for data
The analysis of looking forward or upwards property, thus frequently result in technical staff and unnecessary nervous and radical response occur, in fact, even if increasing for money
The prediction for producing loss event, still cannot effectively avoid the generation of radical response, this is because inventor is for money
The abnormal datas such as production loss data are examined to be found with after analyzing in detail, and the appearance of some loss of assets data is to deposit
In some cycles, for example, when annual double 11 electric business carry out significantly large-scale advertising campaign, due to user's order numbers
Amount is increased sharply, and criminal also avails oneself of the opportunity to get in, and the case of network swindle at this time steeply rises.It therefore, if can be for loss of assets
The periodicity of data is analyzed, and it is combined with the prediction of loss of assets data, so that it may be provided for technical staff
One more reasonable processing threshold value, and then the generation of radical response is effectively avoided, the working efficiency for the personnel that develop skill, together
When, additionally it is possible to the loss of assets data for being likely to occur technical staff for future carry out preliminary understanding, or make in advance
Fixed possible counter-measure plan.
Therefore, in an optional implementation of the present embodiment, when extracting the feature of abnormal data, also described in extraction
The time identifier feature of abnormal data.Wherein, the time identifier feature be exactly for characterize that the abnormal data occurs when
Between the feature of feature be for example, whether the time of origin of the abnormal data belongs to vacation major holidays such as the Spring Festival, National Day
It is no to belong to double 11,618 etc. specific promotion periods sections, if to occur that (for example batch clique commits a crime simultaneously with a certain particular event
It may result in being substantially increased for amount of asset loss, the bankruptcy of a certain service enterprise such as shared bicycle is likely to result in assets damage
The increase of accident part) etc..It will be apparent that can be advised for abnormal data by the extraction of the time identifier feature
Rule is effectively recognized.In practical applications, one-hot coding can be carried out for the time of origin of the abnormal data, come
Obtain the time identifier feature of the abnormal data.
Wherein, the efficiency evaluation feature is used to characterize the validity of the trained abnormal data.In view of more early hair
Raw training abnormal data, reference value are lower.In order to consider the validity of the trained abnormal data, in the present embodiment
In one optional implementation, efficiency evaluation, and the efficiency evaluation that will be obtained are carried out also for the trained abnormal data
As a result as a kind of training for participating in subsequent prediction model of its feature in.
In an optional implementation of the present embodiment, in-service evaluation model has the trained abnormal data
Effect property evaluation, wherein the evaluation model can be used based on the abnormal data before the trained abnormal data time of origin
The evaluation model that training obtains, wherein the evaluation model such as can be LightGBM, Xgboost, Logic Regression Models
(LR), Random Forest model (RF), Gradient Iteration promote models such as tree-model (GBDT), naturally it is also possible to select other suitable
Model, the present invention are not especially limited it.
Wherein, the statistical nature is used to characterize the statistical property of the trained abnormal data, and the statistical nature is such as
It can be minimum value, maximum value, average value, standard deviation, the first value, the last bit value etc. in the trained abnormal data.
Wherein, the operating characteristic is used to characterize the operating characteristic of the trained abnormal data, can be according to for the instruction
Practice abnormal data and carry out default operation rule to obtain, the default operation rule may include one of following rule or more
Kind: summation, averaging, addition subtraction multiplication and division, calculating distance, slide window processing etc..
In an optional implementation of the present embodiment, in order to facilitate the training and calculating of prediction model, institute is being extracted
Before the feature for stating trained abnormal data, sliding-model control is carried out also for the trained abnormal data, to obtain discrete training
Data, or after extracting feature to the trained abnormal data, sliding-model control is carried out for continuity Characteristics, to obtain
Discrete features.
In an optional implementation of the present embodiment, the extraction of the trained abnormal data feature can also be by
The feature extraction tools such as Featuretools and alpha trion are completed, in this regard, the present invention does not describe excessively.
In an optional implementation of the present embodiment, as shown in figure 3, the step S102, i.e., different by the training
The step of feature of regular data obtains abnormal data prediction model as input training, includes the following steps S301-S302:
In step S301, abnormal data prediction model to be trained is determined;
In step s 302, it using the feature of the trained abnormal data as input, is input to described to the abnormal number of training
It is predicted that training obtains abnormal data prediction model in model.
In order to improve the validity of data training, in this embodiment, first according to the number of the trained abnormal data
Abnormal data prediction model to be trained is determined according to feature, then again using the feature of the trained abnormal data as input, input
It is trained into the abnormal data prediction model to be trained, obtains abnormal data prediction model.
In an optional implementation of the present embodiment, the feature of the trained abnormal data can be according to practical application
It needs to be combined, using assemblage characteristic as the input of the abnormal data prediction model to be trained, for example, in view of double ten
One, the abnormal data relevance in daytime and evening is stronger in the specific promotion period sections such as 618, therefore, can be by the specific rush
The feature of the abnormal data in daytime and evening in the period is sold in combination as the abnormal data prediction model to be trained
Input is to use.
In an optional implementation of the present embodiment, the step S302, i.e., by the spy of the trained abnormal data
The step of sign is input in the abnormal data prediction model to be trained as input, and training obtains abnormal data prediction model
Later, further include the steps that carrying out validation verification for the abnormal data prediction model, i.e., as shown in figure 4, the step
S102 includes the following steps S401-S403:
In step S401, abnormal data prediction model to be trained is determined;
In step S402, using the feature of the trained abnormal data as input, it is input to described to the abnormal number of training
It is predicted that training obtains abnormal data prediction model in model;
In step S403, validation verification is carried out for the abnormal data prediction model.
In order to guarantee the validity of abnormal data prediction model, in this embodiment, the exception obtained also for training
Data prediction model carries out validation verification.
In an optional implementation of the present embodiment, as shown in figure 5, the step S403, i.e., for the exception
Data prediction model carries out the step of validation verification, includes the following steps S501-S504:
In step S501, the verification time is determined, and obtain the second history preset time period before the verification time
The second interior history abnormal data;
In step S502, extraction obtains the feature of the second history abnormal data;
In step S503, the feature of the second history abnormal data is input to the abnormal data prediction model and is obtained
To verifying predicted anomaly data;
In step S504, according between verifying predicted anomaly data and the true abnormal data of verification time point
Difference carries out validation verification for the abnormal data prediction model.
In this embodiment, it is first determined the verification time, the abnormal data which is occurred be it is known, after
The continuous verifying being used as the abnormal data prediction model validity;Then second before obtaining the verification time goes through
The second history abnormal data in history preset time period, wherein the second history preset time period is different from described first and goes through
History preset time period, but there may be intersect or partly overlap;Then the feature of the second history abnormal data is extracted,
In, it can be mentioned according to the feature for extracting the second history abnormal data with the method for above extracting training abnormal data feature
The feature type obtained is identical;Then the feature of the second history abnormal data is input to the abnormal data and predicts mould
The predicted anomaly data for verifying are obtained in type;Finally, verifying predicted anomaly data and the verification time point is true
Real abnormal data obtains the validation verification result of the abnormal data prediction model according to difference between the two, wherein if
It, can when difference between the verifying predicted anomaly data and the true abnormal data of verification time point is less than default difference criteria
Think that the abnormal data prediction model is effective, otherwise it is assumed that the abnormal data prediction model is invalid, needs to re-start instruction
Practice, it is of course also possible to use object intrusion algorithm, mean square deviation (MSE), root-mean-square deviation (RMSE), mean absolute error (MAE) etc.
Evaluation algorithms evaluate the validity of the abnormal data prediction model, and those skilled in the art can be according to the need of practical application
It wants, the abnormal data prediction model and the characteristics of inputoutput data select suitable model validation evaluation method, this hair
It is bright that it is not especially limited.
Wherein, the verification time can be a chronomere, such as 1 day, or multiple chronomeres, such as
2 days;The verification time both can be discrete time, or continuous time period, such as from 5 days to 2017 November in 2017
In on November 15, in etc., those skilled in the art can according to the needs of practical application be configured the verification time, this
Invention is not especially limited it.
In an optional implementation of the present embodiment, as shown in fig. 6, the step S103, i.e. acquisition history is abnormal
Data simultaneously extract its feature, and the feature of the history abnormal data is input in the abnormal data prediction model, is obtained different
The step of regular data prediction result, include the following steps S601-S603:
In step s 601, determine the time to be predicted, and obtain the third history before the time to be predicted it is default when
Between third history abnormal data in section;
In step S602, extraction obtains the feature of the third history abnormal data;
In step S603, the feature of the third history abnormal data is input to the abnormal data prediction model and is obtained
To the abnormal data prediction result of the time to be predicted.
In this embodiment, in actual prediction, it is first determined the time to be predicted, and obtain the time to be predicted it
Third history abnormal data in preceding third history preset time period, the third history preset time period can be for close to institutes
State the historical time section of time to be predicted, or there are the historical times of certain time interval with the time to be predicted
Section;Then the feature of the third history abnormal data is extracted, wherein abnormal data feature can be trained according to above extraction
Method extract the feature of the third history abnormal data, the feature type extracted is identical;Finally the third is gone through
The feature of history abnormal data is input to the abnormal data prediction model, and the abnormal data that the time to be predicted can be obtained is pre-
It surveys as a result, the abnormal data prediction result of the time to be predicted is subsequent to be provided to technical staff, is done with support technician
Targeted counter-measure out.
Wherein, the time to be predicted can be time point to be predicted or time interval to be predicted, such as if it is to be predicted when
Between section, then the third history abnormal data need to accordingly be chosen according to time interval to be predicted, for example, if when described to be predicted
Between section be from 5 days of on November 25th, 21 days 1 November in 2017, third history preset time period is set as 10 days, then
Third history abnormal data used in the prediction data on November 21st, 2017 can be from 20 days to 2017 November in 2017
The abnormal data on November 11, in, or from the abnormal data on November 10th, 1 day 1 November in 2017;2017
Third history abnormal data used in the prediction data on November 22 can for from the prediction data on November 21st, 2017 with
And the abnormal data on November 10th, 20 days 1 November in 2017, or from 2 days to 2017 11 November in 2017
The abnormal data on the moon 11, and so on.
Following is apparatus of the present invention embodiment, can be used for executing embodiment of the present invention method.
Fig. 7 shows the structural block diagram of abnormal data prediction meanss according to an embodiment of the present invention, which can lead to
Cross being implemented in combination with as some or all of of electronic equipment of software, hardware or both.As shown in fig. 7, the exception number
It is predicted that device includes:
Module 701 is obtained, is configured as obtaining trained abnormal data, and extract the feature of the trained abnormal data;
Training module 702 is configured as obtaining abnormal data for the feature of the trained abnormal data as input training
Prediction model;
Prediction module 703 is configured as obtaining history abnormal data and extracts its feature, by the history abnormal data
Feature is input in the abnormal data prediction model, obtains abnormal data prediction result.
Mentioned above, with the development of data and information technology, many criminals are real using network or communication tool
Swindle is applied, so that many users suffer more or less loss of assets.In the prior art, usually there is loss of assets in user
It is just verified later, searches reason and formulate corresponding measure, this processing mode lacks the perspective of data, increases event
The randomness and difficulty of processing, are unfavorable for the raising of technical staff's working efficiency.
In view of the above problem, in this embodiment, a kind of abnormal data prediction meanss are proposed, the device is by history
The abnormal datas such as loss of assets and prediction model, predict abnormal data, and obtained prediction result can be technology people
Member provides various references, reduces the randomness and difficulty of event handling, and it is most suitable to not only facilitate technical staff's formulation
Counter-measure, additionally it is possible to the working efficiency of Effective Way by Using Project personnel.
In an optional implementation of the present embodiment, the abnormal data can be it is recordable, statistics available, can monitor
Improper data, such as loss of assets data, loss of goods data, system failure data, machine alarm data etc..
In an optional implementation of the present embodiment, as shown in figure 8, the acquisition module 701 includes:
Acquisition submodule 801 is configured as obtaining the first history abnormal data in the first history preset time period, by it
As the trained abnormal data;
First extracting sub-module 802 is configured as extracting the feature of the first history abnormal data.
In this embodiment, the trained abnormal data can be the historical data generated, for example, taking first to go through
Then the first history abnormal data in history preset time period extracts the trained abnormal data as training abnormal data again
Feature, the input data as subsequent prediction model.
Wherein, the first history preset time period can carry out really with the characteristics of abnormal data according to the needs of practical application
It is fixed, for example may be configured as 6 months or 1 year, the present invention does not make specifically the specific value of the first history preset time period
It limits.
In an optional implementation of the present embodiment, the feature of the abnormal data may include one in following characteristics
Kind is a variety of: attributive character, time identifier feature, efficiency evaluation feature, statistical nature, operating characteristic etc..
Wherein, the attributive character is used to characterize the attribute of the abnormal data, such as the classification, different of the abnormal data
Size of regular data etc..
In the prior art, in order to control the processing ratio of loss of assets time to a certain extent, the control processing time,
And also to filter loss of assets event caused by some maloperations, it will usually an amount of asset loss threshold value is set,
The processing for the loss of assets event can be just triggered when the amount of asset loss of a certain loss of assets event is more than the threshold value.
Since above-mentioned loss of assets processing mode in the prior art does not predict loss of assets event, before lacking for data
The analysis of looking forward or upwards property, thus frequently result in technical staff and unnecessary nervous and radical response occur, in fact, even if increasing for money
The prediction for producing loss event, still cannot effectively avoid the generation of radical response, this is because inventor is for money
The abnormal datas such as production loss data are examined to be found with after analyzing in detail, and the appearance of some loss of assets data is to deposit
In some cycles, for example, when annual double 11 electric business carry out significantly large-scale advertising campaign, due to user's order numbers
Amount is increased sharply, and criminal also avails oneself of the opportunity to get in, and the case of network swindle at this time steeply rises.It therefore, if can be for loss of assets
The periodicity of data is analyzed, and it is combined with the prediction of loss of assets data, so that it may be provided for technical staff
One more reasonable processing threshold value, and then the generation of radical response is effectively avoided, the working efficiency for the personnel that develop skill, together
When, additionally it is possible to the loss of assets data for being likely to occur technical staff for future carry out preliminary understanding, or make in advance
Fixed possible counter-measure plan.
Therefore, in an optional implementation of the present embodiment, when extracting the feature of abnormal data, also described in extraction
The time identifier feature of abnormal data.Wherein, the time identifier feature be exactly for characterize that the abnormal data occurs when
Between the feature of feature be for example, whether the time of origin of the abnormal data belongs to vacation major holidays such as the Spring Festival, National Day
It is no to belong to double 11,618 etc. specific promotion periods sections, if to occur that (for example batch clique commits a crime simultaneously with a certain particular event
It may result in being substantially increased for amount of asset loss, the bankruptcy of a certain service enterprise such as shared bicycle is likely to result in assets damage
The increase of accident part) etc..It will be apparent that can be advised for abnormal data by the extraction of the time identifier feature
Rule is effectively recognized.In practical applications, one-hot coding can be carried out for the time of origin of the abnormal data, come
Obtain the time identifier feature of the abnormal data.
Wherein, the efficiency evaluation feature is used to characterize the validity of the trained abnormal data.In view of more early hair
Raw training abnormal data, reference value are lower.In order to consider the validity of the trained abnormal data, in the present embodiment
In one optional implementation, efficiency evaluation, and the efficiency evaluation that will be obtained are carried out also for the trained abnormal data
As a result as a kind of training for participating in subsequent prediction model of its feature in.
In an optional implementation of the present embodiment, in-service evaluation model has the trained abnormal data
Effect property evaluation, wherein the evaluation model can be used based on the abnormal data before the trained abnormal data time of origin
The evaluation model that training obtains, wherein the evaluation model such as can be LightGBM, Xgboost, Logic Regression Models
(LR), Random Forest model (RF), Gradient Iteration promote models such as tree-model (GBDT), naturally it is also possible to select other suitable
Model, the present invention are not especially limited it.
Wherein, the statistical nature is used to characterize the statistical property of the trained abnormal data, and the statistical nature is such as
It can be minimum value, maximum value, average value, standard deviation, the first value, the last bit value etc. in the trained abnormal data.
Wherein, the operating characteristic is used to characterize the operating characteristic of the trained abnormal data, can be according to for the instruction
Practice abnormal data and carry out default operation rule to obtain, the default operation rule may include one of following rule or more
Kind: summation, averaging, addition subtraction multiplication and division, calculating distance, slide window processing etc..
In an optional implementation of the present embodiment, in order to facilitate the training and calculating of prediction model, institute is being extracted
Before the feature for stating trained abnormal data, sliding-model control is carried out also for the trained abnormal data, to obtain discrete training
Data, or after extracting feature to the trained abnormal data, sliding-model control is carried out for continuity Characteristics, to obtain
Discrete features.
In an optional implementation of the present embodiment, the extraction of the trained abnormal data feature can also be by
The feature extraction tools such as Featuretools and alpha trion are completed, in this regard, the present invention does not describe excessively.
In an optional implementation of the present embodiment, as shown in figure 9, the training module 702 includes:
First determines submodule 901, is configured to determine that abnormal data prediction model to be trained;
Training submodule 902 is configured as being input to described wait instruct using the feature of the trained abnormal data as input
Practice in abnormal data prediction model, training obtains abnormal data prediction model.
In order to improve the validity of data training, in this embodiment, first determines submodule 901 according to the training
The data characteristics of abnormal data determine abnormal data prediction model to be trained, and training submodule 902 is again by the abnormal number of the training
According to feature as input, be input in abnormal data prediction model train and be trained, obtain abnormal data and predict
Model.
In an optional implementation of the present embodiment, the feature of the trained abnormal data can be according to practical application
It needs to be combined, using assemblage characteristic as the input of the abnormal data prediction model to be trained, for example, in view of double ten
One, the abnormal data relevance in daytime and evening is stronger in the specific promotion period sections such as 618, therefore, can be by the specific rush
The feature of the abnormal data in daytime and evening in the period is sold in combination as the abnormal data prediction model to be trained
Input is to use.
In an optional implementation of the present embodiment, the training module 702 further includes for the abnormal data
Prediction model carries out the part of validation verification, i.e., as shown in Figure 10, the training module 702 includes:
First determines submodule 1001, is configured to determine that abnormal data prediction model to be trained;
Training submodule 1002, be configured as using the feature of the trained abnormal data as input, be input to described in
In training abnormal data prediction model, training obtains abnormal data prediction model;
First verifying submodule 1003, is configured as carrying out validation verification for the abnormal data prediction model.
In order to guarantee the validity of abnormal data prediction model, in this embodiment, the exception obtained also for training
Data prediction model carries out validation verification.
In an optional implementation of the present embodiment, as shown in figure 11, the first verifying submodule 1003 includes:
Second determine submodule 1101, be configured to determine that the verification time, and obtain before the verification time second
The second history abnormal data in history preset time period;
Second extracting sub-module 1102 is configured as extraction and obtains the feature of the second history abnormal data;
Input submodule 1103 is configured as the feature of the second history abnormal data being input to the abnormal data
Prediction model is verified predicted anomaly data;
Second verifying submodule 1104 is configured as according to the true of verifying predicted anomaly data and the verification time point
Difference between abnormal data carries out validation verification for the abnormal data prediction model.
In this embodiment, second determine that submodule 1101 determines verification time, the exception which is occurred
Data are known, the subsequent verifyings being used as the abnormal data prediction model validity, and when obtaining the verifying
Between before the second history preset time period in the second history abnormal data, wherein the second history preset time period is not
It is same as the first history preset time period, but there may be intersect or partly overlap;Second extracting sub-module 1102 extracts institute
State the feature of the second history abnormal data, wherein institute can be extracted according to the method for above extracting training abnormal data feature
The feature for stating the second history abnormal data, the feature type extracted are identical;Input submodule 1103 is by second history
The feature of abnormal data, which is input in the abnormal data prediction model, obtains the predicted anomaly data for verifying;Second verifying
The true abnormal data of the verifying predicted anomaly data and verification time point of submodule 1104, according to difference between the two
The different validation verification result for obtaining the abnormal data prediction model, wherein if the verifying predicted anomaly data and verifying
When difference between the true abnormal data at time point is less than default difference criteria, it is believed that the abnormal data prediction model has
Effect, otherwise it is assumed that the abnormal data prediction model is invalid, needs to re-start training, it is of course also possible to use object intrusion
The evaluation algorithms such as algorithm, mean square deviation (MSE), root-mean-square deviation (RMSE), mean absolute error (MAE) evaluate the abnormal data
The validity of prediction model, those skilled in the art can according to the needs of practical application, the abnormal data prediction model and defeated
The characteristics of entering output data selects suitable model validation evaluation method, and the present invention is not especially limited it.
Wherein, the verification time can be a chronomere, such as 1 day, or multiple chronomeres, such as
2 days;The verification time both can be discrete time, or continuous time period, such as from 5 days to 2017 November in 2017
In on November 15, in etc., those skilled in the art can according to the needs of practical application be configured the verification time, this
Invention is not especially limited it.
In an optional implementation of the present embodiment, as shown in figure 12, the prediction module 703 includes:
Third determines submodule 1201, is configured to determine that the time to be predicted, and before obtaining the time to be predicted
Third history abnormal data in third history preset time period;
Third extracting sub-module 1202 is configured as extraction and obtains the feature of the third history abnormal data;
It predicts submodule 1203, is configured as the feature of the third history abnormal data being input to the abnormal data
Prediction model obtains the abnormal data prediction result of the time to be predicted.
In this embodiment, in actual prediction, third determines that submodule 1201 determines the time to be predicted, and obtains institute
State the third history abnormal data in the third history preset time period before the time to be predicted, the third history preset time
Section can be for close to the historical time section of the time to be predicted, or there are between certain time with the time to be predicted
Every historical time section;Third extracting sub-module 1202 extracts the feature of the third history abnormal data, wherein can according to
The method for above extracting training abnormal data feature extracts the feature of the third history abnormal data, the feature extracted
Type is identical;The feature of the third history abnormal data is input to the abnormal data and predicts mould by prediction submodule 1203
The abnormal data prediction result of the time to be predicted, the abnormal data prediction result of the time to be predicted can be obtained in type
It is subsequent to be provided to technical staff, targeted counter-measure is made with support technician.
Wherein, the time to be predicted can be time point to be predicted or time interval to be predicted, such as if it is to be predicted when
Between section, then the third history abnormal data need to accordingly be chosen according to time interval to be predicted, for example, if when described to be predicted
Between section be from 5 days of on November 25th, 21 days 1 November in 2017, third history preset time period is set as 10 days, then
Third history abnormal data used in the prediction data on November 21st, 2017 can be from 20 days to 2017 November in 2017
The abnormal data on November 11, in, or from the abnormal data on November 10th, 1 day 1 November in 2017;2017
Third history abnormal data used in the prediction data on November 22 can for from the prediction data on November 21st, 2017 with
And the abnormal data on November 10th, 20 days 1 November in 2017, or from 2 days to 2017 11 November in 2017
The abnormal data on the moon 11, and so on.
The embodiment of the invention also discloses a kind of electronic equipment, Figure 13 shows electronics according to an embodiment of the present invention and sets
Standby structural block diagram, as shown in figure 13, the electronic equipment 1300 include memory 1301 and processor 1302;Wherein,
The memory 1301 is for storing one or more computer instruction, wherein one or more computer
Instruction is executed by the processor 1302 to realize any of the above-described method and step.
Figure 14 is suitable for being used to realize the knot of the computer system of the abnormal data prediction technique of embodiment according to the present invention
Structure schematic diagram.
As shown in figure 14, computer system 1400 include central processing unit (CPU) 1401, can according to be stored in only
It reads the program in memory (ROM) 1402 or is loaded into random access storage device (RAM) 1403 from storage section 1408
Program and execute the various processing in above embodiment.In RAM1403, be also stored with system 1400 operate it is required various
Program and data.CPU1401, ROM1402 and RAM1403 are connected with each other by bus 1404.Input/output (I/O) interface
1405 are also connected to bus 1404.
I/O interface 1405 is connected to lower component: the importation 1406 including keyboard, mouse etc.;Including such as cathode
The output par, c 1407 of ray tube (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section including hard disk etc.
1408;And the communications portion 1409 of the network interface card including LAN card, modem etc..Communications portion 1409 passes through
Communication process is executed by the network of such as internet.Driver 1410 is also connected to I/O interface 1405 as needed.It is detachable to be situated between
Matter 1411, such as disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 1410, so as to
In being mounted into storage section 1408 as needed from the computer program read thereon.
Particularly, embodiment according to the present invention, method as described above may be implemented as computer software programs.
For example, embodiments of the present invention include a kind of computer program product comprising be tangibly embodied in and its readable medium on
Computer program, the computer program includes program code for executing the abnormal data prediction technique.In this way
Embodiment in, which can be downloaded and installed from network by communications portion 1409, and/or from removable
Medium 1411 is unloaded to be mounted.
Flow chart and block diagram in attached drawing illustrate system, method and computer according to the various embodiments of the present invention
The architecture, function and operation in the cards of program product.In this regard, each box in course diagram or block diagram can be with
A part of a module, section or code is represented, a part of the module, section or code includes one or more
Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box
The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical
On can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it wants
It is noted that the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart, Ke Yiyong
The dedicated hardware based system of defined functions or operations is executed to realize, or can be referred to specialized hardware and computer
The combination of order is realized.
Being described in unit or module involved in embodiment of the present invention can be realized by way of software, can also
It is realized in a manner of through hardware.Described unit or module also can be set in the processor, these units or module
Title do not constitute the restriction to the unit or module itself under certain conditions.
As on the other hand, the embodiment of the invention also provides a kind of computer readable storage mediums, this is computer-readable
Storage medium can be computer readable storage medium included in device described in above embodiment;It is also possible to individually
In the presence of without the computer readable storage medium in supplying equipment.Computer-readable recording medium storage has one or one
Procedure above, described program are used to execute the method for being described in the embodiment of the present invention by one or more than one processor.
Above description is only presently preferred embodiments of the present invention and the explanation to institute's application technology principle.Those skilled in the art
Member is it should be appreciated that invention scope involved in the embodiment of the present invention, however it is not limited to which the specific combination of above-mentioned technical characteristic forms
Technical solution, while should also cover in the case where not departing from the inventive concept, by above-mentioned technical characteristic or its equivalent spy
Levy the other technical solutions for carrying out any combination and being formed.Such as features described above with it is (but unlimited disclosed in the embodiment of the present invention
In) technical characteristic with similar functions is replaced mutually and the technical solution that is formed.