CN110322357A

CN110322357A - Anomaly assessment method, apparatus, computer equipment and the medium of data

Info

Publication number: CN110322357A
Application number: CN201910463901.7A
Authority: CN
Inventors: 李金乐
Original assignee: OneConnect Smart Technology Co Ltd
Current assignee: OneConnect Smart Technology Co Ltd
Priority date: 2019-05-30
Filing date: 2019-05-30
Publication date: 2019-10-11

Abstract

Anomaly assessment method, apparatus, computer equipment and the medium of a kind of data provided herein, the anomaly assessment method of data therein import detection system by obtaining the test data for needing to evaluate and test, and by the test data；The characteristic for extracting the Claims Resolution data, is handled by the Rating Model of detection system, calculates test data score value；The numerical values recited of contrast test data score value and model data score value obtains value-at-risk and Risk Results.The application can learn the overall distribution profile of normal data using the PCA algorithm of unsupervised learning, and the thought based on abnormality detection, without the concern for the distribution and variation of history abnormal data, accuracy is high.

Description

Anomaly assessment method, apparatus, computer equipment and the medium of data

Technical field

This application involves the anti-fraud fields of insurance, in particular to the anomaly assessment method, apparatus of a kind of data, computer are set Standby and medium.

Background technique

There are following pain spots for the anti-fraud air control Rating Model of insurance at present: taking advantage of in most of insurance company's history Claims Resolution data The record for cheating data is seldom, and the ratio of a large amount of normal data and few abnormal data is extremely uneven, leads to much have supervision Machine learning air control model can not use or keep its mode of learning single, less effective.Based on this, needs one kind and pass through ginseng Examine the method that a large amount of normal datas can identify fraud data.

Summary of the invention

The main purpose of the application is to provide anomaly assessment method, apparatus, computer equipment and the medium of a kind of data, purport It is solving the above problems.

To achieve the above object, this application provides a kind of anomaly assessment methods of data, comprising steps of

Obtain the normal data in the historical test data of preset quantity；

Feature Selection is carried out to the normal data, obtains all essential features and each institute of the normal data State essential feature multiple fisrt feature data accordingly；

Feature reduction is carried out to multiple fisrt feature data, obtains multiple history restoring datas；

Calculate the first difference value of multiple fisrt feature data Yu multiple history restoring datas；

Multiple first difference values are brought into sigmoid Function Mapping to (0,1), then by default times of result amplification Number, obtains multiple risk score values of the normal data, and its maximum value is taken to obtain model data score value S_Mould；

Obtain the test data for needing to evaluate and test；

Feature Selection is carried out to the test data according to the essential feature of normal data, obtains the institute of the test data It is necessary to feature and each described corresponding second feature data of essential feature；

Calculate the second difference value of the second feature data Yu the history restoring data；

Second difference value is brought into sigmoid Function Mapping to (0,1), result is then amplified into presupposition multiple, Multiple risk score values of the test data are obtained, and its maximum value is taken to obtain model data score value S_{It surveys}；

By S_Mould、S_{It surveys}It is compared by preset rules, obtains Risk Results.

Further, described the step of feature reduction is carried out to the fisrt feature data, obtains history restoring data, packet It includes:

The fisrt feature data are normalized, the normalization creep function of historical data is obtained；

Fisrt feature matrix is converted by the normalization creep function of the historical data；

Feature reduction is carried out by the method for PCA inverse transformation to the fisrt feature matrix, obtains history restoring data.

Further, it is described to the normal data carry out Feature Selection, obtain the normal data institute it is necessary to spies The step of sign and each described essential feature corresponding fisrt feature data, comprising:

Identify all features of normal data；

If the characteristic value quantity of characteristic therein is less than or equal to 3, it is determined as inessential feature；

If the characteristic value quantity of characteristic therein is greater than 3, it is determined as essential feature；

Inessential feature therein is removed, obtain the normal data all essential features and each described in must Want feature multiple fisrt feature data accordingly.

Further, described by S_Mould、S_{It surveys}The step of being compared by preset rules, obtaining Risk Results include:

If S_{It surveys}>S_Mould, then determine that there are risks；

If S_Mould* 90% < S_{It surveys}<S_Mould, then determine that there may be risks；

If S_{It surveys}<S_Mould* 90%, then determine that risk is not present.

Further, described that the fisrt feature data are normalized, obtain the normalization creep function of historical data Step includes:

The maximum value and minimum value of same feature are obtained, and calculates the difference of maximum value and minimum value；

The result that each of feature data are successively subtracted the minimum value obtains feature divided by the difference Normalize numerical value；

Feature normalization numerical value is acquired to all data in all features to get normalization creep function is arrived.

Further, the step of normalization creep function by the historical data is converted into fisrt feature matrix, comprising:

The Principle component extraction that contribution rate in the normalization creep function of the historical data is more than 95% is come out, is obtained by feature The fisrt feature matrix of vector composition.

Further, the contribution rate are as follows:Wherein, contrib is contribution rate, and si is spy Value indicative.

The application proposes a kind of anomaly assessment device of data simultaneously, comprising:

First acquisition unit, the normal data in historical test data for obtaining preset quantity；

First screening unit, for carrying out Feature Selection to the normal data, obtaining all of the normal data must Want feature and each described essential feature multiple fisrt feature data accordingly；

Reduction unit obtains multiple history restoring datas for carrying out feature reduction to multiple fisrt feature data；

First computing unit, for calculating the first of multiple fisrt feature data and multiple history restoring datas Difference value；And bring multiple first difference values in sigmoid Function Mapping to (0,1) into, then result is amplified default Multiple obtains multiple risk score values of the normal data, and its maximum value is taken to obtain model data score value S_Mould；

Second acquisition unit, for obtaining the test data for needing to evaluate and test；

Second screening unit carries out Feature Selection to the test data for the essential feature according to normal data, obtains To all essential features of the test data and each described corresponding second feature data of essential feature；

Second computing unit, for calculating the second difference value of the second feature data Yu the history restoring data； And bring second difference value in sigmoid Function Mapping to (0,1) into, result is then amplified into presupposition multiple, obtains institute It states multiple risk scores of test data and its maximum value is taken to obtain model data score value S_{It surveys}；

Judging unit is used for S_Mould、S_{It surveys}It is compared by preset rules, obtains Risk Results.

The application proposes a kind of computer equipment, including memory and processor simultaneously, is stored with meter in the memory The step of calculation machine program, the processor realizes any of the above-described the method when executing the computer program.

The application proposes a kind of computer readable storage medium simultaneously, is stored thereon with computer program, the computer The step of method described in any of the above embodiments is realized when program is executed by processor.

Anomaly assessment method, apparatus, computer equipment and the medium of a kind of data provided herein, data therein Anomaly assessment method, need the Claims Resolution data evaluated and tested by obtaining, and by Claims Resolution data importing detection system；Extract institute The characteristic for stating Claims Resolution data is handled by the Rating Model of detection system, calculates Claims Resolution data score value；Comparison Claims Resolution The numerical values recited of data score value and model data score value obtains value-at-risk and Risk Results.The application uses unsupervised The PCA algorithm of study can learn the overall distribution profile of normal data, the thought based on abnormality detection, without the concern for history The distribution and variation of data are cheated, accuracy is high.

Detailed description of the invention

Fig. 1 is the anomaly assessment method and step schematic diagram of data in one embodiment of the application；

Fig. 2 is the anomaly assessment schematic device of data in one embodiment of the application；

Fig. 3 is the structural schematic block diagram of the computer equipment of one embodiment of the application.

The embodiments will be further described with reference to the accompanying drawings for realization, functional characteristics and the advantage of the application purpose.

Specific embodiment

It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not For limiting the application.

Referring to Fig.1, the application proposes a kind of anomaly assessment method of data, comprising steps of

S1, obtain preset quantity historical test data in normal data；

S2, Feature Selection is carried out to the normal data, obtains all essential features of the normal data and each A essential feature multiple fisrt feature data accordingly；

S3, feature reduction is carried out to multiple fisrt feature data, obtains multiple history restoring datas；

S4, the first difference value for calculating multiple the fisrt feature data and multiple history restoring datas；

S5, multiple first difference values are brought into sigmoid Function Mapping to (0,1), is then amplified result pre- If multiple, multiple risk score values of the normal data are obtained, and its maximum value is taken to obtain model data score value S_Mould；

S6, the test data for needing to evaluate and test is obtained；

S7, Feature Selection is carried out to the test data according to the essential feature of normal data, obtains the test data All essential features and each described corresponding second feature data of essential feature；

S8, the second difference value for calculating the second feature data and the history restoring data；

S9, second difference value is brought into sigmoid Function Mapping to (0,1), then by default times of result amplification Number, obtains multiple risk score values of the test data, and its maximum value is taken to obtain model data score value S_{It surveys}；

S10, by S_Mould、S_{It surveys}It is compared by preset rules, obtains Risk Results.

As described in above-mentioned steps S1, the normal data in above-mentioned historical test data refers to settlement of insurance claim history compensation case Normal data；It is that data after abnormal case are cheated in removal to settlement of insurance claim history in compensation case.Above-mentioned history compensation case After normal data modelling, the shape for reflecting normal data is profile.

As described in above-mentioned steps S2, features described above screening is referred to according to business needs, selected characteristic subset, in feature Comprising metric feature and nonmetric value tag, screening is exactly to find out metric and nonmetric value, only retains measurement value tag, leads to It crosses Feature Selection and obtains required characteristic later.Features described above data refer to the specific of the attributive character in Claims Resolution data The value of data, such as when data of settling a claim are a personal insurances, the attributive character and specific data of personal insurance Claims Resolution are [this length of stay (10), percentage (67%) of this length of stay in similar disease maximum length of stay, this Claims Resolution gold Volume (50000), this Claims Resolution hospital's quantity (1), patient age (45), gender ...] etc..Above-mentioned described attributive character is Be [this length of stay, percentage of this length of stay in similar disease maximum length of stay, this amount for which loss settled, this Claims Resolution hospital's quantity, patient age ... ...]；Features described above data be [(10), (67%), (50000), (1), (45) ... ...].

As described in above-mentioned steps S3, by establishing algorithm (detailed process is referring to next embodiment) certainly, to multiple described One characteristic carries out feature reduction, obtains multiple history restoring datas.

As described in above-mentioned steps S4, the first difference value diff indicates to pass through between history restoring data and fisrt feature data Difference after PCA algorithmic transformation, formula used are as follows:

Diff=sum (diff1, diff2 ..., diffm), wherein

Diff1=(X1-X1')/mean (X1)

Diff2=(X2-X2')/mean (X2)

……

Diffm=(Xm-Xm')/mean (Xm)

Mean () expression is averaged.

As described in above-mentioned steps S5, score formula are as follows: y=n/ (1+e^ (- a*diff+b)).In formula, n is presupposition multiple, A, b is two regulatory factors, S_MouldThe maximum value acquired in as all normal training datas according to scoring formula.

As described in above-mentioned steps S6, the test data of above-mentioned needs evaluation and test refers to carrying out insuring anti-fraud detection Claims Resolution data, acquisition modes are the anomaly assessment system or model by data, and system or model are provided with data introducting interface, Can by window pull data file, directly input the modes such as data obtain Claims Resolution data (test data for needing to evaluate and test).

As described in above-mentioned steps S7, the above-mentioned essential feature according to normal data carries out Feature Selection to the test data Obtained all measurement value tags when screening to normal data are referred to, by the above-mentioned measurement value tag for data of settling a claim Corresponding also to find out, remaining feature is removed.

As described in above-mentioned steps S8, the second difference value Diff indicates to pass through between history restoring data and second feature data Difference after PCA algorithmic transformation, formula used is identical as the first used formula of difference value diff is calculated, and repeats no more.

As described in above-mentioned steps S9, score formula are as follows: y=n/ (1+e^ (- a*Diff+b)).In formula, n is presupposition multiple, A, b is two regulatory factors, S_{It surveys}The maximum value acquired in as all normal training datas according to scoring formula.

It is above-mentioned by S as described in above-mentioned steps S10_Mould、S_{It surveys}It compares, S will be obtained_Mould、S_{It surveys}Size and gap model It encloses, the preset rules are exactly to pass through S_Mould、S_{It surveys}Size and gap range obtain corresponding Risk Results.

In one embodiment, described that feature reduction is carried out to the fisrt feature data, obtain history restoring data Step, comprising:

S10, the fisrt feature data are normalized, obtain the normalization creep function of historical data；

S20, fisrt feature matrix is converted by the normalization creep function of the historical data；

S30, feature reduction is carried out by the method for PCA inverse transformation to the fisrt feature matrix, obtains history reduction number According to.

In the present embodiment, above-mentioned PCA (Principal Components Analysis, principal component analysis) refers to PCA algorithm is a kind of unsupervised learning algorithm, is mainly used for feature extraction and dimensionality reduction in the application.

It is described as described in above-mentioned steps S10, portion Claims Resolution data in, include multiple features and characteristic, a spy Sign data are also referred to as element, and all characteristics of a feature form the character subset of this feature.It is above-mentioned to described first Characteristic is normalized, and the normalization creep function for obtaining Claims Resolution data refers to that each element is required to be counted It is calculated according to normalization.After calculating all is normalized to all characteristics, normalization creep function is obtained.

As described in above-mentioned steps S20, it is based on PCA algorithm, the normalization creep function conversion that step S20 is obtained is characterized square Battle array.Specifically, assuming that X is the matrix of a m*n, m character representation data of n object are indicated, i.e., each column indicate one Object, every a line indicate a feature.It is desirable that going out to be reduced to d dimension for feature, d is much smaller than m.Output result is Y, then Y is one The matrix of a d*n.Specific algorithm is as follows:

(1) remember X=[x₁, x₂...x_n], calculate the average value of each object-point

(2) remember decentralization result:Matrix SVD is to it (Singular Value Decomposition, abbreviation SVD) is decomposed i.e.: X-x₀=U Λ V^T

(3) then x0 be new coordinate system origin, matrix U preceding d column be decentralization after new coordinate system, there is no harm in It is denoted as W.So, expression of all the points under new coordinate system are as follows: Y=W^T*(X-x₀), similarly, new subpoint y is restored Into former coordinate system (that is, PCA inverse transformation), as a result it can be written as: x₀+W*y。

As described in above-mentioned steps S30, obtained after the information of reservation 95% after PCA is trained based on passing through for trained data x* To W and Y, Y is then passed through into PCA inverse transformation: x₀+ W*y is converted to

In one embodiment, described that Feature Selection is carried out to the normal data, obtain all of the normal data The step of essential feature and each described essential feature corresponding fisrt feature data, comprising:

Identify all features of normal data；

In the present embodiment, above-mentioned inessential feature, that is, nonmetric value tag, it is no to the behavior profile of normal data real Border influences；Above-mentioned essential feature measures value tag, and the shape for influencing normal data is profile, therefore above-mentioned inessential feature is gone It removes, filters out essential feature, the anomaly assessment method of data can be made more accurate, while reducing calculation amount, reduce error Rate and raising assessment efficiency.

In one embodiment, described by S_Mould、S_{It surveys}The step of being compared by preset rules, obtaining Risk Results packet It includes:

If S_{It surveys}>S_Mould, then determine that there are risks；

If S_{It surveys}<S_Mould* 90%, then determine that risk is not present.

In the present embodiment, what model data score value actually reacted is the normal data of settlement of insurance claim history compensation case Shape be profile, if Claims Resolution data score value is greater than model data score value, the shape for the data that illustrate to settle a claim is profile and normal The shape of data is profile there are bigger difference, illustrates that there are risks；If settle a claim data score value be less than model data score value and Greater than the 90% of model data score value, then illustrate settle a claim data shape be profile have deviate normal data shape be becoming for profile Gesture illustrates that there may be risks；If data score value of settling a claim is less than the 90% of model data score value, illustrate data of settling a claim It with the shape of normal data is that profile is consistent that shape, which is profile, illustrates that there is no risks.

In one embodiment, described that the fisrt feature data are normalized, obtain the normalization of historical data The step of model includes:

In the present embodiment, normalization seeks to data to be treated to limit after treatment (by certain algorithm) In a certain range.Normalization is the convenience for follow-up data processing first, and convergence is accelerated when followed by guarantee program is run. Normalized specific effect is to conclude the statistical distribution of unified samples.Normalizing between 0-1 is the probability distribution counted, is returned One change is the coordinate distribution of statistics on some section.Normalization has same, unified and unification the meaning.Logarithm in the present embodiment According to normalized with means be max min linear normalization method, the formula used is as follows: x*=(X- Xmin)/(Xmax-Xmin)。

In one embodiment, the normalization creep function by the historical data is converted into the step of fisrt feature matrix Suddenly, comprising:

In the present embodiment, the characteristic value [s1, s2 ... .sm] of matrix (X-x0) (X-x0) ^T u=su is arranged from big to small Column, characteristic value si is bigger, corresponding to the feature vector ui data information amount that includes it is more.It extracts and is contributed in normalization creep function Rate is more than that 95% Principle component extraction comes out, the fisrt feature matrix of the normalization creep function feature vector composition for data of settling a claim.

In one embodiment, the contribution rate are as follows:Wherein, contrib is contribution rate, Si is characterized value.

A kind of anomaly assessment method of data provided herein, by obtaining the Claims Resolution data for needing to evaluate and test, and will The Claims Resolution data import detection system；The characteristic for extracting the Claims Resolution data, at the Rating Model of detection system Reason calculates Claims Resolution data score value；The numerical values recited of comparison Claims Resolution data score value and model data score value, obtains risk Value and Risk Results.The application can learn the overall distribution profile of normal data using the PCA algorithm of unsupervised learning, Thought based on abnormality detection, without the concern for the distribution and variation of fraud data, accuracy is high.

Referring to Fig. 2, a kind of anomaly assessment device of data is also proposed in the embodiment of the present application, comprising:

First acquisition unit 10, the normal data in historical test data for obtaining preset quantity；

First screening unit 20 obtains all of the normal data for carrying out Feature Selection to the normal data Essential feature and each described essential feature multiple fisrt feature data accordingly；

Reduction unit 30 obtains multiple history reduction numbers for carrying out feature reduction to multiple fisrt feature data According to；

First computing unit 40, for calculating the of multiple fisrt feature data and multiple history restoring datas One difference value；And bring multiple first difference values in sigmoid Function Mapping to (0,1) into, then result is amplified pre- If multiple, multiple risk score values of the normal data are obtained, and its maximum value is taken to obtain model data score value S_Mould；

Second acquisition unit 50, for obtaining the test data for needing to evaluate and test；

Second screening unit 60 carries out Feature Selection to the test data for the essential feature according to normal data, Obtain all essential features and each described corresponding second feature data of essential feature of the test data；

Second computing unit 70, for calculating the second difference of the second feature data Yu the history restoring data Value；And bring second difference value in sigmoid Function Mapping to (0,1) into, result is then amplified into presupposition multiple, is obtained One risk score of the test data simultaneously takes its maximum value to obtain model data score value S_{It surveys}；

Judging unit 80 being compared by preset rules for surveying S mould, S, obtaining Risk Results.

In the present embodiment, above-mentioned PCA (Principal Components Analysis) refers to PCA algorithm, is one Unsupervised learning algorithm is planted, is mainly used for feature extraction and dimensionality reduction in the application.

In first acquisition unit 10, the normal data of above-mentioned settlement of insurance claim history compensation case is that settlement of insurance claim history has been paid for The data after abnormal case are cheated in removal in case.Above-mentioned history after the normal data modelling of compensation case, reflects normal The shape of data is profile.

In the first screening unit 20, features described above screening is referred to according to business needs, selected characteristic subset, feature In include metric feature and nonmetric value tag, screening is exactly to find out metric and nonmetric value, only retain measurement value tag, By obtaining required characteristic after Feature Selection.Features described above data refer to the tool of the attributive character in Claims Resolution data The value of volume data, such as when data of settling a claim are a personal insurances, the attributive character and specific data of personal insurance Claims Resolution It is [this length of stay (10), percentage (67%) of this length of stay in similar disease maximum length of stay, this Claims Resolution The amount of money (50000), this Claims Resolution hospital's quantity (1), patient age (45), gender ...] etc..Above-mentioned described attributive character Be [this length of stay, percentage of this length of stay in similar disease maximum length of stay, this amount for which loss settled, this Secondary Claims Resolution hospital quantity, patient age ... ...]；Features described above data be [(10), (67%), (50000), (1), (45) ... ...].

In reduction unit 30, by establishing algorithm (detailed process is referring to next embodiment) certainly, to multiple described first Characteristic carries out feature reduction, obtains multiple history restoring datas.

In the first computing unit 40, the first difference value diff indicates to lead between history restoring data and fisrt feature data Cross the difference after PCA algorithmic transformation, formula used are as follows:

Diff=sum (diff1, diff2 ..., diffm), wherein

Diff1=(X1-X1')/mean (X1)

Diff2=(X2-X2')/mean (X2)

……

Diffm=(Xm-Xm')/mean (Xm)

Mean () expression is averaged.

Score formula are as follows: y=n/ (1+e^ (- a*diff+b)).In formula, n is presupposition multiple, and a, b are two regulatory factors, S_MouldThe maximum value acquired in as all normal training datas according to scoring formula.

In second acquisition unit 50, the Claims Resolution data of above-mentioned needs evaluation and test refer to carrying out insuring anti-fraud detection Claims Resolution data, acquisition modes are the anomaly assessment system or model by data, and system or model are provided with data importing and connect Mouthful, data file can be pulled by window, directly inputted the modes such as data and obtained Claims Resolution data.

In the second screening unit 60, the above-mentioned essential feature according to normal data carries out feature sieve to the Claims Resolution data Choosing refers to obtained all measurement value tags when screening to normal data, and the above-mentioned metric for data of settling a claim is special Levy it is corresponding also find out, remaining feature is removed.

In the second computing unit 70, the second difference value Diff indicates to lead between history restoring data and second feature data The difference after PCA algorithmic transformation is crossed, formula used is identical as the first used formula of difference value diff is calculated, and repeats no more.

Score formula are as follows: y=n/ (1+e^ (- a*Diff+b)).In formula, n is presupposition multiple, and a, b are two regulatory factors, S_{It surveys}The maximum value acquired in as all normal training datas according to scoring formula.

It is above-mentioned by S in judging unit 80_Mould、S_{It surveys}It compares, S will be obtained_Mould、S_{It surveys}Size and gap range, institute Stating preset rules is exactly to pass through S_Mould、S_{It surveys}Size and gap range obtain corresponding Risk Results.

Referring to Fig. 3, a kind of computer equipment is also proposed in the embodiment of the present application, which can be server, Its internal structure can be as shown in Figure 3.The computer equipment includes processor, the memory, network connected by system bus Interface and database.Wherein, the processor of the Computer Design is for providing calculating and control ability.The computer equipment is deposited Reservoir includes non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program And database.The built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium. The database of the computer equipment is for storing history settlement of insurance claim case data etc..The network interface of the computer equipment is used for It is communicated with external terminal by network connection.It is commented when the computer program is executed by processor with the exception for realizing a kind of data Estimate method.

Above-mentioned processor executes the step of above method:

Obtain the normal data in the historical test data of preset quantity；

Obtain the test data for needing to evaluate and test；

Identify all features of normal data；

If S_{It surveys}>S_Mould, then determine that there are risks；

If S_{It surveys}<S_Mould* 90%, then determine that risk is not present.

One embodiment of the application also proposes a kind of computer readable storage medium, is stored thereon with computer program, calculates Machine program realizes a kind of anomaly assessment method of data when being executed by processor, comprising steps of

Obtain the normal data in the historical test data of preset quantity；

Obtain the test data for needing to evaluate and test；

Identify all features of normal data；

If S_{It surveys}>S_Mould, then determine that there are risks；

If S_{It surveys}<S_Mould* 90%, then determine that risk is not present.

Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the computer program can store and a non-volatile computer In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, Any reference used in provided herein and embodiment to memory, storage, database or other media, Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM can by diversified forms , such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double speed are according to rate SDRAM (SSRSDRAM), increasing Strong type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..

It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row His property includes, so that the process, device, article or the method that include a series of elements not only include those elements, and And further include the other elements being not explicitly listed, or further include for this process, device, article or method institute it is intrinsic Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do There is also other identical elements in the process, device of element, article or method.

The foregoing is merely preferred embodiment of the present application, are not intended to limit the scope of the patents of the application, all utilizations Equivalent structure or equivalent flow shift made by present specification and accompanying drawing content is applied directly or indirectly in other correlations Technical field, similarly include in the scope of patent protection of the application.

Claims

1. a kind of anomaly assessment method of data, which is characterized in that comprising steps of

Obtain the normal data in the historical test data of preset quantity；

To the normal data carry out Feature Selection, obtain the normal data all essential features and each described in must Want feature multiple fisrt feature data accordingly；

Multiple first difference values are brought into sigmoid Function Mapping to (0,1), result is then amplified into presupposition multiple, Multiple risk score values of the normal data are obtained, and its maximum value is taken to obtain model data score value S_Mould；

Obtain the test data for needing to evaluate and test；

Feature Selection is carried out to the test data according to the essential feature of normal data, obtaining all of the test data must Want feature and each described corresponding second feature data of essential feature；

Second difference value is brought into sigmoid Function Mapping to (0,1), result is then amplified into presupposition multiple, is obtained Multiple risk score values of the test data, and its maximum value is taken to obtain model data score value S_{It surveys}；

2. the anomaly assessment method of data according to claim 1, which is characterized in that described to the fisrt feature data The step of carrying out feature reduction, obtaining history restoring data, comprising:

3. the anomaly assessment method of data according to claim 1, which is characterized in that described to be carried out to the normal data Feature Selection obtains all essential features and each described corresponding fisrt feature number of essential feature of the normal data According to the step of, comprising:

Identify all features of normal data；

Inessential feature therein is removed, all essential features and each described necessary spy of the normal data are obtained Levy corresponding multiple fisrt feature data.

4. the anomaly assessment method of data according to claim 1, which is characterized in that described by S_Mould、S_{It surveys}Pass through default rule The step of then comparing, obtaining Risk Results include:

If S_{It surveys}>S_Mould, then determine that there are risks；

If S_{It surveys}<S_Mould* 90%, then determine that risk is not present.

5. the anomaly assessment method of data according to claim 2, which is characterized in that described to the fisrt feature data The step of being normalized, obtaining the normalization creep function of historical data include:

Each of feature data are successively subtracted to the result of the minimum value divided by the difference, obtain feature normalizing Change numerical value；

6. the anomaly assessment method of data according to claim 2, which is characterized in that the normalizing by the historical data Change the step of model conversation is fisrt feature matrix, comprising:

The Principle component extraction that contribution rate in the normalization creep function of the historical data is more than 95% is come out, is obtained by feature vector The fisrt feature matrix of composition.

7. the anomaly assessment method of data according to claim 6, which is characterized in that the contribution rate are as follows:Wherein, contrib is contribution rate, and si is characterized value.

8. a kind of anomaly assessment device of data characterized by comprising

First screening unit, for the normal data carry out Feature Selection, obtain the normal data institute it is necessary to spies Sign and each described essential feature multiple fisrt feature data accordingly；

First computing unit, for calculating the first difference of multiple fisrt feature data Yu multiple history restoring datas Value；And bring multiple first difference values in sigmoid Function Mapping to (0,1) into, result is then amplified into presupposition multiple, Multiple risk score values of the normal data are obtained, and its maximum value is taken to obtain model data score value S_Mould；

Second screening unit carries out Feature Selection to the test data for the essential feature according to normal data, obtains institute State all essential features and each described corresponding second feature data of essential feature of test data；

Second computing unit, for calculating the second difference value of the second feature data Yu the history restoring data；And it will Second difference value is brought into sigmoid Function Mapping to (0,1), and result is then amplified presupposition multiple, obtains the survey It tries multiple risk scores of data and its maximum value is taken to obtain model data score value S_{It surveys}；

9. a kind of computer equipment, including memory and processor, it is stored with computer program in the memory, feature exists In the step of processor realizes any one of claims 1 to 7 the method when executing the computer program.

10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program The step of method described in any one of claims 1 to 7 is realized when being executed by processor.