CN117911040A - Method and device for identifying abnormal target - Google Patents

Method and device for identifying abnormal target Download PDF

Info

Publication number
CN117911040A
CN117911040A CN202211260265.6A CN202211260265A CN117911040A CN 117911040 A CN117911040 A CN 117911040A CN 202211260265 A CN202211260265 A CN 202211260265A CN 117911040 A CN117911040 A CN 117911040A
Authority
CN
China
Prior art keywords
eigenvalue
value
variance
principal
calculating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211260265.6A
Other languages
Chinese (zh)
Inventor
刘芳
赵旭玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN202211260265.6A priority Critical patent/CN117911040A/en
Publication of CN117911040A publication Critical patent/CN117911040A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method and a device for identifying an abnormal target, and relates to the technical field of big data analysis and information security. One embodiment of the method comprises the following steps: constructing an index matrix according to index values corresponding to the business indexes of each target; performing principal component analysis on the index matrix by adopting a principal component analysis algorithm to obtain at least one principal eigenvalue and an eigenvector corresponding to the principal eigenvalue; calculating the comprehensive index value of each target according to the at least one main characteristic value and the corresponding characteristic vector thereof; and identifying the abnormal target according to the comprehensive index value of each target. The embodiment can solve the technical problems that the risk targets are found to have serious hysteresis and all targets similar to the risk behaviors can be systematically identified.

Description

Method and device for identifying abnormal target
Technical Field
The invention relates to the technical field of big data analysis and information security, in particular to a method and a device for identifying an abnormal target.
Background
At present, when mining targets (such as suppliers or users) with risks, mainly sampling and confirming business indexes such as transaction data set of the targets, personnel association relations and the like, wherein the relatively scattered information greatly reduces the speed of project progress, so that serious hysteresis exists in the risk targets; meanwhile, because of sampling and examining the risk abnormal behaviors of the targets, the coverage is not wide enough, all targets similar to the risk behaviors cannot be systematically identified, the risk scale cannot be predicted, and the risks cannot be effectively prevented and avoided.
Disclosure of Invention
In view of the above, the embodiments of the present invention provide a method and an apparatus for identifying abnormal targets, so as to solve the technical problems of finding out serious hysteresis of risk targets and systematically identifying all targets with similar risk behaviors.
To achieve the above object, according to one aspect of the embodiments of the present invention, there is provided a method of identifying an abnormal target, including:
constructing an index matrix according to index values corresponding to the business indexes of each target;
Performing principal component analysis on the index matrix by adopting a principal component analysis algorithm to obtain at least one principal eigenvalue and an eigenvector corresponding to the principal eigenvalue;
calculating the comprehensive index value of each target according to the at least one main characteristic value and the corresponding characteristic vector thereof;
and identifying the abnormal target according to the comprehensive index value of each target.
Optionally, performing principal component analysis on the index matrix by using a principal component analysis algorithm to obtain at least one principal eigenvalue and its corresponding eigenvector, including:
Calculating a covariance matrix of the index matrix;
Calculating each eigenvalue of the covariance matrix and corresponding eigenvectors thereof;
And calculating each accumulated variance value of the covariance matrix according to each eigenvalue and the corresponding eigenvector of the covariance matrix, thereby screening out at least one main eigenvalue and the corresponding eigenvector of the main eigenvalue.
Optionally, calculating each accumulated variance value of the covariance matrix according to each eigenvalue and corresponding eigenvector of the covariance matrix, so as to screen out at least one main eigenvalue and corresponding eigenvector, including:
For each eigenvalue, dividing the eigenvector corresponding to the eigenvalue by the arithmetic square root of the eigenvalue to obtain a variance vector of the eigenvalue, thereby calculating the variance value of the eigenvalue;
And calculating each accumulated variance value of the covariance matrix according to the variance value of each eigenvalue, thereby screening out at least one main eigenvalue and the eigenvector corresponding to the main eigenvalue.
Optionally, calculating the variance value of the feature value includes:
and calculating the variance of the variance vector, thereby obtaining the variance value of the characteristic value.
Optionally, calculating each accumulated variance value of the covariance matrix according to the variance value of each eigenvalue, so as to screen out at least one principal eigenvalue and its corresponding eigenvector, including:
sequencing the characteristic values according to the order of the variance values from large to small;
For each characteristic value, adding the variance value of the characteristic value and the variance value of the characteristic value positioned before the characteristic value in the sequence to obtain an accumulated variance value of the characteristic value;
And screening out the characteristic value with the accumulated variance value larger than or equal to the variance threshold value as a main characteristic value.
Optionally, before calculating the covariance matrix of the index matrix, the method further comprises:
And carrying out standardization processing on the index matrix.
Optionally, calculating the comprehensive index value of each target according to the at least one main feature value and the corresponding feature vector thereof includes:
Respectively calculating the weight and variance vector of the at least one main characteristic value according to the at least one main characteristic value and the corresponding characteristic vector thereof;
and carrying out weighted summation on the weight and the variance vector of the at least one main characteristic value, thereby obtaining the comprehensive index value of each target.
Optionally, calculating the weight and variance vector of the at least one principal eigenvalue according to the at least one principal eigenvalue and the corresponding eigenvector, respectively, including:
for each principal eigenvalue, dividing the eigenvector corresponding to the principal eigenvalue by the arithmetic square root of the principal eigenvalue to obtain a variance vector of the principal eigenvalue, thereby calculating the variance value of the principal eigenvalue;
For each main characteristic value, dividing the variance value of the main characteristic value by the sum of the variance values of the main characteristic values to obtain the weight of the main characteristic value.
In addition, according to another aspect of an embodiment of the present invention, there is provided an apparatus for identifying an abnormal target, including:
the construction module is used for constructing an index matrix according to index values corresponding to the business indexes of each target;
The analysis module is used for carrying out principal component analysis on the index matrix by adopting a principal component analysis algorithm to obtain at least one principal eigenvalue and an eigenvector corresponding to the principal eigenvalue;
The calculation module is used for calculating the comprehensive index value of each target according to the at least one main characteristic value and the corresponding characteristic vector thereof;
and the identification module is used for identifying the abnormal target according to the comprehensive index value of each target.
Optionally, the analysis module is further configured to:
Calculating a covariance matrix of the index matrix;
Calculating each eigenvalue of the covariance matrix and corresponding eigenvectors thereof;
And calculating each accumulated variance value of the covariance matrix according to each eigenvalue and the corresponding eigenvector of the covariance matrix, thereby screening out at least one main eigenvalue and the corresponding eigenvector of the main eigenvalue.
Optionally, the analysis module is further configured to:
For each eigenvalue, dividing the eigenvector corresponding to the eigenvalue by the arithmetic square root of the eigenvalue to obtain a variance vector of the eigenvalue, thereby calculating the variance value of the eigenvalue;
And calculating each accumulated variance value of the covariance matrix according to the variance value of each eigenvalue, thereby screening out at least one main eigenvalue and the eigenvector corresponding to the main eigenvalue.
Optionally, the analysis module is further configured to:
and calculating the variance of the variance vector, thereby obtaining the variance value of the characteristic value.
Optionally, the analysis module is further configured to:
sequencing the characteristic values according to the order of the variance values from large to small;
For each characteristic value, adding the variance value of the characteristic value and the variance value of the characteristic value positioned before the characteristic value in the sequence to obtain an accumulated variance value of the characteristic value;
And screening out the characteristic value with the accumulated variance value larger than or equal to the variance threshold value as a main characteristic value.
Optionally, the analysis module is further configured to:
and before calculating the covariance matrix of the index matrix, carrying out standardization processing on the index matrix.
Optionally, the computing module is further configured to:
Respectively calculating the weight and variance vector of the at least one main characteristic value according to the at least one main characteristic value and the corresponding characteristic vector thereof;
and carrying out weighted summation on the weight and the variance vector of the at least one main characteristic value, thereby obtaining the comprehensive index value of each target.
Optionally, the computing module is further configured to:
for each principal eigenvalue, dividing the eigenvector corresponding to the principal eigenvalue by the arithmetic square root of the principal eigenvalue to obtain a variance vector of the principal eigenvalue, thereby calculating the variance value of the principal eigenvalue;
For each main characteristic value, dividing the variance value of the main characteristic value by the sum of the variance values of the main characteristic values to obtain the weight of the main characteristic value.
According to another aspect of an embodiment of the present invention, there is also provided an electronic device including:
one or more processors;
storage means for storing one or more programs,
The one or more processors implement the method of any of the embodiments described above when the one or more programs are executed by the one or more processors.
According to another aspect of an embodiment of the present invention, there is also provided a computer readable medium having stored thereon a computer program which, when executed by a processor, implements the method according to any of the embodiments described above.
According to another aspect of embodiments of the present invention, there is also provided a computer program product comprising a computer program which, when executed by a processor, implements the method according to any of the embodiments described above.
One embodiment of the above invention has the following advantages or benefits: the method adopts the technical means that the index matrix is subjected to principal component analysis by adopting a principal component analysis algorithm to obtain at least one principal eigenvalue and an eigenvector corresponding to the principal eigenvalue, and then the comprehensive index value of each target is calculated according to the at least one principal eigenvalue and the eigenvector corresponding to the principal eigenvalue, so that the abnormal targets are identified, and the technical problems that serious hysteresis exists in the risk targets and all targets similar to the risk behaviors can be systematically identified in the prior art are solved. The embodiment of the invention adopts the principal component analysis algorithm to reduce the dimension of the index matrix, converts a plurality of service indexes into a few comprehensive service indexes, so that most of information in the service indexes can be reserved in the comprehensive service indexes, simplifies the service indexes, enhances the data multiplexing capability, effectively reduces a large amount of sampling work, is beneficial to quickly identifying risk targets, and can systematically identify abnormal targets.
Further effects of the above-described non-conventional alternatives are described below in connection with the embodiments.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. Wherein:
FIG. 1 is a flow chart of a method of identifying an anomaly target in accordance with an embodiment of the present invention;
FIG. 2 is a flow chart of a method of identifying an anomaly target in accordance with one referenceable embodiment of the present invention;
FIG. 3 is a flow chart of a method of identifying an anomaly target in accordance with another referent embodiment of the invention;
FIG. 4 is a flow chart of a method of identifying an anomaly target in accordance with yet another referent embodiment of the invention;
FIG. 5 is a schematic diagram of an apparatus for identifying an anomaly target in accordance with an embodiment of the present invention;
FIG. 6 is an exemplary system architecture diagram in which embodiments of the present invention may be applied;
fig. 7 is a schematic diagram of a computer system suitable for use in implementing an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present invention are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The technical scheme of the application obtains, stores, uses, processes and the like the data, which all meet the relevant regulations of national laws and regulations.
Fig. 1 is a flowchart of a method of identifying an anomaly target according to an embodiment of the present invention. As an embodiment of the present invention, as shown in fig. 1, the method for identifying an abnormal target may include:
Step 101, constructing an index matrix according to index values corresponding to the business indexes of each target.
Firstly, determining various business indexes of targets (in an e-commerce platform scene, targets can be suppliers for supplying commodities to the e-commerce platform, targets can be users for purchasing commodities from the e-commerce platform, the embodiment of the invention is not limited to the targets), such as business indexes of contract risks, personnel utilization risks, business risks, operation capacity, profitability, blacklist risks, commodity flowing risks, order performance risks and the like, then acquiring index values corresponding to the various business indexes of each target, and constructing an index matrix. For example, the column elements of the index matrix may be the target identifiers, the row elements may be the business indexes, and the units corresponding to the row elements and the column elements are index values. Similarly, the row elements of the index matrix may be each target identifier, and the column elements may be each business index, which is not limited in this embodiment of the present invention.
And 102, performing principal component analysis on the index matrix by adopting a principal component analysis algorithm to obtain at least one principal eigenvalue and an eigenvector corresponding to the principal eigenvalue.
And (3) carrying out principal component analysis on the index matrix constructed in the step (101) by adopting a principal component analysis algorithm (PCA), and obtaining at least one principal eigenvalue gamma and a corresponding eigenvector f through an analysis result.
The principle of PCA is: assuming that a matrix a is present, we want to find a number m such that ax=mx is present, then m is a eigenvalue of matrix a, x is a column vector, and is a eigenvector of matrix a. Since PCA is an idea of dimension reduction, linear transformation is involved on vector x, and matrix a in ax=mx is to linearly change dimension reduction on vector x. PCA converts a plurality of service indexes into a few comprehensive service indexes by a eigenvalue decomposition algorithm, the number of the service indexes is reduced, but most of information in the service indexes can be reserved in the comprehensive service indexes. The more business indexes of target risk mining are, the complexity of analysis problems is brought, and the embodiment of the invention can better adapt to requirements, simplify the business indexes and help to quickly identify risk targets.
Optionally, step 102 may include: calculating a covariance matrix of the index matrix; calculating each eigenvalue of the covariance matrix and corresponding eigenvectors thereof; and calculating each accumulated variance value of the covariance matrix according to each eigenvalue and the corresponding eigenvector of the covariance matrix, thereby screening out at least one main eigenvalue and the corresponding eigenvector of the main eigenvalue. Firstly, calculating a covariance matrix of an index matrix constructed in the step 101, namely A m,n, wherein the covariance matrix is a symmetric matrix, the diagonal is the variance of each service index, the non-diagonal is the covariance between every two service indexes, then, calculating each characteristic value (gamma 12,...,γn) and a corresponding characteristic vector (f 1,f,...,fn) of the covariance matrix according to a characteristic value decomposition theorem, then, calculating each accumulated variance value of the covariance matrix according to each characteristic value and a corresponding characteristic vector, and finally, screening at least one main characteristic value and a corresponding characteristic vector according to each accumulated variance value.
Optionally, calculating each accumulated variance value of the covariance matrix according to each eigenvalue and corresponding eigenvector of the covariance matrix, so as to screen out at least one main eigenvalue and corresponding eigenvector, including: for each eigenvalue, dividing the eigenvector corresponding to the eigenvalue by the arithmetic square root of the eigenvalue to obtain a variance vector of the eigenvalue, thereby calculating the variance value of the eigenvalue; and calculating each accumulated variance value of the covariance matrix according to the variance value of each eigenvalue, thereby screening out at least one main eigenvalue and the eigenvector corresponding to the main eigenvalue. In an embodiment of the present invention, the variance vector F i of the eigenvalue γ i may be calculated using the following formula:
For example, γ i is a feature value of an index matrix formed by contract risk, personnel utilization risk, business risk, operation capability, profitability, blacklist risk, article flow risk, order performance risk, etc., γ 1 may be 0.3, γ 2 may be 0.7, γ 3 may be 1.2, f 1 is a feature vector corresponding to γ 1, f 2 is a feature vector corresponding to γ 2, and f 3 is a feature vector corresponding to γ 3, which will not be described again.
Then, the larger the variance value a i,ai of the eigenvalue gamma i is calculated according to the variance vector F i, the more useful information contained in the variance vector F i comprehensively obtained by each business index is described, so that each accumulated variance value can be further calculated according to each variance value a i, and at least one main eigenvalue and its corresponding eigenvector can be finally screened out through each accumulated variance value.
Optionally, calculating the variance value of the feature value includes: and calculating the variance of the variance vector, thereby obtaining the variance value of the characteristic value. In the embodiment of the present invention, the larger the variance value a i, i.e., a i=D(Fi),ai, of the variance vector F i is calculated as the feature value γ i, the more useful information contained in the variance vector F i obtained by the synthesis of each business index is described.
Optionally, calculating each accumulated variance value of the covariance matrix according to the variance value of each eigenvalue, so as to screen out at least one principal eigenvalue and its corresponding eigenvector, including: sequencing the characteristic values according to the order of the variance values from large to small; for each characteristic value, adding the variance value of the characteristic value and the variance value of the characteristic value positioned before the characteristic value in the sequence to obtain an accumulated variance value of the characteristic value; and screening out the characteristic value with the accumulated variance value larger than or equal to the variance threshold value as a main characteristic value. First, the variance values a i are sorted in order from large to small, such as a 1,a2,...,an, the respective corresponding eigenvalues are sorted to be γ 12,...,γn, and then the corresponding cumulative variance value of each eigenvalue is calculated, such as: the cumulative variance value of the characteristic value gamma 1 is a 1, the cumulative variance value of the characteristic value gamma 2 is a 1+a2, the cumulative variance value of the characteristic value gamma 3 is a 1+a2+a3, the cumulative variance value of the characteristic value gamma 4 is a 1+a2+a3+a4, the cumulative variance value of the characteristic value gamma 5 is a 1+a2+a3+a4+a5, and so on, which are not described in detail, and finally, the characteristic value with the cumulative variance value being greater than or equal to a variance threshold (such as 80%, 85% or 90%) is selected, for example, the variance threshold is 85%, and the cumulative variance value a 1+a2+a3 of the characteristic value gamma 3 is greater than or equal to 85%, so gamma 1、γ2、γ3 is taken as the main characteristic value. It can be understood that the primary component analysis algorithm is adopted to reduce the dimension of the original service index to 3 comprehensive service indexes, and 85% of the information contained in the original service index is reasonably interpreted by the information contained in the three comprehensive service indexes, so that the dimension reduction purpose is achieved.
And step 103, calculating the comprehensive index value of each target according to the at least one main characteristic value and the corresponding characteristic vector thereof.
Because the principal component analysis is performed on the index matrix in step 102, the dimension of the business index is reduced, and the comprehensive index value of each target can be calculated according to at least one principal eigenvalue and the eigenvector corresponding to the principal eigenvalue calculated in step 102, so that the calculation process can be simplified, and the recognition accuracy can be ensured.
Optionally, step 103 may include: respectively calculating the weight and variance vector of the at least one main characteristic value according to the at least one main characteristic value and the corresponding characteristic vector thereof; and carrying out weighted summation on the weight and the variance vector of the at least one main characteristic value, thereby obtaining the comprehensive index value of each target. In order to accurately calculate the comprehensive index value of each target, the weight and variance vector of at least one main characteristic value are calculated according to the at least one main characteristic value and the corresponding characteristic vector of the main characteristic value, and then the weight and variance vector of the at least one main characteristic value are weighted and summed, so that the comprehensive index value of each target is obtained.
Optionally, calculating the weight and variance vector of the at least one principal eigenvalue according to the at least one principal eigenvalue and the corresponding eigenvector, respectively, including: for each principal eigenvalue, dividing the eigenvector corresponding to the principal eigenvalue by the arithmetic square root of the principal eigenvalue to obtain a variance vector of the principal eigenvalue, thereby calculating the variance value of the principal eigenvalue; for each main characteristic value, dividing the variance value of the main characteristic value by the sum of the variance values of the main characteristic values to obtain the weight of the main characteristic value. Similar to step 102, taking the principal eigenvalue γ 1、γ2、γ3 as an example, the variance vector F i of the principal eigenvalue γ i can be calculated using the following formula:
Then, the variance of the variance vector F i is calculated as the variance value a i of the principal eigenvalue γ i, that is, a i=D(Fi.
Then the weight of the principal eigenvalue γ 1 isThe weight of the principal eigenvalue gamma 2 is/>The weight of the principal eigenvalue gamma 3 is/>
Since the larger a i is, the more useful information is contained in the variance vector F i comprehensively obtained by various service indexes, the embodiment of the invention is usedRepresenting the weight of each principal eigenvalue. The composite index value F of each target may be calculated using the following formula:
the larger F, the higher the risk that the target is at.
Therefore, the embodiment of the invention can accurately calculate the comprehensive index value of each target.
And 104, identifying an abnormal target according to the comprehensive index value of each target.
After the comprehensive index value of each target is calculated, the comprehensive index value of each target is imported into a database, a complete target risk model table is built, a risk strategy is set based on the table in combination with service requirements, abnormal targets are dynamically monitored, and the abnormal targets are pushed to a service side. After receiving the hit abnormal target, the service side reversely excavates which risk index is abnormal, and then pulls out details of an upstream table corresponding to the risk index. Therefore, the embodiment of the invention can dynamically monitor and early warn in advance, thereby greatly reducing the working difficulty and complexity of the service.
According to the various embodiments described above, it can be seen that the technical means of identifying abnormal targets by performing principal component analysis on the index matrix by using the principal component analysis algorithm to obtain at least one principal eigenvalue and its corresponding eigenvector, and then calculating the comprehensive index value of each target according to the at least one principal eigenvalue and its corresponding eigenvector, thereby solving the technical problems of finding serious hysteresis of risk targets and systematically identifying all targets similar to risk behaviors in the prior art. The embodiment of the invention adopts the principal component analysis algorithm to reduce the dimension of the index matrix, converts a plurality of service indexes into a few comprehensive service indexes, so that most of information in the service indexes can be reserved in the comprehensive service indexes, simplifies the service indexes, enhances the data multiplexing capability, effectively reduces a large amount of sampling work, is beneficial to quickly identifying risk targets, and can systematically identify abnormal targets.
FIG. 2 is a flow chart of a method of identifying an anomaly target in accordance with one referenceable embodiment of the present invention. As yet another embodiment of the present invention, as shown in fig. 2, the method for identifying an abnormal target may include:
Step 201, constructing an index matrix according to index values corresponding to the business indexes of each target.
Firstly, determining each service index of each target, then obtaining index values corresponding to each service index of each target, and integrating the index values corresponding to each service index of each target into an index matrix.
And 202, filling data into the index matrix.
And carrying out statistical analysis on each business index to confirm whether an abnormal value or a null value exists. For data with abnormality, whether the data is data storage problem is evaluated, and the abnormal value can be replaced by the average value of the service index; the null value can be filled according to the average value of the service index, and can also be filled by using a random forest algorithm.
And 203, performing principal component analysis on the index matrix by adopting a principal component analysis algorithm to obtain at least one principal eigenvalue and an eigenvector corresponding to the principal eigenvalue.
Step 204, calculating the comprehensive index value of each target according to the at least one main characteristic value and the corresponding characteristic vector thereof.
And step 205, identifying an abnormal target according to the comprehensive index value of each target.
In addition, in the embodiment of the present invention, the method for identifying an abnormal object is described in detail in the above description, and thus, the description is not repeated here.
FIG. 3 is a flow chart of a method of identifying an anomaly target in accordance with another referent embodiment of the present invention. As another embodiment of the present invention, as shown in fig. 3, the method for identifying an abnormal target may include:
Step 301, constructing an index matrix according to index values corresponding to the service indexes of the targets.
And 302, performing standardization processing on the index matrix.
And (3) carrying out standardization treatment on the constructed index matrix to obtain a matrix X, and marking the matrix X as X m,n.
Optionally, the standardized processing method comprises the following steps: for each element in the matrix, subtracting the mean value of the current column from the element, and dividing the mean value by the standard deviation of the current column to obtain a standardized processing result of the element.
Assuming that the index matrix isNormalized index matrix/>
And 303, performing principal component analysis on the normalized index matrix by adopting a principal component analysis algorithm to obtain at least one principal eigenvalue and an eigenvector corresponding to the principal eigenvalue.
Step 304, for each principal eigenvalue, dividing the eigenvector corresponding to the principal eigenvalue by the arithmetic square root of the principal eigenvalue to obtain a variance vector of the principal eigenvalue, thereby calculating a variance value of the principal eigenvalue, and then dividing the variance value of the principal eigenvalue by the sum of the variance values of each principal eigenvalue to obtain the weight of the principal eigenvalue.
And step 305, carrying out weighted summation on the weight and the variance vector of the at least one main characteristic value, thereby obtaining the comprehensive index value of each target.
And 306, identifying an abnormal target according to the comprehensive index value of each target.
In addition, in another embodiment of the present invention, the method for identifying an abnormal object is described in detail in the above description, and thus the description thereof will not be repeated.
FIG. 4 is a flow chart of a method of identifying an anomaly target in accordance with yet another referent embodiment of the invention. As still another embodiment of the present invention, as shown in fig. 4, the method for identifying an abnormal target may include:
Step 401, constructing an index matrix according to index values corresponding to the service indexes of the targets.
Step 402, calculating a covariance matrix of the index matrix.
The covariance matrix of the index matrix constructed in step 401, denoted as a m,n, is calculated as a symmetric matrix, the variance of each traffic index on the diagonal, the covariance between every two traffic indexes on the off-diagonal,
Step 403, calculating each eigenvalue of the covariance matrix and its corresponding eigenvector.
Each eigenvalue of the covariance matrix is calculated according to the eigenvalue decomposition theorem and is marked as gamma i, and the eigenvector corresponding to each eigenvalue is marked as f i.
Step 404, for each eigenvalue, dividing the eigenvector corresponding to the eigenvalue by the arithmetic square root of the eigenvalue to obtain a variance vector of the eigenvalue, and then calculating the variance of the variance vector to obtain the variance value of the eigenvalue.
For F i, the variance vector F i for F i is calculated using the following formula:
the variance of the variance vector F i is then calculated as the variance value a i of the eigenvalue γ i, i.e., a i=D(Fi.
And step 405, sorting the characteristic values according to the order of the variance values from large to small.
The respective feature values are ranked in order of the variance value a i from large to small, such as a 1,a2,...,an, and are ranked as γ 12,...,γn.
Step 406, for each feature value, adding the variance value of the feature value and the variance value of the feature value located before the feature value in the sorting to obtain the accumulated variance value of the feature value.
Step 407, screening out the characteristic value with the accumulated variance value being greater than or equal to the variance threshold as the main characteristic value.
Step 408, respectively calculating the weight and variance vector of the at least one main eigenvalue according to the at least one main eigenvalue and the eigenvector corresponding to the main eigenvalue.
Similar to step 404, the variance vector F i of the principal eigenvalue γ i can be calculated using the following formula:
Then, the variance of the variance vector F i is calculated as the variance value a i of the principal eigenvalue γ i, that is, a i=D(Fi.
And 409, carrying out weighted summation on the weight and the variance vector of the at least one main characteristic value, thereby obtaining the comprehensive index value of each target.
Since the larger a i is, the more useful information is contained in the variance vector F i comprehensively obtained by various service indexes, the embodiment of the invention is usedRepresenting the weight of each principal eigenvalue. The composite index value F of each target may be calculated using the following formula:
the larger F, the higher the risk that the target is at.
Step 410, identifying an abnormal target according to the comprehensive index value of each target.
In addition, in the present invention, the specific implementation of the method for identifying an abnormal target in the embodiment has been described in detail in the above method for identifying an abnormal target, so that the description is not repeated here.
Fig. 5 is a schematic diagram of an apparatus for identifying an abnormal target according to an embodiment of the present invention. As shown in fig. 5, the apparatus 500 for identifying an abnormal target includes a construction module 501, an analysis module 502, a calculation module 503, and an identification module 504; the construction module 501 is configured to construct an index matrix according to index values corresponding to each service index of each target; the analysis module 502 is configured to perform principal component analysis on the index matrix by using a principal component analysis algorithm, to obtain at least one principal eigenvalue and its corresponding eigenvector; the calculating module 503 is configured to calculate a comprehensive index value of each target according to the at least one main feature value and the feature vector corresponding to the main feature value; the identifying module 504 is configured to identify an abnormal target according to the comprehensive index value of each target.
Optionally, the analysis module 502 is further configured to:
Calculating a covariance matrix of the index matrix;
Calculating each eigenvalue of the covariance matrix and corresponding eigenvectors thereof;
And calculating each accumulated variance value of the covariance matrix according to each eigenvalue and the corresponding eigenvector of the covariance matrix, thereby screening out at least one main eigenvalue and the corresponding eigenvector of the main eigenvalue.
Optionally, the analysis module 502 is further configured to:
For each eigenvalue, dividing the eigenvector corresponding to the eigenvalue by the arithmetic square root of the eigenvalue to obtain a variance vector of the eigenvalue, thereby calculating the variance value of the eigenvalue;
And calculating each accumulated variance value of the covariance matrix according to the variance value of each eigenvalue, thereby screening out at least one main eigenvalue and the eigenvector corresponding to the main eigenvalue.
Optionally, the analysis module 502 is further configured to:
and calculating the variance of the variance vector, thereby obtaining the variance value of the characteristic value.
Optionally, the analysis module 502 is further configured to:
sequencing the characteristic values according to the order of the variance values from large to small;
For each characteristic value, adding the variance value of the characteristic value and the variance value of the characteristic value positioned before the characteristic value in the sequence to obtain an accumulated variance value of the characteristic value;
And screening out the characteristic value with the accumulated variance value larger than or equal to the variance threshold value as a main characteristic value.
Optionally, the analysis module 502 is further configured to:
and before calculating the covariance matrix of the index matrix, carrying out standardization processing on the index matrix.
Optionally, the computing module 503 is further configured to:
Respectively calculating the weight and variance vector of the at least one main characteristic value according to the at least one main characteristic value and the corresponding characteristic vector thereof;
and carrying out weighted summation on the weight and the variance vector of the at least one main characteristic value, thereby obtaining the comprehensive index value of each target.
Optionally, the computing module 503 is further configured to:
for each principal eigenvalue, dividing the eigenvector corresponding to the principal eigenvalue by the arithmetic square root of the principal eigenvalue to obtain a variance vector of the principal eigenvalue, thereby calculating the variance value of the principal eigenvalue;
For each main characteristic value, dividing the variance value of the main characteristic value by the sum of the variance values of the main characteristic values to obtain the weight of the main characteristic value.
In addition, the specific implementation of the device for identifying an abnormal object according to the present invention is described in detail in the method for identifying an abnormal object described above, and thus the description thereof will not be repeated here.
Fig. 6 illustrates an exemplary system architecture 600 to which a method of identifying an anomalous target or an apparatus of identifying an anomalous target in accordance with an embodiment of the invention may be applied.
As shown in fig. 6, the system architecture 600 may include terminal devices 601, 602, 603, a network 604, and a server 605. The network 604 is used as a medium to provide communication links between the terminal devices 601, 602, 603 and the server 605. The network 604 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The target may interact with the server 605 via the network 604 using the terminal devices 601, 602, 603 to receive or send messages etc. Various communication client applications such as shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only) may be installed on the terminal devices 601, 602, 603.
The terminal devices 601, 602, 603 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 605 may be a server providing various services, such as a background management server (by way of example only) that provides support for shopping-type websites browsed by the target utilization terminal devices 601, 602, 603. The background management server can analyze and other data such as the received article information inquiry request and feed back the processing result to the terminal equipment.
It should be noted that, the method for identifying an abnormal object provided by the embodiment of the present invention is generally executed by the server 605, and accordingly, the device for identifying an abnormal object is generally disposed in the server 605.
It should be understood that the number of terminal devices, networks and servers in fig. 6 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 7, there is illustrated a schematic diagram of a computer system 700 suitable for use in implementing an embodiment of the present invention. The terminal device shown in fig. 7 is only an example, and should not impose any limitation on the functions and the scope of use of the embodiment of the present invention.
As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU) 701, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data required for the operation of the system 700 are also stored. The CPU 701, ROM 702, and RAM703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
The following components are connected to the I/O interface 705: an input section 706 including a keyboard, a mouse, and the like; an output portion 707 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 708 including a hard disk or the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. The drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read therefrom is mounted into the storage section 708 as necessary.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 709, and/or installed from the removable medium 711. The above-described functions defined in the system of the present invention are performed when the computer program is executed by a Central Processing Unit (CPU) 701.
The computer readable medium shown in the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer programs according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules involved in the embodiments of the present invention may be implemented in software or in hardware. The described modules may also be provided in a processor, for example, as: a processor comprises a building module, an analysis module, a calculation module and an identification module, wherein the names of these modules do not constitute a limitation of the module itself in some cases.
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer readable medium carries one or more programs which, when executed by a device, implement the method of: constructing an index matrix according to index values corresponding to the business indexes of each target; performing principal component analysis on the index matrix by adopting a principal component analysis algorithm to obtain at least one principal eigenvalue and an eigenvector corresponding to the principal eigenvalue; calculating the comprehensive index value of each target according to the at least one main characteristic value and the corresponding characteristic vector thereof; and identifying the abnormal target according to the comprehensive index value of each target.
As a further aspect, embodiments of the present invention also provide a computer program product comprising a computer program which, when executed by a processor, implements the method according to any of the above embodiments.
According to the technical scheme of the embodiment of the invention, the technical means of identifying the abnormal targets is overcome, and the technical problems of serious hysteresis of the risk targets and systematic identification of all targets similar in risk behaviors in the prior art are solved because the technical means of adopting a principal component analysis algorithm to conduct principal component analysis on the index matrix to obtain at least one principal characteristic value and a corresponding characteristic vector thereof, and then calculating the comprehensive index value of each target according to the at least one principal characteristic value and the corresponding characteristic vector thereof. The embodiment of the invention adopts the principal component analysis algorithm to reduce the dimension of the index matrix, converts a plurality of service indexes into a few comprehensive service indexes, so that most of information in the service indexes can be reserved in the comprehensive service indexes, simplifies the service indexes, enhances the data multiplexing capability, effectively reduces a large amount of sampling work, is beneficial to quickly identifying risk targets, and can systematically identify abnormal targets.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (12)

1. A method of identifying an anomalous target, comprising:
constructing an index matrix according to index values corresponding to the business indexes of each target;
Performing principal component analysis on the index matrix by adopting a principal component analysis algorithm to obtain at least one principal eigenvalue and an eigenvector corresponding to the principal eigenvalue;
calculating the comprehensive index value of each target according to the at least one main characteristic value and the corresponding characteristic vector thereof;
and identifying the abnormal target according to the comprehensive index value of each target.
2. The method of claim 1, wherein performing principal component analysis on the index matrix using a principal component analysis algorithm to obtain at least one principal eigenvalue and its corresponding eigenvector, comprises:
Calculating a covariance matrix of the index matrix;
Calculating each eigenvalue of the covariance matrix and corresponding eigenvectors thereof;
And calculating each accumulated variance value of the covariance matrix according to each eigenvalue and the corresponding eigenvector of the covariance matrix, thereby screening out at least one main eigenvalue and the corresponding eigenvector of the main eigenvalue.
3. The method of claim 2, wherein calculating each accumulated variance value of the covariance matrix based on each eigenvalue of the covariance matrix and its corresponding eigenvector, thereby screening out at least one principal eigenvalue and its corresponding eigenvector, comprises:
For each eigenvalue, dividing the eigenvector corresponding to the eigenvalue by the arithmetic square root of the eigenvalue to obtain a variance vector of the eigenvalue, thereby calculating the variance value of the eigenvalue;
And calculating each accumulated variance value of the covariance matrix according to the variance value of each eigenvalue, thereby screening out at least one main eigenvalue and the eigenvector corresponding to the main eigenvalue.
4. A method according to claim 3, wherein calculating a variance value of the feature value comprises:
and calculating the variance of the variance vector, thereby obtaining the variance value of the characteristic value.
5. A method according to claim 3, wherein calculating each accumulated variance value of the covariance matrix according to the variance value of each eigenvalue, thereby screening out at least one principal eigenvalue and its corresponding eigenvector, comprises:
sequencing the characteristic values according to the order of the variance values from large to small;
For each characteristic value, adding the variance value of the characteristic value and the variance value of the characteristic value positioned before the characteristic value in the sequence to obtain an accumulated variance value of the characteristic value;
And screening out the characteristic value with the accumulated variance value larger than or equal to the variance threshold value as a main characteristic value.
6. The method of claim 2, further comprising, prior to calculating the covariance matrix of the index matrix:
And carrying out standardization processing on the index matrix.
7. The method of claim 1, wherein calculating the composite index value for each of the targets based on the at least one primary eigenvalue and its corresponding eigenvector comprises:
Respectively calculating the weight and variance vector of the at least one main characteristic value according to the at least one main characteristic value and the corresponding characteristic vector thereof;
and carrying out weighted summation on the weight and the variance vector of the at least one main characteristic value, thereby obtaining the comprehensive index value of each target.
8. The method of claim 7, wherein calculating the weight and variance vectors of the at least one principal eigenvalue from the at least one principal eigenvalue and its corresponding eigenvector, respectively, comprises:
for each principal eigenvalue, dividing the eigenvector corresponding to the principal eigenvalue by the arithmetic square root of the principal eigenvalue to obtain a variance vector of the principal eigenvalue, thereby calculating the variance value of the principal eigenvalue;
For each main characteristic value, dividing the variance value of the main characteristic value by the sum of the variance values of the main characteristic values to obtain the weight of the main characteristic value.
9. An apparatus for identifying an anomalous target, comprising:
the construction module is used for constructing an index matrix according to index values corresponding to the business indexes of each target;
The analysis module is used for carrying out principal component analysis on the index matrix by adopting a principal component analysis algorithm to obtain at least one principal eigenvalue and an eigenvector corresponding to the principal eigenvalue;
The calculation module is used for calculating the comprehensive index value of each target according to the at least one main characteristic value and the corresponding characteristic vector thereof;
and the identification module is used for identifying the abnormal target according to the comprehensive index value of each target.
10. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
The one or more processors implement the method of any of claims 1-8 when the one or more programs are executed by the one or more processors.
11. A computer readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-8.
12. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-8.
CN202211260265.6A 2022-10-14 2022-10-14 Method and device for identifying abnormal target Pending CN117911040A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211260265.6A CN117911040A (en) 2022-10-14 2022-10-14 Method and device for identifying abnormal target

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211260265.6A CN117911040A (en) 2022-10-14 2022-10-14 Method and device for identifying abnormal target

Publications (1)

Publication Number Publication Date
CN117911040A true CN117911040A (en) 2024-04-19

Family

ID=90682586

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211260265.6A Pending CN117911040A (en) 2022-10-14 2022-10-14 Method and device for identifying abnormal target

Country Status (1)

Country Link
CN (1) CN117911040A (en)

Similar Documents

Publication Publication Date Title
CN110135978B (en) User financial risk assessment method and device, electronic equipment and readable medium
CN111274341A (en) Site selection method and device for network points
CN110766348B (en) Method and device for combining goods picking tasks
US11062224B2 (en) Prediction using fusion of heterogeneous unstructured data
CN110751354B (en) Abnormal user detection method and device
CN114297278A (en) Method, system and device for quickly writing batch data
CN111738632B (en) Device control method, device, electronic device and computer readable medium
CN110347973B (en) Method and device for generating information
CN113743697A (en) Risk alarm method and device
CN113761565A (en) Data desensitization method and apparatus
CN111415168A (en) Transaction warning method and device
CN116228429A (en) Method and device for detecting transaction data
CN117911040A (en) Method and device for identifying abnormal target
CN115619452A (en) User operation method and device based on arithmetic expression configuration
CN115981910B (en) Method, apparatus, electronic device and computer readable medium for processing exception request
CN112434083A (en) Event processing method and device based on big data
CN114969059B (en) Method and device for generating order information, electronic equipment and storage medium
CN112906723A (en) Feature selection method and device
CN114219053B (en) User position information processing method and device and electronic equipment
CN111930704B (en) Service alarm equipment control method, device, equipment and computer readable medium
CN111178375B (en) Method and device for generating information
CN112131264A (en) Method, device and system for recommending different source difference information
CN116049507A (en) Method, apparatus, device and computer readable medium for monitoring data
CN117634877A (en) Risk monitoring method, risk monitoring device, electronic equipment and computer readable medium
CN108536362B (en) Method and device for identifying operation and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination