CN112632469A - Method and device for detecting abnormity of business transaction data and computer equipment - Google Patents

Method and device for detecting abnormity of business transaction data and computer equipment Download PDF

Info

Publication number
CN112632469A
CN112632469A CN202011529615.5A CN202011529615A CN112632469A CN 112632469 A CN112632469 A CN 112632469A CN 202011529615 A CN202011529615 A CN 202011529615A CN 112632469 A CN112632469 A CN 112632469A
Authority
CN
China
Prior art keywords
transaction data
business transaction
data
preset
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011529615.5A
Other languages
Chinese (zh)
Inventor
王德勋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
OneConnect Smart Technology Co Ltd
OneConnect Financial Technology Co Ltd Shanghai
Original Assignee
OneConnect Financial Technology Co Ltd Shanghai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by OneConnect Financial Technology Co Ltd Shanghai filed Critical OneConnect Financial Technology Co Ltd Shanghai
Priority to CN202011529615.5A priority Critical patent/CN112632469A/en
Publication of CN112632469A publication Critical patent/CN112632469A/en
Priority to PCT/CN2021/109385 priority patent/WO2022134579A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Abstract

The invention discloses a method and a device for detecting abnormality of business transaction data and computer equipment, and mainly aims to solve the problem that in the prior art, an abnormality detection result is inaccurate due to unbalance of positive and negative samples. The method comprises the following steps: acquiring business transaction data to be detected; mapping the business transaction data to a data space subject to preset distribution to obtain a hidden variable of the business transaction data in the data space; determining confidence intervals of the preset distribution under the corresponding confidence levels, and judging whether the hidden variables are in the confidence intervals; and if the data is not in the confidence interval, determining that the business transaction data is abnormal. The invention is mainly suitable for the anomaly detection of business transaction data.

Description

Method and device for detecting abnormity of business transaction data and computer equipment
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a method and a device for detecting abnormity of business transaction data and computer equipment.
Background
With the rapid development of market economy, a large number of different types of business transactions are generated in a business transaction center every day, most of the business transactions are normal business transactions, but abnormal business transactions exist, which bring bad influence in the market transaction process and damage the benefits of others, so that abnormal detection needs to be performed on business transaction data at intervals.
At present, in the process of performing anomaly detection on business transaction data, positive and negative sample data are generally collected and labeled, an anomaly detection model is constructed according to the labeled sample data, and then the anomaly detection model is used for performing anomaly detection on the business transaction data. However, the sample data of the business transaction usually has a phenomenon of unbalanced positive and negative samples, that is, more normal transaction sample data and less abnormal transaction sample data, which may result in poor effect of the trained abnormality detection model and low accuracy of abnormality detection, and meanwhile, the way of labeling the positive and negative sample data may increase workload of detection personnel and increase cost of data abnormality detection.
Disclosure of Invention
The invention provides a method and a device for detecting the abnormity of business transaction data and computer equipment, which mainly solve the problem of inaccurate abnormity detection result caused by unbalance of positive and negative samples in the prior art, and simultaneously avoid marking the sample transaction data, thereby reducing the workload of detection personnel and reducing the cost of data abnormity detection.
According to a first aspect of the present invention, there is provided a method for detecting an anomaly of business transaction data, comprising:
acquiring business transaction data to be detected;
mapping the business transaction data to a data space subject to preset distribution to obtain a hidden variable of the business transaction data in the data space;
determining confidence intervals of the preset distribution under the corresponding confidence levels, and judging whether the hidden variables are in the confidence intervals;
and if the data is not in the confidence interval, determining that the business transaction data is abnormal.
According to a second aspect of the present invention, there is provided an anomaly detection apparatus for business transaction data, comprising:
the acquisition unit is used for acquiring the business transaction data to be detected;
the mapping unit is used for mapping the business transaction data to a data space subject to preset distribution to obtain a hidden variable of the business transaction data in the data space;
the judging unit is used for determining confidence intervals of the preset distribution under the corresponding confidence levels and judging whether the hidden variables are in the confidence intervals or not;
and the determining unit is used for determining that the business transaction data is abnormal if the business transaction data is not in the confidence interval.
According to a third aspect of the present invention, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:
acquiring business transaction data to be detected;
mapping the business transaction data to a data space subject to preset distribution to obtain a hidden variable of the business transaction data in the data space;
determining confidence intervals of the preset distribution under the corresponding confidence levels, and judging whether the hidden variables are in the confidence intervals;
and if the data is not in the confidence interval, determining that the business transaction data is abnormal.
According to a fourth aspect of the present invention, there is provided a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the program:
acquiring business transaction data to be detected;
mapping the business transaction data to a data space subject to preset distribution to obtain a hidden variable of the business transaction data in the data space;
determining confidence intervals of the preset distribution under the corresponding confidence levels, and judging whether the hidden variables are in the confidence intervals;
and if the data is not in the confidence interval, determining that the business transaction data is abnormal.
Compared with the current mode of carrying out abnormity detection on the business transaction data by utilizing an abnormity detection model, the abnormity detection method, the abnormity detection device and the computer equipment of the business transaction data can obtain the business transaction data to be detected; mapping the business transaction data to a data space subject to preset distribution to obtain a hidden variable of the business transaction data in the data space; meanwhile, determining confidence intervals of the preset distribution under the corresponding confidence levels, and judging whether the hidden variables are in the confidence intervals; if the business transaction data is not in the confidence interval, determining that the business transaction data is abnormal, mapping the business transaction data to be detected to a data space with known distribution to obtain hidden variables of the business transaction data in the data space, performing statistical analysis on the hidden variables, and judging whether the business transaction data is abnormal data according to the statistical analysis result, so that the defect that the abnormal detection result is inaccurate due to unbalance of positive and negative samples in the prior art can be overcome, the accuracy of the abnormal detection result is improved, the sample transaction data can be prevented from being marked, the workload of detection personnel is reduced, and the cost of data abnormal detection is reduced.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow chart of an anomaly detection method for business transaction data according to an embodiment of the present invention;
FIG. 2 is a flow chart of another method for detecting anomalies in business transaction data according to an embodiment of the invention;
FIG. 3 illustrates a visualization analysis plan provided by an embodiment of the present invention;
fig. 4 is a schematic structural diagram illustrating an anomaly detection apparatus for business transaction data according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of another anomaly detection apparatus for business transaction data according to an embodiment of the present invention;
fig. 6 shows a physical structure diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The invention will be described in detail hereinafter with reference to the accompanying drawings in conjunction with embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
At present, in the process of performing anomaly detection on business transaction data by using an anomaly detection model, sample data of business transactions usually has the phenomenon of imbalance of positive and negative samples, that is, more normal transaction sample data and less abnormal transaction sample data, so that the abnormal detection model for training has poor effect and low accuracy of anomaly detection.
In order to solve the above problem, an embodiment of the present invention provides an anomaly detection method for business transaction data, as shown in fig. 1, where the method includes:
101. and acquiring the business transaction data to be detected.
The embodiment of the invention is mainly suitable for anomaly detection of business transaction data, and the execution subject of the embodiment of the invention is a device or equipment capable of carrying out anomaly detection on the business transaction data.
For the embodiment of the present invention, because a large amount of business transaction data is generated every moment in the business transaction center, the business transaction data in a period of time can be collected and subjected to anomaly detection, for example, the business transaction data of one month or one year is collected and subjected to anomaly detection, which specifically includes: the stock trading data is related to time, so that a preset time length can be set as a sliding window size, a time sequence is constructed according to the sliding window size, for example, the mean value of the trading volume, the variance of the trading volume, the extreme difference of the trading volume, the mean value of the response time, the variance of the response time, the extreme difference of the response time, the mean value of the transaction success rate, the variance of the transaction success rate, the standard deviation of the transaction success rate, the relative date (such as the day in a month) of the time period and the time (such as the hour in a day) of the time period are counted every 5 minutes, and therefore multiple groups of to-be-detected business trading data in a period can be.
102. And mapping the business transaction data to a data space subject to preset distribution to obtain a hidden variable of the business transaction data in the data space.
The preset distribution can be standard normal distribution or other distributions, and as many things in daily life are subjected to normal distribution, the embodiment of the invention takes standard normal distribution as an example to refine the processes of data mapping and statistical analysis, for the embodiment of the invention, in order to facilitate the analysis and statistics of the business transaction data to be detected, the business transaction data to be detected needs to be mapped to a data space which is subjected to known distribution, specifically, the business transaction data can be mapped to the data space which is subjected to standard normal distribution by using a preset encoder, the business transaction data to be detected is input into the preset encoder to be subjected to a series of nonlinear transformation operations to obtain the hidden variable corresponding to the business transaction data, the hidden variable is subjected to standard normal distribution, so the hidden variable corresponding to the business transaction data can be subjected to statistical analysis by using the relevant statistical knowledge of standard normal distribution, and judging whether the business transaction data is abnormal or not according to the statistical analysis result.
103. And determining confidence intervals of the preset distribution under the corresponding confidence levels, and judging whether the hidden variables are in the confidence intervals.
For the embodiment of the present invention, after obtaining the hidden variable corresponding to the business transaction data, because the hidden variable obeys a known distribution, such as a standard normal distribution, the hidden variable of the business transaction data may be statistically analyzed by using relevant statistical knowledge of the known distribution, and then it is determined whether the business transaction data is abnormal according to a statistical analysis result, specifically, because the hidden variable obeys the standard normal distribution, the confidence intervals of the standard normal distribution under corresponding confidence levels may be determined first according to the hidden variable, for example, the confidence intervals of the standard normal distribution under the confidence levels of 68%, 95%, and 99.7% are respectively determined, specifically, the confidence intervals of the standard normal distribution under the confidence levels of 68%, 95%, and 99.7% may be respectively calculated by a mathematical formula, where the specific formula is as follows:
Figure BDA0002851690940000051
wherein the content of the first and second substances,
Figure BDA0002851690940000052
the mean value of the hidden variables corresponding to the business transaction data is represented, z represents how many standard deviations exist, and is related to the confidence level, and can be determined by table lookup, s represents the variance corresponding to the hidden variables, and n is the number corresponding to the hidden variables, so that the confidence interval of the standard normal distribution under the corresponding confidence level can be calculated according to the formula, for example, the confidence interval of the standard normal distribution under the confidence level of 95% is [ -1,1]Furthermore, the confidence level may be set to the actual traffic demand, for example, the confidence level may be set to 99.7% if the accuracy of the final anomaly detection result is desired to be high, or the confidence level may be set to 68% if all anomalous traffic data is desired to be excluded as much as possible.
104. And if the data is not in the confidence interval, determining that the business transaction data is abnormal.
For the embodiment of the invention, after determining the confidence interval of the standard normal distribution under the corresponding confidence level, judging whether the hidden variable corresponding to the service transaction data to be detected is within the confidence interval, for example, determining the confidence interval of the standard normal distribution under 95% confidence level as [ -1,1], judging whether the hidden variable x corresponding to the service transaction data is within the confidence interval, and if so, indicating that the service transaction data corresponding to the hidden variable has 95% possibility of being normal service transaction data; if the transaction data is not in the confidence interval, it is indicated that the probability that the business transaction data corresponding to the hidden variable is abnormal data is 95%, and then the business transaction in the corresponding time can be locked according to the abnormal business transaction data, if the business transaction data is statistical data of all business transactions between 30 am 8 o ' clock and 35 am 8 o ' clock, and the business transaction data is abnormal data, it can be determined that the multiple business transactions between 30 am 8 o ' clock and 35 am are likely to be abnormal.
The anomaly detection method for the business transaction data provided by the embodiment of the invention can acquire the business transaction data to be detected; mapping the business transaction data to a data space subject to preset distribution to obtain a hidden variable of the business transaction data in the data space; meanwhile, determining confidence intervals of the preset distribution under the corresponding confidence levels, and judging whether the hidden variables are in the confidence intervals; if the business transaction data is not in the confidence interval, determining that the business transaction data is abnormal, mapping the business transaction data to be detected to a data space with known distribution to obtain hidden variables of the business transaction data in the data space, performing statistical analysis on the hidden variables, and judging whether the business transaction data is abnormal data according to the statistical analysis result, so that the defect that the abnormal detection result is inaccurate due to unbalance of positive and negative samples in the prior art can be overcome, the accuracy of the abnormal detection result is improved, the sample transaction data can be prevented from being marked, the workload of detection personnel is reduced, and the cost of data abnormal detection is reduced.
Further, in order to better describe the anomaly detection process of the business transaction data, as a refinement and an extension of the foregoing embodiment, an embodiment of the present invention provides another anomaly detection method for business transaction data, as shown in fig. 2, where the method includes:
201. and acquiring sample service transaction data, and inputting the sample service transaction data into an initial encoder for encoding to obtain a hidden variable corresponding to the sample service transaction data.
For the embodiment of the present invention, in order to construct the preset encoder, sample service transaction data within a period of time may be collected from a service transaction center, and a preset duration is set as a size of a sliding window, and a time sequence is constructed according to the size of the sliding window, for example, a transaction amount mean, a transaction amount variance, a transaction amount extreme difference, a response time mean, a response time variance, a response time extreme difference, a transaction success rate mean, a transaction success rate variance, a transaction success rate standard difference, a relative date of the period (for example, the day in a month), and a time of the period (for example, the hour in a day) are counted every 5 minutes, so that multiple sets of sample service transaction data can be obtained. And further, inputting the sample business transaction data into an initial encoder for encoding, and obtaining the hidden variable corresponding to the sample business transaction data through a series of nonlinear transformation.
202. And inputting the hidden variable corresponding to the sample service transaction data into an initial decoder for decoding to obtain restored sample service transaction data, training the initial encoder according to the restored sample service transaction data and the sample service transaction data, and constructing the preset encoder.
Further, inputting the hidden variable corresponding to the sample service transaction data into an initial decoder for decoding, that is, restoring the hidden variable corresponding to the sample service transaction data to obtain restored sample service transaction data, wherein in the process of training an initial encoder, the restored sample service transaction data needs to be made as close to the sample service transaction data as possible, and based on this, the initial encoder is trained according to the restored sample service transaction data and the sample service transaction data to construct the preset encoder, which includes: respectively constructing a reconstruction loss function and a relative entropy loss function according to the restored sample business transaction data and the business transaction data; and under the condition that the loss function value added by the reconstruction loss function and the relative entropy loss function is minimum, updating parameters in the initial encoder to obtain the preset encoder.
Specifically, a reconstruction loss function and a relative entropy loss function are respectively constructed according to the restored sample service transaction data and the first input sample service transaction data, wherein the reconstruction loss function can represent the function of restoring a sample to be completed by an encoder, specifically, a difference value between the restored sample service transaction data and the first input sample service transaction data is smaller, and the smaller the difference value is, the closer the restored sample service transaction data and the first input sample service transaction data are; the loss function of the relative entropy can measure the distance and the similarity of two probability distributions, and by minimizing the loss function of the relative entropy, the hidden distribution learned by the encoder is closer to the prior distribution of data, so that the encoder (model) is more robust, and a specific calculation formula is as follows:
Figure BDA0002851690940000081
in particular, in the matrix operation in the neural network, the relative entropy formula is transformed as follows:
Figure BDA0002851690940000082
wherein σiAnd muiRespectively corresponding standard deviation and mean value of each group of sample business transaction data, further adding the reconstruction loss function and the relative entropy loss function by setting weights respectively corresponding to the reconstruction loss function and the relative entropy loss function, solving parameters of the encoder under the condition of minimum loss function value, and updating the parameters in the initial encoder, thereby obtaining the standard deviation and the mean value of each group of sample business transaction dataTo the preset encoder used in the embodiments of the present invention, the business transaction data is mapped to the data space that follows the standard normal distribution by using the preset encoder.
203. And acquiring the business transaction data to be detected.
For the embodiment of the present invention, in order to obtain the service transaction data to be detected, step 203 specifically includes: acquiring service transaction information in a preset time period; counting a transaction amount mean value, a transaction amount variance, a transaction amount extreme difference, a response time mean value, a response time variance, a response time extreme difference, a transaction success rate mean value, a transaction success rate variance and a transaction success rate standard deviation in a preset duration sliding window according to the service transaction information; and determining the mean value of the transaction amount, the variance of the transaction amount, the extreme difference of the transaction amount, the mean value of the response time, the variance of the response time, the extreme difference of the response time, the mean value of the transaction success rate, the variance of the transaction success rate and the standard deviation of the transaction success rate in the preset time length sliding window as the service transaction data to be detected. The service transaction information is collected service transactions within a period of time, and in order to detect the abnormality of the service transactions, the size of a preset duration sliding window can be set, for example, the transaction amount mean value, the transaction amount variance, the transaction amount extreme difference, the response time mean value, the response time variance, the response time extreme difference, the transaction success rate mean value, the transaction success rate variance, the transaction success rate standard difference, the relative date (for example, the day in a month) of the time period, the time (for example, the hour in a day) of the time period, and the like of a group of service transactions in the time period are unified every 10 minutes, so that a plurality of groups of service transaction data to be detected within the time period can be obtained.
204. And inputting the business transaction data into a preset encoder for encoding to obtain the hidden variable of the business transaction data in the data space.
For the embodiment of the present invention, since the low-dimensional hidden variables are more convenient for statistics and analysis, when the encoder is used to obtain the hidden variables corresponding to the business transaction data, the hidden variables also need to be subjected to dimension reduction processing, based on this, the preset encoder includes an encoding module and a dimension reduction module, the business transaction data is input to the preset encoder to be encoded, and the hidden variables of the business transaction data in the data space are obtained, which includes: inputting the business transaction data into a coding module in a preset coder for coding to obtain a hidden variable of the business transaction data in the data space; and inputting the hidden variable into a dimensionality reduction module in the preset encoder for dimensionality reduction processing to obtain the hidden variable subjected to dimensionality reduction processing.
Specifically, the service transaction data to be detected is input to a coding module of a preset encoder for coding, the coding module may specifically be an attention mechanism layer of the preset encoder, the service transaction data passes through the attention mechanism layer, that is, hidden variables corresponding to the service transaction data are obtained through a series of nonlinear transformations, the hidden variables are input to a dimensionality reduction module of the preset encoder for dimensionality reduction, the dimensionality reduction module may specifically be a full connection layer, the hidden variables are subjected to linear changes of the full connection layer to obtain the hidden variables after dimensionality reduction, for example, the hidden variables output by the coding module are 6-dimensional variables, and the dimensionality reduction module is used for performing dimensionality reduction to obtain 3-dimensional variables or 4-dimensional variables.
Furthermore, in order to perform visual analysis on the hidden variable corresponding to the service transaction data, the hidden variable output by the encoder may be reduced into a 2-dimensional variable or a 3-dimensional variable, for example, the hidden variable corresponding to the service transaction data after the dimension reduction processing is a 2-dimensional variable, and a plan view may be drawn according to the horizontal and vertical coordinates corresponding to the hidden variable after the dimension reduction processing, that is, each hidden variable after the dimension reduction processing is regarded as a point to be detected, and is drawn on a plane according to the horizontal and vertical coordinates corresponding to the point to perform visual analysis on the hidden variable corresponding to the service transaction data.
205. And determining confidence intervals of the preset distribution under the corresponding confidence levels, and judging whether the hidden variables are in the confidence intervals.
In order to visualize statistical analysis of hidden variables corresponding to business transaction data, the determining confidence intervals of the preset distribution under corresponding confidence levels and determining whether the hidden variables are within the confidence intervals includes: if the two-dimensional hidden variables corresponding to the business transaction data obey standard normal distribution, determining a confidence interval of the standard normal distribution under a corresponding confidence level according to the two-dimensional hidden variables; drawing a plane graph based on the determined confidence interval and a two-dimensional hidden variable corresponding to the business transaction data, and determining a target range covered by the confidence interval on the plane graph; and judging whether the point to be detected corresponding to the two-dimensional hidden variable is in the target range.
Specifically, because the two-dimensional hidden variables after the dimensionality reduction processing are subject to the standard normal distribution, the two-dimensional hidden variables can be visually analyzed by using statistical knowledge of the standard normal distribution, firstly, according to the two-dimensional hidden variables corresponding to the business transaction data, a confidence interval of the standard normal distribution under the corresponding confidence level is calculated by using a confidence interval calculation formula, further, according to the determined confidence interval and the two-dimensional hidden variables, a plane graph is drawn, specifically, the two-dimensional hidden variables are regarded as points to be detected, according to horizontal coordinates and vertical coordinates corresponding to the two-dimensional hidden variables, the points to be detected corresponding to the two-dimensional hidden variables are drawn on the plane graph, then, the coordinate origin is taken as the center of a circle, the length of the confidence interval under the corresponding confidence level is taken as the diameter, as shown in fig. 3, the points in the graph are the two-dimensional hidden variables corresponding to each group of business transaction data, and the circles on the outer side and the inner side respectively represent the confidence intervals of the standard distribution under the confidence levels, the range within the circle is the target range, the points inside the outer circle have a 99.7% probability of being normal points, the points outside the outer circle have a 99.7% probability of being abnormal points, the points inside the inner circle have a 95% probability of being normal points, and the points outside the inner circle have a 95% probability of being abnormal points, so that abnormal business transaction data can be determined at a corresponding confidence level according to the requirements of the user.
206. And if the data is not in the confidence interval, determining that the business transaction data is abnormal.
For the embodiment of the present invention, in the process of performing visual analysis on the two-dimensional hidden variable, step 206 specifically includes: if the point to be detected is not in the target range, determining that the business transaction data is abnormal data under the corresponding confidence level; and if the point to be detected is within the target range, determining that the business transaction data is not abnormal data under the corresponding confidence level. As shown in fig. 3, the outer circle is a confidence interval normally distributed at a confidence level of 99.7%, the range in the circle is a target range, and if the detection point corresponding to the two-dimensional variable is not in the target range, that is, not in the circle, it indicates that the service transaction data to be detected is abnormal data; and if the detection point corresponding to the two-dimensional variable is in the target range, namely in the circle, the service transaction data to be detected is not abnormal data.
The other method for detecting the abnormity of the business transaction data provided by the embodiment of the invention can acquire the business transaction data to be detected; mapping the business transaction data to a data space subject to preset distribution to obtain a hidden variable of the business transaction data in the data space; meanwhile, determining confidence intervals of the preset distribution under the corresponding confidence levels, and judging whether the hidden variables are in the confidence intervals; if the business transaction data is not in the confidence interval, determining that the business transaction data is abnormal, mapping the business transaction data to be detected to a data space with known distribution to obtain hidden variables of the business transaction data in the data space, performing statistical analysis on the hidden variables, and judging whether the business transaction data is abnormal data according to the statistical analysis result, so that the defect that the abnormal detection result is inaccurate due to unbalance of positive and negative samples in the prior art can be overcome, the accuracy of the abnormal detection result is improved, the sample transaction data can be prevented from being marked, the workload of detection personnel is reduced, and the cost of data abnormal detection is reduced.
Further, as a specific implementation of fig. 1, an embodiment of the present invention provides an apparatus for detecting an anomaly of business transaction data, as shown in fig. 4, the apparatus includes: an acquisition unit 31, a mapping unit 32, a judgment unit 33, and a determination unit 34.
The acquiring unit 31 may be configured to acquire service transaction data to be detected. The acquiring unit 31 is a main function module in the device for acquiring the business transaction data to be detected.
The mapping unit 32 may be configured to map the service transaction data to a data space subject to a preset distribution, so as to obtain a hidden variable of the service transaction data in the data space. The mapping unit 32 is a main function module, which is also a core module, in the present apparatus, that maps the service transaction data to a data space subject to preset distribution to obtain a hidden variable of the service transaction data in the data space.
The determining unit 33 may be configured to determine confidence intervals of the preset distributions at corresponding confidence levels, and determine whether the hidden variable is within the confidence interval. The determination unit 33 is a main function module, which is also a core module, in the present apparatus, according to the confidence interval that the preset distribution is determined under the corresponding confidence level, and determines whether the hidden variable is in the confidence interval.
The determining unit 34 may be configured to determine that the business transaction data is abnormal if the business transaction data is not within the confidence interval. The determining unit 34 is a main functional module that determines that the business transaction data is abnormal if the device is not within the confidence interval.
In a specific application scenario, in order to determine a hidden variable of the service transaction data in the data space, the mapping unit 32 may be specifically configured to input the service transaction data to a preset encoder for encoding, so as to obtain the hidden variable of the service transaction data in the data space.
In a specific application scenario, the preset encoder includes an encoding module and a dimension reduction module, and in order to perform dimension reduction processing on the hidden variable, as shown in fig. 5, the mapping unit 32 includes: an encoding module 321 and a dimension reduction module 322.
The encoding module 321 may be configured to input the service transaction data to an encoding module in a preset encoder for encoding, so as to obtain an implicit variable of the service transaction data in the data space.
The dimension reduction module 322 may be configured to input the hidden variable to a dimension reduction module in the preset encoder to perform dimension reduction processing, so as to obtain a hidden variable after the dimension reduction processing.
In a specific application scenario, the hidden variable after the dimension reduction processing is a two-dimensional hidden variable, and the determining unit 33 includes: a determination module 331, a rendering module 332, and a determination module 333.
The determining module 331 is configured to determine a confidence interval of the standard normal distribution at a corresponding confidence level according to the two-dimensional hidden variable if the two-dimensional hidden variable corresponding to the service transaction data complies with the standard normal distribution.
The drawing module 322 may be configured to draw a plan view based on the determined confidence interval and the two-dimensional hidden variable corresponding to the business transaction data, and determine a target range covered by the confidence interval on the plan view.
The determining module 333 may be configured to determine whether the point to be detected corresponding to the two-dimensional hidden variable is within the target range.
The determining unit 34 is specifically configured to determine that the business transaction data is abnormal data at a corresponding confidence level if the point to be detected is not within the target range; and if the point to be detected is within the target range, determining that the business transaction data is not abnormal data under the corresponding confidence level.
In a specific application scenario, in order to construct the preset encoder, the apparatus further includes: an encoding unit 35, a decoding unit 36 and a construction unit 37.
The obtaining unit 31 may be further configured to obtain sample service transaction data.
The encoding unit 35 may be configured to input the sample service transaction data into an initial encoder to encode, so as to obtain a hidden variable corresponding to the sample service transaction data.
The decoding unit 36 may be configured to input the hidden variable corresponding to the sample service transaction data into an initial decoder for decoding, so as to obtain restored sample service transaction data.
The constructing unit 37 may be configured to train the initial encoder according to the restored sample service transaction data and the sample service transaction data, and construct the preset encoder.
Further, in order to construct the preset encoder, the constructing unit 37 includes: a building module 371 and an updating module 372.
The constructing module 371 may be configured to respectively construct a reconstruction loss function and a relative entropy loss function according to the restored sample business transaction data and the business transaction data.
The updating module 372 may be configured to update the parameter in the initial encoder to obtain the preset encoder when a loss function value added by the reconstruction loss function and the relative entropy loss function is minimum.
Further, in order to acquire the business transaction data, the acquiring unit 31 includes: an acquisition module 311, a statistics module 312, and a determination module 313.
The obtaining module 311 may be configured to obtain service transaction information within a preset time period.
The counting module 312 may be configured to count a mean value of transaction amount, a variance of transaction amount, a maximum value difference of transaction amount, a mean value of response time, a variance of response time, a maximum value difference of response time, a mean value of transaction success rate, a variance of transaction success rate, and a standard deviation of transaction success rate in a sliding window with a preset duration according to the service transaction information.
The determining module 313 may be configured to determine a transaction amount mean, a transaction amount variance, a transaction amount extreme difference, a response time mean, a response time variance, a response time extreme difference, a transaction success rate mean, a transaction success rate variance, and a transaction success rate standard deviation in the preset duration sliding window as the service transaction data to be detected.
It should be noted that other corresponding descriptions of the functional modules involved in the anomaly detection apparatus for business transaction data provided in the embodiment of the present invention may refer to the corresponding description of the method shown in fig. 1, and are not described herein again.
Based on the method shown in fig. 1, correspondingly, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the following steps: acquiring business transaction data to be detected; mapping the business transaction data to a data space subject to preset distribution to obtain a hidden variable of the business transaction data in the data space; determining confidence intervals of the preset distribution under the corresponding confidence levels, and judging whether the hidden variables are in the confidence intervals; and if the data is not in the confidence interval, determining that the business transaction data is abnormal.
Based on the above embodiments of the method shown in fig. 1 and the apparatus shown in fig. 4, an embodiment of the present invention further provides an entity structure diagram of a computer device, as shown in fig. 6, where the computer device includes: a processor 41, a memory 42, and a computer program stored on the memory 42 and executable on the processor, wherein the memory 42 and the processor 41 are both arranged on a bus 43 such that when the processor 41 executes the program, the following steps are performed: acquiring business transaction data to be detected; mapping the business transaction data to a data space subject to preset distribution to obtain a hidden variable of the business transaction data in the data space; determining confidence intervals of the preset distribution under the corresponding confidence levels, and judging whether the hidden variables are in the confidence intervals; and if the data is not in the confidence interval, determining that the business transaction data is abnormal.
By the technical scheme, the method and the device can acquire the business transaction data to be detected; mapping the business transaction data to a data space subject to preset distribution to obtain a hidden variable of the business transaction data in the data space; meanwhile, determining confidence intervals of the preset distribution under the corresponding confidence levels, and judging whether the hidden variables are in the confidence intervals; if the business transaction data is not in the confidence interval, determining that the business transaction data is abnormal, mapping the business transaction data to be detected to a data space with known distribution to obtain hidden variables of the business transaction data in the data space, performing statistical analysis on the hidden variables, and judging whether the business transaction data is abnormal data according to the statistical analysis result, so that the defect that the abnormal detection result is inaccurate due to unbalance of positive and negative samples in the prior art can be overcome, the accuracy of the abnormal detection result is improved, the sample transaction data can be prevented from being marked, the workload of detection personnel is reduced, and the cost of data abnormal detection is reduced.
It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method for detecting abnormality of business transaction data is characterized by comprising the following steps:
acquiring business transaction data to be detected;
mapping the business transaction data to a data space subject to preset distribution to obtain a hidden variable of the business transaction data in the data space;
determining confidence intervals of the preset distribution under the corresponding confidence levels, and judging whether the hidden variables are in the confidence intervals;
and if the data is not in the confidence interval, determining that the business transaction data is abnormal.
2. The method of claim 1, wherein the mapping the business transaction data to a data space subject to a preset distribution to obtain a hidden variable of the business transaction data in the data space comprises:
and inputting the business transaction data into a preset encoder for encoding to obtain the hidden variable of the business transaction data in the data space.
3. The method according to claim 2, wherein the preset encoder includes an encoding module and a dimension reduction module, and the inputting the service transaction data into the preset encoder for encoding to obtain the hidden variable of the service transaction data in the data space includes:
inputting the business transaction data into a coding module in a preset coder for coding to obtain a hidden variable of the business transaction data in the data space;
and inputting the hidden variable into a dimensionality reduction module in the preset encoder for dimensionality reduction processing to obtain the hidden variable subjected to dimensionality reduction processing.
4. The method according to claim 3, wherein the hidden variables after the dimension reduction processing are two-dimensional hidden variables, and the determining confidence intervals of the preset distributions at the corresponding confidence levels and determining whether the hidden variables are within the confidence intervals comprises:
if the two-dimensional hidden variables corresponding to the business transaction data obey standard normal distribution, determining a confidence interval of the standard normal distribution under a corresponding confidence level according to the two-dimensional hidden variables;
drawing a plane graph based on the determined confidence interval and a two-dimensional hidden variable corresponding to the business transaction data, and determining a target range covered by the confidence interval on the plane graph;
judging whether the point to be detected corresponding to the two-dimensional hidden variable is in the target range or not;
if the business transaction data is not in the confidence interval, determining that the business transaction data is abnormal, wherein the determining comprises the following steps:
if the point to be detected is not in the target range, determining that the business transaction data is abnormal data under the corresponding confidence level;
and if the point to be detected is within the target range, determining that the business transaction data is not abnormal data under the corresponding confidence level.
5. The method according to claim 1, wherein prior to said obtaining the transaction data to be detected, the method further comprises:
acquiring sample business transaction data;
inputting the sample business transaction data into an initial encoder for encoding to obtain a hidden variable corresponding to the sample business transaction data;
inputting the hidden variable corresponding to the sample service transaction data into an initial decoder for decoding to obtain restored sample service transaction data;
and training the initial encoder according to the restored sample business transaction data and the sample business transaction data, and constructing the preset encoder.
6. The method according to claim 5, wherein the training the initial encoder according to the restored sample business transaction data and the sample business transaction data to construct the preset encoder comprises:
respectively constructing a reconstruction loss function and a relative entropy loss function according to the restored sample business transaction data and the business transaction data;
and under the condition that the loss function value added by the reconstruction loss function and the relative entropy loss function is minimum, updating parameters in the initial encoder to obtain the preset encoder.
7. The method according to any one of claims 1 to 6, wherein the acquiring the business transaction data to be detected comprises:
acquiring service transaction information in a preset time period;
counting a transaction amount mean value, a transaction amount variance, a transaction amount extreme difference, a response time mean value, a response time variance, a response time extreme difference, a transaction success rate mean value, a transaction success rate variance and a transaction success rate standard deviation in a preset duration sliding window according to the service transaction information;
and determining the mean value of the transaction amount, the variance of the transaction amount, the extreme difference of the transaction amount, the mean value of the response time, the variance of the response time, the extreme difference of the response time, the mean value of the transaction success rate, the variance of the transaction success rate and the standard deviation of the transaction success rate in the preset time length sliding window as the service transaction data to be detected.
8. An anomaly detection apparatus for business transaction data, comprising:
the acquisition unit is used for acquiring the business transaction data to be detected;
the mapping unit is used for mapping the business transaction data to a data space subject to preset distribution to obtain a hidden variable of the business transaction data in the data space;
the judging unit is used for determining confidence intervals of the preset distribution under the corresponding confidence levels and judging whether the hidden variables are in the confidence intervals or not;
and the determining unit is used for determining that the business transaction data is abnormal if the business transaction data is not in the confidence interval.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
10. A computer arrangement comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 7 when executed by the processor.
CN202011529615.5A 2020-12-22 2020-12-22 Method and device for detecting abnormity of business transaction data and computer equipment Pending CN112632469A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011529615.5A CN112632469A (en) 2020-12-22 2020-12-22 Method and device for detecting abnormity of business transaction data and computer equipment
PCT/CN2021/109385 WO2022134579A1 (en) 2020-12-22 2021-07-29 Method and apparatus for detecting abnormalities of service transaction data, and computer device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011529615.5A CN112632469A (en) 2020-12-22 2020-12-22 Method and device for detecting abnormity of business transaction data and computer equipment

Publications (1)

Publication Number Publication Date
CN112632469A true CN112632469A (en) 2021-04-09

Family

ID=75321126

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011529615.5A Pending CN112632469A (en) 2020-12-22 2020-12-22 Method and device for detecting abnormity of business transaction data and computer equipment

Country Status (2)

Country Link
CN (1) CN112632469A (en)
WO (1) WO2022134579A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022134579A1 (en) * 2020-12-22 2022-06-30 深圳壹账通智能科技有限公司 Method and apparatus for detecting abnormalities of service transaction data, and computer device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106529752B (en) * 2015-09-11 2020-04-14 阿里巴巴集团控股有限公司 Method and device for detecting risks of business operation
JP7331369B2 (en) * 2019-01-30 2023-08-23 日本電信電話株式会社 Abnormal Sound Additional Learning Method, Data Additional Learning Method, Abnormality Degree Calculating Device, Index Value Calculating Device, and Program
CN110263827B (en) * 2019-05-31 2021-08-20 中国工商银行股份有限公司 Abnormal transaction detection method and device based on transaction rule identification
CN112101554B (en) * 2020-11-10 2024-01-23 北京瑞莱智慧科技有限公司 Abnormality detection method and apparatus, device, and computer-readable storage medium
CN112632469A (en) * 2020-12-22 2021-04-09 深圳壹账通智能科技有限公司 Method and device for detecting abnormity of business transaction data and computer equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022134579A1 (en) * 2020-12-22 2022-06-30 深圳壹账通智能科技有限公司 Method and apparatus for detecting abnormalities of service transaction data, and computer device

Also Published As

Publication number Publication date
WO2022134579A1 (en) 2022-06-30

Similar Documents

Publication Publication Date Title
CN113159147B (en) Image recognition method and device based on neural network and electronic equipment
CN111783875A (en) Abnormal user detection method, device, equipment and medium based on cluster analysis
CA3059937A1 (en) User credit evaluation method and device, electronic device, storage medium
CN115810178B (en) Crowd abnormal aggregation early warning method and device, electronic equipment and medium
CN113139743A (en) Sewage discharge index analysis method and device, electronic equipment and storage medium
CN110782349A (en) Model training method and system
US20230401466A1 (en) Method for temporal knowledge graph reasoning based on distributed attention
CN114399212A (en) Ecological environment quality evaluation method and device, electronic equipment and storage medium
CN114448657B (en) Distribution communication network security situation awareness and abnormal intrusion detection method
JP6778132B2 (en) Abnormality diagnosis system for equipment
Scrucca Clustering multivariate spatial data based on local measures of spatial autocorrelation
CN115879300A (en) Landslide induction estimation method, landslide induction estimation device, electronic apparatus, and storage medium
CN112632469A (en) Method and device for detecting abnormity of business transaction data and computer equipment
CN117155771B (en) Equipment cluster fault tracing method and device based on industrial Internet of things
CN113807728A (en) Performance assessment method, device, equipment and storage medium based on neural network
CN113516417A (en) Service evaluation method and device based on intelligent modeling, electronic equipment and medium
CN116881718A (en) Artificial intelligence training method and system based on big data cleaning
CN111460293B (en) Information pushing method and device and computer readable storage medium
CN114926082A (en) Artificial intelligence-based data fluctuation early warning method and related equipment
CN115237739B (en) Analysis method, device and equipment for board card running environment and readable storage medium
CN116109145B (en) Risk assessment method, risk assessment device, risk assessment terminal and risk assessment storage medium for vehicle driving route
CN115757987B (en) Method, device, equipment and medium for determining companion object based on track analysis
CN117235480B (en) Screening method and system based on big data under data processing
CN116448062B (en) Bridge settlement deformation detection method, device, computer and storage medium
US20230022253A1 (en) Fast and accurate prediction methods and systems based on analytical models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40045450

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination