CN116739605A - Transaction data detection method, device, equipment and storage medium - Google Patents

Transaction data detection method, device, equipment and storage medium Download PDF

Info

Publication number
CN116739605A
CN116739605A CN202310794184.2A CN202310794184A CN116739605A CN 116739605 A CN116739605 A CN 116739605A CN 202310794184 A CN202310794184 A CN 202310794184A CN 116739605 A CN116739605 A CN 116739605A
Authority
CN
China
Prior art keywords
transaction
target
nodes
node
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310794184.2A
Other languages
Chinese (zh)
Inventor
王中晴
易厚梅
蒋李灵
吴心坪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202310794184.2A priority Critical patent/CN116739605A/en
Publication of CN116739605A publication Critical patent/CN116739605A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4016Transaction verification involving fraud or risk level assessment in transaction processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Theoretical Computer Science (AREA)
  • Finance (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Databases & Information Systems (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Technology Law (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure provides a transaction data detection method, apparatus, device, and storage medium, which can be applied to the field of artificial intelligence and the field of financial science and technology. The method comprises the following steps: responding to a transaction data detection request, and obtaining abnormal values of a plurality of transaction nodes based on a plurality of pieces of value attribute information corresponding to the transaction nodes included in a transaction network diagram, wherein the transaction network diagram comprises a plurality of edges connected with the transaction nodes; determining weight values of the edges based on the abnormal values of the transaction nodes; determining a target dense subgraph from the transaction network graph based on the weight values of the edges; and obtaining a detection result of the transaction network graph based on the target dense subgraph.

Description

Transaction data detection method, device, equipment and storage medium
Technical Field
The present disclosure relates to the field of artificial intelligence and financial technology, and more particularly, to a transaction data detection method, apparatus, device, medium, and program product.
Background
With the continuous development of information technology, financial fraud approaches are also growing endlessly. In identifying financial fraud, it is often necessary to identify multiple transactions of a bank card account in which transactions are made to determine abnormal or suspicious transaction behavior, so as to timely early warn or intercept suspected fraudulent transactions.
In the process of implementing the disclosed concept, the inventor finds that at least the following problems exist in the related art: the identification of transaction data is limited, so that the identification of transaction anomalies among bank card accounts of both transaction sides can be realized generally, the possible anomalies among a plurality of bank card accounts can not be identified, and meanwhile, the problems of incomplete identification of the transaction anomalies and low identification efficiency exist.
Disclosure of Invention
In view of the foregoing, the present disclosure provides transaction data detection methods, apparatus, devices, media, and program products.
According to a first aspect of the present disclosure, there is provided a transaction data detection method, comprising: responding to a transaction data detection request, and obtaining abnormal values of a plurality of transaction nodes based on a plurality of pieces of value attribute information corresponding to the transaction nodes respectively, wherein the transaction network diagram comprises a plurality of edges for connecting the transaction nodes; determining weight values of the edges based on the abnormal values of the transaction nodes; determining a target dense subgraph from the transaction network graph based on the weight values of the edges; and obtaining a detection result of the transaction network graph based on the target dense subgraph.
According to an embodiment of the present disclosure, the obtaining, based on a plurality of pieces of value attribute information corresponding to each of a plurality of transaction nodes included in a transaction network graph, an abnormal value of each of the plurality of transaction nodes includes: for each of the transaction nodes, determining a plurality of pieces of value attribute information corresponding to the transaction node based on a plurality of pieces of transaction data associated with the transaction node; determining a first frequency of occurrence of each of a plurality of pieces of characteristic attribute information included in a plurality of pieces of value attribute information based on the plurality of pieces of value attribute information corresponding to the transaction node; and obtaining the abnormal value of the transaction node based on the first frequency of each occurrence of the plurality of characteristic attribute information and the preset probability of each occurrence of the plurality of characteristic attribute information.
According to an embodiment of the present disclosure, the obtaining the abnormal value of the transaction node based on the first frequency of occurrence of each of the plurality of feature attribute information and the predetermined probability of occurrence of each of the plurality of feature attribute information includes: obtaining a second frequency of each occurrence of the plurality of feature attribute information based on a first frequency of each occurrence of the plurality of feature attribute information and a predetermined probability of each occurrence of the plurality of feature attribute information; and carrying out chi-square statistics based on the first frequency number of each occurrence of the plurality of characteristic attribute information and the second frequency number of each occurrence of the plurality of characteristic attribute information to obtain the abnormal value of the transaction node.
According to an embodiment of the present disclosure, the determining the weight value of each of the plurality of edges based on the outlier of each of the plurality of transaction nodes includes: for each of the edges, determining a first transaction node and a second transaction node connected to the edge based on the transaction network graph; determining a first outlier corresponding to the first transaction node and a second outlier corresponding to the second transaction node from among the outliers of each of the plurality of transaction nodes; and obtaining the weight value of the edge based on the first abnormal value and the second abnormal value.
According to an embodiment of the disclosure, the determining, based on the weight values of the edges, a target dense subgraph from the transaction network graph includes: determining a weighting value of each of the plurality of transaction nodes based on the weighting value of each of the plurality of edges; determining a traversal order of the plurality of transaction nodes based on the weighted degree value; and determining a target dense subgraph from the transaction network graph based on the traversal order.
According to an embodiment of the present disclosure, the determining, based on the traversal order, the target dense subgraph from the transaction network graph includes: selecting a first target node from the plurality of transaction nodes based on the traversal order, wherein the first target node is the transaction node with the smallest weighting value; generating a target sub-graph based on the first target node and a second target node included in the dense sub-graph, wherein the dense sub-graph is a blank graph when the first target node is a first transaction node determined based on the traversal order; obtaining a first density value of the target sub-graph based on a plurality of weight values corresponding to a plurality of edges included in the target sub-graph and the number of target nodes included in the first sub-graph; and determining the target sub-graph as a new dense sub-graph when the first density value is greater than the density value of the dense sub-graph, wherein the target sub-graph is determined to be the target dense sub-graph when the first target node is the last transaction node determined based on the traversal order.
According to an embodiment of the present disclosure, in response to the number of the dense subgraphs being equal to or greater than a first threshold, it is determined to stop generating the target dense subgraphs; and determining to stop generating the target dense subgraph in response to the number of nodes included in the dense subgraph being less than or equal to a second threshold.
According to an embodiment of the present disclosure, the obtaining, based on the target dense subgraph, a detection result of the transaction network graph includes: determining an abnormal parameter value of the target dense subgraph based on the abnormal values corresponding to each of a plurality of third target nodes included in the target dense subgraph; and determining a detection result of the transaction network diagram based on the abnormal parameter value.
According to an embodiment of the present disclosure, the determining a detection result of the transaction network map based on the anomaly parameter includes: under the condition that the abnormal parameter value is greater than or equal to a parameter threshold value, determining the target dense subgraph as an abnormal subgraph; and determining that the abnormal subgraph comprises a plurality of third target nodes as abnormal nodes, wherein the abnormal nodes represent nodes with abnormal transactions.
According to an embodiment of the present disclosure, the obtaining, based on the target dense subgraph, a detection result of the transaction network graph includes: determining the number of blacklist nodes in a plurality of third target nodes included in the target dense subgraph based on a preset blacklist information table, wherein the blacklist information table includes a plurality of blacklist nodes with abnormal transactions; and obtaining a detection result of the transaction network graph based on the number of the blacklist nodes.
A second aspect of the present disclosure provides a transaction data detection device, comprising: the transaction node abnormal value determining module is used for responding to a transaction data detection request, and obtaining the abnormal value of each transaction node based on a plurality of pieces of value attribute information corresponding to each transaction node included in a transaction network diagram, wherein the transaction network diagram comprises a plurality of edges connected with the transaction nodes; the edge weight value determining module is used for determining the weight value of each edge based on the abnormal value of each transaction node; the target dense subgraph determining module is used for determining a target dense subgraph from the transaction network graph based on the weight values of the edges; and the transaction network diagram detection module is used for obtaining the detection result of the transaction network diagram based on the target dense subgraph.
A third aspect of the present disclosure provides an electronic device, comprising: one or more processors; and a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method described above.
A fourth aspect of the present disclosure also provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the above-described method.
A fifth aspect of the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the above method.
According to the transaction data detection method provided by the disclosure, through the corresponding multiple value attribute information of each node of the transaction network constructed by the transaction data, the corresponding abnormal value of each transaction node can be obtained, the weight value of each side in the transaction network graph can be determined through the corresponding abnormal value of each transaction node, the target dense subgraph can be determined from the transaction network graph based on the weight value of each side in the transaction network graph, and finally the detection result of the transaction network graph can be obtained based on the target dense subgraph. Based on the operation, the weight value of the edge between the transaction nodes with higher abnormal value is larger, so that the dense subgraph with higher possibility of abnormal transaction can be determined based on the weight value of each edge, and the possible abnormality of the transaction among a plurality of transaction nodes can be determined by identifying the dense subgraph.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be more apparent from the following description of embodiments of the disclosure with reference to the accompanying drawings, in which:
FIG. 1 schematically illustrates an application scenario diagram of transaction data detection methods, apparatus, devices, media and program products according to embodiments of the present disclosure;
FIG. 2 schematically illustrates a flow chart of a transaction data detection method according to an embodiment of the present disclosure;
FIG. 3 schematically illustrates a flow chart of calculating transaction node outliers according to an embodiment of the disclosure;
FIG. 4 schematically illustrates a flow diagram of determining a target dense subgraph in accordance with an embodiment of the present disclosure;
FIG. 5 schematically illustrates a schematic diagram of a transaction network diagram according to an embodiment of the present disclosure;
FIG. 6 schematically illustrates a schematic diagram of a target dense subgraph, in accordance with an embodiment of the present disclosure;
FIG. 7 schematically illustrates a flow chart of a transaction data detection method according to another embodiment of the present disclosure;
fig. 8 schematically illustrates a block diagram of a transaction data detection device according to an embodiment of the present disclosure; and
fig. 9 schematically illustrates a block diagram of an electronic device adapted to implement a transaction data detection method according to an embodiment of the disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is only exemplary and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.
Where expressions like at least one of "A, B and C, etc. are used, the expressions should generally be interpreted in accordance with the meaning as commonly understood by those skilled in the art (e.g.," a system having at least one of A, B and C "shall include, but not be limited to, a system having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
In the technical scheme of the disclosure, the related data (such as including but not limited to personal information of a user) are collected, stored, used, processed, transmitted, provided, disclosed, applied and the like, all conform to the regulations of related laws and regulations, necessary security measures are adopted, and the public welcome is not violated.
In recent years, with the rapid development of information technology and mobile internet, telecommunication fraud techniques are increasingly diversified and intelligent, and serious harm is brought to society and public. In order to effectively prevent and combat telecommunications fraud, bank card transaction detection is an important technical means. The bank card transaction detection means that abnormal or suspicious transaction behaviors are identified by analyzing transaction data of a bank card user, so that suspected fraud transactions are early warned or intercepted in time. With the upgrade of detection and black-out resistance, the detection technology based on expert rules and traditional machine learning is insufficient to completely identify the fraud cards, and the fraud bank card users usually have rich transaction relations, such as large-amount transfer-in and transfer-out, more transfer-in times, fewer transfer-out accounts, a mode of using a group partner, and the like, and adopt a mode of carrying out transaction data interaction by adopting a plurality of bank cards. Therefore, in the related art, the abnormality of the transaction data is flattened, so that the transaction abnormality identification between the bank card accounts of both transaction parties can be realized, the transaction abnormality identification can not be performed on the account numbers possibly having association relations among a plurality of bank card accounts, and the identification efficiency of a large amount of transaction data is difficult to improve.
Therefore, it is necessary to detect the network by using a transaction network, the transaction network utilizes the network topology structure of the bank card transaction, and it is found in the research process that the following methods are generally available for mining the network topology structure in the related art.
For graph anomaly detection (GraphBasedAnomaly Detection), this is a method for finding anomaly patterns in data by using graph structures and attributes, but the presence of graph anomaly detection is difficult to define and measure anomalies, and for large-scale and dynamic graph data, there is a problem of inefficiency, and for multi-source and multi-modal information, there is a problem of difficulty in utilization.
Fitting test (Tests of), a statistical method for checking whether data conforms to some preset distribution or assumption. Fitting tests have the problem of difficulty in selecting appropriate test statistics and significance levels, and are not suitable for processing complex or complex data.
In view of this, an embodiment of the present disclosure provides a transaction data detection method, in response to a transaction data detection request, obtaining an abnormal value of each of a plurality of transaction nodes based on a plurality of pieces of value attribute information corresponding to each of the plurality of transaction nodes included in a transaction network graph, where the transaction network graph includes a plurality of edges connecting the plurality of transaction nodes; determining weight values of the edges based on the abnormal values of the transaction nodes; determining a target dense subgraph from the transaction network graph based on the weight values of the edges; and obtaining a detection result of the transaction network graph based on the target dense subgraph.
Fig. 1 schematically illustrates an application scenario diagram of a transaction data detection method, apparatus, device, medium and program product according to an embodiment of the present disclosure.
As shown in fig. 1, an application scenario 100 according to this embodiment may include a first terminal device 101, a second terminal device 102, a third terminal device 103, a network 104, and a server 105. The network 104 is a medium used to provide a communication link between the first terminal device 101, the second terminal device 102, the third terminal device 103, and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 through the network 104 using at least one of the first terminal device 101, the second terminal device 102, the third terminal device 103, to receive or send messages, etc. Various communication client applications, such as a shopping class application, a web browser application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc. (by way of example only) may be installed on the first terminal device 101, the second terminal device 102, and the third terminal device 103.
The first terminal device 101, the second terminal device 102, the third terminal device 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background management server (by way of example only) providing support for websites browsed by the user using the first terminal device 101, the second terminal device 102, and the third terminal device 103. The background management server may analyze and process the received data such as the user request, and feed back the processing result (e.g., the web page, information, or data obtained or generated according to the user request) to the terminal device.
It should be noted that the transaction data detection method provided in the embodiments of the present disclosure may be generally executed by the server 105. Accordingly, the transaction data detection device provided by the embodiments of the present disclosure may be generally disposed in the server 105. The transaction data detection method provided by the embodiments of the present disclosure may also be performed by a server or a server cluster that is different from the server 105 and is capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103, and/or the server 105. Accordingly, the transaction data detection apparatus provided by the embodiments of the present disclosure may also be provided in a server or a server cluster that is different from the server 105 and is capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103, and/or the server 105.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
The transaction data detection method of the disclosed embodiment will be described in detail below with reference to fig. 2 to 5 based on the scenario described in fig. 1.
Fig. 2 schematically illustrates a flow chart of a transaction data detection method according to an embodiment of the present disclosure.
As shown in fig. 2, the transaction data detection of this embodiment includes operations S210 to S240.
In operation S210, in response to the transaction data detection request, an outlier of each of the plurality of transaction nodes is obtained based on a plurality of pieces of value attribute information corresponding to each of the plurality of transaction nodes included in the transaction network graph, wherein the transaction network graph includes a plurality of edges connecting the plurality of transaction nodes.
According to the embodiments of the present disclosure, in the case of receiving a transaction data detection request, it is possible to obtain abnormal values of a plurality of transaction nodes by calculating the degree of deviation of value attribute information from an expected value or a normal value based on a plurality of pieces of value attribute information corresponding to each of a plurality of transaction nodes included in a transaction network diagram constructed from transaction data.
According to embodiments of the present disclosure, the value attribute information may characterize the transaction amount of each transaction.
According to an embodiment of the disclosure, the transaction network diagram may be constructed by a plurality of transaction data, the transaction data may be collected at regular time, and the result of each transaction of the bank card account may be uniformly written into a storage space such as a data lake or a data warehouse, where the transaction data may include: customer identification, customer account, bank card number, transaction time, transaction amount, etc. Multiple customer accounts may be included for a customer and multiple bank card numbers may be stored under a customer account.
According to the embodiment of the disclosure, the collection mode of the transaction data is not limited, and the collection can be performed in a real-time or quasi-real-time mode, or can be performed in an off-line mode.
According to the embodiment of the present disclosure, the collecting tool of the transaction data is not limited, and any tool for collecting the transaction data into the storage space may be used, for example: real-time data may be collected using Flink and Spark Streaming. Offline data was acquired using Hive or Spark. Wherein, the flank is a framework and distributed processing engine. Spark is a fast and versatile computing engine designed for large-scale data processing. Hive is a data warehouse tool used for data extraction, conversion, and loading. Spark Streaming is a real-time Streaming framework capable of realizing a scalable, high throughput, fault tolerant mechanism of real-time data.
According to an embodiment of the disclosure, a transaction node in a transaction network graph may include a plurality of nodes and a plurality of edges, each edge may have a respective weight function, for the transaction network, a bank card number may be used as a node, a transaction is performed between the bank card numbers, and the initial weight between each edge may be represented by a transaction amount, for example: the transaction network is G (V, E, W), where V is the set of all nodes, E is the set of edges, E is the weight function, W represents the weight function, and can be temporarily represented by the transaction amount.
According to the embodiment of the disclosure, the construction mode of the transaction network graph is not limited, and the transaction network graph can be constructed in any mode, for example, a Hugegraph graph database is adopted for construction, and the Hugegraph is an easy-to-use, efficient and general open source graph database.
According to the embodiment of the disclosure, based on calculation of the deviation degree of the plurality of pieces of value attribute information corresponding to each of the plurality of transaction nodes included in the transaction network graph from the expected value attribute information, an abnormal value of each transaction node can be obtained, whether the transaction node is abnormal or not can be determined through the abnormal value, the more likely that the transaction node is abnormal is indicated by the abnormal value, and therefore possible abnormal nodes in the plurality of transaction nodes included in the transaction network graph are primarily confirmed.
In operation S220, weight values of respective edges are determined based on the outliers of the respective transaction nodes.
According to the embodiment of the disclosure, based on the respective abnormal value of each transaction node, re-weighting of each edge can be realized, so that the weight of edges between transaction nodes with high abnormal values is larger, the weight of edges between transaction nodes with low abnormal values is lower, nodes with abnormal transactions can be determined more quickly in subsequent processing, and the probability that nodes with no abnormal transaction possibility are misjudged to be abnormal transaction nodes is reduced.
In operation S230, a target dense subgraph is determined from the transaction network graph based on the weight values of the respective edges.
According to embodiments of the present disclosure, a highest-density target dense subgraph may be found by traversing each node and edge in a transaction network graph based on the respective weight values of the edges, which may include a plurality of nodes where abnormal transactions are most likely to exist.
According to an embodiment of the present disclosure, a target dense subgraph includes a plurality of nodes and a plurality of edges, for a node set V of the plurality of nodes included in the target dense subgraph, the node set V is included in a node set V of the plurality of nodes included in the transaction network graph, the edge set E is included in an edge set E of the plurality of edges included in the transaction network graph, and a weight value of each edge in the target dense subgraph is the same as that of the transaction network graph. For example: edge E in transaction network diagram 1 Edge e in dense subgraph with target 1 Is corresponding to edge E 1 When the weight value is 5, edge e 1 The weight value of (2) is also 5.
According to the embodiment of the disclosure, the obtained target dense subgraph is determined based on the weight values of the edges, and the target dense subgraph can be a transaction network graph for identifying possible problems in transactions among the nodes, so that detection of partner-type financial fraud is realized, possible association relations among the bank card accounts and possible transaction problems among the bank card accounts are identified, and transaction anomalies are more comprehensively identified. And for traversing each node included in the transaction network graph, the comprehensive identification of transaction data is realized. In addition, the weight value determined based on the abnormal value of each transaction node is adopted to determine the target dense subgraph, so that the possibility of abnormal transaction in the transaction network in the target dense subgraph is higher, the accuracy of the target dense subgraph for identifying abnormal transaction is higher, the meaning and the insight of the target dense subgraph are ensured, the target dense subgraph is found in the near-linear time, and the efficiency of identifying transaction data is improved.
In operation S240, a detection result of the transaction network graph is obtained based on the target dense subgraph.
According to the embodiment of the disclosure, the target dense subgraph is identified through the preset rule, so that the identification result of the transaction network graph, namely the identification result of the transaction data, can be obtained, and the accuracy of the identified abnormal nodes with abnormal transactions can be further ensured.
According to the transaction data detection method provided by the disclosure, through the corresponding multiple value attribute information of each node of the transaction network constructed by the transaction data, the corresponding abnormal value of each transaction node can be obtained, the weight value of each side in the transaction network graph can be determined through the corresponding abnormal value of each transaction node, the target dense subgraph can be determined from the transaction network graph based on the weight value of each side in the transaction network graph, and finally the detection result of the transaction network graph can be obtained based on the target dense subgraph. Based on the operation, the weight value of the edge between the transaction nodes with higher abnormal value is larger, so that the dense subgraph with higher possibility of abnormal transaction can be determined based on the weight value of each edge, and the possible abnormality of the transaction among a plurality of transaction nodes can be determined by identifying the dense subgraph.
Fig. 3 schematically illustrates a flow chart of calculating transaction node outliers according to an embodiment of the disclosure.
As shown in fig. 3, calculating the transaction node outlier includes operations S211 to S213.
In operation S211, for each transaction node, a plurality of pieces of value attribute information corresponding to the transaction node are determined based on a plurality of pieces of transaction data related to the transaction node.
In operation S212, a first frequency number at which a plurality of pieces of characteristic attribute information included in a plurality of pieces of value attribute information respectively appear is determined based on the plurality of pieces of value attribute information corresponding to the transaction node.
In operation S213, an outlier of the transaction node is obtained based on the first frequency at which the plurality of characteristic attribute information are each present and the predetermined probability at which the plurality of characteristic attribute information are each present.
According to an embodiment of the present disclosure, there may be a plurality of transaction records corresponding to each transaction node, and each transaction record is a different transaction, there may be a plurality of value attribute information, and feature attribute information corresponding to different value attribute information may be different, where the value data attribute information is not limited, may be a transaction amount, and the feature attribute information is not limited, and may be information included in the transaction amount, for example: the first digit of the transaction amount, etc.
According to embodiments of the present disclosure, there are many possibilities for the leading digit of the transaction amount, such as: 1 to 9.
According to the embodiments of the present disclosure, the manner of determining the outlier of each node by the value attribute information may not be limited, and may be a manner of using the present ford's law, an orphan forest algorithm, a self-encoder, or the like.
According to embodiments of the present disclosure, for example, when the present ford law is adopted, it may be assumed that the first digit of the transaction amount complies with the present ford Benford law, the distribution of the first digit of the transaction amount of each node is calculated from the deviation degree of the present ford law by using the present ford law and the chi-square statistic, and an abnormal value of the transaction node may be obtained, where the higher the abnormal value, the more likely the node is abnormal.
According to the embodiment of the disclosure, the actual frequency of occurrence of each feature attribute information in all the value attribute information corresponding to each transaction node, namely the first frequency, can be determined, and the first frequency of occurrence of each feature attribute information of each node can be counted to obtain the total frequency.
According to the embodiment of the present disclosure, the formula for determining the first frequency number and the total frequency number is not limited, for example: the first frequency of occurrence of the characteristic attribute information i for the transaction node U is The formula of the total frequency of each feature attribute information for each node is shown by the following formula (1).
According to an embodiment of the present disclosure, in the case of calculating an outlier of each node using the present ford's law, a predetermined probability may be obtained by the following equation (2), wherein the predetermined probability may characterize a distribution rule with respect to characteristic attribute information i in a real dataset, which may be a leading number of a transaction amount.
According to an embodiment of the present disclosure, obtaining an outlier of a transaction node based on a first frequency at which a plurality of feature attribute information each appears and a predetermined probability at which a plurality of feature attribute information each appears may include the following operations.
Obtaining a second frequency of each occurrence of the plurality of feature attribute information based on a first frequency of each occurrence of the plurality of feature attribute information and a predetermined probability of each occurrence of the plurality of feature attribute information; and carrying out chi-square statistics based on the first frequency number appearing in each of the plurality of characteristic attribute information and the second frequency number appearing in each of the plurality of characteristic attribute information to obtain an abnormal value of the transaction node.
According to an embodiment of the present disclosure, based on a first frequency of occurrence of each of the plurality of feature attribute information and a predetermined probability of occurrence of each of the plurality of feature attribute information, a second frequency of occurrence of each of the plurality of feature attribute information may be obtained, the second frequency may represent an expected frequency of occurrence of the feature attribute information i in the real dataset, the expected frequency may be a frequency satisfying certain rules, for example: to meet the frequency of the present ford law. The method of calculating the second frequency is not limited, and may be calculated according to another method as shown in the following equation (3).
According to embodiments of the present disclosure, it is possible toCarrying out chi-square statistics on a first frequency number of each occurrence of the plurality of characteristic attribute information and a second frequency number of each occurrence of the plurality of characteristic attribute information, thereby obtaining the deviation degree of the distribution of the characteristic attribute information of each node from a normal rule, and further obtaining an abnormal value of each node, wherein for the abnormal value Score of each node u The calculation formula of (2) may be as shown in formula (4).
According to the embodiment of the disclosure, the actual frequency of occurrence of the plurality of characteristic attribute information of the plurality of value attribute information is determined by determining each transaction node, the expected frequency of occurrence of the plurality of characteristic attribute information of the plurality of value attribute information is determined by the preset probability and the actual frequency, the deviation degree of the distribution of the characteristic attribute information of each node from the normal rule is obtained based on the actual frequency and the expected frequency, and the abnormal value of each node is further determined, so that the preliminary determination of the transaction node of the abnormal transaction possibly existing in the transaction network is realized, the method is applicable to the transaction network with large scale or dynamic change, and meanwhile, the transaction node with the abnormal transaction is preliminarily determined, so that the subsequent determination of the abnormal transaction in the transaction network can be more accurate.
According to an embodiment of the present disclosure, determining the weight value of each of the plurality of edges based on the outliers of each of the plurality of transaction nodes may include the following operations.
For each edge, determining a first transaction node and a second transaction node connected with the edge based on the transaction network graph; determining a first abnormal value corresponding to the first transaction node and a second abnormal value corresponding to the second transaction node from the abnormal values of the transaction nodes respectively; the weight value of the edge is obtained based on the first outlier and the second outlier.
According to the embodiment of the present disclosure, for each edge included in the transaction network, the weight value of the edge may be determined based on the outlier of the node connected to the edge, and the calculation method for obtaining the weight value of the edge based on the first outlier and the second outlier is not limited, and may be any formula capable of obtaining the weight value of the edge, as shown in the following formula (5).
Wherein ω (u, v) characterizes the weight value of the edge between transaction node u and transaction node v, score u Characterizing outliers of transaction node u, score v Characterizing outliers of the transaction node v.
According to the embodiment of the disclosure, the process of determining the weight value of each edge based on the abnormal value of each transaction node may be a process of reassigning the weight value of each edge, and the process of reassigning each edge according to the abnormal score of the transaction node may enable the edge weights between the transaction nodes with high abnormal scores to be larger, so that in the process of determining the target dense subgraph later, the dense subgraph which does not conform to the correct rule may be found more accurately, for example: the dense subgraph of the Ford rule, thereby finding the abnormal subgraph most likely to have abnormal transactions.
According to an embodiment of the present disclosure, determining a target dense subgraph from a transaction network graph based on weight values of respective edges may include the following operations.
Determining the weight value of each of the transaction nodes based on the weight value of each of the edges; determining a traversal order of the plurality of transaction nodes based on the weighting values; a target dense subgraph is determined from the transaction network graph based on the traversal order.
According to the embodiment of the disclosure, the weighting value of each transaction node can be determined based on the respective weighting value of each side through a plurality of sides connected with each transaction node, wherein the manner of determining the weighting value of each transaction node is not limited, and the weighting value can be obtained by adding the weighting values corresponding to the plurality of sides connected with the transaction node, or can be obtained according to other manners.
According to embodiments of the present disclosure, the traversal order of each transaction node may be determined based on the respective weighting values of each transaction node. Each transaction node may be ordered in terms of magnitude of the weighted value degree value, resulting in a traversal order.
According to embodiments of the present disclosure, a greedy stripping algorithm may be employed to determine a target dense subgraph from a transaction network graph based on a traversal order. The method comprises the steps that iteration can be conducted on a plurality of transaction nodes included in a transaction network diagram according to a traversing sequence, a transaction node set can be determined based on the plurality of transaction nodes included in the transaction network diagram, a node with the lowest weighting value is selected from the transaction node set each time, a new dense subgraph is generated by the node with the original dense subgraph, density value comparison is conducted on the new dense subgraph and the original dense subgraph, and after all the transaction nodes are traversed, a target dense subgraph is obtained, wherein the target dense subgraph can be the dense subgraph with the highest density value, namely, the transaction network subgraph with abnormal transaction is most likely to exist.
Fig. 4 schematically illustrates a flow chart of determining a target dense subgraph in accordance with an embodiment of the present disclosure.
As shown in fig. 4, determining the target dense subgraph includes operations S401 to S404.
In operation S401, a first target node is selected from a plurality of transaction nodes based on a traversal order, wherein the first target node is a transaction node having a smallest weighted value.
In operation S402, a target sub-graph is generated based on a first target node and a second target node included in the dense sub-graph, wherein the dense sub-graph is a blank graph in case the first target node is a first transaction node determined based on a traversal order.
In operation S403, a first density value of the target sub-graph is obtained based on a plurality of weight values corresponding to a plurality of edges included in the target sub-graph and the number of target nodes included in the first sub-graph.
In operation S404, the target sub-graph is determined to be a new dense sub-graph in the case where the first density value is greater than the density value of the dense sub-graph, wherein the target sub-graph is determined to be a target dense sub-graph in the case where the first target node is the last transaction node determined based on the traversal order.
According to an embodiment of the present disclosure, a first target node may be selected from a plurality of transaction nodes based on a traversal order, the first target node may be a node that has not been selected from a previous traversal, and a weighting value of the first target node is the smallest among the unselected nodes.
According to the embodiment of the disclosure, the target subgraph is obtained based on the first target node and one or more second target nodes included in the dense subgraph, wherein in the case that the first target node is the first transaction node determined based on the traversal order, any transaction node is not included in the dense subgraph, the dense subgraph may be blank graph without any transaction node, or there is no dense subgraph, and the dense subgraph is blank, etc.
According to an embodiment of the present disclosure, a first density value of a target sub-graph may be obtained based on a plurality of weight values corresponding to a plurality of edges included in the target sub-graph and a number of target nodes included in the first sub-graph, where for the density value, for example: the method of calculating the first density value is not limited, and may be any method capable of obtaining the target sub-graph density value or the dense sub-graph density value, for example, the method of calculating the dense value may be as shown in formula (6).
Wherein ρ is g (S) is the density value of the target sub-graph or the dense sub-graph, S represents the transaction node set in the sub-graph, ω (S) is the sum of all edge weights in the sub-graph, and ω (S) can be calculated as in the following formula (7).
ω(S)=∑ω(u,v),(uv)∈E(S) (7)
Wherein ω (u, v) represents the sum of edge weights between transaction node u and transaction node v, and (uv) E (S) represents the sum of edge weights between transaction node u and transaction node v in data node set S.
According to the embodiment of the disclosure, each transaction node is traversed through a traversing sequence, the target subgraph is generated iteratively, the density value of the target subgraph is continuously compared with the density value of the dense subgraph, and the finally obtained target dense subgraph can be determined to be the subgraph with the highest density value, so that the obtained target dense subgraph can be more accurate in recognition of abnormal transactions, abnormal transaction subgraphs can be found in near-linear time, and the complexity of finding the abnormality in a large-scale transaction network is reduced.
According to an embodiment of the present disclosure, in response to the number of dense subgraphs being greater than or equal to a first threshold, determining to stop generating the target dense subgraph; and determining to stop generating the target dense subgraph in response to the number of nodes included in the dense subgraph being less than or equal to a second threshold.
According to the embodiment of the disclosure, a transaction sub-network with transaction anomalies can be found for one dense sub-graph output, and a plurality of transaction sub-networks with problems can exist for one transaction network, so that the situation that a plurality of abnormal transaction sub-networks can exist can be met by obtaining target dense sub-graphs with target quantity.
According to embodiments of the present disclosure, the output of a plurality of different target dense subgraphs may be achieved by updating a transaction node of a collection of transaction nodes comprising a plurality of transaction nodes of a transaction network, for example: 1000 transaction nodes exist in the transaction node set, 50 transaction nodes are included in the target dense subgraph obtained in the first running program, then before the second running program, the 50 transaction nodes of the target dense subgraph can be deleted or marked from the initial transaction node set, updated transaction node combination is obtained, and the first target node can be selected from the rest 950 transaction nodes in the second running program.
According to an embodiment of the present disclosure, obtaining a target dense subgraph condition for stopping may include: under the condition that the number of the dense subgraphs is larger than or equal to a threshold value, the target dense subgraphs can be stopped, or under the condition that the number of transaction nodes included in the target dense subgraphs can be smaller than or equal to the threshold value, the target dense subgraphs are stopped, and the transaction problems among a plurality of bank card accounts can not be determined for the dense subgraphs with too few transaction nodes, so that generation can be omitted.
According to an embodiment of the present disclosure, obtaining a detection result of a transaction network graph based on a target dense subgraph may include the following operations.
Determining an abnormal parameter value of the target dense subgraph based on abnormal values respectively corresponding to a plurality of third target nodes included in the target dense subgraph; and determining the detection result of the transaction network graph based on the abnormal parameter value.
According to the embodiment of the present disclosure, the number of the target dense subgraphs is not limited, and may be plural or one. For each target dense subgraph, whether the graph is abnormal or not may be determined by calculating an abnormal parameter value, and the abnormal parameter value may be determined by trading the abnormal value of the node and the number of nodes included in the target dense subgraph, without limitation, and the calculation formula may be as shown in the following formula (8).
Wherein Avg characterizes abnormal parameter values of the target dense subgraph, |s' | characterizes the number of nodes included in the target dense subgraph.
According to the embodiment of the disclosure, since the third node included in the target dense subgraph belongs to the transaction node set of the target network graph, for the same transaction node, the outlier corresponding to the third target node included in the target dense subgraph is the same as the outlier corresponding to the transaction node included in the transaction node set.
According to an embodiment of the present disclosure, determining a detection result of a transaction network graph based on an anomaly parameter may include the following operations.
Under the condition that the abnormal parameter value is greater than or equal to the parameter threshold value, determining the target dense subgraph as an abnormal subgraph; and determining that the abnormal subgraph comprises a plurality of third target nodes as abnormal nodes, wherein the abnormal nodes represent nodes with abnormal transactions.
According to an embodiment of the present disclosure, in a case where the abnormal parameter value is greater than the parameter threshold value, it may be determined that the target dense subgraph is an abnormal subgraph, and then it may be considered that transaction data related to transaction nodes included in the abnormal subgraph is abnormal data, and then there may be an abnormality in transactions between the plurality of transaction nodes, and the transaction nodes thereof are also abnormal nodes.
According to the embodiment of the disclosure, whether the target dense subgraph is the abnormal subgraph is determined through the abnormal parameter value of the target dense subgraph, so that the abnormal subgraph can be determined more accurately, a transaction sub-network with abnormal transaction can be found more accurately, and marking, early warning and interception of an abnormal bank card account or abnormal transaction can be better realized.
According to an embodiment of the present disclosure, obtaining a detection result of a transaction network graph based on a target dense subgraph may include the following operations.
Determining the number of blacklist nodes in a plurality of third target nodes included in the target dense subgraph based on a preset blacklist information table, wherein the blacklist information table includes a plurality of blacklist nodes with abnormal transactions; and obtaining a detection result of the transaction network graph based on the number of the blacklist nodes.
According to an embodiment of the present disclosure, blacklist data may be preset, where the blacklist data may be represented by a data table or other forms, if it is determined that a transaction node in a dense subgraph coincides with a blacklist node included in a blacklist information table, the node may be considered as a blacklist node, and if the number of blacklist nodes included in a target dense subgraph is greater than or equal to a number threshold, the target dense subgraph is considered as an abnormal subgraph. And if the duty ratio of the blacklist node included in one target dense subgraph is larger than or equal to the duty ratio threshold value, the target dense subgraph is considered to be an abnormal subgraph.
According to the embodiment of the disclosure, by determining the transaction node included in the target dense subgraph and the blacklist node included in the blacklist information table, repeated determination of the blacklist node marked as the blacklist node can be saved, calculation efficiency is saved, and other blacklist nodes which may exist can be determined through transaction data of the blacklist node and other nodes.
According to the embodiment of the disclosure, the detection framework of the transaction network graph is proposed, the abnormal value of each transaction node is determined by calculating the corresponding pieces of value attribute information of each node, the weight value of each side in the transaction network graph is determined based on the abnormal value of each transaction node, so that the node side weight with high abnormal value is also larger, the target dense subgraph with higher possibility of abnormal transaction can be determined based on the weight value of each side, the detection of the transaction network graph can be realized based on the target dense subgraph, the detection of a large-scale and dynamically-changed transaction network is realized, the transaction subgraph with possibly abnormal transaction can be found in a near linear time, the complexity of finding the transaction subgraph with abnormal transaction in the large-scale transaction network is reduced, and meanwhile, the detection of the transaction network graph can be carried out by adopting other methods based on the framework, so that the detection of the transaction network graph has good expansibility and robustness is realized.
Fig. 5 schematically illustrates a schematic diagram of a transaction network diagram according to an embodiment of the present disclosure.
The schematic diagram of the transaction network diagram shown in fig. 5 is a transaction network diagram after determining the weight value of the edge based on the respective outlier of each transaction node. The transaction network diagram comprises transaction nodes V1-V4, wherein the edges between the two transaction nodes represent the transaction processes of the two transaction nodes, and each edge has a respective weight value W, for example: the weight value W2 of the edge E2 is 2.
According to the embodiment of the disclosure, the number of nodes, the number of edges, the expression form of the weight values and the shapes of the nodes and the edges in the transaction network diagram are all schematic, and different transaction network diagrams can be determined according to actual situations.
Fig. 6 schematically illustrates a schematic diagram of a target dense subgraph, according to an embodiment of the present disclosure.
As shown in fig. 6, the process of determining a target dense subgraph based on the target transaction network graph as shown in fig. 5 may include: determining the respective weighting value of each transaction node through the respective weighting value of each edge and the connection relation between the edge and the transaction node, for example: the transaction node V1 is connected to the edges E1 and E2, where the weight w1=a of E1, the weight w2=b of E2, and the weight of V1 is a+b, and similarly, the weight of V2 is b+e, the weight of V3 is a+c+d, and the weight of V4 is c+d+e, where a, b, c, d, E are constants.
For example, V2< V1< V3< V4 can be obtained by sorting V1 to V4 based on the weighted value, so that the traversal order of V1 to V4 is V2, V1, V3, V4 can be obtained.
Initializing a density value of the dense subgraph and the dense subgraph, wherein the density value of the dense subgraph can be used as 0 in the first iteration, and the dense subgraph is a blank graph.
Selecting V2 to obtain a first target sub-graph, calculating the density value of the first target sub-graph to be (b+e)/1, obtaining the density value of the first target sub-graph to be (b+e), and taking the first target sub-graph as a dense sub-graph if the density value of the first target sub-graph is larger than the density value of the dense sub-graph.
Selecting V1, obtaining a second target subgraph based on the V1 and a node V2 included in the dense subgraph, calculating the density value of the second target subgraph as (a+b+b+e)/2, and if the density value is smaller than the density value of the dense subgraph, not changing the dense subgraph.
And selecting V3, obtaining a third target subgraph based on the V3 and a node V2 included in the dense subgraph, calculating the density value of the third target subgraph to be (a+c+d+b+e)/2, and taking the third target subgraph as the dense subgraph if the density value is larger than the density value of the dense subgraph.
And selecting V4, obtaining a fourth target subgraph based on the V4 and nodes V2 and V3 included in the dense subgraph, calculating the density value of the fourth target subgraph as (c+d+e+a+c+d+b+e)/3, and taking the fourth target subgraph as the target dense subgraph under the condition that all nodes included in the transaction network graph are traversed once if the density value is larger than the density value of the dense subgraph.
Fig. 7 schematically illustrates a flow chart of a transaction data detection method according to another embodiment of the present disclosure.
As shown in fig. 7, the transaction data detection method of another embodiment of the present disclosure may include operations S710 to S750.
In operation S710, a plurality of transaction data of a target period are written into a database in real time through a predetermined rule.
In operation S720, a transaction network is obtained based on the plurality of transaction data included in the database, wherein the plurality of edges included in the transaction network are provided with respective initial weights.
In operation S730, an outlier of each of the plurality of transaction nodes is obtained based on the plurality of pieces of value attribute information corresponding to each of the plurality of transaction nodes included in the transaction network graph.
In operation S740, the weight values of the edges are updated based on the abnormal values of the transaction nodes, so as to obtain the target weight values of the edges.
In operation S750, a target dense subgraph is determined from the transaction network graph based on the target weight values of the respective edges, so as to implement detection of transaction data through the target dense subgraph.
Based on the transaction data detection method, the disclosure also provides a transaction data detection device. The device will be described in detail below in connection with fig. 7.
Fig. 8 schematically shows a block diagram of a transaction data detection device according to an embodiment of the present disclosure.
As shown in fig. 8, the transaction data detection device 800 of this embodiment includes a transaction node outlier determination module 810, an edge weight value determination module 820, a target dense subgraph determination module 830, and a transaction network graph detection module 840.
The transaction node outlier determination module 810 is configured to obtain outliers of a plurality of transaction nodes based on a plurality of pieces of value attribute information corresponding to the transaction nodes included in a transaction network graph in response to the transaction data detection request, where the transaction network graph includes a plurality of edges connecting the transaction nodes. In an embodiment, the transaction node outlier determination module 810 may be configured to perform the operation S210 described above, which is not described herein.
The edge weight value determining module 820 is configured to determine weight values of respective edges based on the outliers of the transaction nodes. In an embodiment, the edge weight determining module 820 may be used to perform the operation S220 described above, which is not described herein.
The target dense subgraph determining module 830 is configured to determine a target dense subgraph from the transaction network graph based on the weight values of the edges. In an embodiment, the target dense subgraph determination module 830 may be configured to perform the operation S230 described above, which is not described herein.
The transaction network diagram detection module 840 is configured to obtain a detection result of the transaction network diagram based on the target dense subgraph. In an embodiment, the transaction network diagram detection module 840 may be used to perform the operation S230 described above, which is not described herein.
According to an embodiment of the present disclosure, the transaction node outlier determination module 810 may include: the value attribute information determining sub-module, the first frequency determining sub-module and the transaction node outlier calculating sub-module.
And the value attribute information determining sub-module is used for determining a plurality of pieces of value attribute information corresponding to the transaction nodes based on a plurality of pieces of transaction data related to the transaction nodes for each transaction node.
And the first frequency determining submodule is used for determining the first frequency of each occurrence of the characteristic attribute information included in the plurality of pieces of value attribute information based on the plurality of pieces of value attribute information corresponding to the transaction node.
And the transaction node abnormal value calculation sub-module is used for obtaining the abnormal value of the transaction node based on the first frequency of each occurrence of the plurality of characteristic attribute information and the preset probability of each occurrence of the plurality of characteristic attribute information.
According to an embodiment of the present disclosure, the transaction node outlier calculation sub-module may include: a second frequency number calculation unit and an outlier calculation unit.
And the second frequency calculation unit is used for obtaining the second frequency of each occurrence of the plurality of characteristic attribute information based on the first frequency of each occurrence of the plurality of characteristic attribute information and the predetermined probability of each occurrence of the plurality of characteristic attribute information.
And the abnormal value calculation unit is used for carrying out chi-square statistics based on the first frequency number of each occurrence of the plurality of characteristic attribute information and the second frequency number of each occurrence of the plurality of characteristic attribute information to obtain the abnormal value of the transaction node.
According to an embodiment of the present disclosure, the edge weight value determination module 820 may include: the node determining sub-module, the abnormal value determining sub-module and the weight value calculating sub-module.
The node determining submodule is used for determining a first transaction node and a second transaction node which are connected with the edges based on the transaction network diagram for each edge.
An outlier determination sub-module is configured to determine, from among outliers of each of the plurality of transaction nodes, a first outlier corresponding to the first transaction node and a second outlier corresponding to the second transaction node.
And the weight value calculation sub-module is used for obtaining the weight value of the edge based on the first abnormal value and the second abnormal value.
According to an embodiment of the present disclosure, the target dense subgraph determination module 830 may include: the transaction node weighting value determining sub-module, the traversal order determining sub-module and the target dense subgraph determining sub-module.
And the transaction node weighting value determination submodule is used for determining the weighting value of each of the plurality of transaction nodes based on the weighting value of each of the plurality of edges.
The traversal order determining sub-module is used for determining the traversal order of the transaction nodes based on the weighting values.
And the target dense subgraph determining sub-module is used for determining the target dense subgraph from the transaction network graph based on the traversal order.
According to an embodiment of the present disclosure, the target dense subgraph determination submodule may include: the first target node determining sub-module, the target sub-graph generating sub-module, the first density value calculating sub-module and the new dense sub-graph generating sub-module.
The first target node determining sub-module is used for selecting a first target node from a plurality of transaction nodes based on the traversal order, wherein the first target node is the transaction node with the smallest weighting value.
And the target sub-graph generation sub-module is used for generating a target sub-graph based on the first target node and a second target node included in the dense sub-graph, wherein the dense sub-graph is a blank graph under the condition that the first target node is the first transaction node determined based on the traversal order.
And the first density value calculation sub-module is used for obtaining a first density value of the target sub-graph based on a plurality of weight values corresponding to a plurality of edges included in the target sub-graph and the number of target nodes included in the first sub-graph.
And the new dense subgraph generation sub-module is used for determining the target subgraph as the new dense subgraph under the condition that the first density value is larger than the density value of the dense subgraph, wherein the target subgraph is determined to be the target dense subgraph under the condition that the first target node is the last transaction node determined based on the traversal order.
According to an embodiment of the present disclosure, the transaction data detection device 800 may include: a first stop condition determination module and a second stop condition determination module.
And the first stop condition determining module is used for determining to stop generating the target dense subgraph in response to the number of the dense subgraphs being greater than or equal to a first threshold.
And the second stopping condition determining module is used for determining to stop generating the target dense subgraph in response to the number of nodes included in the dense subgraph being smaller than or equal to a second threshold value.
According to an embodiment of the present disclosure, the transaction network diagram detection module 840 may include: the abnormal parameter value determining sub-module and the first detection result determining sub-module.
And the abnormal parameter value determining sub-module is used for determining the abnormal parameter value of the target dense subgraph based on the abnormal values corresponding to the third target nodes in the target dense subgraph.
And the first detection result determining sub-module is used for determining the detection result of the transaction network graph based on the abnormal parameter value.
According to an embodiment of the present disclosure, the first detection result determination sub-module may further include: an abnormal subgraph determination unit and an abnormal node determination unit.
And the abnormal subgraph determining unit is used for determining the target dense subgraph as the abnormal subgraph under the condition that the abnormal parameter value is greater than or equal to the parameter threshold value.
And the abnormal node determining unit is used for determining that the abnormal subgraph comprises a plurality of third target nodes as abnormal nodes, wherein the abnormal nodes represent nodes with abnormal transactions.
According to an embodiment of the present disclosure, the transaction network diagram detection module 840 may further include: the blacklist node number determining submodule and the second detection result determining submodule.
And the blacklist node number determining submodule is used for determining the number of blacklist nodes in a plurality of third target nodes included in the target dense subgraph based on a preset blacklist information table, wherein the blacklist information table comprises a plurality of blacklist nodes with abnormal transactions.
And the second detection result determining submodule is used for obtaining the detection result of the transaction network graph based on the number of the blacklist nodes.
Any of the transaction node outlier determination module 810, the edge weight value determination module 820, the target dense subgraph determination module 830, and the transaction network graph detection module 840 may be combined in one module to be implemented, or any of them may be split into multiple modules, according to embodiments of the present disclosure. Alternatively, at least some of the functionality of one or more of the modules may be combined with at least some of the functionality of other modules and implemented in one module. According to embodiments of the present disclosure, at least one of the transaction node outlier determination module 810, the edge weight value determination module 820, the goal dense subgraph determination module 830, and the transaction network graph detection module 840 may be implemented at least in part as hardware circuitry, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system-on-chip, a system-on-substrate, a system-on-package, an Application Specific Integrated Circuit (ASIC), or as hardware or firmware in any other reasonable manner of integrating or packaging the circuitry, or as any one of or a suitable combination of any of the three. Alternatively, at least one of the transaction node outlier determination module 810, the edge weight value determination module 820, the target dense subgraph determination module 830, and the transaction network graph detection module 840 may be at least partially implemented as a computer program module that, when executed, may perform the corresponding functions.
Fig. 9 schematically illustrates a block diagram of an electronic device adapted to implement a transaction data detection method according to an embodiment of the disclosure.
As shown in fig. 9, an electronic device 900 according to an embodiment of the present disclosure includes a processor 901 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 902 or a program loaded from a storage portion 908 into a Random Access Memory (RAM) 903. The processor 901 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. Processor 901 may also include on-board memory for caching purposes. Processor 901 may include a single processing unit or multiple processing units for performing the different actions of the method flows according to embodiments of the present disclosure.
In the RAM 903, various programs and data necessary for the operation of the electronic device 900 are stored. The processor 901, the ROM 902, and the RAM 903 are connected to each other by a bus 904. The processor 901 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM 902 and/or the RAM 903. Note that the program may be stored in one or more memories other than the ROM 902 and the RAM 903. The processor 901 may also perform various operations of the method flow according to embodiments of the present disclosure by executing programs stored in one or more memories.
According to an embodiment of the disclosure, the electronic device 900 may also include an input/output (I/O) interface 905, the input/output (I/O) interface 905 also being connected to the bus 904. The electronic device 900 may also include one or more of the following components connected to an input/output (I/O) interface 905: an input section 906 including a keyboard, a mouse, and the like; an output portion 907 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage portion 908 including a hard disk or the like; and a communication section 909 including a network interface card such as a LAN card, a modem, or the like. The communication section 909 performs communication processing via a network such as the internet. The drive 910 is also connected to an input/output (I/O) interface 905 as needed. A removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on the drive 910 so that a computer program read out therefrom is installed into the storage section 908 as needed.
The present disclosure also provides a computer-readable storage medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs which, when executed, implement methods in accordance with embodiments of the present disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example, but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, the computer-readable storage medium may include ROM 902 and/or RAM 903 and/or one or more memories other than ROM 902 and RAM 903 described above.
Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the methods shown in the flowcharts. The program code, when executed in a computer system, causes the computer system to implement the transaction data detection methods provided by embodiments of the present disclosure.
The above-described functions defined in the system/apparatus of the embodiments of the present disclosure are performed when the computer program is executed by the processor 901. The systems, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
In one embodiment, the computer program may be based on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed, and downloaded and installed in the form of a signal on a network medium, via communication portion 909, and/or installed from removable medium 911. The computer program may include program code that may be transmitted using any appropriate network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In such an embodiment, the computer program may be downloaded and installed from the network via the communication portion 909 and/or installed from the removable medium 911. The above-described functions defined in the system of the embodiments of the present disclosure are performed when the computer program is executed by the processor 901. The systems, devices, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
According to embodiments of the present disclosure, program code for performing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, such computer programs may be implemented in high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. Programming languages include, but are not limited to, such as Java, c++, python, "C" or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that the features recited in the various embodiments of the disclosure and/or in the claims may be provided in a variety of combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the disclosure. In particular, the features recited in the various embodiments of the present disclosure and/or the claims may be variously combined and/or combined without departing from the spirit and teachings of the present disclosure. All such combinations and/or combinations fall within the scope of the present disclosure.
The embodiments of the present disclosure are described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described above separately, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be made by those skilled in the art without departing from the scope of the disclosure, and such alternatives and modifications are intended to fall within the scope of the disclosure.

Claims (14)

1. A transaction data detection method, comprising:
responding to a transaction data detection request, and obtaining abnormal values of a plurality of transaction nodes based on a plurality of pieces of value attribute information corresponding to the transaction nodes respectively, wherein the transaction network diagram comprises a plurality of edges connecting the transaction nodes;
Determining weight values of the edges based on the abnormal values of the transaction nodes;
determining a target dense subgraph from the transaction network graph based on the weight values of the edges;
and obtaining a detection result of the transaction network graph based on the target dense subgraph.
2. The method of claim 1, wherein the deriving the outliers of each of the plurality of transaction nodes based on the pieces of value attribute information corresponding to each of the plurality of transaction nodes included in the transaction network graph comprises:
for each of the transaction nodes, determining a plurality of pieces of value attribute information corresponding to the transaction node based on a plurality of pieces of transaction data associated with the transaction node;
determining a first frequency of occurrence of each of a plurality of feature attribute information included in a plurality of pieces of value attribute information based on the plurality of pieces of value attribute information corresponding to the transaction node;
and obtaining the abnormal value of the transaction node based on the first frequency of each occurrence of the plurality of characteristic attribute information and the preset probability of each occurrence of the plurality of characteristic attribute information.
3. The method of claim 2, wherein the deriving the outlier of the transaction node based on the first frequency of occurrence of each of the plurality of characteristic attribute information and the predetermined probability of occurrence of each of the plurality of characteristic attribute information comprises:
Obtaining a second frequency of each occurrence of the plurality of feature attribute information based on a first frequency of each occurrence of the plurality of feature attribute information and a predetermined probability of each occurrence of the plurality of feature attribute information;
and carrying out chi-square statistics based on the first frequency number of each occurrence of the plurality of characteristic attribute information and the second frequency number of each occurrence of the plurality of characteristic attribute information to obtain the abnormal value of the transaction node.
4. The method of claim 1, wherein the determining the weight value for each of the plurality of edges based on the outliers for each of the plurality of transaction nodes comprises:
for each of the edges, determining a first transaction node and a second transaction node connected to the edge based on the transaction network graph;
determining a first outlier corresponding to the first transaction node and a second outlier corresponding to the second transaction node from the outliers of each of the plurality of transaction nodes;
weight values for the edges are derived based on the first outliers and the second outliers.
5. The method of claim 1, wherein the determining a target dense subgraph from the transaction network graph based on the weight values of the respective edges comprises:
Determining a weighting value of each of the plurality of transaction nodes based on the weighting value of each of the plurality of edges;
determining a traversal order of the plurality of transaction nodes based on the weighting values;
a target dense subgraph is determined from the transaction network graph based on the traversal order.
6. The method of claim 5, wherein the determining a target dense subgraph from the transaction network graph based on the traversal order comprises:
selecting a first target node from the plurality of transaction nodes based on the traversal order, wherein the first target node is the transaction node with the smallest weighting value;
generating a target sub-graph based on the first target node and a second target node included in the dense sub-graph, wherein the dense sub-graph is a blank graph under the condition that the first target node is a first transaction node determined based on the traversal order;
obtaining a first density value of the target sub-graph based on a plurality of weight values corresponding to a plurality of edges included in the target sub-graph and the number of target nodes included in the first sub-graph;
and determining the target subgraph as a new dense subgraph under the condition that the first density value is larger than the density value of the dense subgraph, wherein the target subgraph is determined to be the target dense subgraph under the condition that the first target node is the last transaction node determined based on the traversal order.
7. The method of claim 5, further comprising:
determining to stop generating the target dense subgraph in response to the number of dense subgraphs being greater than or equal to a first threshold;
and determining to stop generating the target dense subgraph in response to the number of nodes included in the dense subgraph being less than or equal to a second threshold.
8. The method of claim 1, wherein the obtaining, based on the target dense subgraph, a detection result of the transaction network graph includes:
determining an abnormal parameter value of the target dense subgraph based on the abnormal values respectively corresponding to a plurality of third target nodes included in the target dense subgraph;
and determining a detection result of the transaction network graph based on the abnormal parameter value.
9. The method of claim 8, wherein the determining the detection result of the transaction network graph based on the anomaly parameter comprises:
under the condition that the abnormal parameter value is greater than or equal to a parameter threshold value, determining the target dense subgraph as an abnormal subgraph;
and determining that the abnormal subgraph comprises a plurality of third target nodes as abnormal nodes, wherein the abnormal nodes represent nodes with abnormal transactions.
10. The method of claim 1, wherein the obtaining, based on the target dense subgraph, a detection result of the transaction network graph includes:
determining the number of blacklist nodes in a plurality of third target nodes included in the target dense subgraph based on a preset blacklist information table, wherein the blacklist information table includes a plurality of blacklist nodes with abnormal transactions;
and obtaining a detection result of the transaction network graph based on the number of the blacklist nodes.
11. A transaction data detection device, comprising:
the transaction node abnormal value determining module is used for responding to a transaction data detection request, and obtaining abnormal values of a plurality of transaction nodes based on a plurality of pieces of value attribute information corresponding to the transaction nodes respectively, wherein the transaction network diagram comprises a plurality of edges connecting the transaction nodes;
the edge weight value determining module is used for determining the weight value of each edge based on the abnormal value of each transaction node;
the target dense subgraph determining module is used for determining a target dense subgraph from the transaction network graph based on the weight values of the edges;
And the transaction network diagram detection module is used for obtaining a detection result of the transaction network diagram based on the target dense subgraph.
12. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-10.
13. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to any of claims 1 to 10.
14. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 10.
CN202310794184.2A 2023-06-30 2023-06-30 Transaction data detection method, device, equipment and storage medium Pending CN116739605A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310794184.2A CN116739605A (en) 2023-06-30 2023-06-30 Transaction data detection method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310794184.2A CN116739605A (en) 2023-06-30 2023-06-30 Transaction data detection method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116739605A true CN116739605A (en) 2023-09-12

Family

ID=87911430

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310794184.2A Pending CN116739605A (en) 2023-06-30 2023-06-30 Transaction data detection method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116739605A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117851959A (en) * 2024-03-07 2024-04-09 中国人民解放军国防科技大学 FHGS-based dynamic network subgraph anomaly detection method, device and equipment
CN117851959B (en) * 2024-03-07 2024-05-28 中国人民解放军国防科技大学 FHGS-based dynamic network subgraph anomaly detection method, device and equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117851959A (en) * 2024-03-07 2024-04-09 中国人民解放军国防科技大学 FHGS-based dynamic network subgraph anomaly detection method, device and equipment
CN117851959B (en) * 2024-03-07 2024-05-28 中国人民解放军国防科技大学 FHGS-based dynamic network subgraph anomaly detection method, device and equipment

Similar Documents

Publication Publication Date Title
US10200393B2 (en) Selecting representative metrics datasets for efficient detection of anomalous data
US20200389495A1 (en) Secure policy-controlled processing and auditing on regulated data sets
CN110992169B (en) Risk assessment method, risk assessment device, server and storage medium
CN105590055B (en) Method and device for identifying user credible behaviors in network interaction system
CN110414987B (en) Account set identification method and device and computer system
US11099842B2 (en) Source code similarity detection using digital fingerprints
CN111612041B (en) Abnormal user identification method and device, storage medium and electronic equipment
CN111612038B (en) Abnormal user detection method and device, storage medium and electronic equipment
US20210136120A1 (en) Universal computing asset registry
CN110135978B (en) User financial risk assessment method and device, electronic equipment and readable medium
US11741379B2 (en) Automated resolution of over and under-specification in a knowledge graph
US8650180B2 (en) Efficient optimization over uncertain data
CN111586695A (en) Short message identification method and related equipment
CN109684198B (en) Method, device, medium and electronic equipment for acquiring data to be tested
WO2019095569A1 (en) Financial analysis method based on financial and economic event on microblog, application server, and computer readable storage medium
CN115115369A (en) Data processing method, device, equipment and storage medium
US20220374524A1 (en) Method and system for anamoly detection in the banking system with graph neural networks (gnns)
CN115168848A (en) Interception feedback processing method based on big data analysis interception
CN116739605A (en) Transaction data detection method, device, equipment and storage medium
CN113869904A (en) Suspicious data identification method, device, electronic equipment, medium and computer program
CN114723548A (en) Data processing method, apparatus, device, medium, and program product
CN112712270A (en) Information processing method, device, equipment and storage medium
CN112750047A (en) Behavior relation information extraction method and device, storage medium and electronic equipment
CN110895564A (en) Potential customer data processing method and device
CN115809466B (en) Security requirement generation method and device based on STRIDE model, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination