WO2020228530A1 - 一种重复交易风险监测方法、装置及计算机可读存储介质 - Google Patents

一种重复交易风险监测方法、装置及计算机可读存储介质 Download PDF

Info

Publication number
WO2020228530A1
WO2020228530A1 PCT/CN2020/087550 CN2020087550W WO2020228530A1 WO 2020228530 A1 WO2020228530 A1 WO 2020228530A1 CN 2020087550 W CN2020087550 W CN 2020087550W WO 2020228530 A1 WO2020228530 A1 WO 2020228530A1
Authority
WO
WIPO (PCT)
Prior art keywords
similarity
batch
transaction
tested
preset
Prior art date
Application number
PCT/CN2020/087550
Other languages
English (en)
French (fr)
Inventor
李晓刚
郑建宾
赵金涛
刘红宝
汤韬
Original Assignee
中国银联股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国银联股份有限公司 filed Critical 中国银联股份有限公司
Publication of WO2020228530A1 publication Critical patent/WO2020228530A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4016Transaction verification involving fraud or risk level assessment in transaction processing

Definitions

  • the invention belongs to the technical field of transaction processing, and specifically relates to a method, a device and a computer-readable storage medium for monitoring repeated transaction risks.
  • batch transfer is a common way to process batches of business, usually refers to a technology that converts pending batch transactions into real-time transactions for processing.
  • the accepting institution and the UnionPay system use batch files to transmit transaction messages
  • the UnionPay system and the card issuer use online messaging to transmit transaction messages.
  • the server resources are insufficient, the network environment is stuck, or the server system is jittered, the problem of repeated transactions may occur, which may lead to economic losses.
  • the method usually adopted in the prior art is to compare the batch number between the currently received batch transaction message and the previously received batch transaction message to obtain the initial batch number.
  • the received batch number shall prevail, and the batch transaction messages with repeated batch numbers will be discarded; however, the above scheme only uses the batch number as the identification standard, and does not involve the specific information of the transaction. If the repeated transaction is in a different batch Appears in the file, the existing scheme will not be recognized.
  • the present invention provides the following solutions.
  • a method for monitoring repeated transaction risks including: obtaining batch transaction messages to be tested sent by the same monitoring object at a specified time, and historical transaction messages sent before the specified time; determining the pending transaction messages according to the content of the specified message Measure the similarity index between batch transaction messages and historical transaction messages, where the specified message content includes at least two of the following: batch number, transaction account number, and transaction amount; by comparing the similarity index with the preset similarity The degree threshold is compared to determine whether there is a risk of repeated transactions in the batch of transaction messages to be tested.
  • obtaining the batch transaction messages to be tested sent by the same monitoring object at a specified time, and the historical transaction messages sent before the specified time include: receiving at a specified time sent by the same monitoring object The batch transaction messages to be tested; the first time period is determined by the preset time length and the designated time, and the historical transaction messages sent by the same monitoring object in the first time period are extracted.
  • determining the similarity index between the batch transaction messages to be tested and the historical transaction messages includes: using a preset similarity algorithm to determine the similarity between the batch transaction messages to be tested and the historical transaction messages
  • Degree vector Use preset scoring rules to convert the similarity vector into a similarity index.
  • using a preset similarity algorithm to determine the similarity vector between the batch transaction messages to be tested and the historical transaction messages includes: constructing a sparse matrix based on the batch transaction messages to be tested and the historical transaction messages, In the sparse matrix, the value of each non-zero element is determined by the transaction amount, and the row label and column label of each element are determined by the batch number and transaction account number respectively; determine the first sparse vector and m second in the sparse matrix The m similarity parameters between the sparse vectors are determined by the m similarity parameters; among them, the batch transaction messages to be tested include: multiple transaction messages corresponding to the first batch number, in the sparse matrix The row vector/or column vector corresponding to the first batch number is used as the first sparse vector; historical transaction messages include: multiple transaction messages corresponding to m second batch numbers, respectively corresponding to m in the sparse matrix The row vectors/or column vectors of the second batch number are used as m second sparse vectors, and m
  • the m similarity parameters between the first sparse vector and the m second sparse vectors in the sparse matrix are determined by the following formula:
  • the preset scoring rule includes: determining the maximum similarity parameter among the m similarity parameters as the similarity index.
  • the preset scoring rule further includes: determining whether the maximum similarity parameter among the m similarity parameters reaches a preset threshold; if the maximum similarity parameter reaches the preset threshold, determining the preset threshold The value is the similarity index; if the maximum similarity parameter does not reach the preset critical value, the m similarity parameters are respectively weighted based on the m preset weight parameters to obtain m weighted similarity parameters, and m The maximum weighted similarity parameter among the weighted similarity parameters is used as the similarity index.
  • the specified message content also includes the batch upload time
  • the method further includes: for each similarity parameter of the m similarity parameters, the corresponding two batch upload time The difference determines the corresponding preset weight parameter.
  • it further includes: determining m preset weight parameters by the following formula, and respectively weighting the m similarity parameters to obtain m weighted similarity parameters:
  • t a is the batch upload time of the batch of transaction messages to be tested;
  • S i is the i- th similarity parameter among the m similarity parameters;
  • t i is the i-th batch history corresponding to the i-th similarity parameter Data batch upload time;
  • ⁇ i is the ith preset weight parameter corresponding to the i-th similarity parameter among the m preset weight parameters;
  • Xi is the i- th similarity parameter among the m weighted similarity parameters similarity weighting parameter of the i-th parameter;
  • T is the duration t a, and comprising a first time period including each of T i.
  • it further includes: determining m preset weight parameters from preset credit information and/or preset attribute information of the same monitored object.
  • the method further includes: extracting historical transaction data sent by the same monitoring object before a specified time, and determining a similarity threshold according to the historical transaction data, where the historical transaction data is sent before the historical transaction message.
  • the historical transaction data includes: multiple transaction data corresponding to n third batch numbers, and each third batch number in the n third batch numbers is correspondingly set Repeat transaction risk label, where n is a positive integer greater than 1; and, the method further includes: sequentially using multiple transaction data corresponding to each of the n third batch numbers as the batch data to be tested, The transaction data except the batch data to be tested in the historical transaction data is regarded as the remaining batch data; according to the specified message content, the reference similarity index between the batch data to be tested and the remaining batch data is determined, thereby Obtain the reference similarity index corresponding to each third batch number; establish the ROC curve according to the reference similarity index corresponding to each third batch number and the repeated transaction risk label, thereby determining the similarity threshold according to the ROC curve.
  • the method before establishing the ROC curve, the method further includes: removing the reference similarity index with a value of 0 or 1 and the corresponding repeated transaction risk label.
  • the historical transaction data has a periodic correspondence with the upload time of the historical transaction message.
  • it further includes: before determining the similarity index between the batch transaction message to be tested and the historical transaction message, comparing the batch number of the batch transaction message to be tested and the historical transaction message; if If there are one or more historical transaction messages with the same batch number compared with the batch transaction message to be tested, it is directly determined that the batch transaction message to be tested has the risk of repeated transactions; if there is no similarity to the batch transaction message to be tested Compared with the historical transaction messages with the same batch number, it is further executed to determine the similarity index between the batch transaction messages to be tested and the historical transaction messages.
  • it also includes: if the batch of transaction messages to be tested is judged to be at risk of repeated transactions, sending early warning information to the same monitoring object; receiving confirmation information from the same monitoring object, and repeating judgments based on the confirmation information Whether there is a risk of repeated transactions in the batch of transaction messages to be tested.
  • a repetitive transaction risk monitoring device including: an acquisition module, used to acquire batch transaction messages to be tested sent by the same monitoring object at a specified time, and historical transaction messages sent before the specified time; similarity module, It is used to determine the similarity index between the batch transaction messages to be tested and the historical transaction messages according to the specified message content, where the specified message content includes at least two of the following: batch number, transaction account number, and transaction Amount; the judgment module is used to compare the similarity index with a preset similarity threshold to determine whether there is a risk of repeated transactions in the batch of transaction messages to be tested.
  • the acquisition module includes: a receiving module, which is used to receive the batch transaction messages to be tested sent by the same monitoring object at a specified time; and the extraction module, which is used to determine the first time based on the preset duration and the specified time. Time period, and extract historical transaction messages sent by the same monitoring object in the first time period.
  • the similarity module includes: a similarity measurement module for determining the similarity vector between the batch transaction messages to be tested and historical transaction messages using a preset similarity algorithm; a similarity scoring module, It is used to convert the similarity vector into a similarity index using preset scoring rules.
  • the similarity measurement module is used to construct a sparse matrix based on the batch of transaction messages to be tested and historical transaction messages.
  • the value of each non-zero element is determined by the transaction amount.
  • the row label and column label of the element are respectively determined by the batch number and transaction account number; determine the m similarity parameters between the first sparse vector and m second sparse vectors in the sparse matrix, and are determined by the m similarity parameters Similarity vector; among them, the batch transaction messages to be tested include: multiple transaction messages corresponding to the first batch number, and the row vector/or column vector corresponding to the first batch number in the sparse matrix is used as the first sparse vector ;
  • Historical transaction messages include: multiple transaction messages corresponding to m second batch numbers, and row vectors/or column vectors corresponding to m second batch numbers in the sparse matrix are regarded as m second sparse Vector, m is a positive integer.
  • the second sparse vector a represents the first sparse vector
  • # ⁇ (b i -a) ⁇ 0 ⁇ represents the number of non-zero elements in the difference vector between the first sparse vector and the i-th second sparse vector
  • # ⁇ ( b+a) ⁇ 0 ⁇ represents the number of non-zero elements in the sum vector of the first sparse vector and the i-th second sparse vector.
  • the m similarity parameters between the first sparse vector and the m second sparse vectors in the sparse matrix are determined by the following formula:
  • the similarity scoring module is used to determine the maximum similarity parameter among the m similarity parameters as the similarity index.
  • the similarity scoring module is used to determine whether the maximum similarity parameter among the m similarity parameters reaches a preset threshold; if the maximum similarity parameter reaches the preset threshold, determine the preset threshold The value is the similarity index; if the maximum similarity parameter does not reach the preset critical value, the m similarity parameters are respectively weighted based on the m preset weight parameters to obtain m weighted similarity parameters, and m The maximum weighted similarity parameter among the weighted similarity parameters is used as the similarity index.
  • the specified message content also includes the batch upload time
  • the similarity scoring module is further used to: For each similarity parameter of the m similarity parameters, determine the The difference in the upload time determines the corresponding preset weight parameter.
  • the similarity scoring module is further used to determine m preset weight parameters by the following formula, and respectively weight the m similarity parameters to obtain m weighted similarity parameters:
  • t a is the batch upload time of the batch of transaction messages to be tested;
  • S i is the i- th similarity parameter among the m similarity parameters;
  • t i is the i-th batch history corresponding to the i-th similarity parameter Data batch upload time;
  • ⁇ i is the ith preset weight parameter corresponding to the i-th similarity parameter among the m preset weight parameters;
  • Xi is the i- th similarity parameter among the m weighted similarity parameters similarity weighting parameter of the i-th parameter;
  • T is the duration t a, and comprising a first time period including each of T i.
  • the similarity scoring module is further used to determine m preset weight parameters from preset credit information and/or preset attribute information of the same monitored object.
  • a similarity threshold module is further included, specifically used to extract historical transaction data sent by the same monitoring object before a specified time, and determine the similarity threshold based on the historical transaction data, where the historical transaction data Sent before the historical transaction message.
  • the historical transaction data includes: multiple transaction data corresponding to n third batch numbers, and each third batch number in the n third batch numbers is correspondingly set Repeat transaction risk label, where n is a positive integer greater than 1, and the similarity threshold module is further used to: sequentially use multiple transaction data corresponding to each of the n third batch numbers as the test Batch data, and the transaction data in the historical transaction data except the batch data to be tested as the remaining batch data; according to the specified message content, determine the reference similarity between the batch data to be tested and the remaining batch data Degree index, so as to obtain the reference similarity index corresponding to each third batch number; establish the ROC curve according to the reference similarity index corresponding to each third batch number and the repeated transaction risk label, so as to determine the similarity according to the ROC curve Degree threshold.
  • the similarity threshold module is further used to remove the reference similarity index with a value of 0 or 1 and the corresponding repeated transaction risk label.
  • the historical transaction data has a periodic correspondence with the upload time of the historical transaction message.
  • a filter module is further included, which is used to compare the batch transaction messages to be tested and the historical transaction messages before determining the similarity index between the batch transaction messages to be tested and the historical transaction messages.
  • Batch number if there are one or more historical transaction messages with the same batch number compared with the batch transaction message to be tested, it is directly determined that the batch transaction message to be tested has the risk of repeated transactions; Compared with the historical transaction messages with the same batch number, the batch transaction messages are further executed to determine the similarity index between the batch transaction messages to be tested and the historical transaction messages.
  • it also includes an early warning module, which is used to send early warning information to the same monitoring object if the batch transaction message to be tested is judged to have a risk of repeated transactions; receive confirmation information from the same monitoring object, and According to the confirmation information, it is repeatedly judged whether there is a risk of repeated transactions in the batch of transaction messages to be tested.
  • an early warning module which is used to send early warning information to the same monitoring object if the batch transaction message to be tested is judged to have a risk of repeated transactions; receive confirmation information from the same monitoring object, and According to the confirmation information, it is repeatedly judged whether there is a risk of repeated transactions in the batch of transaction messages to be tested.
  • a repeated transaction risk monitoring system includes the above-mentioned monitoring device and at least one monitoring object.
  • a repetitive transaction risk monitoring device includes: one or more multi-core processors; a memory for storing one or more programs; when one or more programs are executed by one or more multi-core processors, one or more Implementation of a multi-core processor: Obtain the batch transaction messages to be tested sent by the same monitoring object at a specified time, and historical transaction messages sent before the specified time; determine the batch transaction messages to be tested according to the specified message content The similarity index between text and historical transaction messages, where the specified message content includes at least two of the following: batch number, transaction account number, and transaction amount; by comparing the similarity index with a preset similarity threshold , To determine whether there is a risk of repeated transactions in the batch of transaction messages to be tested.
  • a computer-readable storage medium stores a program, and when the program is executed by a multi-core processor, the multi-core processor executes the above-mentioned method.
  • the batch transaction messages to be tested sent by the same monitoring object at a specified time are calculated as compared with those sent within a period of time before the specified time.
  • the similarity index between historical transaction messages and then by comparing the similarity index with the preset similarity threshold, it is possible to monitor the possible repetitive transactions in the batch transaction messages, which can more sensitively prompt the repetition Transaction risk avoids economic losses; further, in the process of calculating the similarity index, this application makes full use of the information of the transaction itself to improve the credibility of the similarity index, and simplifies the calculation by using the sparse matrix and the difference between the sparse vectors
  • use the transaction delivery time to formulate a reasonable weighting scheme to improve the calculation accuracy of the similarity index
  • this application formulates a reasonable threshold to obtain a plan and uses the ROC curve to obtain credibility
  • the higher similarity threshold value further ensures the accuracy of repeated transaction risk monitoring.
  • FIG. 1 is a schematic flowchart of a method for monitoring repeated transaction risks according to an embodiment of the present invention
  • FIG. 2 is a schematic flowchart of a method for monitoring repeated transaction risks according to another embodiment of the present invention.
  • Figure 3 is a schematic diagram of an ROC curve according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a repeated transaction risk monitoring device according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a repeated transaction risk monitoring device according to another embodiment of the present invention.
  • Fig. 6 is a schematic diagram of a computer-readable storage medium according to an embodiment of the present invention.
  • Fig. 1 schematically shows a flow chart of a method 100 for monitoring repeated transaction risks according to an embodiment of the present invention.
  • the method shown in Fig. 1 can be implemented at a cloud server, a server cluster or a background transaction processing system. Execution, more specifically, the method shown in FIG. 1 can be executed by a specific module set in the UnionPay system.
  • the cloud server is used as the execution subject for specific explanation, but it should be understood that the application does not specifically limit the execution subject.
  • the method 100 includes:
  • Step S101 Obtain a batch of transaction messages to be tested sent by the same monitoring object at a designated time, and historical transaction messages sent before the designated time;
  • the same monitoring object refers to the merchant or terminal that actually makes transactions with the cardholder.
  • the batch transaction messages to be tested and historical transaction messages can be transaction messages generated by multiple types of transactions including credit transactions. .
  • the transaction message is not sent to the cloud server in real time, but multiple transaction messages generated within a period of time are packaged and sent to the cloud server in batches in.
  • the same monitoring object sends batch transaction messages at a designated time and multiple time points before the designated time respectively, wherein the batch transaction messages sent by the monitoring object at the designated time are designated as "waiting".
  • Test batch transaction messages the specified time is usually the most recent time or the current time; the transaction messages sent by the same monitoring object within a period of time before the specified time are designated as "historical transaction messages", which are used as the repeated transaction risk Background data in analysis.
  • step S101 may further include: receiving a batch of transaction messages to be tested sent by the same monitoring object at a specified time; determining the first time period from the preset duration and the specified time, and extracting the same monitoring object Historical transaction messages sent in the first time period.
  • the cloud server receives the batch transaction message to be tested, in order to determine whether the batch transaction message to be tested has the risk of repeated transactions, it extracts the same monitoring object from the database of the cloud server in the previous day, hour or The transaction messages of other batches sent within ten minutes are used as the background data for the repeated transaction risk analysis.
  • the transaction messages stored in the database may be in batch format or non-batch format, and this application does not specifically limit this.
  • the transaction messages of the same monitoring object in the previous period of time are used to analyze the current batch transaction messages, which can determine whether the current batch transaction messages have repeated transactions in real time and more accurately. risk.
  • the method 100 further includes:
  • Step S102 Determine the similarity index between the batch transaction message to be tested and the historical transaction message according to the specified message content
  • the specified message content includes at least two of the following: batch number, transaction account number, and transaction amount.
  • each transaction message in the batch transaction message to be tested and the historical transaction message can be converted into a multi-dimensional feature vector based on the content of the specified message.
  • the historical transaction message training obtains the deep learning model, and the batch transaction messages to be tested are input into the deep learning model to output the similarity index.
  • the above similarity index can be obtained by calculating the cosine distance and Euclidean distance. This application There are no specific restrictions on this.
  • the above-mentioned batch number, transaction card number, and transaction amount are all the message content of the transaction itself.
  • the specified message content may also include: batch upload time, transaction type, transaction currency, transaction commodity type and other information, which is not specifically limited in this application.
  • the method 100 may further include: comparing the batch number of the batch transaction message to be tested with the batch number of the historical transaction message; For one or more historical transaction messages with the same batch number, it is directly determined that the batch transaction message to be tested has the risk of repeated transactions; if there is no historical transaction message with the same batch number compared with the batch transaction message to be tested , Step S102 is further executed.
  • step S102 may further include:
  • Step S201 using a preset similarity algorithm to determine the similarity vector between the batch transaction messages to be tested and the historical transaction messages;
  • step S201 may further include: constructing a sparse matrix based on the batch of transaction messages to be tested and historical transaction messages, where the value of each non-zero element is determined by the transaction amount, and the row of each element The label and column label are respectively determined by the batch number and the transaction account; determine the m similarity parameters between the first sparse vector and the m second sparse vectors in the sparse matrix, and determine the similarity vector by the m similarity parameters .
  • the batch transaction messages to be tested include: multiple transaction messages corresponding to the first batch number, and the row vector/or column vector corresponding to the first batch number in the sparse matrix is used as the first sparse vector; historical transactions The message includes: multiple transaction messages respectively corresponding to m second batch numbers, row vectors/or column vectors corresponding to m second batch numbers in the sparse matrix as m second sparse vectors, m Is a positive integer.
  • each row label and column label of each element are respectively determined by the batch number and transaction account number. It can be: each row element in the sparse matrix corresponds to the same batch number, and each column element corresponds to the same transaction account number; it can also be sparse Each row element in the matrix corresponds to the same transaction account number, and each column element corresponds to the same batch number.
  • the batch transaction message to be tested may be a transaction package generated by the same monitoring object according to a preset rule, and the cloud server parses and obtains the multiple transaction messages after receiving the batch transaction message to be tested.
  • the batch number and batch upload time are the common information of multiple transaction messages, and the transaction card number and transaction amount are unique information for each transaction message. .
  • each transaction message contained in the batch transaction message to be tested and the historical transaction message is arranged with the batch number as the row label and the transaction account number as the column label to form a sparse matrix as shown below.
  • each row element corresponds to the same batch number
  • each column element corresponds to the same transaction account. If a transaction account has a transaction record in a certain batch, the element at the corresponding position is taken as the transaction of the transaction Amount, if there is no transaction record for a certain transaction account in a certain batch, the element at the corresponding position is set to zero. It can be understood from actual transaction experience that in this sparse matrix, there may be a certain number in each row and column The non-zero elements (that is, the actual data), and a large number of zero elements (the zero elements have no data and are not stored).
  • V mn is the historical transaction message, corresponding to the mth second batch number and transaction the amount of the transaction account number C n, V an is a batch test packet transactions corresponding to the transaction amount of the transaction account number C n, and so on.
  • the first sparse vector a that is, the row vector corresponding to the first batch number a is:
  • m second sparse vector b i, i 1,2, ... , m row vectors m, i.e., m second batch numbers respectively corresponding to:
  • the m similarity parameters between the first sparse vector and the m second sparse vectors in the sparse matrix can be determined by the following formula (1):
  • b i represents the i- th second sparse vector among the m second sparse vectors
  • a represents the first sparse vector
  • # ⁇ (b i -a) ⁇ 0 ⁇ represents the first sparse vector
  • the number of non-zero elements in the difference vector of the i-th second sparse vector, # ⁇ (b i +a) ⁇ 0 ⁇ represents the sum of the first sparse vector and the i-th second sparse vector with non-zero elements in the vector number
  • S i represents the i-th similarity parameter i th second sparse sparse vector with the first vector
  • m is a positive integer
  • a second sparse vector number m of the m parameter indicates the degree of similarity.
  • the above formula (1) has a high recognition sensitivity to the risk of repeated transactions, and it has a good effect on the recognition of repeated transactions based on simple statistical calculations.
  • the value of the above-mentioned similarity parameter S i is [0,1], when the two batches of transactions are completely the same, the similarity parameter is 1, and when the two batches of transactions are completely different, the similarity parameter is 0.
  • the present invention can also determine the m similarity parameters between the first sparse vector a and the m second sparse vectors b i in other ways, for example, it can be determined by calculating Euclidean distance, cosine distance, etc. , This application does not specifically limit this.
  • step S102 may further include:
  • Step S202 Convert the similarity vector into a similarity index by using a preset scoring rule.
  • the preset scoring rule in step S202 may include: determining the maximum similarity parameter among the m similarity parameters as the similarity index.
  • the preset scoring rule in step S202 may further include: judging whether the maximum similarity parameter among the m similarity parameters reaches the preset threshold; if the maximum similarity parameter reaches the preset threshold , The preset critical value is determined to be the similarity index; if the maximum similarity parameter does not reach the preset critical value, the m similarity parameters are respectively weighted based on the m preset weight parameters to obtain m weighted similarities And determine the maximum weighted similarity parameter among the m weighted similarity parameters as the similarity index.
  • the value of the similarity parameter S i obtained according to the above formula (1) is between [0,1], so 1 can be used as the preset critical value, and further, if the maximum similarity parameter reaches 1, it means There are two batches of transactions that are exactly the same, and it can generally be considered that the two batches of transactions corresponding to the maximum similarity parameter are repeated. If the maximum similarity parameter is less than 1, it needs to be further combined with the preset weight parameter for judgment, which can be determined by factors such as batch delivery time.
  • the specified message content also includes the batch upload time
  • step S202 further includes: for each similarity parameter of the m similarity parameters, send it from the two corresponding batches The time difference determines the corresponding preset weight parameter. Since two batches of transactions with a smaller time interval are more likely to be repeated, in this embodiment, the preset weight parameter is determined by using the time difference between the batches of the two batches of transactions to determine the similarity with higher accuracy. index.
  • the m preset weight parameters ( ⁇ 1 , ⁇ 2 ,..., ⁇ m ) can be determined by formula (2), and the m preset weight parameters ( ⁇ 1 , ⁇ 2 ,..., ⁇ m ) Respectively weighting the m similarity parameters (S 1 , S 2 ,..., S m ) to obtain m weighted similarity parameters (X 1 , X 2 ,..., X m ).
  • t a is the batch upload time of the batch of transaction messages to be tested;
  • S i is the i- th similarity parameter among the m similarity parameters;
  • t i is the i-th similarity parameter on the i-th transmission history data of the batch the batch time;
  • i [omega] m is the preset weight value corresponding to the i parameter preset weight value of the parameter of the i-th similarity parameters;
  • X-i is a similarity weighting parameter m
  • T is the duration of the first period;
  • m is a positive integer, which represents the number of m similarity parameters.
  • it may further include determining the foregoing m preset weight parameters from preset credit information and/or preset attribute information of the same monitored object.
  • the preset credit information of the same monitored object is, for example, the bank credit score of the same monitored object.
  • the method 100 further includes:
  • Step S103 Compare the similarity index with a preset similarity threshold to determine whether there is a risk of repeated transactions in the batch of transaction messages to be tested.
  • the foregoing repeated transaction risk is used to indicate that there are one or more transaction messages of repeated transactions that are repeated with historical transactions in the batch of transaction messages to be tested.
  • the corresponding similarity index can be obtained, and the similarity index can be compared with the preset similarity threshold. If the similarity index exceeds the preset similarity index Similarity threshold, it is judged that there is a risk of repeated transactions in the batch of transaction messages to be tested, and further relevant early warning measures can be taken. If the similarity index does not exceed the preset similarity threshold, it is determined that the batch of transaction messages to be tested is a normal transaction.
  • the method 100 further includes: extracting historical transaction data sent by the same monitoring object before a specified time, and determining a similarity threshold according to the historical transaction data, where the historical transaction data is sent to the historical transaction report Before the text.
  • the similarity threshold obtained based on the historical transaction data of the same monitoring object has higher adaptability and reliability.
  • the similarity threshold may also be obtained through empirical values and experimental values.
  • the historical transaction data has a periodic correspondence with the upload time of the historical transaction message.
  • historical transaction data and historical transaction messages can be sent by the same monitoring object in the same period of two adjacent weeks or two adjacent days.
  • the historical transaction data includes: multiple transaction data corresponding to n third batch numbers, and each third batch number in the n third batch numbers is correspondingly set Repeat transaction risk label, n is a positive integer greater than 1.
  • determining the similarity threshold according to historical transaction data may specifically include:
  • historical transaction data packets can be divided into R 1 to R 5 corresponding to the five third batch numbers, where R 1 is selected as the batch data to be tested, and the remaining R 2 to R 5 are used as the remaining batch data. And calculate the similarity index between the batch data to be tested and the remaining batch data as the reference similarity index, that is, the reference similarity index corresponding to R 1.
  • the specific calculation process is the same as the calculation of the batch transaction message to be tested above
  • the steps of the similarity index between historical transaction messages are consistent or similar, and will not be repeated in this application.
  • the reference similarity index corresponding to the five batches of R 1 to R 5 can be calculated.
  • R 1 to R 5 respectively represent each third batch number in the multi-batch transaction data.
  • the repeated transaction risk label corresponding to R 1 is 0 (that is, non-repeated transaction), which corresponds to The reference similarity index of R 3 is 0.3; the repeated trading risk label corresponding to R 3 is 1 (ie, repeated trading), and the corresponding reference similarity index is 0.9, and so on; respectively; the corresponding to R 1 ⁇ R 5 Refer to the similarity index as the preset threshold to judge the precision and recall, and determine the four cases of TP, FP, TN, and FN.
  • the ROC curve before establishing the ROC curve, it may further include: removing the reference similarity index with a value of 0 or 1 and the corresponding repeated transaction risk label. This can avoid the threshold selection deviation.
  • the method 100 may further include: if the batch transaction message to be tested is judged to have a risk of repeated transactions, sending early warning information to the same monitoring object; receiving confirmation information sent by the same monitoring object, and according to Confirm the repeated information to determine whether there is a risk of repeated transactions in the batch of transaction messages to be tested. For example, when the similarity index is greater than the similarity threshold, the same monitoring object will be fed back early warning information. If the similarity index reaches the preset threshold, a stronger warning will be fed back to remind the same monitoring object that there may be a risk of repeated transactions. Avoid economic losses.
  • an embodiment of the present invention also provides a repeated transaction risk monitoring device for implementing the repeated risk transaction monitoring method provided in any of the foregoing embodiments.
  • 4 is a schematic structural diagram of a repeated transaction risk monitoring device provided by an embodiment of the present invention.
  • the repeated transaction risk monitoring device 40 includes:
  • the obtaining module 401 is used to obtain the batch transaction messages to be tested sent by the same monitoring object at a specified time and historical transaction messages sent before the specified time;
  • the similarity module 402 is used to determine the similarity index between the batch transaction messages to be tested and the historical transaction messages according to the specified message content, wherein the specified message content includes at least two of the following: batch number , Transaction account number and transaction amount;
  • the judging module 403 is configured to compare the similarity index with a preset similarity threshold to determine whether there is a risk of repeated transactions in the batch of transaction messages to be tested.
  • the acquisition module 401 includes: a receiving module, used to receive a batch of transaction messages to be tested sent by the same monitoring object at a specified time; and an extraction module, used to determine the first time based on the preset duration and the specified time. One time period, and extract historical transaction messages sent by the same monitoring object in the first time period.
  • the similarity module 402 includes: a similarity measurement module for determining the similarity vector between the batch of transaction messages to be tested and historical transaction messages by using a preset similarity algorithm; a similarity scoring module , Used to convert the similarity vector into a similarity index using preset scoring rules.
  • the similarity measurement module is used to construct a sparse matrix based on the batch of transaction messages to be tested and historical transaction messages.
  • the value of each non-zero element is determined by the transaction amount.
  • the row label and column label of the element are respectively determined by the batch number and transaction account number; determine the m similarity parameters between the first sparse vector and m second sparse vectors in the sparse matrix, and are determined by the m similarity parameters Similarity vector; among them, the batch transaction messages to be tested include: multiple transaction messages corresponding to the first batch number, and the row vector/or column vector corresponding to the first batch number in the sparse matrix is used as the first sparse vector ;
  • Historical transaction messages include: multiple transaction messages corresponding to m second batch numbers, and row vectors/or column vectors corresponding to m second batch numbers in the sparse matrix are regarded as m second sparse Vector, m is a positive integer.
  • the second sparse vector a represents the first sparse vector
  • # ⁇ (b i -a) ⁇ 0 ⁇ represents the number of non-zero elements in the difference vector between the first sparse vector and the i-th second sparse vector
  • # ⁇ ( b+a) ⁇ 0 ⁇ represents the number of non-zero elements in the sum vector of the first sparse vector and the i-th second sparse vector.
  • the m similarity parameters between the first sparse vector and the m second sparse vectors in the sparse matrix are determined by the following formula:
  • the similarity scoring module is used to determine the maximum similarity parameter among the m similarity parameters as the similarity index.
  • the similarity scoring module is used to determine whether the maximum similarity parameter among the m similarity parameters reaches a preset threshold; if the maximum similarity parameter reaches the preset threshold, determine the preset threshold The value is the similarity index; if the maximum similarity parameter does not reach the preset critical value, the m similarity parameters are respectively weighted based on the m preset weight parameters to obtain m weighted similarity parameters, and m The maximum weighted similarity parameter among the weighted similarity parameters is used as the similarity index.
  • the specified message content also includes the batch upload time
  • the similarity scoring module is further used to: For each similarity parameter of the m similarity parameters, determine the The difference in the upload time determines the corresponding preset weight parameter.
  • the similarity scoring module is further used to determine m preset weight parameters by the following formula, and respectively weight the m similarity parameters to obtain m weighted similarity parameters:
  • t a is the batch upload time of the batch of transaction messages to be tested;
  • S i is the i- th similarity parameter among the m similarity parameters;
  • t i is the i-th batch history corresponding to the i-th similarity parameter Data batch upload time;
  • ⁇ i is the ith preset weight parameter corresponding to the i-th similarity parameter among the m preset weight parameters;
  • Xi is the i- th similarity parameter among the m weighted similarity parameters similarity weighting parameter of the i-th parameter;
  • T is the duration t a, and comprising a first time period including each of T i.
  • the similarity scoring module is further used to determine m preset weight parameters from preset credit information and/or preset attribute information of the same monitored object.
  • the device 40 further includes a similarity threshold module, which is specifically used to extract historical transaction data sent by the same monitoring object before a specified time, and determine the similarity threshold according to the historical transaction data. The data is sent before the historical transaction message.
  • a similarity threshold module which is specifically used to extract historical transaction data sent by the same monitoring object before a specified time, and determine the similarity threshold according to the historical transaction data. The data is sent before the historical transaction message.
  • the historical transaction data includes: multiple transaction data corresponding to n third batch numbers, and each third batch number in the n third batch numbers is correspondingly set Repeat transaction risk label, where n is a positive integer greater than 1, and the similarity threshold module is further used to: sequentially use multiple transaction data corresponding to each of the n third batch numbers as the test Batch data, and the transaction data in the historical transaction data except the batch data to be tested as the remaining batch data; according to the specified message content, determine the reference similarity between the batch data to be tested and the remaining batch data Degree index, so as to obtain the reference similarity index corresponding to each third batch number; establish the ROC curve according to the reference similarity index corresponding to each third batch number and the repeated transaction risk label, so as to determine the similarity according to the ROC curve Degree threshold.
  • the similarity threshold module is further used to remove the reference similarity index with a value of 0 or 1 and the corresponding repeated transaction risk label.
  • the historical transaction data has a periodic correspondence with the upload time of the historical transaction message.
  • the device 40 further includes a filtering module for comparing the batch transaction messages to be tested with the historical transaction messages before determining the similarity index between the batch transaction messages to be tested and the historical transaction messages. If there are one or more historical transaction messages with the same batch number compared with the batch transaction message to be tested, it is directly determined that there is a risk of repeated transactions in the batch transaction message to be tested; Compared with the historical transaction messages with the same batch number, the similarity module determines the similarity index between the batch transaction messages to be tested and the historical transaction messages.
  • the device 40 further includes an early warning module, which is used to send early warning information to the same monitoring object if the batch of transaction messages to be tested is judged to have a risk of repeated transactions; to receive confirmation information from the same monitoring object , And repeatedly judge whether there is a repeated transaction risk in the batch transaction message to be tested according to the confirmation information.
  • an early warning module which is used to send early warning information to the same monitoring object if the batch of transaction messages to be tested is judged to have a risk of repeated transactions; to receive confirmation information from the same monitoring object , And repeatedly judge whether there is a repeated transaction risk in the batch transaction message to be tested according to the confirmation information.
  • an embodiment of the present invention also provides a repeated transaction risk monitoring system, which includes the above-mentioned monitoring device and at least one monitoring object.
  • a repeated transaction risk monitoring device of the present invention may at least include one or more processors and at least one memory.
  • the memory stores a program, and when the program is executed by the processor, the processor is caused to perform the steps shown in FIG. 1:
  • Step S101 Obtain batch transaction messages to be tested sent by the same monitoring object at a specified time, and historical transaction messages sent before the specified time;
  • Step S102 Determine the similarity index between the batch transaction messages to be tested and the historical transaction messages according to the specified message content, where the specified message content includes at least two of the following: batch number, transaction account number, and The transaction amount;
  • Step S103 The similarity index is compared with a preset similarity threshold to determine whether there is a risk of repeated transactions in the batch of transaction messages to be tested.
  • the repeated transaction risk monitoring device 5 according to this embodiment of the present invention will be described with reference to FIG. 5.
  • the device 5 shown in FIG. 5 is only an example, and should not bring any limitation to the function and application scope of the embodiment of the present invention.
  • the apparatus 5 may be in the form of a general computing device, including but not limited to: at least one processor 10, at least one memory 20, and a bus 60 connecting different device components.
  • the bus 60 includes a data bus, an address bus, and a control bus.
  • the memory 20 may include a volatile memory, such as a random access memory (RAM) 21 and/or a cache memory 22, and may further include a read-only memory (ROM) 23.
  • RAM random access memory
  • ROM read-only memory
  • the memory 20 may also include a program module 24.
  • program module 24 includes, but is not limited to, an operating device, one or more application programs, other program modules, and program data. Each of these examples or a certain combination may include a network. Realization of the environment.
  • the apparatus 5 may also communicate with one or more external devices 2 (for example, a keyboard, a pointing device, a Bluetooth device, etc.), and may also communicate with one or more other devices. This communication can be performed through an input/output (I/O) interface 40 and displayed on the display unit 30.
  • the device 5 may also communicate with one or more networks (for example, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through the network adapter 50. As shown in the figure, the network adapter 50 communicates with other modules in the device 5 through the bus 60.
  • LAN local area network
  • WAN wide area network
  • public network such as the Internet
  • Fig. 6 shows a computer-readable storage medium for executing the method described above.
  • various aspects of the present invention can also be implemented in the form of a computer-readable storage medium, which includes program code, and when the program code is executed by a processor, the program code is used for The processor is caused to execute the method described above.
  • the computer-readable storage medium may adopt any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor device, device, or device, or any combination of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Type programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium 6 As shown in FIG. 6, a computer-readable storage medium 6 according to an embodiment of the present invention is described. It can adopt a portable compact disk read-only memory (CD-ROM) and include program codes, and can be stored in a terminal device, such as a personal computer. Run on.
  • the computer-readable storage medium of the present invention is not limited to this.
  • the readable storage medium can be any tangible medium that contains or stores a program, and the program can be used by or in combination with the instruction execution device, device, or device .
  • the program code used to perform the operations of the present invention can be written in any combination of one or more programming languages.
  • the programming languages include object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural styles. Programming language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computing device, partly executed on the user's device and partly executed on the remote computing device, or entirely executed on the remote computing device or server.
  • the remote computing device can be connected to the user's computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computing device (for example, using Internet services) Provider to connect via the Internet).
  • LAN local area network
  • WAN wide area network
  • Internet services for example, using Internet services

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Accounting & Taxation (AREA)
  • Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Finance (AREA)
  • Computer Security & Cryptography (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Debugging And Monitoring (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

一种重复交易风险监测方法、装置及计算机可读存储介质,该方法包括:获取由同一监测对象在指定时刻上送的待测批量交易报文,以及在指定时刻之前上送的历史交易报文(S101);根据指定的报文内容,确定待测批量交易报文与历史交易报文之间的相似度指数(S102);通过对相似度指数与预设相似度阈值进行比较,以判断待测批量交易报文是否存在重复交易风险(S103)。利用上述方法,能够对不同批次上送的批量交易报文中可能存在的部分重复交易的情况进行监测,进而能更为灵敏地提示重复交易风险,避免经济损失。

Description

一种重复交易风险监测方法、装置及计算机可读存储介质 技术领域
本发明属于交易处理技术领域,具体涉及一种重复交易风险监测方法、装置及计算机可读存储介质。
背景技术
本部分旨在为权利要求书中陈述的本发明的实施方式提供背景或上下文。此处的描述不因为包括在本部分中就承认是现有技术。
在金融领域,“批转实”是一种处理批量业务的常见途径,通常是指将待处理的批量交易转化为实时交易进行处理的一种技术。例如,受理机构与银联系统之间采用批量文件方式传递交易报文,而银联系统和发卡机构之间采用联机报文方式传递交易报文。然而,由于服务器资源不足、网络环境卡顿或者服务端系统抖动等异常时,可能出现交易重复发送的问题,进而导致经济损失。
为了解决上述批量交易中发生的重复交易问题,现有技术中通常采用的方法为通过比较当前接收到的批量交易报文与之前接收到的批量交易报文之间的批次号,以初次收到的批次号为准,并将批次号重复的批量交易报文丢弃;然而,上述方案中仅仅以批次号为辨别标准,而未涉及交易的具体信息,如果重复交易在不同批次的文件中出现,则现有方案将无法识别。
发明内容
针对上述现有技术中难以对存在于不同批次的批量交易报文中的部分重复交易进行监测这一问题,提出了一种重复交易风险监测方法、装置、系统及计算机可读存储介质,利用这种方法、装置、系统及计算机可读存储介质,能够解决上述问题。
本发明提供了以下方案。
一种重复交易风险监测方法,包括:获取由同一监测对象在指定时刻上送的待测批量交易报文,以及在指定时刻之前上送的历史交易报文;根据指定的报文内容,确定待测批量交易报文与历史交易报文之间的相似度指数,其中,指定报文内容包括以下中的至少两种:批次号、交易账号以及交易金额;通过对相似度指数与预设相似度阈值进行比较,以判断待测批量交易报文是否存在重复交易风险。
在一些可能的实施方式中,获取由同一监测对象在指定时刻上送的待测批量交易报文,以及在指定时刻之前上送的历史交易报文包括:在指定时刻接收由同一监测对象上送的待测批量交易报文;由预设时长与指定时刻确定第一时段,并提取同一监测对象在第一时段内上送的历史交易报文。
在一些可能的实施方式中,确定待测批量交易报文与历史交易报文之间的相似度指数包括:利用预设相似度算法确定待测批量交易报文与历史交易报文之间的相似度向量;利用预设评分规则,将相似度向量转化为相似度指数。
在一些可能的实施方式中,利用预设相似度算法确定待测批量交易报文与历史交易报文之间的相似度向量包括:基于待测批量交易报文与历史交易报文构建稀疏矩阵,稀疏矩阵中,每一个非零元素的取值由交易金额确定,每一个元素的行标签与列标签分别由批次号与交易账号确定;确定稀疏矩阵中的第一稀疏向量与m个第二稀疏向量之间的m个相似度参数,并由m个相似度参数确定相似度向量;其中,待测批量交易报文包括:对应于第一批次号的多笔交易报文,稀疏矩阵中对应于第一批次号的行向量/或列向量作为第一稀疏向量;历史交易报文包括:分别对应于m个第二批次号的多笔交易报文,稀疏矩阵中分别对应于m个第二批次号的行向量/或列向量作为m个第二稀疏向量,m为正整数。
在一些可能的实施方式中,还包括:由#{(b i-a)≠0}与#{(b i+a)≠0}的比值和/或差值确定稀疏矩阵中的第一稀疏向量与m个第二稀疏向量之间的m个相似度参数,其中,i=1,2,…,m;其中,b i表示m个第二稀疏向量中的第i个第二稀疏向量,a表示第一稀疏向量,#{(b i-a)≠0}表示第一稀疏向量与第i个 第二稀疏向量的差向量中非零元素的个数,#{(b+a)≠0}表示第一稀疏向量与第i个第二稀疏向量的和向量中非零元素的个数。
具体地:由下列公式确定稀疏矩阵中的第一稀疏向量与m个第二稀疏向量之间的m个相似度参数:
Figure PCTCN2020087550-appb-000001
在一些可能的实施方式中,预设评分规则包括:确定m个相似度参数中的最大相似度参数作为相似度指数。
在一些可能的实施方式中,预设评分规则还包括:判断m个相似度参数中的最大相似度参数是否达到预设临界值;若最大相似度参数达到预设临界值,则确定预设临界值为相似度指数;若最大相似度参数未达到预设临界值,则基于m个预设权值参数分别对m个相似度参数进行加权处理,以得到m个加权相似度参数,并确定m个加权相似度参数中的最大加权相似度参数作为相似度指数。
在一些可能的实施方式中,指定报文内容还包括批次上送时间,方法还包括:针对m个相似度参数中的每一个相似度参数,由所对应的两个批次上送时间的差值而确定对应的预设权值参数。
在一些可能的实施方式中,还包括:由以下公式确定m个预设权值参数,并分别对m个相似度参数进行加权处理,以得到m个加权相似度参数:
Figure PCTCN2020087550-appb-000002
其中,t a为待测批量交易报文的批次上送时间;S i为m个相似度参数中的第i相似度参数;t i为对应于第i相似度参数的第i批次历史数据的批次上送时间;ω i为m个预设权值参数中对应于第i相似度参数的第i预设权值参数;X i为m个加权相似度参数中对应于第i相似度参数的第i加权相似度参数;T为包含t a以及每一个t i在内的第一时段的时长。
在一些可能的实施方式中,还包括:由同一监测对象的预设信用信息和/或预设属性信息确定m个预设权值参数。
在一些可能的实施方式中,还包括:提取同一监测对象在指定时刻之前上送的历史交易数据,并根据历史交易数据确定相似度阈值,其中,历史交易数据上送于历史交易报文之前。
在一些可能的实施方式中,历史交易数据包括:分别对应于n个第三批次号的多笔交易数据,且n个第三批次号中的每一个第三批次号均对应设有重复交易风险标签,n为大于1的正整数;以及,方法还包括:依次将对应于n个第三批次号中每一个第三批次号的多笔交易数据作为待测批次数据,并将历史交易数据中除待测批次数据之外的交易数据作为剩余批次数据;根据指定的报文内容,确定待测批次数据与剩余批次数据之间的参考相似度指数,从而获得对应于每一个第三批次号的参考相似度指数;根据对应于每一个第三批次号的参考相似度指数与重复交易风险标签建立ROC曲线,从而根据ROC曲线确定相似度阈值。
在一些可能的实施方式中,在建立ROC曲线之前,方法还包括:去除取值为0或1的参考相似度指数以及所对应的重复交易风险标签。
在一些可能的实施方式中,历史交易数据与历史交易报文的上送时间具有周期性对应关系。
在一些可能的实施方式中,还包括:在确定待测批量交易报文与历史交易报文之间的相似度指数之前,比较待测批量交易报文与历史交易报文的批次号;若存在与待测批量交易报文相比具有同一批次号的一个或多个历史交易报文,则直接判定待测批量交易报文存在重复交易风险;若不存在与待测批量交易报文相比具有同一批次号的历史交易报文,则进一步执行确定待测批量交易报文与历史交易报文之间的相似度指数。
在一些可能的实施方式中,还包括:若待测批量交易报文被判断存在重复交易风险,则向同一监测对象发送预警信息;接收同一监测对象发来的确认信息,并根据确认信息重复判断待测批量交易报文是否存在重复交易风险。
一种重复交易风险监测装置,包括:获取模块,用于获取由同一监测对象在指定时刻上送的待测批量交易报文,以及在指定时刻之前上送的历史交易报 文;相似度模块,用于根据指定的报文内容,确定待测批量交易报文与历史交易报文之间的相似度指数,其中,指定报文内容包括以下中的至少两种:批次号、交易账号以及交易金额;判断模块,用于通过对相似度指数与预设相似度阈值进行比较,以判断待测批量交易报文是否存在重复交易风险。
在一些可能的实施方式中,获取模块包括:接收模块,用于在指定时刻接收由同一监测对象上送的待测批量交易报文;提取模块,用于由预设时长与指定时刻确定第一时段,并提取同一监测对象在第一时段内上送的历史交易报文。
在一些可能的实施方式中,相似度模块包括:相似度测算模块,用于利用预设相似度算法确定待测批量交易报文与历史交易报文之间的相似度向量;相似度评分模块,用于利用预设评分规则,将相似度向量转化为相似度指数。
在一些可能的实施方式中,相似度测算模块用于:基于待测批量交易报文与历史交易报文构建稀疏矩阵,稀疏矩阵中,每一个非零元素的取值由交易金额确定,每一个元素的行标签与列标签分别由批次号与交易账号确定;确定稀疏矩阵中的第一稀疏向量与m个第二稀疏向量之间的m个相似度参数,并由m个相似度参数确定相似度向量;其中,待测批量交易报文包括:对应于第一批次号的多笔交易报文,稀疏矩阵中对应于第一批次号的行向量/或列向量作为第一稀疏向量;历史交易报文包括:分别对应于m个第二批次号的多笔交易报文,稀疏矩阵中分别对应于m个第二批次号的行向量/或列向量作为m个第二稀疏向量,m为正整数。
在一些可能的实施方式中,相似度测算模块进一步用于:由#{(b i-a)≠0}与#{(b i+a)≠0}的比值和/或差值确定稀疏矩阵中的第一稀疏向量与m个第二稀疏向量之间的m个相似度参数,其中,i=1,2,…,m;其中,b i表示m个第二稀疏向量中的第i个第二稀疏向量,a表示第一稀疏向量,#{(b i-a)≠0}表示第一稀疏向量与第i个第二稀疏向量的差向量中非零元素的个数,#{(b+a)≠0}表示第一稀疏向量与第i个第二稀疏向量的和向量中非零元素的个数。
具体地:由下列公式确定稀疏矩阵中的第一稀疏向量与m个第二稀疏向量之间的m个相似度参数:
Figure PCTCN2020087550-appb-000003
在一些可能的实施方式中,相似度评分模块用于:确定m个相似度参数中的最大相似度参数作为相似度指数。
在一些可能的实施方式中,相似度评分模块用于:判断m个相似度参数中的最大相似度参数是否达到预设临界值;若最大相似度参数达到预设临界值,则确定预设临界值为相似度指数;若最大相似度参数未达到预设临界值,则基于m个预设权值参数分别对m个相似度参数进行加权处理,以得到m个加权相似度参数,并确定m个加权相似度参数中的最大加权相似度参数作为相似度指数。
在一些可能的实施方式中,指定报文内容还包括批次上送时间,相似度评分模块进一步用于:针对m个相似度参数中的每一个相似度参数,由所对应的两个批次上送时间的差值而确定对应的预设权值参数。
在一些可能的实施方式中,相似度评分模块进一步用于:由以下公式确定m个预设权值参数,并分别对m个相似度参数进行加权处理,以得到m个加权相似度参数:
Figure PCTCN2020087550-appb-000004
其中,t a为待测批量交易报文的批次上送时间;S i为m个相似度参数中的第i相似度参数;t i为对应于第i相似度参数的第i批次历史数据的批次上送时间;ω i为m个预设权值参数中对应于第i相似度参数的第i预设权值参数;X i为m个加权相似度参数中对应于第i相似度参数的第i加权相似度参数;T为包含t a以及每一个t i在内的第一时段的时长。
在一些可能的实施方式中,相似度评分模块进一步用于:由同一监测对象的预设信用信息和/或预设属性信息确定m个预设权值参数。
在一些可能的实施方式中,还包括相似度阈值模块,具体用于:提取同一监测对象在指定时刻之前上送的历史交易数据,并根据历史交易数据确定相似度阈值,其中,历史交易数据上送于历史交易报文之前。
在一些可能的实施方式中,历史交易数据包括:分别对应于n个第三批次号的多笔交易数据,且n个第三批次号中的每一个第三批次号均对应设有重复交易风险标签,n为大于1的正整数;以及,相似度阈值模块进一步用于:依次将对应于n个第三批次号中每一个第三批次号的多笔交易数据作为待测批次数据,并将历史交易数据中除待测批次数据之外的交易数据作为剩余批次数据;根据指定的报文内容,确定待测批次数据与剩余批次数据之间的参考相似度指数,从而获得对应于每一个第三批次号的参考相似度指数;根据对应于每一个第三批次号的参考相似度指数与重复交易风险标签建立ROC曲线,从而根据ROC曲线确定相似度阈值。
在一些可能的实施方式中,在建立ROC曲线之前,相似度阈值模块进一步用于:去除取值为0或1的参考相似度指数以及所对应的重复交易风险标签。
在一些可能的实施方式中,历史交易数据与历史交易报文的上送时间具有周期性对应关系。
在一些可能的实施方式中,还包括过滤模块,用于:在确定待测批量交易报文与历史交易报文之间的相似度指数之前,比较待测批量交易报文与历史交易报文的批次号;若存在与待测批量交易报文相比具有同一批次号的一个或多个历史交易报文,则直接判定待测批量交易报文存在重复交易风险;若不存在与待测批量交易报文相比具有同一批次号的历史交易报文,则进一步执行确定待测批量交易报文与历史交易报文之间的相似度指数。在一些可能的实施方式中,还包括预警模块,用于:若待测批量交易报文被判断存在重复交易风险,则向同一监测对象发送预警信息;接收同一监测对象发来的确认信息,并根据确认信息重复判断待测批量交易报文是否存在重复交易风险。
一种重复交易风险监测系统,包括如上述的监测装置以及至少一个监测对象。
一种重复交易风险监测装置,包括:一个或者多个多核处理器;存储器,用于存储一个或多个程序;当一个或多个程序被一个或者多个多核处理器执行时,使得一个或多个多核处理器实现:获取由同一监测对象在指定时刻上送的 待测批量交易报文,以及在指定时刻之前上送的历史交易报文;根据指定的报文内容,确定待测批量交易报文与历史交易报文之间的相似度指数,其中,指定报文内容包括以下中的至少两种:批次号、交易账号以及交易金额;通过对相似度指数与预设相似度阈值进行比较,以判断待测批量交易报文是否存在重复交易风险。
一种计算机可读存储介质,计算机可读存储介质存储有程序,当程序被多核处理器执行时,使得多核处理器执行如上述的方法。
本申请实施例采用的上述至少一个技术方案能够达到以下有益效果:本实施例中,通过计算同一监测对象在指定时刻上送的待测批量交易报文与在指定时刻之前一段时间内上送的历史交易报文之间的相似度指数,进而通过比较相似度指数与预设的相似度阈值,能够对批量交易报文中可能存在的部分重复交易的情况进行监测,能更为灵敏地提示重复交易风险,避免经济损失;进一步地,在计算相似度指数的过程中,本申请充分利用交易本身的信息提升相似度指数的可信度,利用稀疏矩阵以及稀疏向量间的差、和计算简化了相似度计算过程,利用交易上送时间制定合理的加权方案提升相似度指数的计算准确度;在相似度阈值的制定过程中,本申请通过制定合理的阈值求取方案,利用ROC曲线得到可信度较高的相似度阈值,进一步保证了重复交易风险监测的准确度。
应当理解,上述说明仅是本发明技术方案的概述,以便能够更清楚地了解本发明的技术手段,从而可依照说明书的内容予以实施。为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举例说明本发明的具体实施方式。
附图说明
通过阅读下文的示例性实施例的详细描述,本领域普通技术人员将明白本文所述的有点和益处以及其他优点和益处。附图仅用于示出示例性实施例的目的,而并不认为是对本发明的限制。而且在整个附图中,用相同的标号表示相同的部件。在附图中:
图1为根据本发明一实施例的重复交易风险监测方法的流程示意图;
图2为根据本发明另一实施例的重复交易风险监测方法的流程示意图;
图3为根据本发明实施例的ROC曲线示意图;
图4为根据本发明一实施例的重复交易风险监测装置的结构示意图;
图5为根据本发明又一实施例的重复交易风险监测装置的结构示意图;
图6为根据本发明一实施例的计算机可读存储介质的示意图。
在附图中,相同或对应的标号表示相同或对应的部分。
具体实施方式
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。
在本发明中,应理解,诸如“包括”或“具有”等术语旨在指示本说明书中所公开的特征、数字、步骤、行为、部件、部分或其组合的存在,并且不旨在排除一个或多个其他特征、数字、步骤、行为、部件、部分或其组合存在的可能性。
另外还需要说明的是,在不冲突的情况下,本发明中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本发明。
图1示意性地示出了根据本发明实施方式的重复交易风险监测方法100的流程示意图,优选地但非必须地,图1所示的方法可在云端服务器、服务器集群或后台交易处理系统处执行,更具体地,图1所示的方法可由设置于银联系统中的具体模块执行。本实施例中,以云端服务器作为执行主体进行具体阐述,然而应当理解,本申请对执行主体并无具体限制。
如图1所示,该方法100包括:
步骤S101、获取由同一监测对象在指定时刻上送的待测批量交易报文,以及在所述指定时刻之前上送的历史交易报文;
其中,同一监测对象是指实际与持卡人产生交易的商户或者终端,待测批量交易报文与历史交易报文可以是由包括贷记交易在内的多种类型交易而产生的交易报文。在同一监测对象处发生一笔交易后,并不实时地将该笔交易报文上送到云端服务器中,而是定时批量地将一段时间内产生的多笔交易报文打包上送至云端服务器中。在本实施例中,同一监测对象在指定时刻以及指定时刻之前的多个时间点处分别上送批量交易报文,其中,将该监测对象在指定时刻上送的批量交易报文指定为“待测批量交易报文”,指定时刻通常为最近时刻或当前时刻;将同一监测对象在指定时刻之前一段时间内上送的交易报文指定为“历史交易报文”,用于作为该重复交易风险分析中的背景数据。
在一些可能的实施方式中,其中步骤S101可以进一步包括:在指定时刻接收由同一监测对象上送的待测批量交易报文;由预设时长与指定时刻确定第一时段,并提取同一监测对象在第一时段内上送的历史交易报文。
例如,云端服务器在接收到待测批量交易报文之后,为判断该待测批量交易报文是否存在重复交易风险,从云端服务器的数据库中提取出该同一监测对象在之前的一天、一小时或十分钟内上送的其他批次的交易报文作为该重复交易风险分析的背景数据。应理解,存储在数据库中的交易报文可以是批量格式或非批量格式,本申请对此不作具体限制。本申请中,利用同一监测对象本身在之前一段时间内的交易报文来对当前上送的批量交易报文进行分析,可以实时地且较为准确地判断当前的批量交易报文是否具有重复交易的风险。
如图1所示,该方法100还包括:
步骤S102、根据指定的报文内容,确定待测批量交易报文与历史交易报文之间的相似度指数;
具体地,指定报文内容包括以下中的至少两种:批次号、交易账号以及交易金额。具体地,用于计算相似度指数的方法有多种,比如可以基于指定报文内容将待测批量交易报文与历史交易报文中的每一笔交易报文转换为多维特征矢量,进而基于历史交易报文训练获得深度学习模型,将待测批量交易报文输 入至该深度学习模型中以输出相似度指数,又比如可以通过计算余弦距离、欧式距离等方式获取上述相似度指数,本申请对此不作具体限制。
本实施例中,进行重复交易风险分析时无需额外请求其他交易数据,上述批次号、交易卡号以及交易金额等信息均为交易本身的报文内容的。可选地,指定的报文内容还可以包括:批次上送时间、交易类别、交易币种、交易商品类型等信息,本申请对此不作具体限制。
在一些可能的实施方式中,在步骤S102之前,方法100还可以包括:比较待测批量交易报文与历史交易报文的批次号;其中,若存在与待测批量交易报文相比具有同一批次号的一个或多个历史交易报文,则直接判定待测批量交易报文存在重复交易风险;若不存在与待测批量交易报文相比具有同一批次号的历史交易报文,则进一步执行步骤S102。
在一些可能的实施方式中,如图2所示,其中步骤S102可进一步包括:
步骤S201、利用预设相似度算法确定待测批量交易报文与历史交易报文之间的相似度向量;
在一些可能的实施方式中,其中步骤S201可以进一步包括:基于待测批量交易报文与历史交易报文构建稀疏矩阵,其中每一个非零元素的取值由交易金额确定,每一个元素的行标签与列标签分别由批次号与交易账号确定;确定稀疏矩阵中的第一稀疏向量与m个第二稀疏向量之间的m个相似度参数,并由m个相似度参数确定相似度向量。
具体地,待测批量交易报文包括:对应于第一批次号的多笔交易报文,稀疏矩阵中对应于第一批次号的行向量/或列向量作为第一稀疏向量;历史交易报文包括:分别对应于m个第二批次号的多笔交易报文,稀疏矩阵中分别对应于m个第二批次号的行向量/或列向量作为m个第二稀疏向量,m为正整数。
其中,每一个元素的行标签与列标签分别由批次号与交易账号确定可以是:稀疏矩阵中的每一行元素对应于同一批次号、每一列元素对应于同一交易账号;也可以是稀疏矩阵中的每一行元素对应于同一交易账号、每一列元素对应于同一批次号。
其中,对于待测批量交易报文来说,可以是同一监测对象按照预设规则生成的交易包,云端服务器在接收到待测批量交易报文后解析获得该多笔交易报文。对于任一批次的批量交易信息而言,可以理解,批次号与批次上送时间为多笔交易报文的共有信息,交易卡号与交易金额为每一笔交易报文的独有信息。
以下以批次号为行标签、以交易账号为列标签为例进行具体描述:
例如,将待测批量交易报文与历史交易报文包含的每一笔交易报文以批次号为行标签、以交易账号为列标签进行排列,从而形成如下所示的稀疏矩阵。其中,每一行元素对应于同一批次号、每一列元素对应于同一交易账号,若某一交易账号在某一批次中存在交易记录,则将对应位置的元素取值为该笔交易的交易金额,若某一交易账号在某一批次中不存在交易记录,则将对应位置的元素取值为零,由实际交易经验可以理解,在该稀疏矩阵中,每行每列可能存在一定数量的非零元素(即实际数据),和大量的零元素(零元素没有数据,不进行存储)。
具体地,下列稀疏矩阵中,行标签N a对应于第一批次号、行标签N 1~N m分别对应于m个第二批次号,列标签C 1~C n分别对应于上述待测批量交易报文与历史交易报文中所包含的每一笔交易报文中所涉及的每一个交易账号,V mn为历史交易报文中,对应于第m个第二批次号以及交易账号C n的交易金额,V an为待测批量交易报文中对应于交易账号C n的交易金额,并依次类推。
Figure PCTCN2020087550-appb-000005
在该稀疏矩阵中,第一稀疏向量a,也即第一批次号a所对应的行向量为:
a=(V a1 V a2 … V an)
在该稀疏矩阵中,m个第二稀疏向量b i,i=1,2,…,m,也即m个第二批次号所分别对应的m个行向量为:
b i=(V i1 V i2 … V in),其中i=1,2,…,m
进一步地,分别计算第一稀疏向量a与m个第二稀疏向量b i之间的m个相似度参数S i,其中i=1,2,…,m,从而得到相似度向量(S 1,S 2,...,S m)。通过建立上述稀疏矩阵,将待测批量交易报文与历史交易报文之间的相似度运算过程简化为更为简单的向量间相似度计算的过程。
在一些可能的实施方式中,进一步地,可以由#{(b i-a)≠0}与#{(b i+a)≠0}的比值和/或差值确定所述稀疏矩阵中的第一稀疏向量与m个第二稀疏向量之间的m个相似度参数,其中,i=1,2,…,m。
例如,可以由下列公式(1)确定稀疏矩阵中的第一稀疏向量与m个第二稀疏向量之间的m个相似度参数:
Figure PCTCN2020087550-appb-000006
上述公式(1)中,b i表示m个第二稀疏向量中的第i个第二稀疏向量;a表示第一稀疏向量,#{(b i-a)≠0}表示第一稀疏向量与第i个第二稀疏向量的差向量中非零元素的个数,#{(b i+a)≠0}表示第一稀疏向量与第i个第二稀疏向量的和向量中非零元素的个数;S i表示第i个第二稀疏向量与第一稀疏向量的第i相似度参数;m为正整数,表示m个第二稀疏向量与m个相似度参数的数量。
以a=(V a1 V a2 … V an)与b 1=(V 11 V 12 … V 1n)为例进行进一步具体描述。
(b 1-a)=(V 11-V a1 V 12-V a2 … V 1n-V an);
(b 1+a)=(V 11+V a1 V 12+V a2 … V 1n+V an);
可以理解,以交易账号C n为例,若该交易账号在N 1与N a两个批次号中存在重复交易的话,则V 1n-V an=0,且V 1n+V an≠0。也即是说,这种重复交易的个数被统计进入了#{(b 1+a)≠0}、却并未统计进入#{(b i-a)≠0}。进一步地,可以推测得到,S i的取值越大,相对应的两批交易的重复度风险就越高。
由此可以看出,上述公式(1)对重复交易风险具有较高的识别敏感度,基于简单地统计计算即可对重复交易的识别具有较好的效果,上述相似度参数S i取值在[0,1]之间,当两批交易完全相同时,相似度参数为1,当两批交易完全不同时,相似度参数为0。
可选地,本发明还可以通过其他方式确定第一稀疏向量a与m个第二稀疏向量b i之间的m个相似度参数,比如可以通过计算欧氏距离、余弦距离等计算方式而确定,本申请对此不作具体限定。
如图2所示,在步骤S201之后,步骤S102可进一步包括:
步骤S202:利用预设评分规则,将相似度向量转化为相似度指数。
在一些可能的实施方式中,其中步骤S202中的预设评分规则可以包括:确定m个相似度参数中的最大相似度参数作为相似度指数。
在一些可能的实施方式中,其中步骤S202中的预设评分规则还可以包括:判断m个相似度参数中的最大相似度参数是否达到预设临界值;若最大相似度参数达到预设临界值,则确定预设临界值为相似度指数;若最大相似度参数未达到预设临界值,则基于m个预设权值参数分别对m个相似度参数进行加权处理,以得到m个加权相似度参数,并确定m个加权相似度参数中的最大加权相似度参数作为相似度指数。
例如,根据上述公式(1)所获取的相似度参数S i取值在[0,1]之间,因此可以将1作为预设临界值,进一步地,若最大相似度参数达到1,则说明存在两批交易完全相同,通常可以认为对应于该最大相似度参数的两批次交易重复。若最大相似度参数小于1,则需要进一步结合预设权值参数进行判断,该预设权值参数可以由批次上送时间等因素确定。
在一些可能的实施方式中,指定报文内容还包括批次上送时间,其中步骤S202进一步包括:针对m个相似度参数中的每一个相似度参数,由所对应的两个批次上送时间的差值而确定对应的预设权值参数。由于时间间隔较小的两批交易存在重复的概率更高,本实施例中通过采用由两批交易的批次上送时间差而确定预设权值参数,能够确定具有更高准确度的相似度指数。
例如,可以由公式(2)确定m个预设权值参数(ω 12,…,ω m),并分别根据m个预设权值参数(ω 12,…,ω m)分别对m个相似度参数(S 1,S 2,…,S m)进行加权处理,以得到m个加权相似度参数(X 1,X 2,…,X m)。
其中,公式(2)为:
Figure PCTCN2020087550-appb-000007
在上述公式(2)中,t a为待测批量交易报文的批次上送时间;S i为m个相似度参数中的第i相似度参数;t i为对应于第i相似度参数的第i批次历史数据的批次上送时间;ω i为m个预设权值参数中对应于第i相似度参数的第i预设权值参数;X i为m个加权相似度参数中对应于第i相似度参数的第i加权相似度参数;T为第一时段的时长;m为正整数,表示m个相似度参数的个数。
在一些可能的实施方式中,还可以包括由同一监测对象的预设信用信息和/或预设属性信息确定上述m个预设权值参数。可选地,同一监测对象的预设信用信息例如是同一监测对象的银行征信评分。
如图1所示,方法100还包括:
步骤S103、对相似度指数与预设相似度阈值进行比较,以判断待测批量交易报文是否存在重复交易风险。
具体地,上述重复交易风险用于指示待测批量交易报文中存在一笔或多笔与历史交易相重复的重复交易的交易报文。
例如,对于每一次由同一监测对象上送的待测批量交易报文,都可以获取对应的相似度指数,可以将相似度指数与预设相似度阈值进行大小比较,若相似度指数超过预设相似度阈值,则判断待测批量交易报文存在重复交易风险, 进一步可以采取相关预警措施,若相似度指数未超过预设相似度阈值,则判断待测批量交易报文为正常交易。
在一些可能的实施方式中,方法100还包括:提取同一监测对象在指定时刻之前上送的历史交易数据,并根据历史交易数据确定相似度阈值,其中,历史交易数据的上送于历史交易报文之前。本实施例中,基于同一监测对象的历史交易数据而获取的相似度阈值具有更高的自适应性与可靠性。可选地,本实施例也可以通过经验值以及实验值获取相似度阈值。
在一些可能的实施方式中,历史交易数据与历史交易报文的上送时间具有周期性对应关系。例如,历史交易数据与历史交易报文可以是由同一监测对象在相邻的两周或相邻的两天内的同一时段内上送的。
在一些可能的实施方式中,历史交易数据包括:分别对应于n个第三批次号的多笔交易数据,且n个第三批次号中的每一个第三批次号均对应设有重复交易风险标签,n为大于1的正整数。
进一步地,根据历史交易数据确定相似度阈值具体可以包括:
(1)依次将对应于n个第三批次号中每一个第三批次号的多笔交易数据作为待测批次数据,并将历史交易数据中除待测批次数据之外的交易数据作为剩余批次数据;
(2)根据指定的报文内容,确定待测批次数据与剩余批次数据之间的参考相似度指数,从而获得对应于每一个第三批次号的参考相似度指数;
(3)根据对应于每一个第三批次号的参考相似度指数与重复交易风险标签建立ROC曲线,从而根据ROC曲线确定相似度阈值;
例如,历史交易数据包可以划分对应于五个第三批次号的R 1~R 5,其中,选取R 1作为待测批次数据,将剩余的R 2~R 5作为剩余批次数据,并计算得到待测批次数据与剩余批次数据之间的相似度指数作为参考相似度指数,也即对应于R 1的参考相似度指数,具体计算过程与上文中计算待测批量交易报文与历史交易报文之间相似度指数的步骤一致或相似,本申请在此不再赘述。依次类推,可以计算出分别对应于R 1~R 5五个批次的参考相似度指数。
以下结合表1对步骤(3)中的建立ROC曲线进行详细说明。
表1:
Figure PCTCN2020087550-appb-000008
上述表格中,R 1~R 5分别表示上述多批次交易数据中的每一个第三批次号,其中,R 1所对应的重复交易风险标签为0(也即非重复交易),所对应的参考相似度指数为0.3;R 3所对应的重复交易风险标签为1(也即重复交易),所对应的参考相似度指数为0.9,并依次类推;分别以对应于R 1~R 5的参考相似度指数作为预设阈值进行查准率与查全率的判断,判定为TP,FP,TN,FN四种情况,其中,若参考相似度指数≥阈值,且重复交易风险标签=1,判定为TP;若参考相似度指数≥阈值,且重复交易风险标签=0,判定为FP;若参考相似度指数<阈值,且重复交易风险标签=1,判定为FN;若参考相似度指数小于阈值,且重复交易风险标签=0,判定为TN;进一步计算每一个阈值的真正率TPR和假正率FPR,其中,TPR=TP/(TP+FN),FPR=FP/(FP+TN)。进一步,参见图3,以FPR为横轴,TPR为纵轴,根据对应于各个阈值的真正率TPR和假正率FPR,得到ROC曲线,并选取曲线最靠近左上角的点(0,1)对应的阈值0.7作为相似度阈值。
在一些可能的实施方式中,在建立ROC曲线之前,还可以包括:去除取值为0或1的参考相似度指数以及所对应的重复交易风险标签。从而可以避免阈值选取偏差。
在一些可能的实施方式中,方法100还可以包括:若待测批量交易报文被判断存在重复交易风险,则向同一监测对象发送预警信息;接收由同一监测对象发来的确认信息,并根据确认信息重复判断待测批量交易报文是否存在重复交易风险。例如,当相似度指数大于相似度阈值时,向同一监测对象反馈预警信息,若相似度指数到达预设临界值,则反馈更强的预警,从而提醒同一监测对象可能有重复交易的风险出现,避免经济损失。
本实施例中,通过计算同一监测对象在指定时刻上送的待测批量交易报文与在指定时刻之前一段时间内上送的历史交易报文之间的相似度指数,进而通过比较相似度指数与预设的相似度阈值,能够对批量交易报文中可能存在的部分重复交易的情况进行监测,能更为灵敏地提示重复交易风险,避免经济损失;进一步地,在计算相似度指数的过程中,本申请充分利用交易本身的信息提升相似度指数的可信度,利用稀疏矩阵以及稀疏向量间的差、和计算简化了相似度计算过程,利用交易上送时间制定合理的加权方案提升相似度指数的计算准确度;在相似度阈值的制定过程中,本申请通过制定合理的阈值求取方案,利用ROC曲线得到可信度较高的相似度阈值,进一步保证了重复交易风险监测的准确度。
基于相同的技术构思,本发明实施例还提供一种重复交易风险监测装置,用于执行上述任一实施例所提供的重复风险交易监测方法。图4为本发明实施例提供的一种重复交易风险监测装置结构示意图。
如图4所示,重复交易风险监测装置40包括:
获取模块401,用于获取由同一监测对象在指定时刻上送的待测批量交易报文,以及在指定时刻之前上送的历史交易报文;
相似度模块402,用于根据指定的报文内容,确定待测批量交易报文与历史交易报文之间的相似度指数,其中,指定报文内容包括以下中的至少两种:批次号、交易账号以及交易金额;
判断模块403,用于通过对相似度指数与预设相似度阈值进行比较,以判断待测批量交易报文是否存在重复交易风险。
在一些可能的实施方式中,获取模块401包括:接收模块,用于在指定时刻接收由同一监测对象上送的待测批量交易报文;提取模块,用于由预设时长与指定时刻确定第一时段,并提取同一监测对象在第一时段内上送的历史交易报文。
在一些可能的实施方式中,相似度模块402包括:相似度测算模块,用于利用预设相似度算法确定待测批量交易报文与历史交易报文之间的相似度向量;相似度评分模块,用于利用预设评分规则,将相似度向量转化为相似度指数。
在一些可能的实施方式中,相似度测算模块用于:基于待测批量交易报文与历史交易报文构建稀疏矩阵,稀疏矩阵中,每一个非零元素的取值由交易金额确定,每一个元素的行标签与列标签分别由批次号与交易账号确定;确定稀疏矩阵中的第一稀疏向量与m个第二稀疏向量之间的m个相似度参数,并由m个相似度参数确定相似度向量;其中,待测批量交易报文包括:对应于第一批次号的多笔交易报文,稀疏矩阵中对应于第一批次号的行向量/或列向量作为第一稀疏向量;历史交易报文包括:分别对应于m个第二批次号的多笔交易报文,稀疏矩阵中分别对应于m个第二批次号的行向量/或列向量作为m个第二稀疏向量,m为正整数。
在一些可能的实施方式中,相似度测算模块进一步用于:由#{(b i-a)≠0}与#{(b i+a)≠0}的比值和/或差值确定稀疏矩阵中的第一稀疏向量与m个第二稀疏向量之间的m个相似度参数,其中,i=1,2,…,m;其中,b i表示m个第二稀疏向量中的第i个第二稀疏向量,a表示第一稀疏向量,#{(b i-a)≠0}表示第一稀疏向量与第i个第二稀疏向量的差向量中非零元素的个数,#{(b+a)≠0}表示第一稀疏向量与第i个第二稀疏向量的和向量中非零元素的个数。
具体地:由下列公式确定稀疏矩阵中的第一稀疏向量与m个第二稀疏向量之间的m个相似度参数:
Figure PCTCN2020087550-appb-000009
在一些可能的实施方式中,相似度评分模块用于:确定m个相似度参数中的最大相似度参数作为相似度指数。
在一些可能的实施方式中,相似度评分模块用于:判断m个相似度参数中的最大相似度参数是否达到预设临界值;若最大相似度参数达到预设临界值,则确定预设临界值为相似度指数;若最大相似度参数未达到预设临界值,则基于m个预设权值参数分别对m个相似度参数进行加权处理,以得到m个加权相似度参数,并确定m个加权相似度参数中的最大加权相似度参数作为相似度指数。
在一些可能的实施方式中,指定报文内容还包括批次上送时间,相似度评分模块进一步用于:针对m个相似度参数中的每一个相似度参数,由所对应的两个批次上送时间的差值而确定对应的预设权值参数。
在一些可能的实施方式中,相似度评分模块进一步用于:由以下公式确定m个预设权值参数,并分别对m个相似度参数进行加权处理,以得到m个加权相似度参数:
Figure PCTCN2020087550-appb-000010
其中,t a为待测批量交易报文的批次上送时间;S i为m个相似度参数中的第i相似度参数;t i为对应于第i相似度参数的第i批次历史数据的批次上送时间;ω i为m个预设权值参数中对应于第i相似度参数的第i预设权值参数;X i为m个加权相似度参数中对应于第i相似度参数的第i加权相似度参数;T为包含t a以及每一个t i在内的第一时段的时长。
在一些可能的实施方式中,相似度评分模块进一步用于:由同一监测对象的预设信用信息和/或预设属性信息确定m个预设权值参数。
在一些可能的实施方式中,装置40还包括相似度阈值模块,具体用于:提取同一监测对象在指定时刻之前上送的历史交易数据,并根据历史交易数据确定相似度阈值,其中,历史交易数据上送于历史交易报文之前。
在一些可能的实施方式中,历史交易数据包括:分别对应于n个第三批次号的多笔交易数据,且n个第三批次号中的每一个第三批次号均对应设有重复交易风险标签,n为大于1的正整数;以及,相似度阈值模块进一步用于:依次将对应于n个第三批次号中每一个第三批次号的多笔交易数据作为待测批次数据,并将历史交易数据中除待测批次数据之外的交易数据作为剩余批次数据;根据指定的报文内容,确定待测批次数据与剩余批次数据之间的参考相似度指数,从而获得对应于每一个第三批次号的参考相似度指数;根据对应于每一个第三批次号的参考相似度指数与重复交易风险标签建立ROC曲线,从而根据ROC曲线确定相似度阈值。
在一些可能的实施方式中,在建立ROC曲线之前,相似度阈值模块进一步用于:去除取值为0或1的参考相似度指数以及所对应的重复交易风险标签。
在一些可能的实施方式中,历史交易数据与历史交易报文的上送时间具有周期性对应关系。
在一些可能的实施方式中,装置40还包括过滤模块,用于在确定待测批量交易报文与历史交易报文之间的相似度指数之前,比较待测批量交易报文与历史交易报文的批次号;若存在与待测批量交易报文相比具有同一批次号的一个或多个历史交易报文,则直接判定待测批量交易报文存在重复交易风险;若不存在与待测批量交易报文相比具有同一批次号的历史交易报文,则进一步由相似度模块确定待测批量交易报文与历史交易报文之间的相似度指数。在一些可能的实施方式中,装置40还包括预警模块,用于:若待测批量交易报文被判断存在重复交易风险,则向同一监测对象发送预警信息;接收同一监测对象发来的确认信息,并根据确认信息重复判断待测批量交易报文是否存在重复交易风险。
本实施例中,通过计算同一监测对象在指定时刻上送的待测批量交易报文与在指定时刻之前一段时间内上送的历史交易报文之间的相似度指数,进而通过比较相似度指数与预设的相似度阈值,能够对批量交易报文中可能存在的部分重复交易的情况进行监测,能更为灵敏地提示重复交易风险,避免经济损 失;进一步地,在计算相似度指数的过程中,本申请充分利用交易本身的信息提升相似度指数的可信度,利用稀疏矩阵以及稀疏向量间的差、和计算简化了相似度计算过程,利用交易上送时间制定合理的加权方案提升相似度指数的计算准确度;在相似度阈值的制定过程中,本申请通过制定合理的阈值求取方案,利用ROC曲线得到可信度较高的相似度阈值,进一步保证了重复交易风险监测的准确度。
基于相同的技术构思,本发明实施例还提供一种重复交易风险监测系统,包括如上所述的监测装置以及至少一个监测对象。
所属技术领域的技术人员能够理解,本发明的各个方面可以实现为设备、方法或计算机可读存储介质。因此,本发明的各个方面可以具体实现为以下形式,即:完全的硬件实施方式、完全的软件实施方式(包括固件、微代码等),或硬件和软件方面结合的实施方式,这里可以统称为“电路”、“模块”或“设备”。
在一些可能的实施方式中,本发明的一种重复交易风险监测装置可以至少包括一个或多个处理器、以及至少一个存储器。其中,所述存储器存储有程序,当所述程序被所述处理器执行时,使得所述处理器执行如图1所示的步骤:
步骤S101:获取由同一监测对象在指定时刻上送的待测批量交易报文,以及在指定时刻之前上送的历史交易报文;
步骤S102:根据指定的报文内容,确定待测批量交易报文与历史交易报文之间的相似度指数,其中,指定报文内容包括以下中的至少两种:批次号、交易账号以及交易金额;
步骤S103:通过对相似度指数与预设相似度阈值进行比较,以判断待测批量交易报文是否存在重复交易风险。
下面参照图5来描述根据本发明的这种实施方式的重复交易风险监测装置5。图5显示的装置5仅仅是一个示例,不应对本发明实施例的功能和使用范围带来任何限制。
如图5所示,装置5可以以通用计算设备的形式表现,包括但不限于:至少一个处理器10、至少一个存储器20、连接不同设备组件的总线60。
总线60包括数据总线、地址总线和控制总线。
存储器20可以包括易失性存储器,例如随机存取存储器(RAM)21和/或高速缓存存储器22,还可以进一步包括只读存储器(ROM)23。
存储器20还可以包括程序模块24,这样的程序模块24包括但不限于:操作设备、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。
装置5还可以与一个或多个外部设备2(例如键盘、指向设备、蓝牙设备等)通信,也可与一个或者多个其他设备进行通信。这种通信可以通过输入/输出(I/O)接口40进行,并在显示单元30上进行显示。并且,装置5还可以通过网络适配器50与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器50通过总线60与装置5中的其它模块通信。应当明白,尽管图中未示出,但可以结合装置5使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID设备、磁带驱动器以及数据备份存储设备等。
图6示出了一种计算机可读存储介质,用于执行如上所述的方法。
在一些可能的实施方式中,本发明的各个方面还可以实现为一种计算机可读存储介质的形式,其包括程序代码,当所述程序代码在被处理器执行时,所述程序代码用于使所述处理器执行上面描述的方法。
上面描述的方法包括了上面的附图中示出和未示出的多个操作和步骤,这里将不再赘述。
所述计算机可读存储介质可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的设备、设备或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读 存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
如图6所示,描述了根据本发明的实施方式的计算机可读存储介质6,其可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在终端设备,例如个人电脑上运行。然而,本发明的计算机可读存储介质不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行设备、设备或者器件使用或者与其结合使用。
可以以一种或多种程序设计语言的任意组合来编写用于执行本发明操作的程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算设备,或者,可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。
此外,尽管在附图中以特定顺序描述了本发明方法的操作,但是,这并非要求或者暗示必须按照该特定顺序来执行这些操作,或是必须执行全部所示的操作才能实现期望的结果。附加地或备选地,可以省略某些步骤,将多个步骤合并为一个步骤执行,和/或将一个步骤分解为多个步骤执行。
虽然已经参考若干具体实施方式描述了本发明的精神和原理,但是应该理解,本发明并不限于所公开的具体实施方式,对各方面的划分也不意味着这些方面中的特征不能组合以进行受益,这种划分仅是为了表述的方便。本发明旨在涵盖所附权利要求的精神和范围内所包括的各种修改和等同布置。

Claims (35)

  1. 一种重复交易风险监测方法,其特征在于,包括:
    获取由同一监测对象在指定时刻上送的待测批量交易报文,以及在所述指定时刻之前上送的历史交易报文;
    根据指定的报文内容,确定所述待测批量交易报文与所述历史交易报文之间的相似度指数,其中,所述指定报文内容包括以下中的至少两种:批次号、交易账号以及交易金额;
    通过对所述相似度指数与预设相似度阈值进行比较,以判断所述待测批量交易报文是否存在重复交易风险。
  2. 由权利要求1所述的方法,其特征在于,所述获取由同一监测对象在指定时刻上送的待测批量交易报文,以及在所述指定时刻之前上送的历史交易报文包括:
    在所述指定时刻接收由所述同一监测对象上送的待测批量交易报文;
    由预设时长与所述指定时刻确定第一时段,并提取所述同一监测对象在所述第一时段内上送的历史交易报文。
  3. 由权利要求1所述的方法,其特征在于,确定所述待测批量交易报文与所述历史交易报文之间的相似度指数包括:
    利用预设相似度算法确定所述待测批量交易报文与所述历史交易报文之间的相似度向量;
    利用预设评分规则,将所述相似度向量转化为所述相似度指数。
  4. 由权利要求3所述的方法,其特征在于,所述利用预设相似度算法确定所述待测批量交易报文与所述历史交易报文之间的相似度向量包括:
    基于所述待测批量交易报文与所述历史交易报文构建稀疏矩阵,所述稀疏矩阵中,每一个非零元素的取值由所述交易金额确定,每一个元素的行标签与列标签分别由所述批次号与所述交易账号确定;
    确定所述稀疏矩阵中的第一稀疏向量与m个第二稀疏向量之间的m个相似度参数,并由所述m个相似度参数确定所述相似度向量;
    其中,所述待测批量交易报文包括:对应于第一批次号的多笔交易报文,所述稀疏矩阵中对应于所述第一批次号的行向量/或列向量作为所述第一稀疏向量;所述历史交易报文包括:分别对应于m个第二批次号的多笔交易报文,所述稀疏矩阵中分别对应于所述m个第二批次号的行向量/或列向量作为所述m个第二稀疏向量,所述m为正整数。
  5. 由权利要求4所述的方法,其特征在于,还包括:
    由#{(b i-a)≠0}与#{(b i+a)≠0}的比值和/或差值确定所述稀疏矩阵中的第一稀疏向量与m个第二稀疏向量之间的m个相似度参数,其中,i=1,2,…,m;
    其中,所述b i表示所述m个第二稀疏向量中的第i个第二稀疏向量,所述a表示所述第一稀疏向量,所述#{(b i-a)≠0}表示所述第一稀疏向量与所述第i个第二稀疏向量的差向量中非零元素的个数,所述#{(b+a)≠0}表示所述第一稀疏向量与所述第i个第二稀疏向量的和向量中非零元素的个数。
  6. 由权利要求4所述的方法,其特征在于,所述预设评分规则包括:
    确定所述m个相似度参数中的最大相似度参数作为所述相似度指数。
  7. 由权利要求4所述的方法,其特征在于,所述预设评分规则还包括:
    判断所述m个相似度参数中的最大相似度参数是否达到预设临界值;
    若所述最大相似度参数达到所述预设临界值,则确定所述预设临界值为所述相似度指数;
    若所述最大相似度参数未达到所述预设临界值,则基于m个预设权值参数分别对所述m个相似度参数进行加权处理,以得到m个加权相似度参数,并确定所述m个加权相似度参数中的最大加权相似度参数作为所述相似度指数。
  8. 由权利要求7所述的方法,其特征在于,所述指定报文内容还包括批次上送时间,所述方法还包括:
    针对所述m个相似度参数中的每一个相似度参数,由所对应的两个批次上送时间的差值而确定对应的所述预设权值参数。
  9. 由权利要求8所述的方法,其特征在于,还包括:
    由以下公式确定所述m个预设权值参数,并分别对所述m个相似度参数进行加权处理,以得到所述m个加权相似度参数:
    Figure PCTCN2020087550-appb-100001
    其中,t a为所述待测批量交易报文的批次上送时间;S i为所述m个相似度参数中的第i相似度参数;t i为对应于所述第i相似度参数的第i批次历史数据的批次上送时间;ω i为所述m个预设权值参数中对应于所述第i相似度参数的第i预设权值参数;X i为所述m个加权相似度参数中对应于所述第i相似度参数的第i加权相似度参数;T为包含所述t a以及每一个所述t i在内的第一时段的时长。
  10. 由权利要求7所述的方法,其特征在于,还包括:由所述同一监测对象的预设信用信息和/或预设属性信息确定所述m个预设权值参数。
  11. 由权利要求1-10中任一项所述的方法,其特征在于,还包括:
    提取所述同一监测对象在所述指定时刻之前上送的历史交易数据,并根据所述历史交易数据确定所述相似度阈值,其中,所述历史交易数据上送于所述历史交易报文之前。
  12. 由权利要求11所述的方法,其特征在于,所述历史交易数据包括:分别对应于n个第三批次号的多笔交易数据,且所述n个第三批次号中的每一个第三批次号均对应设有重复交易风险标签,所述n为大于1的正整数;以及,
    所述方法还包括:
    依次将对应于所述n个第三批次号中每一个第三批次号的多笔交易数据作为待测批次数据,并将所述历史交易数据中除所述待测批次数据之外的交易数据作为剩余批次数据;
    根据所述指定的报文内容,确定所述待测批次数据与所述剩余批次数据之间的参考相似度指数,从而获得对应于所述每一个第三批次号的参考相似度指数;
    根据对应于所述每一个第三批次号的所述参考相似度指数与所述重复交易风险标签建立ROC曲线,从而根据所述ROC曲线确定所述相似度阈值。
  13. 由权利要求12所述的方法,其特征在于,在建立所述ROC曲线之前,所述方法还包括:
    去除取值为0或1的参考相似度指数以及所对应的重复交易风险标签。
  14. 由权利要求11所述的方法,其特征在于,所述历史交易数据与所述历史交易报文的上送时间具有周期性对应关系。
  15. 由权利要求1所述的方法,其特征在于,还包括:
    在所述确定所述待测批量交易报文与所述历史交易报文之间的相似度指数之前,比较所述待测批量交易报文与所述历史交易报文的批次号;
    若存在与所述待测批量交易报文相比具有同一批次号的一个或多个历史交易报文,则直接判定所述待测批量交易报文存在重复交易风险;
    若不存在与所述待测批量交易报文相比具有同一批次号的历史交易报文,则进一步执行所述确定所述待测批量交易报文与所述历史交易报文之间的相似度指数。
  16. 由权利要求1或15所述的方法,其特征在于,还包括:
    若所述待测批量交易报文被判断存在重复交易风险,则向所述同一监测对象发送预警信息;
    接收所述同一监测对象发来的确认信息,并根据所述确认信息重复判断所述待测批量交易报文是否存在重复交易风险。
  17. 一种重复交易风险监测装置,其特征在于,包括:
    获取模块,用于获取由同一监测对象在指定时刻上送的待测批量交易报文,以及在所述指定时刻之前上送的历史交易报文;
    相似度模块,用于根据指定的报文内容,确定所述待测批量交易报文与所述历史交易报文之间的相似度指数,其中,所述指定报文内容包括以下中的至少两种:批次号、交易账号以及交易金额;
    判断模块,用于通过对所述相似度指数与预设相似度阈值进行比较,以判断所述待测批量交易报文是否存在重复交易风险。
  18. 由权利要求17所述的装置,其特征在于,所述获取模块包括:
    接收模块,用于在所述指定时刻接收由所述同一监测对象上送的待测批量交易报文;
    提取模块,用于由预设时长与所述指定时刻确定第一时段,并提取所述同一监测对象在所述第一时段内上送的历史交易报文。
  19. 由权利要求17所述的装置,其特征在于,所述相似度模块包括:
    相似度测算模块,用于利用预设相似度算法确定所述待测批量交易报文与所述历史交易报文之间的相似度向量;
    相似度评分模块,用于利用预设评分规则,将所述相似度向量转化为所述相似度指数。
  20. 由权利要求19所述的装置,其特征在于,所述相似度测算模块用于:
    基于所述待测批量交易报文与所述历史交易报文构建稀疏矩阵,所述稀疏矩阵中,每一个非零元素的取值由所述交易金额确定,每一个元素的行标签与列标签分别由所述批次号与所述交易账号确定;
    确定所述稀疏矩阵中的第一稀疏向量与m个第二稀疏向量之间的m个相似度参数,并由所述m个相似度参数确定所述相似度向量;
    其中,所述待测批量交易报文包括:对应于第一批次号的多笔交易报文,所述稀疏矩阵中对应于所述第一批次号的行向量/或列向量作为所述第一稀疏向量;所述历史交易报文包括:分别对应于m个第二批次号的多笔交易报文,所述稀疏矩阵中分别对应于所述m个第二批次号的行向量/或列向量作为所述m个第二稀疏向量,所述m为正整数。
  21. 由权利要求20所述的装置,其特征在于,所述相似度测算模块进一步用于:
    由#{(b i-a)≠0}与#{(b i+a)≠0}的比值和/或差值确定所述稀疏矩阵中的第一稀疏向量与m个第二稀疏向量之间的m个相似度参数,其中,i=1,2,…,m;
    其中,所述b i表示所述m个第二稀疏向量中的第i个第二稀疏向量,所述a表示所述第一稀疏向量,所述#{(b i-a)≠0}表示所述第一稀疏向量与所述第i个 第二稀疏向量的差向量中非零元素的个数,所述#{(b+a)≠0}表示所述第一稀疏向量与所述第i个第二稀疏向量的和向量中非零元素的个数。
  22. 由权利要求20所述的装置,其特征在于,所述相似度评分模块用于:
    确定所述m个相似度参数中的最大相似度参数作为所述相似度指数。
  23. 由权利要求20所述的装置,其特征在于,所述相似度评分模块用于:
    判断所述m个相似度参数中的最大相似度参数是否达到预设临界值;
    若所述最大相似度参数达到所述预设临界值,则确定所述预设临界值为所述相似度指数;
    若所述最大相似度参数未达到所述预设临界值,则基于m个预设权值参数分别对所述m个相似度参数进行加权处理,以得到m个加权相似度参数,并确定所述m个加权相似度参数中的最大加权相似度参数作为所述相似度指数。
  24. 由权利要求23所述的装置,其特征在于,所述指定报文内容还包括批次上送时间,所述相似度评分模块进一步用于:
    针对所述m个相似度参数中的每一个相似度参数,由所对应的两个批次上送时间的差值而确定对应的所述预设权值参数。
  25. 由权利要求24所述的装置,其特征在于,所述相似度评分模块进一步用于:
    由以下公式确定所述m个预设权值参数,并分别对所述m个相似度参数进行加权处理,以得到所述m个加权相似度参数:
    Figure PCTCN2020087550-appb-100002
    其中,t a为所述待测批量交易报文的批次上送时间;S i为所述m个相似度参数中的第i相似度参数;t i为对应于所述第i相似度参数的第i批次历史数据的批次上送时间;ω i为所述m个预设权值参数中对应于所述第i相似度参数的第i预设权值参数;X i为所述m个加权相似度参数中对应于所述第i相似度参数的第i加权相似度参数;T为包含所述t a以及每一个所述t i在内的第一时段的时长。
  26. 由权利要求23所述的装置,其特征在于,所述相似度评分模块进一步用于:由所述同一监测对象的预设信用信息和/或预设属性信息确定所述m个预设权值参数。
  27. 由权利要求17-26中任一项所述的装置,其特征在于,还包括相似度阈值模块,具体用于:
    提取所述同一监测对象在所述指定时刻之前上送的历史交易数据,并根据所述历史交易数据确定所述相似度阈值,其中,所述历史交易数据的上送于所述历史交易报文之前。
  28. 由权利要求27所述的装置,其特征在于,所述历史交易数据包括:分别对应于n个第三批次号的多笔交易数据,且所述n个第三批次号中的每一个第三批次号均对应设有重复交易风险标签,所述n为大于1的正整数;以及,
    所述相似度阈值模块进一步用于:
    依次将对应于所述n个第三批次号中每一个第三批次号的多笔交易数据作为待测批次数据,并将所述历史交易数据中除所述待测批次数据之外的交易数据作为剩余批次数据;
    根据所述指定的报文内容,确定所述待测批次数据与所述剩余批次数据之间的参考相似度指数,从而获得对应于所述每一个第三批次号的参考相似度指数;
    根据对应于所述每一个第三批次号的所述参考相似度指数与所述重复交易风险标签建立ROC曲线,从而根据所述ROC曲线确定所述相似度阈值。
  29. 由权利要求28所述的装置,其特征在于,在建立所述ROC曲线之前,所述相似度阈值模块进一步用于:
    去除取值为0或1的参考相似度指数以及所对应的重复交易风险标签。
  30. 由权利要求27所述的装置,其特征在于,所述历史交易数据与所述历史交易报文的上送时间具有周期性对应关系。
  31. 由权利要求17所述的装置,其特征在于,还包括过滤模块,用于:
    在所述确定所述待测批量交易报文与所述历史交易报文之间的相似度指数之前,比较所述待测批量交易报文与所述历史交易报文的批次号;
    若存在与所述待测批量交易报文相比具有同一批次号的一个或多个历史交易报文,则直接判定所述待测批量交易报文存在重复交易风险;
    若不存在与所述待测批量交易报文相比具有同一批次号的历史交易报文,则进一步执行所述确定所述待测批量交易报文与所述历史交易报文之间的相似度指数。
  32. 由权利要求17或31所述的装置,其特征在于,还包括预警模块,用于:
    若所述待测批量交易报文被判断存在重复交易风险,则向所述同一监测对象发送预警信息;
    接收所述同一监测对象发来的确认信息,并根据所述确认信息重复判断所述待测批量交易报文是否存在重复交易风险。
  33. 一种重复交易风险监测系统,其特征在于,包括如权利要求17-32中任一项所述的监测装置以及至少一个监测对象。
  34. 一种重复交易风险监测装置,其特征在于,包括:
    一个或者多个多核处理器;
    存储器,用于存储一个或多个程序;
    当所述一个或多个程序被所述一个或者多个多核处理器执行时,使得所述一个或多个多核处理器实现:
    获取由同一监测对象在指定时刻上送的待测批量交易报文,以及在所述指定时刻之前上送的历史交易报文;
    根据指定的报文内容,确定所述待测批量交易报文与所述历史交易报文之间的相似度指数,其中,所述指定报文内容包括以下中的至少两种:批次号、交易账号以及交易金额;
    通过对所述相似度指数与预设相似度阈值进行比较,以判断所述待测批量交易报文是否存在重复交易风险。
  35. 一种计算机可读存储介质,所述计算机可读存储介质存储有程序,当所述程序被多核处理器执行时,使得所述多核处理器执行如权利要求1-16中任一项所述的方法。
PCT/CN2020/087550 2019-05-16 2020-04-28 一种重复交易风险监测方法、装置及计算机可读存储介质 WO2020228530A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910405528.XA CN110135856B (zh) 2019-05-16 2019-05-16 一种重复交易风险监测方法、装置及计算机可读存储介质
CN201910405528.X 2019-05-16

Publications (1)

Publication Number Publication Date
WO2020228530A1 true WO2020228530A1 (zh) 2020-11-19

Family

ID=67574321

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/087550 WO2020228530A1 (zh) 2019-05-16 2020-04-28 一种重复交易风险监测方法、装置及计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN110135856B (zh)
WO (1) WO2020228530A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113362157A (zh) * 2021-05-27 2021-09-07 中国银联股份有限公司 异常节点识别方法、模型的训练方法、装置及存储介质
CN116295762A (zh) * 2023-02-09 2023-06-23 山东黄河河务局山东黄河信息中心 一种基于无线传感技术的根石位置监测系统及方法
CN116308215A (zh) * 2023-05-17 2023-06-23 云账户技术(天津)有限公司 一种组批出款信息的生成方法、装置及相关设备

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110135856B (zh) * 2019-05-16 2023-10-24 中国银联股份有限公司 一种重复交易风险监测方法、装置及计算机可读存储介质
CN110457159A (zh) * 2019-08-21 2019-11-15 深圳前海微众银行股份有限公司 一种处理批量任务的方法、装置、计算设备及存储介质
CN110705992A (zh) * 2019-09-27 2020-01-17 支付宝(杭州)信息技术有限公司 风险防控策略的相似度评估方法及装置
CN111144975B (zh) * 2019-12-06 2023-09-12 港融科技有限公司 一种订单匹配方法、服务器及计算机可读存储介质
CN111429277B (zh) * 2020-03-18 2023-11-24 中国工商银行股份有限公司 重复交易预测方法及系统
CN112465638A (zh) * 2020-11-16 2021-03-09 中科金审(北京)科技有限公司 一种资金交易拆分组合的链路追踪的方法
CN112381163B (zh) * 2020-11-20 2023-07-25 平安科技(深圳)有限公司 一种用户聚类方法、装置及设备
CN113516555A (zh) * 2021-04-26 2021-10-19 中国工商银行股份有限公司 重复业务交易检测方法及装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070094151A1 (en) * 2002-07-29 2007-04-26 Peter Moenickheim Systems and Methods of Rules-Based Database Access For Account Authentication
CN103049851A (zh) * 2012-12-27 2013-04-17 中国建设银行股份有限公司 一种基于交易数据的反欺诈监控方法和装置
CN106157140A (zh) * 2014-09-25 2016-11-23 天逸财金科技服务股份有限公司 交易单据比对方法及系统
CN108133373A (zh) * 2018-01-04 2018-06-08 交通银行股份有限公司 探寻涉机器行为的风险账户的方法及装置
CN109191136A (zh) * 2018-09-05 2019-01-11 北京芯盾时代科技有限公司 一种电子银行反欺诈方法及装置
CN110135856A (zh) * 2019-05-16 2019-08-16 中国银联股份有限公司 一种重复交易风险监测方法、装置及计算机可读存储介质

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105809502A (zh) * 2014-12-30 2016-07-27 阿里巴巴集团控股有限公司 交易风险检测方法和装置
CN108122114A (zh) * 2017-12-25 2018-06-05 同济大学 针对异常重复交易欺诈检测方法、系统、介质及设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070094151A1 (en) * 2002-07-29 2007-04-26 Peter Moenickheim Systems and Methods of Rules-Based Database Access For Account Authentication
CN103049851A (zh) * 2012-12-27 2013-04-17 中国建设银行股份有限公司 一种基于交易数据的反欺诈监控方法和装置
CN106157140A (zh) * 2014-09-25 2016-11-23 天逸财金科技服务股份有限公司 交易单据比对方法及系统
CN108133373A (zh) * 2018-01-04 2018-06-08 交通银行股份有限公司 探寻涉机器行为的风险账户的方法及装置
CN109191136A (zh) * 2018-09-05 2019-01-11 北京芯盾时代科技有限公司 一种电子银行反欺诈方法及装置
CN110135856A (zh) * 2019-05-16 2019-08-16 中国银联股份有限公司 一种重复交易风险监测方法、装置及计算机可读存储介质

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113362157A (zh) * 2021-05-27 2021-09-07 中国银联股份有限公司 异常节点识别方法、模型的训练方法、装置及存储介质
CN113362157B (zh) * 2021-05-27 2024-02-09 中国银联股份有限公司 异常节点识别方法、模型的训练方法、装置及存储介质
CN116295762A (zh) * 2023-02-09 2023-06-23 山东黄河河务局山东黄河信息中心 一种基于无线传感技术的根石位置监测系统及方法
CN116295762B (zh) * 2023-02-09 2024-05-03 山东黄河河务局山东黄河信息中心 一种基于无线传感技术的根石位置监测系统及方法
CN116308215A (zh) * 2023-05-17 2023-06-23 云账户技术(天津)有限公司 一种组批出款信息的生成方法、装置及相关设备
CN116308215B (zh) * 2023-05-17 2023-07-21 云账户技术(天津)有限公司 一种组批出款信息的生成方法、装置及相关设备

Also Published As

Publication number Publication date
CN110135856B (zh) 2023-10-24
CN110135856A (zh) 2019-08-16

Similar Documents

Publication Publication Date Title
WO2020228530A1 (zh) 一种重复交易风险监测方法、装置及计算机可读存储介质
EP4099170B1 (en) Method and apparatus of auditing log, electronic device, and medium
CN110276621A (zh) 数据卡反欺诈识别方法、电子装置及可读存储介质
CN113221104B (zh) 用户异常行为的检测方法及用户行为重构模型的训练方法
CN113657269A (zh) 人脸识别模型的训练方法、装置及计算机程序产品
WO2019153589A1 (zh) 消息数据处理方法、装置、计算机设备和存储介质
CN112100331A (zh) 医疗数据分析方法及装置、存储介质、电子设备
WO2020232902A1 (zh) 异常对象识别方法、装置、计算设备和存储介质
WO2024098699A1 (zh) 实体对象的威胁检测方法、装置、设备及存储介质
CN117114901A (zh) 基于人工智能的投保数据处理方法、装置、设备及介质
CN117076610A (zh) 一种数据敏感表的识别方法、装置、电子设备及存储介质
CN116542771A (zh) 异常信息确定方法、装置、设备、介质及产品
CN115762704A (zh) 一种处方审核方法、装置、设备和存储介质
CN113887911A (zh) 一种异常身份识别方法及装置
CN114461085A (zh) 医疗输入推荐方法、装置、设备及存储介质
CN114444514A (zh) 语义匹配模型训练、语义匹配方法及相关装置
US20200382522A1 (en) Identity verification
CN116308370A (zh) 异常交易识别模型的训练方法、异常交易识别方法及装置
CN111429257A (zh) 一种交易监控方法和装置
CN113656422A (zh) 人脸底库的更新方法及装置
CN112906387A (zh) 风险内容识别方法、装置、设备、介质和计算机程序产品
CN116011599B (zh) 一种生物实验室智能预约方法及系统
CN117493514B (zh) 文本标注方法、装置、电子设备和存储介质
CN117609723A (zh) 一种对象识别方法、装置、电子设备及存储介质
CN116644372B (zh) 一种账户类型的确定方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20806141

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20806141

Country of ref document: EP

Kind code of ref document: A1