CN109767322B

CN109767322B - Suspicious transaction analysis method and device based on big data and computer equipment

Info

Publication number: CN109767322B
Application number: CN201811567201.4A
Authority: CN
Inventors: 王晓艳; 谢翠萍
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2018-12-20
Filing date: 2018-12-20
Publication date: 2024-02-27
Anticipated expiration: 2038-12-20
Also published as: CN109767322A

Abstract

The application relates to a suspicious transaction analysis method and device based on big data and computer equipment. The method comprises the following steps: acquiring transaction data, wherein the transaction data comprises a client identifier and a plurality of transaction fields; acquiring a plurality of monitoring features, and searching a plurality of feature fields corresponding to the monitoring features through a big data platform; the plurality of feature fields being associated with the customer identification; inputting the transaction fields and a plurality of characteristic fields into a monitoring model, and identifying suspicious transactions through the monitoring model; generating an early warning event when the transaction data is identified as suspicious transaction; and sending the early warning event to a corresponding terminal, wherein the terminal is used for rechecking the suspicious transaction according to the early warning event. The method can effectively improve the accuracy and the recognition efficiency of suspicious transaction recognition.

Description

Suspicious transaction analysis method and device based on big data and computer equipment

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a suspicious transaction analysis method and apparatus based on big data, a computer device, and a storage medium.

Background

The money laundering action promotes serious crimes such as smuggling, fraud, drugs and the like, disturbs the social economic order and has serious social hazard. The way of money laundering involves various fields of banking, insurance, securities, etc. The back money laundering is beneficial to maintenance, economy and safety and crime prevention, and has great significance. Based on the summary of the back-money laundering experience, individual transactions are generally legal, but in combination with historical data, suspicious transactions are found. In the traditional way of identifying suspected back-money transactions, the server will identify in combination with multiple transactions of the same customer. However, this method is limited to combining transaction data, and when identifying whether a suspicious transaction is made, the analysis content is less, omission is easy to cause, and various information needs to be manually collected for confirmation. This results in a lower accuracy and efficiency of identification of suspicious transactions.

Disclosure of Invention

Based on the above, it is necessary to provide a method, an apparatus, a computer device and a storage medium for suspicious transaction analysis based on big data, which effectively improve the recognition accuracy of suspicious transactions and the recognition efficiency.

A suspicious transaction analysis method based on big data, the method comprising:

acquiring transaction data, wherein the transaction data comprises a client identifier and a plurality of transaction fields;

acquiring a plurality of monitoring features, and searching a plurality of feature fields corresponding to the monitoring features through a big data platform; the plurality of feature fields being associated with the customer identification;

inputting the transaction fields and a plurality of characteristic fields into a monitoring model, and identifying suspicious transactions through the monitoring model;

generating an early warning event when the transaction data is identified as suspicious transaction;

and sending the early warning event to a corresponding terminal, wherein the terminal is used for rechecking the suspicious transaction according to the early warning event.

In one embodiment, the method further comprises:

collecting client information with various dimensions and information of associated personnel through various ways;

acquiring historical transaction data and a blacklist from a plurality of source databases, and synchronizing the historical transaction data and the blacklist to a target database;

and importing the client information, the information of the related personnel, the historical transaction data and the blacklist in the plurality of dimensions into a big data platform.

In one embodiment, the suspicious transaction identification by the monitoring model includes:

calling a corresponding money back rule according to the transaction field and the characteristic fields searched by the big data platform;

matching the transaction field and the characteristic field by using the money laundering rule, and recording a rule score corresponding to the corresponding money laundering rule when the matching is successful;

accumulating the rule scores corresponding to the backwashing money to obtain a monitoring score corresponding to the transaction data;

when the monitor score exceeds a threshold, the transaction data is marked as suspicious transactions.

In one embodiment, the method further comprises:

acquiring a risk type corresponding to the suspicious transaction; the risk type corresponds to a plurality of risk features;

calculating vector matrixes corresponding to the risk features, inputting a plurality of vector matrixes into a neural network model, and outputting a corresponding suspicious transaction report through the neural network model;

and sending the suspicious transaction report to a corresponding terminal.

In one embodiment, after the generating the early warning event, the method further comprises:

acquiring service fields, professional grades and task amounts corresponding to a plurality of investigator identifiers;

Selecting a investigator identifier which is suitable for the suspicious transaction according to the service field, the professional level and the task amount;

and sending the suspicious transaction to a corresponding terminal according to the selected investigator identifier.

A suspicious transaction analysis device based on big data, the device comprising:

the data acquisition module is used for acquiring transaction data, wherein the transaction data comprises a client identifier and a plurality of transaction fields;

the big data searching module is used for acquiring a plurality of monitoring features and searching a plurality of feature fields corresponding to the monitoring features through the big data platform; the plurality of feature fields being associated with the customer identification;

the identification module is used for inputting the transaction fields and the plurality of characteristic fields into a monitoring model, and carrying out suspicious transaction identification through the monitoring model;

the early warning module is used for generating an early warning event when the transaction data are identified as suspicious transactions;

and the communication module is used for sending the early warning event to a corresponding terminal, and the terminal is used for rechecking the suspicious transaction according to the early warning event.

In one embodiment, the apparatus further comprises:

the data acquisition module is used for acquiring client information in various dimensions and information of related personnel through various ways; acquiring historical transaction data and a blacklist from a plurality of source databases, and synchronizing the historical transaction data and the blacklist to a target database; and importing the client information, the information of the related personnel, the historical transaction data and the blacklist in the plurality of dimensions into a big data platform.

In one embodiment, the identification module is further configured to invoke a corresponding money back flushing rule according to the transaction field and the plurality of feature fields searched by the big data platform; matching the transaction field and the characteristic field by using the money laundering rule, and recording a rule score corresponding to the corresponding money laundering rule when the matching is successful; accumulating the rule scores corresponding to the backwashing money to obtain a monitoring score corresponding to the transaction data; when the monitor score exceeds a threshold, the transaction data is marked as suspicious transactions.

A computer device comprising a memory storing a computer program and a processor implementing the steps of the method embodiments described above when the processor executes the computer program.

A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the various method embodiments described above.

According to the suspicious transaction analysis method, the suspicious transaction analysis device, the computer equipment and the storage medium based on the big data, the monitoring fields corresponding to the various monitoring features associated with the client identification are searched through the big data platform, and the transaction fields and the monitoring fields in the transaction data are input into the monitoring model. The monitoring model judges the transaction field and the multidimensional characteristic field, so that corresponding suspicious transactions are identified, and an early warning event is generated. The data of suspicious transaction analysis is not only single transaction data per se, but also feature fields of various dimensions searched by a big data platform, so that the data range of suspicious transaction analysis is effectively expanded, more comprehensive and comprehensive analysis can be performed by combining information of various dimensions, and the accuracy of suspicious transaction identification is effectively improved. In the whole process, the suspicious transaction is identified without manually collecting information, and the investigator only needs to recheck the suspicious transaction, so that the workload is effectively reduced, and the identification efficiency of the suspicious transaction is improved.

Drawings

FIG. 1 is an application scenario diagram of a suspicious transaction analysis method based on big data in one embodiment;

FIG. 2 is a flow diagram of a suspicious transaction analysis method based on big data in one embodiment;

FIG. 3 is a flow diagram of suspicious transaction identification steps performed by the monitoring model in one embodiment;

FIG. 4 is a flow chart of a risk type identification step in one embodiment;

FIG. 5 is a block diagram of a suspicious transaction analysis facility based on big data in one embodiment;

fig. 6 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

The suspicious transaction analysis method based on big data can be applied to an application environment shown in fig. 1. Wherein the server 102 communicates with the big data platform 104 via a network. The server 102 communicates with the terminal 106 via a network. The server 102 may be implemented as a stand-alone server or a server cluster including a plurality of servers. The big data platform 104 may be deployed on other independent servers, or may be deployed on a server cluster. The server 102 may obtain transaction data in a database at a predetermined frequency to identify whether there is a suspicious transaction for back-money laundering. The transaction data includes a customer identification and a transaction field. Server 102 searches large data platform 104 for various feature fields associated with the customer identification based on the customer identification, resulting in various data associated with the transaction data. The server 102 enters the transaction fields and the plurality of characteristic fields into a monitoring model through which suspicious transaction identification is performed. When the transaction data is identified as a suspicious transaction, the server 102 generates an early warning event and sends the early warning event to the corresponding terminal 106. The anti-money laundering investigator may review the suspicious transaction via the terminal 106 based on the pre-alarm event.

In one embodiment, as shown in fig. 2, a suspicious transaction analysis method based on big data is provided, and the method is applied to the server in fig. 1 for illustration, and includes the following steps:

step 202, obtaining transaction data, the transaction data including a customer identification and a plurality of transaction fields.

Step 204, obtaining a plurality of monitoring features, and searching a plurality of feature fields corresponding to the monitoring features through a big data platform; a plurality of feature fields are associated with the customer identification.

A database is deployed on the server. After each transaction is performed, the corresponding transaction data is stored in the database. The server can acquire transaction data in the database according to the preset frequency, and identify whether suspicious transactions for back money laundering exist.

Various data related to the customer is stored in the big data platform, including various customer basic information, historical transaction data, associated transactions, credit information, risk management information and the like. The basic information of the client includes not only the basic information of the client but also information of other persons related to the client, and can also be called information of persons having social relations with the client. Such as family member information, relative information, information about people with investment relations, etc. To ensure that the information in the large data platform is valid, the data in the large data platform may also be updated at specific periods.

After the server obtains the transaction data in the database, the server may search for relevant data in the large data platform. Specifically, the transaction data includes a customer identification and a plurality of transaction fields. The transaction field includes a transaction type, a transaction amount, a transaction time, a transaction object, and the like. For suspicious transactions of back money laundering, a plurality of monitoring features are preset on the server. Including customer base information, frequent transactions, dormant account high volume transactions, transaction amounts exceeding a preset value (e.g., 500 kilormb), transaction address falseness, transaction account expiration, associated transactions, funds to and from blacklisted customers, and the like. Each monitoring feature includes a plurality of feature fields. The monitoring feature may also be considered as a dimension of monitoring. Different monitoring features are different dimensions. The server searches the big data platform for a plurality of feature fields associated with the customer identification based on the customer identification. Therefore, various data related to the transaction data can be obtained, and the method is not limited to historical transaction data of the same customer, but combines more-dimensional data, so that the data range for identifying suspicious transactions is enlarged, and the accuracy of suspicious transaction identification can be effectively improved.

And 206, inputting the transaction fields and the plurality of characteristic fields into a monitoring model, and identifying suspicious transactions through the monitoring model.

In step 208, an early warning event is generated when the transaction data is identified as a suspicious transaction.

Step 210, the early warning event is sent to a corresponding terminal, and the terminal is used for rechecking the suspicious transaction according to the early warning event.

A monitoring model is pre-established on the server. The server may input the searched feature fields associated with the customer identification and the transaction fields to the monitoring model. And the monitoring model calls corresponding money laundering rules according to the transaction fields and the characteristic fields, and carries out rule judgment on the transaction fields and the characteristic fields by utilizing the money laundering rules to obtain corresponding rule scores. The monitoring model accumulates the rule scores to obtain the monitoring score corresponding to the transaction data. The monitoring model compares the monitoring score to a threshold and marks the transaction data as suspicious transactions when the monitoring score exceeds the threshold. When the transaction data are identified as suspicious transactions, the monitoring model carries out early warning on the transaction data and outputs corresponding early warning events.

And the server sends the early warning event to a terminal corresponding to the investigation personnel for back money laundering. The server can select the corresponding investigator according to the corresponding business field, professional level, task amount and the like. Whereby compliance with suspicious transactions may be conducted by suitable investigators.

In this embodiment, the big data platform searches for the monitoring fields corresponding to the plurality of monitoring features associated with the client identifier, and inputs the transaction fields and the monitoring fields in the transaction data to the monitoring model. The monitoring model judges the transaction field and the multidimensional characteristic field, so that corresponding suspicious transactions are identified, and an early warning event is generated. The data of suspicious transaction analysis is not only single transaction data per se, but also feature fields of various dimensions searched by a big data platform, so that the data range of suspicious transaction analysis is effectively expanded, more comprehensive and comprehensive analysis can be performed by combining information of various dimensions, and the accuracy of suspicious transaction identification is effectively improved. In the whole process, the suspicious transaction is identified without manually collecting information, and the investigator only needs to recheck the suspicious transaction, so that the workload is effectively reduced, and the identification efficiency of the suspicious transaction is improved.

In one embodiment, after generating the early warning event, the method further comprises: acquiring service fields, professional grades and task amounts corresponding to a plurality of investigator identifiers; selecting a investigator identifier which is suitable for suspicious transactions according to the service field, the professional level and the task amount; and sending the suspicious transaction to a corresponding terminal according to the selected investigator identifier, wherein the terminal is used for calling personnel to recheck the suspicious transaction.

In order to further improve the accuracy of suspicious transaction identification, suspicious transactions also need to be distributed to corresponding investigators for review. The investigator can receive the task (i.e. the ready-to-use transaction) through the terminal and review by using the terminal.

When the server distributes suspicious transactions to the investigators, the server acquires a plurality of investigator identifiers, and acquires corresponding service fields, professional levels, task amounts and the like according to the investigator identifiers. The business field includes insurance, banking, financial management, stock, etc. The server can inquire the identification of the investigator in the corresponding service field according to the service field corresponding to the suspicious transaction. The server then queries the investigator identification for the corresponding level of expertise based on the risk type of the suspicious transaction. Then, the server acquires the task amount corresponding to the identification of the investigator, and when the task amount does not reach the upper limit, the suspicious transaction can be distributed to the investigator. If the task amount of the investigator reaches the upper limit, the suspicious transaction is distributed to other investigators with the same service field and professional level. Therefore, suspicious transactions can be ensured to be distributed to investigators with equivalent business capability for rechecking, and the rechecking accuracy is ensured.

In one embodiment, the method further comprises: collecting client information with various dimensions and information of associated personnel through various ways; acquiring historical transaction data and a blacklist from a plurality of source databases, and synchronizing the historical transaction data and the blacklist to a target database; and importing the client information, the information of the associated personnel, the historical transaction data and the blacklist in multiple dimensions into a big data platform.

In the money back flushing process, in order to effectively improve the identification accuracy of suspicious transactions, customer information and historical transaction data with various dimensions are required. Customer information of multiple dimensions includes: customer base information, network public information, credit information, business information, entry and exit information, and the like.

A database, which may also be referred to as a destination database, is deployed in the server. The server may synchronize customer base information, historical information transaction data, and blacklists, etc., from a plurality of source databases into a destination database. The basic information of the client comprises various basic information submitted when the client transacts business and biological characteristic information acquired in the business transacting process. The basic information includes customer name, age, sex, address, contact information, etc. The biometric information includes a face, a fingerprint, a voiceprint, etc. The historical transaction data can be transaction data generated when the clients transact business on counter, or transaction data generated when the clients transact business online. Transaction data of different traffic types may be stored in different source databases. The business types include a variety of banks, insurance, securities, funds, loans, etc. The server may also synchronize blacklists of multiple source databases into the destination database.

The server may also crawl a plurality of articles related to the customer through a plurality of third party websites, and extract content corresponding to the monitoring features of the anti-money laundering from the plurality of articles. The server extracts positive information from the article that includes the customer as well as negative information. Specifically, the server may perform semantic analysis and word segmentation processing on the crawled articles to obtain a plurality of words. The server filters the plurality of words, extracts words corresponding to monitoring features from the filtered words, and the monitoring features can be features for identifying suspicious transactions. In addition, the server can also climb credit information, business information, entry and exit information and the like of the client through the third-party website.

The server may also gather information about associated persons having social relationships with the clients. Social relationships may include family membership, relatives, contests, investment relationships, supplier relationships, and the like. Specifically, the server may obtain the associated personnel identifier according to the basic information of the client. The server queries information corresponding to the related personnel identification in the target database, wherein the information comprises basic information, historical information transaction data and the like. The server can also climb network public information, credit information, business information, entry and exit information and the like corresponding to the associated personnel identification through the third party platform.

The server imports the collected client information, historical transaction data and information of associated personnel in various dimensions into a big data platform. The Hadoop database can be adopted in the big data platform to store the imported information. The Hadoop database can be partitioned according to the service field, and information in the corresponding service field can be stored in each partition. Each partition can establish a corresponding partition index by which corresponding information can be quickly queried within the partition.

In one embodiment, as shown in fig. 3, the step of identifying suspicious transactions through a monitoring model specifically includes:

step 302, calling corresponding money back rule according to the transaction field and the plurality of characteristic fields searched by the big data platform.

And 304, matching the transaction field and the characteristic field by using the money laundering rule, and recording a rule score corresponding to the corresponding money laundering rule when the matching is successful.

And 306, accumulating the rule scores corresponding to the plurality of back-money-removing items to obtain a monitoring score corresponding to the transaction data.

When the monitoring score exceeds the threshold, the transaction data is marked as suspicious transactions, step 308.

After the server inputs the transaction field in the current transaction data and the characteristic fields searched by the big data platform into the monitoring model, the monitoring model calls the corresponding money back flushing rule according to the transaction field and the characteristic fields. One transaction field or feature field may invoke one back-money rule or may invoke multiple back-money rules. Each money back-flushing rule is preset with a corresponding score. And the monitoring model acquires rule scores corresponding to each money back flushing rule according to the invoked money back flushing rule. The monitoring model may accumulate the rule scores to obtain a monitoring score corresponding to the transaction data.

Specifically, the server acquires parameters, parameter values or parameter descriptions corresponding to the transaction fields or the characteristic fields, matches one or more corresponding money laundering rules according to the parameters, parameter values or parameter descriptions, and records rule scores corresponding to the corresponding money laundering rules when the matching is successful. The money back flushing rule may set a corresponding parameter range or parameter description. For example, the parameter range may be that transaction activity occurs more than 3 times per day over 10 working days. The parameter description can pay the natural person premium by the third person, and the relationship between the third person and the applicant, the insured person and the beneficiary cannot be reasonably explained. Each transaction field and the characteristic field contain a corresponding parameter and a parameter value or a parameter description corresponding to the parameter. And the server invokes one or more corresponding money back flushing rules according to the parameters in the transaction field and the characteristic field. The server matches the parameter values or parameter descriptions in each transaction field and the characteristic field with the corresponding money back rule one by one. If the parameter values in the transaction field and the characteristic field fall into the parameter range of the money back rule, the matching is successful. If the parameter descriptions in the transaction field and the characteristic field are consistent with the parameter descriptions of the money back flushing rule, the matching is successful. When the matching is successful, the server records the rule score corresponding to the corresponding money back flushing rule.

The specific transaction fields may be different in different transaction data. The multi-dimensional feature fields associated with the customer identification searched in the big data platform may also be different. For example, for transaction a, the multi-dimensional feature fields searched in the big data platform include: account dormant time, account active time, collection times after account active, collection amount, collection time, customer occupation, customer income, customer family member occupation and income, customer relatives occupation and income, and the like. Through the multi-dimensional characteristic field, the account is dormant for 4 years and is kept motionless for a long time, and the account starts to be active before half a year, and 5 times of large funds (more than 20 ten thousand) are paid. The client has no income in the current lost industry, the client wife is a common worker, the monthly income is 5 thousand, and a relative of the client has a vending record. And the monitoring model calls corresponding money back-flushing rules to judge according to the transaction fields and the multidimensional characteristic fields, the rule score corresponding to each money back-flushing rule is accumulated, and the monitoring score corresponding to the transaction data is obtained. Assuming a monitoring score of 90 points, exceeding a threshold of 60 points, the transaction is identified as suspicious.

In one embodiment, accumulating the plurality of rule scores to obtain a monitor score corresponding to the transaction data includes: acquiring weight corresponding to the monitoring feature; the weights are obtained after the multiple monitoring features are operated through a logistic regression model; correcting the money back-flushing rule score by using the weight; and accumulating the corrected rule scores to obtain monitoring scores corresponding to the transaction data.

Each monitoring feature can be regarded as a dimension, and the server performs logistic regression processing by using the historical data to obtain a score interval corresponding to each dimension, wherein the score interval can be discrete, linear or normal distribution. And selecting the score corresponding to each dimension by the server according to the score intervals, and taking the score as the weight corresponding to the dimension, namely the weight corresponding to the monitoring feature. The weights between different monitoring features may be the same or different. Different feature fields of the same monitoring feature may be set to the same weight.

The server corrects the scores of the money-back rules respectively by using the weights, and accumulates the corrected rule scores to obtain the monitoring scores corresponding to the transaction data. Because the monitoring characteristics of different dimensions are different in importance degree in money back-washing identification, the money back-washing rule scores are corrected, and the monitoring characteristics of different dimensions can be adjusted, so that more reasonable monitoring scores can be obtained, and the accuracy of identifying suspicious transactions can be improved more effectively.

In one embodiment, after searching the plurality of feature fields corresponding to the monitored feature through the big data platform, the method further comprises: and a risk type identification step. As shown in fig. 4, this step includes:

and step 402, calling codes corresponding to the monitoring features according to the search results of the big data platform.

And step 404, performing risk identification on the corresponding feature fields by using the codes of the monitoring features, and outputting corresponding risk labels.

And step 406, identifying the risk type corresponding to the suspicious transaction according to the plurality of risk tags.

In the conventional manner, when the risk types of the suspicious transaction are identified, in order to ensure that all risk types are identified, the server needs to input the suspicious transaction into a risk model corresponding to each risk type, and the risk model analyzes by using the feature fields of each monitoring feature, so as to identify whether the suspicious transaction has corresponding risk. The risk model in the conventional approach is to bring together the code that analyzes the various monitored features. In the different risk models, the same or similar monitoring features are usually involved. For example, illegally funded risk models and reimbursed risk models are similar in funding, payee, amount, and the like. If the code for each risk model is written separately, more repetitive work is created.

In this embodiment, for each monitoring feature corresponding to each risk model, a corresponding code is written according to a risk rule, and each monitoring feature has a corresponding risk tag. And when the server identifies suspicious transactions, invoking codes corresponding to the monitoring features according to a plurality of feature fields in the big data search results to perform risk analysis. And when the feature fields corresponding to the monitoring features accord with the risk rules, the server adds corresponding risk labels to the big data search results. The risk tag may be similar to a monitoring feature or the like. Because the monitoring indexes corresponding to different risk models are different, namely the corresponding risk labels are different, the server can accurately identify the risk type of the suspicious transaction according to a plurality of risk labels corresponding to the suspicious transaction.

In this embodiment, by writing a code for each monitoring indicator, risk identification can be performed on the feature field by using the code corresponding to the monitoring indicator, so that repetitive work of writing the code is effectively saved. And the codes of the monitoring indexes can cover all the original risk models, and each risk type related to the suspicious transaction can be identified after the risk labels are output through the codes of the monitoring indexes, so that the comprehensiveness and the accuracy of the risk types of the suspicious transaction can be ensured.

In one embodiment, the method further comprises: acquiring a risk type corresponding to a suspicious transaction; the risk type corresponds to a plurality of risk features; calculating vector matrixes corresponding to the risk features, inputting the vector matrixes into a neural network model, and outputting a corresponding suspicious transaction report through the neural network model; and sending the suspicious transaction report to the corresponding terminal.

After suspicious transactions are found, timely reporting is required. In the conventional manner, suspicious transaction reports are manually completed, and the completion efficiency of the suspicious transaction reports is low. In order to effectively improve the working efficiency, the server can automatically output a suspicious transaction report after the suspicious transaction is identified.

A neural network model is pre-established on the server. The neural network model may be a recurrent neural network. The neural network model may be trained beforehand through a variety of sample files. The sample file may be a historical suspicious transaction report of multiple risk types in a database. Since suspicious transactions include a plurality of different risk types, the report templates corresponding to the different risk types are different. Thus, the server pre-trains the neural network with sample files of different risk types.

In particular, the sample file may write pieces of content. Each piece of content includes a plurality of sentences therein. The sentence includes a plurality of words. And the server performs sentence segmentation processing on the sample file to obtain a plurality of sample sentences. The server performs word segmentation processing on the sample sentences to obtain a plurality of sample words. The server generates a training vector matrix corresponding to each sample sentence by using the sample words, and trains the neural network by using the training vector matrix and the mapping relation. During training, the server establishes corresponding mapping relations between a plurality of sentences and a plurality of risk features in the report template according to the risk types of suspicious transactions. The mapping relationship between different report templates and risk features is different.

After identifying the suspicious transaction, the server further identifies the corresponding risk type. And the server acquires corresponding transaction fields according to the risk characteristics related to the suspicious transaction. One risk feature may correspond to one transaction field or may correspond to multiple transaction fields. And the server calculates a vector matrix by utilizing the transaction field corresponding to the risk characteristic and the characteristic field searched by the big data platform. Wherein the transaction field includes a field name and a field value. The characteristic field searched for by big data also includes a field name and a field value. Server device

And searching word vectors of a plurality of words corresponding to the risk features in the training word stock. For numbers such as time and amount, the server can calculate the corresponding word vector by using the word vector model. The server generates a corresponding vector matrix by utilizing a plurality of word vectors corresponding to each risk feature. The server inputs vector matrixes corresponding to the risk features into a neural network model, the neural network model carries out operation processing according to the mapping relation between the vector matrixes and sentences in the report template, and a corresponding suspicious transaction report is output according to the format of the report template.

To ensure the accuracy of the suspicious transaction reports, the server may send the suspicious transaction reports to the corresponding terminals for review by the appropriate investigators. The investigator does not need to manually finish suspicious transaction reports, only needs to recheck, and the working efficiency is effectively improved.

It should be understood that, although the steps in the flowcharts of fig. 2-4 are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 2-4 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily occur sequentially, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or steps.

In one embodiment, as shown in FIG. 5, there is provided a big data based suspicious transaction analysis apparatus comprising: a data acquisition module 502, a big data search module 504, an identification module 504, an early warning module 508, a communication module 510, wherein:

the data acquisition module 502 is configured to acquire transaction data, where the transaction data includes a customer identifier and a plurality of transaction fields.

The big data searching module 504 is configured to obtain a plurality of monitoring features, and search a plurality of feature fields corresponding to the monitoring features through the big data platform; a plurality of feature fields are associated with the customer identification.

The identification module 506 is configured to input the transaction field and the plurality of feature fields into a monitoring model, and identify suspicious transactions through the monitoring model.

The early warning module 508 is configured to generate an early warning event when the transaction data is identified as a suspicious transaction.

The communication module 510 is configured to send the early warning event to a corresponding terminal, where the terminal is configured to recheck the suspicious transaction according to the early warning event.

In one embodiment, the apparatus further comprises: the data acquisition module is used for acquiring client information in various dimensions and information of related personnel through various ways; acquiring historical transaction data and a blacklist from a plurality of source databases, and synchronizing the historical transaction data and the blacklist to a target database; and importing the client information, the information of the associated personnel, the historical transaction data and the blacklist in multiple dimensions into a big data platform.

In one embodiment, the identification module is further configured to invoke a corresponding money back rule according to the transaction field and the plurality of feature fields searched by the big data platform; matching the transaction field and the characteristic field by using the money laundering rule, and recording a rule score corresponding to the corresponding money laundering rule when the matching is successful; accumulating the rule scores corresponding to the backwashing money to obtain a monitoring score corresponding to the transaction data; when the monitoring score exceeds a threshold, the transaction data is marked as suspicious transactions.

In one embodiment, the identification module is further configured to obtain a weight corresponding to the monitoring feature; the weights are obtained after the multiple monitoring features are operated through a logistic regression model; correcting the money back-flushing rule score by using the weight; and accumulating the corrected rule scores to obtain monitoring scores corresponding to the transaction data.

In one embodiment, the identification module is further configured to invoke a code corresponding to the monitoring feature according to a search result of the big data platform; performing risk identification on the corresponding feature fields by using codes of the monitoring features, and outputting corresponding risk labels; and identifying the risk type corresponding to the suspicious transaction according to the plurality of risk tags.

In one embodiment, the apparatus further comprises: the distribution module is used for acquiring the service fields, the professional grades and the task amounts corresponding to the plurality of investigator identifiers; selecting a investigator identifier which is suitable for suspicious transactions according to the service field, the professional level and the task amount; and sending the suspicious transaction to a corresponding terminal according to the selected investigator identifier, wherein the terminal is used for calling personnel to recheck the suspicious transaction.

For specific limitations on big data based suspicious transaction analysis means, reference may be made to the above limitations on big data based suspicious transaction analysis methods, which are not described in detail herein. The above-described modules in the big data based suspicious transaction analysis apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 6. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used to store transaction data, monitoring features, and the like. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a suspicious transaction analysis method based on big data.

It will be appreciated by those skilled in the art that the structure shown in fig. 6 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps of the respective method embodiments described above.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims

1. A suspicious transaction analysis method based on big data, the method comprising:

Matching each transaction field and the parameter or the parameter expression in the characteristic field with the corresponding anti-money laundering rule one by one, if the parameter values or the parameter descriptions in the transaction field and the characteristic field fall into the parameter range of the anti-money laundering rule, the matching is successful, if the parameter descriptions in the transaction field and the characteristic field are consistent with the parameter descriptions of the anti-money laundering rule, the matching is successful, and when the matching is successful, the rule score corresponding to the corresponding anti-money laundering rule is recorded;

performing logistic regression processing by using the historical data to obtain score intervals corresponding to each monitoring characteristic respectively; the scoring areas are distributed in a discrete, linear or normal mode;

determining weights corresponding to the monitoring features respectively according to the score intervals corresponding to the monitoring features; the weight is obtained by calculating a plurality of monitoring features through a logistic regression model;

correcting rule scores corresponding to the money back-flushing rules by using weights;

accumulating the corrected rule scores to obtain monitoring scores corresponding to the transaction data;

when the monitoring score exceeds a threshold value, marking the transaction data as suspicious transactions;

invoking codes corresponding to the monitoring features according to the search results of the big data platform, performing risk identification on corresponding feature fields by utilizing the codes corresponding to the monitoring features, outputting corresponding risk tags, and identifying risk types corresponding to the suspicious transactions according to a plurality of risk tags;

inquiring a first investigator identifier in the corresponding service field according to the service field corresponding to the suspicious transaction;

in the queried first investigator identifier, querying a second investigator identifier corresponding to a professional level according to the risk type of the suspicious transaction;

acquiring task quantity corresponding to the second investigator identifier, when the task quantity does not reach the upper limit, distributing the suspicious transaction to the investigator corresponding to the investigator identifier, and when the task quantity reaches the upper limit, distributing the suspicious transaction to other investigators with the same service field and professional level;

sending the suspicious transaction to a corresponding terminal according to the selected investigator identifier;

2. The method according to claim 1, wherein the method further comprises:

3. The method according to claim 1, wherein the method further comprises:

crawling a plurality of articles related to the customer through a plurality of third party websites;

carrying out semantic analysis and word segmentation processing on the crawled articles to obtain a plurality of words;

and filtering the plurality of words, and extracting words corresponding to the monitoring features from the filtered words.

4. The method according to claim 1, wherein the method further comprises:

and sending the suspicious transaction report to a corresponding terminal.

5. The method of claim 4, wherein outputting, by the neural network model, the corresponding suspicious transaction report comprises:

and the neural network model carries out operation processing according to the mapping relation between the vector matrix and sentences in the report template, and outputs a corresponding suspicious transaction report according to the format of the report template.

6. The method according to claim 4, wherein the method further comprises:

and training the neural network model by utilizing sample files of different risk types in advance.

7. A suspicious transaction analysis device based on big data, the device comprising:

The identification module is used for calling corresponding money back flushing rules according to the transaction fields and the characteristic fields searched by the big data platform; matching each transaction field and the parameter or the parameter expression in the characteristic field with the corresponding anti-money laundering rule one by one, if the parameter values or the parameter descriptions in the transaction field and the characteristic field fall into the parameter range of the anti-money laundering rule, the matching is successful, if the parameter descriptions in the transaction field and the characteristic field are consistent with the parameter descriptions of the anti-money laundering rule, the matching is successful, and when the matching is successful, the rule score corresponding to the corresponding anti-money laundering rule is recorded; performing logistic regression processing by using the historical data to obtain score intervals corresponding to each monitoring characteristic respectively; the scoring areas are distributed in a discrete, linear or normal mode; determining weights corresponding to the monitoring features respectively according to the score intervals corresponding to the monitoring features; the weight is obtained by calculating a plurality of monitoring features through a logistic regression model; correcting rule scores corresponding to the money back-flushing rules by using weights; accumulating the corrected rule scores to obtain monitoring scores corresponding to the transaction data; when the monitoring score exceeds a threshold value, marking the transaction data as suspicious transactions;

the identification module is also used for calling codes corresponding to the monitoring features according to the search results of the big data platform, carrying out risk identification on corresponding feature fields by utilizing the codes corresponding to the monitoring features, outputting corresponding risk tags, and identifying risk types corresponding to the suspicious transactions according to a plurality of risk tags;

the distribution module is used for acquiring the service fields, the professional grades and the task amounts corresponding to the plurality of investigator identifiers; inquiring a first investigator identifier in the corresponding service field according to the service field corresponding to the suspicious transaction; in the queried first investigator identifier, querying a second investigator identifier corresponding to a professional level according to the risk type of the suspicious transaction; acquiring task quantity corresponding to the second investigator identifier, when the task quantity does not reach the upper limit, distributing the suspicious transaction to the investigator corresponding to the investigator identifier, and when the task quantity reaches the upper limit, distributing the suspicious transaction to other investigators with the same service field and professional level; sending the suspicious transaction to a corresponding terminal according to the selected investigator identifier;

8. The apparatus of claim 7, wherein the apparatus further comprises:

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 6 when the computer program is executed.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.