WO2023123929A1

WO2023123929A1 - Abnormal application recognition method and device

Info

Publication number: WO2023123929A1
Application number: PCT/CN2022/100697
Authority: WO
Inventors: 蔡远航; 郑少杰; 范增虎
Original assignee: 深圳前海微众银行股份有限公司
Priority date: 2021-12-27
Filing date: 2022-06-23
Publication date: 2023-07-06
Also published as: CN114282988A

Abstract

Provided in the present application are an abnormal application recognition method and device. The method comprises: acquiring application address information of at least one borrower, wherein the application address information comprises a lender address and a borrower address, which correspond to different administrative regions; clustering the at least one borrower according to the lender address and the borrower address, so as to obtain at least one borrower group; according to a lender address and a borrower address that correspond to the borrower group, acquiring, from a preset transfer matrix, a remote lending probability corresponding to the borrower group, wherein the preset transfer matrix is used for indicating remote lending probabilities between different administrative regions; and determining, from among the at least one borrower group and according to the remote lending probability of the borrower group, a target borrower group having an abnormal application. By means of the present application, a target borrower group having abnormal lending can be recognized by combining address clustering with abnormal lending probabilities between different administrative regions. There is no need to manually process data from different channels, thereby reducing the recognition cost, and improving the recognition accuracy.

Description

Abnormal application identification method and equipment

This application claims the priority of the Chinese patent application with the application number 202111609530.2 and the title of "abnormal application identification method and equipment" filed with the China Patent Office on December 27, 2021, the entire contents of which are incorporated by reference in this application.

technical field

The embodiments of the present application relate to the technical field of financial technology, and in particular to a method and device for identifying abnormal applications.

Background technique

With the development of computer technology, more and more technologies are applied in the financial field. The traditional financial industry is gradually transforming into financial technology (Finteh). The abnormal application identification technology for loans is no exception. However, due to the security and real-time Sexual requirements, but also higher requirements for technology. In the financial technology (Fintech) technology field, applying for a loan is a very common scenario. After applying for a loan, you need to repay the loan on time. If you fail to repay the loan on time, it will cause economic losses to the lender. In order to avoid economic losses to lenders, it is necessary to identify abnormal applications among loan applications.

In the prior art, there are two ways to identify abnormal applications. FIG. 1 is a schematic diagram of an abnormal application identification process provided by the prior art. Referring to Figure 1, firstly, multiple loan application information is input into the clustering algorithm for clustering to obtain one or more loan party groups; then, analysts use empirical knowledge to analyze the loan party groups to determine which The lender group is a lender group with abnormal applications. Wherein, the loan application information may include: basic information, login information, associated information and loan information. Basic information includes: age of the borrower, income of the borrower. The login information may include: the model of the equipment used for login and the network (IP, Internet protocol) address of the equipment used for login. The associated information may include: contact information and family member information. Loan information may include: loan amount and consumption address.

The second way to identify abnormal applications in the prior art is through correlation graphs. Specifically, firstly, obtain a large number of borrowers' basic information and loan transaction information from various channels; then, analyze the lender's identity, income, preference and other information, and label the lender; then, establish a large number of labels and loan transactions Correspondence between information forms an association map; finally, abnormal applications are identified according to the association map.

However, the above-mentioned first method requires analysts to conduct analysis, and the weights of each dimension in the clustering process also need to be manually determined, resulting in high identification costs for abnormal applications. The second method above requires a large amount of data from different channels as support. However, it is difficult to obtain multi-channel data in practical applications, resulting in an incomplete relationship map, which reduces the accuracy of identifying abnormal applications.

technical solution

The present application provides a method and equipment for identifying abnormal applications, so as to reduce the identification cost of abnormal loans and improve the identification accuracy.

In the first aspect, the present application provides a method for identifying abnormal applications, the method comprising:

Obtain the application address information of at least one borrower through a deep learning model, the application address information includes: the address of the lender and the address of the borrower, and the administrative area corresponding to the address of the lender is different from the administrative area corresponding to the address of the borrower ;

clustering the at least one borrower according to the address of the lender and the address of the borrower to obtain at least one group of borrowers;

For each of the borrower groups, according to the lender address and the borrower address corresponding to the borrower group, the probability of a foreign loan corresponding to the borrower group is obtained from a preset transfer matrix, and the preset transfer matrix is used Indicates the probability of inter-regional loans between different administrative regions;

A target borrower group with abnormal applications is determined from at least one borrower group according to the off-site loan probability of the borrower group.

Optionally, the determining the target borrower group with abnormal application from at least one borrower group according to the probability of the borrower group's off-site loan includes:

According to the off-site loan probability of the borrower group and the abnormal loan data that have appeared in the borrower group, determine the target borrower group that has an abnormal application from at least one borrower group, and the abnormal loan data includes at least one of the following Item: the total overdue loan amount of each borrower in the borrower group, and the total number of days overdue loans of each borrower in the borrower group.

Optionally, the determining the target borrower group with abnormal application from at least one borrower group according to the probability of off-site loans of the borrower group and the abnormal loan data that have appeared in the borrower group includes:

Determining the ratio of the abnormal loan data that has appeared in the borrower group to the probability of off-site loans of the borrower group;

A target borrower group with abnormal applications is determined from at least one borrower group according to the ratio.

Optionally, the inter-regional loan probability between different administrative regions is generated through the following steps:

Perform a weighted operation on at least one associated attribute between the first administrative region corresponding to the borrower's address and the second administrative region corresponding to the lender's address to obtain the probability of the off-site loan, and the at least one associated attribute includes at least one of the following : the distance between the first administrative region and the second administrative region, the ratio between the gross production value of the second administrative region and the gross production value of the first administrative region, the address of the borrower belongs to the The proportion of non-overdue loans in the above-mentioned first administrative region and the address of the lender belonging to the above-mentioned second administrative region among the non-local loans.

Optionally, the address of the borrower includes at least one level of administrative region, and the acquisition of the application address information of at least one borrower through a deep learning model includes:

Inputting the borrower's address text of the borrower into the deep learning model to obtain the at least one level of administrative region, the deep learning model is obtained through training with preset training samples, and the training samples include at least one of the following Item: sample address text, each character in the sample address text corresponds to the sample type of the character, and the sample type is one of the following: the start character of a level of administrative area, the end of a level of administrative area characters, other characters.

Optionally, the deep learning model includes: an input layer, a bidirectional LSTM layer, and a CRF layer. During training, the input layer is used to receive the sample address text, and the bidirectional LSTM layer is used to process the sample address text. The address text is processed to obtain a vector, and the CRF layer is used to predict the prediction type of each character in the sample address text according to the vector, and the sample type and the prediction type are used to determine a loss value. In the loss When the value satisfies the convergence condition, the training ends, and when the loss value does not satisfy the convergence condition, the parameters of the deep learning model are adjusted according to the loss value to perform the next round of training.

Optionally, after inputting the borrower's address text of the borrower into the deep learning model to obtain the at least one level of administrative region, it also includes:

If some levels of administrative areas are missing in the at least one level of administrative areas, the missing administrative areas are determined according to a preset administrative area tree, and the preset administrative area tree is used to represent the hierarchical relationship between administrative areas; and/ or,

If some levels of administrative areas are missing in the at least one level of administrative areas, the text of the address of the borrower is input into the third-party interface to obtain the missing administrative areas.

In the second aspect, the present application provides an abnormal application identification device, including:

The application address information acquisition module is used to obtain the application address information of at least one borrower through a deep learning model, and the application address information includes: the address of the lender and the address of the borrower, the administrative area corresponding to the address of the lender and the The administrative region corresponding to the address of the borrower is different;

An address clustering module, configured to cluster the at least one borrower according to the address of the lender and the address of the borrower to obtain at least one group of borrowers;

The off-site loan probability acquisition module is used to obtain the off-site loan probability corresponding to the borrower group from the preset transfer matrix according to the lender address and the borrower address corresponding to the borrower group for each of the borrower groups , the preset transfer matrix is used to indicate the probability of inter-regional loans between different administrative regions;

The abnormal identification module is used to determine the target borrower group with abnormal application from at least one borrower group according to the probability of borrowing in other places of the borrower group.

Optionally, the abnormality identification module is also used for:

When determining a target borrower group with an abnormal application from at least one borrower group according to the probability of a foreign loan of the borrower group and the abnormal loan data that has appeared in the borrower group, determine that in the borrower group The ratio of the abnormal loan data that has appeared to the probability of foreign loans of the borrower group;

Optionally, the inter-regional loan probability between different administrative regions is generated through the following modules:

The first loan probability generation module is used to perform weighted operation on at least one correlation attribute between the first administrative region corresponding to the borrower's address and the second administrative region corresponding to the lender's address to obtain the off-site loan probability, the at least An associated attribute includes at least one of the following: the distance between the first administrative area and the second administrative area, the total value of production of the second administrative area and the total value of production of the first administrative area The ratio between the ratios, the ratio of non-overdue loans whose address of the borrower belongs to the first administrative region and the address of the lender belongs to the second administrative region in the proportion of non-local loans.

Optionally, the borrower's address includes at least one level of administrative region, and the application address information acquisition module is also used to:

Optionally, the device also includes:

The administrative area completion module is used to input the borrower address text of the borrower into the deep learning model to obtain the administrative area of the at least one level, if some levels are missing in the administrative area of the at least one level administrative regions, the missing administrative regions are determined according to the preset administrative region tree, which is used to represent the hierarchical relationship between administrative regions; and/or,

In a third aspect, the present application provides an electronic device, including: at least one processor and a memory;

the memory stores computer-executable instructions;

The at least one processor executes the computer-executed instructions stored in the memory, so that the electronic device implements the method in the aforementioned first aspect.

In a fourth aspect, the present application provides a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and when the processor executes the computer-executable instructions, the computing device realizes the above-mentioned first aspect. method.

In a fifth aspect, the present application provides a computer program, the computer program is used to implement the method in the aforementioned first aspect.

The abnormal application identification method and equipment provided in this application, the method includes: obtaining the application address information of at least one borrower through a deep learning model, the application address information includes: the address of the lender and the address of the borrower, and the administrative area corresponding to the address of the lender The administrative region corresponding to the address of the borrower is different; according to the above address of the lender and the address of the borrower, at least one borrower is clustered to obtain at least one group of borrowers; for each group of borrowers, according to the corresponding loan of the borrower group The address of the borrower and the address of the borrower, and obtain the probability of a foreign loan corresponding to the borrower group from the preset transfer matrix, which is used to indicate the probability of a foreign loan between different administrative regions; according to the probability of a foreign loan of a borrower group , determining a target borrower group with an abnormal application from at least one borrower group. In this embodiment of the application, the borrowers can be clustered according to the address of the lender and the address of the borrower to obtain at least one borrower group, and then the target borrower group of the abnormal loan can be identified by combining the abnormal loan probability between different administrative regions in the transfer matrix . The whole process does not require manual processing, thereby reducing the identification cost. In addition, this embodiment of the application only needs the address of the lender and the address of the borrower, and does not require data from other channels, so that the problem of low recognition accuracy due to the inability to obtain data from more channels can be avoided.

Description of drawings

Fig. 1 is a schematic diagram of an abnormal application identification process provided by the prior art;

Fig. 2 is a flow chart of specific steps of the abnormal application identification method provided by the embodiment of the present application;

Fig. 3 is a schematic structural diagram of a deep learning model provided by an embodiment of the present application;

Fig. 4 is a structural block diagram of an abnormal application identification device provided by an embodiment of the present application;

Fig. 5 is a structural block diagram of an electronic device provided by an embodiment of the present application.

Embodiments of the present invention

The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

The terms "first", "second" and the like in the specification and claims of the present application and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the application described herein, for example, can be practiced in sequences other than those illustrated or described herein.

Furthermore, the terms "comprising" and "having", as well as any variations thereof, are intended to cover a non-exclusive inclusion, for example, a process, method, system, product or device comprising a sequence of steps or elements is not necessarily limited to the expressly listed instead, may include other steps or elements not explicitly listed or inherent to the process, method, product or apparatus.

This embodiment of the application can be applied to loan scenarios. The borrower can provide a loan application to the lender, and the information required for review can be specified in the loan application. The lender reviews the loan application, and the loan is successful after the review is passed.

In order to avoid the economic losses caused by loans, it is necessary to accurately identify abnormal loans. This identification process can be carried out during the review process, and when the loan is identified as an abnormal loan, the review result can be determined as failed review. This identification process can also be done post-loan to notify the borrower to repay as soon as possible.

Fig. 2 is a flow chart of specific steps of the abnormal application identification method provided by the embodiment of the present application. Referring to Figure 2, the method may include:

S101: Obtain the application address information of at least one borrower through the deep learning model. The application address information includes: the address of the lender and the address of the borrower, and the administrative area corresponding to the address of the lender is different from the administrative area corresponding to the address of the borrower.

Wherein, the address of the borrower may include at least one of the following: the account address of the borrower, the residence address of the borrower, and the work address of the borrower.

For the above address of the lender, it may include the administrative region where the lender is located. The administrative region where the lender is located may be determined according to the IP (internet protocol, Internet Protocol) address of the lender's electronic device used when the borrower submits the loan application. The correspondence between administrative regions and IP addresses is preset, and one administrative region may correspond to one or more IP addresses. For example, the borrower needs to go to the lender's offline service point and use the electronic device provided by the lender to submit a loan application. The loan application may be entered by the lender itself, or may be entered by a staff member of the borrower. In this case, the IP address of the lender's electronic device can be obtained, and the corresponding administrative region can be obtained after obtaining the IP address. A tool for converting an IP address into an administrative area is provided in the prior art, and the tool can be called to convert the IP address into an administrative area.

For the above-mentioned address of the borrower, it is usually entered by the borrower when applying for a loan, and the address of the borrower usually includes the administrative area and the detailed address below the administrative area. The administrative area corresponding to the address of the borrower and the administrative area corresponding to the address of the borrower. The administrative area in the above borrower address can be obtained through a deep learning model. The borrower enters an address text that includes the state, city, and county (or district), street, and subdivision. For example, the text of the borrower's address can be "XXXX Community, XXX Street, XXX County, XXXX City, XXXX Province". At this point, the administrative region can be identified from the address text.

It can be seen that the above-mentioned administrative regions are hierarchical. For example, a province is the first level, a city is the second level, and a county (or district) is the third level. A province can include one or more cities, and a city can include one or more counties.

In this embodiment of the present application, at least one level of administrative regions can be identified from the borrower's address text through a deep learning model.

Specifically, the text of the borrower's address of the borrower can be input into the deep learning model to obtain at least one level of administrative regions. The deep learning model is obtained by training a large number of preset training samples, the training samples include at least one of the following: sample address text, each character in the sample address text corresponds to the sample type of the character, and the sample type is as follows One item: the start character of a level of administrative area, the end character of a level of administrative area, and the remaining characters.

Wherein, the content of the sample address text is similar to that of the borrower address text, the difference is that each character in the sample address text corresponds to a sample type. For example, when multiple levels of administrative regions include: province, city, and county, the start and end characters of the province are "B-PROV" and "E-PROV" respectively, and the start and end characters of the city are "B-PROV" respectively. CITY" and "E-CITY", the county starts and ends with "B- COUNTY" and "E-COUNTY" and the remaining characters are "O". Thus, a sample address text could be "X\ B-PROV\ X\ O\province\E-PROV\X\ B-CITY \X\O \city\ E-CITY\X\ B-COUNTY\X\O\county\ E-COUNTY\X \O\X\ O\Street\O\Dao\O\X\O\X\O\X\O\X\O\Small\O\District\O".

The foregoing deep learning model may be any existing deep learning model, which is not limited in this embodiment of the present application. Through multiple experiments in the embodiments of the present application, it is obtained that a deep learning model with high recognition accuracy is obtained. Fig. 3 is a schematic structural diagram of a deep learning model provided by an embodiment of the present application. Referring to FIG. 3 , the deep learning model may include: an input layer, a bidirectional LSTM (long short term memory, long short term memory network) layer, and a CRF layer.

Among them, the input layer is used to receive the sample address text, the bidirectional LSTM layer is used to process the sample address text to obtain a vector, and the CRF layer is used to predict the prediction type of each character in the sample address text according to the vector.

Of course, in application, the input layer is used to receive the borrower’s address text, the bidirectional LSTM layer is used to process the borrower’s address text to obtain a vector, and the CRF (conditional random field) layer is used to predict the borrower’s address text based on the vector The type of each character in the address text, so that the administrative regions of each level can be extracted from the borrower address text according to the type.

The predicted type of each character above corresponds to one of the sample types marked in the sample address text, and similarly, the type of each character corresponds to one of the sample types marked in the sample address text. But the prediction type and sample type corresponding to the same character may be the same or different.

The process of training the deep learning model through the above training samples may include multiple rounds of iterations. In each round of iteration, a set of training samples can be input into the deep learning model to obtain the predicted type of each character in each training sample, and then, the predicted type of each character in this set of training samples and each The sample type of the character is input into the loss function to obtain the loss value; finally, it is determined whether the loss value meets the convergence condition. When the loss value meets the convergence condition, the training ends. When the loss value does not meet the convergence condition, the deep learning The parameters of the model are adjusted for the next round of training.

Wherein, the loss function mentioned above may adopt a loss function commonly used in the prior art, for example, a cross-entropy loss function, an absolute value loss function, and a square sum loss function.

Satisfying the convergence condition of the above loss function may include but is not limited to: the loss value is less than or equal to a preset loss value threshold, and the loss value does not decrease after multiple rounds of iterations.

It can be seen that, ideally, if the borrower address text entered by the borrower includes the administrative area of each level, then the administrative area identified from the borrower address text includes all levels.

However, in practical applications, the borrower's address text entered by the borrower may lack information on some administrative regions, so that the administrative region identified from the borrower's address text lacks some levels. In this way, subsequent processing based on the address of the borrower will be inaccurate. In order to improve the accuracy of subsequent processing, it is necessary to complete the administrative area of the borrower's address.

Specifically, if some levels of administrative regions are missing in at least one level of administrative regions, the missing administrative regions are determined according to the preset administrative region tree; and/or, the borrower’s address text is input into the third-party interface to obtain the missing administrative regions Region, the preset administrative region tree is used to represent the hierarchical relationship between administrative regions.

Wherein, the preset administrative region tree is a tree structure, which includes hierarchical relationships among all administrative regions. The nodes in the administrative area tree form a parent-child relationship, and the administrative area of the child node belongs to the only administrative area of the parent node. Therefore, if there is a low-level administrative area in at least one level of administrative area, but there is no high-level administrative area, then the parent node can be determined according to the node corresponding to the low-level administrative area, so that the parent node corresponds to The administrative area is determined as a high-level administrative area.

However, if the above preset area tree cannot complete the administrative area, a third-party interface can also be called to determine the missing administrative area based on the detailed address in the borrower's address text. For example, the county you belong to can be determined based on "community or street".

S102: Clustering at least one borrower according to the above lender address and borrower address to obtain at least one borrower group.

Specifically, firstly, the address of the lender is converted into latitude and longitude coordinates, and the address of the borrower is converted into coordinates of latitude and longitude, and then, the borrowers are clustered according to the latitude and longitude coordinates of the lender and the latitude and longitude coordinates of the borrower.

It can be understood that the above clustering may use an existing clustering algorithm, and when the existing clustering algorithm can only perform clustering according to one dimension, the clustering algorithm is called multiple times to perform clustering. For example, first call the clustering algorithm to cluster at least one borrower according to the latitude and longitude coordinates of the lender’s address to obtain at least one first borrower group; The latitude and longitude coordinates of each borrower in the first borrower group are clustered to obtain at least one second borrower group, and each second borrower group of each first borrower group is at least one borrower obtained in S102 group.

When the address of the above-mentioned borrower includes: the residential address, account address, and work address of the borrower, the above-mentioned clustering algorithm is used to cluster the borrowers in the first borrower group according to the latitude and longitude coordinates of the borrower's address, and at least A process for the second group of borrowers may include: firstly, calling the clustering algorithm to cluster each borrower in the first group of borrowers according to the latitude and longitude coordinates of the residence address to obtain at least one first subgroup; then, calling The clustering algorithm clusters each borrower in each first subgroup according to the latitude and longitude coordinates of the account address to obtain at least one second subgroup; finally, the clustering algorithm is called to cluster each borrower in the first subgroup according to the latitude and longitude coordinates of the work address. Each borrower in the second subgroup is clustered to obtain at least one third subgroup. Thus, each third subgroup is a second group of borrowers.

Of course, the above-mentioned clustering order among the lender's address, the borrower's residential address, household address, and work address can be flexibly adjusted, and this embodiment of the application does not limit it.

It can be understood that after the above clustering in S102, each borrower in each borrower group obtained has the same borrower address (called the borrower address of the borrower group), and has the same lender address (The address of the lender known as the borrower group).

S103: For each group of borrowers, according to the address of the lender and the address of the borrower corresponding to the group of borrowers, obtain the probability of inter-regional loans corresponding to the group of borrowers from the preset transfer matrix, the preset transfer matrix is used to indicate different Probability of off-site loans between administrative regions.

Wherein, the value of the mth row and the nth column of the preset transfer matrix may be the probability of off-site loans from the mth administrative region to the nth administrative region.

It can be understood that the greater the probability of off-site loans, the higher the possibility of loans from the mth administrative region to the n-th administrative region. The probability of off-site loans is usually determined by the attribute between two administrative regions. This attribute can be Including but not limited to: distance, GPD (gross domestic product, gross domestic product) gap, and the proportion of loans that are not overdue. The smaller the distance, the larger the GDP gap, and the larger the proportion of unoverdue loans, the greater the probability of loans from other places. Therefore, the borrower group with the lower probability of off-site loans is more likely to have abnormal loans.

Wherein, the distance may be the length of a drivable route between two administrative regions, not the straight-line distance between the two administrative regions.

Specifically, the inter-regional loan probability between the above-mentioned different administrative regions can be generated through the following steps:

A weighted operation is performed on at least one correlation attribute between the first administrative region corresponding to the borrower's address and the second administrative region corresponding to the lender's address to obtain the probability of inter-regional loans from the first administrative region to the second administrative region. The at least one associated attribute includes at least one of the following: the distance between the first administrative region and the second administrative region, the ratio between the gross production value of the second administrative region and the gross production value of the first administrative region, the borrower The proportion of non-overdue loans whose address belongs to the first administrative region and whose lender's address belongs to the second administrative region among the non-local loans.

Among them, the non-local loan refers to the loan application in which the address of the borrower and the address of the lender belong to different administrative regions.

The above ratio may be the ratio of quantity or the ratio of total loans.

It can be understood that the above steps can be performed periodically, for example, once a year or once a month, and the loan information of all administrative regions in the current cycle is used each time.

The administrative regions in the above transfer matrix can be all administrative regions at any level. Of course, if the level is lower, the number of administrative regions is larger, the transfer matrix is larger, and the accuracy is higher. A transition matrix can thus be generated by county or district.

After obtaining the above transfer matrix, the probability of off-site loans of any borrower group can be obtained from the transfer matrix. Specifically, first, determine the administrative area corresponding to the borrower address of the borrower group as the administrative area of the borrower, and determine the administrative area corresponding to the address of the lender of the borrower group as the administrative area of the lender; then, from the transfer matrix The administrative region of the borrower is obtained and the probability of non-local loans listed as the administrative region of the lender is used as the probability of non-regional loans of the group of borrowers.

S104: Determine a target borrower group that has an abnormal application from at least one borrower group according to the probability of borrowing in a different place of the borrower group.

Wherein, the determination strategy of the target borrower group may include multiple strategies.

In the first strategy, the borrower group with the smallest or smaller probability of off-site loans can be determined as the target borrower group.

In the second strategy, the target borrower group is determined by combining the off-site loan probability and other information. Specifically, according to the borrower group's off-site loan probability and the abnormal loan data that have appeared in the borrower group, determine the target borrower group that has an abnormal application from at least one borrower group, and the abnormal loan data includes at least one of the following: The total amount of overdue loans of each borrower in the borrower group, and the total number of days overdue for each borrower in the borrower group.

Among them, the target borrower group is the group of borrowers whose probability of taking loans in other places is smaller, the total amount of overdue loans is larger, and the total number of days overdue of loans is larger.

In one example, the group of borrowers can be sorted in descending order comprehensively according to the reciprocal of the probability of off-site loans, the total amount of overdue loans, and the total number of days overdue for loans. Therefore, the borrower group with the highest ranking can be determined as the target borrower group.

In another example, for each group of borrowers, determine the ratio of the abnormal loan data that has appeared in the group of borrowers to the probability of foreign loans of the group of borrowers; Identify target borrower groups with unusual applications.

Specifically, when the data of off-site loans that have occurred is the total amount of overdue loans of each borrower in the borrower group, and the total number of days overdue loans of each borrower in the borrower group, determine the abnormal loan that has occurred in the borrower group The ratio of the data to the probability of off-site loans for the group of borrowers may include: first, determining the product of the total amount of overdue loans of each borrower in the group of borrowers and the total number of days overdue for loans of each borrower in the group of borrowers; then, A ratio of the product to the off-site loan probability for the group of borrowers is determined.

After the above ratio is obtained, one or more borrower groups with a larger ratio can be determined as the target borrower group.

After obtaining the target borrower groups, it is possible to prevent the approval of the loan applications of these target borrower groups, and to remind the target borrower groups of the loan applications that have been approved, including but not limited to: telephone or SMS reminders.

Corresponding to the method for identifying abnormal applications in the above embodiments, FIG. 4 is a structural block diagram of an apparatus for identifying abnormal applications provided in the embodiments of the present application. For ease of description, only the parts related to the embodiment of the present application are shown. Referring to FIG. 4 , the abnormal application identification device 200 includes: an application address information acquisition module 201 , an address clustering module 202 , a remote loan probability acquisition module 203 and an abnormal identification module 204 .

The application address information acquisition module 201 is used to obtain the application address information of at least one borrower through a deep learning model, the application address information includes: the address of the lender and the address of the borrower, the administrative area corresponding to the address of the lender and the The administrative regions corresponding to the borrower's address are different.

The address clustering module 202 is configured to cluster the at least one borrower according to the address of the lender and the address of the borrower to obtain at least one group of borrowers.

The off-site loan probability acquisition module 203 is configured to acquire the off-site loan corresponding to the borrower group from the preset transfer matrix according to the lender address and the borrower address corresponding to the borrower group for each borrower group Probability, the preset transfer matrix is used to indicate the probability of inter-regional loans between different administrative regions.

The abnormal identification module 204 is configured to determine a target borrower group with abnormal applications from at least one borrower group according to the probability of borrowing in other places of the borrower group.

Optionally, the abnormality identification module 204 is also used for:

When determining a target borrower group with an abnormal application from at least one borrower group according to the probability of a foreign loan of the borrower group and the abnormal loan data that has appeared in the borrower group, determine that in the borrower group The ratio of the abnormal loan data that has appeared to the probability of foreign loans of the borrower group; according to the ratio, determine the target borrower group that has an abnormal application from at least one borrower group.

Optionally, the borrower's address includes at least one level of administrative regions, and the application address information acquisition module 201 is also used to:

Optionally, the device also includes:

The administrative area completion module is used to input the borrower address text of the borrower into the deep learning model to obtain the administrative area of the at least one level, if some levels are missing in the administrative area of the at least one level administrative regions, then determine the missing administrative regions according to the preset administrative region tree, which is used to represent the hierarchical relationship between administrative regions; and/or, if the administrative regions of at least one level are missing For administrative regions at some levels, input the text of the address of the borrower into the third-party interface to obtain the missing administrative regions.

Fig. 5 is a structural block diagram of an electronic device provided by an embodiment of the present application. The electronic device 600 includes a memory 602 and at least one processor 601 .

Among them, the memory 602 stores computer-executable instructions. At least one processor 601 executes the computer-executed instructions stored in the memory 602, so that the electronic device 601 implements the method in FIG. 2 .

In addition, the electronic device may also include a receiver 603 and a transmitter 604, the receiver 603 is used to receive information from other devices or devices, and forwards it to the processor 601, and the transmitter 604 is used to send information to other devices or devices .

The embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and when the processor executes the computer-executable instructions, the computing device implements the method described in FIG. 2 .

An embodiment of the present application further provides a computer program, the computer program is used to implement the method described in FIG. 2 above.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, rather than limiting them; although the application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: It is still possible to modify the technical solutions described in the foregoing embodiments, or perform equivalent replacements for some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the various embodiments of the present application. scope.

For convenience of explanation, the above description has been made in conjunction with specific implementation manners. However, the above exemplary discussion is not intended to be exhaustive or to limit the implementations to the precise forms disclosed above. Many modifications and variations are possible in light of the above teachings. The selection and description of the above embodiments are to better explain the principles and practical applications, so that those skilled in the art can better use the embodiments and various modified embodiments suitable for specific use considerations.

Claims

A method for identifying abnormal applications, characterized in that the method includes:

Obtain the application address information of at least one borrower through a deep learning model, the application address information includes: the address of the lender and the address of the borrower, and the administrative area corresponding to the address of the lender is different from the administrative area corresponding to the address of the borrower ;

clustering the at least one borrower according to the address of the lender and the address of the borrower to obtain at least one group of borrowers;

For each of the borrower groups, according to the lender address and the borrower address corresponding to the borrower group, the probability of a foreign loan corresponding to the borrower group is obtained from a preset transfer matrix, and the preset transfer matrix is used Indicates the probability of inter-regional loans between different administrative regions;

A target borrower group with abnormal applications is determined from at least one borrower group according to the off-site loan probability of the borrower group.
The method according to claim 1, characterized in that, according to the off-site loan probability of the borrower group, determining the target borrower group with abnormal application from at least one borrower group includes:

According to the off-site loan probability of the borrower group and the abnormal loan data that have appeared in the borrower group, determine the target borrower group that has an abnormal application from at least one borrower group, and the abnormal loan data includes at least one of the following Item: the total overdue loan amount of each borrower in the borrower group, and the total number of days overdue loans of each borrower in the borrower group.
The method according to claim 2, characterized in that, according to the foreign loan probability of the borrower group and the abnormal loan data that have appeared in the borrower group, it is determined from at least one borrower group that there is an abnormal application target borrower groups, including:

Determining the ratio of the abnormal loan data that has appeared in the borrower group to the probability of off-site loans of the borrower group;

A target borrower group with abnormal applications is determined from at least one borrower group according to the ratio.
The method according to any one of claims 1 to 3, wherein the inter-regional loan probability between different administrative regions is generated through the following steps:

Perform a weighted operation on at least one associated attribute between the first administrative region corresponding to the borrower's address and the second administrative region corresponding to the lender's address to obtain the probability of the off-site loan, and the at least one associated attribute includes at least one of the following : the distance between the first administrative region and the second administrative region, the ratio between the gross production value of the second administrative region and the gross production value of the first administrative region, the address of the borrower belongs to the The proportion of non-overdue loans in the above-mentioned first administrative region and the address of the lender belonging to the above-mentioned second administrative region among the non-local loans.
The method according to any one of claims 1 to 3, wherein the address of the borrower includes at least one level of administrative region, and the acquisition of the application address information of at least one borrower through a deep learning model includes:

Inputting the borrower's address text of the borrower into the deep learning model to obtain the at least one level of administrative region, the deep learning model is obtained through training with preset training samples, and the training samples include at least one of the following Item: sample address text, each character in the sample address text corresponds to the sample type of the character, and the sample type is one of the following: the start character of a level of administrative area, the end of a level of administrative area characters, other characters.
The method according to claim 5, wherein the deep learning model comprises: an input layer, a bidirectional LSTM layer, and a CRF layer, and during training, the input layer is used to receive the sample address text, and the The bidirectional LSTM layer is used to process the sample address text to obtain a vector, and the CRF layer is used to predict the prediction type of each character in the sample address text according to the vector, and the sample type and the prediction type are used After determining the loss value, when the loss value satisfies the convergence condition, end the training, and when the loss value does not satisfy the convergence condition, adjust the parameters of the deep learning model according to the loss value to perform the next round train.
The method according to claim 5, characterized in that, after inputting the borrower address text of the borrower into the deep learning model and obtaining the at least one level of administrative region, further comprising:

If some levels of administrative areas are missing in the at least one level of administrative areas, the missing administrative areas are determined according to a preset administrative area tree, and the preset administrative area tree is used to represent the hierarchical relationship between administrative areas; and/ or,

If some levels of administrative areas are missing in the at least one level of administrative areas, the text of the address of the borrower is input into the third-party interface to obtain the missing administrative areas.
An abnormal application identification device, characterized in that it includes:

The application address information acquisition module is used to obtain the application address information of at least one borrower through a deep learning model, and the application address information includes: the address of the lender and the address of the borrower, the administrative area corresponding to the address of the lender and the The administrative region corresponding to the address of the borrower is different;

An address clustering module, configured to cluster the at least one borrower according to the address of the lender and the address of the borrower to obtain at least one group of borrowers;

The off-site loan probability acquisition module is used to obtain the off-site loan probability corresponding to the borrower group from the preset transfer matrix according to the lender address and the borrower address corresponding to the borrower group for each of the borrower groups , the preset transfer matrix is used to indicate the probability of inter-regional loans between different administrative regions;

The abnormal identification module is used to determine the target borrower group with abnormal application from at least one borrower group according to the probability of borrowing in other places of the borrower group.
An electronic device, characterized in that the electronic device includes: at least one processor and a memory;

the memory stores computer-executable instructions;

The at least one processor executes the computer-executed instructions stored in the memory, so that the electronic device implements the method according to any one of claims 1-7.
A computer-readable storage medium, characterized in that the computer-readable storage medium stores computer-executable instructions, and when the processor executes the computer-executable instructions, the computing device realizes any one of claims 1 to 7. the method described.