CN114638723A - Method and system for risk analysis based on business handling data - Google Patents

Method and system for risk analysis based on business handling data Download PDF

Info

Publication number
CN114638723A
CN114638723A CN202210318187.4A CN202210318187A CN114638723A CN 114638723 A CN114638723 A CN 114638723A CN 202210318187 A CN202210318187 A CN 202210318187A CN 114638723 A CN114638723 A CN 114638723A
Authority
CN
China
Prior art keywords
service
business
risk
category
bank
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210318187.4A
Other languages
Chinese (zh)
Inventor
朱江波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Ltd
Original Assignee
Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Ltd filed Critical Bank of China Ltd
Priority to CN202210318187.4A priority Critical patent/CN114638723A/en
Publication of CN114638723A publication Critical patent/CN114638723A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention provides a method and a system for risk analysis based on business handling data, relating to the technical field of financial data analysis and processing, wherein the method comprises the following steps: performing clustering analysis on the service classes according to the multi-dimensional data, and dividing a service class set into a plurality of service class subsets; acquiring business handling data of each bank outlet in a predetermined area, and determining a business category vector, a risk category vector and a main risk category corresponding to each bank outlet; performing cluster analysis on the bank outlets in the predetermined area to obtain a plurality of bank outlet subsets; determining the service handling duration of each service type according to the service handling data when the teller handles the service in each bank branch subset to obtain the probability density function of the service handling duration of each service type; and the bank outlets acquire the business handling data when the tellers handle the business in real time, and perform risk analysis on the business handling data according to the probability density function of the business handling time of each business category.

Description

Method and system for risk analysis based on business handling data
Technical Field
The invention relates to the technical field of financial data analysis and processing, in particular to a method and a system for risk analysis based on business handling data.
Background
This section is intended to provide a background or context to the embodiments of the invention that are recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.
With the continuous development of the financial industry, the market competition of various banks is intensified day by day. Banks arrange a plurality of tellers engaged in counter work at network sites, and the tellers operate on a bank foreground business system to handle various businesses for customers. However, bank tellers often have certain risks in handling business to customers due to the ability to manipulate or access the customer's vital information and data.
Therefore, a technical scheme for risk analysis of teller transaction is needed.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a method and a system for risk analysis based on business handling data. According to the invention, abnormal handling behaviors of the teller can be found through data analysis, so that risks are reduced to a certain extent.
In a first aspect of the embodiments of the present invention, a method for performing risk analysis based on business handling data is provided, including:
obtaining multidimensional data of each service category in a service category set of a bank outlet, carrying out cluster analysis on the service categories according to the multidimensional data, and dividing the service category set into a plurality of service category subsets;
acquiring service handling data of each bank branch in a predetermined area, and determining a service category vector, a risk category vector and a main risk category corresponding to each bank branch; wherein, the components of the service category vector correspond to the service category subsets one by one;
performing cluster analysis on the bank outlets in the preset area according to the business category vector, the risk category vector and the main risk category corresponding to each bank outlet to obtain a plurality of bank outlet subsets;
determining the service handling duration of each service type according to the service handling data when the teller handles the service in each bank branch subset to obtain the probability density function of the service handling duration of each service type;
the method comprises the steps that a bank outlet acquires service handling data when a teller handles services in real time, risk analysis is carried out on the service handling data according to the probability density function of service handling time of each service type, the service handling data with risks are stored in a block chain node of the bank outlet and uploaded to a bank server, risk prompt information is sent to the teller, feedback information of a digital signature of the teller is received, and the feedback information is stored in the block chain.
In a second aspect of the embodiments of the present invention, a system for risk analysis based on business handling data is provided, including:
the service class clustering analysis module is used for acquiring multidimensional data of each service class in a service class set of a bank outlet, carrying out clustering analysis on the service classes according to the multidimensional data, and dividing the service class set into a plurality of service class subsets;
the business handling data analysis module is used for acquiring business handling data of each bank branch in a predetermined area and determining a business category vector, a risk category vector and a main risk category corresponding to each bank branch; wherein, the components of the service category vector correspond to the service category subsets one by one;
the bank branch clustering analysis module is used for clustering analysis on the bank branches in the preset area according to the business category vector, the risk category vector and the main risk category corresponding to each bank branch to obtain a plurality of bank branch subsets;
the probability density function determining module is used for determining the service handling time of each service type according to the service handling data when the teller handles the service in each bank branch subset to obtain the probability density function of the service handling time of each service type;
the risk analysis module is used for acquiring business handling data when a teller handles business in real time through a bank outlet, carrying out risk analysis on the business handling data according to the probability density function of business handling time of each business category, storing the business handling data with risks into a block chain node of the bank outlet, uploading the business handling data to a bank server, sending risk prompt information to the teller, receiving feedback information of a digital signature of the teller, and storing the feedback information into the block chain.
In a third aspect of the embodiments of the present invention, a computer device is provided, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the method for performing risk analysis based on business process data is implemented.
In a fourth aspect of the embodiments of the present invention, a computer-readable storage medium is provided, in which a computer program is stored, and the computer program, when executed by a processor, implements a method for risk analysis based on business transaction data.
In a fifth aspect of embodiments of the present invention, a computer program product is presented, the computer program product comprising a computer program that, when executed by a processor, implements a method for risk analysis based on business transaction data.
The method and the system for risk analysis based on business handling data divide a business category set into a plurality of business category subsets by clustering business categories, analyze the business handling data of each bank branch in a preset area, determine a business category vector, a risk category vector and a main risk category corresponding to each bank branch, perform clustering analysis on the bank branches in the preset area according to the business category vector, the risk category vector and the main risk category corresponding to each bank branch to obtain a plurality of bank branch subsets, and further determine the business handling duration of each business category according to the business handling data when a teller processes business in each bank branch subset to obtain a probability density function of the business handling duration of each business category; the method comprises the steps of acquiring business handling data when a teller handles business in real time through a bank outlet, carrying out risk analysis on the business handling data according to a probability density function of business handling time of each business category, storing the business handling data with risks into a block chain node of the bank outlet, uploading the business handling data to a bank server, sending risk prompt information to the teller, receiving feedback information of a digital signature of the teller, storing the feedback information into the block chain, discovering abnormal handling behaviors of the teller, reducing risks to a certain extent, and improving safety of the bank business handling.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flowchart illustrating a method for risk analysis based on business transaction data according to an embodiment of the present invention.
FIG. 2 is a flow chart illustrating a method for risk analysis based on business transaction data according to another embodiment of the present invention.
FIG. 3 is a flow chart illustrating analyzing business transaction data according to an embodiment of the invention.
Fig. 4 is a schematic flow chart of cluster analysis performed by a banking outlet according to an embodiment of the present invention.
FIG. 5 is a flow chart illustrating the determination of a probability density function according to an embodiment of the present invention.
FIG. 6 is a flow diagram illustrating a process for real-time risk analysis of business process data, in accordance with an embodiment of the present invention.
FIG. 7 is a schematic flow chart illustrating the risk of analyzing business transaction data after daily business stoppage according to an embodiment of the present invention.
FIG. 8 is a schematic diagram of a system architecture for risk analysis based on business transaction data according to an embodiment of the present invention.
Fig. 9 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The principles and spirit of the present invention will be described with reference to a number of exemplary embodiments. It is understood that these embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the invention, and are not intended to limit the scope of the invention in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As will be appreciated by one skilled in the art, embodiments of the present invention may be embodied as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.
According to the embodiment of the invention, a method and a system for risk analysis based on business handling data are provided, and the method and the system relate to the technical field of financial data analysis and processing.
Taking a business handling scene as an example, under normal conditions, business handling data when a bank teller handles various businesses can accord with certain probability distribution. There is often a risk when the teller is operated too long or too short. According to the invention, abnormal handling behaviors of tellers are found through data analysis, so that risks are reduced to a certain extent.
The service processing duration is influenced by many factors, such as the flow of the service, the processing and reaction time of a bank system, the processing proficiency of a teller, whether risk audit is performed or not and the corresponding audit time. Under normal conditions, the service handling time of each type of service accords with certain distribution, the service handling time of a teller can be regarded as a sample of the distribution in practice, and correspondingly, when the number of the samples is enough, the distribution can be accurately estimated, and the risk of service handling is further analyzed and obtained.
The principles and spirit of the present invention are explained in detail below with reference to several exemplary embodiments of the present invention.
FIG. 1 is a flowchart illustrating a method for risk analysis based on business transaction data according to an embodiment of the present invention. As shown in fig. 1, the method includes:
s1, obtaining multidimensional data of each business category in a business category set of a bank website, carrying out cluster analysis on the business categories according to the multidimensional data, and dividing the business category set into a plurality of business category subsets;
s2, acquiring the business handling data of each bank branch in the predetermined area, and determining the business category vector, the risk category vector and the main risk category corresponding to each bank branch; wherein, the components of the service category vector correspond to the service category subsets one by one;
s3, performing cluster analysis on the bank outlets in the preset area according to the business category vector, the risk category vector and the main risk category corresponding to each bank outlet to obtain a plurality of bank outlet subsets;
s4, determining the service handling duration of each service type according to the service handling data when the teller handles the service in each bank branch subset, and obtaining the probability density function of the service handling duration of each service type;
s5, the bank website acquires the business handling data when the teller handles the business in real time, carries out risk analysis on the business handling data according to the probability density function of the business handling time of each business category, stores the business handling data with risks into the block chain node of the bank website, uploads the business handling data to the bank server, sends risk prompt information to the teller, receives feedback information of the digital signature of the teller, and stores the feedback information into the block chain.
Further, referring to fig. 2, a schematic flow chart of a method for risk analysis based on business transaction data according to another embodiment of the present invention is shown. As shown in fig. 2, the method further comprises:
s6, after the bank outlets stop business everyday, the bank server analyzes the risk of all teller transaction data, to confirm the risk of each teller and each business type, stores the transaction data into the block chain node of the server, sends the risk prompt information to the teller, receives the feedback information of the teller' S digital signature, and stores the feedback information into the block chain.
In order to clearly explain the method for risk analysis based on business transaction data, the following detailed description is made with reference to each step.
In S1, the specific process of obtaining multidimensional data of each service category in the service category set of the banking website, performing cluster analysis on the service categories according to the multidimensional data, and dividing the service category set into a plurality of service category subsets is as follows:
s101, for each business category (such as withdrawal, deposit, transfer and the like), obtaining values of each business category in multiple dimensions;
s102, determining a distance function corresponding to each dimension, and further determining a distance function corresponding to a service class based on the distance function corresponding to each dimension.
In one embodiment, bank customers are classified to obtain a plurality of customer categories, each customer category being a dimension. For each dimension, the argument of the distance function corresponding to the dimension is any two service classes, and the corresponding function value is the absolute value of the difference between the numbers of two clients belonging to the client class corresponding to the dimension in the two client sets respectively corresponding to the two service classes. After the distance function corresponding to each dimension is obtained, the distance function corresponding to the service class may be set to the square root of the sum of the squares of the distance functions corresponding to all the dimensions of the plurality of dimensions. The distance function for the traffic class is determined, and in fact the distance between any two traffic classes is determined.
Further, a maximum risk category corresponding to each business category may also be obtained, where the maximum risk category is a risk category with the highest risk probability corresponding to all risk categories of one business category.
S103, selecting a plurality of service categories as clustering centers in the service category sets, wherein each clustering center corresponds to a service category subset, and the initial elements of the service category subsets only comprise the service categories corresponding to the corresponding clustering centers.
S104, for each service category, executing the following steps:
selecting a plurality of cluster centers which correspond to the maximum risk category and are consistent with the maximum risk category corresponding to the service category from all the cluster centers, calculating the distance between each selected cluster center and the service category based on the distance function corresponding to the service category, then selecting a minimum value from the plurality of distances as a first center distance corresponding to the service category, and taking a service category subset corresponding to the cluster center corresponding to the minimum value as a first service category subset corresponding to the service category; calculating a business category subset corresponding to each clustering center, the corresponding maximum risk category of which is consistent with the maximum risk category corresponding to the business category, and the boundary distance of the business category based on a distance function corresponding to the business category, then selecting a minimum value from a plurality of boundary distances as a first boundary distance corresponding to the business category, and taking the business category subset corresponding to the minimum value as a second business category subset corresponding to the business category;
selecting a plurality of cluster centers which are inconsistent with the maximum risk category corresponding to the business category from all the cluster centers, calculating the distance between each selected cluster center and the business category based on the distance function corresponding to the business category, and then selecting a minimum value from the plurality of distances as a second center distance corresponding to the business category; calculating a business category subset corresponding to each clustering center with inconsistent maximum risk categories corresponding to the corresponding maximum risk categories and boundary distances of the business categories based on distance functions corresponding to the business categories, and then selecting a minimum value from the boundary distances as a second boundary distance corresponding to the business category;
if the absolute value of the difference between the corresponding first center distance and the corresponding second center distance is greater than or equal to the absolute value of the difference between the corresponding first boundary distance and the corresponding second boundary distance, and the corresponding first boundary distance is smaller than the corresponding second boundary distance, dividing the service class into a second service class subset corresponding to the service class;
if the absolute value of the difference between the corresponding first center distance and the corresponding second center distance is smaller than the absolute value of the difference between the corresponding first boundary distance and the corresponding second boundary distance, and the corresponding first center distance is smaller than the corresponding second center distance, dividing the service class into a first service class subset corresponding to the service class;
otherwise, a cluster center is newly built based on the service category, the newly built cluster center corresponds to a new service category subset, and the initial element of the new service category subset only comprises the service category;
s105, after the steps (S104) are executed on all the service classes, for each service class subset, according to the multidimensional data and the maximum risk class of all the service classes of the service class subset, determining the multidimensional data and the maximum risk class of the mean center corresponding to the service class subset and the change value corresponding to the service class subset; the change value corresponding to the service class subset is determined according to the cluster center corresponding to the service class subset and the mean center corresponding to the service class subset;
the method specifically comprises the following steps: for each dimension of the multiple dimensions, the mean value of the data values of all the service classes in the dimension of the service class subset can be used as the value of the mean value center in the dimension corresponding to the service class subset; for the maximum risk category, the maximum risk category with the largest number in the multiple maximum risk categories of all the business categories in the business category subset can be used as the maximum risk category of the mean center corresponding to the business category subset; and taking the square root of the weighted square sum of the differences of the values of the cluster centers corresponding to the business category subsets and the values of the mean centers corresponding to the dimensions as the variation values corresponding to the business category subsets.
S106, if the variation value corresponding to the service class subset is larger than the preset threshold value, newly setting a plurality of clustering centers based on the mean value center obtained in the step, wherein each newly set clustering center corresponds to a new service class subset, and the initial elements of the new service class subsets only comprise the corresponding newly set clustering centers; then, based on the newly set clustering center and the new service class subset, continuously executing the steps (S104) for each service class, determining the multidimensional data and the maximum risk class of the mean center corresponding to each service class subset, and the change value corresponding to each service class subset (S105) until the change values corresponding to all the service class subsets are less than or equal to a preset threshold value;
s107, if the variation values corresponding to all the service class subsets are less than or equal to the preset threshold value, stopping clustering analysis on the service classes, and thus obtaining a plurality of service class subsets.
In one embodiment, calculating the boundary distance between the subset of service classes and the service class based on the distance function corresponding to the service class may be determining the distance between the service class and each service class in the subset of service classes based on the distance function corresponding to the service class, selecting a minimum value from the determined distances, and determining the minimum value as the boundary distance between the subset of service classes and the service class.
In one embodiment, calculating the boundary distance between the service category subset and the service category based on the distance function corresponding to the service category may be to first construct a multidimensional space based on multiple dimensions, then determine multiple points of the service category subset corresponding to the multidimensional space based on the multidimensional data of each service category of the service category subset, and determine one point of the service category corresponding to the multidimensional space based on the multidimensional data of the service category; constructing a geometrical body of the business class subset corresponding to the multi-dimensional space based on the plurality of points of the business class subset corresponding to the multi-dimensional space; and then taking the distance between the point of the service class corresponding to the multidimensional space and the geometry as the boundary distance between the service class subset and the service class.
In one embodiment, the maximum risk category is the risk category with the greatest risk among all risk categories of a business category. Banking and risk are directly related. The risk is different, meaning that the business category is not the same for the customer population and the main flow. By ensuring that the maximum risk categories for the traffic categories are the same, the accuracy of the clustering can be improved, i.e. the traffic categories that are assigned to the same subset of traffic categories are approximately close.
In an embodiment, after the distance function corresponding to the service class is obtained in S102, an existing clustering algorithm (for example, K-means) may be directly selected to perform clustering analysis on the service class, and the service class set is divided into a plurality of service class subsets.
In S2, referring to fig. 3, the specific process of acquiring the business transaction data of each banking outlet in the predetermined area and determining the business category vector, the risk category vector, and the main risk category corresponding to each banking outlet includes:
s201, for each bank branch in a preset area, determining the service handling data of the service belonging to each service category subset in all the service handling data according to the corresponding historical service data, and setting the service category vector of the bank branch; the length of the business category vector is equal to the number of elements of the business category subset, the components of the business category vector correspond to the business category subset one by one, and the value of the component is equal to the business volume of the business category subset corresponding to the component in the business handling data of the bank website;
s202, determining the risk category of each bank outlet and the risk probability of the bank outlet about each risk category according to the risk data in the historical service data of each bank outlet, and setting the risk category vector of the bank outlet; the length of the risk category vector is equal to the number of risk categories of the bank outlets, the components of the risk category vector correspond to the risk categories one by one, and the value of the component is set as the risk probability of the bank outlets about the risk category corresponding to the component;
specifically, according to risk data in historical business data of each banking outlet, a risk category of the banking outlet and risk probability of the banking outlet about each risk category are determined, and specifically, risk categories included in risk data in all transaction data are determined as risk categories of the banking outlet. For each risk category, determining the ratio of the quantity of the transaction data related to the risk category in the transaction data of each customer of the banking site to the quantity of the transaction data, taking the ratio as the risk probability of the customer related to the risk category, and taking the average value of the risk probabilities of all the customers of the banking site related to the risk category as the risk probability of the banking site related to the risk category.
According to the theorem of large numbers, the more data, the closer the ratio is to the actual risk probability, i.e. when there is enough data, the more accurate risk probability can be obtained. For each risk category, determining the ratio of the quantity of the transaction data related to the risk category in the transaction data of each customer of the banking website to the quantity of the transaction data, taking the ratio as the risk probability of the customer related to the risk category, and taking the average value of the risk probabilities of all customers of the banking website related to the risk category as the risk probability of the banking website related to the risk category; if the distance is smaller than the preset value, determining a business category distance function corresponding to the banking outlets, wherein two independent variables of the business category distance function are two banking outlets, and the corresponding function value (namely the distance between the two banking outlets) is the distance between two business category vectors corresponding to the two banking outlets, then determining the business category distances between other banking outlets and the banking outlets based on the business category distance function corresponding to the banking outlets, selecting a proper distance threshold value, so that the number of the plurality of banking outlets and all customers of the banking outlets, the distance between which is smaller than the distance threshold value, is larger than the preset value, at this time, for each risk category, the risk probability of the banking outlet relative to the risk category is set as the mean value of the risk probabilities of all the customers relative to the risk category, wherein the risk probability of each customer relative to the risk category is equal to the mean value of the transaction data related to the risk category in the transaction data of the customer The amount to the amount of transaction data for the customer.
The preset values may be:
Figure BDA0003570415060000091
wherein, σ is the maximum value of the variance of the probability distribution satisfied by each risk category in the banking outlet, ε is the set risk probability error threshold, and P is the probability that the error value of the risk probability is greater than ε. σ can be obtained as follows: for each risk category, determining the variance corresponding to the risk category in the banking outlet based on the risk probability of all customers of the banking outlet about the risk category; and setting the sigma as the maximum value of the variance corresponding to each risk category in the banking outlet. For each risk category, the risk probability of a respective customer of the banking site with respect to the risk category may be considered as one sample of a probability distribution that is satisfied by the risk category in the banking site, and based on a plurality of samples of the probability distribution, the variance of the probability distribution may be approximately calculated.
S203, calculating the difference value between the risk probability and the corresponding preset threshold value, and taking the risk category with the maximum difference value as the main risk category of the banking outlet.
In S3, referring to fig. 4, the specific process of performing cluster analysis on the banking outlets in the predetermined area according to the service category vector, the risk category vector, and the main risk category corresponding to each banking outlet to obtain a plurality of banking outlet subsets includes:
s301, determining a distance function corresponding to the business category vector according to the Euclidean distance of the business category vector; determining a distance function corresponding to the risk category vector according to the Euclidean distance of the risk category vector;
s302, determining a distance function corresponding to a bank outlet according to the distance function corresponding to the business category vector and the distance function corresponding to the risk category vector;
specifically, the distance function may calculate the distance between any two elements, for example, the distance function corresponding to a bank branch may calculate the distance between any two bank branches. For example, for any two banking outlets, the distance between the two banking outlets may be set as the euclidean distance between two business category vectors corresponding to the two banking outlets divided by a set value, and then the euclidean distance between two risk category vectors corresponding to the two banking outlets is added, where the set value is equal to the maximum value of the euclidean distances between the two business category vectors corresponding to the combination of all the two banking outlets, so as to obtain the distance function corresponding to the banking outlets.
And acquiring a distance function corresponding to the banking outlets, and performing cluster analysis on the banking outlets in the predetermined area to obtain a plurality of banking outlet subsets.
In one embodiment, based on the determined distance function corresponding to the banking outlets, the banking outlets in the predetermined area are clustered by using the K-means, and a plurality of banking outlet subsets are obtained.
In one embodiment, according to the determined distance function corresponding to the banking outlets, the following method is adopted to cluster the banking outlets in the predetermined area to obtain a plurality of banking outlet subsets, which is specifically as follows:
selecting a plurality of bank outlets from the bank outlets as clustering centers, wherein each clustering center corresponds to a bank outlet subset, and the initial elements of the bank outlet subsets only comprise the bank outlets corresponding to the corresponding clustering centers;
for each banking outlet, the following steps are carried out:
1) selecting a plurality of clustering centers of which the corresponding main risk categories are consistent with the main risk categories of the bank outlets from all the existing clustering centers; calculating the distance between each selected clustering center and the bank outlet based on a distance function corresponding to the bank outlet, then selecting a minimum value from the plurality of distances as a first minimum distance corresponding to the bank outlet, and taking the clustering center corresponding to the minimum value as the clustering center corresponding to the bank outlet; calculating the distance between each unselected clustering center and the bank outlet based on a distance function corresponding to the bank outlet, and then selecting a minimum value from the plurality of distances as a second minimum distance corresponding to the bank outlet;
2) if the difference between the corresponding first minimum distance and the corresponding second minimum distance is larger than or equal to a set distance threshold, a new clustering center is established based on the bank website, the newly established clustering center corresponds to a new bank website subset, and the initial elements of the new bank website subset only comprise the bank website; otherwise, the bank website is divided into the bank website subset corresponding to the clustering center corresponding to the bank website;
3) after the above steps are executed for all the bank outlets, for each bank outlet subset, the business category vector of the cluster center corresponding to the bank outlet subset is updated to the mean value of the business category vectors of all the bank outlets of the bank outlet subset, the risk category vector of the cluster center corresponding to the bank outlet subset is updated to the mean value of the risk category vectors of all the bank outlets of the bank outlet subset, and the main risk category of the cluster center corresponding to the bank outlet subset is updated to the risk category value with the largest quantity among the risk category values of the main risk categories of all the bank outlets of the bank outlet subset;
4) and repeating the steps 1) to 2) for each bank outlet and the step 3) for each bank outlet subset until the service category vector and the risk category vector of the clustering center are not changed or are changed slightly, thereby obtaining a plurality of bank outlet subsets.
Wherein, the main risk category is the risk category with the highest risk in all risk categories of one bank outlet. Banking and risk are directly related. The different risks mean that the flow, risk audit and corresponding audit time of the business are different, and the probability distribution that the business processing time of the teller is met is different. By ensuring that the main risk categories of the bank outlets are the same, the clustering accuracy can be improved, namely the probability distribution that the service transaction duration of the tellers of the bank outlets which are classified into the same bank outlet subset meets is approximately close.
S303, according to the determined distance function corresponding to the bank outlets, clustering the bank outlets in the preset area to obtain a plurality of bank outlet subsets.
In S4, referring to fig. 5, the specific process of determining the service transaction duration of each service category according to the service transaction data of the teller in each subset of banking outlets when processing the service to obtain the probability density function of the service transaction duration of each service category includes:
s401, when the data volume of the service handling data is larger than a preset threshold value, determining a probability density function of service handling duration of a service category corresponding to the service handling data.
In an actual application scene, based on a majority theorem, data are enough to obtain accurate statistical estimators; therefore, a preset threshold is adopted in the step, the data volume of the service handling data is compared with the preset threshold, and if the data volume of the service handling data is larger than the preset threshold, the estimation quantity of the probability distribution is determined to be accurate. The preset threshold value is required to be greater than or equal to
Figure BDA0003570415060000121
Wherein, θ is the maximum value of the variance of the probability distribution corresponding to the service transaction duration of each service category of the banking outlet, τ is the set error threshold of the mean value of the service transaction durations, and Q is the probability that the error value of the mean value of the service transaction durations is greater than epsilon.
S402, for each service type, determining a confidence interval corresponding to the service type according to the probability density function of the service handling duration of the service type.
S403, when the service volume processed by the bank branch is greater than the threshold, the bank server encrypts the confidence interval corresponding to each service type and sends the confidence interval to the bank branch;
s404, the bank outlets determine the service types and the corresponding service handling time length according to the service handling data which are acquired in real time when the tellers handle the services;
s405, if the bank website stores the confidence interval of the business category, determining whether the business handling of the teller has risk according to the confidence interval, if the business handling time length is not in the confidence interval, determining that the teller has risk, otherwise, determining that the teller has no risk;
s406, if the banking website does not store the confidence interval of the service category, the banking website sends the service category and the service handling time length to a banking server, and the banking server determines the probability corresponding to the service handling data according to the probability density function of the service handling time length of the service category and the service handling time length; and if the probability is smaller than a set threshold value, judging that the risk exists, otherwise, judging that the risk does not exist.
Specifically, the encrypted confidence interval is issued to the bank outlets, so that the speed of risk judgment can be increased, and the probability density function of the service transaction duration cannot be illegally obtained because the probability density function of the service transaction duration is very important information.
The specific process for determining the confidence interval comprises the following steps:
calculating the average value of service handling time; setting a probability value A according to a probability density function of the service handling duration of the service category;
the minimum value (recorded as n) satisfying the following condition is obtained: greater than the mean value and the interval from the mean value to the value corresponds to 1/2 where the integral of the probability density function is greater than the probability value A;
the maximum value (recorded as m) satisfying the following condition is obtained: less than the mean and the value-to-mean interval corresponds to 1/2 where the integral of the probability density function is greater than the probability value A;
the confidence interval may be set to [ m, n ].
Issuing the confidence interval corresponding to each type of service to the corresponding bank outlets; wherein the process may be determined according to the amount of traffic that the bank server needs to handle. And when the processed service volume is greater than the threshold value, issuing the confidence interval corresponding to each type of service to the corresponding bank outlet, otherwise, not issuing.
According to the confidence interval, a function can be established, the input of the function comprises the service class and the corresponding service handling time length, the function judges whether the service handling time length is in the confidence interval corresponding to the service class, if not, the function output is risky, otherwise, the function is risk-free.
In an actual application scene, the function can be compiled into an executable program and sent to a banking outlet, and the banking outlet determines whether business handling is risky or not based on the function, so that the probability density function can be prevented from being acquired by other people.
Furthermore, the probability density function corresponding to the subset of banking outlets may be directly used as the probability density function corresponding to each banking outlet of the subset of banking outlets.
And the probability density function corresponding to the subset of the banking outlets to which the banking outlets belong can be corrected based on the service handling data of the banking outlets. For example, the probability density function corresponding to the sub-set of the bank outlets (the mean value of the probability density function is c, the variance is d) is corrected according to the mean value a and the variance b determined by the sample data corresponding to the bank outlets, the probability density function is translated by a-c, the abscissa is transformed into b/d, and the ordinate is transformed into the original d/b.
The above processes from S1 to S4 are performed by the bank server.
In S5, referring to fig. 6, the banking site obtains service transaction data in real time when the teller handles the service, performs risk analysis on the service transaction data according to the probability density function of the service transaction duration of each service category, stores the service transaction data with risk into a block chain node of the banking site, uploads the service transaction data to the banking server, sends risk prompt information to the teller, receives feedback information of a digital signature of the teller, and stores the feedback information into a block chain by a specific process including:
s501, determining the service type and the corresponding service handling time length of a teller according to service handling data obtained in real time when the teller handles the service;
s502, the bank network sends the service type and the service handling time length to a bank server, and the bank server determines the probability corresponding to the service handling data according to the probability density function of the service handling time length of the service type and the service handling time length; and if the probability is smaller than a set threshold value, judging that the risk exists, otherwise, judging that the risk does not exist.
Specifically, the above process finds out whether the teller's business processing has a risk according to the probability density function of the business processing duration.
Further, the risk category of the teller can be determined according to historical business data of the teller processing business; determining the risk category of the real-time handled service according to historical service data of the service category to which the real-time handled service belongs; judging whether the risk category of the teller and the risk category of the real-time transacted business have intersection, if so, determining whether the business of the teller has risk or not according to the business transaction data and a risk control model corresponding to the risk category contained in the intersection, storing the risky business transaction data into a block chain node of a bank outlet, and uploading the risky business transaction data to a bank server.
In S6, referring to fig. 7, after the daily banking outlets stop operating, the bank server performs risk analysis on the business transaction data of all tellers in all banking outlets on the day, determines the risks of each teller and each business category, stores the business transaction data with risks into a block chain node of the bank server, sends risk prompt information to the tellers, receives feedback information of digital signatures of the tellers, and stores the feedback information into the block chain, where the specific process includes:
s601, acquiring service handling time when a teller handles the service, and determining the probability of the service handling time corresponding to the teller when handling the service of each service type according to the service handling time and the probability density function of the service handling time of each service type; uploading a plurality of service handling data of which the probability of service handling time length in service handling is less than a set threshold value to a block chain, sending risk prompt information to a teller, and receiving feedback information of a digital signature of the teller;
s602, acquiring service handling data of each service category, screening out the service handling data with the minimum probability of service handling time length, judging the service handling data as risky, sending risk prompt information to a teller, and receiving feedback information of a digital signature of the teller.
It should be noted that although the operations of the method of the present invention have been described in the above embodiments and the accompanying drawings in a particular order, this does not require or imply that these operations must be performed in this particular order, or that all of the operations shown must be performed, to achieve the desired results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.
Having described the method of an exemplary embodiment of the present invention, a system for risk analysis based on business process data of an exemplary embodiment of the present invention is described next with reference to FIG. 8.
For implementation of the system for risk analysis based on business transaction data, reference may be made to implementation of the above method, and repeated descriptions are omitted. The term "module" or "unit" used hereinafter may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Based on the same inventive concept, the present invention further provides a system for risk analysis based on business transaction data, as shown in fig. 8, the system includes:
the service category clustering analysis module 810 is configured to obtain multidimensional data of each service category in a service category set of a bank outlet, perform clustering analysis on the service categories according to the multidimensional data, and divide the service category set into a plurality of service category subsets;
a service handling data analysis module 820, configured to obtain service handling data of each banking outlet in a predetermined area, and determine a service category vector, a risk category vector, and a main risk category corresponding to each banking outlet; wherein, the components of the service category vector correspond to the service category subsets one by one;
the bank branch clustering analysis module 830 is configured to perform clustering analysis on the bank branches in the predetermined area according to the service category vector, the risk category vector, and the main risk category corresponding to each bank branch to obtain a plurality of bank branch subsets;
a probability density function determination module 840, configured to determine the service transaction duration of each service category according to the service transaction data of the teller in each bank branch subset when processing the service, so as to obtain a probability density function of the service transaction duration of each service category;
the risk analysis module 850 is used for acquiring the business handling data when the teller handles the business in real time by the bank outlets, performing risk analysis on the business handling data according to the probability density function of the business handling time of each business category, storing the business handling data with risks into the block chain nodes of the bank outlets, uploading the business handling data to the bank server, sending risk prompt information to the teller, receiving feedback information of the digital signature of the teller, and storing the feedback information into the block chain.
In an embodiment, risk analysis module 850 is further configured to:
after the bank outlets stop business every day, the bank server carries out risk analysis on business handling data of all tellers of all bank outlets on the day, determines risks of all tellers and all business types, stores the business handling data with the risks into a block chain node of the bank server, sends risk prompt information to the tellers, receives feedback information of digital signatures of the tellers, and stores the feedback information into a block chain.
In an embodiment, the business transaction data analysis module 820 is specifically configured to:
for each bank outlet in a preset area, determining service handling data of services belonging to each service category subset in all service handling data according to corresponding historical service data, and setting a service category vector of the bank outlet; the length of the business category vector is equal to the number of elements of the business category subset, the components of the business category vector correspond to the business category subset one by one, and the value of the components is equal to the business volume of the business category subset corresponding to the components belonging to the business category in the business handling data of the bank outlets;
determining the risk category of each bank outlet and the risk probability of the bank outlet about each risk category according to the risk data in the historical service data of each bank outlet, and setting the risk category vector of the bank outlet; the length of the risk category vector is equal to the number of risk categories of the bank outlets, the components of the risk category vector correspond to the risk categories one by one, and the value of the component is set as the risk probability of the bank outlets about the risk category corresponding to the component;
and calculating the difference value between the risk probability and the corresponding preset threshold value, and taking the risk category with the maximum difference value as the main risk category of the banking outlet.
In an embodiment, the bank website cluster analysis module 830 is specifically configured to:
determining a distance function corresponding to the business category vector according to the Euclidean distance of the business category vector; determining a distance function corresponding to the risk category vector according to the Euclidean distance of the risk category vector;
determining a distance function corresponding to the bank outlets according to the distance function corresponding to the business category vector and the distance function corresponding to the risk category vector;
and clustering the bank outlets in the preset area according to the determined distance function corresponding to the bank outlets to obtain a plurality of bank outlet subsets.
In one embodiment, risk analysis module 850 is specifically configured to:
determining the service class and the corresponding service handling time length of a teller according to service handling data obtained in real time when the teller handles the service;
the banking network sends the service category and the service handling time length to a banking server, and the banking server determines the probability corresponding to the service handling data according to the probability density function of the service handling time length of the service category and the service handling time length; and if the probability is smaller than a set threshold value, judging that the risk exists, otherwise, judging that the risk does not exist.
In an embodiment, the probability density function determining module 840 is specifically configured to:
when the data volume of the service handling data is larger than a preset threshold value, determining a probability density function of service handling duration of a service category corresponding to the service handling data;
for each service type, determining a confidence interval corresponding to the service type according to the probability density function of the service handling duration of the service type;
when the service volume processed by the bank outlets is larger than the threshold value, the bank server encrypts confidence intervals corresponding to all service types and sends the confidence intervals to the bank outlets;
the bank outlets determine the service types and corresponding service handling durations according to the service handling data acquired in real time when the tellers handle the services;
if the bank website stores the confidence interval of the business category, determining whether the business handling of the teller has risk according to the confidence interval, if the business handling time length is not in the confidence interval, judging that the business handling has risk, otherwise, judging that the business handling has no risk;
if the banking outlet does not store the confidence interval of the service class, the banking outlet sends the service class and the service handling time to a banking server, and the banking server determines the probability corresponding to the service handling data according to the probability density function of the service handling time of the service class and the service handling time; and if the probability is smaller than a set threshold value, judging that the risk exists, otherwise, judging that the risk does not exist.
In one embodiment, risk analysis module 850 is specifically configured to:
acquiring service handling time when a teller handles services, and determining the probability of the service handling time corresponding to the teller when handling the services of each service type according to the service handling time and the probability density function of the service handling time of each service type; the method comprises the steps that a plurality of service handling data with the probability of service handling duration being smaller than a set threshold value in service handling are uploaded to a block chain, risk prompt information is sent to a teller, and feedback information of a digital signature of the teller is received;
the method comprises the steps of obtaining business handling data of each business category, screening out the business handling data with the minimum probability of business handling duration, judging the business handling data to be risky, sending risk prompt information to a teller, and receiving feedback information of a digital signature of the teller.
It should be noted that although several modules of the system for risk analysis based on business transaction data are mentioned in the above detailed description, such partitioning is merely exemplary and not mandatory. Indeed, the features and functionality of two or more of the modules described above may be embodied in one module according to embodiments of the invention. Conversely, the features and functions of one module described above may be further divided into embodiments by a plurality of modules.
Based on the aforementioned inventive concept, as shown in fig. 9, the present invention further provides a computer device 900, which includes a memory 910, a processor 920 and a computer program 930 stored on the memory 910 and operable on the processor 920, wherein the processor 920 executes the computer program 930 to implement the aforementioned method for risk analysis based on business transaction data.
Based on the foregoing inventive concept, the present invention provides a computer-readable storage medium storing a computer program, which when executed by a processor implements the foregoing method for risk analysis based on business transaction data.
Based on the aforementioned inventive concept, the present invention proposes a computer program product comprising a computer program which, when executed by a processor, implements a method for risk analysis based on business transaction data.
The method and the system for risk analysis based on business handling data divide a business category set into a plurality of business category subsets by clustering business categories, analyze the business handling data of each bank branch in a preset area, determine a business category vector, a risk category vector and a main risk category corresponding to each bank branch, perform clustering analysis on the bank branches in the preset area according to the business category vector, the risk category vector and the main risk category corresponding to each bank branch to obtain a plurality of bank branch subsets, and further determine the business handling duration of each business category according to the business handling data when a teller processes business in each bank branch subset to obtain a probability density function of the business handling duration of each business category; the method comprises the steps of obtaining business handling data when a teller handles business in real time through a bank outlet, carrying out risk analysis on the business handling data according to a probability density function of business handling time of each business category, storing the business handling data with risks into a block chain node of the bank outlet, uploading the business handling data to a bank server, sending risk prompt information to the teller, receiving feedback information of a digital signature of the teller, storing the feedback information into the block chain, finding abnormal handling behaviors of the teller, reducing risks to a certain extent, and improving safety when banking business is handled.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (17)

1. A method for risk analysis based on business transaction data, comprising:
obtaining multidimensional data of each service category in a service category set of a bank outlet, carrying out cluster analysis on the service categories according to the multidimensional data, and dividing the service category set into a plurality of service category subsets;
acquiring business handling data of each bank outlet in a predetermined area, and determining a business category vector, a risk category vector and a main risk category corresponding to each bank outlet; wherein, the components of the service category vector correspond to the service category subsets one by one;
performing cluster analysis on the bank outlets in the preset area according to the business category vector, the risk category vector and the main risk category corresponding to each bank outlet to obtain a plurality of bank outlet subsets;
determining the service handling duration of each service type according to the service handling data when the teller handles the service in each bank branch subset to obtain the probability density function of the service handling duration of each service type;
the method comprises the steps that a bank outlet acquires service handling data when a teller handles services in real time, risk analysis is carried out on the service handling data according to the probability density function of service handling time of each service type, the service handling data with risks are stored in a block chain node of the bank outlet and uploaded to a bank server, risk prompt information is sent to the teller, feedback information of a digital signature of the teller is received, and the feedback information is stored in the block chain.
2. The method of claim 1, further comprising:
after the bank outlets stop business every day, the bank server carries out risk analysis on business handling data of all tellers of all bank outlets on the day, determines risks of all tellers and all business types, stores the business handling data with the risks into a block chain node of the bank server, sends risk prompt information to the tellers, receives feedback information of digital signatures of the tellers, and stores the feedback information into a block chain.
3. The method as claimed in claim 1, wherein the step of obtaining the business transaction data of each banking outlet in the predetermined area and determining the business category vector, the risk category vector and the main risk category corresponding to each banking outlet comprises:
for each bank branch in a preset area, determining service handling data of services belonging to each service category subset in all service handling data according to corresponding historical service data, and setting a service category vector of the bank branch; the length of the business category vector is equal to the number of elements of the business category subset, the components of the business category vector correspond to the business category subset one by one, and the value of the component is equal to the business volume of the business category subset corresponding to the component in the business handling data of the bank website;
determining the risk category of each bank outlet and the risk probability of the bank outlet about each risk category according to the risk data in the historical service data of each bank outlet, and setting the risk category vector of the bank outlet; the length of the risk category vector is equal to the number of risk categories of the bank outlets, the components of the risk category vector correspond to the risk categories one by one, and the value of the component is set as the risk probability of the bank outlets about the risk category corresponding to the component;
and calculating the difference value between the risk probability and the corresponding preset threshold value, and taking the risk category with the maximum difference value as the main risk category of the banking outlet.
4. The method as claimed in claim 3, wherein performing cluster analysis on the banking outlets in the predetermined area according to the service category vector, the risk category vector, and the main risk category corresponding to each banking outlet to obtain a plurality of banking outlet subsets, comprises:
determining a distance function corresponding to the service category vector according to the Euclidean distance of the service category vector; determining a distance function corresponding to the risk category vector according to the Euclidean distance of the risk category vector;
determining a distance function corresponding to the bank outlets according to the distance function corresponding to the business category vector and the distance function corresponding to the risk category vector;
and clustering the bank outlets in the preset area according to the determined distance function corresponding to the bank outlets to obtain a plurality of bank outlet subsets.
5. The method as claimed in claim 1, wherein the bank branch obtains the business handling data when the teller handles the business in real time, performs risk analysis on the business handling data according to the probability density function of the business handling duration of each business category, stores the business handling data with risks in the block chain node of the bank branch, uploads the business handling data to the bank server, sends risk prompt information to the teller, receives feedback information of the digital signature of the teller, and stores the feedback information in the block chain, and the method comprises:
determining the service class and the corresponding service handling time length of a teller according to service handling data obtained in real time when the teller handles the service;
the banking network sends the service category and the service handling time length to a banking server, and the banking server determines the probability corresponding to the service handling data according to the probability density function of the service handling time length of the service category and the service handling time length; and if the probability is smaller than a set threshold value, judging that the risk exists, otherwise, judging that the risk does not exist.
6. The method of claim 1, wherein determining the transaction duration for each business category according to the transaction data of the teller in each subset of banking outlets when processing the business to obtain the probability density function of the transaction duration for each business category comprises:
when the data volume of the service handling data is larger than a preset threshold value, determining a probability density function of service handling duration of a service category corresponding to the service handling data;
for each service type, determining a confidence interval corresponding to the service type according to the probability density function of the service handling duration of the service type;
when the service volume processed by the bank outlets is larger than the threshold value, the bank server encrypts confidence intervals corresponding to all service types and sends the confidence intervals to the bank outlets;
the bank outlets determine the service types and corresponding service handling durations according to the service handling data acquired in real time when the tellers handle the services;
if the bank website stores the confidence interval of the business category, determining whether the business handling of the teller has risk according to the confidence interval, if the business handling time length is not in the confidence interval, judging that the business handling has risk, otherwise, judging that the business handling has no risk;
if the banking network does not store the confidence interval of the business category, the banking network sends the business category and the business handling time length to a banking server, and the banking server determines the probability corresponding to the business handling data according to the probability density function of the business handling time length of the business category and the business handling time length; and if the probability is smaller than a set threshold value, judging that the risk exists, otherwise, judging that the risk does not exist.
7. The method as claimed in claim 2, wherein after daily stoppage of the bank outlets, the bank server performs risk analysis on business transaction data of all tellers of all bank outlets on the day to determine risks of each teller and each business category, stores the business transaction data with risks in a blockchain node of the bank server, transmits risk prompt information to the tellers, receives feedback information of digital signatures of the tellers, and stores the feedback information in the blockchain, and the method comprises the following steps:
acquiring service handling time when a teller handles services, and determining the probability of the service handling time corresponding to the teller when handling the services of each service type according to the service handling time and the probability density function of the service handling time of each service type; uploading a plurality of service handling data of which the probability of service handling time length in service handling is less than a set threshold value to a block chain, sending risk prompt information to a teller, and receiving feedback information of a digital signature of the teller;
the method comprises the steps of obtaining business handling data of each business category, screening out the business handling data with the minimum probability of business handling duration, judging the business handling data to be risky, sending risk prompt information to a teller, and receiving feedback information of a digital signature of the teller.
8. A system for risk analysis based on business transaction data, comprising:
the service class clustering analysis module is used for acquiring multidimensional data of each service class in a service class set of a bank outlet, carrying out clustering analysis on the service classes according to the multidimensional data, and dividing the service class set into a plurality of service class subsets;
the business handling data analysis module is used for acquiring business handling data of each bank branch in a predetermined area and determining a business category vector, a risk category vector and a main risk category corresponding to each bank branch; wherein, the components of the service category vector correspond to the service category subsets one by one;
the bank branch cluster analysis module is used for carrying out cluster analysis on the bank branches in the preset area according to the business category vectors, the risk category vectors and the main risk categories corresponding to the bank branches to obtain a plurality of bank branch subsets;
the probability density function determining module is used for determining the service handling time of each service type according to the service handling data when the teller handles the service in each bank branch subset to obtain the probability density function of the service handling time of each service type;
the risk analysis module is used for acquiring business handling data when a teller handles business in real time by a bank outlet, carrying out risk analysis on the business handling data according to the probability density function of business handling time of each business category, storing the business handling data with risks into a block chain node of the bank outlet, uploading the business handling data to a bank server, sending risk prompt information to the teller, receiving feedback information of a digital signature of the teller, and storing the feedback information into the block chain.
9. The system of claim 8, wherein the risk analysis module is further configured to:
after the bank outlets stop business every day, the bank server carries out risk analysis on business handling data of all tellers of all bank outlets on the day, determines risks of all tellers and all business types, stores the business handling data with the risks into a block chain node of the bank server, sends risk prompt information to the tellers, receives feedback information of digital signatures of the tellers, and stores the feedback information into a block chain.
10. The system of claim 8, wherein the business transaction data analysis module is specifically configured to:
for each bank branch in a preset area, determining service handling data of services belonging to each service category subset in all service handling data according to corresponding historical service data, and setting a service category vector of the bank branch; the length of the business category vector is equal to the number of elements of the business category subset, the components of the business category vector correspond to the business category subset one by one, and the value of the components is equal to the business volume of the business category subset corresponding to the components belonging to the business category in the business handling data of the bank outlets;
determining the risk category of each bank outlet and the risk probability of the bank outlet about each risk category according to the risk data in the historical service data of each bank outlet, and setting the risk category vector of the bank outlet; the length of the risk category vector is equal to the number of risk categories of the bank outlets, the components of the risk category vector correspond to the risk categories one by one, and the value of the component is set as the risk probability of the bank outlets about the risk category corresponding to the component;
and calculating the difference between the risk probability and the corresponding preset threshold, and taking the risk category with the maximum difference as the main risk category of the banking outlet.
11. The system of claim 10, wherein the bank outlet cluster analysis module is specifically configured to:
determining a distance function corresponding to the service category vector according to the Euclidean distance of the service category vector; determining a distance function corresponding to the risk category vector according to the Euclidean distance of the risk category vector;
determining a distance function corresponding to the bank outlets according to the distance function corresponding to the business category vector and the distance function corresponding to the risk category vector;
and clustering the banking outlets in the preset area according to the determined distance function corresponding to the banking outlets to obtain a plurality of banking outlet subsets.
12. The system of claim 8, wherein the risk analysis module is specifically configured to:
determining the service class and the corresponding service handling time length of a teller according to service handling data obtained in real time when the teller handles the service;
the banking network sends the service category and the service handling time length to a banking server, and the banking server determines the probability corresponding to the service handling data according to the probability density function of the service handling time length of the service category and the service handling time length; and if the probability is smaller than a set threshold value, judging that the risk exists, otherwise, judging that the risk does not exist.
13. The system of claim 8, wherein the probability density function determination module is specifically configured to:
when the data volume of the service handling data is larger than a preset threshold value, determining a probability density function of service handling duration of a service category corresponding to the service handling data;
for each service category, determining a confidence interval corresponding to the service category according to the probability density function of the service handling duration of the service category;
when the service volume processed by the bank outlets is larger than the threshold value, the bank server encrypts confidence intervals corresponding to all service types and sends the confidence intervals to the bank outlets;
the bank outlets determine the service types and corresponding service handling durations according to the service handling data acquired in real time when the tellers handle the services;
if the bank website stores the confidence interval of the business category, determining whether the business handling of the teller has risk according to the confidence interval, if the business handling time length is not in the confidence interval, judging that the business handling has risk, otherwise, judging that the business handling has no risk;
if the banking outlet does not store the confidence interval of the service class, the banking outlet sends the service class and the service handling time to a banking server, and the banking server determines the probability corresponding to the service handling data according to the probability density function of the service handling time of the service class and the service handling time; and if the probability is smaller than a set threshold value, judging that the risk exists, otherwise, judging that the risk does not exist.
14. The system of claim 9, wherein the risk analysis module is specifically configured to:
acquiring service handling time when a teller handles services, and determining the probability of the service handling time corresponding to the teller when handling the services of each service type according to the service handling time and the probability density function of the service handling time of each service type; the method comprises the steps that a plurality of service handling data with the probability of service handling duration being smaller than a set threshold value in service handling are uploaded to a block chain, risk prompt information is sent to a teller, and feedback information of a digital signature of the teller is received;
the method comprises the steps of obtaining business handling data of each business category, screening out the business handling data with the minimum probability of business handling duration, judging the business handling data to be risky, sending risk prompt information to a teller, and receiving feedback information of a digital signature of the teller.
15. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 7 when executing the computer program.
16. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, implements the method of any one of claims 1 to 7.
17. A computer program product, characterized in that the computer program product comprises a computer program which, when being executed by a processor, carries out the method of any one of claims 1 to 7.
CN202210318187.4A 2022-03-29 2022-03-29 Method and system for risk analysis based on business handling data Pending CN114638723A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210318187.4A CN114638723A (en) 2022-03-29 2022-03-29 Method and system for risk analysis based on business handling data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210318187.4A CN114638723A (en) 2022-03-29 2022-03-29 Method and system for risk analysis based on business handling data

Publications (1)

Publication Number Publication Date
CN114638723A true CN114638723A (en) 2022-06-17

Family

ID=81950764

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210318187.4A Pending CN114638723A (en) 2022-03-29 2022-03-29 Method and system for risk analysis based on business handling data

Country Status (1)

Country Link
CN (1) CN114638723A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115099664A (en) * 2022-07-08 2022-09-23 中国银行股份有限公司 Method and device for controlling internal risk of bank

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115099664A (en) * 2022-07-08 2022-09-23 中国银行股份有限公司 Method and device for controlling internal risk of bank

Similar Documents

Publication Publication Date Title
US20210092160A1 (en) Data set creation with crowd-based reinforcement
US20180006900A1 (en) Predictive anomaly detection in communication systems
CN113626241B (en) Abnormality processing method, device, equipment and storage medium for application program
CN113762377B (en) Network traffic identification method, device, equipment and storage medium
CN111552509A (en) Method and device for determining dependency relationship between interfaces
Hidayat et al. Forecast analysis of research chance on AES algorithm to encrypt during data transmission on cloud computing
CN114638723A (en) Method and system for risk analysis based on business handling data
CN117236656B (en) Informationized management method and system for engineering project
CN114638693A (en) Method and system for determining service type range of bank outlets
CN114723145A (en) Method and system for determining number of intelligent counters based on transaction amount
Ferdiana New approach of ensemble method to improve performance of ids using S-sdn classifier
CN111507397A (en) Abnormal data analysis method and device
CN114926260A (en) Method and system for processing audit risk of bank outlets
CN114707853A (en) Terminal configuration method and system for bank outlets
US11823064B2 (en) Enterprise market volatility prediction through synthetic DNA and mutant nucleotides
US11823065B2 (en) Enterprise market volatility predictions through synthetic DNA and mutant nucleotides
Makkar et al. MFC: A Multishot Approach to Federated Data Clustering
CN114782167A (en) Method and system for controlling customer transaction risk of bank outlets
Hamza et al. Evolutionary constrained optimization with dynamic changes and uncertainty in the objective function
WO2023149120A1 (en) Information processing device, information processing method, and program
US20220358371A1 (en) Digital transaction ledger with dna-related ledger parameter
CN114997269A (en) Face recognition processing method and system for bank outlets
CN114926296A (en) Risk control method and device for new business type of bank
CN114862471A (en) Bank product delivery method and device
CN114925079A (en) Bank customer mobile phone number change processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination