CN112837142A - Financial risk model training method and device - Google Patents

Financial risk model training method and device Download PDF

Info

Publication number
CN112837142A
CN112837142A CN202110071185.5A CN202110071185A CN112837142A CN 112837142 A CN112837142 A CN 112837142A CN 202110071185 A CN202110071185 A CN 202110071185A CN 112837142 A CN112837142 A CN 112837142A
Authority
CN
China
Prior art keywords
data
risk model
training
risk
financial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110071185.5A
Other languages
Chinese (zh)
Inventor
张东凯
吴勇
李宁
陈亚君
蔡朴锐
卢世温
林莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202110071185.5A priority Critical patent/CN112837142A/en
Publication of CN112837142A publication Critical patent/CN112837142A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Finance (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Accounting & Taxation (AREA)
  • Evolutionary Biology (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention relates to the technical field of machine learning, in particular to a financial risk model training method and device. The method comprises the following steps: acquiring original data related to financial risks; constructing a characteristic factor of the original data according to the original data; and training the pre-constructed risk model by utilizing the characteristic factor data according to a preset algorithm to obtain a target risk model. The technical scheme provided by the application can improve the accuracy of the risk model.

Description

Financial risk model training method and device
Technical Field
The invention relates to the technical field of machine learning, in particular to a financial risk model training method and device.
Background
With the rapid development of computer and internet technologies, businesses developed on the internet by various industries are increasingly diversified. The supervision of the local financial institution is gradually changed to a mode mainly based on off-site supervision, so that the online transaction of various services is realized, and the online real-time monitoring of the local financial institution is realized. In the actual supervision process of the local financial institution, risk prediction is usually performed on the institution behavior data through a risk early warning model, and the prediction result shows the risk degree of the institution behavior data, so that the accuracy of the risk early warning model is of great importance.
Disclosure of Invention
The purpose of this application aims at improving financial institution risk early warning model's accuracy. The technical scheme adopted by the application is as follows:
in a first aspect, an embodiment of the present application discloses a financial risk model training method, where the method includes:
acquiring original data related to financial risks;
constructing a characteristic factor of the original data according to the original data;
and training the pre-constructed risk model by utilizing the characteristic factor data according to a preset algorithm to obtain a target risk model.
Further, the acquiring raw data related to financial risk comprises:
acquiring historical business data and fund data of a target financial institution;
according to a preset risk judgment rule, carrying out risk annotation on the historical business data and the fund data;
and determining the example business data and the fund data for completing the risk annotation as the original data.
Further, the acquiring raw data related to financial risk further comprises:
acquiring government affair data and public opinion data of an area where a target financial institution is located;
according to a preset risk judgment rule, carrying out risk annotation on the government affair data and the public opinion data;
and determining the government affair data and the public opinion data which finish the risk marking as original data.
Further, the preset risk judgment rule includes: acquiring and determining a black and white list of the service and a service abnormal rule in the target financial institution as a preset risk judgment rule; and/or the presence of a gas in the gas,
and setting a risk judgment rule of the government affair data and the public opinion data according to big data analysis and an expert judgment rule.
Further, the acquiring government affair data and public opinion data of the area where the target financial institution is located comprises:
applying a data interface to a government affair system and a public opinion analysis system;
and acquiring government affair data and public opinion data from the government affair system and the public opinion analysis system through the data interface.
Further, the constructing the characteristic factors of the raw data according to the raw data comprises:
according to the attribute characteristics of the obtained original data, at least one characteristic factor of the original data is constructed, wherein the characteristic factor comprises the following components:
a base eigenfactor, a bias eigenfactor, a cross eigenfactor.
Further, the method further comprises: grouping the characteristic factor data according to a preset permutation and combination rule; each group of characteristic factor data at least comprises two types of data;
training the pre-constructed risk model by utilizing the first class characteristic factor data in each group;
and verifying the target risk model by using the second class characteristic factor data in each group.
Further, the predetermined permutation and combination rule includes, but is not limited to:
combining two types of characteristic factors adjacent in time dimension into a group according to the time generated by the original data of the characteristic factor data; or the like, or, alternatively,
randomly dividing the characteristic factor data in the same time range into two types and combining the two types of characteristic factor data into one group.
Further, after the target risk model is verified by using the second class characteristic factor data in each group, the method further includes:
obtaining the verification result;
when the verification result is not in accordance with a set standard result, determining a correction variable according to the difference between the verification result and the standard result;
and correcting the target risk model according to the correction variable.
Further, the training the pre-constructed risk model by using the feature factor data according to the preset algorithm includes but is not limited to:
and training the pre-constructed risk model by utilizing the characteristic factor data according to a preset random forest algorithm or a decision tree algorithm.
Further, the training of the pre-constructed risk model by using the characteristic factor data according to the preset random forest algorithm comprises:
adopting a self-help resampling technology to randomly and repeatedly extract K samples from the characteristic factor data for M times to form M characteristic factor data sets; wherein M and K are both positive integers greater than or equal to 1;
and training the pre-constructed risk model by using the M characteristic factor data sets to obtain a target risk model.
Further, the training by using the risk model pre-constructed by the feature factor data according to a preset decision tree algorithm includes:
initializing a level parameter corresponding to a decision tree algorithm;
and training target training data in a training set by adopting a CART algorithm, and acquiring the original risk model when the number of growing layers of the decision tree reaches the level parameters.
In another aspect, an embodiment of the present application provides a financial risk model training apparatus, where the apparatus includes: an acquisition module, a construction module, a storage module and a training module, wherein,
the acquisition module is used for acquiring original data related to financial risks;
the construction module is used for constructing the characteristic factors of the original data according to the original data;
the storage module is used for storing a preset algorithm and a pre-constructed risk model;
and the training module is used for training the pre-constructed risk model by utilizing the characteristic factor data according to a preset algorithm to obtain a target risk model.
Further, the device also comprises a data processing module and a verification module, wherein,
the storage module is also used for storing a preset permutation and combination rule;
the data processing module is further configured to group the characteristic factor data according to a predetermined permutation and combination rule; each group of characteristic factor data at least comprises two types of data;
the training module is specifically used for training the pre-constructed risk model by using the first class characteristic factor data in each group;
and the verification module is used for verifying the target risk model by utilizing the second class characteristic factor data in each group.
In a third aspect, an embodiment of the present application provides an electronic device, including a processor and a memory;
the memory is used for storing operation instructions;
the processor is configured to execute the method in any of the embodiments by calling the operation instruction.
In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the computer program implements the method of any one of the above embodiments.
The financial risk model training scheme provided by the embodiment of the application comprises the steps of obtaining original data related to financial risks; constructing a characteristic factor of the original data according to the original data; and training the pre-constructed risk model by utilizing the characteristic factor data according to a preset algorithm to obtain a target risk model. The beneficial effects brought by the technical scheme provided by the embodiment of the application include that feasible and effective prediction results can be made for a large data source in a relatively short time, and the accuracy of the risk early warning model is greatly improved. Meanwhile, a part of data is used for verifying the constructed and trained model and correcting the model to further improve the accuracy of the risk model, so that the auxiliary identification effect of the target risk model is better. The financial monitoring system can more effectively assist local financial monitoring organizations to maintain financial safety and build good financial industry environment.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.
FIG. 1 is a schematic flow chart illustrating a method for training a financial risk model according to an embodiment of the present disclosure;
FIG. 2 is a schematic flow chart illustrating a method for training a financial risk model according to an embodiment of the present disclosure;
FIG. 3 is a schematic structural diagram of a financial risk model training apparatus according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present invention.
It should be noted that, unless specifically stated otherwise, as used herein, the singular forms "a," "an," "the," and "the" may include the plural forms, and the "first," "second," etc. are defined merely for the purpose of describing a clear solution and are not intended to limit the objects themselves, and of course, the "first" and "second" may be the same terminal, device, user, etc. and may also be the same terminal, device, user. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items. In addition, it is to be understood that "at least one" in the embodiments of the present application means one or more, "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a alone, both A and B, and B alone, where A, B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of the singular or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, a and b, a and c, b and c, or a, b and c, wherein a, b and c can be single or multiple.
Financial risk control is an important guarantee link for health operation management of a local financial supervision system. The core of financial risk control lies in the application of risk early warning models. The accuracy of the risk early warning model depends on the selection of the fine granularity of the historical behavior data of the mechanism to a great extent, and if the fine granularity of the historical behavior data of the mechanism is not enough, the recognition efficiency of the trained risk early warning model is low and the recognition accuracy of the risk early warning model is not high. Whereas the more fine grained historical behavioral data of an organization may be exposed to privacy of the organization. Based on the above, the embodiment of the application starts from the dimensions of business data, fund data, government data and public opinion data related to financial risks and refines the fine granularity of the dimensions to improve the accuracy of the financial risk model.
Fig. 1 shows a schematic flow chart of a financial risk model provided in an embodiment of the present application, and as shown in fig. 1, the method mainly includes:
s101, acquiring original data related to financial risks;
s102, constructing a characteristic factor of the original data according to the original data;
in the embodiment of the present application, the raw data related to financial risk includes historical business data and fund data, and may further include government affair data and public opinion data of an area where a target financial institution (financial institution to be assessed) is located. The process of acquiring the original data of each dimension respectively comprises the following steps:
(1) in an alternative embodiment, the step of obtaining raw data in terms of historical business data and financial data for the target financial institution comprises:
step 1, acquiring historical business data and fund data of a target financial institution: acquiring historical business data and fund data of a financial institution to be evaluated, wherein the business data comprises but is not limited to transfer business data, deposit business data, loan business data, insurance business data and the like; the fund data is the amount, the fund flow direction and the fund traffic involved in the businesses.
Step 2, according to a preset risk judgment rule, carrying out risk marking on the historical business data and the fund data;
and 3, determining the example business data and the fund data which finish the risk annotation as original data.
(2) In an alternative embodiment, the step of obtaining the original data of the government affairs data and public opinion data related to financial risk comprises:
step 1, acquiring government affair data and public opinion data of an area where a target financial institution is located; the obtaining mode can be that a data interface is applied to a government affair system and a public opinion analysis system, and the government affair data and the public opinion data are obtained from the government affair system and the public opinion analysis system through the data interface. The government affair data comprises litigation cases published by clients of the financial institutions in the court system (the corresponding government affair system can be a court database system), business and commerce exception information (the corresponding government affair system can be a business and commerce system) and the like. Public opinion data includes, but is not limited to, some company business information published on web news by customers of financial institutions.
Step 2, according to a preset risk judgment rule, carrying out risk annotation on the government affair data and the public opinion data;
and 3, determining the government affair data and public opinion data with finished risk marking as original data.
On the basis of the above embodiment, in a further application embodiment, the preset risk judgment rule includes: acquiring and determining a black and white list of the business and a business abnormal rule of the system inside the target financial institution or related to the stored original data as a preset risk judgment rule, wherein the rule can be used for carrying out risk marking on business data, capital data, government data and public opinion data; and/or setting a risk judgment rule of the government affair data and the public opinion data according to big data analysis and an expert judgment rule. The expert judgment rule refers to the risk standard of the government affair data and the public opinion data according to the judgment and evaluation conclusion of the expert on the risk.
The original data in the above embodiment is mainly data obtained from internal big data of a local financial organization, a bank, a government and the like, such as daily business handling data, financial statement data, account capital flow, industry and commerce data, court information, public opinion information and the like related to the local financial organization. And marking high and low risks aiming at the local financial organization object and a preset risk judgment rule, wherein the related data of the high risk object is the high risk, the related data of the low risk object is the low risk, and finally obtaining the original training data.
In the above embodiment, constructing the characteristic factor of the raw data according to the raw data includes: and constructing a basic characteristic factor, a deviation characteristic factor or a cross characteristic factor of the original data according to the acquired attribute characteristics of the original data. The characteristic factor is training data required by model training. The method mainly comprises basic features, deviation value features and cross features.
The basic features refer to basic fields of original data, and mainly comprise: reserved fields, statistical features in a particular set, for example: total amount of assets, etc.;
the deviation characteristic refers to the deviation distance between the mean, minimum, maximum and sum values before grouping of the individual and the group, respectively, such as: the rate of deviation of the total assets from the average of the total assets of the industry, etc.
Cross-feature refers to not only building features from one angle, but also building features from multiple angles, or generating new features by interacting features, for example: the rate of assets liability is total amount of liability/total amount of assets, etc.
S103, training the pre-constructed risk model by using the characteristic factor data according to a preset algorithm to obtain a target risk model.
In a preferred embodiment, as shown in fig. 2, the step of training the financial risk model according to the embodiment of the present application may include:
s201, acquiring original data related to financial risks;
s202, constructing a characteristic factor of the original data according to the original data;
s203, grouping the characteristic factor data according to a preset permutation and combination rule; each group of characteristic factor data at least comprises two types of data;
in the embodiment of the present application, the predetermined permutation and combination rules include, but are not limited to, the following two types:
(1) combining two types of characteristic factors adjacent in time dimension into a group according to the time generated by the original data of the characteristic factor data; for example, the original data of 2015-2020 is obtained, and according to the rule, the data can be divided into three groups, namely {2015,2016}, {2017,2018} and {2019,2020}, wherein the characteristic factors of the original data of 2015 can be used as a training data set, the data of 2016 can be used as a testing data set, and the other two groups are the same.
(2) Randomly dividing the characteristic factor data in the same time range into two types and combining the two types of characteristic factor data into one group. For example, the characteristic factors of the raw data of each year from 2015 to 2020 are randomly divided into two types, wherein one type is used as training data and the other type is used as testing data.
S204, training the pre-constructed risk model by utilizing the first class characteristic factor data in each group; the first class of feature factor data is a learning sample data set used to train the machine learning model to determine parameters of the machine learning model.
S205, verifying the target risk model by using the second class characteristic factor data in each group. The second type of characteristic factor data is used for testing the resolving power of the trained machine learning model and improving the accuracy of the model.
S206, obtaining the verification result;
s207, when the verification result is not in accordance with a set standard result, determining a correction variable according to the difference between the verification result and the standard result; and correcting the target risk model according to the correction variable.
In the above embodiments, the preset algorithm includes, but is not limited to, a random forest algorithm or a decision tree algorithm.
In a further embodiment, the training of the pre-constructed risk model by using the feature factor data according to a preset random forest algorithm includes:
step 1, randomly and repeatedly extracting K samples from the characteristic factor data for M times by adopting a self-service resampling technology to form M characteristic factor data sets; wherein M and K are both positive integers greater than or equal to 1;
and 2, training the pre-constructed risk model by using the M characteristic factor data sets to obtain a target risk model.
The advantages of the random forest algorithm are many, the most outstanding one-each tree in the algorithm grows to the greatest extent (and does not prune), which ensures that the model can learn more deeply and more carefully. Meanwhile, two randomness properties, namely randomly selected samples and features, are added into the algorithm, so that the model is more difficult to fall into overfitting during deep learning.
In a further optional embodiment, the training with the risk model pre-constructed using the feature factor data according to a preset decision tree algorithm includes:
step 1, initializing a level parameter corresponding to a decision tree algorithm;
and 2, training target training data in the training set by adopting a CART algorithm, and acquiring the original risk model when the growth layer number of the decision tree reaches the level parameters. The method specifically comprises the following steps:
based on the financial risk model training method shown in fig. 1 and fig. 2, another aspect of the present application provides a financial risk model training apparatus, as shown in fig. 3, the apparatus may include: 301 acquisition module, 302 construction module, 303 storage module, and 304 training module, wherein,
the 301 acquisition module is used for acquiring original data related to financial risks;
the 302 construction module is configured to construct feature factors of the raw data according to the raw data;
the 303 storage module is used for storing a preset algorithm and a pre-constructed risk model;
and the 304 training module is used for training the pre-constructed risk model by using the characteristic factor data according to a preset algorithm to obtain a target risk model.
Further, the apparatus comprises 305 a data processing module and 306 a verification module, wherein,
the 303 storage module is further configured to store a predetermined permutation and combination rule; wherein the predetermined permutation and combination rule includes but is not limited to: combining two types of characteristic factors adjacent in time dimension into a group according to the time generated by the original data of the characteristic factor data; or randomly dividing the characteristic factor data in the same time range into two types and combining the two types of characteristic factor data into one group.
The 305 data processing module is further configured to group the characteristic factor data according to a predetermined permutation and combination rule; each group of characteristic factor data at least comprises two types of data;
the 304 training module is specifically configured to train the pre-constructed risk model by using the first class of feature factor data in each group;
and the 306 verification module is used for verifying the target risk model by using the second class characteristic factor data in each group.
In a further embodiment, the 301 obtaining module comprises a obtaining unit, a labeling unit and a determining unit, wherein,
the acquisition unit is specifically used for acquiring historical business data and fund data of a target financial institution;
the marking unit is specifically used for carrying out risk marking on the historical business data and the fund data according to a preset risk judgment rule;
the determining unit is used for determining the example business data and the fund data for completing the risk annotation as original data.
In a further embodiment, the obtaining unit, the labeling unit and the determining unit of the 301 obtaining module are further configured to,
the acquisition unit is further used for acquiring government affair data and public opinion data of the area where the target financial institution is located; further, the acquiring government affair data and public opinion data of the area where the target financial institution is located comprises: applying a data interface to a government affair system and a public opinion analysis system; and acquiring government affair data and public opinion data from the government affair system and the public opinion analysis system through the data interface.
The marking unit is also used for carrying out risk marking on the government affair data and the public opinion data according to a preset risk judgment rule;
the determining unit is further used for determining the government affair data and the public opinion data with finished risk labeling as original data.
In a further optional embodiment, the preset risk judgment rule stored in the storage module 303 includes: acquiring and determining a black and white list of the service and a service abnormal rule in the target financial institution as a preset risk judgment rule; and/or setting a risk judgment rule of the government affair data and the public opinion data according to big data analysis and an expert judgment rule.
In a specific embodiment, the 302 construction module is specifically configured to construct, according to the attribute feature of the obtained raw data, at least one feature factor of the raw data, which is as follows: a base eigenfactor, a bias eigenfactor, a cross eigenfactor.
In a further optional embodiment, the apparatus further comprises 307 a correction module, wherein the 307 correction module is configured to determine a correction variable according to a difference between the verification result and a set standard result when the verification result does not meet the set standard result; and correcting the target risk model according to the correction variable.
In a further optional embodiment, the 304 training module is specifically configured to train the pre-constructed risk model by using the feature factor data according to a preset random forest algorithm or a decision tree algorithm.
Further, the training module 304 for training the pre-constructed risk model by using the feature factor data according to a preset random forest algorithm includes:
adopting a self-help resampling technology to randomly and repeatedly extract K samples from the characteristic factor data for M times to form M characteristic factor data sets; wherein M and K are both positive integers greater than or equal to 1;
and training the pre-constructed risk model by using the M characteristic factor data sets to obtain a target risk model.
Optionally, the training by the 304 training module using the risk model pre-constructed by the feature factor data according to a preset decision tree algorithm includes:
initializing a level parameter corresponding to a decision tree algorithm;
and training target training data in a training set by adopting a CART algorithm, and acquiring the original risk model when the number of growing layers of the decision tree reaches the level parameters.
It is understood that the above-mentioned components of the financial risk model training apparatus in the present embodiment have functions of implementing the corresponding steps of the method in the embodiments shown in fig. 1 and fig. 2. The function can be realized by hardware, and can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules or means corresponding to the functions described above. The modules and devices can be software and/or hardware, and the modules and devices can be realized independently or integrated by a plurality of modules and devices. For the functional description of each module and apparatus, reference may be specifically made to the corresponding description of the method in the embodiment shown in fig. 1 and fig. 2, and therefore, the beneficial effects that can be achieved by the method may refer to the beneficial effects in the corresponding method provided above, and details are not described here again.
It is understood that the illustrated structure of the embodiment of the present invention does not constitute a specific limitation to the specific structure of the financial risk model training apparatus. In other embodiments of the present application, the financial risk model training apparatus may include more or fewer components than shown, or combine certain components, or split certain components, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
The embodiment of the application provides an electronic device, which comprises a processor and a memory;
a memory for storing operating instructions;
and the processor is used for executing the financial risk model training method provided by any embodiment of the application by calling the operation instruction.
As an example, fig. 4 shows a schematic structural diagram of an electronic device to which the embodiment of the present application is applicable, and as shown in fig. 4, the electronic device 400 includes: a processor 401 and a memory 403. Wherein the processor 401 is coupled to the memory 403, such as via a bus 402. Optionally, the electronic device 400 may also include a transceiver 404. It should be noted that the transceiver 404 is not limited to one in practical applications. It is to be understood that the illustrated structure of the embodiment of the present invention does not constitute a specific limitation to the specific structure of the electronic device 400. In other embodiments of the present application, electronic device 400 may include more or fewer components than shown, or some components may be combined, some components may be split, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware. Optionally, the electronic device may further include a display screen 405 for displaying images or receiving operation instructions of a user as needed.
The processor 401 is applied to the embodiment of the present application, and is configured to implement the method shown in the foregoing method embodiment. The transceiver 404 may include a receiver and a transmitter, and the transceiver 404 is applied in the embodiment of the present application and is used for implementing the function of the electronic device of the embodiment of the present application to communicate with other devices when executed.
The Processor 401 may be a CPU (Central Processing Unit), a general purpose Processor, a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array) or other Programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 401 may also be a combination of computing functions, e.g., comprising one or more microprocessors, a combination of a DSP and a microprocessor, or the like.
Processor 401 may also include one or more processing units, such as: the processor 401 may include an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a memory, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a Neural-Network Processing Unit (NPU), etc. The different processing units may be separate devices or may be integrated into one or more processors. The controller may be, among other things, a neural center and a command center of the electronic device 400. The controller can generate an operation control signal according to the instruction operation code and the timing signal to complete the control of instruction fetching and instruction execution. A memory may also be provided in processor 401 for storing instructions and data. In some embodiments, the memory in the processor 401 is a cache memory. The memory may hold instructions or data that have just been used or recycled by processor 401. If the processor 401 needs to reuse the instruction or data, it can be called directly from the memory. Avoiding repeated accesses reduces the latency of the processor 401, thereby increasing the efficiency of the system.
The processor 401 may operate the financial risk model training method provided in the embodiment of the present application, so as to reduce the operation complexity of the user, improve the intelligent degree of the terminal device, and improve the user experience. The processor 401 may include different devices, for example, when the CPU and the GPU are integrated, the CPU and the GPU may cooperate to execute the financial risk model training method provided in the embodiment of the present application, for example, part of algorithms in the financial risk model training method is executed by the CPU, and another part of algorithms is executed by the GPU, so as to obtain faster processing efficiency.
Bus 402 may include a path that transfers information between the above components. The bus 402 may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus 402 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 4, but this does not indicate only one bus or one type of bus.
The Memory 403 may be a ROM (Read Only Memory) or other types of static storage devices that can store static information and instructions, a RAM (Random Access Memory) or other types of dynamic storage devices that can store information and instructions, an EEPROM (Electrically Erasable Programmable Read Only Memory), a CD-ROM (Compact disk), a high speed Random Access Memory, a non-volatile Memory such as at least one magnetic disk storage device, a flash Memory device, a universal flash Memory (UFS), or other optical disk storage, optical disk storage (including Compact disk, laser disk, optical disk, digital versatile disk, blu-ray disk, etc.), a magnetic disk storage medium or other magnetic storage devices, a magnetic disk storage medium, or other types of magnetic storage devices, Or any other medium which can be used to carry or store desired program code in the form of instructions or data structures and which can be accessed by a computer, but is not limited to such.
Optionally, the memory 403 is used for storing application program codes for executing the scheme of the present application, and is controlled by the processor 401 to execute. The processor 401 is configured to execute the application program code stored in the memory 403 to implement the financial risk model training method provided in any embodiment of the present application.
The memory 403 may be used to store computer-executable program code, which includes instructions. The processor 401 executes various functional applications of the electronic device 400 and data processing by executing instructions stored in the memory 403. The memory 403 may include a program storage area and a data storage area. Wherein, the storage program area can store the codes of the operating system and the application program, etc. The stored data area may store data created during use of the electronic device 400 (e.g., images, video, etc. captured by a camera application), and the like.
The memory 403 may also store one or more computer programs corresponding to the financial risk model training method provided in the embodiments of the present application. The one or more computer programs stored in the memory 403 and configured to be executed by the one or more processors 401 include instructions that may be used to perform the steps of the respective embodiments described above.
Of course, the codes of the financial risk model training method provided in the embodiment of the present application may also be stored in the external memory. In this case, the processor 401 may execute the code of the financial risk model training method stored in the external memory through the external memory interface, and the processor 401 may control the flow of executing the financial risk model.
The display screen 405 includes a display panel. The display panel may be a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (active-matrix organic light-emitting diode, AMOLED), a flexible light-emitting diode (FLED), a miniature, a Micro-oeld, a quantum dot light-emitting diode (QLED), or the like. In some embodiments, the electronic device 400 may include 1 or N display screens 405, N being a positive integer greater than 1. The display screen 405 may be used to display information input by or provided to the user as well as various Graphical User Interfaces (GUIs). For example, the display screen 405 may display a photograph, video, web page, or file, etc.
The electronic device provided by the embodiment of the present application is applicable to any embodiment of the above method, and therefore, the beneficial effects that can be achieved by the electronic device can refer to the beneficial effects in the corresponding method provided above, and are not described again here.
The embodiment of the application provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method for training a financial risk model shown in the above method embodiment is implemented.
The computer-readable storage medium provided in the embodiments of the present application is applicable to any embodiment of the foregoing method, and therefore, the beneficial effects that can be achieved by the computer-readable storage medium can refer to the beneficial effects in the corresponding method provided above, and are not described herein again.
The embodiment of the present application further provides a computer program product, which when running on a computer, causes the computer to execute the above related steps to implement the method in the above embodiment. The computer program product provided in the embodiments of the present application is applicable to any of the embodiments of the method described above, and therefore, the beneficial effects that can be achieved by the computer program product can refer to the beneficial effects in the corresponding method provided above, and are not described herein again.
The financial risk model training scheme provided by the embodiment of the application comprises the steps of obtaining original data related to financial risks; constructing a characteristic factor of the original data according to the original data; and training the pre-constructed risk model by utilizing the characteristic factor data according to a preset algorithm to obtain a target risk model. The technical scheme provided by the embodiment of the application can make a feasible and effective prediction result for a large-scale data source in a relatively short time, and the accuracy of the risk early warning model is greatly improved. Meanwhile, a part of data is used for verifying the constructed and trained model and correcting the model to further improve the accuracy of the risk model, so that the auxiliary identification effect of the target risk model is better. The financial monitoring system can more effectively assist local financial monitoring organizations to maintain financial safety and build good financial industry environment.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, a module or a unit may be divided into only one logical function, and may be implemented in other ways, for example, a plurality of units or components may be combined or integrated into another apparatus, or some features may be discarded or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present application, and can make several modifications and decorations, and these changes, substitutions, improvements and decorations should also be considered to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (16)

1. A financial risk model training method, the method comprising:
acquiring original data related to financial risks;
constructing a characteristic factor of the original data according to the original data;
and training the pre-constructed risk model by utilizing the characteristic factor data according to a preset algorithm to obtain a target risk model.
2. The method of claim 1, wherein the obtaining raw data relating to financial risk comprises:
acquiring historical business data and fund data of a target financial institution;
according to a preset risk judgment rule, carrying out risk annotation on the historical business data and the fund data;
and determining the example business data and the fund data for completing the risk annotation as the original data.
3. The method of claim 1 or 2, wherein the obtaining raw data relating to financial risk further comprises:
acquiring government affair data and public opinion data of an area where a target financial institution is located;
according to a preset risk judgment rule, carrying out risk annotation on the government affair data and the public opinion data;
and determining the government affair data and the public opinion data which finish the risk marking as original data.
4. The financial risk model training method of claim 3, wherein the preset risk judgment rule comprises:
acquiring and determining a black and white list of the service and a service abnormal rule in the target financial institution as a preset risk judgment rule; and/or the presence of a gas in the gas,
and setting a risk judgment rule of the government affair data and the public opinion data according to big data analysis and an expert judgment rule.
5. The financial risk model training method according to claim 3 or 4, wherein the acquiring government affairs data and public opinion data of the region where the target financial institution is located comprises:
applying a data interface to a government affair system and a public opinion analysis system;
and acquiring government affair data and public opinion data from the government affair system and the public opinion analysis system through the data interface.
6. The financial risk model training method of claim 5, wherein the constructing feature factors of the raw data from the raw data comprises:
according to the attribute characteristics of the obtained original data, at least one characteristic factor of the original data is constructed, wherein the characteristic factor comprises the following components:
a base eigenfactor, a bias eigenfactor, a cross eigenfactor.
7. The financial risk model training method of claim 1 or 6, wherein the method further comprises:
grouping the characteristic factor data according to a preset permutation and combination rule; each group of characteristic factor data at least comprises two types of data;
training the pre-constructed risk model by utilizing the first class characteristic factor data in each group;
and verifying the target risk model by using the second class characteristic factor data in each group.
8. The financial risk model training method of claim 7, wherein the predetermined permutation and combination rules include, but are not limited to:
combining two types of characteristic factors adjacent in time dimension into a group according to the time generated by the original data of the characteristic factor data; or the like, or, alternatively,
randomly dividing the characteristic factor data in the same time range into two types and combining the two types of characteristic factor data into one group.
9. The method of claim 8, wherein after validating the target risk model using the second type of feature factor data in each set, the method further comprises:
obtaining the verification result;
when the verification result is not in accordance with a set standard result, determining a correction variable according to the difference between the verification result and the standard result;
and correcting the target risk model according to the correction variable.
10. The method for training the financial risk model according to claim 1 or 9, wherein the training the pre-constructed risk model by using the feature factor data according to a preset algorithm includes but is not limited to:
and training the pre-constructed risk model by utilizing the characteristic factor data according to a preset random forest algorithm or a decision tree algorithm.
11. The financial risk model training method of claim 10, wherein the training of the pre-constructed risk model using the feature factor data according to a preset random forest algorithm comprises:
adopting a self-help resampling technology to randomly and repeatedly extract K samples from the characteristic factor data for M times to form M characteristic factor data sets; wherein M and K are both positive integers greater than or equal to 1;
and training the pre-constructed risk model by using the M characteristic factor data sets to obtain a target risk model.
12. The financial risk model training method of claim 10, wherein the training with the risk model pre-constructed with the feature factor data according to a preset decision tree algorithm comprises:
initializing a level parameter corresponding to a decision tree algorithm;
and training target training data in a training set by adopting a CART algorithm, and acquiring the original risk model when the number of growing layers of the decision tree reaches the level parameters.
13. A financial risk model training apparatus, the apparatus comprising: an acquisition module, a construction module, a storage module and a training module, wherein,
the acquisition module is used for acquiring original data related to financial risks;
the construction module is used for constructing the characteristic factors of the original data according to the original data;
the storage module is used for storing a preset algorithm and a pre-constructed risk model;
and the training module is used for training the pre-constructed risk model by utilizing the characteristic factor data according to a preset algorithm to obtain a target risk model.
14. The financial risk model training apparatus of claim 13, wherein the apparatus further comprises a data processing module and a verification module, wherein,
the storage module is also used for storing a preset permutation and combination rule;
the data processing module is further configured to group the characteristic factor data according to a predetermined permutation and combination rule; each group of characteristic factor data at least comprises two types of data;
the training module is specifically used for training the pre-constructed risk model by using the first class characteristic factor data in each group;
and the verification module is used for verifying the target risk model by utilizing the second class characteristic factor data in each group.
15. An electronic device comprising a processor and a memory;
the memory is used for storing operation instructions;
the processor is used for executing the method of any one of claims 1-12 by calling the operation instruction.
16. A computer-readable storage medium, characterized in that the storage medium has stored thereon a computer program which, when being executed by a processor, carries out the method of any one of claims 1-12.
CN202110071185.5A 2021-01-19 2021-01-19 Financial risk model training method and device Pending CN112837142A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110071185.5A CN112837142A (en) 2021-01-19 2021-01-19 Financial risk model training method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110071185.5A CN112837142A (en) 2021-01-19 2021-01-19 Financial risk model training method and device

Publications (1)

Publication Number Publication Date
CN112837142A true CN112837142A (en) 2021-05-25

Family

ID=75928721

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110071185.5A Pending CN112837142A (en) 2021-01-19 2021-01-19 Financial risk model training method and device

Country Status (1)

Country Link
CN (1) CN112837142A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113837764A (en) * 2021-09-22 2021-12-24 平安科技(深圳)有限公司 Risk early warning method and device, electronic equipment and storage medium
CN115408702A (en) * 2022-11-01 2022-11-29 浙江城云数字科技有限公司 Stacking interface operation risk level evaluation method and application thereof

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113837764A (en) * 2021-09-22 2021-12-24 平安科技(深圳)有限公司 Risk early warning method and device, electronic equipment and storage medium
CN113837764B (en) * 2021-09-22 2023-07-25 平安科技(深圳)有限公司 Risk early warning method, risk early warning device, electronic equipment and storage medium
CN115408702A (en) * 2022-11-01 2022-11-29 浙江城云数字科技有限公司 Stacking interface operation risk level evaluation method and application thereof
CN115408702B (en) * 2022-11-01 2023-02-14 浙江城云数字科技有限公司 Stacking interface operation risk grade evaluation method and application thereof

Similar Documents

Publication Publication Date Title
CN113822494B (en) Risk prediction method, device, equipment and storage medium
WO2020020088A1 (en) Neural network model training method and system, and prediction method and system
CN110009174B (en) Risk recognition model training method and device and server
US20180365594A1 (en) Systems and methods for generative learning
CN112541786A (en) Site selection method and device for network points, electronic equipment and storage medium
CN112488719A (en) Account risk identification method and device
CN112837142A (en) Financial risk model training method and device
CN114693192A (en) Wind control decision method and device, computer equipment and storage medium
CN114398557A (en) Information recommendation method and device based on double portraits, electronic equipment and storage medium
CN115859302A (en) Source code vulnerability detection method, device, equipment and storage medium
Chen et al. Survey on ai sustainability: Emerging trends on learning algorithms and research challenges
CN111709415A (en) Target detection method, target detection device, computer equipment and storage medium
CN114282258A (en) Screen capture data desensitization method and device, computer equipment and storage medium
CN112651782A (en) Behavior prediction method, device, equipment and medium based on zoom dot product attention
CN116166999A (en) Abnormal transaction data identification method, device, computer equipment and storage medium
CN114973374A (en) Expression-based risk evaluation method, device, equipment and storage medium
CN115204971A (en) Product recommendation method and device, electronic equipment and computer-readable storage medium
CN114373098A (en) Image classification method and device, computer equipment and storage medium
CN113256191A (en) Classification tree-based risk prediction method, device, equipment and medium
CN113343882A (en) Crowd counting method and device, electronic equipment and storage medium
CN115546192B (en) Livestock quantity identification method, device, equipment and storage medium
CN117314756B (en) Verification and protection method and device based on remote sensing image, computer equipment and storage medium
CN113657910B (en) Real name authentication method, device, electronic equipment and readable storage medium
CN112541443B (en) Invoice information extraction method, invoice information extraction device, computer equipment and storage medium
Shrmali Categorization of Cloud Computing & Deep Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination