CN117011080A - Financial risk prediction method, apparatus, device, medium and program product - Google Patents

Financial risk prediction method, apparatus, device, medium and program product Download PDF

Info

Publication number
CN117011080A
CN117011080A CN202310986777.9A CN202310986777A CN117011080A CN 117011080 A CN117011080 A CN 117011080A CN 202310986777 A CN202310986777 A CN 202310986777A CN 117011080 A CN117011080 A CN 117011080A
Authority
CN
China
Prior art keywords
risk prediction
data
risk
financial
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310986777.9A
Other languages
Chinese (zh)
Inventor
王亚欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202310986777.9A priority Critical patent/CN117011080A/en
Publication of CN117011080A publication Critical patent/CN117011080A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/12Accounting
    • G06Q40/125Finance or payroll
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2123/00Data types
    • G06F2123/02Data types in the time domain, e.g. time-series data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Strategic Management (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Economics (AREA)
  • Evolutionary Biology (AREA)
  • General Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Biomedical Technology (AREA)
  • Technology Law (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Human Resources & Organizations (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Tourism & Hospitality (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The disclosure provides a financial risk prediction method, which can be applied to the field of artificial intelligence and the field of financial technology. The method comprises the following steps: acquiring data to be processed, wherein the data to be processed comprises at least one of financial report data, associated company information, associated person information and company stock price information; processing data to be processed by adopting a pre-trained risk prediction model to obtain a risk prediction result; determining a risk threshold; determining that the risk prediction result is greater than a risk threshold; the risk prediction model comprises a plurality of encoders and a classifier, wherein the encoders are used for processing data to be processed to obtain a plurality of groups of feature vectors, and the classifier is used for fusing the plurality of groups of feature vectors and calculating to obtain a risk prediction result. The present disclosure also provides a financial risk prediction apparatus, device, storage medium and program product.

Description

Financial risk prediction method, apparatus, device, medium and program product
Technical Field
The present disclosure relates to the field of artificial intelligence and the field of finance, and in particular to a financial risk prediction method, apparatus, device, medium and program product.
Background
In recent years, macroscopic economy faces the situation of intricacies, various financial risk events are frequent, and commercial banks face larger operating pressures. The method has the advantages that the method is a serious emphasis on the public business, how to identify, evaluate and cope with the risk of the public business is of great practical significance to banking, and the financial risk of a public client is a ring which is difficult to control in the risk management of the bank, and the financial risk of the public client is predicted in advance so as to take prevention and control management measures to solve the problem that the economic benefit loss of the bank is minimized.
For analysis of financial risk of important clients such as a marketing company, judgment is usually performed by analyzing financial reports and key financial indexes of the clients (including liquidity indexes, lever indexes, profit indexes and the like), and risk ranking is performed on the financial report indexes by means of averaging, weighting and the like, but the method often lacks a certain degree of reliability, and multi-dimensional information related to the marketing company, such as association relations among a plurality of companies, potential influences of corporate juveniles and high tubes, changes of enterprise stock prices and corporate news and the like, are not fully utilized, and the factors may cause severe fluctuation of the financial risk.
Meanwhile, as current financial business is transferred to online, the bank is difficult to obtain comprehensive customer identity verification information, so that the operation risk is increased, and the information fraud technology is upgraded, so that financial fraud is frequently generated. Taking the financial risk assessment of a marketing company as an example, financial report of the company is a main basis for analyzing the financial risk of the company, however, the problems of time lag, high-dimensional sparseness, missing noise and the like exist, and part of the companies can beautify financial report data and even cause financial falsification, and considering the concealment of the risk of the company, the accurate prediction of the risk of the company is often insufficient only through single financial report data.
Disclosure of Invention
In view of the above, the present disclosure provides a financial risk prediction method, apparatus, device, medium and program product that improve prediction accuracy, for at least partially solving the above technical problems.
According to a first aspect of the present disclosure, there is provided a financial risk prediction method comprising: acquiring data to be processed, wherein the data to be processed comprises at least one of financial report data, associated company information, associated person information and company stock price information; processing data to be processed by adopting a pre-trained risk prediction model to obtain a risk prediction result; determining a risk threshold; determining that the risk prediction result is greater than a risk threshold; the risk prediction model comprises a plurality of encoders and a classifier, wherein the encoders are used for processing data to be processed to obtain a plurality of groups of feature vectors, and the classifier is used for fusing the plurality of groups of feature vectors and calculating to obtain a risk prediction result.
According to an embodiment of the present disclosure, processing data to be processed using a pre-trained risk prediction model, the obtaining a risk prediction result includes: constructing an heterogram according to the information of the associated company and the information of the associated person, wherein the heterogram comprises a master node and a plurality of slave nodes; initializing attributes of a plurality of slave nodes according to a master node to obtain attribute vectors; determining position vectors of a plurality of slave nodes in the heterogeneous graph; fusing the attribute vector and the position vector to obtain the structural characteristics of the main node; and calculating to obtain a risk prediction result according to the structural characteristics of the main node.
According to an embodiment of the present disclosure, processing data to be processed using a pre-trained risk prediction model, obtaining a risk prediction result further includes: extracting features of the stock price information of the company to obtain a stock price time sequence feature vector; and calculating to obtain a risk prediction result according to the structural characteristics of the main node and the stock price time sequence characteristic vector.
According to an embodiment of the present disclosure, processing data to be processed using a pre-trained risk prediction model, obtaining a risk prediction result further includes: extracting features of the financial newspaper data to obtain a financial newspaper feature vector; and calculating to obtain a risk prediction result according to the structural characteristics of the main node, the stock price time sequence characteristic vector and the financial report characteristic vector.
According to an embodiment of the present disclosure, calculating a risk prediction result according to a structural feature of a master node, a stock price time sequence feature vector, and a financial report feature vector includes: determining the weight of the structural feature, the share price time sequence feature vector and the financial report feature vector of the main node; according to the weight, fusing at least two of the structural feature, the share price time sequence feature vector and the financial report feature vector of the main node to obtain an embedded vector; and calculating to obtain a risk prediction result according to the embedded vector.
According to an embodiment of the present disclosure, determining that there is a financial risk in the event that the risk prediction result is greater than a risk threshold comprises: determining a difference value between the risk prediction result and a risk threshold value; in the event that the difference is greater than a difference threshold, a determination is made that there is a financial risk.
According to an embodiment of the present disclosure, acquiring data to be processed includes: acquiring company news data; associated company information and/or associated person information is determined based at least on the company news data.
According to an embodiment of the present disclosure, before processing the data to be processed using the pre-trained risk prediction model, the method further comprises: preprocessing the data to be processed to obtain data with uniform format; processing the data with uniform format by adopting a pre-trained risk prediction model to obtain a risk prediction result; the preprocessing comprises missing value filling, data rejection, discrete feature coding and feature normalization.
According to an embodiment of the present disclosure, training of a risk prediction model includes: determining a plurality of groups of training sets according to the data to be processed; training the risk prediction model by adopting a plurality of groups of training sets to obtain a plurality of initial risk prediction models; testing the initial risk prediction models to obtain initial risk prediction results; and training the risk prediction model by repeatedly adopting a plurality of groups of training sets until the variance of a plurality of initial risk prediction results is smaller than a variance threshold.
A second aspect of the present disclosure provides a financial risk prediction apparatus comprising: the acquisition module is used for acquiring data to be processed, wherein the data to be processed comprises at least one of financial report data, associated company information, associated person information and company stock price information; the processing module is used for processing the data to be processed by adopting a pre-trained risk prediction model to obtain a risk prediction result, wherein the risk prediction model comprises a plurality of encoders and a classifier, the encoders are used for processing the data to be processed to obtain a plurality of groups of feature vectors, and the classifier is used for fusing the plurality of groups of feature vectors and calculating to obtain the risk prediction result; a first determining module for determining a risk threshold; and a second determining module for determining that there is a financial risk if the risk prediction result is greater than the risk threshold.
A third aspect of the present disclosure provides an electronic device, comprising: one or more processors; and a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of the embodiments described above.
A fourth aspect of the present disclosure also provides a computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method of any of the embodiments described above.
A fifth aspect of the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the method of any of the embodiments described above.
Compared with the prior art, the financial risk prediction method, the device, the electronic equipment, the storage medium and the program product provided by the disclosure have at least the following beneficial effects:
(1) According to the financial risk prediction method, the risk prediction result is obtained by processing the high-dimensional multi-component heterogeneous data through the risk prediction model. More complex and rich semantic information is learned by utilizing various factors, meanwhile, stock price sequence data of a marketing company is fully utilized, new assessment dependence information is added for bank forecast company financial risk, and assessment accuracy is improved.
(2) According to the financial risk prediction method, feature extraction of the information of the associated company and the information of the associated person is achieved by means of constructing an abnormal pattern, more information can be provided for vector representation and learning of the graph neural network by means of repeated engraving of the stock right network and the investment network of the marketing company and by means of initializing node attributes and combining with the connection relation among the nodes.
(3) According to the financial risk prediction method, the characteristic of the stock price information of the company is extracted by specifically adopting an algorithm capable of extracting the time sequence characteristic of the data, and the accuracy of financial risk prediction is further improved by combining the structural characteristic associated with the heterogeneous map of the company on the market.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be more apparent from the following description of embodiments of the disclosure with reference to the accompanying drawings, in which:
FIG. 1 schematically illustrates an application scenario diagram of a financial risk prediction method, apparatus, device, medium and program product according to an embodiment of the present disclosure;
FIG. 2A schematically illustrates a flow chart of a financial risk prediction method according to an embodiment of the present disclosure; FIG. 2B schematically illustrates a flow diagram of a financial risk prediction method according to an embodiment of the present disclosure;
FIG. 3 schematically illustrates a flow chart of a method of processing data to be processed according to an embodiment of the disclosure;
FIG. 4 schematically illustrates company and personnel relationship heterograms in accordance with an embodiment of the present disclosure;
FIG. 5 schematically illustrates a block diagram of a financial risk prediction model in accordance with an embodiment of the present disclosure;
FIG. 6 schematically illustrates a flow chart of a method of processing data to be processed according to another embodiment of the disclosure;
FIG. 7A schematically illustrates a flow chart of a method of processing data to be processed according to yet another embodiment of the disclosure; FIG. 7B schematically illustrates a flow chart of a method of processing data to be processed according to yet another embodiment of the disclosure;
FIG. 8 schematically illustrates a method flow diagram for determining that there is a financial risk in accordance with an embodiment of the present disclosure;
FIG. 9 schematically illustrates a flow chart of a method of acquiring data to be processed according to an embodiment of the disclosure;
FIG. 10 schematically illustrates a flow chart of a financial risk prediction method according to another embodiment of the present disclosure;
FIG. 11 schematically illustrates a flow chart of a method of processing data to be processed according to an embodiment of the disclosure;
FIG. 12 schematically illustrates a block diagram of a financial risk prediction apparatus according to an embodiment of the present disclosure; and
Fig. 13 schematically illustrates a block diagram of an electronic device adapted to implement a financial risk prediction method according to an embodiment of the disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is only exemplary and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.
Where expressions like at least one of "A, B and C, etc. are used, the expressions should generally be interpreted in accordance with the meaning as commonly understood by those skilled in the art (e.g.," a system having at least one of A, B and C "shall include, but not be limited to, a system having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
Embodiments of the present disclosure provide a financial risk prediction method, apparatus, device, medium, and program product, which may be used in the financial field or other fields. It should be noted that the financial risk prediction method, apparatus, device, medium and program product of the present disclosure may be used in the financial field, and may also be used in any field other than the financial field, and the application fields of the financial risk prediction method, apparatus, device, medium and program product of the present disclosure are not limited.
In the technical scheme of the invention, the related user information (including but not limited to user personal information, user image information, user equipment information, such as position information and the like) and data (including but not limited to data for analysis, stored data, displayed data and the like) are information and data authorized by a user or fully authorized by all parties, and the processing of the related data such as collection, storage, use, processing, transmission, provision, disclosure, application and the like are all conducted according to the related laws and regulations and standards of related countries and regions, necessary security measures are adopted, no prejudice to the public welfare is provided, and corresponding operation inlets are provided for the user to select authorization or rejection.
Embodiments of the present disclosure provide a financial risk prediction method, comprising: acquiring data to be processed, wherein the data to be processed comprises at least one of financial report data, associated company information, associated person information and company stock price information; processing data to be processed by adopting a pre-trained risk prediction model to obtain a risk prediction result; determining a risk threshold; determining that the risk prediction result is greater than a risk threshold; the risk prediction model comprises a plurality of encoders and a classifier, wherein the encoders are used for processing data to be processed to obtain a plurality of groups of feature vectors, and the classifier is used for fusing the plurality of groups of feature vectors and calculating to obtain a risk prediction result. And processing the high-dimensional multi-component heterogeneous data through a risk prediction model to obtain a risk prediction result. More complex and rich semantic information is learned by utilizing various factors, meanwhile, stock price sequence data of a marketing company is fully utilized, new assessment dependence information is added for bank forecast company financial risk, and assessment accuracy is improved.
Fig. 1 schematically illustrates an application scenario diagram of a financial risk prediction method, apparatus, device, medium and program product according to an embodiment of the present disclosure.
As shown in fig. 1, an application scenario 100 according to this embodiment may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as financial class applications, shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only) may be installed on the terminal devices 101, 102, 103. In particular, data such as financial accounting data of e.g. a company in the city, associated company information, company related personnel information, company related news, and company stock prices may be collected by the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background management server (by way of example only) providing support for websites browsed by users using the terminal devices 101, 102, 103. The background management server may analyze and process the received data such as the user request, and feed back the processing result (e.g., the web page, information, or data obtained or generated according to the user request) to the terminal device.
It should be noted that the financial risk prediction method provided by the embodiments of the present disclosure may be generally performed by the server 105. Accordingly, the financial risk prediction apparatus provided by embodiments of the present disclosure may be generally provided in the server 105. The financial risk prediction method provided by the embodiments of the present disclosure may also be performed by a server or a cluster of servers other than the server 105 and capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Accordingly, the financial risk prediction apparatus provided by the embodiments of the present disclosure may also be provided in a server or server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
The financial risk prediction method of the disclosed embodiment will be described in detail below with reference to the scenario described in fig. 1 through fig. 2A to 11.
Fig. 2A schematically illustrates a flow chart of a financial risk prediction method according to an embodiment of the present disclosure. Fig. 2B schematically illustrates a flow diagram of a financial risk prediction method according to an embodiment of the disclosure.
As shown in fig. 2A, the financial risk prediction method of this embodiment includes, for example, operations S210 to S240, and the transaction processing method may be executed by a computer program on corresponding computer hardware.
In operation S210, data to be processed including at least one of financial report data, associated company information, associated person information, and company stock price information is acquired.
In embodiments of the present disclosure, the user's consent or authorization may be obtained prior to obtaining the user's information. For example, before operation S210, a request to acquire user information may be issued to the user. In case the user agrees or authorizes that the user information can be acquired, the operation S210 is performed.
In operation S220, the data to be processed is processed by using a pre-trained risk prediction model, so as to obtain a risk prediction result. The risk prediction model comprises a plurality of encoders and a classifier, wherein the encoders are used for processing data to be processed to obtain a plurality of groups of feature vectors, and the classifier is used for fusing the plurality of groups of feature vectors and calculating to obtain a risk prediction result.
In operation S230, a risk threshold is determined.
In operation S240, it is determined that there is a financial risk in case the risk prediction result is greater than the risk threshold.
For example, the financial risk prediction method of the present disclosure is used to predict financial risk of a marketable company prior to conducting a financial transaction with the marketable company. Accordingly, the data to be processed includes financial report data of the marketing company, company information associated with the marketing company, associated personnel information, stock price information of the marketing company, and the like. The association herein is to be understood as the participation of the marketing company in investments or other companies with business transactions, the significant stock-holding person in the marketing company and other companies invested therein, etc.
For example, as shown in fig. 2B, to predict financial risk for a certain marketing company, financial report data, associated company information, associated person information, and company share price information of the marketing company are collected as data to be processed. The data includes the financial status of the company, the background information of the associated company and personnel, and the market's response to the company.
First, these pending data are processed using a pre-trained risk prediction model. The model includes a plurality of encoders and classifiers. Each encoder is responsible for processing a portion of the data to be processed and generating a set of feature vectors. For example, one encoder is used to process financial accounting data, another encoder is used to process associated company information, and so on. The feature vectors generated by each encoder are capable of capturing important information of different data types. And then, the classifier is used for fusing the feature vectors and calculating to obtain a risk prediction result. The classifier considers the weights of different features and the relation between the features, so that the risk condition of the whole data to be processed is comprehensively considered. During the prediction process, the staff member sets a risk threshold. If the risk prediction exceeds the threshold, it is determined that the company is at financial risk. This means that the company may be faced with financial dilemma, poor management, or other adverse conditions. In this way, corporate financial conditions and related information may be comprehensively considered from multiple perspectives using big data analysis and machine learning techniques to predict potential financial risk. And further potential problems can be discovered earlier and corresponding measures can be taken to reduce risk and protect the interests of the company.
For example, consider a home electronics and technologies company as an example to illustrate the financial risk prediction method of the present disclosure. The following data to be processed were collected: financial accounting data, associated company information, associated person information, and company equity information. Wherein the financial report data includes financial reports of the company such as net profit cash ratio, asset liability ratio, gross profit, operation profit ratio, operation income growth rate, fixed asset ratio, and equity ratio. These data provide information on the financial status, income status, cash flow status, etc. of the company. The related company information relates to information of other companies related to the company, such as a partner, a provider, a customer, etc. The data can reveal information such as business transactions, associated risks, business associates and the like of the company. The associate information includes background information of persons associated with the company, such as high-management persons, board members, stakeholders, and the like. Such data may provide information on the background of key decision makers, practitioners experience and conflict of interests, etc. The corporate stock price information relates to historical data and market reactions of the corporate stock price. These data may reflect the market's assessment, expectations, and risk of the company.
Based on these pending data, they are processed using a pre-trained risk prediction model, which consists of a plurality of encoders and classifiers. For example, the financial reporting data may be processed by the encoder as a set of feature vectors including financial ratios (e.g., liability ratio, liquidity ratio), revenue growth rate, cash flow conditions, and the like. The associated company information may be processed by the encoder into another set of feature vectors including the associated company's industry status, financial health status, and collaboration history, among others. The associated person information and the company stock price information may also be processed by the encoder as corresponding feature vectors, respectively. Finally, the classifier comprehensively considers all the feature vectors and calculates a risk prediction result. If the risk prediction exceeds a predetermined risk threshold, it may be determined that the company is at financial risk. For example, if the financial ratio shows that the liability ratio of the company is too high, the rate of increase in revenue decreases and the cash flow condition is poor; associating corporate information to show financial dilemma with suppliers or partners having business transactions with the corporation; the associated person information shows records that the high-management personnel have improper association with other companies or have past manager errors; while company stock prices continue to drop, etc., which may result in risk prediction results exceeding a threshold value, thereby determining that the company is at financial risk. By the method, the financial condition, the associated company information, the associated person information and the market response of the company are comprehensively considered by utilizing multidimensional data, so that potential financial risks are predicted, and corresponding measures are taken in advance to reduce the risks and protect the interests of the company.
For example, the initial value of the risk threshold may be set to 0.5 as an empirical value, and then adaptively adjusted according to multiple training and testing results.
Fig. 3 schematically illustrates a flow chart of a method of processing data to be processed according to an embodiment of the disclosure. FIG. 4 schematically illustrates corporate and personnel relationship heterograms in accordance with an embodiment of the present disclosure. FIG. 5 schematically illustrates a block diagram of a financial risk prediction model in accordance with an embodiment of the present disclosure.
According to an embodiment of the present disclosure, as shown in fig. 3, data to be processed is processed, for example, through operations S321 to S325.
In operation S321, an heterogram is constructed according to the association company information and the association person information, the heterogram including a master node and a plurality of slave nodes.
In operation S322, attribute initialization is performed on the plurality of slave nodes according to the master node, to obtain an attribute vector.
In operation S323, position vectors of the plurality of slave nodes in the heterogeneous graph are determined.
In operation S324, the attribute vector and the position vector are fused to obtain the structural feature of the master node.
In operation S325, a risk prediction result is calculated according to the structural features of the master node.
For example, as shown in fig. 4, a marketing company is taken as a master node, and a plurality of persons and a plurality of companies associated with the marketing company are taken as slave nodes to construct different patterns, wherein the persons associated with the marketing company are also associated with different companies. The heterogeneous graph can be divided into a plurality of sub-graphs according to different company and personnel compositions.
For example, as shown in FIG. 5, an Encoder (Encoder) learns the structural representation of each iso-composition based on HAN (Heterogeneous Graph Attention Network, heterogeneous graph intent network). Since legal persons, high-level pipes and the like in the heterograms are not provided with attributes, the node attributes need to be initialized according to a main node, namely a marketing company. The attributes of the nodes are then represented using vectors, and then the attribute vector and the position vector fusion representation of the nodes in the graph are input into the HAN as the vector of the current node to finally obtain the embedded representation of the node, as in equation (1). Multiple types of nodes and edges are arranged in one iso-graph, and the corresponding nodes and edges are solved by using an attention mechanism according to the different types of nodes and edgesThen the obtained plurality of +.>The final structural vector representation Z is obtained through a semantic level attention mechanism calculation module.
Wherein,
representing the semantic level vector (i.e. the attribute vector of each node) can be understood as a weight,/->Representing a node level vector (i.e., a vector of positions of nodes in the graph). w and alpha are hyper-parameters, and h is a fused vector.
It will be appreciated that other network structures besides HAN may be employed to extract feature vectors in the iso-graph, including, for example, but not limited to RGCN, RGAT, HGAT, GNN-FILM, etc.
Fig. 6 schematically illustrates a flow chart of a method of processing data to be processed according to another embodiment of the disclosure.
According to an embodiment of the present disclosure, as shown in fig. 6, data to be processed is processed, for example, through operations S621 to S622.
In operation S621, feature extraction is performed on the company share price information to obtain a share price time sequence feature vector. And
in operation S622, a risk prediction result is calculated according to the structural feature of the master node and the stock price time sequence feature vector.
For example, the corporate stock price information may be time series data. As shown in FIG. 5, LSTM (Long Short Term Memory) is used to extract the characteristics of the time series data of the price of the company, and the hidden representation of the price of the company (namely, the time series characteristic vector of the price of the company) is obtained. Then, the stock price time sequence feature vector and the structural vector representation Z are fused by adopting an MLP (Multilayer Perceptron, multi-layer perceptron) to obtain an embedded vector, and the risk prediction result of the marketing company is obtained through calculation. The financial risk prediction model based on the graph neural network consists of an encoder E and a classifier M, and the objective function is shown as a formula (4):
wherein,
yi represents the predicted risk prediction probability value, Representing a positive category (node with financial risk) in the network>Representing negative categories (nodes without financial risk) in the graph network. By optimizing the objective function, a risk prediction probability value can be calculated.
It can be understood that, besides extracting the time sequence feature vector of the stock price of the company by using the LSTM, the extracting of the time sequence feature vector of the stock price of the company can also be realized by using a network model such as GRU, biLSTM, biGRU, transformer, RNN. Besides MLP, network models such as DNN, CNN and the like can be adopted to fuse the multidimensional feature vectors so as to obtain embedded vectors.
Fig. 7A schematically illustrates a flow chart of a method of processing data to be processed according to yet another embodiment of the disclosure.
According to an embodiment of the present disclosure, as shown in fig. 7A, data to be processed is processed, for example, through operations S721 to S722.
In operation S721, feature extraction is performed on the property data to obtain a property feature vector.
In operation S722, a risk prediction result is calculated according to the structural feature of the master node, the stock price time sequence feature vector and the financial report feature vector.
For example, MLP (Multilayer Perceptron, multi-layer perceptron) may be directly used to fuse the financial report feature vector, the share price time sequence feature vector, and the structural vector representation Z to obtain an embedded vector.
Fig. 7B schematically illustrates a flow chart of a method of processing data to be processed according to yet another embodiment of the disclosure.
According to an embodiment of the present disclosure, as shown in fig. 7B, data to be processed is processed, for example, by operations S7221 to S7223.
In operation S7221, the weights of the structural feature, the stock price timing feature vector, and the financial report feature vector of the master node are determined.
In operation S7222, at least two of the structural feature, the share price time sequence feature vector and the financial report feature vector of the master node are fused according to the weights, to obtain an embedded vector. And
in operation S7223, a risk prediction result is calculated from the embedded vector.
For example, as shown in fig. 5, feature vectors of corporate financial newspaper data may be extracted, and fused with a share price time sequence feature vector and a structure vector representation Z to obtain an embedded vector. For each marketing company node, the features consist of three parts: and finally, the three characteristics are fused into an embedded vector according to weights by using a characteristic fusion module based on an attention mechanism, and the obtained final representation is calculated by an MLP (multi-level processor) perceptron to obtain a risk probability value which can be used for a financial risk assessment task of a marketing company. The multidimensional feature vectors are fused according to the weight, so that the compatibility of the model is improved, and the actual feature attributes of different nodes can be embodied.
FIG. 8 schematically illustrates a method flow diagram for determining a risk of having a financial risk in accordance with an embodiment of the present disclosure.
According to an embodiment of the present disclosure, as shown in fig. 8, it is determined that a certain marketable company has a financial risk, for example, through operations S841 to S842.
In operation S841, a difference between the risk prediction result and the risk threshold value is determined.
In operation S842, it is determined that there is a financial risk if the difference is greater than the difference threshold.
For example, the degree of financial risk is determined based on the difference between the probability value of the generated risk prediction result and the risk threshold. If the risk threshold is greater than the difference threshold, the probability of the company on the market to generate financial risk is considered to be large, and if the risk threshold is less than the difference threshold, the probability of the company to generate financial risk is considered to be small. The method for determining the financial risk is specifically limited, and after the risk threshold value is exceeded to a certain extent, the method is determined to have the financial risk, so that the stability of the model is improved, and false alarms caused by unreasonable selection of the threshold value or tiny fluctuation of data are avoided.
Fig. 9 schematically illustrates a flow chart of a method of acquiring data to be processed according to an embodiment of the disclosure.
According to an embodiment of the present disclosure, as shown in fig. 9, data to be processed is acquired, for example, through operations S911 to S912.
In operation S911, company news data is acquired.
In operation S912, associated company information and/or associated person information is determined based at least on the company news data.
For example, when using a financial risk prediction method, the associated company information and/or the associated person information may be determined by company news data. A new energy company is analyzed using the financial risk prediction method of the present disclosure. In addition to the financial report data, the corporate stock price information and the associated person information, the associated corporate information and some associated person information can be obtained through the corporate news data in the data to be processed. Corporate news data may include news manuscripts, bulletins, media stories, and the like for the corporation. By analyzing these data, it can be found that the new energy company has established a cooperative relationship with a certain battery manufacturer, and a new generation battery technology is planned to be developed together. This battery manufacturer is one of the company's associated companies. This information can be combined and used as one of the data to be processed, further affecting the risk prediction result.
In addition, corporate news data may also provide associated person information. For example, news mentions that the new energy company's sponsor and CEO have been strategically collaborated with a well-known investor who has a high reputation in the industry. This investor is one of the relevant persons of the company. By incorporating this correspondents information into the data to be processed, the financial risk of the company can be assessed more fully. After the data to be processed is processed by using a pre-trained risk prediction model, a risk prediction result can be obtained. If the result exceeds a preset risk threshold, it may be determined that the company is at financial risk based on the associated company information and the associated person information in the company news data. Therefore, by determining the relevant company information and the relevant person information by using the company news data, the data to be processed can be more fully utilized, and the accuracy and the comprehensiveness of the financial risk prediction can be improved.
Missing, inconsistent data is likely to occur due to the dataset. The application of an algorithm to such noisy data can affect the accuracy of the prediction results, and thus the data to be processed can also be preprocessed before the data to be processed is processed using a pre-trained risk prediction model, to improve the accuracy of the model prediction.
FIG. 10 schematically illustrates a flow chart of a financial risk prediction method according to another embodiment of the present disclosure.
According to an embodiment of the present disclosure, as shown in fig. 10, the financial risk prediction method further includes, for example, operations S1010 to S1020, before processing the data to be processed using the pre-trained risk prediction model.
In operation S1010, the data to be processed is preprocessed to obtain data with uniform format. The preprocessing comprises missing value filling, data rejection, discrete feature coding and feature normalization. And
in operation S1020, the data with unified format is processed by using a pre-trained risk prediction model, so as to obtain a risk prediction result.
For example, missing value padding: the missing data may be filled in using an average or median. And (3) data elimination: some related companies may be only a blank company, and have no actual business operation and no reference value, and can reject the data. Discrete feature coding: for discrete data features, one-hot encoding may be used to convert into computable sequence values. Feature normalization: since different evaluation indexes often have different dimensions and dimension units, the situation affects the result of data analysis, and in order to eliminate the dimension influence among indexes, data standardization processing is required to meet the comparability requirement among data indexes. After the original data is subjected to data standardization processing, all indexes are in the same order of magnitude, and the method is suitable for comprehensive comparison and evaluation.
Fig. 11 schematically illustrates a flow chart of a method of processing data to be processed according to an embodiment of the disclosure.
According to an embodiment of the present disclosure, as shown in fig. 11, a risk prediction model is trained, for example, by operations S1110 to S1140.
In operation S1110, a plurality of sets of training sets are determined according to data to be processed.
In operation S1120, training the risk prediction model by using multiple sets of training sets to obtain multiple initial risk prediction models.
In operation S1130, a plurality of initial risk prediction models are tested respectively to obtain a plurality of initial risk prediction results.
In operation S1140, training the risk prediction model using multiple sets of training sets is repeated until the variance of the multiple initial risk prediction results is less than the variance threshold.
For example, for a training dataset, a model is trained using 5-fold cross-validation, by deriving different training models over 5 different training subsets, and finally by reducing the variance of the training results to derive the final financial risk prediction model.
For example, sample data that has been marked is used, which data includes different situations of good financial condition and financial risk. First, sample data is divided into a plurality of different subsets, one part of which is used for training of the model and the other part is used for verification of the model. For example, the sample data may be divided into 5 subsets, 4 of which are used to train the model and 1 of which are used to validate the model. This process is then repeated, with each subset acting as a validation set once. In this way, different sample combinations can be taken into account in combination, thereby more fully evaluating the performance of the model. In each training and validation step, the model will process the data to be processed using the encoder, generating feature vectors. The classifier will then train and predict using these feature vectors, resulting in risk prediction results. By comparing the predicted results with the actual labels, the accuracy and performance of the model can be assessed. The purpose of cross-validation is to find the average performance of the model over different data sets to avoid bias due to selection of a particular data set. In the training process, parameter adjustment and model improvement can be performed according to the verification result so as to improve the prediction capability and generalization capability of the model. By repeatedly performing cross-validation and model optimization, a financial risk prediction model with higher accuracy and stability can be obtained. Such models can be applied to unlabeled data, predict future financial risk, and assist decision makers in taking timely action to reduce risk and protect company interests.
Based on the financial risk prediction method, the disclosure further provides a financial risk prediction device. The device will be described in detail below in connection with fig. 12.
FIG. 12 schematically illustrates a block diagram of a financial risk prediction apparatus according to an embodiment of the present disclosure.
As shown in fig. 12, the financial risk prediction apparatus 1200 of this embodiment includes, for example: an acquisition module 1210, a processing module 1220, a first determination module 1230, and a second determination module 1240.
The acquiring module 1210 is configured to acquire data to be processed, where the data to be processed includes at least one of financial report data, associated company information, associated person information, and company share price information. In an embodiment, the obtaining module 1210 may be configured to perform the operation S210 described above, which is not described herein.
The processing module 1220 is configured to process the data to be processed by using a pre-trained risk prediction model to obtain a risk prediction result, where the risk prediction model includes a plurality of encoders and a classifier, the plurality of encoders are configured to process the data to be processed to obtain a plurality of sets of feature vectors, and the classifier is configured to fuse the plurality of sets of feature vectors and calculate to obtain the risk prediction result. In an embodiment, the processing module 1220 may be configured to perform the operation S220 described above, which is not described herein.
The first determination module 1230 is configured to determine a risk threshold. In an embodiment, the first determining module 1230 may be used to perform the operation S230 described above, which is not described herein.
The second determining module 1240 is configured to determine that there is a financial risk if the risk prediction result is greater than the risk threshold. In an embodiment, the second determining module 1240 may be configured to perform the operation S240 described above, which is not described herein.
Any of the acquisition module 1210, the processing module 1220, the first determination module 1230, and the second determination module 1240 may be combined in one module to be implemented, or any of them may be split into a plurality of modules according to an embodiment of the present disclosure. Alternatively, at least some of the functionality of one or more of the modules may be combined with at least some of the functionality of other modules and implemented in one module. According to embodiments of the present disclosure, at least one of the acquisition module 1210, the processing module 1220, the first determination module 1230, and the second determination module 1240 may be implemented at least in part as hardware circuitry, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in hardware or firmware in any other reasonable manner of integrating or packaging circuitry, or in any one of or a suitable combination of three of software, hardware, and firmware. Alternatively, at least one of the acquisition module 1210, the processing module 1220, the first determination module 1230 and the second determination module 1240 may be at least partially implemented as computer program modules which, when executed, may perform the corresponding functions.
Fig. 13 schematically illustrates a block diagram of an electronic device adapted to implement a financial risk prediction method according to an embodiment of the disclosure.
As shown in fig. 13, an electronic device 1300 according to an embodiment of the present disclosure includes a processor 1301 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 1302 or a program loaded from a storage portion 1308 into a Random Access Memory (RAM) 1303. Processor 1301 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. Processor 1301 may also include on-board memory for caching purposes. Processor 1301 may include a single processing unit or multiple processing units for performing different actions of the method flow according to embodiments of the present disclosure.
In the RAM 1303, various programs and data necessary for the operation of the electronic apparatus 1300 are stored. The processor 1301, the ROM 1302, and the RAM 1303 are connected to each other through a bus 1304. The processor 1301 performs various operations of the method flow according to the embodiment of the present disclosure by executing programs in the ROM 1302 and/or the RAM 1303. Note that the program may be stored in one or more memories other than the ROM 1302 and the RAM 1303. Processor 1301 may also perform various operations of the method flow according to embodiments of the present disclosure by executing programs stored in the one or more memories.
According to an embodiment of the disclosure, the electronic device 1300 may also include an input/output (I/O) interface 1305, the input/output (I/O) interface 1305 also being connected to the bus 1304. The electronic device 1300 may also include one or more of the following components connected to the I/O interface 1305: an input section 1306 including a keyboard, a mouse, and the like; an output portion 1307 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker, and the like; a storage portion 1308 including a hard disk or the like; and a communication section 1309 including a network interface card such as a LAN card, a modem, or the like. The communication section 1309 performs a communication process via a network such as the internet. The drive 1310 is also connected to the I/O interface 1305 as needed. Removable media 1311, such as magnetic disks, optical disks, magneto-optical disks, semiconductor memory, and the like, is installed as needed on drive 1310 so that a computer program read therefrom is installed as needed into storage portion 1308.
The present disclosure also provides a computer-readable storage medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs that, when executed, implement a financial risk prediction method according to an embodiment of the present disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example, but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, the computer-readable storage medium may include ROM 1302 and/or RAM 1303 described above and/or one or more memories other than ROM 1302 and RAM 1303.
Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the methods shown in the flowcharts. The program code, when executed in a computer system, causes the computer system to implement the financial risk prediction method provided by embodiments of the present disclosure.
The above-described functions defined in the system/apparatus of the embodiments of the present disclosure are performed when the computer program is executed by the processor 1301. The systems, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
In one embodiment, the computer program may be based on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program can also be transmitted, distributed over a network medium in the form of signals, downloaded and installed via the communication portion 1309, and/or installed from the removable medium 1311. The computer program may include program code that may be transmitted using any appropriate network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In such embodiments, the computer program may be downloaded and installed from a network via the communication portion 1309 and/or installed from the removable medium 1311. The above-described functions defined in the system of the embodiments of the present disclosure are performed when the computer program is executed by the processor 1301. The systems, devices, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
According to embodiments of the present disclosure, program code for performing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, such computer programs may be implemented in high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. Programming languages include, but are not limited to, such as Java, c++, python, "C" or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that the features recited in the various embodiments of the disclosure and/or in the claims may be provided in a variety of combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the disclosure. In particular, the features recited in the various embodiments of the present disclosure and/or the claims may be variously combined and/or combined without departing from the spirit and teachings of the present disclosure. All such combinations and/or combinations fall within the scope of the present disclosure.
The embodiments of the present disclosure are described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described above separately, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be made by those skilled in the art without departing from the scope of the disclosure, and such alternatives and modifications are intended to fall within the scope of the disclosure.

Claims (13)

1. A method of financial risk prediction comprising:
acquiring data to be processed, wherein the data to be processed comprises at least one of financial report data, associated company information, associated person information and company stock price information;
Processing the data to be processed by adopting a pre-trained risk prediction model to obtain a risk prediction result;
determining a risk threshold;
determining that there is a financial risk if the risk prediction result is greater than the risk threshold;
the risk prediction model comprises a plurality of encoders and a classifier, wherein the encoders are used for processing the data to be processed to obtain a plurality of groups of feature vectors, and the classifier is used for fusing the plurality of groups of feature vectors and calculating to obtain the risk prediction result.
2. The method of claim 1, wherein processing the data to be processed using a pre-trained risk prediction model to obtain a risk prediction result comprises:
constructing an heterogram according to the association company information and the association person information, wherein the heterogram comprises a master node and a plurality of slave nodes;
initializing the attributes of the plurality of slave nodes according to the master node to obtain attribute vectors;
determining position vectors of the plurality of slave nodes in the heterogram;
fusing the attribute vector and the position vector to obtain the structural characteristics of the main node; and
And calculating to obtain the risk prediction result according to the structural characteristics of the main node.
3. The method of claim 2, wherein processing the data to be processed using a pre-trained risk prediction model to obtain a risk prediction result further comprises:
extracting features of the stock price information of the company to obtain a stock price time sequence feature vector; and
and calculating to obtain the risk prediction result according to the structural characteristics of the main node and the stock price time sequence characteristic vector.
4. A method according to claim 3, wherein said processing said data to be processed using a pre-trained risk prediction model to obtain a risk prediction result further comprises:
extracting features of the financial newspaper data to obtain a financial newspaper feature vector; and
and calculating to obtain the risk prediction result according to the structural characteristics of the main node, the stock price time sequence characteristic vector and the financial report characteristic vector.
5. The method of claim 4, wherein calculating the risk prediction result according to the structural feature of the master node, the stock price timing feature vector, and the financial report feature vector comprises:
Determining the structural characteristics of the master node, the stock price time sequence characteristic vector and the weight of the financial report characteristic vector;
according to the weight, fusing at least two of the structural feature of the main node, the stock price time sequence feature vector and the financial report feature vector to obtain an embedded vector; and
and calculating to obtain the risk prediction result according to the embedded vector.
6. The method of claim 1, wherein the determining that there is a financial risk if the risk prediction result is greater than the risk threshold comprises:
determining a difference between the risk prediction result and the risk threshold;
and determining that the financial risk exists if the difference is greater than a difference threshold.
7. The method of claim 1, wherein the acquiring the data to be processed comprises:
acquiring company news data;
and determining the associated company information and/or the associated person information at least according to the company news data.
8. The method of claim 1, wherein prior to processing the data to be processed using a pre-trained risk prediction model, the method further comprises:
Preprocessing the data to be processed to obtain data with uniform format; and
processing the data with unified format by adopting a pre-trained risk prediction model to obtain a risk prediction result;
the preprocessing comprises missing value filling, data rejection, discrete feature coding and feature normalization.
9. The method of claim 1, wherein the training of the risk prediction model comprises:
determining a plurality of groups of training sets according to the data to be processed;
training the risk prediction model by adopting the multiple groups of training sets to obtain multiple initial risk prediction models;
testing the initial risk prediction models respectively to obtain initial risk prediction results;
and training the risk prediction model by repeatedly adopting the multiple sets of training sets until the variance of the multiple initial risk prediction results is smaller than a variance threshold.
10. A financial risk prediction apparatus, comprising:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring data to be processed, and the data to be processed comprises at least one of financial report data, associated company information, associated person information and company stock price information;
The processing module is used for processing the data to be processed by adopting a pre-trained risk prediction model to obtain a risk prediction result, wherein the risk prediction model comprises a plurality of encoders and a classifier, the encoders are used for processing the data to be processed to obtain a plurality of groups of feature vectors, and the classifier is used for fusing the plurality of groups of feature vectors and calculating to obtain the risk prediction result;
a first determining module for determining a risk threshold; and
and the second determining module is used for determining that the risk prediction result is greater than the risk threshold value and has financial risk.
11. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-9.
12. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to any of claims 1-9.
13. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 9.
CN202310986777.9A 2023-08-07 2023-08-07 Financial risk prediction method, apparatus, device, medium and program product Pending CN117011080A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310986777.9A CN117011080A (en) 2023-08-07 2023-08-07 Financial risk prediction method, apparatus, device, medium and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310986777.9A CN117011080A (en) 2023-08-07 2023-08-07 Financial risk prediction method, apparatus, device, medium and program product

Publications (1)

Publication Number Publication Date
CN117011080A true CN117011080A (en) 2023-11-07

Family

ID=88565179

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310986777.9A Pending CN117011080A (en) 2023-08-07 2023-08-07 Financial risk prediction method, apparatus, device, medium and program product

Country Status (1)

Country Link
CN (1) CN117011080A (en)

Similar Documents

Publication Publication Date Title
US11321774B2 (en) Risk-based machine learning classifier
Guo et al. Bitcoin price forecasting: A perspective of underlying blockchain transactions
US12002094B2 (en) Systems and methods for generating gradient-boosted models with improved fairness
WO2019196546A1 (en) Method and apparatus for determining risk probability of service request event
US8296205B2 (en) Connecting decisions through customer transaction profiles
US20230351396A1 (en) Systems and methods for outlier detection of transactions
US11694208B2 (en) Self learning machine learning transaction scores adjustment via normalization thereof accounting for underlying transaction score bases relating to an occurrence of fraud in a transaction
Wang et al. Leveraging Multisource Heterogeneous Data for Financial Risk Prediction: A Novel Hybrid-Strategy-Based Self-Adaptive Method.
US20220222683A1 (en) Labeling optimization through image clustering
Hou Financial Abnormal Data Detection System Based on Reinforcement Learning
CN116934341A (en) Transaction risk assessment method, device, electronic equipment and medium
US20240161117A1 (en) Trigger-Based Electronic Fund Transfers
WO2023121848A1 (en) Deduplication of accounts using account data collision detected by machine learning models
US20230066770A1 (en) Cross-channel actionable insights
CN115689571A (en) Abnormal user behavior monitoring method, device, equipment and medium
CN117011080A (en) Financial risk prediction method, apparatus, device, medium and program product
Locatelli et al. Artificial Intelligence and Credit Risk
US11989243B2 (en) Ranking similar users based on values and personal journeys
Xie Nonlinear volatility risk prediction algorithm of financial data based on improved deep learning
Dai et al. Financial Risk Early‐Warning of Neusoft Group Based on Support Vector Machine
CN117973627A (en) Data prediction method, device, apparatus, storage medium, and program product
CN115048585A (en) Product recommendation method, and training method, device and equipment of product recommendation model
Balamohan et al. Effective Stock Market Pricing Prediction Using Long Short Term Memory-Upgraded Model (LSTM-UP) on Evolving Data Sets
ADEJUMO DEVELOPMENT OF AN AUTOMATED REAL-TIME CREDIT CARD FRAUD DETECTION SYSTEM
CN117035843A (en) Customer loss prediction method and device, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination