CN113486124B - Bank bad asset management and management system, method, equipment and storage medium based on PCA and knowledge graph technology - Google Patents

Bank bad asset management and management system, method, equipment and storage medium based on PCA and knowledge graph technology Download PDF

Info

Publication number
CN113486124B
CN113486124B CN202110564109.8A CN202110564109A CN113486124B CN 113486124 B CN113486124 B CN 113486124B CN 202110564109 A CN202110564109 A CN 202110564109A CN 113486124 B CN113486124 B CN 113486124B
Authority
CN
China
Prior art keywords
asset
knowledge graph
bad
principal component
assets
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110564109.8A
Other languages
Chinese (zh)
Other versions
CN113486124A (en
Inventor
陈伟
李丽华
王红宝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jialian International Auction Co ltd
Shandong Jialian Electronic Commerce Co ltd
Original Assignee
Jialian International Auction Co ltd
Shandong Jialian Electronic Commerce Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jialian International Auction Co ltd, Shandong Jialian Electronic Commerce Co ltd filed Critical Jialian International Auction Co ltd
Priority to CN202110564109.8A priority Critical patent/CN113486124B/en
Publication of CN113486124A publication Critical patent/CN113486124A/en
Application granted granted Critical
Publication of CN113486124B publication Critical patent/CN113486124B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a bank bad asset management and management system, method, equipment and storage medium based on PCA and knowledge graph technology, comprising an asset display module, an asset classification analysis module and an asset disposal module; the asset display module is to: the overview of the asset information and the real-time view data refreshing are realized; the asset classification module is to: classifying and displaying according to the type of the assets; the method comprises the steps of displaying attributes of assets in a knowledge graph mode, and extracting a plurality of attributes in the knowledge graph into a plurality of key indexes for displaying in a PCA mode; calculating and analyzing the knowledge graph through a set rule, and giving the priority degree, the disposal opportunity and the disposal mode of the undesirable assets to be disposed by combining a manual analysis and judgment mode; the asset handling module implements knowledge graph playback replication. The invention reduces the dimension of the asset information attribute, solves the problems of excessive dimension, high management difficulty and inaccurate analysis of the poor asset information, and leads managers to pay close attention to the principal component index.

Description

Bank bad asset management and management system, method, equipment and storage medium based on PCA and knowledge graph technology
Technical Field
The invention relates to a bank asset management system, a bank asset management method, bank asset management equipment and a storage medium based on a knowledge graph and big data analysis, which provide asset management analysis and bad asset disposal operation decision analysis functions for financial institutions such as banks and the like, and belong to the technical field of network asset transaction.
Background
At present, central national enterprises, banks and the like face inventory and asset inventory, value increase of asset insurance is realized, asset loss is prevented, inventory resource allocation is emphasized and optimized, high-quality incremental supply is enlarged, and objective requirements and demands of supply and demand dynamic balance are realized. The bank bad assets are in trillion level, effective management of the assets is promoted through a high-tech means of an asset management system, and the improvement of the efficiency of bad asset disposal has important economic and social significance.
In the field of bank asset management, particularly poor asset management at present, the following problems need to be solved in a key way: first, enterprise, corporate inventory asset classification and management is not uniform. The enterprise has a large number of assets, various types and dispersed regions, and the current situation of the assets is difficult to understand visually and clearly. Secondly, a scientific and effective technical analysis means is lacked, and real-time query and analysis are carried out on asset valuation, an asset disposal mode, a disposal strategy, asset expected income and acquired income, so that the asset disposal efficiency is low, and the asset devaluation or the asset loss is caused. The main reason for this problem is that the information attribute indexes related to the assets, especially the bad assets, have too many dimensions, which increases the complexity and analysis difficulty of the problem, and in addition, there may be an association relationship between the indexes and the indexes, which results in overlapping information and error analysis, so that the asset manager can not clearly and intuitively grasp the core elements of the assets and can not accurately and effectively manage, analyze and make decisions on the assets.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a bank asset management system based on a knowledge graph and big data analysis;
the invention also provides a bank asset management method based on the knowledge graph and big data analysis;
the method comprises the steps of displaying the attributes of the bad assets of the bank in a knowledge graph mode, drawing images of the assets by the graph, displaying the assets by a technical means of d3.js + vue, storing asset graph data by a neo4j graph database, and supporting the editing function of nodes, attributes and relationships by the knowledge graph. The technical means mainly solves the problem that the assets are displayed in an unclear and visual mode.
The invention uses PCA principal component analysis method to the asset knowledge map node list to reduce the dimension of the asset information attribute. The bad assets of the bank are extracted into 3-4 principal components from more than 30 attributes, and the asset manager focuses on the principal component indexes after dimension reduction, so that the information of the bad assets can be grasped. Classifying the nodes contained in the found main components by using different colors and different arrangements in the knowledge graph display; the knowledge graph of the bad assets is simplified to only display the main components, the content concerned by a manager is simplified, and the asset management work efficiency is improved. The invention mainly solves the problems of excessive dimensionality of the poor asset information, high management difficulty and inaccurate analysis, and enables managers to pay attention to the principal component indexes.
The invention also provides computer equipment and a computer readable storage medium.
Interpretation of terms:
1. knowledge map (Knowledge Graph) is a series of different graphs displaying Knowledge development process and structure relationship in the book intelligence field, describing Knowledge resources and carriers thereof by using visualization technology, mining, analyzing, constructing, drawing and displaying Knowledge and mutual relation between Knowledge resources and Knowledge carriers.
2. The relational database is a database established on the basis of a relational model, and data in the database is processed by means of mathematical concepts and methods such as set algebra and the like. Various entities in the real world and various connections between entities are represented by relational models.
3.Js + vue. js technology, a fully named Data-Driven Documents for d3, a javascript class library for Data-Driven Documents, is one of the most popular web visualization libraries, which is used by many other form plug-ins, and d3.js is mainly used for manipulating Data, which converts Data into various graphics by using HTML, SVG, CSS. The invention uses d3.js to display nodes and relations of the knowledge graph visualization. Js is a set of progressive framework for constructing a user interface, a view component for realizing data binding and combination of response through an API (application programming interface) which is as simple as possible, and is a JavaScript library for constructing a Web interface, a data-driven component is provided, a simple and flexible API is provided, so that the MVVM is simpler. Js is one of the mainstream front-end frameworks for the industry to develop web pages at present.
4. The feature value points in the lithotripsy graph are represented in the form of circles and look like lithotripsy, so the lithotripsy graph is called. It is also called a steep slope diagram, which is also pictographic, and the circles of these characteristic values are connected by lines, usually like a steep hill. The lithograph is a number of intra-quantities showing the descending order of eigenvalues associated with the components or factors and the components or factors. Used in principal component analysis and factor analysis to intuitively assess which components or factors account for the majority of the variability in the data. The ideal pattern in the lithograph is a steep curve, followed by a curve, and then a flat or horizontal line. Those components or factors of the steep curve that precede the first point of onset of the flat line trend are retained.
5. Neo4j, a high-performance NOSQL graph database, stores structured data on a network rather than in tables. Graph databases are good at handling large volumes of complex, interconnected, low-structured data, as opposed to relational databases.
The technical scheme of the invention is as follows:
a bank bad asset management and management system based on PCA and knowledge graph technology comprises an asset display module and an asset classification analysis module;
the asset display module is to: displaying the overall situation of the bad asset projects, the distribution situation of the bad asset projects and the classification statistical situation of the bad asset projects, and realizing the overview of the asset information and the refreshing of real-time view data; the overall condition of the bad asset project refers to the total number of households, the total amount and the proportion of the bad assets in the total amount which are checked through the overview view; the distribution condition of the bad asset items, namely the regional distribution condition of the bad asset items, refers to the number of distributed households of the bad assets in regions and cities; for example, 1000 Jinan City and 2000 Qingdao City. The classified statistic condition of the bad asset items refers to the number of the corresponding households according to the type of the bad asset mortgage or the number of the corresponding households according to the five-level classification of the bank bad assets;
the asset classification module is to: displaying according to the type of the assets, such as the property, the real property, the share right and the bond right;
the method comprises the steps of displaying the attributes of the assets in a knowledge graph mode, wherein the attributes of the assets comprise the property, the real property, the share right and the bond right, and extracting a plurality of attributes in the knowledge graph into a plurality of key indexes for displaying in a PCA mode.
Preferably, the system for managing and managing the bank bad assets further comprises the asset handling module; the asset handling module is to: the method for realizing knowledge graph playback and duplication specifically comprises the following steps: in the asset disposal process, processes such as decision analysis, analysis results, disposal results and the like are displayed through the knowledge graph, time stamps and step attributes are added to knowledge graph nodes and relations in the creation and modification processes of the knowledge graph, the knowledge graph is refreshed according to time periods and steps, and staged duplication of the knowledge graph is realized through incremental increase of the time periods and the steps; playback is achieved by decrementing of the time end or step.
According to the invention, the attributes of the assets are displayed in a knowledge graph mode, and a plurality of attributes in the knowledge graph are extracted into a plurality of key indexes for displaying in a PCA mode, wherein the specific implementation process is as follows:
the method comprises the following steps: knowledge graph initial data display
A. Reading bad asset project account information which comprises an attribute field name, a type and an example, and displaying the bad asset project account information in a list mode;
B. automatically generating a knowledge graph of the undesirable asset project, specifically comprising: storing the attribute field names in the bad asset project ledger information as nodes of a knowledge graph, storing examples in the bad asset project ledger information as attributes of the nodes of the knowledge graph, storing types in the bad asset project ledger information as relationships among the nodes of the knowledge graph, and storing the knowledge graph in a neo4j database; the implementation technique is prior art.
And displaying the knowledge graph of the bad asset project by using a d3.js + vue. js technology; in the knowledge graph of the undesirable asset project, the knowledge graph comprises a root node and leaf nodes, the root node is the name of the asset project, the root node comprises a borrow number attribute and serves as a unique identifier, and other leaf nodes are displayed according to the direct association relationship of the attributes of the undesirable asset project.
Step two: sample data preparation
Extracting a plurality of mortgage type assets from bad asset project ledger information, converting enumerated character string fields into enumerated value numbers, and removing irrelevant or non-participating statistical calculation fields, including client names and client certificate information. Obtaining sample data;
loading sample data by using IBM sps Statistics software;
step three: analyzing the sample data by using a Principal Component Analysis (PCA) method, and extracting principal components;
obtaining a lithotripsy (steep slope inspection), removing main components (factors) of a flat part except a slope line part in the lithotripsy, and reserving the main components of the slope line part; as shown in fig. 3, the abscissa Component Number represents the principal-Number-th Component; the ordinate Eigenvalue represents the Eigenvalue (variance) of the corresponding principal component; in fig. 3, the turning point of the steep slope and the slope is between the 3 rd and 4 th principal components, and the variance from the 4 th to 10 th principal components is small and not much different, so that it is appropriate to extract 3 principal components.
Calculating the variance contribution rate of the principal component of the retained slope line part, and retaining the principal component result of the variance contribution rate, wherein the principal component result is a table comprising a principal component name, an entity name and an entity Chinese name;
naming the principal component result, reading the principal component analysis result and the related fields from the sample data, and generating a simplified knowledge graph for displaying the principal component.
According to the invention, preferably, the method includes the following steps:
1) reading the principal component results line by line, and adding all principal component names to a set S1; adding all entity names to the collection S2;
2) and D, cutting the knowledge graph of the bad asset project generated in the step B to form a knowledge graph of the main component, and comprising the following steps of:
a. deleting all relations of the root node;
b. traversing the set S2, traversing all nodes in the knowledge graph aiming at each element En in the set S2, deleting all relationships of the current node and deleting the current node if the entity names of the current node and the element En are the same, otherwise, not doing any operation; after traversing, only the root node and the nodes in the set S2 exist in the knowledge graph;
c. traversing the set S1, and generating a node set M for each element in the set S1, wherein the node set M comprises n nodes, and n refers to the number of corresponding principal components;
for each node Mi in the node set M, finding an element set Q corresponding to the subscript of the same sequence number in the set S2 according to the sequence number i of the node in the set S1, traversing the element set Q, generating a node Lj for each element Qj in the element set Q, and associating the node Lj with the Mi;
d. and generating a direct association relation between the m1 node and the main node by using a CQL mode.
A bank asset management method based on knowledge graph and big data analysis comprises the following steps:
(1) and (4) asset display: displaying the overall situation of the bad asset projects, the distribution situation of the bad asset projects and the classification statistical situation of the bad asset projects, and realizing the overview of the asset information and the refreshing of real-time view data;
(2) and (4) asset classification: displaying according to the type of the assets, such as the property, the real property, the share right and the bond right; the method comprises the steps of displaying the attributes of the assets in a knowledge graph mode, wherein the attributes of the assets comprise the property, the real property, the share right and the bond right, and extracting a plurality of attributes in the knowledge graph into a plurality of key indexes for displaying in a PCA mode.
A computer device comprising a memory storing a computer program and a processor implementing the steps of a method of bank asset management based on knowledge-graph and big-data analysis when executing the computer program.
A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, carries out the steps of a method of bank asset management based on knowledge-graph and big-data analysis.
The invention has the beneficial effects that:
1. the attribute of the bank bad asset is displayed in a knowledge map mode, the asset is represented by a map, and the problem that the bad asset is displayed in a non-clear and visual mode is solved.
2. And a PCA principal component analysis method is used for the asset knowledge graph node list, and the problem that the dimension of the poor asset information is too large, the management difficulty is high and the analysis is not accurate is solved by reducing the dimension of the asset information attribute, so that a manager focuses on principal component indexes.
3. The system adopts the knowledge graph playback technology, realizes the playback and the duplication of the disposal process, and is greatly convenient for the asset disposal information analysis and summarization of managers.
Drawings
FIG. 1 is a schematic view of an asset knowledge graph;
FIG. 2(a) is a first schematic diagram illustrating the operation steps of analyzing sample data by PCA principal component analysis;
FIG. 2(b) is a second schematic diagram of the operation steps of analyzing sample data using PCA principal component analysis;
FIG. 2(c) is a third schematic diagram of the operation steps of analyzing sample data using PCA principal component analysis;
FIG. 3 is a schematic illustration of a lithotripsy;
FIG. 4 is a schematic diagram of a reduced knowledge graph showing principal components.
Detailed Description
The invention is further defined in the following, but not limited to, the figures and examples in the description.
Example 1
A bank bad asset management and management system based on PCA and knowledge graph technology comprises an asset display module and an asset classification analysis module;
the asset display module is to: displaying the overall situation of the bad asset projects, the distribution situation of the bad asset projects and the classification statistical situation of the bad asset projects, and realizing the overview of the asset information and the refreshing of real-time view data; the overall condition of the bad asset project refers to the total number of households, the total amount and the proportion of the bad assets in the total amount which are checked through the overview view; the distribution condition of the bad asset items, namely the regional distribution condition of the bad asset items, refers to the number of distributed households of the bad assets in regions and cities; for example, 1000 Jinan City and 2000 Qingdao City. The classified statistic condition of the bad asset items refers to the number of the corresponding households according to the type of the bad asset mortgage or the number of the corresponding households according to the five-level classification of the bank bad assets; for example, a property mortgage of 100, a vehicle mortgage of 200, and a stock mortgage of 1000. The loan is divided into five categories according to the risk degree, namely normal, concerned, secondary, suspicious and lost, and the latter three categories are bad loans, for example, the total amount of suspicious assets is 100 ten thousand, and the total amount of lost is 200 ten thousand.
The asset classification module is to: displaying according to the type of the assets, such as the property, the real property, the share right and the bond right;
the method comprises the steps of displaying the attributes of the assets in a knowledge graph mode, wherein the attributes of the assets comprise the property, the real property, the share right and the bond right, and extracting a plurality of attributes in the knowledge graph into a plurality of key indexes for displaying in a PCA mode.
Example 2
The management system for managing the bank bad assets based on PCA and the knowledge graph technology in the embodiment 1 is characterized in that:
the bank bad asset management and management system also comprises an asset disposal module; the asset handling module is to: the method for realizing knowledge graph playback and duplication specifically comprises the following steps: in the asset disposal process, processes such as decision analysis, analysis results, disposal results and the like are displayed through the knowledge graph, time stamps and step attributes are added to knowledge graph nodes and relations in the creation and modification processes of the knowledge graph, the knowledge graph is refreshed according to time periods and steps, and staged duplication of the knowledge graph is realized through incremental increase of the time periods and the steps; playback is achieved by decrementing of the time end or step.
Example 3
The management system for managing the bank bad assets based on PCA and the knowledge graph technology in the embodiment 1 is characterized in that:
the method comprises the following steps of showing the property of the asset in a knowledge graph mode, extracting a plurality of properties in the knowledge graph into a plurality of key indexes in a PCA mode, and showing the key indexes, wherein the specific realization process comprises the following steps:
the method comprises the following steps: knowledge graph initial data display
B. Reading bad asset project ledger information, wherein the bad asset project ledger information comprises an attribute field name, a type and an example, which are specifically shown in table 1, and displaying the bad asset project ledger information in a list mode;
TABLE 1
Figure BDA0003080086610000071
B. Automatically generating a knowledge graph of the undesirable asset project, specifically comprising: storing the attribute field names in the bad asset project ledger information as nodes of a knowledge graph, storing examples in the bad asset project ledger information as attributes of the nodes of the knowledge graph, storing types in the bad asset project ledger information as relationships among the nodes of the knowledge graph, and storing the knowledge graph in a neo4j database; the implementation technique is prior art.
And displaying the knowledge graph of the bad asset project by using a d3.js + vue. js technology; in the knowledge graph of the undesirable asset project, the knowledge graph comprises a root node and leaf nodes, the root node is the name of the asset project, the root node comprises a borrow number attribute and serves as a unique identifier, and other leaf nodes are displayed according to the direct association relationship of the attributes of the undesirable asset project.
Step two: sample data preparation
Extracting a plurality of mortgage type assets from bad asset project ledger information, converting enumerated character string fields into enumerated value numbers, and removing irrelevant or non-participating statistical calculation fields, including client names and client certificate information. Obtaining sample data;
loading sample data by using IBM sps Statistics software; for example, loading sample data is shown in Table 2:
TABLE 2
Figure BDA0003080086610000081
Step three: analyzing the sample data by using a Principal Component Analysis (PCA) method, and extracting principal components;
obtaining a lithotripsy (steep slope inspection), removing main components (factors) of a flat part except a slope line part in the lithotripsy, and reserving the main components of the slope line part; as shown in fig. 3, the abscissa Component Number represents the principal-Number-th Component; the ordinate Eigenvalue represents the Eigenvalue (variance) of the corresponding principal component; in fig. 3, the turning point of the steep slope and the slope is between the 3 rd and 4 th principal components, and the variance from the 4 th to 10 th principal components is small and not much different, so that it is appropriate to extract 3 principal components.
Calculating the variance contribution rate of the principal component of the retained slope line part, and retaining the principal component result of the variance contribution rate, wherein the principal component result is a table comprising a principal component name, an entity name and an entity Chinese name;
naming the principal component result, reading the principal component analysis result and the related fields from the sample data, and generating a simplified knowledge graph for displaying the principal component.
Inputting data through an IBM SPSS tool, and automatically outputting results through the tool;
in the Factor Analysis dialog box, since all fields have been digitized, all attribute fields are selected for Factor Analysis. As shown in fig. 2 (a).
-selecting the (Principal components) option in Factor Analysis: Extraction 'Method'; selecting a' Correlation matrix option (solving for principal components based on Correlation coefficients) in the Analyze column; selecting the 'Unrotated factor solution' option (principal component load matrix) in the 'output Display' column; selecting a Screen plot option, and displaying a cliff low-lithotripsy graph, namely a contribution rate graph; selecting 'eigen values over' based on the characteristic root in the 'Extract extraction' column and filling 1 (extracting a main component according to the principle that the characteristic root is larger than 1); as shown in fig. 2 (b).
The Factor Scores section selects 'display Factor score coeffecient matrix (coefficient matrix showing principal component expressions)'; as shown in fig. 2 (c).
-clicking the [ Options ] button in the [ factorization Factor Analysis ] dialog, the [ factorization: option Factor Analysis: Options dialog.
"deletion value Missing 'Exclude cases listwise in Missing value Missing columns'
'Sorted by size' in 'Coefficient Display Format' represents sorting by principal component (factor) load amount; "cancel small coefficient supress absolute values less than this", default is 0.1; the expression lists all load factors with load amounts greater than 0.1.
In the above, the analysis and type selection of the sample data is completed, the analysis operation is performed, and the analysis result is explained as follows:
looking up the lithotripsy (steep slope inspection) -remove the principal component (factor) of the flat part of the slope line; the 7 th factor and later in fig. 3 are more flat, so the 7 factors remain.
The variance contribution table is shown in table 3, and 7 variance contribution tables are obtained by taking factors with eigenvalues greater than 1, and the variance contribution ratios are 12.131%, 5.718%, 5.315%, 3.446%, 2.489%, 1.479%, and 1.155%, respectively.
TABLE 3
Total Variance Explained
Figure BDA0003080086610000101
Extraction Method:Principal Component Analysis.
The principal component (factor load) matrix table is shown in table 4:
TABLE 4
Component Matrixa
Figure BDA0003080086610000111
Extraction Method:Principal Component Analysis.
a.7 components extracted.
From the table above it can be analyzed that:
the first principal component is highly correlated with Pinterest, Pdeal, Qinterest, capital _ out, LLEft, Pasum, Ltotal, extract _ sum, Ctate, Lpur.
The second principal component is highly correlated with spo _ economy, estamate, sType.
The third principal component is highly related to Package, dealType, Pvalue, PayType.
Because the main requirement of analysis is to perform dimensionality reduction on data and let managers pay attention to the core index field, 7 principal components are analyzed through PCA, and only the first 3 principal components are selected here because the contribution rates of the first 3 principal components have reached 70% (as shown in table 5).
TABLE 5
Total Variance Explained
Figure BDA0003080086610000121
The main components are named:
principal component 1 is named as "asset ledger information";
the principal component 2 is named as "mortgage condition information";
principal component 3 is named "asset disposition information".
The principal component results obtained from the sps tool analysis are reported in tabular form as shown in table 6:
TABLE 6
Figure BDA0003080086610000122
Figure BDA0003080086610000131
Naming the principal component result, reading the principal component analysis result and related fields from the sample data to obtain related information and relation, and generating a simplified knowledge graph displaying the principal component, comprising the following steps:
1) reading the principal component results line by line, adding all principal component names (column 2 of table 6) to the set S1; add all entity names (table 6, column 3) to the collection S2; for example, after all data are read, a set S1 is formed, [ asset ledger information, mortgage situation information, asset disposition information ]; set S2 [ [ Pinterest, Pcap, Qinterest, captal _ out, LLEft, Pasum, Ltotal, contact _ sum, Ctate, Lpur ], [ spo _ economi, estimate, sType ], [ Package, DealType, Pvalue, PayType ] ].
2) And D, cutting the knowledge graph of the bad asset project generated in the step B to form a knowledge graph of the main component, and comprising the following steps of:
a. deleting all relations of the root node;
b. traversing the set S2, traversing all nodes in the knowledge graph aiming at each element En in the set S2, deleting all relationships of the current node and deleting the current node if the entity names of the current node and the element En are the same, otherwise, not doing any operation; after traversing, only the root node and the nodes in the set S2 exist in the knowledge graph;
in the graph view shown in fig. 1, a graph is cut to form a principal component knowledge graph, all nodes in the graph are traversed except for a master Node in the graph, a set S2 is traversed for a current Node _ k, and if an entity name of the Node _ k exists in S2, all relations of the Node are deleted; having traversed the set S2, if the entity name of Node _ k does not exist in S2, then the Node is deleted. At this point, only the root node and the nodes in the set S2 that correspond to the same entity name exist in the view, and all non-principal nodes have been deleted.
c. Traversing the set S1, and generating a node set M for each element in the set S1, wherein the node set M comprises n nodes, and n refers to the number of corresponding principal components;
the generation method comprises the following steps: the entity, the attribute of the entity and the relation between the entities are written into a CQL statement of neo4j, a create (Mn: TargetNode { name: 'asset standing book information' }) is connected with a neo4j database through a java SDN drive, the CQL statement is executed and inserted into a neo4j database, and the generation of the node Mi is realized.
For each node Mi in the node set M, finding an element set Q corresponding to the subscript of the same sequence number in the set S2 according to the sequence number i of the node in the set S1, traversing the element set Q, generating a node Lj for each element Qj in the element set Q, and associating the node Lj with the Mi;
generating a node Lj for each element Qj in the element set Q and the association between the node Lj and Mi, wherein the generation mode is as follows: the relation between the entities is written into a CQL statement of neo4j, a create (Mn: TargetNode { name: 'asset standing information' }) is connected with a neo4j database through a java SDN driver, the CQL statement is executed and inserted into the neo4j database, and the generation of the relation is realized.
d. And generating a direct association relation between the m1 node and the main node by using a CQL mode.
Through the steps, the simplified knowledge graph displaying the main components is generated, and the effect is shown in fig. 4.
Example 4
A bank asset management method based on knowledge graph and big data analysis comprises the following steps:
(1) and (4) asset display: displaying the overall situation of the bad asset projects, the distribution situation of the bad asset projects and the classification statistical situation of the bad asset projects, and realizing the overview of the asset information and the refreshing of real-time view data;
(2) and (4) asset classification: displaying according to the type of the assets, such as the property, the real property, the share right and the bond right; the method comprises the steps of displaying the attributes of the assets in a knowledge graph mode, wherein the attributes of the assets comprise the property, the real property, the share right and the bond right, and extracting a plurality of attributes in the knowledge graph into a plurality of key indexes for displaying in a PCA mode.
Example 5
A computer device comprising a memory storing a computer program and a processor implementing the steps of a method for bank asset management based on knowledge-graph and big-data analysis as described in embodiment 4 when the computer program is executed by the processor.
Example 6
A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, carries out the steps of the method of bank asset management based on knowledge-graph and big-data analysis described in embodiment 4.

Claims (6)

1. A bank bad asset management and management system based on PCA and knowledge graph technology is characterized by comprising an asset display module and an asset classification analysis module;
the asset display module is to: displaying the overall situation of the bad asset projects, the distribution situation of the bad asset projects and the classification statistical situation of the bad asset projects, and realizing the overview of the asset information and the refreshing of real-time view data; the overall condition of the undesirable asset project refers to the total number of households, the total amount and the proportion of the undesirable assets; the distribution condition of the bad asset items, namely the regional distribution condition of the bad asset items, refers to the number of distributed households of the bad assets in regions and cities; the classified statistic condition of the bad asset items refers to the number of the corresponding households according to the type of the bad asset mortgage or the number of the corresponding households according to the five-level classification of the bank bad assets;
the asset classification analysis module is to: displaying according to the type of the assets, such as the property, the real property, the share right and the bond right;
the method comprises the steps that the attributes of assets are displayed in a knowledge graph mode, the attributes of the assets comprise dynamic property, real property, share right and bond right, and a plurality of attributes in the knowledge graph are extracted into a plurality of key indexes for displaying in a PCA mode;
the method comprises the following steps of showing the property of the asset in a knowledge graph mode, extracting a plurality of properties in the knowledge graph into a plurality of key indexes in a PCA mode, and showing the key indexes, wherein the specific realization process comprises the following steps:
the method comprises the following steps: knowledge graph initial data display
A. Reading bad asset project account information which comprises an attribute field name, a type and an example, and displaying the bad asset project account information in a list mode;
B. automatically generating a knowledge graph of the undesirable asset project, specifically comprising: storing the attribute field names in the bad asset project ledger information as nodes of a knowledge graph, storing examples in the bad asset project ledger information as attributes of the nodes of the knowledge graph, storing types in the bad asset project ledger information as relationships among the nodes of the knowledge graph, and storing the knowledge graph in a neo4j database; and displaying the knowledge graph of the bad asset project by using a d3.js + vue. js technology;
step two: sample data preparation
Extracting a plurality of balance type assets from bad asset project ledger information, converting enumerated character string fields into enumerated value numbers, and removing irrelevant or non-participating fields in statistical calculation to obtain sample data;
loading sample data by using IBM sps Statistics software;
step three: analyzing the sample data by using a Principal Component Analysis (PCA) method, and extracting principal components;
obtaining a broken stone map, removing main components of a flat part except a slope line part in the broken stone map, and reserving the main components of the slope line part;
calculating the variance contribution rate of the principal component of the retained slope line part, and retaining the principal component result of the variance contribution rate, wherein the principal component result is a table comprising a principal component name, an entity name and an entity Chinese name;
naming the principal component result, reading the principal component analysis result and the related fields from the sample data, and generating a simplified knowledge graph for displaying the principal component.
2. The system for managing and managing the bank undesirable assets based on the PCA and the knowledge-graph technology as claimed in claim 1, wherein the system for managing and managing the bank undesirable assets further comprises an assets disposal module; the asset handling module is to: the method for realizing knowledge graph playback and duplication specifically comprises the following steps: in the asset disposal process, processes such as decision analysis, analysis results, disposal results and the like are displayed through the knowledge graph, time stamps and step attributes are added to knowledge graph nodes and relations in the creation and modification processes of the knowledge graph, the knowledge graph is refreshed according to time periods and steps, and staged duplication of the knowledge graph is realized through incremental increase of the time periods and the steps; playback is achieved by decrementing of the time end or step.
3. The system of claim 1, wherein the PCA and the knowledge graph technology based management and management system for the bad assets of the bank is characterized in that the principal component results are named, the principal component analysis results and the related fields are read from the sample data to generate the simplified knowledge graph displaying the principal components, and the method comprises the following steps:
1) reading the principal component results line by line, and adding all principal component names to a set S1; adding all entity names to the collection S2;
2) and D, cutting the knowledge graph of the bad asset project generated in the step B to form a knowledge graph of the main component, and comprising the following steps of:
a. deleting all relations of the root node;
b. traversing the set S2, traversing all nodes in the knowledge graph aiming at each element En in the set S2, deleting all relationships of the current node and deleting the current node if the entity names of the current node and the element En are the same, otherwise, not doing any operation; after traversing, only the root node and the nodes in the set S2 exist in the knowledge graph;
c. traversing the set S1, and generating a node set M for each element in the set S1, wherein the node set M comprises n nodes, and n refers to the number of corresponding principal components;
for each node Mi in the node set M, finding an element set Q corresponding to the subscript of the same sequence number in the set S2 according to the sequence number i of the node in the set S1, traversing the element set Q, generating a node Lj for each element Qj in the element set Q, and associating the node Lj with the Mi;
d. and generating direct association relation between the m1 node and the main node.
4. A bank asset management method based on knowledge graph and big data analysis is characterized by comprising the following steps:
(1) and (4) asset display: displaying the overall situation of the bad asset projects, the distribution situation of the bad asset projects and the classification statistical situation of the bad asset projects, and realizing the overview of the asset information and the refreshing of real-time view data;
(2) and (4) asset classification: displaying according to the type of the assets, such as the property, the real property, the share right and the bond right; the method comprises the steps that the attributes of assets are displayed in a knowledge graph mode, the attributes of the assets comprise dynamic property, real property, share right and bond right, and a plurality of attributes in the knowledge graph are extracted into a plurality of key indexes for displaying in a PCA mode;
the method comprises the following steps of showing the property of the asset in a knowledge graph mode, extracting a plurality of properties in the knowledge graph into a plurality of key indexes in a PCA mode, and showing the key indexes, wherein the specific realization process comprises the following steps:
the method comprises the following steps: knowledge graph initial data display
A. Reading bad asset project account information which comprises an attribute field name, a type and an example, and displaying the bad asset project account information in a list mode;
B. automatically generating a knowledge graph of the undesirable asset project, specifically comprising: storing the attribute field names in the bad asset project ledger information as nodes of a knowledge graph, storing examples in the bad asset project ledger information as attributes of the nodes of the knowledge graph, storing types in the bad asset project ledger information as relationships among the nodes of the knowledge graph, and storing the knowledge graph in a neo4j database; and displaying the knowledge graph of the bad asset project by using a d3.js + vue. js technology;
step two: sample data preparation
Extracting a plurality of balance type assets from bad asset project ledger information, converting enumerated character string fields into enumerated value numbers, and removing irrelevant or non-participating fields in statistical calculation to obtain sample data;
loading sample data by using IBM sps Statistics software;
step three: analyzing the sample data by using a Principal Component Analysis (PCA) method, and extracting principal components;
obtaining a broken stone map, removing main components of a flat part except a slope line part in the broken stone map, and reserving the main components of the slope line part;
calculating the variance contribution rate of the principal component of the retained slope line part, and retaining the principal component result of the variance contribution rate, wherein the principal component result is a table comprising a principal component name, an entity name and an entity Chinese name;
naming the principal component result, reading the principal component analysis result and the related fields from the sample data, and generating a simplified knowledge graph for displaying the principal component.
5. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor when executing the computer program performs the steps of the method for bank asset management based on knowledge-graph and big-data analysis of claim 4.
6. A computer-readable storage medium, having stored thereon a computer program, wherein the computer program, when executed by a processor, performs the steps of the method for bank asset management based on knowledge-graph and big-data analysis of claim 4.
CN202110564109.8A 2021-05-24 2021-05-24 Bank bad asset management and management system, method, equipment and storage medium based on PCA and knowledge graph technology Active CN113486124B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110564109.8A CN113486124B (en) 2021-05-24 2021-05-24 Bank bad asset management and management system, method, equipment and storage medium based on PCA and knowledge graph technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110564109.8A CN113486124B (en) 2021-05-24 2021-05-24 Bank bad asset management and management system, method, equipment and storage medium based on PCA and knowledge graph technology

Publications (2)

Publication Number Publication Date
CN113486124A CN113486124A (en) 2021-10-08
CN113486124B true CN113486124B (en) 2022-02-11

Family

ID=77932988

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110564109.8A Active CN113486124B (en) 2021-05-24 2021-05-24 Bank bad asset management and management system, method, equipment and storage medium based on PCA and knowledge graph technology

Country Status (1)

Country Link
CN (1) CN113486124B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108932340A (en) * 2018-07-13 2018-12-04 华融融通(北京)科技有限公司 The construction method of financial knowledge mapping under a kind of non-performing asset operation field
CN109584046A (en) * 2018-11-29 2019-04-05 广州广永投资管理有限公司 A kind of pair of non-performing asset information data carries out depth excavation and analysis method and system
CN111061859A (en) * 2019-12-02 2020-04-24 深圳追一科技有限公司 Data processing method and device based on knowledge graph and computer equipment
CN112069327A (en) * 2020-09-04 2020-12-11 西南大学 Knowledge graph construction method and system for teaching resources of online education classroom
WO2021041241A1 (en) * 2019-08-26 2021-03-04 Healthpointe Solutions, Inc. System and method for defining a user experience of medical data systems through a knowledge graph

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110019841A (en) * 2018-07-24 2019-07-16 南京涌亿思信息技术有限公司 Construct data analysing method, the apparatus and system of debtor's knowledge mapping

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108932340A (en) * 2018-07-13 2018-12-04 华融融通(北京)科技有限公司 The construction method of financial knowledge mapping under a kind of non-performing asset operation field
CN109584046A (en) * 2018-11-29 2019-04-05 广州广永投资管理有限公司 A kind of pair of non-performing asset information data carries out depth excavation and analysis method and system
WO2021041241A1 (en) * 2019-08-26 2021-03-04 Healthpointe Solutions, Inc. System and method for defining a user experience of medical data systems through a knowledge graph
CN111061859A (en) * 2019-12-02 2020-04-24 深圳追一科技有限公司 Data processing method and device based on knowledge graph and computer equipment
CN112069327A (en) * 2020-09-04 2020-12-11 西南大学 Knowledge graph construction method and system for teaching resources of online education classroom

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"A PCA approach for fast retrieval of structural patterns in attributed graphs";Lei Xu等;《IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)》;20011031;第812-817页 *
基于知识图谱的资产关联模型构建与应用;中国工商银行江苏省分行课题组;《金融纵横》;20190525;第15-20页 *

Also Published As

Publication number Publication date
CN113486124A (en) 2021-10-08

Similar Documents

Publication Publication Date Title
US11681694B2 (en) Systems and methods for grouping and enriching data items accessed from one or more databases for presentation in a user interface
US11816100B2 (en) Dynamically materialized views for sheets based data
US8990717B2 (en) Context-aware charting
US8972888B2 (en) Graphical user interface for filtering a population of items
JP5551187B2 (en) Literature analysis system
US8510645B2 (en) Method of applying a function to a data set
US20040088650A1 (en) Methods and apparatus for generating a spreadsheet report template
CN106649223A (en) Financial report automatic generation method based on natural language processing
US20160210475A1 (en) Method and apparatus for providing selective access to information
US8314798B2 (en) Dynamic generation of contextual charts based on personalized visualization preferences
CN112328589B (en) Electronic form data granulation and index standardization processing method
US8280896B2 (en) Reporting row structure for generating reports using focus areas
JP5135412B2 (en) Document analysis apparatus and program
US11392281B1 (en) Hierarchical data display
US9189478B2 (en) System and method for collecting data from an electronic document and storing the data in a dynamically organized data structure
US7440969B2 (en) Data processing systems and methods for processing a plurality of application programs requiring an input database table having a predefined set of attributes
CN113486124B (en) Bank bad asset management and management system, method, equipment and storage medium based on PCA and knowledge graph technology
US20230376900A1 (en) Financial documents examination methods and systems
JP3185167B2 (en) Data processing system
US20070192278A1 (en) Method and apparatus for providing selective access to information
CN111639910A (en) Standing book generation method, device, equipment and storage medium
Tamasauskas et al. Research of conventional data mining tools for big data handling in finance institutions
US20100169266A1 (en) Data-overlap analysis for a data-warehousing system
CN117217172B (en) Table information acquisition method, apparatus, computer device, and storage medium
Assaf et al. Improving schema matching with linked data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant