CN113989018A

CN113989018A - Risk management method, risk management device, electronic equipment and medium

Info

Publication number: CN113989018A
Application number: CN202111244065.7A
Authority: CN
Inventors: 张珺珺; 苏宗国; 陈道斌; 金阳; 孟岩
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2021-10-25
Filing date: 2021-10-25
Publication date: 2022-01-28

Abstract

The disclosure provides a risk management method, a risk management device, an electronic device and a medium. The risk management method and the risk management device can be used in the technical field of big data. The risk management method comprises the following steps: acquiring enterprise data, wherein the enterprise data comprises a party and an event relation between the party, the event relation comprises m category data, each category data comprises n event information, m is an integer greater than or equal to 1, and n is an integer greater than or equal to 1; constructing a knowledge graph according to enterprise data, wherein the party and event information are nodes of the knowledge graph, the event relationship is the side of the knowledge graph, and the event information is used for explaining the corresponding side; calculating reduction weight of n event information edges between two parties based on the knowledge graph constructed by each category data

Calculating an aggregation weight ω of m class-data edges between two parties based on the reduction weights_vu(ii) a And performing risk management on the enterprise according to the aggregation weight.

Description

Risk management methods, devices, electronic equipment and media

技术领域technical field

本公开涉及大数据技术领域，更具体地，涉及一种风险管理方法、装置、电子设备和介质。The present disclosure relates to the technical field of big data, and more particularly, to a risk management method, apparatus, electronic device and medium.

背景技术Background technique

现阶段，我国小微融资市场蕴含巨大潜力，根据相关部门数据显示，我国市场经营主体超过一亿个，其中个体工商户超过七千万，蕴藏广阔的经营空间。作为经济发展和社会稳定的重要支柱，小微企业在促进人才有序流动、维护市场活力、推动科技创新等方面发挥着不可或缺的作用。At this stage, my country's small and micro financing market has huge potential. According to data from relevant departments, there are more than 100 million business entities in my country's market, including more than 70 million individual industrial and commercial households, which has a broad operating space. As an important pillar of economic development and social stability, small and micro enterprises play an indispensable role in promoting the orderly flow of talents, maintaining market vitality, and promoting technological innovation.

相对于大中型企业而言，小微企业在市场竞争中仍处于弱势地位，融资问题导致其很难保证经营的稳定性与持续性。银行贷款是企业融资的重要手段，为缓解小微企业的经营压力并满足强烈的融资需求，银行的小微贷款业务逐步扩张，各类信贷产品应运而生。伴随而来的问题是小微企业的较低还贷能力可能导致高贷款逾期偿还比例。Compared with large and medium-sized enterprises, small and micro enterprises are still in a weak position in market competition, and financing problems make it difficult for them to ensure the stability and continuity of operations. Bank loans are an important means of corporate financing. In order to relieve the operating pressure of small and micro enterprises and meet the strong financing needs, the small and micro loan business of banks has gradually expanded, and various credit products have emerged. The accompanying problem is that the low loan repayment ability of small and micro enterprises may lead to a high loan overdue repayment ratio.

传统的金融风控体系中主要依赖专家规则和巴塞尔协议中的各种指标，而小微企业在提供自身资讯上具有天然弱势，其“不透明”、“内部化”的非对称数据信息使得银行难以把控小微企业客户的实质性信贷风险，对小微企业信贷产品的管理比大型企业困难得多。如果采用无差异化的风控模型，则无法对绝大部分小微企业的风险进行合理预测，需引入全新的数据分析处理方法解决小微信贷业务难题。The traditional financial risk control system mainly relies on expert rules and various indicators in the Basel Agreement, while small and micro enterprises have a natural weakness in providing their own information, and their "opaque" and "internalized" asymmetric data information makes banks It is difficult to control the substantial credit risks of small and micro enterprise customers, and the management of credit products for small and micro enterprises is much more difficult than that of large enterprises. If an undifferentiated risk control model is adopted, it is impossible to reasonably predict the risks of most small and micro enterprises, and a new data analysis and processing method needs to be introduced to solve the problems of small and micro credit business.

发明内容SUMMARY OF THE INVENTION

有鉴于此，本公开提供了一种风险预测效果好和使用方便的基于知识图谱的风险管理方法、装置、电子设备和计算机可读存储介质。In view of this, the present disclosure provides a knowledge graph-based risk management method, apparatus, electronic device and computer-readable storage medium with good risk prediction effect and convenient use.

本公开的一个方面提供了一种基于知识图谱的风险管理方法，包括：获取企业数据，其中，企业数据包括当事方和当事方之间的事件关系，所述事件关系包括m个类别数据，每个所述类别数据包括n个事件信息，其中，m为大于等于1的整数，n为大于等于1的整数；根据所述企业数据构建知识图谱，其中，所述当事方和所述事件信息为所述知识图谱的节点，所述事件关系为所述知识图谱的边，所述事件信息用于解释对应的边；基于每个所述类别数据构建的所述知识图谱，计算两个所述当事方之间的n个事件信息边的归约权重

根据所述归约权重，计算两个所述当事方之间的m个类别数据边的聚合权重ω_vu；以及根据所述聚合权重对企业进行风险管理。One aspect of the present disclosure provides a knowledge graph-based risk management method, including: acquiring enterprise data, wherein the enterprise data includes a party and an event relationship between the parties, and the event relationship includes m categories of data , each of the category data includes n pieces of event information, where m is an integer greater than or equal to 1, and n is an integer greater than or equal to 1; construct a knowledge graph according to the enterprise data, wherein the parties and the The event information is the node of the knowledge graph, the event relationship is the edge of the knowledge graph, and the event information is used to explain the corresponding edge; based on the knowledge graph constructed by each of the category data, two the reduction weights of the n event information edges between the parties

According to the reduction weights, an aggregated weight ω _vu of m class data edges between two of the parties is calculated; and the enterprise is risk managed according to the aggregated weights.

根据本公开实施例的基于知识图谱的风险管理方法，通过建立知识图谱，并且计算知识图谱中边的聚合权重，可以对企业进行风险管理，例如利用知识图谱的标签传播，可以考虑到某一客户贷款存续期的风险对其上下游等关联客户产生的影响，从而提升针对小微企业普惠金融服务能力。本公开将事件信息作为知识图谱的节点来说明对应的边，使得边不需要附带属性，由此可以减小边的冗余，从而使得知识图谱在使用时反应更加快捷。According to the risk management method based on the knowledge graph according to the embodiment of the present disclosure, by establishing the knowledge graph and calculating the aggregate weight of the edges in the knowledge graph, the risk management of the enterprise can be performed. The risk of loan duration will affect its upstream and downstream related customers, thereby improving the ability to provide inclusive financial services for small and micro enterprises. In the present disclosure, event information is used as a node of the knowledge graph to describe the corresponding edge, so that the edge does not need additional attributes, thereby reducing the redundancy of the edge, thereby making the knowledge graph respond more quickly when used.

在一些实施例中，所述归约权重的计算方法如下：In some embodiments, the calculation method of the reduction weight is as follows:

其中，v表示两个所述当事方中的头节点，u表示两个所述当事方中的尾节点，E(v，u)表示头节点为v、尾节点为u的n个所述事件信息边的集合，E(v)表示头节点为v的有向边的集合，

表示边l′在对应的事件发生时刻的初始权重，t表示当前时刻，

表示边l′的时间权重，

表示边l在对应的事件发生时刻的初始权重，

表示边l的时间权重；以及

Among them, v represents the head node of the two parties, u represents the tail node of the two parties, and E(v, u) represents n all the nodes whose head node is v and tail node is u. E(v) represents the set of directed edges whose head node is v,

represents the initial weight of edge l' at the time of the corresponding event, t represents the current moment,

represents the time weight of edge l′,

represents the initial weight of edge l at the time of the corresponding event,

represents the time weight of edge l; and

所述聚合权重的计算方法如下：The calculation method of the aggregation weight is as follows:

其中，R(v，u)表示头节点为v、尾节点为u的m个所述类别数据边的集合，C_r表示与不同的所述类别数据对应的常数系数。

Wherein, R(v, u) represents a set of m edges of the class data with the head node v and the tail node u, and C _r represents a constant coefficient corresponding to different class data.

在一些实施例中，m个所述类别数据包括资金流数据、投资数据、担保数据和人企关联数据中的至少一个。In some embodiments, the m categories of data include at least one of capital flow data, investment data, guarantee data, and person-enterprise association data.

在一些实施例中，n个所述事件信息包括n个时间段的同一类别数据下的事件信息。In some embodiments, the n pieces of event information include event information under the same category of data for n time periods.

在一些实施例中，在所述根据所述企业数据构建知识图谱前，还包括清洗所述企业数据，其中，清洗所述企业数据包括：数据去重、数据中的特征补齐和数据中的异常特征处理中的一个。In some embodiments, before the construction of the knowledge graph according to the enterprise data, the method further includes cleaning the enterprise data, wherein the cleaning of the enterprise data includes: data deduplication, feature complementing in the data, and One of the exception feature handling.

在一些实施例中，所述根据所述企业数据构建知识图谱包括：根据所述当事方和m个所述类别数据构建模式层，其中，所述模式层包括根据所述当事方和m个所述类别数据中的m个类别建立的节点，和根据每个所述类别数据中的事件建立的节点之间的连边；以及将所述类别数据中的数据导入对应的模式层。In some embodiments, the constructing the knowledge graph according to the enterprise data comprises: constructing a schema layer according to the parties and m pieces of the category data, wherein the schema layer comprises according to the parties and m Nodes established by m categories in the category data, and links between nodes established according to events in each category data; and importing the data in the category data into the corresponding schema layer.

在一些实施例中，所述类别数据为结构化数据，所述将所述类别数据中的数据导入对应的模式层包括：将所述类别数据转换为资源描述框架数据；以及将所述资源描述框架数据导入对应的模式层。In some embodiments, the category data is structured data, and the importing data in the category data into a corresponding schema layer includes: converting the category data into resource description framework data; and converting the resource description The frame data is imported into the corresponding schema layer.

在一些实施例中，所述根据所述聚合权重对企业进行风险管理包括：根据所述聚合权重对所述知识图谱的节点进行标签传播；根据所述标签传播的结果进行社群划分，得到社群规模；根据所述社群规模，计算所述知识图谱的节点的度中心性；以及将所述度中心性作为风险预测模型的输入进行风险预测。In some embodiments, the performing risk management on the enterprise according to the aggregated weight includes: performing label propagation on the nodes of the knowledge graph according to the aggregated weight; dividing the community according to the result of the label propagation to obtain a community group size; according to the community size, calculate the degree centrality of the nodes of the knowledge graph; and use the degree centrality as an input of a risk prediction model to perform risk prediction.

在一些实施例中，所述根据所述聚合权重对企业进行风险管理包括：根据所述聚合权重对所述知识图谱的节点进行标签传播；以及根据所述标签传播的结果进行风险预测。In some embodiments, the performing risk management on the enterprise according to the aggregated weight includes: performing label propagation on the nodes of the knowledge graph according to the aggregated weight; and performing risk prediction according to a result of the label propagation.

在一些实施例中，所述的基于知识图谱的风险管理方法还包括可视化所述知识图谱，其中，所述可视化所述知识图谱包括：基于所述知识图谱进行节点检索、基于所述知识图谱进行子图游走、基于所述知识图谱进行路径探索和基于所述知识图谱进行自环探索中的至少一个。In some embodiments, the risk management method based on the knowledge graph further includes visualizing the knowledge graph, wherein the visualizing the knowledge graph includes: performing node retrieval based on the knowledge graph, performing node retrieval based on the knowledge graph at least one of subgraph walking, path exploration based on the knowledge graph, and self-loop exploration based on the knowledge graph.

在一些实施例中，所述节点检索包括：响应于输入的节点名称，展示与所述节点名称相关的节点及该节点的关联信息；所述子图游走包括：响应于人工点击所述事件关系的操作，展示与所点击事件关系对应的节点及由该节点发出的所有边；所述路径探索包括：获取两个节点之间的通路关系并展示；以及所述自环探索包括：获取通路关系为闭环的节点并展示。In some embodiments, the node retrieval includes: in response to an input node name, displaying a node related to the node name and associated information of the node; the sub-graph wandering includes: in response to a manual click on the event The operation of the relationship, displaying the node corresponding to the clicked event relationship and all the edges sent by the node; the path exploration includes: acquiring and displaying the path relationship between the two nodes; and the self-loop exploration includes: acquiring the path Nodes whose relationships are closed loops are displayed.

本公开的另一个方面提供了一种基于知识图谱的风险管理装置，包括：获取模块，所述获取模块用于获取企业数据，其中，企业数据包括当事方和当事方之间的事件关系，所述事件关系包括m个类别数据，每个所述类别数据包括n个事件信息，其中，m为大于等于1的整数，n为大于等于1的整数；构建模块，所述构建模块用于根据所述企业数据构建知识图谱，其中，所述当事方和所述事件信息为所述知识图谱的节点，所述事件关系为所述知识图谱的边，所述事件信息用于解释说明对应的所述边；第一计算模块，所述第一计算模块用于基于每个所述类别数据构建的所述知识图谱，计算两个所述当事方之间的n个事件信息边的归约权重

第二计算模块，所述第二计算模块用于根据所述归约权重，计算两个所述当事方之间的m个类别数据边的聚合权重ω_vu；以及管理模块，所述管理模块用于根据所述聚合权重对企业进行风险管理。Another aspect of the present disclosure provides a knowledge graph-based risk management device, comprising: an acquisition module, the acquisition module is configured to acquire enterprise data, wherein the enterprise data includes parties and event relationships between parties , the event relationship includes m pieces of category data, each of which includes n pieces of event information, where m is an integer greater than or equal to 1, and n is an integer greater than or equal to 1; a building module, the building module is used for A knowledge graph is constructed according to the enterprise data, wherein the parties and the event information are nodes of the knowledge graph, the event relationship is an edge of the knowledge graph, and the event information is used to explain the corresponding the edge; the first calculation module, the first calculation module is used to calculate the normalization of n event information edges between two parties based on the knowledge graph constructed by each of the category data. about weight

a second calculation module, which is configured to calculate, according to the reduction weights, the aggregated weights ω _vu of m class data edges between two of the parties; and a management module, the management module For risk management of the enterprise according to the aggregated weight.

本公开的另一方面提供了一种电子设备，包括一个或多个处理器以及一个或多个存储器，其中，所述存储器用于存储可执行指令，所述可执行指令在被所述处理器执行时，实现如上所述方法。Another aspect of the present disclosure provides an electronic device including one or more processors and one or more memories, wherein the memories are used to store executable instructions, the executable instructions being executed by the processor When executed, the method described above is implemented.

本公开的另一方面提供了一种计算机可读存储介质，存储有计算机可执行指令，所述指令在被执行时用于实现如上所述的方法Another aspect of the present disclosure provides a computer-readable storage medium storing computer-executable instructions that, when executed, are used to implement the method as described above

附图说明Description of drawings

通过以下参照附图对本公开实施例的描述，本公开的上述以及其他目的、特征和优点将更为清楚，在附图中：The above and other objects, features and advantages of the present disclosure will become more apparent from the following description of embodiments of the present disclosure with reference to the accompanying drawings, in which:

图1示意性示出了根据本公开实施例的可以应用方法、装置的示例性系统架构；FIG. 1 schematically shows an exemplary system architecture to which methods and apparatuses according to embodiments of the present disclosure can be applied;

图2示意性示出了根据本公开实施例的基于知识图谱的风险管理方法的流程图；FIG. 2 schematically shows a flowchart of a risk management method based on knowledge graph according to an embodiment of the present disclosure;

图3示意性示出了根据本公开实施例的知识图谱的示意图；FIG. 3 schematically shows a schematic diagram of a knowledge graph according to an embodiment of the present disclosure;

图4示意性示出了根据本公开实施例的根据企业数据构建知识图谱的流程图；FIG. 4 schematically shows a flowchart of constructing a knowledge graph according to enterprise data according to an embodiment of the present disclosure;

图5示意性示出了根据本公开实施例的将类别数据中的数据导入对应的模式层的流程图；FIG. 5 schematically shows a flowchart of importing data in category data into a corresponding schema layer according to an embodiment of the present disclosure;

图6示意性示出了根据本公开实施例的根据聚合权重对企业进行风险管理的流程图；FIG. 6 schematically shows a flowchart of risk management for an enterprise according to an aggregated weight according to an embodiment of the present disclosure;

图7示意性示出了根据本公开实施例的根据聚合权重对企业进行风险管理的流程图；FIG. 7 schematically shows a flowchart of risk management for an enterprise according to an aggregated weight according to an embodiment of the present disclosure;

图8示意性示出了根据本公开实施例的可视化知识图谱的流程图；FIG. 8 schematically shows a flowchart of a visualized knowledge graph according to an embodiment of the present disclosure;

图9示意性示出了根据本公开实施例的可视化知识图谱的流程图；FIG. 9 schematically shows a flowchart of a visualized knowledge graph according to an embodiment of the present disclosure;

图10示意性示出了根据本公开实施例的可视化知识图谱的流程图；FIG. 10 schematically shows a flowchart of a visualized knowledge graph according to an embodiment of the present disclosure;

图11示意性示出了根据本公开实施例的可视化知识图谱的流程图；FIG. 11 schematically shows a flowchart of a visualized knowledge graph according to an embodiment of the present disclosure;

图12示意性示出了根据本公开实施例的基于知识图谱的风险管理装置的框图；FIG. 12 schematically shows a block diagram of a risk management apparatus based on a knowledge graph according to an embodiment of the present disclosure;

图13示意性示出了根据本公开实施例的电子设备的方框图。13 schematically shows a block diagram of an electronic device according to an embodiment of the present disclosure.

具体实施方式Detailed ways

以下，将参照附图来描述本公开的实施例。但是应该理解，这些描述只是示例性的，而并非要限制本公开的范围。在下面的详细描述中，为便于解释，阐述了许多具体的细节以提供对本公开实施例的全面理解。然而，明显地，一个或多个实施例在没有这些具体细节的情况下也可以被实施。此外，在以下说明中，省略了对公知结构和技术的描述，以避免不必要地混淆本公开的概念。在本公开的技术方案中，所涉及的用户个人信息的获取，存储和应用等，均符合相关法律法规的规定，采取了必要保密措施，且不违背公序良俗。Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood, however, that these descriptions are exemplary only, and are not intended to limit the scope of the present disclosure. In the following detailed description, for convenience of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It will be apparent, however, that one or more embodiments may be practiced without these specific details. Also, in the following description, descriptions of well-known structures and techniques are omitted to avoid unnecessarily obscuring the concepts of the present disclosure. In the technical solution of the present disclosure, the acquisition, storage and application of the user's personal information involved all comply with the relevant laws and regulations, take necessary confidentiality measures, and do not violate public order and good customs.

在此使用的术语仅仅是为了描述具体实施例，而并非意在限制本公开。在此使用的术语“包括”、“包含”等表明了所述特征、步骤、操作和/或部件的存在，但是并不排除存在或添加一个或多个其他特征、步骤、操作或部件。The terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the present disclosure. The terms "comprising", "comprising" and the like as used herein indicate the presence of stated features, steps, operations and/or components, but do not preclude the presence or addition of one or more other features, steps, operations or components.

在使用类似于“A、B或C等中至少一个”这样的表述的情况下，一般来说应该按照本领域技术人员通常理解该表述的含义来予以解释(例如，“具有A、B或C中至少一个的系统”应包括但不限于单独具有A、单独具有B、单独具有C、具有A和B、具有A和C、具有B和C、和/或具有A、B、C的系统等)。术语“第一”、“第二”仅用于描述目的，而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此，限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个所述特征。Where expressions like "at least one of A, B, or C, etc.," are used, they should generally be interpreted in accordance with the meaning of the expression as commonly understood by those skilled in the art (eg, "has A, B, or C, etc." At least one of the "systems" shall include, but not be limited to, systems with A alone, B alone, C alone, A and B, A and C, B and C, and/or A, B, C, etc. ). The terms "first" and "second" are only used for descriptive purposes, and should not be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, features defined as "first", "second" may expressly or implicitly include one or more of said features.

目前多数的存续期管理模型仍存在一定局限性：仅通过小微企业客户自身的数据特征进行建模，而金融行业内的数据多为涉及到多个客户的关联关系数据，使得数据信息利用不充分；同时，也无法考虑某一客户贷款存续期的风险对其上下游等关联客户产生的影响，忽略了贷款存续期管理的一个重要角度，影响针对小微企业普惠金融服务能力。Most of the current duration management models still have certain limitations: they are only modeled by the data characteristics of small and micro enterprise customers themselves, while the data in the financial industry is mostly related data involving multiple customers, which makes the use of data and information ineffective. At the same time, it is impossible to consider the impact of the risk of a customer's loan duration on its upstream and downstream related customers, ignoring an important aspect of loan duration management, and affecting the ability to provide inclusive financial services for small and micro enterprises.

本公开的实施例提供了一种基于知识图谱的风险管理方法、风险管理装置、电子设备、计算机可读存储介质和计算机程序。基于知识图谱的风险管理方法包括：获取企业数据，其中，企业数据包括当事方和当事方之间的事件关系，事件关系包括m个类别数据，每个类别数据包括n个事件信息，其中，m为大于等于1的整数，n为大于等于1的整数；根据企业数据构建知识图谱，其中，当事方和事件信息为知识图谱的节点，事件关系为知识图谱的边，事件信息用于解释对应的边；基于每个类别数据构建的知识图谱，计算两个当事方之间的n个事件信息边的归约权重

根据归约权重，计算两个当事方之间的m个类别数据边的聚合权重ω_vu；以及根据聚合权重对企业进行风险管理。Embodiments of the present disclosure provide a knowledge graph-based risk management method, a risk management apparatus, an electronic device, a computer-readable storage medium, and a computer program. The risk management method based on knowledge graph includes: acquiring enterprise data, wherein the enterprise data includes the parties and the event relationship between the parties, the event relationship includes m category data, each category data includes n event information, wherein , m is an integer greater than or equal to 1, and n is an integer greater than or equal to 1; construct a knowledge graph based on enterprise data, in which the parties and event information are nodes of the knowledge graph, the event relationship is the edge of the knowledge graph, and the event information is used for Interpret the corresponding edges; calculate the reduction weights of n event information edges between two parties based on the knowledge graph constructed by each category of data

According to the reduction weights, the aggregated weights ω _vu of the m class data edges between two parties are calculated; and the enterprise is risk managed according to the aggregated weights.

需要说明的是，本公开的基于知识图谱的风险管理方法、风险管理装置、电子设备、计算机可读存储介质和计算机程序可用于大数据领域，也可用于除大数据领域之外的任意领域，例如金融领域，这里对本公开的领域不做限定。It should be noted that the knowledge graph-based risk management method, risk management device, electronic device, computer-readable storage medium and computer program of the present disclosure can be used in the field of big data, and can also be used in any field other than the field of big data, For example, in the financial field, the field of the present disclosure is not limited here.

图1示意性示出了根据本公开实施例的可以应用基于知识图谱的风险管理方法、风险管理装置、电子设备、计算机可读存储介质和计算机程序的示例性系统架构100。需要注意的是，图1所示仅为可以应用本公开实施例的系统架构的示例，以帮助本领域技术人员理解本公开的技术内容，但并不意味着本公开实施例不可以用于其他设备、系统、环境或场景。FIG. 1 schematically shows an exemplary system architecture 100 to which a knowledge graph-based risk management method, a risk management apparatus, an electronic device, a computer-readable storage medium and a computer program can be applied according to an embodiment of the present disclosure. It should be noted that FIG. 1 is only an example of a system architecture to which the embodiments of the present disclosure can be applied, so as to help those skilled in the art to understand the technical content of the present disclosure, but it does not mean that the embodiments of the present disclosure cannot be used for other A device, system, environment or scene.

如图1所示，根据该实施例的系统架构100可以包括终端设备101、102、103，网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型，例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 1 , the system architecture 100 according to this embodiment may include terminal devices 101 , 102 , and 103 , a network 104 and a server 105 . The network 104 is a medium used to provide a communication link between the terminal devices 101 , 102 , 103 and the server 105 . The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

用户可以使用终端设备101、102、103通过网络104与服务器105交互，以接收或发送消息等。终端设备101、102、103上可以安装有各种通讯客户端应用，例如购物类应用、网页浏览器应用、搜索类应用、即时通信工具、邮箱客户端、社交平台软件等(仅为示例)。The user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages and the like. Various communication client applications may be installed on the terminal devices 101 , 102 and 103 , such as shopping applications, web browser applications, search applications, instant messaging tools, email clients, social platform software, etc. (only examples).

终端设备101、102、103可以是具有显示屏并且支持网页浏览的各种电子设备，包括但不限于智能手机、平板电脑、膝上型便携计算机和台式计算机等等。The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop computers, desktop computers, and the like.

服务器105可以是提供各种服务的服务器，例如对用户利用终端设备101、102、103所浏览的网站提供支持的后台管理服务器(仅为示例)。后台管理服务器可以对接收到的用户请求等数据进行分析等处理，并将处理结果(例如根据用户请求获取或生成的网页、信息、或数据等)反馈给终端设备。The server 105 may be a server that provides various services, such as a background management server (just an example) that provides support for websites browsed by users using the terminal devices 101 , 102 , and 103 . The background management server can analyze and process the received user requests and other data, and feed back the processing results (such as web pages, information, or data obtained or generated according to user requests) to the terminal device.

需要说明的是，本公开实施例所提供的基于知识图谱的风险管理方法一般可以由服务器105执行。相应地，本公开实施例所提供的风险管理装置一般可以设置于服务器105中。本公开实施例所提供的基于知识图谱的风险管理方法也可以由不同于服务器105且能够与终端设备101、102、103和/或服务器105通信的服务器或服务器集群执行。相应地，本公开实施例所提供的风险管理装置也可以设置于不同于服务器105且能够与终端设备101、102、103和/或服务器105通信的服务器或服务器集群中。It should be noted that, the risk management method based on the knowledge graph provided by the embodiments of the present disclosure may generally be executed by the server 105 . Correspondingly, the risk management apparatus provided by the embodiments of the present disclosure may generally be provided in the server 105 . The knowledge graph-based risk management method provided by the embodiments of the present disclosure may also be executed by a server or server cluster that is different from the server 105 and can communicate with the terminal devices 101 , 102 , 103 and/or the server 105 . Correspondingly, the risk management apparatus provided by the embodiments of the present disclosure may also be provided in a server or a server cluster that is different from the server 105 and can communicate with the terminal devices 101 , 102 , 103 and/or the server 105 .

应该理解，图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要，可以具有任意数目的终端设备、网络和服务器。It should be understood that the numbers of terminal devices, networks and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.

以下将基于图1描述的场景，通过图2～图11对本公开实施例的基于知识图谱的风险管理方法进行详细描述。Based on the scenario described in FIG. 1 , the following will describe the risk management method based on the knowledge graph according to the embodiment of the present disclosure in detail through FIGS. 2 to 11 .

图2示意性示出了根据本公开实施例的基于知识图谱的风险管理方法的流程图。FIG. 2 schematically shows a flowchart of a risk management method based on a knowledge graph according to an embodiment of the present disclosure.

如图2所示，该实施例的基于知识图谱的风险管理方法，包括操作S210～操作S250。As shown in FIG. 2 , the risk management method based on the knowledge graph of this embodiment includes operations S210 to S250 .

在操作S210，获取企业数据，其中，企业数据包括当事方和当事方之间的事件关系，事件关系包括m个类别数据，每个类别数据包括n个事件信息，其中，m为大于等于1的整数，n为大于等于1的整数。In operation S210, enterprise data is acquired, wherein the enterprise data includes a party and an event relationship between the parties, the event relationship includes m types of data, and each type of data includes n pieces of event information, where m is greater than or equal to An integer of 1, and n is an integer greater than or equal to 1.

需要说明的是，当事方可以均为企业，当事方也可以均为个人，当事方还可以为企业和个人。这里，企业与企业之间可以存在事件关系，个人与个人之间可以存在事件关系，企业与个人之间可以存在事件关系。It should be noted that the parties can be both enterprises, the parties can also be individuals, and the parties can also be enterprises and individuals. Here, an event relationship can exist between enterprises and enterprises, an event relationship can exist between individuals and individuals, and an event relationship can exist between companies and individuals.

其中，事件关系包括m个类别数据，作为一种可能实现的方式，m个类别数据包括资金流数据、投资数据、担保数据和人企关联数据中的至少一个。The event relationship includes m categories of data, and as a possible implementation, the m categories of data include at least one of capital flow data, investment data, guarantee data, and person-enterprise association data.

可以理解的是，资金流数据可以为当事方之间的资金流交易数据，也即可以为企业与企业之间的资金流交易数据，可以为个人与个人之间的资金流交易数据，可以为企业与个人之间的资金流交易数据。It can be understood that the capital flow data can be the capital flow transaction data between parties, that is, the capital flow transaction data between enterprises and enterprises, the capital flow transaction data between individuals, or the capital flow transaction data between individuals. Transaction data for the flow of funds between businesses and individuals.

投资数据可以理解为当事方之间的投资关系数据，也即可以为企业与企业之间的投资关系，可以为个人与个人之间的投资关系，可以为企业与个人之间的投资关系。Investment data can be understood as the investment relationship data between the parties, that is, the investment relationship between enterprises and enterprises, the investment relationship between individuals and individuals, and the investment relationship between enterprises and individuals.

担保数据可以理解为当事方之间的担保关系数据，也即可以为企业与企业之间的担保关系，可以为个人与个人之间的担保关系，可以为企业与个人之间的担保关系。Guarantee data can be understood as the guarantee relationship data between the parties, that is, the guarantee relationship between enterprises and enterprises, the guarantee relationship between individuals and individuals, and the guarantee relationship between enterprises and individuals.

人企关联数据可以理解为企业与个人之间的关联关系数据，例如，一企业与该企业的法定代表人、第二负责人、保险法定受益人、财务主管、股东、总经理、单位联系人、董事长和其他负责人等。Person-enterprise association data can be understood as the association relationship data between enterprises and individuals, for example, the legal representative, second responsible person, insurance legal beneficiary, financial director, shareholder, general manager, unit contact person of an enterprise and the enterprise , Chairman and other responsible persons.

其中，每个类别数据包括n个事件信息，作为一种可能实现的方式，n个事件信息包括n个时间段的同一类别数据下的事件信息。例如，在资金流数据的类别数据下，每个时间段发生的事件归为一个事件信息，进一步例如设定时间段为一个季度，一年可以有4个时间段，则当事方和当事方之间在资金流数据下存在4个事件信息。每个事件信息可以包括但不限于资金流出当事方、交易资金金额、交易时间和资金流入当事方。Wherein, each category of data includes n pieces of event information. As a possible implementation manner, the n pieces of event information include event information under the same category of data in n time periods. For example, under the category data of capital flow data, events that occur in each time period are classified as one event information. For example, if the time period is set as a quarter, and there can be 4 time periods in a year, the parties and the parties There are 4 event information under the capital flow data between parties. Each event information may include, but is not limited to, funds outflow parties, transaction funds amount, transaction time, and funds inflow parties.

又如，在投资数据的类别数据下，每个时间段发生的事件归为一个事件信息，进一步例如设定时间段为一个季度，一年可以有4个时间段，则当事方和当事方之间在投资数据下存在4个事件信息。每个事件信息可以包括但不限于投资当事方、投资金额、投资发生时间和受资当事方。For another example, under the category data of investment data, events that occur in each time period are classified as one event information. For example, if the time period is set as a quarter, and there can be 4 time periods in a year, the parties and There are 4 event information under investment data between parties. Each event information may include, but is not limited to, the investing party, the investment amount, the time the investment occurred, and the funded party.

又如，在担保数据的类别数据下，每个时间段发生的事件归为一个事件信息，进一步例如设定时间段为一个季度，一年可以有4个时间段，则当事方和当事方之间在担保数据下存在4个事件信息。每个事件信息可以包括但不限于担保当事方、担保金额、担保时间和受保当事方。For another example, under the category data of guarantee data, events that occur in each time period are classified as one event information. There are 4 event information under the guarantee data between the parties. Each event information may include, but is not limited to, the insured party, the insured amount, the insured time, and the insured party.

再如，在人企关联数据的类别数据下，每个时间段发生的事件归为一个事件信息，每个时间段内一企业与该企业的法定代表人、第二负责人、保险法定受益人、财务主管、股东、总经理、单位联系人、董事长和其他负责人等建立的关联关系分别为一个事件信息。进一步例如设定时间段为一个季度，一年可以有4个时间段，则该企业与该企业的法定代表人、第二负责人、保险法定受益人、财务主管、股东、总经理、单位联系人、董事长和其他负责人等分别存在4个事件信息。每个事件信息可以包括但不限于企业当事方、个人任职职位、开始任职时间和结束任职时间。For another example, under the category data of person-enterprise association data, events that occur in each time period are classified as one event information, and the legal representative, second responsible person, and insurance legal beneficiary of an enterprise and the enterprise in each time period. , financial director, shareholder, general manager, unit contact person, chairman of the board and other persons in charge, etc., are each associated with one event information. Further, for example, if the time period is set as a quarter, and there can be 4 time periods in a year, the enterprise shall contact the legal representative, the second person in charge, the legal beneficiary of insurance, the financial director, the shareholder, the general manager, and the unit of the enterprise. There are 4 event information respectively for the person, chairman and other responsible persons. Each event information may include, but is not limited to, the business party, the individual's job title, the start date and the end date.

当然，时间段的划分并不限于按季度划分，时间段的划分可以按照年、月、日或者任意的时间区间，这里对时间段不做过多限制。Of course, the division of time periods is not limited to division by quarters, and the division of time periods may be based on years, months, days, or any time interval, and there are no restrictions on time periods here.

在操作S220，根据企业数据构建知识图谱，其中，当事方和事件信息为知识图谱的节点，事件关系为知识图谱的边，事件信息用于解释对应的边。In operation S220, a knowledge graph is constructed according to the enterprise data, wherein the parties and event information are nodes of the knowledge graph, the event relationship is an edge of the knowledge graph, and the event information is used to explain the corresponding edge.

例如，结合图3，企业A与企业B之间存在资金流交易和投资关系，其中，企业A为资金流出方，交易资金金额为5万，交易时间为t₁，企业B为资金流入方；企业A为投资方，投资金额为100万，交易时间为t₂，企业B为受资方。For example, referring to Figure 3, there is a capital flow transaction and investment relationship between enterprise A and enterprise B, wherein, enterprise A is the capital outflow party, the transaction capital amount is 50,000, the transaction time is t ₁ , and enterprise B is the capital inflow party; Enterprise A is the investor, the investment amount is 1 million, the transaction time is t ₂ , and the enterprise B is the fundee.

企业A与企业C之间存在担保关系，其中，企业A为担保方、担保金额为50万，担保时间为t₃，企业C为受保方；企业A和个人a之间存在人企关联关系，个人a为企业A的股东，其中，开始任职时间为t₄，结束任职时间为t₅；企业D和企业B之间存在担保关系，其中，企业B为担保方、担保金额为40万，担保时间为t₆，企业D为受保方。There is a guarantee relationship between enterprise A and enterprise C. Among them, enterprise A is the guarantor, the guarantee amount is 500,000 yuan, the guarantee time is t ₃ , and enterprise C is the insured party; there is a person-enterprise relationship between enterprise A and individual a , individual a is a shareholder of enterprise A, the starting time is t ₄ , and the ending time is t ₅ ; there is a guarantee relationship between enterprise D and enterprise B, where enterprise B is the guarantor and the guarantee amount is 400,000 yuan, The guarantee time is t ₆ , and enterprise D is the insured party.

企业D和个人a之间存在资金流交易，其中，企业D为资金流出方，交易资金金额为7万，交易时间为t₇，个人a为资金流入方；企业C与个人b之间存在人企关联关系，个人b为企业C的财务主管，其中，开始任职时间为t₈，结束任职时间为t₉；企业C与个人c之间存在人企关联关系，个人c为企业C的董事长，其中，开始任职时间为t₁₀，结束任职时间为t₁₁。There is a capital flow transaction between enterprise D and individual a. Among them, enterprise D is the capital outflow party, the transaction capital amount is 70,000, the transaction time is t ₇ , and individual a is the capital inflow party; there is a person between enterprise C and individual b. Enterprise association relationship, individual b is the financial director of enterprise C, where the starting time is t ₈ and the ending time is t ₉ ; there is a person-enterprise association relationship between enterprise C and individual c, and individual c is the chairman of enterprise C. , where the starting time is t ₁₀ , and the ending time is t ₁₁ .

基于上述企业数据，可以将企业A、企业B、企业C、企业D、个人a、个人b和个人c作为知识图谱的节点，根据事件关系用边将各个节点连接起来；将事件信息“企业A为资金流出方，交易资金金额为5万，交易时间为t₁，企业B为资金流入方”、“企业A为投资方，投资金额为100万，交易时间为t₂，企业B为受资方”、“企业A为担保方、担保金额为50万，担保时间为t₃，企业C为受保方”、“个人a为企业A的股东，其中，开始任职时间为t₄，结束任职时间为t₅”、“企业B为担保方、担保金额为40万，担保时间为t₆，企业D为受保方”、“企业D为资金流出方，交易资金金额为7万，交易时间为t₇，个人a为资金流入方”、“个人b为企业C的财务主管，其中，开始任职时间为t₈，结束任职时间为t₉”和“个人c为企业C的董事长，其中，开始任职时间为t₁₀，结束任职时间为t₁₁”作为知识图谱的节点，用于解释对应的边。Based on the above enterprise data, enterprise A, enterprise B, enterprise C, enterprise D, individual a, individual b and individual c can be used as nodes of the knowledge graph, and each node can be connected by edges according to the event relationship; is the capital outflow party, the transaction capital amount is 50,000, the transaction time is t ₁ , and the company B is the capital inflow party”, “Company A is the investor, the investment amount is 1 million, the transaction time is t ₂ , and the company B is the capital recipient. ", "Enterprise A is the guarantor, the guarantee amount is 500,000 yuan, the guarantee period is t ₃ , and enterprise C is the insured party", "Individual a is the shareholder of enterprise A, and the starting time is t ₄ , and the ending time is t 4 . is t ₅ ”, “Enterprise B is the guarantor, the guarantee amount is 400,000, the guarantee time is t ₆ , and the enterprise D is the insured party”, “Enterprise D is the fund outflow party, the transaction capital amount is 70,000, and the transaction time is t ₇ , individual a is the fund inflow party”, “individual b is the financial director of enterprise C, where the starting time is t ₈ and the ending time is t ₉ ” and “individual c is the chairman of the company C, where, The starting service time is t ₁₀ , and the end service time is t ₁₁ ” as the nodes of the knowledge graph, which are used to explain the corresponding edges.

作为一种可能实施的方式，如图4所示，操作S220根据企业数据构建知识图谱包括操作S221～操作S222。As a possible implementation manner, as shown in FIG. 4 , operation S220 to construct a knowledge graph according to enterprise data includes operations S221 to S222.

在操作S221，根据当事方和m个类别数据构建模式层，其中，模式层包括根据当事方和m个类别数据中的m个类别建立的节点，和根据每个类别数据中的事件建立的节点之间的连边。In operation S221, a schema layer is constructed according to the parties and m category data, wherein the schema layer includes nodes established according to the parties and m categories in the m category data, and established according to events in each category data edges between nodes.

在操作S222，将类别数据中的数据导入对应的模式层。In operation S222, the data in the category data is imported into the corresponding schema layer.

其中，模式层可以理解为知识图谱的结构层，通过操作S211搭建好知识图谱的结构后，可以将类别数据中的具体数据导入对应的模式层中，由此，通过操作S221和操作S222可以便于建立知识图谱。Among them, the schema layer can be understood as the structure layer of the knowledge graph. After the structure of the knowledge graph is built by operating S211, the specific data in the category data can be imported into the corresponding schema layer. Therefore, by operating S221 and operating S222, it is convenient to Build a knowledge graph.

进一步地，如图5所示，类别数据为结构化数据，操作S222将类别数据中的数据导入对应的模式层包括操作S2221～操作S2222。Further, as shown in FIG. 5 , the category data is structured data, and operation S222 to import the data in the category data into the corresponding schema layer includes operations S2221 to S2222 .

在操作S2221，将类别数据转换为资源描述框架数据。In operation S2221, the category data is converted into resource description framework data.

在操作S2222，将资源描述框架数据导入对应的模式层。In operation S2222, the resource description framework data is imported into the corresponding schema layer.

通过操作S2221～操作S2222可以便于实现将类别数据中的数据导入对应的模式层，进而便于构建出知识图谱。Through operations S2221 to S2222, the data in the category data can be easily imported into the corresponding schema layer, thereby facilitating the construction of a knowledge graph.

在操作S230，基于每个类别数据构建的知识图谱，计算两个当事方之间的n个事件信息边的归约权重

作为一种可实施的方式，归约权重的计算方法如下：In operation S230, based on the knowledge graph constructed by each category of data, the reduction weights of n event information edges between two parties are calculated

As an implementable way, the calculation method of the reduction weight is as follows:

其中，v表示两个当事方中的头节点，u表示两个当事方中的尾节点，E(v，u)表示头节点为v、尾节点为u的n个事件信息边的集合，E(v)表示头节点为v的有向边的集合，

表示边l′的时间权重，

表示边l在对应的事件发生时刻的初始权重，

表示边l的时间权重。

Among them, v represents the head node in the two parties, u represents the tail node in the two parties, E(v, u) represents the set of n event information edges with the head node v and the tail node u , E(v) represents the set of directed edges whose head node is v,

represents the time weight of edge l′,

represents the initial weight of edge l at the time of the corresponding event,

represents the time weight of edge l.

在操作S240，根据归约权重，计算两个当事方之间的m个类别数据边的聚合权重ω_vu。作为一种可实施的方式，聚合权重的计算方法如下：In operation S240, according to the reduction weights, the aggregated weights ω _vu of the m class data edges between the two parties are calculated. As an implementable manner, the calculation method of the aggregate weight is as follows:

其中，R(v，u)表示头节点为v、尾节点为u的m个类别数据边的集合，C_r表示与不同的类别数据对应的常数系数。

Among them, R(v, u) represents the set of m class data edges whose head node is v and tail node is u, and C _r represents constant coefficients corresponding to different class data.

在操作S250，根据聚合权重对企业进行风险管理。In operation S250, risk management is performed on the enterprise according to the aggregated weight.

作为一种可能实现的方式，如图6所示，操作S250根据聚合权重对企业进行风险管理包括操作S251～操作S254。As a possible implementation manner, as shown in FIG. 6 , operation S250 to perform risk management on the enterprise according to the aggregated weight includes operations S251 to S254 .

在操作S251，根据聚合权重对知识图谱的节点进行标签传播。例如可以设定一个目标节点，以目标节点为中心，与目标节点连接有多个邻居节点，目标节点与每个邻居节点的连边对应有聚合权重，根据聚合权重可以确定目标节点传播哪个邻居节点的标签。In operation S251, label propagation is performed on the nodes of the knowledge graph according to the aggregated weights. For example, a target node can be set, with the target node as the center, there are multiple neighbor nodes connected to the target node, the connection between the target node and each neighbor node has an aggregation weight, and according to the aggregation weight, it can be determined which neighbor node the target node propagates. Tag of.

进一步举例说明，例如目标节点为A，目标节点A有邻居节点B，有邻居节点C和邻居节点D，目标节点A的标签是标签1，邻居节点B和邻居节点C的标签是标签2，邻居节点D的标签是标签3。根据聚合权重可以确定目标节点A的新标签传播标签2还是标签3，若目标节点A传播标签2，则目标节点A的新标签为标签2；若目标节点A传播标签3，则目标节点A的新标签为标签3。For further example, for example, the target node is A, the target node A has neighbor node B, neighbor node C and neighbor node D, the label of target node A is label 1, the label of neighbor node B and neighbor node C is label 2, and the neighbor node C has label 2. The label of node D is label 3. According to the aggregation weight, it can be determined whether the new label of target node A propagates label 2 or label 3. If target node A propagates label 2, the new label of target node A is label 2; if target node A propagates label 3, then target node A's new label is label 2. The new tab is Tab 3.

在操作S252，根据标签传播的结果进行社群划分，得到社群规模。可以理解的是，对知识图谱进行标签传播后，可以得到各个节点的新标签，根据新标签可以对各个节点进行社群划分，例如将同为标签1的节点划分为社群1，将同为标签2的节点划分为社群2，将同为标签3的节点划分为社群3，每个社群的节点和连边数量可以理解为该社群的规模。In operation S252, the community is divided according to the result of label propagation, and the community size is obtained. It can be understood that after label propagation of the knowledge graph, new labels of each node can be obtained, and each node can be divided into communities according to the new labels. For example, the nodes with the same label 1 are divided into The nodes of label 2 are divided into community 2, and the nodes of the same label 3 are divided into community 3. The number of nodes and edges of each community can be understood as the scale of the community.

在操作S253，根据社群规模，计算知识图谱的节点的度中心性。In operation S253, according to the community size, the degree centrality of the nodes of the knowledge graph is calculated.

在操作S254，将度中心性作为风险预测模型的输入进行风险预测。由此，通过将度中心性输入风险预测模型可以便于风险预测模型可以预测出每个节点的风险，例如哪个节点为逾期客户，哪个节点为未逾期客户，哪个节点为即将逾期客户。In operation S254, risk prediction is performed using the degree centrality as an input of the risk prediction model. Therefore, by entering the degree centrality into the risk prediction model, the risk prediction model can predict the risk of each node, for example, which node is an overdue customer, which node is a non-overdue customer, and which node is an overdue customer.

作为另一种可能实现的方式，如图7所示，操作S250根据聚合权重对企业进行风险管理包括操作S255～操作S256。As another possible implementation manner, as shown in FIG. 7 , operation S250 to perform risk management on the enterprise according to the aggregated weight includes operations S255 to S256 .

在操作S255，根据聚合权重对知识图谱的节点进行标签传播。例如可以设定一个目标节点，以目标节点为中心，与目标节点连接有多个邻居节点，目标节点与每个邻居节点的连边对应有聚合权重，根据聚合权重可以确定目标节点传播哪个邻居节点的标签。In operation S255, label propagation is performed on the nodes of the knowledge graph according to the aggregated weights. For example, a target node can be set, with the target node as the center, there are multiple neighbor nodes connected to the target node, the connection between the target node and each neighbor node has an aggregation weight, and according to the aggregation weight, it can be determined which neighbor node the target node propagates. Tag of.

进一步举例说明，例如目标节点为A，目标节点A有邻居节点B，有邻居节点C和邻居节点D，目标节点A的标签是非风险客户，邻居节点B和邻居节点C的标签是非风险客户，邻居节点D的标签是风险客户。根据聚合权重可以确定目标节点A的新标签传播风险客户还是非风险客户，若目标节点A传播风险客户，则目标节点A的新标签为风险客户；若目标节点A传播非风险客户，则目标节点A的新标签为非风险客户。For further illustration, for example, the target node is A, the target node A has neighbor node B, neighbor node C and neighbor node D, the label of target node A is a non-risk customer, the labels of neighbor node B and neighbor node C are non-risk customers, neighbors Node D's label is a risky customer. According to the aggregation weight, it can be determined whether the new label of target node A propagates risk customers or non-risk customers. If target node A propagates risk customers, the new label of target node A is risk customers; if target node A propagates non-risk customers, then target node A propagates non-risk customers. A's new label is a non-risk customer.

在操作S256，根据标签传播的结果进行风险预测。可以理解的是，对知识图谱进行标签传播后，可以得到各个节点的新标签，根据新标签可以得出该节点是风险客户还是非风险客户的预测结果。In operation S256, risk prediction is performed according to the result of label propagation. It is understandable that after label propagation is performed on the knowledge graph, a new label of each node can be obtained, and a prediction result of whether the node is a risk customer or a non-risk customer can be obtained according to the new label.

在本公开的一些实施例中，结合图2，在操作S220根据企业数据构建知识图谱前，还包括操作S001。In some embodiments of the present disclosure, with reference to FIG. 2 , before operation S220 constructs a knowledge graph according to enterprise data, operation S001 is further included.

在操作S001，清洗企业数据，其中，清洗企业数据包括：数据去重、数据中的特征补齐和数据中的异常特征处理中的一个。在一些具体的示例中，数据去重可以理解为去除数据中的重复数据，例如企业数据中包括的企业A为担保人，担保金额为200万，企业B为受保人和企业B为受保人，接收企业A为其担保的担保金额200万即为重复数据，在数据去重操作中可以删除其一。In operation S001, the enterprise data is cleaned, wherein the cleaning of the enterprise data includes one of: data deduplication, feature complementation in the data, and abnormal feature processing in the data. In some specific examples, data deduplication can be understood as removing duplicate data in the data. For example, enterprise A included in the enterprise data is the guarantor with a guarantee amount of 2 million, enterprise B is the insured and enterprise B is the insured person, the guarantee amount of 2 million guaranteed by the receiving enterprise A for it is the duplicate data, and one of them can be deleted in the data deduplication operation.

在一些具体的示例中，数据中的特征补齐可以理解为数据中有缺失的特征，可以将缺失的特征补齐，例如企业数据中包括的企业A为担保人，担保金额为空，企业B为受保人，则担保金额即为缺失的特征，可以将该特征补齐。In some specific examples, the completion of features in the data can be understood as missing features in the data, and the missing features can be filled in. For example, the company A included in the company data is the guarantor, the guarantee amount is empty, and the company B If the insured is the insured, the guaranteed amount is the missing feature, which can be filled.

在一些具体的示例中，数据中的异常特征处理可以理解为数据中有异常的特征，可以将异常的特征替换或删除，例如企业数据中包括的企业A为担保人，担保金额为负数，企业B为受保人，则担保金额即为异常的特征，可以将该特征替换或删除。In some specific examples, the processing of abnormal features in the data can be understood as abnormal features in the data, and the abnormal features can be replaced or deleted. For example, enterprise A included in the enterprise data is the guarantor, the guarantee amount is negative, and the enterprise B is the insured, and the guaranteed amount is an abnormal feature, which can be replaced or deleted.

通过清洗企业数据可以获得较为标准的企业数据作为建立知识图谱的基础数据，从而使得建立的知识图谱更加准确。By cleaning enterprise data, relatively standard enterprise data can be obtained as the basic data for building a knowledge graph, so that the established knowledge graph is more accurate.

在本公开的一些实施例中，如图2所示，基于知识图谱的风险管理方法还包括操作S260。In some embodiments of the present disclosure, as shown in FIG. 2 , the knowledge graph-based risk management method further includes operation S260.

在操作S260，可视化知识图谱，其中，如图8-图11所示，操作S260可视化知识图谱包括操作S261～操作S264中的至少一个。In operation S260, the knowledge graph is visualized, wherein, as shown in FIG. 8-FIG. 11, the visualization of the knowledge graph in operation S260 includes at least one of operations S261 to S264.

在操作S261，基于知识图谱进行节点检索。在一些示例中，节点检索可以包括：响应于输入的节点名称，展示与节点名称相关的节点及该节点的关联信息，例如，以目标节点A为输入点，即可以展示出与目标节点A连接的所有邻居节点、邻居节点的邻居节点、目标节点A与每个邻居节点之间的边、邻居节点与邻居节点之间的边以及用于说明对应边的事件信息节点。由此可以了解目标节点A在整个知识图谱中所处的位置与重要性。In operation S261, node retrieval is performed based on the knowledge graph. In some examples, the node retrieval may include: in response to the input node name, displaying the node related to the node name and the associated information of the node, for example, taking the target node A as the input point, that is, displaying the connection with the target node A All the neighbor nodes of , the neighbor nodes of the neighbor node, the edge between the target node A and each neighbor node, the edge between the neighbor node and the neighbor node, and the event information node used to describe the corresponding edge. From this, we can understand the position and importance of the target node A in the entire knowledge graph.

在操作S262，基于知识图谱进行子图游走。在一些示例中，子图游走可以包括：响应于人工点击事件关系的操作，展示与所点击事件关系对应的节点及由该节点发出的所有边。可以理解的是，用户想要查看与事件关系有关的节点和边时，可以基于知识图谱点击该事件关系，则可展示与所点击事件关系对应的节点及由该节点发出的所有边。In operation S262, a subgraph walk is performed based on the knowledge graph. In some examples, the subgraph walk may include, in response to an operation of manually clicking on an event relationship, exposing the node corresponding to the clicked event relationship and all edges emanating from the node. It can be understood that when a user wants to view the nodes and edges related to an event relationship, he can click on the event relationship based on the knowledge graph, and then the node corresponding to the clicked event relationship and all edges sent by the node can be displayed.

在操作S263，基于知识图谱进行路径探索。在一些示例中，路径探索包括：获取两个节点之间的通路关系并展示，其中，基于知识图谱点击两个节点，即可获得两个节点之间的通路关系并展示。In operation S263, path exploration is performed based on the knowledge graph. In some examples, the path exploration includes: obtaining and displaying the path relationship between two nodes, wherein, by clicking on the two nodes based on the knowledge graph, the path relationship between the two nodes can be obtained and displayed.

在操作S264，基于知识图谱进行自环探索中的至少一个。在一些示例中，自环探索包括：获取通路关系为闭环的节点并展示。例如知识图谱中包括节点A投资节点B，节点B担保节点C，节点D为节点C的懂事长，节点D资金流入节点A，由此，上述知识图谱中存在从节点A到节点A的闭环，通过可视化知识图谱可以获取从节点A到节点A之间的通路关系和所有节点并展示。In operation S264, at least one of self-loop exploration is performed based on the knowledge graph. In some examples, the self-loop exploration includes: acquiring and displaying nodes whose path relationships are closed loops. For example, the knowledge graph includes node A investing in node B, node B guaranteeing node C, node D is the director of node C, and node D's funds flow into node A. Therefore, there is a closed loop from node A to node A in the above knowledge graph. By visualizing the knowledge graph, the path relationship from node A to node A and all nodes can be obtained and displayed.

通过操作S261～操作S264使得知识图谱的使用多样化，可以增加知识图谱的使用性和实用性。Diversifying the use of the knowledge graph through operations S261 to S264 can increase the usability and practicability of the knowledge graph.

下面详细描述根据本公开实施例的基于知识图谱的风险管理方法。值得理解的是，下述描述仅是示例性说明，而不是对本公开的具体限制。The risk management method based on the knowledge graph according to the embodiment of the present disclosure is described in detail below. It is to be understood that the following description is intended to be illustrative only and not specific to the present disclosure.

根据本公开的实施例，主要分为知识图谱的构建和知识图谱的应用，知识图谱构建主要分为知识获取、知识建模、知识抽取、知识存储和知识可视化。According to the embodiments of the present disclosure, it is mainly divided into the construction of knowledge graph and the application of knowledge graph, and the construction of knowledge graph is mainly divided into knowledge acquisition, knowledge modeling, knowledge extraction, knowledge storage and knowledge visualization.

具体地，知识获取是从行内数据中筛选出支撑知识图谱构建需求的结构化数据；知识建模指由领域通用知识、领域专家经验构建本体模型，并将其作为知识图谱的模式层；知识抽取依照本体模型的结构，将知识获取环节中的结构化数据转换为RDF格式数据(资源描述框架数据)，从而完成知识图谱数据层的构建；知识存储是将构建好的知识图谱存入面向RDF格式数据的图数据库中；知识可视化通过http服务访问图数据库，以实现知识图谱可视化，呈现给信贷业务人员。Specifically, knowledge acquisition is to screen out the structured data that supports the construction of knowledge graphs from in-line data; knowledge modeling refers to building an ontology model from general domain knowledge and domain expert experience, and using it as the pattern layer of the knowledge graph; knowledge extraction According to the structure of the ontology model, the structured data in the knowledge acquisition process is converted into RDF format data (resource description framework data), so as to complete the construction of the knowledge graph data layer; knowledge storage is to store the constructed knowledge graph in the RDF format. In the graph database of the data; knowledge visualization accesses the graph database through the http service to realize the visualization of the knowledge graph and present it to the credit business personnel.

其中，在知识获取中，本公开的目标客户为成功使用过经营快贷产品的小微企业，时间范围仅取借据放款期限(到期日)在2018年5月到2019年4月的样本。滑动时间段长度设定为12个月，也就是每个样本的观察期为12个月，因此取每个小微企业到期日前12个月的时间作为时间窗口。Among them, in the knowledge acquisition, the target customers of this disclosure are small and micro enterprises that have successfully used the quick loan products, and the time range is only to take the samples with the loan term (expiration date) of the IOU from May 2018 to April 2019. The length of the sliding time period is set to 12 months, that is, the observation period of each sample is 12 months, so the time window 12 months before the expiration date of each small and micro enterprise is taken as the time window.

考虑到小微企业相互间联系较少，关系连边稀疏，无法全面地计算小微企业间的风险传播路径；而且，实际业务中也需要将小微企业放入整个金融业务网络中进行建模，而非仅关注小微企业之间的关联信息，因此以目标小微企业客户为中心，资金流关系、投资关系、担保关系和人企关系为路径，扩展了相关联的非小微企业贷款客户，具体扩展规则如下。Considering that small and micro enterprises are less connected to each other and the relationship is sparse, it is impossible to comprehensively calculate the risk propagation path between small and micro enterprises; moreover, in actual business, it is also necessary to put small and micro enterprises into the entire financial business network for modeling , instead of only focusing on the association information between small and micro enterprises, therefore, focusing on the target small and micro enterprise customers, capital flow relationship, investment relationship, guarantee relationship and human-enterprise relationship as the path, the related non-small and micro enterprise loans are expanded. Customers, the specific extension rules are as follows.

(1)资金流关系：以目标客户为中心，取其在对应时间窗口内的全部资金流交易数据，由于交易数量庞大，每个小微企业交易数据均按照月份汇总，即对相同交易对手的交易金额进行求和并生成一条新的交易记录。(1) Fund flow relationship: Take the target customer as the center and take all the fund flow transaction data within the corresponding time window. Due to the huge number of transactions, the transaction data of each small and micro enterprise is aggregated by month, that is, the transaction data of the same counterparty The transaction amounts are summed and a new transaction record is generated.

(2)投资关系：以目标客户为中心，取其在对应时间窗口内存在的全部投资数据。(2) Investment relationship: Take the target customer as the center and take all the investment data existing in the corresponding time window.

(3)担保关系：以目标客户为中心，取其在对应时间窗口内所处的全部担保圈数据，对担保圈进行拆分，即目标客户与担保圈内的任一其他客户均生成一条新的担保数据记录。(3) Guarantee relationship: Take the target customer as the center, take all the data of the guarantee circle in the corresponding time window, and split the guarantee circle, that is, the target customer and any other customer in the guarantee circle will generate a new guarantee circle. guarantee data records.

(4)人企关系：以目标客户为中心，取其在对应时间窗口内存在的全部人企关联数据，包括目标客户的第二负责人、法定代表人、保险法定受益人、财务主管、股东、总经理、单位联系人、董事长和其他负责人。对于关联出的个人客户，同样取其在对应时间窗口内的全部资金流交易数据，处理方式与上述资金流关系数据相同。(4) Person-enterprise relationship: take the target customer as the center, and take all the person-enterprise related data existing in the corresponding time window, including the target customer’s second person in charge, legal representative, insurance legal beneficiary, financial director, shareholders , general manager, unit contact person, chairman of the board and other responsible persons. For the associated individual customers, all the transaction data of the capital flow within the corresponding time window are also taken, and the processing method is the same as the above-mentioned capital flow relationship data.

完成客户扩展后，还需进行数据清洗工作，主要针对不同数据的不同特征进行相应的调整，包括记录去重、特征补缺和异常处理等。After completing the customer expansion, data cleaning work is also required, mainly to adjust the different characteristics of different data, including record deduplication, feature filling and exception handling.

(1)记录去重：在资金流交易数据的原始记录中，若交易双方均为行内客户，则同一笔交易会被记录两次，两次记录的区别在于借贷方向不同，因此需统一所有记录的借贷方向，去除重复记录。(1) Deduplication of records: In the original records of capital flow transaction data, if both parties to the transaction are intra-bank customers, the same transaction will be recorded twice. The difference between the two records is that the loan direction is different, so all records need to be unified. Loan direction, remove duplicate records.

(2)特征补缺：首先要分析特征缺失原因，针对不同的原因再决定如何处理缺失特征，基本有删除样本、删除特征变量和填补缺失值三种处理方式。具体地，非我行客户在数据中没有ID主键，由于涉及样本数量较大，不宜删除处理，采用以MD5加密客户账号替代ID主键的方式进行缺失值填补。(2) Feature filling: First, we must analyze the reasons for missing features, and then decide how to deal with missing features according to different reasons. There are basically three processing methods: deleting samples, deleting feature variables, and filling missing values. Specifically, non-bank customers do not have an ID primary key in the data. Due to the large number of samples involved, they should not be deleted. The MD5 encrypted customer account number is used instead of the ID primary key to fill in the missing values.

(3)异常处理：对数据的异常值的确认，首先要结合业务和探索性数据分析的结果进行确定，其次可以采用统计学中常用的一些方法进行确定，然后对异常值进行相应的处理来消除异常值对模型训练的影响。例如，资金流交易数据中的交易金额应为正数，删除出现负交易金额的资金流交易数据。(3) Exception handling: To confirm the abnormal value of the data, firstly, it should be determined in combination with the results of business and exploratory data analysis. Secondly, some methods commonly used in statistics can be used to determine it, and then the abnormal value should be processed accordingly. Eliminate the impact of outliers on model training. For example, the transaction amount in the cash flow transaction data should be a positive number, delete the cash flow transaction data with negative transaction amount.

其中，在知识建模中，知识图谱在逻辑结构上主要分为数据层和模式层，数据层包含大量的事实信息，即(实体，关系，实体)或者(实体，属性，属性值)等三元组表示形式，将这些数据存储在图数据库中会构成大规模的实体关系网络，进而形成知识图谱。模式层是知识图谱的核心，建立在数据层之上，存储的是提炼后的知识。通常采用本体模型来管理模式层，即使用本体模型对公理、规则和约束条件的支持能力来规范实体、关系以及实体的类型和属性等对象之间的联系。Among them, in knowledge modeling, knowledge graph is mainly divided into data layer and schema layer in logical structure. The data layer contains a large amount of factual information, namely (entity, relationship, entity) or (entity, attribute, attribute value) and other three Tuple representation, storing these data in a graph database will form a large-scale entity-relationship network, and then form a knowledge graph. The schema layer is the core of the knowledge graph, built on the data layer, and stores the refined knowledge. The ontology model is usually used to manage the schema layer, that is, the support ability of the ontology model for axioms, rules and constraints is used to standardize the relationship between entities, relationships, and objects such as entity types and attributes.

本公开采用自顶向下的方式构建小微企业金融知识图谱，首先为知识图谱定义好本体，或者称作数据模式，再生成知识图谱的数据层，这种构建方式一般适合于领域知识图谱的构建。在定义本体的过程中，首先从最顶层的概念开始，然后逐步进行细化，形成结构良好的层次结构；在定义好本体后，再把实体数据逐个添加进本体的概念中，底层实体数据依据本体概念间的关联关系自动化地生成知识图谱。The present disclosure adopts a top-down approach to construct the financial knowledge graph of small and micro enterprises. First, define the ontology, or data schema, for the knowledge graph, and then generate the data layer of the knowledge graph. This construction method is generally suitable for domain knowledge graphs. Construct. In the process of defining the ontology, first start with the top-level concept, and then gradually refine it to form a well-structured hierarchical structure; after defining the ontology, add entity data into the concept of the ontology one by one, and the underlying entity data is based on The relationship between ontology concepts automatically generates a knowledge graph.

根据上述方法，深入挖掘金融领域相关知识，对知识获取阶段的数据集进行整体调研，通过分析领域内概念和属性之间的语义关联，构建小微企业金融本体模型。本体模型中包含企业和/或个人客户、人企关联事件、资金交易事件、担保事件和投资事件五类实体节点，包含企业-个人、资金流入-流出、担保-被担保和投资-被投资四类关联关系，每类实体节点具有的属性如下所示。According to the above method, the relevant knowledge in the financial field is deeply excavated, the data set in the knowledge acquisition stage is investigated as a whole, and the financial ontology model of small and micro enterprises is constructed by analyzing the semantic association between concepts and attributes in the field. The ontology model includes five types of entity nodes: enterprise and/or individual customers, human-enterprise association events, capital transaction events, guarantee events, and investment events, including four types of entity nodes: enterprise-individual, capital inflow-outflow, guarantee-guaranteed, and investment-invested four. The class association relationship, the attributes of each type of entity node are as follows.

(1)企业/个人客户：企业ID主键、企业名称和企业类别。(1) Corporate/individual customers: corporate ID primary key, corporate name and corporate category.

(2)人企关联事件：个人任职职位、任职开始时间和任职结束时间。(2) Person-enterprise related events: personal position, start time and end time.

(3)资金交易事件：交易事件ID主键、交易事件金额和交易事件。(3) Fund transaction event: transaction event ID primary key, transaction event amount and transaction event.

(4)担保事件：担保圈ID主键和担保圈建立时间。(4) Guarantee event: Guarantee circle ID primary key and guarantee circle establishment time.

(5)投资事件：投资事件ID主键、投资事件金额、投资事件开始时间和投资事件结束时间。(5) Investment event: investment event ID primary key, investment event amount, investment event start time and investment event end time.

本公开将不同客户间的关联关系实例化为了实体节点，而非简单的一条关系连边，例如实例化资金交易事件节点来取代资金交易连边。这是因为，小微企业金融知识图谱中，关系连边同样包含较多信息，如交易金额与交易时间等，实际业务场景中信贷人员也会重点关注此类属性信息，而仅使用图谱中的一条边表示这种关联关系则会损失大量关键信息，因此小微企业金融知识图谱中实例化了四类关联关系事件，以便更加全面化、扁平化地展示关联关系的属性。The present disclosure instantiates the association relationship between different customers as an entity node, rather than a simple relationship link, for example, instantiate a fund transaction event node to replace the fund transaction link. This is because, in the financial knowledge graph of small and micro enterprises, the relationship links also contain a lot of information, such as transaction amount and transaction time. If an edge represents this relationship, a lot of key information will be lost. Therefore, four types of relationship events are instantiated in the financial knowledge graph of small and micro enterprises to display the attributes of the relationship in a more comprehensive and flattened manner.

目前知识图谱数据层普遍采用了RDF(Resource Description Framework，资源描述框架)模型来表示数据。RDF是W3C万维网联盟针对语义网制定的表示和交换机器可理解信息的标准数据模型，使用Web标识符(URIs)来标识资源，使用属性和属性值来描述资源。在RDF图中，每个资源具有一个(HTTP URI)作为其唯一地址。RDF图定义为三元组(s，p，o)的有限集合；每个三元组表示是一个事实陈述句，其中，s是主语(subject)，p是谓语(predicate)，o是宾语(object)；(s，p，o)表示s与o之间具有联系p，或表示s具有属性p且其取值为o。RDF中的三元组有时候也称为一条语句(statement)，在知识图谱中我们也称其为一条知识。At present, the knowledge graph data layer generally adopts the RDF (Resource Description Framework) model to represent data. RDF is a standard data model for representing and exchanging machine-understandable information developed by the W3C World Wide Web Consortium for the Semantic Web. It uses Web Identifiers (URIs) to identify resources, and uses attributes and attribute values to describe resources. In an RDF graph, each resource has a (HTTP URI) as its unique address. An RDF graph is defined as a finite set of triples (s, p, o); each triple represents a fact statement, where s is the subject, p is the predicate, and o is the object ); (s, p, o) means that there is a connection p between s and o, or that s has an attribute p and its value is o. A triple in RDF is sometimes called a statement, and we also call it a piece of knowledge in the knowledge graph.

面向RDF的数据库是从关系型数据库中抽取知识的一种方式，数据库表名直接映射到RDF中的类，字段映射到类的属性，不同类之间的关系可以从表示关系的表中得出。RDF-oriented database is a way to extract knowledge from relational database. Database table names are directly mapped to classes in RDF, fields are mapped to attributes of classes, and relationships between different classes can be derived from tables representing relationships .

其中，在知识存储中，针对RDF格式知识图谱数据管理的一个核心问题是如何有效地存储RDF数据集和快速回答SPARQL查询，总的来说，有两套完全不同的思路。其一是我们可以利用已有的成熟的数据库管理系统(例如关系数据库系统)来存储知识图谱数据，将面向RDF知识图谱的SPARQL查询转换为面向此类成熟数据库管理系统的查询，例如面向关系数据库的SQL查询，利用已有的关系数据库产品或者相关技术来回答查询；其二是直接开发面向RDF知识图谱数据的原生知识图谱数据存储和查询系统(Native RDF，也即图数据库系统)，考虑到RDF知识图谱管理的特性，从数据库系统的底层进行优化。Among them, in knowledge storage, one of the core issues of knowledge graph data management in RDF format is how to efficiently store RDF datasets and quickly answer SPARQL queries. In general, there are two completely different sets of ideas. One is that we can use existing mature database management systems (such as relational database systems) to store knowledge graph data, and convert SPARQL queries oriented to RDF knowledge graphs into queries oriented to such mature database management systems, such as relational database oriented The second is to directly develop a native knowledge graph data storage and query system (Native RDF, that is, a graph database system) for RDF knowledge graph data, considering that The characteristics of RDF knowledge graph management are optimized from the bottom layer of the database system.

本公开选择面向RDF图的原生知识图谱存储管理方案，因为在需要描述大量关系时，传统的关系型数据库已经不堪重负，它所能承担的是较多实体但是实体间关系略显简单的情况。而对于小微企业这种实体间关系非常复杂，常常需要在关系之中记录数据，而且大部分对数据的操作都与关系有关的情况，RDF图数据库是更为合理的选择。它不仅仅可以为我们带来运行性能的提升，更可以大大提高系统开发效率，减少维护成本。The present disclosure chooses a native knowledge graph storage management solution for RDF graphs, because when a large number of relationships need to be described, the traditional relational database is already overwhelmed, and it can bear the situation that there are many entities but the relationship between entities is slightly simple. For small and micro enterprises, the relationship between entities is very complex, and it is often necessary to record data in the relationship, and most of the operations on the data are related to the relationship, the RDF graph database is a more reasonable choice. It can not only bring us the improvement of operating performance, but also greatly improve the efficiency of system development and reduce maintenance costs.

其中，在知识可视化中，可以搭建小微企业知识图谱平台，对信贷人员提供人机交互界面，辅助信贷人员进行小微企业客户存续期动态管理。平台将实现小微企业知识图谱的可视化，以及基于图谱的节点检索、子图游走、路径探索和自环探索四项主要功能，具体描述如下。Among them, in the knowledge visualization, a knowledge graph platform for small and micro enterprises can be built to provide a human-computer interaction interface for loan personnel, and assist loan personnel in dynamic management of small and micro enterprise customers. The platform will realize the visualization of the knowledge graph of small and micro enterprises, as well as four main functions of graph-based node retrieval, subgraph walking, path exploration and self-loop exploration, which are described as follows.

(1)节点检索：节点检索是小微企业知识图谱最基本也是最核心的功能，对于一个给定的目标节点名称，平台可以实现快速响应，将目标节点及相关信息在可视化界面内进行展示。可视化界面将展示目标节点的属性信息，以及图谱上一定跳数内的邻居节点，即目标实体的关联实体。同时，由于知识图谱可以实现深度关联查询操作，因此信贷人员可以获得目标节点所处整个关系网络的结构，如目标节点所处的完整上下游企业链、目标节点所处完整担保圈等，从而了解目标节点在整个关系网络中所处的位置与重要性。(1) Node retrieval: Node retrieval is the most basic and core function of the knowledge graph of small and micro enterprises. For a given target node name, the platform can respond quickly and display the target node and related information in the visual interface. The visual interface will display the attribute information of the target node and the neighbor nodes within a certain number of hops on the graph, that is, the associated entities of the target entity. At the same time, since the knowledge graph can realize the deep correlation query operation, the credit personnel can obtain the structure of the entire relationship network where the target node is located, such as the complete upstream and downstream enterprise chain where the target node is located, the complete guarantee circle where the target node is located, etc., so as to understand The position and importance of the target node in the entire relationship network.

(2)子图游走：在小微企业知识图谱中，每个节点所具有的边的数量和类别都不尽相同，相应地，其邻居节点的数量和类别也会存在较大差异，因此不存在某一固定的模式去全面地刻画节点的特征画像，信贷人员可以通过在节点邻域子图上启发式游走的方法针对性地获取节点关联信息。子图游走方式来源于基于图的随机游走算法，随机游走是提取图结构特征的一种方法，简单来说，随机游走算法构建了若干个随机游走器(randomwalker)，每个随机游走器从某个节点初始化，之后在每一步随机游走中，随机地访问当前节点的某个邻接节点。(2) Subgraph walking: In the knowledge graph of small and micro enterprises, each node has different numbers and types of edges, and accordingly, the number and types of its neighbor nodes are also quite different. Therefore, There is no fixed pattern to comprehensively describe the feature profile of nodes. Credit officers can obtain node association information in a targeted manner by heuristically walking on the node neighborhood subgraph. The subgraph walk method comes from the random walk algorithm based on the graph. Random walk is a method of extracting the structural features of the graph. In short, the random walk algorithm builds several random walkers, each of which is a random walker. The random walker is initialized from a node, and then randomly visits a neighbor node of the current node in each random walk step.

小微企业知识图谱上的子图游走与随机游走最大区别在于可以通过人工的方式取代随机方式选择下一步访问的节点，例如企业间的担保关系是小微企业特征中较为重要但又十分稀疏的数据，对于目标小微企业节点，如果其在知识图谱上具有多条担保关系的连边，则信贷人员可以重点选择沿着担保关系连边进行游走；如果节点不具有担保关系，则可以选择其他关系进行子图游走，针对性地获取不同企业全面的特征画像。The biggest difference between the subgraph walk and random walk on the knowledge graph of small and micro enterprises is that the nodes to be visited in the next step can be selected manually instead of random methods. For example, the guarantee relationship between enterprises is an important but very important feature of small and micro enterprises. For sparse data, for the target small and micro enterprise node, if it has multiple edges of the guarantee relationship on the knowledge graph, the credit officer can choose to walk along the edge of the guarantee relationship; if the node does not have a guarantee relationship, then You can choose other relationships for sub-graph walks to obtain comprehensive feature portraits of different enterprises in a targeted manner.

(3)路径探索：对于两个给定的目标节点名称，小微企业知识图谱平台可以探索两节点间是否存在图谱上的通路，信贷人员可以查看小微企业客户是否与行业内龙头企业或上市企业存在通路、是否与行业内贷款违约企业存在通路，给风险评估提供参考。除此之外，两个企业间可能没有直接关联，即在图谱中对应的两个节点间没有连边，但可能经由中间节点存在多种关系，如竞争关系、控股关系和投融资关系等，信贷人员可以通过对两企业间通路的分析获取企业间这种间接关联的关系。(3) Path exploration: For two given target node names, the small and micro enterprise knowledge graph platform can explore whether there is a path on the graph between the two nodes, and the loan officer can check whether the small and micro enterprise customer is related to the industry's leading enterprises or listed companies. Whether there is a path for the enterprise and whether there is a path with the loan defaulting enterprise in the industry provides a reference for risk assessment. In addition, there may be no direct relationship between the two enterprises, that is, there is no edge between the two corresponding nodes in the graph, but there may be various relationships through intermediate nodes, such as competition relationship, holding relationship, investment and financing relationship, etc. The loan officer can obtain the indirect relationship between the two enterprises through the analysis of the channel between the two enterprises.

(4)自环探索：对于一个给定的目标节点名称，小微企业知识图谱平台可以检索其是否在图谱中存在一条或多条的自环通路，若存在，则将一条或多条环路呈现给信贷人员。小微企业金融知识图谱上的自环，意味着小微客户可能存在套贷风险，即发放给该客户的贷款再流经其它多方客户后又转回了自身，需要信贷人员重点关注、单独分析。(4) Self-loop exploration: For a given target node name, the knowledge graph platform for small and micro enterprises can retrieve whether there are one or more self-loop paths in the graph. Presented to loan officers. The self-loop on the financial knowledge map of small and micro enterprises means that small and micro customers may have the risk of arbitrage, that is, the loan issued to this customer flows through other multi-party customers and then is transferred back to itself, which requires the focus of credit personnel and separate analysis. .

知识图谱应用主要分为边权建模、社群划分、图特征计算和风险传播预测。The application of knowledge graph is mainly divided into edge weight modeling, community division, graph feature calculation and risk propagation prediction.

其中，在边权建模中，上述构建的小微企业金融知识图谱忽略节点上的属性后，可以看作一个有向多重图(directed multi-graph)，即在有向简单图的基础上，图中某两个节点之间的边数多于一条。对于小微企业金融知识图谱，资金流关系和投资关系都是有向边，资金流关系的方向代表着资金流出与流入的方向，投资关系的方向表示投资与被投资方；同时，两个客户间可能存在多种关系，且两个客户可能在不同月份中均存在资金流交易，会存在多条资金流交易连边。因此，该知识图谱是一个多重有向图，为针对节点的图特征建模，需先将多重图归约为简单图，再直观地计算网络特征指标。Among them, in the edge weight modeling, the financial knowledge graph of small and micro enterprises constructed above can be regarded as a directed multi-graph after ignoring the attributes on the nodes, that is, on the basis of the directed simple graph, There are more than one edge between two nodes in the graph. For the financial knowledge graph of small and micro enterprises, the capital flow relationship and the investment relationship are both directed edges. The direction of the capital flow relationship represents the direction of capital outflow and inflow, and the direction of the investment relationship represents the investment and the investee. There may be multiple relationships between the two clients, and the two clients may have capital flow transactions in different months, and there will be multiple capital flow transaction connections. Therefore, the knowledge graph is a multiple directed graph. In order to model the graph features of nodes, the multiple graphs need to be reduced to a simple graph first, and then the network feature indicators can be calculated intuitively.

基于构建好的知识图谱，假设图中包含的关联关系集合为{R₁，R₂，R₃...R_n}，按照不同的关系将G划分为多个关系图，例如G_r1，G_r2，G_r3...G_rn，每个关系图G_ri中仅包含知识图谱中关联关系为ri的连边，在G_ri中任意相连两节点间的连边类别均相同，但连边时间和其余属性可能存在不同，依据属性值的不同为连边计算权重，最终将两节点间的所有连边权重求和，归约为一条边。Based on the constructed knowledge graph, assuming that the set of association relationships contained in the graph is {R ₁ , R ₂ , R ₃ ... R _n }, G is divided into multiple relation graphs according to different relationships, such as G _r1 , G _r2 , G _r3 ...G _rn , each relational graph G _ri only contains the edges with the relation ri in the knowledge graph, and the types of edges between any two connected nodes in G _ri are the same, but the time of connecting the edges is the same. It may be different from other attributes. The weights are calculated for the edges according to the difference of attribute values. Finally, the weights of all the edges between the two nodes are summed and reduced to one edge.

具体地，对某一关系图G_ri中的节点v，若节点

归约后的新连边权重为

其中，

为图G_ri中的节点v的出边邻居节点集合，v表示两个当事方中的头节点，u表示两个当事方中的尾节点，E(v，u)表示头节点为v、尾节点为u的n个事件信息边的集合，E(v)表示头节点为v的有向边的集合，

表示边l′的时间权重，

表示边l在对应的事件发生时刻的初始权重，

表示边l的时间权重。资金流关系中初始权重为交易事件的交易金额属性值，投资关系中初始权重为投资事件的投资金额属性值，其余关系中初始权重均为1；边的创建时刻t_l距离当前时刻t跨度越大，时间权重越小，这条边的重要性就越低。Specifically, for a node v in a relational graph G _ri , if the node

The weight of the new connection after reduction is

in,

is the set of outgoing neighbor nodes of the node v in the graph G _ri , v represents the head node of the two parties, u represents the tail node of the two parties, and E(v, u) represents that the head node is v , the set of n event information edges whose tail node is u, E(v) represents the set of directed edges whose head node is v,

represents the time weight of edge l′,

represents the initial weight of edge l at the time of the corresponding event,

represents the time weight of edge l. The initial weight in the capital flow relationship is the transaction amount attribute value of the transaction event, the initial weight in the investment relationship is the investment amount attribute value of the investment event, and the initial weight in the other relationships is 1; the creation time t _l of the edge is more than the current time t span. The smaller the time weight, the less important this edge is.

在完成对关系图G_ri中的多重边聚合后，再将全部关系图进行合并，得到一个新的关联关系图谱。对于图谱中的节点v，若节点u∈N_out(v)，则新连边权重

其中，N_out(v)为新的关联关系图谱中节点v的出边邻居节点集合，R(v，u)表示头节点为v、尾节点为u的m个类别数据边的集合，C_r表示与不同的类别数据对应的常数系数。After the aggregation of multiple edges in the relational graph G _ri is completed, all relational graphs are merged to obtain a new relational relation graph. For the node v in the graph, if the node u∈N _out (v), then the new edge weight

Among them, N _out (v) is the set of outgoing neighbor nodes of node v in the new relationship graph, R(v, u) is the set of m types of data edges whose head node is v and tail node is u, C _r Represents constant coefficients corresponding to different categories of data.

最终得到一个任意相邻两节点间在一个方向上仅有一条连边的有向简单图，且连边权重越大，两节点关联越紧密。Finally, a directed simple graph with only one edge in one direction between any two adjacent nodes is obtained, and the greater the weight of the edge, the more closely the two nodes are related.

其中，在社群划分中，由于图谱中节点数量较多，全局计算节点图特征时结果差异性小，出现大量重复数值，计算结果无法作为特征入模。因此，应用标签传播算法基于连边权重对新的图谱进行社群划分，在划分后的社群子图中计算节点的图特征。Among them, in the community division, due to the large number of nodes in the graph, the differences in the results of the global calculation of node graph features are small, and a large number of repeated values appear, and the calculation results cannot be used as features. Therefore, the label propagation algorithm is applied to divide the community of the new graph based on the edge weight, and the graph features of the nodes are calculated in the divided community subgraph.

标签传播进行社群划分的基本过程是，首先为图中的每一个节点分配一个不同的社群标签，然后这些标签开始在知识图谱中进行传播，传播的每一步，每个节点都会根据邻居节点标签的情况更新自己的标签，具体来说，每个节点会选择加入邻居节点中连边权重之和最大的标签类别。随着标签的传播，最终连接紧密的结点集合将会达成一个共识，而他们身上的标签也将不再发生变化。基于标签传播的社群划分算法会根据节点间连边的权重大小，也即连接紧密程度选择标签传播路径，最终将关联关系紧密的企业实体划分到同一社团，相比于基于特征相似度的聚类划分，更易判定企业所处集团、风险传播等潜在信息。The basic process of community division by label propagation is to first assign a different community label to each node in the graph, and then these labels start to propagate in the knowledge graph. In each step of propagation, each node will be based on the neighbor nodes. In the case of the label, update its own label. Specifically, each node will choose to join the label category with the largest sum of edge weights among neighbor nodes. As the label propagates, the final set of closely connected nodes will reach a consensus, and the labels on them will no longer change. The community division algorithm based on label propagation will select the label propagation path according to the weight of the edges between nodes, that is, the degree of connection tightness, and finally divide the closely related enterprise entities into the same community. Class division makes it easier to determine potential information such as the group in which the enterprise is located, and risk communication.

在不同的社群子图内计算节点的图特征，会忽略不同社群间的连边，造成信息浪费，为充分利用数据信息，本公开将社群内计算出的节点特征定义为局部特征信息；将每个社群抽象为一个节点，两个社群间的所有连边加权求和作为社群节点间的连边，得到一个社群关联网络，计算社群节点在网络内的特征，作为社群中原始客户节点的全局特征信息。综合考虑全局特征与局部特征，可以充分利用有限的数据。Calculating the graph features of nodes in different community subgraphs will ignore the edges between different communities, resulting in information waste. In order to make full use of data information, the present disclosure defines the node features calculated in the community as local feature information ;Abstract each community as a node, and the weighted summation of all the edges between the two communities is used as the edge between the community nodes, and a community association network is obtained, and the characteristics of the community nodes in the network are calculated as Global feature information of original client nodes in the community. Considering global features and local features comprehensively, limited data can be fully utilized.

其中，在图特征计算中，本公开选择了度中心性、特征向量中心性、中介中心性、紧密中心性、HITS值和PAGERANK值这六个指标计算局部与全局维度的特征，使用这些图特征与传统变量特征联合建立逻辑回归模型对小微客户贷款逾期风险进行预测，最终共有12个变量入模，包含全局维度的度中心性特征、局部维度的入边紧密中心性特征和局部维度的枢纽值特征三个图特征，说明该特征对小微企业客户逾期风险区分度最高，可以作为入模变量弥补传统预测模型的特征维度。Among them, in the graph feature calculation, the present disclosure selects six indicators of degree centrality, feature vector centrality, betweenness centrality, close centrality, HITS value and PAGERANK value to calculate the features of local and global dimensions, and use these graph features Combined with traditional variable features, a logistic regression model is established to predict the loan overdue risk of small and micro customers. In the end, a total of 12 variables are entered into the model, including the degree centrality feature of the global dimension, the edge closeness centrality feature of the local dimension, and the hub of the local dimension. The three graph features of the value feature indicate that this feature has the highest degree of discrimination for overdue risks of small and micro enterprise customers, and can be used as a model variable to make up for the feature dimension of traditional prediction models.

其中，在风险传播预测中，除计算图特征外，本公开还应用标签传播算法对违约节点进行了分类预测，标签传播算法认为每个节点的标签应该和其大多数邻居的标签相同，即当某一小微企业客户未出现逾期风险，但与其关联紧密的多个邻居客户出现逾期风险时，风险极有可能传播给该未逾期客户，而标签传播算法则可以预测这种风险的传递，提前预警进行规避。Among them, in the risk propagation prediction, in addition to calculating the graph features, the present disclosure also applies the label propagation algorithm to classify and predict the default nodes. The label propagation algorithm believes that the label of each node should be the same as the label of most of its neighbors, that is, when When a small and micro enterprise customer has no overdue risk, but multiple neighboring customers closely related to it have overdue risk, the risk is very likely to spread to the non-overdue customer. Warning to avoid.

首先给每个节点添加标签以代表该客户贷款是否逾期，之后在图上不断迭代传播，更新节点的标签。一定迭代次数后，统计图中节点的标签是否发生变化，发现部分未知标签节点被传播为风险节点，未知标签表示未在我行持有贷款的行外客户，被传播为风险节点可为之后贷款准入提供参考；有少量客户标签从未逾期传播为了逾期客户，表示这些客户易被风险传递，是存续期管理重点关注的目标。First, add a label to each node to represent whether the customer's loan is overdue, and then iteratively propagate on the graph to update the node's label. After a certain number of iterations, whether the labels of the nodes in the statistical graph have changed, and it is found that some nodes with unknown labels are propagated as risk nodes. The unknown labels represent out-of-bank customers who do not hold loans in our bank, and are propagated as risk nodes for subsequent loans. Access provides reference; there are a small number of customer labels that have never been overdue and spread to overdue customers, indicating that these customers are easily passed on by risks, and they are the focus of duration management.

基于上述基于知识图谱的风险管理方法，本公开还提供了一种基于知识图谱的风险管理装置10。以下将结合图12对风险管理装置10进行详细描述。Based on the above risk management method based on knowledge graph, the present disclosure also provides a risk management device 10 based on knowledge graph. The risk management device 10 will be described in detail below with reference to FIG. 12 .

图12示意性示出了根据本公开实施例的基于知识图谱的风险管理装置10的结构框图。FIG. 12 schematically shows a structural block diagram of the risk management apparatus 10 based on the knowledge graph according to an embodiment of the present disclosure.

基于知识图谱的风险管理装置10，包括获取模块1、构建模块2、第一计算模块3、第二计算模块4和管理模块5。The knowledge graph-based risk management device 10 includes an acquisition module 1 , a construction module 2 , a first calculation module 3 , a second calculation module 4 and a management module 5 .

获取模块1，获取模块1用于执行操作S210：获取企业数据，其中，企业数据包括当事方和当事方之间的事件关系，事件关系包括m个类别数据，每个类别数据包括n个事件信息，其中，m为大于等于1的整数，n为大于等于1的整数。Obtaining module 1, the obtaining module 1 is configured to perform operation S210: obtaining enterprise data, wherein the enterprise data includes a party and an event relationship between the parties, the event relationship includes m types of data, and each type of data includes n data Event information, where m is an integer greater than or equal to 1, and n is an integer greater than or equal to 1.

构建模块2，构建模块2用于执行操作S220：根据企业数据构建知识图谱，其中，当事方和事件信息为知识图谱的节点，事件关系为知识图谱的边，事件信息用于解释说明对应的边。Building module 2, the building module 2 is used to perform operation S220: build a knowledge graph according to enterprise data, wherein the parties and event information are nodes of the knowledge graph, the event relationship is an edge of the knowledge graph, and the event information is used to explain the corresponding side.

第一计算模块3，第一计算模块3用于执行操作S230：基于每个类别数据构建的知识图谱，计算两个当事方之间的n个事件信息边的归约权重

A first calculation module 3, the first calculation module 3 is configured to perform operation S230: based on the knowledge graph constructed by each category of data, calculate the reduction weights of n event information edges between two parties

第二计算模块4，第二计算模块4用于执行操作S240：根据归约权重，计算两个当事方之间的m个类别数据边的聚合权重ω_vu。The second calculation module 4 is configured to perform operation S240: according to the reduction weight, calculate the aggregation weight ω _vu of the m category data edges between the two parties.

管理模块5，管理模块5用于执行操作S250：根据聚合权重对企业进行风险管理。Management module 5. The management module 5 is configured to perform operation S250: perform risk management on the enterprise according to the aggregated weight.

由于上述风险管理装置10是基于风险管理方法设置的，因此上述风险管理装置10的有益效果与风险管理方法的相同，这里不再赘述。Since the above-mentioned risk management apparatus 10 is set up based on the risk management method, the beneficial effects of the above-mentioned risk management apparatus 10 are the same as those of the risk management method, which will not be repeated here.

另外，根据本公开的实施例，获取模块1、构建模块2、第一计算模块3、第二计算模块4和管理模块5中的任意多个模块可以合并在一个模块中实现，或者其中的任意一个模块可以被拆分成多个模块。或者，这些模块中的一个或多个模块的至少部分功能可以与其他模块的至少部分功能相结合，并在一个模块中实现。In addition, according to the embodiment of the present disclosure, any number of modules in the acquisition module 1, the construction module 2, the first calculation module 3, the second calculation module 4, and the management module 5 may be combined in one module for implementation, or any of them may be implemented in one module. A module can be split into multiple modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of other modules and implemented in one module.

根据本公开的实施例，获取模块1、构建模块2、第一计算模块3、第二计算模块4和管理模块5中的至少一个可以至少被部分地实现为硬件电路，例如现场可编程门阵列(FPGA)、可编程逻辑阵列(PLA)、片上系统、基板上的系统、封装上的系统、专用集成电路(ASIC)，或可以通过对电路进行集成或封装的任何其他的合理方式等硬件或固件来实现，或以软件、硬件以及固件三种实现方式中任意一种或以其中任意几种的适当组合来实现。According to an embodiment of the present disclosure, at least one of the acquisition module 1 , the building module 2 , the first computing module 3 , the second computing module 4 , and the management module 5 may be at least partially implemented as a hardware circuit, such as a field programmable gate array (FPGA), Programmable Logic Array (PLA), System-on-Chip, System-on-Substrate, System-on-Package, Application-Specific Integrated Circuit (ASIC), or any other reasonable means by which circuits can be integrated or packaged such as hardware or Firmware, or any one of software, hardware and firmware, or any appropriate combination of any of them.

或者，获取模块1、构建模块2、第一计算模块3、第二计算模块4和管理模块5中的至少一个可以至少被部分地实现为计算机程序模块，当该计算机程序模块被运行时，可以执行相应的功能。Alternatively, at least one of the acquisition module 1, the building module 2, the first computing module 3, the second computing module 4, and the management module 5 may be implemented, at least in part, as a computer program module that, when executed, may perform the corresponding function.

图13示意性示出了根据本公开实施例的适于实现基于知识图谱的风险管理方法的电子设备的方框图。13 schematically shows a block diagram of an electronic device suitable for implementing a knowledge graph-based risk management method according to an embodiment of the present disclosure.

如图13所示，根据本公开实施例的电子设备900包括处理器901，其可以根据存储在只读存储器(ROM)902中的程序或者从存储部分908加载到随机访问存储器(RAM)903中的程序而执行各种适当的动作和处理。处理器901例如可以包括通用微处理器(例如CPU)、指令集处理器和/或相关芯片组和/或专用微处理器(例如，专用集成电路(ASIC))等等。处理器901还可以包括用于缓存用途的板载存储器。处理器901可以包括用于执行根据本公开实施例的方法流程的不同动作的单一处理单元或者是多个处理单元。As shown in FIG. 13 , an electronic device 900 according to an embodiment of the present disclosure includes a processor 901 that can be loaded into a random access memory (RAM) 903 according to a program stored in a read only memory (ROM) 902 or from a storage portion 908 program to perform various appropriate actions and processes. The processor 901 may include, for example, a general-purpose microprocessor (eg, a CPU), an instruction set processor and/or a related chipset, and/or a special-purpose microprocessor (eg, an application-specific integrated circuit (ASIC)), and the like. The processor 901 may also include on-board memory for caching purposes. The processor 901 may include a single processing unit or multiple processing units for performing different actions of the method flow according to the embodiments of the present disclosure.

在RAM 903中，存储有电子设备900操作所需的各种程序和数据。处理器901、ROM902以及RAM 903通过总线904彼此相连。处理器901通过执行ROM 902和/或RAM 903中的程序来执行根据本公开实施例的方法流程的各种操作。需要注意，所述程序也可以存储在除ROM 902和RAM 903以外的一个或多个存储器中。处理器901也可以通过执行存储在所述一个或多个存储器中的程序来执行根据本公开实施例的方法流程的各种操作。In the RAM 903, various programs and data necessary for the operation of the electronic device 900 are stored. The processor 901 , the ROM 902 and the RAM 903 are connected to each other through a bus 904 . The processor 901 performs various operations of the method flow according to the embodiment of the present disclosure by executing the programs in the ROM 902 and/or the RAM 903 . Note that the program may also be stored in one or more memories other than the ROM 902 and the RAM 903 . The processor 901 may also perform various operations of the method flow according to the embodiments of the present disclosure by executing programs stored in the one or more memories.

根据本公开的实施例，电子设备900还可以包括输入/输出(I/O)接口905，输入/输出(I/O)接口905也连接至总线904。电子设备900还可以包括连接至I/O接口905的以下部件中的一项或多项：包括键盘、鼠标等的输入部分906；包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分907；包括硬盘等的存储部分908；以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分909。通信部分909经由诸如因特网的网络执行通信处理。驱动器910也根据需要连接至输入/输出(I/O)接口905。可拆卸介质911，诸如磁盘、光盘、磁光盘、半导体存储器等等，根据需要安装在驱动器910上，以便于从其上读出的计算机程序根据需要被安装入存储部分908。According to an embodiment of the present disclosure, the electronic device 900 may also include an input/output (I/O) interface 905 which is also connected to the bus 904 . Electronic device 900 may also include one or more of the following components connected to I/O interface 905: input portion 906 including keyboard, mouse, etc.; including components such as cathode ray tube (CRT), liquid crystal display (LCD), etc., and An output section 907 of speakers and the like; a storage section 908 including a hard disk and the like; and a communication section 909 including a network interface card such as a LAN card, a modem, and the like. The communication section 909 performs communication processing via a network such as the Internet. Drive 910 is also connected to input/output (I/O) interface 905 as needed. A removable medium 911, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is mounted on the drive 910 as needed so that a computer program read therefrom is installed into the storage section 908 as needed.

本公开还提供了一种计算机可读存储介质，该计算机可读存储介质可以是上述实施例中描述的设备/装置/系统中所包含的；也可以是单独存在，而未装配入该设备/装置/系统中。上述计算机可读存储介质承载有一个或者多个程序，当上述一个或者多个程序被执行时，实现根据本公开实施例的方法。The present disclosure also provides a computer-readable storage medium. The computer-readable storage medium may be included in the device/apparatus/system described in the above embodiments; it may also exist alone without being assembled into the device/system. device/system. The above-mentioned computer-readable storage medium carries one or more programs, and when the above-mentioned one or more programs are executed, implement the method according to the embodiment of the present disclosure.

根据本公开的实施例，计算机可读存储介质可以是非易失性的计算机可读存储介质，例如可以包括但不限于：便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中，计算机可读存储介质可以是任何包含或存储程序的有形介质，该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。例如，根据本公开的实施例，计算机可读存储介质可以包括上文描述的ROM 902和/或RAM 903和/或ROM 902和RAM 903以外的一个或多个存储器。According to an embodiment of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, such as, but not limited to, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM) , erasable programmable read only memory (EPROM or flash memory), portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing. In this disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, a computer-readable storage medium may include one or more memories other than ROM 902 and/or RAM 903 and/or ROM 902 and RAM 903 described above.

本公开的实施例还包括一种计算机程序产品，其包括计算机程序，该计算机程序包含用于执行流程图所示的方法的程序代码。当计算机程序产品在计算机系统中运行时，该程序代码用于使计算机系统实现本公开实施例的方法。Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the method illustrated in the flowchart. When the computer program product is run in a computer system, the program code is used to cause the computer system to implement the methods of the embodiments of the present disclosure.

在该计算机程序被处理器901执行时执行本公开实施例的系统/装置中限定的上述功能。根据本公开的实施例，上文描述的系统、装置、模块、单元等可以通过计算机程序模块来实现。When the computer program is executed by the processor 901, the above-described functions defined in the system/apparatus of the embodiment of the present disclosure are performed. According to embodiments of the present disclosure, the systems, apparatuses, modules, units, etc. described above may be implemented by computer program modules.

在一种实施例中，该计算机程序可以依托于光存储器件、磁存储器件等有形存储介质。在另一种实施例中，该计算机程序也可以在网络介质上以信号的形式进行传输、分发，并通过通信部分909被下载和安装，和/或从可拆卸介质911被安装。该计算机程序包含的程序代码可以用任何适当的网络介质传输，包括但不限于：无线、有线等等，或者上述的任意合适的组合。In one embodiment, the computer program may rely on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed in the form of a signal over a network medium, and downloaded and installed through the communication section 909, and/or installed from a removable medium 911. The program code embodied by the computer program may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

在这样的实施例中，该计算机程序可以通过通信部分909从网络上被下载和安装，和/或从可拆卸介质911被安装。在该计算机程序被处理器901执行时，执行本公开实施例的系统中限定的上述功能。根据本公开的实施例，上文描述的系统、设备、装置、模块、单元等可以通过计算机程序模块来实现。In such an embodiment, the computer program may be downloaded and installed from the network via the communication portion 909, and/or installed from the removable medium 911. When the computer program is executed by the processor 901, the above-described functions defined in the system of the embodiment of the present disclosure are performed. According to embodiments of the present disclosure, the above-described systems, apparatuses, apparatuses, modules, units, etc. can be implemented by computer program modules.

根据本公开的实施例，可以以一种或多种程序设计语言的任意组合来编写用于执行本公开实施例提供的计算机程序的程序代码，具体地，可以利用高级过程和/或面向对象的编程语言、和/或汇编/机器语言来实施这些计算程序。程序设计语言包括但不限于诸如Java，C++，python，“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中，远程计算设备可以通过任意种类的网络，包括局域网(LAN)或广域网(WAN)，连接到用户计算设备，或者，可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。According to the embodiments of the present disclosure, the program code for executing the computer program provided by the embodiments of the present disclosure may be written in any combination of one or more programming languages, and specifically, high-level procedures and/or object-oriented programming may be used. programming language, and/or assembly/machine language to implement these computational programs. Programming languages include, but are not limited to, languages such as Java, C++, python, "C" or similar programming languages. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (eg, using an Internet service provider business via an Internet connection).

附图中的流程图和框图，图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上，流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分，上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意，在有些作为替换的实现中，方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如，两个接连地表示的方框实际上可以基本并行地执行，它们有时也可以按相反的顺序执行，这依所涉及的功能而定。也要注意的是，框图或流程图中的每个方框、以及框图或流程图中的方框的组合，可以用执行规定的功能或操作的专用的基于硬件的系统来实现，或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams or flowchart illustrations, and combinations of blocks in the block diagrams or flowchart illustrations, can be implemented in special purpose hardware-based systems that perform the specified functions or operations, or can be implemented using A combination of dedicated hardware and computer instructions is implemented.

本领域技术人员可以理解，本公开的各个实施例和/或权利要求中记载的特征可以进行多种组合和/或结合，即使这样的组合或结合没有明确记载于本公开中。特别地，在不脱离本公开精神和教导的情况下，本公开的各个实施例和/或权利要求中记载的特征可以进行多种组合和/或结合。所有这些组合和/或结合均落入本公开的范围。Those skilled in the art will appreciate that various combinations and/or combinations of features recited in various embodiments and/or claims of the present disclosure are possible, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments of the present disclosure and/or in the claims may be made without departing from the spirit and teachings of the present disclosure. All such combinations and/or combinations fall within the scope of this disclosure.

以上对本公开的实施例进行了描述。但是，这些实施例仅仅是为了说明的目的，而并非为了限制本公开的范围。尽管在以上分别描述了各实施例，但是这并不意味着各个实施例中的措施不能有利地结合使用。本公开的范围由所附权利要求及其等同物限定。不脱离本公开的范围，本领域技术人员可以做出多种替代和修改，这些替代和修改都应落在本公开的范围之内。Embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only, and are not intended to limit the scope of the present disclosure. Although the various embodiments are described above separately, this does not mean that the measures in the various embodiments cannot be used in combination to advantage. The scope of the present disclosure is defined by the appended claims and their equivalents. Without departing from the scope of the present disclosure, those skilled in the art can make various substitutions and modifications, and these substitutions and modifications should all fall within the scope of the present disclosure.

Claims

1. A risk management method based on knowledge graph is characterized by comprising the following steps:

acquiring enterprise data, wherein the enterprise data comprises a party and an event relation between the party, the event relation comprises m category data, each category data comprises n event information, m is an integer greater than or equal to 1, and n is an integer greater than or equal to 1;

constructing a knowledge graph according to the enterprise data, wherein the parties and the event information are nodes of the knowledge graph, the event relation is an edge of the knowledge graph, and the event information is used for explaining the corresponding edge;

calculating reduction weights of n event information edges between the two parties based on the knowledge graph constructed for each of the category data

Calculating an aggregation weight ω of m class-data edges between the two parties according to the reduction weight_vu(ii) a And

and carrying out risk management on the enterprises according to the aggregation weight.

2. The method of knowledge-graph-based risk management according to claim 1, wherein the reduction weight is calculated as follows:

wherein v represents a head node in two of the parties, u represents a tail node in two of the parties, E (v, u) represents a set of n event information edges with a head node of v and a tail node of u, E (v) represents a set of directed edges with a head node of v,

indicating the initial weight of the edge l' at the corresponding event occurrence time, t indicating the current time,

the temporal weight of the edge l' is represented,

indicating the initial weight of the edge/at the time of the corresponding event occurrence,

represents the temporal weight of the edge l; and

the calculation method of the aggregation weight is as follows:

wherein R (v, u) represents a set of m said class data edges with a head node v and a tail node u, C_rRepresenting constant coefficients corresponding to different ones of the category data.

3. The method of knowledge-graph-based risk management according to claim 1, wherein the m category data includes at least one of capital stream data, investment data, warranty data, and corporate-related data.

4. The method of knowledge-graph-based risk management according to claim 1, wherein the n event information comprises event information under the same category data for n time periods.

5. The method of knowledge-graph based risk management according to claim 1, further comprising, prior to said building a knowledge-graph from the enterprise data:

cleansing the enterprise data, wherein cleansing the enterprise data comprises: one of data deduplication, feature in data completion, and anomalous feature processing in data.

6. The method of knowledge-graph based risk management according to claim 1, wherein said building a knowledge-graph from the enterprise data comprises:

constructing a schema layer from the party and the m category data, wherein the schema layer comprises nodes established from the party and m categories of the m category data, and edges between the nodes established from events in each of the category data; and

and importing the data in the category data into a corresponding mode layer.

7. The method of knowledge-graph-based risk management according to claim 6, wherein the category data is structured data, and importing the data in the category data into a corresponding schema layer comprises:

converting the category data into resource description framework data; and

and importing the resource description framework data into a corresponding mode layer.

8. The method of knowledge-graph-based risk management according to claim 1, wherein said risk managing an enterprise according to the aggregation weight comprises:

performing label propagation on the nodes of the knowledge graph according to the aggregation weight;

carrying out community division according to the label propagation result to obtain the community scale;

calculating the degree centrality of the nodes of the knowledge graph according to the community scale; and

and performing risk prediction by taking the centrality as an input of a risk prediction model.

9. The method of knowledge-graph-based risk management according to claim 1, wherein said risk managing an enterprise according to the aggregation weight comprises:

performing label propagation on the nodes of the knowledge graph according to the aggregation weight; and

and predicting the risk according to the label propagation result.

10. The method of knowledge-graph-based risk management according to any one of claims 1-9, further comprising visualizing the knowledge-graph, wherein the visualizing the knowledge-graph comprises: at least one of node retrieval based on the knowledge graph, sub-graph walking based on the knowledge graph, path exploration based on the knowledge graph, and self-loop exploration based on the knowledge graph.

11. The knowledge-graph based risk management method of claim 10,

the node retrieval includes: responding to the input node name, and displaying a node related to the node name and associated information of the node;

the subgraph wandering comprises: responding to the operation of manually clicking the event relation, and displaying a node corresponding to the clicked event relation and all edges sent by the node;

the path exploration comprises the following steps: acquiring and displaying a path relation between two nodes; and

the self-loop exploration comprises the following steps: and acquiring and displaying the nodes with the closed-loop path relation.

12. A knowledge-graph-based risk management device, comprising:

the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring enterprise data, the enterprise data comprises a party and an event relation between the party, the event relation comprises m category data, each category data comprises n event information, m is an integer which is greater than or equal to 1, and n is an integer which is greater than or equal to 1;

a construction module, configured to construct a knowledge graph according to the enterprise data, where the party and the event information are nodes of the knowledge graph, the event relationship is an edge of the knowledge graph, and the event information is used to explain the corresponding edge;

a first calculation module for calculating reduction weights of n event information edges between the two parties based on the knowledge graph constructed for each of the category data

A second calculation module for calculating an aggregation weight ω of m class-data edges between the two parties according to the reduction weight_vu(ii) a And

and the management module is used for carrying out risk management on the enterprises according to the aggregation weight.

13. An electronic device, comprising:

one or more processors;

one or more memories for storing executable instructions that, when executed by the processor, implement the method of any of claims 1-11.

14. A computer-readable storage medium having stored thereon executable instructions that when executed by a processor implement a method according to any one of claims 1 to 11.