CN109597894B - Correlation model generation method and device, and data correlation method and device - Google Patents

Correlation model generation method and device, and data correlation method and device Download PDF

Info

Publication number
CN109597894B
CN109597894B CN201811159278.8A CN201811159278A CN109597894B CN 109597894 B CN109597894 B CN 109597894B CN 201811159278 A CN201811159278 A CN 201811159278A CN 109597894 B CN109597894 B CN 109597894B
Authority
CN
China
Prior art keywords
entity
service
external data
index set
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811159278.8A
Other languages
Chinese (zh)
Other versions
CN109597894A (en
Inventor
杨树波
于君泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Advanced New Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced New Technologies Co Ltd filed Critical Advanced New Technologies Co Ltd
Priority to CN201811159278.8A priority Critical patent/CN109597894B/en
Publication of CN109597894A publication Critical patent/CN109597894A/en
Application granted granted Critical
Publication of CN109597894B publication Critical patent/CN109597894B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis

Abstract

The method and the device for generating the association model and the method and the device for associating the data provided by the specification comprise the steps of obtaining external data, wherein the external data comprise supervision treaty, policy and regulation, cases and/or news; preprocessing the external data and obtaining a treatise index set corresponding to the external data; entity extraction is carried out according to the treaty index set, and a first entity corresponding to the treaty index set is obtained; and obtaining service data and a first association degree associated with the first entity according to the first entity and a pre-generated association model.

Description

Correlation model generation method and device, and data correlation method and device
Technical Field
The present application relates to the field of computer automatic relationship recognition technology, and in particular, to a method and apparatus for generating a correlation model, a method and apparatus for associating data, a computing device, and a storage medium.
Background
With the continuous appearance of new technologies, the conventional supervision compliance means are difficult to cope with the rapid development of the financial and technological industry. The traditional finance company relies on the experience of the supervision compliance professional manpower to analyze and judge the compliance of the business, so that the efficiency is lower, and the requirement on the supervision compliance industry experience of personnel is higher.
Disclosure of Invention
In view of the above, the embodiments of the present application provide a method and apparatus for generating a correlation model, a method and apparatus for correlating data, a computing device and a storage medium, so as to solve the technical defects existing in the prior art.
In a first aspect, an embodiment of the present specification discloses a method for generating a correlation model, including:
acquiring external data and business data, wherein the external data comprises supervision treaties, policy regulations, cases and/or news;
preprocessing the external data and the service data to respectively obtain a treaty index set corresponding to the external data and a service index set corresponding to the service data;
entity extraction is carried out according to the treaty index set and the business index set, and a first entity corresponding to the treaty index set and a second entity corresponding to the business index set are respectively obtained;
determining an entity relationship between the first entity and the second entity;
training an association model through the first entity, the second entity and the entity relation to obtain the association model, wherein the association model enables the first entity to be associated with the second entity, and the association degree of the first entity and the second entity is output.
In a second aspect, embodiments of the present disclosure disclose a data association method, including:
obtaining external data, wherein the external data comprises regulatory treaties, policy regulations, cases and/or news;
preprocessing the external data and obtaining a treatise index set corresponding to the external data;
entity extraction is carried out according to the treaty index set, and a first entity corresponding to the treaty index set is obtained;
and obtaining service data and a first association degree associated with the first entity according to the first entity and a pre-generated association model.
In a third aspect, an embodiment of the present disclosure discloses a data association method, including:
acquiring service data;
preprocessing the service data and obtaining a service index set corresponding to the service data;
entity extraction is carried out according to the service index set, and a second entity corresponding to the service index set is obtained;
and obtaining external data and a second association degree associated with the second entity according to the second entity and a pre-generated association model, wherein the external data comprises supervision treaty, policy rules, cases and/or news.
In a fourth aspect, an embodiment of the present specification discloses a correlation model generating apparatus, including:
a first acquisition module configured to acquire external data and business data, wherein the external data includes regulatory treaties, policy regulations, cases, and/or news;
the first preprocessing module is configured to preprocess the external data and the service data to respectively obtain a treaty index set corresponding to the external data and a service index set corresponding to the service data;
the first extraction module is configured to perform entity extraction according to the treaty index set and the business index set to respectively obtain a first entity corresponding to the treaty index set and a second entity corresponding to the business index set;
a first determination module configured to determine an entity relationship between the first entity and the second entity;
the first training module is configured to train the association model through the first entity, the second entity and the entity relation to obtain the association model, the association model enables the first entity to be associated with the second entity, and the association degree of the first entity and the second entity is output.
In a fifth aspect, embodiments of the present disclosure disclose a data association apparatus, including:
a second acquisition module configured to acquire external data, wherein the external data includes regulatory treaties, policy regulations, cases, and/or news;
the second preprocessing module is configured to preprocess the external data and obtain a treatise index set corresponding to the external data;
the second extraction module is configured to perform entity extraction according to the treaty index set and obtain a first entity corresponding to the treaty index set;
and the first obtaining module is configured to obtain service data and a first association degree associated with the first entity according to the first entity and a pre-generated association model.
In a sixth aspect, embodiments of the present disclosure disclose a data association apparatus, including:
a third acquisition module configured to acquire service data;
the third preprocessing module is configured to preprocess the service data and obtain a service index set corresponding to the service data;
the third extraction module is configured to perform entity extraction according to the service index set and obtain a second entity corresponding to the service index set;
And a second obtaining module configured to obtain external data and a second degree of association associated with the second entity according to the second entity and a pre-generated association model, wherein the external data comprises regulatory treaties, policy regulations, cases and/or news.
In a seventh aspect, embodiments of the present specification also disclose a computing device comprising a memory, a processor, and computer instructions stored on the memory and executable on the processor, the processor executing the instructions to implement the steps of the correlation model generation method or the data correlation method as described above when the instructions are executed by the processor.
In an eighth aspect, the present specification embodiment also discloses a computer-readable storage medium storing computer instructions that, when executed by a processor, implement the steps of the correlation model generation method or the data correlation method described above.
The description provides a method and a device for generating a correlation model, a method and a device for correlating data, a computing device and a storage medium, wherein the method for correlating data comprises the steps of obtaining external data, wherein the external data comprises supervision treatises, policy regulations, cases and/or news; preprocessing the external data and obtaining a treatise index set corresponding to the external data; entity extraction is carried out according to the treaty index set, and a first entity corresponding to the treaty index set is obtained; and obtaining service data and a first association degree associated with the first entity according to the first entity and a pre-generated association model.
Drawings
FIG. 1 is a flowchart of a method for generating a correlation model according to an embodiment of the present disclosure;
FIG. 2 is a flowchart of a method for generating a correlation model according to an embodiment of the present disclosure;
FIG. 3 is a flow chart of a method for associating data according to an embodiment of the present disclosure;
FIG. 4 is a flow chart of a method for associating data according to an embodiment of the present disclosure;
FIG. 5 is a flow chart of a method for associating data according to an embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of an association model generating apparatus according to an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of a data association device according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of a data association device according to an embodiment of the present disclosure;
fig. 9 is a block diagram of a computing device according to an embodiment of the present disclosure.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. The present application may be embodied in many other forms than those herein described, and those skilled in the art will readily appreciate that the present application may be similarly embodied without departing from the spirit or essential characteristics thereof, and therefore the present application is not limited to the specific embodiments disclosed below.
The terminology used in the one or more embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the specification. As used in this specification, one or more embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of this specification to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
First, terms related to one or more embodiments of the present application will be explained.
And (3) compliance: it is meant that business activities of commercial banks are consistent with laws, rules and guidelines.
Quantification: the goal or task is specifically defined and can be measured clearly.
Knowledge graph: the knowledge graph is essentially a semantic network, is a graph-based data structure, and consists of nodes (points) and edges (edges). In the knowledge graph, each node represents an "entity" existing in the real world, and each edge is a "relationship" between entities. Knowledge-graph is the most efficient representation of relationships. In popular terms, a knowledge graph is a network of relationships that is obtained by linking together all the different kinds of information (Heterogeneous Information). Knowledge maps provide the ability to analyze problems from a "relational" perspective.
NLP: english is called nature language processing, chinese is natural language processing.
In the present application, a method and apparatus for generating a correlation model, a method and apparatus for correlating data, a computing device and a storage medium are provided, and detailed description is given one by one in the following embodiments.
Referring to FIG. 1, one or more embodiments of the present description provide a flowchart of a correlation model generation method.
As can be seen from fig. 1, the association model comprises input parameters and output parameters, wherein the input parameters comprise a first entity, a second entity and an entity relationship between the first entity and the second entity.
The first entity is obtained in the following manner:
obtaining external data, wherein the external data comprises regulatory treaties, policy regulations, cases and/or news;
preprocessing the external data and obtaining a treatise index set corresponding to the external data;
and extracting the entity according to the treaty index set, and obtaining a first entity corresponding to the treaty index set.
The second entity is obtained in the following manner:
acquiring service data;
preprocessing the service data and obtaining a service index set corresponding to the service data;
and extracting the entity according to the service index set, and obtaining a second entity corresponding to the service index set.
In addition, the entity relationship between the first entity and the second entity comprises a first entity relationship and a second entity relationship, wherein the first entity relationship is an initial relationship between the first entity and the second entity established through expert experience, and the second entity relationship is a potential relationship which is deduced through a pre-generated association model according to a knowledge graph established by the first entity, the second entity and the initial relationship.
The output parameters of the association model include outputting the second entity and a first degree of association according to the first entity or outputting the first entity and a second degree of association according to the second entity. The first association degree and the second association degree are the influence degree of the first entity on the second entity and the influence degree of the second entity on the first entity.
For example, the first entity is an entity extracted from a treaty index set formed by supervision treaty, policy regulation, case or dynamic news, and the second entity is an entity extracted from a business index set formed by business refinement of each product line, after the first entity, the second entity and the relationship between the first entity and the second entity are modeled by a machine learning technology, the association model can achieve two targets: the first item is that from the perspective of the treaty, it can be seen which services and the extent of the effect a certain treaty affects; the second is that from a business perspective it can be seen which treaties a certain business will be affected and to what extent.
When the policy change or punishment cases exist in the industry, the pre-generated association model can identify which company products or businesses are affected and the influence degree; or when the products or services of the company are regulated or increased, the pre-generated association model can identify which regulations the products or services are affected and the influence degree.
In one or more embodiments of the present disclosure, the generated association model adopts a common machine learning algorithm, rules, and the like to perform compliance risk identification, so that compliance of the service can be intelligently identified, and the rule information and the service are integrated in a graph association relationship through a knowledge graph technology, and then potential relationship reasoning is performed on the graph structure by adopting the machine learning technology, so that compliance risk of each service can be more comprehensively identified.
Referring to fig. 2, one or more embodiments of the present disclosure provide a flowchart of a correlation model generation method, including steps 202 to 210.
Step 202: external data and business data are obtained, wherein the external data includes regulatory treaties, policy regulations, cases, and/or news.
In one or more embodiments of the present disclosure, the external data includes, but is not limited to, regulatory treaties, policy regulations, cases, and/or news, and may include release meeting information of some competitors or other industry-related information, etc.
The business data includes, but is not limited to, business data such as payment treasures, financial resources, micro-credits, insurance, international, payment finance, public praise, and risk data.
Step 204: and preprocessing the external data and the service data to respectively obtain a treaty index set corresponding to the external data and a service index set corresponding to the service data.
In one or more embodiments of the present disclosure, preprocessing the external data and the service data includes the steps of:
step one: and analyzing the external data by adopting a natural language processing technology, and converting the analyzed external data into indexes related to the service to form the treaty index set.
In one or more embodiments of the present disclosure, the external data is analyzed by using a natural language processing technology, which is implemented by normalizing, word segmentation, keyword extraction, and semantic understanding of the obtained text of the external data by using a natural language processing technology (NLP) technology, and then disassembling the text.
The analyzed external data are converted into indexes related to the service to form the treaty index set, product information related to the service in the disassembled external data is extracted, wherein each product has a set of universal service indexes to describe service conditions, then a mapping relation is established between the disassembled external data and the service indexes of each product, namely, the external data are converted into indexes related to the service, and the influence possibly caused to the internal service can be perceived from the external data through the processing.
For example, when it is found after disassembling a certain external data, the external data may relate to a product, which is paid by a third party, then a mapping relationship is established for converting the external data into an index related to a service, that is, determining which service indexes the third party pays for, then a mapping relationship is established between the service indexes and the external data, and the established mapping relationship is the conversion.
Step two: and extracting the service indexes of the service data according to preset conditions to form the service index set.
In one or more embodiments of the present disclosure, the preset conditions include, but are not limited to, theme, etc., and may be set according to actual requirements, which is not limited in any way in the present disclosure.
Extracting service indexes of the service data according to preset conditions, namely dividing internal services according to topics to uniformly produce service index sets, wherein the service index sets can comprise, but are not limited to, related information such as products, data sources, calibers (standards adopted by statistical data) or index keyword description.
In actual use, index extraction is performed on each product line service to form a service index set, wherein the index is a set of universal service indexes corresponding to each introduced product, and the indexes can fully reflect the development state of each product line service.
In practical use, the service index extracted for each product line is actually different, for example, the product is paid, and then the extracted service index may include the transaction amount, the number of users, and the like.
The relationship between the treaty index set and the business index set is described in one practical case, for example, the external data includes: the regulatory agency penalizes the third party company and then extracts the case information to determine what products the penalty targets for the third party company, what penalties the regulatory agency has made for the company, and what the cost is.
And then analyzing the external data to determine which product corresponds to the penalty, then looking at whether a product exists in the company according to the product information, if so, then the product corresponds to an index system, and then associating the index system with the penalty, so that the influence of the penalty on the company product can be seen.
Step 206: and extracting entities according to the treaty index set and the business index set to respectively obtain a first entity corresponding to the treaty index set and a second entity corresponding to the business index set.
In one or more embodiments of the present disclosure, the treaty index set includes legal treaties, case information, industry information, business experience documents, etc. related to the financial industry.
And then entity extraction is carried out according to the treaty index set, and a first entity corresponding to the treaty index set is obtained, wherein the first entity comprises:
and extracting the entities in the treaty index set by adopting an NLP technology, wherein the NLP technology comprises basic capabilities of word vector, named entity identification, keyword extraction of financial industry characteristics, article center word extraction and the like of the construction financial industry, and extracting the corresponding entities and attributes from a large number of treaty index sets by using the basic capabilities.
In one or more embodiments of the present disclosure, the service index set includes a structured service index, and basic information of an external company, where the structured service index may include, but is not limited to, a product name, an online time, and the like, and the basic information of the external company may include, but is not limited to, registration information of the external company in a business, a legal person, a share right, a business complaint, and the like.
And then extracting the entity according to the service index set to obtain a second entity corresponding to the service index set, wherein the second entity comprises:
And extracting the entity of the service index set according to expert experience and the structure of the knowledge graph, wherein the structure of the knowledge graph comprises domains, types, attributes and the like.
Step 208: an entity relationship between the first entity and the second entity is determined.
In one or more embodiments of the present disclosure, referring to fig. 3, determining the entity relationship between the first entity and the second entity includes steps 302 to 306.
Step 302: a first entity relationship between the first entity and the second entity is determined.
In one or more embodiments of the present disclosure, the first entity relationship between the first entity and the second entity may be established according to an expert experience.
Step 304: and constructing a knowledge graph according to the first entity, the second entity and the first entity relationship.
Step 306: and obtaining a second entity relationship between the first entity and the second entity according to the knowledge graph and a pre-generated association model.
In one or more embodiments of the present description, for potential relationships that cannot be determined empirically by expert, it is desirable to do the inference mining by means of machine learning.
Firstly, constructing a knowledge graph according to the first entity, the second entity and the first entity relationship, and then obtaining a second entity relationship between the first entity and the second entity according to the knowledge graph and a pre-generated association model.
In practical use, the knowledge graph is constructed by constructing a relationship network of the first entity and the second entity, wherein the first entity and the second entity represent nodes of the relationship network, then a random walk algorithm is adopted to sample each node in the relationship network in sequence, a node sequence is generated, finally each node in the node sequence is vectorized based on a network embedded learning model, and then the knowledge graph is constructed according to the vectorized representation of each node.
Step 210: training an association model through the first entity, the second entity and the entity relation to obtain the association model, wherein the association model enables the first entity to be associated with the second entity, and the association degree of the first entity and the second entity is output.
In one or more embodiments of the present disclosure, outputting the association model includes outputting the second entity and a first degree of association according to the first entity or outputting the first entity and a second degree of association according to the second entity. The first association degree and the second association degree are the influence degree of the first entity on the second entity and the influence degree of the second entity on the first entity.
In one or more embodiments of the present disclosure, the generated association model adopts a machine learning algorithm, expert experience, and the like to perform compliance risk identification, so that the compliance risk of the service can be intelligently identified, external data, such as the structure of regulatory compliance information, and service data, such as the internal service development state, can be quantized, and the quantized information is integrated in a graph association relationship through a knowledge graph, and then a machine learning technology is adopted to perform potential relationship reasoning on the graph structure, so that the association model trained according to the relationship among the entities of the external data, the service data, and the knowledge graph can more comprehensively identify the compliance risk of each service.
Referring to fig. 4, one or more embodiments of the present description provide a data association method comprising steps 402 to 408.
Step 402: external data is obtained, wherein the external data includes regulatory treaties, policy regulations, cases, and/or news.
Step 404: and preprocessing the external data to obtain a treatise index set corresponding to the external data.
Step 406: and extracting the entity according to the treaty index set, and obtaining a first entity corresponding to the treaty index set.
Step 408: and obtaining service data and a first association degree associated with the first entity according to the first entity and a pre-generated association model.
In one or more embodiments of the present description, the obtaining of the external data may obtain the external data through a crawler system.
In addition, for the preprocessing of the external data and the extracting of the first entity, reference may be made to the above embodiments, which are not described in detail in this specification.
In one or more embodiments of the present disclosure, if the external data includes an industry policy change or a penalty case, the pre-generated association model can identify what company products the industry policy change or penalty case affects and to what extent.
In one or more embodiments of the present disclosure, the data association method may identify the service data associated with the external data based on a pre-generated association model, and may determine the influence degree of the external data on the service data, and by using the system, the service data associated with the external data and the influence degree of the association degree are automatically and comprehensively analyzed and identified, so that the efficiency and the accuracy are high.
Referring to fig. 5, one or more embodiments of the present description provide a data association method, including steps 502 through 508.
Step 502: and acquiring service data.
Step 504: preprocessing the service data and obtaining a service index set corresponding to the service data.
Step 506: and extracting the entity according to the service index set, and obtaining a second entity corresponding to the service index set.
Step 508: external data and a second degree of association associated with the second entity are obtained according to the second entity and a pre-trained association model, wherein the external data includes regulatory treaties, policy regulations, cases and/or news.
In one or more embodiments of the present disclosure, the preprocessing of the service data and the extracting of the second entity may refer to the foregoing embodiments, which are not repeated herein.
In one or more embodiments of the present disclosure, if the business data includes a product, when the product is adjusted or added, the pre-generated association model can identify which external data the product will be affected by and to what extent.
In one or more embodiments of the present disclosure, the data association method may identify external data that affects the service data based on a pre-generated association model, and may determine the extent of the impact of the identified external data on the service data, and by using the system, the external data associated with the service data and the extent of the impact of the association degree are automatically and comprehensively analyzed and identified, with high efficiency and high accuracy.
Referring to fig. 6, one or more embodiments of the present disclosure provide an association model generating apparatus, including:
a first acquisition module 602 configured to acquire external data and business data, wherein the external data includes regulatory treaties, policy regulations, cases, and/or news;
a first preprocessing module 604, configured to preprocess the external data and the service data, so as to obtain a treaty index set corresponding to the external data and a service index set corresponding to the service data respectively;
a first extraction module 606, configured to perform entity extraction according to the treaty index set and the business index set, so as to obtain a first entity corresponding to the treaty index set and a second entity corresponding to the business index set respectively;
a first determination module 608 configured to determine an entity relationship between the first entity and the second entity;
the first training module 610 is configured to train an association model through the first entity, the second entity and the entity relationship to obtain the association model, where the association model associates the first entity with the second entity, and outputs the association degree of the first entity with the second entity.
Optionally, the first preprocessing module 604 includes:
the first analysis submodule is configured to analyze the external data by adopting a natural language processing technology, and convert the analyzed external data into indexes related to business to form the treaty index set; and
the first extraction submodule is configured to extract service indexes of the service data according to preset conditions to form the service index set.
Optionally, the first determining module 608 includes:
a first entity relationship determination submodule configured to determine a first entity relationship between the first entity and the second entity;
a knowledge graph construction sub-module configured to construct a knowledge graph from the first entity, the second entity, and the first entity relationship;
and the second entity relationship determination submodule is configured to obtain a second entity relationship between the first entity and the second entity according to the knowledge graph and a pre-generated association model.
Optionally, the first obtaining module 602 is further configured to obtain the external data through a crawler system.
In one or more embodiments of the present disclosure, the association model device uses a common machine learning algorithm and rules to perform compliance risk recognition, so that compliance of the service can be intelligently recognized, and the knowledge graph technology integrates the association relationship between the treaty information and the service, and then uses the machine learning technology to perform potential relationship reasoning on the graph structure, so that compliance risk of each service can be more comprehensively recognized.
Referring to fig. 7, one or more embodiments of the present disclosure provide a data association apparatus, including:
a second acquisition module 702 configured to acquire external data, wherein the external data comprises regulatory treaties, policy regulations, cases, and/or news;
a second preprocessing module 704, configured to preprocess the external data, and obtain a treatise index set corresponding to the external data;
a second extraction module 706, configured to perform entity extraction according to the treaty index set, and obtain a first entity corresponding to the treaty index set;
a first obtaining module 708 configured to obtain, according to the first entity and a pre-generated association model, traffic data and a first degree of association associated with the first entity.
Optionally, the second preprocessing module 704 is further configured to:
and analyzing the external data by adopting a natural language processing technology, and converting the analyzed external data into indexes related to the service to form the treaty index set.
Optionally, the second obtaining module 702 is configured to obtain the external data through a crawler system.
In one or more embodiments of the present disclosure, the data association device may identify the service data associated with the external data based on a pre-generated association model, and may determine the influence degree of the external data on the service data, and by using such a system, the service data associated with the external data and the influence degree of the association degree are automatically and comprehensively analyzed and identified, which is efficient and accurate.
Referring to fig. 8, one or more embodiments of the present disclosure provide a data association apparatus, including:
a third acquiring module 802 configured to acquire service data;
a third preprocessing module 804, configured to preprocess the service data, and obtain a service index set corresponding to the service data;
a third extraction module 806, configured to perform entity extraction according to the service index set, and obtain a second entity corresponding to the service index set;
a second obtaining module 808 configured to obtain external data and a second degree of association associated with the second entity according to the second entity and a pre-generated association model, wherein the external data comprises regulatory treaties, policy regulations, cases and/or news.
Optionally, the third preprocessing module 804 is configured to:
and extracting the service indexes of the service data according to preset conditions to form the service index set.
In one or more embodiments of the present disclosure, the data association device may identify external data that affects the service data based on a pre-generated association model, and may determine the extent of the effect of the identified external data on the service data, and by using the system, the external data associated with the service data and the extent of the effect of the association degree are automatically and comprehensively analyzed and identified, with high efficiency and high accuracy.
Fig. 9 is a block diagram illustrating a configuration of a computing device 100 according to an embodiment of the present description. The components of the computing device 100 include, but are not limited to, a memory 110 and a processor 120. Processor 120 is coupled to memory 110 via bus 130 and database 150 is used to store data.
Computing device 100 also includes access device 140, access device 140 enabling computing device 100 to communicate via one or more networks 160. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. The access device 140 may include one or more of any type of network interface, wired or wireless (e.g., a Network Interface Card (NIC)), such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.
In one embodiment of the present description, the other components of computing device 100 described above and not shown in FIG. 9 may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device illustrated in FIG. 9 is for exemplary purposes only and is not intended to limit the scope of the present description. Those skilled in the art may add or replace other components as desired.
Computing device 100 may be any type of stationary or mobile computing device including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smart phone), wearable computing device (e.g., smart watch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 100 may also be a mobile or stationary server.
The computing device comprises a memory, a processor and computer instructions stored on the memory and executable on the processor, wherein the processor executes the instructions to implement the steps of the association model generation method or the data association method as described above.
An embodiment of the present application also provides a computer-readable storage medium storing computer instructions that, when executed by a processor, implement the steps of the correlation model generation method or the data correlation method described above.
The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium and the technical solution of the association model generating method or the data associating method belong to the same concept, and details of the technical solution of the storage medium which are not described in detail can be referred to the description of the technical solution of the association model generating method or the data associating method.
The technical carrier involved in payment in the embodiment of the application can comprise near field communication (Near Field Communication, NFC), WIFI, 3G/4G/5G, POS machine card swiping technology, two-dimension code scanning technology, bar code scanning technology, bluetooth, infrared, short message (Short Message Service, SMS), multimedia message (Multimedia Message Service, MMS) and the like.
The computer instructions include computer program code that may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.
It should be noted that, for the sake of simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all required for the present application.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.
The preferred embodiments of the application disclosed above are intended only to assist in the explanation of the application. Alternative embodiments are not intended to be exhaustive or to limit the application to the precise form disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the application and the practical application, to thereby enable others skilled in the art to best understand and utilize the application. The application is limited only by the claims and the full scope and equivalents thereof.

Claims (18)

1. A correlation model generation method, comprising:
acquiring external data and business data, wherein the external data comprises supervision treaties, policy regulations, cases and/or news;
preprocessing the external data and the service data to respectively obtain a treaty index set corresponding to the external data and a service index set corresponding to the service data, wherein a mapping relation exists between the treaty index in the treaty index set and the service index in the service index set;
entity extraction is carried out according to the treaty index set and the business index set, and a first entity corresponding to the treaty index set and a second entity corresponding to the business index set are respectively obtained;
determining an entity relationship between the first entity and the second entity, wherein the entity relationship comprises a first entity relationship and a second entity relationship, the first entity relationship is established through expert experience, and the second entity relationship is a knowledge graph constructed according to the first entity, the second entity and the first entity relationship and is obtained through reasoning of a pre-generated association model;
training an association model through the first entity, the second entity and the entity relation to obtain the association model, wherein the association model enables the first entity to be associated with the second entity, and outputs the association degree of the first entity and the second entity, and the association degree is used for representing the influence degree of the external data on the business data.
2. The method of claim 1, wherein preprocessing the external data and the service data to obtain a set of treaty indicators corresponding to the external data and a set of service indicators corresponding to the service data, respectively, comprises:
analyzing the external data by adopting a natural language processing technology, and converting the analyzed external data into indexes related to business to form the treaty index set; and
and extracting the service indexes of the service data according to preset conditions to form the service index set.
3. The method of claim 1, wherein obtaining external data comprises:
external data is acquired through the crawler system.
4. A method of data association, comprising:
obtaining external data, wherein the external data comprises regulatory treaties, policy regulations, cases and/or news;
preprocessing the external data and obtaining a treatise index set corresponding to the external data;
entity extraction is carried out according to the treaty index set, and a first entity corresponding to the treaty index set is obtained;
training an association model through the first entity, a second entity determined based on service data and an entity relation between the first entity and the second entity to obtain the association model, wherein the entity relation comprises a first entity relation and a second entity relation, the first entity relation is established through expert experience, and the second entity relation is a knowledge graph established according to the first entity, the second entity and the first entity relation and is obtained through reasoning of a pre-generated association model;
And obtaining service data and a first association degree associated with the first entity according to the first entity and the pre-generated association model.
5. The method of claim 4, wherein preprocessing the external data and obtaining a set of treaty indicators corresponding to the external data comprises:
and analyzing the external data by adopting a natural language processing technology, and converting the analyzed external data into indexes related to the service to form the treaty index set.
6. The method of claim 4, wherein obtaining external data comprises:
external data is acquired through the crawler system.
7. A method of data association, comprising:
acquiring service data;
preprocessing the service data and obtaining a service index set corresponding to the service data;
entity extraction is carried out according to the service index set, and a second entity corresponding to the service index set is obtained;
training a correlation model through the second entity, a first entity determined based on external data and an entity relation between the first entity and the second entity to obtain the correlation model, wherein the entity relation comprises a first entity relation and a second entity relation, the first entity relation is established through expert experience, and the second entity relation is a knowledge graph established according to the first entity, the second entity and the first entity relation and is obtained through reasoning of a pre-generated correlation model;
And obtaining external data and a second association degree associated with the second entity according to the second entity and the pre-generated association model, wherein the external data comprises supervision treaty, policy rules, cases and/or news.
8. The method of claim 7, wherein preprocessing the service data and obtaining a service indicator set corresponding to the service data comprises:
and extracting the service indexes of the service data according to preset conditions to form the service index set.
9. An association model generation apparatus, comprising:
a first acquisition module configured to acquire external data and business data, wherein the external data includes regulatory treaties, policy regulations, cases, and/or news;
the first preprocessing module is configured to preprocess the external data and the service data to respectively obtain a treaty index set corresponding to the external data and a service index set corresponding to the service data, wherein a mapping relation exists between the treaty index in the treaty index set and the service index in the service index set;
the first extraction module is configured to perform entity extraction according to the treaty index set and the business index set to respectively obtain a first entity corresponding to the treaty index set and a second entity corresponding to the business index set;
The first determining module is configured to determine an entity relationship between the first entity and the second entity, wherein the entity relationship comprises a first entity relationship and a second entity relationship, the first entity relationship is established through expert experience, and the second entity relationship is a knowledge graph constructed according to the first entity, the second entity and the first entity relationship and is obtained through reasoning of a pre-generated association model;
the first training module is configured to train a correlation model through the first entity, the second entity and the entity relation to obtain the correlation model, the correlation model enables the first entity to be correlated with the second entity, and the correlation degree of the first entity and the second entity is output, wherein the correlation degree is used for representing the influence degree of the external data on the service data.
10. The apparatus of claim 9, wherein the first preprocessing module comprises:
the first analysis submodule is configured to analyze the external data by adopting a natural language processing technology, and convert the analyzed external data into indexes related to business to form the treaty index set; and
The first extraction submodule is configured to extract service indexes of the service data according to preset conditions to form the service index set.
11. The apparatus of claim 9, wherein the first acquisition module is further configured to acquire external data via a crawler system.
12. A data association apparatus, comprising:
a second acquisition module configured to acquire external data, wherein the external data includes regulatory treaties, policy regulations, cases, and/or news;
the second preprocessing module is configured to preprocess the external data and obtain a treatise index set corresponding to the external data;
the second extraction module is configured to perform entity extraction according to the treaty index set and obtain a first entity corresponding to the treaty index set;
the first obtaining module is configured to train a correlation model through the first entity, a second entity determined based on service data and an entity relation between the first entity and the second entity to obtain the correlation model, wherein the entity relation comprises a first entity relation and a second entity relation, the first entity relation is established through expert experience, and the second entity relation is a knowledge graph established according to the first entity, the second entity and the first entity relation and is obtained through reasoning of a pre-generated correlation model; and obtaining service data and a first association degree associated with the first entity according to the first entity and the pre-generated association model.
13. The apparatus of claim 12, wherein the second preprocessing module is further configured to:
and analyzing the external data by adopting a natural language processing technology, and converting the analyzed external data into indexes related to the service to form the treaty index set.
14. The apparatus of claim 12, wherein the second acquisition module is configured to acquire external data via a crawler system.
15. A data association apparatus, comprising:
a third acquisition module configured to acquire service data;
the third preprocessing module is configured to preprocess the service data and obtain a service index set corresponding to the service data;
the third extraction module is configured to perform entity extraction according to the service index set and obtain a second entity corresponding to the service index set;
the second obtaining module is configured to train a correlation model through the second entity, a first entity determined based on external data and an entity relation between the first entity and the second entity to obtain the correlation model, wherein the entity relation comprises a first entity relation and a second entity relation, the first entity relation is established through expert experience, and the second entity relation is a knowledge graph established according to the first entity, the second entity and the first entity relation and is obtained through reasoning of a pre-generated correlation model; and obtaining external data and a second association degree associated with the second entity according to the second entity and the pre-generated association model, wherein the external data comprises supervision treaty, policy rules, cases and/or news.
16. The apparatus of claim 15, wherein the third preprocessing module is configured to:
and extracting the service indexes of the service data according to preset conditions to form the service index set.
17. A computing device comprising a memory, a processor, and computer instructions stored on the memory and executable on the processor, wherein execution of the instructions by the processor implements the steps of the method of any one of claims 1-3, 4-6, or 7-8 when executed by the processor.
18. A computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the method of any one of claims 1-3, 4-6 or 7-8.
CN201811159278.8A 2018-09-30 2018-09-30 Correlation model generation method and device, and data correlation method and device Active CN109597894B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811159278.8A CN109597894B (en) 2018-09-30 2018-09-30 Correlation model generation method and device, and data correlation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811159278.8A CN109597894B (en) 2018-09-30 2018-09-30 Correlation model generation method and device, and data correlation method and device

Publications (2)

Publication Number Publication Date
CN109597894A CN109597894A (en) 2019-04-09
CN109597894B true CN109597894B (en) 2023-10-03

Family

ID=65957345

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811159278.8A Active CN109597894B (en) 2018-09-30 2018-09-30 Correlation model generation method and device, and data correlation method and device

Country Status (1)

Country Link
CN (1) CN109597894B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110187678B (en) * 2019-04-19 2021-11-05 广东省智能制造研究所 Information storage and digital application system of processing equipment in manufacturing industry
CN111488741A (en) * 2020-04-14 2020-08-04 税友软件集团股份有限公司 Tax knowledge data semantic annotation method and related device
CN111754104A (en) * 2020-06-22 2020-10-09 平安资产管理有限责任公司 Service index execution method and system
CN112749284B (en) * 2020-12-31 2021-12-17 平安科技(深圳)有限公司 Knowledge graph construction method, device, equipment and storage medium

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104133848A (en) * 2014-07-01 2014-11-05 中央民族大学 Tibetan language entity knowledge information extraction method
CN105095195A (en) * 2015-07-03 2015-11-25 北京京东尚科信息技术有限公司 Method and system for human-machine questioning and answering based on knowledge graph
CN105468583A (en) * 2015-12-09 2016-04-06 百度在线网络技术(北京)有限公司 Entity relationship obtaining method and device
CN106354710A (en) * 2016-08-18 2017-01-25 清华大学 Neural network relation extracting method
CN106372118A (en) * 2016-08-24 2017-02-01 武汉烽火普天信息技术有限公司 Large-scale media text data-oriented online semantic comprehension search system and method
CN106776711A (en) * 2016-11-14 2017-05-31 浙江大学 A kind of Chinese medical knowledge mapping construction method based on deep learning
CN106815293A (en) * 2016-12-08 2017-06-09 中国电子科技集团公司第三十二研究所 System and method for constructing knowledge graph for information analysis
CN107273349A (en) * 2017-05-09 2017-10-20 清华大学 A kind of entity relation extraction method and server based on multilingual
CN107291687A (en) * 2017-04-27 2017-10-24 同济大学 It is a kind of based on interdependent semantic Chinese unsupervised open entity relation extraction method
CN107358315A (en) * 2017-06-26 2017-11-17 深圳市金立通信设备有限公司 A kind of information forecasting method and terminal
CN107657063A (en) * 2017-10-30 2018-02-02 合肥工业大学 The construction method and device of medical knowledge collection of illustrative plates
CN107665252A (en) * 2017-09-27 2018-02-06 深圳证券信息有限公司 A kind of method and device of creation of knowledge collection of illustrative plates
CN107783973A (en) * 2016-08-24 2018-03-09 慧科讯业有限公司 The methods, devices and systems being monitored based on domain knowledge spectrum data storehouse to the Internet media event
CN107909274A (en) * 2017-11-17 2018-04-13 平安科技(深圳)有限公司 Enterprise investment methods of risk assessment, device and storage medium
CN107967267A (en) * 2016-10-18 2018-04-27 中兴通讯股份有限公司 A kind of knowledge mapping construction method, apparatus and system
CN108460136A (en) * 2018-03-08 2018-08-28 国网福建省电力有限公司 Electric power O&M information knowledge map construction method
CN108563620A (en) * 2018-04-13 2018-09-21 上海财梵泰传媒科技有限公司 The automatic writing method of text and system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180253650A9 (en) * 2014-08-06 2018-09-06 Prysm, Inc. Knowledge To User Mapping in Knowledge Automation System
US20160041720A1 (en) * 2014-08-06 2016-02-11 Kaybus, Inc. Knowledge automation system user interface

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104133848A (en) * 2014-07-01 2014-11-05 中央民族大学 Tibetan language entity knowledge information extraction method
CN105095195A (en) * 2015-07-03 2015-11-25 北京京东尚科信息技术有限公司 Method and system for human-machine questioning and answering based on knowledge graph
CN105468583A (en) * 2015-12-09 2016-04-06 百度在线网络技术(北京)有限公司 Entity relationship obtaining method and device
CN106354710A (en) * 2016-08-18 2017-01-25 清华大学 Neural network relation extracting method
CN107783973A (en) * 2016-08-24 2018-03-09 慧科讯业有限公司 The methods, devices and systems being monitored based on domain knowledge spectrum data storehouse to the Internet media event
CN106372118A (en) * 2016-08-24 2017-02-01 武汉烽火普天信息技术有限公司 Large-scale media text data-oriented online semantic comprehension search system and method
CN107967267A (en) * 2016-10-18 2018-04-27 中兴通讯股份有限公司 A kind of knowledge mapping construction method, apparatus and system
CN106776711A (en) * 2016-11-14 2017-05-31 浙江大学 A kind of Chinese medical knowledge mapping construction method based on deep learning
CN106815293A (en) * 2016-12-08 2017-06-09 中国电子科技集团公司第三十二研究所 System and method for constructing knowledge graph for information analysis
CN107291687A (en) * 2017-04-27 2017-10-24 同济大学 It is a kind of based on interdependent semantic Chinese unsupervised open entity relation extraction method
CN107273349A (en) * 2017-05-09 2017-10-20 清华大学 A kind of entity relation extraction method and server based on multilingual
CN107358315A (en) * 2017-06-26 2017-11-17 深圳市金立通信设备有限公司 A kind of information forecasting method and terminal
CN107665252A (en) * 2017-09-27 2018-02-06 深圳证券信息有限公司 A kind of method and device of creation of knowledge collection of illustrative plates
CN107657063A (en) * 2017-10-30 2018-02-02 合肥工业大学 The construction method and device of medical knowledge collection of illustrative plates
CN107909274A (en) * 2017-11-17 2018-04-13 平安科技(深圳)有限公司 Enterprise investment methods of risk assessment, device and storage medium
CN108460136A (en) * 2018-03-08 2018-08-28 国网福建省电力有限公司 Electric power O&M information knowledge map construction method
CN108563620A (en) * 2018-04-13 2018-09-21 上海财梵泰传媒科技有限公司 The automatic writing method of text and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
甘丽新等.基于句法语义特征的中文实体关系抽取.《计算机研究与发展》.2016,284-302. *

Also Published As

Publication number Publication date
CN109597894A (en) 2019-04-09

Similar Documents

Publication Publication Date Title
CN109597894B (en) Correlation model generation method and device, and data correlation method and device
WO2021047186A1 (en) Method, apparatus, device, and storage medium for processing consultation dialogue
CN110020660A (en) Use the integrity assessment of the unstructured process of artificial intelligence (AI) technology
CN110929043B (en) Service problem extraction method and device
Alamsyah et al. Dynamic large scale data on twitter using sentiment analysis and topic modeling
CN107402912B (en) Method and device for analyzing semantics
CN110633577B (en) Text desensitization method and device
CN109933782B (en) User emotion prediction method and device
CN110472008B (en) Intelligent interaction method and device
CN110019758B (en) Core element extraction method and device and electronic equipment
KR102100214B1 (en) Method and appratus for analysing sales conversation based on voice recognition
CN110955750A (en) Combined identification method and device for comment area and emotion polarity, and electronic equipment
Wiratama et al. Sentiment analysis of application user feedback in Bahasa Indonesia using multinomial naive bayes
CN111242710A (en) Business classification processing method and device, service platform and storage medium
CN108197106B (en) Product competition analysis method, device and system based on deep learning
CN115099310A (en) Method and device for training model and classifying enterprises
CN107609921A (en) A kind of data processing method and server
CN111639494A (en) Case affair relation determining method and system
CN113869049B (en) Fact extraction method and device with legal attribute based on legal consultation problem
CN114356860A (en) Dialog generation method and device
CN111552846B (en) Method and device for identifying suspicious relationships
CN112528887B (en) Auditing method and device
CN115080732A (en) Complaint work order processing method and device, electronic equipment and storage medium
CN111784313B (en) Service processing method and device
CN115618968B (en) New idea discovery method and device, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200924

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20200924

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

GR01 Patent grant
GR01 Patent grant