CN109597894A - A kind of correlation model generation method and device, a kind of data correlation method and device - Google Patents

A kind of correlation model generation method and device, a kind of data correlation method and device Download PDF

Info

Publication number
CN109597894A
CN109597894A CN201811159278.8A CN201811159278A CN109597894A CN 109597894 A CN109597894 A CN 109597894A CN 201811159278 A CN201811159278 A CN 201811159278A CN 109597894 A CN109597894 A CN 109597894A
Authority
CN
China
Prior art keywords
instance
external data
provision
business datum
operational indicator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811159278.8A
Other languages
Chinese (zh)
Other versions
CN109597894B (en
Inventor
杨树波
于君泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201811159278.8A priority Critical patent/CN109597894B/en
Publication of CN109597894A publication Critical patent/CN109597894A/en
Application granted granted Critical
Publication of CN109597894B publication Critical patent/CN109597894B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Human Resources & Organizations (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A kind of correlation model generation method and device, a kind of data correlation method and device that this specification provides, wherein the data correlation method includes obtaining external data, wherein the external data includes supervision provision, policies and regulations, case and/or news;The external data is pre-processed, and obtains the corresponding provision index set of the external data;Entity extraction is carried out according to the provision index set, and obtains the corresponding first instance of the provision index set;Business datum associated with the first instance and first degree of association are obtained according to the first instance and pre-generated correlation model.

Description

A kind of correlation model generation method and device, a kind of data correlation method and device
Technical field
This application involves the automatic relation recognition technical field of computer, in particular to a kind of correlation model generation method and dress It sets, a kind of data correlation method and device, a kind of calculating equipment and storage medium.
Background technique
With the emergence of new technology, tradition supervision closes rule means and is difficult to cope with the fast development of financial technology industry. The experience of the professional manpower of rule is closed by supervision to analyze and judge the compliance of business by traditional financial company, and not only efficiency is lower, The requirement for closing rule industry experience to the supervision of personnel is also higher.
Summary of the invention
In view of this, the embodiment of the present application provides a kind of correlation model generation method and device, a kind of data association Method and device, a kind of calculating equipment and storage medium, to solve technological deficiency existing in the prior art.
In a first aspect, this specification embodiment discloses a kind of correlation model generation method, comprising:
Obtain external data and business datum, wherein the external data include supervision provision, policies and regulations, case and/ Or news;
The external data and the business datum are pre-processed, the corresponding provision of the external data is respectively obtained Index set and the corresponding operational indicator collection of the business datum;
Entity extraction is carried out according to the provision index set and the operational indicator collection, respectively obtains the provision index set Corresponding first instance and the corresponding second instance of the operational indicator collection;
Determine the entity relationship between the first instance and the second instance;
Correlation model is trained by the first instance, the second instance and the entity relationship, is obtained The correlation model, the correlation model makes the first instance and the second instance associated, and exports described first The degree of association of entity and the second instance.
Second aspect, this specification embodiment disclose a kind of data correlation method, comprising:
Obtain external data, wherein the external data includes supervision provision, policies and regulations, case and/or news;
The external data is pre-processed, and obtains the corresponding provision index set of the external data;
Entity extraction is carried out according to the provision index set, and obtains the corresponding first instance of the provision index set;
Business number associated with the first instance is obtained according to the first instance and pre-generated correlation model According to first degree of association.
The third aspect, this specification embodiment disclose a kind of data correlation method, comprising:
Obtain business datum;
The business datum is pre-processed, and obtains the corresponding operational indicator collection of the business datum;
Entity extraction is carried out according to the operational indicator collection, and obtains the corresponding second instance of the operational indicator collection;
External number associated with the second instance is obtained according to the second instance and pre-generated correlation model According to second degree of association, wherein the external data includes supervision provision, policies and regulations, case and/or news.
Fourth aspect, this specification embodiment disclose a kind of correlation model generating means, comprising:
First obtains module, is configured as obtaining external data and business datum, wherein the external data includes supervision Provision, policies and regulations, case and/or news;
First preprocessing module is configured as pre-processing the external data and the business datum, respectively To the corresponding provision index set of the external data and the corresponding operational indicator collection of the business datum;
First abstraction module is configured as carrying out entity extraction according to the provision index set and the operational indicator collection, Respectively obtain the corresponding first instance of the provision index set and the corresponding second instance of the operational indicator collection;
First determining module, the entity relationship being configured to determine that between the first instance and the second instance;
First training module is configured as through the first instance, the second instance and the entity relationship pair Correlation model is trained, and obtains the correlation model, and the correlation model makes the first instance and the second instance It is associated, and export the degree of association of the first instance and the second instance.
5th aspect, this specification embodiment disclose a kind of data association device, comprising:
Second obtains module, is configured as obtaining external data, wherein the external data includes supervision provision, policy Regulation, case and/or news;
Second preprocessing module is configured as pre-processing the external data, and obtains the external data pair The provision index set answered;
Second abstraction module is configured as carrying out entity extraction according to the provision index set, and obtains the provision and refer to Mark collects corresponding first instance;
First obtains module, is configured as being obtained according to the first instance and pre-generated correlation model and described the The associated business datum of one entity and first degree of association.
6th aspect, this specification embodiment disclose a kind of data association device, comprising:
Third obtains module, is configured as obtaining business datum;
Third preprocessing module is configured as pre-processing the business datum, and obtains the business datum pair The operational indicator collection answered;
Third abstraction module is configured as carrying out entity extraction according to the operational indicator collection, and obtains the business and refer to Mark collects corresponding second instance;
Second obtains module, is configured as being obtained according to the second instance and pre-generated correlation model and described the The associated external data of two entities and second degree of association, wherein the external data includes supervision provision, policies and regulations, case Example and/or news.
7th aspect, this specification embodiment also disclose a kind of calculating equipment, including memory, processor and are stored in On memory and the computer instruction that can run on a processor, the processor realize that the instruction is located when executing described instruction The step of reason device realizes correlation model generation method as described above or the data correlation method when executing.
Eighth aspect, this specification embodiment also disclose a kind of computer readable storage medium, are stored with computer The step of correlation model generation method as described above or the data correlation method is realized in instruction, the instruction when being executed by processor Suddenly.
A kind of correlation model generation method and device, a kind of data correlation method and device, one kind that this specification provides Calculate equipment and storage medium, wherein the data correlation method includes obtaining external data, wherein the external data packet Include supervision provision, policies and regulations, case and/or news;The external data is pre-processed, and obtains the external data Corresponding provision index set;Entity extraction is carried out according to the provision index set, and obtains the provision index set corresponding the One entity;Business datum associated with the first instance is obtained according to the first instance and pre-generated correlation model With first degree of association.
Detailed description of the invention
Fig. 1 is a kind of flow chart for correlation model generation method that one embodiment of this specification provides;
Fig. 2 is a kind of flow chart for correlation model generation method that one embodiment of this specification provides;
Fig. 3 is a kind of flow chart for data correlation method that one embodiment of this specification provides;
Fig. 4 is a kind of flow chart for data correlation method that one embodiment of this specification provides;
Fig. 5 is a kind of flow chart for data correlation method that one embodiment of this specification provides;
Fig. 6 is a kind of structural schematic diagram for correlation model generating means that one embodiment of this specification provides;
Fig. 7 is a kind of structural schematic diagram for data association device that one embodiment of this specification provides;
Fig. 8 is a kind of structural schematic diagram for data association device that one embodiment of this specification provides;
Fig. 9 is a kind of structural block diagram for calculating equipment that one embodiment of this specification provides.
Specific embodiment
Many details are explained in the following description in order to fully understand the application.But the application can be with Much it is different from other way described herein to implement, those skilled in the art can be without prejudice to the application intension the case where Under do similar popularization, therefore the application is not limited by following public specific implementation.
The term used in this specification one or more embodiment be only merely for for the purpose of describing particular embodiments, It is not intended to be limiting this specification one or more embodiment.In this specification one or more embodiment and appended claims The "an" of singular used in book, " described " and "the" are also intended to including most forms, unless context is clearly Indicate other meanings.It is also understood that term "and/or" used in this specification one or more embodiment refers to and includes One or more associated any or all of project listed may combine.
It will be appreciated that though may be retouched using term first, second etc. in this specification one or more embodiment Various information are stated, but these information should not necessarily be limited by these terms.These terms are only used to for same type of information being distinguished from each other It opens.For example, first can also be referred to as second, class in the case where not departing from this specification one or more scope of embodiments As, second can also be referred to as first.Depending on context, word as used in this " if " can be construed to " ... when " or " when ... " or " in response to determination ".
Firstly, the vocabulary of terms being related to one or more embodiments of the invention explains.
It closes rule: referring to that the business activities of business bank are consistent with law, rule and criterion.
Quantization: target or task specific, concrete can be measured clearly.
Knowledge mapping: knowledge mapping is substantially semantic network, is a kind of data structure based on figure, by node (Point) it is formed with side (Edge).In knowledge mapping, each node is indicated present in real world " entity ", each edge " relationship " between entity and entity.Knowledge mapping is the most effective representation of relationship.Generally, knowledge mapping is just It is a network of personal connections obtained from all different types of information (Heterogeneous Information) are linked together Network.Knowledge mapping provides the ability that problem analysis is gone from the angle of " relationship ".
NLP: full name in English is nature language processing, and Chinese is natural language processing.
In this application, a kind of correlation model generation method and device, a kind of data correlation method and device, one are provided Kind calculates equipment and storage medium, is described in detail one by one in the following embodiments.
Referring to Fig. 1, this specification one or more embodiment provides a kind of correlation model generation method flow chart.
As shown in Figure 1, the correlation model includes input and output parameter, wherein the input parameter includes the Entity relationship between one entity, second instance and the first instance and the second instance.
The acquisition modes of the first instance are as follows:
Obtain external data, wherein the external data includes supervision provision, policies and regulations, case and/or news;
The external data is pre-processed, and obtains the corresponding provision index set of the external data;
Entity extraction is carried out according to the provision index set, and obtains the corresponding first instance of the provision index set.
The acquisition modes of the second instance are as follows:
Obtain business datum;
The business datum is pre-processed, and obtains the corresponding operational indicator collection of the business datum;
Entity extraction is carried out according to the operational indicator collection, and obtains the corresponding second instance of the operational indicator collection.
In addition, the entity relationship between the first instance and the second instance includes first instance relationship and described Two entity relationships, wherein the first instance relationship is the first instance and described second set up by expertise The initial relation of entity, the second instance relationship are according to the first instance, the second instance and the initial pass Be building knowledge mapping by pre-generated correlation model infer come potential relationship.
The output parameter of the correlation model includes that the second instance and the first pass are exported according to the first instance Connection degree exports the first instance and second degree of association according to the second instance.Wherein, first degree of association and Second degree of association is for the first instance to the influence degree of the second instance and the second instance to described first The influence degree of entity.
For example, the first instance is the provision index set formed from supervision provision, policies and regulations, case or dynamic news The entity of middle extraction, the second instance are to refine the operational indicator to be formed from the business to each product line to concentrate the reality extracted Body, then by machine learning techniques the relationship between first instance, second instance and first instance and second instance into After the modeling of row correlation model, the correlation model can achieve two targets: first item is to can see from provision angle a certain Provision influences whether which business and effect;Section 2 is, from operational angle it can be seen that a certain business can be by which The influence and effect of a little provisions.
I.e. when having policy change or punishment case in industry, it can recognize that pair by the pre-generated correlation model Which Products or business have an impact and influence degree;Either when the product of company or business have adjustment or increase, It can recognize that the product by the pre-generated correlation model or business can be influenced by which provision and influence degree.
In this specification one or more embodiment, the correlation model of generation using common machines learning algorithm and Rule etc. does conjunction rule risk identification, can be with the compliance of intelligent recognition business, and passes through knowledge mapping technology provision information Figure incidence relation integration is carried out with business, then carries out potential relation inference, Ke Yigeng on graph structure using machine learning techniques Add the conjunction rule risk for comprehensively identifying each business.
Referring to fig. 2, this specification one or more embodiment provides a kind of correlation model generation method flow chart, including Step 202 is to step 210.
Step 202: obtaining external data and business datum, wherein the external data includes supervision provision, policy method Rule, case and/or news.
In this specification one or more embodiment, the external data include but is not limited to supervise provision, policies and regulations, Case and/or news can also include news conference information or other information relevant to industry of some rivals etc..
The business datum include but be not limited to Alipay, wealth, it is micro- borrow, insurance, international, payment finance, public praise and The business datums such as risk data.
Step 204: the external data and the business datum being pre-processed, the external data pair is respectively obtained The corresponding operational indicator collection of provision index set and the business datum answered.
In this specification one or more embodiment, carrying out pretreatment to the external data and the business datum includes Following steps:
Step 1: analyzing the external data using natural language processing technique, and will be described outer after analysis Portion's data conversion forms the provision index set at index relevant to business.
In this specification one or more embodiment, the external data is divided using natural language processing technique Analysis is exactly to carry out normalizing by text of natural language processing technique (NLP) technology to the external data got in fact It is disassembled after change, participle, keyword extraction, semantic understanding.
The external data after analysis is converted into index relevant to business, the provision index set is formed, is then The product information of business involved in external data after dismantling is extracted, wherein each product can have a set of general Operational indicator state of affairs is described, the operational indicator of external data and each product after dismantling is then established into mapping and is closed System, that is, complete the external data and be converted into index relevant to business, can perceive from external data by handling in this way Interior business bring may be influenced.
For example, the external data may be related to some product, described to be somebody's turn to do to finding after the dismantling of some external data Product is Third-party payment, then it is then to establish a mapping relations that the external data, which is converted to index relevant to business, It first determines which operational indicator Third-party payment can correspond to, these operational indicators is then established one with the external data and are reflected Penetrate relationship, it is established that this mapping relations come are to convert.
Step 2: extracting the operational indicator of the business datum according to preset condition, forms the operational indicator collection.
In this specification one or more embodiment, the preset condition includes but is not limited to theme etc., can be according to reality Border demand is configured, and this specification is not limited in any way this.
The operational indicator of the business datum is extracted according to preset condition, and as interior business is drawn according to theme Point, unified output operational indicator collection, wherein the operational indicator concentration can include but is not limited to related products, data are come The information such as source, bore (standard used by statistical data) or the description of index keyword.
In actual use, indexing refinement is carried out to each product line service, forms operational indicator collection, the index is as above-mentioned The corresponding a set of general operational indicator of each product introduced, the index can sufficiently reflect the development shape of each product line service State.
In actual use, it is different in fact for the operational indicator of each product line drawing, for example, payment product, The operational indicator so extracted can include trading volume, transaction amount, number of users etc..
Illustrate the relationship of provision index set and operational indicator collection, such as the external data packet with a real case below Include: a punishment of the supervision department to third company then can extract this case information, determines the punishment For third company what product, supervision part which type of one punishment has been done to the said firm, expense is how many.
Then it is that the determining punishment is corresponding after analyzing the external data for which product, is then believed according to the product Breath sees down itself intra-company either with or without such a product, if there is such product, then this product can correspond to one again Then the index system and above-mentioned punishment are associated by a index system, then it can be seen that the punishment is to itself Products Influence.
Step 206: entity extraction being carried out according to the provision index set and the operational indicator collection, respectively obtains the item The literary corresponding first instance of index set and the corresponding second instance of the operational indicator collection.
It include the relevant law item of financial industry in the provision index set in this specification one or more embodiment Text, case information, industry Zone Information information and business experience document etc..
Then entity extraction is carried out according to the provision index set, obtains the corresponding first instance packet of the provision index set It includes:
The entity in the provision index set is extracted using NLP technology, wherein the NLP technology includes to construction Term vector, name Entity recognition, the keyword extraction of financial industry characteristic and the article centre word of financial industry extract Equal base powers, then go out its corresponding entity and attribute extraction from a large amount of provision index sets using these base powers Come.
In this specification one or more embodiment, the operational indicator collection includes operational indicator and the outside of structuring The basic information etc. of company, wherein the operational indicator of the structuring can include but is not limited to name of product, on-line time Deng, the basic information of the external company can include but is not limited to external company industrial and commercial registration information, legal person, equity, Industry and commerce complaint etc..
Then entity extraction is carried out according to the operational indicator collection, obtains the corresponding second instance packet of the operational indicator collection It includes:
Entity extraction, such as the knowledge are carried out to the operational indicator collection according to expertise and the structure of knowledge mapping The structure of map includes domain, type and attribute etc..
Step 208: determining the entity relationship between the first instance and the second instance.
In this specification one or more embodiment, referring to Fig. 3, determine between the first instance and the second instance Entity relationship include step 302 to step 306.
Step 302: determining the first instance relationship between the first instance and the second instance.
In this specification one or more embodiment, can be set up according to the mode of expertise the first instance and First instance relationship between the second instance.
Step 304: knowledge graph is constructed according to the first instance, the second instance and the first instance relationship Spectrum.
Step 306: the first instance and described second is obtained in fact according to knowledge mapping and pre-generated correlation model Second instance relationship between body.
In this specification one or more embodiment, for the potential relationship that cannot be determined by way of expertise, It needs to do reasoning excavation by the method for machine learning.
A knowledge graph is first constructed according to the first instance, the second instance and the first instance relationship Then spectrum obtains the between the first instance and the second instance according to knowledge mapping and pre-generated correlation model Two entity relationships.
In actual use, constructing the knowledge mapping then is the network of personal connections for constructing the first instance and the second instance Network, wherein the node of the first instance and the second instance characterization of relation network, then using Random Walk Algorithm to this Each node carries out sequential sampling in relational network, and generates sequence node, finally will based on certain internet startup disk learning model Each node in the sequence node carries out vectorization expression, then indicates building knowledge graph according to the vectorization of each node Spectrum.
Step 210: correlation model being instructed by the first instance, the second instance and the entity relationship Practice, obtains the correlation model, the correlation model makes the first instance and the second instance associated, and exports institute State the degree of association of first instance and the second instance.
In this specification one or more embodiment, the output of the correlation model includes being exported according to the first instance The second instance and first degree of association export the first instance and second degree of association according to the second instance. Wherein, first degree of association and second degree of association are the first instance to the influence degree of the second instance and institute Second instance is stated to the influence degree of the first instance.
In this specification one or more embodiment, the correlation model of generation uses machine learning algorithm and expert Experience etc. does conjunction rule risk identification, can advise risk with the conjunction of intelligent recognition business, and external data can for example be supervised The structuring and business datum such as interior business state of development for closing rule provision information are quantified, then pass through knowledge mapping handle Information after quantization carries out figure incidence relation integration, then potential relation inference is carried out on graph structure using machine learning techniques, Allow according to the trained correlation model of the relationship between each entity of external data, business datum and knowledge mapping more Add the conjunction rule risk for comprehensively identifying each business.
Referring to fig. 4, this specification one or more embodiment provides a kind of data correlation method, including step 402 to Step 408.
Step 402: obtain external data, wherein the external data include supervision provision, policies and regulations, case and/or News.
Step 404: the external data being pre-processed, and obtains the corresponding provision index set of the external data.
Step 406: entity extraction being carried out according to the provision index set, and obtains the provision index set corresponding first Entity.
Step 408: being obtained according to the first instance and pre-generated correlation model associated with the first instance Business datum and first degree of association.
In this specification one or more embodiment, the acquisition of external data can be obtained by crawler system external Data.
And the extraction of pretreatment and the first instance for the external data can be found in above-described embodiment, This specification repeats no more this.
In this specification one or more embodiment, if the external data includes that Industry Policy changes or punish case, Which then can recognize that the sector policy change or punishment case by the pre-generated correlation model to Products Have an impact and influence degree is how many.
In this specification one or more embodiment, the data correlation method can be based on pre-generated correlation model The associated business datum of the external data is identified, and may determine that the external data to the business datum Influence degree, analyzed comprehensively automatically by this system and identify business datum associated with the external data and pass Connection degree influence degree, high-efficient and accuracy rate are high.
Referring to Fig. 5, this specification one or more embodiment provides a kind of data correlation method, including step 502 to Step 508.
Step 502: obtaining business datum.
Step 504: the business datum being pre-processed, and obtains the corresponding operational indicator collection of the business datum.
Step 506: entity extraction being carried out according to the operational indicator collection, and obtains the operational indicator collection corresponding second Entity.
Step 508: correlation model trained according to the second instance and in advance obtains associated with the second instance External data and second degree of association, wherein the external data includes supervision provision, policies and regulations, case and/or news.
In this specification one or more embodiment, pretreatment and the second instance for the business datum It extracts and can be found in above-described embodiment, this specification repeats no more this.
In this specification one or more embodiment, if the business datum includes certain product, when the product has tune When whole or increase, it can recognize that the product can be influenced by which external data by the pre-generated correlation model And influence degree is how many.
In this specification one or more embodiment, the data correlation method can be based on pre-generated correlation model The influential external data of the business datum will be identified, and may determine that the external data identified to business number According to influence degree, analyzed and identified comprehensively automatically by this system external data associated with the business datum and Degree of association influence degree, high-efficient and accuracy rate are high.
Referring to Fig. 6, this specification one or more embodiment provides a kind of correlation model generating means, comprising:
First obtains module 602, is configured as obtaining external data and business datum, wherein the external data includes Supervise provision, policies and regulations, case and/or news;
First preprocessing module 604 is configured as pre-processing the external data and the business datum, respectively Obtain the corresponding provision index set of the external data and the corresponding operational indicator collection of the business datum;
First abstraction module 606 is configured as carrying out entity pumping according to the provision index set and the operational indicator collection It takes, respectively obtains the corresponding first instance of the provision index set and the corresponding second instance of the operational indicator collection;
First determining module 608, the entity relationship being configured to determine that between the first instance and the second instance;
First training module 610 is configured as through the first instance, the second instance and the entity relationship Correlation model is trained, the correlation model is obtained, the correlation model makes the first instance and described second in fact Body is associated, and exports the degree of association of the first instance and the second instance.
Optionally, first preprocessing module 604 includes:
First analysis submodule, is configured as analyzing the external data using natural language processing technique, and The external data after analysis is converted into index relevant to business, forms the provision index set;And
First extracting sub-module is configured as extracting the operational indicator of the business datum according to preset condition, forms institute State operational indicator collection.
Optionally, first determining module 608 includes:
First instance relationship determines submodule, be configured to determine that between the first instance and the second instance One entity relationship;
Knowledge mapping constructs submodule, is configured as according to the first instance, the second instance and described first Entity relationship constructs knowledge mapping;
Second instance relationship determines submodule, is configured as obtaining institute according to knowledge mapping and pre-generated correlation model State the second instance relationship between first instance and the second instance.
Optionally, the first acquisition module 602 is additionally configured to obtain external data by crawler system.
In this specification one or more embodiment, the correlation model device uses common machines learning algorithm and rule Then etc. do conjunction rule risk identification, can with the compliance of intelligent recognition business, and by knowledge mapping technology provision information and Business carries out figure incidence relation integration, then potential relation inference is carried out on graph structure using machine learning techniques, can be more Comprehensively identify the conjunction rule risk of each business.
Referring to Fig. 7, this specification one or more embodiment provides a kind of data association device, comprising:
Second obtains module 702, is configured as obtaining external data, wherein the external data includes supervision provision, political affairs Plan regulation, case and/or news;
Second preprocessing module 704 is configured as pre-processing the external data, and obtains the external data Corresponding provision index set;
Second abstraction module 706 is configured as carrying out entity extraction according to the provision index set, and obtains the provision The corresponding first instance of index set;
First obtains module 708, is configured as according to the first instance and pre-generated correlation model obtains and institute State the associated business datum of first instance and first degree of association.
Optionally, second preprocessing module 704 is also configured to
The external data is analyzed using natural language processing technique, and the external data after analysis is turned It changes index relevant to business into, forms the provision index set.
Optionally, described second module 702 is obtained, is configured as obtaining external data by crawler system.
In this specification one or more embodiment, the data association device can be based on pre-generated correlation model The associated business datum of the external data is identified, and may determine that the external data to the business datum Influence degree, analyzed comprehensively automatically by this system and identify business datum associated with the external data and pass Connection degree influence degree, high-efficient and accuracy rate are high.
Referring to Fig. 8, this specification one or more embodiment provides a kind of data association device, comprising:
Third obtains module 802, is configured as obtaining business datum;
Third preprocessing module 804 is configured as pre-processing the business datum, and obtains the business datum Corresponding operational indicator collection;
Third abstraction module 806 is configured as carrying out entity extraction according to the operational indicator collection, and obtains the business The corresponding second instance of index set;
Second obtains module 808, is configured as according to the second instance and pre-generated correlation model obtains and institute State the associated external data of second instance and second degree of association, wherein the external data includes supervision provision, policy method Rule, case and/or news.
Optionally, the third preprocessing module 804, is configured as:
The operational indicator that the business datum is extracted according to preset condition forms the operational indicator collection.
In this specification one or more embodiment, the data association device can be based on pre-generated correlation model The influential external data of the business datum will be identified, and may determine that the external data identified to business number According to influence degree, analyzed and identified comprehensively automatically by this system external data associated with the business datum and Degree of association influence degree, high-efficient and accuracy rate are high.
Fig. 9 is to show the structural block diagram of the calculating equipment 100 according to one embodiment of this specification.The calculating equipment 100 Component include but is not limited to memory 110 and processor 120.Processor 120 is connected with memory 110 by bus 130, Database 150 is for saving data.
Calculating equipment 100 further includes access device 140, access device 140 enable calculate equipment 100 via one or Multiple networks 160 communicate.The example of these networks includes public switched telephone network (PSTN), local area network (LAN), wide area network (WAN), the combination of the communication network of personal area network (PAN) or such as internet.Access device 140 may include wired or wireless One or more of any kind of network interface (for example, network interface card (NIC)), such as IEEE802.11 wireless local area Net (WLAN) wireless interface, worldwide interoperability for microwave accesses (Wi-MAX) interface, Ethernet interface, universal serial bus (USB) connect Mouth, cellular network interface, blue tooth interface, near-field communication (NFC) interface, etc..
In one embodiment of this specification, unshowned other component in above-mentioned and Fig. 9 of equipment 100 is calculated It can be connected to each other, such as pass through bus.It should be appreciated that calculating device structure block diagram shown in Fig. 9 is merely for the sake of example Purpose, rather than the limitation to this specification range.Those skilled in the art can according to need, and increase or replace other portions Part.
Calculating equipment 100 can be any kind of static or mobile computing device, including mobile computer or mobile meter Calculate equipment (for example, tablet computer, personal digital assistant, laptop computer, notebook computer, net book etc.), movement Phone (for example, smart phone), wearable calculating equipment (for example, smartwatch, intelligent glasses etc.) or other kinds of shifting Dynamic equipment, or the static calculating equipment of such as desktop computer or PC.Calculating equipment 100 can also be mobile or state type Server.
Wherein, the calculating equipment, including memory, processor and storage can be run on a memory and on a processor Computer instruction, the processor realizes that correlation model generation method as described above or the data are closed when executing described instruction The step of linked method.
One embodiment of the application also provides a kind of computer readable storage medium, is stored with computer instruction, the instruction The step of correlation model generation method as described above or the data correlation method are realized when being executed by processor.
A kind of exemplary scheme of above-mentioned computer readable storage medium for the present embodiment.It should be noted that this is deposited The technical solution of the technical solution of storage media and above-mentioned correlation model generation method or data correlation method belongs to same design, deposits The detail content that the technical solution of storage media is not described in detail may refer to above-mentioned correlation model generation method or data correlation The description of the technical solution of method.
The technology carrier being related to is paid described in the embodiment of the present application, such as may include near-field communication (Near Field Communication, NFC), WIFI, 3G/4G/5G, POS machine swipe the card technology, two dimensional code barcode scanning technology, bar code barcode scanning technology, Bluetooth, infrared, short message (Short Message Service, SMS), Multimedia Message (Multimedia Message Service, MMS) etc..
The computer instruction includes computer program code, the computer program code can for source code form, Object identification code form, executable file or certain intermediate forms etc..The computer-readable medium may include: that can carry institute State any entity or device, recording medium, USB flash disk, mobile hard disk, magnetic disk, CD, the computer storage of computer program code Device, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), Electric carrier signal, telecommunication signal and software distribution medium etc..It should be noted that the computer-readable medium include it is interior Increase and decrease appropriate can be carried out according to the requirement made laws in jurisdiction with patent practice by holding, such as in certain jurisdictions of courts Area does not include electric carrier signal and telecommunication signal according to legislation and patent practice, computer-readable medium.
It should be noted that for the various method embodiments described above, describing for simplicity, therefore, it is stated as a series of Combination of actions, but those skilled in the art should understand that, the application is not limited by the described action sequence because According to the application, certain steps can use other sequences or carry out simultaneously.Secondly, those skilled in the art should also know It knows, the embodiments described in the specification are all preferred embodiments, and related actions and modules might not all be this Shen It please be necessary.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment Point, it may refer to the associated description of other embodiments.
The application preferred embodiment disclosed above is only intended to help to illustrate the application.There is no detailed for alternative embodiment All details are described, are not limited the invention to the specific embodiments described.Obviously, according to the content of this specification, It can make many modifications and variations.These embodiments are chosen and specifically described to this specification, is in order to preferably explain the application Principle and practical application, so that skilled artisan be enable to better understand and utilize the application.The application is only It is limited by claims and its full scope and equivalent.

Claims (20)

1. a kind of correlation model generation method characterized by comprising
Obtain external data and business datum, wherein the external data includes supervision provision, policies and regulations, case and/or new It hears;
The external data and the business datum are pre-processed, the corresponding provision index of the external data is respectively obtained Collect operational indicator collection corresponding with the business datum;
Entity extraction is carried out according to the provision index set and the operational indicator collection, it is corresponding to respectively obtain the provision index set First instance and the corresponding second instance of the operational indicator collection;
Determine the entity relationship between the first instance and the second instance;
Correlation model is trained by the first instance, the second instance and the entity relationship, is obtained described Correlation model, the correlation model makes the first instance and the second instance associated, and exports the first instance With the degree of association of the second instance.
2. the method according to claim 1, wherein being located in advance to the external data and the business datum Reason, respectively obtains the corresponding provision index set of the external data and the corresponding operational indicator collection of the business datum includes:
The external data is analyzed using natural language processing technique, and the external data after analysis is converted into Index relevant to business forms the provision index set;And
The operational indicator that the business datum is extracted according to preset condition forms the operational indicator collection.
3. the method according to claim 1, wherein determining between the first instance and the second instance Entity relationship includes:
Determine the first instance relationship between the first instance and the second instance;
Knowledge mapping is constructed according to the first instance, the second instance and the first instance relationship;
Second between the first instance and the second instance is obtained according to knowledge mapping and pre-generated correlation model Entity relationship.
4. the method according to claim 1, wherein acquisition external data includes:
External data is obtained by crawler system.
5. a kind of data correlation method characterized by comprising
Obtain external data, wherein the external data includes supervision provision, policies and regulations, case and/or news;
The external data is pre-processed, and obtains the corresponding provision index set of the external data;
Entity extraction is carried out according to the provision index set, and obtains the corresponding first instance of the provision index set;
According to the first instance and pre-generated correlation model obtain business datum associated with the first instance and First degree of association.
6. according to the method described in claim 5, it is characterized in that, pre-processed to the external data, and obtaining described The corresponding provision index set of external data includes:
The external data is analyzed using natural language processing technique, and the external data after analysis is converted into Index relevant to business forms the provision index set.
7. according to the method described in claim 5, it is characterized in that, acquisition external data includes:
External data is obtained by crawler system.
8. a kind of data correlation method characterized by comprising
Obtain business datum;
The business datum is pre-processed, and obtains the corresponding operational indicator collection of the business datum;
Entity extraction is carried out according to the operational indicator collection, and obtains the corresponding second instance of the operational indicator collection;
According to the second instance and pre-generated correlation model obtain external data associated with the second instance and Second degree of association, wherein the external data includes supervision provision, policies and regulations, case and/or news.
9. according to the method described in claim 8, it is characterized in that, pre-processed to the business datum, and obtaining described The corresponding operational indicator collection of business datum includes:
The operational indicator that the business datum is extracted according to preset condition forms the operational indicator collection.
10. a kind of correlation model generating means characterized by comprising
First obtains module, is configured as obtaining external data and business datum, wherein the external data includes supervision item Text, policies and regulations, case and/or news;
First preprocessing module is configured as pre-processing the external data and the business datum, respectively obtains institute State the corresponding provision index set of external data and the corresponding operational indicator collection of the business datum;
First abstraction module is configured as carrying out entity extraction according to the provision index set and the operational indicator collection, respectively Obtain the corresponding first instance of the provision index set and the corresponding second instance of the operational indicator collection;
First determining module, the entity relationship being configured to determine that between the first instance and the second instance;
First training module is configured as through the first instance, the second instance and the entity relationship to association Model is trained, and obtains the correlation model, and the correlation model makes the first instance related to the second instance Connection, and export the degree of association of the first instance and the second instance.
11. device according to claim 10, which is characterized in that first preprocessing module includes:
First analysis submodule, is configured as analyzing the external data using natural language processing technique, and will divide The external data after analysis is converted into index relevant to business, forms the provision index set;And
First extracting sub-module is configured as extracting the operational indicator of the business datum according to preset condition, forms the industry Business index set.
12. device according to claim 10, which is characterized in that first determining module includes:
First instance relationship determines submodule, is configured to determine that first between the first instance and the second instance is real Body relationship;
Knowledge mapping constructs submodule, is configured as according to the first instance, the second instance and the first instance Relationship constructs knowledge mapping;
Second instance relationship determines submodule, is configured as obtaining described the according to knowledge mapping and pre-generated correlation model Second instance relationship between one entity and the second instance.
13. device according to claim 10, which is characterized in that the first acquisition module is additionally configured to pass through crawler System obtains external data.
14. a kind of data association device characterized by comprising
Second obtain module, be configured as obtain external data, wherein the external data include supervision provision, policies and regulations, Case and/or news;
Second preprocessing module is configured as pre-processing the external data, and it is corresponding to obtain the external data Provision index set;
Second abstraction module is configured as carrying out entity extraction according to the provision index set, and obtains the provision index set Corresponding first instance;
First obtains module, is configured as being obtained with described first in fact according to the first instance and pre-generated correlation model The associated business datum of body and first degree of association.
15. device according to claim 14, which is characterized in that second preprocessing module is also configured to
The external data is analyzed using natural language processing technique, and the external data after analysis is converted into Index relevant to business forms the provision index set.
16. device according to claim 14, which is characterized in that described second obtains module, is configured as passing through crawler System obtains external data.
17. a kind of data association device characterized by comprising
Third obtains module, is configured as obtaining business datum;
Third preprocessing module is configured as pre-processing the business datum, and it is corresponding to obtain the business datum Operational indicator collection;
Third abstraction module is configured as carrying out entity extraction according to the operational indicator collection, and obtains the operational indicator collection Corresponding second instance;
Second obtains module, is configured as being obtained with described second in fact according to the second instance and pre-generated correlation model The associated external data of body and second degree of association, wherein the external data include supervision provision, policies and regulations, case and/ Or news.
18. device according to claim 17, which is characterized in that the third preprocessing module is configured as:
The operational indicator that the business datum is extracted according to preset condition forms the operational indicator collection.
19. a kind of calculating equipment including memory, processor and stores the calculating that can be run on a memory and on a processor Machine instruction, which is characterized in that the processor is realized when executing described instruction realizes that right the is wanted when instruction is executed by processor The step of seeking 1-4,5-7 or 8-9 any one the method.
20. a kind of computer readable storage medium, is stored with computer instruction, which is characterized in that the instruction is held by processor The step of claim 1-4,5-7 or 8-9 any one the method are realized when row.
CN201811159278.8A 2018-09-30 2018-09-30 Correlation model generation method and device, and data correlation method and device Active CN109597894B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811159278.8A CN109597894B (en) 2018-09-30 2018-09-30 Correlation model generation method and device, and data correlation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811159278.8A CN109597894B (en) 2018-09-30 2018-09-30 Correlation model generation method and device, and data correlation method and device

Publications (2)

Publication Number Publication Date
CN109597894A true CN109597894A (en) 2019-04-09
CN109597894B CN109597894B (en) 2023-10-03

Family

ID=65957345

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811159278.8A Active CN109597894B (en) 2018-09-30 2018-09-30 Correlation model generation method and device, and data correlation method and device

Country Status (1)

Country Link
CN (1) CN109597894B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110187678A (en) * 2019-04-19 2019-08-30 广东省智能制造研究所 A kind of storage of manufacturing industry process equipment information and digitlization application system
CN111488741A (en) * 2020-04-14 2020-08-04 税友软件集团股份有限公司 Tax knowledge data semantic annotation method and related device
CN111754104A (en) * 2020-06-22 2020-10-09 平安资产管理有限责任公司 Service index execution method and system
CN112749284A (en) * 2020-12-31 2021-05-04 平安科技(深圳)有限公司 Knowledge graph construction method, device, equipment and storage medium

Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104133848A (en) * 2014-07-01 2014-11-05 中央民族大学 Tibetan language entity knowledge information extraction method
CN105095195A (en) * 2015-07-03 2015-11-25 北京京东尚科信息技术有限公司 Method and system for human-machine questioning and answering based on knowledge graph
US20160041720A1 (en) * 2014-08-06 2016-02-11 Kaybus, Inc. Knowledge automation system user interface
CN105468583A (en) * 2015-12-09 2016-04-06 百度在线网络技术(北京)有限公司 Entity relationship obtaining method and device
CN106354710A (en) * 2016-08-18 2017-01-25 清华大学 Neural network relation extracting method
CN106372118A (en) * 2016-08-24 2017-02-01 武汉烽火普天信息技术有限公司 Large-scale media text data-oriented online semantic comprehension search system and method
CN106776711A (en) * 2016-11-14 2017-05-31 浙江大学 A kind of Chinese medical knowledge mapping construction method based on deep learning
CN106815293A (en) * 2016-12-08 2017-06-09 中国电子科技集团公司第三十二研究所 System and method for constructing knowledge graph for information analysis
CN107273349A (en) * 2017-05-09 2017-10-20 清华大学 A kind of entity relation extraction method and server based on multilingual
CN107291687A (en) * 2017-04-27 2017-10-24 同济大学 It is a kind of based on interdependent semantic Chinese unsupervised open entity relation extraction method
US20170308792A1 (en) * 2014-08-06 2017-10-26 Prysm, Inc. Knowledge To User Mapping in Knowledge Automation System
CN107358315A (en) * 2017-06-26 2017-11-17 深圳市金立通信设备有限公司 A kind of information forecasting method and terminal
CN107657063A (en) * 2017-10-30 2018-02-02 合肥工业大学 The construction method and device of medical knowledge collection of illustrative plates
CN107665252A (en) * 2017-09-27 2018-02-06 深圳证券信息有限公司 A kind of method and device of creation of knowledge collection of illustrative plates
CN107783973A (en) * 2016-08-24 2018-03-09 慧科讯业有限公司 The methods, devices and systems being monitored based on domain knowledge spectrum data storehouse to the Internet media event
CN107909274A (en) * 2017-11-17 2018-04-13 平安科技(深圳)有限公司 Enterprise investment methods of risk assessment, device and storage medium
CN107967267A (en) * 2016-10-18 2018-04-27 中兴通讯股份有限公司 A kind of knowledge mapping construction method, apparatus and system
CN108460136A (en) * 2018-03-08 2018-08-28 国网福建省电力有限公司 Electric power O&M information knowledge map construction method
CN108563620A (en) * 2018-04-13 2018-09-21 上海财梵泰传媒科技有限公司 The automatic writing method of text and system

Patent Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104133848A (en) * 2014-07-01 2014-11-05 中央民族大学 Tibetan language entity knowledge information extraction method
US20160041720A1 (en) * 2014-08-06 2016-02-11 Kaybus, Inc. Knowledge automation system user interface
US20170308792A1 (en) * 2014-08-06 2017-10-26 Prysm, Inc. Knowledge To User Mapping in Knowledge Automation System
CN105095195A (en) * 2015-07-03 2015-11-25 北京京东尚科信息技术有限公司 Method and system for human-machine questioning and answering based on knowledge graph
CN105468583A (en) * 2015-12-09 2016-04-06 百度在线网络技术(北京)有限公司 Entity relationship obtaining method and device
CN106354710A (en) * 2016-08-18 2017-01-25 清华大学 Neural network relation extracting method
CN107783973A (en) * 2016-08-24 2018-03-09 慧科讯业有限公司 The methods, devices and systems being monitored based on domain knowledge spectrum data storehouse to the Internet media event
CN106372118A (en) * 2016-08-24 2017-02-01 武汉烽火普天信息技术有限公司 Large-scale media text data-oriented online semantic comprehension search system and method
CN107967267A (en) * 2016-10-18 2018-04-27 中兴通讯股份有限公司 A kind of knowledge mapping construction method, apparatus and system
CN106776711A (en) * 2016-11-14 2017-05-31 浙江大学 A kind of Chinese medical knowledge mapping construction method based on deep learning
CN106815293A (en) * 2016-12-08 2017-06-09 中国电子科技集团公司第三十二研究所 System and method for constructing knowledge graph for information analysis
CN107291687A (en) * 2017-04-27 2017-10-24 同济大学 It is a kind of based on interdependent semantic Chinese unsupervised open entity relation extraction method
CN107273349A (en) * 2017-05-09 2017-10-20 清华大学 A kind of entity relation extraction method and server based on multilingual
CN107358315A (en) * 2017-06-26 2017-11-17 深圳市金立通信设备有限公司 A kind of information forecasting method and terminal
CN107665252A (en) * 2017-09-27 2018-02-06 深圳证券信息有限公司 A kind of method and device of creation of knowledge collection of illustrative plates
CN107657063A (en) * 2017-10-30 2018-02-02 合肥工业大学 The construction method and device of medical knowledge collection of illustrative plates
CN107909274A (en) * 2017-11-17 2018-04-13 平安科技(深圳)有限公司 Enterprise investment methods of risk assessment, device and storage medium
CN108460136A (en) * 2018-03-08 2018-08-28 国网福建省电力有限公司 Electric power O&M information knowledge map construction method
CN108563620A (en) * 2018-04-13 2018-09-21 上海财梵泰传媒科技有限公司 The automatic writing method of text and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
甘丽新等: "基于句法语义特征的中文实体关系抽取", 《计算机研究与发展》, pages 284 - 302 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110187678A (en) * 2019-04-19 2019-08-30 广东省智能制造研究所 A kind of storage of manufacturing industry process equipment information and digitlization application system
CN111488741A (en) * 2020-04-14 2020-08-04 税友软件集团股份有限公司 Tax knowledge data semantic annotation method and related device
CN111754104A (en) * 2020-06-22 2020-10-09 平安资产管理有限责任公司 Service index execution method and system
CN112749284A (en) * 2020-12-31 2021-05-04 平安科技(深圳)有限公司 Knowledge graph construction method, device, equipment and storage medium
CN112749284B (en) * 2020-12-31 2021-12-17 平安科技(深圳)有限公司 Knowledge graph construction method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN109597894B (en) 2023-10-03

Similar Documents

Publication Publication Date Title
CN108763445B (en) Construction method, device, computer equipment and the storage medium in patent knowledge library
CN109635117B (en) Method and device for recognizing user intention based on knowledge graph
CN109597894A (en) A kind of correlation model generation method and device, a kind of data correlation method and device
Choi et al. Item-level RFID for enhancement of customer shopping experience in apparel retail
CN110688454A (en) Method, device, equipment and storage medium for processing consultation conversation
CN105512687A (en) Emotion classification model training and textual emotion polarity analysis method and system
CN111966716A (en) Data processing method and device
CN110008977B (en) Clustering model construction method and device
CN113449046A (en) Model training method, system and related device based on enterprise knowledge graph
CN108197106B (en) Product competition analysis method, device and system based on deep learning
Yang Financial big data management and control and artificial intelligence analysis method based on data mining technology
CN112016850A (en) Service evaluation method and device
CN109284500A (en) Information transmission system and method based on merchants inviting work process and reading preference
Che et al. Bank telemarketing forecasting model based on t-SNE-SVM
CN110209767A (en) A kind of user's portrait construction method
Mei et al. Research on e-commerce coupon user behavior prediction technology based on decision tree algorithm
Modrušan et al. Intelligent Public Procurement Monitoring System Powered by Text Mining and Balanced Indicators
CN111209394A (en) Text classification processing method and device
Feng Data analysis and prediction modeling based on deep learning in E-commerce
TWM583089U (en) Smart credit risk assessment system
Kuo et al. Integration of artificial immune system and k-means algorithm for customer clustering
Hou Financial Abnormal Data Detection System Based on Reinforcement Learning
Wang et al. Color trend prediction method based on genetic algorithm and extreme learning machine
Li Consumer behavior analysis model based on machine learning
CN112435103B (en) Intelligent recommendation method and system for postmortem diversity interpretation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200924

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20200924

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

GR01 Patent grant
GR01 Patent grant