CN109472032A - A kind of determination method, apparatus, server and the storage medium of entity relationship diagram - Google Patents

A kind of determination method, apparatus, server and the storage medium of entity relationship diagram Download PDF

Info

Publication number
CN109472032A
CN109472032A CN201811355514.3A CN201811355514A CN109472032A CN 109472032 A CN109472032 A CN 109472032A CN 201811355514 A CN201811355514 A CN 201811355514A CN 109472032 A CN109472032 A CN 109472032A
Authority
CN
China
Prior art keywords
entity
entity relationship
target
relationship
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811355514.3A
Other languages
Chinese (zh)
Inventor
火莽
火一莽
张志远
张自峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Ruian Technology Co Ltd
Original Assignee
Beijing Ruian Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Ruian Technology Co Ltd filed Critical Beijing Ruian Technology Co Ltd
Priority to CN201811355514.3A priority Critical patent/CN109472032A/en
Publication of CN109472032A publication Critical patent/CN109472032A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Abstract

The invention discloses a kind of determination methods of entity relationship diagram, device, server and storage medium, this method comprises: determining at least one entity in target data, and extract the entity relationship between each entity, determine the reliability of each entity relationship, according to the sequence of the reliability, determine target entity relationship and corresponding target entity pair, each target entity pair is connected based on each target entity relationship, it constitutes entity relationship diagram and stores, through the above technical solution, solves existing entity, relation data stores the problems such as brought storage overhead and cumbersome process, simplify the determination process of entity relationship diagram, improve efficiency, save memory space.

Description

A kind of determination method, apparatus, server and the storage medium of entity relationship diagram
Technical field
The present embodiments relate to field of computer technology more particularly to a kind of determination method, apparatus of entity relationship diagram, Server and storage medium.
Background technique
In face of increasing massive information, the information really needed is therefrom quickly selected, and these information are divided Class, extraction and reconstruct, it appears particularly important.
In this background, information extraction technique comes into being, and broadly, the object of information extraction processing can be The medias such as text, image, voice or video are usually to extract in practical application to text information.Text information is taken out Take be a kind of entity and relationship that specified type is extracted from natural language text technology, main includes three aspects: processing is non- The natural language text of structuring, selectivity extract the information specified in text, the information of extraction forms structural data and indicates. For this purpose, the prior art utilizes information extraction technique, the relationship between entity two-by-two is extracted, and stored, then pass through place These relationships to be managed, final relational graph is formed, this method not only needs to occupy biggish memory space, but also process is cumbersome, Efficiency is lower.
Summary of the invention
The embodiment of the present invention provides determination method, apparatus, server and the storage medium of a kind of entity relationship diagram, with simplification The determination process of entity relationship diagram, improves efficiency, and saves memory space.
In a first aspect, the embodiment of the present invention provides a kind of determination method of entity relationship diagram, comprising:
It determines at least one entity in target data, and extracts the entity relationship between each entity;
Determine the reliability of each entity relationship;
According to the sequence of the reliability, target entity relationship and corresponding target entity pair are determined;
Each target entity pair is connected based on each target entity relationship, entity relationship diagram is constituted and stores.
Further, at least one entity in the determining target data, comprising:
Semantic parsing is carried out to the keyword of target data;
At least one entity in the target data is determined according to parsing result.
Further, after determining at least one entity in target data, further includes:
Each entity disambiguate and merger is handled, obtains at least one standards entities.
Further, described that disambiguation and merger processing are carried out to each entity, at least one standards entities is obtained, is wrapped It includes:
Disambiguation processing is carried out to each entity according to the disambiguation rule of setting;
Calculate entity attributes similarity after each disambiguation;
The entity similarity of entity after each disambiguation is determined according to each attributes similarity;
According to each entity similarity, merger processing is carried out to the entity each after disambiguation, obtains at least one standard Entity.
Further, the entity relationship extracted between each entity, comprising:
Existing entity relationship between each entity is determined according to preset rules, and extracts each entity relationship.
Further, the reliability of each entity relationship of the determination, comprising:
According to the source of the entity, the source coefficient of the entity is determined;
According to the generation time of the entity, the time coefficient of the entity is determined;
According to the frequency that entity described in preset time occurs, the frequency of occurrences coefficient of the entity is determined;
According to the source coefficient, the time coefficient and the frequency of occurrences coefficient determine each entity relationship can By degree.
Further, before determining at least one entity in target data, further includes:
Initial data is grabbed, and analyzes the data characteristics for extracting the initial data;
It is integrated according to the data characteristics and obtains target data.
Second aspect, the embodiment of the present invention also provide a kind of determining device of entity relationship diagram, which includes:
First determining module for determining at least one entity in target data, and extracts the entity between each entity Relationship;
Second determining module, for determining the reliability of each entity relationship;
Third determining module determines that target entity relationship and corresponding target are real for the sequence according to the reliability Body pair;
Module is constituted, for connecting each target entity pair based on each target entity relationship, constitutes entity relationship Scheme and stores.
The third aspect, the embodiment of the present invention also provide a kind of server, comprising:
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processing Device realizes the determination method of entity relationship diagram as described in relation to the first aspect.
Fourth aspect, the embodiment of the present invention also provide a kind of storage medium, are stored thereon with computer program, the program quilt The determination method of entity relationship diagram as described in relation to the first aspect is realized when processor executes.
The embodiment of the present invention provides determination method, apparatus, server and the storage medium of a kind of entity relationship diagram, by true At least one entity in the data that set the goal, and extract the entity relationship between each entity, determine each entity relationship can By degree, according to the sequence of the reliability, target entity relationship and corresponding target entity pair are determined, it is real based on each target Body relationship connects each target entity pair, constitutes entity relationship diagram and stores, and solves existing entity, and relation data stores institute The problems such as bring storage overhead and cumbersome process, simplifies the determination process of entity relationship diagram, improves efficiency, saves and deposit Store up space.
Detailed description of the invention
Fig. 1 is a kind of flow chart of the determination method for entity relationship diagram that the embodiment of the present invention one provides;
Fig. 2 is a kind of flow chart of the determination method of entity relationship diagram provided by Embodiment 2 of the present invention;
Fig. 3 is a kind of schematic diagram of parsing tree provided by Embodiment 2 of the present invention;
Fig. 4 is a kind of structure chart of the determining device for entity relationship diagram that the embodiment of the present invention three provides;
Fig. 5 is a kind of structure chart for server that the embodiment of the present invention four provides.
Specific embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining the present invention rather than limiting the invention.It also should be noted that in order to just Only the parts related to the present invention are shown in description, attached drawing rather than entire infrastructure.
Embodiment one
Fig. 1 is a kind of flow chart of the determination method for entity relationship diagram that the embodiment of the present invention one provides, and the present embodiment can The case where applied to entity relationship diagram is determined according to entity and entity relationship, this method can be by the determining devices of entity relationship diagram It executes, the device is integrated in the server, and with reference to Fig. 1, this method comprises the following steps:
S110, it determines at least one entity in target data, and extracts the entity relationship between each entity.
Target data can be the data obtained based on setting rule, be also possible to the certain kinds selected according to actual needs Data, wherein setting rule can be the rule such as data cleansing, data integration and data normalization to turn unstructured data It changes structural data into, meets the needs of users, wherein structural data is also referred to as row data, is patrolled by two-dimentional table structure Volume expression and the data realized, it then follows data format and length specification, mainly by relevant database progress storage and management, Unstructured data is that data structure is irregular or imperfect, without predefined data model, it has not been convenient to database two dimension Logical table is come the data, such as office documents, text, picture, all kinds of reports, image and audio/visual information etc. that show.It is specific Class data can be certain standard that user selects according to actual needs, such as professional standard.
The entity of data is the set of a kind of things, can be people, thing, object and tissue, wherein ground can be reality Geographical location, be also possible to the virtual addresses such as network address or IP address, thing can be specific event, and object can be entity Object is also possible to virtual object, and wherein physical objects can be trees, birds and beasts or clothes etc., and virtual object can be stock or bill etc., Tissue can be specific organization or group, such as song and dance ensemble.In practical application, every data may include multiple realities Body, and the incidence relation between each entity can be embodied, the incidence relation between each entity is denoted as entity relationship, example Such as " Xiao Ming has an apple ", that " Xiao Ming " indicates is entity people, and what " apple " indicated is physical objects, the reality between two entities Body relationship as " possesses ".It should be noted that can have one or more entity relationships between different entities, same reality Also it can have different entity relationships between body, possess and use as the entity relationship of people and object can be, people and tissue Entity relationship is to belong to, and the entity relationship of person to person can be good friend or same people.
Specifically, the method for determining the entity of target data can select according to actual needs, such as can be using artificial Mode is analyzed target data, summarized and is concluded, and determines the entity of target data, automated manner can also be used, by mesh The entity that pre-sets of mark data input determines in model, exports entity by physical model, wherein entity determines that model can be with It is the Chinese Named Entity Extraction Model based on CRF (Conditional Random Field, condition random field).Entity relationship Figure includes the relationship between entity and each entity, thus determine that needing further to extract each after the entity of target data Relationship between a entity.It is understood that there may be a variety of entity relationships, such as the reality of people and object between two entities Body relationship, which can be, to be possessed, and is also possible to using for this purpose, can determine each entity by the semanteme of parsing target data Entity relationship, and the entity relationship is taken out, foundation is provided for the determination of entity relationship diagram.
S120, the reliability for determining each entity relationship.
Reliability is that the reflection whether true index of entity relationship shows if reliability is more than or equal to preset value The entity relationship of extraction can really reflect the relationship in target data between two entities, whereas if reliability is less than Preset value shows that the entity relationship extracted is insincere, cannot determine entity relationship diagram according to the entity relationship, wherein preset value It can be set according to actual needs, embodiment is without limiting.Specifically, can determine the meter of reliability according to the feature of entity Then calculation method calculates the reliability of each entity relationship according to this method, wherein the feature of entity can be according to actual needs Selection.
S130, according to the sequence of the reliability, determine target entity relationship and corresponding target entity pair.
Target entity relationship is the entity relationship that reliability meets preset value, and target entity is to being and target entity relationship pair The entity pair answered, for example, the reliability of " possessing " this entity relationship meets preset value, entity corresponding with " possessing " is respectively " Xiao Ming " and " apple ", then " possessing " is target entity relationship, and " Xiao Ming " and " apple " is corresponding target entity pair.Specifically , it, can be true according to reliability when reliability is more than or equal to preset value to calculated reliability according to sorting from high to low The entity relationship that sets the goal and target entity pair corresponding with target entity relationship.
S140, each target entity pair is connected based on each target entity relationship, constitutes entity relationship diagram and stores.
Entity relationship diagram is referred to as Entity-Relationship figure, is a kind of structure chart for database design, including entity Entity relationship between entity can be with the relationship between very clear each entity, and therefrom quickly by entity relationship diagram Acquisition need information.Specifically, connecting corresponding each target entity to may make up according to each target entity relationship Entity relationship diagram.Entity relationship diagram can store after generating in graphic data base, and graphic data base is a kind of non-relational number According to library, using the entity relationship between graph theory storage entity, such as entity relationship diagram can be stored in Neo4j figure number According to library, the benefit stored in this way is compared to relevant database, and result can be more clear intuitive exhibition by graphic data base Show to user, for the information that user selects oneself to need, saves the time.
The embodiment of the present invention one provides a kind of determination method of entity relationship diagram, by determining at least one in target data A entity, and the entity relationship between each entity is extracted, the reliability of each entity relationship is determined, according to the reliability Sequence, determines target entity relationship and corresponding target entity pair, connects each target based on each target entity relationship Entity pair constitutes entity relationship diagram and stores, and solves existing entity, and relation data stores brought storage overhead and process The problems such as cumbersome, simplifies the determination process of entity relationship diagram, improves efficiency, saves memory space.
It is understood that there may be identical or very close situations for the meaning of different entity on behalf, such as open That three and Zhang Asan is indicated is the same person, and same entity is also likely to be present a variety of meanings, and such as " bank " can indicate bank, Can indicate riverbank, the presence of these situations will affect the accuracy of entity relationship diagram therefore the entity for determining target data it After need to carry out these entities disambiguate and merger, to improve the accuracy of entity relationship diagram.
Specifically, on the basis of the above embodiments, after determining at least one entity in target data, also wrapping It includes:
Each entity disambiguate and merger is handled, obtains at least one standards entities.
Disambiguating is the ambiguity for eliminating entity, to determine meaning of the entity in the data.It is based on specifically, can use The disambiguation of dictionary, i.e., based on the disambiguation of semantical definition.If including vocabulary Other to i-th kind of definition of Word in dictionaryi, that If in a sentence comprising Word, while also there is Otheri, then it is assumed that the semanteme of Word takes in the sentence I-th kind of definition in dictionary.Merger is the merging of entity, i.e., substantially identical entity is merged into an entity, such as will Zhang San and Zhang great San are unified for Zhang San.
In order to further illustrate disambiguate and merger process, can will " to each entity carry out disambiguate and merger handle, Obtain at least one standards entities " it is embodied as:
Disambiguation processing is carried out to each entity according to the disambiguation rule of setting;
Calculate entity attributes similarity after each disambiguation;
The entity similarity of entity after each disambiguation is determined according to each attributes similarity;
According to each entity similarity, merger processing is carried out to the entity each after disambiguation, obtains at least one standard Entity.
Specifically, disambiguating rule can be the rule set according to actual needs, such as it can be semantic-based disambiguation Rule, detailed process have been described in front, and details are not described herein again.Entity attributes can be the characteristic of entity, such as student (entity) has the attributes such as student number, name, age and gender.Attributes similarity is the similarity of the same alike result of same class entity, When the similarity of two attributes is greater than preset attributes similarity, show that the entity of two attribute representatives is same entity, it is real It, can be using attributes similarity as the entity similarity of entity corresponding with attribute, for example, the corresponding reality of attribute 1 in the application of border Body is entity 1, and the corresponding entity of attribute 2 is entity 2, if the attributes similarity of attribute 1 and attribute 2 is 70%, then it is assumed that real The entity similarity of body 1 and entity 2 is also 70%.
Optionally, semantic analysis is carried out to the entity after disambiguation, and is one group by the entity division of semantic similarity, such as open Three, Zhang great San and Zhang San 1990 are one group, and whether what is indicated in order to further determine three attributes is same people, can be calculated Big three and Zhang San 1990 attributes similarity and Zhang San and Zhang great San attributes similarity, it is assumed that preset attributes similarity value It is 90%, if the attributes similarity of Zhang San and Zhang great San are more than or equal to 90%, it is determined that Zhang San and Zhang great San are same people, such as The attributes similarity of fruit Zhang great San and Zhang San 1990 are less than 90%, it is determined that Zhang San and Zhang San 1990 are two people.Further, According to attributes similarity value, can determine that Zhang San and Zhang great San indicate is same entity, therefore can be by Zhang San and Zhang great San Unification is carried out, i.e., substantially identical entity is merged into same entity, obtains a standards entities, improve the reliable of data Degree, while saving the space of data storage.
Embodiment two
Fig. 2 is a kind of flow chart of the determination method of entity relationship diagram provided by Embodiment 2 of the present invention, in above-mentioned implementation It is optimized on the basis of example, specifically, this method comprises the following steps:
S210, crawl initial data, and analyze the data characteristics for extracting the initial data.
Initial data can be non-structured daily record data, can be existed by the modes such as script crawler or data convergence The crawl such as mail or network, wherein crawler is a kind of mode for automatically grabbing internet information, such as can be climbed using Python Worm crawls daily record data or extracts data in mail.Data characteristics is the feature or characteristic of data, can reflect data generation The information of table, such as can be cell-phone number, identity card or WeChat ID etc..Optionally, obtain initial data after, to initial data into Row pretreatment, wherein pretreatment includes but is not limited to data cleansing and data transformation, and data cleansing can be data deduplication sum number According to noise reduction, to improve the quality of data, data transformation mainly carries out standardization processing to data, by the data conversion of different-format For preset format.After pre-processing to initial data, the data characteristics of initial data is extracted, to generate structural data.
S220, acquisition target data is integrated according to the data characteristics.
After extracting data characteristics, data characteristics can be arranged in a certain order, to obtain target data, and By the target data deposit Kafka caching of generation, wherein Kafka caching is a kind of distributed post subscription message system, is used for Storing data.
S230, semantic parsing is carried out to the keyword of target data.
Specifically, keyword is the theme for reflecting target data, include according to the target data that data characteristics is integrated Multiple keywords need to carry out keyword semantic parsing in order to further determine the meaning of keyword.
S240, at least one entity in the target data is determined according to parsing result.
Illustratively, for keyword A after parsing, expression is geographical location, therefore, can will be closed according to parsing result Keyword A is determined as physically.Optionally, after determining entity, a label can be assigned for entity, it can be true according to the label The meaning that the type and the entity for determining current entity indicate.
S250, existing entity relationship between each entity is determined according to preset rules, and extract each entity and close System.
Preset rules are the foundation of existing entity relationship between determining each entity, such as can be and first determine entity A Two realities are further determined that from entity relationship that may be present then in conjunction with context with entity B entity relationship that may be present The entity relationship of body.For example, Zhang San and Li Si are good friends, according to preset rules, first determine Zhang San and Li Si may be same People, it is also possible to which two people can determine Zhang San and Li Si is two people, and the entity of Zhang San and Li Si close in conjunction with context System is good friend.It should be noted that entity herein can be the entity determined according to target data, it is also possible to by disambiguating With the standards entities obtained after merger processing.It optionally, can be according to preset rules after determining the entity of target data Entity relationship between each entity is obtained using manual type, automated manner can also be used, such as machine learning in advance will be pre- If rule input machine learning, then inputs machine learning for target data, the entity of each entity is exported by machine learning Relationship, wherein manual type or automated manner are in above-described embodiment by the agency of mistake, and details are not described herein again.Further, it is taking out After the entity relationship for taking each entity, entity relationship is applied between the entity for assigning label, in order to determine the entity of extraction The reliability of relationship is conducted into knowledge base after extracting entity relationship and is learnt, and adjusts entity according to learning outcome Relationship.
Optionally, the relationship between each entity can also be determined according to parsing tree by building parsing tree, Parsing tree can reflect syntax, semanteme and logical relation in target data between word and word, phrase and phrase.It is exemplary , by taking " I plays basketball " as an example, " I " is noun, is indicated with NN, and " beating " is verb, is indicated with " Vt ", and " basketball " is noun, with NN indicates that, according to the principle of parsing tree it is found that " I " corresponding derivation probability is 0.5, the path of tree is " I ", " beating " Deriving probability is 1.0, and the path of tree is " beating ", and the derivation probability of " basketball " is 0.5, and the path of tree is " basketball ", " basketball " and " beating " combination meets VP rule, wherein what VP was indicated is verb phrases, and deriving probability is that 0.5, NN and VP combination meets S rule Then, deriving probability is 0.25, and what wherein S was indicated is sentence, and the path of tree is " I plays basketball ", the final sentence obtained as a result, The schematic diagram of method parsing tree is as shown in Figure 3.It should be noted that in same sentence, the derivation probability of identical part of speech and be 1.
After obtaining parsing tree, using machine learning since the term vector on parsing tree top, to sentence Grammer is iterated and merges, and the vector for finally obtaining the sentence indicates, is indicated according to the vector, determines the entity of each entity Relationship.Wherein, machine learning can be recurrent neural network.
S260, the reliability for determining each entity relationship.
In order to further clarify the determination process of reliability, " reliability for determining each entity relationship " is carried out below It embodies, specifically, the reliability of each entity relationship of the determination, comprising:
According to the source of the entity, the source coefficient of the entity is determined;
According to the generation time of the entity, the time coefficient of the entity is determined;
According to the frequency that entity described in preset time occurs, the frequency of occurrences coefficient of the entity is determined;
According to the source coefficient, the time coefficient and the frequency of occurrences coefficient determine each entity relationship can By degree.
Specifically, the source of entity is the source of entity, i.e. the source of data corresponding to current entity is true according to source The source coefficient of entity is determined, for example, if the corresponding data source of entity is reliable, such as selected from Baidupedia or certain mark Standard, then corresponding source coefficient is higher, whereas if the corresponding data source of entity is unreliable, then corresponding source coefficient compared with Low, specifically, setting, which data source is reliable, which data source is unreliable to can be set according to actual needs.
The time of the corresponding data of generation time, that is, entity of entity, e.g. 2018 data either 2017 Data, embodiment setting, the data of the selection corresponding time is closer with current time, and corresponding time coefficient is higher, such as The time coefficient of data corresponding entity of the time coefficient of the corresponding entity of data in 2018 higher than 2001.
Entity occur frequency, that is, preset time in entity occur frequency, if there is frequency it is higher, then occur frequency Rate coefficient is higher, wherein preset time can be configured according to actual needs.Specifically, the calculating of reliability can basis The calculation formula of setting calculates, wherein calculation formula can be reliability=source coefficient * time coefficient * frequency of occurrences coefficient. It should be noted that source coefficient, time coefficient and frequency of occurrences coefficient are percentage.
S270, according to the sequence of the reliability, determine target entity relationship and corresponding target entity pair.
S280, each target entity pair is connected based on each target entity relationship, constitutes entity relationship diagram and stores.
Second embodiment of the present invention provides a kind of determination methods of entity relationship diagram, on the basis of the above embodiments, to " really At least one entity in the data that set the goal, and extract the entity relationship between each entity " and " determine each entity relationship Reliability " optimize, integrate to obtain number of targets by pre-processing initial data, and according to the data characteristics of extraction According to, and determine according to the source coefficient of entity, time coefficient and the frequency of occurrences coefficient reliability of each entity relationship, it improves The quality of data increases the reliability of entity relationship diagram.
Embodiment three
Fig. 4 is a kind of structure chart for the determining device of entity relationship diagram that the embodiment of the present invention three provides, which can be with The determination method for executing entity relationship diagram described in above-described embodiment, specifically, the device includes:
First determining module 410 for determining at least one entity in target data, and extracts the reality between each entity Body relationship;
Second determining module 420, for determining the reliability of each entity relationship;
Third determining module 430 determines target entity relationship and corresponding target for the sequence according to the reliability Entity pair;
Module 440 is constituted, for connecting each target entity pair based on each target entity relationship, entity is constituted and closes System schemes and stores.
The embodiment of the present invention three provides a kind of determining device of entity relationship diagram, by determining at least one in target data A entity, and the entity relationship between each entity is extracted, the reliability of each entity relationship is determined, according to the reliability Sequence, determines target entity relationship and corresponding target entity pair, connects each target based on each target entity relationship Entity pair constitutes entity relationship diagram and stores, and solves existing entity, and relation data stores brought storage overhead and process The problems such as cumbersome, simplifies the determination process of entity relationship diagram, improves efficiency, saves memory space.
On the basis of the above embodiments, the first determining module 410, comprising:
Resolution unit carries out semantic parsing for the keyword to target data;
Entity determination unit, for determining at least one entity in the target data according to parsing result.
On the basis of the above embodiments, the device further include:
Processing module, for being disambiguated to each entity after determining at least one entity in target data It is handled with merger, obtains at least one standards entities.
On the basis of the above embodiments, processing module, comprising:
Processing unit is disambiguated, for carrying out disambiguation processing to each entity according to the disambiguation rule of setting;
Computing unit, for calculating entity attributes similarity after each disambiguation;
Determination unit, for determining the entity similarity of entity after each disambiguation according to each attributes similarity;
Merger processing unit, for carrying out merger processing to the entity each after disambiguation according to each entity similarity, Obtain at least one standards entities.
On the basis of the above embodiments, the first determining module 410, further includes:
Extracting unit for determining existing entity relationship between each entity according to preset rules, and extracts each institute State entity relationship.
On the basis of the above embodiments, the second determining module 420, comprising:
First determination unit determines the source coefficient of the entity for the source according to the entity;
Second determination unit determines the time coefficient of the entity for the generation time according to the entity;
Third determination unit, the frequency for being occurred according to entity described in preset time, determines the appearance of the entity Coefficient of frequency;
4th determination unit, for being determined according to the source coefficient, the time coefficient and the frequency of occurrences coefficient The reliability of each entity relationship.
On the basis of the above embodiments, the device further include:
Handling module for grabbing initial data, and analyzes the data characteristics for extracting the initial data;
Module is integrated, obtains target data for integrating according to the data characteristics.
Example IV
Fig. 5 is a kind of structure chart for server that the embodiment of the present invention four provides, and with reference to Fig. 5, which includes: processing Device 510, memory 520, input unit 530 and output device 540.The quantity of processor 510 can be one in the server Or it is multiple, Fig. 5 is by taking a processor 510 as an example.Processor 510, memory 520, input unit 530 in the server and defeated Device 540 can be connected by bus or other modes out, in Fig. 5 for being connected by bus.
Memory 520 is used as a kind of computer readable storage medium, can be used for storing software program, journey can be performed in computer Sequence and module, such as the corresponding program instruction/module of the determination method of entity relationship diagram in the embodiment of the present invention.Processor 510 By running software program, instruction and module stored in memory, thereby executing terminal various function application and Data processing, i.e. the determination method of the entity relationship diagram of realization above-described embodiment.
Memory 520 mainly includes storing program area and storage data area, wherein storing program area can store operation system Application program needed for system, at least one function;Storage data area, which can be stored, uses created data etc. according to terminal.This Outside, memory 520 may include high-speed random access memory, can also include nonvolatile memory, for example, at least one Disk memory, flush memory device or other non-volatile solid state memory parts.In some instances, memory 520 can be into one Step includes the memory remotely located relative to processor, these remote memories can pass through network connection to terminal.It is above-mentioned The example of network includes but is not limited to internet, intranet, local area network, mobile radio communication and combinations thereof.
Input unit 530 can be used for receiving the number or character information of input, and generate and user setting and function Control related key signals input.Output device 540 may include that display screen etc. shows the audios such as equipment, loudspeaker and buzzer Equipment.
The determination method of server and entity relationship diagram provided by the above embodiment that the embodiment of the present invention four provides belongs to Same inventive concept, the technical detail of detailed description not can be found in above-described embodiment in the present embodiment, and the present embodiment has The standby identical beneficial effect of determination method for executing entity relationship diagram.
Embodiment five
The embodiment of the present invention five also provides a kind of storage medium, is stored thereon with computer program, and the program is by processor The determination method of the entity relationship diagram as described in any embodiment of that present invention is realized when execution.
Certainly, a kind of storage medium provided by the embodiment of the present invention, computer executable instructions are not limited to institute as above The operation of the determination method for the entity relationship diagram stated, can also be performed entity relationship diagram provided by any embodiment of the invention It determines the relevant operation in method, and has corresponding function and beneficial effect.
By the description above with respect to embodiment, it is apparent to those skilled in the art that, the present invention It can be realized by software and required common hardware, naturally it is also possible to which by hardware realization, but in many cases, the former is more Good embodiment.Based on this understanding, technical solution of the present invention substantially in other words contributes to the prior art Part can be embodied in the form of software products, which can store in computer readable storage medium In, floppy disk, read-only memory (Read-Only Memory, ROM), random access memory (Random such as computer Access Memory, RAM), flash memory (FLASH), hard disk or CD etc., including some instructions are with so that a computer is set Standby (can be robot, personal computer, server or the network equipment etc.) executes reality described in each embodiment of the present invention The determination method of body relational graph.
Note that the above is only a better embodiment of the present invention and the applied technical principle.It will be appreciated by those skilled in the art that The invention is not limited to the specific embodiments described herein, be able to carry out for a person skilled in the art it is various it is apparent variation, It readjusts and substitutes without departing from protection scope of the present invention.Therefore, although being carried out by above embodiments to the present invention It is described in further detail, but the present invention is not limited to the above embodiments only, without departing from the inventive concept, also It may include more other equivalent embodiments, and the scope of the invention is determined by the scope of the appended claims.

Claims (10)

1. a kind of determination method of entity relationship diagram characterized by comprising
It determines at least one entity in target data, and extracts the entity relationship between each entity;
Determine the reliability of each entity relationship;
According to the sequence of the reliability, target entity relationship and corresponding target entity pair are determined;
Each target entity pair is connected based on each target entity relationship, entity relationship diagram is constituted and stores.
2. the method according to claim 1, wherein at least one entity in the determining target data, packet It includes:
Semantic parsing is carried out to the keyword of the target data;
At least one entity in the target data is determined according to parsing result.
3. according to the method described in claim 2, it is characterized in that, after determining at least one entity in target data, Further include:
Each entity disambiguate and merger is handled, obtains at least one standards entities.
4. according to the method described in claim 3, it is characterized in that, it is described to each entity carry out disambiguate and merger handle, Obtain at least one standards entities, comprising:
Disambiguation processing is carried out to each entity according to the disambiguation rule of setting;
Calculate entity attributes similarity after each disambiguation;
The entity similarity of entity after each disambiguation is determined according to each attributes similarity;
According to each entity similarity, merger processing is carried out to entity after each disambiguation, obtains at least one standards entities.
5. the method according to claim 1, wherein the entity relationship extracted between each entity, comprising:
Existing entity relationship between each entity is determined according to preset rules, and extracts each entity relationship.
6. the method according to claim 1, wherein the reliability of each entity relationship of the determination, comprising:
According to the source of the entity, the source coefficient of the entity is determined;
According to the generation time of the entity, the time coefficient of the entity is determined;
According to the frequency that entity described in preset time occurs, the frequency of occurrences coefficient of the entity is determined;
The reliable of each entity relationship is determined according to the source coefficient, the time coefficient and the frequency of occurrences coefficient Degree.
7. the method according to claim 1, wherein before determining at least one entity in target data, Further include:
Initial data is grabbed, and analyzes the data characteristics for extracting the initial data;
It is integrated according to the data characteristics and obtains target data.
8. a kind of determining device of entity relationship diagram characterized by comprising
First determining module for determining at least one entity in target data, and extracts the entity relationship between each entity;
Second determining module, for determining the reliability of each entity relationship;
Third determining module determines target entity relationship and corresponding target entity pair for the sequence according to the reliability;
Module is constituted, for connecting each target entity pair based on each target entity relationship, constitutes entity relationship diagram simultaneously Storage.
9. a kind of server characterized by comprising
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors are real Now such as the determination method of entity relationship diagram of any of claims 1-7.
10. a kind of storage medium, is stored thereon with computer program, which is characterized in that the realization when program is executed by processor Such as the determination method of entity relationship diagram of any of claims 1-7.
CN201811355514.3A 2018-11-14 2018-11-14 A kind of determination method, apparatus, server and the storage medium of entity relationship diagram Pending CN109472032A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811355514.3A CN109472032A (en) 2018-11-14 2018-11-14 A kind of determination method, apparatus, server and the storage medium of entity relationship diagram

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811355514.3A CN109472032A (en) 2018-11-14 2018-11-14 A kind of determination method, apparatus, server and the storage medium of entity relationship diagram

Publications (1)

Publication Number Publication Date
CN109472032A true CN109472032A (en) 2019-03-15

Family

ID=65672962

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811355514.3A Pending CN109472032A (en) 2018-11-14 2018-11-14 A kind of determination method, apparatus, server and the storage medium of entity relationship diagram

Country Status (1)

Country Link
CN (1) CN109472032A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110427623A (en) * 2019-07-24 2019-11-08 深圳追一科技有限公司 Semi-structured document Knowledge Extraction Method, device, electronic equipment and storage medium
CN110674304A (en) * 2019-10-09 2020-01-10 北京明略软件系统有限公司 Entity disambiguation method and device, readable storage medium and electronic equipment
CN110851586A (en) * 2019-10-22 2020-02-28 陈华 Bank operation data processing system and method, equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101882259A (en) * 2009-05-06 2010-11-10 日电(中国)有限公司 Method and equipment for filtering entity relationship instance
CN104933164A (en) * 2015-06-26 2015-09-23 华南理工大学 Method for extracting relations among named entities in Internet massive data and system thereof
US20160092549A1 (en) * 2014-09-26 2016-03-31 International Business Machines Corporation Information Handling System and Computer Program Product for Deducing Entity Relationships Across Corpora Using Cluster Based Dictionary Vocabulary Lexicon
CN106294744A (en) * 2016-08-11 2017-01-04 上海动云信息科技有限公司 Interest recognition methods and system
US20170277856A1 (en) * 2016-03-24 2017-09-28 Fujitsu Limited Healthcare risk extraction system and method
CN107992480A (en) * 2017-12-25 2018-05-04 东软集团股份有限公司 A kind of method, apparatus for realizing entity disambiguation and storage medium, program product
CN108363816A (en) * 2018-03-21 2018-08-03 北京理工大学 Open entity relation extraction method based on sentence justice structural model

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101882259A (en) * 2009-05-06 2010-11-10 日电(中国)有限公司 Method and equipment for filtering entity relationship instance
US20160092549A1 (en) * 2014-09-26 2016-03-31 International Business Machines Corporation Information Handling System and Computer Program Product for Deducing Entity Relationships Across Corpora Using Cluster Based Dictionary Vocabulary Lexicon
CN104933164A (en) * 2015-06-26 2015-09-23 华南理工大学 Method for extracting relations among named entities in Internet massive data and system thereof
US20170277856A1 (en) * 2016-03-24 2017-09-28 Fujitsu Limited Healthcare risk extraction system and method
CN106294744A (en) * 2016-08-11 2017-01-04 上海动云信息科技有限公司 Interest recognition methods and system
CN107992480A (en) * 2017-12-25 2018-05-04 东软集团股份有限公司 A kind of method, apparatus for realizing entity disambiguation and storage medium, program product
CN108363816A (en) * 2018-03-21 2018-08-03 北京理工大学 Open entity relation extraction method based on sentence justice structural model

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110427623A (en) * 2019-07-24 2019-11-08 深圳追一科技有限公司 Semi-structured document Knowledge Extraction Method, device, electronic equipment and storage medium
CN110427623B (en) * 2019-07-24 2021-09-21 深圳追一科技有限公司 Semi-structured document knowledge extraction method and device, electronic equipment and storage medium
CN110674304A (en) * 2019-10-09 2020-01-10 北京明略软件系统有限公司 Entity disambiguation method and device, readable storage medium and electronic equipment
CN110851586A (en) * 2019-10-22 2020-02-28 陈华 Bank operation data processing system and method, equipment and storage medium

Similar Documents

Publication Publication Date Title
US20200401765A1 (en) Man-machine conversation method, electronic device, and computer-readable medium
US11227118B2 (en) Methods, devices, and systems for constructing intelligent knowledge base
US20200301954A1 (en) Reply information obtaining method and apparatus
US10083690B2 (en) Better resolution when referencing to concepts
CN106776544B (en) Character relation recognition method and device and word segmentation method
JP6667504B2 (en) Orphan utterance detection system and method
WO2020082560A1 (en) Method, apparatus and device for extracting text keyword, as well as computer readable storage medium
US20190163691A1 (en) Intent Based Dynamic Generation of Personalized Content from Dynamic Sources
KR102288249B1 (en) Information processing method, terminal, and computer storage medium
EP3405912A1 (en) Analyzing textual data
CN109408811B (en) Data processing method and server
CN106874441A (en) Intelligent answer method and apparatus
WO2018045646A1 (en) Artificial intelligence-based method and device for human-machine interaction
CN108304375A (en) A kind of information identifying method and its equipment, storage medium, terminal
CN108538294B (en) Voice interaction method and device
WO2020005601A1 (en) Semantic parsing of natural language query
WO2021114841A1 (en) User report generating method and terminal device
JP2023535709A (en) Language expression model system, pre-training method, device, device and medium
CN108536807B (en) Information processing method and device
CN116775847B (en) Question answering method and system based on knowledge graph and large language model
CN111382260A (en) Method, device and storage medium for correcting retrieved text
CN109472032A (en) A kind of determination method, apparatus, server and the storage medium of entity relationship diagram
KR20200088088A (en) Apparatus and method for classifying word attribute
CN111126084A (en) Data processing method and device, electronic equipment and storage medium
CN113392305A (en) Keyword extraction method and device, electronic equipment and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190315

RJ01 Rejection of invention patent application after publication