CN108399157A - Dynamic abstracting method, server and the readable storage medium storing program for executing of entity and relation on attributes - Google Patents

Dynamic abstracting method, server and the readable storage medium storing program for executing of entity and relation on attributes Download PDF

Info

Publication number
CN108399157A
CN108399157A CN201711389560.0A CN201711389560A CN108399157A CN 108399157 A CN108399157 A CN 108399157A CN 201711389560 A CN201711389560 A CN 201711389560A CN 108399157 A CN108399157 A CN 108399157A
Authority
CN
China
Prior art keywords
entity
dynamic
attributes
relation
attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711389560.0A
Other languages
Chinese (zh)
Other versions
CN108399157B (en
Inventor
陈虹
董振江
王宇
龚乐君
李涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Nanjing University of Posts and Telecommunications
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201711389560.0A priority Critical patent/CN108399157B/en
Publication of CN108399157A publication Critical patent/CN108399157A/en
Application granted granted Critical
Publication of CN108399157B publication Critical patent/CN108399157B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri

Abstract

The invention discloses the dynamic abstracting method of a kind of entity and relation on attributes, this method includes:Obtain text data;Based on transaction relation on attributes library and training pattern, the various features of entity and attribute are dynamically extracted from text data.In addition, the present invention also provides a kind of server and readable storage medium storing program for executing, dynamic entity attribute relationship library and training pattern are constructed using the present invention, and the various features of entity and attribute can be automatically extracted from text data.

Description

Dynamic abstracting method, server and the readable storage medium storing program for executing of entity and relation on attributes
Technical field
Internet technical field of the present invention more particularly to entity and the dynamic abstracting method of relation on attributes, server and can Read storage medium.
Background technology
Fast development along with internet and the arriving of big data information age, in some specific areas, such as: Technology and business in field of telecommunications face technology upgrading, the opportunities and challenges of processing business and updates, produce a large amount of knowledge and specially With term, become the industry of qualified knowledge height secret letter.In field of telecommunications information content increase, and formed one it is non- Often huge and unordered information resource database, wherein carry in unstructured or semi-structured text data abundant valuable The telecom information of value.Name entity is the important linguistic unit of carrying information in text, be obtain valuable information must can not There is different attributes, same class entity to have roughly the same attribute for few link, different entities, and only attribute value has Institute is different.
Name Entity recognition includes the extraction of identification and attribute to entity.Entity recognition is text in general field In entity divide a certain semantic type into.There are mainly three types of methods for existing method, i.e.,:Based on dictionary, based on statistics and be based on The method of rule.Wherein,
Method based on dictionary is mainly looked for by string matching and names entity in dictionary, but usually not one A comprehensive entity library, and compare time-consuming.
Morphological rule, syntax rule, semantic rules are mainly added in rule-based algorithm during Entity recognition, pass through The method of rule match identifies various types of name entities.However, rule-based approach is limited to manually add rule.
Statistics-Based Method is trained using artificial mark or original language material.And Statistics-Based Method needs elder generation Language model is established, then appraising model parameter on the training data, this is conducive to be transplanted to different language and frontier. Statistics-Based Method mainly utilizes some statistical models such as hidden Markov model, maximum entropy model, support vector machines, item Part random field etc..The task of attribute extraction is for each Entity Semantics class structure attribute table and to extract attribute value.Attribute extraction Method mainly by pattern match and Statistics-Based Method, still, the research of current this respect is far less than Entity recognition. So in the prior art, there are still insufficient and defects for the technology of extraction entity and relation on attributes.
Invention content
It is a primary object of the present invention to propose the dynamic abstracting method of a kind of entity and relation on attributes, server with can Read storage medium, it is intended to solve the problems, such as that the knowledge base of particular technology area and language material are unsound.
To achieve the above object, the dynamic abstracting method of a kind of entity provided by the invention and relation on attributes, the method Including step:
Obtain text data;
Based on transaction relation on attributes library and training pattern, entity and attribute are dynamically extracted from the text data Various features.
In addition, to achieve the above object, the present invention also proposes a kind of server, the server include processor and Memory;
The processor is used to execute the dynamic extraction program of the entity and relation on attributes that are stored in memory, to realize Above-mentioned method.
In addition, to achieve the above object, the present invention also proposes a kind of computer readable storage medium, and the computer can It reads storage medium and is stored with one or more program, one or more of programs can be by one or more processor It executes, to realize above-mentioned method.
Dynamic abstracting method, server and the readable storage medium storing program for executing of entity proposed by the present invention and relation on attributes, by obtaining Text data is taken, transaction relation on attributes library and training pattern are based on, entity and attribute are dynamically extracted from text data Various features to construct dynamic entity attribute relationship library and training pattern, and can automatically take out from text data Take the various features of entity and attribute.
Description of the drawings
Fig. 1 is the flow diagram of the dynamic abstracting method of entity and relation on attributes that the application first embodiment provides;
Fig. 2 is that another flow of the dynamic abstracting method of entity and relation on attributes that the application first embodiment provides is shown It is intended to one;
Fig. 3 is the sub-process signal of the dynamic abstracting method of entity and relation on attributes that the application first embodiment provides Figure one;
Fig. 4 is the exemplary plot of the dynamic abstracting method of entity and relation on attributes that the application first embodiment provides;
Fig. 5 is that another flow of the dynamic abstracting method of entity and relation on attributes that the application first embodiment provides is shown It is intended to two;
Fig. 6 is the sub-process signal of the dynamic abstracting method of entity and relation on attributes that the application first embodiment provides Figure two;
Fig. 7 is the schematic diagram for the server hardware framework that the application second embodiment provides;
Fig. 8 is the module diagram of entity and the dynamic extraction program of relation on attributes in Fig. 7.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific implementation mode
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not used to limit this hair It is bright.
In subsequent description, using for indicating that the suffix of such as " module ", " component " or " unit " of element is only The explanation for being conducive to the present invention, itself does not have a specific meaning.Therefore, " module ", " component " or " unit " can mix Ground uses.
First embodiment
As shown in Figure 1, being the flow of the dynamic abstracting method of entity and relation on attributes that the application first embodiment provides Schematic diagram.In Fig. 1, the dynamic abstracting method of the entity and relation on attributes, includes the following steps:
Step 110, text data is obtained;
Step 120, it is based on transaction relation on attributes library and training pattern, is dynamically extracted from the text data real The various features of body and attribute.
Specifically, when getting text data, then closed based on the entity attribute relationship library pre-established and entity attribute It is training pattern, the various features of entity and attribute is dynamically extracted from this article notebook data, and is structured as entity and belonged to Property pair, obtain the result that extracts of dynamic.
After establishing entity attribute relationship library and training pattern, the entity and category in text data can recognize that Property relationship, dynamic extracts various features, and the entity attribute relationship language material of continuous Dynamic expansion training pattern.To obtain The language material of scale is more improved as training corpus, will be made by the automatic extraction mass text based on statistical machine learning Entity and attribute method performance it is more preferable, to carrying out the entity and attribute extracted automatically in a large amount of texts comprehensively.
Optionally, as shown in Fig. 2, before step 110, the method further includes:
Step 210, multiple sample datas are captured;
Step 220, according to the multiple sample data, structure entity attribute relationship library;
Step 230, according to preset characterization rules, entity attribute relationship library is expanded.
Specifically, obtain a large amount of sample data, using crawler technology and using related field (such as:Field of telecommunications) Related text data on typical keyword crawl internet with the field.The sample data for studying crawl, uses entity category Property value model (Entity-attribute-value model, EAV) automatically structure entity attribute kind sublist, as entity category The seed bank of sexual intercourse.
Using preset characterization rules, text is split so that text is after the pretreatments such as subordinate sentence, participle, in advance Stay preset keyword either in keyword and keyword or keyword expansion to entity attribute relationship library that will be reserved.With For field of telecommunications, these keywords or keyword can be " set meal ", " logical ", " phone ", " display " etc., when detecting These keywords or keyword then extend to entity attribute relationship library.
Optionally, as shown in figure 3, step 230 specifically includes:
Step 310, character string statement is received;
Step 320, judge in the character string statement whether to include preset keyword in entity attribute relationship library;If It is then to enter step 330, if it is not, not dealing with then;
Step 330, the character string statement is divided into one or more substring sentence;
Step 340, judge the matching of each substring sentence and preset keyword in entity attribute relationship library Whether degree reaches preset threshold value;If so, indicate in former entity attribute relationship library there are the entity in substring sentence, It does not deal with, if it is not, then entering step 350;
Step 350, the substring sentence is extended into entity attribute relationship library.
Specifically, detection user inputs character string statement, and the character string statement is received, judge to wrap in character string statement Included preset keyword either keyword then by regular expression by character string statement optimization simplify it is for one or more A substring sentence.Entity in substring sentence and entity attribute relationship library is subjected to similarity mode.Similarity The process matched is:Similarity threshold (such as 1) is set, if substring sentence and the Entities Matching in entity attribute relationship library Degree is 1, then illustrates that there are the entities in substring sentence in former entity attribute relationship library, need not be expanded, on the contrary Ground illustrates that former entity attribute closes if substring sentence does not reach 1 with the Entities Matching degree in entity attribute relationship library It is the entity not having in library in substring sentence, then needs to expand former entity attribute relationship library.Preferably, if there are multiple The not up to entity of similarity threshold, the then entity for taking similarity high extend to entity attribute relationship library.
Illustratively, as shown in figure 4, to expand the display figure in entity attribute relationship library.In Fig. 4, when reception input is looked into When inquiry content is " I wants to understand the related news of WiMAX and Internet access via leased-line ", it is " WiMAX " to obtain entity 1, and real The similarity result of body 1 is 0.800000011920929Pts, and information corresponding with entity 1 is:Introduction to business, access way, Terminal, wireless network card and accident analysis;It is " Internet access via leased-line " to obtain entity 2, and information corresponding with entity 2 is situated between for business It continues.Entity 1 and the similarity in entity attribute relationship library are less than 1, then extend to entity 1 in entity attribute relationship library.
Optionally, as shown in figure 5, further including step after step 110:
Step 510, according to entity attribute relationship library, the mark of entity and attribute is carried out to the text data;
Step 520, the language material for studying mark is selected with the feature to entity and attribute.
Specifically, the text data to crawl is labeled it using XML language by entity attribute relationship library, shape At the text entities attribute corpus of specific area.The language material of mark is studied, and according to entity in text and attribute Feature selects the feature of entity and attribute, for example, based on context feature, part of speech feature, vocabulary feature etc. are selected, from And extract the various features in text.
Further, word, sentence etc. that entity may be constituted can also be chosen to be labeled and expand.If for example, relationship library In have existed entity " set meal ", and there is " A set meals ", " B set meals " etc. in another text data, can also mark that " A covers Meal ", " B set meals " are entity, and the entity newly marked are extended in entity attribute relationship library.
Optionally, the entity of the application and the dynamic abstracting method of relation on attributes further include:Establish entity attribute relationship instruction Practice model, as shown in fig. 6, establishing entity attribute relationship training pattern specifically includes following steps:
Step 610, multiple corpus of text are captured;
Step 620, the corpus of text is processed into one or more language material file of preset format;
Step 630, the one or more of language material files of training generate model file;
Step 640, by feature function set in the model file and preset algorithm to the model file into Rower is noted.
Specifically, carrying out preprocessing to corpus of text, the training corpus of one or more word grade of preset format is generated File and the common training corpus file of word grade, for example, generating the training file of prescribed form, test file and for test and appraisal Correct answer file.
The language material file generated training file that pretreatment stage generates can be provided in the present embodiment by CRF Software Development Kit (Software Development Kit, SDK) generate training file.By means of in model file Feature function set and parameter using Viterbi dimensioning algorithms obtain test input data global optimum annotation results.
Optionally, establishing the process of entity attribute relationship training pattern can also include:
Identify that accuracy rate, recall rate and the F of the model file of mark estimate.
Specifically, in the present embodiment, annotation results and model answer being compared the accuracy rate identified, are called together The rate of returning and F estimate.
In practical applications, text data is got every time, then is repeated the above process, and then dynamic establishes master and apprentice's attribute The element filtered out is added to make model learn to arrive new knowledge in the case of Finite Samples for relationship library and training pattern In vocabulary.It is real by expanding name to the study automatic identification telecommunication entities of mass data with increasing for data sample The scale in body library.By the telecommunication entities attribute language material of dynamic construction, the language material of fairly perfect scale is obtained as training language Material will keep the performance by the method for extracting entity and attribute in mass text automatically based on statistical machine learning more preferable, To carry out entity and attribute in automatic extraction mass text comprehensively.
The dynamic abstracting method of entity provided in this embodiment and relation on attributes, by obtaining text data, based on dynamic Entity attribute relationship library and training pattern dynamically extract the various features of entity and attribute from text data, to construct Dynamic entity attribute relationship library and training pattern, and the items of entity and attribute can be automatically extracted from text data Feature.
Second embodiment
As shown in fig. 7, providing a kind of schematic diagram of server hardware framework for the application second embodiment.In the figure 7, Server includes:It memory 710, processor 720 and is stored on the memory 710 and can be transported on the processor 720 The dynamic extraction program 730 of capable entity and relation on attributes.In the present embodiment, the dynamic of the entity and relation on attributes Extraction program 730 includes a series of computer program instructions being stored on memory 710, when the computer program instructions When being executed by processor 720, the entity of various embodiments of the present invention and the dynamic extraction operation of relation on attributes may be implemented.One In a little embodiments, based on the specific operation that the computer program instructions each section is realized, entity and relation on attributes it is dynamic State extraction program 730 can be divided into one or more modules.As shown in figure 8, the dynamic of entity and relation on attributes extracts journey Sequence 730 includes:Data acquisition module 810, dynamic abstraction module 820, relationship library structure module 830, enlargement module 840, mark Module 850, feature selection module 860 and model construction module 870.Wherein,
Data acquisition module 810, for obtaining text data;
Dynamic abstraction module 820, for being based on transaction relation on attributes library and training pattern, from the text data Middle dynamic extracts the various features of entity and attribute.
Specifically, when data acquisition module 810 gets text data, then closed based on the entity attribute pre-established It is library and entity attribute relationship training pattern, dynamic abstraction module 820 dynamically extracts entity and attribute from this article notebook data Various features, and it is structured as entity and attribute pair, obtain the result that dynamic extracts.
After establishing entity attribute relationship library and training pattern, dynamic abstraction module 820 can recognize that textual data The relationship of entity and attribute in, dynamic extract various features, and the entity attribute relationship of continuous Dynamic expansion training pattern Language material.Language material to more be improved scale will make through the automatic pumping based on statistical machine learning as training corpus Take the performance of the method for the entity and attribute in mass text more preferable, to carry out the reality extracted automatically in a large amount of texts comprehensively Body and attribute.
Data acquisition module 810 is additionally operable to capture multiple sample datas;
Relationship library builds module 830, for according to the multiple sample data, structure entity attribute relationship library;
Enlargement module 840, for according to preset characterization rules, expanding entity attribute relationship library.
Specifically, when data acquisition module 810 obtains a large amount of sample data, using crawler technology and using related Field (such as:Field of telecommunications) related text data on typical keyword crawl internet with the field.Research crawl Sample data builds entity attribute kind sublist, the seed bank as entity attribute relationship using EAV automatically.
Using preset characterization rules, text is split so that text is after the pretreatments such as subordinate sentence, participle, in advance Stay preset keyword either in keyword and keyword or keyword expansion to entity attribute relationship library that will be reserved.With For field of telecommunications, these keywords or keyword can be " set meal ", " logical ", " phone ", " display " etc., when detecting These keywords or keyword then extend to entity attribute relationship library.
Optionally, as shown in figure 3, enlargement module 840 is specifically used for:
Receive character string statement;
Judge in the character string statement whether to include preset keyword in entity attribute relationship library;If so, by institute It states character string statement and is divided into one or more substring sentence;
Judge whether each substring sentence and the matching degree of preset keyword in entity attribute relationship library reach To preset threshold value;If so, indicate not deal with there are the entity in substring sentence in former entity attribute relationship library, If it is not, the substring sentence is then extended to entity attribute relationship library.
Specifically, detection user inputs character string statement, and the character string statement is received, judge to wrap in character string statement Included preset keyword either keyword then by regular expression by character string statement optimization simplify it is for one or more A substring sentence.Entity in substring sentence and entity attribute relationship library is subjected to similarity mode.Similarity The process matched is:Similarity threshold (such as 1) is set, if substring sentence and the Entities Matching in entity attribute relationship library Degree is 1, then illustrates that there are the entities in substring sentence in former entity attribute relationship library, need not be expanded, on the contrary Ground illustrates that former entity attribute closes if substring sentence does not reach 1 with the Entities Matching degree in entity attribute relationship library It is the entity not having in library in substring sentence, then needs to expand former entity attribute relationship library.Preferably, if there are multiple The not up to entity of similarity threshold, the then entity for taking similarity high extend to entity attribute relationship library.
Illustratively, as shown in figure 4, to expand the display figure in entity attribute relationship library.In Fig. 4, when reception input is looked into When inquiry content is " I wants to understand the related news of WiMAX and Internet access via leased-line ", it is " WiMAX " to obtain entity 1, and real The similarity result of body 1 is 0.800000011920929Pts, and information corresponding with entity 1 is:Introduction to business, access way, Terminal, wireless network card and accident analysis;It is " Internet access via leased-line " to obtain entity 2, and information corresponding with entity 2 is situated between for business It continues.Entity 1 and the similarity in entity attribute relationship library are less than 1, then extend to entity 1 in entity attribute relationship library.
Labeling module 850, for according to entity attribute relationship library, entity and attribute to be carried out to the text data Mark;
Feature selection module 860, the language material for studying mark, is selected with the feature to entity and attribute.
Specifically, labeling module 850 uses XML language to it text data of crawl by entity attribute relationship library It is labeled, forms the text entities attribute corpus of specific area.Feature selection module 860 grinds the language material of mark Study carefully, and the characteristics of according to entity in text and attribute, the feature of entity and attribute is selected, for example, based on context feature, word Property feature, vocabulary feature etc. are selected, to extract the various features in text.
Further, word, sentence etc. that entity may be constituted can also be chosen to be labeled and expand.If for example, relationship library In have existed entity " set meal ", and there is " A set meals ", " B set meals " etc. in another text data, can also mark that " A covers Meal ", " B set meals " are entity, and the entity newly marked are extended in entity attribute relationship library.
Model construction module 870, for establishing entity attribute relationship training pattern, model construction module 870 includes:In advance Processing unit 871, training unit 872, mark unit 873 and test and appraisal unit 874.Wherein,
Pretreatment unit 871, one or more language for multiple corpus of text of crawl to be processed into preset format Expect file;
Training unit 872 generates model file for training one or more of language material files;
Unit 873 is marked, the feature function set and preset algorithm for being used to pass through in the model file are to the mould Type file is labeled.
Test and appraisal unit 874, accuracy rate, recall rate and the F of the model file marked for identification estimate.
Specifically, carrying out preprocessing to corpus of text, the training corpus of one or more word grade of preset format is generated File and the common training corpus file of word grade, for example, generating the training file of prescribed form, test file and for test and appraisal Correct answer file.
The language material file generated training file that pretreatment stage generates can be provided in the present embodiment by CRF SDK generate training file.By means of in model file feature function set and parameter obtained using Viterbi dimensioning algorithms Test the annotation results of the global optimum of input data.
In the present embodiment, annotation results and model answer are compared to the accuracy rate, recall rate and F identified Estimate.
In practical applications, text data is got every time, then is repeated the above process, and then dynamic establishes master and apprentice's attribute The element filtered out is added to make model learn to arrive new knowledge in the case of Finite Samples for relationship library and training pattern In vocabulary.It is real by expanding name to the study automatic identification telecommunication entities of mass data with increasing for data sample The scale in body library.By the telecommunication entities attribute language material of dynamic construction, the language material of fairly perfect scale is obtained as training language Material will keep the performance by the method for extracting entity and attribute in mass text automatically based on statistical machine learning more preferable, To carry out entity and attribute in automatic extraction mass text comprehensively.
Server provided in this embodiment obtains text data by data acquisition module 810, is based on transaction category Sexual intercourse library and training pattern, dynamic abstraction module 820 dynamically extract the various features of entity and attribute from text data, To construct dynamic entity attribute relationship library and training pattern, and can automatically be extracted from text data entity with The various features of attribute.
3rd embodiment
The embodiment of the present application also provides a kind of computer readable storage mediums.Here computer readable storage medium It is stored with one or more program.Wherein, computer readable storage medium may include volatile memory, such as at random Access memory;Memory can also include nonvolatile memory, such as read-only memory, flash memory, hard disk or solid State hard disk;Memory can also include the combination of the memory of mentioned kind.When one in computer readable storage medium or Multiple programs can be executed by one or more processor, to realize that the entity that above-mentioned first embodiment is provided is closed with attribute The dynamic abstracting method of system.
It should be noted that herein, the terms "include", "comprise" or its any other variant are intended to non- It is exclusive to include, so that process, method, article or device including a series of elements include not only those elements, But also include other elements that are not explicitly listed, or further include for this process, method, article or device institute Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that There is also other identical elements in process, method, article or device including the element.
The embodiments of the present invention are for illustration only, can not represent the quality of embodiment.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment Method can add the mode of required general hardware platform to realize by software, naturally it is also possible to by hardware, but many situations It is lower the former be more preferably embodiment.Based on this understanding, technical scheme of the present invention is substantially in other words to the prior art The part to contribute can be expressed in the form of software products, which is stored in a storage and is situated between In matter (such as ROM/RAM, magnetic disc, CD), including some instructions are used so that a station terminal (can be mobile phone, computer, clothes Be engaged in device, air conditioner or the network equipment etc.) execute method described in each embodiment of the present invention.
The embodiment of the present invention is described with above attached drawing, but the invention is not limited in above-mentioned tools Body embodiment, the above mentioned embodiment is only schematical, rather than restrictive, the ordinary skill of this field Personnel under the inspiration of the present invention, without breaking away from the scope protected by the purposes and claims of the present invention, can also make Many forms, all of these belong to the protection of the present invention.

Claims (10)

1. the dynamic abstracting method of a kind of entity and relation on attributes, which is characterized in that the method includes the steps:
Obtain text data;
Based on transaction relation on attributes library and training pattern, the items of entity and attribute are dynamically extracted from the text data Feature.
2. the dynamic abstracting method of entity according to claim 1 and relation on attributes, which is characterized in that obtaining textual data According to before, the method further includes:
Capture multiple sample datas;
According to the multiple sample data, structure entity attribute relationship library.
3. the dynamic abstracting method of entity according to claim 2 and relation on attributes, which is characterized in that the method is also wrapped It includes:According to preset characterization rules, entity attribute relationship library is expanded.
4. the dynamic abstracting method of entity according to claim 3 and relation on attributes, which is characterized in that according to preset spy Sign rule, expands entity attribute relationship library, including:
Receive character string statement;
Judge in the character string statement whether to include preset keyword in entity attribute relationship library;
If so, the character string statement is divided into one or more substring sentence;
Judge whether the matching degree of each substring sentence and preset keyword in entity attribute relationship library reaches pre- If threshold value;
If it is not, the substring sentence is then extended to entity attribute relationship library.
5. the dynamic abstracting method of entity according to claim 1 and relation on attributes, which is characterized in that obtaining textual data According to later, the method further includes:
According to entity attribute relationship library, the mark of entity and attribute is carried out to the text data;
The language material for studying mark, is selected with the feature to entity and attribute.
6. the dynamic abstracting method of entity according to claim 1 and relation on attributes, which is characterized in that obtaining textual data According to before, the method further includes:
Establish entity attribute relationship training pattern.
7. the dynamic abstracting method of entity according to claim 6 and relation on attributes, which is characterized in that establish entity attribute Relationship training pattern, including:
Capture multiple corpus of text;
The corpus of text is processed into one or more language material file of preset format;
The one or more of language material files of training, generate model file;
The model file is labeled by feature function set in the model file and preset algorithm.
8. the dynamic abstracting method of entity according to claim 7 and relation on attributes, which is characterized in that the method is also wrapped It includes:
Identify that accuracy rate, recall rate and the F of the model file of mark estimate.
9. a kind of server, which is characterized in that the server includes processor and memory;
The processor is used to execute the dynamic extraction program of the entity and relation on attributes that are stored in memory, to realize that right is wanted Seek 1-8 any one of them methods.
10. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage there are one or Multiple programs, one or more of programs can be executed by one or more processor, to realize that claim 1-8 is any Method described in.
CN201711389560.0A 2017-12-21 2017-12-21 Dynamic extraction method of entity and attribute relationship, server and readable storage medium Active CN108399157B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711389560.0A CN108399157B (en) 2017-12-21 2017-12-21 Dynamic extraction method of entity and attribute relationship, server and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711389560.0A CN108399157B (en) 2017-12-21 2017-12-21 Dynamic extraction method of entity and attribute relationship, server and readable storage medium

Publications (2)

Publication Number Publication Date
CN108399157A true CN108399157A (en) 2018-08-14
CN108399157B CN108399157B (en) 2023-08-18

Family

ID=63094325

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711389560.0A Active CN108399157B (en) 2017-12-21 2017-12-21 Dynamic extraction method of entity and attribute relationship, server and readable storage medium

Country Status (1)

Country Link
CN (1) CN108399157B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109726398A (en) * 2018-12-27 2019-05-07 北京奇安信科技有限公司 A kind of Entity recognition and determined property method, system, equipment and medium
CN110457686A (en) * 2019-07-23 2019-11-15 福建奇点时空数字科技有限公司 A kind of information technology data entity attribute abstracting method based on deep learning
CN111611799A (en) * 2020-05-07 2020-09-01 北京智通云联科技有限公司 Dictionary and sequence labeling model based entity attribute extraction method, system and equipment
CN111858948A (en) * 2019-04-30 2020-10-30 杭州海康威视数字技术股份有限公司 Ontology construction method and device, electronic equipment and storage medium
CN112434530A (en) * 2019-08-06 2021-03-02 富士通株式会社 Information processing apparatus, information processing method, and computer program

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104160390A (en) * 2012-03-06 2014-11-19 微软公司 Entity augmentation service from latent relational data
CN104572125A (en) * 2015-01-28 2015-04-29 中国农业银行股份有限公司 Methods and devices for drawing and storing entity relation diagrams
CN104933164A (en) * 2015-06-26 2015-09-23 华南理工大学 Method for extracting relations among named entities in Internet massive data and system thereof
CN106202382A (en) * 2016-07-08 2016-12-07 南京缘长信息科技有限公司 Link instance method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104160390A (en) * 2012-03-06 2014-11-19 微软公司 Entity augmentation service from latent relational data
CN104572125A (en) * 2015-01-28 2015-04-29 中国农业银行股份有限公司 Methods and devices for drawing and storing entity relation diagrams
CN104933164A (en) * 2015-06-26 2015-09-23 华南理工大学 Method for extracting relations among named entities in Internet massive data and system thereof
CN106202382A (en) * 2016-07-08 2016-12-07 南京缘长信息科技有限公司 Link instance method and system

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109726398A (en) * 2018-12-27 2019-05-07 北京奇安信科技有限公司 A kind of Entity recognition and determined property method, system, equipment and medium
CN109726398B (en) * 2018-12-27 2023-07-07 奇安信科技集团股份有限公司 Entity identification and attribute judgment method, system, equipment and medium
CN111858948A (en) * 2019-04-30 2020-10-30 杭州海康威视数字技术股份有限公司 Ontology construction method and device, electronic equipment and storage medium
CN110457686A (en) * 2019-07-23 2019-11-15 福建奇点时空数字科技有限公司 A kind of information technology data entity attribute abstracting method based on deep learning
CN112434530A (en) * 2019-08-06 2021-03-02 富士通株式会社 Information processing apparatus, information processing method, and computer program
CN111611799A (en) * 2020-05-07 2020-09-01 北京智通云联科技有限公司 Dictionary and sequence labeling model based entity attribute extraction method, system and equipment
CN111611799B (en) * 2020-05-07 2023-06-02 北京智通云联科技有限公司 Entity attribute extraction method, system and equipment based on dictionary and sequence labeling model

Also Published As

Publication number Publication date
CN108399157B (en) 2023-08-18

Similar Documents

Publication Publication Date Title
CN110781276B (en) Text extraction method, device, equipment and storage medium
CN108628971B (en) Text classification method, text classifier and storage medium for unbalanced data set
CN107679039B (en) Method and device for determining statement intention
CN108287858B (en) Semantic extraction method and device for natural language
CN108399157A (en) Dynamic abstracting method, server and the readable storage medium storing program for executing of entity and relation on attributes
US11409642B2 (en) Automatic parameter value resolution for API evaluation
CN105426354B (en) The fusion method and device of a kind of vector
CN111488468B (en) Geographic information knowledge point extraction method and device, storage medium and computer equipment
CN110555440B (en) Event extraction method and device
CN108549723B (en) Text concept classification method and device and server
CN105760363B (en) Word sense disambiguation method and device for text file
CN103365834B (en) Language Ambiguity eliminates system and method
CN109299277A (en) The analysis of public opinion method, server and computer readable storage medium
CN106874397B (en) Automatic semantic annotation method for Internet of things equipment
CN108536673B (en) News event extraction method and device
Armouty et al. Automated keyword extraction using support vector machine from Arabic news documents
CN113722438A (en) Sentence vector generation method and device based on sentence vector model and computer equipment
CN112860896A (en) Corpus generalization method and man-machine conversation emotion analysis method for industrial field
CN112069312A (en) Text classification method based on entity recognition and electronic device
CN104573030A (en) Textual emotion prediction method and device
CN112183102A (en) Named entity identification method based on attention mechanism and graph attention network
CN109271624A (en) A kind of target word determines method, apparatus and storage medium
CN110968664A (en) Document retrieval method, device, equipment and medium
CN110196910B (en) Corpus classification method and apparatus
CN110309355A (en) Generation method, device, equipment and the storage medium of content tab

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20180829

Address after: 518000 Zhongnan communication tower, South China Road, Nanshan District high tech Industrial Park, Shenzhen, Guangdong

Applicant after: ZTE Corp.

Applicant after: NANJING University OF POSTS AND TELECOMMUNICATIONS

Address before: 518000 Zhongnan communication tower, South China Road, Nanshan District high tech Industrial Park, Shenzhen, Guangdong

Applicant before: ZTE Corp.

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant