CN106815293A - System and method for constructing knowledge graph for information analysis - Google Patents

System and method for constructing knowledge graph for information analysis Download PDF

Info

Publication number
CN106815293A
CN106815293A CN201611124399.XA CN201611124399A CN106815293A CN 106815293 A CN106815293 A CN 106815293A CN 201611124399 A CN201611124399 A CN 201611124399A CN 106815293 A CN106815293 A CN 106815293A
Authority
CN
China
Prior art keywords
data
relation
module
analysis
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611124399.XA
Other languages
Chinese (zh)
Inventor
王金华
姜春涛
丘定
姜鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
No32 Research Institute Of China Electronics Technology Group Corp
Original Assignee
No32 Research Institute Of China Electronics Technology Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by No32 Research Institute Of China Electronics Technology Group Corp filed Critical No32 Research Institute Of China Electronics Technology Group Corp
Priority to CN201611124399.XA priority Critical patent/CN106815293A/en
Publication of CN106815293A publication Critical patent/CN106815293A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A system and a method for constructing a knowledge graph for intelligence analysis comprise the following steps: the data acquisition module is used for cleaning and simply preprocessing the acquired data and then outputting the data to the text extraction module; the text extraction module is used for cleaning and preprocessing the acquired structured and unstructured data and transmitting the clean data to the entity identification module; the entity recognition module is used for segmenting the text, performing part-of-speech tagging on the segmented words, then extracting terms, and transmitting the extracted result to the semantic analysis module; the semantic analysis module analyzes and extracts the relation between the ontologies, generates a semantic metadata model through an ontology construction tool and outputs the semantic metadata model to the entity relation extraction module; and the entity relationship extraction module is used for finally generating the knowledge map language by extracting classification relationship and non-classification relationship. The invention combines the syntactic training and the association rule, thereby reducing external input and manual intervention and continuously identifying the entity relationship.

Description

A kind of system and method for the structure knowledge mapping towards intelligence analysis
Technical field
The present invention relates to fields such as natural language processing, Computerized Information Processing Tech, Chinese knowledge base applications, particularly It is related to knowledge mapping to build field.
Background technology
It it has been an epoch for data explosion since in recent years, data volume remains 50% or so growth rate, is every year The data of these magnanimity are processed, the potential significance of these data is excavated, retrieval quality and efficiency, global major research aircrafts is improved Structure and search engine producer all rack one's brains.With being in full swing for the projects such as Linking Open Data, Semantic Web data The quantity in source is increased sharply, and a large amount of RDF (Resource Description Framework, resource description framework) data are published. Internet is just transformed into comprising a large amount of from only including the document WWW (Document Web) of hyperlink between webpage and webpage The data WWW (Data Web) of abundant relation between various entities and entity is described.In this context, Google, Baidu Knowledge mapping is built based on this one after another with the search engine companies such as search dog, it is respectively Knowledge Graph, intimate and know Cube search quality is improved, so as to having pulled open the prelude of semantic search.Meanwhile, the information in face of magnanimity, complexity, isomery is believed Breath, it is desirable to be able to carry out quick analysis mining and associate ability, quickly can go out object of information entity and by entity by mining analysis It is associated with huge information knowledge mapping, and requires to process the data environment of information in sides such as data processing amount, feedback timeliness Face possesses very strong ability.Want to possess these abilities, how to build one completely, high-quality knowledge mapping is indispensable Key technology.
Countries in the world safety and intelligence department constructing system or research and development technology, support effectively to collect, fusion, management With analysis information big data, and the therefrom valuable information of analysis acquisition.Each intelligence department of such as U.S. government passes through " prism meter Draw " at home and abroad persistently monitoring the Internet activity and common carrier user profile;US military is early in just throwing before 5 years simultaneously Substantial contribution passes through its perfect database and powerful data relation analysis skill to the companies such as Palantir, Palantir Art, has helped Obama administration to chase the action of this Laden.After September 11th terrorist attack occurs, due to much jeopardizing national peace Full suspected terrorists data message is dispersed in different institutions, in this regard, according to No. 6 president's order of Homeland Security in 2003 years Set up terrorist's examination center (hereinafter referred to as TSC).The center is under the jurisdiction of FBI and by its leader, is one by the Ministry of Justice, state The comprehensive mechanism of the departmental representatives such as native Ministry of State Security, State Council composition, it is main to be responsible for determining suspicious or potential terrorist.Under If three departments at Information Technology Office, DARPA, administrative management service and action.National anti-terrorist center (NCTC) and FBI are respectively by external and state Interior known and suspicious terrorist's name single pass-through TSC is transferred to terrorist's examination database (TSDB), and according to different numbers Operated according to daily interaction between storehouse, increase, change or delete known and suspected terrorists identity information in TSDB.In TSC Round-the-clock call center is inside provided with, the characteristic information network of terrorist's identification covering the whole world is constructed, including position, lived Whether place, contact method, transaction record, be several with the personnel for helping law enforcement agency's determination to be run into during terrorist's examination According to the matching object determined in storehouse, then these available informations are supplied to the law enfrocement official for carrying out daily examination operation.TSC is also Actively with the whole America fusion center cooperation, it is ensured that fusion center accurately and timely transmits known and suspected terrorists information, passes through Authorize FBI Deng Ge mechanism law enfrocement officials to participate in examination, effectively prevent the generation again of the attack of terrorism.Therefore, characteristic information Network is into the core support technology of American National anti-terrorist center.TIA (i.e. " the full information identifications " or " threat letter of U.S. army Breath identification " plan), to be collected using advanced method, process and analyze the data of extensive terrorism, its final purpose is The attack of terrorism is blocked from root.Its main method is by EELD (evidence is extracted and incidence relation) sub-projects from non-structural Change the evidence of extraction people, the relation between tissue, place and event and association in text, constitute knowledge mapping, and then to terror The relation of molecule, whereabouts, activity are modeled and association analysis, and important function has been played in IS terrorist activities are recognized.From technology On say, TIA plan give priority to:1) architecture of large-scale anti-terrorism database is built;2) data are filled from existing resource The new method in storehouse, creates new source, and create new excavation, fusion and refine algorithm;3) it is analysis and linked database letter Breath uses the new model of knowledge mapping, so as to obtain exercisable intelligence technology.
By contrast, domestic utilization and research to this respect technology is still short of very much, so in face of being related to each neck The information data of the magnanimity numerous and complicated in domain, be badly in need of a kind of method carries out cleaning treatment to these data, it is become have usury With the knowledge mapping of value.The construction step of knowledge mapping is usually extracted data, Chinese word segmentation, Entity recognition, relation recognition. And relation recognition is current most insoluble problem.Its groundwork is:Entity co-occurrence+relationship marking.And existing method without Method accomplishes to be continuously improved the mark of relation or the input of strong depend-ence external knowledge and manual intervention.
In existing patent of invention, a kind of " reading domain knowledge map construction method towards books " patent (publication number: CN103488724A, 2014.01.01), a kind of by obtaining the knowledge on internet, integrated universal knowledge mapping is described, so Extend the related concept of books and entity using the mode of iteration with reference to world knowledge collection of illustrative plates afterwards, binding entity Infobox tables and Conventional relationship extracts entity relationship, finally marks the kernel entity in e-book from long to short according to entity, and set up entity With linking for books knowledge mapping, the method to realize intelligent knowledge recommendation.Here it is in face of a certain figure to build knowledge mapping Book, so the keyword of Manual definition's books is needed in there is significant limitation, and building process, when needs build greatly , it is necessary to expend many manpowers during the knowledge mapping of amount, and easily because manual errors cause the inaccurate of collection of illustrative plates.It is topmost Or the scope of application is too narrow, and applicability is not wide, it is impossible to process the information data of magnanimity numerous and complicated." it is based on structuring The knowledge mapping construction method and device of data " (publication number:CN104462501A, 2015.03.25) patent provide a kind of base In the knowledge mapping construction method and device of structural data, entity name and correspondent entity are included by obtaining one or more The structural data of attribute information, the mapping for extracting the entity name and its attribute information that are included in the structural data is closed System, generates corresponding data structure pair;Using the data structure of the generation to being stored as knowledge mapping data.Here only Structure-oriented data, it is well known that it is all unstructured data that we usually need major part to be processed, is fitted in this approach It is also very narrow with property, it is impossible to complete our requirement.
The content of the invention
For defect of the prior art, it is an object of the invention to provide a kind of structure knowledge mapping towards intelligence analysis System and method, by syntax training and correlation rule be combined, not only reduced outside input and manual intervention but can continue know Other entity relationship.
To reach above-mentioned purpose, the technical solution adopted in the present invention is as follows:
A kind of system of the structure knowledge mapping towards intelligence analysis, including:
Data acquisition module, output extracts mould to text after the data to collecting carry out cleaning and simple pretreatment Block;
Text abstraction module, data cleansing and pretreatment are carried out to the structuring for collecting and unstructured data, are judged With the presence or absence of breakage, the file to collecting include the operation of Unified coding conversion and complicated and simple conversion to file, cleans and pre- Data are transported to Entity recognition module after having processed;
Entity recognition module, for the clean text data for receiving, carries out participle, then to dividing to text first Word carry out part-of-speech tagging, after the completion of part-of-speech tagging, term is extracted, the result for extracting is transported to semantic analysis Module;
The relation between body is extracted in semantic module, analysis, then by ontology edit tool generative semantics metadata mould Type, is then output to entity relationship extraction module;
Entity relationship extraction module, includes that classification relation, non-categorical relation ultimately produce knowledge mapping by extracting.
The data acquisition module is by big towards information using what is targetedly crawled around a certain specific objective The crawler system of data is realized.
The Entity recognition module includes:Word-dividing mode, part-of-speech tagging module, terminological analysis module, for what is received Clean text data, word-dividing mode carries out participle to text first, and all of word is extracted according to word segmentation regulation and dictionary, Then the word that part-of-speech tagging module has been divided word-dividing mode carries out part-of-speech tagging, after the completion of part-of-speech tagging, terminological analysis mould Root tuber is extracted according to terminology bank to term.
The semantic module combination Ontology integration extracts syntax library and ontology library, with reference to grammer, semantic normalization with And grammer, semantic learning algorithm, a rule base of information extraction is generated, the syntax library generated during dependence, ontology library And information extraction rule base, extract the relation between body eventually through grammer, semantic analysis.
The present invention also provides a kind of method of the structure knowledge mapping towards intelligence analysis, by any of the above-described system reality It is existing, including:
Step 1, by using can targetedly be crawled around a certain specific objective towards information big data Crawler system realizes crawling for data;
Step 2, the data that step 1 is collected carry out data cleansing and filtering, and the data for having damaged are filtered, Diversified text encoding format is converted into unified UTF-8 codings, the text of converted coding include complicated and simple The pretreatment operation of conversion;
Step 3, the data after step 2 is cleaned and pre-processed are named Entity recognition, including participle, part of speech mark Note and terminological analysis;
Step 4, generates information extraction rule base, syntax library, the ontology library generated in combination with during, by grammer point Analysis, semantic analysis, then by ontology edit tool generative semantics metadata schema, it is then output to the knowledge of next step semantic relation Not;
Step 5, carries out semantic relation identification and extracts to the semantic metadata model that step 4 conveying comes.
In step 1, the text formatting for crawling is included Office, PDF, XML, HTML or is carried out by private data storehouse Importing data.
Step 3 includes three sub-steps:
Participle step 206, part-of-speech tagging step 208, terminological analysis step 210, wherein, participle step 206 will clean and Pretreated text data carries out participle according to dictionary and selected word segmentation regulation, and participle just carries out part-of-speech tagging step after finishing Rapid 208, according to dictionary and Rules for Part of Speech Tagging storehouse by step 206 decomposite come word carry out part-of-speech tagging, wherein the latter, participle Step 206 and part-of-speech tagging step 208 are put together and are carried out, and part of speech, after completion, Ran Houjin have just been marked while participle Row terminological analysis step 210, according to the terminology bank by term extraction and the integrated generation of glossary, is analyzed to term and carries Take, word useless is filtered, by step 206, step 208, step 210, according to by the integrated life of machine learning and dictionary Extract name entity in name entity patterns storehouse.
In order to tackle the deficiency that dictionary counts participle, using CRF participle techniques, the frequency letter that word occurs not only is considered Breath, while considering context of co-text.
Step 5 includes that classification relation is extracted, non-categorical relation is extracted, and the method that the non-categorical relation is extracted is specific such as Under:
Non-categorical relation is extracted by way of combining association rule and grammatical relation, one is measured using association rule To concept and the intensity of the correlation of verb, as the candidates for having confirmed that the concept pair that there is semantic relation, meanwhile, in neck In the text set of domain, the semantic relation between concept is found using the interdependent and sentence structure analysis between concept, and between concept Semantic relation is generally to be expressed by verb and be connected, wherein after candidate relationship collection is selected by syntactic analysis, by associating Rule is excavated, and confirms suitable set of relations.
Step 5 includes that classification relation is extracted, non-categorical relation is extracted, and the method that the non-categorical relation is extracted is specific such as Under:
The mixed method that rule-based and machine learning is combined extracts non-categorical relation from the text of field, its extract into Cheng Wei:
1) field text, by artificial rule learning, formation rule storehouse;
2) use learnt rule to match other language material texts, form language material clause storehouse;
3) using language material clause storehouse, manually training corpus is labeled, is then instructed with CRF machine learning algorithms Practice, generate training pattern;
4) testing material and training pattern are used, is tested and manual verification, according to test result, training corpus entered Row supplement and adjustment, re -training, until the accuracy rate and recall rate of training pattern reach certain level;
5) adjusted training pattern is utilized, binding rule matching result extracts non-categorical entity pass from actual language material System.
What the art of this patent scheme was brought has the beneficial effect that:
1st, realize since collect intelligence information, text participle, part-of-speech tagging, terminological analysis, name are carried out successively in fact A series of technologies such as body identification, syntactic analysis, semantic analysis, entity relationship extraction, finally complete automatic semi-automatic structure knowledge Collection of illustrative plates, greatly reduces the manpower for building and being expended required for high-quality knowledge mapping.
2nd, skill can be solved and detects that information source is more, flow big, high density data are difficult to reliable record, efficient analysis Problem.
3rd, can realize that fast automatic efficiently mining analysis go out object of information entity and by the automatic semi-automatic association of entity To in huge information knowledge mapping.
4th, can realize that, information of isomery complicated to magnanimity carries out quick analysis mining and the ability that associates, complete feelings Report the semi-automatic structure of knowledge mapping.
5th, greatly improve the information automatic identification based on machine intelligence, information association confirmation, Data mining analysis Ability, because high-quality knowledge spectrogram is the base support of the inquiry of high efficiency knowledge mapping and analysis, this patent is also straight Connect the inquiry and analysis efficiency for improving big data platform to complex relationship collection of illustrative plates.
Brief description of the drawings
The detailed description made to non-limiting example with reference to the following drawings by reading, further feature of the invention, Objects and advantages will become more apparent upon:
Fig. 1 is the system block diagram of the automatic semi-automatic structure collection of illustrative plates towards intelligence analysis provided by the present invention;
Fig. 2 is flow chart;
Fig. 3 is that the non-categorical relation based on association rule extracts flow chart;
Fig. 4 is rule-based and the non-hierarchical relation of machine learning extracts flow chart.
Specific embodiment
With reference to specific embodiment, the present invention is described in detail.Following examples will be helpful to the technology of this area Personnel further understand the present invention, but the invention is not limited in any way.It should be pointed out that to the ordinary skill of this area For personnel, without departing from the inventive concept of the premise, some changes and improvements can also be made.These belong to the present invention Protection domain.
The system and method for the automatic semi-automatic structure collection of illustrative plates towards intelligence analysis that the present invention is provided include information Collection, name Entity recognition, grammer, semantic analysis, entity relationship extract four bulks, wherein most crucial technology is name entity Identification and entity relationship are extracted.
Information collection passes through what is realized using widely used data crawler system, and general with range as mesh Target data crawler system is compared, and crawler system is improved and optimizated, and becoming can enter around a certain specific objective The crawler system towards information big data that row is targetedly crawled.Data for crawling carry out cleaning and simple pre- place Just name Entity recognition module is arrived in output after reason.
Name Entity recognition includes:Word-dividing mode, part-of-speech tagging module, terminological analysis module.
Wherein participle and part-of-speech tagging can put together is carried out, and part of speech has just been marked while participle.Basic divides Word and part-of-speech tagging operation depend on a relation storehouse of word segmentation regulation storehouse and dictionary and part of speech.But be in fact this for The discrimination of ambiguity word and neologisms is relatively low.Especially Chinese word segmentation, the appearance of polysemy, ambiguity and various network nouns, Need constantly to expand participle storehouse.In order to tackle the deficiency that dictionary counts participle, present invention employs based on condition random field Participle technique, it not only allows for the frequency information of word appearance, while further contemplating context of co-text, possesses good study energy Power, therefore have preferable learning ability for ambiguity word and neologisms.The present invention combine constantly expand participle storehouse and using than More ripe participle technique, participle and part-of-speech tagging, and so participle and word can be just completed by participle storehouse and part of speech storehouse Property mark it is more accurate and efficiency is very high, when running into ambiguity word, neologisms, when participle storehouse and indeterminable part of speech storehouse, Work is completed in conjunction with CRF, and expands participle storehouse and part of speech storehouse.Terminological analysis mould is arrived in output after participle and part-of-speech tagging Block.
Terminological analysis are generally exactly that term to terminology bank is extracted from glossary.Need to be divided according to different fields And safeguard this terminology bank.Grammer, semantic modules are arrived in output after terminological analysis.
Grammer, semantic analysis.Extract syntax library and ontology library with reference to Ontology integration, with reference to grammer, semantic normalization and Grammer, semantic learning algorithm, generate a rule base of information extraction.Rely on the syntax library of this process generation, ontology library And information extraction rule base, the relation between body is extracted eventually through grammer, semantic analysis, then by ontology edit tool Generative semantics metadata schema, is then output to entity relationship extraction module.
Entity relationship is extracted includes that classification relation is extracted, non-categorical relation is extracted.Wherein difficult point is exactly that non-categorical relation is carried Take, extracting non-categorical relation can be divided into two different problems:1) relation existed between a pair of concepts is found;2) marked according to semanteme Remember this relation.This patent extracts non-categorical relation using two methods.
Non-categorical relation is extracted by way of combining association rule and grammatical relation.Wherein the present invention uses correlation method Then come the intensity that measures the correlation of a pair of concepts and verb, (this intensity is that the confidence of correlation rule by being extracted comes Definition), as the candidates for having confirmed that the concept pair that there is semantic relation.The present invention is also proposed in field text set simultaneously In, find the semantic relation between concept using the interdependent and sentence structure analysis between concept, and the semantic relation between concept It is generally to be expressed by verb and be connected.Wherein after candidate relationship collection is selected by syntactic analysis, excavated by association rule, Confirm suitable set of relations, finally by the feedback of domain expert, can finally be provided to the most suitable set of relationship of Ontology engineering teacher. The method not only can automatically find the relation between concept, or these relations assign suitable mark.In this way Burden of the knowledge engineer when domain body builds can greatly be alleviated.
The mixed method that rule-based and machine learning is combined extracts non-categorical relation from the text of field, its extract into Cheng Wei:1) field text, by artificial rule learning, formation rule storehouse;2) learnt rule is used to other language materials text Originally matched, formed language material clause storehouse;3) using language material clause storehouse, manually training corpus is labeled, then uses CRF machines Device learning algorithm is trained, and generates training pattern;4) testing material and training pattern are used, are tested and manual verification, According to test result, training corpus is supplemented and is adjusted, re -training, until the accuracy rate and recall rate of training pattern reach To certain level;5) adjusted training pattern is utilized, binding rule matching result extracts non-categorical reality from actual language material Body relation.The method effectively increases the effect that non-categorical relation is obtained from term set, by the artificial CRF methods for helping More terms can also be extracted, existing term set can be effectively supplemented.
As shown in figure 1, the system of the automatic semi-automatic structure collection of illustrative plates towards intelligence analysis that the embodiment of the present invention is provided 101 include:Data acquisition module 102, text abstraction module 103, Entity recognition module 104, semantic module 108, entity Relation extraction module 109, knowledge mapping 110.Wherein Entity recognition module 104 includes:Word-dividing mode 105, part-of-speech tagging module 106th, 107 3 submodules of terminological analysis module.Data acquisition module 102 is responsible for What system was realized, compared with the general data crawler system with range as target, it is excellent that we have carried out improvement to crawler system Change, become the crawler system towards information big data that can be targetedly crawled around a certain specific objective.It is right With regard to output to text abstraction module 103 after the data for crawling carry out cleaning and simple pretreatment.Text abstraction module 103 pairs of structurings for collecting and unstructured data carry out data cleansing and pretreatment, judge file with the presence or absence of breakage, right Collecting file carries out Unified coding conversion, then carries out the operations such as complicated and simple conversion, after cleaning and having pre-processed that data are defeated It is sent to Entity recognition module 103.Real-time identification module 103 includes word-dividing mode 105, part-of-speech tagging module 106, terminological analysis mould 107 3 submodules of block, for the clean text data for receiving, word-dividing mode 105 carries out participle to text first, according to Word segmentation regulation and dictionary extract all of word, and then part-of-speech tagging module 106 is carried out to 105 points of good words of word-dividing mode Part-of-speech tagging, after the completion of part-of-speech tagging, terminological analysis module 107 is extracted according to terminology bank to term, the knot for extracting Fruit is transported to semantic module 108.Semantic module 108 extracts syntax library and ontology library with reference to Ontology integration, with reference to Grammer, semantic normalization and grammer, semantic learning algorithm, generate a rule base of information extraction.Rely on the life of this process Into syntax library, ontology library and information extraction rule base extract the relation between body eventually through grammer, semantic analysis, Again by ontology edit tool generative semantics metadata schema, entity relationship extraction module 109 is then output to.Entity relationship is carried Modulus block 109 includes that classification relation, non-categorical relation ultimately produce knowledge mapping 110 by extracting.
Fig. 2 is, towards the flow of the automatic semi-automatic structure collection of illustrative plates of intelligence analysis, to specifically include following steps:
Step 201 is climbed by using widely used data crawler system, and the general data with range as target Worm system is compared, and we are improved and optimizated to crawler system, and becoming can be directed to around a certain specific objective The crawler system towards information big data for crawling of property.Text formatting for crawling can be Office, PDF, XML, HTML, certain we can also carry out data importing by private data storehouse.
Step 202 is that the data for collecting step 201 carry out data cleansing and filtering, and the data for having damaged are carried out Filtering, unified UTF-8 codings are converted into by diversified text encoding format, the text of converted coding are carried out complicated and simple The pretreatment operations such as conversion.The data that step 201 is collected carried out clean and pre-process etc. operation after proceed to it is next Step name Entity recognition step 203.
Step 203 name Entity recognition includes participle step 206, part-of-speech tagging step 208, terminological analysis step in fact 210 3 sub-steps.Step 206 will be cleaned with pretreated text data according to dictionary and selected participle to step 202 Rule 207 carries out participle, and participle just carries out part-of-speech tagging step 208 after finishing, will according to dictionary and Rules for Part of Speech Tagging storehouse 209 Step 206 decomposites the word for coming and carries out part-of-speech tagging, after step 208 is completed, then carries out terminological analysis step 210, according to By the terminology bank 211 of integrated 212 generation of term extraction and glossary, we can be analyzed extraction to term, filtering Some words useless, such as some auxiliary verbs, adverbial word, adjective etc..By step 206, step 208, step 210, I Can extract name entity according to by the name entity patterns storehouse 204 of machine learning and dictionary integrated 205, and be transported to down One step syntactic analysis 213.
Step 213 syntactic analysis and step 215 semantic analysis extract language according to combination body learning and Ontology integration 217 Faku County 214 and ontology library 216, with reference to the learning algorithm of grammer, semantic relation study and declarative specifications 220, generate information extraction A rule base 219.The syntax library 214 of this process generation, ontology library 216 and information extraction rule base 219 are relied on, most Eventually by syntactic analysis 213, semantic analysis 215, then by ontology edit tool generative semantics metadata schema, it is then output to Next step semantic relation identification 218.
Step 218 entity relationship is extracted carries out semantic relation identification to the semantic metadata model that step 215 conveying comes Extract, including classification relation is extracted, non-categorical relation is extracted.This patent extracts non-categorical relation using two methods.
As shown in figure 3, the present embodiment provides the non-categorical relation based on association rule extracts flow, closed by combining Join the mode of rule and grammatical relation to extract non-categorical relation.To field text set 301, text is carried out by step 302 Treatment and foundation index, text-processing here include participle, identify entity.By concept Candidate Set 303, identification relation occurs Frequency 304, find the semantic relation between concept using the interdependent and sentence structure analysis between concept, generate a time Select set of relations.Wherein after candidate relationship collection is selected by syntactic analysis, 306 are excavated by association rule, confirm suitable relation Collection, finally by the feedback of domain expert 307, can finally be provided to the most suitable set of relationship of Ontology engineering teacher.The method is not only The relation between concept can be automatically found, or these relations assign suitable mark.Can greatly subtract in this way Light burden of the knowledge engineer when domain body builds.
As shown in figure 4, the non-hierarchical relation that the present embodiment provides rule-based and machine learning extracts flow, field By artificial rule learning, formation rule storehouse 402 uses learnt rule to enter professional etiquette to other language material texts to text 401 Then match, form language material clause storehouse 403, using language material clause storehouse 403, manually training corpus is labeled by step 408, Then step 405 is trained using CRF machine learning algorithms, generates training pattern.Use testing material and training pattern, step Rapid 409 are tested and manual verification, and according to test result, step 404 is supplemented and adjusted to training corpus, is instructed again Practice, until the accuracy rate and recall rate of training pattern reach certain level;Using adjusted training pattern, binding rule With result, non-categorical entity relationship is extracted from actual language material.The method is effectively increased and obtains non-categorical from term set The effect of relation, more terms can also be extracted by the artificial CRF methods for helping, and can effectively supplement existing term Set.
Specific embodiment of the invention is described above.It is to be appreciated that the invention is not limited in above-mentioned Particular implementation, those skilled in the art can within the scope of the claims make a variety of changes or change, this not shadow Sound substance of the invention.In the case where not conflicting, feature in embodiments herein and embodiment can any phase Mutually combination.

Claims (10)

1. it is a kind of towards intelligence analysis structure knowledge mapping system, it is characterised in that including:
Data acquisition module, the data to collecting exported to text abstraction module after cleaning and simple pretreatment;
Text abstraction module, data cleansing and pretreatment are carried out to the structuring for collecting and unstructured data, judge file With the presence or absence of breakage, the file to collecting include the operation of Unified coding conversion and complicated and simple conversion, cleaning and pretreatment Data are transported to Entity recognition module after complete;
Entity recognition module, for the clean text data for receiving, carries out participle, then the list to having divided to text first Word carries out part-of-speech tagging, after the completion of part-of-speech tagging, term is extracted, and the result for extracting is transported to semantic analysis mould Block;
The relation between body is extracted in semantic module, analysis, then by ontology edit tool generative semantics metadata schema, so Entity relationship extraction module is arrived in output afterwards;
Entity relationship extraction module, includes that classification relation, non-categorical relation ultimately produce knowledge mapping by extracting.
2. it is according to claim 1 towards intelligence analysis structure knowledge mapping system, it is characterised in that the data Acquisition module by using the crawler system towards information big data targetedly crawled around a certain specific objective come Realize.
3. it is according to claim 1 towards intelligence analysis structure knowledge mapping system, it is characterised in that the entity Identification module includes:Word-dividing mode, part-of-speech tagging module, terminological analysis module, for the clean text data for receiving, point Word module carries out participle to text first, extracts all of word according to word segmentation regulation and dictionary, then part-of-speech tagging module The word divided word-dividing mode carries out part-of-speech tagging, and after the completion of part-of-speech tagging, terminological analysis module is according to terminology bank to art Language is extracted.
4. it is according to claim 1 towards intelligence analysis structure knowledge mapping system, it is characterised in that the semanteme Analysis module combination Ontology integration extracts syntax library and ontology library, with reference to grammer, semantic normalization and grammer, semantic study Algorithm, generates a rule base of information extraction, the syntax library generated during dependence, and ontology library and information extraction are regular Storehouse, the relation between body is extracted eventually through grammer, semantic analysis.
5. it is a kind of towards intelligence analysis structure knowledge mapping method, it is characterised in that by any institute of Claims 1-4 The system realization stated, including:
Step 1, by using the reptile towards information big data that can be targetedly crawled around a certain specific objective System realizes crawling for data;
Step 2, the data that step 1 is collected carry out data cleansing and filtering, and the data for having damaged are filtered, and will be many Plant various text encoding format and be converted into unified UTF-8 codings, the text to converted coding carries out including complicated and simple conversion Pretreatment operation;
Step 3, the data after step 2 is cleaned and pre-processed are named Entity recognition, including participle, part-of-speech tagging and Terminological analysis;
Step 4, generate information extraction rule base, in combination with during generate syntax library, ontology library, by syntactic analysis, Semantic analysis, then by ontology edit tool generative semantics metadata schema, it is then output to the identification of next step semantic relation;
Step 5, carries out semantic relation identification and extracts to the semantic metadata model that step 4 conveying comes.
6. it is according to claim 5 towards intelligence analysis structure knowledge mapping method, it is characterised in that in step 1, The text formatting for crawling includes Office, PDF, XML, HTML or the importing data carried out by private data storehouse.
7. it is according to claim 5 towards intelligence analysis structure knowledge mapping method, it is characterised in that step 3 is wrapped Include three sub-steps:
Participle step 206, part-of-speech tagging step 208, terminological analysis step 210, wherein, participle step 206 will be cleaned and pre- place The text data managed carries out participle according to dictionary and selected word segmentation regulation, and participle just carries out part-of-speech tagging step after finishing 208, according to dictionary and Rules for Part of Speech Tagging storehouse by step 206 decomposite come word carry out part-of-speech tagging, the latter, wherein participle step Rapid 206 and part-of-speech tagging step 208 put together and carry out, part of speech has just been marked while participle, after completion, then carry out Terminological analysis step 210, according to the terminology bank by term extraction and the integrated generation of glossary, is analyzed to term and carries Take, word useless is filtered, by step 206, step 208, step 210, according to by the integrated life of machine learning and dictionary Extract name entity in name entity patterns storehouse.
8. it is according to claim 7 towards intelligence analysis structure knowledge mapping method, it is characterised in that in order to tackle Dictionary counts the deficiency of participle, using CRF participle techniques, the frequency information that word occurs not only is considered, while considering context Linguistic context.
9. it is according to claim 5 towards intelligence analysis structure knowledge mapping method, it is characterised in that step 5 is wrapped Include classification relation extraction, non-categorical relation to extract, the method that the non-categorical relation is extracted is specific as follows:
Non-categorical relation is extracted by way of combining association rule and grammatical relation, it is general to measure a pair using association rule The intensity with the correlation of verb is read, as the candidates for having confirmed that the concept pair that there is semantic relation, meanwhile, in field text This concentration, finds the semantic relation between concept using the interdependent and sentence structure analysis between concept, and the semanteme between concept Relation is generally to be expressed by verb and be connected, wherein after candidate relationship collection is selected by syntactic analysis, by association rule Excavate, confirm suitable set of relations.
10. it is according to claim 5 towards intelligence analysis structure knowledge mapping method, it is characterised in that step 5 is wrapped Include classification relation extraction, non-categorical relation to extract, the method that the non-categorical relation is extracted is specific as follows:
The mixed method that rule-based and machine learning is combined extracts non-categorical relation from the text of field, and it extracts process For:
1) field text, by artificial rule learning, formation rule storehouse;
2) use learnt rule to match other language material texts, form language material clause storehouse;
3) using language material clause storehouse, manually training corpus is labeled, is then trained with CRF machine learning algorithms, it is raw Into training pattern;
4) testing material and training pattern are used, is tested and manual verification, according to test result, training corpus mended Fill and adjust, re -training, until the accuracy rate and recall rate of training pattern reach certain level;
5) adjusted training pattern is utilized, binding rule matching result extracts non-categorical entity relationship from actual language material.
CN201611124399.XA 2016-12-08 2016-12-08 System and method for constructing knowledge graph for information analysis Pending CN106815293A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611124399.XA CN106815293A (en) 2016-12-08 2016-12-08 System and method for constructing knowledge graph for information analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611124399.XA CN106815293A (en) 2016-12-08 2016-12-08 System and method for constructing knowledge graph for information analysis

Publications (1)

Publication Number Publication Date
CN106815293A true CN106815293A (en) 2017-06-09

Family

ID=59106968

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611124399.XA Pending CN106815293A (en) 2016-12-08 2016-12-08 System and method for constructing knowledge graph for information analysis

Country Status (1)

Country Link
CN (1) CN106815293A (en)

Cited By (119)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291700A (en) * 2017-07-17 2017-10-24 广州特道信息科技有限公司 Entity word recognition method and device
CN107341264A (en) * 2017-07-19 2017-11-10 东北大学 A kind of electronic health record system and method for supporting custom entities
CN107391684A (en) * 2017-07-24 2017-11-24 深信服科技股份有限公司 A kind of method and system for threatening information generation
CN107392433A (en) * 2017-06-27 2017-11-24 北京神州泰岳软件股份有限公司 A kind of method and apparatus for extracting enterprise's incidence relation information
CN107480128A (en) * 2017-07-17 2017-12-15 广州特道信息科技有限公司 The segmenting method and device of Chinese text
CN107577670A (en) * 2017-09-15 2018-01-12 清华大学 A kind of terminology extraction method based on definition with relation
CN107609478A (en) * 2017-08-09 2018-01-19 广州思涵信息科技有限公司 A kind of real-time analysis of the students system and method for matching classroom knowledge content
CN107704634A (en) * 2017-11-04 2018-02-16 辽宁工程技术大学 A kind of method for forming knowledge and building knowledge chain
CN107766483A (en) * 2017-10-13 2018-03-06 华中科技大学 The interactive answering method and system of a kind of knowledge based collection of illustrative plates
CN107807917A (en) * 2017-09-27 2018-03-16 风变科技(深圳)有限公司 Method for extracting content of text, device, system and storage medium
CN107870966A (en) * 2017-08-11 2018-04-03 成都萌想科技有限责任公司 A kind of recruitment general regulations data pick-up method based on semantic model
CN107908671A (en) * 2017-10-25 2018-04-13 南京擎盾信息科技有限公司 Knowledge mapping construction method and system based on law data
CN107908621A (en) * 2017-11-16 2018-04-13 东华大学 Tumor of breast risk assessment system based on ultrasonic examination report text data
CN107967290A (en) * 2017-10-09 2018-04-27 国家计算机网络与信息安全管理中心 A kind of knowledge mapping network establishing method and system, medium based on magnanimity scientific research data
CN108228701A (en) * 2017-10-23 2018-06-29 武汉大学 A kind of system for realizing Chinese near-nature forest language inquiry interface
CN108256063A (en) * 2018-01-15 2018-07-06 中国人民解放军国防科技大学 Knowledge base construction method for network security
CN108255815A (en) * 2018-02-07 2018-07-06 苏州金螳螂文化发展股份有限公司 The segmenting method and device of text
CN108304541A (en) * 2018-01-31 2018-07-20 刘世洪 The structure system and method for user preferences modeling UIM based on technique transfers platform
CN108647192A (en) * 2018-03-27 2018-10-12 常熟鑫沐奇宝软件开发有限公司 A method of generating virtual reality work script with natural language processing technique
CN108763195A (en) * 2018-05-02 2018-11-06 武汉烽火普天信息技术有限公司 A kind of non-limiting type relation excavation method based on interdependent syntax and pattern rules
CN108829696A (en) * 2018-04-18 2018-11-16 西安理工大学 Towards knowledge mapping node method for auto constructing in metro design code
CN108874878A (en) * 2018-05-03 2018-11-23 众安信息技术服务有限公司 A kind of building system and method for knowledge mapping
CN109064339A (en) * 2018-09-12 2018-12-21 张连祥 A kind of method and system of intelligence inquiry merchant vector environmental information
CN109086391A (en) * 2018-07-27 2018-12-25 北京光年无限科技有限公司 A kind of method and system constructing knowledge mapping
CN109117477A (en) * 2018-07-17 2019-01-01 广州大学 Non-categorical Relation extraction method, apparatus, equipment and medium towards Chinese field
CN109165337A (en) * 2018-10-17 2019-01-08 珠海市智图数研信息技术有限公司 A kind of method and system of knowledge based map construction bidding field association analysis
CN109189942A (en) * 2018-09-12 2019-01-11 山东大学 A kind of construction method and device of patent data knowledge mapping
CN109189943A (en) * 2018-09-19 2019-01-11 中国电子科技集团公司信息科学研究院 A kind of capability knowledge extracts and the method for capability knowledge map construction
CN109241046A (en) * 2018-08-30 2019-01-18 天津做票君机器人科技有限公司 A kind of inventory information recognition methods of negotiation by draft robot and identifier
CN109241532A (en) * 2018-08-30 2019-01-18 天津做票君机器人科技有限公司 A kind of the vote buying information identifying method and identifier of negotiation by draft robot
CN109241295A (en) * 2018-08-31 2019-01-18 北京天广汇通科技有限公司 A kind of extracting method of special entity relationship in unstructured data
CN109241078A (en) * 2018-08-30 2019-01-18 中国地质大学(武汉) A kind of knowledge mapping hoc queries method based on hybrid database
CN109241199A (en) * 2018-08-08 2019-01-18 广州初星科技有限公司 A method of it is found towards financial knowledge mapping
CN109359299A (en) * 2018-09-28 2019-02-19 中国电子科技集团公司信息科学研究院 A kind of internet of things equipment ability ontology based on commodity data is from construction method
CN109376353A (en) * 2018-09-04 2019-02-22 国家电网公司华东分部 A kind of power grid start-up operation ticket generating means and method based on natural language processing
CN109446337A (en) * 2018-09-19 2019-03-08 中国信息通信研究院 A kind of knowledge mapping construction method and device
CN109492916A (en) * 2018-11-16 2019-03-19 东南大学 A kind of nuclear power regulation model building method based on ontology
CN109522418A (en) * 2018-11-08 2019-03-26 杭州费尔斯通科技有限公司 A kind of automanual knowledge mapping construction method
CN109582933A (en) * 2018-11-13 2019-04-05 北京合享智慧科技有限公司 A kind of method and relevant apparatus of determining text novelty degree
CN109597885A (en) * 2018-12-11 2019-04-09 福建亿榕信息技术有限公司 A kind of Knowledge Map construction method and storage medium
CN109597894A (en) * 2018-09-30 2019-04-09 阿里巴巴集团控股有限公司 A kind of correlation model generation method and device, a kind of data correlation method and device
CN109635117A (en) * 2018-12-26 2019-04-16 零犀(北京)科技有限公司 A kind of knowledge based spectrum recognition user intention method and device
CN109657072A (en) * 2018-12-13 2019-04-19 北京百分点信息科技有限公司 A kind of intelligent search WEB system and method applied to government's aid decision
CN109670051A (en) * 2018-12-14 2019-04-23 北京百度网讯科技有限公司 Knowledge mapping method for digging, device, equipment and storage medium
CN109729171A (en) * 2019-01-10 2019-05-07 七彩安科智慧科技有限公司 A kind of construction method of small town cognition matrix Internet of Things
CN109753664A (en) * 2019-01-21 2019-05-14 广州大学 A kind of concept extraction method, terminal device and the storage medium of domain-oriented
CN109783484A (en) * 2018-12-29 2019-05-21 北京航天云路有限公司 The construction method and system of the data service platform of knowledge based map
CN109857917A (en) * 2018-12-21 2019-06-07 中国科学院信息工程研究所 Towards the security knowledge map construction method and system for threatening information
CN109947903A (en) * 2019-03-15 2019-06-28 北京金山数字娱乐科技有限公司 A kind of Chinese idiom querying method and device
CN109977233A (en) * 2019-03-15 2019-07-05 北京金山数字娱乐科技有限公司 A kind of idiom knowledge map construction method and device
CN110096599A (en) * 2019-04-30 2019-08-06 长沙知了信息科技有限公司 The generation method and device of knowledge mapping
CN110119469A (en) * 2019-05-22 2019-08-13 北京计算机技术及应用研究所 A kind of data collection and transmission and method towards darknet
CN110162792A (en) * 2019-05-24 2019-08-23 国家电网有限公司 Electric network data management method and device
CN110175334A (en) * 2019-06-05 2019-08-27 苏州派维斯信息科技有限公司 Text knowledge's extraction system and method based on customized knowledge slot structure
CN110210025A (en) * 2019-05-29 2019-09-06 广州伟宏智能科技有限公司 A kind of conversion method based on Text Feature Extraction
CN110209828A (en) * 2018-02-12 2019-09-06 北大方正集团有限公司 Case querying method and case inquiry unit, computer equipment and storage medium
CN110210038A (en) * 2019-06-13 2019-09-06 北京百度网讯科技有限公司 Kernel entity determines method and its system, server and computer-readable medium
CN110209839A (en) * 2019-06-18 2019-09-06 卓尔智联(武汉)研究院有限公司 Agricultural knowledge map construction device, method and computer readable storage medium
CN110222127A (en) * 2019-06-06 2019-09-10 中国电子科技集团公司第二十八研究所 The converging information method, apparatus and equipment of knowledge based map
CN110287481A (en) * 2019-05-29 2019-09-27 西南电子技术研究所(中国电子科技集团公司第十研究所) Name entity corpus labeling training system
CN110321432A (en) * 2019-06-24 2019-10-11 拓尔思信息技术股份有限公司 Textual event information extracting method, electronic device and non-volatile memory medium
CN110347844A (en) * 2019-07-15 2019-10-18 中国人民解放军战略支援部队航天工程大学 A kind of space object knowledge map construction system
CN110399605A (en) * 2018-04-17 2019-11-01 富士施乐株式会社 Information processing unit and the computer-readable medium for storing program
CN110458397A (en) * 2019-07-05 2019-11-15 苏州热工研究院有限公司 A kind of nuclear material military service performance information extracting method
CN110458471A (en) * 2019-08-19 2019-11-15 绍兴数纺科技有限公司 Standardize dyestuff information management system
CN110516077A (en) * 2019-08-20 2019-11-29 北京中亦安图科技股份有限公司 Knowledge mapping construction method and device towards enterprise's market conditions
CN110633469A (en) * 2019-09-10 2019-12-31 陈绪平 Method for accurately understanding Chinese sentence meaning
CN110674308A (en) * 2019-08-23 2020-01-10 上海科技发展有限公司 Scientific and technological word list expansion method, device, terminal and medium based on grammar mode
CN110717049A (en) * 2019-08-29 2020-01-21 四川大学 Text data-oriented threat information knowledge graph construction method
CN110795932A (en) * 2019-09-30 2020-02-14 中国地质大学(武汉) Geological report text information extraction method based on geological ontology
CN110825839A (en) * 2019-11-07 2020-02-21 成都国腾实业集团有限公司 Incidence relation analysis method for targets in text information
CN110888991A (en) * 2019-11-28 2020-03-17 哈尔滨工程大学 Sectional semantic annotation method in weak annotation environment
CN110990525A (en) * 2019-11-15 2020-04-10 华融融通(北京)科技有限公司 Natural language processing-based public opinion information extraction and knowledge base generation method
CN111104478A (en) * 2019-09-05 2020-05-05 李轶 Domain concept semantic drift exploration method
CN111126065A (en) * 2019-12-02 2020-05-08 南京医渡云医学技术有限公司 Information extraction method and device for natural language text
CN111177399A (en) * 2019-12-04 2020-05-19 华瑞新智科技(北京)有限公司 Knowledge graph construction method and device
CN111178075A (en) * 2019-12-19 2020-05-19 厦门快商通科技股份有限公司 Online customer service log analysis method, device and equipment
CN111277560A (en) * 2019-12-24 2020-06-12 普世(南京)智能科技有限公司 Safe information acquisition, import and compilation method and system based on high-bandwidth physical isolation unidirectional transmission
CN111339253A (en) * 2020-02-25 2020-06-26 中国建设银行股份有限公司 Method and device for extracting article information
CN111476034A (en) * 2020-04-07 2020-07-31 同方赛威讯信息技术有限公司 Legal document information extraction method and system based on combination of rules and models
CN111488497A (en) * 2019-01-25 2020-08-04 北京沃东天骏信息技术有限公司 Similarity determination method and device for character string set, terminal and readable medium
CN111488741A (en) * 2020-04-14 2020-08-04 税友软件集团股份有限公司 Tax knowledge data semantic annotation method and related device
CN111597349A (en) * 2020-04-30 2020-08-28 西安理工大学 Rail transit standard entity relation automatic completion method based on artificial intelligence
CN111651995A (en) * 2020-06-07 2020-09-11 上海建科工程咨询有限公司 Accident information automatic extraction method and system based on deep circulation neural network
CN111666425A (en) * 2020-06-10 2020-09-15 深圳开思时代科技有限公司 Automobile accessory searching method based on semantic knowledge
CN111738445A (en) * 2020-05-26 2020-10-02 山东大学 Design knowledge fusion reasoning method supporting product rapid innovation
CN111753022A (en) * 2020-06-17 2020-10-09 第四范式(北京)技术有限公司 Method, device and equipment for constructing knowledge graph and readable storage medium
CN111753021A (en) * 2020-06-17 2020-10-09 第四范式(北京)技术有限公司 Method, device and equipment for constructing knowledge graph and readable storage medium
CN111858575A (en) * 2020-08-05 2020-10-30 杭州锘崴信息科技有限公司 Private data analysis method and system
CN111859968A (en) * 2020-06-15 2020-10-30 深圳航天科创实业有限公司 Text structuring method, text structuring device and terminal equipment
CN111897914A (en) * 2020-07-20 2020-11-06 杭州叙简科技股份有限公司 Entity information extraction and knowledge graph construction method for field of comprehensive pipe gallery
CN111914569A (en) * 2020-08-10 2020-11-10 哈尔滨安天科技集团股份有限公司 Prediction method and device based on fusion map, electronic equipment and storage medium
CN111984640A (en) * 2020-08-04 2020-11-24 中国科学技术大学智慧城市研究院(芜湖) Portrait construction method based on multi-element heterogeneous data
CN112084329A (en) * 2020-07-31 2020-12-15 西安理工大学 Semantic analysis method for entity recognition and relation extraction tasks
CN112241734A (en) * 2020-10-15 2021-01-19 首域科技(杭州)有限公司 Method and system for diagnosing equipment fault through knowledge graph and Bayesian network
CN112287686A (en) * 2020-08-13 2021-01-29 新智道枢(上海)科技有限公司 Warning safety protection method based on semantic analysis
CN112307767A (en) * 2020-11-09 2021-02-02 国网福建省电力有限公司 Bi-LSTM technology-based regulation and control knowledge modeling method
CN112328811A (en) * 2020-11-12 2021-02-05 国衡智慧城市科技研究院(北京)有限公司 Word spectrum clustering intelligent generation method based on same type of phrases
CN112364649A (en) * 2020-09-08 2021-02-12 平安医疗健康管理股份有限公司 Named entity identification method and device, computer equipment and storage medium
CN112417083A (en) * 2020-11-12 2021-02-26 福建亿榕信息技术有限公司 Method for constructing and deploying text entity relationship extraction model and storage device
CN112463960A (en) * 2020-10-30 2021-03-09 完美世界控股集团有限公司 Entity relationship determination method and device, computing equipment and storage medium
CN112800243A (en) * 2021-02-04 2021-05-14 天津德尔塔科技有限公司 Project budget analysis method and system based on knowledge graph
CN112860913A (en) * 2021-02-24 2021-05-28 广州汇通国信科技有限公司 Ontology creation method of knowledge graph
CN112861515A (en) * 2021-02-08 2021-05-28 上海天壤智能科技有限公司 Interactive knowledge definition and processing method, system, device and readable medium
CN113127503A (en) * 2021-03-18 2021-07-16 中国科学院国家空间科学中心 Automatic information extraction method and system for aerospace information
CN113220672A (en) * 2021-04-26 2021-08-06 中国人民解放军军事科学院国防科技创新研究院 Military and civil fusion policy information database system
CN113239201A (en) * 2021-05-20 2021-08-10 国网上海市电力公司 Scientific and technological literature classification method based on knowledge graph
CN113297252A (en) * 2021-05-28 2021-08-24 北京信息科技大学 Data query service method with mode being unaware
CN113326700A (en) * 2021-02-26 2021-08-31 西安理工大学 ALBert-based complex heavy equipment entity extraction method
CN113505191A (en) * 2021-03-26 2021-10-15 中国航空无线电电子研究所 Ontology-based avionics system architecture model construction method
CN113609838A (en) * 2021-07-14 2021-11-05 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Document information extraction and mapping method and system
CN113761226A (en) * 2021-11-10 2021-12-07 中国电子科技集团公司第二十八研究所 Ontology construction method of multi-modal airport data
CN114186561A (en) * 2021-10-20 2022-03-15 福建亿榕信息技术有限公司 Electronic file association analysis method and system based on knowledge graph
CN114490626A (en) * 2022-04-18 2022-05-13 成都数融科技有限公司 Financial information analysis method and system based on parallel computing
CN115169362A (en) * 2022-09-08 2022-10-11 北京瀚语科技有限公司 HowNet natural language processing method, system and application
CN115358201A (en) * 2022-08-03 2022-11-18 浙商期货有限公司 Processing method and system for delivery and research report in futures field
CN115796160A (en) * 2022-12-09 2023-03-14 南阳理工学院 Method and device for cleaning redundant thesis data based on lexical affixes and storage medium
CN116483940A (en) * 2023-04-26 2023-07-25 深圳市国房云数据技术服务有限公司 Method for extracting and structuring data of whole-flow type document
CN113609838B (en) * 2021-07-14 2024-05-24 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Document information extraction and mapping method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103699663A (en) * 2013-12-27 2014-04-02 中国科学院自动化研究所 Hot event mining method based on large-scale knowledge base
CN105183869A (en) * 2015-09-16 2015-12-23 分众(中国)信息技术有限公司 Building knowledge mapping database and construction method thereof
CN105653522A (en) * 2016-01-21 2016-06-08 中国农业大学 Non-classified relation recognition method for plant field

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103699663A (en) * 2013-12-27 2014-04-02 中国科学院自动化研究所 Hot event mining method based on large-scale knowledge base
CN105183869A (en) * 2015-09-16 2015-12-23 分众(中国)信息技术有限公司 Building knowledge mapping database and construction method thereof
CN105653522A (en) * 2016-01-21 2016-06-08 中国农业大学 Non-classified relation recognition method for plant field

Cited By (164)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107392433B (en) * 2017-06-27 2018-09-04 北京神州泰岳软件股份有限公司 A kind of method and apparatus of extraction enterprise incidence relation information
CN107392433A (en) * 2017-06-27 2017-11-24 北京神州泰岳软件股份有限公司 A kind of method and apparatus for extracting enterprise's incidence relation information
CN107480128A (en) * 2017-07-17 2017-12-15 广州特道信息科技有限公司 The segmenting method and device of Chinese text
CN107291700A (en) * 2017-07-17 2017-10-24 广州特道信息科技有限公司 Entity word recognition method and device
CN107341264A (en) * 2017-07-19 2017-11-10 东北大学 A kind of electronic health record system and method for supporting custom entities
CN107341264B (en) * 2017-07-19 2020-09-25 东北大学 Electronic medical record retrieval system and method supporting user-defined entity
CN107391684A (en) * 2017-07-24 2017-11-24 深信服科技股份有限公司 A kind of method and system for threatening information generation
CN107391684B (en) * 2017-07-24 2020-12-11 深信服科技股份有限公司 Method and system for generating threat information
CN107609478A (en) * 2017-08-09 2018-01-19 广州思涵信息科技有限公司 A kind of real-time analysis of the students system and method for matching classroom knowledge content
CN107870966A (en) * 2017-08-11 2018-04-03 成都萌想科技有限责任公司 A kind of recruitment general regulations data pick-up method based on semantic model
CN107577670A (en) * 2017-09-15 2018-01-12 清华大学 A kind of terminology extraction method based on definition with relation
CN107577670B (en) * 2017-09-15 2020-09-22 清华大学 Term extraction method based on definition and relation
CN107807917A (en) * 2017-09-27 2018-03-16 风变科技(深圳)有限公司 Method for extracting content of text, device, system and storage medium
CN107967290A (en) * 2017-10-09 2018-04-27 国家计算机网络与信息安全管理中心 A kind of knowledge mapping network establishing method and system, medium based on magnanimity scientific research data
CN107766483A (en) * 2017-10-13 2018-03-06 华中科技大学 The interactive answering method and system of a kind of knowledge based collection of illustrative plates
CN108228701A (en) * 2017-10-23 2018-06-29 武汉大学 A kind of system for realizing Chinese near-nature forest language inquiry interface
CN107908671A (en) * 2017-10-25 2018-04-13 南京擎盾信息科技有限公司 Knowledge mapping construction method and system based on law data
CN107908671B (en) * 2017-10-25 2022-02-01 南京擎盾信息科技有限公司 Knowledge graph construction method and system based on legal data
CN107704634A (en) * 2017-11-04 2018-02-16 辽宁工程技术大学 A kind of method for forming knowledge and building knowledge chain
CN107908621A (en) * 2017-11-16 2018-04-13 东华大学 Tumor of breast risk assessment system based on ultrasonic examination report text data
CN108256063A (en) * 2018-01-15 2018-07-06 中国人民解放军国防科技大学 Knowledge base construction method for network security
CN108304541A (en) * 2018-01-31 2018-07-20 刘世洪 The structure system and method for user preferences modeling UIM based on technique transfers platform
CN108255815A (en) * 2018-02-07 2018-07-06 苏州金螳螂文化发展股份有限公司 The segmenting method and device of text
CN110209828A (en) * 2018-02-12 2019-09-06 北大方正集团有限公司 Case querying method and case inquiry unit, computer equipment and storage medium
CN110209828B (en) * 2018-02-12 2021-08-27 北大方正集团有限公司 Case query method, case query device, computer device and storage medium
CN108647192A (en) * 2018-03-27 2018-10-12 常熟鑫沐奇宝软件开发有限公司 A method of generating virtual reality work script with natural language processing technique
CN108647192B (en) * 2018-03-27 2022-04-12 常熟鑫沐奇宝软件开发有限公司 Method for generating virtual reality working script by natural language processing technology
CN110399605A (en) * 2018-04-17 2019-11-01 富士施乐株式会社 Information processing unit and the computer-readable medium for storing program
CN108829696A (en) * 2018-04-18 2018-11-16 西安理工大学 Towards knowledge mapping node method for auto constructing in metro design code
CN108763195A (en) * 2018-05-02 2018-11-06 武汉烽火普天信息技术有限公司 A kind of non-limiting type relation excavation method based on interdependent syntax and pattern rules
CN108763195B (en) * 2018-05-02 2022-01-18 武汉烽火普天信息技术有限公司 Dependency syntax and mode rule-based non-restricted relationship mining method
CN108874878A (en) * 2018-05-03 2018-11-23 众安信息技术服务有限公司 A kind of building system and method for knowledge mapping
CN109117477B (en) * 2018-07-17 2022-01-28 广州大学 Chinese field-oriented non-classification relation extraction method, device, equipment and medium
CN109117477A (en) * 2018-07-17 2019-01-01 广州大学 Non-categorical Relation extraction method, apparatus, equipment and medium towards Chinese field
CN109086391A (en) * 2018-07-27 2018-12-25 北京光年无限科技有限公司 A kind of method and system constructing knowledge mapping
CN109086391B (en) * 2018-07-27 2022-07-01 北京光年无限科技有限公司 Method and system for constructing knowledge graph
CN109241199A (en) * 2018-08-08 2019-01-18 广州初星科技有限公司 A method of it is found towards financial knowledge mapping
CN109241199B (en) * 2018-08-08 2022-09-23 上海旭荣网络科技有限公司 Financial knowledge graph discovery method
CN109241532A (en) * 2018-08-30 2019-01-18 天津做票君机器人科技有限公司 A kind of the vote buying information identifying method and identifier of negotiation by draft robot
CN109241078A (en) * 2018-08-30 2019-01-18 中国地质大学(武汉) A kind of knowledge mapping hoc queries method based on hybrid database
CN109241046A (en) * 2018-08-30 2019-01-18 天津做票君机器人科技有限公司 A kind of inventory information recognition methods of negotiation by draft robot and identifier
CN109241078B (en) * 2018-08-30 2021-07-20 中国地质大学(武汉) Knowledge graph organization query method based on mixed database
CN109241295B (en) * 2018-08-31 2021-12-24 北京天广汇通科技有限公司 Method for extracting specific entity relation in unstructured data
CN109241295A (en) * 2018-08-31 2019-01-18 北京天广汇通科技有限公司 A kind of extracting method of special entity relationship in unstructured data
CN109376353B (en) * 2018-09-04 2022-09-16 国家电网公司华东分部 Natural language processing-based power grid starting operation ticket generation device and method
CN109376353A (en) * 2018-09-04 2019-02-22 国家电网公司华东分部 A kind of power grid start-up operation ticket generating means and method based on natural language processing
CN109064339A (en) * 2018-09-12 2018-12-21 张连祥 A kind of method and system of intelligence inquiry merchant vector environmental information
CN109189942A (en) * 2018-09-12 2019-01-11 山东大学 A kind of construction method and device of patent data knowledge mapping
CN109189943A (en) * 2018-09-19 2019-01-11 中国电子科技集团公司信息科学研究院 A kind of capability knowledge extracts and the method for capability knowledge map construction
CN109446337A (en) * 2018-09-19 2019-03-08 中国信息通信研究院 A kind of knowledge mapping construction method and device
CN109446337B (en) * 2018-09-19 2020-10-13 中国信息通信研究院 Knowledge graph construction method and device
CN109359299A (en) * 2018-09-28 2019-02-19 中国电子科技集团公司信息科学研究院 A kind of internet of things equipment ability ontology based on commodity data is from construction method
CN109597894B (en) * 2018-09-30 2023-10-03 创新先进技术有限公司 Correlation model generation method and device, and data correlation method and device
CN109597894A (en) * 2018-09-30 2019-04-09 阿里巴巴集团控股有限公司 A kind of correlation model generation method and device, a kind of data correlation method and device
CN109165337B (en) * 2018-10-17 2021-10-15 珠海市智图数研信息技术有限公司 Method and system for establishing bid and ask field association analysis based on knowledge graph
CN109165337A (en) * 2018-10-17 2019-01-08 珠海市智图数研信息技术有限公司 A kind of method and system of knowledge based map construction bidding field association analysis
CN109522418A (en) * 2018-11-08 2019-03-26 杭州费尔斯通科技有限公司 A kind of automanual knowledge mapping construction method
CN109522418B (en) * 2018-11-08 2020-05-12 杭州费尔斯通科技有限公司 Semi-automatic knowledge graph construction method
CN109582933A (en) * 2018-11-13 2019-04-05 北京合享智慧科技有限公司 A kind of method and relevant apparatus of determining text novelty degree
CN109492916A (en) * 2018-11-16 2019-03-19 东南大学 A kind of nuclear power regulation model building method based on ontology
CN109597885A (en) * 2018-12-11 2019-04-09 福建亿榕信息技术有限公司 A kind of Knowledge Map construction method and storage medium
CN109657072A (en) * 2018-12-13 2019-04-19 北京百分点信息科技有限公司 A kind of intelligent search WEB system and method applied to government's aid decision
CN109670051A (en) * 2018-12-14 2019-04-23 北京百度网讯科技有限公司 Knowledge mapping method for digging, device, equipment and storage medium
CN109857917A (en) * 2018-12-21 2019-06-07 中国科学院信息工程研究所 Towards the security knowledge map construction method and system for threatening information
CN109635117A (en) * 2018-12-26 2019-04-16 零犀(北京)科技有限公司 A kind of knowledge based spectrum recognition user intention method and device
CN109783484A (en) * 2018-12-29 2019-05-21 北京航天云路有限公司 The construction method and system of the data service platform of knowledge based map
CN109729171B (en) * 2019-01-10 2021-07-30 七彩安科智慧科技有限公司 Method for constructing town cognitive matrix Internet of things
CN109729171A (en) * 2019-01-10 2019-05-07 七彩安科智慧科技有限公司 A kind of construction method of small town cognition matrix Internet of Things
CN109753664A (en) * 2019-01-21 2019-05-14 广州大学 A kind of concept extraction method, terminal device and the storage medium of domain-oriented
CN111488497B (en) * 2019-01-25 2023-05-12 北京沃东天骏信息技术有限公司 Similarity determination method and device for character string set, terminal and readable medium
CN111488497A (en) * 2019-01-25 2020-08-04 北京沃东天骏信息技术有限公司 Similarity determination method and device for character string set, terminal and readable medium
CN109977233A (en) * 2019-03-15 2019-07-05 北京金山数字娱乐科技有限公司 A kind of idiom knowledge map construction method and device
CN109947903B (en) * 2019-03-15 2023-02-07 北京金山数字娱乐科技有限公司 Idiom query method and device
CN109947903A (en) * 2019-03-15 2019-06-28 北京金山数字娱乐科技有限公司 A kind of Chinese idiom querying method and device
CN110096599B (en) * 2019-04-30 2023-03-21 长沙知了信息科技有限公司 Knowledge graph generation method and device
CN110096599A (en) * 2019-04-30 2019-08-06 长沙知了信息科技有限公司 The generation method and device of knowledge mapping
CN110119469A (en) * 2019-05-22 2019-08-13 北京计算机技术及应用研究所 A kind of data collection and transmission and method towards darknet
CN110162792A (en) * 2019-05-24 2019-08-23 国家电网有限公司 Electric network data management method and device
CN110210025A (en) * 2019-05-29 2019-09-06 广州伟宏智能科技有限公司 A kind of conversion method based on Text Feature Extraction
CN110287481B (en) * 2019-05-29 2022-06-14 西南电子技术研究所(中国电子科技集团公司第十研究所) Named entity corpus labeling training system
CN110287481A (en) * 2019-05-29 2019-09-27 西南电子技术研究所(中国电子科技集团公司第十研究所) Name entity corpus labeling training system
CN110175334B (en) * 2019-06-05 2023-06-27 苏州派维斯信息科技有限公司 Text knowledge extraction system and method based on custom knowledge slot structure
CN110175334A (en) * 2019-06-05 2019-08-27 苏州派维斯信息科技有限公司 Text knowledge's extraction system and method based on customized knowledge slot structure
CN110222127A (en) * 2019-06-06 2019-09-10 中国电子科技集团公司第二十八研究所 The converging information method, apparatus and equipment of knowledge based map
CN110210038B (en) * 2019-06-13 2023-01-10 北京百度网讯科技有限公司 Core entity determining method, system, server and computer readable medium thereof
CN110210038A (en) * 2019-06-13 2019-09-06 北京百度网讯科技有限公司 Kernel entity determines method and its system, server and computer-readable medium
CN110209839A (en) * 2019-06-18 2019-09-06 卓尔智联(武汉)研究院有限公司 Agricultural knowledge map construction device, method and computer readable storage medium
CN110209839B (en) * 2019-06-18 2021-07-27 卓尔智联(武汉)研究院有限公司 Agricultural knowledge graph construction device and method and computer readable storage medium
CN110321432B (en) * 2019-06-24 2021-11-23 拓尔思信息技术股份有限公司 Text event information extraction method, electronic device and nonvolatile storage medium
CN110321432A (en) * 2019-06-24 2019-10-11 拓尔思信息技术股份有限公司 Textual event information extracting method, electronic device and non-volatile memory medium
CN110458397A (en) * 2019-07-05 2019-11-15 苏州热工研究院有限公司 A kind of nuclear material military service performance information extracting method
CN110347844A (en) * 2019-07-15 2019-10-18 中国人民解放军战略支援部队航天工程大学 A kind of space object knowledge map construction system
CN110458471B (en) * 2019-08-19 2022-05-20 绍兴数纺科技有限公司 Standardized dye information management system
CN110458471A (en) * 2019-08-19 2019-11-15 绍兴数纺科技有限公司 Standardize dyestuff information management system
CN110516077A (en) * 2019-08-20 2019-11-29 北京中亦安图科技股份有限公司 Knowledge mapping construction method and device towards enterprise's market conditions
CN110674308A (en) * 2019-08-23 2020-01-10 上海科技发展有限公司 Scientific and technological word list expansion method, device, terminal and medium based on grammar mode
CN110717049B (en) * 2019-08-29 2020-12-04 四川大学 Text data-oriented threat information knowledge graph construction method
CN110717049A (en) * 2019-08-29 2020-01-21 四川大学 Text data-oriented threat information knowledge graph construction method
CN111104478A (en) * 2019-09-05 2020-05-05 李轶 Domain concept semantic drift exploration method
CN110633469A (en) * 2019-09-10 2019-12-31 陈绪平 Method for accurately understanding Chinese sentence meaning
CN110795932A (en) * 2019-09-30 2020-02-14 中国地质大学(武汉) Geological report text information extraction method based on geological ontology
CN110825839A (en) * 2019-11-07 2020-02-21 成都国腾实业集团有限公司 Incidence relation analysis method for targets in text information
CN110990525A (en) * 2019-11-15 2020-04-10 华融融通(北京)科技有限公司 Natural language processing-based public opinion information extraction and knowledge base generation method
CN110888991A (en) * 2019-11-28 2020-03-17 哈尔滨工程大学 Sectional semantic annotation method in weak annotation environment
CN110888991B (en) * 2019-11-28 2023-12-01 哈尔滨工程大学 Sectional type semantic annotation method under weak annotation environment
CN111126065A (en) * 2019-12-02 2020-05-08 南京医渡云医学技术有限公司 Information extraction method and device for natural language text
CN111126065B (en) * 2019-12-02 2024-03-15 医渡云(北京)技术有限公司 Information extraction method and device for natural language text
CN111177399B (en) * 2019-12-04 2023-06-16 华瑞新智科技(北京)有限公司 Knowledge graph construction method and device
CN111177399A (en) * 2019-12-04 2020-05-19 华瑞新智科技(北京)有限公司 Knowledge graph construction method and device
CN111178075A (en) * 2019-12-19 2020-05-19 厦门快商通科技股份有限公司 Online customer service log analysis method, device and equipment
CN111277560A (en) * 2019-12-24 2020-06-12 普世(南京)智能科技有限公司 Safe information acquisition, import and compilation method and system based on high-bandwidth physical isolation unidirectional transmission
CN111339253A (en) * 2020-02-25 2020-06-26 中国建设银行股份有限公司 Method and device for extracting article information
CN111476034A (en) * 2020-04-07 2020-07-31 同方赛威讯信息技术有限公司 Legal document information extraction method and system based on combination of rules and models
CN111488741A (en) * 2020-04-14 2020-08-04 税友软件集团股份有限公司 Tax knowledge data semantic annotation method and related device
CN111597349A (en) * 2020-04-30 2020-08-28 西安理工大学 Rail transit standard entity relation automatic completion method based on artificial intelligence
CN111597349B (en) * 2020-04-30 2022-10-11 西安理工大学 Rail transit standard entity relation automatic completion method based on artificial intelligence
CN111738445A (en) * 2020-05-26 2020-10-02 山东大学 Design knowledge fusion reasoning method supporting product rapid innovation
CN111651995A (en) * 2020-06-07 2020-09-11 上海建科工程咨询有限公司 Accident information automatic extraction method and system based on deep circulation neural network
CN111666425B (en) * 2020-06-10 2023-04-18 深圳开思时代科技有限公司 Automobile accessory searching method based on semantic knowledge
CN111666425A (en) * 2020-06-10 2020-09-15 深圳开思时代科技有限公司 Automobile accessory searching method based on semantic knowledge
CN111859968A (en) * 2020-06-15 2020-10-30 深圳航天科创实业有限公司 Text structuring method, text structuring device and terminal equipment
CN111753022A (en) * 2020-06-17 2020-10-09 第四范式(北京)技术有限公司 Method, device and equipment for constructing knowledge graph and readable storage medium
CN111753021A (en) * 2020-06-17 2020-10-09 第四范式(北京)技术有限公司 Method, device and equipment for constructing knowledge graph and readable storage medium
CN111897914A (en) * 2020-07-20 2020-11-06 杭州叙简科技股份有限公司 Entity information extraction and knowledge graph construction method for field of comprehensive pipe gallery
CN111897914B (en) * 2020-07-20 2023-09-19 杭州叙简科技股份有限公司 Entity information extraction and knowledge graph construction method for comprehensive pipe rack field
CN112084329B (en) * 2020-07-31 2024-02-02 西安理工大学 Semantic analysis method for entity identification and relation extraction tasks
CN112084329A (en) * 2020-07-31 2020-12-15 西安理工大学 Semantic analysis method for entity recognition and relation extraction tasks
CN111984640A (en) * 2020-08-04 2020-11-24 中国科学技术大学智慧城市研究院(芜湖) Portrait construction method based on multi-element heterogeneous data
CN111858575A (en) * 2020-08-05 2020-10-30 杭州锘崴信息科技有限公司 Private data analysis method and system
CN111858575B (en) * 2020-08-05 2024-04-19 杭州锘崴信息科技有限公司 Private data analysis method and system
CN111914569A (en) * 2020-08-10 2020-11-10 哈尔滨安天科技集团股份有限公司 Prediction method and device based on fusion map, electronic equipment and storage medium
CN111914569B (en) * 2020-08-10 2023-07-21 安天科技集团股份有限公司 Fusion map-based prediction method and device, electronic equipment and storage medium
CN112287686A (en) * 2020-08-13 2021-01-29 新智道枢(上海)科技有限公司 Warning safety protection method based on semantic analysis
CN112364649B (en) * 2020-09-08 2022-07-19 深圳平安医疗健康科技服务有限公司 Named entity identification method and device, computer equipment and storage medium
CN112364649A (en) * 2020-09-08 2021-02-12 平安医疗健康管理股份有限公司 Named entity identification method and device, computer equipment and storage medium
CN112241734A (en) * 2020-10-15 2021-01-19 首域科技(杭州)有限公司 Method and system for diagnosing equipment fault through knowledge graph and Bayesian network
CN112463960A (en) * 2020-10-30 2021-03-09 完美世界控股集团有限公司 Entity relationship determination method and device, computing equipment and storage medium
CN112307767A (en) * 2020-11-09 2021-02-02 国网福建省电力有限公司 Bi-LSTM technology-based regulation and control knowledge modeling method
CN112328811A (en) * 2020-11-12 2021-02-05 国衡智慧城市科技研究院(北京)有限公司 Word spectrum clustering intelligent generation method based on same type of phrases
CN112417083B (en) * 2020-11-12 2022-05-17 福建亿榕信息技术有限公司 Method for constructing and deploying text entity relationship extraction model and storage device
CN112417083A (en) * 2020-11-12 2021-02-26 福建亿榕信息技术有限公司 Method for constructing and deploying text entity relationship extraction model and storage device
CN112800243A (en) * 2021-02-04 2021-05-14 天津德尔塔科技有限公司 Project budget analysis method and system based on knowledge graph
CN112861515A (en) * 2021-02-08 2021-05-28 上海天壤智能科技有限公司 Interactive knowledge definition and processing method, system, device and readable medium
CN112861515B (en) * 2021-02-08 2022-11-11 上海天壤智能科技有限公司 Interactive knowledge definition and processing method, system, device and readable medium
CN112860913A (en) * 2021-02-24 2021-05-28 广州汇通国信科技有限公司 Ontology creation method of knowledge graph
CN112860913B (en) * 2021-02-24 2024-03-08 广州汇通国信科技有限公司 Ontology creation method of knowledge graph
CN113326700B (en) * 2021-02-26 2024-05-14 西安理工大学 ALBert-based complex heavy equipment entity extraction method
CN113326700A (en) * 2021-02-26 2021-08-31 西安理工大学 ALBert-based complex heavy equipment entity extraction method
CN113127503A (en) * 2021-03-18 2021-07-16 中国科学院国家空间科学中心 Automatic information extraction method and system for aerospace information
CN113505191A (en) * 2021-03-26 2021-10-15 中国航空无线电电子研究所 Ontology-based avionics system architecture model construction method
CN113220672A (en) * 2021-04-26 2021-08-06 中国人民解放军军事科学院国防科技创新研究院 Military and civil fusion policy information database system
CN113239201A (en) * 2021-05-20 2021-08-10 国网上海市电力公司 Scientific and technological literature classification method based on knowledge graph
CN113297252A (en) * 2021-05-28 2021-08-24 北京信息科技大学 Data query service method with mode being unaware
CN113609838B (en) * 2021-07-14 2024-05-24 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Document information extraction and mapping method and system
CN113609838A (en) * 2021-07-14 2021-11-05 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Document information extraction and mapping method and system
CN114186561A (en) * 2021-10-20 2022-03-15 福建亿榕信息技术有限公司 Electronic file association analysis method and system based on knowledge graph
CN113761226A (en) * 2021-11-10 2021-12-07 中国电子科技集团公司第二十八研究所 Ontology construction method of multi-modal airport data
CN114490626B (en) * 2022-04-18 2022-08-16 成都数融科技有限公司 Financial information analysis method and system based on parallel computing
CN114490626A (en) * 2022-04-18 2022-05-13 成都数融科技有限公司 Financial information analysis method and system based on parallel computing
CN115358201A (en) * 2022-08-03 2022-11-18 浙商期货有限公司 Processing method and system for delivery and research report in futures field
CN115169362A (en) * 2022-09-08 2022-10-11 北京瀚语科技有限公司 HowNet natural language processing method, system and application
CN115796160B (en) * 2022-12-09 2024-04-09 南阳理工学院 Thesis redundant data cleaning method and device based on lexical affix and storage medium
CN115796160A (en) * 2022-12-09 2023-03-14 南阳理工学院 Method and device for cleaning redundant thesis data based on lexical affixes and storage medium
CN116483940A (en) * 2023-04-26 2023-07-25 深圳市国房云数据技术服务有限公司 Method for extracting and structuring data of whole-flow type document

Similar Documents

Publication Publication Date Title
CN106815293A (en) System and method for constructing knowledge graph for information analysis
CN104391942B (en) Short essay eigen extended method based on semantic collection of illustrative plates
CN111078889B (en) Method for extracting relationship between medicines based on various attentions and improved pre-training
CN107526799A (en) A kind of knowledge mapping construction method based on deep learning
CN108073569A (en) A kind of law cognitive approach, device and medium based on multi-layer various dimensions semantic understanding
Zhou et al. Recognizing software bug-specific named entity in software bug repository
CN109241199B (en) Financial knowledge graph discovery method
CN109871449A (en) A kind of zero sample learning method end to end based on semantic description
Cao et al. Toward accurate link between code and software documentation
Almazroi et al. COVID-19 Cases Prediction in Saudi Arabia Using Tree-based Ensemble Models.
CN116244446A (en) Social media cognitive threat detection method and system
Fang et al. CyberEyes: cybersecurity entity recognition model based on graph convolutional network
Tianxiong et al. Identifying chinese event factuality with convolutional neural networks
CN117574898A (en) Domain knowledge graph updating method and system based on power grid equipment
Guo et al. Research on named entity recognition for information extraction
Qiu et al. NeuroSPE: A neuro‐net spatial relation extractor for natural language text fusing gazetteers and pretrained models
CN106156316A (en) Special name under a kind of big data environment and native place correlating method and system
CN116186422A (en) Disease-related public opinion analysis system based on social media and artificial intelligence
Zhu et al. A Semantic Similarity Computing Model based on Siamese Network for Duplicate Questions Identification.
Parolin et al. Come-ke: A new transformers based approach for knowledge extraction in conflict and mediation domain
CN110377690A (en) A kind of information acquisition method and system based on long-range Relation extraction
CN113886524A (en) Network security threat event extraction method based on short text
Zhou et al. Ontology-based information extraction from environmental regulations for supporting environmental compliance checking
Liao et al. Detecting duplicate questions in stack overflow via semantic and relevance approaches
Bäck Domain similarity metrics for predicting transfer learning performance

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170609

RJ01 Rejection of invention patent application after publication