CN106815293A - System and method for constructing knowledge graph for information analysis - Google Patents
System and method for constructing knowledge graph for information analysis Download PDFInfo
- Publication number
- CN106815293A CN106815293A CN201611124399.XA CN201611124399A CN106815293A CN 106815293 A CN106815293 A CN 106815293A CN 201611124399 A CN201611124399 A CN 201611124399A CN 106815293 A CN106815293 A CN 106815293A
- Authority
- CN
- China
- Prior art keywords
- data
- relation
- module
- analysis
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 239000000284 extract Substances 0.000 claims abstract description 34
- 238000000605 extraction Methods 0.000 claims abstract description 31
- 238000012549 training Methods 0.000 claims abstract description 29
- 238000004140 cleaning Methods 0.000 claims abstract description 9
- 238000013507 mapping Methods 0.000 claims description 38
- 239000000463 material Substances 0.000 claims description 20
- 238000010801 machine learning Methods 0.000 claims description 11
- 238000006243 chemical reaction Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 230000009193 crawling Effects 0.000 claims description 8
- 230000011218 segmentation Effects 0.000 claims description 8
- 238000012360 testing method Methods 0.000 claims description 8
- 238000001914 filtration Methods 0.000 claims description 5
- 230000010354 integration Effects 0.000 claims description 5
- 230000015572 biosynthetic process Effects 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 4
- 238000012795 verification Methods 0.000 claims description 4
- 230000007812 deficiency Effects 0.000 claims description 3
- 241000270322 Lepidosauria Species 0.000 claims 1
- 238000010276 construction Methods 0.000 abstract description 5
- 238000007781 pre-processing Methods 0.000 abstract 2
- 238000005516 engineering process Methods 0.000 description 11
- 230000000694 effects Effects 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 230000004927 fusion Effects 0.000 description 4
- 238000005065 mining Methods 0.000 description 4
- 206010028916 Neologism Diseases 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 239000013589 supplement Substances 0.000 description 2
- 241000408659 Darpa Species 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000012098 association analyses Methods 0.000 description 1
- 238000009412 basement excavation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/288—Entity relationship models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A system and a method for constructing a knowledge graph for intelligence analysis comprise the following steps: the data acquisition module is used for cleaning and simply preprocessing the acquired data and then outputting the data to the text extraction module; the text extraction module is used for cleaning and preprocessing the acquired structured and unstructured data and transmitting the clean data to the entity identification module; the entity recognition module is used for segmenting the text, performing part-of-speech tagging on the segmented words, then extracting terms, and transmitting the extracted result to the semantic analysis module; the semantic analysis module analyzes and extracts the relation between the ontologies, generates a semantic metadata model through an ontology construction tool and outputs the semantic metadata model to the entity relation extraction module; and the entity relationship extraction module is used for finally generating the knowledge map language by extracting classification relationship and non-classification relationship. The invention combines the syntactic training and the association rule, thereby reducing external input and manual intervention and continuously identifying the entity relationship.
Description
Technical field
The present invention relates to fields such as natural language processing, Computerized Information Processing Tech, Chinese knowledge base applications, particularly
It is related to knowledge mapping to build field.
Background technology
It it has been an epoch for data explosion since in recent years, data volume remains 50% or so growth rate, is every year
The data of these magnanimity are processed, the potential significance of these data is excavated, retrieval quality and efficiency, global major research aircrafts is improved
Structure and search engine producer all rack one's brains.With being in full swing for the projects such as Linking Open Data, Semantic Web data
The quantity in source is increased sharply, and a large amount of RDF (Resource Description Framework, resource description framework) data are published.
Internet is just transformed into comprising a large amount of from only including the document WWW (Document Web) of hyperlink between webpage and webpage
The data WWW (Data Web) of abundant relation between various entities and entity is described.In this context, Google, Baidu
Knowledge mapping is built based on this one after another with the search engine companies such as search dog, it is respectively Knowledge Graph, intimate and know
Cube search quality is improved, so as to having pulled open the prelude of semantic search.Meanwhile, the information in face of magnanimity, complexity, isomery is believed
Breath, it is desirable to be able to carry out quick analysis mining and associate ability, quickly can go out object of information entity and by entity by mining analysis
It is associated with huge information knowledge mapping, and requires to process the data environment of information in sides such as data processing amount, feedback timeliness
Face possesses very strong ability.Want to possess these abilities, how to build one completely, high-quality knowledge mapping is indispensable
Key technology.
Countries in the world safety and intelligence department constructing system or research and development technology, support effectively to collect, fusion, management
With analysis information big data, and the therefrom valuable information of analysis acquisition.Each intelligence department of such as U.S. government passes through " prism meter
Draw " at home and abroad persistently monitoring the Internet activity and common carrier user profile;US military is early in just throwing before 5 years simultaneously
Substantial contribution passes through its perfect database and powerful data relation analysis skill to the companies such as Palantir, Palantir
Art, has helped Obama administration to chase the action of this Laden.After September 11th terrorist attack occurs, due to much jeopardizing national peace
Full suspected terrorists data message is dispersed in different institutions, in this regard, according to No. 6 president's order of Homeland Security in 2003 years
Set up terrorist's examination center (hereinafter referred to as TSC).The center is under the jurisdiction of FBI and by its leader, is one by the Ministry of Justice, state
The comprehensive mechanism of the departmental representatives such as native Ministry of State Security, State Council composition, it is main to be responsible for determining suspicious or potential terrorist.Under
If three departments at Information Technology Office, DARPA, administrative management service and action.National anti-terrorist center (NCTC) and FBI are respectively by external and state
Interior known and suspicious terrorist's name single pass-through TSC is transferred to terrorist's examination database (TSDB), and according to different numbers
Operated according to daily interaction between storehouse, increase, change or delete known and suspected terrorists identity information in TSDB.In TSC
Round-the-clock call center is inside provided with, the characteristic information network of terrorist's identification covering the whole world is constructed, including position, lived
Whether place, contact method, transaction record, be several with the personnel for helping law enforcement agency's determination to be run into during terrorist's examination
According to the matching object determined in storehouse, then these available informations are supplied to the law enfrocement official for carrying out daily examination operation.TSC is also
Actively with the whole America fusion center cooperation, it is ensured that fusion center accurately and timely transmits known and suspected terrorists information, passes through
Authorize FBI Deng Ge mechanism law enfrocement officials to participate in examination, effectively prevent the generation again of the attack of terrorism.Therefore, characteristic information
Network is into the core support technology of American National anti-terrorist center.TIA (i.e. " the full information identifications " or " threat letter of U.S. army
Breath identification " plan), to be collected using advanced method, process and analyze the data of extensive terrorism, its final purpose is
The attack of terrorism is blocked from root.Its main method is by EELD (evidence is extracted and incidence relation) sub-projects from non-structural
Change the evidence of extraction people, the relation between tissue, place and event and association in text, constitute knowledge mapping, and then to terror
The relation of molecule, whereabouts, activity are modeled and association analysis, and important function has been played in IS terrorist activities are recognized.From technology
On say, TIA plan give priority to:1) architecture of large-scale anti-terrorism database is built;2) data are filled from existing resource
The new method in storehouse, creates new source, and create new excavation, fusion and refine algorithm;3) it is analysis and linked database letter
Breath uses the new model of knowledge mapping, so as to obtain exercisable intelligence technology.
By contrast, domestic utilization and research to this respect technology is still short of very much, so in face of being related to each neck
The information data of the magnanimity numerous and complicated in domain, be badly in need of a kind of method carries out cleaning treatment to these data, it is become have usury
With the knowledge mapping of value.The construction step of knowledge mapping is usually extracted data, Chinese word segmentation, Entity recognition, relation recognition.
And relation recognition is current most insoluble problem.Its groundwork is:Entity co-occurrence+relationship marking.And existing method without
Method accomplishes to be continuously improved the mark of relation or the input of strong depend-ence external knowledge and manual intervention.
In existing patent of invention, a kind of " reading domain knowledge map construction method towards books " patent (publication number:
CN103488724A, 2014.01.01), a kind of by obtaining the knowledge on internet, integrated universal knowledge mapping is described, so
Extend the related concept of books and entity using the mode of iteration with reference to world knowledge collection of illustrative plates afterwards, binding entity Infobox tables and
Conventional relationship extracts entity relationship, finally marks the kernel entity in e-book from long to short according to entity, and set up entity
With linking for books knowledge mapping, the method to realize intelligent knowledge recommendation.Here it is in face of a certain figure to build knowledge mapping
Book, so the keyword of Manual definition's books is needed in there is significant limitation, and building process, when needs build greatly
, it is necessary to expend many manpowers during the knowledge mapping of amount, and easily because manual errors cause the inaccurate of collection of illustrative plates.It is topmost
Or the scope of application is too narrow, and applicability is not wide, it is impossible to process the information data of magnanimity numerous and complicated." it is based on structuring
The knowledge mapping construction method and device of data " (publication number:CN104462501A, 2015.03.25) patent provide a kind of base
In the knowledge mapping construction method and device of structural data, entity name and correspondent entity are included by obtaining one or more
The structural data of attribute information, the mapping for extracting the entity name and its attribute information that are included in the structural data is closed
System, generates corresponding data structure pair;Using the data structure of the generation to being stored as knowledge mapping data.Here only
Structure-oriented data, it is well known that it is all unstructured data that we usually need major part to be processed, is fitted in this approach
It is also very narrow with property, it is impossible to complete our requirement.
The content of the invention
For defect of the prior art, it is an object of the invention to provide a kind of structure knowledge mapping towards intelligence analysis
System and method, by syntax training and correlation rule be combined, not only reduced outside input and manual intervention but can continue know
Other entity relationship.
To reach above-mentioned purpose, the technical solution adopted in the present invention is as follows:
A kind of system of the structure knowledge mapping towards intelligence analysis, including:
Data acquisition module, output extracts mould to text after the data to collecting carry out cleaning and simple pretreatment
Block;
Text abstraction module, data cleansing and pretreatment are carried out to the structuring for collecting and unstructured data, are judged
With the presence or absence of breakage, the file to collecting include the operation of Unified coding conversion and complicated and simple conversion to file, cleans and pre-
Data are transported to Entity recognition module after having processed;
Entity recognition module, for the clean text data for receiving, carries out participle, then to dividing to text first
Word carry out part-of-speech tagging, after the completion of part-of-speech tagging, term is extracted, the result for extracting is transported to semantic analysis
Module;
The relation between body is extracted in semantic module, analysis, then by ontology edit tool generative semantics metadata mould
Type, is then output to entity relationship extraction module;
Entity relationship extraction module, includes that classification relation, non-categorical relation ultimately produce knowledge mapping by extracting.
The data acquisition module is by big towards information using what is targetedly crawled around a certain specific objective
The crawler system of data is realized.
The Entity recognition module includes:Word-dividing mode, part-of-speech tagging module, terminological analysis module, for what is received
Clean text data, word-dividing mode carries out participle to text first, and all of word is extracted according to word segmentation regulation and dictionary,
Then the word that part-of-speech tagging module has been divided word-dividing mode carries out part-of-speech tagging, after the completion of part-of-speech tagging, terminological analysis mould
Root tuber is extracted according to terminology bank to term.
The semantic module combination Ontology integration extracts syntax library and ontology library, with reference to grammer, semantic normalization with
And grammer, semantic learning algorithm, a rule base of information extraction is generated, the syntax library generated during dependence, ontology library
And information extraction rule base, extract the relation between body eventually through grammer, semantic analysis.
The present invention also provides a kind of method of the structure knowledge mapping towards intelligence analysis, by any of the above-described system reality
It is existing, including:
Step 1, by using can targetedly be crawled around a certain specific objective towards information big data
Crawler system realizes crawling for data;
Step 2, the data that step 1 is collected carry out data cleansing and filtering, and the data for having damaged are filtered,
Diversified text encoding format is converted into unified UTF-8 codings, the text of converted coding include complicated and simple
The pretreatment operation of conversion;
Step 3, the data after step 2 is cleaned and pre-processed are named Entity recognition, including participle, part of speech mark
Note and terminological analysis;
Step 4, generates information extraction rule base, syntax library, the ontology library generated in combination with during, by grammer point
Analysis, semantic analysis, then by ontology edit tool generative semantics metadata schema, it is then output to the knowledge of next step semantic relation
Not;
Step 5, carries out semantic relation identification and extracts to the semantic metadata model that step 4 conveying comes.
In step 1, the text formatting for crawling is included Office, PDF, XML, HTML or is carried out by private data storehouse
Importing data.
Step 3 includes three sub-steps:
Participle step 206, part-of-speech tagging step 208, terminological analysis step 210, wherein, participle step 206 will clean and
Pretreated text data carries out participle according to dictionary and selected word segmentation regulation, and participle just carries out part-of-speech tagging step after finishing
Rapid 208, according to dictionary and Rules for Part of Speech Tagging storehouse by step 206 decomposite come word carry out part-of-speech tagging, wherein the latter, participle
Step 206 and part-of-speech tagging step 208 are put together and are carried out, and part of speech, after completion, Ran Houjin have just been marked while participle
Row terminological analysis step 210, according to the terminology bank by term extraction and the integrated generation of glossary, is analyzed to term and carries
Take, word useless is filtered, by step 206, step 208, step 210, according to by the integrated life of machine learning and dictionary
Extract name entity in name entity patterns storehouse.
In order to tackle the deficiency that dictionary counts participle, using CRF participle techniques, the frequency letter that word occurs not only is considered
Breath, while considering context of co-text.
Step 5 includes that classification relation is extracted, non-categorical relation is extracted, and the method that the non-categorical relation is extracted is specific such as
Under:
Non-categorical relation is extracted by way of combining association rule and grammatical relation, one is measured using association rule
To concept and the intensity of the correlation of verb, as the candidates for having confirmed that the concept pair that there is semantic relation, meanwhile, in neck
In the text set of domain, the semantic relation between concept is found using the interdependent and sentence structure analysis between concept, and between concept
Semantic relation is generally to be expressed by verb and be connected, wherein after candidate relationship collection is selected by syntactic analysis, by associating
Rule is excavated, and confirms suitable set of relations.
Step 5 includes that classification relation is extracted, non-categorical relation is extracted, and the method that the non-categorical relation is extracted is specific such as
Under:
The mixed method that rule-based and machine learning is combined extracts non-categorical relation from the text of field, its extract into
Cheng Wei:
1) field text, by artificial rule learning, formation rule storehouse;
2) use learnt rule to match other language material texts, form language material clause storehouse;
3) using language material clause storehouse, manually training corpus is labeled, is then instructed with CRF machine learning algorithms
Practice, generate training pattern;
4) testing material and training pattern are used, is tested and manual verification, according to test result, training corpus entered
Row supplement and adjustment, re -training, until the accuracy rate and recall rate of training pattern reach certain level;
5) adjusted training pattern is utilized, binding rule matching result extracts non-categorical entity pass from actual language material
System.
What the art of this patent scheme was brought has the beneficial effect that:
1st, realize since collect intelligence information, text participle, part-of-speech tagging, terminological analysis, name are carried out successively in fact
A series of technologies such as body identification, syntactic analysis, semantic analysis, entity relationship extraction, finally complete automatic semi-automatic structure knowledge
Collection of illustrative plates, greatly reduces the manpower for building and being expended required for high-quality knowledge mapping.
2nd, skill can be solved and detects that information source is more, flow big, high density data are difficult to reliable record, efficient analysis
Problem.
3rd, can realize that fast automatic efficiently mining analysis go out object of information entity and by the automatic semi-automatic association of entity
To in huge information knowledge mapping.
4th, can realize that, information of isomery complicated to magnanimity carries out quick analysis mining and the ability that associates, complete feelings
Report the semi-automatic structure of knowledge mapping.
5th, greatly improve the information automatic identification based on machine intelligence, information association confirmation, Data mining analysis
Ability, because high-quality knowledge spectrogram is the base support of the inquiry of high efficiency knowledge mapping and analysis, this patent is also straight
Connect the inquiry and analysis efficiency for improving big data platform to complex relationship collection of illustrative plates.
Brief description of the drawings
The detailed description made to non-limiting example with reference to the following drawings by reading, further feature of the invention,
Objects and advantages will become more apparent upon:
Fig. 1 is the system block diagram of the automatic semi-automatic structure collection of illustrative plates towards intelligence analysis provided by the present invention;
Fig. 2 is flow chart;
Fig. 3 is that the non-categorical relation based on association rule extracts flow chart;
Fig. 4 is rule-based and the non-hierarchical relation of machine learning extracts flow chart.
Specific embodiment
With reference to specific embodiment, the present invention is described in detail.Following examples will be helpful to the technology of this area
Personnel further understand the present invention, but the invention is not limited in any way.It should be pointed out that to the ordinary skill of this area
For personnel, without departing from the inventive concept of the premise, some changes and improvements can also be made.These belong to the present invention
Protection domain.
The system and method for the automatic semi-automatic structure collection of illustrative plates towards intelligence analysis that the present invention is provided include information
Collection, name Entity recognition, grammer, semantic analysis, entity relationship extract four bulks, wherein most crucial technology is name entity
Identification and entity relationship are extracted.
Information collection passes through what is realized using widely used data crawler system, and general with range as mesh
Target data crawler system is compared, and crawler system is improved and optimizated, and becoming can enter around a certain specific objective
The crawler system towards information big data that row is targetedly crawled.Data for crawling carry out cleaning and simple pre- place
Just name Entity recognition module is arrived in output after reason.
Name Entity recognition includes:Word-dividing mode, part-of-speech tagging module, terminological analysis module.
Wherein participle and part-of-speech tagging can put together is carried out, and part of speech has just been marked while participle.Basic divides
Word and part-of-speech tagging operation depend on a relation storehouse of word segmentation regulation storehouse and dictionary and part of speech.But be in fact this for
The discrimination of ambiguity word and neologisms is relatively low.Especially Chinese word segmentation, the appearance of polysemy, ambiguity and various network nouns,
Need constantly to expand participle storehouse.In order to tackle the deficiency that dictionary counts participle, present invention employs based on condition random field
Participle technique, it not only allows for the frequency information of word appearance, while further contemplating context of co-text, possesses good study energy
Power, therefore have preferable learning ability for ambiguity word and neologisms.The present invention combine constantly expand participle storehouse and using than
More ripe participle technique, participle and part-of-speech tagging, and so participle and word can be just completed by participle storehouse and part of speech storehouse
Property mark it is more accurate and efficiency is very high, when running into ambiguity word, neologisms, when participle storehouse and indeterminable part of speech storehouse,
Work is completed in conjunction with CRF, and expands participle storehouse and part of speech storehouse.Terminological analysis mould is arrived in output after participle and part-of-speech tagging
Block.
Terminological analysis are generally exactly that term to terminology bank is extracted from glossary.Need to be divided according to different fields
And safeguard this terminology bank.Grammer, semantic modules are arrived in output after terminological analysis.
Grammer, semantic analysis.Extract syntax library and ontology library with reference to Ontology integration, with reference to grammer, semantic normalization and
Grammer, semantic learning algorithm, generate a rule base of information extraction.Rely on the syntax library of this process generation, ontology library
And information extraction rule base, the relation between body is extracted eventually through grammer, semantic analysis, then by ontology edit tool
Generative semantics metadata schema, is then output to entity relationship extraction module.
Entity relationship is extracted includes that classification relation is extracted, non-categorical relation is extracted.Wherein difficult point is exactly that non-categorical relation is carried
Take, extracting non-categorical relation can be divided into two different problems:1) relation existed between a pair of concepts is found;2) marked according to semanteme
Remember this relation.This patent extracts non-categorical relation using two methods.
Non-categorical relation is extracted by way of combining association rule and grammatical relation.Wherein the present invention uses correlation method
Then come the intensity that measures the correlation of a pair of concepts and verb, (this intensity is that the confidence of correlation rule by being extracted comes
Definition), as the candidates for having confirmed that the concept pair that there is semantic relation.The present invention is also proposed in field text set simultaneously
In, find the semantic relation between concept using the interdependent and sentence structure analysis between concept, and the semantic relation between concept
It is generally to be expressed by verb and be connected.Wherein after candidate relationship collection is selected by syntactic analysis, excavated by association rule,
Confirm suitable set of relations, finally by the feedback of domain expert, can finally be provided to the most suitable set of relationship of Ontology engineering teacher.
The method not only can automatically find the relation between concept, or these relations assign suitable mark.In this way
Burden of the knowledge engineer when domain body builds can greatly be alleviated.
The mixed method that rule-based and machine learning is combined extracts non-categorical relation from the text of field, its extract into
Cheng Wei:1) field text, by artificial rule learning, formation rule storehouse;2) learnt rule is used to other language materials text
Originally matched, formed language material clause storehouse;3) using language material clause storehouse, manually training corpus is labeled, then uses CRF machines
Device learning algorithm is trained, and generates training pattern;4) testing material and training pattern are used, are tested and manual verification,
According to test result, training corpus is supplemented and is adjusted, re -training, until the accuracy rate and recall rate of training pattern reach
To certain level;5) adjusted training pattern is utilized, binding rule matching result extracts non-categorical reality from actual language material
Body relation.The method effectively increases the effect that non-categorical relation is obtained from term set, by the artificial CRF methods for helping
More terms can also be extracted, existing term set can be effectively supplemented.
As shown in figure 1, the system of the automatic semi-automatic structure collection of illustrative plates towards intelligence analysis that the embodiment of the present invention is provided
101 include:Data acquisition module 102, text abstraction module 103, Entity recognition module 104, semantic module 108, entity
Relation extraction module 109, knowledge mapping 110.Wherein Entity recognition module 104 includes:Word-dividing mode 105, part-of-speech tagging module
106th, 107 3 submodules of terminological analysis module.Data acquisition module 102 is responsible for
What system was realized, compared with the general data crawler system with range as target, it is excellent that we have carried out improvement to crawler system
Change, become the crawler system towards information big data that can be targetedly crawled around a certain specific objective.It is right
With regard to output to text abstraction module 103 after the data for crawling carry out cleaning and simple pretreatment.Text abstraction module
103 pairs of structurings for collecting and unstructured data carry out data cleansing and pretreatment, judge file with the presence or absence of breakage, right
Collecting file carries out Unified coding conversion, then carries out the operations such as complicated and simple conversion, after cleaning and having pre-processed that data are defeated
It is sent to Entity recognition module 103.Real-time identification module 103 includes word-dividing mode 105, part-of-speech tagging module 106, terminological analysis mould
107 3 submodules of block, for the clean text data for receiving, word-dividing mode 105 carries out participle to text first, according to
Word segmentation regulation and dictionary extract all of word, and then part-of-speech tagging module 106 is carried out to 105 points of good words of word-dividing mode
Part-of-speech tagging, after the completion of part-of-speech tagging, terminological analysis module 107 is extracted according to terminology bank to term, the knot for extracting
Fruit is transported to semantic module 108.Semantic module 108 extracts syntax library and ontology library with reference to Ontology integration, with reference to
Grammer, semantic normalization and grammer, semantic learning algorithm, generate a rule base of information extraction.Rely on the life of this process
Into syntax library, ontology library and information extraction rule base extract the relation between body eventually through grammer, semantic analysis,
Again by ontology edit tool generative semantics metadata schema, entity relationship extraction module 109 is then output to.Entity relationship is carried
Modulus block 109 includes that classification relation, non-categorical relation ultimately produce knowledge mapping 110 by extracting.
Fig. 2 is, towards the flow of the automatic semi-automatic structure collection of illustrative plates of intelligence analysis, to specifically include following steps:
Step 201 is climbed by using widely used data crawler system, and the general data with range as target
Worm system is compared, and we are improved and optimizated to crawler system, and becoming can be directed to around a certain specific objective
The crawler system towards information big data for crawling of property.Text formatting for crawling can be Office, PDF, XML,
HTML, certain we can also carry out data importing by private data storehouse.
Step 202 is that the data for collecting step 201 carry out data cleansing and filtering, and the data for having damaged are carried out
Filtering, unified UTF-8 codings are converted into by diversified text encoding format, the text of converted coding are carried out complicated and simple
The pretreatment operations such as conversion.The data that step 201 is collected carried out clean and pre-process etc. operation after proceed to it is next
Step name Entity recognition step 203.
Step 203 name Entity recognition includes participle step 206, part-of-speech tagging step 208, terminological analysis step in fact
210 3 sub-steps.Step 206 will be cleaned with pretreated text data according to dictionary and selected participle to step 202
Rule 207 carries out participle, and participle just carries out part-of-speech tagging step 208 after finishing, will according to dictionary and Rules for Part of Speech Tagging storehouse 209
Step 206 decomposites the word for coming and carries out part-of-speech tagging, after step 208 is completed, then carries out terminological analysis step 210, according to
By the terminology bank 211 of integrated 212 generation of term extraction and glossary, we can be analyzed extraction to term, filtering
Some words useless, such as some auxiliary verbs, adverbial word, adjective etc..By step 206, step 208, step 210, I
Can extract name entity according to by the name entity patterns storehouse 204 of machine learning and dictionary integrated 205, and be transported to down
One step syntactic analysis 213.
Step 213 syntactic analysis and step 215 semantic analysis extract language according to combination body learning and Ontology integration 217
Faku County 214 and ontology library 216, with reference to the learning algorithm of grammer, semantic relation study and declarative specifications 220, generate information extraction
A rule base 219.The syntax library 214 of this process generation, ontology library 216 and information extraction rule base 219 are relied on, most
Eventually by syntactic analysis 213, semantic analysis 215, then by ontology edit tool generative semantics metadata schema, it is then output to
Next step semantic relation identification 218.
Step 218 entity relationship is extracted carries out semantic relation identification to the semantic metadata model that step 215 conveying comes
Extract, including classification relation is extracted, non-categorical relation is extracted.This patent extracts non-categorical relation using two methods.
As shown in figure 3, the present embodiment provides the non-categorical relation based on association rule extracts flow, closed by combining
Join the mode of rule and grammatical relation to extract non-categorical relation.To field text set 301, text is carried out by step 302
Treatment and foundation index, text-processing here include participle, identify entity.By concept Candidate Set 303, identification relation occurs
Frequency 304, find the semantic relation between concept using the interdependent and sentence structure analysis between concept, generate a time
Select set of relations.Wherein after candidate relationship collection is selected by syntactic analysis, 306 are excavated by association rule, confirm suitable relation
Collection, finally by the feedback of domain expert 307, can finally be provided to the most suitable set of relationship of Ontology engineering teacher.The method is not only
The relation between concept can be automatically found, or these relations assign suitable mark.Can greatly subtract in this way
Light burden of the knowledge engineer when domain body builds.
As shown in figure 4, the non-hierarchical relation that the present embodiment provides rule-based and machine learning extracts flow, field
By artificial rule learning, formation rule storehouse 402 uses learnt rule to enter professional etiquette to other language material texts to text 401
Then match, form language material clause storehouse 403, using language material clause storehouse 403, manually training corpus is labeled by step 408,
Then step 405 is trained using CRF machine learning algorithms, generates training pattern.Use testing material and training pattern, step
Rapid 409 are tested and manual verification, and according to test result, step 404 is supplemented and adjusted to training corpus, is instructed again
Practice, until the accuracy rate and recall rate of training pattern reach certain level;Using adjusted training pattern, binding rule
With result, non-categorical entity relationship is extracted from actual language material.The method is effectively increased and obtains non-categorical from term set
The effect of relation, more terms can also be extracted by the artificial CRF methods for helping, and can effectively supplement existing term
Set.
Specific embodiment of the invention is described above.It is to be appreciated that the invention is not limited in above-mentioned
Particular implementation, those skilled in the art can within the scope of the claims make a variety of changes or change, this not shadow
Sound substance of the invention.In the case where not conflicting, feature in embodiments herein and embodiment can any phase
Mutually combination.
Claims (10)
1. it is a kind of towards intelligence analysis structure knowledge mapping system, it is characterised in that including:
Data acquisition module, the data to collecting exported to text abstraction module after cleaning and simple pretreatment;
Text abstraction module, data cleansing and pretreatment are carried out to the structuring for collecting and unstructured data, judge file
With the presence or absence of breakage, the file to collecting include the operation of Unified coding conversion and complicated and simple conversion, cleaning and pretreatment
Data are transported to Entity recognition module after complete;
Entity recognition module, for the clean text data for receiving, carries out participle, then the list to having divided to text first
Word carries out part-of-speech tagging, after the completion of part-of-speech tagging, term is extracted, and the result for extracting is transported to semantic analysis mould
Block;
The relation between body is extracted in semantic module, analysis, then by ontology edit tool generative semantics metadata schema, so
Entity relationship extraction module is arrived in output afterwards;
Entity relationship extraction module, includes that classification relation, non-categorical relation ultimately produce knowledge mapping by extracting.
2. it is according to claim 1 towards intelligence analysis structure knowledge mapping system, it is characterised in that the data
Acquisition module by using the crawler system towards information big data targetedly crawled around a certain specific objective come
Realize.
3. it is according to claim 1 towards intelligence analysis structure knowledge mapping system, it is characterised in that the entity
Identification module includes:Word-dividing mode, part-of-speech tagging module, terminological analysis module, for the clean text data for receiving, point
Word module carries out participle to text first, extracts all of word according to word segmentation regulation and dictionary, then part-of-speech tagging module
The word divided word-dividing mode carries out part-of-speech tagging, and after the completion of part-of-speech tagging, terminological analysis module is according to terminology bank to art
Language is extracted.
4. it is according to claim 1 towards intelligence analysis structure knowledge mapping system, it is characterised in that the semanteme
Analysis module combination Ontology integration extracts syntax library and ontology library, with reference to grammer, semantic normalization and grammer, semantic study
Algorithm, generates a rule base of information extraction, the syntax library generated during dependence, and ontology library and information extraction are regular
Storehouse, the relation between body is extracted eventually through grammer, semantic analysis.
5. it is a kind of towards intelligence analysis structure knowledge mapping method, it is characterised in that by any institute of Claims 1-4
The system realization stated, including:
Step 1, by using the reptile towards information big data that can be targetedly crawled around a certain specific objective
System realizes crawling for data;
Step 2, the data that step 1 is collected carry out data cleansing and filtering, and the data for having damaged are filtered, and will be many
Plant various text encoding format and be converted into unified UTF-8 codings, the text to converted coding carries out including complicated and simple conversion
Pretreatment operation;
Step 3, the data after step 2 is cleaned and pre-processed are named Entity recognition, including participle, part-of-speech tagging and
Terminological analysis;
Step 4, generate information extraction rule base, in combination with during generate syntax library, ontology library, by syntactic analysis,
Semantic analysis, then by ontology edit tool generative semantics metadata schema, it is then output to the identification of next step semantic relation;
Step 5, carries out semantic relation identification and extracts to the semantic metadata model that step 4 conveying comes.
6. it is according to claim 5 towards intelligence analysis structure knowledge mapping method, it is characterised in that in step 1,
The text formatting for crawling includes Office, PDF, XML, HTML or the importing data carried out by private data storehouse.
7. it is according to claim 5 towards intelligence analysis structure knowledge mapping method, it is characterised in that step 3 is wrapped
Include three sub-steps:
Participle step 206, part-of-speech tagging step 208, terminological analysis step 210, wherein, participle step 206 will be cleaned and pre- place
The text data managed carries out participle according to dictionary and selected word segmentation regulation, and participle just carries out part-of-speech tagging step after finishing
208, according to dictionary and Rules for Part of Speech Tagging storehouse by step 206 decomposite come word carry out part-of-speech tagging, the latter, wherein participle step
Rapid 206 and part-of-speech tagging step 208 put together and carry out, part of speech has just been marked while participle, after completion, then carry out
Terminological analysis step 210, according to the terminology bank by term extraction and the integrated generation of glossary, is analyzed to term and carries
Take, word useless is filtered, by step 206, step 208, step 210, according to by the integrated life of machine learning and dictionary
Extract name entity in name entity patterns storehouse.
8. it is according to claim 7 towards intelligence analysis structure knowledge mapping method, it is characterised in that in order to tackle
Dictionary counts the deficiency of participle, using CRF participle techniques, the frequency information that word occurs not only is considered, while considering context
Linguistic context.
9. it is according to claim 5 towards intelligence analysis structure knowledge mapping method, it is characterised in that step 5 is wrapped
Include classification relation extraction, non-categorical relation to extract, the method that the non-categorical relation is extracted is specific as follows:
Non-categorical relation is extracted by way of combining association rule and grammatical relation, it is general to measure a pair using association rule
The intensity with the correlation of verb is read, as the candidates for having confirmed that the concept pair that there is semantic relation, meanwhile, in field text
This concentration, finds the semantic relation between concept using the interdependent and sentence structure analysis between concept, and the semanteme between concept
Relation is generally to be expressed by verb and be connected, wherein after candidate relationship collection is selected by syntactic analysis, by association rule
Excavate, confirm suitable set of relations.
10. it is according to claim 5 towards intelligence analysis structure knowledge mapping method, it is characterised in that step 5 is wrapped
Include classification relation extraction, non-categorical relation to extract, the method that the non-categorical relation is extracted is specific as follows:
The mixed method that rule-based and machine learning is combined extracts non-categorical relation from the text of field, and it extracts process
For:
1) field text, by artificial rule learning, formation rule storehouse;
2) use learnt rule to match other language material texts, form language material clause storehouse;
3) using language material clause storehouse, manually training corpus is labeled, is then trained with CRF machine learning algorithms, it is raw
Into training pattern;
4) testing material and training pattern are used, is tested and manual verification, according to test result, training corpus mended
Fill and adjust, re -training, until the accuracy rate and recall rate of training pattern reach certain level;
5) adjusted training pattern is utilized, binding rule matching result extracts non-categorical entity relationship from actual language material.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611124399.XA CN106815293A (en) | 2016-12-08 | 2016-12-08 | System and method for constructing knowledge graph for information analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611124399.XA CN106815293A (en) | 2016-12-08 | 2016-12-08 | System and method for constructing knowledge graph for information analysis |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106815293A true CN106815293A (en) | 2017-06-09 |
Family
ID=59106968
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611124399.XA Pending CN106815293A (en) | 2016-12-08 | 2016-12-08 | System and method for constructing knowledge graph for information analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106815293A (en) |
Cited By (119)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107291700A (en) * | 2017-07-17 | 2017-10-24 | 广州特道信息科技有限公司 | Entity word recognition method and device |
CN107341264A (en) * | 2017-07-19 | 2017-11-10 | 东北大学 | A kind of electronic health record system and method for supporting custom entities |
CN107391684A (en) * | 2017-07-24 | 2017-11-24 | 深信服科技股份有限公司 | A kind of method and system for threatening information generation |
CN107392433A (en) * | 2017-06-27 | 2017-11-24 | 北京神州泰岳软件股份有限公司 | A kind of method and apparatus for extracting enterprise's incidence relation information |
CN107480128A (en) * | 2017-07-17 | 2017-12-15 | 广州特道信息科技有限公司 | The segmenting method and device of Chinese text |
CN107577670A (en) * | 2017-09-15 | 2018-01-12 | 清华大学 | A kind of terminology extraction method based on definition with relation |
CN107609478A (en) * | 2017-08-09 | 2018-01-19 | 广州思涵信息科技有限公司 | A kind of real-time analysis of the students system and method for matching classroom knowledge content |
CN107704634A (en) * | 2017-11-04 | 2018-02-16 | 辽宁工程技术大学 | A kind of method for forming knowledge and building knowledge chain |
CN107766483A (en) * | 2017-10-13 | 2018-03-06 | 华中科技大学 | The interactive answering method and system of a kind of knowledge based collection of illustrative plates |
CN107807917A (en) * | 2017-09-27 | 2018-03-16 | 风变科技(深圳)有限公司 | Method for extracting content of text, device, system and storage medium |
CN107870966A (en) * | 2017-08-11 | 2018-04-03 | 成都萌想科技有限责任公司 | A kind of recruitment general regulations data pick-up method based on semantic model |
CN107908671A (en) * | 2017-10-25 | 2018-04-13 | 南京擎盾信息科技有限公司 | Knowledge mapping construction method and system based on law data |
CN107908621A (en) * | 2017-11-16 | 2018-04-13 | 东华大学 | Tumor of breast risk assessment system based on ultrasonic examination report text data |
CN107967290A (en) * | 2017-10-09 | 2018-04-27 | 国家计算机网络与信息安全管理中心 | A kind of knowledge mapping network establishing method and system, medium based on magnanimity scientific research data |
CN108228701A (en) * | 2017-10-23 | 2018-06-29 | 武汉大学 | A kind of system for realizing Chinese near-nature forest language inquiry interface |
CN108256063A (en) * | 2018-01-15 | 2018-07-06 | 中国人民解放军国防科技大学 | Knowledge base construction method for network security |
CN108255815A (en) * | 2018-02-07 | 2018-07-06 | 苏州金螳螂文化发展股份有限公司 | The segmenting method and device of text |
CN108304541A (en) * | 2018-01-31 | 2018-07-20 | 刘世洪 | The structure system and method for user preferences modeling UIM based on technique transfers platform |
CN108647192A (en) * | 2018-03-27 | 2018-10-12 | 常熟鑫沐奇宝软件开发有限公司 | A method of generating virtual reality work script with natural language processing technique |
CN108763195A (en) * | 2018-05-02 | 2018-11-06 | 武汉烽火普天信息技术有限公司 | A kind of non-limiting type relation excavation method based on interdependent syntax and pattern rules |
CN108829696A (en) * | 2018-04-18 | 2018-11-16 | 西安理工大学 | Towards knowledge mapping node method for auto constructing in metro design code |
CN108874878A (en) * | 2018-05-03 | 2018-11-23 | 众安信息技术服务有限公司 | A kind of building system and method for knowledge mapping |
CN109064339A (en) * | 2018-09-12 | 2018-12-21 | 张连祥 | A kind of method and system of intelligence inquiry merchant vector environmental information |
CN109086391A (en) * | 2018-07-27 | 2018-12-25 | 北京光年无限科技有限公司 | A kind of method and system constructing knowledge mapping |
CN109117477A (en) * | 2018-07-17 | 2019-01-01 | 广州大学 | Non-categorical Relation extraction method, apparatus, equipment and medium towards Chinese field |
CN109165337A (en) * | 2018-10-17 | 2019-01-08 | 珠海市智图数研信息技术有限公司 | A kind of method and system of knowledge based map construction bidding field association analysis |
CN109189942A (en) * | 2018-09-12 | 2019-01-11 | 山东大学 | A kind of construction method and device of patent data knowledge mapping |
CN109189943A (en) * | 2018-09-19 | 2019-01-11 | 中国电子科技集团公司信息科学研究院 | A kind of capability knowledge extracts and the method for capability knowledge map construction |
CN109241046A (en) * | 2018-08-30 | 2019-01-18 | 天津做票君机器人科技有限公司 | A kind of inventory information recognition methods of negotiation by draft robot and identifier |
CN109241532A (en) * | 2018-08-30 | 2019-01-18 | 天津做票君机器人科技有限公司 | A kind of the vote buying information identifying method and identifier of negotiation by draft robot |
CN109241295A (en) * | 2018-08-31 | 2019-01-18 | 北京天广汇通科技有限公司 | A kind of extracting method of special entity relationship in unstructured data |
CN109241078A (en) * | 2018-08-30 | 2019-01-18 | 中国地质大学(武汉) | A kind of knowledge mapping hoc queries method based on hybrid database |
CN109241199A (en) * | 2018-08-08 | 2019-01-18 | 广州初星科技有限公司 | A method of it is found towards financial knowledge mapping |
CN109359299A (en) * | 2018-09-28 | 2019-02-19 | 中国电子科技集团公司信息科学研究院 | A kind of internet of things equipment ability ontology based on commodity data is from construction method |
CN109376353A (en) * | 2018-09-04 | 2019-02-22 | 国家电网公司华东分部 | A kind of power grid start-up operation ticket generating means and method based on natural language processing |
CN109446337A (en) * | 2018-09-19 | 2019-03-08 | 中国信息通信研究院 | A kind of knowledge mapping construction method and device |
CN109492916A (en) * | 2018-11-16 | 2019-03-19 | 东南大学 | A kind of nuclear power regulation model building method based on ontology |
CN109522418A (en) * | 2018-11-08 | 2019-03-26 | 杭州费尔斯通科技有限公司 | A kind of automanual knowledge mapping construction method |
CN109582933A (en) * | 2018-11-13 | 2019-04-05 | 北京合享智慧科技有限公司 | A kind of method and relevant apparatus of determining text novelty degree |
CN109597885A (en) * | 2018-12-11 | 2019-04-09 | 福建亿榕信息技术有限公司 | A kind of Knowledge Map construction method and storage medium |
CN109597894A (en) * | 2018-09-30 | 2019-04-09 | 阿里巴巴集团控股有限公司 | A kind of correlation model generation method and device, a kind of data correlation method and device |
CN109635117A (en) * | 2018-12-26 | 2019-04-16 | 零犀(北京)科技有限公司 | A kind of knowledge based spectrum recognition user intention method and device |
CN109657072A (en) * | 2018-12-13 | 2019-04-19 | 北京百分点信息科技有限公司 | A kind of intelligent search WEB system and method applied to government's aid decision |
CN109670051A (en) * | 2018-12-14 | 2019-04-23 | 北京百度网讯科技有限公司 | Knowledge mapping method for digging, device, equipment and storage medium |
CN109729171A (en) * | 2019-01-10 | 2019-05-07 | 七彩安科智慧科技有限公司 | A kind of construction method of small town cognition matrix Internet of Things |
CN109753664A (en) * | 2019-01-21 | 2019-05-14 | 广州大学 | A kind of concept extraction method, terminal device and the storage medium of domain-oriented |
CN109783484A (en) * | 2018-12-29 | 2019-05-21 | 北京航天云路有限公司 | The construction method and system of the data service platform of knowledge based map |
CN109857917A (en) * | 2018-12-21 | 2019-06-07 | 中国科学院信息工程研究所 | Towards the security knowledge map construction method and system for threatening information |
CN109947903A (en) * | 2019-03-15 | 2019-06-28 | 北京金山数字娱乐科技有限公司 | A kind of Chinese idiom querying method and device |
CN109977233A (en) * | 2019-03-15 | 2019-07-05 | 北京金山数字娱乐科技有限公司 | A kind of idiom knowledge map construction method and device |
CN110096599A (en) * | 2019-04-30 | 2019-08-06 | 长沙知了信息科技有限公司 | The generation method and device of knowledge mapping |
CN110119469A (en) * | 2019-05-22 | 2019-08-13 | 北京计算机技术及应用研究所 | A kind of data collection and transmission and method towards darknet |
CN110162792A (en) * | 2019-05-24 | 2019-08-23 | 国家电网有限公司 | Electric network data management method and device |
CN110175334A (en) * | 2019-06-05 | 2019-08-27 | 苏州派维斯信息科技有限公司 | Text knowledge's extraction system and method based on customized knowledge slot structure |
CN110210025A (en) * | 2019-05-29 | 2019-09-06 | 广州伟宏智能科技有限公司 | A kind of conversion method based on Text Feature Extraction |
CN110209828A (en) * | 2018-02-12 | 2019-09-06 | 北大方正集团有限公司 | Case querying method and case inquiry unit, computer equipment and storage medium |
CN110210038A (en) * | 2019-06-13 | 2019-09-06 | 北京百度网讯科技有限公司 | Kernel entity determines method and its system, server and computer-readable medium |
CN110209839A (en) * | 2019-06-18 | 2019-09-06 | 卓尔智联(武汉)研究院有限公司 | Agricultural knowledge map construction device, method and computer readable storage medium |
CN110222127A (en) * | 2019-06-06 | 2019-09-10 | 中国电子科技集团公司第二十八研究所 | The converging information method, apparatus and equipment of knowledge based map |
CN110287481A (en) * | 2019-05-29 | 2019-09-27 | 西南电子技术研究所(中国电子科技集团公司第十研究所) | Name entity corpus labeling training system |
CN110321432A (en) * | 2019-06-24 | 2019-10-11 | 拓尔思信息技术股份有限公司 | Textual event information extracting method, electronic device and non-volatile memory medium |
CN110347844A (en) * | 2019-07-15 | 2019-10-18 | 中国人民解放军战略支援部队航天工程大学 | A kind of space object knowledge map construction system |
CN110399605A (en) * | 2018-04-17 | 2019-11-01 | 富士施乐株式会社 | Information processing unit and the computer-readable medium for storing program |
CN110458397A (en) * | 2019-07-05 | 2019-11-15 | 苏州热工研究院有限公司 | A kind of nuclear material military service performance information extracting method |
CN110458471A (en) * | 2019-08-19 | 2019-11-15 | 绍兴数纺科技有限公司 | Standardize dyestuff information management system |
CN110516077A (en) * | 2019-08-20 | 2019-11-29 | 北京中亦安图科技股份有限公司 | Knowledge mapping construction method and device towards enterprise's market conditions |
CN110633469A (en) * | 2019-09-10 | 2019-12-31 | 陈绪平 | Method for accurately understanding Chinese sentence meaning |
CN110674308A (en) * | 2019-08-23 | 2020-01-10 | 上海科技发展有限公司 | Scientific and technological word list expansion method, device, terminal and medium based on grammar mode |
CN110717049A (en) * | 2019-08-29 | 2020-01-21 | 四川大学 | Text data-oriented threat information knowledge graph construction method |
CN110795932A (en) * | 2019-09-30 | 2020-02-14 | 中国地质大学(武汉) | Geological report text information extraction method based on geological ontology |
CN110825839A (en) * | 2019-11-07 | 2020-02-21 | 成都国腾实业集团有限公司 | Incidence relation analysis method for targets in text information |
CN110888991A (en) * | 2019-11-28 | 2020-03-17 | 哈尔滨工程大学 | Sectional semantic annotation method in weak annotation environment |
CN110990525A (en) * | 2019-11-15 | 2020-04-10 | 华融融通(北京)科技有限公司 | Natural language processing-based public opinion information extraction and knowledge base generation method |
CN111104478A (en) * | 2019-09-05 | 2020-05-05 | 李轶 | Domain concept semantic drift exploration method |
CN111126065A (en) * | 2019-12-02 | 2020-05-08 | 南京医渡云医学技术有限公司 | Information extraction method and device for natural language text |
CN111177399A (en) * | 2019-12-04 | 2020-05-19 | 华瑞新智科技(北京)有限公司 | Knowledge graph construction method and device |
CN111178075A (en) * | 2019-12-19 | 2020-05-19 | 厦门快商通科技股份有限公司 | Online customer service log analysis method, device and equipment |
CN111277560A (en) * | 2019-12-24 | 2020-06-12 | 普世(南京)智能科技有限公司 | Safe information acquisition, import and compilation method and system based on high-bandwidth physical isolation unidirectional transmission |
CN111339253A (en) * | 2020-02-25 | 2020-06-26 | 中国建设银行股份有限公司 | Method and device for extracting article information |
CN111476034A (en) * | 2020-04-07 | 2020-07-31 | 同方赛威讯信息技术有限公司 | Legal document information extraction method and system based on combination of rules and models |
CN111488497A (en) * | 2019-01-25 | 2020-08-04 | 北京沃东天骏信息技术有限公司 | Similarity determination method and device for character string set, terminal and readable medium |
CN111488741A (en) * | 2020-04-14 | 2020-08-04 | 税友软件集团股份有限公司 | Tax knowledge data semantic annotation method and related device |
CN111597349A (en) * | 2020-04-30 | 2020-08-28 | 西安理工大学 | Rail transit standard entity relation automatic completion method based on artificial intelligence |
CN111651995A (en) * | 2020-06-07 | 2020-09-11 | 上海建科工程咨询有限公司 | Accident information automatic extraction method and system based on deep circulation neural network |
CN111666425A (en) * | 2020-06-10 | 2020-09-15 | 深圳开思时代科技有限公司 | Automobile accessory searching method based on semantic knowledge |
CN111738445A (en) * | 2020-05-26 | 2020-10-02 | 山东大学 | Design knowledge fusion reasoning method supporting product rapid innovation |
CN111753022A (en) * | 2020-06-17 | 2020-10-09 | 第四范式(北京)技术有限公司 | Method, device and equipment for constructing knowledge graph and readable storage medium |
CN111753021A (en) * | 2020-06-17 | 2020-10-09 | 第四范式(北京)技术有限公司 | Method, device and equipment for constructing knowledge graph and readable storage medium |
CN111858575A (en) * | 2020-08-05 | 2020-10-30 | 杭州锘崴信息科技有限公司 | Private data analysis method and system |
CN111859968A (en) * | 2020-06-15 | 2020-10-30 | 深圳航天科创实业有限公司 | Text structuring method, text structuring device and terminal equipment |
CN111897914A (en) * | 2020-07-20 | 2020-11-06 | 杭州叙简科技股份有限公司 | Entity information extraction and knowledge graph construction method for field of comprehensive pipe gallery |
CN111914569A (en) * | 2020-08-10 | 2020-11-10 | 哈尔滨安天科技集团股份有限公司 | Prediction method and device based on fusion map, electronic equipment and storage medium |
CN111984640A (en) * | 2020-08-04 | 2020-11-24 | 中国科学技术大学智慧城市研究院(芜湖) | Portrait construction method based on multi-element heterogeneous data |
CN112084329A (en) * | 2020-07-31 | 2020-12-15 | 西安理工大学 | Semantic analysis method for entity recognition and relation extraction tasks |
CN112241734A (en) * | 2020-10-15 | 2021-01-19 | 首域科技(杭州)有限公司 | Method and system for diagnosing equipment fault through knowledge graph and Bayesian network |
CN112287686A (en) * | 2020-08-13 | 2021-01-29 | 新智道枢(上海)科技有限公司 | Warning safety protection method based on semantic analysis |
CN112307767A (en) * | 2020-11-09 | 2021-02-02 | 国网福建省电力有限公司 | Bi-LSTM technology-based regulation and control knowledge modeling method |
CN112328811A (en) * | 2020-11-12 | 2021-02-05 | 国衡智慧城市科技研究院(北京)有限公司 | Word spectrum clustering intelligent generation method based on same type of phrases |
CN112364649A (en) * | 2020-09-08 | 2021-02-12 | 平安医疗健康管理股份有限公司 | Named entity identification method and device, computer equipment and storage medium |
CN112417083A (en) * | 2020-11-12 | 2021-02-26 | 福建亿榕信息技术有限公司 | Method for constructing and deploying text entity relationship extraction model and storage device |
CN112463960A (en) * | 2020-10-30 | 2021-03-09 | 完美世界控股集团有限公司 | Entity relationship determination method and device, computing equipment and storage medium |
CN112800243A (en) * | 2021-02-04 | 2021-05-14 | 天津德尔塔科技有限公司 | Project budget analysis method and system based on knowledge graph |
CN112860913A (en) * | 2021-02-24 | 2021-05-28 | 广州汇通国信科技有限公司 | Ontology creation method of knowledge graph |
CN112861515A (en) * | 2021-02-08 | 2021-05-28 | 上海天壤智能科技有限公司 | Interactive knowledge definition and processing method, system, device and readable medium |
CN113127503A (en) * | 2021-03-18 | 2021-07-16 | 中国科学院国家空间科学中心 | Automatic information extraction method and system for aerospace information |
CN113220672A (en) * | 2021-04-26 | 2021-08-06 | 中国人民解放军军事科学院国防科技创新研究院 | Military and civil fusion policy information database system |
CN113239201A (en) * | 2021-05-20 | 2021-08-10 | 国网上海市电力公司 | Scientific and technological literature classification method based on knowledge graph |
CN113297252A (en) * | 2021-05-28 | 2021-08-24 | 北京信息科技大学 | Data query service method with mode being unaware |
CN113326700A (en) * | 2021-02-26 | 2021-08-31 | 西安理工大学 | ALBert-based complex heavy equipment entity extraction method |
CN113505191A (en) * | 2021-03-26 | 2021-10-15 | 中国航空无线电电子研究所 | Ontology-based avionics system architecture model construction method |
CN113609838A (en) * | 2021-07-14 | 2021-11-05 | 华东计算技术研究所(中国电子科技集团公司第三十二研究所) | Document information extraction and mapping method and system |
CN113761226A (en) * | 2021-11-10 | 2021-12-07 | 中国电子科技集团公司第二十八研究所 | Ontology construction method of multi-modal airport data |
CN114186561A (en) * | 2021-10-20 | 2022-03-15 | 福建亿榕信息技术有限公司 | Electronic file association analysis method and system based on knowledge graph |
CN114490626A (en) * | 2022-04-18 | 2022-05-13 | 成都数融科技有限公司 | Financial information analysis method and system based on parallel computing |
CN115169362A (en) * | 2022-09-08 | 2022-10-11 | 北京瀚语科技有限公司 | HowNet natural language processing method, system and application |
CN115358201A (en) * | 2022-08-03 | 2022-11-18 | 浙商期货有限公司 | Processing method and system for delivery and research report in futures field |
CN115796160A (en) * | 2022-12-09 | 2023-03-14 | 南阳理工学院 | Method and device for cleaning redundant thesis data based on lexical affixes and storage medium |
CN116483940A (en) * | 2023-04-26 | 2023-07-25 | 深圳市国房云数据技术服务有限公司 | Method for extracting and structuring data of whole-flow type document |
CN113609838B (en) * | 2021-07-14 | 2024-05-24 | 华东计算技术研究所(中国电子科技集团公司第三十二研究所) | Document information extraction and mapping method and system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103699663A (en) * | 2013-12-27 | 2014-04-02 | 中国科学院自动化研究所 | Hot event mining method based on large-scale knowledge base |
CN105183869A (en) * | 2015-09-16 | 2015-12-23 | 分众(中国)信息技术有限公司 | Building knowledge mapping database and construction method thereof |
CN105653522A (en) * | 2016-01-21 | 2016-06-08 | 中国农业大学 | Non-classified relation recognition method for plant field |
-
2016
- 2016-12-08 CN CN201611124399.XA patent/CN106815293A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103699663A (en) * | 2013-12-27 | 2014-04-02 | 中国科学院自动化研究所 | Hot event mining method based on large-scale knowledge base |
CN105183869A (en) * | 2015-09-16 | 2015-12-23 | 分众(中国)信息技术有限公司 | Building knowledge mapping database and construction method thereof |
CN105653522A (en) * | 2016-01-21 | 2016-06-08 | 中国农业大学 | Non-classified relation recognition method for plant field |
Cited By (164)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107392433B (en) * | 2017-06-27 | 2018-09-04 | 北京神州泰岳软件股份有限公司 | A kind of method and apparatus of extraction enterprise incidence relation information |
CN107392433A (en) * | 2017-06-27 | 2017-11-24 | 北京神州泰岳软件股份有限公司 | A kind of method and apparatus for extracting enterprise's incidence relation information |
CN107480128A (en) * | 2017-07-17 | 2017-12-15 | 广州特道信息科技有限公司 | The segmenting method and device of Chinese text |
CN107291700A (en) * | 2017-07-17 | 2017-10-24 | 广州特道信息科技有限公司 | Entity word recognition method and device |
CN107341264A (en) * | 2017-07-19 | 2017-11-10 | 东北大学 | A kind of electronic health record system and method for supporting custom entities |
CN107341264B (en) * | 2017-07-19 | 2020-09-25 | 东北大学 | Electronic medical record retrieval system and method supporting user-defined entity |
CN107391684A (en) * | 2017-07-24 | 2017-11-24 | 深信服科技股份有限公司 | A kind of method and system for threatening information generation |
CN107391684B (en) * | 2017-07-24 | 2020-12-11 | 深信服科技股份有限公司 | Method and system for generating threat information |
CN107609478A (en) * | 2017-08-09 | 2018-01-19 | 广州思涵信息科技有限公司 | A kind of real-time analysis of the students system and method for matching classroom knowledge content |
CN107870966A (en) * | 2017-08-11 | 2018-04-03 | 成都萌想科技有限责任公司 | A kind of recruitment general regulations data pick-up method based on semantic model |
CN107577670A (en) * | 2017-09-15 | 2018-01-12 | 清华大学 | A kind of terminology extraction method based on definition with relation |
CN107577670B (en) * | 2017-09-15 | 2020-09-22 | 清华大学 | Term extraction method based on definition and relation |
CN107807917A (en) * | 2017-09-27 | 2018-03-16 | 风变科技(深圳)有限公司 | Method for extracting content of text, device, system and storage medium |
CN107967290A (en) * | 2017-10-09 | 2018-04-27 | 国家计算机网络与信息安全管理中心 | A kind of knowledge mapping network establishing method and system, medium based on magnanimity scientific research data |
CN107766483A (en) * | 2017-10-13 | 2018-03-06 | 华中科技大学 | The interactive answering method and system of a kind of knowledge based collection of illustrative plates |
CN108228701A (en) * | 2017-10-23 | 2018-06-29 | 武汉大学 | A kind of system for realizing Chinese near-nature forest language inquiry interface |
CN107908671A (en) * | 2017-10-25 | 2018-04-13 | 南京擎盾信息科技有限公司 | Knowledge mapping construction method and system based on law data |
CN107908671B (en) * | 2017-10-25 | 2022-02-01 | 南京擎盾信息科技有限公司 | Knowledge graph construction method and system based on legal data |
CN107704634A (en) * | 2017-11-04 | 2018-02-16 | 辽宁工程技术大学 | A kind of method for forming knowledge and building knowledge chain |
CN107908621A (en) * | 2017-11-16 | 2018-04-13 | 东华大学 | Tumor of breast risk assessment system based on ultrasonic examination report text data |
CN108256063A (en) * | 2018-01-15 | 2018-07-06 | 中国人民解放军国防科技大学 | Knowledge base construction method for network security |
CN108304541A (en) * | 2018-01-31 | 2018-07-20 | 刘世洪 | The structure system and method for user preferences modeling UIM based on technique transfers platform |
CN108255815A (en) * | 2018-02-07 | 2018-07-06 | 苏州金螳螂文化发展股份有限公司 | The segmenting method and device of text |
CN110209828A (en) * | 2018-02-12 | 2019-09-06 | 北大方正集团有限公司 | Case querying method and case inquiry unit, computer equipment and storage medium |
CN110209828B (en) * | 2018-02-12 | 2021-08-27 | 北大方正集团有限公司 | Case query method, case query device, computer device and storage medium |
CN108647192A (en) * | 2018-03-27 | 2018-10-12 | 常熟鑫沐奇宝软件开发有限公司 | A method of generating virtual reality work script with natural language processing technique |
CN108647192B (en) * | 2018-03-27 | 2022-04-12 | 常熟鑫沐奇宝软件开发有限公司 | Method for generating virtual reality working script by natural language processing technology |
CN110399605A (en) * | 2018-04-17 | 2019-11-01 | 富士施乐株式会社 | Information processing unit and the computer-readable medium for storing program |
CN108829696A (en) * | 2018-04-18 | 2018-11-16 | 西安理工大学 | Towards knowledge mapping node method for auto constructing in metro design code |
CN108763195A (en) * | 2018-05-02 | 2018-11-06 | 武汉烽火普天信息技术有限公司 | A kind of non-limiting type relation excavation method based on interdependent syntax and pattern rules |
CN108763195B (en) * | 2018-05-02 | 2022-01-18 | 武汉烽火普天信息技术有限公司 | Dependency syntax and mode rule-based non-restricted relationship mining method |
CN108874878A (en) * | 2018-05-03 | 2018-11-23 | 众安信息技术服务有限公司 | A kind of building system and method for knowledge mapping |
CN109117477B (en) * | 2018-07-17 | 2022-01-28 | 广州大学 | Chinese field-oriented non-classification relation extraction method, device, equipment and medium |
CN109117477A (en) * | 2018-07-17 | 2019-01-01 | 广州大学 | Non-categorical Relation extraction method, apparatus, equipment and medium towards Chinese field |
CN109086391A (en) * | 2018-07-27 | 2018-12-25 | 北京光年无限科技有限公司 | A kind of method and system constructing knowledge mapping |
CN109086391B (en) * | 2018-07-27 | 2022-07-01 | 北京光年无限科技有限公司 | Method and system for constructing knowledge graph |
CN109241199A (en) * | 2018-08-08 | 2019-01-18 | 广州初星科技有限公司 | A method of it is found towards financial knowledge mapping |
CN109241199B (en) * | 2018-08-08 | 2022-09-23 | 上海旭荣网络科技有限公司 | Financial knowledge graph discovery method |
CN109241532A (en) * | 2018-08-30 | 2019-01-18 | 天津做票君机器人科技有限公司 | A kind of the vote buying information identifying method and identifier of negotiation by draft robot |
CN109241078A (en) * | 2018-08-30 | 2019-01-18 | 中国地质大学(武汉) | A kind of knowledge mapping hoc queries method based on hybrid database |
CN109241046A (en) * | 2018-08-30 | 2019-01-18 | 天津做票君机器人科技有限公司 | A kind of inventory information recognition methods of negotiation by draft robot and identifier |
CN109241078B (en) * | 2018-08-30 | 2021-07-20 | 中国地质大学(武汉) | Knowledge graph organization query method based on mixed database |
CN109241295B (en) * | 2018-08-31 | 2021-12-24 | 北京天广汇通科技有限公司 | Method for extracting specific entity relation in unstructured data |
CN109241295A (en) * | 2018-08-31 | 2019-01-18 | 北京天广汇通科技有限公司 | A kind of extracting method of special entity relationship in unstructured data |
CN109376353B (en) * | 2018-09-04 | 2022-09-16 | 国家电网公司华东分部 | Natural language processing-based power grid starting operation ticket generation device and method |
CN109376353A (en) * | 2018-09-04 | 2019-02-22 | 国家电网公司华东分部 | A kind of power grid start-up operation ticket generating means and method based on natural language processing |
CN109064339A (en) * | 2018-09-12 | 2018-12-21 | 张连祥 | A kind of method and system of intelligence inquiry merchant vector environmental information |
CN109189942A (en) * | 2018-09-12 | 2019-01-11 | 山东大学 | A kind of construction method and device of patent data knowledge mapping |
CN109189943A (en) * | 2018-09-19 | 2019-01-11 | 中国电子科技集团公司信息科学研究院 | A kind of capability knowledge extracts and the method for capability knowledge map construction |
CN109446337A (en) * | 2018-09-19 | 2019-03-08 | 中国信息通信研究院 | A kind of knowledge mapping construction method and device |
CN109446337B (en) * | 2018-09-19 | 2020-10-13 | 中国信息通信研究院 | Knowledge graph construction method and device |
CN109359299A (en) * | 2018-09-28 | 2019-02-19 | 中国电子科技集团公司信息科学研究院 | A kind of internet of things equipment ability ontology based on commodity data is from construction method |
CN109597894B (en) * | 2018-09-30 | 2023-10-03 | 创新先进技术有限公司 | Correlation model generation method and device, and data correlation method and device |
CN109597894A (en) * | 2018-09-30 | 2019-04-09 | 阿里巴巴集团控股有限公司 | A kind of correlation model generation method and device, a kind of data correlation method and device |
CN109165337B (en) * | 2018-10-17 | 2021-10-15 | 珠海市智图数研信息技术有限公司 | Method and system for establishing bid and ask field association analysis based on knowledge graph |
CN109165337A (en) * | 2018-10-17 | 2019-01-08 | 珠海市智图数研信息技术有限公司 | A kind of method and system of knowledge based map construction bidding field association analysis |
CN109522418A (en) * | 2018-11-08 | 2019-03-26 | 杭州费尔斯通科技有限公司 | A kind of automanual knowledge mapping construction method |
CN109522418B (en) * | 2018-11-08 | 2020-05-12 | 杭州费尔斯通科技有限公司 | Semi-automatic knowledge graph construction method |
CN109582933A (en) * | 2018-11-13 | 2019-04-05 | 北京合享智慧科技有限公司 | A kind of method and relevant apparatus of determining text novelty degree |
CN109492916A (en) * | 2018-11-16 | 2019-03-19 | 东南大学 | A kind of nuclear power regulation model building method based on ontology |
CN109597885A (en) * | 2018-12-11 | 2019-04-09 | 福建亿榕信息技术有限公司 | A kind of Knowledge Map construction method and storage medium |
CN109657072A (en) * | 2018-12-13 | 2019-04-19 | 北京百分点信息科技有限公司 | A kind of intelligent search WEB system and method applied to government's aid decision |
CN109670051A (en) * | 2018-12-14 | 2019-04-23 | 北京百度网讯科技有限公司 | Knowledge mapping method for digging, device, equipment and storage medium |
CN109857917A (en) * | 2018-12-21 | 2019-06-07 | 中国科学院信息工程研究所 | Towards the security knowledge map construction method and system for threatening information |
CN109635117A (en) * | 2018-12-26 | 2019-04-16 | 零犀(北京)科技有限公司 | A kind of knowledge based spectrum recognition user intention method and device |
CN109783484A (en) * | 2018-12-29 | 2019-05-21 | 北京航天云路有限公司 | The construction method and system of the data service platform of knowledge based map |
CN109729171B (en) * | 2019-01-10 | 2021-07-30 | 七彩安科智慧科技有限公司 | Method for constructing town cognitive matrix Internet of things |
CN109729171A (en) * | 2019-01-10 | 2019-05-07 | 七彩安科智慧科技有限公司 | A kind of construction method of small town cognition matrix Internet of Things |
CN109753664A (en) * | 2019-01-21 | 2019-05-14 | 广州大学 | A kind of concept extraction method, terminal device and the storage medium of domain-oriented |
CN111488497B (en) * | 2019-01-25 | 2023-05-12 | 北京沃东天骏信息技术有限公司 | Similarity determination method and device for character string set, terminal and readable medium |
CN111488497A (en) * | 2019-01-25 | 2020-08-04 | 北京沃东天骏信息技术有限公司 | Similarity determination method and device for character string set, terminal and readable medium |
CN109977233A (en) * | 2019-03-15 | 2019-07-05 | 北京金山数字娱乐科技有限公司 | A kind of idiom knowledge map construction method and device |
CN109947903B (en) * | 2019-03-15 | 2023-02-07 | 北京金山数字娱乐科技有限公司 | Idiom query method and device |
CN109947903A (en) * | 2019-03-15 | 2019-06-28 | 北京金山数字娱乐科技有限公司 | A kind of Chinese idiom querying method and device |
CN110096599B (en) * | 2019-04-30 | 2023-03-21 | 长沙知了信息科技有限公司 | Knowledge graph generation method and device |
CN110096599A (en) * | 2019-04-30 | 2019-08-06 | 长沙知了信息科技有限公司 | The generation method and device of knowledge mapping |
CN110119469A (en) * | 2019-05-22 | 2019-08-13 | 北京计算机技术及应用研究所 | A kind of data collection and transmission and method towards darknet |
CN110162792A (en) * | 2019-05-24 | 2019-08-23 | 国家电网有限公司 | Electric network data management method and device |
CN110210025A (en) * | 2019-05-29 | 2019-09-06 | 广州伟宏智能科技有限公司 | A kind of conversion method based on Text Feature Extraction |
CN110287481B (en) * | 2019-05-29 | 2022-06-14 | 西南电子技术研究所(中国电子科技集团公司第十研究所) | Named entity corpus labeling training system |
CN110287481A (en) * | 2019-05-29 | 2019-09-27 | 西南电子技术研究所(中国电子科技集团公司第十研究所) | Name entity corpus labeling training system |
CN110175334B (en) * | 2019-06-05 | 2023-06-27 | 苏州派维斯信息科技有限公司 | Text knowledge extraction system and method based on custom knowledge slot structure |
CN110175334A (en) * | 2019-06-05 | 2019-08-27 | 苏州派维斯信息科技有限公司 | Text knowledge's extraction system and method based on customized knowledge slot structure |
CN110222127A (en) * | 2019-06-06 | 2019-09-10 | 中国电子科技集团公司第二十八研究所 | The converging information method, apparatus and equipment of knowledge based map |
CN110210038B (en) * | 2019-06-13 | 2023-01-10 | 北京百度网讯科技有限公司 | Core entity determining method, system, server and computer readable medium thereof |
CN110210038A (en) * | 2019-06-13 | 2019-09-06 | 北京百度网讯科技有限公司 | Kernel entity determines method and its system, server and computer-readable medium |
CN110209839A (en) * | 2019-06-18 | 2019-09-06 | 卓尔智联(武汉)研究院有限公司 | Agricultural knowledge map construction device, method and computer readable storage medium |
CN110209839B (en) * | 2019-06-18 | 2021-07-27 | 卓尔智联(武汉)研究院有限公司 | Agricultural knowledge graph construction device and method and computer readable storage medium |
CN110321432B (en) * | 2019-06-24 | 2021-11-23 | 拓尔思信息技术股份有限公司 | Text event information extraction method, electronic device and nonvolatile storage medium |
CN110321432A (en) * | 2019-06-24 | 2019-10-11 | 拓尔思信息技术股份有限公司 | Textual event information extracting method, electronic device and non-volatile memory medium |
CN110458397A (en) * | 2019-07-05 | 2019-11-15 | 苏州热工研究院有限公司 | A kind of nuclear material military service performance information extracting method |
CN110347844A (en) * | 2019-07-15 | 2019-10-18 | 中国人民解放军战略支援部队航天工程大学 | A kind of space object knowledge map construction system |
CN110458471B (en) * | 2019-08-19 | 2022-05-20 | 绍兴数纺科技有限公司 | Standardized dye information management system |
CN110458471A (en) * | 2019-08-19 | 2019-11-15 | 绍兴数纺科技有限公司 | Standardize dyestuff information management system |
CN110516077A (en) * | 2019-08-20 | 2019-11-29 | 北京中亦安图科技股份有限公司 | Knowledge mapping construction method and device towards enterprise's market conditions |
CN110674308A (en) * | 2019-08-23 | 2020-01-10 | 上海科技发展有限公司 | Scientific and technological word list expansion method, device, terminal and medium based on grammar mode |
CN110717049B (en) * | 2019-08-29 | 2020-12-04 | 四川大学 | Text data-oriented threat information knowledge graph construction method |
CN110717049A (en) * | 2019-08-29 | 2020-01-21 | 四川大学 | Text data-oriented threat information knowledge graph construction method |
CN111104478A (en) * | 2019-09-05 | 2020-05-05 | 李轶 | Domain concept semantic drift exploration method |
CN110633469A (en) * | 2019-09-10 | 2019-12-31 | 陈绪平 | Method for accurately understanding Chinese sentence meaning |
CN110795932A (en) * | 2019-09-30 | 2020-02-14 | 中国地质大学(武汉) | Geological report text information extraction method based on geological ontology |
CN110825839A (en) * | 2019-11-07 | 2020-02-21 | 成都国腾实业集团有限公司 | Incidence relation analysis method for targets in text information |
CN110990525A (en) * | 2019-11-15 | 2020-04-10 | 华融融通(北京)科技有限公司 | Natural language processing-based public opinion information extraction and knowledge base generation method |
CN110888991A (en) * | 2019-11-28 | 2020-03-17 | 哈尔滨工程大学 | Sectional semantic annotation method in weak annotation environment |
CN110888991B (en) * | 2019-11-28 | 2023-12-01 | 哈尔滨工程大学 | Sectional type semantic annotation method under weak annotation environment |
CN111126065A (en) * | 2019-12-02 | 2020-05-08 | 南京医渡云医学技术有限公司 | Information extraction method and device for natural language text |
CN111126065B (en) * | 2019-12-02 | 2024-03-15 | 医渡云(北京)技术有限公司 | Information extraction method and device for natural language text |
CN111177399B (en) * | 2019-12-04 | 2023-06-16 | 华瑞新智科技(北京)有限公司 | Knowledge graph construction method and device |
CN111177399A (en) * | 2019-12-04 | 2020-05-19 | 华瑞新智科技(北京)有限公司 | Knowledge graph construction method and device |
CN111178075A (en) * | 2019-12-19 | 2020-05-19 | 厦门快商通科技股份有限公司 | Online customer service log analysis method, device and equipment |
CN111277560A (en) * | 2019-12-24 | 2020-06-12 | 普世(南京)智能科技有限公司 | Safe information acquisition, import and compilation method and system based on high-bandwidth physical isolation unidirectional transmission |
CN111339253A (en) * | 2020-02-25 | 2020-06-26 | 中国建设银行股份有限公司 | Method and device for extracting article information |
CN111476034A (en) * | 2020-04-07 | 2020-07-31 | 同方赛威讯信息技术有限公司 | Legal document information extraction method and system based on combination of rules and models |
CN111488741A (en) * | 2020-04-14 | 2020-08-04 | 税友软件集团股份有限公司 | Tax knowledge data semantic annotation method and related device |
CN111597349A (en) * | 2020-04-30 | 2020-08-28 | 西安理工大学 | Rail transit standard entity relation automatic completion method based on artificial intelligence |
CN111597349B (en) * | 2020-04-30 | 2022-10-11 | 西安理工大学 | Rail transit standard entity relation automatic completion method based on artificial intelligence |
CN111738445A (en) * | 2020-05-26 | 2020-10-02 | 山东大学 | Design knowledge fusion reasoning method supporting product rapid innovation |
CN111651995A (en) * | 2020-06-07 | 2020-09-11 | 上海建科工程咨询有限公司 | Accident information automatic extraction method and system based on deep circulation neural network |
CN111666425B (en) * | 2020-06-10 | 2023-04-18 | 深圳开思时代科技有限公司 | Automobile accessory searching method based on semantic knowledge |
CN111666425A (en) * | 2020-06-10 | 2020-09-15 | 深圳开思时代科技有限公司 | Automobile accessory searching method based on semantic knowledge |
CN111859968A (en) * | 2020-06-15 | 2020-10-30 | 深圳航天科创实业有限公司 | Text structuring method, text structuring device and terminal equipment |
CN111753022A (en) * | 2020-06-17 | 2020-10-09 | 第四范式(北京)技术有限公司 | Method, device and equipment for constructing knowledge graph and readable storage medium |
CN111753021A (en) * | 2020-06-17 | 2020-10-09 | 第四范式(北京)技术有限公司 | Method, device and equipment for constructing knowledge graph and readable storage medium |
CN111897914A (en) * | 2020-07-20 | 2020-11-06 | 杭州叙简科技股份有限公司 | Entity information extraction and knowledge graph construction method for field of comprehensive pipe gallery |
CN111897914B (en) * | 2020-07-20 | 2023-09-19 | 杭州叙简科技股份有限公司 | Entity information extraction and knowledge graph construction method for comprehensive pipe rack field |
CN112084329B (en) * | 2020-07-31 | 2024-02-02 | 西安理工大学 | Semantic analysis method for entity identification and relation extraction tasks |
CN112084329A (en) * | 2020-07-31 | 2020-12-15 | 西安理工大学 | Semantic analysis method for entity recognition and relation extraction tasks |
CN111984640A (en) * | 2020-08-04 | 2020-11-24 | 中国科学技术大学智慧城市研究院(芜湖) | Portrait construction method based on multi-element heterogeneous data |
CN111858575A (en) * | 2020-08-05 | 2020-10-30 | 杭州锘崴信息科技有限公司 | Private data analysis method and system |
CN111858575B (en) * | 2020-08-05 | 2024-04-19 | 杭州锘崴信息科技有限公司 | Private data analysis method and system |
CN111914569A (en) * | 2020-08-10 | 2020-11-10 | 哈尔滨安天科技集团股份有限公司 | Prediction method and device based on fusion map, electronic equipment and storage medium |
CN111914569B (en) * | 2020-08-10 | 2023-07-21 | 安天科技集团股份有限公司 | Fusion map-based prediction method and device, electronic equipment and storage medium |
CN112287686A (en) * | 2020-08-13 | 2021-01-29 | 新智道枢(上海)科技有限公司 | Warning safety protection method based on semantic analysis |
CN112364649B (en) * | 2020-09-08 | 2022-07-19 | 深圳平安医疗健康科技服务有限公司 | Named entity identification method and device, computer equipment and storage medium |
CN112364649A (en) * | 2020-09-08 | 2021-02-12 | 平安医疗健康管理股份有限公司 | Named entity identification method and device, computer equipment and storage medium |
CN112241734A (en) * | 2020-10-15 | 2021-01-19 | 首域科技(杭州)有限公司 | Method and system for diagnosing equipment fault through knowledge graph and Bayesian network |
CN112463960A (en) * | 2020-10-30 | 2021-03-09 | 完美世界控股集团有限公司 | Entity relationship determination method and device, computing equipment and storage medium |
CN112307767A (en) * | 2020-11-09 | 2021-02-02 | 国网福建省电力有限公司 | Bi-LSTM technology-based regulation and control knowledge modeling method |
CN112328811A (en) * | 2020-11-12 | 2021-02-05 | 国衡智慧城市科技研究院(北京)有限公司 | Word spectrum clustering intelligent generation method based on same type of phrases |
CN112417083B (en) * | 2020-11-12 | 2022-05-17 | 福建亿榕信息技术有限公司 | Method for constructing and deploying text entity relationship extraction model and storage device |
CN112417083A (en) * | 2020-11-12 | 2021-02-26 | 福建亿榕信息技术有限公司 | Method for constructing and deploying text entity relationship extraction model and storage device |
CN112800243A (en) * | 2021-02-04 | 2021-05-14 | 天津德尔塔科技有限公司 | Project budget analysis method and system based on knowledge graph |
CN112861515A (en) * | 2021-02-08 | 2021-05-28 | 上海天壤智能科技有限公司 | Interactive knowledge definition and processing method, system, device and readable medium |
CN112861515B (en) * | 2021-02-08 | 2022-11-11 | 上海天壤智能科技有限公司 | Interactive knowledge definition and processing method, system, device and readable medium |
CN112860913A (en) * | 2021-02-24 | 2021-05-28 | 广州汇通国信科技有限公司 | Ontology creation method of knowledge graph |
CN112860913B (en) * | 2021-02-24 | 2024-03-08 | 广州汇通国信科技有限公司 | Ontology creation method of knowledge graph |
CN113326700B (en) * | 2021-02-26 | 2024-05-14 | 西安理工大学 | ALBert-based complex heavy equipment entity extraction method |
CN113326700A (en) * | 2021-02-26 | 2021-08-31 | 西安理工大学 | ALBert-based complex heavy equipment entity extraction method |
CN113127503A (en) * | 2021-03-18 | 2021-07-16 | 中国科学院国家空间科学中心 | Automatic information extraction method and system for aerospace information |
CN113505191A (en) * | 2021-03-26 | 2021-10-15 | 中国航空无线电电子研究所 | Ontology-based avionics system architecture model construction method |
CN113220672A (en) * | 2021-04-26 | 2021-08-06 | 中国人民解放军军事科学院国防科技创新研究院 | Military and civil fusion policy information database system |
CN113239201A (en) * | 2021-05-20 | 2021-08-10 | 国网上海市电力公司 | Scientific and technological literature classification method based on knowledge graph |
CN113297252A (en) * | 2021-05-28 | 2021-08-24 | 北京信息科技大学 | Data query service method with mode being unaware |
CN113609838B (en) * | 2021-07-14 | 2024-05-24 | 华东计算技术研究所(中国电子科技集团公司第三十二研究所) | Document information extraction and mapping method and system |
CN113609838A (en) * | 2021-07-14 | 2021-11-05 | 华东计算技术研究所(中国电子科技集团公司第三十二研究所) | Document information extraction and mapping method and system |
CN114186561A (en) * | 2021-10-20 | 2022-03-15 | 福建亿榕信息技术有限公司 | Electronic file association analysis method and system based on knowledge graph |
CN113761226A (en) * | 2021-11-10 | 2021-12-07 | 中国电子科技集团公司第二十八研究所 | Ontology construction method of multi-modal airport data |
CN114490626B (en) * | 2022-04-18 | 2022-08-16 | 成都数融科技有限公司 | Financial information analysis method and system based on parallel computing |
CN114490626A (en) * | 2022-04-18 | 2022-05-13 | 成都数融科技有限公司 | Financial information analysis method and system based on parallel computing |
CN115358201A (en) * | 2022-08-03 | 2022-11-18 | 浙商期货有限公司 | Processing method and system for delivery and research report in futures field |
CN115169362A (en) * | 2022-09-08 | 2022-10-11 | 北京瀚语科技有限公司 | HowNet natural language processing method, system and application |
CN115796160B (en) * | 2022-12-09 | 2024-04-09 | 南阳理工学院 | Thesis redundant data cleaning method and device based on lexical affix and storage medium |
CN115796160A (en) * | 2022-12-09 | 2023-03-14 | 南阳理工学院 | Method and device for cleaning redundant thesis data based on lexical affixes and storage medium |
CN116483940A (en) * | 2023-04-26 | 2023-07-25 | 深圳市国房云数据技术服务有限公司 | Method for extracting and structuring data of whole-flow type document |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106815293A (en) | System and method for constructing knowledge graph for information analysis | |
CN104391942B (en) | Short essay eigen extended method based on semantic collection of illustrative plates | |
CN111078889B (en) | Method for extracting relationship between medicines based on various attentions and improved pre-training | |
CN107526799A (en) | A kind of knowledge mapping construction method based on deep learning | |
CN108073569A (en) | A kind of law cognitive approach, device and medium based on multi-layer various dimensions semantic understanding | |
Zhou et al. | Recognizing software bug-specific named entity in software bug repository | |
CN109241199B (en) | Financial knowledge graph discovery method | |
CN109871449A (en) | A kind of zero sample learning method end to end based on semantic description | |
Cao et al. | Toward accurate link between code and software documentation | |
Almazroi et al. | COVID-19 Cases Prediction in Saudi Arabia Using Tree-based Ensemble Models. | |
CN116244446A (en) | Social media cognitive threat detection method and system | |
Fang et al. | CyberEyes: cybersecurity entity recognition model based on graph convolutional network | |
Tianxiong et al. | Identifying chinese event factuality with convolutional neural networks | |
CN117574898A (en) | Domain knowledge graph updating method and system based on power grid equipment | |
Guo et al. | Research on named entity recognition for information extraction | |
Qiu et al. | NeuroSPE: A neuro‐net spatial relation extractor for natural language text fusing gazetteers and pretrained models | |
CN106156316A (en) | Special name under a kind of big data environment and native place correlating method and system | |
CN116186422A (en) | Disease-related public opinion analysis system based on social media and artificial intelligence | |
Zhu et al. | A Semantic Similarity Computing Model based on Siamese Network for Duplicate Questions Identification. | |
Parolin et al. | Come-ke: A new transformers based approach for knowledge extraction in conflict and mediation domain | |
CN110377690A (en) | A kind of information acquisition method and system based on long-range Relation extraction | |
CN113886524A (en) | Network security threat event extraction method based on short text | |
Zhou et al. | Ontology-based information extraction from environmental regulations for supporting environmental compliance checking | |
Liao et al. | Detecting duplicate questions in stack overflow via semantic and relevance approaches | |
Bäck | Domain similarity metrics for predicting transfer learning performance |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170609 |
|
RJ01 | Rejection of invention patent application after publication |