CN108595708A - A kind of exception information file classification method of knowledge based collection of illustrative plates - Google Patents

A kind of exception information file classification method of knowledge based collection of illustrative plates Download PDF

Info

Publication number
CN108595708A
CN108595708A CN201810443976.4A CN201810443976A CN108595708A CN 108595708 A CN108595708 A CN 108595708A CN 201810443976 A CN201810443976 A CN 201810443976A CN 108595708 A CN108595708 A CN 108595708A
Authority
CN
China
Prior art keywords
vector
entity
text
knowledge
illustrative plates
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810443976.4A
Other languages
Chinese (zh)
Inventor
张日崇
马宏远
王飞
杜翠兰
王玥
赵晓航
怀进鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
National Computer Network and Information Security Management Center
Original Assignee
Beihang University
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University, National Computer Network and Information Security Management Center filed Critical Beihang University
Priority to CN201810443976.4A priority Critical patent/CN108595708A/en
Publication of CN108595708A publication Critical patent/CN108595708A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention proposes a kind of exception information file classification method of knowledge based collection of illustrative plates, builds domain knowledge collection of illustrative plates first, constructs Entity recognition and entity link based on the domain knowledge collection of illustrative plates, then builds Text Representation vector vtextVector v is indicated with substance featureent, Text Representation vector and substance feature are finally indicated that vector splicing has been incorporated the new text representation vector v of knowledge featuremerge, classification based training is carried out to the new text representation vector, obtains final classification results.

Description

A kind of exception information file classification method of knowledge based collection of illustrative plates
Technical field
The present invention relates to a kind of sorting technique more particularly to a kind of exception information text classification sides of knowledge based collection of illustrative plates Method.
Background technology
With the development of internet and the continuous growth of the network information, the rapid development of network technology makes people to network It increasingly relies on, along with ever-increasing information sharing and service propaganda on network, the safety problem of Web content has highlighted Come.Therefore it is badly in need of the exception information recognition methods of a kind of high accuracy and strong autgmentability and provides network security for society and individual Guarantee.
In the prior art, exception information detection there are two main classes method:One kind is that keyword is used to filter or with artificial The mode that mode models exception information, it is artificial to formulate filtering keys word list matched text information;Another kind of is to be based on The file classification method of statistics and machine learning, such as support vector machines, K neighbor algorithms and decision Tree algorithms.Above method takes The effect obtained is all not satisfactory, and application scenarios limitation is often extremely difficult to balance between the accuracy and scalability of method. Artificial formulation filtering keys word list, machinery are relied in such a way that exception information is identified in the method for keyword filtering And autgmentability is poor, and the neologisms on network emerge one after another, and it can not be complete by exception information only according to artificial formulation lists of keywords Covering, and can not also understand from the angle of semantic analysis and screen harmful information.Currently based on the Information Filtering Model of content A large amount of rules manually formulated are relied on to complete to model, network harmful content is various informative, and the rule manually formulated can not sample Sample exhaustion is to the greatest extent.In addition data mining technology and the neural network model of machine learning also obtain in terms of the identification of exception information With application, but ignore the field priori involved in text, most methods are only started with from the surface characteristics of text, passed through The word frequency of word or semantic vector carry out semantic modeling to text in text, can only simply utilize such as cooccurrence relation shallow-layer special Sign, it is difficult to the semantic information for capturing the deeper contained in text, general character relationship, the inclusion relation of the things as mentioned by text And the priori etc. of the unmentioned common sense property of text.
Currently, knowledge mapping has become the semantic interlink realized in big data analysis, the multi-source heterogeneous number in internet is realized According to the important tool in the conversion of the specific things description to objective world.The foundation of knowledge mapping to the Unify legislations of data, Effective integration, association are found and knowledge reasoning has established effective research method, knowledge mapping visualization technique Description of Knowledge Resource and its carrier excavate, analysis, structure, draw knowledge and connect each other therebetween.As WordNet, DBPedia etc. are advised greatly The appearance and development of mould knowledge base, a large amount of knowledge can be with Open Access Journals, and the knowledge feature obtained from knowledge base is also by increasingly In more the applying to natural language processing of the tasks.As the natural language model based on neural network is embedded in (word by word Embedding) text feature is carried out the success of vectorization expression by method, is achieved on the representation method of knowledge feature same The remarkable result of sample, such as TransE to TranR a series of knowledge base entities and relationship embedding technique studies.But A few class representation of knowledge learning methods of the prior art are mostly used for the knowledge bases field internal problems such as relation inference, link prediction, And individually knowledge information is modeled mostly, it is not applied in exception information text identification.
Invention content
The present invention proposes a kind of exception information file classification method of knowledge based collection of illustrative plates, builds domain knowledge figure first Spectrum, construct Entity recognition and entity link model based on the domain knowledge collection of illustrative plates, then build Text Representation to Measure vtextVector v is indicated with substance featureent, Text Representation vector is finally indicated that vector splicing obtains with substance feature The new text representation vector v of knowledge feature is incorporatedmerge, classification based training is carried out to the new text representation vector, is obtained Final classification results.
The present invention is based on the Entity recognitions of knowledge mapping with linking and the short essay based on text Yu knowledge mapping union feature This classification is detected to realize based on the short text exception information of text and knowledge mapping.Present invention introduces external knowledge library is auxiliary Help the Deep Semantics excavation for carrying out text and character representation.Pass through entity relationship, classification, attribute etc. abundant inside knowledge base The extraction for extending the Deep Semantics relationship in information support text passes through the entity disambiguation of knowledge based collection of illustrative plates and link method The ambiguity problem for solving word handles the reference in text by the mapping relations of abbreviation complete in knowledge base and alias Word finally adds to the knowledge base information for linking entity in the training process of model as supplemental characteristic, is realized to improve The reliability of abnormal text classification.
Description of the drawings
Fig. 1 is the domain entities relationship system figure of one embodiment of the invention;
Fig. 2 is the attribute fusion of one embodiment of the invention and the qi flow chart that disappears;
Fig. 3 is the entity recognition model framework of the present invention;
Fig. 4 is the present invention based on text and knowledge mapping union feature classification process figure.
Specific implementation mode
In order to make the purpose , technical scheme and advantage of the present invention be clearer, with reference to the accompanying drawings and embodiments, The present invention will be described in further detail.It should be appreciated that specific embodiment described herein is only used to explain this hair It is bright, it is not intended to limit the present invention.In addition, technology involved in the various embodiments of the present invention described below is special Sign can be combined with each other as long as they do not conflict with each other.
As shown in Figure 1 for the exception information text detection embodiment of political class and concerning taxes class field, structure field is needed Knowledge mapping establishes domain entities library.Building process is by extracting in news portal website, microblogging, wechat public platform and forum Political, economic related data, and combine main a few class Chinese encyclopaedia website (Baidupedia, interactions hundred in current internet Section, Chinese wikipedia) in semi-structured data supplement.Network data has polyphyly, derives from news portal net It stands, multiple channels such as microblogging, wechat public platform and forum, different statement modes and data content knot is had in different platform Structure, so needing to be handled the data of multi-source and realizing fusion.First using the rule-based tool that crawls from Network page The data of structuring are extracted in face, and design simple rule and (filtering spcial character, conversion between simplified and traditional Chinese are cleared up to initial data Deng) with normalization (unified time, the expression forms such as date), then will be obtained from Chinese encyclopaedia website with crawl with Politics, economy class news, the relevant entry of microblogging are used as entity using original entry label in encyclopaedia website data Simple K-means clustering algorithms are the delimitation classification of each entity, composition and classification system.About the not Tongfang of entry in encyclopaedia Face description is used for constituting every attribute of entity, and the hyperlink in being described about entry in encyclopaedia can be used for establishing each reality Isolated entity is connected into collection of illustrative plates by the incidence relation between body.It is preliminary based on having been obtained from multi-resources Heterogeneous data at this time Molding knowledge mapping provides basis for follow-up work.
Since the expression-form of separate sources data differs with the quality of data, need to carry out knowledge fusion.Knowledge fusion packet Containing entity alignment and two groundworks of qi that disappear of merging of attribute, entity alignment is retouched using physical name, entity class and entity It states three dimensional characteristics and the list of entities that should be aligned is found out by Arithmetic of Semantic Similarity judgement, same entity is needed to be melted The entity item attribute information of conjunction is organized into set, using attribute fusion as shown in Figure 2 and the qi flow that disappears, finally obtains complete Solid data be stored in database.The building process of knowledge base in this way is basically completed, and storage medium uses neo4j chart databases, The inquiry of the knowledge base of structure completion is carried out by way of the API Access interface that calling neo4j is provided.
Extract textual association to knowledge mapping information need through entity recognition method to the reality that is arrived involved in text Pronouns, general term for nouns, numerals and measure words or phrase are labeled, and be linked in knowledge base its it is corresponding it is specific physically.The main task of Entity recognition is The name such as the name mentioned in natural language text, place name, institution term Entity recognition is come out, and is optionally carried out real The simple classification of body name.Current almost all of processing mode is all to regard this problem as the similar sequence labelling segmented to ask Topic uses " BIO " labelling method that each word in sentence is marked, and " B " represents the beginning of some physical name, and " I " represents certain The centre of a physical name or end, " O " represent the word other than physical name, reuse machine learning model mark data It is trained on collection, such as condition random field (CRF) or Recognition with Recurrent Neural Network model.The present invention uses as shown in Figure 3 The binding model of BILSTM+CRF is first encoded text using shot and long term Memory Neural Networks (LSTM), each in text Input of the term vector of word as LSTM, it is certain that then output, which is each word,
The groundwork of entity link is to find the corresponding entity in knowledge base according to name entity word, can be related to therebetween And the disambiguation to entity of the same name, as judged, " Zhang San " should be linked to leader Zhang San in " Zhang San is a great Leader " Physically or personage's biography Zhang San physically.The present invention is established generally by way of statistical learning under the data set of standard Rate model completes the qi that disappears, and identifies the highest entity of probability, returns to entity id.The case where lacking the data set completely marked Under, it first passes through knowledge library searching and enumerates all candidate entities, then utilize entity popularity, entity class and original text The indexs such as this degree of association or the similarity of entity information and urtext carry out rule-based sequence and screening.
Traditional text representation method carries out vectorization expression by one-hot vectors or TF-IDF value sequences, right first The word obtained after all text participles of data set is counted, and vocabulary is obtained after filtering low word stop words, if vocabulary size For n, then the expression vector v of textd∈RnThe appearance feelings of i-th of word in the text in digital representation in i-th dimension vocabulary Condition, 1 is to occur 0 not occur, or replace to obtain more preferable effect using the TF-IDF values of the word.But this representation method Dimension can be brought excessively high, Sparse, and to the weaker equal prominent questions of the code capacity of Semantic Similarity.
To solve the above problems, the present invention uses word embedding grammar, single word is subjected to vectorization expression, by word Between similitude be converted to the measurement of COS distance between vector.The present invention uses nearest 1 year news data and Chinese Wikipedia data carry out term vector training using word2vec.
Text is considered as the sequence of terms (w occurred successively1, w2, w3...), if word wiWord2vec vector tables It is shown as vwi, vector length k is stitched together the term vector of all words of text to obtain Text Representation vector vtext∈ Rs×k
The main target of knowledge mapping be various entities and concept existing for describe among real world and it Between incidence relation.Knowledge mapping is by " entity-relationship-entity " triple, by entity with the real world and concept It is mapped in a semantic network, can effectively solve the problem that the low density problem of open the Internet big data information value, it is especially suitable Relevant information retrieval task related for entity, semantic.But entity relationship is very difficult to apply in Algorithm of documents categorization, this hair The bright semantic information by knowledge base is expressed as dense low-dimensional real-valued vectors, towards in knowledge mapping entity and relationship carry out Indicate study.
The present invention indicates entity and relation vector using TransE models, by each triple example (head, Relation, tail) in relationship relation regard the addition of vectors from entity head to entity tail as, by constantly adjusting The vector of whole head, relation and tail keep (h+r) equal with t, i.e. h+r=t.
By TransE algorithms in knowledge mapping entity and after relationship is indicated study, each entity and pass System can indicate v with a k dimensional vectorei.Then it can show that the vector of knowledge feature indicates:The reality of knowledge based collection of illustrative plates Method body identification and linked, text (w1, w2, w3...) can the entity that arrives of text link be (e1,e2,……,et), The entity vector of all entities is stitched together to obtain substance feature vector expression vent∈Rt×k
Text Representation vector and substance feature are indicated that vector splicing has been incorporated the text representation of knowledge feature Form:
With new text representation vector vmergeInstead of the expression vector v originally based on plain text featuretext, participate in mesh Model training in, to complete to expand the feature of target text, increase the support to Deep Semantics information, improve model Quality and completeness.
The disaggregated model of the present invention is illustrated in figure 4 to vmergeCarry out classification based training.Using CNN deep learnings model into Row classification based training, vmergeVector is spliced into the representing matrix of text, CNN layers is input to, result is finally input to full connection Network classifier carries out model training, obtains final classification as a result, ensure that capture of the model to text Deep Semantics information, Improve classification quality and reliability.
Finally it should be noted that:The above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;To the greatest extent Present invention has been described in detail with reference to the aforementioned embodiments for pipe, it will be understood by those of ordinary skill in the art that:It is still It can modify to the technical solution recorded in previous embodiment or equivalent replacement of some of the technical features; And these modifications or replacements, the spirit for various embodiments of the present invention technical solution that it does not separate the essence of the corresponding technical solution And range.

Claims (6)

1. a kind of exception information file classification method of knowledge based collection of illustrative plates, which is characterized in that domain knowledge collection of illustrative plates is built first, Entity recognition and entity link based on the domain knowledge collection of illustrative plates are constructed, Text Representation vector v is then builttextWith Substance feature indicates vector vent, Text Representation vector is finally indicated that vector splicing has been incorporated with substance feature and is known Know the new text representation vector v of featuremerge, classification based training is carried out to the new text representation vector, obtains final point Class result.
2. the method as described in claim 1, which is characterized in that the structure domain knowledge collection of illustrative plates is by extracting in different platform Multi-source data handled and merged, establish between entity class and entity and be associated with, then carry out knowledge fusion;It is described to know It includes entity alignment and two steps of qi that disappear that merge of attribute to know fusion, and physical name, entity are used in the entity alignment step Three dimensional characteristics of classification and entity description find out the list of entities that should be aligned by Arithmetic of Semantic Similarity judgement, will be same All entity item attribute informations to be fused of entity are organized into set.
3. the method as described in claim 1, which is characterized in that the structure is passed through based on the Entity recognition based on collection of illustrative plates Using the binding model of BILSTM+CRF, first text is encoded using LSTM algorithms, the term vector of each word in text As the input of LSTM, then output is the probability that each word is some label, and as the input of CRF, randomization transfer Probability matrix finds out the highest flag sequence of probability according to deduction algorithm;The chain of entities based on collection of illustrative plates based on described is built to connect The mode for crossing statistical learning establishes probabilistic model under the data set of standard, completes the qi that disappears, identifies the highest entity of probability, return Return entity id.
4. the method as described in claim 1, which is characterized in that the structure Text Representation vector uses word insertion side Single word is carried out vectorization expression, the similitude between word is converted to the measurement of COS distance between vector, led to by method It crosses neural network to learn text feature, while reducing term vector dimension;If word wiWord2vec vectors be expressed as vwi, the term vector of all words of text is stitched together to obtain Text Representation vector by vector length kThe s is the quantity of word, the vtext∈Rs×k
5. method as claimed in claim 4, which is characterized in that it is to pass through to build the method that the substance feature indicates vector TransE algorithms in knowledge mapping entity and relationship be indicated study, one k dimensional vector of each entity and relationship Indicate vei, text (w1, w2, w3...) can the entity that arrives of text link be (e1,e2,……,et), by the reality of all entities Body vector is stitched together to obtain the substance feature expression vectorThe t is real The quantity of body, vent∈Rt×k
6. method as claimed in claim 5, which is characterized in that Text Representation vector and substance feature are indicated that vector is spelled Connect the new text representation vector v for having been incorporated knowledge featuremergeMode beThen make Classification based training is carried out with CNN deep learnings model, by vmergeVector is spliced into the representing matrix of text, is input to CNN layers, finally Result is input to fully-connected network grader progress model training and obtains final classification result.
CN201810443976.4A 2018-05-10 2018-05-10 A kind of exception information file classification method of knowledge based collection of illustrative plates Pending CN108595708A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810443976.4A CN108595708A (en) 2018-05-10 2018-05-10 A kind of exception information file classification method of knowledge based collection of illustrative plates

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810443976.4A CN108595708A (en) 2018-05-10 2018-05-10 A kind of exception information file classification method of knowledge based collection of illustrative plates

Publications (1)

Publication Number Publication Date
CN108595708A true CN108595708A (en) 2018-09-28

Family

ID=63637073

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810443976.4A Pending CN108595708A (en) 2018-05-10 2018-05-10 A kind of exception information file classification method of knowledge based collection of illustrative plates

Country Status (1)

Country Link
CN (1) CN108595708A (en)

Cited By (54)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543041A (en) * 2018-11-30 2019-03-29 安徽听见科技有限公司 A kind of generation method and device of language model scores
CN109582802A (en) * 2018-11-30 2019-04-05 国信优易数据有限公司 A kind of entity embedding grammar, device, medium and equipment
CN109614615A (en) * 2018-12-04 2019-04-12 联想(北京)有限公司 Methodology for Entities Matching, device and electronic equipment
CN109657238A (en) * 2018-12-10 2019-04-19 宁波深擎信息科技有限公司 Context identification complementing method, system, terminal and the medium of knowledge based map
CN109684394A (en) * 2018-12-13 2019-04-26 北京百度网讯科技有限公司 Document creation method, device, equipment and storage medium
CN109726253A (en) * 2018-12-21 2019-05-07 义橙网络科技(上海)有限公司 Construction method, device, equipment and the medium of talent's map and talent's portrait
CN109977419A (en) * 2019-04-09 2019-07-05 福建奇点时空数字科技有限公司 A kind of knowledge mapping building system
CN110046260A (en) * 2019-04-16 2019-07-23 广州大学 A kind of darknet topic discovery method and system of knowledge based map
CN110069779A (en) * 2019-04-18 2019-07-30 腾讯科技(深圳)有限公司 The symptom entity recognition method and relevant apparatus of medical text
CN110188147A (en) * 2019-05-22 2019-08-30 厦门无常师教育科技有限公司 The document entity relationship of knowledge based map finds method and system
CN110245228A (en) * 2019-04-29 2019-09-17 阿里巴巴集团控股有限公司 The method and apparatus for determining text categories
CN110263178A (en) * 2019-06-03 2019-09-20 南京航空航天大学 A kind of mapping method of WordNet to Neo4J, Semantic detection method and semantic computation expansion interface generation method
CN110263324A (en) * 2019-05-16 2019-09-20 华为技术有限公司 Text handling method, model training method and device
CN110275928A (en) * 2019-06-24 2019-09-24 复旦大学 Iterative entity relation extraction method
CN110297908A (en) * 2019-07-01 2019-10-01 中国医学科学院医学信息研究所 Diagnosis and treatment program prediction method and device
CN110390324A (en) * 2019-07-27 2019-10-29 苏州过来人科技有限公司 A kind of resume printed page analysis algorithm merging vision and text feature
CN110399261A (en) * 2019-06-13 2019-11-01 中国科学院信息工程研究所 A kind of system alarm clustering method based on co-occurrence figure
CN110490251A (en) * 2019-03-08 2019-11-22 腾讯科技(深圳)有限公司 Prediction disaggregated model acquisition methods and device, storage medium based on artificial intelligence
CN110516073A (en) * 2019-08-30 2019-11-29 北京百度网讯科技有限公司 A kind of file classification method, device, equipment and medium
CN110633366A (en) * 2019-07-31 2019-12-31 国家计算机网络与信息安全管理中心 Short text classification method, device and storage medium
CN110750647A (en) * 2019-10-17 2020-02-04 北京华宇信息技术有限公司 Construction method of ELP model of multi-source heterogeneous information data
CN110825882A (en) * 2019-10-09 2020-02-21 西安交通大学 Knowledge graph-based information system management method
CN110910243A (en) * 2019-09-26 2020-03-24 山东佳联电子商务有限公司 Property right transaction method based on reconfigurable big data knowledge map technology
CN110955764A (en) * 2019-11-19 2020-04-03 百度在线网络技术(北京)有限公司 Scene knowledge graph generation method, man-machine conversation method and related equipment
CN110955780A (en) * 2019-10-12 2020-04-03 中国人民解放军国防科技大学 Entity alignment method for knowledge graph
CN110990533A (en) * 2019-11-29 2020-04-10 支付宝(杭州)信息技术有限公司 Method and device for determining standard text corresponding to query text
CN111028952A (en) * 2019-11-27 2020-04-17 云知声智能科技股份有限公司 Method and device for constructing Chinese medical implication knowledge graph
CN111144574A (en) * 2018-11-06 2020-05-12 北京嘀嘀无限科技发展有限公司 Artificial intelligence system and method for training learner model using instructor model
CN111191031A (en) * 2019-12-24 2020-05-22 上海大学 Entity relation classification method of unstructured text based on WordNet and IDF
CN111191047A (en) * 2019-12-31 2020-05-22 武汉理工大学 Knowledge graph construction method for human-computer cooperation disassembly task
CN111209738A (en) * 2019-12-31 2020-05-29 浙江大学 Multi-task named entity recognition method combining text classification
CN111414393A (en) * 2020-03-26 2020-07-14 湖南科创信息技术股份有限公司 Semantic similar case retrieval method and equipment based on medical knowledge graph
CN111563166A (en) * 2020-05-28 2020-08-21 浙江学海教育科技有限公司 Pre-training model method for mathematical problem classification
CN111737489A (en) * 2020-06-17 2020-10-02 广联达科技股份有限公司 Building information retrieval method, device, equipment and readable storage medium
CN111985242A (en) * 2019-05-22 2020-11-24 中国信息安全测评中心 Text labeling method and device
CN112084331A (en) * 2020-08-27 2020-12-15 清华大学 Text processing method, text processing device, model training method, model training device, computer equipment and storage medium
CN112182249A (en) * 2020-10-23 2021-01-05 四川大学 Automatic classification method and device for aviation safety report
CN112417448A (en) * 2020-11-15 2021-02-26 复旦大学 Anti-aging enhancement method for malicious software detection model based on API (application programming interface) relational graph
CN112559737A (en) * 2020-11-20 2021-03-26 和美(深圳)信息技术股份有限公司 Node classification method and system of knowledge graph
CN112597298A (en) * 2020-10-14 2021-04-02 上海勃池信息技术有限公司 Deep learning text classification method fusing knowledge maps
CN112632994A (en) * 2020-12-03 2021-04-09 大箴(杭州)科技有限公司 Method, device and equipment for determining basic attribute characteristics based on text information
CN112801706A (en) * 2021-02-04 2021-05-14 北京云上曲率科技有限公司 Game user behavior data mining method and system
CN112906361A (en) * 2021-02-09 2021-06-04 上海明略人工智能(集团)有限公司 Text data labeling method and device, electronic equipment and storage medium
CN113094715A (en) * 2021-04-20 2021-07-09 国家计算机网络与信息安全管理中心 Network security dynamic early warning system based on knowledge graph
CN113254615A (en) * 2021-05-31 2021-08-13 中国移动通信集团陕西有限公司 Text processing method, device, equipment and medium
CN113449104A (en) * 2021-06-22 2021-09-28 上海明略人工智能(集团)有限公司 Label enhancement model construction method and system, electronic equipment and storage medium
CN113590802A (en) * 2021-09-27 2021-11-02 北京明略软件系统有限公司 Session content abnormity detection method and device, electronic equipment and storage medium
CN113641766A (en) * 2021-07-15 2021-11-12 北京三快在线科技有限公司 Relationship identification method and device, storage medium and electronic equipment
CN113722509A (en) * 2021-09-07 2021-11-30 中国人民解放军32801部队 Knowledge graph data fusion method based on entity attribute similarity
WO2021259002A1 (en) * 2020-06-23 2021-12-30 平安科技(深圳)有限公司 Decision tree-based method and apparatus for outputting abnormal data sources, and computer device
CN113963357A (en) * 2021-12-16 2022-01-21 北京大学 Knowledge graph-based sensitive text detection method and system
CN114064901A (en) * 2021-11-26 2022-02-18 重庆邮电大学 Book comment text classification method based on knowledge graph word meaning disambiguation
CN114548103A (en) * 2020-11-25 2022-05-27 马上消费金融股份有限公司 Training method of named entity recognition model and recognition method of named entity
CN117040926A (en) * 2023-10-08 2023-11-10 北京网藤科技有限公司 Industrial control network security feature analysis method and system applying knowledge graph

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107526798A (en) * 2017-08-18 2017-12-29 武汉红茶数据技术有限公司 A kind of Entity recognition based on neutral net and standardization integrated processes and model
CN107644014A (en) * 2017-09-25 2018-01-30 南京安链数据科技有限公司 A kind of name entity recognition method based on two-way LSTM and CRF
CN107992480A (en) * 2017-12-25 2018-05-04 东软集团股份有限公司 A kind of method, apparatus for realizing entity disambiguation and storage medium, program product

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107526798A (en) * 2017-08-18 2017-12-29 武汉红茶数据技术有限公司 A kind of Entity recognition based on neutral net and standardization integrated processes and model
CN107644014A (en) * 2017-09-25 2018-01-30 南京安链数据科技有限公司 A kind of name entity recognition method based on two-way LSTM and CRF
CN107992480A (en) * 2017-12-25 2018-05-04 东软集团股份有限公司 A kind of method, apparatus for realizing entity disambiguation and storage medium, program product

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JIN WANG等: "《Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence》", 《COMBINING KNOWLEDGE WITH DEEP CONVOLUTIONAL NEURAL NETWORKS FOR SHORT TEXT CLASSIFICATION》 *
徐增林等: "知识图谱技术综述", 《电子科技大学学报》 *

Cited By (82)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020093356A1 (en) * 2018-11-06 2020-05-14 Beijing Didi Infinity Technology And Development Co., Ltd. Artificial intelligent systems and methods for using structurally simpler learner model to mimic behaviors of structurally more complicated reference model
US10872300B2 (en) 2018-11-06 2020-12-22 Beijing Didi Infinity Technology And Development Co., Ltd. Artificial intelligent systems and methods for using a structurally simpler learner model to mimic behaviors of a structurally more complicated reference model
CN111144574A (en) * 2018-11-06 2020-05-12 北京嘀嘀无限科技发展有限公司 Artificial intelligence system and method for training learner model using instructor model
CN111144574B (en) * 2018-11-06 2023-03-24 北京嘀嘀无限科技发展有限公司 Artificial intelligence system and method for training learner model using instructor model
CN109582802B (en) * 2018-11-30 2020-11-03 国信优易数据股份有限公司 Entity embedding method, device, medium and equipment
CN109582802A (en) * 2018-11-30 2019-04-05 国信优易数据有限公司 A kind of entity embedding grammar, device, medium and equipment
CN109543041A (en) * 2018-11-30 2019-03-29 安徽听见科技有限公司 A kind of generation method and device of language model scores
CN109614615B (en) * 2018-12-04 2022-04-22 联想(北京)有限公司 Entity matching method and device and electronic equipment
CN109614615A (en) * 2018-12-04 2019-04-12 联想(北京)有限公司 Methodology for Entities Matching, device and electronic equipment
CN109657238A (en) * 2018-12-10 2019-04-19 宁波深擎信息科技有限公司 Context identification complementing method, system, terminal and the medium of knowledge based map
CN109657238B (en) * 2018-12-10 2023-10-13 宁波深擎信息科技有限公司 Knowledge graph-based context identification completion method, system, terminal and medium
CN109684394A (en) * 2018-12-13 2019-04-26 北京百度网讯科技有限公司 Document creation method, device, equipment and storage medium
CN109726253A (en) * 2018-12-21 2019-05-07 义橙网络科技(上海)有限公司 Construction method, device, equipment and the medium of talent's map and talent's portrait
CN110490251A (en) * 2019-03-08 2019-11-22 腾讯科技(深圳)有限公司 Prediction disaggregated model acquisition methods and device, storage medium based on artificial intelligence
CN110490251B (en) * 2019-03-08 2022-07-01 腾讯科技(深圳)有限公司 Artificial intelligence-based prediction classification model obtaining method and device and storage medium
CN109977419A (en) * 2019-04-09 2019-07-05 福建奇点时空数字科技有限公司 A kind of knowledge mapping building system
CN110046260A (en) * 2019-04-16 2019-07-23 广州大学 A kind of darknet topic discovery method and system of knowledge based map
CN110069779A (en) * 2019-04-18 2019-07-30 腾讯科技(深圳)有限公司 The symptom entity recognition method and relevant apparatus of medical text
CN110069779B (en) * 2019-04-18 2023-01-10 腾讯科技(深圳)有限公司 Symptom entity identification method of medical text and related device
CN110245228A (en) * 2019-04-29 2019-09-17 阿里巴巴集团控股有限公司 The method and apparatus for determining text categories
US20220147715A1 (en) * 2019-05-16 2022-05-12 Huawei Technologies Co., Ltd. Text processing method, model training method, and apparatus
CN110263324B (en) * 2019-05-16 2021-02-12 华为技术有限公司 Text processing method, model training method and device
CN110263324A (en) * 2019-05-16 2019-09-20 华为技术有限公司 Text handling method, model training method and device
CN110188147A (en) * 2019-05-22 2019-08-30 厦门无常师教育科技有限公司 The document entity relationship of knowledge based map finds method and system
CN111985242A (en) * 2019-05-22 2020-11-24 中国信息安全测评中心 Text labeling method and device
CN110263178B (en) * 2019-06-03 2023-05-12 南京航空航天大学 WordNet-to-Neo 4J mapping method, semantic detection method and semantic calculation expansion interface generation method
CN110263178A (en) * 2019-06-03 2019-09-20 南京航空航天大学 A kind of mapping method of WordNet to Neo4J, Semantic detection method and semantic computation expansion interface generation method
CN110399261A (en) * 2019-06-13 2019-11-01 中国科学院信息工程研究所 A kind of system alarm clustering method based on co-occurrence figure
CN110275928B (en) * 2019-06-24 2022-11-22 复旦大学 Iterative entity relation extraction method
CN110275928A (en) * 2019-06-24 2019-09-24 复旦大学 Iterative entity relation extraction method
CN110297908A (en) * 2019-07-01 2019-10-01 中国医学科学院医学信息研究所 Diagnosis and treatment program prediction method and device
CN110390324A (en) * 2019-07-27 2019-10-29 苏州过来人科技有限公司 A kind of resume printed page analysis algorithm merging vision and text feature
CN110633366B (en) * 2019-07-31 2022-12-16 国家计算机网络与信息安全管理中心 Short text classification method, device and storage medium
CN110633366A (en) * 2019-07-31 2019-12-31 国家计算机网络与信息安全管理中心 Short text classification method, device and storage medium
CN110516073A (en) * 2019-08-30 2019-11-29 北京百度网讯科技有限公司 A kind of file classification method, device, equipment and medium
CN110910243A (en) * 2019-09-26 2020-03-24 山东佳联电子商务有限公司 Property right transaction method based on reconfigurable big data knowledge map technology
CN110825882A (en) * 2019-10-09 2020-02-21 西安交通大学 Knowledge graph-based information system management method
CN110825882B (en) * 2019-10-09 2022-03-01 西安交通大学 Knowledge graph-based information system management method
CN110955780A (en) * 2019-10-12 2020-04-03 中国人民解放军国防科技大学 Entity alignment method for knowledge graph
CN110955780B (en) * 2019-10-12 2022-10-14 中国人民解放军国防科技大学 Entity alignment method for knowledge graph
CN110750647A (en) * 2019-10-17 2020-02-04 北京华宇信息技术有限公司 Construction method of ELP model of multi-source heterogeneous information data
CN110750647B (en) * 2019-10-17 2020-07-31 北京华宇信息技术有限公司 Method for constructing E L P model of multi-source heterogeneous information data
CN110955764A (en) * 2019-11-19 2020-04-03 百度在线网络技术(北京)有限公司 Scene knowledge graph generation method, man-machine conversation method and related equipment
CN110955764B (en) * 2019-11-19 2021-04-06 百度在线网络技术(北京)有限公司 Scene knowledge graph generation method, man-machine conversation method and related equipment
CN111028952B (en) * 2019-11-27 2023-08-04 云知声智能科技股份有限公司 Method and device for constructing Chinese medical implication knowledge graph
CN111028952A (en) * 2019-11-27 2020-04-17 云知声智能科技股份有限公司 Method and device for constructing Chinese medical implication knowledge graph
CN110990533A (en) * 2019-11-29 2020-04-10 支付宝(杭州)信息技术有限公司 Method and device for determining standard text corresponding to query text
CN110990533B (en) * 2019-11-29 2023-08-25 支付宝(杭州)信息技术有限公司 Method and device for determining standard text corresponding to query text
CN111191031A (en) * 2019-12-24 2020-05-22 上海大学 Entity relation classification method of unstructured text based on WordNet and IDF
CN111191047A (en) * 2019-12-31 2020-05-22 武汉理工大学 Knowledge graph construction method for human-computer cooperation disassembly task
CN111209738A (en) * 2019-12-31 2020-05-29 浙江大学 Multi-task named entity recognition method combining text classification
CN111414393A (en) * 2020-03-26 2020-07-14 湖南科创信息技术股份有限公司 Semantic similar case retrieval method and equipment based on medical knowledge graph
CN111563166B (en) * 2020-05-28 2024-02-13 浙江学海教育科技有限公司 Pre-training model method for classifying mathematical problems
CN111563166A (en) * 2020-05-28 2020-08-21 浙江学海教育科技有限公司 Pre-training model method for mathematical problem classification
CN111737489A (en) * 2020-06-17 2020-10-02 广联达科技股份有限公司 Building information retrieval method, device, equipment and readable storage medium
WO2021259002A1 (en) * 2020-06-23 2021-12-30 平安科技(深圳)有限公司 Decision tree-based method and apparatus for outputting abnormal data sources, and computer device
CN112084331A (en) * 2020-08-27 2020-12-15 清华大学 Text processing method, text processing device, model training method, model training device, computer equipment and storage medium
CN112597298A (en) * 2020-10-14 2021-04-02 上海勃池信息技术有限公司 Deep learning text classification method fusing knowledge maps
CN112182249A (en) * 2020-10-23 2021-01-05 四川大学 Automatic classification method and device for aviation safety report
CN112417448A (en) * 2020-11-15 2021-02-26 复旦大学 Anti-aging enhancement method for malicious software detection model based on API (application programming interface) relational graph
CN112417448B (en) * 2020-11-15 2022-03-18 复旦大学 Anti-aging enhancement method for malicious software detection model based on API (application programming interface) relational graph
CN112559737A (en) * 2020-11-20 2021-03-26 和美(深圳)信息技术股份有限公司 Node classification method and system of knowledge graph
CN114548103A (en) * 2020-11-25 2022-05-27 马上消费金融股份有限公司 Training method of named entity recognition model and recognition method of named entity
CN114548103B (en) * 2020-11-25 2024-03-29 马上消费金融股份有限公司 Named entity recognition model training method and named entity recognition method
CN112632994B (en) * 2020-12-03 2023-09-01 大箴(杭州)科技有限公司 Method, device and equipment for determining basic attribute characteristics based on text information
CN112632994A (en) * 2020-12-03 2021-04-09 大箴(杭州)科技有限公司 Method, device and equipment for determining basic attribute characteristics based on text information
CN112801706A (en) * 2021-02-04 2021-05-14 北京云上曲率科技有限公司 Game user behavior data mining method and system
CN112801706B (en) * 2021-02-04 2024-02-02 北京云上曲率科技有限公司 Game user behavior data mining method and system
CN112906361A (en) * 2021-02-09 2021-06-04 上海明略人工智能(集团)有限公司 Text data labeling method and device, electronic equipment and storage medium
CN113094715A (en) * 2021-04-20 2021-07-09 国家计算机网络与信息安全管理中心 Network security dynamic early warning system based on knowledge graph
CN113254615A (en) * 2021-05-31 2021-08-13 中国移动通信集团陕西有限公司 Text processing method, device, equipment and medium
CN113449104A (en) * 2021-06-22 2021-09-28 上海明略人工智能(集团)有限公司 Label enhancement model construction method and system, electronic equipment and storage medium
CN113641766A (en) * 2021-07-15 2021-11-12 北京三快在线科技有限公司 Relationship identification method and device, storage medium and electronic equipment
CN113722509A (en) * 2021-09-07 2021-11-30 中国人民解放军32801部队 Knowledge graph data fusion method based on entity attribute similarity
CN113722509B (en) * 2021-09-07 2022-03-01 中国人民解放军32801部队 Knowledge graph data fusion method based on entity attribute similarity
CN113590802A (en) * 2021-09-27 2021-11-02 北京明略软件系统有限公司 Session content abnormity detection method and device, electronic equipment and storage medium
CN114064901B (en) * 2021-11-26 2022-08-26 重庆邮电大学 Book comment text classification method based on knowledge graph word meaning disambiguation
CN114064901A (en) * 2021-11-26 2022-02-18 重庆邮电大学 Book comment text classification method based on knowledge graph word meaning disambiguation
CN113963357A (en) * 2021-12-16 2022-01-21 北京大学 Knowledge graph-based sensitive text detection method and system
CN113963357B (en) * 2021-12-16 2022-03-11 北京大学 Knowledge graph-based sensitive text detection method and system
CN117040926A (en) * 2023-10-08 2023-11-10 北京网藤科技有限公司 Industrial control network security feature analysis method and system applying knowledge graph
CN117040926B (en) * 2023-10-08 2024-01-26 北京网藤科技有限公司 Industrial control network security feature analysis method and system applying knowledge graph

Similar Documents

Publication Publication Date Title
CN108595708A (en) A kind of exception information file classification method of knowledge based collection of illustrative plates
Niu et al. Multi-modal multi-scale deep learning for large-scale image annotation
CN108628828B (en) Combined extraction method based on self-attention viewpoint and holder thereof
CN112069811B (en) Electronic text event extraction method with multi-task interaction enhancement
CN110532328B (en) Text concept graph construction method
CN110909164A (en) Text enhancement semantic classification method and system based on convolutional neural network
CN114444516B (en) Cantonese rumor detection method based on deep semantic perception map convolutional network
CN113157859B (en) Event detection method based on upper concept information
CN112541337B (en) Document template automatic generation method and system based on recurrent neural network language model
CN110457585B (en) Negative text pushing method, device and system and computer equipment
CN114881043B (en) Deep learning model-based legal document semantic similarity evaluation method and system
CN113569050A (en) Method and device for automatically constructing government affair field knowledge map based on deep learning
Kumar et al. Hybrid fusion based approach for multimodal emotion recognition with insufficient labeled data
CN112733547A (en) Chinese question semantic understanding method by utilizing semantic dependency analysis
CN113011126A (en) Text processing method and device, electronic equipment and computer readable storage medium
CN115952794A (en) Chinese-Tai cross-language sensitive information recognition method fusing bilingual sensitive dictionary and heterogeneous graph
Celikyilmaz et al. A graph-based semi-supervised learning for question-answering
CN117765450B (en) Video language understanding method, device, equipment and readable storage medium
Samih et al. Enhanced sentiment analysis based on improved word embeddings and XGboost.
Mahmud et al. Deep learning based sentiment analysis from Bangla text using glove word embedding along with convolutional neural network
CN113486143A (en) User portrait generation method based on multi-level text representation and model fusion
CN113761128A (en) Event key information extraction method combining domain synonym dictionary and pattern matching
Cai et al. Multi‐level deep correlative networks for multi‐modal sentiment analysis
CN117216617A (en) Text classification model training method, device, computer equipment and storage medium
Wang et al. RSRNeT: a novel multi-modal network framework for named entity recognition and relation extraction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180928