CN109271632A - A kind of term vector learning method of supervision - Google Patents

A kind of term vector learning method of supervision Download PDF

Info

Publication number
CN109271632A
CN109271632A CN201811075603.2A CN201811075603A CN109271632A CN 109271632 A CN109271632 A CN 109271632A CN 201811075603 A CN201811075603 A CN 201811075603A CN 109271632 A CN109271632 A CN 109271632A
Authority
CN
China
Prior art keywords
term vector
word
vector
model
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811075603.2A
Other languages
Chinese (zh)
Other versions
CN109271632B (en
Inventor
覃勋辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Star Cube Digital Technology Co ltd
Original Assignee
Chongqing Xiezhi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Xiezhi Technology Co ltd filed Critical Chongqing Xiezhi Technology Co ltd
Priority to CN201811075603.2A priority Critical patent/CN109271632B/en
Publication of CN109271632A publication Critical patent/CN109271632A/en
Application granted granted Critical
Publication of CN109271632B publication Critical patent/CN109271632B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

The present patent application discloses a kind of term vector learning method of supervision, it is related to natural language processing method field, the following steps are included: step 1 builds deep learning network model by increasing word relationship disaggregated model on the basis of word2vec neural network model;Multiple a certain specified term vectors of adjacent input word vector sum are input in deep learning network model and carry out multi-task learning by step 2;Step 3 repeats step 4, is iterated calculating, word2vec neural network model and word relationship disaggregated model after being optimized.The present patent application can obtain the relationship between the term vector and specified term vector while term vector is calculated.

Description

A kind of term vector learning method of supervision
Technical field
The present invention relates to natural language processing method fields, and in particular to a kind of term vector learning method of supervision.
Background technique
Term vector (word embedding), the vector characterization of word are an operations common in natural language processing, are The common basic technology in the Internet services such as search engine, ad system, recommender system behind.
Term vector can simply be interpreted as word carrying out vectorization expression, and entity has been abstracted into mathematical description, such as one A word: " apple ", be expressed as [0.4,0.5,0.9 ...], " banana ": [0.3,0.8,0.1 ...], the different dimensional of vector Degree is used to characterize different characteristic, just represents different semantemes on different dimensions.
Natural language processing (NLP is made in natural language processing, abbreviation) is artificial intelligence and linguistics The subdiscipline in field.Inquire into how to handle and with natural language, allow computer " to understand " language of the mankind, computer in this field Data are converted into natural language, and natural language is converted into the form that computer program is more easily handled.
Present natural language processing, including various ways, wherein word2vec be it is now more common be used to carry out from The series model of right Language Processing.Word2vec relies on skip-grams or continuous bag of words (CBOW) and is embedded in establish neural word, Term vector is obtained using neural network model.It is more in line in daily communication compared to skip-grams, CBOW by nature language The requirement that speech is interchangeable with machine language.
Although word2vec is able to carry out natural language processing, but word ambiguity and the obstructed feelings of sentence often occurs Condition, to find out its cause, being because of the unsupervised mechanism of Word2vec, what word2vec considered is only the pass between word and surrounding word System, when surrounding's word difference of two synonyms, the term vector that the two synonyms train also differs widely certainly.By big The word2vec of corpus learns term vector out, has in term vector space from what given word was closer: synonym, same to position Word, upper hyponym, related term etc., but word2vec can not distinguish these relationships.And many NLP tasks need this kind of word and word Relationship, but the term vector obtained in existing learning method does not have such function.
Summary of the invention
The invention is intended to provide a kind of term vector learning method of supervision, can not only obtain the corresponding word of natural language to Amount can also predict relationship between two term vectors.
The term vector learning method supervised in this programme, comprising the following steps:
Step 1 builds depth by increasing word relationship disaggregated model on the basis of word2vec neural network model Learning network model;
Multiple a certain specified term vectors of adjacent input word vector sum are input in deep learning network model by step 2 Carry out multi-task learning;
Step 3 repeats step 4, is iterated calculating, and the word2vec neural network model and word after being optimized close It is disaggregated model.
The present invention has the advantages that
The present invention proposes a kind of word-based and word relationship term vector generation method for having supervised.This method is existing On the basis of word2vec, the word relationship disaggregated model for calculating word and word relationship is increased, using neural network multitask The mechanism of habit learns term vector and word word relationship simultaneously.After the completion of training, term vector corresponding to word can not only obtain, and It can predict the word relationship of two words.This word relationship is in multiple skills such as Text similarity computing, the information retrievals of natural language There is very important effect in art field.
In addition, telling the priori knowledge of neural network word in the training process, the study for helping to eliminate low-frequency word is not filled The case where dividing.
Further, before step 1, corpus text is segmented, establishes vocabulary and initial term vector corresponding with vocabulary.
By collecting corpus, vocabulary and initial term vector are established initially to be instructed to newly-built deep learning network model Practice.
Further, it before step 1, according to vocabulary, marks in corpus text between each term vector and term vector Relationship.
By be labelled with relationship term vector can output vector to deep learning network model and word relationship carry out it is anti- To learning and give comments and criticisms, make the ginseng in the word2vec neural network model and word relationship disaggregated model in deep learning network model Number can access optimization.
Wherein, corpus text is collected into internet and corpus ancient books and records using crawler.
Corpus text in corpus ancient books and records than more complete, but be not it is newest, crawled on internet by crawler Supplement of the cyberspeak as corpus text in existing corpus ancient books and records, when the vocabulary established and initial term vector can be made all to meet For language feature.
Further, in step 1, word relationship disaggregated model includes sequentially connected input layer, splicing layer, Quan Lian stratum And probability layer;Wherein splicing layer by the output vector Wi being calculated by word2vec neural network model and is input to word The specified vector Wk of relationship disaggregated model is spliced according to following formula: [Wi, Wk, Wi-Wk, Wi ° of Wk, Cos (Wi, Wk)].
By word relationship disaggregated model, the relationship between initial term vector is subjected to corresponding mark, is facilitated below in training It is calculated together in calculating process with relationship.
Further, in step 2, input word vector sum is defined by initial term vector and specifies term vector.
All term vectors are both initialized to the vector of specified equal length.
Further, it in step 2, inputs and exports to the neural network model of word2vec using continuous pond bag model The adjacent multiple term vectors of term vector are as input term vector.
Continuous pond bag model is the main models for being currently used to carry out natural language processing in word2vec, but every Term vector in a pond bag does not carry out relationship each other and corresponds to, this term vector for allowing for finally being calculated also is difficult Accurately correct relationship is established with other term vectors.The present invention efficiently solves this by increase word relationship disaggregated model and asks Topic.And neural network model is superimposed continuous pond bag model, can greatly reduce the number of plies of calculating and the number of iteration, reduces and calculates Amount enables natural language to be quickly processed into the term vector of standard, and then carries out subsequent applications.
Further, in step 2, when carrying out multi-task learning, word2vec neural network disk model is calculating output While vector Wi, word relationship disaggregated model calculates the relationship label (Wk, Wi) of Wi and Wk.
While with initial vector training word2vec neural network, training word relationship disaggregated model, trained depth Degree learning network model can obtain the relationship label (Wk, Wi) of Wi and Wk while obtaining output term vector Wi.
Further, in step 2, word2vec neural network is by error back propagation mechanism to neural network parameter It optimizes, error includes the error in classification and word relationship error in classification of Hofman tree.
Optimize output vector Wi and the word2vec neural network model being calculated.
Further, in step 2, word relationship disaggregated model is by neural network error back propagation mechanism to full connection Layer parameter optimizes.
Using the term vector of mark relationship, the relationship after calculating word relationship disaggregated model compares optimization, Jin Erxiu It is positive to update full connection layer parameter, optimize the label (Wk, Wi) being calculated and word relationship disaggregated model.
Further, in step 3, the multiple input vectors selected at random and specified vector are separately input to In word2vec neural network model and word relationship disaggregated model, an output term vector and the output term vector is calculated With the relationship between specified term vector.
It is used after training word2vec neural network model and word relationship disaggregated model by successive ignition When, it can also synchronize to obtain the relationship of the output term vector and specified term vector while obtaining output term vector.
Detailed description of the invention
Fig. 1 is the flow chart of the embodiment of the present invention.
Fig. 2 is the operation frame diagram of the embodiment of the present invention.
Specific embodiment
It is further described below by specific embodiment:
Embodiment is substantially as shown in Fig. 1: the term vector learning method supervised in the present embodiment, comprising the following steps:
The first step establishes corpus text library, to corpus text segment, segmenting method can using existing ltp, stammerer, Even segment by hand;Participle establishes vocabulary later, and vocabulary is the set of word composition one by one;And randomly select initial word to Amount.
When establishing corpus text library, pass through existing Chinese corpus ancient books and records such as " ccs " dictionary, " hownet ", " major term Woods " and corpus text is collected by crawler on the internet, multiple big corpus texts are formed, and these big corpus texts are built It stands into for receiving the corpus text library of rope.
After the foundation for completing corpus text library, the corpus text in library is segmented.Meanwhile by the initial of each word Term vector is defined as W0={ w1 ..., wn }, wherein W0 is term vector, and w1 to wn is respectively term vector in n different dimensions Characteristic value, wherein n is the term vector intrinsic dimensionality of word2vec setting.
Second step marks the relative of a word according to vocabulary, mark can according to existing dictionary, as major term woods, Ccs etc. can also be marked by hand.
Relative includes synonym, apposition, hypernym, hyponym, unrelated word etc..In the connection established between word and word When being, first, in accordance with the prior art, the relationship between word and word, corpus allusion quotation are established using the word relationship in existing corpus ancient books and records Nationality such as " ccs " dictionary, " hownet ", " major term woods ".If only existing corpus ancient books and records, the word word relationship provided is not Completely.In the present embodiment, we using following manner construct word word relationship, for Wi all word relationships with label (Wi, Wk it) indicates, i and k belong to { 1 ..., n }.Word relationship has { synonym, apposition, hypernym, hyponym, unrelated word, unknown }. Unknown word relational tags are set as " unknown ", these word relationships are simultaneously not involved in training.
Third step builds deep learning network structure.
Output term vector is calculated using the neural network model of word2vec, passes through CBOW model insertion in calculating process Initial term vector.Meanwhile by word relationship disaggregated model calculate export term vector while, synchronize calculate the output word to Relationship between amount and specified term vector.
Specifically, as shown in Fig. 2, being input to using the CBOW model of word2vec by term vector Wi-m to Wi+m is inputted Output term vector Wi is calculated in word2vec neural network type.Then Wi is input to huffman model The probability of output vector, the probability exported according to the prior art by huffman model are calculated in hierarchicalsoftmax It corrects the neural network parameter and output vector of neural network model, makes the output term vector obtained after neural network model It is more accurate.
While calculating output term vector, pass through word relationship disaggregated model, i.e. word relationship disaggregated model calculates output word Relationship between vector Wi and specified term vector Wk.
Specifically, word relationship disaggregated model, including sequentially connected input layer, splicing layer, Quan Lian stratum fully
Connectedlayer and probability layer softmax.
4th step, multi-task learning.
While inputting multiple term vectors to word2vec, classified mould by the input layer of word relational model to word relationship Type inputs specified term vector Wk.Then, while neural network model exports Wi, by Wi and Wk by the splicing layer of input, two A vector forms row vector according to basic mathematical formulae recombination feature, and the row vector after recombination is Wi Wk Wi-Wk Wi ° WkCos (Wi, Wk), then remapped by the network of full articulamentum, finally by softmax classifier realize the classification of word relationship and Error calculation obtains the relationship between two term vectors being arranged according to predetermined dimension.
Assuming that word2vec selects Cbow, the window selection of Cbow is 2m+1.[Wi-m ..., Wi+m] it is one in addition to Wi The corpus data of the vectorization of window.Wk is the relative of Wi, i.e., specified term vector, the relationship of the two term vectors is expressed as label(Wi,Wk).This variable represents the relationship of Wi and Wk.In the present embodiment, label (Wi, Wk) is equal to { synonym, same to position It is word, hypernym, hyponym, unrelated word, unknown } in the similarity probabilities that calculate of each characteristic dimension, label (Wi, Wk) passes through Word relationship disaggregated model is calculated.
This method uses the output and mark joint training of neural network model and sorter model, the damage of two models The Probability Forms expression of appraxia logarithmetics is added to obtain the loss function of whole network again, as follows:
Loss=logP (Wi | Wi-m ..., Wi+m)+s*logP (label (Wi, Wk) | Wi, Wk)
Wherein, s is preset coefficient, such as takes s=0.5.
After obtaining loss function, joined using loss function using the mechanism learning network of neural network error back propagation Number, wherein network parameter is that neural network is included, by the continuous corrective networks parameter of loss function, obtains neural network model The output term vector arrived is more accurate.Meanwhile utilizing the Quan Lian stratum in the mechanism Study strategies and methods model of error back propagation Parameter makes to connect full level parameter in the gradually calculating process of term vector relationship by continuous training optimization, makes the word finally obtained Relationship disaggregated model can accurately calculate the relationship between two term vectors.
5th step updates and network parameter and connects level parameter entirely, and the deep neural network model after being optimized is to get arriving Updated word2vec neural network model and word relationship disaggregated model.
When concrete application, the adjacent term vector of some term vector of stochastic inputs is obtained by neural network model To output term vector Wi, while specified term vector Wk is inputted, obtains term vector Wi and word by the iterative calculation of sorter model The relationship label (Wi, Wk) of the Wk of vector correlation between the two.
By above step, after the completion of training, the corresponding term vector of word can not only be obtained, while can be according to classification Device model calculates the relationship between the term vector and specified term vector.
This method increases the classifier of word and word relationship on the basis of existing word2vec, more using neural network The mechanism of tasking learning learns term vector and word word relationship simultaneously, during CBOW term vector model is learnt, passes through The relationship of each term vector and other term vectors is predicted and is defined by word relationship disaggregated model.As shown in Fig. 2, this method has The output and mark joint training of two networks are used to body, left network is the vec2vec CBOW net based on Hofman tree Network, the right are word relationship sorter networks.The Probability Forms of the loss logarithmetics of two networks of left and right are indicated to be added again, as net The loss function of network.After the completion of training, term vector corresponding to word can not only obtain, and can predict that the word of two words closes System.This word relationship has very important in multiple technical fields such as Text similarity computing, the information retrievals of natural language Effect.
In addition, telling the priori knowledge of neural network word in the training process, the study for helping to eliminate low-frequency word is not filled The case where dividing.Such as: " Zhang San " and " Li Si " is synonym, and there are many frequency that " Li Si " occurs in training text, it is believed that can be with It is trained up, and the frequency that " Zhang San " occurs is seldom, can not all train up according to traditional word2vec.In the present invention Network in, when training Zhang San, the term vector of word-based word sorter network and " Li Si " can be with by mistake backpropagation mechanism " Zhang San " term vector is updated, so inventive network helps to eliminate the insufficient situation of study of low-frequency word.
Similarly, the priori knowledge of neural network word is told in the training process, and word-based word sorter network enhances input Two term vectors for having a priori interest distinguish and connection, overcome in original word2vec network model term vector only and dependence Deficiency in the relevant mechanism of text.
What has been described above is only an embodiment of the present invention, and the common sense such as well known specific structure and characteristic are not made herein in scheme Excessive description, technical field that the present invention belongs to is all before one skilled in the art know the applying date or priority date Ordinary technical knowledge can know the prior art all in the field, and have using routine experiment hand before the date The ability of section, one skilled in the art can improve and be implemented in conjunction with self-ability under the enlightenment that the application provides This programme, some typical known features or known method should not become one skilled in the art and implement the application Obstacle.It should be pointed out that for those skilled in the art, without departing from the structure of the invention, can also make Several modifications and improvements out, these also should be considered as protection scope of the present invention, these all will not influence the effect that the present invention is implemented Fruit and patent practicability.The scope of protection required by this application should be based on the content of the claims, the tool in specification The records such as body embodiment can be used for explaining the content of claim.

Claims (10)

1. a kind of term vector learning method of supervision, it is characterised in that: the following steps are included:
Step 1 builds deep learning by increasing word relationship disaggregated model on the basis of word2vec neural network model Network model;
Multiple a certain specified term vectors of adjacent input word vector sum are input in deep learning network model and carry out by step 2 Multi-task learning;
Step 3 repeats step 4, is iterated calculating, word2vec neural network model and word relation after being optimized Class model.
2. the term vector learning method of supervision according to claim 1, it is characterised in that: before step 1, by corpus Text participle, establishes vocabulary and initial term vector corresponding with vocabulary.
3. the term vector learning method of supervision according to claim 2, it is characterised in that: according to vocabulary, mark corpus text Relationship in this between each term vector and term vector.
4. the term vector learning method of supervision according to claim 1, it is characterised in that: in step 1, word relation Class model includes sequentially connected input layer, splicing layer, Quan Lian stratum and probability layer;Wherein splicing layer will pass through word2vec The output vector Wi and be input to the specified vector Wk of word relationship disaggregated model according to following public affairs that neural network model is calculated Formula is spliced: [Wi, Wk, Wi-Wk, Wi ° of Wk, Cos (Wi, Wk)].
5. the term vector learning method of supervision according to claim 2, it is characterised in that: in step 2, by initial Term vector specifies term vector to define input word vector sum.
6. the term vector learning method of supervision according to claim 5, it is characterised in that: in step 2, using continuous Pond bag model to the input of the neural network model of word2vec with export the adjacent multiple term vectors of term vector as input word to Amount.
7. the term vector learning method of supervision according to claim 4, it is characterised in that: more in progress in step 2 When tasking learning, while calculating output vector Wi, word relationship disaggregated model calculates word2vec neural network disk model The relationship label (Wk, Wi) of Wi and Wk.
8. the term vector learning method of supervision according to claim 1, it is characterised in that: in step 2, word2vec Neural network optimizes neural network parameter by error back propagation mechanism, and error includes the error in classification of Hofman tree With word relationship error in classification.
9. the term vector learning method of supervision according to claim 1, it is characterised in that: in step 2, word relation Class model optimizes full connection layer parameter by neural network error back propagation mechanism.
10. the term vector learning method of supervision according to claim 1, it is characterised in that: in step 3, will select at random Multiple input vectors and specified vector out are separately input in word2vec neural network model and word relationship disaggregated model, meter It calculates and obtains the relationship between an output term vector and the output term vector and specified term vector.
CN201811075603.2A 2018-09-14 2018-09-14 Supervised word vector learning method Active CN109271632B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811075603.2A CN109271632B (en) 2018-09-14 2018-09-14 Supervised word vector learning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811075603.2A CN109271632B (en) 2018-09-14 2018-09-14 Supervised word vector learning method

Publications (2)

Publication Number Publication Date
CN109271632A true CN109271632A (en) 2019-01-25
CN109271632B CN109271632B (en) 2023-05-26

Family

ID=65188340

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811075603.2A Active CN109271632B (en) 2018-09-14 2018-09-14 Supervised word vector learning method

Country Status (1)

Country Link
CN (1) CN109271632B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110825875A (en) * 2019-11-01 2020-02-21 科大讯飞股份有限公司 Text entity type identification method and device, electronic equipment and storage medium
CN110852077A (en) * 2019-11-13 2020-02-28 泰康保险集团股份有限公司 Method, device, medium and electronic equipment for dynamically adjusting Word2Vec model dictionary
CN111444346A (en) * 2020-03-31 2020-07-24 广州大学 Word vector confrontation sample generation method and device for text classification
CN112989032A (en) * 2019-12-17 2021-06-18 医渡云(北京)技术有限公司 Entity relationship classification method, apparatus, medium and electronic device

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105975478A (en) * 2016-04-09 2016-09-28 北京交通大学 Word vector analysis-based online article belonging event detection method and device
CN106557462A (en) * 2016-11-02 2017-04-05 数库(上海)科技有限公司 Name entity recognition method and system
CN106682220A (en) * 2017-01-04 2017-05-17 华南理工大学 Online traditional Chinese medicine text named entity identifying method based on deep learning
US20170139899A1 (en) * 2015-11-18 2017-05-18 Le Holdings (Beijing) Co., Ltd. Keyword extraction method and electronic device
CN106933806A (en) * 2017-03-15 2017-07-07 北京大数医达科技有限公司 The determination method and apparatus of medical synonym
CN107145503A (en) * 2017-03-20 2017-09-08 中国农业大学 Remote supervision non-categorical relation extracting method and system based on word2vec
CN107247780A (en) * 2017-06-12 2017-10-13 北京理工大学 A kind of patent document method for measuring similarity of knowledge based body
CN107291693A (en) * 2017-06-15 2017-10-24 广州赫炎大数据科技有限公司 A kind of semantic computation method for improving term vector model
CN107895000A (en) * 2017-10-30 2018-04-10 昆明理工大学 A kind of cross-cutting semantic information retrieval method based on convolutional neural networks
CN107895051A (en) * 2017-12-08 2018-04-10 宏谷信息科技(珠海)有限公司 A kind of stock news quantization method and system based on artificial intelligence
CN108280058A (en) * 2018-01-02 2018-07-13 中国科学院自动化研究所 Relation extraction method and apparatus based on intensified learning
US10037362B1 (en) * 2017-07-24 2018-07-31 International Business Machines Corpoation Mining procedure dialogs from source content
CN108388654A (en) * 2018-03-01 2018-08-10 合肥工业大学 A kind of sensibility classification method based on turnover sentence semantic chunk partition mechanism
CN108388914A (en) * 2018-02-26 2018-08-10 中译语通科技股份有限公司 A kind of grader construction method, grader based on semantic computation

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170139899A1 (en) * 2015-11-18 2017-05-18 Le Holdings (Beijing) Co., Ltd. Keyword extraction method and electronic device
CN105975478A (en) * 2016-04-09 2016-09-28 北京交通大学 Word vector analysis-based online article belonging event detection method and device
CN106557462A (en) * 2016-11-02 2017-04-05 数库(上海)科技有限公司 Name entity recognition method and system
CN106682220A (en) * 2017-01-04 2017-05-17 华南理工大学 Online traditional Chinese medicine text named entity identifying method based on deep learning
CN106933806A (en) * 2017-03-15 2017-07-07 北京大数医达科技有限公司 The determination method and apparatus of medical synonym
CN107145503A (en) * 2017-03-20 2017-09-08 中国农业大学 Remote supervision non-categorical relation extracting method and system based on word2vec
CN107247780A (en) * 2017-06-12 2017-10-13 北京理工大学 A kind of patent document method for measuring similarity of knowledge based body
CN107291693A (en) * 2017-06-15 2017-10-24 广州赫炎大数据科技有限公司 A kind of semantic computation method for improving term vector model
US10037362B1 (en) * 2017-07-24 2018-07-31 International Business Machines Corpoation Mining procedure dialogs from source content
CN107895000A (en) * 2017-10-30 2018-04-10 昆明理工大学 A kind of cross-cutting semantic information retrieval method based on convolutional neural networks
CN107895051A (en) * 2017-12-08 2018-04-10 宏谷信息科技(珠海)有限公司 A kind of stock news quantization method and system based on artificial intelligence
CN108280058A (en) * 2018-01-02 2018-07-13 中国科学院自动化研究所 Relation extraction method and apparatus based on intensified learning
CN108388914A (en) * 2018-02-26 2018-08-10 中译语通科技股份有限公司 A kind of grader construction method, grader based on semantic computation
CN108388654A (en) * 2018-03-01 2018-08-10 合肥工业大学 A kind of sensibility classification method based on turnover sentence semantic chunk partition mechanism

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ALEXIS CONNEAU ET AL: ""Supervised Learning of Universal Sentence Representations from"", 《COMPUTER SCIENCE>COMPUTATION AND LANGUAGE》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110825875A (en) * 2019-11-01 2020-02-21 科大讯飞股份有限公司 Text entity type identification method and device, electronic equipment and storage medium
CN110825875B (en) * 2019-11-01 2022-12-06 科大讯飞股份有限公司 Text entity type identification method and device, electronic equipment and storage medium
CN110852077A (en) * 2019-11-13 2020-02-28 泰康保险集团股份有限公司 Method, device, medium and electronic equipment for dynamically adjusting Word2Vec model dictionary
CN110852077B (en) * 2019-11-13 2023-03-31 泰康保险集团股份有限公司 Method, device, medium and electronic equipment for dynamically adjusting Word2Vec model dictionary
CN112989032A (en) * 2019-12-17 2021-06-18 医渡云(北京)技术有限公司 Entity relationship classification method, apparatus, medium and electronic device
CN111444346A (en) * 2020-03-31 2020-07-24 广州大学 Word vector confrontation sample generation method and device for text classification
CN111444346B (en) * 2020-03-31 2023-04-18 广州大学 Word vector confrontation sample generation method and device for text classification

Also Published As

Publication number Publication date
CN109271632B (en) 2023-05-26

Similar Documents

Publication Publication Date Title
CN110298037B (en) Convolutional neural network matching text recognition method based on enhanced attention mechanism
CN109992783B (en) Chinese word vector modeling method
CN112163426B (en) Relationship extraction method based on combination of attention mechanism and graph long-time memory neural network
CN111241294B (en) Relationship extraction method of graph convolution network based on dependency analysis and keywords
CN110377903B (en) Sentence-level entity and relation combined extraction method
CN109271632A (en) A kind of term vector learning method of supervision
CN106326212B (en) A kind of implicit chapter relationship analysis method based on level deep semantic
CN109657239A (en) The Chinese name entity recognition method learnt based on attention mechanism and language model
CN110222163A (en) A kind of intelligent answer method and system merging CNN and two-way LSTM
CN109344399A (en) A kind of Text similarity computing method based on the two-way lstm neural network of stacking
CN110555084B (en) Remote supervision relation classification method based on PCNN and multi-layer attention
CN107273426B (en) A kind of short text clustering method based on deep semantic route searching
CN111177383B (en) Text entity relation automatic classification method integrating text grammar structure and semantic information
CN111522965A (en) Question-answering method and system for entity relationship extraction based on transfer learning
CN104615767A (en) Searching-ranking model training method and device and search processing method
CN109933792B (en) Viewpoint type problem reading and understanding method based on multilayer bidirectional LSTM and verification model
CN108647191A (en) It is a kind of based on have supervision emotion text and term vector sentiment dictionary construction method
CN110866542A (en) Depth representation learning method based on feature controllable fusion
CN111241303A (en) Remote supervision relation extraction method for large-scale unstructured text data
Chen et al. Deep neural networks for multi-class sentiment classification
CN114417851A (en) Emotion analysis method based on keyword weighted information
CN111428481A (en) Entity relation extraction method based on deep learning
CN103678318A (en) Multi-word unit extraction method and equipment and artificial neural network training method and equipment
CN110569355B (en) Viewpoint target extraction and target emotion classification combined method and system based on word blocks
CN114048314A (en) Natural language steganalysis method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Qin Hong Hui

Inventor after: Du Ruo

Inventor after: Xiang Hai

Inventor after: Hou Cong

Inventor after: Liu Ke

Inventor before: Qin Hong Hui

CB03 Change of inventor or designer information
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20231127

Address after: Room 1208-208, 12th Floor, Building 2, Fuhai Center, Daliushu, Haidian District, Beijing, 100081

Patentee after: Beijing Zhicheng Excellence Technology Co.,Ltd.

Address before: 401120 No. 1, Floor 3, Building 11, Internet Industrial Park, No. 106, West Section of Jinkai Avenue, Yubei District, Chongqing

Patentee before: CHONGQING XIEZHI TECHNOLOGY Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231220

Address after: No. 1-0286, Juhe 6th Street, Jufuyuan Industrial Park, Tongzhou Economic Development Zone, Beijing, 101127

Patentee after: Beijing Star Cube Digital Technology Co.,Ltd.

Address before: Room 1208-208, 12th Floor, Building 2, Fuhai Center, Daliushu, Haidian District, Beijing, 100081

Patentee before: Beijing Zhicheng Excellence Technology Co.,Ltd.

TR01 Transfer of patent right