CN109271632B - Supervised word vector learning method - Google Patents

Supervised word vector learning method Download PDF

Info

Publication number
CN109271632B
CN109271632B CN201811075603.2A CN201811075603A CN109271632B CN 109271632 B CN109271632 B CN 109271632B CN 201811075603 A CN201811075603 A CN 201811075603A CN 109271632 B CN109271632 B CN 109271632B
Authority
CN
China
Prior art keywords
word
word vector
vector
neural network
relation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811075603.2A
Other languages
Chinese (zh)
Other versions
CN109271632A (en
Inventor
覃勋辉
杜若
向海
侯聪
刘科
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Star Cube Digital Technology Co ltd
Original Assignee
Chongqing Xiezhi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Xiezhi Technology Co ltd filed Critical Chongqing Xiezhi Technology Co ltd
Priority to CN201811075603.2A priority Critical patent/CN109271632B/en
Publication of CN109271632A publication Critical patent/CN109271632A/en
Application granted granted Critical
Publication of CN109271632B publication Critical patent/CN109271632B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a supervised word vector learning method, which relates to the field of natural language processing methods and comprises the following steps: step one, a word relation classification model is added on the basis of a word2vec neural network model, and a deep learning network model is built; inputting a plurality of adjacent input word vectors and a certain appointed word vector into a deep learning network model for multitask learning; and step three, repeating the step four, and performing iterative computation to obtain an optimized word2vec neural network model and a word relation classification model. The method and the device can calculate the word vector and obtain the relationship between the word vector and the appointed word vector.

Description

Supervised word vector learning method
Technical Field
The invention relates to the field of natural language processing methods, in particular to a supervised word vector learning method.
Background
Word vector (word casting), the vector representation of words is a common operation in natural language processing, and is a common basic technology behind internet services such as search engines, advertisement systems, recommendation systems and the like.
Word vectors, which can be understood simply as vectorizing words, are abstracted into mathematical descriptions of entities, such as a word: "apple", expressed as [0.4,0.5,0.9,. ], banana ": [0.3,0.8,0.1.], different dimensions of the vector are used to characterize different features, and different dimensions represent different semantics.
Natural language processing (natural language processing, abbreviated NLP) is a branch discipline in the fields of artificial intelligence and linguistics. The field discusses how to process and use natural language, let a computer "understand" human language, convert computer data into natural language, and convert natural language into a form that is easier for a computer program to process.
Natural language processing now includes a number of approaches, where word2vec is a series of models that are now relatively common for natural language processing. Word2vec relies on skip-grams or continuous Word bags (CBOW) to build neural Word embedding, and Word vectors are obtained using neural network models. Compared with skip-grams, CBOW meets the requirement of exchanging natural language with machine language in daily communication.
Although Word2vec can perform natural language processing, word ambiguity and Word failure often occur, because Word2vec is not a supervision mechanism, word2vec only considers the relationship between words and surrounding words, and when the surrounding words of two synonyms are different, word vectors trained by the two synonyms are different. Word vectors learned by word2vec of large corpus have the following distances from given words in the word vector space: synonyms, co-located words, contextuals, related words, etc., but word2vec does not distinguish between these relationships. Many NLP tasks require such word-to-word relationships, but word vectors obtained in existing learning methods do not have such functionality.
Disclosure of Invention
The invention aims to provide a supervised word vector learning method which not only can obtain word vectors corresponding to natural language, but also can predict the relationship between two word vectors.
The supervised word vector learning method in the scheme comprises the following steps:
step one, a word relation classification model is added on the basis of a word2vec neural network model, and a deep learning network model is built;
inputting a plurality of adjacent input word vectors and a certain appointed word vector into a deep learning network model for multitask learning;
and step three, repeating the step four, and performing iterative computation to obtain an optimized word2vec neural network model and a word relation classification model.
The invention has the advantages that:
the invention provides a supervised word vector generation method based on word and word relation. According to the method, a word relation classification model for calculating word and word relation is added on the basis of the existing word2vec, and a neural network multitask learning mechanism is adopted to learn word vectors and word relation simultaneously. After training is completed, not only the word vector corresponding to the word can be obtained, but also the word relationship of the two words can be predicted. The word relation has very important roles in a plurality of technical fields such as text similarity calculation, information retrieval and the like of natural language.
In addition, the prior knowledge of the neural network words is told in the training process, and the learning of the low-frequency words is eliminated.
Further, before the first step, the corpus text is segmented, and a word list and an initial word vector corresponding to the word list are established.
And establishing a word list and an initial word vector by collecting corpus to perform initial training on the newly built deep learning network model.
Further, before the first step, according to the vocabulary, each word vector and the relationship between the word vectors in the corpus text are marked.
The output vector of the deep learning network model and the word relation can be reversely learned and taught through the word vector marked with the relation, so that parameters in the word2vec neural network model and the word relation classification model in the deep learning network model can be optimized.
And collecting corpus texts from the internet and corpus books by adopting crawlers.
The corpus texts in the corpus books are complete, but are not up to date, and the established word list and the initial word vector can accord with the language characteristics of the times by crawling web terms on the internet as the supplement of the corpus texts in the existing corpus books by the crawler.
Further, in the first step, the word relation classification model includes an input layer, a splicing layer, a full-connection layer and a probability layer which are sequentially connected; the splicing layer splices an output vector Wi obtained through word2vec neural network model calculation and a designated vector Wk input into a word relation classification model according to the following formula: [ Wi, wk, wi-Wk, wi DEG Wk, cos (Wi, wk) ].
And the relation between the initial word vectors is correspondingly marked through the word relation classification model, so that the relation is conveniently calculated together in the training calculation process.
Further, in step two, an input word vector and a specified word vector are defined by the initial word vector.
All word vectors are initialized to a specified vector of the same length.
Further, in the second step, a plurality of word vectors adjacent to the output word vector are input as input word vectors to the neural network model of word2vec by using the continuous pool bag model.
The continuous pool bag model is a main model which is used for natural language processing in word2vec, but word vectors in each pool bag do not have relation correspondence with each other, so that the word vectors obtained by final calculation are difficult to accurately establish correct relation with other word vectors. The invention effectively solves the problem by adding a word relation classification model. The neural network model is overlapped with the continuous pool bag model, so that the number of calculated layers and the number of iterations can be greatly reduced, the calculated amount is reduced, and the natural language can be processed into standard word vectors more quickly, so that the subsequent application can be further carried out.
Further, in step two, when the multitask learning is performed, the word2vec neural network disk model calculates the output vector Wi and the word relation classification model calculates the relation label (Wk, wi) of Wi and Wk.
The word2vec neural network is trained by using the initial vector, and meanwhile, the word relation classification model is trained, and the trained deep learning network model can obtain the relation label (Wk, wi) of Wi and Wk at the same time of obtaining the output word vector Wi.
Further, in the second step, the word2vec neural network optimizes the neural network parameters through an error back propagation mechanism, and the errors include classification errors of the huffman tree and word relation classification errors.
And optimizing the calculated output vector Wi and word2vec neural network model.
Further, in the second step, the word relation classification model optimizes the full connection layer parameters through a neural network error back propagation mechanism.
And (3) comparing and optimizing the relation calculated by the word relation classification model by using the word vector of the marked relation, and further correcting and updating the parameters of the full-connection layer to optimize the calculated label (Wk, wi) and the word relation classification model.
In step three, a plurality of randomly selected input vectors and specified vectors are respectively input into a word2vec neural network model and a word relation classification model, and an output word vector and a relation between the output word vector and the specified word vector are obtained through calculation.
After the word2vec neural network model and the word relation classification model are trained through multiple iterations, when the word2vec neural network model is used, the relation between the output word vector and the appointed word vector can be synchronously obtained while the output word vector is obtained.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention.
Fig. 2 is a diagram of an operation framework according to an embodiment of the present invention.
Detailed Description
The following is a further detailed description of the embodiments:
an example is substantially as shown in figure 1: the supervised word vector learning method in the embodiment comprises the following steps:
firstly, establishing a corpus text library, and word segmentation is carried out on the corpus text, wherein the word segmentation method can adopt the existing ltp, barking and even manual word segmentation; after word segmentation, establishing a word list, wherein the word list is a set formed by words; and randomly selects an initial word vector.
When the corpus text library is built, corpus texts are collected on the Internet through the existing Chinese corpus books such as ccs dictionary, hownet, large word forest and crawlers to form a plurality of large corpus texts, and the large corpus texts are built into a corpus text library for collection.
After the establishment of the corpus text library is completed, word segmentation is carried out on the corpus text in the corpus. Meanwhile, the initial word vector for each word is defined as w0= { W1,..once, wn }, where W0 is the word vector, W1 to wn are feature values of the word vector in n different dimensions, respectively, where n is the word vector feature dimension set by word2 vec.
In the second step, a word is marked Guan Jici according to a word list, and marking can be performed according to an existing dictionary, such as a big word forest, ccs and the like, or manual marking can be performed.
The relation words comprise synonyms, parity words, hypernyms, hyponyms, irrelevant words and the like. When the relation between words is established, firstly, the relation between the words is established by adopting the word relation in the existing corpus books according to the prior art, and the corpus books are such as a ccs dictionary, a hownet, a large word forest and the like. If only the existing corpus books are used, the word-word relation provided by the corpus books is incomplete. In this embodiment, we construct word relationships in the following way, and all word relationships for Wi are denoted by label (Wi, wk), i and k belonging to { 1..sub.n }. The word relation is { synonym, co-located word, hypernym, hyponym, irrelevant word, unknown }. The unknown word relationship labels are set to "unknown" and these word relationships do not participate in the training.
Thirdly, building a deep learning network structure.
And calculating an output word vector by using a word2vec neural network model, and embedding an initial word vector into the word vector by using a CBOW model in the calculation process. Meanwhile, the word relation classification model calculates the output word vector and synchronously calculates the relation between the output word vector and the appointed word vector.
Specifically, as shown in fig. 2, the input word vectors Wi-m to wi+m are input into the word2vec neural network type by using the CBOW model of the word2vec, and the output word vector Wi is calculated. And then, wi is input into a Huffman model hierarachicalcarftmax to calculate the probability of an output vector, and the neural network parameters and the output vector of the neural network model are corrected according to the probability output by the Huffman model in the prior art, so that the output word vector obtained after the neural network model is more accurate.
And calculating the relation between the output word vector Wi and the appointed word vector Wk through a word relation classification model, namely, a word relation classification model, while calculating the output word vector.
Specifically, the word relation classification model comprises an input layer, a splicing layer and a full-connection level full which are sequentially connected
connectiedlayer and probability layer softmax.
Fourth, multitask learning.
The specified word vector Wk is input to the word-relationship classification model through the input layer of the word-relationship model while a plurality of word vectors are input to the word2 vec. And when the neural network model outputs Wi, the Wi and Wk are subjected to input splicing layers, the two vectors are recombined according to a basic mathematical formula to form row vectors, the recombined row vectors are Wi Wk Wi-Wk Wi degrees WkCos (Wi, wk), the Wi-Wk Wi degrees WkCos (Wi, wk) are remapped through a network of the fully connected layers, and finally word relation classification and error calculation are realized through a softmax classifier, so that the relation between the two word vectors arranged according to a preset dimension is obtained.
Assume that word2vec selects Cbow, and the window of Cbow is selected to be 2m+1. Wi-m, wi+m is vectorized corpus data for a window other than Wi. Wk is a relational term for Wi, i.e., a specified word vector, and the relationship between these two word vectors is denoted as label (Wi, wk). This variable represents the relationship of Wi and Wk. In this embodiment, label (Wi, wk) is equal to the similarity probability calculated by each feature dimension in { synonyms, co-located words, hypernyms, hyponyms, irrelevant words, unknown }, and is calculated by a word relationship classification model.
The method adopts the output and labeling combined training of a neural network model and a classifier model, and the loss of the two models is represented by a logarithmic probability form and added to obtain a loss function of the whole network, wherein the loss function is as follows:
Loss=logP(Wi|Wi-m,...,Wi+m)+s*logP(label(Wi,Wk)|Wi,Wk)
where s is a predetermined coefficient, for example, s=0.5.
After the loss function is obtained, a mechanism of neural network error back propagation is adopted to learn network parameters by using the loss function, wherein the network parameters are self-contained in the neural network, and the network parameters are continuously corrected by the loss function, so that the output word vector obtained by the neural network model is more accurate. Meanwhile, the full-connection level parameters in the classifier model are learned by utilizing an error back propagation mechanism, so that the full-connection level parameters are continuously trained and optimized in the gradual calculation process of the word vector relations, and the finally obtained word relation classification model can accurately calculate the relation between two word vectors.
And fifthly, updating network parameters and full-connection level parameters to obtain an optimized deep neural network model, namely obtaining an updated word2vec neural network model and a word relation classification model.
When the method is specifically applied, word vectors adjacent to a certain word vector are randomly input, an output word vector Wi is obtained through a neural network model, meanwhile, an appointed word vector Wk is input, and a relation label (Wi, wk) between the word vector Wi and the Wk related to the word vector is obtained through iterative calculation of a classifier model.
Through the steps, after training is completed, not only can the word vector corresponding to the word be obtained, but also the relation between the word vector and the appointed word vector can be calculated according to the classifier model.
According to the method, on the basis of the existing word2vec, a word and word relation classifier is added, a neural network multitask learning mechanism is adopted to learn word vectors and word relation simultaneously, and in the learning process of a CBOW word vector model, the relation between each word vector and other word vectors is predicted and defined through the word relation classification model. As shown in FIG. 2, the method specifically adopts the output and label combined training of two networks, wherein the left network is a vec2vec CBOW network based on a Huffman tree, and the right network is a word relation classification network. The losses of the left and right networks are represented by logarithmic probability forms and added as a function of the losses of the networks. After training is completed, not only the word vector corresponding to the word can be obtained, but also the word relationship of the two words can be predicted. The word relation has very important roles in a plurality of technical fields such as text similarity calculation, information retrieval and the like of natural language.
In addition, the prior knowledge of the neural network words is told in the training process, and the learning of the low-frequency words is eliminated. Such as: "Zhang Sano" and "Li four" are synonyms, and the frequency of occurrence of "Li four" in the training text is much, which is considered to be fully trained, while the frequency of occurrence of "Zhang Sano" is little, which is not fully trained according to the traditional word2 vec. In the network of the invention, when training Zhang Sanhe, word vectors of the 'Zhang Sanhe' can be updated through a false back propagation mechanism based on word and word classification network and word vectors of the 'Liqu', so the network of the invention is beneficial to eliminating the condition of insufficient learning of low-frequency words.
Similarly, the prior knowledge of the word of the neural network is told in the training process, the word-word classification network is based on the word vector distinction and connection of two inputted word vectors with prior relations are enhanced, and the defect that the word vectors are only related to the dependent text in the original word2vec network model is overcome.
The foregoing is merely an embodiment of the present invention, and a specific structure and characteristics of common knowledge in the art, which are well known in the scheme, are not described herein, so that a person of ordinary skill in the art knows all the prior art in the application day or before the priority date of the present invention, and can know all the prior art in the field, and have the capability of applying the conventional experimental means before the date, so that a person of ordinary skill in the art can complete and implement the present embodiment in combination with his own capability in the light of the present application, and some typical known structures or known methods should not be an obstacle for a person of ordinary skill in the art to implement the present application. It should be noted that modifications and improvements can be made by those skilled in the art without departing from the structure of the present invention, and these should also be considered as the scope of the present invention, which does not affect the effect of the implementation of the present invention and the utility of the patent. The protection scope of the present application shall be subject to the content of the claims, and the description of the specific embodiments and the like in the specification can be used for explaining the content of the claims.

Claims (8)

1. A supervised word vector learning method is characterized in that: the method comprises the following steps:
step one, a word relation classification model is added on the basis of a word2vec neural network model, and a deep learning network model is built;
inputting a plurality of adjacent input word vectors and a certain appointed word vector into a deep learning network model for multitask learning;
repeating the second step, and performing iterative computation to obtain an optimized word2vec neural network model and a word relation classification model;
after training is completed, not only can word vectors corresponding to words be obtained, but also the relation between the word vectors and the appointed word vectors can be calculated according to the classifier model;
in the second step, the word2vec neural network optimizes the neural network parameters through an error back propagation mechanism, wherein the errors comprise classification errors of Huffman trees and word relation classification errors;
in the third step, a plurality of randomly selected input vectors and specified vectors are respectively input into a word2vec neural network model and a word relation classification model, and an output word vector and the relation between the output word vector and the specified word vector are obtained through calculation.
2. The supervised word vector learning method of claim 1, wherein: before the first step, the corpus text is segmented, and a word list and an initial word vector corresponding to the word list are established.
3. The supervised word vector learning method of claim 2, wherein: and labeling the relation between each word vector and each word vector in the corpus text according to the word list.
4. The supervised word vector learning method of claim 1, wherein: in the first step, the word relation classification model comprises an input layer, a splicing layer, a full-connection layer and a probability layer which are connected in sequence; the splicing layer splices an output vector Wi obtained through word2vec neural network model calculation and a designated vector Wk input into a word relation classification model according to the following formula: [ Wi, wk, wi-Wk, wi ∘ Wk, cos (Wi, wk) ].
5. The supervised word vector learning method of claim 2, wherein: in step two, an input word vector and a specified word vector are defined by the initial word vector.
6. The supervised word vector learning method of claim 5, wherein: in the second step, a plurality of word vectors adjacent to the output word vector are input to the neural network model of word2vec by adopting the continuous pool bag model as input word vectors.
7. The supervised word vector learning method of claim 4, wherein: in the second step, when the multitask learning is performed, the word2vec neural network disk model calculates an output vector Wi, and simultaneously, the word relation classification model calculates a relation label (Wk, wi) of Wi and Wk.
8. The supervised word vector learning method of claim 1, wherein: in the second step, the word relation classification model optimizes the parameters of the full-connection layer through a neural network error back propagation mechanism.
CN201811075603.2A 2018-09-14 2018-09-14 Supervised word vector learning method Active CN109271632B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811075603.2A CN109271632B (en) 2018-09-14 2018-09-14 Supervised word vector learning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811075603.2A CN109271632B (en) 2018-09-14 2018-09-14 Supervised word vector learning method

Publications (2)

Publication Number Publication Date
CN109271632A CN109271632A (en) 2019-01-25
CN109271632B true CN109271632B (en) 2023-05-26

Family

ID=65188340

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811075603.2A Active CN109271632B (en) 2018-09-14 2018-09-14 Supervised word vector learning method

Country Status (1)

Country Link
CN (1) CN109271632B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110825875B (en) * 2019-11-01 2022-12-06 科大讯飞股份有限公司 Text entity type identification method and device, electronic equipment and storage medium
CN110852077B (en) * 2019-11-13 2023-03-31 泰康保险集团股份有限公司 Method, device, medium and electronic equipment for dynamically adjusting Word2Vec model dictionary
CN112989032A (en) * 2019-12-17 2021-06-18 医渡云(北京)技术有限公司 Entity relationship classification method, apparatus, medium and electronic device
CN111444346B (en) * 2020-03-31 2023-04-18 广州大学 Word vector confrontation sample generation method and device for text classification

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108280058A (en) * 2018-01-02 2018-07-13 中国科学院自动化研究所 Relation extraction method and apparatus based on intensified learning
US10037362B1 (en) * 2017-07-24 2018-07-31 International Business Machines Corpoation Mining procedure dialogs from source content
CN108388914A (en) * 2018-02-26 2018-08-10 中译语通科技股份有限公司 A kind of grader construction method, grader based on semantic computation

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170139899A1 (en) * 2015-11-18 2017-05-18 Le Holdings (Beijing) Co., Ltd. Keyword extraction method and electronic device
CN105975478A (en) * 2016-04-09 2016-09-28 北京交通大学 Word vector analysis-based online article belonging event detection method and device
CN106557462A (en) * 2016-11-02 2017-04-05 数库(上海)科技有限公司 Name entity recognition method and system
CN106682220A (en) * 2017-01-04 2017-05-17 华南理工大学 Online traditional Chinese medicine text named entity identifying method based on deep learning
CN106933806A (en) * 2017-03-15 2017-07-07 北京大数医达科技有限公司 The determination method and apparatus of medical synonym
CN107145503A (en) * 2017-03-20 2017-09-08 中国农业大学 Remote supervision non-categorical relation extracting method and system based on word2vec
CN107247780A (en) * 2017-06-12 2017-10-13 北京理工大学 A kind of patent document method for measuring similarity of knowledge based body
CN107291693B (en) * 2017-06-15 2021-01-12 广州赫炎大数据科技有限公司 Semantic calculation method for improved word vector model
CN107895000B (en) * 2017-10-30 2021-06-18 昆明理工大学 Cross-domain semantic information retrieval method based on convolutional neural network
CN107895051A (en) * 2017-12-08 2018-04-10 宏谷信息科技(珠海)有限公司 A kind of stock news quantization method and system based on artificial intelligence
CN108388654B (en) * 2018-03-01 2020-03-17 合肥工业大学 Sentiment classification method based on turning sentence semantic block division mechanism

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10037362B1 (en) * 2017-07-24 2018-07-31 International Business Machines Corpoation Mining procedure dialogs from source content
CN108280058A (en) * 2018-01-02 2018-07-13 中国科学院自动化研究所 Relation extraction method and apparatus based on intensified learning
CN108388914A (en) * 2018-02-26 2018-08-10 中译语通科技股份有限公司 A kind of grader construction method, grader based on semantic computation

Also Published As

Publication number Publication date
CN109271632A (en) 2019-01-25

Similar Documents

Publication Publication Date Title
CN109271632B (en) Supervised word vector learning method
CN109840287B (en) Cross-modal information retrieval method and device based on neural network
CN108052512B (en) Image description generation method based on depth attention mechanism
CN110377903B (en) Sentence-level entity and relation combined extraction method
CN108733742B (en) Global normalized reader system and method
CN111914067B (en) Chinese text matching method and system
US20240177047A1 (en) Knowledge grap pre-training method based on structural context infor
CN110222163A (en) A kind of intelligent answer method and system merging CNN and two-way LSTM
CN109657239A (en) The Chinese name entity recognition method learnt based on attention mechanism and language model
CN108132931A (en) A kind of matched method and device of text semantic
CN105938485A (en) Image description method based on convolution cyclic hybrid model
CN110688489B (en) Knowledge graph deduction method and device based on interactive attention and storage medium
CN111898374B (en) Text recognition method, device, storage medium and electronic equipment
CN111274794B (en) Synonym expansion method based on transmission
CN111368870A (en) Video time sequence positioning method based on intra-modal collaborative multi-linear pooling
CN108536735B (en) Multi-mode vocabulary representation method and system based on multi-channel self-encoder
CN110399454B (en) Text coding representation method based on transformer model and multiple reference systems
CN112801762B (en) Multi-mode video highlight detection method and system based on commodity perception
CN116204674B (en) Image description method based on visual concept word association structural modeling
CN111191461B (en) Remote supervision relation extraction method based on course learning
CN114881042A (en) Chinese emotion analysis method based on graph convolution network fusion syntax dependence and part of speech
CN113779190B (en) Event causal relationship identification method, device, electronic equipment and storage medium
CN114492459A (en) Comment emotion analysis method and system based on convolution of knowledge graph and interaction graph
CN114492451A (en) Text matching method and device, electronic equipment and computer readable storage medium
CN116680407A (en) Knowledge graph construction method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Qin Hong Hui

Inventor after: Du Ruo

Inventor after: Xiang Hai

Inventor after: Hou Cong

Inventor after: Liu Ke

Inventor before: Qin Hong Hui

GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231127

Address after: Room 1208-208, 12th Floor, Building 2, Fuhai Center, Daliushu, Haidian District, Beijing, 100081

Patentee after: Beijing Zhicheng Excellence Technology Co.,Ltd.

Address before: 401120 No. 1, Floor 3, Building 11, Internet Industrial Park, No. 106, West Section of Jinkai Avenue, Yubei District, Chongqing

Patentee before: CHONGQING XIEZHI TECHNOLOGY Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231220

Address after: No. 1-0286, Juhe 6th Street, Jufuyuan Industrial Park, Tongzhou Economic Development Zone, Beijing, 101127

Patentee after: Beijing Star Cube Digital Technology Co.,Ltd.

Address before: Room 1208-208, 12th Floor, Building 2, Fuhai Center, Daliushu, Haidian District, Beijing, 100081

Patentee before: Beijing Zhicheng Excellence Technology Co.,Ltd.