CN108021555A - A kind of Question sentence parsing measure based on depth convolutional neural networks - Google Patents

A kind of Question sentence parsing measure based on depth convolutional neural networks Download PDF

Info

Publication number
CN108021555A
CN108021555A CN201711162561.1A CN201711162561A CN108021555A CN 108021555 A CN108021555 A CN 108021555A CN 201711162561 A CN201711162561 A CN 201711162561A CN 108021555 A CN108021555 A CN 108021555A
Authority
CN
China
Prior art keywords
question
neural network
convolutional neural
sentence
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711162561.1A
Other languages
Chinese (zh)
Inventor
张家重
赵亚欧
付宪瑞
王玉奎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Financial Information Technology Co Ltd
Original Assignee
Inspur Financial Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Financial Information Technology Co Ltd filed Critical Inspur Financial Information Technology Co Ltd
Priority to CN201711162561.1A priority Critical patent/CN108021555A/en
Publication of CN108021555A publication Critical patent/CN108021555A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Fuzzy Systems (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The present invention provides a kind of Question sentence parsing measure based on depth convolutional neural networks, includes the following steps:S1, generate life corpus by ken related pages, crawls the Chinese character occurred in raw language material, generates the corresponding word vector of each Chinese character;S2, with corresponding word vector replace question sentence in each Chinese character, obtain corresponding to question sentence word vector set;Word vector set is calculated by convolutional neural networks obtains corresponding sentence justice vector;S3, question sentence carry out combination of two, and the cosine function absolute value by calculating sentence justice vector corresponding to two question sentences obtains similarity between two question sentences.This method is avoided due to influence of the cutting word mistake to subsequent analysis by the way of individual character analysis;Whole question sentence is extracted whole sentence feature by convolutional neural networks, avoids and isolates problem using sentence justice caused by word similarity matrix.

Description

Question similarity measurement method based on deep convolutional neural network
Technical Field
The invention relates to a question similarity measuring method, in particular to a question similarity measuring method based on a deep convolutional neural network.
Background
The main functions of the financial self-service robot are business consultation, business handling, cash access, user guidance and the like. The business consultation function can be understood as a Chinese question-answering system aiming at the bank field, and the key technology is to carry out similarity calculation on the questions asked by the user and the questions in a bank question bank and return answers corresponding to the most similar questions. Because natural languages, especially spoken languages, have a variety of different expression modes for questions with the same meaning, how to calculate the similarity between questions according to the real semantics of the questions becomes a problem to be solved urgently.
The traditional question similarity calculation methods generally have two types: one is a keyword matching based approach and the other is a machine learning based approach. The method based on keyword matching mainly calculates the similarity between two sentences by comparing the information of the times, positions, sequences and the like of the same keywords in the two question sentences. The method is simple in calculation, but the processing effect is often poor for long sentences, especially synonyms of different expression modes. The machine learning method mainly analyzes a domain knowledge base to establish a model between question sentences and question sentence semantics to calculate the similarity between different question sentences. The method is complex in calculation, but the method can better process synonyms, so that the method gradually becomes the current mainstream.
In recent years, with the success of deep learning techniques in the fields of voice, images, and the like, they have also been introduced into the field of similarity calculation. As disclosed in the prior art, chinese patent No. CN106776545A, "a method for calculating similarity between short texts through a deep convolutional neural network" is a typical process, which includes first segmenting words in a question, then converting each word into a word vector, and finally inputting a similarity matrix formed by all word vectors in two questions into the convolutional neural network to calculate the similarity.
The method mainly has the following problems:
first, chinese word segmentation cannot be completely accurate, and the accuracy rate is closely related to a specific field. For example, in the field of banks, because of more professional terms, the word segmentation accuracy is generally lower, and the lower accuracy can affect subsequent calculation.
Second, such methods often use a similarity matrix between word vectors as a measure of question similarity, which splits the similarity between questions into similarities between words, destroying the overall semantics of the question.
Disclosure of Invention
In view of the defects in the prior art, the invention aims to provide a question similarity measurement method based on a deep convolutional neural network, which is used for calculating the similarity between questions according to the implicit semantics between the questions.
The purpose of the invention is realized by the following technical scheme: a question similarity measurement method based on a deep convolutional neural network comprises the following steps:
s1, generating a raw corpus through related pages in the knowledge field, crawling Chinese characters appearing in the raw corpus, and generating a corresponding character vector of each Chinese character;
s2, replacing each Chinese character in the question with the corresponding character vector to obtain a character vector set corresponding to the question; the word vector set obtains corresponding sentence meaning vectors through the calculation of a convolution neural network;
and S3, combining the question sentences in pairs, and calculating the cosine function absolute values of the sentence meaning vectors corresponding to the two question sentences to obtain the similarity between the two question sentences.
The technical scheme of the invention is further defined as follows: the method for generating the raw corpus through the knowledge domain related pages in the step S1 comprises the following steps:
s11, compiling a web crawler by using a python language, and crawling knowledge-related webpages;
s12, preprocessing the webpage, removing webpage marks, invalid characters, mathematical formulas, pictures and tables, combining all the webpages, and generating an original raw corpus;
and S13, segmenting the original raw corpus according to punctuations, segmenting each sentence into a plurality of clauses, wherein each clause occupies one line, and combining all the clauses to generate a final raw corpus.
As a further improvement of the present invention, in step 1, a word vector is generated by adopting a skip-gram algorithm of a word2vec tool, the window size of the skip-gram algorithm is set to 2, 3500 common words and a UNK are set in the algorithm, and the UNK is used for replacing uncommon words except the 3500 common words.
As a further improvement of the present invention, further, the convolutional neural network of step S2 includes a convolutional layer and a pooling layer, the convolutional layer adopts a convolutional kernel size of 2 × 200,2 indicates that only the association between 2 single words is considered, 200 is the dimension of the word vector, and the number of convolutional kernels in the convolutional layer is 100-200; the pooling layer employs 1-max pooling, i.e., taking a maximum for each dimension of the features after convolution.
As a further improvement of the present invention, further, the method for calculating and acquiring corresponding sentence meaning vectors by the convolutional neural network in step 2 comprises:
1) The sentence S contains n Chinese characters, each Chinese character corresponds to a d-dimensional character vector v i Then the sentence after replacement is represented as S' = { v = { 1 ,v 2 ,…,v n };
2) Inputting S' into convolution layer of convolution neural network to calculate to obtain result after convolutionThe calculation formula is as follows:
wherein, c i =[v i ,v i+1 ](0<i&N) being a vector formed by combining two adjacent word vectors, W k K-th convolution kernel moment for convolutional neural networkArray, b k A deviation vector corresponding to the kth convolution kernel;
3) The result after convolutionInputting the result into a pooling layer for calculation to obtain a result p after pooling k; The pooling adopts 1-max pooling, and the calculation formula is as follows:
wherein max is a maximum function representing all inputs of the previous layerThe maximum value of (a);
4) Repeatedly executing the steps 2) and 3); the repetition times are 1-3 times;
5) Output p of the corresponding pooling layer of the last repetition k I.e. the sentence meaning vector of sentence S.
As a further improvement of the present invention, in step S3, the formula for calculating the cosine function absolute values of the sentence meaning vectors corresponding to the two question sentences is as follows:
wherein, x and y respectively represent semantic vectors corresponding to the question 1 and the question 2, and the numeric area of sim (x, y) is [0,1].
As a further improvement of the invention, a training method for calculating and acquiring the sentence meaning vector by the convolutional neural network comprises the following steps:
1) Clustering the questions with the same answers according to answers of the questions, combining the questions in the same cluster in pairs to generate positive example samples, combining the questions in different clusters in pairs to generate negative example samples, and combining all the positive example samples and the negative example samples to generate a training set;
2) Configuring a convolutional neural network based on a tensoflow frame, wherein the maximum training time of the convolutional neural network is 1000, a loss function adopts L2 normalized mean square error, the batch size is 400, the characteristic number of network convolutional kernels is 200, the convolutional kernel size is 2 x 200, and a network pooling layer adopts 1-max pooling;
3) Taking a sample in the training set, replacing the sample with a corresponding word vector sample, and calculating a sentence meaning vector corresponding to the word vector sample through the whole neural network;
4) Calculating cosine function absolute values among sample sentence meaning vectors to obtain sample similarity, and adjusting the weight of the neural network according to the error of the similarity with the original sample;
5) And repeating the steps 2) to 4) until the maximum training times are met.
The invention has the outstanding effects that:
(1) And a single character analysis mode is adopted, so that the influence of word segmentation errors on subsequent analysis is avoided.
(2) The convolution neural network takes the whole question as a whole to extract the characteristics of the whole sentence, thereby avoiding the sentence meaning split problem caused by using a word similarity matrix.
(3) And adding an absolute value function on the basis of the original cosine similarity formula to ensure that the value range of the function is [0,1]. The problem that the value ranges of the sigmoid function (a common excitation function of the neural network) and the cosine function are inconsistent is solved, and the situation that the similarity is negative is also avoided.
(4) And generating data required by deep learning model training, and establishing the association between the question and the corresponding semantics.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
FIG. 2 is a schematic diagram of a convolutional neural network structure according to the present invention;
FIG. 3 is a flowchart of a similarity analysis method according to an embodiment of the present invention;
Detailed Description
Example one
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
Referring to fig. 1-3, the question similarity measurement method based on the deep convolutional neural network of the present invention includes the following steps:
s1, generating a raw corpus through related pages in the knowledge field, crawling Chinese characters appearing in the raw corpus, and generating a corresponding character vector of each Chinese character;
s2, replacing each Chinese character in the question with the corresponding character vector to obtain a character vector set corresponding to the question; the word vector set obtains corresponding sentence meaning vectors through the calculation of a convolution neural network;
and S3, combining the question sentences in pairs, and calculating the cosine function absolute values of the sentence meaning vectors corresponding to the two question sentences to obtain the similarity between the two question sentences.
The rules, modes, etc. of operation in the above steps S1 to S3 will be described in detail below,
the embodiment is a similarity detection process of financial field question based on the method of the invention
And S1, generating a financial field corpus. The specific implementation steps are as follows:
step 1, compiling a web crawler by using a python language, and crawling financial related webpages, wherein the website range crawled by the embodiment comprises various large bank websites, financial blocks of various large portal websites, professional financial websites and the like.
And step 2, preprocessing the web pages, removing web page marks, invalid characters, mathematical formulas, pictures, tables and the like, combining all the web pages, and generating the original raw corpus.
And 3, further processing the raw corpus, segmenting the original raw corpus according to punctuations, segmenting each sentence into single clauses, wherein each clause occupies one line, and combining all the clauses to generate a final financial corpus.
And generating a financial field word vector. The specific implementation steps are as follows:
step 1, configuring a word2vec program based on the tensoflow framework. The specific configuration is as follows, the algorithm adopts a skip-gram algorithm, the loss function adopts nce _ loss, the sliding window is 2, the feature size of the word vector is 200, the training times are 3000000, the mini-batch size is 128, the anti-sampling times are 10, and the dictionary size is 3501. The specific implementation process can be adjusted according to the situation.
And 2, utilizing the program to learn the financial corpus generated in the step 1, and generating a corresponding word vector aiming at each word in the dictionary.
Step S2, the calculation method for obtaining the corresponding sentence meaning vector by the convolution neural network calculation comprises the following steps:
step 1, the sentence S contains n Chinese characters, each Chinese character corresponds to a d-dimension character vector v i Then the sentence after replacement is represented as S' = { v = { 1 ,v 2 ,…,v n };
Step 2, inputting S' into convolution layer of convolution neural network to calculate to obtain result after convolutionThe calculation formula is as follows:
wherein, c i =[v i ,v i+1 ](0<i&N) being a vector formed by combining two adjacent word vectors, W k The kth convolution kernel matrix of the convolutional neural network, b k A deviation vector corresponding to the kth convolution kernel;
step 3, convolving the resultInputting the result into a pooling layer for calculation to obtain a result p after pooling k; Using in a pond1-max pooling, which is calculated by the formula:
where max is a function of the maximum value, indicating that all inputs of the previous layer are takenThe maximum value of (a);
step 4, repeatedly executing the steps 2) and 3); the repetition times are 1-3 times;
step 5, the output p of the corresponding pooling layer is repeated for the last time k I.e. the sentence meaning vector of sentence S.
In step S2, the formula for calculating the cosine function absolute values of the sentence meaning vectors corresponding to the two question sentences is:
wherein, x and y respectively represent semantic vectors corresponding to the question 1 and the question 2, and the numeric area of sim (x, y) is [0,1].
Step S2, the training step of generating sentence meaning vector by the convolution neural network is as follows:
step 1, clustering the questions with the same answers according to answers of the questions, combining every two questions in the same cluster to generate positive example samples, combining every two questions in different clusters to generate negative example samples, and combining all the positive example samples and the negative example samples to generate a training set.
And step 2, configuring a convolutional neural network program based on the tenserflow framework. The concrete configuration is as follows: the maximum training time of the convolutional neural network is 1000, the loss function adopts L2 normalized Mean Square Error (MSE), the batch size is 400, the feature number of the network convolution kernel is 200, the convolution kernel size is 2 x 200, and the network pooling layer adopts 1-max pooling.
And 3, using the training sample set T generated in the first step.
Step 4, randomly taking a training sample in T, sample = (Sen) 1 ,Sen 2 P) by replacing it with a set of two word vectors S vec1 ,S vec2 . The specific method comprises the following steps: question-question Sen 1 The ith Chinese character C i Using step S 2 The generated word vectors are replaced. For example, the question "I want to transfer", wherein Chinese characters are "I", "want", "transfer" and "Account", respectively, and assuming that vectors corresponding to "I" are {0.5,0.7,0.6}, vectors corresponding to "want", "transfer" and "Account" are {0.1,0.2,0.5}, {0.2,0.3,0.7}, and {0.9,0.2.0.7}, respectively, the whole sentence is represented as a set of combinations of all vectors, i.e., { {0.5,0.7,0.6}, {0.1,0.2,0.5}, {0.2,0.3,0.7}, and {0.9,0.2.0.7} }.
Step 5, adding S vec1 、S vec2 Inputting into the convolutional neural network for calculation to obtain sentence meaning vector S rep1 、S rep2
And 6, using a formula of cosine function absolute value:calculating S rep1 、S rep2 And then adjusting the weight of the neural network according to the error of the similarity with the original sample.
And 7, repeating the steps 2 to 6 until a termination condition (such as reaching the maximum training times or specifying an error) is met. The present embodiment employs the maximum number of training times as a termination condition.
In addition to the above steps, the embodiment may use the trained convolutional neural network to perform question similarity measurement. The method comprises the following specific steps:
step 1, loading a trained convolutional neural network model
And step 2, loading a bank question-answer library. For each question sentence S in the question-answer library req_i Converting it into sentence meaning vector S in step S3 rep_i
Step 3, receiving a question S of the user request Step S3 is utilized to convert the sentence meaning vector S into a sentence meaning vector S rep_request
Step 4, calculating S in sequence by using improved cosine function rep_request And S rep_i Similarity sim of i Then taking the maximum similarity sim max Finally, sim max The answer to the corresponding question is returned as the final answer.
The embodiment adopts a single word analysis mode, and avoids the influence on subsequent analysis due to word segmentation errors. The convolution neural network takes the whole question sentence as a whole to extract the characteristics of the whole sentence, thereby avoiding the sentence meaning splitting problem brought by using the word similarity matrix. And adding an absolute value function on the basis of the original cosine similarity formula to ensure that the value range of the function is [0,1]. The problem that the value ranges of the sigmoid function (a common excitation function of the neural network) and the cosine function are inconsistent is solved, and the situation that the similarity is negative is also avoided. And generating data required by deep learning model training, and establishing the association between the question and the corresponding semantics. In addition to the above embodiments, the present invention may have other embodiments. All technical solutions formed by adopting equivalent substitutions or equivalent transformations fall within the protection scope of the present invention.

Claims (7)

1. A question similarity measurement method based on a deep convolutional neural network is characterized by comprising the following steps:
s1, generating a raw corpus through related pages in the knowledge field, crawling Chinese characters appearing in the raw corpus, and generating a corresponding character vector of each Chinese character;
s2, replacing each Chinese character in the question with the corresponding character vector to obtain a character vector set corresponding to the question; the word vector set obtains corresponding sentence meaning vectors through the calculation of a convolutional neural network;
and S3, combining the question sentences in pairs, and calculating the cosine function absolute value of the sentence meaning vector corresponding to the two question sentences to obtain the similarity between the two question sentences.
2. The method for measuring question similarity based on the deep convolutional neural network according to claim 1, wherein the method for generating the corpus by the knowledge domain related pages in the step S1 is as follows:
s11, compiling a web crawler by using a python language, and crawling knowledge-related webpages;
s12, preprocessing the webpage, removing webpage marks, invalid characters, mathematical formulas, pictures and tables, combining all the webpages, and generating an original raw corpus;
and S13, segmenting the original raw corpus according to punctuations, segmenting each sentence into a plurality of clauses, wherein each clause occupies one line, and combining all the clauses to generate a final raw corpus.
3. The question similarity measurement method based on the deep convolutional neural network as claimed in claim 1, wherein a word vector is generated in step 1 by using a skip-gram algorithm of a word2vec tool, the window size of the skip-gram algorithm is set to 2, 3500 common words and an UNK are set in the algorithm, and the UNK is used for replacing uncommon words except 3500 common words.
4. The question similarity measurement method based on the deep convolutional neural network as claimed in claim 1, wherein the convolutional neural network of step S2 includes a convolutional layer and a pooling layer, the convolutional layer uses a convolutional kernel size of 2 x 200,2 indicates that only 2 single words of correlation are considered, 200 is a dimension of a word vector, and the number of convolutional kernels in the convolutional layer is 100-200; the pooling layer employs 1-max pooling, i.e., taking a maximum for each dimension of the features after convolution.
5. The question similarity measurement method based on the deep convolutional neural network as claimed in claim 1, wherein the calculation method for obtaining the corresponding sentence meaning vector by the convolutional neural network in step 2 is as follows:
1) The sentence S contains n Chinese characters, each of which corresponds to a d-dimensional character vector v i Then the sentence after replacement is represented as S' = { v = { 1 ,v 2 ,…,v n };
2) Inputting S' into convolution spiritCalculating by the convolution layer of the network to obtain the result after convolutionThe calculation formula is as follows:
wherein, c i =[v i ,v i+1 ](0<i&N) is a vector formed by combining two adjacent word vectors, W k The kth convolution kernel matrix of the convolutional neural network, b k A deviation vector corresponding to the kth convolution kernel;
3) The result after convolutionInputting the result into a pooling layer for calculation to obtain a result p after pooling k; The pooling adopts 1-max pooling, and the calculation formula is as follows:
where max is a function of the maximum value, indicating that all inputs of the previous layer are takenMaximum value of (d);
4) Repeatedly executing the steps 2) and 3); the repetition times are 1-3 times;
5) Output p of the corresponding pooling layer of the last repetition k I.e. the sentence meaning vector of sentence S.
6. The question similarity measurement method based on the deep convolutional neural network according to claim 1, wherein the formula for calculating the cosine function absolute values of the question meaning vectors corresponding to the two question sentences in step S3 is as follows:
wherein, x and y respectively represent semantic vectors corresponding to the question 1 and the question 2, and the numeric area of sim (x, y) is [0,1].
7. The question similarity measurement method based on the deep convolutional neural network as claimed in claim 1, wherein the training method for the convolutional neural network to calculate and obtain sentence meaning vector is as follows:
1) Clustering the questions with the same answers according to the answers of the questions, combining the questions in the same cluster in pairs to generate positive example samples, combining the questions in different clusters in pairs to generate negative example samples, and combining all the positive example samples and the negative example samples to generate a training set;
2) Configuring a convolutional neural network based on a tensoflow frame, wherein the maximum training frequency of the convolutional neural network is 1000, a loss function adopts L2 normalized mean square error, the batch size is 400, the characteristic number of network convolutional kernels is 200, the convolutional kernel size is 2 x 200, and a network pooling layer adopts 1-max pooling;
3) Taking a sample in the training set, replacing the sample with a corresponding word vector sample, and calculating a sentence meaning vector corresponding to the word vector sample through the whole neural network;
4) Calculating cosine function absolute values among sample sentence meaning vectors to obtain sample similarity, and adjusting the weight of the neural network according to the error of the similarity with the original sample;
5) And repeating the steps 2) to 4) until the maximum training times are met.
CN201711162561.1A 2017-11-21 2017-11-21 A kind of Question sentence parsing measure based on depth convolutional neural networks Pending CN108021555A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711162561.1A CN108021555A (en) 2017-11-21 2017-11-21 A kind of Question sentence parsing measure based on depth convolutional neural networks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711162561.1A CN108021555A (en) 2017-11-21 2017-11-21 A kind of Question sentence parsing measure based on depth convolutional neural networks

Publications (1)

Publication Number Publication Date
CN108021555A true CN108021555A (en) 2018-05-11

Family

ID=62080014

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711162561.1A Pending CN108021555A (en) 2017-11-21 2017-11-21 A kind of Question sentence parsing measure based on depth convolutional neural networks

Country Status (1)

Country Link
CN (1) CN108021555A (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108984694A (en) * 2018-07-04 2018-12-11 龙马智芯(珠海横琴)科技有限公司 The processing method and processing device of webpage, storage medium, electronic device
CN109062892A (en) * 2018-07-10 2018-12-21 东北大学 A kind of Chinese sentence similarity calculating method based on Word2Vec
CN109102809A (en) * 2018-06-22 2018-12-28 北京光年无限科技有限公司 A kind of dialogue method and system for intelligent robot
CN109101494A (en) * 2018-08-10 2018-12-28 哈尔滨工业大学(威海) A method of it is calculated for Chinese sentence semantic similarity, equipment and computer readable storage medium
CN109145290A (en) * 2018-07-25 2019-01-04 东北大学 Based on word vector with from the semantic similarity calculation method of attention mechanism
CN109241249A (en) * 2018-07-16 2019-01-18 阿里巴巴集团控股有限公司 A kind of method and device of determining bursting problem
CN109543179A (en) * 2018-11-05 2019-03-29 北京康夫子科技有限公司 The normalized method and system of colloquial style symptom
CN109918491A (en) * 2019-03-12 2019-06-21 焦点科技股份有限公司 A kind of intelligent customer service question matching method of knowledge based library self study
CN110032635A (en) * 2019-04-22 2019-07-19 齐鲁工业大学 One kind being based on the problem of depth characteristic fused neural network to matching process and device
CN110309503A (en) * 2019-05-21 2019-10-08 昆明理工大学 A kind of subjective item Rating Model and methods of marking based on deep learning BERT--CNN
CN110348024A (en) * 2019-07-23 2019-10-18 天津汇智星源信息技术有限公司 Intelligent identifying system based on legal knowledge map
CN110969005A (en) * 2018-09-29 2020-04-07 航天信息股份有限公司 Method and device for determining similarity between entity corpora
CN111666482A (en) * 2019-03-06 2020-09-15 珠海格力电器股份有限公司 Query method and device, storage medium and processor
CN111669410A (en) * 2020-07-24 2020-09-15 中国航空油料集团有限公司 Industrial control network negative sample data generation method, device, server and medium
CN111753081A (en) * 2019-03-28 2020-10-09 百度(美国)有限责任公司 Text classification system and method based on deep SKIP-GRAM network

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9659248B1 (en) * 2016-01-19 2017-05-23 International Business Machines Corporation Machine learning and training a computer-implemented neural network to retrieve semantically equivalent questions using hybrid in-memory representations
CN106776545A (en) * 2016-11-29 2017-05-31 西安交通大学 A kind of method that Similarity Measure between short text is carried out by depth convolutional neural networks
CN106815311A (en) * 2016-12-21 2017-06-09 杭州朗和科技有限公司 A kind of problem matching process and device
CN106844741A (en) * 2017-02-13 2017-06-13 哈尔滨工业大学 A kind of answer method towards specific area
CN106897568A (en) * 2017-02-28 2017-06-27 北京大数医达科技有限公司 The treating method and apparatus of case history structuring

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9659248B1 (en) * 2016-01-19 2017-05-23 International Business Machines Corporation Machine learning and training a computer-implemented neural network to retrieve semantically equivalent questions using hybrid in-memory representations
CN106776545A (en) * 2016-11-29 2017-05-31 西安交通大学 A kind of method that Similarity Measure between short text is carried out by depth convolutional neural networks
CN106815311A (en) * 2016-12-21 2017-06-09 杭州朗和科技有限公司 A kind of problem matching process and device
CN106844741A (en) * 2017-02-13 2017-06-13 哈尔滨工业大学 A kind of answer method towards specific area
CN106897568A (en) * 2017-02-28 2017-06-27 北京大数医达科技有限公司 The treating method and apparatus of case history structuring

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109102809A (en) * 2018-06-22 2018-12-28 北京光年无限科技有限公司 A kind of dialogue method and system for intelligent robot
CN108984694A (en) * 2018-07-04 2018-12-11 龙马智芯(珠海横琴)科技有限公司 The processing method and processing device of webpage, storage medium, electronic device
CN109062892A (en) * 2018-07-10 2018-12-21 东北大学 A kind of Chinese sentence similarity calculating method based on Word2Vec
CN109241249A (en) * 2018-07-16 2019-01-18 阿里巴巴集团控股有限公司 A kind of method and device of determining bursting problem
CN109241249B (en) * 2018-07-16 2021-09-14 创新先进技术有限公司 Method and device for determining burst problem
CN109145290A (en) * 2018-07-25 2019-01-04 东北大学 Based on word vector with from the semantic similarity calculation method of attention mechanism
CN109145290B (en) * 2018-07-25 2020-07-07 东北大学 Semantic similarity calculation method based on word vector and self-attention mechanism
CN109101494A (en) * 2018-08-10 2018-12-28 哈尔滨工业大学(威海) A method of it is calculated for Chinese sentence semantic similarity, equipment and computer readable storage medium
CN110969005A (en) * 2018-09-29 2020-04-07 航天信息股份有限公司 Method and device for determining similarity between entity corpora
CN110969005B (en) * 2018-09-29 2023-10-31 航天信息股份有限公司 Method and device for determining similarity between entity corpora
CN109543179A (en) * 2018-11-05 2019-03-29 北京康夫子科技有限公司 The normalized method and system of colloquial style symptom
CN111666482B (en) * 2019-03-06 2022-08-02 珠海格力电器股份有限公司 Query method and device, storage medium and processor
CN111666482A (en) * 2019-03-06 2020-09-15 珠海格力电器股份有限公司 Query method and device, storage medium and processor
CN109918491B (en) * 2019-03-12 2022-07-29 焦点科技股份有限公司 Intelligent customer service question matching method based on knowledge base self-learning
CN109918491A (en) * 2019-03-12 2019-06-21 焦点科技股份有限公司 A kind of intelligent customer service question matching method of knowledge based library self study
CN111753081A (en) * 2019-03-28 2020-10-09 百度(美国)有限责任公司 Text classification system and method based on deep SKIP-GRAM network
CN111753081B (en) * 2019-03-28 2023-06-09 百度(美国)有限责任公司 System and method for text classification based on deep SKIP-GRAM network
CN110032635B (en) * 2019-04-22 2023-01-20 齐鲁工业大学 Problem pair matching method and device based on depth feature fusion neural network
CN110032635A (en) * 2019-04-22 2019-07-19 齐鲁工业大学 One kind being based on the problem of depth characteristic fused neural network to matching process and device
CN110309503A (en) * 2019-05-21 2019-10-08 昆明理工大学 A kind of subjective item Rating Model and methods of marking based on deep learning BERT--CNN
CN110348024A (en) * 2019-07-23 2019-10-18 天津汇智星源信息技术有限公司 Intelligent identifying system based on legal knowledge map
CN111669410A (en) * 2020-07-24 2020-09-15 中国航空油料集团有限公司 Industrial control network negative sample data generation method, device, server and medium

Similar Documents

Publication Publication Date Title
CN108021555A (en) A kind of Question sentence parsing measure based on depth convolutional neural networks
CN110442760B (en) Synonym mining method and device for question-answer retrieval system
CN107562792B (en) question-answer matching method based on deep learning
CN106997376B (en) Question and answer sentence similarity calculation method based on multi-level features
CN106776545B (en) Method for calculating similarity between short texts through deep convolutional neural network
CN107818164A (en) A kind of intelligent answer method and its system
CN111831789B (en) Question-answering text matching method based on multi-layer semantic feature extraction structure
CN111563384B (en) Evaluation object identification method and device for E-commerce products and storage medium
CN112035730B (en) Semantic retrieval method and device and electronic equipment
CN111368049A (en) Information acquisition method and device, electronic equipment and computer readable storage medium
CN104615767A (en) Searching-ranking model training method and device and search processing method
CN110362678A (en) A kind of method and apparatus automatically extracting Chinese text keyword
CN112052319B (en) Intelligent customer service method and system based on multi-feature fusion
CN111444704A (en) Network security keyword extraction method based on deep neural network
CN113486645A (en) Text similarity detection method based on deep learning
Rahman et al. NLP-based automatic answer script evaluation
CN110334204B (en) Exercise similarity calculation recommendation method based on user records
CN112434533A (en) Entity disambiguation method, apparatus, electronic device, and computer-readable storage medium
Mahmoodvand et al. Semi-supervised approach for Persian word sense disambiguation
CN114169447B (en) Event detection method based on self-attention convolution bidirectional gating cyclic unit network
CN116757188A (en) Cross-language information retrieval training method based on alignment query entity pairs
CN110287396A (en) Text matching technique and device
CN111767388B (en) Candidate pool generation method
CN111159405B (en) Irony detection method based on background knowledge
CN110413956B (en) Text similarity calculation method based on bootstrapping

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180511