CN107729309B

CN107729309B - Deep learning-based Chinese semantic analysis method and device

Info

Publication number: CN107729309B
Application number: CN201610658579.XA
Authority: CN
Inventors: 郑骁庆; 陈军; 吕永; 尚国强
Original assignee: Fudan University; ZTE Corp
Current assignee: Fudan University; ZTE Corp
Priority date: 2016-08-11
Filing date: 2016-08-11
Publication date: 2022-11-08
Anticipated expiration: 2036-08-11
Also published as: WO2018028077A1; CN107729309A

Abstract

The invention discloses a method and a device for Chinese semantic analysis based on deep learning, which relate to the technical field of natural language processing, and the method comprises the following steps: the mobile terminal obtains a standard Chinese text by performing standardized processing on the acquired Chinese text; the mobile terminal identifies special type vocabularies and/or custom vocabularies and/or Chinese names of the standard Chinese text, and takes an identification result as a constraint condition; the mobile terminal obtains a Chinese word segmentation and part-of-speech tagging model according to the constraint conditions and by utilizing deep learning, and performs Chinese word segmentation and part-of-speech analysis on the standardized Chinese text to obtain the word segmentation and part-of-speech of the standardized Chinese text; and the mobile terminal performs Chinese semantic analysis on the standardized Chinese text by utilizing the word segmentation, the part of speech and/or the naming identification type of the standardized Chinese text.

Description

Deep learning-based Chinese semantic analysis method and device

Technical Field

The invention relates to the technical field of natural language processing, in particular to a Chinese semantic analysis method and device based on deep learning.

Background

Chinese natural language understanding has advanced substantially, and particularly, a great deal of research results have been generated in terms of Chinese word segmentation and part-of-speech analysis. Although chinese automated analysis technology is still relatively lagged behind compared to english and japanese, previous research accumulation has made it possible to develop and apply systems capable of high-level semantic analysis and understanding to practical applications. The system applying the semantic analysis technology can greatly improve the intelligence level and the coping capability of the system. The semantic analysis technology is a key and difficult point of text information analysis and processing, and is also the basis of information extraction, user intention analysis, information fusion, question answering, intelligent reasoning and the like.

On the other hand, deep learning is a breakthrough progress of recent artificial intelligence research, which ends the situation that artificial intelligence cannot progress for ten years and rapidly affects the industry. The deep learning is different from a narrow artificial intelligence system (function simulation for a specific task) which can only complete a specific task, and as a general artificial intelligence technique, it can cope with various situations and problems, has been applied very successfully in the fields of image recognition, voice recognition, and the like, and has also achieved a result in the field of natural language processing (mainly, english).

Disclosure of Invention

The technical problem solved by the scheme provided by the embodiment of the invention is that the automatic analysis of Chinese semantics is inaccurate.

The method for Chinese semantic analysis based on deep learning provided by the embodiment of the invention comprises the following steps:

the mobile terminal obtains a standard Chinese text by performing standardized processing on the acquired Chinese text;

the mobile terminal performs special type vocabulary recognition and/or user-defined vocabulary recognition and/or Chinese naming recognition on the standard Chinese text, and takes a recognition result as a constraint condition;

the mobile terminal obtains a Chinese word segmentation and part-of-speech tagging model according to the constraint conditions and by utilizing deep learning, and performs Chinese word segmentation and part-of-speech analysis on the standardized Chinese text to obtain the word segmentation and part-of-speech of the standardized Chinese text;

and the mobile terminal performs Chinese semantic analysis on the standardized Chinese text by utilizing the word segmentation, the part of speech and/or the naming identification type of the standardized Chinese text.

Preferably, the mobile terminal performs special type vocabulary recognition and/or custom vocabulary recognition and/or Chinese naming recognition on the canonical Chinese text, and takes the recognition result as a constraint condition, including:

the mobile terminal performs special type vocabulary recognition on the standard Chinese text by using a special type vocabulary template to obtain a special type vocabulary recognition result of the standard Chinese text, and the obtained special type vocabulary recognition result is used as a first constraint condition.

and the mobile terminal carries out user-defined vocabulary recognition on the standard Chinese text by using the user-defined dictionary to obtain a user-defined vocabulary recognition result of the standard Chinese text, and the obtained user-defined vocabulary recognition result is used as a second constraint condition.

the mobile terminal conducts Chinese naming recognition on the standard Chinese text by utilizing a Chinese naming recognition model obtained through deep learning to obtain a Chinese naming recognition result of the standard Chinese text, and the obtained Chinese naming recognition result is used as a third constraint condition.

Preferably, the constraint condition includes at least one of a first constraint condition, a second constraint condition, and a third constraint condition, or a combination thereof.

Preferably, the performing, by the mobile terminal, chinese semantic analysis on the normalized chinese text by using the segmentation, part of speech, and/or named recognition type of the normalized chinese text includes:

and the mobile terminal classifies the standard Chinese text according to the characters of the standard Chinese text and a Chinese sentence model based on a convolutional neural network with dynamic k-max pooling to obtain a sentence classification result of the standard Chinese text.

the mobile terminal determines a Chinese semantic role labeling model of a bidirectional LSTM (Long-Short Term Memory) according to sentence classification results, and performs semantic role labeling on each participle and symbol of the standard Chinese text according to the participle, part of speech and/or naming type of the standard Chinese text and the Chinese semantic role labeling model of the bidirectional LSTM to obtain semantic role labeling results of the standard Chinese text.

and the mobile terminal carries out structured processing on the standard Chinese text according to the semantic role marking result and the event model of the standard Chinese text, and extracts key information of the standard Chinese text.

Preferably, the key information of the canonical chinese text includes an event name, a key attribute, and an attribute value.

The device for Chinese semantic analysis based on deep learning provided by the embodiment of the invention comprises the following components:

the normalization processing module is used for performing normalization processing on the acquired Chinese text to obtain a normalized Chinese text;

the recognition module is used for carrying out special type vocabulary recognition and/or user-defined vocabulary recognition and/or Chinese naming recognition on the standard Chinese text, and taking a recognition result as a constraint condition;

and the analysis module is used for performing Chinese word segmentation and part-of-speech analysis on the standard Chinese text according to the constraint conditions and by utilizing deep learning to obtain a Chinese word segmentation and part-of-speech tagging model, obtaining the segmentation and part-of-speech of the standard Chinese text, and performing Chinese semantic analysis on the standard Chinese text by utilizing the segmentation and part-of-speech and/or naming identification type of the standard Chinese text.

According to the scheme provided by the embodiment of the invention, the input Chinese sentences are subjected to semantic analysis, then structured analysis results are output, and tasks requiring high-level semantic analysis support, such as event analysis, information extraction, emotion analysis and the like, are completed by utilizing the structured analysis results.

Drawings

FIG. 1 is a flowchart of a method for deep learning-based Chinese semantic analysis according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of an apparatus for deep learning-based Chinese semantic analysis according to an embodiment of the present invention;

FIG. 3 is a block diagram of Chinese semantic analysis according to an embodiment of the present invention;

FIG. 4 is a diagram of a Chinese sequence annotation network model structure according to an embodiment of the present invention;

FIG. 5 is a block diagram of a convolutional neural network based on pooling with dynamic k-max according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of semantic role labeling of bidirectional LSTM provided by an embodiment of the present invention.

Detailed Description

The preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings, and it should be understood that the preferred embodiments described below are only for the purpose of illustrating and explaining the present invention, and are not to be construed as limiting the present invention.

Fig. 1 is a flowchart of a method for deep learning-based chinese semantic analysis according to an embodiment of the present invention, as shown in fig. 1, including:

step S101: the mobile terminal obtains a standard Chinese text by performing standardized processing on the acquired Chinese text;

step S102: the mobile terminal identifies special type vocabularies and/or custom vocabularies and/or Chinese names of the standard Chinese text, and takes an identification result as a constraint condition;

step S103: the mobile terminal obtains a Chinese word segmentation and part-of-speech tagging model according to the constraint conditions and by deep learning, and performs Chinese word segmentation and part-of-speech analysis on the standardized Chinese text to obtain the word segmentation and part-of-speech of the standardized Chinese text;

step S104: and the mobile terminal performs Chinese semantic analysis on the standardized Chinese text by utilizing the word segmentation, the part of speech and/or the naming identification type of the standardized Chinese text.

The mobile terminal performs special type vocabulary recognition and/or user-defined vocabulary recognition and/or Chinese naming recognition on the standard Chinese text, and takes the recognition result as a constraint condition, wherein the constraint condition comprises the following steps: the mobile terminal performs special type vocabulary recognition on the standard Chinese text by using a special type vocabulary template to obtain a special type vocabulary recognition result of the standard Chinese text, and the obtained special type vocabulary recognition result is used as a first constraint condition.

The mobile terminal performs special type vocabulary recognition and/or user-defined vocabulary recognition and/or Chinese naming recognition on the standard Chinese text, and takes the recognition result as a constraint condition, wherein the constraint condition comprises the following steps: and the mobile terminal carries out user-defined vocabulary recognition on the standard Chinese text by using the user-defined dictionary to obtain a user-defined vocabulary recognition result of the standard Chinese text, and the obtained user-defined vocabulary recognition result is used as a second constraint condition.

The mobile terminal performs special type vocabulary recognition and/or user-defined vocabulary recognition and/or Chinese naming recognition on the standard Chinese text, and takes the recognition result as a constraint condition, wherein the constraint condition comprises the following steps: the mobile terminal conducts Chinese naming recognition on the standard Chinese text by utilizing a Chinese naming recognition model obtained through deep learning to obtain a Chinese naming recognition result of the standard Chinese text, and the obtained Chinese naming recognition result is used as a third constraint condition.

Wherein the constraint condition comprises at least one of a first constraint condition, a second constraint condition and a third constraint condition or a combination thereof.

The special type vocabulary recognition and/or the user-defined vocabulary recognition and/or the Chinese naming recognition are pre-word segmentation and part of speech tagging, namely, the special type vocabulary and/or the user-defined vocabulary and/or the Chinese naming recognized in the step are not subjected to word segmentation and part of speech tagging again in the next word segmentation and part of speech tagging step, so that a constraint condition is formed.

The mobile terminal performs Chinese semantic analysis on the standard Chinese text by using the segmentation, the part of speech and/or the naming identification type of the standard Chinese text, and the method comprises the following steps: and the mobile terminal classifies the standard Chinese text according to the characters of the standard Chinese text and a Chinese sentence model based on a convolutional neural network with dynamic k-max pooling to obtain a sentence classification result of the standard Chinese text.

The mobile terminal performs Chinese semantic analysis on the standardized Chinese text by using the word segmentation, the part of speech and/or the naming identification type of the standardized Chinese text, and comprises the following steps: and the mobile terminal determines a Chinese semantic role labeling model of the bidirectional long-and-short-term memory LSTM according to the sentence classification result, and performs semantic role labeling on each participle and symbol of the standard Chinese text according to the participle, part of speech and/or naming type of the standard Chinese text and the Chinese semantic role labeling model of the bidirectional long-and-short-term memory LSTM to obtain a semantic role labeling result of the standard Chinese text.

The mobile terminal performs Chinese semantic analysis on the standardized Chinese text by using the word segmentation, the part of speech and/or the naming identification type of the standardized Chinese text, and comprises the following steps: and the mobile terminal carries out structured processing on the standard Chinese text according to the semantic role marking result and the event model of the standard Chinese text, and extracts key information of the standard Chinese text. Specifically, the key information of the canonical chinese text includes an event name, key attributes, and attribute values.

Fig. 2 is a schematic diagram of a device for deep learning-based chinese semantic analysis according to an embodiment of the present invention, as shown in fig. 2, including: the normalization processing module 201 is configured to perform normalization processing on the acquired chinese text to obtain a normalized chinese text; the recognition module 202 is used for performing special type vocabulary recognition and/or user-defined vocabulary recognition and/or Chinese naming recognition on the standard Chinese text, and taking a recognition result as a constraint condition; the analysis module 203 is configured to obtain a Chinese segmentation and part-of-speech tagging model according to the constraint conditions and by using deep learning, perform Chinese segmentation and part-of-speech analysis on the normalized Chinese text to obtain segmentation and part-of-speech of the normalized Chinese text, and perform Chinese semantic analysis on the normalized Chinese text by using the segmentation, part-of-speech, and/or name recognition types of the normalized Chinese text.

Wherein the analysis module 202 comprises: and the sentence classification unit is used for classifying the sentences of the standard Chinese text according to the characters of the standard Chinese text and a Chinese sentence model based on a convolutional neural network with dynamic k-max pooling to obtain a sentence classification result of the standard Chinese text.

Wherein the analysis module 202 further comprises: and the semantic role labeling unit is used for determining a Chinese semantic role labeling model of the bidirectional long-short-term memory LSTM according to the sentence classification result, and performing semantic role labeling on elements such as single words, participles, special types of words and the like in the standard Chinese text according to the participles, the part of speech and/or the naming recognition types of the standard Chinese text and the Chinese semantic role labeling model of the bidirectional long-short-term memory LSTM to obtain the semantic role labeling result of the standard Chinese text.

Wherein the analysis module 202 further comprises: and the structural processing unit is used for carrying out structural processing on the standard Chinese text by the mobile terminal according to the semantic role marking result and the event model of the standard Chinese text and extracting the key information of the standard Chinese text. Specifically, the key information of the canonical chinese text includes an event name, key attributes, and attribute values. Wherein, the event name may correspond to a sentence classification result. For example, for the text of the short message received by the terminal, the sentence classification model is divided into bank bills, flight and train, appointments, weather forecast and the like. The resulting type of sentence classification can be used as the event name. And the key attribute is a semantic role labeling result. For example, in the bank bill short message, the attribute value is marked as several categories such as the bill date, the consumption amount, the repayment date, the repayment amount, and the like, and the attribute value is marked as a specific value in the original short message text corresponding to the categories, such as a specific date, a specific amount, and the like.

Fig. 3 is a schematic diagram of a module for chinese semantic analysis according to an embodiment of the present invention, and as shown in fig. 3, after performing semantic analysis on an input chinese sentence by using a deep learning technique, a structured analysis result is output, and tasks requiring high-level semantic analysis support, such as event analysis, information extraction, and emotion analysis, are completed by using the structured analysis result, which specifically includes:

text normalization processing: the input Chinese sentence is subjected to normalized processing, and the method comprises the following steps: unified coding, traditional Chinese character conversion to simplified Chinese character conversion, full angle conversion to half angle conversion, special character conversion and non-standard expression replacement (for example, network expression is replaced by standard expression).

Custom vocabulary recognition: utilizing the user-defined dictionary to identify the user-defined vocabulary, comprising: application domain vocabulary, idioms, food, places, works, equipment, names of people, names of places, and names of institutions.

Special type vocabulary recognition: the electronic mail box, the website, the date, the time, the percentage, the quantifier, the currency, the telephone number, the number and the foreign words contained in the input sentence are identified by defining a template for identifying the electronic mail box, the website, the date, the time, the quantifier, the currency, the telephone number, the number and the foreign words, and are replaced by special characters.

Chinese naming recognition: by preparing a corpus of Chinese naming recognition, labeling a network model with a Chinese sequence as shown in fig. 4, and training a Chinese naming recognition model for Chinese naming recognition, a Person name, a place name, and a mechanism name in an input sentence are recognized, that is, specific Person names, place names, and mechanism names in the sentence are recognized and corresponding naming types are saved (for example, they can be respectively expressed by "Person", "Location", and "Organization").

Chinese participles and part-of-speech tagging: taking the result of special type vocabulary recognition and/or custom vocabulary recognition and/or Chinese naming recognition as a constraint, preparing combined Chinese participles and part-of-speech tagging corpora, adopting a Chinese sequence tagging network model shown in figure 4, training a Chinese participle and part-of-speech tagging model for performing combined tagging of Chinese participles and part-of-speech analysis, and performing combined Chinese participle and part-of-speech analysis on an input sentence.

Sentence classification: before semantic character labeling, sentences are classified by using sentence semantic representations generated by the convolutional neural network with the dynamic k-max pooling shown in FIG. 5, and input sentences which are not interesting in application are filtered. The method comprises the steps of training a Chinese sentence classification model of a convolutional neural network with dynamic k-max pooling by adopting a sentence classification corpus comprising sentences of various types in balance and negative sample sentences (applying uninteresting Chinese sentences), classifying input sentences by the model, and filtering and applying uninteresting input sentences.

Semantic role labeling: determining a semantic annotation network model of the bidirectional LSTM according to the sentence classification result (namely, different parsing models are adopted for different sentence classification categories), and then performing semantic role annotation on the sentences by adopting the semantic annotation network of the bidirectional LSTM shown in FIG. 6 for the participles, parts of speech and/or naming types in the standard text. According to the word segmentation, the part of speech and/or the naming type, preparing semantic role labeling linguistic data of the same sentence category, training a bidirectional LSTM Chinese semantic role labeling model, and performing semantic role labeling on the sentences through the model.

Event analysis: and according to the semantic role labeling result, packaging the semantic-analyzed structural representation by combining an event template, and extracting the name, key attribute and attribute value of the event.

The format of the training corpus labeled by the semantic role is that a word is arranged in a row in the sequence of words in a sentence, each row has 5 columns, and the training corpus sequentially and respectively represents the participles per se (e-mail, website, date, time, percentage, quantifier, currency, telephone number, foreign words and the like are replaced by English tags, single words or punctuation marks and the like also serve as independent participles), semantic tags ('O' represents a class unrelated to a task), part-of-speech tags, naming identification tags and the original word forms of the participles in the sentence. Each sentence sample is separated by an empty row.

When a sequence tagging task based on deep learning, such as Wen Fenci, part-of-speech tagging, chinese naming recognition and the like, is in progress, a decoding algorithm is performed by taking a result of special type vocabulary recognition and/or custom vocabulary recognition as a constraint (the constraint condition during Chinese segmentation and part-of-speech tagging can be added with a Chinese naming recognition result), comprising:

(1) The types of the e-mail, the website, the date, the time, the percentage, the quantifier, the currency, the telephone number, the foreign words and the like are identified in advance through the template.

(2) The self-defining of vocabularies including domain vocabularies, idioms, food, places, works, equipment, names of people, places, names of organizations and the like is supported.

(3) And combining the prediction output of the deep learning network, and performing Viterbi decoding by using the result of special type vocabulary recognition and/or user-defined vocabulary recognition as a constraint.

FIG. 4 is a diagram of a network model structure of Chinese sequence tagging provided in the embodiment of the present invention, which can be used for Chinese naming identification, chinese segmentation and part-of-speech tagging (note: different training corpora, different trained model data, and different constraint conditions). As shown in FIG. 4, the deep-learning Chinese sequence annotation network model receives a Chinese sentence as input and outputs sequence annotation results in units of characters (including Chinese characters, punctuation marks and other characters in the sentence that may occur). The label set adopts a label formed by expanding word segmentation labels and specific task labels. Taking Chinese naming identification as an example, if "PER" is used to represent the name tag, the following sentence:

zhuge Liang is the army teacher of Liu Bei military group. "

The corresponding labeling results are:

“B_PER I_PER E_PER O B_PER E_PER O O O O O O O O”。

wherein: "B" represents the beginning character of the vocabulary, "I" represents the middle character of the vocabulary, "E" represents the ending character of the vocabulary, and "O" represents a character unrelated to the task. Also, "S" represents a character capable of being individually formed into a word (e.g., a single character or a punctuation mark).

The label of a character is typically related to its surrounding characters, so a window model is used, i.e. the character and surrounding characters are taken as input when estimating the likelihood that the current character belongs to a label (see fig. 4). If the window size is set to 5, this character and two characters on the left and right thereof are indicated as an input window. If the number of characters on the left and right is less than the specified size of the window, padding is used instead.

Each input character will be converted to a corresponding vector representation by means of a lookup in a word vector table. The representation of each character may be randomly generated or pre-trained using an unsupervised approach. And then, splicing the vectors to represent the characteristic representation of a certain window. After passing through a linear network layer (middle hidden layer), nonlinear conversion is carried out by using a Sigmoid function, finally, a linear layer is used, vectors with the number equal to that of task tags are output, and each element of the vector represents the possibility of a corresponding tag.

Given a chinese sentence, the network outputs a matrix, each element f θ (t | i) in the matrix representing an estimate of the likelihood that the ith character in the sentence belongs to a tag t, where θ represents a parameter of the network. In the sequence labeling task, because of the strong dependency relationship between the front label and the back label, the matrix Aij is introduced to indicate the possibility of jumping from the label i to the label j (also included in the parameter set theta). Given a sentence s [1:n ] containing n characters, an estimate can be given for a certain label sequence t [1:n ] of equal length:

under the condition of given parameters, a Viterbi decoding algorithm can be adopted to obtain a label sequence with the highest score as a labeling result.

The training method is that on the training set, the probability of the occurrence of the correct labeling sequence of each sample is required to be maximum:

wherein: (s, t) represents one sample in the training set. The training adopts a gradient descent method, and all parameters of the network are updated by using the following formula:

wherein: λ represents a learning step size.

The Chinese sequence labeling network and the learning algorithm based on deep learning are characterized in that:

(1) The necessary preprocessing is carried out on the input Chinese sentence, and comprises the following steps: unified coding, traditional Chinese character conversion to simplified Chinese character conversion, full-angle conversion to half-angle conversion, special character conversion, non-standard phrase replacement, and unified conversion of recognized electronic mail box, web address, date, time, percentage, quantifier, currency, telephone number, number and foreign word into special character.

(2) When Viterbi decoding is used, the results of the user-defined vocabulary recognition, the special type vocabulary recognition and the Chinese naming recognition are used as constraints.

(3) A network configuration with 100 dimensions of word vectors, a window size of 3 or 5, and a number of intermediate hidden layer neurons 300 is used (the specific parameters depend on the corpus sample set size).

Fig. 5 is a structural diagram of a convolutional neural network based on dynamic k-max pooling according to an embodiment of the present invention, as shown in fig. 5, a chinese sentence is used as an input, a semantic representation of a full sentence is generated by the network, and a category related to a task to which the sentence belongs is predicted according to the representation.

The network first converts each character in the input sentence into a corresponding vector representation by looking up a word vector table. The representation of each character may be randomly generated or pre-trained using an unsupervised approach. The sentences are converted to form a feature matrix. The second step is that: and on each dimension of the characteristic matrix, converting the window characteristic input into a new characteristic by adopting a convolution method according to the set window size. The windows are slid sequentially from left to right across the feature matrix, producing a higher level representation of features equal in number to the columns of the feature matrix. Different convolution kernels are used for different dimensions, thereby generating a feature map of the input feature matrix. A set of different convolution kernels may be used simultaneously to generate multiple feature maps. The k most significant features are extracted by adopting a k-max pooling method on each feature map, namely k maximum feature values are extracted in each dimension, but the sequence of the feature values keeps the sequence in the input feature map. The feature transformation is performed on the k-max pooled result matrix using a hardpan nonlinear function. The second step can be performed by stacking a plurality of layers, a new one on top of the other. The k value of k-max pooling of the last layer is fixed (hyper-parameter of the model), and the k value of each previous layer is the larger value of the k value of the last layer and the value calculated by the formula (H-H/H) xL after rounding up. And thirdly, splicing all the characteristic values obtained from the last layer to generate the semantic representation of the whole sentence. On the basis of semantic representation, the type of the sentence is predicted through a linear layer and a Softmax layer.

Due to the use of the Softmax layer, the network output can be seen as a different class of probability distributions. The training adopts a gradient descent method, and the goal of network training is to increase the probability of correct prediction on a training set and simultaneously reduce the probability of wrong prediction.

The Chinese sentence classification model based on the convolutional neural network with the dynamic k-max pooling is characterized in that:

(1) The input Chinese sentence is subjected to necessary preprocessing, which comprises the following steps: unified coding, traditional Chinese character conversion to simplified Chinese character conversion, full-angle conversion to half-angle conversion, special character conversion, non-standard phrase replacement, and unified conversion of recognized electronic mail box, web address, date, time, percentage, quantifier, currency, telephone number, number and foreign word into special character.

(2) The method takes the character (including Chinese characters, punctuations and other characters in sentences which may appear) level as input, is very suitable for the Chinese situation, and avoids the error expansion of Chinese word segmentation to the sentence classification task.

(3) The convolution with one dimension is used, and the number of columns of the feature map output by the convolution layer is the same as the number of columns of the input feature matrix, so that the speed of network processing is increased.

(4) The network employs a convolution of two layers, wherein: the size of the first layer of windows is 5, the number of feature maps is 2, the size of the second layer of windows is 3, and the number of feature maps is 3. The k value of k-max pooling of the last layer is 5.

Fig. 6 is a schematic diagram of semantic role labeling of bidirectional LSTM according to an embodiment of the present invention, as shown in fig. 6, different semantic role labeling models are used for different sentence classification results, and when labeling semantic roles, recognition types are identified by participles, parts of speech, and/or names, and the recognition types are sorted and then used as input, and a semantic tag set associated with sentence categories is used to label the sentences by using the participles as units.

The input of each time of the network (corresponding to each vocabulary of the input sentence) is the spliced vector representation after the current vocabulary, part of speech and/or the name recognition type (i.e. the category in Chinese name recognition, such as the name of Person, place, organization respectively represented by "Person", "Location" and "Organization") are converted into the vector. The input sentence is processed from left to right (forward) and from right to left (backward) using two LSTMs, respectively. For each vocabulary, the LSTM outputs a vector representation, and the concatenation of the forward and backward LSTM produces the output as a vector representation of the vocabulary (fusing context information about itself and its left and right) that is used as input to predict the tags to which the vocabulary belongs using a linear layer.

The dependency relationship between the predicted vocabulary labels, namely the bidirectional LSTM with transition probability, can be further utilized on the basis of the bidirectional LSTM model. That is, given a Chinese sentence, the network outputs a matrix, where each element f θ (t | i) in the matrix represents an estimate of the likelihood that the ith word in the sentence belongs to a tag t, where θ represents a parameter of the network. In the semantic annotation task, since there is also a certain dependency relationship between the front label and the back label, the matrix Aij is introduced to indicate the possibility of jumping from the label i to the label j (also included in the parameter set θ). Given a sentence s [1:n ] containing n words, an estimate can be made for some sequence of equal-length tags t [1:n ]:

under the condition of network parameter setting, a Viterbi decoding algorithm can be adopted to obtain a label sequence with the highest score as a labeling result. The training method is that on the training set, the probability of the occurrence of the correct semantic annotation sequence corresponding to each sample is required to be maximum. If the current network parameters generate wrong predictions, the gradient of each parameter to the target function is calculated by using a gradient descent method, and the parameters are updated accordingly.

The Chinese semantic role labeling model of the bidirectional LSTM is characterized in that:

(1) Each time of the LSTM network (corresponding to each vocabulary of the input sentence) takes as input the concatenation of the vectors corresponding to the participles, parts of speech and/or naming types.

(2) The necessary preprocessing is carried out on the input Chinese sentence, and comprises the following steps: unified coding, traditional Chinese character conversion to simplified Chinese character conversion, full-angle conversion to half-angle conversion, special character conversion, non-standard phrase replacement, and unified conversion of recognized electronic mail box, web address, date, time, percentage, quantifier, currency, telephone number, number and foreign word into special character.

(3) Bi-directional LSTM is used to generate a feature representation for each chinese vocabulary.

(4) The model uses the following key parameters: the dimension of the vocabulary feature vector is 30, the dimension of the part-of-speech feature vector is 10, the dimension of the type feature vector is 10, the number of blocks of each LSTM is 50, and each Block comprises 1 Cell unit.

(5) For the bidirectional LSTM with transition probability, the transition probability among semantic labels is introduced at the same time, and then the Viterbi decoding is adopted to label the semantic roles of the Chinese sentences.

The following is a description of specific embodiments of the present invention:

for example, the mobile phone receives an account of a short message' your tail number 5714, and completes an existing transaction by 15 points at 11 days 16/07 th, wherein the amount is 1300.00 yuan and the balance is 3456.03 yuan. [ agricultural Bank of China ] ".

Firstly, the original text is subjected to standard processing, for example, some short messages are written as [ in ] ", so that the standard, full angle and half angle and different forms of various symbols are required to be carried out, and the subsequent processing is convenient after the different forms are unified.

And then recognizing the vocabulary of the special type, mainly searching and recognizing in the text character string by adopting a regular expression mode, thus recognizing:

3-6:DIGIT 5714

11-16 days

17-22, TIME 11 time 15 minutes

35-42

46-53

And meanwhile, the mark symbols in the text can be identified. [] "position of the substrate.

According to the named recognition unit or the custom dictionary (usually, a specific vocabulary which cannot be recognized by the named recognition unit can be added into the custom dictionary, such as a bank keyword is added in the custom dictionary in advance), the following can be recognized:

56-61

Note: the two numbers in the first column are the starting position of the special vocabulary in the original text (the first character is counted from 0).

Then, after preprocessing, the recognized word segments form the next constraint (i.e. the words are not re-segmented and part-of-speech tagged), and the constraint can be represented by a character string, which represents the word segment and part-of-speech of each character, for example "

O O O B_D I_D I_D E_D O O O O B_NT I_NT I_NT I_NT_I NT E_NT B_NT I_NT I_NT I_NT I_NT E_NT O O O O O O O O S_PU O O O B_D I_D I_D I_D I_D I_D I_D E_D S_PU O O B_D I_D I_D I_D I_D I_D I_D E_D S_PU S_PU B_NR I_NR I_NR I_NR I_NR E_NR S_PU”

The above-mentioned "O" indicates other characters, and word segmentation and part-of-speech recognition are performed in the next step. Such as "B _ D" representing the beginning of a digital word, "I _ D" representing the middle of a digital word, and "E _ D" representing the end of a digital word. The underline _ "indicates the position of the character in the word before and the part of speech after, which is to perform joint segmentation and part of speech tagging. "B", "I", "E" indicate the beginning, middle, and end, respectively, of a character in a participle. The "S" symbol represents an individual word, e.g., the punctuation symbol is represented by "S _ PU". "NT" represents a temporal noun, "NR" represents a special noun, and various parts of speech such as other verbs, adjectives, etc., may be specified in advance.

After word segmentation and part-of-speech tagging, each word in the text can be distinguished (original word before "/" and part-of-speech after "/") as follows:

"you/PN tail number/U account/NN of NN 5714/D15 min/NT completed/V one/D pen/M present/V transaction/V at/P07 month 16/NT 11,/PU amount/NN is/V1300.00 yuan/D,/PU balance/NN 3456.03 yuan/D. The term "PU" (/ PU China agricultural Bank/NR ]/PU ").

In the above example, for example, the word "end" is a part of speech, which is a common noun and is denoted by "NN". Also for example, the word "5714" has a part of speech of a number, denoted by "D", the word "transaction", and a part of speech of a verb, denoted by "V". The participle is "[", the part of speech is punctuation, represent with "PU". By analogy, the normalized text is segmented according to word segmentation units (single words and punctuations are also used as independent word segmentation structures), and the part of speech of the word in the text is marked.

When semantic analysis is carried out, words of special types can be uniformly expressed, namely, a label symbol is used for replacing the words, so that the following steps are carried out:

"you/PN tail/NN DIGIT/D/U Account/NN in/P DATE/NT TIME/NT complete/V pen/D pen/M extant/V transaction/V,/PU amount/NN is/V CURRENCY/D,/PU balance/NN CURRENCY/D. Per PU [/PU BANK/NR ]/PU "

Words which are interesting to the user can be extracted through semantic analysis according to word segmentation, part of speech and/or naming identification types, for example, key information such as a bank notification short message, date, time, account number, amount of money in and out, balance, bank name and the like can be extracted, the key information, namely semantic roles are labeled and marked behind the corresponding words, and the words are separated by "/". "/" is followed by "O", i.e., something that does not need to be extracted.

Semantic analysis results for this example: "you/O tail number/O5714/ACCOUNT/O Account/O completed/O one/O pen/O extant/O transaction/O15 min/TIME at/O07 month 16/DATE 11,/O amount/O is/O1300.00 yuan/INCOME,/O BALANCE/O3456.03 yuan/BALANCE. O [/O Chinese agricultural Bank/BANK ]/O ].

Wherein "ACCOUNT", "DATE", "TIME", "INCOME", "BALANCE", "BANK" are semantic role labels and are labeled on the corresponding participles.

Finally, according to the extracted key information, prompting, interaction and the like are carried out in an interface or an application. For example, receiving the above short message may prompt the user to:

event entry

Account number 5714

Day 07, month 16

Time 11 hours and 15 minutes

Posting 1300.00 yuan

Balance 3456.03 yuan

Bank China agricultural Bank

According to the scheme provided by the embodiment of the invention, the Chinese sequence labeling network and the learning algorithm based on deep learning, the Chinese sentence classification model based on the convolutional neural network with the dynamic k-max pooling, the Chinese semantic role labeling model of the bidirectional LSTM with the transition probability, and the integration and integration mode of the key technologies are adopted. The developed system can be deployed on mobile computing platforms with relatively limited computing resources such as mobile phones and the like, can complete complex Chinese semantic analysis tasks without additional computing resources and equipment, and can greatly improve the response speed and user satisfaction of related applications.

Although the present invention has been described in detail hereinabove, the present invention is not limited thereto, and various modifications can be made by those skilled in the art in light of the principle of the present invention. Thus, modifications made in accordance with the principles of the present invention should be understood to fall within the scope of the present invention.

Claims

1. A method for Chinese semantic analysis based on deep learning comprises the following steps:

2. The method of claim 1, wherein the mobile terminal performs special type vocabulary recognition and/or custom vocabulary recognition and/or Chinese naming recognition on the canonical Chinese text, and the recognition result is used as a constraint condition, comprising:

3. The method of claim 1, wherein the mobile terminal performs special type vocabulary recognition and/or custom vocabulary recognition and/or Chinese naming recognition on the canonical Chinese text, and the recognition result is used as a constraint condition, comprising:

4. The method of claim 1, wherein the mobile terminal performs special type vocabulary recognition and/or custom vocabulary recognition and/or Chinese naming recognition on the canonical Chinese text, and the recognition result is used as a constraint condition, comprising:

the mobile terminal conducts Chinese naming recognition on the standard Chinese text by utilizing the Chinese naming recognition model obtained through deep learning to obtain a Chinese naming recognition result of the standard Chinese text, and the obtained Chinese naming recognition result is used as a third constraint condition.

5. The method of any of claims 2-4, the constraints comprising at least one of a first constraint, a second constraint, and a third constraint, or a combination thereof.

6. The method according to any of claims 1-5, wherein the mobile terminal performing the chinese semantic analysis on the normalized chinese text by using the segmentation, the part of speech and/or the named recognition type of the normalized chinese text comprises:

7. The method of claim 6, wherein the mobile terminal performing the chinese semantic analysis on the normalized chinese text using the segmentation, part of speech and/or named recognition type of the normalized chinese text comprises:

and the mobile terminal determines a Chinese semantic role labeling model of the bidirectional long-and-short-term memory LSTM according to the sentence classification result, and performs semantic role labeling on each participle and symbol of the standard Chinese text according to the participle, part of speech and/or name identification type of the standard Chinese text and the Chinese semantic role labeling model of the bidirectional long-and-short-term memory LSTM to obtain a semantic role labeling result of the standard Chinese text.

8. The method of claim 7, wherein the mobile terminal performing the chinese semantic analysis on the normalized chinese text using the segmentation, the part of speech, and/or the named recognition type of the normalized chinese text comprises:

and the mobile terminal carries out structural processing on the standard Chinese text according to the semantic role labeling result and the event model of the standard Chinese text, and extracts key information of the standard Chinese text.

9. The method of claim 8, the key information of the canonical chinese text including an event name, a key attribute, and an attribute value.

10. An apparatus for deep learning based Chinese semantic analysis, comprising:

and the analysis module is used for performing Chinese word segmentation and part-of-speech analysis on the standardized Chinese text according to the constraint conditions and by utilizing deep learning to obtain a Chinese word segmentation and part-of-speech tagging model, obtaining the segmentation and part-of-speech of the standardized Chinese text, and performing Chinese semantic analysis on the standardized Chinese text by utilizing the segmentation, part-of-speech and/or naming identification types of the standardized Chinese text.