CN110569500A

CN110569500A - Text semantic recognition method and device, computer equipment and storage medium

Info

Publication number: CN110569500A
Application number: CN201910666457.9A
Authority: CN
Inventors: 韩铃; 张然
Original assignee: Ping An International Smart City Technology Co Ltd
Current assignee: Ping An International Smart City Technology Co Ltd
Priority date: 2019-07-23
Filing date: 2019-07-23
Publication date: 2019-12-13

Abstract

The application relates to the technical field of natural language processing, and provides a text semantic recognition method, a text semantic recognition device, computer equipment and a storage medium. The method comprises the following steps: determining text characters contained in a target text and text participles to which each text character belongs; calculating word vectors corresponding to text characters and word vectors corresponding to text word segmentation; splicing the word vector of each text character with the word vector of the corresponding text participle to obtain a spliced vector of the corresponding text character; inputting the word vector into a first neural network layer to obtain a first feature vector, and inputting the splicing vector into the first neural network layer to obtain a second feature vector; and inputting a comprehensive characteristic vector obtained by splicing the first characteristic vector and the second characteristic vector into a second neural network layer to obtain the semantic type of the target text. By adopting the method, the accuracy of text semantic recognition can be improved.

Description

Text semantic recognition method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of natural language processing technologies, and in particular, to a text semantic recognition method, an apparatus, a computer device, and a storage medium.

Background

Based on the requirements of intelligent customer service, instant messaging and the like, semantic recognition is often required to be carried out on text data. With the development of the internet, the text semantic recognition technology is more and more widely applied, especially in the field of intelligent customer service. For example, in the field of intelligent customer service, in order to accurately answer text data input by a user, the intelligent customer service generally needs to perform semantic recognition on the text data input by the user and judge the real meaning expressed by the text data, so as to accurately and quickly answer a question posed by the user. For example, in the field of instant messaging, in order to prevent a user from transmitting an unvoiced phrase such as a dirty word through an instant messaging platform, a computer device generally needs to detect text data input by the user and detect a sensitive word appearing in the text data, so as to avoid the problem of spreading the unvoiced phrase in the instant messaging process.

At present, most text semantic analysis technologies are realized by adopting a keyword matching method, and the semantics of the text data cannot be accurately identified if the keywords which are not recorded in a keyword library appear in the text data, namely the text semantic identification accuracy is limited by the keyword coverage rate, so that the text semantic identification accuracy is low.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a text semantic recognition method, apparatus, computer device and storage medium.

A method of text semantic recognition, the method comprising:

Determining text characters contained in a target text and text participles to which each text character belongs;

calculating word vectors corresponding to text characters and word vectors corresponding to text word segmentation;

Splicing the word vector of each text character with the word vector of the corresponding text participle to obtain a spliced vector of the corresponding text character;

Inputting the word vector into a first neural network layer to obtain a first feature vector, and inputting the splicing vector into the first neural network layer to obtain a second feature vector;

And inputting a comprehensive characteristic vector obtained by splicing the first characteristic vector and the second characteristic vector into a second neural network layer to obtain the semantic type of the target text.

In one embodiment, the method further comprises:

Obtaining a sample text;

Extracting a word vector and a word vector of the sample text based on a pre-trained first neural network layer;

Respectively carrying out character numbering on the character vectors and the word vectors;

Writing the character vectors, the word vectors and the corresponding character numbers into a preset text;

The calculating word vectors corresponding to the word segments of the text and the word vectors of the text characters comprises:

carrying out character numbering on each text character and each text participle;

and reading a word vector corresponding to each text character and a word vector corresponding to each text word segmentation in the preset text based on the character number.

in one embodiment, the first neural network layer comprises a convolutional layer and a pooling layer; inputting the word vector into a first neural network layer to obtain a first feature vector comprises:

splicing word vectors corresponding to all text characters contained in the target text to obtain a word vector matrix of the target text;

The word vector matrix is used as the input of the convolution layer, and the convolution layer is used for carrying out convolution operation on the word vector matrix to obtain a word vector convolution characteristic;

And taking the word vector convolution characteristics as the input of a pooling layer, wherein the pooling layer is used for projecting the maximum characteristic value in each vector in the word vector convolution characteristics to obtain a first characteristic vector.

in one embodiment, the inputting the synthesized feature vector obtained by splicing the first feature vector and the second feature vector into a second neural network layer includes:

Inputting the word vector into the first neural network layer to obtain a third feature vector;

And inputting a comprehensive characteristic vector obtained by splicing the first characteristic vector, the second characteristic vector and the third characteristic vector into a second neural network layer.

In one embodiment, the second neural network layer comprises a random inactivation layer and a fully connected layer; inputting a comprehensive feature vector obtained by splicing the first feature vector and the second feature vector into a second neural network layer, and obtaining the semantic type of the target text comprises the following steps:

Splicing the first feature vector and the second feature vector to obtain a comprehensive feature vector of the target text;

the comprehensive characteristic vector is used as the input of the random inactivation layer, and the random inactivation layer is used for projecting each datum in the comprehensive characteristic vector according to a preset sparse probability to obtain a sparse characteristic vector;

The sparse feature vectors are used as the input of a full connection layer, and the full connection layer is used for carrying out classification operation on the sparse feature vectors to obtain the prediction probability corresponding to each semantic type;

and selecting the semantic type with the maximum prediction probability as the semantic type of the target text.

an apparatus for text semantic recognition, the apparatus comprising:

The text determining module is used for determining text characters contained in the target text and text participles to which each text character belongs;

the vector calculation module is used for calculating word vectors corresponding to text characters and word vectors corresponding to text word segmentation;

The vector splicing module is used for splicing the word vector of each text character with the word vector of the corresponding text participle to obtain a spliced vector of the corresponding text character;

the characteristic vector acquisition module is used for inputting the word vector to a first neural network layer to obtain a first characteristic vector and inputting the splicing vector to the first neural network layer to obtain a second characteristic vector;

And the semantic type acquisition module is used for inputting the comprehensive characteristic vector obtained by splicing the first characteristic vector and the second characteristic vector into a second neural network layer to obtain the semantic type of the target text.

in one embodiment, the apparatus further comprises a preset sample generation module, configured to obtain a sample text; extracting a word vector and a word vector of the sample text based on a pre-trained first neural network layer; respectively carrying out character numbering on the character vectors and the word vectors; writing the character vectors, the word vectors and the corresponding character numbers into a preset text; the vector calculation module is also used for carrying out character numbering on each text character and each text participle; and reading a word vector corresponding to each text character and a word vector corresponding to each text word segmentation in the preset text based on the character number.

In one embodiment, the feature vector acquisition module is further configured to splice word vectors corresponding to text characters included in the target text to obtain a word vector matrix of the target text; the word vector matrix is used as the input of the convolution layer, and the convolution layer is used for carrying out convolution operation on the word vector matrix to obtain a word vector convolution characteristic; and taking the word vector convolution characteristics as the input of a pooling layer, wherein the pooling layer is used for projecting the maximum characteristic value in each vector in the word vector convolution characteristics to obtain a first characteristic vector.

A computer device comprising a memory storing a computer program and a processor implementing the steps of the text semantic recognition method when the processor executes the computer program.

A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned text semantic recognition method.

The text semantic recognition method, the text semantic recognition device, the computer equipment and the storage medium determine text characters contained in a target text and text participles to which the text characters belong, obtain word vectors corresponding to each text character and word vectors of the text participles through calculation and carry out vector splicing to obtain spliced vectors corresponding to the text characters, carry out vector splicing on the text characters, represent the text through various feature vectors, and enhance feature dimensions represented by a text language; furthermore, vector representations of a plurality of target texts are respectively input to the first neural network layer for feature extraction, the first feature vector and the second feature vector output by the first neural network layer are spliced to obtain comprehensive features, the comprehensive features of the target texts are classified to obtain semantic types, and the semantic features of the target texts can be more fully expressed due to the comprehensive features, so that the accuracy of text semantic identification is improved.

drawings

FIG. 1 is a diagram of an application scenario of a text semantic recognition method in one embodiment;

FIG. 2 is a flow diagram that illustrates a method for semantic text recognition, according to one embodiment;

FIG. 3 is a schematic flow chart illustrating the generation of a default text in one embodiment;

FIG. 4 is a block diagram of an exemplary text semantic recognition apparatus;

FIG. 5 is a block diagram showing the structure of a text semantic recognition apparatus according to another embodiment;

FIG. 6 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The text semantic recognition method provided by the application can be applied to the application environment shown in fig. 1. The text semantic recognition method is applied to a text semantic system. The text semantic system includes a terminal 102 and a server 104. Wherein the terminal 102 and the server 104 communicate via a network. The text semantic recognition method can be completed in the terminal 102 or the server 104, and the terminal 102 can collect a target text to be recognized and recognize a semantic type on the terminal 102 by adopting the text semantic recognition method. Or the terminal 102 may acquire a target text to be recognized, and then transmit the target text to the server 104 through network connection, and the server 104 recognizes the semantic type of the target text by using the text semantic recognition method. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 104 may be implemented by an independent server or a server cluster formed by a plurality of servers.

in one embodiment, as shown in fig. 2, a text semantic recognition method is provided, which is described by taking the method as an example applied to the server in fig. 1, and includes the following steps:

step S202, text characters contained in the target text and text participles to which each text character belongs are determined.

The text characters are a plurality of independent characters obtained by segmenting the target text. The text characters may specifically be letters, numbers, words or symbols, etc. Text word segmentation refers to a process of segmenting a target text into a single word, i.e., recombining continuous word sequences into word sequences according to a certain specification. The text word segmentation can be performed by adopting a word segmentation method based on character string matching, a word segmentation method based on semantics and a word segmentation method based on statistics.

Specifically, the terminal obtains a plurality of target texts, where the target texts may be a recognized text obtained through speech recognition, or a text directly input by the user at the terminal. The terminal transmits the acquired target text to the server, wherein the transmission mode may be a wireless transmission mode or a wired transmission mode, such as a radio frequency transmission mode, an NFC (near field communication) transmission mode, a bluetooth transmission mode or a wireless network transmission mode.

The server performs word segmentation processing according to the received target text by characters to obtain text characters contained in the target text; and arranging the obtained text characters according to the sequence of the text characters in the target text to obtain a character sequence of the target text, deleting the text characters belonging to the stop word list from the character sequence, and obtaining the preprocessed character sequence. The stop word refers to a word or a character which is required to be filtered and has no processing value in the natural language processing task; the stop words comprise English characters, numbers, mathematical characters, punctuation marks, single Chinese characters with high use frequency and the like.

The server detects each character in the character sequence, and performs character identification on the same character to distinguish different words corresponding to the same character; performing word segmentation processing on the character sequence with the character identifier by using a pre-constructed word segmentation word bank to obtain a word sequence with the character identifier; based on the preprocessed character sequence, the server determines the text participle to which each character belongs from the word sequence.

In one embodiment, the word stock can be established on the basis of a Xinhua dictionary or other similar published books when the word stock is established, and the word stock can also be established according to an intelligent customer service scene. The constructed word segmentation library can be stored in a database of the server or sent to the cloud.

In one embodiment, the target text may also be obtained by the server, for example, the server may obtain the required text data from the web page as the target text, and further determine each text character of the target text and the text segmentation to which each text character belongs.

for example, the target text for acquisition is "the municipality of Shenzhen City is in the citizen center. Firstly, the server carries out word segmentation processing on the target text to obtain a character sequence' Shen/Shenzhen/Cin/Zhen/Fu/Yun/Cin/Zhongxin/. "deleting the characters belonging to the disabled word list in the character sequence to obtain a preprocessed character sequence" Shen/Shenzhen/City/Zheng/Fu/City/Min/Zhongzhong/Xin "; further, character identification is carried out on the same characters, namely a character sequence 'Shenzhen/City 01/City 02/political/mansion/City 03/civilian/center' with the character identification is obtained, word segmentation is carried out on the character sequence to obtain a word sequence 'Shenzhen City 01/City 02 government/City 03 civilian center', although the text character is 'City' and corresponds to three words, the text word segmentation to which the text character belongs can be distinguished according to the character identification.

Step S204, calculating word vectors corresponding to text characters and word vectors corresponding to text word segmentation.

The word vector and the word vector are used for representing the multi-dimensional representation form of the target text.

specifically, after the text characters of the target text and the text participles to which each text character belongs are obtained, the server matches a word vector corresponding to each text character and a word vector corresponding to the text participles in the target text through a pre-trained word vector library or word vector library. The server can also encode the obtained text characters and text word segmentation through a preset vector encoding rule to obtain corresponding word vectors and word vectors.

and S206, splicing the word vector of each text character with the word vector of the corresponding text participle to obtain a spliced vector of the corresponding text character.

The splicing vector is formed by splicing a plurality of text vectors according to a preset rule, and represents the representation dimensions of the plurality of vectors.

specifically, based on the obtained word vector and word vector of the target text, the server splices the sub-vector corresponding to each text character with the word vector to which the text character belongs to obtain a spliced vector corresponding to the text character, so as to obtain spliced vectors of all text characters contained in the target text, wherein the order of splicing the word vector and the word vector is not required.

in one embodiment, the server adds or multiplies the word vector corresponding to each text character and the word vector of the text participle to which the text character belongs to obtain a splicing vector of the corresponding text character.

step S208, inputting the word vector into the first neural network layer to obtain a first feature vector, and inputting the splicing vector into the first neural network layer to obtain a second feature vector.

The first neural network layer is mainly used for generating the features carrying the context semantics of the target text from the vector features contained in the input target text.

specifically, based on word vectors corresponding to all text characters in the obtained target text, the server inputs the word vectors to the first neural network layer, and performs feature extraction on the word vectors through the first neural network layer to obtain corresponding first feature vectors. Meanwhile, based on the obtained splicing vectors corresponding to all text characters in the target text, the server inputs all the splicing vectors into the first neural network layer, and performs feature extraction on all the splicing vectors through the first neural network layer to obtain corresponding second feature vectors.

And step S210, inputting a comprehensive characteristic vector obtained by splicing the first characteristic vector and the second characteristic vector into a second neural network layer to obtain the semantic type of the target text.

The comprehensive characteristic vector is formed by splicing a plurality of characteristic vectors output by the first neural network layer according to a preset rule. The second neural network layer is mainly used for classifying the comprehensive characteristic vectors corresponding to the input target texts according to semantic types so as to obtain the semantic types of the target texts. The semantic type is to determine the type of the target text according to the semantic relation of the target text.

Specifically, based on a first feature vector and a second feature vector of the obtained target text, the server splices the first feature vector and the second feature vector to obtain a comprehensive vector of the target text; further, the comprehensive characteristic vector of the target text is transmitted to the second neural network layer, the comprehensive characteristic vector is classified according to semantic types through the second neural network layer, the semantic type of the target text is obtained, and semantic understanding of text context and hidden words is fully considered. For example, when recognizing the dirty words and polite words of the target text, the semantic types can be set to two types, i.e. category 1: text is dirty, category 0: text is a polite phrase.

In the embodiment, the text characters contained in the target text and the text participles to which the text characters belong are determined, word vectors corresponding to each text character and word vectors of the text participles to which the text characters belong are obtained through calculation and are subjected to vector splicing to obtain spliced vectors corresponding to the text characters, the text is represented through multiple feature vectors by performing vector splicing on the text characters, and feature dimensions represented by a text language are enhanced; furthermore, vector representations of a plurality of target texts are respectively input to the first neural network layer for feature extraction, the first feature vector and the second feature vector output by the first neural network layer are spliced to obtain comprehensive features, the comprehensive features of the target texts are classified to obtain semantic types, and the semantic features of the target texts can be more fully expressed due to the comprehensive features, so that the accuracy of text semantic identification is improved.

In one embodiment, as shown in fig. 3, the method further includes the step of generating a preset text: step S302, obtaining a sample text; step S304, extracting word vectors and word vectors of the sample text based on a pre-trained first neural network layer; step S306, character numbering is carried out on the character vectors and the word vectors respectively; step S308, writing the character vectors, the word vectors and the character numbers respectively corresponding to the character vectors and the word vectors into a preset text; calculating word vectors corresponding to word segments of the text and word vectors corresponding to word segments of the text comprises: carrying out character numbering on each text character and each text participle; and reading a word vector corresponding to each text character and a word vector corresponding to each text participle in a preset text based on the character number.

The preset text is a pre-constructed text with indexes, and comprises word vectors and indexes thereof, and word vectors and indexes thereof.

Specifically, before calculating a word vector of a text character and a word vector corresponding to a text participle, a preset text of an index query including the word vector and the word vector needs to be constructed. The server acquires a sample text and a corresponding known semantic type from a terminal or a webpage, extracts a word vector and a word vector of the sample text based on a pre-trained first neural network layer, and respectively carries out character numbering on the extracted word vector and word vector to obtain a mapping relation between the word vector and a number and a mapping relation between the word vector and the number. The server writes the word vectors, the word vectors and the corresponding character numbers into a preset text to form the word vectors and the word vectors with the character number indexes.

based on the obtained text characters contained in the target text and the text participles to which the text characters belong, the server respectively carries out character numbering on each text character and each text participle to obtain the mapping relation between the text characters and the character numbers and the mapping relation between the text participles and the character numbers. And inquiring from the preset text to obtain a word vector of the corresponding text character according to the character number of each text character, and inquiring from the preset text to obtain a word vector of the corresponding text word segmentation according to the character number of each text word.

In one embodiment, the character number may include a number type. And respectively carrying out character numbering on the character vector and the word vector according to the numbering types, wherein the numbering types of the character vector and the word vector can be the same or different. For example, character numbering is performed on word vectors according to natural numbers; the word vectors may be numbered according to natural numbers or english letters.

For example, if the target text is "Shenzhen city", and if the character number of the text character "Shen" is 01, the word vector corresponding to the character number 01 obtained by querying the preset text is (1,1,2,2,0, 0).

in this embodiment, through the pre-established preset text containing the word vectors and the word vectors, when the word vectors and the word vectors of the target text are calculated, the word vectors and the word vectors corresponding to the text characters and the text participles are obtained by searching the preset text according to the character numbers corresponding to the text characters and the text participles, so that the word vectors of the text characters and the word vectors of the text participles can be accurately and quickly obtained, and the rate and the accuracy of obtaining the semantic type of the target text are improved.

In one embodiment, the first neural network layer includes a convolutional layer and a pooling layer; inputting the word vector into the first neural network layer to obtain a first feature vector comprises: splicing word vectors corresponding to all text characters contained in the target text to obtain a word vector matrix of the target text; the word vector matrix is used as the input of a convolution layer, and the convolution layer is used for carrying out convolution operation on the word vector matrix to obtain a word vector convolution characteristic; and the word vector convolution characteristics are used as the input of a pooling layer, and the pooling layer is used for projecting the maximum characteristic value in each vector in the word vector convolution characteristics to obtain a first characteristic vector.

the convolution layer is mainly used for performing convolution operation on the input features, extracting context information of the input features, reducing dimensionality of the input features and achieving compression of the input features. The pooling layer is mainly used for sampling the neurons output by convolution, so that the number of the neurons is reduced, and the calculation amount of the neural network is reduced.

In particular, the first neural network layer includes a convolutional layer and a pooling layer, where an output of the convolutional layer serves as an input to the pooling layer. The server splices word vectors corresponding to all text characters contained in the target text to obtain a word vector matrix of the target text, wherein the width of the word vector matrix is consistent with the number of the text characters contained in the target text. Further, the server inputs the word vector matrix into a convolution layer of the first neural network layer, and performs convolution operation on the word vector matrix through the convolution layer to obtain a word vector convolution characteristic; the convolution characteristic of the word vector is composed of each neuron output by convolution operation. And then, the server inputs the word vector convolution characteristics into the pooling layer, the word vector convolution characteristics are sampled through the pooling layer, the maximum characteristic value of each vector in the word vector convolution characteristics is selected for projection, and the first characteristic vector of the target text is obtained.

In one embodiment, the convolution layer performing convolution operation on the word vector matrix to obtain the convolution feature of the word vector specifically includes: the server adopts a plurality of preset convolution kernels to perform convolution on the word vector matrix respectively, and outputs the multidimensional column vectors corresponding to the convolution kernels. Specifically, after a convolution kernel is adopted to carry out convolution on a word vector matrix for one time, a corresponding convolution layer outputs a multi-dimensional column vector; and further after the word vector matrix is convolved by adopting a plurality of convolution kernels, a plurality of multi-dimensional column vectors are output by the corresponding convolution layer. Where the width of the multi-dimensional column vector is (i.e., the number of rows of the multi-dimensional column vector): (number of word vectors-width of convolution kernel)/step size of convolution kernel + 1. And the length of the convolution kernel is consistent with the length of the word vector, the step bit of the convolution kernel can be 1, and the width of the convolution kernel is between [1 and the number of the word vector ].

In one embodiment, the obtained multidimensional column vectors are activated through an activation function, and a plurality of activated multidimensional column vectors are obtained. The activation function is mainly used for carrying out nonlinear transformation on the multidimensional column vector, and the expression capability of the first neural network layer on the semantic recognition model is improved. The activation functions include a linear rectification function (Relu) and a radial basis kernel function.

In one embodiment, inputting the word vector to the first neural network layer to obtain the second feature vector comprises: splicing word vectors of all text characters contained in the target text to obtain a word vector matrix of the target text; the word vector matrix is used as the input of a convolution layer, and the convolution layer is used for carrying out convolution operation on the word vector matrix to obtain word vector convolution characteristics; and the word vector convolution characteristics are used as the input of a pooling layer, and the pooling layer is used for projecting the maximum characteristic value in each vector in the word vector convolution characteristics to obtain a second characteristic vector.

In this embodiment, a word vector matrix obtained by splicing word vectors corresponding to text characters is input to a convolution layer, a word vector convolution feature is obtained through convolution operation and is input to a pooling layer, a first feature vector is obtained by projecting a maximum feature value in each vector in the word vector convolution feature, the word vector convolution feature obtained through the convolution operation includes a context semantic feature of a target text, the semantic feature implied by the target text is well represented, and accuracy in identifying the target text is further improved.

in one embodiment, inputting the synthetic feature vector obtained by splicing the first feature vector and the second feature vector into the second neural network layer comprises: inputting the splicing vector into the first neural network layer to obtain a third feature vector; and inputting the comprehensive characteristic vector obtained by splicing the first characteristic vector, the second characteristic vector and the third characteristic vector into a second neural network layer.

specifically, based on the splicing vector of the corresponding text character, the server inputs the splicing vector to the first neural network layer, and performs convolution operation on the splicing vector through the first neural network layer to obtain a third feature vector of the target text. And then, the server splices the first feature vector, the second feature vector and the third feature vector of the target text to obtain a spliced comprehensive feature vector of the target text, and inputs the comprehensive feature vector to a second neural network layer for classification semantic type recognition.

In this embodiment, a word vector corresponding to text word segmentation is input to the first neural network layer to obtain a third feature vector, the first feature vector, the second feature vector and the third feature vector are spliced to obtain a comprehensive feature vector, and the comprehensive feature vector is input to the second neural network layer.

in one embodiment, the second neural network layer includes a random inactivation layer and a fully connected layer; inputting a comprehensive characteristic vector obtained by splicing the first characteristic vector and the second characteristic vector into a second neural network layer, and obtaining the semantic type of the target text comprises the following steps: splicing the first feature vector and the second feature vector to obtain a comprehensive feature vector of the target text; the comprehensive characteristic vector is used as the input of a random inactivation layer, and the random inactivation layer is used for diluting the comprehensive characteristic vector to obtain a sparse characteristic vector; the sparse feature vectors are used as the input of a full connection layer, and the full connection layer is used for carrying out classification operation on the sparse feature vectors to obtain the prediction probability corresponding to each semantic type; and selecting the semantic type with the maximum prediction probability as the semantic type of the target recognition text.

The random inactivation layer (dropout) is mainly used for performing sparse processing on the input comprehensive characteristic vector and performing zeroing processing on partial elements of the comprehensive characteristic vector, so that overfitting of the neural network is prevented, and meanwhile, the calculation amount of the neural network is reduced. The full-connection layer is mainly used for classifying the sparse comprehensive characteristic vectors to obtain a classification result.

In particular, the second neural network layer includes a random inactivation layer and a fully-connected layer, wherein an output of the random inactivation layer serves as an input to the fully-connected layer. And the server splices the first characteristic vector and the second characteristic vector of the target text so as to obtain a comprehensive characteristic vector of the target text. Further, the server inputs the obtained comprehensive characteristic vector into a random inactivation layer, the random inactivation layer conducts sparse processing on the comprehensive characteristic vector according to set sparse probability, each data in the comprehensive characteristic vector is projected according to the sparse probability, and therefore the sparse characteristic vector is obtained, wherein the sparse probability refers to the probability of the data after projection. For example, the output result after passing through the pooling layer is a one-dimensional sequence [1,2,3,4 ]]^TSetting the sparse probability to be 0.5, and the probability of occurrence after each digital projection in the corresponding one-dimensional sequence to be 0.5, i.e. the result output by the random deactivation layer may be [0,2,0,4 ]]^TMay also be [0,0,0,4 ]]^T。

and then, the server inputs the sparse feature vectors into the full-connection layer, classification operation is carried out on the sparse feature vectors through the full-connection layer, the prediction probability corresponding to each semantic type is calculated according to the trained weight parameters of the full-connection layer, and each prediction probability output by the full-connection layer corresponds to one semantic type. And the server selects the semantic type with the maximum prediction probability as the semantic type of the target text.

In one embodiment, the second neural network layer further includes a logistic regression layer (softmax layer), specifically including: and taking the prediction probability corresponding to each semantic type as the input of a softmax layer, wherein the softmax layer is used for carrying out normalization processing on each prediction probability to obtain the probability corresponding to each semantic type, and selecting the semantic type with the maximum probability as the semantic type of the target text.

for example, the prediction probability is output via the full link layer asThe semantic type corresponding to a is 1, and the semantic type corresponding to b is 0; normalization with softmax functionThen, the output probability of each semantic type after normalization is obtained asand selecting the semantic type corresponding to the maximum probability as the semantic type of the target text.

in one embodiment, the neural network comprises a first neural network layer and a second neural network layer, and the training process of the neural network model comprises the following steps: obtaining a sample text and a known label, determining sample text characters contained in the sample text and sample text participles to which each sample text character belongs, calculating sample word vectors corresponding to the sample text characters and sample word vectors corresponding to the sample text participles, and splicing the sample word vectors corresponding to each sample text character with the sample word vectors of the sample text participles to obtain corresponding sample spliced vectors of the sample text characters; inputting the sample word vector to a first neural network layer to be trained to obtain a sample first feature vector, and inputting the sample splicing vector to the first neural network layer to be trained to obtain a sample second feature vector; inputting a sample comprehensive characteristic vector obtained by splicing the sample first characteristic vector and the sample second characteristic vector into a second neural network layer to be trained to obtain a predicted semantic type of a sample text; calculating a loss value according to the predicted semantic type and the known label, and transmitting the loss value to each layer of the neural network model by a reverse gradient propagation method to obtain the gradient of each layer of parameters; and adjusting parameters of each layer in the neural network model according to the gradient until the determined loss value reaches a training stop condition.

The adjusting of the parameters of each layer in the neural network model specifically includes adjusting the weight parameters of the fully-connected layer and the kernel function of the convolutional layer. The function that calculates the loss value may be a cross-entropy loss function. The inverse gradient propagation method may be a batch gradient descent method (BGD), a small batch gradient descent Method (MGBD), and a stochastic gradient descent method (SGD).

In this embodiment, the first feature vector and the second feature vector are spliced to form a comprehensive feature vector of the target text, the comprehensive feature vector of the target text is sparsely processed through a random inactivation layer of a second neural network layer to obtain a sparse feature vector after the sparse processing, the sparse feature vector is classified and operated through a full connection layer to obtain a prediction probability corresponding to each semantic type, the semantic type corresponding to the maximum prediction probability is selected as the semantic type of the target text, the semantic features of the target text are represented by the comprehensive feature vector, the sparse processing is performed, the calculation amount of computer equipment is reduced, and meanwhile the classification accuracy of the target sample is improved.

It should be understood that although the various steps in the flow diagrams of fig. 2-3 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2-3 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternating with other steps or at least some of the sub-steps or stages of other steps.

in one embodiment, as shown in fig. 4, there is provided a text semantic recognition apparatus 400, including: a text determination module 402, a vector calculation module 404, a vector concatenation module 406, a feature vector acquisition module 408, and a semantic type acquisition module 410, wherein:

a text determining module 402, configured to determine text characters included in the target text and text participles to which each text character belongs.

And a vector calculation module 404, configured to calculate word vectors corresponding to text characters and word vectors corresponding to text participles.

And the vector splicing module 406 is configured to splice the word vector of each text character with the word vector of the text participle to which the word vector belongs, so as to obtain a spliced vector of the corresponding text character.

The feature vector obtaining module 408 is configured to input the word vector to the first neural network layer to obtain a first feature vector, and input the stitching vector to the first neural network layer to obtain a second feature vector.

and a semantic type obtaining module 410, configured to input a comprehensive feature vector obtained by splicing the first feature vector and the second feature vector into the second neural network layer, so as to obtain a semantic type of the target text.

In one embodiment, as shown in fig. 5, the method further includes a preset sample generating module 412, configured to obtain a sample text; extracting a word vector and a word vector of the sample text based on a pre-trained first neural network layer; character numbering is carried out on the character vectors and the word vectors respectively; writing the character vectors, the word vectors and the character numbers respectively corresponding to the character vectors into a preset text; the vector calculation module is also used for carrying out character numbering on each text character and each text word; and reading a word vector corresponding to each text character and a word vector corresponding to each text participle in a preset text based on the character number.

in one embodiment, the feature vector acquisition module is further configured to splice word vectors corresponding to text characters included in the target text to obtain a word vector matrix of the target text; the word vector matrix is used as the input of a convolution layer, and the convolution layer is used for carrying out convolution operation on the word vector matrix to obtain a word vector convolution characteristic; and the word vector convolution characteristics are used as the input of a pooling layer, and the pooling layer is used for projecting the maximum characteristic value in each vector in the word vector convolution characteristics to obtain a first characteristic vector.

in one embodiment, the semantic type obtaining module is further configured to input the word vector to the first neural network layer to obtain a third feature vector; and inputting the comprehensive characteristic vector obtained by splicing the first characteristic vector, the second characteristic vector and the third characteristic vector into a second neural network layer.

In one embodiment, the semantic type obtaining module is further configured to splice the first feature vector and the second feature vector to obtain a comprehensive feature vector of the target text; the comprehensive characteristic vector is used as the input of a random inactivation layer, and the random inactivation layer is used for projecting each data in the comprehensive characteristic vector according to a preset sparse probability to obtain a sparse characteristic vector; the sparse feature vectors are used as the input of a full connection layer, and the full connection layer is used for carrying out classification operation on the sparse feature vectors to obtain the prediction probability corresponding to each semantic type; and selecting the semantic type with the maximum prediction probability as the semantic type of the target text.

for the specific limitation of the text semantic recognition device, reference may be made to the above limitation of the text semantic recognition method, which is not described herein again. The modules in the text semantic recognition device can be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

in one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 6. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer equipment is used for storing word vectors corresponding to text characters and word vectors corresponding to text participles contained in a preset text and a target text. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a text semantic recognition method.

Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, there is provided a computer device comprising a memory storing a computer program and a processor implementing the following steps when the processor executes the computer program: determining text characters contained in a target text and text participles to which each text character belongs; calculating word vectors corresponding to text characters and word vectors corresponding to text word segmentation; splicing the word vector of each text character with the word vector of the corresponding text participle to obtain a spliced vector of the corresponding text character; inputting the word vector into a first neural network layer to obtain a first feature vector, and inputting the splicing vector into the first neural network layer to obtain a second feature vector; and inputting the comprehensive characteristic vector obtained by splicing the first characteristic vector and the second characteristic vector into a second neural network layer to obtain the semantic type of the target text.

In one embodiment, the processor, when executing the computer program, further performs the steps of: obtaining a sample text; extracting a word vector and a word vector of the sample text based on a pre-trained first neural network layer; character numbering is carried out on the character vectors and the word vectors respectively; writing the character vectors, the word vectors and the character numbers respectively corresponding to the character vectors into a preset text; calculating word vectors corresponding to word segments of the text and word vectors corresponding to word segments of the text comprises: carrying out character numbering on each text character and each text participle; and reading a word vector corresponding to each text character and a word vector corresponding to each text participle in a preset text based on the character number.

In one embodiment, the processor, when executing the computer program, further performs the steps of: splicing word vectors corresponding to all text characters contained in the target text to obtain a word vector matrix of the target text; the word vector matrix is used as the input of a convolution layer, and the convolution layer is used for carrying out convolution operation on the word vector matrix to obtain a word vector convolution characteristic; and the word vector convolution characteristics are used as the input of a pooling layer, and the pooling layer is used for projecting the maximum characteristic value in each vector in the word vector convolution characteristics to obtain a first characteristic vector.

In one embodiment, the processor, when executing the computer program, further performs the steps of: inputting the word vector into the first neural network layer to obtain a third feature vector; and inputting the comprehensive characteristic vector obtained by splicing the first characteristic vector, the second characteristic vector and the third characteristic vector into a second neural network layer.

In one embodiment, the processor, when executing the computer program, further performs the steps of: splicing the first feature vector and the second feature vector to obtain a comprehensive feature vector of the target text; the comprehensive characteristic vector is used as the input of a random inactivation layer, and the random inactivation layer is used for projecting each data in the comprehensive characteristic vector according to a preset sparse probability to obtain a sparse characteristic vector; the sparse feature vectors are used as the input of a full connection layer, and the full connection layer is used for carrying out classification operation on the sparse feature vectors to obtain the prediction probability corresponding to each semantic type; and selecting the semantic type with the maximum prediction probability as the semantic type of the target text.

In this embodiment, text characters contained in a target text and text participles to which the text characters belong are determined, word vectors corresponding to each text character and word vectors of the text participles to which the text characters belong are obtained through calculation and vector splicing, spliced vectors corresponding to the text characters are obtained, the text is represented through multiple feature vectors by performing vector splicing on the text characters, and feature dimensions represented by a text language are enhanced; furthermore, vector representations of a plurality of target texts are respectively input to the first neural network layer for feature extraction, the first feature vector and the second feature vector output by the first neural network layer are spliced to obtain comprehensive features, the comprehensive features of the target texts are classified to obtain semantic types, and the semantic features of the target texts can be more fully expressed due to the comprehensive features, so that the accuracy of text semantic identification is improved.

In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of: determining text characters contained in a target text and text participles to which each text character belongs; calculating word vectors corresponding to text characters and word vectors corresponding to text word segmentation; splicing the word vector of each text character with the word vector of the corresponding text participle to obtain a spliced vector of the corresponding text character; inputting the word vector into a first neural network layer to obtain a first feature vector, and inputting the splicing vector into the first neural network layer to obtain a second feature vector; and inputting the comprehensive characteristic vector obtained by splicing the first characteristic vector and the second characteristic vector into a second neural network layer to obtain the semantic type of the target text.

In one embodiment, the computer program when executed by the processor implements the steps of: obtaining a sample text; extracting a word vector and a word vector of the sample text based on a pre-trained first neural network layer; character numbering is carried out on the character vectors and the word vectors respectively; writing the character vectors, the word vectors and the character numbers respectively corresponding to the character vectors into a preset text; calculating word vectors corresponding to word segments of the text and word vectors corresponding to word segments of the text comprises: carrying out character numbering on each text character and each text participle; and reading a word vector corresponding to each text character and a word vector corresponding to each text participle in a preset text based on the character number.

in one embodiment, the computer program when executed by the processor implements the steps of: splicing word vectors corresponding to all text characters contained in the target text to obtain a word vector matrix of the target text; the word vector matrix is used as the input of a convolution layer, and the convolution layer is used for carrying out convolution operation on the word vector matrix to obtain a word vector convolution characteristic; and the word vector convolution characteristics are used as the input of a pooling layer, and the pooling layer is used for projecting the maximum characteristic value in each vector in the word vector convolution characteristics to obtain a first characteristic vector.

In one embodiment, the computer program when executed by the processor implements the steps of: inputting the word vector into the first neural network layer to obtain a third feature vector; and inputting the comprehensive characteristic vector obtained by splicing the first characteristic vector, the second characteristic vector and the third characteristic vector into a second neural network layer.

In one embodiment, the computer program when executed by the processor implements the steps of: splicing the first feature vector and the second feature vector to obtain a comprehensive feature vector of the target text; the comprehensive characteristic vector is used as the input of a random inactivation layer, and the random inactivation layer is used for projecting each data in the comprehensive characteristic vector according to a preset sparse probability to obtain a sparse characteristic vector; the sparse feature vectors are used as the input of a full connection layer, and the full connection layer is used for carrying out classification operation on the sparse feature vectors to obtain the prediction probability corresponding to each semantic type; and selecting the semantic type with the maximum prediction probability as the semantic type of the target text.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

the technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A method of text semantic recognition, the method comprising:

2. the method of claim 1, further comprising:

obtaining a sample text;

3. The method of claim 1, wherein the first neural network layer comprises a convolutional layer and a pooling layer; inputting the word vector into a first neural network layer to obtain a first feature vector comprises:

4. The method of claim 1, wherein inputting the combined feature vector obtained by stitching the first feature vector and the second feature vector into a second neural network layer comprises:

5. The method of claim 1, wherein the second neural network layer comprises a random inactivation layer and a fully connected layer; inputting a comprehensive feature vector obtained by splicing the first feature vector and the second feature vector into a second neural network layer, and obtaining the semantic type of the target text comprises the following steps:

6. An apparatus for semantic recognition of text, the apparatus comprising:

7. the device of claim 6, further comprising a preset sample generation module for obtaining a sample text; extracting a word vector and a word vector of the sample text based on a pre-trained first neural network layer; respectively carrying out character numbering on the character vectors and the word vectors; writing the character vectors, the word vectors and the corresponding character numbers into a preset text; the vector calculation module is also used for carrying out character numbering on each text character and each text participle; and reading a word vector corresponding to each text character and a word vector corresponding to each text word segmentation in the preset text based on the character number.

8. The device according to claim 6, wherein the feature vector acquisition module is further configured to splice word vectors corresponding to text characters included in the target text to obtain a word vector matrix of the target text; the word vector matrix is used as the input of the convolution layer, and the convolution layer is used for carrying out convolution operation on the word vector matrix to obtain a word vector convolution characteristic; and taking the word vector convolution characteristics as the input of a pooling layer, wherein the pooling layer is used for projecting the maximum characteristic value in each vector in the word vector convolution characteristics to obtain a first characteristic vector.

9. a computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 5 when executing the computer program.

10. a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 5.