CN110598206B - Text semantic recognition method and device, computer equipment and storage medium - Google Patents

Text semantic recognition method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN110598206B
CN110598206B CN201910744603.5A CN201910744603A CN110598206B CN 110598206 B CN110598206 B CN 110598206B CN 201910744603 A CN201910744603 A CN 201910744603A CN 110598206 B CN110598206 B CN 110598206B
Authority
CN
China
Prior art keywords
text
character
hidden layer
word
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910744603.5A
Other languages
Chinese (zh)
Other versions
CN110598206A (en
Inventor
卢清明
张然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An International Smart City Technology Co Ltd
Original Assignee
Ping An International Smart City Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An International Smart City Technology Co Ltd filed Critical Ping An International Smart City Technology Co Ltd
Priority to CN201910744603.5A priority Critical patent/CN110598206B/en
Publication of CN110598206A publication Critical patent/CN110598206A/en
Priority to PCT/CN2020/104679 priority patent/WO2021027533A1/en
Application granted granted Critical
Publication of CN110598206B publication Critical patent/CN110598206B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The application relates to the technical field of natural language processing, and provides a text semantic recognition method, a text semantic recognition device, computer equipment and a storage medium. The method comprises the following steps: calculating word vectors of text characters in the target text and word vectors of each text word; splicing the word vector of each text character with the word vector of the text participle to obtain a spliced vector of the text character; sequentially inputting the word vectors and the splicing vectors of the text characters into a first neural network according to the forward appearance sequence of the text characters in the target text to obtain first text characteristics; according to the reverse appearance sequence of the text characters in the target text, sequentially inputting the word vectors and the splicing vectors corresponding to the text characters into a second neural network to obtain second text characteristics; and inputting the comprehensive text features obtained by splicing the first text features and the second text features into a third neural network to obtain the semantic type of the target text. By adopting the method, the accuracy of text semantic recognition is improved.

Description

Text semantic recognition method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of natural language processing technologies, and in particular, to a text semantic recognition method, apparatus, computer device, and storage medium.
Background
With the development of the internet, the text semantic recognition technology is more and more widely applied. Especially in the field of intelligent question answering, in order to answer questions consulted by users accurately, voice input by users is generally required to be converted into text data, semantic recognition is further carried out on the text data, and the real meanings expressed by the text data are judged, so that the questions consulted by the users can be answered accurately and quickly.
In the aspect of a network platform, in order to maintain the culture of network terms and improve the use experience of users, a text semantic recognition technology is usually adopted to perform semantic recognition on texts published on the network, so that texts with semantic information such as violence, low colloquial topics, sensitive topics, commercial advertisements and the like are recognized.
At present, most text semantic analysis technologies adopt a keyword matching method for processing, a keyword database needs to be constructed in advance, a text to be identified is matched with keywords in the constructed database to identify sensitive words, however, the semantics of the keywords which are not recorded in the database cannot be accurately identified, that is, the coverage of the keywords limits the accuracy of text semantic identification, so that the accuracy of text semantic identification is low.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a text semantic recognition method, apparatus, computer device and storage medium.
A method of text semantic recognition, the method comprising:
calculating a word vector of each text character and a word vector of each text word in the target text;
splicing the word vector of each text character with the word vectors of the text participles to obtain spliced vectors of corresponding text character characters;
according to the forward appearance sequence of the text characters in the target text, sequentially inputting word vectors and splicing vectors corresponding to the text characters into different hidden layers of a first neural network to obtain first text characteristics of the target text based on the forward appearance sequence;
sequentially inputting word vectors and splicing vectors corresponding to a plurality of text characters into different hidden layers of a second neural network according to the reverse appearance sequence of the text characters in the target text to obtain second text characteristics of the target text based on the reverse appearance sequence;
and inputting the comprehensive text features obtained by splicing the first text features and the second text features into a third neural network to obtain the semantic type of the target text.
In one embodiment, the method further comprises:
obtaining a sample text;
extracting a word vector and a word vector of the sample text based on a pre-trained first neural network layer;
respectively carrying out character numbering on the character vectors and the word vectors;
writing the character vectors, the word vectors and the corresponding character numbers into a preset file;
the calculating word vectors corresponding to the word segments of the text and the word vectors of the text characters comprises:
carrying out character numbering on each text character and each text participle;
and reading a word vector corresponding to each text character and a word vector corresponding to each text word in the preset file based on the character number.
In one embodiment, the sequentially inputting word vectors and concatenation vectors corresponding to a plurality of text characters into different hidden layers of a first neural network according to a forward appearance sequence of the text characters in the target text to obtain a first text feature of the target text based on the forward appearance sequence includes:
inputting word vectors and splicing vectors corresponding to the text characters in the current sequence into a current hidden layer of a first neural network according to the forward appearance sequence of the text characters in the target text;
inputting character features output by the current hidden layer, word vectors corresponding to the next sequential text characters and splicing vectors into the next hidden layer of the first neural network;
and iterating by taking the next hidden layer as the current hidden layer until the last sequential text characters obtain the first text characteristic of the target text based on the forward appearance sequence.
In one embodiment, the current hidden layer comprises a first sub hidden layer and a second sub hidden layer; the inputting the word vector and the splicing vector corresponding to the current sequential text character into the current hidden layer of the first neural network comprises:
taking the word vector and the output of the previous hidden layer as the input of a first sub hidden layer, wherein the first sub hidden layer is used for projecting the word vector according to the weight of each neuron node corresponding to the first sub hidden layer to obtain a first sub character feature;
and taking the first sub-character features and the splicing vectors as the input of a second sub-hidden layer, wherein the second sub-hidden layer is used for projecting the splicing vectors according to the weight of each neuron node corresponding to the second sub-hidden layer to obtain second sub-character features which are taken as the output of the current hidden layer.
In one embodiment, the first neural network further comprises a random inactivation layer, the method further comprising:
and taking the first text feature as the input of the random inactivation layer, wherein the random inactivation layer is used for projecting each data in the first text feature according to a preset sparse probability to obtain a sparse feature vector as the output of the first neural network.
In one embodiment, the sequentially inputting word vectors and splicing vectors corresponding to a plurality of text characters into different hidden layers of a second neural network according to a reverse appearance sequence of the text characters in the target text to obtain a second text feature of the target text based on the reverse appearance sequence includes:
inputting the first text feature, the word vector corresponding to the first sequence text character and the splicing vector into a first hidden layer of a second neural network according to the reverse appearance sequence of the text character in the target text to obtain the character feature output by the first hidden layer;
taking a second hidden layer of the second neural network as a current hidden layer, and taking a second sequence text character as a current sequence text character;
inputting the character features output by the previous hidden layer, the word vectors corresponding to the current sequential text characters and the splicing vectors into the current hidden layer of the second neural network;
and iterating by taking the next hidden layer as the current hidden layer until the last sequential text characters are obtained, and obtaining a second text characteristic of the target text based on the reverse appearance sequence.
In one embodiment, the third neural network layer comprises an attention mechanism layer and a fully connected layer; splicing the first text feature and the second text feature and inputting the spliced first text feature and second text feature into a third neural network, wherein obtaining the semantic type of the target text comprises the following steps:
splicing the first text characteristic and the second text characteristic to obtain a comprehensive text characteristic of the target text;
the comprehensive text features are used as input of the attention mechanism layer, and the attention mechanism layer is used for weighting each datum in the comprehensive text features to obtain weighted features;
the weighted features are used as input of the random inactivation layer, and the random inactivation layer is used for projecting each data in the weighted features according to a preset sparse probability to obtain sparse features;
the sparse features are used as the input of a full connection layer, and the full connection layer is used for carrying out classification operation on the sparse features to obtain the prediction probability corresponding to each semantic type;
and selecting the semantic type with the maximum prediction probability as the semantic type of the target text.
An apparatus for text semantic recognition, the apparatus comprising:
the vector calculation module is used for calculating a word vector of each text character and a word vector of each text word in the target text;
the vector splicing module is used for splicing the word vector of each text character with the word vector of the text participle to obtain a spliced vector of the corresponding text character;
the first text feature acquisition module is used for sequentially inputting word vectors and splicing vectors corresponding to a plurality of text characters into different hidden layers of a first neural network according to the forward appearance sequence of the text characters in the target text to obtain first text features of the target text based on the forward appearance sequence;
the second text feature acquisition module is used for sequentially inputting the word vectors and the splicing vectors corresponding to the text characters into different hidden layers of a second neural network according to the reverse appearance sequence of the text characters in the target text to obtain second text features of the target text based on the reverse appearance sequence;
and the semantic type acquisition module is used for inputting the comprehensive text features obtained by splicing the first text features and the second text features into a third neural network to obtain the semantic type of the target text.
A computer device comprising a memory storing a computer program and a processor implementing the above text semantic identification steps when executing the computer program.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the above-mentioned steps of text semantic recognition.
According to the text semantic identification method, the text semantic identification device, the computer equipment and the storage medium, the word vector corresponding to each text character and the word vectors of the text participles are obtained through calculation and are subjected to vector splicing to obtain the spliced vectors corresponding to the text characters, the text is represented through multiple feature vectors by performing vector splicing on the text characters, and the feature dimension represented by the text language is enhanced. Further, the word vectors and the splicing vectors are input into different hidden layers of different neural networks according to the forward sequence and the reverse sequence, so that the relevant information of text characters can be more fully acquired, the context semantics among the text characters are mined, the first text features and the second text features output by the first neural network are spliced to obtain comprehensive features, the semantic features of a target text can be more fully expressed, and the accuracy of text semantic recognition is improved.
Drawings
FIG. 1 is a diagram of an application scenario of a text semantic recognition method in one embodiment;
FIG. 2 is a flow diagram that illustrates a method for semantic text recognition, according to one embodiment;
FIG. 3 is a schematic flow chart illustrating the generation of a default file in one embodiment;
FIG. 4 is a block diagram of an exemplary text semantic recognition apparatus;
FIG. 5 is a block diagram showing the structure of a text semantic recognition apparatus according to another embodiment;
FIG. 6 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The text semantic recognition method provided by the application can be applied to the application environment shown in fig. 1. The text semantic recognition method is applied to a text semantic system. The text semantic system includes a terminal 102 and a server 104. Wherein the terminal 102 and the server 104 communicate via a network. The text semantic recognition method can be completed in the terminal 102 or the server 104, and the terminal 102 can collect a target text to be recognized and recognize a semantic type on the terminal 102 by adopting the text semantic recognition method. Or the terminal 102 may acquire a target text to be recognized, and then transmit the target text to the server 104 through network connection, and the server 104 recognizes the semantic type of the target text by using the text semantic recognition method. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 104 may be implemented by an independent server or a server cluster formed by a plurality of servers.
In one embodiment, as shown in fig. 2, a text semantic recognition method is provided, which is described by taking the method as an example applied to the server in fig. 1, and includes the following steps:
step S202, calculating a word vector of each text character and a word vector of each text participle in the target text.
The text characters are a plurality of independent characters obtained by segmenting the target text. The text characters may specifically be letters, numbers, words or symbols, etc. Text segmentation refers to a process of segmenting a target text into a single word, i.e., recombining continuous word sequences into word sequences according to a certain specification. The text word segmentation can be performed by adopting a word segmentation method based on character string matching, a word segmentation method based on semantics and a word segmentation method based on statistics. Word vectors and word vectors are used to characterize a multi-dimensional representation of the target text.
Specifically, the server determines each text character contained in the target text and the text participle to which each text character belongs according to the obtained target text, and obtains a word vector corresponding to each text character and a word vector corresponding to the text participle in the target text through a pre-trained word vector library or a pre-trained word vector library in a matching manner. The server can also encode the obtained text characters and text word segmentation through a preset vector encoding rule to obtain corresponding word vectors and word vectors.
In one embodiment, the specific step of obtaining the target text includes: the terminal acquires a plurality of target texts, wherein the target texts can be recognized texts obtained through voice recognition or texts directly input by a user at the terminal. And the terminal transmits the acquired target text to the server. The target text can also be obtained from a network platform, and related target texts are obtained from the network through a crawler technology.
In one embodiment, the step of determining each text character contained in the target text and the text segmentation word to which each text character belongs includes: the server performs word segmentation processing according to the received target text by characters to obtain text characters contained in the target text; and arranging the obtained text characters according to the sequence of the text characters in the target text to obtain a character sequence of the target text, deleting the text characters belonging to the stop word list from the character sequence, and obtaining the character sequence after pretreatment. The stop word refers to a word or a character which is required to be filtered and has no processing value in the natural language processing task; the stop words comprise English characters, numbers, mathematical characters, punctuations, single Chinese characters with high use frequency and the like.
The server detects each character in the character sequence, and performs character identification on the same character to distinguish different words corresponding to the same character; performing word segmentation processing on the character sequence with the character identifier by utilizing a pre-constructed word segmentation word bank to obtain a word sequence with the character identifier; based on the preprocessed character sequence, the server determines the text participle to which each character belongs from the word sequence.
In one embodiment, the word stock can be established on the basis of a Xinhua dictionary or other similar published books when the word stock is established, and the word stock can also be established according to an intelligent customer service scene. The constructed word segmentation word bank can be stored in a database of the server or sent to the cloud.
In one embodiment, the target text may also be obtained by the server, for example, the server may obtain the required text data from the web page as the target text, and further determine each text character of the target text and the text participle to which each text character belongs.
For example, the target text for acquisition is "the municipality of Shenzhen City is in the citizen center. First, the server performs word segmentation processing on the target text to obtain a character sequence' Shen/Shenzhen/Citizen/city/political/house/in/city/civilian/center/. "deleting the characters belonging to the disabled word list in the character sequence to obtain a preprocessed character sequence" Shen/Shenzhen/City/Zheng/Fu/City/Min/Zhongzhong/Xin "; further, performing character identification on the same characters to obtain a character sequence "Shenzhen/city 01/city 02/Zheng/Fu/city 03/min/heart" with the character identification, performing word segmentation on the character sequence to obtain a word sequence "Shenzhen city 01/city 02 government/city 03 min", wherein although the text character "city" corresponds to three words, the text participles to which the text characters belong can be distinguished according to the character identification.
And S204, splicing the word vector of each text character with the word vector of the text participle to obtain a spliced vector of the corresponding text character.
The splicing vector is formed by splicing a plurality of text vectors according to a preset rule, and represents the representation dimensions of the plurality of vectors.
Specifically, based on the obtained word vector and word vector of the target text, the server splices the word vector corresponding to each text character with the word vector to which the text character belongs to obtain a spliced vector corresponding to the text character, so as to obtain spliced vectors of all text characters contained in the target text, wherein the splicing sequence of the word vector and the word vector is not required.
In one embodiment, the server adds or multiplies the word vector corresponding to each text character and the word vector of the text participle to which the text character belongs to obtain a splicing vector of the corresponding text character.
Step S206, according to the forward appearance sequence of the text characters in the target text, sequentially inputting the word vectors and the splicing vectors corresponding to the text characters into different hidden layers of the first neural network to obtain a first text feature of the target text based on the forward appearance sequence.
The first neural network is mainly used for generating the features of the context semantics of the target text based on the forward appearance sequence according to the features contained in the target text input according to the forward appearance sequence of the text characters in the target text; the first neural network includes a plurality of hidden layers, each hidden layer may have the same or different neuron nodes. The first neural network is a recurrent neural network, which may be a long short term memory network LSTM and a recurrent neural network RNN.
Specifically, the target text comprises a plurality of text characters, the server calculates a word vector and a splicing vector corresponding to each text character, the word vectors and the splicing vectors obtained through calculation are sequenced according to a forward appearance sequence of the text characters in the target text, further, the word vectors and the splicing vectors sequenced according to the forward appearance sequence are sequentially input into different hidden layers of a first neural network for feature extraction, mutual information of different text characters is obtained, and first text features based on the forward appearance sequence are obtained.
In one embodiment, word vectors, and concatenation vectors corresponding to a plurality of text characters may be sequentially input to different hidden layers of the first neural network, so as to obtain a first text feature of the target text based on a forward appearance order.
And S208, sequentially inputting the word vectors and the splicing vectors corresponding to the text characters into different hidden layers of the second neural network according to the reverse appearance sequence of the text characters in the target text to obtain second text characteristics of the target text based on the reverse appearance sequence.
The second neural network is mainly used for generating the features carrying the context semantics of the target text based on the reverse appearance sequence according to the features contained in the target text input according to the reverse appearance sequence of the text characters in the target text; the second neural network includes a plurality of hidden layers, each hidden layer may have the same or different neuron nodes. The second neural network is a recurrent neural network, which may be a long short term memory network LSTM and a recurrent neural network RNN.
Specifically, based on word vectors and splicing vectors corresponding to all text characters in the obtained target text, the word vectors and the splicing vectors corresponding to the text characters are sequentially input into different hidden layers of a second neural network according to a reverse sequence of the text characters appearing in the target text, and feature extraction is performed on the input word vectors and the splicing vectors through the second neural network to obtain second text features based on the reverse appearance sequence.
In one embodiment, the number of maximum text characters that can be input by the first neural network or the second neural network may be preset, and if the number of text characters of the currently input target text is less than the number of maximum text characters, 0 vector is used for completion in a word vector matrix formed by the target text, and the completed word vector matrix is used as the input of the first neural network or the second neural network.
And step S210, inputting the comprehensive text feature obtained by splicing the first text feature and the second text feature into a third neural network to obtain the semantic type of the target text.
The comprehensive text feature is a text feature formed by splicing the output of the first neural network and the output of the second neural network according to a preset rule. And the third neural network is mainly used for classifying according to semantic types of comprehensive text features corresponding to the input target text so as to obtain the semantic types of the target text. The semantic type is to determine the type of the target text according to the semantic relation of the target text.
Specifically, based on the obtained first text feature and the second text feature of the target text, the server splices the first text feature and the second text feature to obtain a comprehensive text feature of the target text; further, the comprehensive text features of the target text are transmitted to a third neural network, the comprehensive text features are classified according to semantic classes through the third neural network, the semantic type of the target text is obtained, and semantic understanding of text context and hidden words is fully considered. For example, when recognizing dirty words and polite words of the target text, the semantic types can be set to two types, i.e. category 1: text is dirty, category 0: text is a polite phrase.
In the embodiment, the word vector corresponding to each text character and the word vectors of the text participles are obtained through calculation and are subjected to vector splicing to obtain the spliced vectors corresponding to the text characters, the text is represented through multiple feature vectors by performing vector splicing on the text characters, and the feature dimension represented by the text language is enhanced. Further, the word vectors and the splicing vectors are input into different hidden layers of different neural networks according to the forward sequence and the reverse sequence, so that the relevant information of text characters can be more fully acquired, the context semantics among the text characters are mined, the first text features and the second text features output by the first neural network are spliced to obtain comprehensive features, the semantic features of the target text can be more fully expressed, and the accuracy of text semantic recognition is improved.
In an embodiment, as shown in fig. 3, the method further includes the step of generating a preset file:
step S302, a sample text is acquired.
Step S304, extracting word vectors and word vectors of the sample text based on the pre-trained first neural network.
Step S306, character numbering is carried out on the character vectors and the word vectors respectively.
Step S308, writing the character vectors, the word vectors and the corresponding character numbers into a preset file.
Calculating word vectors corresponding to word segments of the text and word vectors corresponding to word segments of the text comprises: carrying out character numbering on each text character and each text participle; and reading a word vector corresponding to each text character and a word vector corresponding to each text participle in a preset file based on the character number.
The preset file is a text with indexes, which is constructed in advance and comprises word vectors and indexes thereof, and word vectors and indexes thereof.
Specifically, before calculating a word vector of a text character and a word vector corresponding to a text word, a preset file of an index query including the word vector and the word vector needs to be constructed. The server acquires a sample text and a corresponding known semantic type from a terminal or a webpage, extracts a word vector and a word vector of the sample text based on a pre-trained first neural network, and respectively carries out character numbering on the extracted word vector and word vector to obtain a mapping relation between the word vector and a number and a mapping relation between the word vector and the number. The server writes the character vectors, the word vectors and the corresponding character numbers into a preset file to form the character vectors and the word vectors with the character number indexes.
Based on the obtained text characters contained in the target text and the text participles to which the text characters belong, the server respectively carries out character numbering on each text character and each text participle to obtain the mapping relation between the text characters and the character numbers and the mapping relation between the text participles and the character numbers. And inquiring from a preset file according to the character number of each text character to obtain a word vector of the corresponding text character, and inquiring from the preset file according to the character number of each text word to obtain a word vector of the corresponding text word.
In one embodiment, the character number may include a number type. And respectively carrying out character numbering on the character vector and the word vector according to the numbering types, wherein the numbering types of the character vector and the word vector can be the same or different. For example, character numbering is performed on the word vectors according to natural numbers; the word vectors may be numbered according to natural numbers or english letters.
For example, the target text is "Shenzhen city", and if the character number of the text character "Shen" is 01, the word vector corresponding to the character number 01 obtained by querying from the preset file is (1, 2).
In this embodiment, through a pre-established preset file containing word vectors and word vectors, when calculating the word vectors and word vectors of the target text, the word vectors and word vectors corresponding to the text characters and the text participles are obtained by querying from the preset file according to the character numbers corresponding to the text characters and the text participles, so that the word vectors of the text characters and the word vectors of the text participles can be accurately and quickly obtained, and thus, the rate and accuracy of obtaining the semantic type of the target text are improved.
In one embodiment, sequentially inputting word vectors and splicing vectors corresponding to a plurality of text characters into different hidden layers of a first neural network according to a forward appearance sequence of the text characters in a target text, and obtaining a first text feature of the target text based on the forward appearance sequence comprises: inputting word vectors and splicing vectors corresponding to the text characters in the current sequence into a current hidden layer of a first neural network according to the forward appearance sequence of the text characters in a target text; inputting character features output by the current hidden layer, word vectors corresponding to the next sequential text characters and splicing vectors into the next hidden layer of the first neural network; and iterating by taking the next hidden layer as the current hidden layer until the last sequential text characters obtain the first text characteristic of the target text based on the forward appearance sequence.
Specifically, according to the forward appearance sequence of text characters in a target text, a server inputs word vectors and splicing vectors corresponding to the text characters in a first sequence into a first hidden layer of a first neural network, and the input word vectors and splicing vectors are projected through the first hidden layer to obtain character features corresponding to the text characters. The first sequence text characters refer to the text characters in the first order which are all the text characters in the target text after being sequenced according to a preset appearance sequence. The last sequential text character refers to the text character which is positioned at the last position after all text characters in the target text are sequenced according to a preset appearance sequence.
And the server takes the first hidden layer as a current hidden layer, and obtains a word vector and a splicing vector corresponding to the current hidden layer according to the forward appearance sequence of the text characters in the target text. The weights of all neuron nodes in the current hidden layer are preset, wherein the weights of all neuron nodes can be the same or different. And then, the server performs nonlinear mapping on the features input by the current hidden layer according to the preset weight of each neuron node to obtain the character features output by the current hidden layer. The nonlinear mapping may adopt an activation function such as sigmoid (S-type) function, tanh (hyperbolic tangent) function, relu (modified linear unit) function, and the like.
The server inputs character features output by the current hidden layer, word vectors and splicing vectors corresponding to the next sequential text characters into the next hidden layer of the first neural network, the next hidden layer is used as the current hidden layer, and the following steps are executed in a circulating mode: and obtaining a word vector and a splicing vector corresponding to the current hidden layer according to the forward appearance sequence of the text characters in the target text. And presetting the weight of each neuron node in the current hidden layer. And the server obtains the character features output by the current hidden layer by adopting nonlinear mapping according to the word vector and the splicing vector corresponding to the current hidden layer, the weight of each neuron node corresponding to the current hidden layer and the character features output by the previous hidden layer. The server inputs the character features output by the previous hidden layer, the word vectors corresponding to the current hidden layer and the splicing vectors into the current hidden layer, and performs nonlinear mapping on the input features according to preset weights of all neuron nodes to obtain the character features of the next hidden layer. And repeatedly iterating the next hidden layer as the current hidden layer, executing the steps until the word vector and the splicing vector corresponding to the last sequential text character are input into the current hidden layer to obtain character features output by the current hidden layer, and taking the output character features as first text features of the target text based on the forward appearance sequence.
In one embodiment, the current hidden layer comprises a first sub hidden layer and a second sub hidden layer; inputting the word vector and the splicing vector corresponding to the current sequential text character into the current hidden layer of the first neural network comprises the following steps: taking the word vector and the output of the previous hidden layer as the input of a first sub hidden layer, wherein the first sub hidden layer is used for projecting the word vector according to the weight of each neuron node corresponding to the first sub hidden layer to obtain a first sub character characteristic; and the first sub-character features and the splicing vectors are used as the input of a second sub-hidden layer, and the second sub-hidden layer is used for projecting the splicing vectors according to the weight of each neuron node corresponding to the second sub-hidden layer to obtain second sub-character features which are used as the output of the current hidden layer.
Specifically, the current hidden layer comprises a first sub hidden layer and a second sub hidden layer; the server inputs word vectors corresponding to text characters sequentially positioned at the first position into the first sub-hidden layer, and the word vectors are projected through the weight of each neuron node preset in the first sub-hidden layer to obtain first character features output by the first sub-hidden layer; further, the server takes the splicing vector corresponding to the first character feature and the text character with the appearance sequence positioned at the first position as the input of the second sub-hidden layer, and performs nonlinear mapping on the input feature according to the preset weight of each neuron node in the second sub-hidden layer, so as to obtain the character feature output by the second sub-hidden layer and take the character feature as the output of the first hidden layer.
In one embodiment, the current hidden layer comprises a first sub hidden layer and a second sub hidden layer; inputting the word vector and the splicing vector corresponding to the current sequential text character into the current hidden layer of the first neural network comprises the following steps: the server takes the spliced vector and the output of the previous hidden layer as the input of a first sub hidden layer, and the first sub hidden layer is used for projecting the word vector according to the weight of each neuron node corresponding to the first sub hidden layer to obtain a first sub character feature; and taking the first sub-character features and the word vectors as the input of a second sub-hidden layer, wherein the second sub-hidden layer is used for projecting the splicing vectors according to the weight of each neuron node corresponding to the second sub-hidden layer to obtain second sub-character features which are taken as the output of the current hidden layer.
In one embodiment, according to the forward appearance sequence of text characters in a target text, a server sequentially inputs word vectors, word vectors and splicing vectors corresponding to a plurality of text characters to different hidden layers of a first neural network, and first text features of the target text based on the forward appearance sequence are obtained.
Specifically, according to the forward appearance sequence of text characters in a target text, a server inputs a word vector, a word vector and a splicing vector corresponding to the text characters in the current sequence into a current hidden layer of a first neural network, and obtains character features output by the current hidden layer by adopting nonlinear mapping according to the word vector, the word vector and the splicing vector corresponding to the current hidden layer, the weight of each neuron node corresponding to the current hidden layer and character features output by a previous hidden layer; and the server inputs the character features output by the current hidden layer and the word vectors, the word vectors and the splicing vectors corresponding to the next sequential text characters into the next hidden layer of the first neural network, the next hidden layer is taken as the current hidden layer, and the step of inputting the character features output by the current hidden layer and the word vectors, the word vectors and the splicing vectors corresponding to the next sequential text characters into the next hidden layer of the first neural network is repeated until the last sequential text characters, so that the first text features of the target text based on the forward appearance sequence are obtained.
In one embodiment, the first hidden layer comprises a first sub hidden layer, a second sub hidden layer and a third sub hidden layer; inputting a word vector, a word vector and a splicing vector corresponding to the current sequential text character into a current hidden layer of a first neural network comprises: taking the word vector and the output of the previous hidden layer as the input of a first sub hidden layer, wherein the first sub hidden layer is used for projecting the word vector according to the weight of each neuron node corresponding to the first sub hidden layer to obtain a first sub character characteristic; the first sub-character features and the word vectors are used as input of a second sub-hidden layer, and the second sub-hidden layer is used for projecting the splicing vectors according to the weight of each neuron node corresponding to the second sub-hidden layer to obtain second sub-character features; and taking the second sub-character features and the splicing vectors as the input of a third sub-hidden layer, wherein the third sub-hidden layer is used for projecting the splicing vectors according to the weight of each neuron node corresponding to the third sub-hidden layer to obtain third sub-character features which are taken as the output of the first hidden layer. The projection sequence of the word vector, the word vector and the splicing vector corresponding to the text character is not specified and can be set at will. For example, the word vector may be projected to obtain a first sub-character feature corresponding to the first sub-hidden layer.
In one embodiment, the following example specifically illustrates that the server obtains the character feature of the next hidden layer by using nonlinear mapping according to the word vector and the splicing vector corresponding to the current hidden layer, the weight of each neuron node corresponding to the current hidden layer, and the character feature output by the previous hidden layer.
For example, suppose that each neuron node corresponding to the current hidden layer is marked as W f The word vector corresponding to the current hidden layer is x t The splicing vector y corresponding to the current hidden layer t The character output from the previous hidden layer is characterized by h t-1 If the nonlinear function is tan h, the character feature f output by the next hidden layer t Can be calculated by the following formula:
f t =tanh(W c [h t-1 ,x t ,y t ]+b f ) (ii) a Wherein, b f Is the bias of the current hidden layer.
In one embodiment, the first neural network further comprises a random inactivation layer, and the method further comprises: and the first text feature is used as the input of a random inactivation layer, and the random inactivation layer is used for projecting each data in the first text feature according to a preset sparse probability to obtain a sparse feature vector as the output of the first neural network.
The random inactivation layer (dropout) is mainly used for performing sparse processing on the input first text feature and performing zeroing processing on partial elements of the first text feature, so that overfitting of the neural network is prevented, and meanwhile, the calculation amount of the neural network is reduced.
Specifically, the server inputs the first text feature into a random inactivation layer, the random inactivation layer conducts sparse processing on the first text feature according to a set sparse probability, and each data in the first text feature is projected according to the sparse probability to obtain a sparse feature vector, wherein the sparse probability refers to the probability of occurrence of the data after projection. For example, the first text feature is a one-dimensional sequence [1,2,3,4] T Setting the sparsity probability to be 0.5, the probability of occurrence after each digital projection in the corresponding one-dimensional sequence to be 0.5, i.e. the result output through the random deactivation layer can be [0,2,0,4 ]] T And may also be [0, 4 ]] T
In one embodiment, sequentially inputting word vectors and splicing vectors corresponding to a plurality of text characters into different hidden layers of a second neural network according to a reverse appearance sequence of the text characters in a target text, and obtaining a second text feature of the target text based on the reverse appearance sequence comprises: inputting the first text feature, the word vector corresponding to the first sequence text character and the splicing vector into a first hidden layer of a second neural network according to the reverse appearance sequence of the text character in the target text to obtain the character feature output by the first hidden layer; taking a second hidden layer of the second neural network as a current hidden layer, and taking a second sequence text character as a current sequence text character; inputting the character features output by the previous hidden layer, the word vectors corresponding to the current sequential text characters and the splicing vectors into the current hidden layer of the second neural network; and iterating by taking the next hidden layer as the current hidden layer until the last sequential text characters are obtained, and obtaining a second text characteristic of the target text based on the reverse appearance sequence.
Specifically, a first neural network and a second neural network are connected in series, and the output of the first neural network is used as the input of the second neural network. Obtaining a word vector and a splicing vector corresponding to text characters in a first sequence according to a reverse appearance sequence of the text characters in a target text; the server inputs the obtained first text features output by the first neural network, the word vector and the splicing vector into a first hidden layer of a second neural network to obtain character features output by the first hidden layer; taking a second hidden layer of the second neural network as a current hidden layer, and taking a second sequence text character as a current sequence text character; further, inputting character features output by a previous hidden layer, word vectors corresponding to the current sequential text characters and splicing vectors into a current hidden layer of a second neural network, and performing nonlinear mapping on the input features through the weight of each neuron arranged in the current hidden layer to obtain the character features output by the current hidden layer; and the server inputs the character features output by the current hidden layer, the word vectors corresponding to the next sequential text characters and the splicing vectors into the next hidden layer of the second neural network, and the next hidden layer is used as the current hidden layer to iterate until the last sequential text characters, so that second text features of the target text based on the reverse appearance sequence are obtained.
In one embodiment, the current hidden layer of the second neural network comprises a first sub-hidden layer and a second sub-hidden layer; inputting the character features output by the previous hidden layer, the word vectors corresponding to the current sequential text characters and the splicing vectors into the current hidden layer of the second neural network comprises the following steps: taking the word vector and the character features output by the previous hidden layer as the input of a first sub hidden layer, wherein the first sub hidden layer is used for projecting the word vector according to the weight of each neuron node corresponding to the first sub hidden layer to obtain first sub character features; and the first sub-character features and the splicing vectors are used as the input of a second sub-hidden layer, and the second sub-hidden layer is used for projecting the splicing vectors according to the weight of each neuron node corresponding to the second sub-hidden layer to obtain second sub-character features which are used as the output of the current hidden layer.
In one embodiment, inputting the second text feature into a third neural network, and obtaining the semantic type of the target text includes: the second text feature is used as the input of an attention mechanism layer, and the attention mechanism layer is used for weighting each datum in the second text feature to obtain a weighted feature; the weighted features are used as input of a random inactivation layer, and the random inactivation layer is used for projecting each datum in the weighted features according to a preset sparse probability to obtain sparse features; the sparse features are used as the input of a full connection layer, and the full connection layer is used for carrying out classification operation on the sparse features to obtain the prediction probability corresponding to each semantic type; and selecting the semantic type with the maximum prediction probability as the semantic type of the target text.
In one embodiment, the server sequentially inputs word vectors, word vectors and splicing vectors corresponding to the text characters into different hidden layers of the second neural network according to the reverse appearance sequence of the text characters in the target text, and second text features of the target text based on the reverse appearance sequence are obtained.
In one embodiment, the second neural network further comprises a random inactivation layer, and the method further comprises: and the server takes the second text characteristics as the input of a random inactivation layer, and the random inactivation layer is used for projecting each data in the second text characteristics according to a preset sparse probability to obtain a sparse characteristic vector as the output of the second neural network.
The processing of the target text by the first neural network and the second neural network is described below in a specific embodimentThe process. For example, for a target text of "Shenzhen City", the word vector corresponding to the text character "Shen" is "(1, 2)", the concatenation vector is "(1, 1)", the corresponding text character "Shen" corresponds to "(1, 2,3, 4)", the concatenation vector is "(2, 1)", the text character "City" corresponds to "(0, 2, 5)", the concatenation vector is "(3, 1)"; according to the forward appearance sequence of the text characters in the target text, the formed word vector of the target text is
Figure GDA0004013026910000161
Spliced vector is->
Figure GDA0004013026910000162
Inputting a word vector and a splicing vector corresponding to a first sequence text character into a first hidden layer, and taking the first hidden layer as a current hidden layer, wherein the current hidden layer comprises a first sub hidden layer and a second sub hidden layer; inputting the word vectors (1, 2 and 2) into a first sub-hidden layer of the current hidden layer to obtain a first sub-character feature output by the first sub-hidden layer; further, the first sub-character features and the splicing vectors (1, 1) are input into a second sub-hidden layer, and the input features are subjected to nonlinear mapping according to the preset weights of all neuron nodes in the second sub-hidden layer, so that the character features output by the second sub-hidden layer are obtained and serve as the output of the current hidden layer.
Further, the text characters are pushed according to the forward appearance sequence to obtain corresponding word vectors (1, 2,3, 4) and splicing vectors (2, 1), and the second hidden layer is used as the current hidden layer. Inputting the output of the previous hidden layer and the word vector (1, 2,3, 4) into a first sub hidden layer of the current hidden layer to obtain the output of the first sub hidden layer of the current hidden layer; the output of the first sub hidden layer of the current hidden layer and the splicing vector (2, 1) are input into the second sub hidden layer of the current hidden layer, and the input characteristics are subjected to nonlinear mapping according to the preset weight of each neuron node in the second sub hidden layer of the current hidden layer, so that the character characteristics output by the second sub hidden layer of the current hidden layer are obtained and serve as the output of the current hidden layer. Until the output of the previous hidden layer, the word vectors (0, 2 and 5) and the splicing vectors (3, 1 and 1) corresponding to the last sequential text characters are input into the current hidden layer to obtain the output of the current hidden layer, and the output result is used as the first text characteristic of the output of the first neural network.
In one embodiment, the first neural network and the second neural network are connected in series, and the second text features output by the second neural network are input into the third neural network; wherein the output of the first neural network is used as the input of the second neural network.
Based on the reverse appearance sequence of the text characters in the target text, the formed word vector of the target text is
Figure GDA0004013026910000163
Splicing vector is->
Figure GDA0004013026910000164
Inputting a first text feature, a word vector corresponding to a first sequence text character and a splicing vector into a first hidden layer of a second neural network according to a reverse appearance sequence of the text characters to obtain a character feature output by the first hidden layer, taking the first hidden layer of the second neural network as a current hidden layer, and taking a second sequence text character as a current sequence text character; namely, the first text characteristic and the word vector (0, 2, 5) are input into the first sub-hidden layer to obtain the output of the first sub-hidden layer of the current hidden layer. The output of the first sub hidden layer of the current hidden layer and the splicing vector (3, 1) are input into the second sub hidden layer of the current hidden layer, and the input characteristics are subjected to nonlinear mapping according to the preset weight of each neuron node in the second sub hidden layer of the current hidden layer, so that the character characteristics output by the second sub hidden layer of the current hidden layer are obtained and serve as the output of the current hidden layer.
And (3) pushing the text characters according to the reverse appearance sequence to obtain corresponding word vectors (1, 2,3, 4) and splicing vectors (2, 1), and taking the second hidden layer as the current hidden layer. Inputting the output of the previous hidden layer and the word vectors (1, 2,3 and 4) into a first sub hidden layer of the current hidden layer to obtain the output of the first sub hidden layer of the current hidden layer; the output of the first sub hidden layer of the current hidden layer and the splicing vector (2, 1) are input into the second sub hidden layer of the current hidden layer, and the input characteristics are subjected to nonlinear mapping according to the preset weights of all neuron nodes in the second sub hidden layer of the current hidden layer, so that the character characteristics output by the second sub hidden layer of the current hidden layer are obtained and serve as the output of the current hidden layer. Until the output of the previous hidden layer, the word vectors (1, 2) and the splicing vectors (1, 1) corresponding to the last sequential text characters are input into the current hidden layer to obtain the output of the current hidden layer, and the output result is used as the second text characteristic output by the first neural network. And inputting the second text characteristic into a third neural network to obtain the semantic type of the target text. Through the serial connection of the first neural network and the second neural network, the output of the first neural network is used as the input of the second neural network, namely, the first text features based on the forward sequence, the word vectors based on the reverse occurrence sequence and the splicing vectors are input into the second neural network together, so that the mutual information among text characters can be more fully mined, and especially, the context information among the characters can be well obtained under the condition that the text characters are far away from each other.
In one embodiment, the first neural network and the second neural network are connected in parallel, and the first text feature output by the first neural network and the second text feature output by the second neural network are spliced and input into the third neural network. The first neural network and the second neural network are used for processing the target text in parallel, so that the data processing speed can be improved.
In the embodiment, a plurality of features such as word vectors, splicing vectors and the like corresponding to text characters are input into the first neural network and the second neural network, and the input features are circularly calculated by the first neural network and hidden layers in the second neural network, so that the first text feature and the second text feature which represent the semantics of the target text are obtained. That is, when the text character interval is large, the relevant information of the predicted position can be well acquired. The relevant information does not show a trend of attenuation along with the increment of the cycle number.
In one embodiment, the third neural network layer includes an attention mechanism layer and a fully connected layer; splicing the first text characteristic and the second text characteristic and inputting the spliced first text characteristic and second text characteristic into a third neural network, wherein the obtaining of the semantic type of the target text comprises the following steps: splicing the first text characteristic and the second text characteristic to obtain a comprehensive text characteristic of the target text; the comprehensive text features are used as input of an attention mechanism layer, and the attention mechanism layer is used for weighting each datum in the comprehensive text features to obtain weighted features; the weighted features are used as input of a random inactivation layer, and the random inactivation layer is used for projecting each data in the weighted features according to a preset sparse probability to obtain sparse features; the sparse features are used as the input of a full connection layer, and the full connection layer is used for carrying out classification operation on the sparse features to obtain the prediction probability corresponding to each semantic type; and selecting the semantic type with the maximum prediction probability as the semantic type of the target text.
Specifically, the server splices a first text feature and a second text feature of the target text, so as to obtain a comprehensive text feature of the target text. Further, the server inputs the obtained comprehensive text features into an attention mechanism layer, and the attention mechanism layer calculates each data in the comprehensive text features according to pre-trained coefficient weight parameters to obtain a coefficient sequence; activating the coefficient sequence by adopting a nonlinear activation function to obtain an activated coefficient sequence; normalizing the activated coefficient sequence through a logistic regression (softmax) function to obtain coefficient probabilities corresponding to all data in the comprehensive text characteristics; wherein the range of coefficient probabilities lies between [0,1 ]. And multiplying the obtained coefficient probability by the data corresponding to the comprehensive text characteristics respectively to obtain weighted characteristics after weighted processing, and using the weighted characteristics as the output of the attention mechanism layer.
Further, the server inputs the weighted features output by the attention mechanism layer into the random inactivation layer, the random inactivation layer conducts sparse processing on the weighted features according to the set sparse probability, each data in the weighted features is projected according to the sparse probability, and therefore the sparse features are obtained, wherein the sparse probability refers to the probability of the data after projection.
Further, the server inputs the sparse features into the full-connection layer, classification operation is carried out on the sparse features through the full-connection layer, the prediction probability corresponding to each semantic type is calculated according to the trained weight parameters of the full-connection layer, and each prediction probability output by the full-connection layer corresponds to one semantic type. And the server selects the semantic type with the maximum prediction probability as the semantic type of the target text.
In one embodiment, the third neural network layer further includes a logistic regression layer (softmax layer), specifically including: and taking the prediction probability corresponding to each semantic type as the input of a softmax layer, wherein the softmax layer is used for carrying out normalization processing on each prediction probability to obtain the probability corresponding to each semantic type, and selecting the semantic type with the maximum probability as the semantic type of the target text.
For example, the prediction probability is output via the full link layer as
Figure GDA0004013026910000191
The semantic type corresponding to a is 1, and the semantic type corresponding to b is 0; after normalization is carried out by adopting a softmax function, the output probability of each semantic type after normalization is obtained
Figure GDA0004013026910000192
And selecting the semantic type corresponding to the maximum probability as the semantic type of the target text.
In one embodiment, the neural network comprises a first neural network, a second neural network and a third neural network, and the training process of the neural network model comprises the following steps: obtaining a sample text and a known label, determining sample text characters contained in the sample text and sample text participles to which each sample text character belongs, calculating sample word vectors corresponding to the sample text characters and sample word vectors corresponding to the sample text participles, and splicing the sample word vectors corresponding to each sample text character with the sample word vectors of the sample text participles to obtain corresponding sample spliced vectors of the sample text characters; according to the forward appearance sequence of the sample text characters in the sample text, sequentially inputting sample word vectors and sample splicing vectors corresponding to the sample text characters to a first neural network to be trained to obtain a first text characteristic of a sample; according to the reverse appearance sequence of the sample text characters in the sample text, sequentially inputting sample word vectors and sample splicing vectors corresponding to the sample text characters to a second neural network to be trained to obtain second text characteristics of the sample;
inputting sample comprehensive text characteristics obtained by splicing the sample first text characteristics and the sample second text characteristics into a third neural network to be trained to obtain a predicted semantic type of the sample text; calculating a loss value according to the predicted semantic type and the known label, and transmitting the loss value to each layer of the neural network model by a reverse gradient propagation method to obtain the gradient of each layer of parameters; and adjusting parameters of each layer in the neural network model according to the gradient until the determined loss value reaches a training stop condition.
The adjusting of the parameters of each layer in the neural network model specifically comprises adjusting the weight parameters of the fully-connected layer, and the weight parameters and the bias parameters of each hidden layer in the first neural network and the second neural network. The function that calculates the loss value may be a cross-entropy loss function. The inverse gradient propagation method may be a batch gradient descent method (BGD), a small batch gradient descent Method (MGBD), and a stochastic gradient descent method (SGD).
In this embodiment, the comprehensive text features are weighted by the attention mechanism layer of the third neural network layer, so that the features with high mutual information between text characters are highlighted, and the features with low mutual information are weakened; furthermore, sparse processing is carried out on the weighted features through the random inactivation layer to obtain sparse features after sparse processing, classification operation is carried out on the sparse features through the full connection layer to obtain the prediction probability corresponding to each semantic type, the semantic type corresponding to the maximum prediction probability is selected as the semantic type of the target text, the semantic features of the target text are represented by the comprehensive text features, the upper and lower text semantics of text characters are increased through weighting and sparse processing, the calculation amount of computer equipment is reduced, and meanwhile the classification accuracy of the target sample is improved.
It should be understood that although the various steps in the flow charts of fig. 2-3 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not limited to being performed in the exact order illustrated and, unless explicitly stated herein, may be performed in other orders. Moreover, at least some of the steps in fig. 2-3 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternating with other steps or at least some of the sub-steps or stages of other steps.
In one embodiment, as shown in fig. 4, there is provided a text semantic recognition apparatus 400, including: a vector calculation module 402, a vector concatenation module 404, a first text feature acquisition module 406, a second text feature acquisition module 408, and a semantic type acquisition module 410, wherein:
and a vector calculation module 402, configured to calculate a word vector of each text character and a word vector of each text word in the target text.
And the vector splicing module 404 is configured to splice the word vector of each text character with the word vector of the text participle to which the text character belongs, so as to obtain a spliced vector of the corresponding text character.
The first text feature obtaining module 406 is configured to sequentially input word vectors and concatenation vectors corresponding to the multiple text characters into different hidden layers of the first neural network according to a forward appearance sequence of the text characters in the target text, so as to obtain a first text feature of the target text based on the forward appearance sequence.
The second text feature obtaining module 408 is configured to sequentially input word vectors and concatenation vectors corresponding to the plurality of text characters into different hidden layers of the second neural network according to a reverse appearance sequence of the text characters in the target text, so as to obtain a second text feature of the target text based on the reverse appearance sequence.
And a semantic type obtaining module 410, configured to input the comprehensive text feature obtained by splicing the first text feature and the second text feature into a third neural network, so as to obtain a semantic type of the target text.
In one embodiment, the first neural network inputs a word vector and a splicing vector corresponding to a current sequence text character into a current hidden layer of the first neural network according to a forward appearance sequence of the text character in a target text, inputs character features output by the current hidden layer and a word vector and a splicing vector corresponding to a next sequence text character into a next hidden layer of the first neural network, and iterates by taking the next hidden layer as the current hidden layer until the last sequence text character is obtained to obtain the first text feature of the target text based on the forward appearance sequence.
In one embodiment, as shown in fig. 5, the method further includes a preset sample generation module 412, which extracts word vectors and word vectors of the sample text based on the first neural network layer that is pre-trained; character numbering is carried out on the character vectors and the word vectors respectively; writing the character vectors, the word vectors and the corresponding character numbers into a preset file; calculating word vectors corresponding to word segments of the text and word vectors corresponding to word segments of the text comprises: carrying out character numbering on each text character and each text participle; and reading a word vector corresponding to each text character and a word vector corresponding to each text participle in a preset file based on the character number.
In one embodiment, the first text feature obtaining module is further configured to input word vectors and concatenation vectors corresponding to text characters in a current order into a current hidden layer of the first neural network according to a forward appearance order of the text characters in the target text; inputting character features output by the current hidden layer, word vectors corresponding to the next sequential text characters and splicing vectors into the next hidden layer of the first neural network; and taking the next hidden layer as the current hidden layer, returning to the step of inputting the character features output by the current hidden layer, the word vectors corresponding to the next sequential text characters and the splicing vectors into the next hidden layer of the first neural network until the final sequential text characters, and obtaining the first text features of the target text based on the forward appearance sequence.
In one embodiment, the second text feature obtaining module is further configured to input the first text feature, the word vector corresponding to the first sequence text character, and the concatenation vector into a first hidden layer of a second neural network according to a reverse appearance sequence of the text character in the target text; inputting the output of the first hidden layer, the word vector corresponding to the current sequential text character and the splicing vector into the current hidden layer of the second neural network; inputting character features output by the current hidden layer, word vectors corresponding to the next sequential text characters and splicing vectors into the next hidden layer of the second neural network; and taking the next hidden layer as the current hidden layer, returning to the step of inputting the character features output by the current hidden layer, the word vectors corresponding to the next sequential text characters and the splicing vectors into the next hidden layer of the second neural network until the last sequential text characters, and obtaining second text features of the target text based on the reverse appearance sequence.
In an embodiment, the first text feature obtaining module is further configured to use a word vector and an output of a previous hidden layer as inputs of a first sub hidden layer, where the first sub hidden layer is configured to project the word vector according to weights of each neuron node corresponding to the first sub hidden layer to obtain a first sub character feature; and the first sub-character features and the splicing vectors are used as the input of a second sub-hidden layer, and the second sub-hidden layer is used for projecting the splicing vectors according to the weight of each neuron node corresponding to the second sub-hidden layer to obtain second sub-character features which are used as the output of the current hidden layer.
In an embodiment, the first text feature obtaining module is further configured to use the first text feature as an input of a random inactivation layer, and the random inactivation layer is configured to project each data in the first text feature according to a preset sparse probability to obtain a sparse feature vector, which is used as an output of the first neural network.
In one embodiment, the semantic type obtaining module is further configured to splice the first text feature and the second text feature to obtain a comprehensive text feature of the target text; the comprehensive text features are used as input of an attention mechanism layer, and the attention mechanism layer is used for weighting each datum in the comprehensive text features to obtain weighted features; the weighted features are used as input of a random inactivation layer, and the random inactivation layer is used for projecting each data in the weighted features according to a preset sparse probability to obtain sparse features; the sparse features are used as the input of a full connection layer, and the full connection layer is used for carrying out classification operation on the sparse features to obtain the prediction probability corresponding to each semantic type; and selecting the semantic type with the maximum prediction probability as the semantic type of the target text.
In the embodiment, the word vector corresponding to each text character and the word vectors of the text participles are obtained through calculation and are subjected to vector splicing to obtain the spliced vectors corresponding to the text characters, the text is represented through multiple feature vectors by performing vector splicing on the text characters, and the feature dimension represented by the text language is enhanced. Further, the word vectors and the splicing vectors are input into different hidden layers of different neural networks according to the forward sequence and the reverse sequence, so that the relevant information of text characters can be more fully acquired, the context semantics among the text characters are mined, the first text features and the second text features output by the first neural network are spliced to obtain comprehensive features, the semantic features of the target text can be more fully expressed, and the accuracy of text semantic recognition is improved.
For the specific definition of the text semantic recognition device, reference may be made to the above definition of the text semantic recognition method, which is not described herein again. The modules in the text semantic recognition device can be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 6. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer equipment is used for storing preset files, word vectors corresponding to text characters contained in the target text and word vectors corresponding to text word segmentation. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a text semantic recognition method.
It will be appreciated by those skilled in the art that the configuration shown in fig. 6 is a block diagram of only a portion of the configuration associated with the present application, and is not intended to limit the computing device to which the present application may be applied, and that a particular computing device may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor when executing the computer program performs the steps of the text semantic recognition method provided in any one of the embodiments of the present application.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the text semantic recognition method provided in any one of the embodiments of the present application.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), rambus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
All possible combinations of the technical features in the above embodiments may not be described for the sake of brevity, but should be considered as being within the scope of the present disclosure as long as there is no contradiction between the combinations of the technical features.
The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent application shall be subject to the appended claims.

Claims (10)

1. A method of text semantic recognition, the method comprising:
calculating a word vector of each text character and a word vector of each text word in the target text;
splicing the word vector of each text character with the word vectors of the text participles to obtain spliced vectors of corresponding text character characters;
according to the forward appearance sequence of the text characters in the target text, sequentially inputting word vectors and splicing vectors corresponding to the text characters into different hidden layers of a first neural network to obtain first text characteristics of the target text based on the forward appearance sequence;
sequentially inputting word vectors and splicing vectors corresponding to a plurality of text characters into different hidden layers of a second neural network according to the reverse appearance sequence of the text characters in the target text to obtain second text characteristics of the target text based on the reverse appearance sequence;
and inputting the comprehensive text features obtained by splicing the first text features and the second text features into a third neural network to obtain the semantic type of the target text.
2. The method of claim 1, further comprising:
obtaining a sample text;
extracting a word vector and a word vector of the sample text based on a pre-trained first neural network layer;
respectively carrying out character numbering on the character vectors and the word vectors;
writing the character vectors, the word vectors and the corresponding character numbers into a preset file;
the calculating the word vector of each text character and the word vector of each text participle in the target text comprises the following steps:
carrying out character numbering on each text character and each text participle;
and reading a word vector corresponding to each text character and a word vector corresponding to each text participle in the preset file based on the character number.
3. The method of claim 1, wherein sequentially inputting word vectors and concatenation vectors corresponding to a plurality of text characters into different hidden layers of a first neural network according to a forward appearance sequence of the text characters in the target text, and obtaining a first text feature of the target text based on the forward appearance sequence comprises:
inputting word vectors and splicing vectors corresponding to the text characters in the current sequence into a current hidden layer of a first neural network according to the forward appearance sequence of the text characters in the target text;
inputting character features output by the current hidden layer, word vectors corresponding to the next sequential text characters and splicing vectors into the next hidden layer of the first neural network;
and iterating by taking the next hidden layer as the current hidden layer until the last sequential text characters obtain the first text characteristic of the target text based on the forward appearance sequence.
4. The method of claim 3, wherein the current hidden layer comprises a first sub hidden layer and a second sub hidden layer; the inputting the word vector and the splicing vector corresponding to the current sequential text character into the current hidden layer of the first neural network comprises:
taking the word vector and the output of the previous hidden layer as the input of a first sub hidden layer, wherein the first sub hidden layer is used for projecting the word vector according to the weight of each neuron node corresponding to the first sub hidden layer to obtain a first sub character feature;
and taking the first sub-character features and the splicing vector as the input of a second sub-hidden layer, wherein the second sub-hidden layer is used for projecting the splicing vector according to the weight of each neuron node corresponding to the second sub-hidden layer to obtain second sub-character features, and the second sub-character features are taken as the output of the current hidden layer.
5. The method of claim 3, wherein the first neural network further comprises a random inactivation layer, the method further comprising:
and taking the first text feature as the input of the random inactivation layer, wherein the random inactivation layer is used for projecting each data in the first text feature according to a preset sparse probability to obtain a sparse feature vector as the output of the first neural network.
6. The method of claim 1, wherein the sequentially inputting word vectors and splicing vectors corresponding to a plurality of text characters into different hidden layers of a second neural network according to a reverse appearance sequence of the text characters in the target text to obtain a second text feature of the target text based on the reverse appearance sequence comprises:
inputting the first text feature, the word vector corresponding to the first sequence text character and the splicing vector into a first hidden layer of a second neural network according to the reverse appearance sequence of the text characters in the target text to obtain the character feature output by the first hidden layer;
taking a second hidden layer of the second neural network as a current hidden layer, and taking a second sequence text character as a current sequence text character;
inputting character features output by a previous hidden layer, word vectors corresponding to the current sequential text characters and splicing vectors into a current hidden layer of a second neural network;
and iterating by taking the next hidden layer as the current hidden layer until the last sequential text characters are obtained, and obtaining a second text characteristic of the target text based on the reverse appearance sequence.
7. The method of claim 1, wherein the third neural network layer comprises an attention mechanism layer and a fully-connected layer; splicing the first text feature and the second text feature and inputting the spliced first text feature and second text feature into a third neural network, wherein obtaining the semantic type of the target text comprises the following steps:
splicing the first text feature and the second text feature to obtain a comprehensive text feature of the target text;
the comprehensive text features are used as input of the attention mechanism layer, and the attention mechanism layer is used for weighting each datum in the comprehensive text features to obtain weighted features;
the weighted features are used as input of a random inactivation layer, and the random inactivation layer is used for projecting each data in the weighted features according to a preset sparse probability to obtain sparse features;
the sparse features are used as the input of a full connection layer, and the full connection layer is used for carrying out classification operation on the sparse features to obtain the prediction probability corresponding to each semantic type;
and selecting the semantic type with the maximum prediction probability as the semantic type of the target text.
8. An apparatus for semantic recognition of text, the apparatus comprising:
the vector calculation module is used for calculating a word vector of each text character and a word vector of each text word in the target text;
the vector splicing module is used for splicing the word vector of each text character with the word vector of the text participle to obtain a spliced vector of the corresponding text character;
the first text feature acquisition module is used for sequentially inputting word vectors and splicing vectors corresponding to a plurality of text characters into different hidden layers of a first neural network according to the forward appearance sequence of the text characters in the target text to obtain first text features of the target text based on the forward appearance sequence;
the second text feature acquisition module is used for sequentially inputting word vectors and splicing vectors corresponding to a plurality of text characters into different hidden layers of a second neural network according to the reverse appearance sequence of the text characters in the target text to obtain second text features of the target text based on the reverse appearance sequence;
and the semantic type acquisition module is used for inputting the comprehensive text features obtained by splicing the first text features and the second text features into a third neural network to obtain the semantic type of the target text.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN201910744603.5A 2019-08-13 2019-08-13 Text semantic recognition method and device, computer equipment and storage medium Active CN110598206B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910744603.5A CN110598206B (en) 2019-08-13 2019-08-13 Text semantic recognition method and device, computer equipment and storage medium
PCT/CN2020/104679 WO2021027533A1 (en) 2019-08-13 2020-07-25 Text semantic recognition method and apparatus, computer device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910744603.5A CN110598206B (en) 2019-08-13 2019-08-13 Text semantic recognition method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110598206A CN110598206A (en) 2019-12-20
CN110598206B true CN110598206B (en) 2023-04-07

Family

ID=68854117

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910744603.5A Active CN110598206B (en) 2019-08-13 2019-08-13 Text semantic recognition method and device, computer equipment and storage medium

Country Status (2)

Country Link
CN (1) CN110598206B (en)
WO (1) WO2021027533A1 (en)

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110598206B (en) * 2019-08-13 2023-04-07 平安国际智慧城市科技股份有限公司 Text semantic recognition method and device, computer equipment and storage medium
CN111309901A (en) * 2020-01-19 2020-06-19 北京海鑫科金高科技股份有限公司 Short text classification method and device
CN111353035B (en) * 2020-03-11 2021-02-19 镁佳(北京)科技有限公司 Man-machine conversation method and device, readable storage medium and electronic equipment
CN111581335B (en) * 2020-05-14 2023-11-24 腾讯科技(深圳)有限公司 Text representation method and device
CN111814461B (en) * 2020-07-09 2024-05-31 科大讯飞股份有限公司 Text processing method, related equipment and readable storage medium
CN111859862B (en) * 2020-07-22 2024-03-22 海尔优家智能科技(北京)有限公司 Text data labeling method and device, storage medium and electronic device
CN112417859A (en) * 2020-11-24 2021-02-26 北京明略昭辉科技有限公司 Intention recognition method, system, computer device and computer-readable storage medium
CN112786108B (en) * 2021-01-21 2023-10-24 北京百度网讯科技有限公司 Training method, device, equipment and medium of molecular understanding model
CN114969316B (en) * 2021-02-24 2024-04-26 腾讯科技(深圳)有限公司 Text data processing method, device, equipment and medium
CN112949477B (en) * 2021-03-01 2024-03-15 苏州美能华智能科技有限公司 Information identification method, device and storage medium based on graph convolution neural network
CN112818699A (en) * 2021-03-03 2021-05-18 深圳前海微众银行股份有限公司 Risk analysis method, device, equipment and computer-readable storage medium
CN112926339B (en) * 2021-03-09 2024-02-09 北京小米移动软件有限公司 Text similarity determination method, system, storage medium and electronic equipment
CN113128241A (en) * 2021-05-17 2021-07-16 口碑(上海)信息技术有限公司 Text recognition method, device and equipment
CN113157900A (en) * 2021-05-27 2021-07-23 中国平安人寿保险股份有限公司 Intention recognition method and device, computer equipment and storage medium
CN113408268B (en) * 2021-06-22 2023-01-13 平安科技(深圳)有限公司 Slot filling method, device, equipment and storage medium
CN113590820A (en) * 2021-07-16 2021-11-02 杭州网易智企科技有限公司 Text processing method, device, medium and electronic equipment
CN113904851A (en) * 2021-10-11 2022-01-07 中国电信股份有限公司 Network information processing method, user plane function system, medium, and electronic device
CN113886885A (en) * 2021-10-21 2022-01-04 平安科技(深圳)有限公司 Data desensitization method, data desensitization device, equipment and storage medium
CN114003007A (en) * 2021-10-28 2022-02-01 国家石油天然气管网集团有限公司华南分公司 Control system operation data based analysis method and related device
CN113889281B (en) * 2021-11-17 2024-05-03 华美浩联医疗科技(北京)有限公司 Chinese medical intelligent entity identification method and device and computer equipment
CN114297987B (en) * 2022-03-09 2022-07-19 杭州实在智能科技有限公司 Document information extraction method and system based on text classification and reading understanding
CN115116437B (en) * 2022-04-07 2024-02-09 腾讯科技(深圳)有限公司 Speech recognition method, device, computer equipment, storage medium and product
CN115409038A (en) * 2022-08-26 2022-11-29 湖北星纪时代科技有限公司 Natural language processing method and device, electronic equipment and storage medium
CN116756579B (en) * 2023-08-22 2023-12-12 腾讯科技(深圳)有限公司 Training method of large language model and text processing method based on large language model

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918500A (en) * 2019-01-17 2019-06-21 平安科技(深圳)有限公司 File classification method and relevant device based on convolutional neural networks

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180018573A1 (en) * 2016-07-12 2018-01-18 Xerox Corporation Vector operators for distributional entailment
CN107632987B (en) * 2016-07-19 2018-12-07 腾讯科技(深圳)有限公司 A kind of dialogue generation method and device
CN108287858B (en) * 2017-03-02 2021-08-10 腾讯科技(深圳)有限公司 Semantic extraction method and device for natural language
CN107025219B (en) * 2017-04-19 2019-07-26 厦门大学 A kind of word insertion representation method based on internal Semantic hierarchy
CN107680579B (en) * 2017-09-29 2020-08-14 百度在线网络技术(北京)有限公司 Text regularization model training method and device, and text regularization method and device
CN108334492B (en) * 2017-12-05 2021-11-02 腾讯科技(深圳)有限公司 Text word segmentation and instant message processing method and device
CN108170675A (en) * 2017-12-27 2018-06-15 哈尔滨福满科技有限责任公司 A kind of name entity recognition method based on deep learning towards medical field
CN108132931B (en) * 2018-01-12 2021-06-25 鼎富智能科技有限公司 Text semantic matching method and device
CN108717409A (en) * 2018-05-16 2018-10-30 联动优势科技有限公司 A kind of sequence labelling method and device
CN109376240A (en) * 2018-10-11 2019-02-22 平安科技(深圳)有限公司 A kind of text analyzing method and terminal
CN109684626A (en) * 2018-11-16 2019-04-26 深思考人工智能机器人科技(北京)有限公司 Method for recognizing semantics, model, storage medium and device
CN110598206B (en) * 2019-08-13 2023-04-07 平安国际智慧城市科技股份有限公司 Text semantic recognition method and device, computer equipment and storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918500A (en) * 2019-01-17 2019-06-21 平安科技(深圳)有限公司 File classification method and relevant device based on convolutional neural networks

Also Published As

Publication number Publication date
CN110598206A (en) 2019-12-20
WO2021027533A1 (en) 2021-02-18

Similar Documents

Publication Publication Date Title
CN110598206B (en) Text semantic recognition method and device, computer equipment and storage medium
CN110569500A (en) Text semantic recognition method and device, computer equipment and storage medium
CN111160017B (en) Keyword extraction method, phonetics scoring method and phonetics recommendation method
CN110347835B (en) Text clustering method, electronic device and storage medium
CN111108501B (en) Context-based multi-round dialogue method, device, equipment and storage medium
CN108536800B (en) Text classification method, system, computer device and storage medium
CN112613308B (en) User intention recognition method, device, terminal equipment and storage medium
CN107808011B (en) Information classification extraction method and device, computer equipment and storage medium
CN106991085B (en) Entity abbreviation generation method and device
CN109376222B (en) Question-answer matching degree calculation method, question-answer automatic matching method and device
CN110750965B (en) English text sequence labeling method, english text sequence labeling system and computer equipment
CN108520041B (en) Industry classification method and system of text, computer equipment and storage medium
CN109543007A (en) Put question to data creation method, device, computer equipment and storage medium
CN111259113B (en) Text matching method, text matching device, computer readable storage medium and computer equipment
CN112307164A (en) Information recommendation method and device, computer equipment and storage medium
CN111191457A (en) Natural language semantic recognition method and device, computer equipment and storage medium
CN110990532A (en) Method and device for processing text
CN111859916B (en) Method, device, equipment and medium for extracting key words of ancient poems and generating poems
CN111985228A (en) Text keyword extraction method and device, computer equipment and storage medium
CN112667782A (en) Text classification method, device, equipment and storage medium
CN113536784B (en) Text processing method, device, computer equipment and storage medium
CN111859979A (en) Ironic text collaborative recognition method, ironic text collaborative recognition device, ironic text collaborative recognition equipment and computer readable medium
CN112256863A (en) Method and device for determining corpus intentions and electronic equipment
CN114357151A (en) Processing method, device and equipment of text category identification model and storage medium
CN112215629B (en) Multi-target advertisement generating system and method based on construction countermeasure sample

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant