CN112597757A

CN112597757A - Word detection method and device, storage medium and electronic device

Info

Publication number: CN112597757A
Application number: CN202011407627.0A
Authority: CN
Inventors: 杨聪聪; 朱海刚; 王鹏; 田江; 向小佳; 丁永建; 李璠
Original assignee: Everbright Technology Co ltd
Current assignee: Everbright Technology Co ltd
Priority date: 2020-12-04
Filing date: 2020-12-04
Publication date: 2021-04-02

Abstract

The embodiment of the invention provides a word detection method and device, a storage medium and an electronic device, wherein the method comprises the following steps: determining a word hidden state vector of a target word in a sentence to be processed and a sentence hidden state vector of the sentence; enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector; splicing the enhanced word hidden state vector and sentence hidden state vector to obtain a spliced hidden state vector; and detecting the target word in the sentence by utilizing the spliced hidden state vector so as to determine the word associated with the target word in the sentence. The invention solves the problem of low word detection accuracy in the correlation technique and achieves the effect of accurately detecting the related words in the sentences.

Description

Word detection method and device, storage medium and electronic device

Technical Field

The embodiment of the invention relates to the field of communication, in particular to a word detection method and device, a storage medium and an electronic device.

Background

With the continuous improvement of the information capacity of the network texts, the entity information contained in the texts can be quickly and efficiently identified, and the method has important significance for various industries. Most named entity recognition models currently use context encoders (e.g., Long Short-term Memory Networks (LSTM)) and Convolutional Neural Networks (CNN)), to obtain upper and lower states of words, which represent the final entity labels to be predicted. Although these words may learn current context information, the sentence semantic information of these words is still weak.

In view of the above technical problems, no effective solution has been proposed in the related art.

Disclosure of Invention

The embodiment of the invention provides a word detection method and device, a storage medium and an electronic device, and aims to at least solve the problem of low word detection accuracy in the related art.

According to an embodiment of the present invention, there is provided a word detection method including: determining a word hidden state vector of a target word in a sentence to be processed and a sentence hidden state vector of the sentence; enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector; splicing the enhanced word hidden state vector and the sentence hidden state vector to obtain a spliced hidden state vector; and detecting the target word in the sentence by using the spliced hidden state vector so as to determine a word associated with the target word in the sentence.

According to another embodiment of the present invention, there is provided a word detection apparatus including: the first determining module is used for determining a word hidden state vector of a target word in a sentence to be processed and a sentence hidden state vector of the sentence; the first enhancement module is used for enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector; the first splicing module is used for splicing the enhanced word hidden state vector and the sentence hidden state vector to obtain a spliced hidden state vector; and the first detection module is used for detecting the target word in the sentence by using the splicing hidden state vector so as to determine a word associated with the target word in the sentence.

In an exemplary embodiment, the first determining module includes: a first determining unit, configured to input the word vector of the target word into a first bidirectional Short-term Memory (BiLSTM) layer, and obtain a word hidden state vector of the target word output by the first BiLSTM layer; and the second determining unit is used for inputting the sentence vector of the sentence into the first BilSTM layer to obtain the sentence hidden state vector of the sentence output by the first BilSTM layer.

In an exemplary embodiment, the first enhancement module includes: a first processing unit, configured to input the word hidden state vector into a Self-Attention mechanism Self-Attention layer, so that the Self-Attention layer captures a relationship between the target word and the other words, and output the enhanced word hidden state vector.

In an exemplary embodiment, the first detecting module includes: a third determining unit, configured to input the spliced hidden state vector into a second BiLSTM layer, so as to obtain a target hidden state vector of the target word output by the second BiLSTM layer; a first searching unit, configured to search for target tag information corresponding to tag information of the target hidden state vector in the sentence; a fourth determining unit, configured to determine a word corresponding to the target tag information as a word associated with the target word.

In an exemplary embodiment, the apparatus further includes: the system comprises a first mapping module, a second mapping module and a third mapping module, wherein the first mapping module is used for mapping a target word to a distributed representation space before determining a word hidden state vector of the target word in a sentence to be processed and a sentence hidden state vector of the sentence so as to capture semantic and syntactic characteristics of the target word by utilizing the distributed representation space; and the second determining module is used for determining the word vector of the target word in a preset word vector library based on the semantic meaning and the syntactic characteristics of the target word.

In an exemplary embodiment, the apparatus further includes: the system comprises a coding module and a processing module, wherein the coding module is used for coding a sentence by using a Cellular Neural Network (CNN) model before determining a word hidden state vector of a target word in the sentence to be processed and a sentence hidden state vector of the sentence to obtain the sentence vector of the sentence, the CNN model comprises N channels, and N is a natural number greater than 1.

In an exemplary embodiment, the encoding module includes: a first capturing unit, configured to capture word information in the sentence by using a filter in a first channel of the N channels to obtain a first vector, where the first channel is used to represent a character-level vector; a second capturing unit, configured to capture word information in the sentence by using a filter in a second channel of the N channels to obtain a second vector, where the second channel is used to represent a word-level vector; a fifth determining unit configured to determine a sentence vector of the sentence in the first vector and the second vector.

According to a further embodiment of the present invention, there is also provided a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.

According to yet another embodiment of the present invention, there is also provided an electronic device, including a memory in which a computer program is stored and a processor configured to execute the computer program to perform the steps in any of the above method embodiments.

According to the method, the word hidden state vector of the target word in the sentence to be processed and the sentence hidden state vector of the sentence are determined; enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector; splicing the enhanced word hidden state vector and sentence hidden state vector to obtain a spliced hidden state vector; and detecting the target word in the sentence by utilizing the spliced hidden state vector so as to determine the word associated with the target word in the sentence. The method realizes the purposes that each word can obtain abundant semantic information and better predicts the entity label. Therefore, the problem of low detection accuracy of the words can be solved, and the effect of accurately detecting the associated words in the sentences can be achieved.

Drawings

Fig. 1 is a block diagram of a hardware configuration of a mobile terminal of a word detection method according to an embodiment of the present invention;

FIG. 2 is a flow diagram of a method of detecting words, according to an embodiment of the invention;

FIG. 3 is a diagram of a sentence representation model based on multi-channel CNN according to an embodiment of the present invention;

FIG. 4 is a diagram of an NER model based on sentence semantics and a Self-authorization mechanism according to an embodiment of the present invention;

fig. 5 is a block diagram of a word detection apparatus according to an embodiment of the present invention.

Detailed Description

Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings in conjunction with the embodiments.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

The method embodiments provided in the embodiments of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Taking an example of the present invention running on a mobile terminal, fig. 1 is a block diagram of a hardware structure of the mobile terminal of a detection method of the present invention. As shown in fig. 1, the mobile terminal may include one or more (only one shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), and a memory 104 for storing data, wherein the mobile terminal may further include a transmission device 106 for communication functions and an input-output device 108. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration, and does not limit the structure of the mobile terminal. For example, the mobile terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.

The memory 104 may be used to store computer programs, for example, software programs and modules of application software, such as computer programs corresponding to the word detection method in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the computer programs stored in the memory 104, so as to implement the above-mentioned method. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the mobile terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the mobile terminal. In one example, the transmission device 106 includes a Network adapter (NIC), which can be connected to other Network devices through a base station so as to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.

In the present embodiment, a word detection method is provided, and fig. 2 is a flowchart of a word detection method according to an embodiment of the present invention, as shown in fig. 2, the flowchart includes the following steps:

step S202, determining word hidden state vectors of target words and sentence hidden state vectors of sentences in the sentences to be processed;

step S204, enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector;

step S206, splicing the enhanced word hidden state vector and sentence hidden state vector to obtain a spliced hidden state vector;

and S208, detecting the target words in the sentence by using the spliced hidden state vector so as to determine the words associated with the target words in the sentence.

The execution subject of the above steps may be a server, etc., but is not limited thereto.

Optionally, the present embodiment includes, but is not limited to, application in a scenario in which semantic information of a word is detected.

Through the steps, word hidden state vectors of target words and sentence hidden state vectors of sentences in the sentences to be processed are determined; enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector; splicing the enhanced word hidden state vector and sentence hidden state vector to obtain a spliced hidden state vector; and detecting the target word in the sentence by utilizing the spliced hidden state vector so as to determine the word associated with the target word in the sentence. The method realizes the purposes that each word can obtain abundant semantic information and better predicts the entity label. Therefore, the problem of low detection accuracy of the words can be solved, and the effect of accurately detecting the associated words in the sentences can be achieved.

Optionally, in this embodiment, determining a word hidden state vector of a target word in a sentence to be processed and a sentence hidden state vector of the sentence includes:

s1, inputting the word vector of the target word into the first BilSTM layer to obtain the word hidden state vector of the target word output by the first BilSTM layer;

s2, the sentence vector of the sentence is input into the first BilSTM layer, and the sentence hidden state vector of the sentence output by the first BilSTM layer is obtained.

Optionally, in this embodiment, for example, if the target word is "north", the word vector corresponding to "north" is input into the first BiLSTM layer to obtain the word hidden state vector of "north". If the sentence is 'Beijing welcome you', the sentence vector of 'Beijing welcome you' is input into the first BilTM layer to obtain the hidden state vector of 'Beijing welcome you'.

Optionally, in this embodiment, the word vector of the target word may be determined from a trained 100-dimensional word vector.

In an exemplary embodiment, enhancing the relationship between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector includes:

s1, the word hidden state vector is input into the Self-Attention mechanism Self-Attention layer, so that the Self-Attention layer captures the relation between the target word and other words, and the enhanced word hidden state vector is output.

Optionally, in this embodiment, the Self-authorization layer employs a Self-authorization mechanism.

Alternatively, in this embodiment, BilSTM can process some longer sentences than RNN, however as the sentence length is longer, the more distant words have less dependency information between them. Thus, long-term dependency information between words is not well captured. The Self-Attention mechanism can directly capture the relationship between two words in a sentence regardless of the distance between the two words, and can well capture the syntactic and semantic features in the sentence, which is a good supplement to BilSTM. The Self-Attention mechanism or internal Attention mechanism is a special Attention mechanism, an input sequence is an output sequence, the Attention is made in the sequence, the Attention weight of the sequence to the Self is calculated, and the relation in the sequence is searched. Query is Value, namely Attention (X, X), and the final output is Y MultiHead (X, X).

In one exemplary embodiment, detecting a target word in a sentence using a stitched hidden state vector to determine a word associated with the target word in the sentence comprises:

s1, inputting the spliced hidden state vector into the second BilSTM layer to obtain a target hidden state vector of the target word output by the second BilSTM layer;

s2, searching target label information corresponding to the label information of the target hidden state vector in the sentence;

s3, the word corresponding to the target label information is determined as the word associated with the target word.

Optionally, in this embodiment, the mutual information between adjacent tags is useful, for example, the tag I-PER may not follow the B-LOC. The tag sequences can be jointly decoded using the CRF layer, which allows the model to find the optimal path from all possible tag sequences.

In an exemplary embodiment, before determining the word hidden state vector of the target word and the sentence hidden state vector of the sentence in the sentence to be processed, the method further comprises:

s1, mapping the target word to the distributed expression space so as to capture the semantic and syntactic characteristics of the target word by using the distributed expression space;

and S2, determining a word vector of the target word in a preset word vector library based on the semantic and syntactic characteristics of the target word.

Alternatively, in the present embodiment, the distributed representation space is a low-dimensional dense vector representation space capable of capturing semantic and syntactic characteristics of words, and the word embedding layers of chinese and english may be different due to linguistic characteristics.

For example, the english word embedding layer generally consists of word-level vectors and character-level vectors in the english named entity recognition task. The use of pre-trained word vectors can achieve a significant performance improvement over the use of randomly initialized word vectors. Thus, stoffy pre-trained publicly available 300-dimensional word vectors can be used as word-level vectors. In addition, different english words have surface or morphological similarities, such as: words with regular suffixes may share some character level features. Thus, character-level features are used to process unknown words. Firstly, establishing a character-level vector representation model by using BilSTM:

h_c＝[h₁；h_f]formula 3;

wherein, c_iRepresents a sequence of characters (c)₁,c₂,…,c_f) F represents the length of the character sequence.

And

are the hidden state vectors of forward LSTM and backward LSTM, respectively. h is_cIs the finally obtained character-level vector.

The Chinese word embedding layer is different from English, Chinese has great relation with input and word segmentation due to lack of natural separators, and early Chinese named entity recognition provides three embedding methods: word embedding, and word position embedding. Because the word segmentation methods are numerous and non-uniform and the word segmentation effect is poor, wrong word segmentation may have wrong influence on the use of downstream tasks, so that the performance of the final model is poor. The Chinese Network Layer information grouping (NLP) task is carried out under a neural Network model, the expression of a word level is almost always superior to that of a word level, for partial tasks, the best expression can be achieved by using the word level expression alone, and the addition of words is counterproductive and has negative effects. Thus, a pre-trained chinese 100-dimensional word vector may be used as input to the model.

s1, the sentence is coded by using a cellular neural network CNN model to obtain a sentence vector of the sentence, wherein the CNN model comprises N channels, and N is a natural number greater than 1.

In an exemplary embodiment, encoding the sentence using the cellular neural network CNN model to obtain a sentence vector of the sentence includes:

s1, capturing word information in a sentence by using a filter in a first channel of the N channels to obtain a first vector, wherein the first channel is used for representing a character-level vector;

s2, capturing word information in the sentence by using a filter in a second channel of the N channels to obtain a second vector, wherein the second channel is used for representing a word-level vector;

s3, a sentence vector of the sentence is determined in the first vector and the second vector.

Alternatively, in this embodiment, although the bi-directional LSTM may learn the context information of the current word, the global semantic information of the current word is still weak. A sentence representation model based on a multi-channel convolutional neural network. Linking the sentence representation with the context word states obtained via BilSTM makes better use of the global context. CNN encoded sentences typically use a channel structure to generate a sentence representation using a 1-D convolutional layer. We invented a multi-channel CNN to model sentence representations.

As shown in FIG. 3, the embedding layer has two channels, each channel being a set of vectors, the first channel being a character-level vector and the second channel being a word-level vector. Each filter is applied to two channels to capture adjacent word information, and then a maximum polling max-polling layer is applied to select one piece of strongest information as a current sentence representation. The following equation:

g_i＝pooling([m₁,m₂,…,m_f]) Formula 4;

h_S＝[g₁；g₂；…；g_o]equation 5;

where k is the size of the filter, o is the number of filter types, g_iIs a feature obtained by a different kind of filter, x_iRepresenting an input sentence (x)₁,x₂,…,x_n) A word of h_SA sentence-level representation representing the input sentence.

The invention is illustrated below with reference to specific examples:

the embodiment provides a sentence representation model carrying semantic information of sentences. By splicing the context word expression and the sentence expression, each word can obtain abundant semantic information, and entity label prediction can be better carried out. In addition, a Self-authorization mechanism is applied in the model, and the Self-authorization mechanism can directly capture the long-distance dependence of any two words in the sentence and better capture the global dependence of the whole sentence. The model provided by the invention is not only suitable for English data sets, but also suitable for Chinese data sets.

As shown in fig. 4, the model is divided into 5 parts: the system comprises a word embedding layer, a first layer of BilSTM, a Self-Attention layer, a word hidden state and sentence level vector combination part and a second layer of BilSTM. Word embedding obtains hidden state vector representation of words and vector representation of sentences through a first layer of BilSTM, then the hidden state of the words is sent to a Self-Attention layer to enhance the capability of capturing relations between the words, h(s) is sentence-level vector representation generated based on a sentence representation model of multi-channel CNN, and finally the hidden state of the words is spliced with the sentence-level vector and then sent to a second layer of BilSTM to obtain a final hidden state vector for label prediction. It is useful to consider the mutual information between neighboring tags, e.g., it is unlikely that tag I-PER will follow B-LOC. Thus, the tag sequences are finally decoded jointly using the CRF layer, which allows the model to find the optimal path from all possible tag sequences.

In summary, in the embodiment, the context word representation and the sentence representation are spliced, so that each word can obtain rich semantic information, and entity tag prediction is performed better. In addition, aiming at the problem that BilSTM cannot directly capture the long-distance dependence of any two words in a sentence, the Self-Attention can be applied to directly capture the long-distance dependence of any two words in the sentence, so that the global dependence of the whole sentence can be better captured, and the proposed model is not only suitable for an English data set, but also suitable for a Chinese data set.

Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

In this embodiment, a word detection device is further provided, and the device is used to implement the foregoing embodiments and preferred embodiments, and the description of the device that has been already made is omitted. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.

Fig. 5 is a block diagram of a word detection apparatus according to an embodiment of the present invention, as shown in fig. 5, the apparatus including:

a first determining module 52, configured to determine a word hidden state vector of a target word in a sentence to be processed and a sentence hidden state vector of the sentence;

a first enhancing module 54, configured to enhance a relationship between the target word and another word based on the word hidden state vector to obtain an enhanced word hidden state vector;

a first splicing module 56, configured to splice the enhanced word hidden state vector and sentence hidden state vector to obtain a spliced hidden state vector;

a first detection module 58, configured to detect the target word in the sentence by using the concatenated hidden state vector to determine a word associated with the target word in the sentence.

In an exemplary embodiment, the first determining module includes:

a first determining unit, configured to input the word vector of the target word into a first BilSTM layer, so as to obtain a word hidden state vector of the target word output by the first BilSTM layer;

and the second determining unit is used for inputting the sentence vector of the sentence into the first BilSTM layer to obtain the sentence hidden state vector of the sentence output by the first BilSTM layer.

In an exemplary embodiment, the first enhancement module includes:

a first processing unit, configured to input the word hidden state vector into a Self-Attention mechanism Self-Attention layer, so that the Self-Attention layer captures a relationship between the target word and the other words, and output the enhanced word hidden state vector.

In an exemplary embodiment, the first detecting module includes:

a third determining unit, configured to input the spliced hidden state vector into a second BiLSTM layer, so as to obtain a target hidden state vector of the target word output by the second BiLSTM layer;

a first searching unit, configured to search for target tag information corresponding to tag information of the target hidden state vector in the sentence;

a fourth determining unit, configured to determine a word corresponding to the target tag information as a word associated with the target word.

In an exemplary embodiment, the apparatus further includes: the system comprises a first mapping module, a second mapping module and a third mapping module, wherein the first mapping module is used for mapping a target word to a distributed representation space before determining a word hidden state vector of the target word in a sentence to be processed and a sentence hidden state vector of the sentence so as to capture semantic and syntactic characteristics of the target word by utilizing the distributed representation space;

and the second determining module is used for determining the word vector of the target word in a preset word vector library based on the semantic meaning and the syntactic characteristics of the target word.

In an exemplary embodiment, the apparatus further includes:

the system comprises a coding module and a processing module, wherein the coding module is used for coding a sentence by using a Cellular Neural Network (CNN) model before determining a word hidden state vector of a target word in the sentence to be processed and a sentence hidden state vector of the sentence to obtain the sentence vector of the sentence, the CNN model comprises N channels, and N is a natural number greater than 1.

In an exemplary embodiment, the encoding module includes:

a first capturing unit, configured to capture word information in the sentence by using a filter in a first channel of the N channels to obtain a first vector, where the first channel is used to represent a character-level vector;

a second capturing unit, configured to capture word information in the sentence by using a filter in a second channel of the N channels to obtain a second vector, where the second channel is used to represent a word-level vector;

a fifth determining unit configured to determine a sentence vector of the sentence in the first vector and the second vector.

It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.

Embodiments of the present invention also provide a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above-mentioned method embodiments when executed.

In the present embodiment, the above-mentioned computer-readable storage medium may be configured to store a computer program for executing the steps of:

s1, determining a word hidden state vector of a target word in a sentence to be processed and a sentence hidden state vector of the sentence;

s2, enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector;

s3, splicing the enhanced word hidden state vector and sentence hidden state vector to obtain a spliced hidden state vector;

and S4, detecting the target words in the sentence by using the spliced hidden state vector so as to determine the words associated with the target words in the sentence.

In an exemplary embodiment, the computer-readable storage medium may include, but is not limited to: various media capable of storing computer programs, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.

Embodiments of the present invention also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the above method embodiments.

In an exemplary embodiment, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.

In an exemplary embodiment, the processor may be configured to execute the following steps by a computer program:

For specific examples in this embodiment, reference may be made to the examples described in the above embodiments and exemplary embodiments, and details of this embodiment are not repeated herein.

It will be apparent to those skilled in the art that the various modules or steps of the invention described above may be implemented using a general purpose computing device, they may be centralized on a single computing device or distributed across a network of computing devices, and they may be implemented using program code executable by the computing devices, such that they may be stored in a memory device and executed by the computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into various integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for detecting a word, comprising:

determining a word hidden state vector of a target word in a sentence to be processed and a sentence hidden state vector of the sentence;

enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector;

splicing the enhanced word hidden state vector and the sentence hidden state vector to obtain a spliced hidden state vector;

detecting the target word in the sentence by using the concatenated hidden state vector to determine a word associated with the target word in the sentence.

2. The method of claim 1, wherein determining a word hidden state vector of a target word in a sentence to be processed and a sentence hidden state vector of the sentence comprises:

inputting the word vector of the target word into a first bidirectional time recurrent neural network (BilTM) layer to obtain a word hidden state vector of the target word output by the first BilTM layer;

and inputting the sentence vector of the sentence into the first BilSTM layer to obtain the sentence hidden state vector of the sentence output by the first BilSTM layer.

3. The method of claim 1, wherein enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector comprises:

inputting the word hidden state vector into a Self-Attention mechanism Self-Attention layer, so that the Self-Attention layer captures the relationship between the target word and the other words, and outputting the enhanced word hidden state vector.

4. The method of claim 1, wherein detecting the target word in the sentence using the concatenated hidden state vector to determine a word associated with the target word in the sentence comprises:

inputting the spliced hidden state vector into a second BilSTM layer to obtain a target hidden state vector of the target word output by the second BilSTM layer;

searching target label information corresponding to the label information of the target hidden state vector in the sentence;

determining a word corresponding to the target tag information as a word associated with the target word.

5. The method of claim 1, wherein prior to determining a word hidden state vector for a target word in a sentence to be processed and a sentence hidden state vector for the sentence, the method further comprises:

mapping the target word to a distributed representation space to capture semantic and syntactic characteristics of the target word using the distributed representation space;

determining a word vector of the target word in a preset word vector library based on the semantic meaning and the syntactic characteristics of the target word.

6. The method of claim 1, wherein prior to determining a word hidden state vector for a target word in a sentence to be processed and a sentence hidden state vector for the sentence, the method further comprises:

and coding the sentence by utilizing a Cellular Neural Network (CNN) model to obtain a sentence vector of the sentence, wherein the CNN model comprises N channels, and N is a natural number greater than 1.

7. The method of claim 6, wherein encoding the sentence using a Cellular Neural Network (CNN) model to obtain a sentence vector of the sentence comprises:

capturing word information in the sentence by using a filter in a first channel of the N channels to obtain a first vector, wherein the first channel is used for representing a character-level vector;

capturing word information in the sentence by using a filter in a second channel of the N channels to obtain a second vector, wherein the second channel is used for representing a word-level vector;

determining a sentence vector for the sentence in the first vector and the second vector.

8. An apparatus for detecting a word, comprising:

the system comprises a first determining module, a second determining module and a third determining module, wherein the first determining module is used for determining a word hidden state vector of a target word in a sentence to be processed and a sentence hidden state vector of the sentence;

the first enhancement module is used for enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector;

the first splicing module is used for splicing the enhanced word hidden state vector and the sentence hidden state vector to obtain a spliced hidden state vector;

a first detection module, configured to detect the target word in the sentence by using the concatenated hidden state vector, so as to determine a word associated with the target word in the sentence.

9. A computer-readable storage medium, in which a computer program is stored, wherein the computer program is arranged to perform the method of any of claims 1 to 7 when executed.

10. An electronic device comprising a memory and a processor, wherein the memory has stored therein a computer program, and wherein the processor is arranged to execute the computer program to perform the method of any of claims 1 to 7.