CN112597757A - Word detection method and device, storage medium and electronic device - Google Patents

Word detection method and device, storage medium and electronic device Download PDF

Info

Publication number
CN112597757A
CN112597757A CN202011407627.0A CN202011407627A CN112597757A CN 112597757 A CN112597757 A CN 112597757A CN 202011407627 A CN202011407627 A CN 202011407627A CN 112597757 A CN112597757 A CN 112597757A
Authority
CN
China
Prior art keywords
word
sentence
hidden state
state vector
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011407627.0A
Other languages
Chinese (zh)
Inventor
杨聪聪
朱海刚
王鹏
田江
向小佳
丁永建
李璠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Everbright Technology Co ltd
Original Assignee
Everbright Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Everbright Technology Co ltd filed Critical Everbright Technology Co ltd
Priority to CN202011407627.0A priority Critical patent/CN112597757A/en
Publication of CN112597757A publication Critical patent/CN112597757A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides a word detection method and device, a storage medium and an electronic device, wherein the method comprises the following steps: determining a word hidden state vector of a target word in a sentence to be processed and a sentence hidden state vector of the sentence; enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector; splicing the enhanced word hidden state vector and sentence hidden state vector to obtain a spliced hidden state vector; and detecting the target word in the sentence by utilizing the spliced hidden state vector so as to determine the word associated with the target word in the sentence. The invention solves the problem of low word detection accuracy in the correlation technique and achieves the effect of accurately detecting the related words in the sentences.

Description

Word detection method and device, storage medium and electronic device
Technical Field
The embodiment of the invention relates to the field of communication, in particular to a word detection method and device, a storage medium and an electronic device.
Background
With the continuous improvement of the information capacity of the network texts, the entity information contained in the texts can be quickly and efficiently identified, and the method has important significance for various industries. Most named entity recognition models currently use context encoders (e.g., Long Short-term Memory Networks (LSTM)) and Convolutional Neural Networks (CNN)), to obtain upper and lower states of words, which represent the final entity labels to be predicted. Although these words may learn current context information, the sentence semantic information of these words is still weak.
In view of the above technical problems, no effective solution has been proposed in the related art.
Disclosure of Invention
The embodiment of the invention provides a word detection method and device, a storage medium and an electronic device, and aims to at least solve the problem of low word detection accuracy in the related art.
According to an embodiment of the present invention, there is provided a word detection method including: determining a word hidden state vector of a target word in a sentence to be processed and a sentence hidden state vector of the sentence; enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector; splicing the enhanced word hidden state vector and the sentence hidden state vector to obtain a spliced hidden state vector; and detecting the target word in the sentence by using the spliced hidden state vector so as to determine a word associated with the target word in the sentence.
According to another embodiment of the present invention, there is provided a word detection apparatus including: the first determining module is used for determining a word hidden state vector of a target word in a sentence to be processed and a sentence hidden state vector of the sentence; the first enhancement module is used for enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector; the first splicing module is used for splicing the enhanced word hidden state vector and the sentence hidden state vector to obtain a spliced hidden state vector; and the first detection module is used for detecting the target word in the sentence by using the splicing hidden state vector so as to determine a word associated with the target word in the sentence.
In an exemplary embodiment, the first determining module includes: a first determining unit, configured to input the word vector of the target word into a first bidirectional Short-term Memory (BiLSTM) layer, and obtain a word hidden state vector of the target word output by the first BiLSTM layer; and the second determining unit is used for inputting the sentence vector of the sentence into the first BilSTM layer to obtain the sentence hidden state vector of the sentence output by the first BilSTM layer.
In an exemplary embodiment, the first enhancement module includes: a first processing unit, configured to input the word hidden state vector into a Self-Attention mechanism Self-Attention layer, so that the Self-Attention layer captures a relationship between the target word and the other words, and output the enhanced word hidden state vector.
In an exemplary embodiment, the first detecting module includes: a third determining unit, configured to input the spliced hidden state vector into a second BiLSTM layer, so as to obtain a target hidden state vector of the target word output by the second BiLSTM layer; a first searching unit, configured to search for target tag information corresponding to tag information of the target hidden state vector in the sentence; a fourth determining unit, configured to determine a word corresponding to the target tag information as a word associated with the target word.
In an exemplary embodiment, the apparatus further includes: the system comprises a first mapping module, a second mapping module and a third mapping module, wherein the first mapping module is used for mapping a target word to a distributed representation space before determining a word hidden state vector of the target word in a sentence to be processed and a sentence hidden state vector of the sentence so as to capture semantic and syntactic characteristics of the target word by utilizing the distributed representation space; and the second determining module is used for determining the word vector of the target word in a preset word vector library based on the semantic meaning and the syntactic characteristics of the target word.
In an exemplary embodiment, the apparatus further includes: the system comprises a coding module and a processing module, wherein the coding module is used for coding a sentence by using a Cellular Neural Network (CNN) model before determining a word hidden state vector of a target word in the sentence to be processed and a sentence hidden state vector of the sentence to obtain the sentence vector of the sentence, the CNN model comprises N channels, and N is a natural number greater than 1.
In an exemplary embodiment, the encoding module includes: a first capturing unit, configured to capture word information in the sentence by using a filter in a first channel of the N channels to obtain a first vector, where the first channel is used to represent a character-level vector; a second capturing unit, configured to capture word information in the sentence by using a filter in a second channel of the N channels to obtain a second vector, where the second channel is used to represent a word-level vector; a fifth determining unit configured to determine a sentence vector of the sentence in the first vector and the second vector.
According to a further embodiment of the present invention, there is also provided a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
According to yet another embodiment of the present invention, there is also provided an electronic device, including a memory in which a computer program is stored and a processor configured to execute the computer program to perform the steps in any of the above method embodiments.
According to the method, the word hidden state vector of the target word in the sentence to be processed and the sentence hidden state vector of the sentence are determined; enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector; splicing the enhanced word hidden state vector and sentence hidden state vector to obtain a spliced hidden state vector; and detecting the target word in the sentence by utilizing the spliced hidden state vector so as to determine the word associated with the target word in the sentence. The method realizes the purposes that each word can obtain abundant semantic information and better predicts the entity label. Therefore, the problem of low detection accuracy of the words can be solved, and the effect of accurately detecting the associated words in the sentences can be achieved.
Drawings
Fig. 1 is a block diagram of a hardware configuration of a mobile terminal of a word detection method according to an embodiment of the present invention;
FIG. 2 is a flow diagram of a method of detecting words, according to an embodiment of the invention;
FIG. 3 is a diagram of a sentence representation model based on multi-channel CNN according to an embodiment of the present invention;
FIG. 4 is a diagram of an NER model based on sentence semantics and a Self-authorization mechanism according to an embodiment of the present invention;
fig. 5 is a block diagram of a word detection apparatus according to an embodiment of the present invention.
Detailed Description
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings in conjunction with the embodiments.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
The method embodiments provided in the embodiments of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Taking an example of the present invention running on a mobile terminal, fig. 1 is a block diagram of a hardware structure of the mobile terminal of a detection method of the present invention. As shown in fig. 1, the mobile terminal may include one or more (only one shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), and a memory 104 for storing data, wherein the mobile terminal may further include a transmission device 106 for communication functions and an input-output device 108. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration, and does not limit the structure of the mobile terminal. For example, the mobile terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The memory 104 may be used to store computer programs, for example, software programs and modules of application software, such as computer programs corresponding to the word detection method in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the computer programs stored in the memory 104, so as to implement the above-mentioned method. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the mobile terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the mobile terminal. In one example, the transmission device 106 includes a Network adapter (NIC), which can be connected to other Network devices through a base station so as to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
In the present embodiment, a word detection method is provided, and fig. 2 is a flowchart of a word detection method according to an embodiment of the present invention, as shown in fig. 2, the flowchart includes the following steps:
step S202, determining word hidden state vectors of target words and sentence hidden state vectors of sentences in the sentences to be processed;
step S204, enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector;
step S206, splicing the enhanced word hidden state vector and sentence hidden state vector to obtain a spliced hidden state vector;
and S208, detecting the target words in the sentence by using the spliced hidden state vector so as to determine the words associated with the target words in the sentence.
The execution subject of the above steps may be a server, etc., but is not limited thereto.
Optionally, the present embodiment includes, but is not limited to, application in a scenario in which semantic information of a word is detected.
Through the steps, word hidden state vectors of target words and sentence hidden state vectors of sentences in the sentences to be processed are determined; enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector; splicing the enhanced word hidden state vector and sentence hidden state vector to obtain a spliced hidden state vector; and detecting the target word in the sentence by utilizing the spliced hidden state vector so as to determine the word associated with the target word in the sentence. The method realizes the purposes that each word can obtain abundant semantic information and better predicts the entity label. Therefore, the problem of low detection accuracy of the words can be solved, and the effect of accurately detecting the associated words in the sentences can be achieved.
Optionally, in this embodiment, determining a word hidden state vector of a target word in a sentence to be processed and a sentence hidden state vector of the sentence includes:
s1, inputting the word vector of the target word into the first BilSTM layer to obtain the word hidden state vector of the target word output by the first BilSTM layer;
s2, the sentence vector of the sentence is input into the first BilSTM layer, and the sentence hidden state vector of the sentence output by the first BilSTM layer is obtained.
Optionally, in this embodiment, for example, if the target word is "north", the word vector corresponding to "north" is input into the first BiLSTM layer to obtain the word hidden state vector of "north". If the sentence is 'Beijing welcome you', the sentence vector of 'Beijing welcome you' is input into the first BilTM layer to obtain the hidden state vector of 'Beijing welcome you'.
Optionally, in this embodiment, the word vector of the target word may be determined from a trained 100-dimensional word vector.
In an exemplary embodiment, enhancing the relationship between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector includes:
s1, the word hidden state vector is input into the Self-Attention mechanism Self-Attention layer, so that the Self-Attention layer captures the relation between the target word and other words, and the enhanced word hidden state vector is output.
Optionally, in this embodiment, the Self-authorization layer employs a Self-authorization mechanism.
Alternatively, in this embodiment, BilSTM can process some longer sentences than RNN, however as the sentence length is longer, the more distant words have less dependency information between them. Thus, long-term dependency information between words is not well captured. The Self-Attention mechanism can directly capture the relationship between two words in a sentence regardless of the distance between the two words, and can well capture the syntactic and semantic features in the sentence, which is a good supplement to BilSTM. The Self-Attention mechanism or internal Attention mechanism is a special Attention mechanism, an input sequence is an output sequence, the Attention is made in the sequence, the Attention weight of the sequence to the Self is calculated, and the relation in the sequence is searched. Query is Value, namely Attention (X, X), and the final output is Y MultiHead (X, X).
In one exemplary embodiment, detecting a target word in a sentence using a stitched hidden state vector to determine a word associated with the target word in the sentence comprises:
s1, inputting the spliced hidden state vector into the second BilSTM layer to obtain a target hidden state vector of the target word output by the second BilSTM layer;
s2, searching target label information corresponding to the label information of the target hidden state vector in the sentence;
s3, the word corresponding to the target label information is determined as the word associated with the target word.
Optionally, in this embodiment, the mutual information between adjacent tags is useful, for example, the tag I-PER may not follow the B-LOC. The tag sequences can be jointly decoded using the CRF layer, which allows the model to find the optimal path from all possible tag sequences.
In an exemplary embodiment, before determining the word hidden state vector of the target word and the sentence hidden state vector of the sentence in the sentence to be processed, the method further comprises:
s1, mapping the target word to the distributed expression space so as to capture the semantic and syntactic characteristics of the target word by using the distributed expression space;
and S2, determining a word vector of the target word in a preset word vector library based on the semantic and syntactic characteristics of the target word.
Alternatively, in the present embodiment, the distributed representation space is a low-dimensional dense vector representation space capable of capturing semantic and syntactic characteristics of words, and the word embedding layers of chinese and english may be different due to linguistic characteristics.
For example, the english word embedding layer generally consists of word-level vectors and character-level vectors in the english named entity recognition task. The use of pre-trained word vectors can achieve a significant performance improvement over the use of randomly initialized word vectors. Thus, stoffy pre-trained publicly available 300-dimensional word vectors can be used as word-level vectors. In addition, different english words have surface or morphological similarities, such as: words with regular suffixes may share some character level features. Thus, character-level features are used to process unknown words. Firstly, establishing a character-level vector representation model by using BilSTM:
Figure BDA0002819025030000071
Figure BDA0002819025030000072
hc=[h1;hf]formula 3;
wherein, ciRepresents a sequence of characters (c)1,c2,…,cf) F represents the length of the character sequence.
Figure BDA0002819025030000081
And
Figure BDA0002819025030000082
are the hidden state vectors of forward LSTM and backward LSTM, respectively. h iscIs the finally obtained character-level vector.
The Chinese word embedding layer is different from English, Chinese has great relation with input and word segmentation due to lack of natural separators, and early Chinese named entity recognition provides three embedding methods: word embedding, and word position embedding. Because the word segmentation methods are numerous and non-uniform and the word segmentation effect is poor, wrong word segmentation may have wrong influence on the use of downstream tasks, so that the performance of the final model is poor. The Chinese Network Layer information grouping (NLP) task is carried out under a neural Network model, the expression of a word level is almost always superior to that of a word level, for partial tasks, the best expression can be achieved by using the word level expression alone, and the addition of words is counterproductive and has negative effects. Thus, a pre-trained chinese 100-dimensional word vector may be used as input to the model.
In an exemplary embodiment, before determining the word hidden state vector of the target word and the sentence hidden state vector of the sentence in the sentence to be processed, the method further comprises:
s1, the sentence is coded by using a cellular neural network CNN model to obtain a sentence vector of the sentence, wherein the CNN model comprises N channels, and N is a natural number greater than 1.
In an exemplary embodiment, encoding the sentence using the cellular neural network CNN model to obtain a sentence vector of the sentence includes:
s1, capturing word information in a sentence by using a filter in a first channel of the N channels to obtain a first vector, wherein the first channel is used for representing a character-level vector;
s2, capturing word information in the sentence by using a filter in a second channel of the N channels to obtain a second vector, wherein the second channel is used for representing a word-level vector;
s3, a sentence vector of the sentence is determined in the first vector and the second vector.
Alternatively, in this embodiment, although the bi-directional LSTM may learn the context information of the current word, the global semantic information of the current word is still weak. A sentence representation model based on a multi-channel convolutional neural network. Linking the sentence representation with the context word states obtained via BilSTM makes better use of the global context. CNN encoded sentences typically use a channel structure to generate a sentence representation using a 1-D convolutional layer. We invented a multi-channel CNN to model sentence representations.
As shown in FIG. 3, the embedding layer has two channels, each channel being a set of vectors, the first channel being a character-level vector and the second channel being a word-level vector. Each filter is applied to two channels to capture adjacent word information, and then a maximum polling max-polling layer is applied to select one piece of strongest information as a current sentence representation. The following equation:
Figure BDA0002819025030000091
gi=pooling([m1,m2,…,mf]) Formula 4;
hS=[g1;g2;…;go]equation 5;
where k is the size of the filter, o is the number of filter types, giIs a feature obtained by a different kind of filter, xiRepresenting an input sentence (x)1,x2,…,xn) A word of hSA sentence-level representation representing the input sentence.
The invention is illustrated below with reference to specific examples:
the embodiment provides a sentence representation model carrying semantic information of sentences. By splicing the context word expression and the sentence expression, each word can obtain abundant semantic information, and entity label prediction can be better carried out. In addition, a Self-authorization mechanism is applied in the model, and the Self-authorization mechanism can directly capture the long-distance dependence of any two words in the sentence and better capture the global dependence of the whole sentence. The model provided by the invention is not only suitable for English data sets, but also suitable for Chinese data sets.
As shown in fig. 4, the model is divided into 5 parts: the system comprises a word embedding layer, a first layer of BilSTM, a Self-Attention layer, a word hidden state and sentence level vector combination part and a second layer of BilSTM. Word embedding obtains hidden state vector representation of words and vector representation of sentences through a first layer of BilSTM, then the hidden state of the words is sent to a Self-Attention layer to enhance the capability of capturing relations between the words, h(s) is sentence-level vector representation generated based on a sentence representation model of multi-channel CNN, and finally the hidden state of the words is spliced with the sentence-level vector and then sent to a second layer of BilSTM to obtain a final hidden state vector for label prediction. It is useful to consider the mutual information between neighboring tags, e.g., it is unlikely that tag I-PER will follow B-LOC. Thus, the tag sequences are finally decoded jointly using the CRF layer, which allows the model to find the optimal path from all possible tag sequences.
In summary, in the embodiment, the context word representation and the sentence representation are spliced, so that each word can obtain rich semantic information, and entity tag prediction is performed better. In addition, aiming at the problem that BilSTM cannot directly capture the long-distance dependence of any two words in a sentence, the Self-Attention can be applied to directly capture the long-distance dependence of any two words in the sentence, so that the global dependence of the whole sentence can be better captured, and the proposed model is not only suitable for an English data set, but also suitable for a Chinese data set.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
In this embodiment, a word detection device is further provided, and the device is used to implement the foregoing embodiments and preferred embodiments, and the description of the device that has been already made is omitted. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 5 is a block diagram of a word detection apparatus according to an embodiment of the present invention, as shown in fig. 5, the apparatus including:
a first determining module 52, configured to determine a word hidden state vector of a target word in a sentence to be processed and a sentence hidden state vector of the sentence;
a first enhancing module 54, configured to enhance a relationship between the target word and another word based on the word hidden state vector to obtain an enhanced word hidden state vector;
a first splicing module 56, configured to splice the enhanced word hidden state vector and sentence hidden state vector to obtain a spliced hidden state vector;
a first detection module 58, configured to detect the target word in the sentence by using the concatenated hidden state vector to determine a word associated with the target word in the sentence.
In an exemplary embodiment, the first determining module includes:
a first determining unit, configured to input the word vector of the target word into a first BilSTM layer, so as to obtain a word hidden state vector of the target word output by the first BilSTM layer;
and the second determining unit is used for inputting the sentence vector of the sentence into the first BilSTM layer to obtain the sentence hidden state vector of the sentence output by the first BilSTM layer.
In an exemplary embodiment, the first enhancement module includes:
a first processing unit, configured to input the word hidden state vector into a Self-Attention mechanism Self-Attention layer, so that the Self-Attention layer captures a relationship between the target word and the other words, and output the enhanced word hidden state vector.
In an exemplary embodiment, the first detecting module includes:
a third determining unit, configured to input the spliced hidden state vector into a second BiLSTM layer, so as to obtain a target hidden state vector of the target word output by the second BiLSTM layer;
a first searching unit, configured to search for target tag information corresponding to tag information of the target hidden state vector in the sentence;
a fourth determining unit, configured to determine a word corresponding to the target tag information as a word associated with the target word.
In an exemplary embodiment, the apparatus further includes: the system comprises a first mapping module, a second mapping module and a third mapping module, wherein the first mapping module is used for mapping a target word to a distributed representation space before determining a word hidden state vector of the target word in a sentence to be processed and a sentence hidden state vector of the sentence so as to capture semantic and syntactic characteristics of the target word by utilizing the distributed representation space;
and the second determining module is used for determining the word vector of the target word in a preset word vector library based on the semantic meaning and the syntactic characteristics of the target word.
In an exemplary embodiment, the apparatus further includes:
the system comprises a coding module and a processing module, wherein the coding module is used for coding a sentence by using a Cellular Neural Network (CNN) model before determining a word hidden state vector of a target word in the sentence to be processed and a sentence hidden state vector of the sentence to obtain the sentence vector of the sentence, the CNN model comprises N channels, and N is a natural number greater than 1.
In an exemplary embodiment, the encoding module includes:
a first capturing unit, configured to capture word information in the sentence by using a filter in a first channel of the N channels to obtain a first vector, where the first channel is used to represent a character-level vector;
a second capturing unit, configured to capture word information in the sentence by using a filter in a second channel of the N channels to obtain a second vector, where the second channel is used to represent a word-level vector;
a fifth determining unit configured to determine a sentence vector of the sentence in the first vector and the second vector.
It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.
Embodiments of the present invention also provide a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above-mentioned method embodiments when executed.
In the present embodiment, the above-mentioned computer-readable storage medium may be configured to store a computer program for executing the steps of:
s1, determining a word hidden state vector of a target word in a sentence to be processed and a sentence hidden state vector of the sentence;
s2, enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector;
s3, splicing the enhanced word hidden state vector and sentence hidden state vector to obtain a spliced hidden state vector;
and S4, detecting the target words in the sentence by using the spliced hidden state vector so as to determine the words associated with the target words in the sentence.
In an exemplary embodiment, the computer-readable storage medium may include, but is not limited to: various media capable of storing computer programs, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
Embodiments of the present invention also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the above method embodiments.
In an exemplary embodiment, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
In an exemplary embodiment, the processor may be configured to execute the following steps by a computer program:
s1, determining a word hidden state vector of a target word in a sentence to be processed and a sentence hidden state vector of the sentence;
s2, enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector;
s3, splicing the enhanced word hidden state vector and sentence hidden state vector to obtain a spliced hidden state vector;
and S4, detecting the target words in the sentence by using the spliced hidden state vector so as to determine the words associated with the target words in the sentence.
For specific examples in this embodiment, reference may be made to the examples described in the above embodiments and exemplary embodiments, and details of this embodiment are not repeated herein.
It will be apparent to those skilled in the art that the various modules or steps of the invention described above may be implemented using a general purpose computing device, they may be centralized on a single computing device or distributed across a network of computing devices, and they may be implemented using program code executable by the computing devices, such that they may be stored in a memory device and executed by the computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into various integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method for detecting a word, comprising:
determining a word hidden state vector of a target word in a sentence to be processed and a sentence hidden state vector of the sentence;
enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector;
splicing the enhanced word hidden state vector and the sentence hidden state vector to obtain a spliced hidden state vector;
detecting the target word in the sentence by using the concatenated hidden state vector to determine a word associated with the target word in the sentence.
2. The method of claim 1, wherein determining a word hidden state vector of a target word in a sentence to be processed and a sentence hidden state vector of the sentence comprises:
inputting the word vector of the target word into a first bidirectional time recurrent neural network (BilTM) layer to obtain a word hidden state vector of the target word output by the first BilTM layer;
and inputting the sentence vector of the sentence into the first BilSTM layer to obtain the sentence hidden state vector of the sentence output by the first BilSTM layer.
3. The method of claim 1, wherein enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector comprises:
inputting the word hidden state vector into a Self-Attention mechanism Self-Attention layer, so that the Self-Attention layer captures the relationship between the target word and the other words, and outputting the enhanced word hidden state vector.
4. The method of claim 1, wherein detecting the target word in the sentence using the concatenated hidden state vector to determine a word associated with the target word in the sentence comprises:
inputting the spliced hidden state vector into a second BilSTM layer to obtain a target hidden state vector of the target word output by the second BilSTM layer;
searching target label information corresponding to the label information of the target hidden state vector in the sentence;
determining a word corresponding to the target tag information as a word associated with the target word.
5. The method of claim 1, wherein prior to determining a word hidden state vector for a target word in a sentence to be processed and a sentence hidden state vector for the sentence, the method further comprises:
mapping the target word to a distributed representation space to capture semantic and syntactic characteristics of the target word using the distributed representation space;
determining a word vector of the target word in a preset word vector library based on the semantic meaning and the syntactic characteristics of the target word.
6. The method of claim 1, wherein prior to determining a word hidden state vector for a target word in a sentence to be processed and a sentence hidden state vector for the sentence, the method further comprises:
and coding the sentence by utilizing a Cellular Neural Network (CNN) model to obtain a sentence vector of the sentence, wherein the CNN model comprises N channels, and N is a natural number greater than 1.
7. The method of claim 6, wherein encoding the sentence using a Cellular Neural Network (CNN) model to obtain a sentence vector of the sentence comprises:
capturing word information in the sentence by using a filter in a first channel of the N channels to obtain a first vector, wherein the first channel is used for representing a character-level vector;
capturing word information in the sentence by using a filter in a second channel of the N channels to obtain a second vector, wherein the second channel is used for representing a word-level vector;
determining a sentence vector for the sentence in the first vector and the second vector.
8. An apparatus for detecting a word, comprising:
the system comprises a first determining module, a second determining module and a third determining module, wherein the first determining module is used for determining a word hidden state vector of a target word in a sentence to be processed and a sentence hidden state vector of the sentence;
the first enhancement module is used for enhancing the relation between the target word and other words based on the word hidden state vector to obtain an enhanced word hidden state vector;
the first splicing module is used for splicing the enhanced word hidden state vector and the sentence hidden state vector to obtain a spliced hidden state vector;
a first detection module, configured to detect the target word in the sentence by using the concatenated hidden state vector, so as to determine a word associated with the target word in the sentence.
9. A computer-readable storage medium, in which a computer program is stored, wherein the computer program is arranged to perform the method of any of claims 1 to 7 when executed.
10. An electronic device comprising a memory and a processor, wherein the memory has stored therein a computer program, and wherein the processor is arranged to execute the computer program to perform the method of any of claims 1 to 7.
CN202011407627.0A 2020-12-04 2020-12-04 Word detection method and device, storage medium and electronic device Pending CN112597757A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011407627.0A CN112597757A (en) 2020-12-04 2020-12-04 Word detection method and device, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011407627.0A CN112597757A (en) 2020-12-04 2020-12-04 Word detection method and device, storage medium and electronic device

Publications (1)

Publication Number Publication Date
CN112597757A true CN112597757A (en) 2021-04-02

Family

ID=75188276

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011407627.0A Pending CN112597757A (en) 2020-12-04 2020-12-04 Word detection method and device, storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN112597757A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113158643A (en) * 2021-04-27 2021-07-23 广东外语外贸大学 Novel text readability assessment method and system
CN113158643B (en) * 2021-04-27 2024-05-28 广东外语外贸大学 Novel text readability evaluation method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408812A (en) * 2018-09-30 2019-03-01 北京工业大学 A method of the sequence labelling joint based on attention mechanism extracts entity relationship
CN110188190A (en) * 2019-04-03 2019-08-30 阿里巴巴集团控股有限公司 Talk with analytic method, device, server and readable storage medium storing program for executing
CN110704598A (en) * 2019-09-29 2020-01-17 北京明略软件系统有限公司 Statement information extraction method, extraction device and readable storage medium
CN111160008A (en) * 2019-12-18 2020-05-15 华南理工大学 Entity relationship joint extraction method and system
CN111222336A (en) * 2019-12-25 2020-06-02 北京明略软件系统有限公司 Method and device for identifying unknown entity
CN111597816A (en) * 2020-05-22 2020-08-28 北京慧闻科技(集团)有限公司 Self-attention named entity recognition method, device, equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408812A (en) * 2018-09-30 2019-03-01 北京工业大学 A method of the sequence labelling joint based on attention mechanism extracts entity relationship
CN110188190A (en) * 2019-04-03 2019-08-30 阿里巴巴集团控股有限公司 Talk with analytic method, device, server and readable storage medium storing program for executing
CN110704598A (en) * 2019-09-29 2020-01-17 北京明略软件系统有限公司 Statement information extraction method, extraction device and readable storage medium
CN111160008A (en) * 2019-12-18 2020-05-15 华南理工大学 Entity relationship joint extraction method and system
CN111222336A (en) * 2019-12-25 2020-06-02 北京明略软件系统有限公司 Method and device for identifying unknown entity
CN111597816A (en) * 2020-05-22 2020-08-28 北京慧闻科技(集团)有限公司 Self-attention named entity recognition method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨聪聪: "面向对话领域的命名实体识别方法研究与应用", 中国优秀硕士学位论文全文数据库 信息科技辑, no. 7, pages 29 - 40 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113158643A (en) * 2021-04-27 2021-07-23 广东外语外贸大学 Novel text readability assessment method and system
CN113158643B (en) * 2021-04-27 2024-05-28 广东外语外贸大学 Novel text readability evaluation method and system

Similar Documents

Publication Publication Date Title
CN112199375B (en) Cross-modal data processing method and device, storage medium and electronic device
CN109918560B (en) Question and answer method and device based on search engine
US10796224B2 (en) Image processing engine component generation method, search method, terminal, and system
US11651014B2 (en) Source code retrieval
Sridhar et al. Fake news detection and analysis using multitask learning with BiLSTM CapsNet model
CN113298197B (en) Data clustering method, device, equipment and readable storage medium
CN114358203A (en) Training method and device for image description sentence generation module and electronic equipment
CN112182167B (en) Text matching method and device, terminal equipment and storage medium
Zhang et al. Image region annotation based on segmentation and semantic correlation analysis
CN113761868A (en) Text processing method and device, electronic equipment and readable storage medium
CN113986950A (en) SQL statement processing method, device, equipment and storage medium
CN115640520A (en) Method, device and storage medium for pre-training cross-language cross-modal model
CN116050352A (en) Text encoding method and device, computer equipment and storage medium
CN112597757A (en) Word detection method and device, storage medium and electronic device
CN110705258A (en) Text entity identification method and device
CN114398903B (en) Intention recognition method, device, electronic equipment and storage medium
CN113408282B (en) Method, device, equipment and storage medium for topic model training and topic prediction
CN110502741B (en) Chinese text recognition method and device
CN111222328A (en) Label extraction method and device and electronic equipment
CN115129885A (en) Entity chain pointing method, device, equipment and storage medium
CN112749251B (en) Text processing method, device, computer equipment and storage medium
CN112836057B (en) Knowledge graph generation method, device, terminal and storage medium
CN114398482A (en) Dictionary construction method and device, electronic equipment and storage medium
CN114818727A (en) Key sentence extraction method and device
CN113961701A (en) Message text clustering method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination