CN111222334A - Named entity identification method, device, equipment and medium - Google Patents

Named entity identification method, device, equipment and medium Download PDF

Info

Publication number
CN111222334A
CN111222334A CN201911124011.XA CN201911124011A CN111222334A CN 111222334 A CN111222334 A CN 111222334A CN 201911124011 A CN201911124011 A CN 201911124011A CN 111222334 A CN111222334 A CN 111222334A
Authority
CN
China
Prior art keywords
named entity
embedding
model
sequence
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911124011.XA
Other languages
Chinese (zh)
Inventor
姚志强
周曦
李继伟
杜晓薇
郝东
赵云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Honghuang Intelligent Technology Co ltd
Original Assignee
Guangzhou Honghuang Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Honghuang Intelligent Technology Co ltd filed Critical Guangzhou Honghuang Intelligent Technology Co ltd
Priority to CN201911124011.XA priority Critical patent/CN111222334A/en
Publication of CN111222334A publication Critical patent/CN111222334A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention provides a named entity identification method, a device, equipment and a medium, wherein the method comprises the following steps: acquiring a natural language-based dialog input by a user; preprocessing the corpus information in the conversation; and identifying the corpus information by utilizing a pre-trained named entity model to obtain a corresponding named entity. Compared with the traditional named entity method, the named entity recognition module is trained in advance, and the acquired dialog based on the natural language is input to the named entity recognition module to recognize the named entity in the corpus information after being preprocessed; on the one hand, not relying on a syntactic parse tree or on rule-based matching; on the other hand, the requirement on training data of a specific scene is reduced, and the training data is reduced.

Description

Named entity identification method, device, equipment and medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a named entity identification method, a named entity identification device, named entity identification equipment and named entity identification media.
Background
Human-computer conversation is a sub-direction of the field of artificial intelligence, and popular speaking is that a person can interact with a computer such as a human-computer conversation system through human language (i.e. natural language). Through the interaction between people and the human-computer conversation system, the human-computer conversation system can understand the intention and the requirement of people, and therefore tasks such as song searching, ordering of shopping and control of equipment are completed.
However, in the existing dialog system, since the named entity recognition has less labeled data, high difficulty in data labeling and non-standard data labeling, and meanwhile, the conventional named entity recognition model is usually based on syntax and needs to rely on syntactic analysis tree or rule matching, a new named entity recognition method is urgently needed.
Disclosure of Invention
In view of the above-mentioned shortcomings of the prior art, the present invention provides a method, an apparatus, a device and a medium for identifying a named entity, which are used to solve the problem of relying on a parsing tree or rule or lack of training data in the existing named entity identification process.
To achieve the above and other related objects, the present invention provides a named entity recognition method, including:
acquiring a natural language-based dialog input by a user;
preprocessing the corpus information in the conversation;
and identifying the corpus information by utilizing a pre-trained named entity model to obtain a corresponding named entity.
Another object of the present invention is to provide a named entity recognition apparatus, comprising:
the dialogue acquisition module is used for acquiring dialogue which is input by a user and is based on natural language;
the preprocessing module is used for preprocessing the corpus information in the conversation;
and the named entity identification module is used for identifying the corpus information by utilizing a pre-trained named entity model to obtain a corresponding named entity.
It is another object of the invention to provide an apparatus comprising:
one or more processors; and
one or more machine-readable media having instructions stored thereon that, when executed by the one or more processors, cause the apparatus to perform the named entity identification method described above.
It is also an object of the invention to provide one or more machine readable media comprising:
having stored thereon instructions that, when executed by one or more processors, cause an apparatus to perform the named entity recognition method described above.
As described above, the named entity identification method, apparatus, device and medium provided by the present invention have the following beneficial effects:
compared with the traditional named entity method, the named entity recognition module is trained in advance, and the acquired dialog based on the natural language is input to the named entity recognition module to recognize the named entity in the corpus information after being preprocessed; on the one hand, not relying on a syntactic parse tree or on rule-based matching; (ii) a On the other hand, reducing the requirements on the training data for a particular scenario reduces the training data.
Drawings
Fig. 1 is a flowchart of a named entity identification method according to an embodiment of the present invention;
fig. 2 is a flowchart of named entity training in the method for identifying a named entity according to the embodiment of the present invention;
FIG. 3 is a flowchart of vector generation for embedding in human-machine dialog named entity recognition according to an embodiment of the present invention;
fig. 4 is a block diagram of a named entity recognition apparatus according to an embodiment of the present invention;
fig. 5 is a block diagram illustrating a structure of a named entity recognition module in the named entity recognition apparatus according to an embodiment of the present invention;
fig. 6 is a block diagram of a structure of an embedded vector generation unit in the named entity recognition apparatus according to the embodiment of the present invention;
fig. 7 is a schematic diagram of a hardware structure of a terminal device according to an embodiment of the present invention;
fig. 8 is a schematic diagram of a hardware structure of a terminal device according to an embodiment of the present invention.
Description of the element reference numerals
1 dialogue acquisition module
2 preprocessing module
3 named entity recognition module
31 Embedded vector generating Unit
32 named entity feature generation unit
33 named entity discriminating unit
311 dicing sub-units
312 sequence combination subunit
313 Embedded extraction subunit
314 vector output subunit
1100 input device
1101 first processor
1102 output device
1103 first memory
1104 communication bus
1200 processing assembly
1201 second processor
1202 second memory
1203 communication assembly
1204 Power supply Assembly
1205 multimedia assembly
1206 voice assembly
1207 input/output interface
1208 sensor assembly
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.
Referring to fig. 1, a flowchart of a named entity identification method according to an embodiment of the present invention includes:
step S1, acquiring a natural language-based dialog input by a user;
wherein, the user inputs a dialogue based on natural language (human language understood by human or machine) through the terminal, and the dialogue can be in the form of voice, text or pictures, but is not limited to the form.
Step S2, preprocessing the corpus information in the dialog;
the corpus information (language material) in the dialog is normalized, for example, the corpus can be normalized through simple scaling, sample-by-sample mean subtraction, or feature normalization, so as to improve the precision of the corpus and facilitate subsequent feature extraction.
And step S3, recognizing the corpus information by using a pre-trained named entity model to obtain a corresponding named entity.
The named entity model is used for identifying the named entity (identifying entities with special meanings, including name, place name, organization name, proper noun and the like) in the corpus information; specifically, taking the user behavior "air ticket booking" as an example, the corpus related to "air ticket booking" includes time, place, and the like, and in the user statement "help me book air tickets from shenzhen to beijing", the named entity includes time "tomorrow ten spots", places "shenzhen" and "beijing".
In the embodiment, a named entity recognition module is trained in advance to obtain a natural language-based dialogue, corpus information in the dialogue is preprocessed, and a named entity in the corpus information is recognized by the named entity recognition module; compared with the traditional method for naming the entity, on one hand, the method does not depend on the matching of a syntactic analysis tree and a rule; on the other hand, the requirements on training data of a specific scene are reduced, and the training data are reduced; the method can be widely applied to the fields of voice assistants, intelligent customer service, intelligent sound boxes, chat robots and the like.
Referring to fig. 2, a flowchart of training a named entity in the method for identifying a named entity according to the embodiment of the present invention is detailed as follows:
step S301, constructing an embedded vector of an input sequence;
wherein, because the original input sequence in the training sample input by the user needs to be converted into vectorized expression, see fig. 3 in detail, the steps are as follows:
step S3011, dividing the original input sequence in the training sample into characters, words or a plurality of grammar units; i.e., re-slicing;
step S3012, inputting the sequence by adopting single or multiple granularity combinations according to the time sequence; that is, the segmented sequence is rearranged, wherein the granularity of the arranged time sequence may be a word, a plurality of grammar units or a combination thereof, and an input at one time point may be referred to as one input unit of the input sequence.
Step S3013, extracting each unit of the input sequence based on any one or several dimensions of semantic embedding, font embedding or pronunciation embedding;
the embedding type includes semantic embedding, font embedding or pronunciation embedding, for example, semantic embedding for extracting each unit in the input sequence, and the embedding vector can be used for training from beginning or loading the pre-trained embedding vector directly. Extracting glyph embedding of each unit of the input sequence by adopting a deep convolutional neural network, namely, a pre-training mode: inputting the font picture of each character, constructing a classification model, predicting the corresponding ID of each picture in a font library, and classifying by adopting a full-connection layer so as to minimize cross entropy loss and judge output. Adopting any one mode of a recurrent neural network, a long-short term memory network or a recurrent neural network to extract the word-pronunciation embedding of each unit of the input sequence, namely, a pre-training mode: inputting the character tone of each character, constructing a classification model, predicting the corresponding ID of each picture in a character library, and classifying by adopting a full-connection layer so as to minimize cross entropy loss and judge output.
Specifically, the embedded vector is embedded from three aspects of semantics, font and pronunciation, and the three share semantic information to promote the semantic information, so that the semantic information is fully utilized, on one hand, the accuracy and efficiency of embedded vector expression are improved, and on the other hand, the workload of training data is reduced.
The word extraction may be performed by a table lookup method, for example, a word or N-gram (a plurality of grammatical units) may be embedded in a table in which words or N-grams are embedded, or may be generated by word embedding.
And step S3014, fusing the embedded vectors of the multiple embedding types for generating the input sequence.
Specifically, if multiple embedding types are selected, fusion of the multiple embedding types is performed to obtain the final embedding vector of the input sequence.
By adopting the method, the quantity requirement on the training data of the specific scene can be reduced by introducing various embedding types and embedding modes according to the dialog content input by the user, and the expression of the embedded vector is formed by adopting multiple granularities and multiple ranges, so that on one hand, the data which is easy to obtain is adopted to replace the data which is difficult to obtain in the specific scene, the requirement on the training data is reduced, on the other hand, the situation that only key words are captured in the traditional field, and the embedded vector of the real intention and the real field of the user is accurately matched is avoided; in addition, knowledge in other fields can also be introduced by using a pre-trained feature generation model, and training data can also be reduced.
Step S302, constructing a named entity feature generation model and generating features of an input sequence;
specifically, a pre-trained named entity feature generation model is adopted, training data are reduced, the named entity feature generation model is trained by adopting a two-way long and short memory network or a Transformer model, features with different granularities and dimensions required by a user in a fusion session can be generated, and meanwhile, the accuracy of feature extraction is improved subsequently.
Step S303, constructing a named entity model, and generating a predicted named entity sequence as an output named entity;
specifically, the named entity model is trained by using a maximum likelihood estimation algorithm based on a conditional random field algorithm model, for example, the training criterion of the maximum likelihood estimation algorithm is as follows:
Figure BDA0002274998300000041
wherein x is an input sequence, y is an output sequence, Score (x, y) is a Score when the input sequence is x and the output sequence is y, and exp is based on a natural constant eExponential function
Figure BDA0002274998300000042
In the above formula, ΨEMIT(yi->xi) Indicates, label yiEmits the corresponding input unit xiPotential energy from the output features of the feature generation model; ΨTRANS(yi-1->yi) Label y at time i-1i-1Transition to y at time iiIs a training parameter in CRF (conditional random field algorithm); and i, summing the lengths of the input sequences, training the CRF by adopting the maximum likelihood estimation algorithm, and calculating a loss function output sequence.
In addition, the named entity model can also adopt the minimized cross entropy loss as a training criterion, and the cross entropy loss is specifically as follows:
Figure BDA0002274998300000051
in the formula, y is a label, p is the probability of each input unit prediction, M is the number of categories of tasks, L is the sentence length, and the loss function is calculated to output a word or a word until convergence to the minimum cross entropy loss.
In this embodiment, an original input sequence is expressed by using an embedded vector, a feature of the input sequence is generated by a named entity feature generation model, and the feature of the input sequence is input to a named entity model, so that a named entity sequence is output.
Referring to fig. 4, a block diagram of a named entity recognition apparatus according to an embodiment of the present invention includes:
the dialogue acquisition module 1 is used for acquiring a dialogue based on natural language input by a user;
the preprocessing module 2 is used for preprocessing the corpus information in the conversation;
normalizing the linguistic data in the conversation to obtain preprocessed linguistic data information in the conversation;
and the named entity recognition module 3 recognizes the corpus information by utilizing a pre-trained named entity model to obtain a corresponding named entity.
In an embodiment, see fig. 5 in detail, which is a block diagram illustrating a structure of a named entity recognition module in a named entity recognition apparatus according to an embodiment of the present invention; the details are as follows:
an embedded vector generation unit 31 for generating an original input sequence within the training samples into an input sequence expressed in embedded vectors;
a named entity feature generation unit 32 configured to construct a named entity feature generation model that generates features of the input sequence;
named entity unit 33 for constructing a named entity model that generates a predicted sequence of named entities.
Specifically, see fig. 6 for details, which is a block diagram of a structure of a vector generation unit embedded in the recognition of a named entity in a human-computer conversation provided in the embodiment of the present invention; the details are as follows:
a segmentation subunit 311, configured to segment the original input sequence in the training sample into words, phrases, or multiple grammar units;
a sequence combination subunit 312, configured to input a sequence with single or multiple granularity combinations according to a time sequence;
an embedding extraction sub-unit 313 for extracting each unit of the input sequence based on any one or several dimensions of semantic embedding, glyph embedding, or pronunciation embedding;
a vector output subunit 314, configured to fuse the embedded vectors of the multiple embedding types that generate the input sequence.
Specifically, a deep convolutional neural network is employed to extract glyph embedding for each unit of the input sequence.
Specifically, the phonetic word embedding of each unit of the input sequence is extracted by adopting any one mode of a recurrent neural network, a long-short term memory network or a recurrent neural network.
Specifically, the named entity feature generation model is trained by using a bidirectional long and short memory network or a Transformer model.
Specifically, the named entity model is trained by using a maximum likelihood estimation algorithm on the basis of a conditional random field algorithm model, so that the named entity model of the named entity sequence which outputs the prediction is obtained.
In this embodiment, the named entity recognition apparatus and the named entity recognition method are in a one-to-one correspondence, and specific functions and technical effects can be obtained by referring to the above embodiments, which are not described herein again.
An embodiment of the present application further provides an apparatus, which may include: one or more processors; and one or more machine readable media having instructions stored thereon that, when executed by the one or more processors, cause the apparatus to perform the method of fig. 1. In practical applications, the device may be used as a terminal device, and may also be used as a server, where examples of the terminal device may include: the mobile terminal includes a smart phone, a tablet computer, an electronic book reader, an MP3 (Moving Picture Experts Group Audio Layer III) player, an MP4 (Moving Picture Experts Group Audio Layer IV) player, a laptop, a vehicle-mounted computer, a desktop computer, a set-top box, an intelligent television, a wearable device, and the like.
The present embodiment also provides a non-volatile readable storage medium, where one or more modules (programs) are stored in the storage medium, and when the one or more modules are applied to a device, the device may be caused to execute instructions (instructions) of steps included in the named entity identification method in fig. 1 according to the present embodiment.
Fig. 7 is a schematic diagram of a hardware structure of a terminal device according to an embodiment of the present application. As shown in fig. 7, the terminal device may include: an input device 1100, a first processor 1101, an output device 1102, a first memory 1103, and at least one communication bus 1104. The communication bus 1104 is used to implement communication connections between the elements. The first memory 1103 may include a high-speed RAM memory, and may also include a non-volatile storage NVM, such as at least one disk memory, and the first memory 1103 may store various programs for performing various processing functions and implementing the method steps of the present embodiment.
Alternatively, the first processor 1101 may be, for example, a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a controller, a microcontroller, a microprocessor, or other electronic components, and the first processor 1101 is coupled to the input device 1100 and the output device 1102 through a wired or wireless connection.
Optionally, the input device 1100 may include a variety of input devices, such as at least one of a user-oriented user interface, a device-oriented device interface, a software programmable interface, a camera, and a sensor. Optionally, the device interface facing the device may be a wired interface for data transmission between devices, or may be a hardware plug-in interface (e.g., a USB interface, a serial port, etc.) for data transmission between devices; optionally, the user-facing user interface may be, for example, a user-facing control key, a voice input device for receiving voice input, and a touch sensing device (e.g., a touch screen with a touch sensing function, a touch pad, etc.) for receiving user touch input; optionally, the programmable interface of the software may be, for example, an entry for a user to edit or modify a program, such as an input pin interface or an input interface of a chip; the output devices 1102 may include output devices such as a display, audio, and the like.
In this embodiment, the processor of the terminal device includes specific functions and technical effects for executing the functions of the modules of the speech recognition apparatus in each device, which are referred to in the foregoing embodiments and will not be described herein again.
Fig. 8 is a schematic hardware structure diagram of a terminal device according to an embodiment of the present application. FIG. 8 is a specific embodiment of FIG. 7 in an implementation. As shown, the terminal device of the present embodiment may include a second processor 1201 and a second memory 1202.
The second processor 1201 executes the computer program code stored in the second memory 1202 to implement the method described in fig. 4 in the above embodiment.
The second memory 1202 is configured to store various types of data to support operations at the terminal device. Examples of such data include instructions for any application or method operating on the terminal device, such as messages, pictures, videos, and so forth. The second memory 1202 may include a Random Access Memory (RAM) and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory.
Optionally, a second processor 1201 is provided in the processing assembly 1200. The terminal device may further include: communication component 1203, power component 1204, multimedia component 1205, speech component 1206, input/output interfaces 1207, and/or sensor component 1208. The specific components included in the terminal device are set according to actual requirements, which is not limited in this embodiment.
The processing component 1200 generally controls the overall operation of the terminal device. The processing component 1200 may include one or more second processors 1201 to execute instructions to perform all or a portion of the steps of the named entity recognition method described above. Further, the processing component 1200 can include one or more modules that facilitate interaction between the processing component 1200 and other components. For example, the processing component 1200 can include a multimedia module to facilitate interaction between the multimedia component 1205 and the processing component 1200.
The power supply component 1204 provides power to the various components of the terminal device. The power components 1204 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the terminal device.
The multimedia components 1205 include a display screen that provides an output interface between the terminal device and the user. In some embodiments, the display screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the display screen includes a touch panel, the display screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation.
The voice component 1206 is configured to output and/or input voice signals. For example, the voice component 1206 includes a Microphone (MIC) configured to receive external voice signals when the terminal device is in an operational mode, such as a voice recognition mode. The received speech signal may further be stored in the second memory 1202 or transmitted via the communication component 1203. In some embodiments, the speech component 1206 further comprises a speaker for outputting speech signals.
The input/output interface 1207 provides an interface between the processing component 1200 and peripheral interface modules, which may be click wheels, buttons, etc. These buttons may include, but are not limited to: a volume button, a start button, and a lock button.
The sensor component 1208 includes one or more sensors for providing various aspects of status assessment for the terminal device. For example, the sensor component 1208 may detect an open/closed state of the terminal device, relative positioning of the components, presence or absence of user contact with the terminal device. The sensor assembly 1208 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact, including detecting the distance between the user and the terminal device. In some embodiments, the sensor assembly 1208 may also include a camera or the like.
The communication component 1203 is configured to facilitate communications between the terminal device and other devices in a wired or wireless manner. The terminal device may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In one embodiment, the terminal device may include a SIM card slot therein for inserting a SIM card therein, so that the terminal device may log onto a GPRS network to establish communication with the server via the internet.
As can be seen from the above, the communication component 1203, the voice component 1206, the input/output interface 1207 and the sensor component 1208 involved in the embodiment of fig. 8 can be implemented as the input device in the embodiment of fig. 7.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims (16)

1. A named entity recognition method, comprising:
acquiring a natural language-based dialog input by a user;
preprocessing the corpus information in the conversation;
and identifying the corpus information by utilizing a pre-trained named entity model to obtain a corresponding named entity.
2. The named entity recognition method of claim 1, wherein the training process of the named entity model comprises:
generating an input sequence expressed by an embedded vector from an original input sequence corresponding to the corpus information in the training sample;
constructing a named entity feature generation model for generating input sequence features;
a named entity model is constructed that generates a predicted sequence of named entities.
3. The named entity recognition method of claim 2, wherein the step of generating the original input sequence as an embedded vector-expressed input sequence comprises:
dividing an original input sequence corresponding to the corpus information in the training sample into characters, words or a plurality of grammar units;
inputting the sequence by adopting single or multiple granularity combinations according to the time sequence;
extracting each unit of the input sequence based on any one or more dimensions of semantic embedding, font embedding or pronunciation embedding;
fusing multiple embedding types of embedding vectors that generate the input sequence.
4. The named entity recognition method of claim 3, wherein the glyph embedding of the units of each of the input sequences is extracted using a deep convolutional neural network.
5. The named entity recognition method of claim 3, wherein the phonetic transcription of the units of each of the input sequences is extracted using any one of a recurrent neural network, a long-short term memory network, or a recurrent neural network.
6. The named entity recognition method of claim 2, wherein the named entity feature generation model is trained using a two-way long-short memory network or a transform model.
7. The method of claim 2, wherein the named entity model that generates the predicted sequence of named entities is obtained by training with a maximum likelihood estimation algorithm based on a conditional random field algorithm model.
8. A named entity recognition apparatus, comprising:
the dialogue acquisition module is used for acquiring dialogue which is input by a user and is based on natural language;
the preprocessing module is used for preprocessing the corpus information in the conversation;
and the named entity identification module is used for identifying the corpus information by utilizing a pre-trained named entity model to obtain a corresponding named entity.
9. The named entity recognition device of claim 8, wherein the named entity module comprises:
the embedded vector generating unit is used for generating an original input sequence corresponding to the corpus information in the training sample into an input sequence expressed by an embedded vector;
the named entity feature generation unit is used for constructing a named entity feature generation model for generating input sequence features;
and the named entity distinguishing unit is used for constructing a named entity model for generating the predicted named entity sequence.
10. The named entity recognition device of claim 9, wherein the embedded vector generation unit comprises:
the segmentation subunit is used for segmenting an original input sequence corresponding to the corpus information in the training sample into characters, words or a plurality of grammar units;
a sequence combination subunit for combining the input sequences with a single or multiple granularities according to the time sequence;
an embedding extraction subunit, configured to extract each unit of the input sequence based on any one or several dimensions of semantic embedding, glyph embedding, or pronunciation embedding;
and the vector output subunit is used for fusing the embedded vectors of the plurality of embedding types for generating the input sequence.
11. The named entity recognition device of claim 10, wherein a deep convolutional neural network is employed to extract glyph embedding for each unit of the input sequence.
12. The named entity recognition device of claim 10, wherein the phonetic embeddings of the units of each of the input sequences are extracted using any one of a recurrent neural network, a long-short term memory network, or a recurrent neural network.
13. The named entity recognition device of claim 9, wherein the named entity feature generation model is trained using a two-way long and short memory network or a transform model.
14. The named entity recognition device of claim 9, wherein the named entity model that generates the predicted sequence of named entities is trained using a maximum likelihood estimation algorithm based on a conditional random field algorithm model.
15. An apparatus, comprising:
one or more processors;
and one or more machine readable media having instructions stored thereon that, when executed by the one or more processors, cause the apparatus to perform the method recited by one or more of claims 1-7.
16. One or more machine-readable media having instructions stored thereon, which when executed by one or more processors, cause an apparatus to perform the method recited by one or more of claims 1-7.
CN201911124011.XA 2019-11-15 2019-11-15 Named entity identification method, device, equipment and medium Pending CN111222334A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911124011.XA CN111222334A (en) 2019-11-15 2019-11-15 Named entity identification method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911124011.XA CN111222334A (en) 2019-11-15 2019-11-15 Named entity identification method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN111222334A true CN111222334A (en) 2020-06-02

Family

ID=70807700

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911124011.XA Pending CN111222334A (en) 2019-11-15 2019-11-15 Named entity identification method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN111222334A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738008A (en) * 2020-07-20 2020-10-02 平安国际智慧城市科技股份有限公司 Entity identification method, device and equipment based on multilayer model and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090326923A1 (en) * 2006-05-15 2009-12-31 Panasonic Corporatioin Method and apparatus for named entity recognition in natural language
CN107797992A (en) * 2017-11-10 2018-03-13 北京百分点信息科技有限公司 Name entity recognition method and device
CN110232192A (en) * 2019-06-19 2019-09-13 中国电力科学研究院有限公司 Electric power term names entity recognition method and device
CN110334357A (en) * 2019-07-18 2019-10-15 北京香侬慧语科技有限责任公司 A kind of method, apparatus, storage medium and electronic equipment for naming Entity recognition

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090326923A1 (en) * 2006-05-15 2009-12-31 Panasonic Corporatioin Method and apparatus for named entity recognition in natural language
CN107797992A (en) * 2017-11-10 2018-03-13 北京百分点信息科技有限公司 Name entity recognition method and device
CN110232192A (en) * 2019-06-19 2019-09-13 中国电力科学研究院有限公司 Electric power term names entity recognition method and device
CN110334357A (en) * 2019-07-18 2019-10-15 北京香侬慧语科技有限责任公司 A kind of method, apparatus, storage medium and electronic equipment for naming Entity recognition

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738008A (en) * 2020-07-20 2020-10-02 平安国际智慧城市科技股份有限公司 Entity identification method, device and equipment based on multilayer model and storage medium
CN111738008B (en) * 2020-07-20 2021-04-27 深圳赛安特技术服务有限公司 Entity identification method, device and equipment based on multilayer model and storage medium

Similar Documents

Publication Publication Date Title
US10515627B2 (en) Method and apparatus of building acoustic feature extracting model, and acoustic feature extracting method and apparatus
CN110223695B (en) Task creation method and mobile terminal
US11482212B2 (en) Electronic device for analyzing meaning of speech, and operation method therefor
CN110827831A (en) Voice information processing method, device, equipment and medium based on man-machine interaction
EP3824462B1 (en) Electronic apparatus for processing user utterance and controlling method thereof
CN110909543A (en) Intention recognition method, device, equipment and medium
CN112699686B (en) Semantic understanding method, device, equipment and medium based on task type dialogue system
CN111241237A (en) Intelligent question and answer data processing method and device based on operation and maintenance service
CN112328761B (en) Method and device for setting intention label, computer equipment and storage medium
CN113205817A (en) Speech semantic recognition method, system, device and medium
WO2016008128A1 (en) Speech recognition using foreign word grammar
CN112669842A (en) Man-machine conversation control method, device, computer equipment and storage medium
TW202022849A (en) Voice data identification method, apparatus and system
CN113761888A (en) Text translation method and device, computer equipment and storage medium
CN113822076A (en) Text generation method and device, computer equipment and storage medium
CN112001167B (en) Punctuation mark adding method, system, equipment and medium
US20210110824A1 (en) Electronic apparatus and controlling method thereof
CN111222334A (en) Named entity identification method, device, equipment and medium
CN116881446A (en) Semantic classification method, device, equipment and storage medium thereof
CN112084780B (en) Coreference resolution method, device, equipment and medium in natural language processing
CN114218356A (en) Semantic recognition method, device, equipment and storage medium based on artificial intelligence
CN114430832A (en) Data processing method and device, electronic equipment and storage medium
CN111126075B (en) Semantic understanding method, system, equipment and medium for text resistance training
US20220319497A1 (en) Electronic device and operation method thereof
CN117633231A (en) Model-based classification method, device and equipment for bank system text

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
CB02 Change of applicant information

Address after: 511458 room 1011, No.26, Jinlong Road, Nansha District, Guangzhou City, Guangdong Province (only for office use)

Applicant after: Guangzhou yunconghonghuang Intelligent Technology Co., Ltd

Address before: 511458 room 1011, No.26, Jinlong Road, Nansha District, Guangzhou City, Guangdong Province (only for office use)

Applicant before: GUANGZHOU HONGHUANG INTELLIGENT TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200602

RJ01 Rejection of invention patent application after publication