US11681872B2 - Language sequence labeling method and apparatus, storage medium, and computing device - Google Patents

Language sequence labeling method and apparatus, storage medium, and computing device Download PDF

Info

Publication number
US11681872B2
US11681872B2 US17/355,120 US202117355120A US11681872B2 US 11681872 B2 US11681872 B2 US 11681872B2 US 202117355120 A US202117355120 A US 202117355120A US 11681872 B2 US11681872 B2 US 11681872B2
Authority
US
United States
Prior art keywords
representation
language sequence
word
embedding representation
hidden
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US17/355,120
Other languages
English (en)
Other versions
US20210319181A1 (en
Inventor
Fandong Meng
Yijin Liu
Jinchao Zhang
Jie Zhou
Jinan Xu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Assigned to TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED reassignment TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: LIU, YIJIN, MENG, FANDONG, XU, JINAN, ZHANG, JINCHAO, ZHOU, JIE
Publication of US20210319181A1 publication Critical patent/US20210319181A1/en
Application granted granted Critical
Publication of US11681872B2 publication Critical patent/US11681872B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/117Tagging; Marking up; Designating a block; Setting of attributes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • Embodiments of the present disclosure relate to the field of data processing technologies, and specifically, to a language sequence labeling method and apparatus, a storage medium, and a computing device.
  • NLP Natural language processing
  • the NLP technology generally includes technologies such as text processing, semantic understanding, machine translation, robot question and answer, and knowledge graph.
  • Sequence labeling is the basic work of NLP, and also a challenging issue in NLP.
  • the sequence labeling mainly includes part-of-speech labeling, named entity recognition, and the like.
  • a main task of the named entity recognition is to recognize proper nouns such as a person name, a place name, and an organization name and meaningful phrases such as time and date in text.
  • a sequence labeling task is an important part of information extraction, and its effect has great impact on machine translation, intelligent dialog system, and the like.
  • main models of sequence labeling include a common machine learning model and a neural network model.
  • the neural network model can achieve a better effect in the sequence labeling task with assistance of a small number of artificial features.
  • embodiments of the present disclosure provide a language sequence labeling method and apparatus, a storage medium, and a computing device.
  • a language sequence labeling method is provided, and is performed by a computing device, the method including: reading a first embedding representation of a language sequence, the first embedding representation including a character-level word embedding representation, a pre-trained word embedding representation, and a global word embedding representation of the language sequence, the global word embedding representation referring to a global context representation of the language sequence; performing first depth transformation (DT) encoding on the first embedding representation based on a first DT recurrent neural network (RNN), to output a first hidden-layer state representation corresponding to each word in the language sequence; and decoding the first hidden-layer state representations of the language sequence, to obtain a labeling result of one or more elements in the language sequence.
  • DT first depth transformation
  • RNN DT recurrent neural network
  • a language sequence labeling apparatus including: a sequence labeling encoder, including: a first reading module, configured to read a first embedding representation of a language sequence, the first embedding representation including a character-level word embedding representation, a pre-trained word embedding representation, and a global word embedding representation of the language sequence, the global word embedding representation referring to a global context representation of the language sequence; a first DT module, configured to perform first DT encoding on the first embedding representation based on a first DT RNN, to output a first hidden-layer state representation corresponding to each word in the language sequence; and a sequence labeling decoder, configured to decode the first hidden-layer state representations of the language sequence, to obtain a labeling result of one or more elements in the language sequence.
  • a sequence labeling encoder including: a first reading module, configured to read a first embedding representation of a language sequence, the first embedding representation including a character-level word embedding representation,
  • a non-transitory computer-readable storage medium storing computer program instructions, the computer program instructions, when executed by a processor, causing the processor to perform: reading a first embedding representation of a language sequence, the first embedding representation including a character-level word embedding representation, a pre-trained word embedding representation, and a global word embedding representation of the language sequence, the global word embedding representation referring to a global context representation of the language sequence; performing first depth transformation (DT) encoding on the first embedding representation based on a first DT recurrent neural network (RNN), to output a first hidden-layer state representation corresponding to each word in the language sequence; and decoding the first hidden-layer state representations of the language sequence, to obtain a labeling result of one or more elements in the language sequence.
  • DT first depth transformation
  • RNN DT recurrent neural network
  • a computing device including a processor and a memory storing a computer program, and the computer program being configured to, when executed on the processor, cause the processor to perform: reading a first embedding representation of a language sequence, the first embedding representation including a character-level word embedding representation, a pre-trained word embedding representation, and a global word embedding representation of the language sequence, the global word embedding representation referring to a global context representation of the language sequence; performing first depth transformation (DT) encoding on the first embedding representation based on a first DT recurrent neural network (RNN), to output a first hidden-layer state representation corresponding to each word in the language sequence; and decoding the first hidden-layer state representations of the language sequence, to obtain a labeling result of one or more elements in the language sequence.
  • DT first depth transformation
  • RNN DT recurrent neural network
  • FIG. 1 A is a schematic structural diagram of an implementation environment involved in an embodiment of the present disclosure.
  • FIG. 1 B is a schematic diagram of an application scenario in which language sequence labeling is used as an underlying technology according to an embodiment of the present disclosure.
  • FIG. 2 is a schematic diagram of an application scenario in which language sequence labeling is used as an underlying technology according to another embodiment of the present disclosure.
  • FIG. 3 is a schematic diagram of an encoder-decoder architecture for sequence labeling according to an embodiment of the present disclosure.
  • FIG. 4 is a diagram of an architecture for sequence labeling that is based on DT and on which global information enhancement has been performed according to an embodiment of the present disclosure.
  • FIG. 5 is a flowchart of a language sequence labeling method according to an embodiment of the present disclosure.
  • FIG. 6 is a flowchart of a language sequence labeling method according to another embodiment of the present disclosure.
  • FIG. 7 is a schematic diagram of a language sequence labeling apparatus according to an embodiment of the present disclosure.
  • FIG. 8 is a schematic diagram of a language sequence labeling apparatus according to another embodiment of the present disclosure.
  • FIG. 9 shows an exemplary system including an exemplary computing device that represents one or more systems and/or devices that can implement various technologies described herein.
  • language sequence labeling may be used as an underlying processing application, and may be used for resolving problems such as Chinese word segmentation, part-of-speech labeling, and named entity recognition.
  • a language sequence labeling task is an important part of information extraction, and may be specifically applied to machine translation, an intelligent dialog system, and the like.
  • the named entity recognition is an important basic tool in application fields such as information extraction, a question and answer system, syntax analysis, and machine translation.
  • FIG. 1 A is a schematic structural diagram of an implementation environment involved in an embodiment of the present disclosure.
  • a language sequence labeling system 100 includes a server 110 , a network 120 , a terminal device 130 , and a user 140 .
  • the server 110 includes a processor and a memory.
  • a method embodiment in the present disclosure is performed by a processor executing instructions stored in the memory.
  • the server 110 includes a language sequence labeling apparatus 111 and a training database 112 .
  • a client 130 - 1 is installed on the terminal device 130 .
  • the client 130 - 1 as an application program for receiving a language sequence, can receive voice or text entered by the user, and obtain a to-be-labeled language sequence from the voice or the text. Then the terminal device 130 sends the to-be-labeled language sequence to the server 110 , and the language sequence labeling apparatus 111 in the server 110 labels and analyzes the language sequence.
  • the training database 112 stores a pre-trained word vector table.
  • the language sequence labeling apparatus 111 obtains a pre-trained word vector from the training database 112 for forming a first embedding representation and a second embedding representation.
  • the language sequence labeling apparatus 111 obtains a labeling result of the language sequence by constructing a DT RNN, understands an intention of the user according to the labeling result, and determines a message to be returned to the terminal device 130 , so as to display the message on the client 130 - 1 , to achieve human-machine interaction.
  • the labeling result may be labeled part-of-speech for each word in the language sequence, recognized/labeled named entity in the language sequence, translated sequence in another language, etc.
  • the server 110 may be one server, or a server cluster that includes a plurality of servers, or a cloud computing service center.
  • the network 120 may connect the server 110 and the terminal device 130 in a wireless or wired manner.
  • the terminal device 130 may be an intelligent terminal, including a smartphone, a tablet computer, a laptop portable computer, or the like.
  • FIG. 1 B schematically shows an application scenario of an intelligent dialog system to which one embodiment of the present disclosure is applied as an underlying technology, and specifically shows a human-machine interaction interface 200 therein.
  • the user may enter a language sequence including voice or text, as shown by 210 and 220 .
  • a machine can understand an intention of the user, for example, what a question of the user is, by performing, at a backend, sequence labeling analysis on a language sequence such as 210 “Where is the washroom” entered by the user.
  • Corresponding replies provided by the machine to the question of the user are shown by 230 and 240 .
  • entities including a person name, a place name, an organization name, a proper noun and the like with specific meanings in text are recognized.
  • labels are added to named entities, and a result is outputted: Einstein [person name], Germany [organization name].
  • FIG. 2 schematically shows an application scenario of machine translation to which one embodiment of the present disclosure is applied as an underlying technology, and specifically shows a human-machine interaction interface 300 therein.
  • a plurality of technology providers may be provided, for example, a translator 321 and a Tencent AIlab 322 , to provide the user with various backend services such as text translation 311 , voice translation 312 , and image translation 313 .
  • text translation 311 a language sequence entered in a box 331 on the left can be translated into a language sequence in a box 332 on the right.
  • Chinese “ ,” is entered in the box 331 on the left.
  • sequence labeling processing in one embodiment of the present disclosure can be used as an underlying application to analyze and process an entered language at a backend.
  • a core architecture is an encoder-decoder solution.
  • An encoder processes variable-length input and establishes a fixed-length vector representation.
  • a decoder generates a variable-length sequence (target sequence) based on the encoded vector representation.
  • FIG. 3 is a schematic diagram of an encoder-decoder architecture for sequence labeling. As shown in FIG. 3 , part-of-speech analysis in the sequence labeling is used as an example.
  • an encoded vector representation outputted by the encoder is expressed as [z 1 , z 2 , . . . , z d ]
  • FIG. 4 is a diagram of an architecture for sequence labeling that is based on DT and on which global information enhancement has been performed according to an embodiment of the present disclosure, and the architecture may be applied to a computing device, for example, the server 110 in FIG. 1 A .
  • a global information encoder 401 a sequence labeling encoder 402 , and a sequence labeling decoder 403 are included.
  • An example in which bidirectional DT processing is performed on the language sequence is used. The bidirectional DT processing is performed on the language sequence in forward and reverse orders: x 1 , x 2 , . . . , x n , and x n , x n-1 , . . . , x 1 , respectively.
  • DT refers to increasing processing depths between a plurality of adjacent time steps by using a multi-level non-linear recurrent unit in a neural network structure.
  • each square represents a DT recurrent neuron, briefly referred to as a DT unit.
  • identifiers are provided, including: a square with a rightward shadow identifies a forward (from left to right) DT unit, a square with a leftward shadow identifies a backward (from right to left) DT unit, a square with no shadow identifies a unidirectional DT unit, and a rounded square identifies word embedding.
  • the global information encoder 401 reads a second embedding representation of the language sequence x 1 , x 2 , . . . , x n , including character-level word embedding representations c 1 , c 2 , . . . , and c n and pre-trained word embedding representations w 1 , w 2 , . . . , and w n .
  • the character-level word embedding representation c n and the pre-trained word embedding representation w n separately correspond to a subword x n in the inputted language sequence.
  • the character-level word embedding representation c n is a word vector that learns at a character level, and is obtained by performing convolution processing on the subword x n at the character level.
  • the pre-trained word embedding representation w n is a word vector obtained according to the subword x n by looking up a pre-trained and stored word vector table.
  • the DT is performed by constructing a DT RNN.
  • the DT RNN includes a gated recurrent unit (GRU) improved through linear transformation.
  • the GRU is a variant of a long short-term memory (LSTM).
  • the LSTM is a time RNN and is suitable for processing and predicting an event with a relatively long interval and delay in a time series.
  • An RNN is a type of recursive neural network in which sequence data is used as an input, recursion is performed in a sequence evolution direction, and all nodes (recurrent units) are connected in a chain.
  • the GRU maintains an effect of the LSTM and has a simpler structure, which helps resolve a vanishing gradient problem in the RNN.
  • a DT unit 4011 in the global information encoder 401 represents a second DT RNN, including one layer of linear transformation enhanced gated recurrent units (L-GRU) and one layer of transition gated recurrent units (T-GRU).
  • L-GRUs are adopted at a bottom layer
  • T-GRUs are adopted at an upper layer.
  • other numbers of layers, typically, two to three layers, of T-GRUs may be alternatively adopted.
  • the GRU includes an input layer, a hidden layer, and an output layer.
  • is an element-wise product
  • W is a to-be-learned network parameter
  • x t is an input encoding vector of the moment t
  • r t is a reset gate
  • is a weight coefficient, so that values of r t and z t are within [0, 1].
  • the T-GRU is a type of GRU, and does not appear in the first layer of the DT RNN, the T-GRU does not have the input encoding vector x t like the GRU.
  • W is a to-be-learned network parameter
  • ⁇ tilde over (h) ⁇ t tan h( W xh x t +r t ⁇ ( W hh h t ⁇ 1 )+l t ⁇ ( W x x t ) (10)
  • the second DT encoding is performed in the global information encoder 401 in a bidirectional manner. Therefore, a concatenating unit 4012 concatenates results obtained after the forward and reverse DT processing is performed on the same subword x n , and reduces dimensions by using an information aggregation processing unit 4013 , to obtain a global word embedding vector g.
  • information aggregation processing may include average pooling, maximum pooling, or an attention mechanism.
  • the sequence labeling encoder 402 enhances a language sequence embedding representation by using the global word embedding vector g outputted by the global information encoder 401 .
  • the sequence labeling encoder 402 reads a first embedding representation of the language sequence, including character-level word embedding representations c 1 , c 2 , . . . , and c n shown by 4021 , pre-trained word embedding representations w 1 , w 2 , . . . , and w n shown by 4022 , and the global word embedding representations g shown by 4023 .
  • the character-level word embedding representation c n and the pre-trained word embedding representation w n respectively correspond to the subword x n in the inputted language sequence.
  • c n , w n , and the global word embedding vector g that are corresponding to the subword x n are concatenated to form the first embedding representations of the language sequence.
  • the character-level word embedding representation c t is obtained according to a recurrent convolutional neural network (CNN).
  • the pre-trained word embedding representation w t is obtained by looking up a lookup table.
  • the global word embedding representation g is a global context representation obtained through pre-encoding computation for the language sequence, that is, extracted from the bidirectional second DT RNN by using the global information encoder 401 .
  • the sequence labeling encoder 402 then performs first DT encoding on the first embedding representation of the read language sequence based on the first DT RNN.
  • the first DT encoding is performed in a bidirectional manner.
  • L-GRUs are adopted at a bottom layer, and T-GRUs are adopted at the remaining layers.
  • a number of layers of the T-GRUs adopted is usually two to five. As can be understood by a person skilled in the art, other layers of T-GRUs may alternatively be adopted.
  • the sequence labeling encoder 402 adopts one layer of L-GRUs.
  • a concatenating unit 4025 concatenates results obtained after the forward and reverse DT processing is performed on the same subword x n , to obtain a first hidden-layer state representation h t corresponding to the each word.
  • the sequence labeling decoder 403 reads, at each moment t, from the sequence labeling encoder 402 , a first hidden-layer state representation h t corresponding to a current word, and performs decoding based on label information y t-1 at a previous moment. Specifically, the following steps are included:
  • First perform, for the each word, DT on the first hidden-layer state representation of the word based on a third DT RNN, to obtain a second hidden-layer state representation s t corresponding to the each word.
  • the sequence labeling decoder 403 adopts a unidirectional structure and performs unidirectional DT.
  • a structure of recurrent neuron DT of the sequence labeling decoder 403 is similar to the structures of recurrent neuron DT in the global information encoder 401 and the sequence labeling encoder 402 , L-GRUs are in a first layer (referring to 4031 in FIG. 4 ), and T-GRUs are in the remaining layers.
  • FIG. 5 schematically shows a flowchart of a language sequence labeling method according to an embodiment of the present disclosure.
  • the method is performed by a computing device such as the server 110 in FIG. 1 A .
  • the method specifically includes the following steps:
  • Step 501 Read a first embedding representation of a language sequence, the first embedding representation including a character-level word embedding representation, a pre-trained word embedding representation, and a global word embedding representation of the language sequence, and the global word embedding representation referring to a global context representation of the language sequence.
  • the first embedding representation includes a character-level word embedding representation c n , a pre-trained word embedding representation w n , and a global word embedding representation g of the language sequence.
  • the character-level word embedding representation c n is a word vector that learns at a character level, and is obtained by performing convolution processing on a subword x n at the character level.
  • the pre-trained word embedding representation w n is obtained according to the subword x n by looking up a pre-trained and stored word vector table.
  • the global word embedding representation g is a global context representation obtained through pre-encoding computation for the language sequence.
  • Step 502 Perform first DT encoding on the first embedding representation based on a first DT RNN, to output a first hidden-layer state representation corresponding to each word in the language sequence.
  • the first DT encoding includes: performing the first DT encoding on the first embedding representation in forward and reverse directions respectively, that is, bidirectional DT performed forward from left to right and reversely from right to left.
  • bidirectional DT DT encoding results obtained in the forward and reverse directions need to be concatenated.
  • the DT encoding results obtained in the forward and reverse directions are concatenated, to obtain the first hidden-layer state representation corresponding to the each word.
  • a DT unit at a bottom layer is an L-GRU, and DT units at the remaining layers are T-GRUs.
  • a number of layers of T-GRUs adopted is usually two to five. As can be understood by a person skilled in the art, other layers of T-GRUs may alternatively be adopted.
  • Step 503 Decode the first hidden-layer state representation, to obtain a labeling result of the language sequence.
  • the first hidden-layer state representations of all words of the language sequence are decoded, to obtain the labeling result of one or more elements of the language sequence.
  • the one or more elements may be word(s) and/or phrase(s) in the language sequence.
  • each word and/or phrase in the language sequence may have a labeling result.
  • just words and/or phrases belonging to one or more certain categories are labeled.
  • a sequence labeling method that is based on a DT architecture and on which global information enhancement has been performed is provided, and a transformation process between adjacent states of an RNN may be deepened. Meanwhile, local information of the each word is enhanced by using the global information encoder, and a more comprehensive feature representation is obtained, thereby improving prediction accuracy.
  • FIG. 6 schematically shows a flowchart of a language sequence labeling method according to another embodiment of the present disclosure.
  • the method is performed by a computing device such as the server 110 in FIG. 1 A . Based on the procedure of the method in FIG. 5 , in FIG. 6 , the following steps are specifically included.
  • Step 601 Construct a DT RNN by using an L-GRU and a T-GRU.
  • the constructed DT RNN includes a plurality of layers of GRUs, where numbers of layers of L-GRUs and T-GRUs that are used are configurable.
  • the first DT RNN used in the sequence labeling encoder 402 may include one layer of L-GRUs and at least two layers of T-GRUs.
  • the second DT RNN used in the global information encoder 401 may include one layer of L-GRUs and one layer of T-GRUs.
  • the one layer of T-GRUs are located at a bottom layer of a network.
  • Step 602 Read a second embedding representation of the language sequence, the second embedding representation including a character-level word embedding representation and a pre-trained word embedding representation.
  • the character-level word embedding representation c n and the pre-trained word embedding representation w n respectively correspond to a subword x n in the inputted language sequence.
  • the character-level word embedding representation c n is a word vector that learns at a character level, and is obtained by performing convolution processing on the subword x n at the character level.
  • the pre-trained word embedding representation is a word vector obtained according to the subword x n by looking up a pre-trained and stored word vector table.
  • Step 603 Perform second DT encoding on the second embedding representation based on a second DT RNN, to obtain the global word embedding representation.
  • the second DT encoding includes performing the DT encoding on the read second embedding representation in a forward direction from left to right and a reverse direction from right to left.
  • results of the DT encoding in the forward direction and the reverse direction are concatenated, and then the information aggregation is performed.
  • the information aggregation may include maximum pooling or average pooling.
  • the second DT encoding is performed through the second DT RNN including the L-GRU and the T-GRU.
  • the L-GRU is located at the first layer of a recurrent unit in the second DT RNN
  • the T-GRU is located at another layer of the recurrent unit in the DT RNN.
  • a number of layers of T-GRUs in the first DT RNN is 1.
  • other numbers of layers, such as two to three layers, of T-GRUs may exist in the second DT RNN.
  • steps 501 and 502 are performed, steps 604 and 605 are performed.
  • Step 604 Perform, for the each word, DT on the first hidden-layer state representation of the word based on a third DT RNN, to obtain a second hidden-layer state representation.
  • the DT being performed may be unidirectional DT.
  • a unidirectional DT unit is included in 4031 .
  • Step 605 Obtain a labeling result of the language sequence based on the second hidden-layer state representation.
  • a plurality of labels are preset, and linear transformation is performed on the second hidden-layer state representation and label information at a previous moment, to obtain a probability that the word belongs to each label.
  • a probability that each word belongs to each label in the label set Y can be obtained and used as a label prediction result of the word, that is, the labeling result of the language sequence is obtained.
  • FIG. 7 is a schematic diagram of a language sequence labeling apparatus 700 according to an embodiment of the present disclosure.
  • the apparatus 700 may be applied to a computing device such as the server 110 in FIG. 1 A .
  • the language sequence labeling apparatus 700 includes a sequence labeling encoder 701 and a sequence labeling decoder 702 .
  • the sequence labeling encoder 701 includes a first reading module 7011 and a first DT module 7012 .
  • the first reading module 7011 is configured to read a first embedding representation of a language sequence, the first embedding representation including a character-level word embedding representation, a pre-trained word embedding representation, and a global word embedding representation of the language sequence, the global word embedding representation referring to a global context representation of the language sequence.
  • the first DT module 7012 is configured to perform first DT encoding on the first embedding representation based on a first DT RNN, to output a first hidden-layer state representation corresponding to each word in the language sequence.
  • the sequence labeling decoder 702 is configured to decode the first hidden-layer state representation, to obtain a labeling result of the language sequence.
  • FIG. 8 is a schematic diagram of a language sequence labeling apparatus 800 according to another embodiment of the present disclosure.
  • the apparatus 800 may be applied to a computing device such as the server 110 in FIG. 1 A .
  • the language sequence labeling apparatus 800 further includes a global information encoder 703 .
  • the global information encoder 703 is configured to obtain the global word embedding representation, and includes:
  • a second reading module 7031 configured to read a second embedding representation of the language sequence, the second embedding representation including the character-level word embedding representation and the pre-trained word embedding representation;
  • a second DT module 7032 configured to perform second DT encoding on the second embedding representation based on a second DT RNN, to obtain the global word embedding representation.
  • the global information encoder 703 further includes:
  • an information aggregation module 7033 configured to perform information aggregation on a result obtained after the second DT encoding, to obtain the global word embedding representation.
  • the global information encoder 703 may perform bidirectional DT encoding, that is, perform transformation encoding from the left to the right and perform the DT encoding from the right to the left. There is no difference in the bidirectional DT encoding other than a direction of inputting a sequence.
  • the first DT module 7012 is configured to perform the first DT encoding on the first embedding representation in forward and reverse directions respectively; and concatenate DT encoding results obtained in the forward and reverse directions, to obtain the first hidden-layer state representation corresponding to the each word.
  • the apparatus 800 further includes:
  • a construction module 704 configured to construct a DT RNN by using a L-GRU and a T-GRU.
  • the first DT RNN includes one layer of L-GRUs and at least two layers of T-GRUs.
  • sequence labeling decoder 702 includes:
  • a third DT module 7022 configured to perform, for the each word, DT on the first hidden-layer state representation of the word based on a third DT RNN, to obtain a second hidden-layer state representation
  • a labeling module 7023 configured to obtain the labeling result of the language sequence based on the second hidden-layer state representation.
  • sequence labeling decoder 702 further includes:
  • a setting module 7021 configured to preset a plurality of labels.
  • the labeling module 7023 is configured to perform linear transformation on the second hidden-layer state representation and label information at a previous moment, to obtain a probability that the word belongs to each label.
  • the first DT module 7012 and the second DT module 7032 perform bidirectional DT. However, the third DT module 7022 performs unidirectional DT.
  • unit in this disclosure may refer to a software unit, a hardware unit, or a combination thereof.
  • a software unit e.g., computer program
  • a hardware unit may be implemented using processing circuitry and/or memory.
  • processors or processors and memory
  • a processor or processors and memory
  • each unit can be part of an overall unit that includes the functionalities of the unit.
  • a state corresponding to an i th word is h i ⁇ h i L .
  • the second DT module 7032 performs average pooling on encoding representations of all the words, to obtain a final global representation
  • the sequence labeling solution in the embodiments of the present disclosure exhibits a better labeling effect, and can more accurately identify a named entity, a syntax block, a part-of-speech, and other information in a sentence, thereby optimizing an existing relevant application system such as a micro dialog system.
  • F1 is an average indicator that represents accuracy and a return rate.
  • the annotation indicator F1 of the sequence labeling is used as an example. Actual tests show that a value of F1 is increased in a plurality of aspects in the solution of the embodiments of the present disclosure.
  • Table 1 schematically shows comparisons of F1 performance of various solutions in named entity recognition and syntax block recognition.
  • the value of F1 is increased by 0.32 based on 91.64 in the related art, and for the syntax block recognition in the sequence labeling, the value of F1 is increased by 0.14 based on 95.29 in the related art.
  • Table 2 shows performance comparisons with the stacked RNN.
  • the stacked RNN can process an extremely deep structure, a transformation depth between successive hidden-layer states at a word level is still relatively shallow.
  • the hidden-layer state along an axis of a sequence is simply fed to a corresponding position of a higher layer, that is, only a position sensing feature is transmitted in a depth architecture.
  • each internal state at a word position in a global encoder is transformed into a vector of a fixed size.
  • a context sensing representation provides more general and informative features of a sentence.
  • the stacked RNN with parameter values similar to those in one embodiment of the present disclosure is used.
  • the numerically stacked RNN in Table 2 there is still a big gap between the stacked RNN and the technical solution of one embodiment of the present disclosure. As shown in Table 2, one embodiment of the present disclosure achieves better performance than the stacked RNN by using a smaller number of parameters.
  • a value of F1 in one embodiment of the present disclosure is 91.96, which is 1.02 higher than that of the stacked RNN. Therefore, it is confirmed that the technical solution of one embodiment of the present disclosure can effectively utilize global information to learn more useful representations of a sequence labeling task.
  • Table 3 shows results of the model ablation experiment, that is, values of F1 for a named entity recognition task that are obtained when respectively removing one of the character-level word embedding representation (that is, 4021 in FIG. 4 ), the pre-trained word embedding representation (that is, 4022 in FIG. 4 ), the global word embedding representation (that is, 4023 in FIG. 4 ), and the DT RNN (that is, 4024 in FIG. 4 ) and retaining the other three components.
  • global word embedding representation is used in one embodiment of the present disclosure to enhance an input of a sequence labeling encoder
  • global word embedding information may be enhanced in another manner such as being used as an input of a sequence labeling decoder or being used as an input of a softmax classification layer.
  • the technical solution in the embodiments of the present disclosure has the best effect.
  • the global word embedding representation, a multi-granularity character-level word embedding representation, and the pre-trained word embedding representation are used as the input of the sequence labeling encoder.
  • a more specific and richer representation can be learned for each word position, thereby improving an overall effect of the model.
  • the global information and a feature space of another hidden-layer state are relatively similar.
  • FIG. 9 shows an exemplary system 900 , including an exemplary computing device 910 that represents one or more systems and/or devices that can implement various technologies described herein.
  • the computing device 910 may be, for example, a server of a service provider, a device associated with a client (for example, a client device), a system-on-a-chip, and/or any other suitable computing device or computing system.
  • the language sequence labeling apparatus 700 in FIG. 7 or the language sequence labeling apparatus 800 in FIG. 8 may be in a form of the computing device 910 .
  • the language sequence labeling apparatus 700 and the language sequence labeling apparatus 800 each may be implemented as a computer program in a form of a sequence labeling application 916 .
  • the exemplary computing device 910 shown in the figure includes a processing system 911 , one or more computer-readable media 912 , and one or more I/O interfaces 913 communicatively coupled to each other.
  • the computing device 910 may further include a system bus or another data and command transmission system, which couples various components with each other.
  • the system bus may include any one or a combination of different bus structures such as a memory bus or memory controller, a peripheral bus, a universal serial bus, and/or a processor or a local bus that uses any one of various bus architectures.
  • Various other examples such as control and data lines are also contemplated.
  • the processing system 911 represents a function of using hardware to perform one or more operations. Therefore, the processing system 911 is illustrated as including a hardware element 914 that can be configured as a processor, a functional block, or the like. This may include an application-specific integrated circuit (ASIC) implemented in hardware or another logic device formed by using one or more semiconductors.
  • ASIC application-specific integrated circuit
  • the hardware element 914 is not limited by a material from which the hardware element is formed or a processing mechanism adopted therein.
  • the processor may include (a plurality of) semiconductors and/or transistors (for example, an electronic integrated circuit (IC)).
  • processor-executable instructions may be electronically executable instructions.
  • the computer-readable medium 912 is illustrated as including a memory/storage apparatus 915 .
  • the memory/storage apparatus 915 represents a memory/storage capacity associated with one or more computer-readable media.
  • the memory/storage apparatus 915 may include a volatile medium (for example, a random access memory (RAM)) and/or a non-volatile medium (for example, a read-only memory (ROM), a flash memory, an optical disc, or a magnetic disk).
  • the memory/storage apparatus 915 may include a non-removable medium (for example, a RAM, a ROM, or a solid state hard drive) and a removable medium (for example, a flash memory, a removable hard disk drive, or an optical disc).
  • the computer-readable medium 912 may be configured in various other manners as described further below.
  • the one or more I/O interfaces 913 represent functions of allowing a user to input commands and information to the computing device 910 , and in one embodiment, allowing information to be presented to the user and/or other components or devices using various input/output devices.
  • input devices include a keyboard, a cursor control device (for example, a mouse), a microphone (for example, used for voice input), a scanner, touch functionality (for example, capacitive or other sensors that are configured to detect physical touch), and a camera (for example, visible or invisible wavelengths such as infrared frequencies can be used to detect movement that does not involve touch as a gesture), and so forth.
  • Examples of output devices include a display device (for example, a monitor or a projector), a speaker, a printer, a network card, a tactile response device, and the like. Therefore, the computing device 910 may be configured in various manners as described further below to support user interaction.
  • the computing device 910 further includes the sequence labeling application 916 .
  • the sequence labeling application 916 may be, for example, a software instance of the language sequence labeling apparatus 700 and the language sequence labeling apparatus 800 described in FIG. 5 , and is combined with other elements in the computing device 910 to implement the technologies described herein.
  • modules include a routine, a program, an object, an element, a component, a data structure, and the like for executing a particular task or implementing a particular abstract data type.
  • Terms “module”, “function”, and “component” used herein generally represent software, hardware, or a combination thereof.
  • Implementations of the described modules and technologies may be stored on a specific form of computer-readable medium or transmitted across a specific form of computer-readable medium.
  • the computer-readable medium may include various media accessible by the computing device 910 .
  • the computer-readable medium may include a “computer-readable storage medium” and a “computer-readable signal medium”.
  • the “computer-readable storage medium” refers to a medium and/or device capable of permanently storing information, and/or a tangible storage apparatus. Therefore, the computer-readable storage medium is a non-signal carrying medium.
  • the computer-readable storage medium includes volatile and non-volatile, or removable and non-removable media, and/or hardware such as a storage device implemented with methods or technologies suitable for storing information (for example, computer-readable instructions, a data structure, a program module, a logic element/circuit, or other data).
  • Examples of the computer-readable storage medium may include, but are not limited to, a RAM, a ROM, an EEPROM, a flash memory or another memory technology, a CD-ROM, a digital versatile disk (DVD) or another optical storage apparatus, hardware, a cartridge tape, a magnetic tape, a magnetic disk storage apparatus or another magnetic storage device, or another storage device, a tangible medium or a product that is suitable for storing expected information and accessible by a computer.
  • the “computer-readable signal medium” refers to a signal-carrying medium configured to send instructions to hardware of the computing device 910 through a network.
  • the signal medium may typically embody computer-readable instructions, a data structure, a program module, or other data in a modulated data signal such as a carrier, a data signal, or another transmission mechanism.
  • the signal medium may further include any information transmission medium.
  • modulated data signal refers to a signal that encodes information in the signal in such a manner to set or change one or more of features of the signal.
  • a communication medium includes a wired medium such as a wired network or a directly connected wired medium and wireless mediums such as acoustic, RF, infrared, and other wireless mediums.
  • the hardware element 914 and the computer-readable medium 912 represent instructions, a module, programmable device logic, and/or fixed device logic implemented in a form of hardware, which may be configured to implement at least some aspects of the technologies described herein in some embodiments.
  • the hardware element may include an integrated circuit or a system-on-a-chip, an ASIC, a field programmable gate array (FPGA), a complex programmable logic device (CPLD), and another implementation in silicon or a component of another hardware device.
  • the hardware element may be used as a processing device for executing program tasks defined by the instructions, modules and/or logic embodied by the hardware element, and a hardware device for storing instructions for execution, for example, the computer-readable storage medium described above.
  • the software, the hardware or the program modules and other program modules may be implemented as one or more instructions and/or logic embodied by one or more hardware elements 914 in a specific form of computer-readable storage medium.
  • the computing device 910 may be configured to implement specific instructions and/or functions corresponding to software and/or hardware modules. Therefore, for example, the computer-readable storage medium and/or the hardware element 914 of the processing system may be configured to at least partially implement, in a form of hardware, the module as a module executable by the computing device 910 as software.
  • the instructions and/or functions may be executable/operable by one or more products (for example, one or more computing devices 910 and/or processing systems 911 ) to implement the technologies, modules, and examples described herein.
  • the computing device 910 may have various different configurations.
  • the computing device 910 may be implemented as a computer-type device such as a personal computer, a desktop computer, a multi-screen computer, a laptop computer, or a netbook.
  • the computing device 910 may further be implemented as a mobile apparatus-type device including a mobile device such as a mobile phone, a portable music player, a portable game device, a tablet computer, a multi-screen computer, or the like.
  • the computing device 910 may also be implemented as a television-type device, including a device with or connected to a generally larger screen in a casual viewing environment.
  • the devices include a television, a set-top box, a game console, and the like.
  • the technologies described herein may be supported by the various configurations of the computing device 910 and are not limited to specific examples of the technologies described herein.
  • the functions may also be implemented completely or partially on a “cloud” 920 by using a distributed system such as through a platform 922 as described below.
  • the cloud 920 includes and/or represents the platform 922 for resources 924 .
  • the platform 922 abstracts underlying functions of hardware (for example, a server) and software resources of the cloud 920 .
  • the resources 924 may include applications and/or data that can be used when computer processing is performed on a server remote from the computing device 910 .
  • the resources 924 may further include a service provided through the Internet and/or through a subscriber network such as a cellular or a Wi-Fi network.
  • the platform 922 may abstract resources and functions to connect the computing device 910 to another computing device.
  • the platform 922 may further be configured to abstract classification of the resources to provide a corresponding level of classification of requirements encountered for the resources 924 that is implemented through the platform 922 . Therefore, in an embodiment of interconnection devices, implementations of the functions described herein may be distributed throughout the system 900 . For example, the functions may be partially implemented on the computing device 910 and through the platform 922 that abstracts the functions of the cloud 920 .
  • each functional module may be implemented in a single module, implemented in a plurality of modules, or implemented as a part of other functional modules.
  • the functionality described as being performed by the single module may be performed by a plurality of different modules. Therefore, a reference to a specific functional module is only considered as a reference to an appropriate module for providing the described functionality, rather than indicating a strict logical or physical structure or organization. Therefore, the embodiments of the present disclosure may be implemented in the single module, or may be physically and functionally distributed between different modules and circuits.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Machine Translation (AREA)
US17/355,120 2019-06-05 2021-06-22 Language sequence labeling method and apparatus, storage medium, and computing device Active 2040-09-23 US11681872B2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201910486896.1A CN110196967A (zh) 2019-06-05 2019-06-05 基于深度转换架构的序列标注方法和装置
CN201910486896.1 2019-06-05
PCT/CN2020/093679 WO2020244475A1 (zh) 2019-06-05 2020-06-01 用于语言序列标注的方法、装置、存储介质及计算设备

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/093679 Continuation WO2020244475A1 (zh) 2019-06-05 2020-06-01 用于语言序列标注的方法、装置、存储介质及计算设备

Publications (2)

Publication Number Publication Date
US20210319181A1 US20210319181A1 (en) 2021-10-14
US11681872B2 true US11681872B2 (en) 2023-06-20

Family

ID=67754015

Family Applications (1)

Application Number Title Priority Date Filing Date
US17/355,120 Active 2040-09-23 US11681872B2 (en) 2019-06-05 2021-06-22 Language sequence labeling method and apparatus, storage medium, and computing device

Country Status (4)

Country Link
US (1) US11681872B2 (zh)
JP (1) JP7431833B2 (zh)
CN (1) CN110196967A (zh)
WO (1) WO2020244475A1 (zh)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110196967A (zh) * 2019-06-05 2019-09-03 腾讯科技(深圳)有限公司 基于深度转换架构的序列标注方法和装置
CN111274820B (zh) * 2020-02-20 2023-04-07 齐鲁工业大学 一种基于神经网络的智能医疗命名实体识别方法和装置
CN111353295A (zh) * 2020-02-27 2020-06-30 广东博智林机器人有限公司 序列标注方法、装置、存储介质及计算机设备
CN112818676B (zh) * 2021-02-02 2023-09-26 东北大学 一种医学实体关系联合抽取方法
CN113220876B (zh) * 2021-04-16 2022-12-06 山东师范大学 一种用于英文文本的多标签分类方法及系统
CN114153942B (zh) * 2021-11-17 2024-03-29 中国人民解放军国防科技大学 一种基于动态注意力机制的事件时序关系抽取方法
CN114548080B (zh) * 2022-04-24 2022-07-15 长沙市智为信息技术有限公司 一种基于分词增强的中文错字校正方法及系统
CN115545035B (zh) * 2022-11-29 2023-02-17 城云科技(中国)有限公司 一种文本实体识别模型及其构建方法、装置及应用

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107025219A (zh) 2017-04-19 2017-08-08 厦门大学 一种基于内部语义层次结构的词嵌入表示方法
US20170308790A1 (en) 2016-04-21 2017-10-26 International Business Machines Corporation Text classification by ranking with convolutional neural networks
CN108717409A (zh) 2018-05-16 2018-10-30 联动优势科技有限公司 一种序列标注方法及装置
CN108829818A (zh) 2018-06-12 2018-11-16 中国科学院计算技术研究所 一种文本分类方法
CN110196967A (zh) 2019-06-05 2019-09-03 腾讯科技(深圳)有限公司 基于深度转换架构的序列标注方法和装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9239828B2 (en) * 2013-12-05 2016-01-19 Microsoft Technology Licensing, Llc Recurrent conditional random fields
CN108304911B (zh) * 2018-01-09 2020-03-13 中国科学院自动化研究所 基于记忆神经网络的知识抽取方法以及系统和设备
CN108628823B (zh) * 2018-03-14 2022-07-01 中山大学 结合注意力机制和多任务协同训练的命名实体识别方法
CN108846017A (zh) * 2018-05-07 2018-11-20 国家计算机网络与信息安全管理中心 基于Bi-GRU和字向量的大规模新闻文本的端到端分类方法
CN109408812A (zh) * 2018-09-30 2019-03-01 北京工业大学 一种基于注意力机制的序列标注联合抽取实体关系的方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170308790A1 (en) 2016-04-21 2017-10-26 International Business Machines Corporation Text classification by ranking with convolutional neural networks
CN107025219A (zh) 2017-04-19 2017-08-08 厦门大学 一种基于内部语义层次结构的词嵌入表示方法
CN108717409A (zh) 2018-05-16 2018-10-30 联动优势科技有限公司 一种序列标注方法及装置
CN108829818A (zh) 2018-06-12 2018-11-16 中国科学院计算技术研究所 一种文本分类方法
CN110196967A (zh) 2019-06-05 2019-09-03 腾讯科技(深圳)有限公司 基于深度转换架构的序列标注方法和装置

Non-Patent Citations (12)

* Cited by examiner, † Cited by third party
Title
Alan Akbik et al., "Contextual String Embeddings for Sequence Labeling," Proceedings of the 27th International Conference on Computational Linguistics, pp. 1638-1649, Aug. 2018. 12 pages.
Antonio Valerio Miceli Barone et al., "Deep architectures for neural machine translation," in WMT, 2017. 9 pages.
Dong, et al., "Deep learning for named entity recognition on Chinese electronic medical records: Combining deep transfer learning with multitask bi-directional LSTM RNN", Plos One, May 2, 2019 16 pages.
Dzmitry Bahdanau et al., "Neural Machine Translation by Jointly Learning to Align and Translate," in ICLR, 2015. 15 pages.
Ling Luo et al., "An attention-based BiLSTM-CRF approach to document level chemical named entity recognition," Bioinformatics, 34(8), Oxford University Press, Nov. 24, 2017, pp. 1381-1388. 8 pages.
Marek Rei et al., "Attending to Characters in Neural Sequence Labeling Models," Proceedings of Coling 2016, Dec. 17, 2016, pp. 309-318. 10 pages.
Matthew E. Peters et al., "Semi-supervised sequence tagging with bidirectional language models," in ACL, 2017. 10 pages.
Mingxuan Wang et al., "Deep Neural Machine Translation with Linear Associative Unit," in ACL, 2017. 10 pages.
Minh-Thang Luong et al., "Effective Approaches to Attention-based Neural Machine Translation," in EMNLP, 2015 10 pages.
Razvan Pascanu et al., "How to Construct Deep Recurrent Neural Networks," in ICLR, 2014. 13 pages.
The Japan Patent Office (JPO) Notification of Reasons for Refusal for Application No. 2021-539998 dated Aug. 1, 2022 5 pages (including translation).
The World Intellectual Property Organization (WIPO) International Search Report for PCT/CN2020/093679 dated Aug. 27, 2020 6 Pages (including translation).

Also Published As

Publication number Publication date
JP7431833B2 (ja) 2024-02-15
US20210319181A1 (en) 2021-10-14
WO2020244475A1 (zh) 2020-12-10
JP2022517971A (ja) 2022-03-11
CN110196967A (zh) 2019-09-03

Similar Documents

Publication Publication Date Title
US11681872B2 (en) Language sequence labeling method and apparatus, storage medium, and computing device
CN108959246B (zh) 基于改进的注意力机制的答案选择方法、装置和电子设备
US11157693B2 (en) Stylistic text rewriting for a target author
EP3568852B1 (en) Training and/or using an encoder model to determine responsive action(s) for natural language input
CN107066464B (zh) 语义自然语言向量空间
US20180276525A1 (en) Method and neural network system for human-computer interaction, and user equipment
CN109582956B (zh) 应用于句子嵌入的文本表示方法和装置
CN109313650B (zh) 在自动聊天中生成响应
CN112685565A (zh) 基于多模态信息融合的文本分类方法、及其相关设备
CN110704576B (zh) 一种基于文本的实体关系抽取方法及装置
CN111931517B (zh) 文本翻译方法、装置、电子设备以及存储介质
US11769018B2 (en) System and method for temporal attention behavioral analysis of multi-modal conversations in a question and answer system
CN110795541B (zh) 文本查询方法、装置、电子设备及计算机可读存储介质
CN110377733B (zh) 一种基于文本的情绪识别方法、终端设备及介质
US20230061778A1 (en) Conversation information processing method, apparatus, computer- readable storage medium, and device
CN113204618A (zh) 基于语义增强的信息识别方法、装置、设备及存储介质
CN111666400A (zh) 消息获取方法、装置、计算机设备及存储介质
US11531927B2 (en) Categorical data transformation and clustering for machine learning using natural language processing
CN113961679A (zh) 智能问答的处理方法、系统、电子设备及存储介质
US20230205994A1 (en) Performing machine learning tasks using instruction-tuned neural networks
CN112906368B (zh) 行业文本增量方法、相关装置及计算机程序产品
CN111506717B (zh) 问题答复方法、装置、设备及存储介质
WO2023116572A1 (zh) 一种词句生成方法及相关设备
CN114970666B (zh) 一种口语处理方法、装置、电子设备及存储介质
CN115050371A (zh) 语音识别方法、装置、计算机设备和存储介质

Legal Events

Date Code Title Description
AS Assignment

Owner name: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED, CHINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MENG, FANDONG;LIU, YIJIN;ZHANG, JINCHAO;AND OTHERS;REEL/FRAME:056626/0372

Effective date: 20210621

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STCF Information on status: patent grant

Free format text: PATENTED CASE