CA3166784A1 - Human-machine interactive speech recognizing method and system for intelligent devices - Google Patents
Human-machine interactive speech recognizing method and system for intelligent devicesInfo
- Publication number
- CA3166784A1 CA3166784A1 CA3166784A CA3166784A CA3166784A1 CA 3166784 A1 CA3166784 A1 CA 3166784A1 CA 3166784 A CA3166784 A CA 3166784A CA 3166784 A CA3166784 A CA 3166784A CA 3166784 A1 CA3166784 A1 CA 3166784A1
- Authority
- CA
- Canada
- Prior art keywords
- slot
- vector
- intent
- term
- hidden state
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 62
- 230000002452 interceptive effect Effects 0.000 title claims description 22
- 239000013598 vector Substances 0.000 claims abstract description 208
- 230000011218 segmentation Effects 0.000 claims abstract description 45
- 238000012549 training Methods 0.000 claims abstract description 15
- 238000012545 processing Methods 0.000 claims abstract description 13
- 238000005457 optimization Methods 0.000 claims abstract description 7
- 230000008569 process Effects 0.000 claims description 32
- 239000011159 matrix material Substances 0.000 claims description 17
- 230000004913 activation Effects 0.000 claims description 12
- 230000002457 bidirectional effect Effects 0.000 claims description 11
- 230000001131 transforming effect Effects 0.000 claims description 10
- 230000009466 transformation Effects 0.000 claims description 4
- 238000012216 screening Methods 0.000 claims description 2
- 230000003993 interaction Effects 0.000 abstract description 4
- 238000001514 detection method Methods 0.000 abstract 2
- 230000000694 effects Effects 0.000 description 5
- 241000238558 Eucarida Species 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/22—Interactive procedures; Man-machine interfaces
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Machine Translation (AREA)
- Document Processing Apparatus (AREA)
Abstract
A speech recognition method for human-machine interaction of a smart apparatus and a system, pertaining to the technical field of speech recognition, and improving the accuracy of speech recognition by means of joint optimization training of intent detection and slot filling. The method comprises: performing word segmentation on speech data of a user's question to obtain an original word sequence, and generating a vector representation of the original word sequence by means of embedding processing; performing weighting processing on a hidden state vector hi and a slot context vector ci S to obtain a slot label model yi S; performing weighting processing on a hidden state vector hT and an intent context vector cI to obtain an intent prediction model yI; joining the slot context vector ci S and the intent context vector cI by means of a slot gate g, and obtaining a transformed representation of the slot label model yi S by means of the slot gate g; and constructing an objective function for joint optimization of the intent prediction model yI and the transformed slot label model yi S, and performing intent detection on the speech data of the user's question on the basis of the objective function.
Description
HUMAN-MACHINE INTERACTIVE SPEECH RECOGNIZING METHOD AND
SYSTEM FOR INTELLIGENT DEVICES
BACKGROUND OF THE INVENTION
Technical Field [0001] The present invention relates to the technical field of speech recognition, and more particularly to a human-machine interactive speech recognizing method and system for an intelligent device.
Description of Related Art
SYSTEM FOR INTELLIGENT DEVICES
BACKGROUND OF THE INVENTION
Technical Field [0001] The present invention relates to the technical field of speech recognition, and more particularly to a human-machine interactive speech recognizing method and system for an intelligent device.
Description of Related Art
[0002] With the development of the internet technology, there come into being more and more intelligent devices that employ speeches for human-machine interaction.
Currently available speech interactive systems include Siri, Xiaomi, Cortana, Avatar Framework, and Duer, etc. As compared with the traditional human-machine interaction based on manual input, speech human-machine interaction exhibits characteristics of conveniency, high efficiency, and broad range of application scenarios. During the process of speech recognition, intent recognition and slot filling techniques are keys to ensuring the accuracy of speech recognition results.
Currently available speech interactive systems include Siri, Xiaomi, Cortana, Avatar Framework, and Duer, etc. As compared with the traditional human-machine interaction based on manual input, speech human-machine interaction exhibits characteristics of conveniency, high efficiency, and broad range of application scenarios. During the process of speech recognition, intent recognition and slot filling techniques are keys to ensuring the accuracy of speech recognition results.
[0003] As regards intent recognition, it can be abstracted as a classification problem, and a classifier represented by means of CNN + knowledge is then employed to train an intent recognition model, in which is further introduced semantic representation of knowledge to enhance the generalization capability of the presentation layer in addition to word-embedding for speech questions of users, but it has been found in practical application that such a model is defective in terms of slot information filling deviation, whereby accuracy of the intent recognition model is adversely affected. As regards slot filling, its Date Regue/Date Received 2022-07-04 essence is to formalize a sentence sequence to a marked sequence, and there are many frequently used methods to mark sequences, such as the hidden Markov model or the conditional random field model, but these slot filling models cannot satisfy practical application requirements under specific application scenarios, due to ambiguities of slots existent under different semantic intents caused by the lack of contextual information.
Seen as such, trainings of the two models are independently carried out in the state of the art, and there is no combined optimization of the intent recognition task and the slot filling task, so that the finally trained models are problematic in terms of low recognition accuracy in the aspect of speech recognition, and user experience is lowered.
SUMMARY OF THE INVENTION
Seen as such, trainings of the two models are independently carried out in the state of the art, and there is no combined optimization of the intent recognition task and the slot filling task, so that the finally trained models are problematic in terms of low recognition accuracy in the aspect of speech recognition, and user experience is lowered.
SUMMARY OF THE INVENTION
[0004] The objective of the present invention is to provide a human-machine interactive speech recognizing method and system for an intelligent device, to enhance accuracy of speech recognition by jointly optimizing and training intent recognition and slot filling.
[0005] To achieve the above objective, according to one aspect, the present invention provides a human-machine interactive speech recognizing method for an intelligent device, the method comprising:
[0006] subjecting a speech question of a user to a term-segmenting process to obtain an original term sequence, and vectorizing the original term sequence through an embedding process;
[0007] calculating a hidden state vector hi and a slot context vector cis of each term segmentation vector, and weighting the hidden state vector hi and the slot context vector cis to thereafter obtain a slot label model y ;
[0008] calculating a hidden state vector hT and an intent context vector cf of the vectorized original term sequence, and weighting the hidden state vector hT and the intent context vector cf to thereafter obtain an intent prediction model y';
[0009] employing a slot gate g to join the slot context vector cis and the intent context vector cf, and generating a transformed representation of the slot label model yis through the slot Date Regue/Date Received 2022-07-04 gate g; and
[0010] jointly optimizing the intent prediction model yi and the transformed slot label model yis to construct a target function, and performing intent recognition on the speech question of the user based on the target function.
[0011] Preferably, the step of subjecting a speech question of a user to a term-segmenting process to obtain an original term sequence, and vectorizing the original term sequence through an embedding process includes:
[0012] receiving the speech question of the user and transforming the speech question to a recognizable text, and employing a tokenizer to term-segment the recognizable text and obtain the original term sequence; and
[0013] subjecting the original term sequence to a word embedding process, and realizing a vector representation of each segmented term in the original term sequence.
[0014] Preferably, the step of calculating a hidden state vector hi and a slot context vector cis of each term segmentation vector, and weighting the hidden state vector hi and the slot context vector cis to thereafter obtain a slot label model yis includes:
[0015] employing a bidirectional LSTM network to encode each term segmentation vector, and outputting the hidden state vector hi corresponding to each term segmentation vector;
[0016] calculating the slot context vector cis, to which each term segmentation vector corresponds, through formula ciS = , wherein a represents an attention weight of a slot, its calculation formula is a11 = ¨ exp (ei k) Texp (eij) , e ¨ o-(Whsehj), where represents a slot activation function, and WL represents a slot weight matrix;
and
and
[0017] constructing a slot label model yiS = softmax (Whse(hi +cis) ) based on the hidden state vector hi and the slot context vector cis.
[0018] Further, the step of calculating a hidden state vector hT and an intent context vector cf of the vectorized original term sequence, and weighting the hidden state vector hT and the Date Regue/Date Received 2022-07-04 intent context vector cf to thereafter obtain an intent prediction model yi includes:
[0019] employing a hidden unit in the bidirectional LSTM network to encode the vectorized original term sequence, and obtaining the hidden state vector hT;
[0020] calculating the intent context vector cf of the original term sequence through formula c1 = EaJhT, wherein ai represents an attention weight of an intent, its calculation formula is al. ¨ Texp (e1) e ¨ o- ' hT
) , where a' represents an intent activation k=lexp (e k)' function, and Ku, represents an intent weight matrix; and
) , where a' represents an intent activation k=lexp (e k)' function, and Ku, represents an intent weight matrix; and
[0021] constructing an intent prediction model 371 = so ftmax(Wilu,(hT + cl)) based on the hidden state vector hT and the intent context vector cf.
[0022] Preferably, the step of employing a slot gate g to join the slot context vector cis and the intent context vector cf, and generating a transformed representation of the slot label model yis through the slot gate g includes:
[0023] formally representing the slot gate g as g= v = tanh (cr + W = c') , wherein v represents a weight vector obtained by training, and W represents a weight matrix obtained by training; and
[0024] formally representing the transformation of the slot label model yis through the slot gate g as:
[0025] y = so ftmax(W hse(hi + c g)).
[0026] Optionally, the target function constructed by jointly optimizing the intent prediction model yi and the transformed slot label model yis is:
[0027] p(ys , 3711X) = p(y1 IX) {I p(yiclX) , wherein p(ys , 371 IX) represents a conditional probability for outputting slot filling and intent prediction at a given original term sequence, where X is the vectorized original term sequence.
[0028] Preferably, the step of performing intent recognition on the speech question of the user based on the target function includes:
Date Regue/Date Received 2022-07-04
Date Regue/Date Received 2022-07-04
[0029] sequentially obtaining intent conditional probabilities, to which the various segmented terms in the original term sequence correspond, through the target function;
and
and
[0030] screening therefrom a segmented term with the maximum probability value and recognizing the segmented term as the intent of the speech question of the user.
[0031] In comparison with prior-art technology, the human-machine interactive speech recognizing method for an intelligent device provided by the present invention achieves the following advantageous effects.
[0032] In the human-machine interactive speech recognizing method for an intelligent device provided by the present invention, the speech question of the user as obtained is firstly transformed to a recognizable text, a term segmenting process is carried out on the basis of the recognizable text to generate an original term sequence, which is then subjected to a word embedding process to realize vector representation, thereafter, a slot label model yis and an intent prediction model yi are respectively constructed on the basis of the vectorized original term sequence, wherein the step of constructing the slot label model yis is to calculate a hidden state vector h, and a slot context vector c15 of each term segmentation vector, and weight the hidden state vector hi and the slot context vector os to thereafter obtain the slot label model yis, while the step of constructing the intent prediction model yi is to calculate a hidden state vector hT and an intent context vector cf of the original term sequence, and weight the hidden state vector hT and the intent context vector cf to thereafter obtain the intent prediction model j/; seen as such, in order to fuse the intent prediction model)/ with the slot label model yis, a decoder layer is additionally added to the existing encoder-decoder framework to construct the intent prediction model y', join the slot context vector c15 and the intent context vector c' by introducing a slot gate g, finally jointly optimize the intent prediction model yl and the transformed slot label model yis to obtain a target function, employ the target function to sequentially obtain intent conditional probabilities, to which the various segmented terms in the original term sequence correspond, and screen therefrom a segmented term with the maximum Date Regue/Date Received 2022-07-04 probability value and recognize it as the intent of the speech question of the user, so as to ensure accuracy of speech recognition.
[0033] According to another aspect, the present invention provides a human-machine interactive speech recognizing system for an intelligent device, wherein the system is applied to the human-machine interactive speech recognizing method for an intelligent device as recited in the foregoing technical solution, and the system comprises:
[0034] a term segmentation processing unit, for subjecting a speech question of a user to a term-segmenting process to obtain an original term sequence, and vectorizing the original term sequence through an embedding process;
[0035] a first calculating unit, for calculating a hidden state vector hi and a slot context vector cis of each term segmentation vector, and weighting the hidden state vector hi and the slot context vector cis to thereafter obtain a slot label model yis;
[0036] a second calculating unit, for calculating a hidden state vector hT and an intent context vector cf of the vectorized original term sequence, and weighting the hidden state vector hT and the intent context vector cf to thereafter obtain an intent prediction model yi;
[0037] a model transforming unit, for employing a slot gate g to join the slot context vector cis and the intent context vector cf, and generating a transformed representation of the slot label model yis through the slot gate g; and
[0038] a joint optimization unit, for jointly optimizing the intent prediction model yl and the transformed slot label model yis to construct a target function, and performing intent recognition on the speech question of the user based on the target function.
[0039] Preferably, the term segmentation processing unit includes:
[0040] a term-segmenting module, for receiving the speech question of the user and transforming the speech question to a recognizable text, and employing a tokenizer to term-segment the recognizable text and obtain the original term sequence; and
[0041] an embedding processing module, for subjecting the original term sequence to a word embedding process, and realizing a vector representation of each segmented term in the Date Regue/Date Received 2022-07-04 original term sequence.
[0042] Preferably, the first calculating unit includes:
[0043] a hidden state calculating module, for employing a bidirectional LSTM
network to encode each term segmentation vector, and outputting the hidden state vector hi corresponding to each term segmentation vector;
network to encode each term segmentation vector, and outputting the hidden state vector hi corresponding to each term segmentation vector;
[0044] a slot context calculating module, for calculating the slot context vector cis, to which each term segmentation vector corresponds, through formula c = Eah1, wherein ais:j represents an attention weight of a slot, its calculation formula is ___ =
Texp (e Ek=i exP (ei,k)' e = o-(Whsehj), where a represents a slot activation function, and WL
represents a slot weight matrix; and
Texp (e Ek=i exP (ei,k)' e = o-(Whsehj), where a represents a slot activation function, and WL
represents a slot weight matrix; and
[0045] a slot label model module, for constructing a slot label model yic =
so f tmax (14a(hi + cis) ) based on the hidden state vector hi and the slot context vector cis.
so f tmax (14a(hi + cis) ) based on the hidden state vector hi and the slot context vector cis.
[0046] As compared with prior-art technology, the advantageous effects achieved by the human-machine interactive speech recognizing system for an intelligent device provided by the present invention are identical with the advantageous effects achievable by the human-machine interactive speech recognizing method for an intelligent device provided by the foregoing technical solution, so these are not redundantly described in this context.
BRIEF DESCRIPTION OF THE DRAWINGS
BRIEF DESCRIPTION OF THE DRAWINGS
[0047] The drawings described here are meant to provide further understanding of the present invention, and constitute part of the present invention. The exemplary embodiments of the present invention and the descriptions thereof are meant to explain the present invention, rather than to restrict the present invention. In the drawings:
Date Regue/Date Received 2022-07-04
Date Regue/Date Received 2022-07-04
[0048] Fig. 1 is a flowchart schematically illustrating the human-machine interactive speech recognizing method for an intelligent device in Embodiment 1 of the present invention;
[0049] Fig. 2 is an exemplary view illustrating encoder-decoder fusing model in Embodiment 1 of the present invention;
[0050] Fig. 3 is an exemplary view illustrating the slot gate g in Fig. 2; and
[0051] Fig. 4 is a block diagram illustrating the structure of the human-machine interactive speech recognizing system for an intelligent device in Embodiment 2 of the present invention.
[0052] Reference numerals:
[0053] 1 ¨ term segmentation processing unit
[0054] 3 ¨ second calculating unit
[0055] 5 ¨joint optimization unit 2¨ first calculating unit 4¨ model transforming unit Date Regue/Date Received 2022-07-04 DETAILED DESCRIPTION OF THE INVENTION
[0056] To make more lucid and clear the objectives, features and advantages of the present invention, the technical solutions in the embodiments of the present invention are clearly and comprehensively described below with reference to the accompanying drawings in the embodiments of the present invention. Apparently, the embodiments as described are merely partial, rather than the entire, embodiments of the present invention.
All other embodiments obtainable by persons ordinarily skilled in the art on the basis of the embodiments in the present invention without spending creative effort shall all fall within the protection scope of the present invention.
All other embodiments obtainable by persons ordinarily skilled in the art on the basis of the embodiments in the present invention without spending creative effort shall all fall within the protection scope of the present invention.
[0057] Embodiment 1
[0058] Fig. 1 is a flowchart schematically illustrating the human-machine interactive speech recognizing method for an intelligent device in Embodiment 1 of the present invention.
Referring to Fig. 1, the human-machine interactive speech recognizing method for an intelligent device provided by this embodiment comprises:
Referring to Fig. 1, the human-machine interactive speech recognizing method for an intelligent device provided by this embodiment comprises:
[0059] subjecting a speech question of a user to a term-segmenting process to obtain an original term sequence, and vectorizing the original term sequence through an embedding process;
calculating a hidden state vector hi and a slot context vector cis of each term segmentation vector, and weighting the hidden state vector hi and the slot context vector cis to thereafter obtain a slot label model yis; calculating a hidden state vector hT and an intent context vector cf of the vectorized original term sequence, and weighting the hidden state vector hT and the intent context vector cf to thereafter obtain an intent prediction model yi;
employing a slot gate g to join the slot context vector cis and the intent context vector cf, and generating a transformed representation of the slot label model yis through the slot gate g; and jointly optimizing the intent prediction model./ and the transformed slot label model yis to construct a target function, and performing intent recognition on the speech Date Regue/Date Received 2022-07-04 question of the user based on the target function.
calculating a hidden state vector hi and a slot context vector cis of each term segmentation vector, and weighting the hidden state vector hi and the slot context vector cis to thereafter obtain a slot label model yis; calculating a hidden state vector hT and an intent context vector cf of the vectorized original term sequence, and weighting the hidden state vector hT and the intent context vector cf to thereafter obtain an intent prediction model yi;
employing a slot gate g to join the slot context vector cis and the intent context vector cf, and generating a transformed representation of the slot label model yis through the slot gate g; and jointly optimizing the intent prediction model./ and the transformed slot label model yis to construct a target function, and performing intent recognition on the speech Date Regue/Date Received 2022-07-04 question of the user based on the target function.
[0060] In the human-machine interactive speech recognizing method for an intelligent device provided by this embodiment, the speech question of the user as obtained is firstly transformed to a recognizable text, a term segmenting process is carried out on the basis of the recognizable text to generate an original term sequence, which is then subjected to a word embedding process to realize vector representation, thereafter, a slot label model yis and an intent prediction model yi are respectively constructed on the basis of the vectorized original term sequence, wherein the step of constructing the slot label model yis is to calculate a hidden state vector h, and a slot context vector cis of each term segmentation vector, and weight the hidden state vector hi and the slot context vector cis to thereafter obtain the slot label model yis, while the step of constructing the intent prediction model yi is to calculate a hidden state vector hT and an intent context vector cf of the original term sequence, and weight the hidden state vector hT and the intent context vector cf to thereafter obtain the intent prediction model yl; as shown in Fig. 2, in order to fuse the intent prediction model yi with the slot label model yis, a decoder layer is additionally added to the existing encoder-decoder framework to construct the intent prediction model join the slot context vector cis and the intent context vector cf by introducing a slot gate g, finally jointly optimize the intent prediction model yl and the transformed slot label model yis to obtain a target function, employ the target function to sequentially obtain intent conditional probabilities, to which the various segmented terms in the original term sequence correspond, and subsequently screen therefrom a segmented term with the maximum probability value and recognize it as the intent of the speech question of the user, so as to ensure accuracy of speech recognition.
[0061] Specifically, the step of subjecting a speech question of a user to a term-segmenting process to obtain an original term sequence, and vectorizing the original term sequence through an embedding process in the foregoing embodiment includes:
[0062] receiving the speech question of the user and transforming the speech question to a Date Regue/Date Received 2022-07-04 recognizable text, and employing a tokenizer to term-segment the recognizable text and obtain the original term sequence; and subjecting the original term sequence to a word embedding process, and realizing a vector representation of each segmented term in the original term sequence.
[0063] As should be noted, the step of calculating a hidden state vector hi and a slot context vector cis of each term segmentation vector, and weighting the hidden state vector hi and the slot context vector cis to thereafter obtain a slot label model yis in the foregoing embodiment includes:
[0064] employing a bidirectional LSTM network to encode each term segmentation vector, and outputting the hidden state vector hi corresponding to each term segmentation vector;
calculating the slot context vector cis, to which each term segmentation vector corresponds, through formula c = , wherein a represents an attention exp (ei j) s weight of a slot, its calculation formula is = = _______________________ e = o-(Whehj), where exp a represents a slot activation function, and WL represents a slot weight matrix; and constructing a slot label model y;s. = softmax (WL(hi +cis) ) based on the hidden state vector hi and the slot context vector cis.
calculating the slot context vector cis, to which each term segmentation vector corresponds, through formula c = , wherein a represents an attention exp (ei j) s weight of a slot, its calculation formula is = = _______________________ e = o-(Whehj), where exp a represents a slot activation function, and WL represents a slot weight matrix; and constructing a slot label model y;s. = softmax (WL(hi +cis) ) based on the hidden state vector hi and the slot context vector cis.
[0065] During specific implementation, after plural term segmentation vectors have been input to the bidirectional LSTM network, hidden state vectors hi can be correspondingly output on a one-by-one basis, as regards formula c = E of the slot context vector, where represents the attention weight of the slot, i represents the ith term segmentation vector, and j represents the jth element in the ith term segmentation vector.
Specifically, the calculation formula of the attention weight of the slot is cO. ¨ Texp (e , e = =
exp (ei,k) o-(Wilhj), where T represents the total number of elements in the term segmentation vector, and K represents the Kth element in T. In addition, as regards slot activation function a and slot weight matrix W,, these can be derived on the basis of vector Date Regue/Date Received 2022-07-04 matrix training of the original term sequence, and the specific training processes are conventional technical means frequently employed in this technical field, so these are not redundantly described in this embodiment.
Specifically, the calculation formula of the attention weight of the slot is cO. ¨ Texp (e , e = =
exp (ei,k) o-(Wilhj), where T represents the total number of elements in the term segmentation vector, and K represents the Kth element in T. In addition, as regards slot activation function a and slot weight matrix W,, these can be derived on the basis of vector Date Regue/Date Received 2022-07-04 matrix training of the original term sequence, and the specific training processes are conventional technical means frequently employed in this technical field, so these are not redundantly described in this embodiment.
[0066] The step of calculating a hidden state vector hT and an intent context vector cf of the vectorized original term sequence, and weighting the hidden state vector hT
and the intent context vector cf to thereafter obtain an intent prediction model yi in the foregoing embodiment includes:
and the intent context vector cf to thereafter obtain an intent prediction model yi in the foregoing embodiment includes:
[0067] employing a hidden unit in the bidirectional LSTM network to encode the vectorized original term sequence, and obtaining the hidden state vector hT; calculating the intent context vector cf of the original term sequence through formula c1 = EaJhT, wherein aJ represents an attention weight of an intent, its calculation formula is aJ
¨ Texp (ei) Ek,, exp (e k)' ei = o-' (Wifi,hT) , where a' represents an intent activation function, and Wif, represents an intent weight matrix; and constructing an intent prediction model 37' =
so f tmax (14/11,õ(hT + c')) based on the hidden state vector hT and the intent context vector c.f.
¨ Texp (ei) Ek,, exp (e k)' ei = o-' (Wifi,hT) , where a' represents an intent activation function, and Wif, represents an intent weight matrix; and constructing an intent prediction model 37' =
so f tmax (14/11,õ(hT + c')) based on the hidden state vector hT and the intent context vector c.f.
[0068] During the process of specific implementation, the method of training the intent prediction model yi is the same as the method of training the slot label model yis, and the difference rests in the fact that the hidden state vector hT can be obtained merely by means of a hidden unit in the bidirectional LSTM network, after one-dimensional transformation of the vector matrix, formula cd = E ct. hT is subsequently invoked to calculate the intent context vector cf of the original term sequence, where ct.; represents an attention weight of an intent, its calculation formula is ct" = Texp (e1) e- = (K.", hT) , Ek,, exp (e k) wherein a' represents an intent activation function, and Ku, represents an intent weight matrix; as regards the intent activation function a' and the intent weight matrix Wkõ, these can be derived on the basis of processed one-dimensional vector training, the Date Regue/Date Received 2022-07-04 specific training processes are conventional technical means frequently employed in this technical field, so these are not redundantly described in this embodiment.
[0069] Moreover, the step of employing a slot gate g to join the slot context vector cis and the intent context vector cf, and generating a transformed representation of the slot label model yis through the slot gate g in the foregoing embodiment includes:
[0070] formally representing the slot gate g as g= v = tanh (cis. + W = c') , wherein v represents a weight vector obtained by training, and W represents a weight matrix obtained by training; and formally representing the transformation of the slot label model yis through the slot gate g as yis. = so ftmax (WL (hi + c g)). Fig. 3 shows a structure model of the slot gate g.
[0071] Preferably, the target function constructed by jointly optimizing the intent prediction model yi and the transformed slot label model yis in the foregoing embodiment is:
[0072] p(ys y' po , polispo wherein p (ys, y11X) represents a conditional probability for outputting slot filling and intent prediction at a given original term sequence, where X represents the vectorized original term sequence. After expansion, P (ys ,Y1 IX) = P(371 IX) Fr P(YiclX) = P(371 ixi,' xT) P(Yis. 'xi, = = = xT) , where xi represents the ith term segmentation vector, and T represents the total number of term segmentation vectors. Through calculation of the target function can be obtained intent probability values of the various term segmentation vectors, and a segmented term with the maximum probability value is screened out of the various term segmentation vectors and recognized as the intent of the speech question of the user.
[0073] Embodiment 2
[0074] Referring to Fig. 1 and Fig. 4, this embodiment provides a human-machine interactive speech recognizing system for an intelligent device, the system comprising:
Date Regue/Date Received 2022-07-04
Date Regue/Date Received 2022-07-04
[0075] a term segmentation processing unit 1, for subjecting a speech question of a user to a term-segmenting process to obtain an original term sequence, and vectorizing the original term sequence through an embedding process;
[0076] a first calculating unit 2, for calculating a hidden state vector hi and a slot context vector cis of each term segmentation vector, and weighting the hidden state vector hi and the slot context vector cis to thereafter obtain a slot label model yis;
[0077] a second calculating unit 3, for calculating a hidden state vector hT
and an intent context vector ef of the vectorized original term sequence, and weighting the hidden state vector hT and the intent context vector cf to thereafter obtain an intent prediction model yl;
and an intent context vector ef of the vectorized original term sequence, and weighting the hidden state vector hT and the intent context vector cf to thereafter obtain an intent prediction model yl;
[0078] a model transforming unit 4, for employing a slot gate g to join the slot context vector cis and the intent context vector cf, and generating a transformed representation of the slot label model yis through the slot gate g; and
[0079] a joint optimization unit 5, for jointly optimizing the intent prediction model yl and the transformed slot label model yis to construct a target function, and performing intent recognition on the speech question of the user based on the target function.
Specifically, the term segmentation processing unit includes:
Specifically, the term segmentation processing unit includes:
[0080] a term-segmenting module, for receiving the speech question of the user and transforming the speech question to a recognizable text, and employing a tokenizer to term-segment the recognizable text and obtain the original term sequence; and
[0081] an embedding processing module, for subjecting the original term sequence to a word embedding process, and realizing a vector representation of each segmented term in the original term sequence.
[0082] Specifically, the first calculating unit includes:
[0083] a hidden state calculating module, for employing a bidirectional LSTM
network to encode each term segmentation vector, and outputting the hidden state vector hi corresponding to each term segmentation vector;
network to encode each term segmentation vector, and outputting the hidden state vector hi corresponding to each term segmentation vector;
[0084] a slot context calculating module, for calculating the slot context vector cis, to which each term segmentation vector corresponds, through formula c = Eahj, wherein ais:j Date Regue/Date Received 2022-07-04 exp (e ii) represents an attention weight of a slot, its calculation formula is c0../ =
exp cid = o-(Whsehj), where a represents a slot activation function, and WL
represents a slot weight matrix; and
exp cid = o-(Whsehj), where a represents a slot activation function, and WL
represents a slot weight matrix; and
[0085] a slot label model module, for constructing a slot label model yis =
softmax (W, (hi +cis) ) based on the hidden state vector hi and the slot context vector cis.
softmax (W, (hi +cis) ) based on the hidden state vector hi and the slot context vector cis.
[0086] As compared with prior-art technology, the advantageous effects achieved by the human-machine interactive speech recognizing system for an intelligent device provided by this embodiment of the present invention are identical with the advantageous effects achievable by the human-machine interactive speech recognizing method for an intelligent device provided by the foregoing Embodiment 1, so these are not redundantly described in this context.
[0087] As understandable to persons ordinarily skilled in the art, realization of the entire or partial steps in the method of the present invention can be completed via a program that instructs relevant hardware, the program can be stored in a computer-readable storage medium, and subsumes the various steps of the method in the foregoing embodiment when it is executed, wherein the storage medium can be an ROM/RAM, a magnetic disk, an optical disk, or a memory card, etc.
[0088] What the above describes is merely directed to specific modes of execution of the present invention, but the protection scope of the present invention is not restricted thereby. Any change or replacement easily conceivable to persons skilled in the art within the technical range disclosed by the present invention shall be covered by the protection scope of the present invention. Accordingly, the protection scope of the present invention shall be based on the protection scope as claimed in the Claims.
Date Regue/Date Received 2022-07-04
Date Regue/Date Received 2022-07-04
Claims (10)
1. A human-machine interactive speech recognizing method for an intelligent device, characterized in comprising:
subjecting a speech question of a user to a term-segmenting process to obtain an original term sequence, and vectorizing the original term sequence through an embedding process;
calculating a hidden state vector hi and a slot context vector cis of each term segmentation vector, and weighting the hidden state vector hi and the slot context vector cis to thereafter obtain a slot label model yis;
calculating a hidden state vector hT and an intent context vector cf of the vectorized original term sequence, and weighting the hidden state vector hT and the intent context vector cf to thereafter obtain an intent prediction model yi;
employing a slot gate g to join the slot context vector cis and the intent context vector cf, and generating a transformed representation of the slot label model yis through the slot gate g; and jointly optimizing the intent prediction model yi and the transformed slot label model yis to construct a target function, and performing intent recognition on the speech question of the user based on the target function.
subjecting a speech question of a user to a term-segmenting process to obtain an original term sequence, and vectorizing the original term sequence through an embedding process;
calculating a hidden state vector hi and a slot context vector cis of each term segmentation vector, and weighting the hidden state vector hi and the slot context vector cis to thereafter obtain a slot label model yis;
calculating a hidden state vector hT and an intent context vector cf of the vectorized original term sequence, and weighting the hidden state vector hT and the intent context vector cf to thereafter obtain an intent prediction model yi;
employing a slot gate g to join the slot context vector cis and the intent context vector cf, and generating a transformed representation of the slot label model yis through the slot gate g; and jointly optimizing the intent prediction model yi and the transformed slot label model yis to construct a target function, and performing intent recognition on the speech question of the user based on the target function.
2. The method according to Claim 1, characterized in that the step of subjecting a speech question of a user to a term-segmenting process to obtain an original term sequence, and vectorizing the original term sequence through an embedding process includes:
receiving the speech question of the user and transforming the speech question to a recognizable text, and employing a tokenizer to term-segment the recognizable text and obtain the original term sequence; and subjecting the original term sequence to a word embedding process, and realizing a vector representation of each segmented term in the original term sequence.
receiving the speech question of the user and transforming the speech question to a recognizable text, and employing a tokenizer to term-segment the recognizable text and obtain the original term sequence; and subjecting the original term sequence to a word embedding process, and realizing a vector representation of each segmented term in the original term sequence.
3. The method according to Claim 1, characterized in that the step of calculating a hidden state vector hi and a slot context vector cis of each term segmentation vector, and weighting the hidden state vector hi and the slot context vector cis to thereafter obtain a slot label model yis includes:
employing a bidirectional LSTM network to encode each term segmentation vector, and outputting the hidden state vector hi corresponding to each term segmentation vector;
calculating the slot context vector cis, to which each term segmentation vector corresponds, through formula <BIG> , wherein IMG represents an attention weight of a slot, its calculation formula is , where a represents a slot activation function, and BIG represents a slot weight matrix; and constructing a slot label model based on the hidden state vector hi and the slot context vector c1s.
employing a bidirectional LSTM network to encode each term segmentation vector, and outputting the hidden state vector hi corresponding to each term segmentation vector;
calculating the slot context vector cis, to which each term segmentation vector corresponds, through formula <BIG> , wherein IMG represents an attention weight of a slot, its calculation formula is , where a represents a slot activation function, and BIG represents a slot weight matrix; and constructing a slot label model based on the hidden state vector hi and the slot context vector c1s.
4. The method according to Claim 1, characterized in that the step of calculating a hidden state vector hT and an intent context vector cf of the vectorized original term sequence, and weighting the hidden state vector hT and the intent context vector cf to thereafter obtain an intent prediction model./ includes:
employing a hidden unit in the bidirectional LSTM network to encode the vectorized original term sequence, and obtaining the hidden state vector hT;
calculating the intent context vector cf of the original term sequence through formula c1 =
, wherein ct.; represents an attention weight of an intent, its calculation formula is ct.; =
where a' represents an intent activation function, and <BIG>
represents an intent weight matrix; and constructing an intent prediction model based on the hidden state vector hT and the intent context vector cf.
employing a hidden unit in the bidirectional LSTM network to encode the vectorized original term sequence, and obtaining the hidden state vector hT;
calculating the intent context vector cf of the original term sequence through formula c1 =
, wherein ct.; represents an attention weight of an intent, its calculation formula is ct.; =
where a' represents an intent activation function, and <BIG>
represents an intent weight matrix; and constructing an intent prediction model based on the hidden state vector hT and the intent context vector cf.
5. The method according to Claim 1, characterized in that the step of employing a slot gate g to join the slot context vector cis and the intent context vector cf, and generating a transformed representation of the slot label model yiS through the slot gate g includes:
formally representing the slot gate g as g= v = tanh (cis. + W = c') , wherein v represents a weight vector obtained by training, and W represents a weight matrix obtained by training; and formally representing the transformation of the slot label model yis through the slot gate g as
formally representing the slot gate g as g= v = tanh (cis. + W = c') , wherein v represents a weight vector obtained by training, and W represents a weight matrix obtained by training; and formally representing the transformation of the slot label model yis through the slot gate g as
6. The method according to Claim 1, characterized in that the target function constructed by jointly optimizing the intent prediction model./ and the transformed slot label model y1s. is:
wherein represents a conditional probability for outputting slot filling and intent prediction at a given original term sequence, where X is the vectorized original term sequence.
wherein represents a conditional probability for outputting slot filling and intent prediction at a given original term sequence, where X is the vectorized original term sequence.
7. The method according to Claim 6, characterized in that the step of performing intent recognition on the speech question of the user based on the target function includes:
sequentially obtaining intent conditional probabilities, to which the various segmented terms in the original term sequence correspond, through the target function; and screening therefrom a segmented term with the maximum probability value and recognizing the segmented term as the intent of the speech question of the user.
sequentially obtaining intent conditional probabilities, to which the various segmented terms in the original term sequence correspond, through the target function; and screening therefrom a segmented term with the maximum probability value and recognizing the segmented term as the intent of the speech question of the user.
8. A human-machine interactive speech recognizing system for an intelligent device, characterized in comprising:
a term segmentation processing unit, for subjecting a speech question of a user to a term-segmenting process to obtain an original term sequence, and vectorizing the original term sequence through an embedding process;
a first calculating unit, for calculating a hidden state vector hi and a slot context vector ci5 of each term segmentation vector, and weighting the hidden state vector hi and the slot context vector ci5 to thereafter obtain a slot label model yiS;
a second calculating unit, for calculating a hidden state vector hT and an intent context vector cf of the vectorized original term sequence, and weighting the hidden state vector hT and the intent context vector cf to thereafter obtain an intent prediction model yi;
a model transforming unit, for employing a slot gate g to join the slot context vector cis and the intent context vector cf, and generating a transformed representation of the slot label model yis through the slot gate g; and a joint optimization unit, for jointly optimizing the intent prediction model./ and the transformed slot label model yis to construct a target function, and performing intent recognition on the speech question of the user based on the target function.
a term segmentation processing unit, for subjecting a speech question of a user to a term-segmenting process to obtain an original term sequence, and vectorizing the original term sequence through an embedding process;
a first calculating unit, for calculating a hidden state vector hi and a slot context vector ci5 of each term segmentation vector, and weighting the hidden state vector hi and the slot context vector ci5 to thereafter obtain a slot label model yiS;
a second calculating unit, for calculating a hidden state vector hT and an intent context vector cf of the vectorized original term sequence, and weighting the hidden state vector hT and the intent context vector cf to thereafter obtain an intent prediction model yi;
a model transforming unit, for employing a slot gate g to join the slot context vector cis and the intent context vector cf, and generating a transformed representation of the slot label model yis through the slot gate g; and a joint optimization unit, for jointly optimizing the intent prediction model./ and the transformed slot label model yis to construct a target function, and performing intent recognition on the speech question of the user based on the target function.
9. The system according to Claim 8, characterized in that the term segmentation processing unit includes:
a term-segmenting module, for receiving the speech question of the user and transforming the speech question to a recognizable text, and employing a tokenizer to term-segment the recognizable text and obtain the original term sequence; and an embedding processing module, for subjecting the original term sequence to a word embedding process, and realizing a vector representation of each segmented term in the original term sequence.
a term-segmenting module, for receiving the speech question of the user and transforming the speech question to a recognizable text, and employing a tokenizer to term-segment the recognizable text and obtain the original term sequence; and an embedding processing module, for subjecting the original term sequence to a word embedding process, and realizing a vector representation of each segmented term in the original term sequence.
10. The system according to Claim 8, characterized in that the first calculating unit includes:
a hidden state calculating module, for employing a bidirectional LSTM network to encode each term segmentation vector, and outputting the hidden state vector hi corresponding to each term segmentation vector;
a slot context calculating module, for calculating the slot context vector cis, to which each term segmentation vector corresponds, through formula <IMG"
.. wherein a represents an attention weight of a slot, its calculation formula is , where a represents a slot activation function, and WL represents a slot weight matrix; and a slot label model module, for constructing a slot label model yis = so ftmax ) based on the hidden state vector hi and the slot context vector cis.
a hidden state calculating module, for employing a bidirectional LSTM network to encode each term segmentation vector, and outputting the hidden state vector hi corresponding to each term segmentation vector;
a slot context calculating module, for calculating the slot context vector cis, to which each term segmentation vector corresponds, through formula <IMG"
.. wherein a represents an attention weight of a slot, its calculation formula is , where a represents a slot activation function, and WL represents a slot weight matrix; and a slot label model module, for constructing a slot label model yis = so ftmax ) based on the hidden state vector hi and the slot context vector cis.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910002748.8 | 2019-01-02 | ||
CN201910002748.8A CN109785833A (en) | 2019-01-02 | 2019-01-02 | Human-computer interaction audio recognition method and system for smart machine |
PCT/CN2019/106778 WO2020140487A1 (en) | 2019-01-02 | 2019-09-19 | Speech recognition method for human-machine interaction of smart apparatus, and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CA3166784A1 true CA3166784A1 (en) | 2020-07-09 |
Family
ID=66499837
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA3166784A Pending CA3166784A1 (en) | 2019-01-02 | 2019-09-19 | Human-machine interactive speech recognizing method and system for intelligent devices |
Country Status (3)
Country | Link |
---|---|
CN (1) | CN109785833A (en) |
CA (1) | CA3166784A1 (en) |
WO (1) | WO2020140487A1 (en) |
Families Citing this family (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109785833A (en) * | 2019-01-02 | 2019-05-21 | 苏宁易购集团股份有限公司 | Human-computer interaction audio recognition method and system for smart machine |
CN110532355B (en) * | 2019-08-27 | 2022-07-01 | 华侨大学 | Intention and slot position joint identification method based on multitask learning |
CN110750628A (en) * | 2019-09-09 | 2020-02-04 | 深圳壹账通智能科技有限公司 | Session information interaction processing method and device, computer equipment and storage medium |
CN110795532A (en) * | 2019-10-18 | 2020-02-14 | 珠海格力电器股份有限公司 | Voice information processing method and device, intelligent terminal and storage medium |
CN110853626B (en) * | 2019-10-21 | 2021-04-20 | 成都信息工程大学 | Bidirectional attention neural network-based dialogue understanding method, device and equipment |
CN110827816A (en) * | 2019-11-08 | 2020-02-21 | 杭州依图医疗技术有限公司 | Voice instruction recognition method and device, electronic equipment and storage medium |
CN111090728B (en) * | 2019-12-13 | 2023-05-26 | 车智互联(北京)科技有限公司 | Dialogue state tracking method and device and computing equipment |
CN111062209A (en) * | 2019-12-16 | 2020-04-24 | 苏州思必驰信息科技有限公司 | Natural language processing model training method and natural language processing model |
CN111177381A (en) * | 2019-12-21 | 2020-05-19 | 深圳市傲立科技有限公司 | Slot filling and intention detection joint modeling method based on context vector feedback |
WO2021140447A1 (en) * | 2020-01-06 | 2021-07-15 | 7Hugs Labs | System and method for controlling a plurality of devices |
CN111339770B (en) * | 2020-02-18 | 2023-07-21 | 百度在线网络技术(北京)有限公司 | Method and device for outputting information |
CN111833849A (en) * | 2020-03-10 | 2020-10-27 | 北京嘀嘀无限科技发展有限公司 | Method for speech recognition and speech model training, storage medium and electronic device |
CN113505591A (en) * | 2020-03-23 | 2021-10-15 | 华为技术有限公司 | Slot position identification method and electronic equipment |
CN111597342B (en) * | 2020-05-22 | 2024-01-26 | 北京慧闻科技(集团)有限公司 | Multitasking intention classification method, device, equipment and storage medium |
CN113779975B (en) * | 2020-06-10 | 2024-03-01 | 北京猎户星空科技有限公司 | Semantic recognition method, device, equipment and medium |
CN112069828B (en) * | 2020-07-31 | 2023-07-04 | 飞诺门阵(北京)科技有限公司 | Text intention recognition method and device |
CN112800190B (en) * | 2020-11-11 | 2022-06-10 | 重庆邮电大学 | Intent recognition and slot value filling joint prediction method based on Bert model |
CN112765959B (en) * | 2020-12-31 | 2024-05-28 | 康佳集团股份有限公司 | Intention recognition method, device, equipment and computer readable storage medium |
CN114969339B (en) * | 2022-05-30 | 2023-05-12 | 中电金信软件有限公司 | Text matching method and device, electronic equipment and readable storage medium |
CN115358186B (en) * | 2022-08-31 | 2023-11-14 | 南京擎盾信息科技有限公司 | Generating method and device of slot label and storage medium |
CN115273849B (en) * | 2022-09-27 | 2022-12-27 | 北京宝兰德软件股份有限公司 | Intention identification method and device for audio data |
CN117151121B (en) * | 2023-10-26 | 2024-01-12 | 安徽农业大学 | Multi-intention spoken language understanding method based on fluctuation threshold and segmentation |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10319375B2 (en) * | 2016-12-28 | 2019-06-11 | Amazon Technologies, Inc. | Audio message extraction |
CN107491541B (en) * | 2017-08-24 | 2021-03-02 | 北京丁牛科技有限公司 | Text classification method and device |
CN108415923B (en) * | 2017-10-18 | 2020-12-11 | 北京邮电大学 | Intelligent man-machine conversation system of closed domain |
CN108417205B (en) * | 2018-01-19 | 2020-12-18 | 苏州思必驰信息科技有限公司 | Semantic understanding training method and system |
CN108876527A (en) * | 2018-06-06 | 2018-11-23 | 北京京东尚科信息技术有限公司 | Method of servicing and service unit, using open platform and storage medium |
CN108874782B (en) * | 2018-06-29 | 2019-04-26 | 北京寻领科技有限公司 | A kind of more wheel dialogue management methods of level attention LSTM and knowledge mapping |
CN109065053B (en) * | 2018-08-20 | 2020-05-15 | 百度在线网络技术(北京)有限公司 | Method and apparatus for processing information |
CN109785833A (en) * | 2019-01-02 | 2019-05-21 | 苏宁易购集团股份有限公司 | Human-computer interaction audio recognition method and system for smart machine |
-
2019
- 2019-01-02 CN CN201910002748.8A patent/CN109785833A/en not_active Withdrawn
- 2019-09-19 CA CA3166784A patent/CA3166784A1/en active Pending
- 2019-09-19 WO PCT/CN2019/106778 patent/WO2020140487A1/en active Application Filing
Also Published As
Publication number | Publication date |
---|---|
WO2020140487A1 (en) | 2020-07-09 |
CN109785833A (en) | 2019-05-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA3166784A1 (en) | Human-machine interactive speech recognizing method and system for intelligent devices | |
US10373610B2 (en) | Systems and methods for automatic unit selection and target decomposition for sequence labelling | |
TWI530940B (en) | Method and apparatus for acoustic model training | |
CN111738251B (en) | Optical character recognition method and device fused with language model and electronic equipment | |
US20220351487A1 (en) | Image Description Method and Apparatus, Computing Device, and Storage Medium | |
CN112100349A (en) | Multi-turn dialogue method and device, electronic equipment and storage medium | |
CN111916067A (en) | Training method and device of voice recognition model, electronic equipment and storage medium | |
CN108710704B (en) | Method and device for determining conversation state, electronic equipment and storage medium | |
CN113011186B (en) | Named entity recognition method, named entity recognition device, named entity recognition equipment and computer readable storage medium | |
CN106202056B (en) | Chinese word segmentation scene library update method and system | |
CN112992125B (en) | Voice recognition method and device, electronic equipment and readable storage medium | |
CN115617955B (en) | Hierarchical prediction model training method, punctuation symbol recovery method and device | |
CN116861995A (en) | Training of multi-mode pre-training model and multi-mode data processing method and device | |
CN114913590B (en) | Data emotion recognition method, device and equipment and readable storage medium | |
CN115100582B (en) | Model training method and device based on multi-mode data | |
CN113609284A (en) | Method and device for automatically generating text abstract fused with multivariate semantics | |
CN116259075A (en) | Pedestrian attribute identification method based on prompt fine tuning pre-training large model | |
CN114387537A (en) | Video question-answering method based on description text | |
CN113113024A (en) | Voice recognition method and device, electronic equipment and storage medium | |
CN116341651A (en) | Entity recognition model training method and device, electronic equipment and storage medium | |
CN114860938A (en) | Statement intention identification method and electronic equipment | |
CN113870863A (en) | Voiceprint recognition method and device, storage medium and electronic equipment | |
CN116522905B (en) | Text error correction method, apparatus, device, readable storage medium, and program product | |
CN115408494A (en) | Text matching method integrating multi-head attention alignment | |
US11321527B1 (en) | Effective classification of data based on curated features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request |
Effective date: 20220704 |
|
EEER | Examination request |
Effective date: 20220704 |
|
EEER | Examination request |
Effective date: 20220704 |
|
EEER | Examination request |
Effective date: 20220704 |
|
EEER | Examination request |
Effective date: 20220704 |
|
EEER | Examination request |
Effective date: 20220704 |
|
EEER | Examination request |
Effective date: 20220704 |