WO2023168838A1 - 语句文本的识别方法和装置、存储介质及电子装置 - Google Patents

语句文本的识别方法和装置、存储介质及电子装置 Download PDF

Info

Publication number
WO2023168838A1
WO2023168838A1 PCT/CN2022/096405 CN2022096405W WO2023168838A1 WO 2023168838 A1 WO2023168838 A1 WO 2023168838A1 CN 2022096405 W CN2022096405 W CN 2022096405W WO 2023168838 A1 WO2023168838 A1 WO 2023168838A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
ingredient
component
text
label
Prior art date
Application number
PCT/CN2022/096405
Other languages
English (en)
French (fr)
Inventor
刘建国
王迪
李昱涧
Original Assignee
青岛海尔科技有限公司
海尔智家股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 青岛海尔科技有限公司, 海尔智家股份有限公司 filed Critical 青岛海尔科技有限公司
Publication of WO2023168838A1 publication Critical patent/WO2023168838A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Definitions

  • the present disclosure relates to the field of smart home technology, and specifically to a sentence text recognition method and device, a storage medium and an electronic device.
  • NLP Natural Language Processing, natural language processing
  • a recognition model is often constructed by inputting training data into the prediction model output. The results serve as the intent expressed by the training data.
  • the accuracy and rationality of the recognition model will have a decisive impact on the intention of predicting the training data.
  • the recognition model does not combine the training data when predicting the intention, which may cause the recognition model to recognize The intention expressed is far from the intention actually expressed by the training data.
  • Embodiments of the present disclosure provide a sentence text recognition method and device, a storage medium, and an electronic device, so as to at least solve the problem in related technologies that the accuracy of identifying the intention expressed by the sentence text is low.
  • a sentence text recognition method including:
  • the target sentence text is identified through a target component recognition model to obtain the target component features corresponding to the target sentence text, wherein the target component recognition model is an initial component recognition model using a first text sample marked with component features. Obtained through training, the target component features are used to indicate the language components of the target sentence text;
  • a target intention feature corresponding to the target sentence text is identified, wherein the target intention feature is used to indicate the operation intention of the target sentence text for the smart device.
  • identifying the target sentence text through a target component identification model includes:
  • the text of the target sentence is input into the ingredient label recognition layer included in the target ingredient recognition model, and a plurality of target words output by the ingredient tag recognition layer are obtained.
  • the ingredient tag corresponding to each target word and the ingredient tag corresponding to each ingredient tag are obtained.
  • Component label probability wherein the target sentence text includes the plurality of target characters, the component label is used to indicate the language component to which each of the target characters is allowed to correspond, and the component label probability is used to indicate the corresponding The probability that each target text belongs to the corresponding component label;
  • the target sentence text is input into the ingredient label identification layer included in the target ingredient identification model, and a plurality of target words output by the ingredient tag identification layer are obtained, each target word corresponding to of ingredient labels and the ingredient label probability corresponding to each ingredient label, including:
  • the multiple target words, the ingredient label corresponding to each target word, and the ingredient label probability corresponding to each ingredient label are input into the ingredient label determination layer to obtain the output of the ingredient label determination layer and Multiple target ingredient labels corresponding to the multiple target words as ingredient identification results include:
  • the component label determination layer selects candidate component labels that meet the target constraint conditions from the component labels corresponding to each target text, where the target constraint conditions are the constraints on the language components in the sentence;
  • the ingredient tag whose corresponding ingredient label probability satisfies the target probability condition is obtained from the candidate ingredient tags as the target ingredient label corresponding to each target text, and the corresponding ingredient label corresponding to the multiple target words is obtained.
  • Multiple target component labels in one-to-one correspondence are used as the component identification results.
  • identifying target intent features corresponding to the target sentence text based on the target component features and the target sentence text includes:
  • the target sentence text carrying the target component features is identified through a target intention recognition model, wherein the target intention recognition model uses a second text sample marked with intention features and carrying component features to identify the initial intention.
  • the model is trained;
  • identifying the target sentence text carrying the target component features through a target intention recognition model includes:
  • Target entity recognition model Input the target sentence text into the target entity recognition model to obtain the target entity features output by the target entity recognition model, where the target entity features are used to indicate entities included in the target sentence text.
  • the target entity The recognition model is obtained by training the initial entity recognition model using a third text sample labeled with entity features;
  • the target component characteristics and the target entity characteristics are input into the target intention recognition model, and the intention recognition result output by the target intention recognition model is obtained.
  • the method before inputting the target sentence text into the target entity recognition model, the method further includes:
  • the initial entity recognition model is trained using the third text sample marked with the entity characteristics to obtain the target entity recognition model.
  • a device for recognizing sentence text including:
  • the first acquisition module is configured to acquire the sentence text collected by the smart device as the target sentence text to be recognized;
  • the first recognition module is configured to identify the target sentence text through a target component recognition model to obtain the target component features corresponding to the target sentence text, wherein the target component recognition model uses the first component feature annotated
  • the text samples are obtained by training the initial component recognition model, and the target component features are used to indicate the language components of the target sentence text;
  • the second identification module is configured to identify the target intention feature corresponding to the target sentence text according to the target component feature and the target sentence text, wherein the target intention feature is used to indicate that the target sentence text is important to the target sentence text.
  • the operational intent of the smart device is configured to identify the target intention feature corresponding to the target sentence text according to the target component feature and the target sentence text, wherein the target intention feature is used to indicate that the target sentence text is important to the target sentence text.
  • a computer-readable storage medium stores a computer program, wherein the computer program is configured to execute the above statement text when running. recognition methods.
  • an electronic device including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the above-mentioned steps through the computer program.
  • Sentence text recognition method including a processor, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the above-mentioned steps through the computer program.
  • the sentence text collected by the smart device is obtained as the target sentence text to be recognized; the target sentence text is identified through the target component recognition model, and the target component characteristics corresponding to the target sentence text are obtained, wherein the target component recognition model is used
  • the first text sample marked with component features is obtained by training the initial component recognition model.
  • the target component features are used to indicate the language components of the target sentence text; based on the target component features and the target sentence text, the target corresponding to the target sentence text is identified.
  • the target intent feature is used to indicate the operating intention of the target sentence text for the smart device, that is, if the sentence text collected by the smart device is obtained as the target sentence text to be recognized, the target sentence text can be identified through the target component recognition model
  • the language component it possesses is used as the target component feature, and the target intent feature is identified by combining the target sentence text and the language component contained in the target sentence text, thereby achieving accurate identification of the target sentence text's operating intention for the smart device.
  • Figure 1 is a schematic diagram of the hardware environment of a sentence text recognition method according to an embodiment of the present disclosure
  • Figure 2 is a flow chart of a sentence text recognition method according to an embodiment of the present disclosure
  • Figure 3 is a flow chart for identifying component features corresponding to sentence text through a target component recognition model according to an embodiment of the present disclosure
  • Figure 4 is a flow chart for identifying component features of sentence text according to an embodiment of the present disclosure
  • Figure 5 is an architectural diagram of an optional BiLSTM model according to an embodiment of the present disclosure.
  • Figure 6 is an optional model architecture diagram for identifying the intention of a target statement according to an embodiment of the present disclosure
  • Figure 7 is a schematic diagram of identifying language components of sentence text according to an embodiment of the present disclosure.
  • Figure 8 is an optional model architecture diagram for identifying language components of a target sentence according to an embodiment of the present disclosure
  • Figure 9 is a schematic diagram of a scene of voice interaction between a user and a smart speaker according to an embodiment of the present disclosure.
  • Figure 10 is a schematic diagram of a scene of voice interaction between a user and a smart TV according to an embodiment of the present disclosure
  • Figure 11 is a structural block diagram of a sentence text recognition device according to an embodiment of the present disclosure.
  • a method for recognizing sentence text is provided.
  • This sentence text recognition method is widely used in whole-house intelligent digital control application scenarios such as smart home, smart home, smart home device ecology, and smart residence (Intelligence House) ecology.
  • the above sentence text recognition method can be applied to the hardware environment composed of the terminal device 102 and the server 104 as shown in FIG. 1 .
  • Figure 1 is a schematic diagram of the hardware environment of a sentence text recognition method according to an embodiment of the present disclosure.
  • the server 104 is connected to the terminal device 102 through the network and can be configured to provide a terminal or a client installed on the terminal.
  • a database can be set on the server or independently of the server, and the server 104 is set to provide data storage services, and cloud computing and/or edge computing services can be configured on the server or independently of the server, and the server 104 is set Provide data computing services.
  • the above-mentioned network may include but is not limited to at least one of the following: wired network, wireless network.
  • the above-mentioned wired network may include but is not limited to at least one of the following: wide area network, metropolitan area network, and local area network.
  • the above-mentioned wireless network may include at least one of the following: WIFI (Wireless Fidelity, Wireless Fidelity), Bluetooth.
  • the terminal device 102 may be, but is not limited to, a PC (Personal Computer), a mobile phone, a tablet, a smart air conditioner, a smart hood, a smart refrigerator, a smart oven, a smart stove, a smart washing machine, a smart water heater, a smart washing equipment, a smart Dishwasher, smart projection equipment, smart TV, smart clothes drying rack, smart curtains, smart audio and video, smart sockets, smart audio, smart speakers, smart fresh air equipment, smart kitchen and bathroom equipment, smart bathroom equipment, smart sweeping robot, smart window cleaning Robots, smart mopping robots, smart air purification equipment, smart steamers, smart microwave ovens, smart kitchen appliances, smart purifiers, smart water dispensers, smart door locks, etc.
  • a PC Personal Computer
  • FIG. 2 is a flow chart of a sentence text recognition method according to an embodiment of the present disclosure. As shown in Figure 2, the process includes follow these steps:
  • Step S202 Obtain the sentence text collected by the smart device as the target sentence text to be recognized;
  • Step S204 Recognize the target sentence text through a target component recognition model to obtain target component features corresponding to the target sentence text, wherein the target component recognition model uses the first text sample marked with component features to initially Obtained by training a component recognition model, the target component feature is used to indicate the language component of the target sentence text;
  • Step S206 Identify the target intention feature corresponding to the target sentence text according to the target component feature and the target sentence text, wherein the target intention feature is used to indicate the operation of the smart device by the target sentence text. intention.
  • the language component of the target sentence text can be identified as the target component feature through the target component recognition model.
  • the language components in the text are combined to identify the target intention characteristics, achieving accurate identification of the operating intention of the target sentence text for the smart device.
  • the smart device can, but is not limited to, convert the voice command issued by the user into the corresponding sentence text, or convert the text content input by the user on the smart device into the corresponding sentence text, etc. Etc., it is possible to obtain the language content that users want to express in a variety of ways, making it convenient for users to operate in a variety of ways, and improving the user's operating experience.
  • smart devices may, but are not limited to, include devices that support voice interaction with users and perform corresponding operations according to user instructions, etc.
  • smart devices may include, but are not limited to, smart air conditioners, smart Hood, smart refrigerator, smart oven, smart stove, smart washing machine, smart water heater, smart washing equipment, smart dishwasher, smart projection equipment, smart TV, smart clothes drying rack, smart curtains, smart sockets, smart audio, smart speakers, Smart fresh air equipment, smart kitchen and bathroom equipment, smart bathroom equipment, smart sweeping robot, smart window cleaning robot, smart mopping robot, smart air purification equipment, smart steamer, smart microwave oven, smart kitchen treasure, smart purifier, smart water dispenser , smart door locks, smart car air conditioners, smart wipers, smart car speakers, smart car refrigerators, etc.
  • the target component identification model can be, but is not limited to, used to identify the language components of the sentence text as the component features corresponding to the sentence text.
  • Figure 3 is a diagram through target component identification according to an embodiment of the present disclosure. The flow chart of the model identifying the component features corresponding to the sentence text is shown in Figure 3. It can, but is not limited to, include the following steps:
  • Step S301 input the target sentence text into the target component recognition model
  • Step S302 The target component recognition model identifies the language components of the target sentence text
  • Step S303 Output the language components of the target sentence text identified by the target component recognition model as target component features.
  • the language components of the target sentence text may include, but are not limited to, at least one of the following: subject, predicate, object, attributive, adverbial, complement, subject clause, predicate clause, object clause, attributive Clauses, adverbial clauses, complement clauses, etc., by identifying the linguistic components of the sentence text, make full use of the information contained in the sentence text, and improve the accuracy of identifying the sentence text.
  • the first text sample may be obtained in the following manner, but is not limited to: obtaining an initial text sample; labeling the language components of each text sample in the initial text sample, and obtaining a text sample with component characteristics labeled First text sample.
  • the target sentence text may be identified in the following manner, but is not limited to: input the target sentence text into the ingredient label identification layer included in the target ingredient identification model to obtain the ingredient tag identification layer
  • the multiple target words output, the ingredient label corresponding to each target word and the ingredient label probability corresponding to each ingredient tag, wherein the target sentence text includes the multiple target words, and the ingredient tag is used to indicate that the corresponding The language component to which each of the target words belongs, and the component label probability is used to indicate the probability that each of the corresponding target words belongs to the corresponding component label;
  • the ingredient labels and the ingredient label probabilities corresponding to each ingredient label are input into the ingredient label determination layer, and a plurality of target ingredient labels output by the ingredient label determination layer that correspond to the plurality of target characters in a one-to-one manner are obtained as ingredient identification results.
  • each target text in the target sentence text may, but is not limited to, correspond to one or more ingredient labels.
  • Each ingredient label has an ingredient label probability corresponding to the ingredient label.
  • Ingredient label determination The layer outputs a one-to-one ingredient label for each target text based on the output result of the ingredient label recognition layer.
  • the component characteristics of the target sentence text can be identified through, but are not limited to, the component label recognition layer and the component label determination layer included in the target component recognition model.
  • Figure 4 is a recognition sentence according to an embodiment of the present disclosure.
  • the flow chart of text component characteristics, as shown in Figure 4 can, but is not limited to, include the following steps:
  • Step S401 input the target sentence text into the component label recognition layer
  • Step S402 The ingredient label identification layer identifies and outputs multiple target words in the target sentence text, the ingredient tags that may correspond to each target word and the ingredient tag probability corresponding to each ingredient tag;
  • Step S403 input multiple target words, the ingredient label corresponding to each target word, and the ingredient label probability corresponding to each ingredient label into the ingredient label determination layer;
  • Step S404 The ingredient label determination layer outputs a target ingredient label corresponding to each target text.
  • the multiple target words output by the ingredient label recognition layer, the ingredient tag corresponding to each target word, and the ingredient tag probability corresponding to each ingredient tag can be obtained in the following manner:
  • the sentence text is input into the preprocessing network included in the component label identification layer, and a plurality of word vectors output by the preprocessing network corresponding to the plurality of target texts are obtained; the plurality of word vectors are input into the
  • the ingredient label identification network included in the ingredient label identification layer obtains the plurality of word vectors output by the ingredient label identification network, the ingredient label corresponding to each word vector, and the ingredient label probability corresponding to each ingredient label.
  • the preprocessing network may be, but is not limited to, used to convert each target text in the target sentence text into a word vector corresponding to each target text.
  • the preprocessing network may be, but is not limited to, Including networks with BERT (Bidirectional Encoder Representation from Transformers, bidirectional pre-training method) model architecture, or networks with Roberta (A Robustly Optimized BERT Pretraining Approach, robust optimization pre-training method) model architecture, etc.
  • the Roberta model has relatively It has a strong ability to obtain dynamic word vectors and optimizes the network structure in terms of model details, training strategies, and data levels. It can quickly and accurately convert each target text of the target sentence text into the corresponding word vector, saving the time of converting text. The time for corresponding word vectors improves the efficiency of converting text into corresponding word vectors.
  • the ingredient label identification network may be, but is not limited to, used to predict ingredient labels corresponding to multiple input word vectors and the ingredient label probability corresponding to each ingredient label.
  • the ingredient label identification network may be, but is not limited to, Including LSTM (Long Short-Term Memory, long short-term memory) model architecture network, or BiLSTM (Bi-directional Long Short-Term Memory, bidirectional long short-term memory) model architecture network, etc., Figure 5 is implemented according to this disclosure The architecture diagram of the optional BiLSTM model is shown in Figure 5. When the BiLSTM model predicts the ingredient label corresponding to "EU rejects German cal", it can, but is not limited to, use forward prediction and backward prediction.
  • the ingredient label probabilities may, but are not limited to, include unnormalized probabilities (that is, each ingredient label probability may, but is not limited to, be greater than 1, or greater than or equal to 0, and less than or equal to 1) , or normalized probability (that is, each component label probability may be, but is not limited to, both greater than or equal to 0, and less than or equal to 1) and so on.
  • the ingredient label identification layer may, but is not limited to, include an ingredient label identification network and an ingredient label probability normalization network, or an ingredient label identification network; the ingredient label probability normalization network may, but is not limited to It is used to normalize the ingredient label probability output by the ingredient label recognition network.
  • the ingredient label probability normalization network can include, but is not limited to, a network using a Softmax (classification network) model architecture, etc.
  • the ingredient identification result may be obtained in the following manner, but is not limited to: filtering out candidate ingredient tags that meet the target constraint conditions from the ingredient tags corresponding to each target text through the ingredient tag determination layer,
  • the target constraint is a constraint on a language component in a sentence
  • the component label determination layer obtains a component label whose corresponding component label probability satisfies the target probability condition from the candidate component labels as each of the component labels.
  • the target component labels corresponding to the target text are used to obtain a plurality of target component labels corresponding one-to-one to the plurality of target characters as the component identification results.
  • the ingredient label determination layer may be, but is not limited to, used to output a target ingredient label corresponding to each target text.
  • the ingredient label determination layer may be, but is not limited to, using CRF (Conditional Random Field, Conditional random field) model, etc., the CRF model can make full use of the information in the BiLSTM model, improving the accuracy of the target component labels corresponding to each target text output by CRF.
  • CRF Conditional Random Field, Conditional random field
  • the target constraints can be, but are not limited to, learned from the target sentence text by the component label determination layer, for example: the first word of the sentence should be “B-label” or “O-label” " instead of “I-label", “B-label1 I-label1 I-label 2", label1, label2 and label3 should be the same ingredient category, "O I-label” is wrong, the beginning should be “B-” instead of "I-”.
  • the accuracy of the predicted labels predicted by the ingredient label recognition network can be improved.
  • the component label with the highest probability of a component label among the component labels to which each target text in the candidate component labels may belong may be used as the target component label for each target text, or,
  • the multiple component labels with the largest sum of component label probabilities of multiple target words in the target sentence text are used as multiple target component labels, and so on.
  • the operation intention of the target sentence text for the smart device can be identified by, but is not limited to, combining the target component characteristics and the target sentence text.
  • the text that needs to be recognized may be a long sentence containing a clause part, or it may be a short sentence including several language components.
  • the information in the target sentence text can be fully utilized to accurately identify the meaning of the target sentence text.
  • the language components and target sentences contained in the sentence text can accurately identify the iconic features of the target sentence text, thereby improving the accuracy of identifying the operating intention of the target sentence text for the smart device.
  • the target intention features corresponding to the target sentence text may be identified in the following manner, but are not limited to: using a target intention recognition model to identify the target sentence text carrying the target component features, where: The target intention recognition model is obtained by training the initial intention recognition model using a second text sample marked with intention features and carrying component features; and the intention recognition result output by the target intention recognition model is obtained as the target intention feature.
  • the target sentence text carrying the target component characteristics output by the target component recognition model may be input into the target intention recognition model, and the target sentence text output by the target intention recognition model may be input to the smart device.
  • Operational intentions as target intention features may be input into the target intention recognition model, and the target sentence text output by the target intention recognition model may be input to the smart device. Operational intentions as target intention features.
  • the target sentence text carrying the target component characteristics can be recognized in the following manner, but is not limited to: input the target sentence text into the target entity recognition model, and obtain the target output by the target entity recognition model.
  • Entity features wherein the target entity features are used to indicate entities included in the target sentence text, and the target entity recognition model is obtained by training an initial entity recognition model using a third text sample marked with entity features. ; Input the target component characteristics and the target entity characteristics into the target intention recognition model to obtain the intention recognition result output by the target intention recognition model.
  • the entities included in the target sentence text recognized by the target entity recognition model may be used as the target entity features, and the target component features and target entity features may be input into the target intent, but are not limited to
  • the recognition model uses the intention recognition result output by the target intention recognition model as the operation intention for the smart device corresponding to the target sentence text.
  • Figure 6 is an optional model architecture diagram for identifying the target sentence intention according to an embodiment of the present disclosure.
  • the operation intention of the target sentence for the smart device can be identified by, but not limited to, combining the target component recognition model, the target entity recognition model, and the target intention recognition model, thereby realizing the combination of the language components of the target sentence text and The entities included in the target sentence text are combined for intent recognition, making full use of the information included in the target sentence text and improving the accuracy of identifying the target sentence text's intention to operate the smart device.
  • the target entity recognition model may be obtained in the following manner, but is not limited to: obtaining a third text sample marked with entity features, where the entity features are used to characterize the control operation performed on the smart device. the operation information; use the third text sample marked with the entity characteristics to train the initial entity recognition model to obtain the target entity recognition model.
  • the operation information of the control operation performed on the smart device may, but is not limited to, include the operation time, operation location, operation mode, operating device and operating mode of the device, etc. wait.
  • Figure 7 is a schematic diagram of identifying language components of sentence text according to an embodiment of the present disclosure. As shown in Figure 7, the method may, but is not limited to, include the following steps:
  • Step S701 Collect and clean text data
  • Step S702 Determine the component labels and quantities of text data annotations.
  • the component labels may, but are not limited to, include at least one of the following: subject (SUB), predicate (PRE), object (OBJ), attributive (ATT), adverbial (ADV, Adverbial) ), complement (COM, Complement)), subject clause, predicate clause, object clause, attributive clause, adverbial clause, complement clause, etc.;
  • Step S703 Label the language components of the text data. You can, but are not limited to, label the language components of each statement text in the text data according to the determined labeling rules to obtain sample data for model training;
  • Step S704 The sample data may be, but is not limited to, divided into a training set, a verification set and a test set to obtain training data;
  • Step S705 Input the training data to the Roberta pre-training model for vectorization.
  • the vectorization can be, but is not limited to, divided into three modules: input-ids (a tensor composed of word ids (identification, identification) in the input data), segment-ids (Tensor composed of sentence ids (identifications) in the input data), input-mask (input data mask).
  • input-ids a tensor composed of word ids (identification, identification) in the input data
  • segment-ids Transensor composed of sentence ids (identifications) in the input data
  • input-mask input-mask
  • Step S706 The BiLSTM model predicts the component label corresponding to each word vector and the component label probability corresponding to each component label. It can, but is not limited to, use multiple word vectors output by the Roberta pre-training model as input to the BiLSTM model. The input obtained The n-dimensional word vector is used as the input of each time step of the BiLSTM neural network to obtain the hidden state sequence of the BiLSTM layer.
  • the BiLSTM model learning parameters can be updated, but are not limited to, using the BPTT (Back-Propagation Through Time) algorithm. The difference between this model and the general model in the forward and backward stages lies in the hidden layer The calculation must be carried out for all time steps.
  • BPTT Back-Propagation Through Time
  • Step S707 The Softmax layer normalizes the probability of each component label. It can, but is not limited to, input multiple word vectors output by BiLSTM, the component label corresponding to each word vector, and the component label probability corresponding to each component label into logit. (Logical) layer, where the logit layer is the input of the Softmax layer, and the Softmax layer outputs multiple word vectors, the component labels corresponding to each word vector and the normalized component label probability corresponding to each component label;
  • Step S708 The CRF layer outputs the predicted component label corresponding to each word vector, which may be but is not limited to multiple word vectors output by the Softmax layer, the component label corresponding to each word vector, and the normalized component corresponding to each component label.
  • the label probability is input to the CRF layer.
  • the CRF layer can add some constraints to the final predicted component label to ensure that the predicted component label is valid. These constraints are automatically learned from the training data set during the training process of the CRF layer.
  • CRF uses the output of LSTM on the i-th tag (classification) at each time t as a point function in the feature function, which introduces nonlinearity into the original CRF.
  • the overall model is still a large framework with CRF as the main body, so that the information in LSTM can be fully reused, and finally the globally optimal output sequence can be obtained.
  • Step S709 Calculate the loss degree between the predicted component label and the real component label. After calculating the loss (loss degree) with the real label of the training data, the iteration of each epoch (period) is repeated and continuously updated through the BPTT algorithm. The parameters of the neural network nodes make the loss gradually decrease and finally reach the model convergence state. After the model is optimized, the loss can be ensured to be small. The trained model has a high accuracy rate for new data.
  • Step S710 When the loss degree is less than or equal to the loss degree threshold, the model deployment is completed. During the entire process of statement intent parsing, new data is passed into the model to obtain prediction labels. Combined with expert rules, Complete precise identification of intent.
  • Figure 8 is an optional model architecture diagram for identifying language components of a target sentence according to an embodiment of the present disclosure.
  • the above steps S701 to S710 can be, but are not limited to, used in the model architecture as shown in Figure 8. They can be, but are not It is limited to completing high-precision component identification of sentence text by building sentence component labeling rules and sentence component label prediction neural network models; it can but is not limited to using the Roberta pre-training model to complete the embedding of input words, making word vectorization simple It is efficient, and vectorization contains richer information and meaning, which improves the accuracy of vectorization.
  • the state transition matrix of the CRF model is used to greatly improve the effectiveness of label prediction.
  • the model structure improves the training speed and prediction accuracy, and provides a new processing method in the field of intent recognition.
  • FIG 9 is a schematic diagram of a scene of voice interaction between a user and a smart speaker according to an embodiment of the present disclosure.
  • the user can, but is not limited to, express the voice command "the sound is too loud" while the smart speaker is playing music, and can, but is not limited to, identify the user's intention to operate the smart speaker.
  • FIG 10 is a schematic diagram of a scene of voice interaction between a user and a smart TV according to an embodiment of the present disclosure.
  • the smart TV can be, but is not limited to, in a power-off state. If the user expresses a voice command of "turn on the TV" , it can, but is not limited to, recognize that the user's intention to operate the smart TV is to control the smart TV to turn on, and then it can, but is not limited to, respond to the user's voice command to control the smart TV to turn on.
  • the shapes of the smart speakers and smart TVs are not limited.
  • a cylindrical smart speaker is exemplified.
  • a rectangular shape is taken as an example.
  • the shapes of smart speakers and smart TVs can be any shape that meets the production process and user needs, and this disclosure does not limit this.
  • the method according to the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is Better implementation.
  • the technical solution of the present disclosure can be embodied in the form of a software product in essence or that contributes to the existing technology.
  • the computer software product is stored in a storage medium (such as ROM/RAM, disk, CD), including several instructions to cause a terminal device (which can be a mobile phone, computer, server, or network device, etc.) to execute the methods of various embodiments of the present disclosure.
  • Figure 11 is a structural block diagram of a sentence text recognition device according to an embodiment of the present disclosure; as shown in Figure 11, it includes:
  • the first acquisition module 1102 is configured to acquire the sentence text collected by the smart device as the target sentence text to be recognized;
  • the first recognition module 1104 is configured to identify the target sentence text through a target component recognition model to obtain the target component features corresponding to the target sentence text, wherein the target component recognition model uses a third component marked with component features.
  • a text sample is obtained by training the initial component recognition model, and the target component feature is used to indicate the language component of the target sentence text;
  • the second identification module 1106 is configured to identify the target intention feature corresponding to the target sentence text according to the target component feature and the target sentence text, wherein the target intention feature is used to indicate that the target sentence text is Describe the operation intention of the smart device.
  • the language component of the target sentence text can be identified as the target component feature through the target component recognition model.
  • the language components contained in the sentence text are combined to identify the target intention characteristics, achieving accurate identification of the target sentence text's operating intention for the smart device.
  • the first identification module includes:
  • the first recognition unit is configured to input the target sentence text into the ingredient label recognition layer included in the target ingredient recognition model, and obtain a plurality of target words output by the ingredient tag recognition layer, and the ingredient tag corresponding to each target word.
  • the probability of a component label corresponding to each component label wherein the target sentence text includes the plurality of target characters, the component label is used to indicate the language component to which each of the corresponding target characters is allowed to belong, the The ingredient label probability is used to indicate the probability that each corresponding target text belongs to the corresponding ingredient label;
  • the processing unit is configured to input a plurality of target words, an ingredient label corresponding to each target word, and an ingredient label probability corresponding to each ingredient label into the ingredient label determination layer, and obtain a result output by the ingredient label determination layer that is consistent with the plurality of targets. Multiple target ingredient labels with one-to-one text correspondence are used as ingredient identification results.
  • the first identification unit is configured as:
  • the processing unit is configured to:
  • the component label determination layer selects candidate component labels that meet the target constraint conditions from the component labels corresponding to each target text, where the target constraint conditions are the constraints on the language components in the sentence;
  • the ingredient tag whose corresponding ingredient tag probability satisfies the target probability condition is obtained from the candidate ingredient tags as the target ingredient tag corresponding to each target text, and the corresponding ingredient label corresponding to the multiple target words is obtained.
  • Multiple target component labels in one-to-one correspondence are used as the component identification results.
  • the second identification module includes:
  • the second recognition unit is configured to recognize the target sentence text carrying the target component characteristics through a target intention recognition model, wherein the target intention recognition model uses a third text carrying component characteristics marked with the intention characteristics. Obtained by training the initial intent recognition model with two text samples;
  • the acquisition unit is configured to acquire the intention recognition result output by the target intention recognition model as the target intention feature.
  • the second identification unit is set to:
  • Target entity recognition model Input the target sentence text into the target entity recognition model to obtain the target entity features output by the target entity recognition model, where the target entity features are used to indicate entities included in the target sentence text.
  • the target entity The recognition model is obtained by training the initial entity recognition model using a third text sample labeled with entity features;
  • the target component characteristics and the target entity characteristics are input into the target intention recognition model, and the intention recognition result output by the target intention recognition model is obtained.
  • the device further includes:
  • the second acquisition module is configured to acquire a third text sample marked with entity features before inputting the target sentence text into the target entity recognition model, wherein the entity features are used to characterize the execution of the intelligent device. Operational information to control operations;
  • a training module configured to train the initial entity recognition model using the third text sample marked with the entity characteristics to obtain the target entity recognition model.
  • An embodiment of the present disclosure also provides a storage medium that includes a stored program, wherein the method of any of the above items is executed when the program is run.
  • the above-mentioned storage medium may be configured to store program codes for performing the following steps:
  • S2 identify the target sentence text through a target component recognition model to obtain the target component features corresponding to the target sentence text, wherein the target component recognition model uses the first text sample marked with component features to compare the initial components Obtained by training the recognition model, the target component features are used to indicate the language components of the target sentence text;
  • target component characteristics and the target sentence text identify the target intention characteristics corresponding to the target sentence text, wherein the target intention characteristics are used to indicate the operation intention of the target sentence text for the smart device. .
  • Embodiments of the present disclosure also provide an electronic device, including a memory and a processor.
  • a computer program is stored in the memory, and the processor is configured to run the computer program to perform the steps in any of the above method embodiments.
  • the above-mentioned electronic device may further include a transmission device and an input-output device, wherein the transmission device is connected to the above-mentioned processor, and the input-output device is connected to the above-mentioned processor.
  • the above-mentioned processor may be configured to perform the following steps through a computer program:
  • S2 identify the target sentence text through a target component recognition model to obtain the target component features corresponding to the target sentence text, wherein the target component recognition model uses the first text sample marked with component features to compare the initial components Obtained by training the recognition model, the target component features are used to indicate the language components of the target sentence text;
  • target component characteristics and the target sentence text identify the target intention characteristics corresponding to the target sentence text, wherein the target intention characteristics are used to indicate the operation intention of the target sentence text for the smart device. .
  • the above storage medium may include but is not limited to: U disk, read-only memory (Read-Only Memory, referred to as ROM), random access memory (Random Access Memory, referred to as RAM), Various media that can store program code, such as mobile hard drives, magnetic disks, or optical disks.
  • ROM read-only memory
  • RAM random access memory
  • program code such as mobile hard drives, magnetic disks, or optical disks.
  • modules or steps of the present disclosure can be implemented using general-purpose computing devices, and they can be concentrated on a single computing device, or distributed across a network composed of multiple computing devices. , optionally, they may be implemented in program code executable by a computing device, such that they may be stored in a storage device for execution by the computing device, and in some cases, may be in a sequence different from that herein.
  • the steps shown or described are performed either individually as individual integrated circuit modules, or as multiple modules or steps among them as a single integrated circuit module. As such, the present disclosure is not limited to any specific combination of hardware and software.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

本公开提供了一种语句文本的识别方法和装置、存储介质及电子装置,涉及智能家居技术领域,该语句文本的识别方法包括:获取智能设备采集到的语句文本作为待识别的目标语句文本;通过目标成分识别模型对所述目标语句文本进行识别,得到所述目标语句文本对应的目标成分特征;根据所述目标成分特征和所述目标语句文本,识别所述目标语句文本对应的目标意图特征,其中,所述目标意图特征用于指示所述目标语句文本对所述智能设备的操作意图。采用上述技术方案,解决了相关技术中,识别语句文本所表达意图的准确率较低等问题。

Description

语句文本的识别方法和装置、存储介质及电子装置
本公开要求于2022年3月9日提交中国专利局、申请号为202210234269.0、发明名称“语句文本的识别方法和装置、存储介质及电子装置”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。
技术领域
本公开涉及智能家居技术领域,具体而言,涉及一种语句文本的识别方法和装置、存储介质及电子装置。
背景技术
在NLP(Natural Language Processing,自然语言处理)领域中,往往需要准确高效的识别出数据所表达的意图,现有技术中,往往通过将训练数据输入构建好的识别模型,将识别模型输出的预测结果作为训练数据所表达的意图。这样的实现方式,一方面,识别模型的准确性和合理性将会对预测训练数据的意图产生决定性影响,另一方面,识别模型在预测意图的时候没有结合训练数据,可能会导致识别模型识别出的意图与训练数据真实表达的意图相差甚远。
针对相关技术中,识别语句文本所表达意图的准确率较低等问题,尚未提出有效的解决方案。
发明内容
本公开实施例提供了一种语句文本的识别方法和装置、存储介质及电子装置,以至少解决相关技术中,识别语句文本所表达意图的准确率较低等问题。
根据本公开实施例的一个实施例,提供了一种语句文本的识别方法,包括:
获取智能设备采集到的语句文本作为待识别的目标语句文本;
通过目标成分识别模型对所述目标语句文本进行识别,得到所述目标语句文 本对应的目标成分特征,其中,所述目标成分识别模型是使用标注了成分特征的第一文本样本对初始成分识别模型进行训练得到的,所述目标成分特征用于指示所述目标语句文本所具有的语言成分;
根据所述目标成分特征和所述目标语句文本,识别所述目标语句文本对应的目标意图特征,其中,所述目标意图特征用于指示所述目标语句文本对所述智能设备的操作意图。
在一个示例性实施例中,所述通过目标成分识别模型对所述目标语句文本进行识别,包括:
将所述目标语句文本输入所述目标成分识别模型所包括的成分标签识别层,得到所述成分标签识别层输出的多个目标文字,每个目标文字对应的成分标签和每个成分标签对应的成分标签概率,其中,所述目标语句文本包括所述多个目标文字,所述成分标签用于指示允许对应的所述每个目标文字所属于的语言成分,所述成分标签概率用于指示对应的所述每个目标文字属于对应的所述成分标签的概率;
将多个目标文字,每个目标文字对应的成分标签和每个成分标签对应的成分标签概率输入成分标签判定层,得到所述成分标签判定层输出的与所述多个目标文字一一对应的多个目标成分标签作为成分识别结果。
在一个示例性实施例中,所述将所述目标语句文本输入所述目标成分识别模型所包括的成分标签识别层,得到所述成分标签识别层输出的多个目标文字,每个目标文字对应的成分标签和每个成分标签对应的成分标签概率,包括:
将所述目标语句文本输入所述成分标签识别层所包括的预处理网络,得到所述预处理网络输出的与所述多个目标文字一一对应的多个词向量;
将所述多个词向量输入所述成分标签识别层所包括的成分标签识别网络,得到所述成分标签识别网络输出的所述多个词向量,每个词向量对应的成分标签和每个成分标签对应的成分标签概率。
在一个示例性实施例中,所述将多个目标文字,每个目标文字对应的成分标 签和每个成分标签对应的成分标签概率输入成分标签判定层,得到所述成分标签判定层输出的与所述多个目标文字一一对应的多个目标成分标签作为成分识别结果,包括:
通过所述成分标签判定层从所述每个目标文字对应的成分标签中筛选出满足目标约束条件的候选成分标签,其中,所述目标约束条件为语句中对语言成分的约束条件;
通过所述成分标签判定层从所述候选成分标签中获取所对应的成分标签概率满足目标概率条件的成分标签作为所述每个目标文字所对应的目标成分标签,得到与所述多个目标文字一一对应的多个目标成分标签作为所述成分识别结果。
在一个示例性实施例中,所述根据所述目标成分特征和所述目标语句文本,识别所述目标语句文本对应的目标意图特征,包括:
通过目标意图识别模型对携带了所述目标成分特征的所述目标语句文本进行识别,其中,所述目标意图识别模型是使用标注了意图特征的携带了成分特征的第二文本样本对初始意图识别模型进行训练得到的;
获取所述目标意图识别模型输出的意图识别结果作为所述目标意图特征。
在一个示例性实施例中,所述通过目标意图识别模型对携带了所述目标成分特征的所述目标语句文本进行识别,包括:
将所述目标语句文本输入目标实体识别模型,得到所述目标实体识别模型输出的目标实体特征,其中,所述目标实体特征用于指示所述目标语句文本中所包括的实体,所述目标实体识别模型是使用标注了实体特征的第三文本样本对初始实体识别模型进行训练得到的;
将所述目标成分特征和所述目标实体特征输入所述目标意图识别模型,得到所述目标意图识别模型输出的所述意图识别结果。
在一个示例性实施例中,在所述将所述目标语句文本输入目标实体识别模型之前,所述方法还包括:
获取标注了实体特征的第三文本样本,其中,所述实体特征用于表征对所述智能设备执行的控制操作的操作信息;
使用标注了所述实体特征的所述第三文本样本对所述初始实体识别模型进行训练,得到所述目标实体识别模型。
根据本公开的另一个实施例,还提供了一种语句文本的识别装置,包括:
第一获取模块,设置为获取智能设备采集到的语句文本作为待识别的目标语句文本;
第一识别模块,设置为通过目标成分识别模型对所述目标语句文本进行识别,得到所述目标语句文本对应的目标成分特征,其中,所述目标成分识别模型是使用标注了成分特征的第一文本样本对初始成分识别模型进行训练得到的,所述目标成分特征用于指示所述目标语句文本所具有的语言成分;
第二识别模块,设置为根据所述目标成分特征和所述目标语句文本,识别所述目标语句文本对应的目标意图特征,其中,所述目标意图特征用于指示所述目标语句文本对所述智能设备的操作意图。
根据本公开实施例的又一方面,还提供了一种计算机可读的存储介质,该计算机可读的存储介质中存储有计算机程序,其中,该计算机程序被设置为运行时执行上述语句文本的识别方法。
根据本公开实施例的又一方面,还提供了一种电子装置,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,上述处理器通过计算机程序执行上述的语句文本的识别方法。
通过本公开,获取智能设备采集到的语句文本作为待识别的目标语句文本;通过目标成分识别模型对目标语句文本进行识别,得到目标语句文本对应的目标成分特征,其中,目标成分识别模型是使用标注了成分特征的第一文本样本对初始成分识别模型进行训练得到的,目标成分特征用于指示目标语句文本所具有的语言成分;根据目标成分特征和目标语句文本,识别目标语句文本对应的目标意图特征,其中,目标意图特征用于指示目标语句文本对智能设备的操作意图,即 如果获取到智能设备采集到的语句文本作为待识别的目标语句文本,可以通过目标成分识别模型识别目标语句文本所具有的语言成分作为目标成分特征,通过将目标语句文本和目标语句文本中所具有的语言成分结合来识别目标意图特征,实现了准确地识别目标语句文本对智能设备的操作意图。采用上述技术方案,解决了相关技术中,识别语句文本所表达意图的准确率较低等问题,实现了提高识别语句文本所表达意图的准确率的技术效果。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是根据本公开实施例的一种语句文本的识别方法的硬件环境示意图;
图2是根据本公开实施例的一种语句文本的识别方法的流程图;
图3是根据本公开实施例的通过目标成分识别模型识别语句文本对应的成分特征的流程图;
图4是根据本公开实施例的识别语句文本的成分特征的流程图;
图5是根据本公开实施例的可选的BiLSTM模型的架构图;
图6是根据本公开实施例的一种可选的识别目标语句意图的模型架构图;
图7是根据本公开实施例的识别语句文本的语言成分的示意图;
图8是根据本公开实施例的可选的识别目标语句所具有的语言成分的模型架构图;
图9是根据本公开实施例的用户与智能音箱语音交互的场景示意图;
图10是根据本公开实施例的用户与智能电视语音交互的场景示意图;
图11是根据本公开实施例的一种语句文本的识别装置的结构框图。
具体实施方式
为了使本技术领域的人员更好地理解本公开方案,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分的实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本公开保护的范围。
需要说明的是,本公开的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本公开的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
根据本公开实施例的一个方面,提供了一种语句文本的识别方法。该语句文本的识别方法广泛应用于智慧家庭(Smart Home)、智能家居、智能家用设备生态、智慧住宅(Intelligence House)生态等全屋智能数字化控制应用场景。可选地,在本实施例中,上述语句文本的识别方法可以应用于如图1所示的由终端设备102和服务器104所构成的硬件环境中。图1是根据本公开实施例的一种语句文本的识别方法的硬件环境示意图,如图1所示,服务器104通过网络与终端设备102进行连接,可设置为终端或终端上安装的客户端提供服务(如应用服务等),可在服务器上或独立于服务器设置数据库,设置为服务器104提供数据存储服务,可在服务器上或独立于服务器配置云计算和/或边缘计算服务,设置为服务器104提供数据运算服务。
上述网络可以包括但不限于以下至少之一:有线网络,无线网络。上述有线网络可以包括但不限于以下至少之一:广域网,城域网,局域网,上述无线网络可以包括但不限于以下至少之一:WIFI(Wireless Fidelity,无线保真),蓝牙。终端设备102可以并不限定于为PC(Personal Computer,个人电脑)、手机、平板电脑、智能空调、智能烟机、智能冰箱、智能烤箱、智能炉灶、智能洗衣机、智能热水器、智能洗涤设备、智能洗碗机、智能投影设备、智能电视、智能晾衣架、智能窗帘、智能影音、智能插座、智能音响、智能音箱、智能新风设备、智能厨卫设备、智能卫浴设备、智能扫地机器人、智能擦窗机器人、智能拖地机器人、智能空气净化设备、智能蒸箱、智能微波炉、智能厨宝、智能净化器、智能饮水机、智能门锁等。
在本实施例中提供了一种语句文本的识别方法,应用于上述计算机终端,图2是根据本公开实施例的一种语句文本的识别方法的流程图,如图2所示,该流程包括如下步骤:
步骤S202,获取智能设备采集到的语句文本作为待识别的目标语句文本;
步骤S204,通过目标成分识别模型对所述目标语句文本进行识别,得到所述目标语句文本对应的目标成分特征,其中,所述目标成分识别模型是使用标注了成分特征的第一文本样本对初始成分识别模型进行训练得到的,所述目标成分特征用于指示所述目标语句文本所具有的语言成分;
步骤S206,根据所述目标成分特征和所述目标语句文本,识别所述目标语句文本对应的目标意图特征,其中,所述目标意图特征用于指示所述目标语句文本对所述智能设备的操作意图。
通过上述步骤,如果获取到智能设备采集到的语句文本作为待识别的目标语句文本,可以通过目标成分识别模型识别目标语句文本所具有的语言成分作为目标成分特征,通过将目标语句文本和目标语句文本中所具有的语言成分结合来识别目标意图特征,实现了准确地识别目标语句文本对智能设备的操作意图。采用上述技术方案,解决了相关技术中,识别语句文本所表达意图的准确率较低等问题,实现了提高识别语句文本所表达意图的准确率的技术效果。
在上述步骤S202提供的技术方案中,智能设备可以但不限于将获取到用户发出的语音指令转换为对应的语句文本,或者,将用户在智能设备上输入的文字内容转换为对应的语句文本等等,实现了可以通过多种方式获取用户想要表达的语言内容,方便用户可以采用多种方式进行操作,提升了用户的操作体验度。
可选地,在本实施例中,智能设备可以但不限于包括支持与用户进行语音交互,按照用户的指令执行对应的操作的设备等等,比如:智能设备可以但不限于包括智能空调、智能烟机、智能冰箱、智能烤箱、智能炉灶、智能洗衣机、智能热水器、智能洗涤设备、智能洗碗机、智能投影设备、智能电视、智能晾衣架、智能窗帘、智能插座、智能音响、智能音箱、智能新风设备、智能厨卫设备、智能卫浴设备、智能扫地机器人、智能擦窗机器人、智能拖地机器人、智能空气净化设备、智能蒸箱、智能微波炉、智能厨宝、智能净化器、智能饮水机、智能门锁、智能车载空调、智能雨刷、智能车载音箱、智能车载冰箱等等。
在上述步骤S204提供的技术方案中,目标成分识别模型可以但不限于用于识别语句文本所具有的语言成分作为该语句文本对应的成分特征,图3是根据本公开实施例的通过目标成分识别模型识别语句文本对应的成分特征的流程图,如图3所示,可以但不限于包括以下步骤:
步骤S301,将目标语句文本输入目标成分识别模型;
步骤S302,目标成分识别模型对目标语句文本所具有的语言成分进行识别;
步骤S303,输出目标成分识别模型识别出的目标语句文本所具有的语言成分作为目标成分特征。
可选地,在本实施例中,目标语句文本所具有的语言成分可以但不限于包括以下至少之一:主语、谓语、宾语、定语、状语、补语、主语从句、谓语从句、宾语从句、定语从句、状语从句、和补语从句等等,通过识别语句文本所具有的语言成分,实现了充分利用语句文本所包括的信息,提升了识别语句文本的准确性。
可选地,在本实施例中,可以但不限于通过以下方式获取第一文本样本:获 取初始文本样本;标注初始文本样本中的每一个文本样本所具有的语言成分,得到标注了成分特征的第一文本样本。
在一个示例性实施例中,可以但不限于通过以下方式对目标语句文本进行识别:将所述目标语句文本输入所述目标成分识别模型所包括的成分标签识别层,得到所述成分标签识别层输出的多个目标文字,每个目标文字对应的成分标签和每个成分标签对应的成分标签概率,其中,所述目标语句文本包括所述多个目标文字,所述成分标签用于指示允许对应的所述每个目标文字所属于的语言成分,所述成分标签概率用于指示对应的所述每个目标文字属于对应的所述成分标签的概率;将多个目标文字,每个目标文字对应的成分标签和每个成分标签对应的成分标签概率输入成分标签判定层,得到所述成分标签判定层输出的与所述多个目标文字一一对应的多个目标成分标签作为成分识别结果。
可选地,在本实施例中,目标语句文本中的每个目标文字可以但不限于对应一个或者多个成分标签,每个成分标签都有和该成分标签对应的成分标签概率,成分标签判定层根据成分标签识别层的输出结果,输出每个目标文字一一对应的成分标签。
可选地,在本实施例中,可以但不限于通过目标成分识别模型包括的成分标签识别层和成分标签判定层来识别目标语句文本的成分特征,图4是根据本公开实施例的识别语句文本的成分特征的流程图,如图4所示,可以但不限于包括以下步骤:
步骤S401,将目标语句文本输入成分标签识别层;
步骤S402,成分标签识别层识别并输出目标语句文本中的多个目标文字,每个目标文字可能对应的成分标签和每个成分标签对应的成分标签概率;
步骤S403,将多个目标文字,每个目标文字对应的成分标签和每个成分标签对应的成分标签概率输入成分标签判定层;
步骤S404,成分标签判定层输出与每个目标文字一一对应的目标成分标签。
在一个示例性实施例中,可以但不限于通过以下方式得到成分标签识别层输 出的多个目标文字,每个目标文字对应的成分标签和每个成分标签对应的成分标签概率:将所述目标语句文本输入所述成分标签识别层所包括的预处理网络,得到所述预处理网络输出的与所述多个目标文字一一对应的多个词向量;将所述多个词向量输入所述成分标签识别层所包括的成分标签识别网络,得到所述成分标签识别网络输出的所述多个词向量,每个词向量对应的成分标签和每个成分标签对应的成分标签概率。
可选地,在本实施例中,预处理网络可以但不限于用于将目标语句文本中的每个目标文字转换为与每个目标文字一一对应的词向量,预处理网络可以但不限于包括BERT(Bidirectional Encoder Representation from Transformers,双向预训练方法)模型架构的网络,或者,Roberta(A Robustly Optimized BERT Pretraining Approach,鲁棒性优化的预训练方法)模型架构的网络等等,Roberta模型具有较强的获取动态词向量的能力,在模型细节、训练策略、数据层面三方面优化了网络结构,可以快速准确地将目标语句文本的每个目标文字转换为对应的词向量,节约了将文字转换为对应的词向量的时间,提高了将文字转换为对应的词向量的效率。
可选地,在本实施例中,成分标签识别网络可以但不限于用于预测输入的多个词向量对应的成分标签和每个成分标签对应的成分标签概率,成分标签识别网络可以但不限于包括LSTM(Long Short-Term Memory,长短时记忆)模型架构的网络,或者,BiLSTM(Bi-directional Long Short-Term Memory,双向长短时记忆)模型架构的网络等等,图5是根据本公开实施例的可选的BiLSTM模型的架构图,如图5所示,BiLSTM模型在进行“EU rejects German cal”对应的成分标签预测时,可以但不限于进行利用前向预测和后向预测,通过将前向预测的结果和后向预测的结果拼接,将“EU”预测为“B-SUB”成分标签,其中,“SUB”代表着主语(SUBJECT),将“rejects”预测为“B-PRE”成分标签,其中,“PRE”代表着谓语动词(PREDICATE),将“German”预测为“B-ATT”,其中,“ATT”代表着定语(ATTRIBUTE),将“call”预测为“B-OBJ”,其中,“OBJ”代表着宾语(OBJECT),提高了预测词向量对应的成分标签的准确率,并且Bi-LSTM模型具有较强的鲁棒性,较少的受到工程特征的影响,能够稳定运行。
可选地,在本实施例中,成分标签概率可以但不限于包括未归一化的概率(即各个成分标签概率可以但不限于为大于1,或者大于或者等于0,并且小于或者等于1),或者,归一化的概率(即各个成分标签概率可以但不限于均大于或者等于0,并且小于或者等于1)等等。
可选地,在本实施例中,成分标签识别层可以但不限于包括成分标签识别网络和成分标签概率归一化网络,或者,成分标签识别网络;成分标签概率归一化网络可以但不限于用于将成分标签识别网络输出的成分标签概率进行归一化,成分标签概率归一化网络可以但不限于包括采用Softmax(分类网络)模型架构的网络等等。
在一个示例性实施例中,可以但不限于通过以下方式得到成分识别结果:通过所述成分标签判定层从所述每个目标文字对应的成分标签中筛选出满足目标约束条件的候选成分标签,其中,所述目标约束条件为语句中对语言成分的约束条件;通过所述成分标签判定层从所述候选成分标签中获取所对应的成分标签概率满足目标概率条件的成分标签作为所述每个目标文字所对应的目标成分标签,得到与所述多个目标文字一一对应的多个目标成分标签作为所述成分识别结果。
可选地,在本实施例中,成分标签判定层可以但不限于用于输出与每个目标文字一一对应的目标成分标签,成分标签判定层可以但不限于包括采用CRF(Conditional Random Field,条件随机场)模型等等,CRF模型可以充分利用BiLSTM模型中的信息,提高了CRF输出的每个目标文字对应的目标成分标签的准确性。
可选地,在本实施例中,目标约束条件可以但不限于由成分标签判定层从目标语句文本中学习到,比如:句子的第一个单词应该是“B-label”或“O-label”而不是“I-label”,“B-label1 I-label1 I-label 2……”中,label1,label2和label3应该是同一种成分类别,“O I-label”是错误的,开头应该是“B-”而不是“I-”。通过学习这些约束,可以提高成分标签识别网络所预测出的预测标签的准确性。
可选地,在本实施例中,可以但不限于将候选成分标签中的每个目标文字可能属于的成分标签中的成分标签概率最大的成分标签作为每个目标文字的目标 成分标签,或者,将目标语句文本中的多个目标文字的成分标签概率的和值最大的多个成分标签作为多个目标成分标签等等。
在上述步骤S206提供的技术方案中,可以但不限于通过结合目标成分特征和目标语句文本识别目标语句文本对智能设备的操作意图,在智能对话系统中,往往需要及时准确地识别出文本的意图,需要识别的文本可能是包含从句部分的长句,也可能是包括几个语言成分的短句,结合目标语句文本可以充分利用目标语句文本中的信息准确地识别目标语句文本的含义,结合目标语句文本所具有的语言成分和目标语句,可以准确地识别出目标语句文本的标意图特征,实现了提升识别目标语句文本对智能设备的操作意图的准确率。
在一个示例性实施例中,可以但不限于通过以下方式识别目标语句文本对应的目标意图特征:通过目标意图识别模型对携带了所述目标成分特征的所述目标语句文本进行识别,其中,所述目标意图识别模型是使用标注了意图特征的携带了成分特征的第二文本样本对初始意图识别模型进行训练得到的;获取所述目标意图识别模型输出的意图识别结果作为所述目标意图特征。
可选地,在本实施例中,可以但不限于将目标成分识别模型输出的携带有目标成分特征的目标语句文本输入目标意图识别模型,将目标意图识别模型输出的目标语句文本对智能设备的操作意图作为目标意图特征。
在一个示例性实施例中,可以但不限于通过以下方式对携带了目标成分特征的目标语句文本进行识别:将所述目标语句文本输入目标实体识别模型,得到所述目标实体识别模型输出的目标实体特征,其中,所述目标实体特征用于指示所述目标语句文本中所包括的实体,所述目标实体识别模型是使用标注了实体特征的第三文本样本对初始实体识别模型进行训练得到的;将所述目标成分特征和所述目标实体特征输入所述目标意图识别模型,得到所述目标意图识别模型输出的所述意图识别结果。
可选地,在本实施例中,可以但不限于将目标实体识别模型识别出的目标语句文本中所包括的实体作为目标实体特征,可以但不限于将目标成分特征和目标实体特征输入目标意图识别模型,将目标意图识别模型输出的意图识别结果作为 目标语句文本所对应的对智能设备的操作意图,图6是根据本公开实施例的一种可选的识别目标语句意图的模型架构图,如图6所示,可以但不限于通过结合目标成分识别模型、目标实体识别模型、和目标意图识别模型来识别目标语句对智能设备的操作意图,实现了将目标语句文本所具有的语言成分和目标语句文本中所包括的实体结合进行意图识别,充分利用了目标语句文本所包括的信息,提升了识别出目标语句文本对智能设备的操作意图的准确性。
在一个示例性实施例中,可以但不限于通过以下方式得到目标实体识别模型:获取标注了实体特征的第三文本样本,其中,所述实体特征用于表征对所述智能设备执行的控制操作的操作信息;使用标注了所述实体特征的所述第三文本样本对所述初始实体识别模型进行训练,得到所述目标实体识别模型。
可选地,在本实施例中,对智能设备执行的控制操作的操作信息可以但不限于包括对智能设备执行控制操作的操作时间,操作地点,操作模式,操作设备及该设备的操作模式等等。
为了更好的理解上述语句文本的识别的过程,以下再结合可选实施例对上述语句文本的识别流程进行说明,但不用于限定本公开实施例的技术方案。
在本实施例中提供了一种语句文本的识别方法,图7是根据本公开实施例的识别语句文本的语言成分的示意图,如图7所示,该方法可以但不限于包括如下步骤:
步骤S701:进行文本数据的收集、清洗处理;
步骤S702:确定文本数据标注的成分标签和数量,成分标签可以但不限于包括以下至少之一:主语(SUB)、谓语(PRE)、宾语(OBJ)、定语(ATT)、状语(ADV,Adverbial)、补语(COM,Complement))、主语从句、谓语从句、宾语从句、定语从句、状语从句、和补语从句等等;
步骤S703:标注文本数据所具有的语言成分,可以但不限于根据确定好的标注规则,标注文本数据中各个语句文本中所具有的语言成分,得到模型训练的样本数据;
步骤S704:可以但不限于将样本数据切分为训练集、验证集和测试集,得到训练数据;
步骤S705:将训练数据输入到Roberta预训练模型进行向量化,向量化可以但不限于分为三个模块input-ids(输入数据中词语id(identification,标识)组成的张量)、segment-ids(输入数据中的句子id(标识)组成的张量)、input-mask(输入数据掩码)。对三个向量化的结果做融合得到Embedding(词向量)的输出;
步骤S706:BiLSTM模型预测每个词向量对应的成分标签和每个成分标签对应的成分标签概率,可以但不限于将Roberta预训练模型输出的多个词向量作为BiLSTM模型的输入,该输入获取的n维字向量作为BiLSTM神经网络各个时间步的输入,得到BiLSTM层的隐状态序列。BiLSTM模型学习参数的更新可以但不限于使用BPTT(Back-Propagation Through Time,时序反向传播)算法,该模型在forward(前向)和backward(后向)阶段与一般模型不同之处在于隐藏层对于所有的time step(步长)都要展开计算。
步骤S707:Softmax层对每个成分标签概率进行归一化处理,可以但不限于将BiLSTM输出的多个词向量,每个词向量对应的成分标签和每个成分标签对应的成分标签概率输入logit(逻辑)层,其中,logit层是Softmax层的输入,Softmax层输出多个词向量,每个词向量对应的成分标签和每个成分标签对应的归一化的成分标签概率;
步骤S708:CRF层输出每个词向量对应的预测成分标签,可以但不限于将Softmax层输出的多个词向量,每个词向量对应的成分标签和每个成分标签对应的归一化的成分标签概率输入CRF层,CRF层可以向最终的预测成分标签添加一些约束,以确保预测成分标签为有效的,这些约束在该CRF层训练过程中由训练数据集自动学习得到。CRF则将LSTM在每个t时刻在第i个tag(分类)上的输出作为特征函数中的点函数,使原本的CRF中引入了非线性。整体模型还是以CRF为主体的大框架,使LSTM中的信息得到充分的再利用,最终能够得到全局最优的输出序列。
步骤S709:计算预测成分标签与真实成分标签之间的损失度,经过与训练数据的真实标签进行loss(损失度)的计算,再循环进行每个epoch(时期)的迭代,通过BPTT算法不断更新神经网络节点的参数,使loss逐渐下降最终达到模型收敛状态,且模型经过优化后能保证loss较小,经过训练完成的模型具有较高的新数据的准确率。
步骤S710:在损失度小于或者等于损失度阈值的情况下,模型部署完成,在进行语句意图解析的整个流程中,将新数据传入到该模型中,即可得到预测标签,结合专家规则,完成意图的精确识别。
图8是根据本公开实施例的可选的识别目标语句所具有的语言成分的模型架构图,上述步骤S701~步骤S710可以但不限于用于如图8所示的模型架构中,可以但不限于通过构建句子成分标注规则和句子成分标签预测神经网络模型,完成高精度的语句文本的成分识别;可以但不限于使用Roberta预训练模型完成输入字词的Embedding,使得字词向量化变得简单高效,且向量化包含的信息和含义更加丰富,提升了向量化的准确性;同时利用CRF模型的状态转移矩阵,使得标签预测有效性大大提升。该模型结构提升了训练的速度和预测准确性,在意图识别领域提供了一种新的处理方式。
用户可以但不限于与智能音箱或者其它的智能设备(比如:智能窗帘,智能热水器,或者智能电视等等)进行语音交互,图9是根据本公开实施例的用户与智能音箱语音交互的场景示意图,如图9所示,用户可以但不限于在智能音箱播放音乐的过程中,表达出“声音太大了”的语音指令,可以但不限于识别出用户对智能音箱的操作意图可以但不限于包括将智能音箱播放音乐的音量调小10%(或者15%,5%等等),那么可以但不限于响应用户的语音指令,将智能音箱播放音乐的音量调小10%。
图10是根据本公开实施例的用户与智能电视语音交互的场景示意图,如图10所示,智能电视可以但不限于为关机的状态,如果获取到用户表达出“把电视打开”的语音指令,可以但不限于识别出用户对智能电视的操作意图为控制智能电视开机,那么可以但不限于响应用户的语音指令,控制智能电视开机。
需要说明的是,在本实施例中,对智能音箱和智能电视的形状不做限定,在图9中仅以外形为圆柱形的智能音箱进行举例说明,在图10中仅以外形为矩形状的智能电视进行举例说明,智能音箱和智能电视的形状可以是任何符合生产工艺和用户需求的形状,本公开对此不做限定。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本公开各个实施例的方法。
图11是根据本公开实施例的一种语句文本的识别装置的结构框图;如图11所示,包括:
第一获取模块1102,设置为获取智能设备采集到的语句文本作为待识别的目标语句文本;
第一识别模块1104,设置为通过目标成分识别模型对所述目标语句文本进行识别,得到所述目标语句文本对应的目标成分特征,其中,所述目标成分识别模型是使用标注了成分特征的第一文本样本对初始成分识别模型进行训练得到的,所述目标成分特征用于指示所述目标语句文本所具有的语言成分;
第二识别模块1106,设置为根据所述目标成分特征和所述目标语句文本,识别所述目标语句文本对应的目标意图特征,其中,所述目标意图特征用于指示所述目标语句文本对所述智能设备的操作意图。
通过上述实施例,如果获取到智能设备采集到的语句文本作为待识别的目标语句文本,可以通过目标成分识别模型识别目标语句文本所具有的语言成分作为目标成分特征,通过将目标语句文本和目标语句文本中所具有的语言成分结合来识别目标意图特征,实现了准确地识别目标语句文本对智能设备的操作意图。采 用上述技术方案,解决了相关技术中,识别语句文本所表达意图的准确率较低等问题,实现了提高识别语句文本所表达意图的准确率的技术效果。
在一个示例性实施例中,所述第一识别模块,包括:
第一识别单元,设置为将所述目标语句文本输入所述目标成分识别模型所包括的成分标签识别层,得到所述成分标签识别层输出的多个目标文字,每个目标文字对应的成分标签和每个成分标签对应的成分标签概率,其中,所述目标语句文本包括所述多个目标文字,所述成分标签用于指示允许对应的所述每个目标文字所属于的语言成分,所述成分标签概率用于指示对应的所述每个目标文字属于对应的所述成分标签的概率;
处理单元,设置为将多个目标文字,每个目标文字对应的成分标签和每个成分标签对应的成分标签概率输入成分标签判定层,得到所述成分标签判定层输出的与所述多个目标文字一一对应的多个目标成分标签作为成分识别结果。
在一个示例性实施例中,所述第一识别单元,设置为:
将所述目标语句文本输入所述成分标签识别层所包括的预处理网络,得到所述预处理网络输出的与所述多个目标文字一一对应的多个词向量;
将所述多个词向量输入所述成分标签识别层所包括的成分标签识别网络,得到所述成分标签识别网络输出的所述多个词向量,每个词向量对应的成分标签和每个成分标签对应的成分标签概率。
在一个示例性实施例中,所述处理单元,设置为:
通过所述成分标签判定层从所述每个目标文字对应的成分标签中筛选出满足目标约束条件的候选成分标签,其中,所述目标约束条件为语句中对语言成分的约束条件;
通过所述成分标签判定层从所述候选成分标签中获取所对应的成分标签概率满足目标概率条件的成分标签作为所述每个目标文字所对应的目标成分标签,得到与所述多个目标文字一一对应的多个目标成分标签作为所述成分识别结果。
在一个示例性实施例中,所述第二识别模块,包括:
第二识别单元,设置为通过目标意图识别模型对携带了所述目标成分特征的所述目标语句文本进行识别,其中,所述目标意图识别模型是使用标注了意图特征的携带了成分特征的第二文本样本对初始意图识别模型进行训练得到的;
获取单元,设置为获取所述目标意图识别模型输出的意图识别结果作为所述目标意图特征。
在一个示例性实施例中,其特征在于,所述第二识别单元,设置为:
将所述目标语句文本输入目标实体识别模型,得到所述目标实体识别模型输出的目标实体特征,其中,所述目标实体特征用于指示所述目标语句文本中所包括的实体,所述目标实体识别模型是使用标注了实体特征的第三文本样本对初始实体识别模型进行训练得到的;
将所述目标成分特征和所述目标实体特征输入所述目标意图识别模型,得到所述目标意图识别模型输出的所述意图识别结果。
在一个示例性实施例中,其特征在于,所述装置还包括:
第二获取模块,设置为在所述将所述目标语句文本输入目标实体识别模型之前,获取标注了实体特征的第三文本样本,其中,所述实体特征用于表征对所述智能设备执行的控制操作的操作信息;
训练模块,设置为使用标注了所述实体特征的所述第三文本样本对所述初始实体识别模型进行训练,得到所述目标实体识别模型。
本公开的实施例还提供了一种存储介质,该存储介质包括存储的程序,其中,上述程序运行时执行上述任一项的方法。
可选地,在本实施例中,上述存储介质可以被设置为存储用于执行以下步骤的程序代码:
S1,获取智能设备采集到的语句文本作为待识别的目标语句文本;
S2,通过目标成分识别模型对所述目标语句文本进行识别,得到所述目标语句文本对应的目标成分特征,其中,所述目标成分识别模型是使用标注了成分特征的第一文本样本对初始成分识别模型进行训练得到的,所述目标成分特征用于指示所述目标语句文本所具有的语言成分;
S3,根据所述目标成分特征和所述目标语句文本,识别所述目标语句文本对应的目标意图特征,其中,所述目标意图特征用于指示所述目标语句文本对所述智能设备的操作意图。
本公开的实施例还提供了一种电子装置,包括存储器和处理器,该存储器中存储有计算机程序,该处理器被设置为运行计算机程序以执行上述任一项方法实施例中的步骤。
可选地,上述电子装置还可以包括传输设备以及输入输出设备,其中,该传输设备和上述处理器连接,该输入输出设备和上述处理器连接。
可选地,在本实施例中,上述处理器可以被设置为通过计算机程序执行以下步骤:
S1,获取智能设备采集到的语句文本作为待识别的目标语句文本;
S2,通过目标成分识别模型对所述目标语句文本进行识别,得到所述目标语句文本对应的目标成分特征,其中,所述目标成分识别模型是使用标注了成分特征的第一文本样本对初始成分识别模型进行训练得到的,所述目标成分特征用于指示所述目标语句文本所具有的语言成分;
S3,根据所述目标成分特征和所述目标语句文本,识别所述目标语句文本对应的目标意图特征,其中,所述目标意图特征用于指示所述目标语句文本对所述智能设备的操作意图。
可选地,在本实施例中,上述存储介质可以包括但不限于:U盘、只读存储器(Read-Only Memory,简称为ROM)、随机存取存储器(Random Access Memory,简称为RAM)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
可选地,本实施例中的具体示例可以参考上述实施例及可选实施方式中所描述的示例,本实施例在此不再赘述。
显然,本领域的技术人员应该明白,上述的本公开的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本公开不限制于任何特定的硬件和软件结合。
以上所述仅是本公开的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本公开原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本公开的保护范围。

Claims (16)

  1. 一种语句文本的识别方法,包括:
    获取智能设备采集到的语句文本作为待识别的目标语句文本;
    通过目标成分识别模型对所述目标语句文本进行识别,得到所述目标语句文本对应的目标成分特征,其中,所述目标成分识别模型是使用标注了成分特征的第一文本样本对初始成分识别模型进行训练得到的,所述目标成分特征用于指示所述目标语句文本所具有的语言成分;
    根据所述目标成分特征和所述目标语句文本,识别所述目标语句文本对应的目标意图特征,其中,所述目标意图特征用于指示所述目标语句文本对所述智能设备的操作意图。
  2. 根据权利要求1所述的方法,其中,所述通过目标成分识别模型对所述目标语句文本进行识别,包括:
    将所述目标语句文本输入所述目标成分识别模型所包括的成分标签识别层,得到所述成分标签识别层输出的多个目标文字,每个目标文字对应的成分标签和每个成分标签对应的成分标签概率,其中,所述目标语句文本包括所述多个目标文字,所述成分标签用于指示允许对应的所述每个目标文字所属于的语言成分,所述成分标签概率用于指示对应的所述每个目标文字属于对应的所述成分标签的概率;
    将多个目标文字,每个目标文字对应的成分标签和每个成分标签对应的成分标签概率输入成分标签判定层,得到所述成分标签判定层输出的与所述多个目标文字一一对应的多个目标成分标签作为成分识别结果。
  3. 根据权利要求2所述的方法,其中,所述将所述目标语句文本输入所述目标成分识别模型所包括的成分标签识别层,得到所述成分标签识别层输出的多个目标文字,每个目标文字对应的成分标签和每个成分标签对应的成分标签概率,包括:
    将所述目标语句文本输入所述成分标签识别层所包括的预处理网络,得到所述预处理网络输出的与所述多个目标文字一一对应的多个词向量;
    将所述多个词向量输入所述成分标签识别层所包括的成分标签识别网络,得到所述成分标签识别网络输出的所述多个词向量,每个词向量对应的成分标签和每个成分标签对应的成分标签概率。
  4. 根据权利要求2所述的方法,其中,所述将多个目标文字,每个目标文字对应的成分标签和每个成分标签对应的成分标签概率输入成分标签判定层,得到所述成分标签判定层输出的与所述多个目标文字一一对应的多个目标成分标签作为成分识别结果,包括:
    通过所述成分标签判定层从所述每个目标文字对应的成分标签中筛选出满足目标约束条件的候选成分标签,其中,所述目标约束条件为语句中对语言成分的约束条件;
    通过所述成分标签判定层从所述候选成分标签中获取所对应的成分标签概率满足目标概率条件的成分标签作为所述每个目标文字所对应的目标成分标签,得到与所述多个目标文字一一对应的多个目标成分标签作为所述成分识别结果。
  5. 根据权利要求1-4任一项所述的方法,其中,所述根据所述目标成分特征和所述目标语句文本,识别所述目标语句文本对应的目标意图特征,包括:
    通过目标意图识别模型对携带了所述目标成分特征的所述目标语句文本进行识别,其中,所述目标意图识别模型是使用标注了意图特征的携带了成分特征的第二文本样本对初始意图识别模型进行训练得到的;
    获取所述目标意图识别模型输出的意图识别结果作为所述目标意图特征。
  6. 根据权利要求5所述的方法,其中,所述通过目标意图识别模型对携带了所述目标成分特征的所述目标语句文本进行识别,包括:
    将所述目标语句文本输入目标实体识别模型,得到所述目标实体识别模型 输出的目标实体特征,其中,所述目标实体特征用于指示所述目标语句文本中所包括的实体,所述目标实体识别模型是使用标注了实体特征的第三文本样本对初始实体识别模型进行训练得到的;
    将所述目标成分特征和所述目标实体特征输入所述目标意图识别模型,得到所述目标意图识别模型输出的所述意图识别结果。
  7. 根据权利要求6所述的方法,其中,在所述将所述目标语句文本输入目标实体识别模型之前,所述方法还包括:
    获取标注了实体特征的第三文本样本,其中,所述实体特征用于表征对所述智能设备执行的控制操作的操作信息;
    使用标注了所述实体特征的所述第三文本样本对所述初始实体识别模型进行训练,得到所述目标实体识别模型。
  8. 一种语句文本的识别装置,包括:
    第一获取模块,设置为获取智能设备采集到的语句文本作为待识别的目标语句文本;
    第一识别模块,设置为通过目标成分识别模型对所述目标语句文本进行识别,得到所述目标语句文本对应的目标成分特征,其中,所述目标成分识别模型是使用标注了成分特征的第一文本样本对初始成分识别模型进行训练得到的,所述目标成分特征用于指示所述目标语句文本所具有的语言成分;
    第二识别模块,设置为根据所述目标成分特征和所述目标语句文本,识别所述目标语句文本对应的目标意图特征,其中,所述目标意图特征用于指示所述目标语句文本对所述智能设备的操作意图。
  9. 根据权利要求8所述的装置,其中,所述第一识别模块,包括:
    第一识别单元,设置为将所述目标语句文本输入所述目标成分识别模型所包括的成分标签识别层,得到所述成分标签识别层输出的多个目标文字,每个目标文字对应的成分标签和每个成分标签对应的成分标签概率,其中,所述目 标语句文本包括所述多个目标文字,所述成分标签用于指示允许对应的所述每个目标文字所属于的语言成分,所述成分标签概率用于指示对应的所述每个目标文字属于对应的所述成分标签的概率;
    处理单元,设置为将多个目标文字,每个目标文字对应的成分标签和每个成分标签对应的成分标签概率输入成分标签判定层,得到所述成分标签判定层输出的与所述多个目标文字一一对应的多个目标成分标签作为成分识别结果。
  10. 根据权利要求9所述的装置,其中,所述第一识别单元,设置为:
    将所述目标语句文本输入所述成分标签识别层所包括的预处理网络,得到所述预处理网络输出的与所述多个目标文字一一对应的多个词向量;
    将所述多个词向量输入所述成分标签识别层所包括的成分标签识别网络,得到所述成分标签识别网络输出的所述多个词向量,每个词向量对应的成分标签和每个成分标签对应的成分标签概率。
  11. 根据权利要求9所述的装置,其中,所述处理单元,设置为:
    通过所述成分标签判定层从所述每个目标文字对应的成分标签中筛选出满足目标约束条件的候选成分标签,其中,所述目标约束条件为语句中对语言成分的约束条件;
    通过所述成分标签判定层从所述候选成分标签中获取所对应的成分标签概率满足目标概率条件的成分标签作为所述每个目标文字所对应的目标成分标签,得到与所述多个目标文字一一对应的多个目标成分标签作为所述成分识别结果。
  12. 根据权利要求8-11任一项所述的装置,其中,所述第二识别模块,包括:
    第二识别单元,设置为通过目标意图识别模型对携带了所述目标成分特征的所述目标语句文本进行识别,其中,所述目标意图识别模型是使用标注了意图特征的携带了成分特征的第二文本样本对初始意图识别模型进行训练得到的;
    获取单元,设置为获取所述目标意图识别模型输出的意图识别结果作为所 述目标意图特征。
  13. 根据权利要求12所述的装置,其中,所述第二识别单元,设置为:
    将所述目标语句文本输入目标实体识别模型,得到所述目标实体识别模型输出的目标实体特征,其中,所述目标实体特征用于指示所述目标语句文本中所包括的实体,所述目标实体识别模型是使用标注了实体特征的第三文本样本对初始实体识别模型进行训练得到的;
    将所述目标成分特征和所述目标实体特征输入所述目标意图识别模型,得到所述目标意图识别模型输出的所述意图识别结果。
  14. 根据权利要求13所述的装置,其中,所述装置还包括:
    第二获取模块,设置为在所述将所述目标语句文本输入目标实体识别模型之前,获取标注了实体特征的第三文本样本,其中,所述实体特征用于表征对所述智能设备执行的控制操作的操作信息;
    训练模块,设置为使用标注了所述实体特征的所述第三文本样本对所述初始实体识别模型进行训练,得到所述目标实体识别模型。
  15. 一种计算机可读的存储介质,所述计算机可读的存储介质包括存储的程序,其中,所述程序运行时执行权利要求1至7中任一项所述的方法。
  16. 一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为通过所述计算机程序执行权利要求1至7中任一项所述的方法。
PCT/CN2022/096405 2022-03-09 2022-05-31 语句文本的识别方法和装置、存储介质及电子装置 WO2023168838A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210234269.0A CN114676689A (zh) 2022-03-09 2022-03-09 语句文本的识别方法和装置、存储介质及电子装置
CN202210234269.0 2022-03-09

Publications (1)

Publication Number Publication Date
WO2023168838A1 true WO2023168838A1 (zh) 2023-09-14

Family

ID=82073019

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/096405 WO2023168838A1 (zh) 2022-03-09 2022-05-31 语句文本的识别方法和装置、存储介质及电子装置

Country Status (2)

Country Link
CN (1) CN114676689A (zh)
WO (1) WO2023168838A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115269774A (zh) * 2022-06-30 2022-11-01 青岛海尔科技有限公司 文本意图的识别方法和装置、存储介质和电子装置
CN115826627A (zh) * 2023-02-21 2023-03-21 白杨时代(北京)科技有限公司 一种编队指令的确定方法、系统、设备及存储介质
CN116662555B (zh) * 2023-07-28 2023-10-20 成都赛力斯科技有限公司 一种请求文本处理方法、装置、电子设备及存储介质

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108491443A (zh) * 2018-02-13 2018-09-04 上海好体信息科技有限公司 由计算机实施的与用户对话的方法和计算机系统
CN111079405A (zh) * 2019-11-29 2020-04-28 微民保险代理有限公司 文本信息识别方法、装置、存储介质和计算机设备
US20200234700A1 (en) * 2017-07-14 2020-07-23 Cognigy Gmbh Method for conducting dialog between human and computer
CN111738018A (zh) * 2020-06-24 2020-10-02 深圳前海微众银行股份有限公司 一种意图理解方法、装置、设备及存储介质
CN111931513A (zh) * 2020-07-08 2020-11-13 泰康保险集团股份有限公司 一种文本的意图识别方法及装置
CN113032568A (zh) * 2021-04-02 2021-06-25 同方知网(北京)技术有限公司 一种基于bert+bilstm+crf并融合句型分析的查询意图识别方法
US20220044081A1 (en) * 2020-12-09 2022-02-10 Beijing Baidu Netcom Science And Technology Co., Ltd. Method for recognizing dialogue intention, electronic device and storage medium
CN114138963A (zh) * 2021-12-01 2022-03-04 北京比特易湃信息技术有限公司 基于句法分析的意图识别模型

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200234700A1 (en) * 2017-07-14 2020-07-23 Cognigy Gmbh Method for conducting dialog between human and computer
CN108491443A (zh) * 2018-02-13 2018-09-04 上海好体信息科技有限公司 由计算机实施的与用户对话的方法和计算机系统
CN111079405A (zh) * 2019-11-29 2020-04-28 微民保险代理有限公司 文本信息识别方法、装置、存储介质和计算机设备
CN111738018A (zh) * 2020-06-24 2020-10-02 深圳前海微众银行股份有限公司 一种意图理解方法、装置、设备及存储介质
CN111931513A (zh) * 2020-07-08 2020-11-13 泰康保险集团股份有限公司 一种文本的意图识别方法及装置
US20220044081A1 (en) * 2020-12-09 2022-02-10 Beijing Baidu Netcom Science And Technology Co., Ltd. Method for recognizing dialogue intention, electronic device and storage medium
CN113032568A (zh) * 2021-04-02 2021-06-25 同方知网(北京)技术有限公司 一种基于bert+bilstm+crf并融合句型分析的查询意图识别方法
CN114138963A (zh) * 2021-12-01 2022-03-04 北京比特易湃信息技术有限公司 基于句法分析的意图识别模型

Also Published As

Publication number Publication date
CN114676689A (zh) 2022-06-28

Similar Documents

Publication Publication Date Title
WO2023168838A1 (zh) 语句文本的识别方法和装置、存储介质及电子装置
CN108062388B (zh) 人机对话的回复生成方法和装置
CN109684456B (zh) 基于物联网能力知识图谱的场景能力智能问答系统
CN109101624A (zh) 对话处理方法、装置、电子设备及存储介质
CN116229955A (zh) 基于生成式预训练gpt模型的交互意图信息确定方法
CN113255366B (zh) 一种基于异构图神经网络的方面级文本情感分析方法
CN115424615A (zh) 智能设备语音控制方法、装置、设备及存储介质
Morioka et al. Multiscale recurrent neural network based language model.
CN116910223B (zh) 一种基于预训练模型的智能问答数据处理系统
CN115941369A (zh) 智能家居联动的问答数据采集方法、设备、介质和系统
CN114925158A (zh) 语句文本的意图识别方法和装置、存储介质及电子装置
CN114818690A (zh) 一种评论信息生成的方法、装置及存储介质
CN117708680B (zh) 一种用于提升分类模型准确度的方法及装置、存储介质、电子装置
CN117706954B (zh) 一种用于场景生成的方法及装置、存储介质、电子装置
CN117390175B (zh) 基于bert的智能家居使用事件抽取方法
CN114385805B (zh) 一种提高深度文本匹配模型适应性的小样本学习方法
CN112951235B (zh) 一种语音识别方法及装置
CN110070093B (zh) 一种基于对抗学习的远监督关系抽取去噪方法
CN117807215A (zh) 一种基于模型的语句多意图识别方法、装置及设备
CN117010378A (zh) 语义转换方法和装置、存储介质及电子装置
CN116049361A (zh) 目标对话模型的训练方法及装置、电子装置
CN118051625A (zh) 一种用于优化场景生成模型的方法及装置、存储介质、电子装置
CN115271207A (zh) 一种基于门控图神经网络的序列关系预测方法以及装置
CN116432658A (zh) 语音数据的处理方法和装置、存储介质及电子装置
Zhang et al. Multi-domain adaptation for cross-domain semantic slot filling

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22930467

Country of ref document: EP

Kind code of ref document: A1