WO2023173596A1 - 语句文本的意图识别方法和装置、存储介质及电子装置 - Google Patents

语句文本的意图识别方法和装置、存储介质及电子装置 Download PDF

Info

Publication number
WO2023173596A1
WO2023173596A1 PCT/CN2022/096435 CN2022096435W WO2023173596A1 WO 2023173596 A1 WO2023173596 A1 WO 2023173596A1 CN 2022096435 W CN2022096435 W CN 2022096435W WO 2023173596 A1 WO2023173596 A1 WO 2023173596A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
entity
initial
recognition model
label
Prior art date
Application number
PCT/CN2022/096435
Other languages
English (en)
French (fr)
Inventor
刘建国
王迪
李昱涧
Original Assignee
青岛海尔科技有限公司
海尔智家股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 青岛海尔科技有限公司, 海尔智家股份有限公司 filed Critical 青岛海尔科技有限公司
Publication of WO2023173596A1 publication Critical patent/WO2023173596A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Definitions

  • the present disclosure relates to the field of smart home technology, and specifically to a method and device for recognizing the intention of sentence text, a storage medium and an electronic device.
  • NLP Natural Language Processing, natural language processing
  • the training data is often labeled as person names, organization names, place names, currencies, percentages, etc. tags, and then input the labeled training data into the built recognition model, and use the prediction results output by the recognition model as the intention expressed by the training data.
  • tags are less relevant to smart devices, which may lead to lower accuracy in identifying the intent expressed by the user's language.
  • Embodiments of the present disclosure provide a method and device, a storage medium, and an electronic device for identifying the intention of a sentence text, so as to at least solve the problem in the related art of low accuracy in identifying the intention expressed by the sentence text.
  • a method for intent recognition of sentence text including: obtaining the sentence text collected by a smart device as the target sentence text to be recognized;
  • Target entity label is used to characterize the target operation information of a target control operation corresponding to the target sentence text, and the target control operation is the target sentence Text indicating a control operation performed on the smart device;
  • the target intention feature corresponding to the target sentence text is identified, wherein the target intention feature is used to indicate the operation intention of the target sentence text for the smart device.
  • performing entity recognition on the target sentence text to obtain a target entity label includes:
  • the target sentence text is input into the target entity recognition model, wherein the target entity recognition model is obtained by training the initial entity recognition model using text samples labeled with entity tags.
  • entity tags include: operation time, operation location , operation resource attributes, operation equipment, operation mode;
  • the method before inputting the target sentence text into the target entity recognition model, the method further includes:
  • the model parameters of the initial entity recognition model are adjusted according to the loss value until the training cutoff condition is met, and the target entity recognition model is obtained.
  • inputting the text sample into the initial entity recognition model to obtain the initial entity label output by the initial entity recognition model includes: inputting the text sample into an initial label prediction layer; The initial prediction label output by the initial label prediction layer is input into the initial condition constraint layer to obtain the initial entity label output by the initial condition constraint layer; wherein the initial entity recognition model includes the initial label prediction layer and the initial Conditional constraint layer, the initial entity recognition model is used to predict the prediction label corresponding to the input parameter and the prediction probability corresponding to each prediction label, the initial condition constraint layer is used to predict the prediction label predicted by the initial entity recognition model and Add constraints to the prediction probability corresponding to each prediction label to obtain an entity label that satisfies the constraints;
  • the adjusting the model parameters of the initial entity recognition model according to the loss value includes: adjusting the prediction parameters of the initial label prediction layer and the constraints of the initial condition constraint layer according to the loss value. , wherein the model parameters of the initial entity recognition model include the prediction parameters and the constraint conditions.
  • inputting the text sample into the initial entity recognition model includes:
  • the text vector is input into the initial entity recognition model.
  • identifying the target intention characteristics corresponding to the target sentence text according to the target entity tag includes:
  • the target intent recognition model is obtained by training an initial intent recognition model using entity label samples marked with intent features;
  • identifying the target entity tag through a target intent recognition model includes:
  • the target component characteristics and the target entity label are input into the target intention recognition model, and the intention recognition result output by the target intention recognition model is obtained.
  • a device for recognizing the intention of sentence text including:
  • the acquisition module is configured to obtain the sentence text collected by the smart device as the target sentence text to be recognized;
  • the first recognition module is configured to perform entity recognition on the target sentence text to obtain a target entity label, wherein the target entity label is used to characterize the target operation information of the target control operation corresponding to the target sentence text, and the target
  • the control operation is a control operation that the target sentence text indicates to be performed on the smart device;
  • the second identification module is configured to identify the target intention feature corresponding to the target sentence text according to the target entity tag, wherein the target intention feature is used to indicate the operating intention of the target sentence text for the smart device.
  • a computer-readable storage medium stores a computer program, wherein the computer program is configured to execute the above statement text when running. Intent identification methods.
  • an electronic device including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the above-mentioned steps through the computer program.
  • Intent recognition method for statement text including a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the above-mentioned steps through the computer program.
  • the sentence text collected by the smart device is obtained as the target sentence text to be recognized; entity recognition is performed on the target sentence text to obtain the target entity label, where the target entity label is used to characterize the target corresponding to the target sentence text.
  • the target control operation is the target sentence text indicating the control operation performed on the smart device; according to the target entity tag, identify the target intention feature corresponding to the target sentence text, where the target intention feature is used to indicate the target sentence text pair.
  • the operation intention of the smart device that is, if the target sentence text to be recognized is obtained, the target operation information indicating the control operation performed on the smart device by identifying the target sentence text is used as the target entity tag, and the target sentence text's effect on the smart device is identified based on the target entity tag.
  • the operation intention improves the accuracy of identifying the operation intention of the target sentence text on the smart device through the target entity tag that is highly correlated with the operation intention of the smart device.
  • the above technical solution is used to solve the problem of low accuracy in identifying the intention expressed by the sentence text in related technologies, and achieve the technical effect of improving the accuracy of identifying the intention expressed by the sentence text.
  • Figure 1 is a schematic diagram of the hardware environment of an intention recognition method for sentence text according to an embodiment of the present disclosure
  • Figure 2 is a flow chart of an intention recognition method for sentence text according to an embodiment of the present disclosure
  • Figure 3 is an architectural diagram of an optional BiLSTM model according to an embodiment of the present disclosure
  • Figure 4 is an overall model architecture diagram of optional recognition of the intention of sentence text according to an embodiment of the present disclosure
  • Figure 5 is a flow chart for identifying target intent features of a target sentence text according to an embodiment of the present disclosure
  • Figure 6 is a schematic diagram of an intention recognition method of sentence text according to an embodiment of the present disclosure.
  • Figure 7 is an optional model architecture diagram for identifying language components of a target sentence according to an embodiment of the present disclosure
  • Figure 8 is a schematic diagram of a scene of voice interaction between a user and a smart speaker according to an embodiment of the present disclosure
  • Figure 9 is a schematic diagram of a scene of voice interaction between a user and a smart TV according to an embodiment of the present disclosure.
  • Figure 10 is a structural block diagram of a device for recognizing the intention of sentence text according to an embodiment of the present disclosure.
  • an intention recognition method of sentence text is provided.
  • the intent recognition method of sentence text is widely used in whole-house intelligent digital control application scenarios such as smart home, smart home, smart home device ecology, and smart residence (Intelligence House) ecology.
  • the above intention recognition method of sentence text can be applied to the hardware environment composed of the terminal device 102 and the server 104 as shown in FIG. 1 .
  • the server 104 is connected to the terminal device 102 through the network, and can be set to provide services (such as application services, etc.) to the terminal or the client installed on the terminal.
  • the database can be set on the server or independently of the server, and is set to
  • the server 104 provides data storage services, and cloud computing and/or edge computing services can be configured on the server or independently of the server, and configured to provide data computing services.
  • the above-mentioned network may include but is not limited to at least one of the following: wired network, wireless network.
  • the above-mentioned wired network may include but is not limited to at least one of the following: wide area network, metropolitan area network, and local area network.
  • the above-mentioned wireless network may include at least one of the following: WIFI (Wireless Fidelity, Wireless Fidelity), Bluetooth.
  • the terminal device 102 may be, but is not limited to, a PC (Personal Computer), a mobile phone, a tablet, a smart air conditioner, a smart hood, a smart refrigerator, a smart oven, a smart stove, a smart washing machine, a smart water heater, a smart washing equipment, a smart Dishwasher, smart projection equipment, smart TV, smart clothes drying rack, smart curtains, smart audio and video, smart sockets, smart audio, smart speakers, smart fresh air equipment, smart kitchen and bathroom equipment, smart bathroom equipment, smart sweeping robot, smart window cleaning Robots, smart mopping robots, smart air purification equipment, smart steamers, smart microwave ovens, smart kitchen appliances, smart purifiers, smart water dispensers, smart door locks, etc.
  • a PC Personal Computer
  • FIG. 2 is a flow chart of an intention recognition method of sentence text according to an embodiment of the present disclosure. As shown in Figure 2, the The process includes the following steps:
  • Step S202 Obtain the sentence text collected by the smart device as the target sentence text to be recognized;
  • Step S204 Perform entity recognition on the target sentence text to obtain a target entity label, where the target entity label is used to characterize the target operation information of the target control operation corresponding to the target sentence text, and the target control operation is the target control operation.
  • the target sentence text indicates a control operation performed on the smart device;
  • Step S206 Identify the target intention feature corresponding to the target sentence text according to the target entity tag, where the target intention feature is used to indicate the operating intention of the target sentence text for the smart device.
  • the target sentence text to be recognized is obtained, identify the target operation information of the target sentence text indicating the control operation performed on the smart device as the target entity tag, and identify the operation intention of the target sentence text on the smart device according to the target entity tag,
  • the accuracy of identifying the operation intention of the target sentence text for the smart device is improved through the target entity tag that is highly correlated with the operation intention of the smart device.
  • the above technical solution is used to solve the problem of low accuracy in identifying the intention expressed by the sentence text in related technologies, and achieve the technical effect of improving the accuracy of identifying the intention expressed by the sentence text.
  • the smart device can, but is not limited to, convert the voice command issued by the user into the corresponding sentence text, or convert the text content input by the user on the smart device into the corresponding sentence text, etc. Etc., it is possible to obtain the language content that users want to express in a variety of ways, making it convenient for users to operate in a variety of ways, and improving the user's operating experience.
  • smart devices may include, but are not limited to, devices that support performing corresponding operations according to user voice instructions, etc.
  • smart devices may include, but are not limited to, smart air conditioners, smart cigarette machines, and smart refrigerators. , smart oven, smart stove, smart washing machine, smart water heater, smart washing equipment, smart dishwasher, smart projection equipment, smart TV, smart clothes drying rack, smart curtains, smart sockets, smart audio, smart speakers, smart fresh air equipment, smart Kitchen and bathroom equipment, smart bathroom equipment, smart sweeping robot, smart window cleaning robot, smart mopping robot, smart air purification equipment, smart steamer, smart microwave oven, smart kitchen treasure, smart purifier, smart water dispenser, smart door lock, Smart car air conditioners, smart wipers, smart car speakers, smart car refrigerators, etc.
  • the target operation information indicating the control operation performed on the smart device by the target sentence text may be obtained, but is not limited to, by performing entity recognition on the target sentence text.
  • the target operation information may include, but is not limited to, the target operation information.
  • the statement text indicates operation information related to the target control operation performed on the smart device.
  • the target operation information may be, but is not limited to, used as the target entity tag, which improves the correlation between the target entity tag and the control operation performed on the smart device indicated by the target sentence text.
  • the target entity label may be obtained in the following manner, but is not limited to: inputting the target sentence text into a target entity recognition model, wherein the target entity recognition model uses text sample pairs labeled with entity labels.
  • the entity tag includes: operation time, operation location, operation resource attributes, operation equipment, and operation mode; obtain the target entity tag output by the target entity recognition model.
  • the target sentence text can be, but is not limited to, input into the target entity recognition model, and the target sentence text output by the target entity recognition model can be used to control the operation time, operation location, and operation resources of the smart device. Attributes, operating devices, operating modes, etc. serve as target entity tags.
  • the operating location may include, but is not limited to, a room or a functional area.
  • a room is, for example, a room marked with a label such as a bedroom, living room, study room, kitchen, or video room;
  • a functional area is, for example, an indoor area marked with a specific function, such as an entertainment area, a cooking area, a study area, a laundry area, or a dressing area.
  • the operating resource attributes may include, but are not limited to, operating resources (such as: audio resources such as songs, audio books, and video resources such as TV series and movies, etc. operating resources) and performers of operating resources ( For example: singers who sing songs, starring actors in movies, etc.).
  • operating resources such as: audio resources such as songs, audio books, and video resources such as TV series and movies, etc. operating resources
  • performers of operating resources For example: singers who sing songs, starring actors in movies, etc.
  • the target entity recognition model may be obtained in the following manner, but is not limited to: input the text sample into the initial entity recognition model, and obtain the initial entity label output by the initial entity recognition model; The initial entity label and the entity label marked by the text sample are input into a preset loss function to obtain a loss value; the model parameters of the initial entity recognition model are adjusted according to the loss value until the training cutoff conditions are met, and the resulting The target entity recognition model is described.
  • the initial entity recognition model may be trained, but is not limited to, by using text samples labeled with entity labels, and the training cutoff conditions may include, but is not limited to, initial entity labels and entity labels labeled by the text samples.
  • the loss value between them is less than or equal to the loss value threshold, or the loss degree tends to be constant, or the number of training times reaches a predetermined number.
  • the model parameters that make the initial entity recognition model converge may be used as target model parameters, but are not limited to, to obtain the target entity recognition model.
  • the initial entity label can be obtained in the following manner, but is not limited to: input the text sample into the initial label prediction layer; input the initial prediction label output by the initial label prediction layer into the initial condition constraint layer, and obtain The initial entity label output by the initial condition constraint layer; wherein the initial entity recognition model includes the initial label prediction layer and the initial condition constraint layer, and the initial entity recognition model is used to predict the corresponding input parameter Predicted labels and predicted probabilities corresponding to each predicted label.
  • the initial condition constraint layer is used to add constraints to the predicted labels predicted by the initial entity recognition model and the predicted probabilities corresponding to each predicted label to satisfy the constraint conditions.
  • entity tag is used to add constraints to the predicted labels predicted by the initial entity recognition model and the predicted probabilities corresponding to each predicted label to satisfy the constraint conditions.
  • the initial label prediction layer may, but is not limited to, include a network using a LSTM (Long Short-Term Memory) model architecture, or use a BiLSTM (Bi-directional Long Short-Term Memory) , two-way long short-term memory) model architecture network, etc.
  • Figure 3 is an architectural diagram of an optional BiLSTM model according to an embodiment of the present disclosure. As shown in Figure 3, the BiLSTM model can, but is not limited to, when predicting the entity tag corresponding to "Play loudspeaker inbedroom" Perform forward prediction and backward prediction, and splice the results of the forward prediction and the result of the backward prediction, and predict "Play” as the "B-PAT” label, where "PAT” represents the play mode (PATTERN).
  • LSTM Long Short-Term Memory
  • BiLSTM Bi-directional Long Short-Term Memory
  • each text in the text sample may, but is not limited to, correspond to one or more prediction labels.
  • Each prediction label has a prediction probability corresponding to the prediction label.
  • the initial condition constraint layer may but It is not limited to adding constraints to the output results of the initial label prediction layer, and outputs prediction labels that satisfy the constraints and correspond one-to-one to each text in the text sample.
  • the prediction probability corresponding to each prediction label may, but is not limited to, include unnormalized probability (that is, the prediction probability may, but is not limited to, be greater than 1, or less than or equal to 1 and greater than or equal to 0), or, normalized probability (that is, the predicted probability may be, but is not limited to, greater than or equal to 0 and less than or equal to 1) and so on.
  • the prediction probability corresponding to each prediction label can be normalized through, but is not limited to, a Softmax (classification network) model.
  • the initial condition constraint layer may, but is not limited to, include the use of a CRF (Conditional Random Field, Conditional Random Field) model, etc.
  • the CRF model can make full use of the information in the BiLSTM model and improve the accuracy of the CRF output. Accuracy of predicted labels.
  • the constraints can be, but are not limited to, learned from text samples by the initial condition constraint layer, for example: the first word of the sentence should be "B-label” or “O-label” instead of “I-label”, "B-label1 I-label1 I-label 2", label1, label2 and label3 should be the same ingredient category, "O I-label” is wrong, the beginning should be "B- ” instead of “I-”.
  • the accuracy of the predicted labels predicted by the initial label prediction layer can be effectively improved.
  • the method may include, but is not limited to, determining the prediction label with the highest prediction probability among multiple prediction labels corresponding to each character in the text sample as the entity label, or determining the prediction label among the multiple characters in the text sample.
  • the multiple prediction labels with the largest sum of prediction probabilities corresponding to the prediction labels are determined as entity labels.
  • model parameters of the initial entity recognition model may be adjusted, but are not limited to, in the following manner: all prediction parameters of the initial label prediction layer and all parameters of the initial condition constraint layer are adjusted according to the loss value.
  • the constraint conditions are adjusted, wherein the model parameters of the initial entity recognition model include the prediction parameters and the constraint conditions.
  • the prediction parameters and initial condition constraint layer of the initial label prediction layer may be configured based on the loss value, but is not limited to, when the loss value is greater than the loss value threshold, or the loss value does not tend to a constant value. constraints are adjusted.
  • text samples may be input into the initial entity recognition model in the following manner, but are not limited to: performing vectorization processing on the text samples to obtain text vectors corresponding to the text samples; inputting the text vectors The initial entity recognition model.
  • the model and so on convert each word in the text sample into a word vector corresponding to each word.
  • the Roberta model has a strong ability to obtain dynamic word vectors and optimizes the network structure in terms of model details, training strategies, and data levels.
  • the Roberta model can quickly and accurately convert each target text of the target sentence text into the corresponding word. vector, which saves the time of converting text into corresponding word vectors and improves the efficiency of converting text into corresponding word vectors.
  • the intention of the control operation performed by the target sentence text on the smart device can be identified through, but is not limited to, a target entity tag that is highly relevant to the control operation performed on the smart device.
  • the intelligent dialogue system often needs to accurately identify the user's expressed language intention for the operation of the smart device, and then control the smart device to perform the user's desired operation. It uses the target entity tag to identify the operation intention of the target statement text for the smart device to achieve It achieves fast speed, excellent effect and accurate intent recognition.
  • the target intent feature may be identified in the following manner, but is not limited to: identifying the target entity label through a target intent recognition model, wherein the target intent recognition model uses entities annotated with intent features.
  • the label samples are obtained by training the initial intention recognition model; the intention recognition result output by the target intention recognition model is obtained as the target intention feature.
  • the target entity tag output by the target entity recognition model may be, but is not limited to, input into the target intent recognition model.
  • the target intent recognition model recognizes the intention of the control operation performed by the target entity tag on the smart device.
  • the target intent The recognition model outputs the target intent characteristics corresponding to the target entity label.
  • FIG. 4 is an overall model architecture diagram for optionally identifying the intention of a sentence text according to an embodiment of the present disclosure, as shown in FIG. 4 .
  • the target entity label may be identified in the following manner, but is not limited to: performing language component analysis on the target sentence text to obtain target component features; combining the target component features and the target entity label Input the target intention recognition model to obtain the intention recognition result output by the target intention recognition model.
  • the target component features may, but are not limited to, include at least one of the following: subject, predicate, object, attributive, adverbial, complement, subject clause, predicate clause, object clause, attributive clause, adverbial clause, and complement clauses, etc., by analyzing the linguistic components of the target sentence text, the utilization of the information included in the sentence text is improved.
  • the language component of the target sentence text and the target entity tag can be combined, but are not limited to, and the intention of the control operation performed by the target sentence text on the smart device is identified through the target intention recognition model, thereby realizing Make full use of the information contained in the sentence text to improve the accuracy of identifying the operating intention of the target sentence text for the smart device.
  • the target intention characteristics of the target sentence text can be identified through, but not limited to, the target entity recognition model and the target intention recognition model, combined with the language components of the target sentence text.
  • Figure 5 is a flow chart for identifying target intent features of target sentence text according to an embodiment of the present disclosure.
  • Figure 5 can be, but is not limited to, applied to the model architecture shown in Figure 4 above. As shown in Figure 5, it can be, but is not limited to, Includes the following steps:
  • Step S501 Obtain the target sentence text
  • Step S502 Input the target sentence text into the target entity recognition model
  • Step S503 The target entity recognition model identifies the entity tag of the target sentence text and obtains the target entity tag;
  • Step S504 The target entity recognition model outputs the target entity label
  • Step S505 Input the target entity label into the target intention recognition model
  • Step S506 The target intention recognition model combines the target component characteristics and the target entity label to identify the target intention characteristics of the target sentence text;
  • Step S507 The target intention recognition model outputs target intention features.
  • Figure 6 is a schematic diagram of a method for identifying the intention of a statement text according to an embodiment of the present disclosure. As shown in Figure 6, it may, but is not limited to, include the following steps:
  • Step S601 Collect and clean text data
  • Step S602 Determine the entity tag and quantity of the text data annotation.
  • the entity tag may, but is not limited to, include at least one of the following: time (i.e., the above-mentioned operation time), room (i.e., the above-mentioned operation location), resource (i.e., the above-mentioned operation resource) attributes), singers (i.e., the above-mentioned operating resource attributes), equipment (i.e., the above-mentioned operating devices) and modes (i.e., the above-mentioned operating modes), etc.;
  • Step S603 Label the entity labels of the text data. You can, but are not limited to, perform manual label labeling according to the determined labeling rules, label each entity word component in each sentence of the text data, and obtain sample data for model training;
  • Step S604 The sample data may be, but is not limited to, divided into a training set, a verification set and a test set to obtain training data;
  • Step S605 Input the training data to the Roberta pre-training model for vectorization.
  • the vectorization can be, but is not limited to, divided into three modules: input-ids (a tensor composed of word ids (identification, identification) in the input data), segment-ids (Tensor composed of sentence ids (identifications) in the input data), input-mask (input data mask).
  • input-ids a tensor composed of word ids (identification, identification) in the input data
  • segment-ids Transensor composed of sentence ids (identifications) in the input data
  • input-mask input-mask
  • Step S606 The BiLSTM model predicts the entity label corresponding to each word vector and the entity label probability corresponding to each entity label. It can, but is not limited to, use multiple word vectors output by the Roberta pre-training model as input to the BiLSTM model. The input obtained The n-dimensional word vector is used as the input of each time step of the BiLSTM neural network to obtain the hidden state sequence of the BiLSTM layer.
  • the BiLSTM model learning parameters can be updated, but are not limited to, using the BPTT (Back-Propagation Through Time) algorithm. The difference between this model and the general model in the forward and backward stages lies in the hidden layer Calculations must be carried out for all time steps;
  • Step S607 The Softmax layer normalizes the entity label probability corresponding to each entity label. It can be but is not limited to multiple word vectors output by BiLSTM, the entity label corresponding to each word vector and the entity corresponding to each entity label.
  • the label probability is input to the logit (logical) layer, where the logit layer is the input of the Softmax layer, and the Softmax layer outputs multiple word vectors, the entity label corresponding to each word vector and the normalized entity label probability corresponding to each entity label;
  • Step S608 The CRF layer outputs the predicted entity label corresponding to each word vector, which may be but is not limited to multiple word vectors output by the Softmax layer, the entity label corresponding to each word vector, and the normalized entity corresponding to each entity label.
  • the label probability is input to the CRF layer.
  • the CRF layer can add some constraints to the final predicted entity label to ensure that the predicted entity label is valid. These constraints are automatically learned from the training data set during the training process of the CRF layer.
  • CRF uses the output of LSTM on the i-th tag (classification) at each time t as a point function in the feature function, which introduces nonlinearity into the original CRF.
  • the overall model is still a large framework with CRF as the main body, so that the information in LSTM can be fully reused, and finally the globally optimal output sequence can be obtained;
  • Step S609 Calculate the loss degree between the predicted entity label and the real entity label. After calculating the loss (loss degree) with the real label of the training data, the iteration of each epoch (period) is repeated and continuously updated through the BPTT algorithm. The parameters of the neural network nodes make the loss gradually decrease and finally reach the model convergence state, and the model can ensure that the loss is small after optimization, and the trained model has a high accuracy rate for new data;
  • Step S610 When the loss degree is less than or equal to the loss degree threshold, the model deployment is completed. During the entire process of statement intent parsing, new data is passed into the model to obtain prediction labels. Combined with expert rules, Complete precise identification of intent.
  • Figure 7 is an optional model architecture diagram for identifying language components of a target sentence according to an embodiment of the present disclosure.
  • the above steps S601 to S610 can be, but are not limited to, used in the model architecture as shown in Figure 7. They can be, but are not It is limited to constructing named entity labeling rules and named entity recognition neural network models for smart home appliance scenes, and combining the language components of the sentence text to complete high-accuracy intent recognition; it can but is not limited to using the Roberta pre-training model to complete the embedding of input words. (Word vector), making word vectorization simple and efficient, and vectorization contains richer information and meaning, improving the accuracy of vectorization; at the same time, using the state transition matrix of the CRF model, the effectiveness of label prediction is greatly improved .
  • the model structure improves the training speed and prediction accuracy, and provides a new processing method in the field of intent recognition.
  • FIG 8 is a schematic diagram of a scene of voice interaction between users and smart speakers according to an embodiment of the present disclosure.
  • the user can, but is not limited to, express the voice command "play next song” while the smart speaker is playing the song, and the identified entity tag corresponding to "play next song” can be, but is not limited to Including the device tag (corresponding to the smart speaker) and the mode tag (corresponding to the next song), which can, but is not limited to, identify "play the next song” based on the device tag (corresponding to the smart speaker) and the mode tag (corresponding to the next song).
  • the purpose of the control operation performed on the smart speaker is to control the smart speaker to play the next song in the current song list to be played. Then it can, but is not limited to, respond to the user's voice command to control the smart speaker to play the next song in the current song list to be played. Next song.
  • Figure 9 is a schematic diagram of a scene of voice interaction between a user and a smart TV according to an embodiment of the present disclosure.
  • the smart TV can be, but is not limited to, playing sports news. If the user expresses "the screen is too bright" is obtained.
  • the entity tags corresponding to the identified "screen is too bright" may, but are not limited to, include device tags (corresponding to smart TV screens) and mode tags (corresponding to bright), and may, but are not limited to, be based on device tags (corresponding to smart TV screens).
  • the smart TV screen can be, but is not limited to, responding to the user's voice command. Reduce the display brightness by 5% (or 10%, 15%, etc.).
  • the shapes of the smart speakers and smart TVs are not limited.
  • a cylindrical smart speaker is used as an example.
  • a rectangular shape is used.
  • the shapes of smart speakers and smart TVs can be any shape that meets the production process and user needs, and this disclosure does not limit this.
  • the method according to the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is Better implementation.
  • the technical solution of the present disclosure can be embodied in the form of a software product in essence or that contributes to the existing technology.
  • the computer software product is stored in a storage medium (such as ROM/RAM, disk, CD), including several instructions to cause a terminal device (which can be a mobile phone, computer, server, or network device, etc.) to execute the methods of various embodiments of the present disclosure.
  • Figure 10 is a structural block diagram of an intention recognition device for sentence text according to an embodiment of the present disclosure; as shown in Figure 10, it includes:
  • the acquisition module 102 is configured to acquire the sentence text collected by the smart device as the target sentence text to be recognized;
  • the first identification module 104 is configured to perform entity recognition on the target sentence text to obtain a target entity tag, wherein the target entity tag is used to characterize the target operation information of the target control operation corresponding to the target sentence text, and the The target control operation is a control operation that the target sentence text indicates to be performed on the smart device;
  • the second identification module 106 is configured to identify the target intention feature corresponding to the target sentence text according to the target entity tag, wherein the target intention feature is used to indicate the operating intention of the target sentence text for the smart device. .
  • the target operation information indicating the control operation performed on the smart device is identified as the target entity tag, and the operation intention of the target sentence text on the smart device is identified according to the target entity tag.
  • target entity tags that are highly correlated with the operation intention of the smart device, the accuracy of identifying the operation intention of the target sentence text for the smart device is improved.
  • the above technical solution is used to solve the problem of low accuracy in identifying the intention expressed by the sentence text in related technologies, and achieve the technical effect of improving the accuracy of identifying the intention expressed by the sentence text.
  • the first identification module includes:
  • the first input unit is configured to input the target sentence text into a target entity recognition model, wherein the target entity recognition model is obtained by training the initial entity recognition model using text samples labeled with entity labels, and the entity labels Including: operation time, operation location, operation resource attributes, operation equipment, operation mode;
  • the first acquisition unit is configured to acquire the target entity label output by the target entity recognition model.
  • the device further includes:
  • the first input module is configured to input the text sample into the initial entity recognition model before inputting the target sentence text into the target entity recognition model to obtain the initial entity label output by the initial entity recognition model;
  • the second input module is configured to input the initial entity label and the entity label marked by the text sample into a preset loss function to obtain a loss value
  • the adjustment module is configured to adjust the model parameters of the initial entity recognition model according to the loss value until the training cutoff condition is met to obtain the target entity recognition model.
  • the first input module is configured to: input the text sample into the initial label prediction layer; input the initial prediction label output by the initial label prediction layer into the initial condition constraint layer to obtain the initial condition constraint layer output.
  • Initial entity label wherein, the initial entity recognition model includes the initial label prediction layer and the initial condition constraint layer, and the initial entity recognition model is used to predict the prediction label corresponding to the input parameter and the prediction corresponding to each prediction label. Probability, the initial condition constraint layer is used to add constraints to the prediction labels predicted by the initial entity recognition model and the prediction probability corresponding to each prediction label to obtain entity labels that satisfy the constraints;
  • the adjustment module is configured to adjust the prediction parameters of the initial label prediction layer and the constraint conditions of the initial condition constraint layer according to the loss value, wherein the model parameters of the initial entity recognition model include the prediction parameters and the constraints.
  • the first input module includes:
  • a vectorization unit configured to vectorize the text sample to obtain a text vector corresponding to the text sample
  • the second input unit is configured to input the text vector into the initial entity recognition model.
  • the second identification module includes:
  • An identification unit configured to identify the target entity tag through a target intention recognition model, wherein the target intention recognition model is obtained by training an initial intention recognition model using entity tag samples marked with intention features;
  • the second acquisition unit is configured to acquire the intention recognition result output by the target intention recognition model as the target intention feature.
  • the identification unit is configured as:
  • the target component characteristics and the target entity label are input into the target intention recognition model, and the intention recognition result output by the target intention recognition model is obtained.
  • An embodiment of the present disclosure also provides a storage medium that includes a stored program, wherein the method of any of the above items is executed when the program is run.
  • the above-mentioned storage medium may be configured to store program codes for performing the following steps:
  • S2 Perform entity recognition on the target sentence text to obtain a target entity label, where the target entity label is used to characterize the target operation information of the target control operation corresponding to the target sentence text, and the target control operation is the The target statement text indicates a control operation performed on the smart device;
  • the target entity tag identify the target intention feature corresponding to the target sentence text, where the target intention feature is used to indicate the operating intention of the target sentence text for the smart device.
  • Embodiments of the present disclosure also provide an electronic device, including a memory and a processor.
  • a computer program is stored in the memory, and the processor is configured to run the computer program to perform the steps in any of the above method embodiments.
  • the above-mentioned electronic device may further include a transmission device and an input-output device, wherein the transmission device is connected to the above-mentioned processor, and the input-output device is connected to the above-mentioned processor.
  • the above-mentioned processor may be configured to perform the following steps through a computer program:
  • S2 Perform entity recognition on the target sentence text to obtain a target entity label, where the target entity label is used to characterize the target operation information of the target control operation corresponding to the target sentence text, and the target control operation is the The target statement text indicates a control operation performed on the smart device;
  • the target entity tag identify the target intention feature corresponding to the target sentence text, where the target intention feature is used to indicate the operating intention of the target sentence text for the smart device.
  • the above storage medium may include but is not limited to: U disk, read-only memory (Read-Only Memory, referred to as ROM), random access memory (Random Access Memory, referred to as RAM), Various media that can store program code, such as mobile hard drives, magnetic disks, or optical disks.
  • ROM read-only memory
  • RAM random access memory
  • program code such as mobile hard drives, magnetic disks, or optical disks.
  • modules or steps of the present disclosure can be implemented using general-purpose computing devices, and they can be concentrated on a single computing device, or distributed across a network composed of multiple computing devices. , optionally, they may be implemented in program code executable by a computing device, such that they may be stored in a storage device for execution by the computing device, and in some cases, may be in a sequence different from that herein.
  • the steps shown or described are performed either individually as individual integrated circuit modules, or as multiple modules or steps among them as a single integrated circuit module. As such, the present disclosure is not limited to any specific combination of hardware and software.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Machine Translation (AREA)

Abstract

本公开提供了一种语句文本的意图识别方法和装置、存储介质及电子装置,涉及智能家居技术领域,该语句文本的意图识别方法包括:获取智能设备采集到的语句文本作为待识别的目标语句文本;对目标语句文本进行实体识别,得到目标实体标签,其中,目标实体标签用于表征目标语句文本对应的目标控制操作的目标操作信息,目标控制操作为目标语句文本指示对智能设备执行的控制操作;根据目标实体标签,识别目标语句文本对应的目标意图特征,其中,目标意图特征用于指示目标语句文本对智能设备的操作意图。采用上述技术方案,解决了相关技术中,识别语句文本所表达意图的准确率较低等问题。

Description

语句文本的意图识别方法和装置、存储介质及电子装置
本公开要求于2022年3月15日提交中国专利局、申请号为202210252555.X、发明名称“语句文本的意图识别方法和装置、存储介质及电子装置”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。
技术领域
本公开涉及智能家居技术领域,具体而言,涉及一种语句文本的意图识别方法和装置、存储介质及电子装置。
背景技术
在NLP(Natural Language Processing,自然语言处理)领域中,往往需要及时准确地识别出数据所表达的意图,现有技术中,往往将训练数据标注为人名、机构名、地名、货币和百分比等等标签,再将标注好的训练数据输入构建好的识别模型,将识别模型输出的预测结果作为训练数据所表达的意图,而在智慧设备控制场景中,人名、机构名、地名、货币和百分比等等标签与智能设备的相关度较低,这可能会导致识别用户语言所表达的意图的准确性较低。
针对相关技术中,识别语句文本所表达意图的准确率较低等问题,尚未提出有效的解决方案。
发明内容
本公开实施例提供了一种语句文本的意图识别方法和装置、存储介质及电子装置,以至少解决相关技术中,识别语句文本所表达意图的准确率较低等问题。
根据本公开实施例的一个实施例,提供了一种语句文本的意图识别方法,包括:获取智能设备采集到的语句文本作为待识别的目标语句文本;
对所述目标语句文本进行实体识别,得到目标实体标签,其中,所述目标实 体标签用于表征所述目标语句文本对应的目标控制操作的目标操作信息,所述目标控制操作为所述目标语句文本指示对所述智能设备执行的控制操作;
根据所述目标实体标签,识别所述目标语句文本对应的目标意图特征,其中,所述目标意图特征用于指示所述目标语句文本对所述智能设备的操作意图。
在一个示例性实施例中,所述对所述目标语句文本进行实体识别,得到目标实体标签,包括:
将所述目标语句文本输入目标实体识别模型,其中,所述目标实体识别模型是使用标注了实体标签的文本样本对初始实体识别模型进行训练得到的,所述实体标签包括:操作时间、操作位置、操作资源属性、操作设备、操作模式;
获取所述目标实体识别模型输出的所述目标实体标签。
在一个示例性实施例中,在所述将所述目标语句文本输入目标实体识别模型之前,所述方法还包括:
将所述文本样本输入所述初始实体识别模型,得到所述初始实体识别模型输出的初始实体标签;
将所述初始实体标签与所述文本样本所标注的实体标签输入预设的损失函数,得到损失值;
根据所述损失值对所述初始实体识别模型的模型参数进行调整,直至满足训练截止条件,得到所述目标实体识别模型。
在一个示例性实施例中,所述将所述文本样本输入所述初始实体识别模型,得到所述初始实体识别模型输出的初始实体标签包括:将所述文本样本输入初始标签预测层;将所述初始标签预测层输出的初始预测标签输入初始条件约束层,得到所述初始条件约束层输出的所述初始实体标签;其中,所述初始实体识别模型包括所述初始标签预测层和所述初始条件约束层,所述初始实体识别模型用于预测输入参数对应的预测标签和每个预测标签对应的预测概率,所述初始条件约束层用于对所述初始实体识别模型预测出的预测标签和每个预测标签对应的预 测概率添加约束条件得到满足所述约束条件的实体标签;
所述根据所述损失值对所述初始实体识别模型的模型参数进行调整包括:根据所述损失值对所述初始标签预测层的预测参数和所述初始条件约束层的所述约束条件进行调整,其中,所述初始实体识别模型的模型参数包括所述预测参数和所述约束条件。
在一个示例性实施例中,所述将所述文本样本输入所述初始实体识别模型,包括:
将所述文本样本进行向量化处理,得到所述文本样本对应的文本向量;
将所述文本向量输入所述初始实体识别模型。
在一个示例性实施例中,所述根据所述目标实体标签,识别所述目标语句文本对应的目标意图特征,包括:
通过目标意图识别模型对所述目标实体标签进行识别,其中,所述目标意图识别模型是使用标注了意图特征的实体标签样本对初始意图识别模型进行训练得到的;
获取所述目标意图识别模型输出的意图识别结果作为所述目标意图特征。
在一个示例性实施例中,所述通过目标意图识别模型对所述目标实体标签进行识别,包括:
对所述目标语句文本进行语言成分解析,得到目标成分特征;
将所述目标成分特征和所述目标实体标签输入所述目标意图识别模型,得到所述目标意图识别模型输出的所述意图识别结果。
根据本公开实施例的另一个实施例,还提供了一种语句文本的意图识别装置,包括:
获取模块,设置为获取智能设备采集到的语句文本作为待识别的目标语句文本;
第一识别模块,设置为对所述目标语句文本进行实体识别,得到目标实体标签,其中,所述目标实体标签用于表征所述目标语句文本对应的目标控制操作的目标操作信息,所述目标控制操作为所述目标语句文本指示对所述智能设备执行的控制操作;
第二识别模块,设置为根据所述目标实体标签,识别所述目标语句文本对应的目标意图特征,其中,所述目标意图特征用于指示所述目标语句文本对所述智能设备的操作意图。
根据本公开实施例的又一方面,还提供了一种计算机可读的存储介质,该计算机可读的存储介质中存储有计算机程序,其中,该计算机程序被设置为运行时执行上述语句文本的意图识别方法。
根据本公开实施例的又一方面,还提供了一种电子装置,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,上述处理器通过计算机程序执行上述的语句文本的意图识别方法。
在本公开实施例中,获取智能设备采集到的语句文本作为待识别的目标语句文本;对目标语句文本进行实体识别,得到目标实体标签,其中,目标实体标签用于表征目标语句文本对应的目标控制操作的目标操作信息,目标控制操作为目标语句文本指示对智能设备执行的控制操作;根据目标实体标签,识别目标语句文本对应的目标意图特征,其中,目标意图特征用于指示目标语句文本对智能设备的操作意图,即如果获取到待识别的目标语句文本,识别目标语句文本指示对智能设备执行的控制操作的目标操作信息作为目标实体标签,根据目标实体标签识别目标语句文本对智能设备的操作意图,通过与对智能设备的操作意图的相关性较高的目标实体标签,提升了识别目标语句文本对智能设备的操作意图的准确性。采用上述技术方案,解决了相关技术中,识别语句文本所表达意图的准确率较低等问题,实现了提升识别语句文本所表达意图的准确率的技术效果。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的 实施例,并与说明书一起用于解释本公开的原理。
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是根据本公开实施例的一种语句文本的意图识别方法的硬件环境示意图;
图2是根据本公开实施例的一种语句文本的意图识别方法的流程图;
图3是根据本公开实施例的可选的BiLSTM模型的架构图;
图4是根据本公开实施例的可选的识别语句文本的意图的整体模型架构图;
图5是根据本公开实施例的识别目标语句文本的目标意图特征的流程图;
图6是根据本公开实施例的一种语句文本的意图识别方法的示意图;
图7是根据本公开实施例的可选的识别目标语句所具有的语言成分的模型架构图;
图8是根据本公开实施例的用户与智能音箱语音交互的场景示意图;
图9是根据本公开实施例的用户与智能电视语音交互的场景示意图;
图10是根据本公开实施例的一种语句文本的意图识别装置的结构框图。
具体实施方式
为了使本技术领域的人员更好地理解本公开方案,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分的实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本公开保护的范围。
需要说明的是,本公开的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该 理解这样使用的数据在适当情况下可以互换,以便这里描述的本公开的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
根据本公开实施例的一个方面,提供了一种语句文本的意图识别方法。该语句文本的意图识别方法广泛应用于智慧家庭(Smart Home)、智能家居、智能家用设备生态、智慧住宅(Intelligence House)生态等全屋智能数字化控制应用场景。可选地,在本实施例中,上述语句文本的意图识别方法可以应用于如图1所示的由终端设备102和服务器104所构成的硬件环境中。如图1所示,服务器104通过网络与终端设备102进行连接,可设置为终端或终端上安装的客户端提供服务(如应用服务等),可在服务器上或独立于服务器设置数据库,设置为服务器104提供数据存储服务,可在服务器上或独立于服务器配置云计算和/或边缘计算服务,设置为服务器104提供数据运算服务。
上述网络可以包括但不限于以下至少之一:有线网络,无线网络。上述有线网络可以包括但不限于以下至少之一:广域网,城域网,局域网,上述无线网络可以包括但不限于以下至少之一:WIFI(Wireless Fidelity,无线保真),蓝牙。终端设备102可以并不限定于为PC(Personal Computer,个人电脑)、手机、平板电脑、智能空调、智能烟机、智能冰箱、智能烤箱、智能炉灶、智能洗衣机、智能热水器、智能洗涤设备、智能洗碗机、智能投影设备、智能电视、智能晾衣架、智能窗帘、智能影音、智能插座、智能音响、智能音箱、智能新风设备、智能厨卫设备、智能卫浴设备、智能扫地机器人、智能擦窗机器人、智能拖地机器人、智能空气净化设备、智能蒸箱、智能微波炉、智能厨宝、智能净化器、智能饮水机、智能门锁等。
在本实施例中提供了一种语句文本的意图识别方法,应用于上述计算机终端,图2是根据本公开实施例的一种语句文本的意图识别方法的流程图,如图2 所示,该流程包括如下步骤:
步骤S202,获取智能设备采集到的语句文本作为待识别的目标语句文本;
步骤S204,对所述目标语句文本进行实体识别,得到目标实体标签,其中,所述目标实体标签用于表征所述目标语句文本对应的目标控制操作的目标操作信息,所述目标控制操作为所述目标语句文本指示对所述智能设备执行的控制操作;
步骤S206,根据所述目标实体标签,识别所述目标语句文本对应的目标意图特征,其中,所述目标意图特征用于指示所述目标语句文本对所述智能设备的操作意图。
通过上述步骤,如果获取到待识别的目标语句文本,识别目标语句文本指示对智能设备执行的控制操作的目标操作信息作为目标实体标签,根据目标实体标签识别目标语句文本对智能设备的操作意图,通过与对智能设备的操作意图的相关性较高的目标实体标签,提升了识别目标语句文本对智能设备的操作意图的准确性。采用上述技术方案,解决了相关技术中,识别语句文本所表达意图的准确率较低等问题,实现了提升识别语句文本所表达意图的准确率的技术效果。
在上述步骤S202提供的技术方案中,智能设备可以但不限于将获取到用户发出的语音指令转换为对应的语句文本,或者,将用户在智能设备上输入的文字内容转换为对应的语句文本等等,实现了可以通过多种方式获取用户想要表达的语言内容,方便用户可以采用多种方式进行操作,提升了用户的操作体验度。
可选地,在本实施例中,智能设备可以但不限于包括支持按照用户的语音指令执行对应的操作的设备等等,比如:智能设备可以但不限于包括智能空调、智能烟机、智能冰箱、智能烤箱、智能炉灶、智能洗衣机、智能热水器、智能洗涤设备、智能洗碗机、智能投影设备、智能电视、智能晾衣架、智能窗帘、智能插座、智能音响、智能音箱、智能新风设备、智能厨卫设备、智能卫浴设备、智能扫地机器人、智能擦窗机器人、智能拖地机器人、智能空气净化设备、智能蒸箱、智能微波炉、智能厨宝、智能净化器、智能饮水机、智能门锁、智能车载空调、 智能雨刷、智能车载音箱、智能车载冰箱等等。
在上述步骤S204提供的技术方案中,可以但不限于通过对目标语句文本进行实体识别,得到目标语句文本指示对智能设备执行的控制操作的目标操作信息,目标操作信息可以但不限于包括与目标语句文本指示对智能设备执行的目标控制操作相关的操作信息,可以但不限于将目标操作信息作为目标实体标签,提升了目标实体标签与目标语句文本指示对智能设备执行的控制操作的相关度。
在一个示例性实施例中,可以但不限于通过以下方式得到目标实体标签:将所述目标语句文本输入目标实体识别模型,其中,所述目标实体识别模型是使用标注了实体标签的文本样本对初始实体识别模型进行训练得到的,所述实体标签包括:操作时间、操作位置、操作资源属性、操作设备、操作模式;获取所述目标实体识别模型输出的所述目标实体标签。
可选地,在本实施例中,可以但不限于将目标语句文本输入目标实体识别模型,将目标实体识别模型输出的目标语句文本对智能设备执行的控制操作的操作时间、操作位置、操作资源属性、操作设备和操作模式等等作为目标实体标签。
可选地,在本实施例中,操作位置可以但不限于包括房间或者功能区域。房间例如为标记有卧室、客厅、书房、厨房或影音室等标签的房间;功能区域例如为标记有特定功能的室内区域,例如娱乐区、做饭区、学习区、洗衣区或穿戴区等。
可选地,在本实施例中,操作资源属性可以但不限于包括操作资源(比如:歌曲、有声读物等音频资源,以及电视剧、电影等视频资源等等操作资源)和操作资源的演绎者(比如:演唱歌曲的歌手、电影的主演等等)。
在一个示例性实施例中,可以但不限于通过以下方式得到目标实体识别模型:将所述文本样本输入所述初始实体识别模型,得到所述初始实体识别模型输出的初始实体标签;将所述初始实体标签与所述文本样本所标注的实体标签输入预设的损失函数,得到损失值;根据所述损失值对所述初始实体识别模型的模型参数进行调整,直至满足训练截止条件,得到所述目标实体识别模型。
可选地,在本实施例中,可以但不限于通过使用标注有实体标签的文本样本对初始实体识别模型进行训练,训练截止条件可以包括但不限于初始实体标签和文本样本所标注的实体标签之间的损失值小于或者等于损失值阈值,或者损失度趋于恒定,或者训练次数达到预定次数。此时初始实体识别模型收敛,可以但不限于将使得初始实体识别模型收敛的模型参数作为目标模型参数,得到目标实体识别模型。
在一个示例性实施例中,可以但不限于通过以下方式得到初始实体标签:将所述文本样本输入初始标签预测层;将所述初始标签预测层输出的初始预测标签输入初始条件约束层,得到所述初始条件约束层输出的所述初始实体标签;其中,所述初始实体识别模型包括所述初始标签预测层和所述初始条件约束层,所述初始实体识别模型用于预测输入参数对应的预测标签和每个预测标签对应的预测概率,所述初始条件约束层用于对所述初始实体识别模型预测出的预测标签和每个预测标签对应的预测概率添加约束条件得到满足所述约束条件的实体标签;
可选地,在本实施例中,初始标签预测层可以但不限于包括采用LSTM(Long Short-Term Memory,长短时记忆)模型架构的网络,或者,采用BiLSTM(Bi-directional Long Short-Term Memory,双向长短时记忆)模型架构的网络等等。图3是根据本公开实施例的可选的BiLSTM模型的架构图,如图3所示,BiLSTM模型在预测“Play loudspeaker inbedroom(播放在卧室的音箱)”对应的实体标签时,可以但不限于进行前向预测和后向预测,并将前向预测的结果和后向预测的结果拼接,将“Play”预测为“B-PAT”标签,其中,“PAT”代表着播放模式(PATTERN),将“loudspeaker”预测为“B-DEV”标签,其中,“DEV”代表着操作资源(DEVICE),将“in”预测为“O”标签,其中,“O”标签代表着其它(OTHER),将“bedroom”预测为“B-ROOM”标签,其中,“ROOM”代表着操作房间,提高了预测标签与对智能设备执行的控制操作的操作信息之间的相关度,提升了预测标签的准确率,并且BiLSTM模型具有较强的鲁棒性,较少的受到工程特征的影响,能够稳定运行。
可选地,在本实施例中,文本样本中的每个文字可以但不限于对应一个或者 多个预测标签,每个预测标签都有和该预测标签对应的预测概率,初始条件约束层可以但不限于对初始标签预测层的输出结果添加约束条件,输出满足约束条件的与文本样本中的每个文字一一对应的预测标签。
可选地,在本实施例中,每个预测标签对应的预测概率可以但不限于包括未归一化的概率(即预测概率可以但不限于为大于1,或者小于或者等于1并且大于或者等于0),或者,归一化的概率(即预测概率可以但不限于大于或者等于0,并且小于或者等于1)等等。可以但不限于通过Softmax(分类网络)模型对每个预测标签对应的预测概率进行归一化处理。
可选地,在本实施例中,初始条件约束层可以但不限于包括采用CRF(Conditional Random Field,条件随机场)模型等等,CRF模型可以充分利用BiLSTM模型中的信息,提高了CRF输出的预测标签的准确性。
可选地,在本实施例中,约束条件可以但不限于初始条件约束层从文本样本中学习到,比如:句子的第一个单词应该是“B-label”或“O-label”而不是“I-label”,“B-label1 I-label1 I-label 2……”中,label1,label2和label3应该是同一种成分类别,“O I-label”是错误的,开头应该是“B-”而不是“I-”。通过学习这些约束,可以有效提高初始标签预测层所预测出的预测标签的准确性。
可选地,在本实施例中,可以但不限于包括将文本样本中的每个文字对应的多个预测标签中预测概率最大的预测标签确定为实体标签,或者,将文本样本中多个文字的预测标签对应的预测概率的和值最大的多个预测标签确定为实体标签。
在一个示例性实施例中,可以但不限于通过以下方式对初始实体识别模型的模型参数进行调整:根据所述损失值对所述初始标签预测层的预测参数和所述初始条件约束层的所述约束条件进行调整,其中,所述初始实体识别模型的模型参数包括所述预测参数和所述约束条件。
可选地,在本实施例中,可以但不限于在损失值大于损失值阈值,或者损失值未趋于恒定值的情况下,根据损失值对初始标签预测层的预测参数和初始条件 约束层的约束条件进行调整。
在一个示例性实施例中,可以但不限于通过以下方式将文本样本输入初始实体识别模型:将所述文本样本进行向量化处理,得到所述文本样本对应的文本向量;将所述文本向量输入所述初始实体识别模型。
可选地,在本实施例中,可以但不限于通过BERT(Bidirectional Encoder Representation from Transformers,双向预训练方法)模型,或者,Roberta(A Robustly Optimized BERT Pretraining Approach,鲁棒性优化的预训练方法)模型等等将文本样本中的每个文字转换为与每个文字一一对应的词向量。Roberta模型具有较强的获取动态词向量的能力,在模型细节、训练策略、数据层面三方面优化了网络结构,同时Roberta模型可以快速准确地将目标语句文本的每个目标文字转换为对应的词向量,节约了将文字转换为对应的词向量的时间,提高了将文字转换为对应的词向量的效率。
在上述步骤S206提供的技术方案中,可以但不限于通过与对智能设备执行的控制操作的相关度高的目标实体标签,识别目标语句文本对智能设备执行的控制操作意图,在智慧家庭家电控制场景中,智能对话系统往往需要准确识别出用户所表达的语言对智能设备的操作意图,进而控制智能设备执行用户想要的操作,通过目标实体标签识别目标语句文本对智能设备的操作意图,实现了速度快、效果优、识别准确的意图识别。
在一个示例性实施例中,可以但不限于通过以下方式识别目标意图特征:通过目标意图识别模型对所述目标实体标签进行识别,其中,所述目标意图识别模型是使用标注了意图特征的实体标签样本对初始意图识别模型进行训练得到的;获取所述目标意图识别模型输出的意图识别结果作为所述目标意图特征。
可选地,在本实施例中,可以但不限于将目标实体识别模型输出的目标实体标签输入目标意图识别模型,目标意图识别模型识别目标实体标签对智能设备执行的控制操作的意图,目标意图识别模型输出目标实体标签所对应的目标意图特征。图4是根据本公开实施例的可选的识别语句文本的意图的整体模型架构图,如图4所示。
在一个示例性实施例中,可以但不限于通过以下方式对目标实体标签进行识别:对所述目标语句文本进行语言成分解析,得到目标成分特征;将所述目标成分特征和所述目标实体标签输入所述目标意图识别模型,得到所述目标意图识别模型输出的所述意图识别结果。
可选地,在本实施例中,目标成分特征可以但不限于包括以下至少之一:主语、谓语、宾语、定语、状语、补语、主语从句、谓语从句、宾语从句、定语从句、状语从句、和补语从句等等,通过对目标语句文本所具有的语言成分进行分析,提升了语句文本所包括的信息的利用率。
可选地,在本实施例中,可以但不限于将目标语句文本所具有的语言成分和目标实体标签结合,通过目标意图识别模型识别目标语句文本对智能设备执行的控制操作的意图,实现了充分利用语句文本所包括的信息,提升了识别目标语句文本对智能设备的操作意图的准确率。
可选地,在本实施例中,可以但不限于通过目标实体识别模型和目标意图识别模型,结合目标语句文本所具有的语言成分识别目标语句文本的目标意图特征。图5是根据本公开实施例的识别目标语句文本的目标意图特征的流程图,图5可以但不限于应用于如上述图4所示的模型架构中,如图5所示,可以但不限于包括以下步骤:
步骤S501:获取目标语句文本;
步骤S502:将目标语句文本输入目标实体识别模型;
步骤S503:目标实体识别模型识别目标语句文本所具有的实体标签,得到目标实体标签;
步骤S504:目标实体识别模型输出目标实体标签;
步骤S505:将目标实体标签输入目标意图识别模型;
步骤S506:目标意图识别模型结合目标成分特征和目标实体标签识别目标语句文本的目标意图特征;
步骤S507:目标意图识别模型输出目标意图特征。
为了更好的理解上述语句文本的意图识别的过程,以下再结合可选实施例对上述语句文本的意图识别流程进行说明,但不用于限定本公开实施例的技术方案。
在本实施例中提供了一种语句文本的意图识别方法,图6是根据本公开实施例的一种语句文本的意图识别方法的示意图,如图6所示,可以但不限于包括如下步骤:
步骤S601:进行文本数据的收集、清洗处理;
步骤S602:确定文本数据标注的实体标签和数量,实体标签可以但不限于包括以下至少之一:时间(即上述的操作时间)、房间(即上述的操作位置)、资源(即上述的操作资源属性)、歌手(即上述的操作资源属性)、设备(即上述的操作设备)和模式(即上述的操作模式)等等;
步骤S603:标注文本数据所具有的实体标签,可以但不限于根据确定好的标注规则,进行人工的标签标注,标注出文本数据各句子中的各实体词成分,得到模型训练的样本数据;
步骤S604:可以但不限于将样本数据切分为训练集、验证集和测试集,得到训练数据;
步骤S605:将训练数据输入到Roberta预训练模型进行向量化,向量化可以但不限于分为三个模块input-ids(输入数据中词语id(identification,标识)组成的张量)、segment-ids(输入数据中的句子id(标识)组成的张量)、input-mask(输入数据掩码)。对三个向量化的结果做融合得到Embedding(词向量)的输出;
步骤S606:BiLSTM模型预测每个词向量对应的实体标签和每个实体标签对应的实体标签概率,可以但不限于将Roberta预训练模型输出的多个词向量作为BiLSTM模型的输入,该输入获取的n维字向量作为BiLSTM神经网络各个时间步的输入,得到BiLSTM层的隐状态序列。BiLSTM模型学习参数的更新可以但不限于使用BPTT(Back-Propagation Through Time,时序反向传播)算法,该模型 在forward(前向)和backward(后向)阶段与一般模型不同之处在于隐藏层对于所有的time step(步长)都要展开计算;
步骤S607:Softmax层对每个实体标签对应的实体标签概率进行归一化处理,可以但不限于将BiLSTM输出的多个词向量,每个词向量对应的实体标签和每个实体标签对应的实体标签概率输入logit(逻辑)层,其中,logit层是Softmax层的输入,Softmax层输出多个词向量,每个词向量对应的实体标签和每个实体标签对应的归一化的实体标签概率;
步骤S608:CRF层输出每个词向量对应的预测实体标签,可以但不限于将Softmax层输出的多个词向量,每个词向量对应的实体标签和每个实体标签对应的归一化的实体标签概率输入CRF层,CRF层可以向最终的预测实体标签添加一些约束,以确保预测实体标签为有效的,这些约束在该CRF层训练过程中由训练数据集自动学习得到。CRF则将LSTM在每个t时刻在第i个tag(分类)上的输出作为特征函数中的点函数,使原本的CRF中引入了非线性。整体模型还是以CRF为主体的大框架,使LSTM中的信息得到充分的再利用,最终能够得到全局最优的输出序列;
步骤S609:计算预测实体标签与真实实体标签之间的损失度,经过与训练数据的真实标签进行loss(损失度)的计算,再循环进行每个epoch(时期)的迭代,通过BPTT算法不断更新神经网络节点的参数,使loss逐渐下降最终达到模型收敛状态,且模型经过优化后能保证loss较小,经过训练完成的模型具有较高的新数据的准确率;
步骤S610:在损失度小于或者等于损失度阈值的情况下,模型部署完成,在进行语句意图解析的整个流程中,将新数据传入到该模型中,即可得到预测标签,结合专家规则,完成意图的精确识别。
图7是根据本公开实施例的可选的识别目标语句所具有的语言成分的模型架构图,上述步骤S601~步骤S610可以但不限于用于如图7所示的模型架构中,可以但不限于通过构建智慧家电场景命名实体标注规则和命名实体识别神经网络模型,结合语句文本所具有的语言成分来完成高准确率的意图识别;可以但不 限于使用Roberta预训练模型完成输入字词的Embedding(词向量),使得字词向量化变得简单高效,且向量化包含的信息和含义更加丰富,提升了向量化的准确性;同时利用CRF模型的状态转移矩阵,使得标签预测有效性大大提升。此外,该模型结构提升了训练的速度和预测准确性,在意图识别领域提供了一种新的处理方式。
用户可以但不限于与智能音箱或者其它的智能设备(比如:智能洗衣机,智能冰箱,或者智能台灯等等)进行语音交互,图8是根据本公开实施例的用户与智能音箱语音交互的场景示意图,如图8所示,用户可以但不限于在智能音箱播放歌曲的过程中,表达出“播放下一首”的语音指令,识别出的“播放下一首”对应的实体标签可以但不限于包括设备标签(对应着智能音箱)和模式标签(对应着下一首),可以但不限于根据设备标签(对应着智能音箱)和模式标签(对应着下一首),识别“播放下一首”对智能音箱的执行的控制操作的意图为控制智能音箱播放当前待播放歌曲列表中的下一首歌曲,那么可以但不限于响应用户的语音指令,控制智能音箱播放当前待播放歌曲列表中的下一首歌曲。
图9是根据本公开实施例的用户与智能电视语音交互的场景示意图,如图9所示,智能电视可以但不限于为正在播放体育新闻,如果获取到用户表达出“屏幕太亮了”的语音指令,识别出的“屏幕太亮了”对应的实体标签可以但不限于包括设备标签(对应着智能电视屏幕)和模式标签(对应着亮),可以但不限于根据设备标签(对应着智能电视屏幕)和模式标签(对应着亮),识别“屏幕太亮了”对智能电视执行的控制操作的意图为调低智能电视屏幕的显示亮度,那么可以但不限于响应用户的语音指令,将智能电视屏幕的显示亮度调低5%(或者10%,15%等等)。
需要说明的是,在本实施例中,对智能音箱和智能电视的形状不做限定,在图8中仅以外形为圆柱形的智能音箱进行举例说明,在图9中仅以外形为矩形状的智能电视进行举例说明,智能音箱和智能电视的形状可以是任何符合生产工艺和用户需求的形状,本公开对此不做限定。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述 实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本公开各个实施例的方法。
图10是根据本公开实施例的一种语句文本的意图识别装置的结构框图;如图10所示,包括:
获取模块102,设置为获取智能设备采集到的语句文本作为待识别的目标语句文本;
第一识别模块104,设置为对所述目标语句文本进行实体识别,得到目标实体标签,其中,所述目标实体标签用于表征所述目标语句文本对应的目标控制操作的目标操作信息,所述目标控制操作为所述目标语句文本指示对所述智能设备执行的控制操作;
第二识别模块106,设置为根据所述目标实体标签,识别所述目标语句文本对应的目标意图特征,其中,所述目标意图特征用于指示所述目标语句文本对所述智能设备的操作意图。
通过上述实施例,如果获取到待识别的目标语句文本,识别目标语句文本指示对智能设备执行的控制操作的目标操作信息作为目标实体标签,根据目标实体标签识别目标语句文本对智能设备的操作意图,通过与对智能设备的操作意图的相关性较高的目标实体标签,提升了识别目标语句文本对智能设备的操作意图的准确性。采用上述技术方案,解决了相关技术中,识别语句文本所表达意图的准确率较低等问题,实现了提升识别语句文本所表达意图的准确率的技术效果。
在一个示例性实施例中,所述第一识别模块,包括:
第一输入单元,设置为将所述目标语句文本输入目标实体识别模型,其中,所述目标实体识别模型是使用标注了实体标签的文本样本对初始实体识别模型 进行训练得到的,所述实体标签包括:操作时间、操作位置、操作资源属性、操作设备、操作模式;
第一获取单元,设置为获取所述目标实体识别模型输出的所述目标实体标签。
在一个示例性实施例中,所述装置还包括:
第一输入模块,设置为在所述将所述目标语句文本输入目标实体识别模型之前,将所述文本样本输入所述初始实体识别模型,得到所述初始实体识别模型输出的初始实体标签;
第二输入模块,设置为将所述初始实体标签与所述文本样本所标注的实体标签输入预设的损失函数,得到损失值;
调整模块,设置为根据所述损失值对所述初始实体识别模型的模型参数进行调整,直至满足训练截止条件,得到所述目标实体识别模型。
在一个示例性实施例中,其特征在于,
所述第一输入模块,设置为:将所述文本样本输入初始标签预测层;将所述初始标签预测层输出的初始预测标签输入初始条件约束层,得到所述初始条件约束层输出的所述初始实体标签;其中,所述初始实体识别模型包括所述初始标签预测层和所述初始条件约束层,所述初始实体识别模型用于预测输入参数对应的预测标签和每个预测标签对应的预测概率,所述初始条件约束层用于对所述初始实体识别模型预测出的预测标签和每个预测标签对应的预测概率添加约束条件得到满足所述约束条件的实体标签;
所述调整模块,设置为:根据所述损失值对所述初始标签预测层的预测参数和所述初始条件约束层的所述约束条件进行调整,其中,所述初始实体识别模型的模型参数包括所述预测参数和所述约束条件。
在一个示例性实施例中,所述第一输入模块,包括:
向量化单元,设置为将所述文本样本进行向量化处理,得到所述文本样本对应的文本向量;
第二输入单元,设置为将所述文本向量输入所述初始实体识别模型。
在一个示例性实施例中,特征在于,所述第二识别模块,包括:
识别单元,设置为通过目标意图识别模型对所述目标实体标签进行识别,其中,所述目标意图识别模型是使用标注了意图特征的实体标签样本对初始意图识别模型进行训练得到的;
第二获取单元,设置为获取所述目标意图识别模型输出的意图识别结果作为所述目标意图特征。
在一个示例性实施例中,所述识别单元,设置为:
对所述目标语句文本进行语言成分解析,得到目标成分特征;
将所述目标成分特征和所述目标实体标签输入所述目标意图识别模型,得到所述目标意图识别模型输出的所述意图识别结果。
本公开的实施例还提供了一种存储介质,该存储介质包括存储的程序,其中,上述程序运行时执行上述任一项的方法。
可选地,在本实施例中,上述存储介质可以被设置为存储用于执行以下步骤的程序代码:
S1,获取智能设备采集到的语句文本作为待识别的目标语句文本;
S2,对所述目标语句文本进行实体识别,得到目标实体标签,其中,所述目标实体标签用于表征所述目标语句文本对应的目标控制操作的目标操作信息,所述目标控制操作为所述目标语句文本指示对所述智能设备执行的控制操作;
S3,根据所述目标实体标签,识别所述目标语句文本对应的目标意图特征,其中,所述目标意图特征用于指示所述目标语句文本对所述智能设备的操作意图。
本公开的实施例还提供了一种电子装置,包括存储器和处理器,该存储器中存储有计算机程序,该处理器被设置为运行计算机程序以执行上述任一项方法实施例中的步骤。
可选地,上述电子装置还可以包括传输设备以及输入输出设备,其中,该传输设备和上述处理器连接,该输入输出设备和上述处理器连接。
可选地,在本实施例中,上述处理器可以被设置为通过计算机程序执行以下步骤:
S1,获取智能设备采集到的语句文本作为待识别的目标语句文本;
S2,对所述目标语句文本进行实体识别,得到目标实体标签,其中,所述目标实体标签用于表征所述目标语句文本对应的目标控制操作的目标操作信息,所述目标控制操作为所述目标语句文本指示对所述智能设备执行的控制操作;
S3,根据所述目标实体标签,识别所述目标语句文本对应的目标意图特征,其中,所述目标意图特征用于指示所述目标语句文本对所述智能设备的操作意图。
可选地,在本实施例中,上述存储介质可以包括但不限于:U盘、只读存储器(Read-Only Memory,简称为ROM)、随机存取存储器(Random Access Memory,简称为RAM)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
可选地,本实施例中的具体示例可以参考上述实施例及可选实施方式中所描述的示例,本实施例在此不再赘述。
显然,本领域的技术人员应该明白,上述的本公开的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本公开不限制于任何特定的硬件和软件结合。
以上所述仅是本公开的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本公开原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本公开的保护范围。

Claims (16)

  1. 一种语句文本的意图识别方法,包括:
    获取智能设备采集到的语句文本作为待识别的目标语句文本;
    对所述目标语句文本进行实体识别,得到目标实体标签,其中,所述目标实体标签用于表征所述目标语句文本对应的目标控制操作的目标操作信息,所述目标控制操作为所述目标语句文本指示对所述智能设备执行的控制操作;
    根据所述目标实体标签,识别所述目标语句文本对应的目标意图特征,其中,所述目标意图特征用于指示所述目标语句文本对所述智能设备的操作意图。
  2. 根据权利要求1所述的方法,其中,所述对所述目标语句文本进行实体识别,得到目标实体标签,包括:
    将所述目标语句文本输入目标实体识别模型,其中,所述目标实体识别模型是使用标注了实体标签的文本样本对初始实体识别模型进行训练得到的,所述实体标签包括:操作时间、操作位置、操作资源属性、操作设备、操作模式;获取所述目标实体识别模型输出的所述目标实体标签。
  3. 根据权利要求2所述的方法,其中,在所述将所述目标语句文本输入目标实体识别模型之前,所述方法还包括:
    将所述文本样本输入所述初始实体识别模型,得到所述初始实体识别模型输出的初始实体标签;
    将所述初始实体标签与所述文本样本所标注的实体标签输入预设的损失函数,得到损失值;
    根据所述损失值对所述初始实体识别模型的模型参数进行调整,直至满足训练截止条件,得到所述目标实体识别模型。
  4. 根据权利要求3所述的方法,其中,
    所述将所述文本样本输入所述初始实体识别模型,得到所述初始实体识别模型输出的初始实体标签包括:将所述文本样本输入初始标签预测层;将所述初始标签预测层输出的初始预测标签输入初始条件约束层,得到所述初始条件约束层输出的所述初始实体标签;其中,所述初始实体识别模型包括所述初始标签预测层和所述初始条件约束层,所述初始实体识别模型用于预测输入参数对应的预测标签和每个预测标签对应的预测概率,所述初始条件约束层用于对所述初始实体识别模型预测出的预测标签和每个预测标签对应的预测概率添加约束条件得到满足所述约束条件的实体标签;
    所述根据所述损失值对所述初始实体识别模型的模型参数进行调整包括:根据所述损失值对所述初始标签预测层的预测参数和所述初始条件约束层的所述约束条件进行调整,其中,所述初始实体识别模型的模型参数包括所述预测参数和所述约束条件。
  5. 根据权利要求3所述的方法,其中,所述将所述文本样本输入所述初始实体识别模型,包括:
    将所述文本样本进行向量化处理,得到所述文本样本对应的文本向量;
    将所述文本向量输入所述初始实体识别模型。
  6. 根据权利要求1-5任一项所述的方法,其中,所述根据所述目标实体标签,识别所述目标语句文本对应的目标意图特征,包括:
    通过目标意图识别模型对所述目标实体标签进行识别,其中,所述目标意图识别模型是使用标注了意图特征的实体标签样本对初始意图识别模型进行训练得到的;
    获取所述目标意图识别模型输出的意图识别结果作为所述目标意图特征。
  7. 根据权利要求6所述的方法,其中,所述通过目标意图识别模型对所述目标实体标签进行识别,包括:
    对所述目标语句文本进行语言成分解析,得到目标成分特征;
    将所述目标成分特征和所述目标实体标签输入所述目标意图识别模型,得到所述目标意图识别模型输出的所述意图识别结果。
  8. 一种语句文本的意图识别装置,包括:
    获取模块,设置为获取智能设备采集到的语句文本作为待识别的目标语句文本;
    第一识别模块,设置为对所述目标语句文本进行实体识别,得到目标实体标签,其中,所述目标实体标签用于表征所述目标语句文本对应的目标控制操作的目标操作信息,所述目标控制操作为所述目标语句文本指示对所述智能设备执行的控制操作;
    第二识别模块,设置为根据所述目标实体标签,识别所述目标语句文本对应的目标意图特征,其中,所述目标意图特征用于指示所述目标语句文本对所述智能设备的操作意图。
  9. 根据权利要求8所述的装置,其中,所述第一识别模块,包括:
    第一输入单元,设置为将所述目标语句文本输入目标实体识别模型,其中,所述目标实体识别模型是使用标注了实体标签的文本样本对初始实体识别模型进行训练得到的,所述实体标签包括:操作时间、操作位置、操作资源属性、操作设备、操作模式;
    第一获取单元,设置为获取所述目标实体识别模型输出的所述目标实体标签。
  10. 根据权利要求9所述的装置,其中,所述装置还包括:
    第一输入模块,设置为在所述将所述目标语句文本输入目标实体识别模型之前,将所述文本样本输入所述初始实体识别模型,得到所述初始实体识别模型输出的初始实体标签;
    第二输入模块,设置为将所述初始实体标签与所述文本样本所标注的实体标签输入预设的损失函数,得到损失值;
    调整模块,设置为根据所述损失值对所述初始实体识别模型的模型参数进行调整,直至满足训练截止条件,得到所述目标实体识别模型。
  11. 根据权利要求10所述的装置,其中,
    所述第一输入模块,设置为:将所述文本样本输入初始标签预测层;将所述初始标签预测层输出的初始预测标签输入初始条件约束层,得到所述初始条件约束层输出的所述初始实体标签;其中,所述初始实体识别模型包括所述初始标签预测层和所述初始条件约束层,所述初始实体识别模型用于预测输入参数对应的预测标签和每个预测标签对应的预测概率,所述初始条件约束层用于对所述初始实体识别模型预测出的预测标签和每个预测标签对应的预测概率添加约束条件得到满足所述约束条件的实体标签;
    所述调整模块,设置为:根据所述损失值对所述初始标签预测层的预测参数和所述初始条件约束层的所述约束条件进行调整,其中,所述初始实体识别模型的模型参数包括所述预测参数和所述约束条件。
  12. 根据权利要求10所述的装置,其中,所述第一输入模块,包括:
    向量化单元,设置为将所述文本样本进行向量化处理,得到所述文本样本对应的文本向量;
    第二输入单元,设置为将所述文本向量输入所述初始实体识别模型。
  13. 根据权利要求8-12任一项所述的装置,其中,所述第二识别模块,包括:
    识别单元,设置为通过目标意图识别模型对所述目标实体标签进行识别,其中,所述目标意图识别模型是使用标注了意图特征的实体标签样本对初始意图识别模型进行训练得到的;
    第二获取单元,设置为获取所述目标意图识别模型输出的意图识别结果作为所述目标意图特征。
  14. 根据权利要求13所述的装置,其中,所述识别单元,设置为:
    对所述目标语句文本进行语言成分解析,得到目标成分特征;
    将所述目标成分特征和所述目标实体标签输入所述目标意图识别模型,得到所述目标意图识别模型输出的所述意图识别结果。
  15. 一种计算机可读的存储介质,所述计算机可读的存储介质包括存储的程序,其中,所述程序运行时执行权利要求1至7中任一项所述的方法。
  16. 一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为通过所述计算机程序执行权利要求1至7中任一项所述的方法。
PCT/CN2022/096435 2022-03-15 2022-05-31 语句文本的意图识别方法和装置、存储介质及电子装置 WO2023173596A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210252555.X 2022-03-15
CN202210252555.XA CN114925158A (zh) 2022-03-15 2022-03-15 语句文本的意图识别方法和装置、存储介质及电子装置

Publications (1)

Publication Number Publication Date
WO2023173596A1 true WO2023173596A1 (zh) 2023-09-21

Family

ID=82805044

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/096435 WO2023173596A1 (zh) 2022-03-15 2022-05-31 语句文本的意图识别方法和装置、存储介质及电子装置

Country Status (2)

Country Link
CN (1) CN114925158A (zh)
WO (1) WO2023173596A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116662555B (zh) * 2023-07-28 2023-10-20 成都赛力斯科技有限公司 一种请求文本处理方法、装置、电子设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287480A (zh) * 2019-05-27 2019-09-27 广州多益网络股份有限公司 一种命名实体识别方法、装置、存储介质及终端设备
CN112100349A (zh) * 2020-09-03 2020-12-18 深圳数联天下智能科技有限公司 一种多轮对话方法、装置、电子设备及存储介质
WO2021212682A1 (zh) * 2020-04-21 2021-10-28 平安国际智慧城市科技股份有限公司 知识抽取方法、装置、电子设备及存储介质
CN113593565A (zh) * 2021-09-29 2021-11-02 深圳大生活家科技有限公司 一种智能家庭设备管控方法和系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287480A (zh) * 2019-05-27 2019-09-27 广州多益网络股份有限公司 一种命名实体识别方法、装置、存储介质及终端设备
WO2021212682A1 (zh) * 2020-04-21 2021-10-28 平安国际智慧城市科技股份有限公司 知识抽取方法、装置、电子设备及存储介质
CN112100349A (zh) * 2020-09-03 2020-12-18 深圳数联天下智能科技有限公司 一种多轮对话方法、装置、电子设备及存储介质
CN113593565A (zh) * 2021-09-29 2021-11-02 深圳大生活家科技有限公司 一种智能家庭设备管控方法和系统

Also Published As

Publication number Publication date
CN114925158A (zh) 2022-08-19

Similar Documents

Publication Publication Date Title
US20220317641A1 (en) Device control method, conflict processing method, corresponding apparatus and electronic device
WO2023168838A1 (zh) 语句文本的识别方法和装置、存储介质及电子装置
US20210160130A1 (en) Method and Apparatus for Determining Target Object, Storage Medium, and Electronic Device
CN116229955B (zh) 基于生成式预训练gpt模型的交互意图信息确定方法
CN109818839A (zh) 应用于智能家居的个性化行为预测方法、装置和系统
CN109684456B (zh) 基于物联网能力知识图谱的场景能力智能问答系统
CN109101624A (zh) 对话处理方法、装置、电子设备及存储介质
WO2023173596A1 (zh) 语句文本的意图识别方法和装置、存储介质及电子装置
WO2024021407A1 (zh) 知识图谱的更新方法和装置、存储介质及电子装置
CN114821236A (zh) 智慧家庭环境感知方法、系统、存储介质及电子装置
CN115424615A (zh) 智能设备语音控制方法、装置、设备及存储介质
JPH06161496A (ja) 家電製品のリモコン命令語を認識するための音声認識システム
CN115877726A (zh) 智能家居设备的控制方法、计算机设备及存储介质
CN114694644A (zh) 语音意图识别方法、装置及电子设备
US20230385377A1 (en) Device, method, and computer program for performing actions on iot devices
Yu Research on multimodal music emotion recognition method based on image sequence
WO2020151017A1 (zh) 一种可扩展的领域人机对话系统状态跟踪方法及设备
CN113836932A (zh) 交互方法、装置和系统,以及智能设备
CN117765949B (zh) 一种基于语义依存分析的语句多意图识别方法及装置
WO2023206723A1 (zh) 语义转换方法和装置、存储介质及电子装置
CN116418828B (zh) 基于人工智能的视音频设备集成管理方法
Ye et al. Development of music teaching system by using speech recognition and intelligent mobile remote device
US11755930B2 (en) Method and apparatus for controlling learning of model for estimating intention of input utterance
CN118034415A (zh) 一种家庭设备的控制方法、装置、电子设备及存储介质
CN117708680A (zh) 一种用于提升分类模型准确度的方法及装置、存储介质、电子装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22931621

Country of ref document: EP

Kind code of ref document: A1