CN117892770A

CN117892770A - Training method of air conditioner control model and air conditioner control method

Info

Publication number: CN117892770A
Application number: CN202311800172.2A
Authority: CN
Inventors: 吴雪燕; 李绍斌; 贾巨涛; 黄鑫
Original assignee: Gree Electric Appliances Inc of Zhuhai; Zhuhai Lianyun Technology Co Ltd
Current assignee: Gree Electric Appliances Inc of Zhuhai; Zhuhai Lianyun Technology Co Ltd
Priority date: 2023-12-25
Filing date: 2023-12-25
Publication date: 2024-04-16

Abstract

The invention provides a training method of an air conditioner control model and an air conditioner control method, wherein the method comprises the following steps: acquiring control information, and inputting the control information into an initial model; the initial model comprises an encoder and a decoder, wherein a plurality of labeling labels are arranged in the control information, and the labeling labels have labeling probability; receiving the control information by the encoder of the initial model, and extracting a feature vector based on the control information; receiving the feature vector through the decoder of the initial model, and obtaining an output tag and an output probability of the output tag based on the feature vector; and training the initial model according to the output probability of the output label and the labeling probability of the labeling label to obtain an air conditioner control model. The air conditioner control method and the air conditioner control device can improve accuracy of air conditioner control through a trained air conditioner control model.

Description

Training method of air conditioner control model and air conditioner control method

Technical Field

The embodiment of the invention relates to the field of intelligent home, in particular to a training method of an air conditioner control model, an air conditioner control method, a training device of the air conditioner control model, an air conditioner control device, electronic equipment and a computer readable storage medium.

Background

With the rapid development of artificial intelligence and internet of things, intelligent home systems have become increasingly popular. The air conditioner is taken as an important component of the home system, can adjust environmental states such as indoor temperature and the like, and improves the comfort of the environment.

Currently, a user controls an air conditioner, mainly by controlling the air conditioner through a remote controller or intelligent terminal equipment such as a mobile phone, and corresponding instructions are sent out through keys or options provided by the equipment to control the air conditioner.

The scheme can only control the air conditioner based on simple keys or options, and the control is not accurate enough.

Disclosure of Invention

In view of the foregoing, embodiments of the present invention have been made to provide a training method of an air conditioner control model, an air conditioner control method, an electronic device, and a computer-readable storage medium, which overcome or at least partially solve the foregoing problems.

In a first aspect, an embodiment of the present application discloses a training method for an air conditioner control model, including:

acquiring control information, and inputting the control information into an initial model; the initial model comprises an encoder and a decoder, wherein a plurality of labeling labels are arranged in the control information, and the labeling labels have labeling probability;

Receiving the control information by the encoder of the initial model, and extracting a feature vector based on the control information;

receiving the feature vector through the decoder of the initial model, and obtaining an output tag and an output probability of the output tag based on the feature vector;

and training the initial model according to the output probability of the output label and the labeling probability of the labeling label to obtain an air conditioner control model.

In a second aspect, an embodiment of the present application discloses an air conditioner control method, including:

acquiring control information, and inputting the control information into an air conditioner control model;

after the air conditioner control model receives the control information, an output label is obtained;

generating an air conditioner control instruction according to the output tag, and controlling an air conditioner based on the air conditioner control instruction; wherein the air conditioning control model is trained by the method of any one of the first aspects.

In a third aspect, an embodiment of the present application discloses a training device for an air conditioner control model, including:

the information acquisition module is used for acquiring control information and inputting the control information into the initial model; the initial model comprises an encoder and a decoder, wherein a plurality of labeling labels are arranged in the control information, and the labeling labels have labeling probability;

A feature vector module for receiving the control information through the encoder of the initial model and extracting a feature vector based on the control information;

the probability output module is used for receiving the feature vector through the decoder of the initial model and obtaining an output label and the output probability of the output label based on the feature vector;

and the model training module is used for training the initial model to obtain an air conditioner control model according to the output probability of the output label and the labeling probability of the labeling label.

In a fourth aspect, an embodiment of the present application discloses an air conditioner control device, including:

the control acquisition module is used for acquiring control information and inputting the control information into the air conditioner control model;

the label acquisition module is used for acquiring an output label after the air conditioner control model receives the control information;

the instruction control module is used for generating an air conditioner control instruction according to the output tag and controlling an air conditioner based on the air conditioner control instruction; the air conditioner control model is trained by the method of the first aspect.

In a fifth aspect, an embodiment of the present application further discloses an electronic device, including a processor and a memory, where the memory stores a program or instructions executable on the processor, where the program or instructions implement the training method of the air conditioner control model according to the first aspect and the steps of the air conditioner control method according to the second aspect when executed by the processor.

In a sixth aspect, embodiments of the present application further disclose a computer readable storage medium, where a program or an instruction is stored, where the program or the instruction implement the steps of the training method of the air conditioner control model according to the first aspect and the air conditioner control method according to the second aspect when executed by a processor.

In the embodiment of the application, the control information is input into an initial model by acquiring the control information; the initial model comprises an encoder and a decoder, wherein a plurality of labeling labels are arranged in the control information, and the labeling labels have labeling probability; receiving the control information by the encoder of the initial model, and extracting a feature vector based on the control information; receiving the feature vector through the decoder of the initial model, and obtaining an output tag and an output probability of the output tag based on the feature vector; and training the initial model according to the output probability of the output label and the labeling probability of the labeling label to obtain an air conditioner control model. The air conditioner control model based on the control information and the output label can be obtained, so that the air conditioner can be controlled according to the output label, and the accuracy of air conditioner control is improved.

Drawings

FIG. 1 is a flow chart of steps of a training method of an air conditioner control model according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating steps of a training method of an air conditioner control model according to an embodiment of the present invention;

FIG. 3 is a flowchart illustrating steps for controlling an air conditioner according to an embodiment of the present invention;

FIG. 4 is a block diagram of a training device for an air conditioner control model according to an embodiment of the present invention;

fig. 5 is a block diagram of an air conditioner control device according to an embodiment of the present invention;

FIG. 6 is a block diagram of an electronic device provided by an embodiment of the invention;

fig. 7 is a block diagram of still another electronic device provided by an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.

In recent years, technology research such as computer vision, deep learning, machine learning, image processing, image recognition and the like based on artificial intelligence has been advanced significantly. Artificial intelligence (AI, artificial Intelligence) is an emerging scientific technology for studying and developing theories, methods, techniques and application systems for simulating and extending human intelligence. The artificial intelligence discipline is a comprehensive discipline and relates to various technical categories such as chips, big data, cloud computing, internet of things, distributed storage, deep learning, machine learning, neural networks and the like. Deep Learning (DL) is a new research direction in the field of Machine Learning (ML). Deep learning is the inherent law and presentation hierarchy of learning sample data, and information obtained in these learning processes has a great reference meaning to interpretation of data such as text, images and sounds. Its final goal is to have the machine have analytical learning capabilities like a person, and to recognize text, image, and sound data. Deep learning has achieved many results in search technology, data mining, machine learning, machine translation, natural language processing, multimedia learning, speech, recommendation and personalization techniques, and other related fields. Deep learning can solve complex pattern recognition problems by making machines mimic human activities such as audio-visual and thinking.

The embodiment of the application is mainly applied to the field of processing of the control instruction of the air conditioner. Of course, the foregoing is merely an exemplary listing of possible scenarios of the methods provided by the embodiments of the present application, and is not meant to limit the embodiments of the present application.

Referring to fig. 1, a step flowchart of a training method of an air conditioner control model provided in an embodiment of the present application is shown, where the method includes:

step 101, obtaining control information, and inputting the control information into an initial model; the initial model comprises an encoder and a decoder, wherein a plurality of labeling labels are arranged in the control information, and the labeling labels have labeling probability;

in the embodiment of the invention, the control information can be a control instruction, and the control instruction can be a control instruction sent by a user through an air conditioner remote controller, or can be a control instruction sent by a user through an air conditioner control application program installed on terminal equipment such as a mobile phone, a tablet or a computer. How to acquire the control information is not limited here. The control information comprises a plurality of labeling labels, and the labeling labels have labeling probability. For example, if the control information is "turn on the air conditioner and raise the temperature to twenty-five degrees", the labeling label may be "turn on", "air conditioner", "raise", "temperature", "twenty-five degrees", and the labeling probability of each labeling label is 1.

The initial model includes an encoder and a decoder. Wherein the encoder converts the original data into a low-dimensional representation to capture key features of the data and the decoder converts the low-dimensional representation back into the original data space to recover the original data. In addition, the initial model includes an encoder and a decoder, which means that the structure of the model is composed of the encoder and the decoder, the initial model can adopt convolutional neural networks CNN, convolutional Neural Networks), cyclic neural networks (RNN, recurrent Neural Networks), bi-directional long-short-term memory cyclic neural networks (BiRNN, bi-directional Recurrent Neural Networks), long-short-term memory (LSTM, long Short Term Memory), gated cyclic networks (GRU, gated Recurrent Unit), and the like.

Step 102, receiving the control information through the encoder of the initial model, and extracting a feature vector based on the control information;

in an embodiment of the invention, the Encoder (Encoder) is responsible for representing the Input (Input), and the Encoder (Encoder) is responsible for converting the Input (Input) into the Feature (Feature). The control information is received through an encoder of the initial model, and the encoder performs feature extraction on the control information based on the control information to obtain feature vectors. For example, the initial model may include an input layer, a hidden layer, and an output layer, the input layer to hidden layer may be referred to as an encoder, and the hidden layer to output layer may be referred to as a decoder, and when the number of hidden layer neurons is lower than the input layer, fewer feature neurons may be used to characterize the input data, thereby reducing the dimension of the data, i.e., extraction of features.

For example, the control information may be "i want to turn on the air conditioner", and then, for example, a binary vector representation may be obtained: [1,0, 1], where [1,0] represents on and [1,0] represents air conditioner, this feature vector can be used to represent on air conditioner.

Step 103, receiving the feature vector through the decoder of the initial model, and obtaining an output label and the output probability of the output label based on the feature vector;

in the embodiment of the invention, the decoder of the initial model receives the feature vector, and outputs a result and the probability of the result according to the feature vector, thereby obtaining the output label and the output probability of the output label.

And 104, training the initial model to obtain an air conditioner control model according to the output probability of the output label and the labeling probability of the labeling label.

In the embodiment of the invention, the performance of the model is determined by comparing the predicted output and the expected output of the model through the loss function according to the output probability of the output label and the labeling probability of the labeling label, and the model is optimized. If the deviation between the output probability of the output label and the labeling probability of the labeling label is very large, the loss value is larger; if the deviation is small or the values are nearly the same, the loss value will be low. Therefore, based on the output probability of the output label and the labeling probability of the labeling label, a proper loss function can be selected to train the initial model, and model parameters of the initial model are continuously adjusted to obtain the air conditioner control model.

In summary, in the embodiment of the present application, control information is input into an initial model by acquiring the control information; the initial model comprises an encoder and a decoder, wherein a plurality of labeling labels are arranged in the control information, and the labeling labels have labeling probability; receiving the control information by the encoder of the initial model, and extracting a feature vector based on the control information; receiving the feature vector through the decoder of the initial model, and obtaining an output tag and an output probability of the output tag based on the feature vector; and training the initial model according to the output probability of the output label and the labeling probability of the labeling label to obtain an air conditioner control model. The air conditioner control model based on the control information and the output label can be obtained, so that the air conditioner can be controlled according to the output label, and the accuracy of air conditioner control is improved.

Referring to fig. 2, a step flowchart of a training method of an air conditioner control model according to an embodiment of the present application is shown, where the method includes:

step 201, obtaining control information, and inputting the control information into an initial model; the initial model comprises an encoder and a decoder, wherein a plurality of labeling labels are arranged in the control information, and the labeling labels have labeling probability;

Step 204, receiving the control information through the encoder of the initial model, and extracting a feature vector based on the control information;

step 205, receiving the feature vector by the decoder of the initial model, and obtaining an output tag and an output probability of the output tag based on the feature vector;

and 206, training the initial model to obtain an air conditioner control model according to the output probability of the output label and the labeling probability of the labeling label.

The steps 201, 204-206 may refer to the embodiment of fig. 1, and are not described here again.

Optionally, step 206 may specifically include:

a substep 2061 of calculating a loss value between the output probability and the labeling probability by a preset loss function based on the output probability of the output label and the labeling probability of the labeling label; the preset loss function is one of a cross entropy loss function and a mean square error loss function;

and step 2062, according to the loss value, adjusting the weight parameter of the initial model until the loss value reaches a preset expected value to obtain an air conditioner control model.

In the embodiment of the invention, the initial model is trained based on the output probability of the output label and the labeling probability of the labeling label. The loss value between the output probability and the labeling probability may be calculated by a preset loss function, which is used to define the error between the individual training samples and the true value. The predetermined loss function may be a cross entropy loss function (CELF, cross Entropy Loss Function) or a mean square error loss function (MSE, mean Square Error), or other loss function. Such as the mean absolute error Loss function (MAEL, mean Absolute Error Loss), the Quantile Loss function (QL, quantile Loss), etc.

Wherein the loss may be calculated using a mean square error loss function. The mean square error loss function is the mean value of the sum of squares of the differences between the predicted value and the target value, namely the mean value of the sum of squares of the differences between the output probability of the output tag and the labeling probability of the labeling tag. The curve of the mean square error loss function is smooth and continuous, can be conducted everywhere, and can use a gradient descent algorithm. In addition, as the error decreases, the gradient also decreases, which is beneficial to convergence, and even if a fixed learning rate is used, the convergence to the minimum value can be faster.

The weight parameters of the initial model are adjusted according to the loss values, namely the loss function is derived, and the gradient of each weight parameter of the initial model relative to the loss function is obtained. This gradient may represent the magnitude and direction of the contribution of the model parameters to the loss function in the current state, i.e. the direction and magnitude of the parameter update. Through a gradient descent algorithm, model parameters can be adjusted according to the gradient, so that a loss function is gradually reduced, and model performance is gradually improved. When the loss value gradually decreases until the loss value reaches a preset expected value, the most accurate predicted optimal parameter combination can be generated, and then the initial model training can be considered to be completed, so that a final air conditioner control model can be obtained.

For example, assuming that the model parameters of the initial model are w and b (linear regression model), the partial derivatives of the loss function for w and b need to be calculated. The values of w and b are then updated by these partial derivatives such that the loss function value decreases, eventually bringing the loss value to the desired value.

In general, samples with an output probability of greater than 0.5 of the model may be classified as positive classes, and samples less than 0.5 may be classified as negative classes. In practical applications, this threshold of 0.5 is not always the most suitable. For example, in some cases, the number of samples of the positive class is very small, which may result in many positive class samples being incorrectly classified as negative if a model output probability greater than 0.5 is required. The threshold of the loss function may be adjusted. When the threshold is adjusted to 0.6, we classify the model as positive only if its output probability is greater than 0.6. This adjustment may be achieved by modifying the way the loss function is calculated.

According to the embodiment of the disclosure, a loss value between the output probability and the labeling probability is calculated through a preset loss function based on the output probability of the output label and the labeling probability of the labeling label; and adjusting the weight parameters of the initial model according to the loss value until the loss value reaches a preset expected value to obtain the air conditioner control model. The accuracy of the output result of the model can be improved, and the accuracy of controlling the air conditioner through the air conditioner control model is further improved.

Optionally, step 201 may include sub-steps 2011-2012:

a substep 2011, acquiring a control instruction and environmental information of a user through terminal equipment; the environmental information includes temperature;

sub-step 2012, splitting the control instruction and the environmental information to obtain a plurality of information units, and adding corresponding part-of-speech tags to the plurality of information units respectively to obtain control information including the plurality of information units.

In the embodiment of the disclosure, the terminal device may be any one device of smart home devices, and a plurality of smart home devices are mutually linked through a network, so as to form a smart home system. Any terminal equipment in the intelligent home system can acquire a control instruction or environment information of a user. The control instruction may be input by the user based on any intelligent home device, i.e. the terminal device, for example, the user may input a voice instruction through an intelligent voice assistant device, the user may also input a voice instruction through an intelligent refrigerator, and the voice instruction acquired by any intelligent home device may be acquired again by other intelligent home devices. Or, the user inputs the text instruction through the display screen based on the display screen of the intelligent home equipment. It should be noted that when the voice command of the user is obtained, the voice command may be translated to obtain a text command.

The environmental information may include temperature, specifically, indoor temperature and outdoor temperature, and also air humidity, air quality, illumination intensity, etc., and the specific content of the environmental information is not particularly limited herein. The environmental information may be obtained by some smart home devices with corresponding sensors, for example, the air quality may be detected by the sensor of the air purifier, the indoor temperature may be obtained by the temperature sensor of the smart home device, etc.

And splitting the acquired control instruction and environment information to obtain a plurality of information units, namely converting the instruction into a structured data format so as to facilitate model processing. For example, splitting the instruction text into individual vocabulary units, e.g. "adjust indoor temperature to twenty-five degrees" may be split into "adjust", "indoor", "temperature", "adjust to" twenty-five degrees "and so on. In addition, part-of-speech tagging can be performed on each information unit, and corresponding part-of-speech tags can be added to each of the plurality of information units. For example, "will" may be labeled as a verb, "indoor" may be labeled as a noun, "temperature" may be labeled as a noun, "call" may be labeled as a verb, "25 degrees" may be labeled as a number, etc.

It should be understood that part-of-speech tagging is actually a classification problem, i.e. classifying information units in a corpus by part-of-speech. The part of speech of a word is determined by its meaning, morphology and grammatical functions in the language to which it belongs. The parts of speech are not closed sets, but rather have the phenomenon of doubling, so the labels of the parts of speech are context dependent. For adding the part-of-speech tag, a part-of-speech tagging method based on rules, a part-of-speech tagging method based on statistical models, a part-of-speech tagging method based on combination of statistics and rules, and the like can be adopted.

When a part-of-speech labeling method based on a statistical model is adopted, a hidden markov model (HMM, hidden Markov Model), a conditional random field (CRF, conditional Random Field) and the like are adopted, the statistical model can be trained by using a large corpus with labeled data, and the labeled data refers to that each word is assigned with a correct part-of-speech label. The control information finally obtained comprises a plurality of information units with part-of-speech tags.

By implementing the embodiment of the disclosure, a control instruction and environment information of a user are acquired through terminal equipment; the environmental information includes temperature; splitting the control instruction and the environment information to obtain a plurality of information units, and respectively adding corresponding part-of-speech tags to the plurality of information units to obtain control information comprising the plurality of information units. Related control information can be conveniently and comprehensively obtained to train an initial model of a user, and the applicability of the air conditioner control model is improved.

Optionally, step 201 further includes step 202:

step 202, identifying text content of the control instruction and text content of the environment information;

step 203, when there are a plurality of control instructions or environment information of the same text content, one of the control instructions or environment information of the same text content is reserved.

In the embodiment of the present disclosure, after the control instruction and the environment information are acquired, the content of the control instruction and the environment information needs to be identified. When there are a plurality of control instructions or environment information of the same text content, one of the plurality of control instructions or environment information of the same text content is retained. For example, the user may issue multiple identical control commands through different smart home devices, the control command of the user inputting the first text through the air conditioning control application of the mobile phone is "raise temperature to twenty-five degrees", and the control command of the user inputting the second voice through the smart voice assistant device is "raise temperature to twenty-five degrees". Then after the control instruction of the second voice is converted to obtain a text instruction, two text instructions represent the same control content.

Thus, there may be duplicate data records that occupy a significant amount of memory and also have a noisy impact on model training when collecting data. Thus, filtering or deduplication processing is required for these duplicate data records. Therefore, among a plurality of control instructions of the same content, only one control instruction is reserved. The same is true for the environmental information, and the same environmental information can be acquired by different intelligent home devices for multiple times, so that only one environmental information is reserved in the environmental information with the same content.

It will be appreciated that other and further data processing may be performed on the acquired control instructions and environmental information. For example, filling in missing values, there may be some missing values in the data set, i.e. the control instructions and the context information, which may be left unrecorded or lost for some reason. To enable better training of the model, these missing values need to be filled in. The missing values can be filled with the mean, median, mode, etc., keeping the amount of data unchanged. For time series data, interpolation methods including, for example, linear interpolation, lagrangian interpolation, and the like may be employed to estimate the missing values.

In addition, abnormal values can be removed. In the dataset, there may be outliers due to measurement errors, data entry errors, etc., which may negatively impact model training. The abnormal values can be observed and processed by using a visual tool such as a scatter diagram and a box diagram, or the abnormal values beyond a certain range can be cut off into boundary values, so that the abnormal values can not greatly affect the model.

In implementing the embodiment of the disclosure, identifying text content of a control instruction and text content of environmental information; when there are a plurality of control instructions or environment information of the same text content, one of the plurality of control instructions or environment information of the same text content is retained. The negative influence of training data, namely control instructions and environmental information, can be reduced, the performance of the air conditioner control model is improved, and the accuracy of air conditioner control through the air conditioner control model is further improved.

Optionally, step 204 may include sub-steps 2041-2042:

a substep 2041, performing text embedding on the plurality of information units through the encoder of the initial model to obtain feature vectors corresponding to the plurality of information units;

In sub-step 2042, a feature vector corresponding to the control information is obtained according to the feature vectors corresponding to the plurality of information units.

In the embodiment of the disclosure, text embedding is performed on a plurality of information units distributed through an encoder of an initial model, so as to obtain feature vectors corresponding to the plurality of information units. A text embedding layer may be included in the encoder, the text embedding layer functioning to transform representations of words in the text into vector representations to capture relationships between words in a high-dimensional space. Feature vectors of information units may be obtained using one-hot coding (one-hot), lexical mapping (Word 2 Vec), or the like.

After the text embedding layer, a position coding layer can be included, and information which can generate different semantics due to different vocabulary positions is added into the vector so as to make up for the lack of position information. Each information unit can be represented by a one-dimensional vector, and the feature vector of the control information can be obtained based on the feature vectors of a plurality of information units.

According to the embodiment of the disclosure, text embedding is carried out on a plurality of information units through the encoder of an initial model to obtain feature vectors corresponding to the plurality of information units; and obtaining the feature vector corresponding to the control information according to the feature vector corresponding to the plurality of information units. The characteristic vector of the control information can be extracted, and the capability of capturing the characteristic of the control information is improved.

Optionally, step 205 may include sub-steps 2051-2053:

a substep 2051, wherein the decoder receives the feature vector to obtain a probability distribution output by the decoder; the probability distribution comprises a plurality of candidate output labels and output probabilities corresponding to the candidate output labels;

sub-step 2052, determining a target output tag and an output probability of the target output tag from the plurality of candidate output tags based on a maximum probability in the probability distribution.

In the embodiment of the disclosure, the feature vector is received by a decoder and can be output to a linear layer and a classification layer, and the classification layer outputs index values of tags with high probability. Through the index value, the label corresponding to the index value can be inquired in the label dictionary. I.e. the output of the classification layer may be a probability distribution which may represent different probabilities for different labels. For example, if the initial model is used to type a piece of text, which may be any of news, science and technology, entertainment, the model output may be news-probability 0.1, science and technology-probability 0.1, entertainment-probability 0.8. The probability addition in the probability distribution is 1.

The label corresponding to the maximum probability in the probability distribution is the output label of the model, and the output probability of the output label of the model is actually the maximum probability in the probability distribution.

Referring to fig. 3, a step flowchart of a training method of an air conditioner control model according to an embodiment of the present application is shown, where the method includes:

Step 301, obtaining control information, and inputting the control information into an air conditioner control model;

step 302, after the air conditioner control model receives the control information, an output label is obtained;

and 303, generating an air conditioner control instruction according to the output tag, and controlling an air conditioner based on the air conditioner control instruction.

In the embodiment of the present disclosure, the control information is acquired, and the control information may be a control instruction, environment information, and so on, which are not described herein. The control information is input into the air-conditioning control model, and after the air-conditioning control model receives the control information, an output tag may be obtained, which may be, for example, "raise temperature". Based on the output tag, an air conditioner control instruction is generated to control the air conditioner. Other content may refer to the embodiments of fig. 1 or fig. 2, and is not described herein.

Referring to fig. 4, a training device for an air conditioner control model provided in an embodiment of the present application includes:

an information acquisition module 401, configured to acquire control information, and input the control information into an initial model; the initial model comprises an encoder and a decoder, wherein a plurality of labeling labels are arranged in the control information, and the labeling labels have labeling probability;

a feature vector module 402 for receiving the control information by the encoder of the initial model and extracting a feature vector based on the control information;

a probability output module 403, configured to receive the feature vector through the decoder of the initial model, and obtain an output tag and an output probability of the output tag based on the feature vector;

and the model training module 404 is configured to train the initial model to obtain an air conditioner control model according to the output probability of the output label and the labeling probability of the labeling label.

Optionally, the model training module includes:

the loss calculation sub-module is used for calculating a loss value between the output probability and the labeling probability through a preset loss function based on the output probability of the output label and the labeling probability of the labeling label; the preset loss function is one of a cross entropy loss function and a mean square error loss function;

And the parameter adjustment sub-module is used for adjusting the weight parameter of the initial model according to the loss value until the loss value reaches a preset expected value so as to obtain an air conditioner control model.

Optionally, the information acquisition module includes:

acquiring a control instruction and environment information of a user through terminal equipment; the environmental information includes temperature;

splitting the control instruction and the environment information to obtain a plurality of information units, and respectively adding corresponding part-of-speech tags to the plurality of information units to obtain control information comprising the plurality of information units.

Optionally, the apparatus further comprises:

the identification module is used for identifying the text content of the control instruction and the text content of the environment information;

and the reservation module is used for reserving one of the control instructions or the environment information of the same text contents when the control instructions or the environment information of the same text contents exist.

Optionally, the feature vector module includes:

the unit vector sub-module is used for text embedding the information units through the encoder of the initial model to obtain feature vectors corresponding to the information units;

And the information vector sub-module is used for obtaining the feature vector corresponding to the control information according to the feature vectors corresponding to the information units.

Optionally, the probability output module includes:

the probability output sub-module is used for receiving the characteristic vector through the decoder and obtaining probability distribution output by the decoder; the probability distribution comprises a plurality of candidate output labels and output probabilities corresponding to the candidate output labels;

and the final output sub-module is used for determining a target output label and the output probability of the target output label from the plurality of candidate output labels according to the maximum probability in the probability distribution.

Referring to fig. 5, an air conditioner control device provided in an embodiment of the present application includes:

a control acquisition module 501, configured to acquire control information, and input the control information to an air conditioner control model;

the tag obtaining module 502 is configured to obtain an output tag after the air conditioner control model receives the control information;

the instruction control module 502 is configured to generate an air conditioner control instruction according to the output tag, and control an air conditioner based on the air conditioner control instruction.

Fig. 6 illustrates a block diagram of an electronic device 600, according to an example embodiment. For example, the electronic device 600 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.

Referring to fig. 6, an electronic device 600 may include one or more of the following components: a processing component 602, a memory 604, a power component 606, a multimedia component 608, an audio component 610, an input/output (I/O) interface 612, a sensor component 614, and a communication component 616.

The processing component 602 generally controls overall operation of the electronic device 600, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 602 may include one or more processors 620 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 602 can include one or more modules that facilitate interaction between the processing component 602 and other components. For example, the processing component 602 may include a multimedia module to facilitate interaction between the multimedia component 608 and the processing component 602.

The memory 604 is used to store various types of data to support operations at the electronic device 600. Examples of such data include instructions for any application or method operating on the electronic device 600, contact data, phonebook data, messages, pictures, multimedia, and so forth. The memory 604 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

The power supply component 606 provides power to the various components of the electronic device 600. The power supply components 606 can include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the electronic device 600.

The multimedia component 608 includes a screen between the electronic device 600 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may not only sense demarcations of touch or sliding actions, but also detect durations and pressures associated with the touch or sliding operations. In some embodiments, the multimedia component 608 includes a front camera and/or a rear camera. When the electronic device 600 is in an operational mode, such as a shooting mode or a multimedia mode, the front-facing camera and/or the rear-facing camera may receive external multimedia data. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.

The audio component 610 is for outputting and/or inputting audio signals. For example, the audio component 610 includes a Microphone (MIC) for receiving external audio signals when the electronic device 600 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 604 or transmitted via the communication component 616. In some embodiments, audio component 610 further includes a speaker for outputting audio signals.

The I/O interface 612 provides an interface between the processing component 602 and peripheral interface modules, which may be a keyboard, click wheel, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.

The sensor assembly 614 includes one or more sensors for providing status assessment of various aspects of the electronic device 600. For example, the sensor assembly 614 may detect an on/off state of the electronic device 600, a relative positioning of the components, such as a display and keypad of the electronic device 600, the sensor assembly 614 may also detect a change in position of the electronic device 600 or a component of the electronic device 600, the presence or absence of a user's contact with the electronic device 600, an orientation or acceleration/deceleration of the electronic device 600, and a change in temperature of the electronic device 600. The sensor assembly 614 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact. The sensor assembly 614 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 614 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 616 is utilized to facilitate communication between the electronic device 600 and other devices, either in a wired or wireless manner. The electronic device 600 may access a wireless network based on a communication standard, such as WiFi, an operator network (e.g., 2G, 3G, 4G, or 5G), or a combination thereof. In one exemplary embodiment, the communication component 616 receives broadcast signals or broadcast-related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 616 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the electronic device 600 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for implementing the training methods for air conditioner control models provided by embodiments of the present application.

In an exemplary embodiment, a non-transitory computer-readable storage medium is also provided, such as memory 604, including instructions executable by processor 620 of electronic device 600 to perform the above-described method. For example, the non-transitory storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

Fig. 7 is a block diagram of an electronic device 700, as shown in an exemplary embodiment. For example, the electronic device 700 may be provided as a server. Referring to fig. 7, electronic device 700 includes a processing component 722 that further includes one or more processors and memory resources represented by memory 732 for storing instructions, such as application programs, executable by processing component 722. The application programs stored in memory 732 may include one or more modules that each correspond to a set of instructions. In addition, the processing component 722 is configured to execute instructions to perform a training method of an air conditioner control model provided in an embodiment of the present application.

The electronic device 700 may also include a power supply component 726 configured to perform power management of the electronic device 700, a wired or wireless network interface 750 configured to connect the electronic device 700 to a network, and an input output (I/O) interface 758. The electronic device 700 may operate based on an operating system stored in memory 732, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, or the like.

Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.

It is to be understood that the present application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims

1. A training method for an air conditioner control model, comprising:

2. The method of claim 1, wherein training the initial model to obtain an air conditioning control model based on the output probability of the output tag and the labeling probability of the labeling tag comprises:

Calculating a loss value between the output probability and the labeling probability through a preset loss function based on the output probability of the output label and the labeling probability of the labeling label; the preset loss function is one of a cross entropy loss function and a mean square error loss function;

and adjusting the weight parameters of the initial model according to the loss value until the loss value reaches a preset expected value to obtain an air conditioner control model.

3. The method of claim 1, wherein the step of obtaining control information comprises:

4. A method according to claim 3, further comprising, after the step of acquiring the control instruction and the environmental information of the user by the terminal device:

identifying text content of the control instruction and text content of the environmental information;

When there are a plurality of the control instructions or the environment information of the same text content, one of the control instructions or the environment information of the same text content is retained.

5. A method according to claim 3, wherein the step of receiving the control information by the encoder of the initial model, converting the control information into feature vectors, comprises:

text embedding is carried out on the plurality of information units through the encoder of the initial model to obtain feature vectors corresponding to the plurality of information units;

and obtaining the feature vector corresponding to the control information according to the feature vector corresponding to the information units.

6. The method of claim 1, wherein the step of receiving the feature vector by the decoder of the initial model, converting the feature vector to an output label and an output probability of the output label comprises:

receiving the feature vector through the decoder to obtain probability distribution output by the decoder; the probability distribution comprises a plurality of candidate output labels and output probabilities corresponding to the candidate output labels;

And determining a target output tag and the output probability of the target output tag from the plurality of candidate output tags according to the maximum probability in the probability distribution.

7. An air conditioner control method, comprising:

generating an air conditioner control instruction according to the output tag, and controlling an air conditioner based on the air conditioner control instruction; wherein the air conditioning control model is trained by the method of any one of claims 1-6.

8. A training device for an air conditioner control model, comprising:

9. An air conditioner control device, comprising:

the instruction control module is used for generating an air conditioner control instruction according to the output tag and controlling an air conditioner based on the air conditioner control instruction; wherein the air conditioning control model is trained by the method of any one of claims 1-6.

10. An electronic device, comprising: a memory and a processor, the memory having stored thereon a computer program which, when executed by the processor, performs the method of any of claims 1 to 7.

11. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored program, wherein the program, when run, controls a device in which the computer readable storage medium is located to perform the method of any one of claims 1 to 7.