WO2020006993A1

WO2020006993A1 - Intelligent household electrical appliance control method and intelligent household electrical appliance control device

Info

Publication number: WO2020006993A1
Application number: PCT/CN2018/122256
Authority: WO
Inventors: 杨赛赛; 陈翀; 万会; 宋德超; 连圆圆; 秦萍; 冯德兵
Original assignee: 珠海格力电器股份有限公司
Priority date: 2018-07-06
Filing date: 2018-12-20
Publication date: 2020-01-09
Also published as: CN110687802A

Abstract

Disclosed are an intelligent household electrical appliance control method and an intelligent household electrical appliance control device (5), which belong to the field of intelligent household electrical appliance control. The intelligent household electrical appliance control method comprises: obtaining parameter information (S101); obtaining, on the basis of the parameter information, a control action corresponding to the parameter information by means of a preset model, the preset model including an intensive learning model, and the intensive learning model can be adjusted according to a comfort evaluation result (S102); and controlling operation according to the control action (S103). On the basis of the obtained parameter information, control actions with good evaluation results are output to control operation of the intelligent household electrical appliances; the control actions for implementing executions of the intelligent household electrical appliances can satisfy comfort demands of a user, so that the comfort control is achieved and the user experience is improved.

Description

Intelligent household appliance control method and intelligent household appliance control device

Related applications

This application claims priority from a Chinese patent application filed on July 6, 2018, with application number 201810734605.1, entitled "A Smart Home Appliance Control Method and Smart Home Appliance Control Device", which is incorporated herein by reference in its entirety.

Technical field

The present application relates to the field of smart home appliance control, and in particular, to a smart home appliance control method and a smart home appliance control device.

Background technique

Smart appliances can improve the comfort of home life. Taking air conditioners as an example, it can provide users with a comfortable ambient temperature environment.

At present, the control method of the air conditioner is that the user sets the operating temperature, and the air conditioner performs feedback adjustment according to the ambient temperature of the room in which the ambient temperature of the room is maintained at the set temperature of the air conditioner. In terms of comfort control, users usually set the air conditioner to achieve comfort control by relying on their somatosensory experience. In this case, there is a problem that the air conditioner operation state set by the user each time may not be the most comfortable operation state, which results in a poor user experience.

Therefore, in terms of the comfort control and user experience of smart home appliances, there is still a need for improvement.

Summary of the invention

In order to overcome the problems in the related technology at least to a certain extent, the present application discloses a smart home appliance control method and a smart home appliance control device.

In order to achieve the above purpose, this application uses the following technical solutions:

A smart home appliance control method includes:

Get parameter information;

Based on the parameter information, a control action corresponding to the parameter information is obtained through a preset model, the preset model includes a reinforcement learning model, and the reinforcement learning model can be adjusted according to a comfort evaluation result;

The operation is controlled according to the control action.

Preferably, the acquiring parameter information includes:

Obtain environmental parameter information, and / or,

Get the parameter information of the smart home appliance.

Preferably, the acquiring environmental parameter information includes:

Obtain the environmental parameter information collected and / or configured by the smart appliance itself; and / or

Obtain environmental parameter information collected and / or configured by external devices of smart appliances.

Preferably, the preset model further includes a state transition model;

The obtaining a control action corresponding to the parameter information based on the parameter information through a preset model includes:

Obtaining state parameters corresponding to the parameter information through the state transition model based on the parameter information, and the state transition model is used to represent a correspondence between the parameter information and the state parameters;

Based on the state parameter, a control action is generated by the reinforcement learning model, and the reinforcement learning model is used to represent a correspondence between the state parameter and the control action.

Preferably, the state transition model includes one or more of a state comparison table, a neural network model, and a preset logic rule.

Preferably, the reinforcement learning model can be adjusted according to the comfort evaluation results, including:

The probability that the reinforcement learning model outputs the control action can be adjusted according to the comfort evaluation result.

Preferably, it further includes:

Acquiring the comfort evaluation result after controlling operation according to the control action;

Updating the reinforcement learning model according to the comfort evaluation result.

Preferably, the obtaining the comfort evaluation result after running according to the control action includes:

Obtaining state parameters before and after performing a corresponding operation according to the control action;

Calculating a first comfort value and a second comfort value according to a preset comfort evaluation algorithm, wherein the first comfort value is a comfort value corresponding to a state parameter before performing a corresponding operation according to the control action, The second comfort value is a comfort value corresponding to a state parameter after performing a corresponding operation according to the control action;

According to the first comfort value and the second comfort value, the comfort evaluation result is obtained.

Preferably, the comfort evaluation algorithm sets the same or different weights corresponding to each state parameter.

A comfort evaluation result fed back by a user is obtained after a corresponding operation is performed according to the control action.

Preferably, the comfort evaluation result includes a positive evaluation result or a negative evaluation result, and the updating the reinforcement learning model according to the comfort evaluation result includes:

If the comfort evaluation result is a positive evaluation result, increasing the output probability of the control action; or,

If the comfort evaluation result is a negative evaluation result, the output probability of the control action is reduced.

A smart home appliance control device includes:

A first obtaining module, configured to obtain parameter information;

A second acquisition module, based on the parameter information, obtaining a control action corresponding to the parameter information through a preset model, the preset model includes a reinforcement learning model, and the reinforcement learning model can be adjusted according to a comfort evaluation result;

A control module, configured to control operation according to the control action.

Preferably, the first obtaining module is specifically configured to:

Obtain environmental parameter information, and / or,

Get the parameter information of the smart home appliance.

Preferably, in the second acquisition module, the preset model further includes a state transition model;

Preferably, the smart home appliance control device further includes:

An evaluation module, configured to obtain the comfort evaluation result after controlling operation according to the control action;

An update module is configured to update the reinforcement learning model according to the comfort evaluation result.

Preferably, the evaluation module is specifically configured to:

Preferably, the evaluation module is further specifically configured to:

Preferably, the update module is specifically configured to:

The comfort evaluation result includes a positive evaluation result or a negative evaluation result;

This application adopts the above technical solutions and has at least the following beneficial effects:

This application is based on the obtained parameter information, and obtains the control action corresponding to the parameter information through a preset model. The preset model includes a reinforcement learning model. The reinforcement learning model can be adjusted according to the comfort evaluation result. By controlling the operation of smart home appliances, the control actions performed by the smart home appliances can meet the comfort needs of users, thereby achieving the comfort control of smart home appliances and improving the user experience.

It should be understood that the above general description and the following detailed description are merely exemplary and explanatory, and should not limit the present application.

BRIEF DESCRIPTION OF THE DRAWINGS

In order to explain the technical solutions in the embodiments of the present application or the prior art more clearly, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description are merely It is an embodiment of the present application. For those of ordinary skill in the art, other drawings can be obtained according to the disclosed drawings without paying creative labor.

1 is a schematic flowchart of a smart home appliance control method disclosed by an embodiment of the present application;

2 is a schematic structural diagram of a state transition model disclosed by an embodiment of the present application;

3 is a schematic flowchart of a smart home appliance control method disclosed by another embodiment of the present application;

4 is a schematic flowchart of a smart home appliance control method disclosed by another embodiment of the present application;

5 is a schematic structural diagram of a smart home appliance control device disclosed by an embodiment of the present application;

FIG. 6 is a schematic structural diagram of a smart home appliance control device disclosed by another embodiment of the present application.

detailed description

In order to make the purpose, technical solution, and advantages of the present application clearer, the technical solution of the present application will be described in detail below. Obviously, the described embodiments are only a part of the embodiments of the present application, but not all the embodiments. Based on the embodiments in the present application, all other implementations obtained by a person of ordinary skill in the art without making creative efforts fall within the protection scope of the present application.

FIG. 1 is a schematic flowchart of a smart home appliance control method according to an embodiment of the present application. As shown in FIG. 1, the smart home appliance control method includes the following steps:

Step S101: Acquire parameter information.

It can be understood that the parameter information is related to the comfort of the smart home appliance control action, and the parameter information may be environmental parameter information. In one embodiment, the environmental parameter information may be environmental parameter information collected and / or configured by the smart home appliance itself. , Smart appliances can obtain environmental parameter information through themselves. For example, smart home appliances use their own configured sensors to collect environmental parameter information, such as indoor temperature, humidity, and particle information; for example, room information configured in smart home appliances, such as room size, orientation, and lighting. Alternatively, the environmental parameter information may also be collected and / or configured for external devices of the smart home appliance. Smart appliances can receive environmental parameter information from external devices, such as smart appliances receiving local weather information, such as local temperature, humidity, rain, and snow, sent by cloud servers through the network. Alternatively, the smart home appliance may also be associated with other smart home appliances and sensors, and receive environmental parameter information collected by other smart home appliances and sensors. For example, it is associated with other smart home appliances and receives temperature and humidity information collected by other smart home appliances. For example, it is associated with door and window sensors. The door and window sensors obtain the door and window switch status information. Then the smart home appliances receive the switch status information sent by the door and window sensors. It is the smart home appliance that receives the information sent by the control center of the smart home system, such as room information configured in the control center of the smart home system.

From the perspective of acquisition, the above-mentioned environmental parameter information may be obtained by the smart home appliance from itself or by other appliances.

From the perspective of specific information, in a specific embodiment, the environmental parameter information may include at least one of the following: local weather information, such as temperature, humidity, rain, snow, etc .; room information of the room where the smart home appliance is located , Such as space size, orientation, lighting conditions, etc .; information about other devices in the room where the smart appliance is located, such as door and window status information, such as doors or windows open or closed.

In addition, the parameter information may also be parameter information of the smart home appliance, such as the running time information of the smart home appliance.

Through the above embodiments, diversification of parameter information can be achieved, and the diversified parameter information is comprehensively used to control smart home appliances. Smart home appliances are controlled by responding to the diversified parameter information, instead of being operated by users themselves, which can improve the user experience.

In a specific embodiment, the smart home appliance includes, but is not limited to, a smart air conditioner. Taking smart air conditioners as an example, by responding to diverse parameter information for control, instead of the user's own operation, the user experience can be improved, and energy saving purposes can also be achieved in some cases, such as the local hot weather suddenly cools down and becomes cold, but the user is unknown At this time, the intelligent air conditioner controls the operation according to the local weather at this time, which can achieve energy saving.

Step S102: Based on the parameter information, a control action corresponding to the parameter information is obtained through a preset model. The preset model includes a reinforcement learning model, and the reinforcement learning model can be adjusted according to a comfort evaluation result.

In the above solution, the control action can be generated by the reinforcement learning model, and the reinforcement learning model can be adjusted according to the comfort evaluation result, so that the control action generated by the adjustment meets the user's comfort experience.

In some embodiments, the preset model may further include a state transition model. Accordingly, obtaining the control action corresponding to the parameter information through the preset model based on the parameter information includes:

Based on the state parameter, a control action is generated by the reinforcement learning model, which is used to characterize the correspondence between the state parameter and the control action.

FIG. 2 is a schematic structural diagram of a state transition model provided by an embodiment of the present application. As shown in FIG. 2, the input of the state transition model 20 is parameter information and the output is state parameters. For example, the parameter information includes door and window closing conditions 201 and weather environment conditions. 202 (including temperature, humidity, rain, snow, etc.), room information 203 (including space size, orientation, lighting, etc.), the state parameters are preset and fixed, as shown in FIG. 2, taking three state parameters as examples. The specific status parameters can be set according to the actual situation. For example, taking smart appliances as air conditioners as an example, the status parameters can include temperature, humidity, and lighting, and can be three types of statuses determined by comprehensively determining a variety of parameter information.

The state transition model 20 may be an artificially set logic rule, a state comparison table, a neural network structure, or a mixture of the three. The output is a simplified mapping of the input information, and the specific output parameter type depends on the actual control target. Generally, the establishment of this process requires a large amount of actual case data extraction or training. For example, according to the actual case data extraction, it can be determined that the corresponding state parameter B when the parameter information is A. Therefore, when the parameter information is obtained as A, the corresponding state parameter can be obtained as B according to the state transition model.

Through state conversion, it can be better applied to the situation where the parameter information is complicated. The complex parameter information is converted into state parameters with a mapping relationship. The state parameters can form a summary of the complex parameter information, simplifying data processing, and avoiding Reinforcement learning models face the processing pressure when faced with complex and numerous parameter information.

The input of the reinforcement learning model is the state parameter, and the output is the control action. Take the air conditioner as an example, the state parameter is, for example, temperature drop, and the control action is, for example, increasing the temperature. The output probability of the output parameters of the reinforcement learning model can be adjusted according to the comfort evaluation result, so each output control action of the reinforcement model can be the control action with the best comfort evaluation result.

In one embodiment, the reinforcement learning model can be adjusted according to the comfort evaluation result, including: the probability that the reinforcement learning model outputs the control action can be adjusted according to the comfort evaluation result.

It can be understood that the output probability of the control action can be adjusted according to the comfort evaluation result, so that the generated control action can be most suitable for the user's comfort experience.

Step S103: Control operation according to the control action.

It can be understood that the reinforcement learning model can be obtained by continuously updating the comfort evaluation result of the control action, because the control action of the reinforcement learning model can be adjusted according to the comfort evaluation result, corresponding to a specific state parameter, According to the reinforcement learning model, the control action with the best evaluation result corresponding to the state parameter is output to control the operation of the smart home appliance, so that the control action performed by the smart home appliance can achieve comfort control, thereby improving the user experience.

FIG. 3 is a schematic flowchart of a smart home appliance control method according to another embodiment of the present application. As shown in FIG. 3, the smart home appliance control method further includes the following steps:

Step S104: Obtain the comfort evaluation result after the control operation is performed according to the control action.

The comfort evaluation result reflects the degree of comfort experience given to the user after the control action is performed. It can be understood that after the control action has a new comfort evaluation result, the occurrence of the control action in the future operation of the smart home appliance will be Adjust based on the new comfort evaluation results.

The comfort evaluation result may be a feedback result of the user, and / or the comfort evaluation result may also be calculated according to a preset algorithm.

For example, in one embodiment, the acquiring the comfort evaluation result after controlling the operation of the smart home appliance according to the control action includes:

Taking the air conditioner as an example, after the air conditioner performs a control action, the user can intuitively evaluate the comfort experience of the environment as the result of the comfort evaluation of the air conditioner control action.

For another example, in another embodiment, the acquiring the comfort evaluation result after running according to the control action includes:

Take the air conditioner as an example. Before the air conditioner performs a new control action, its state parameter corresponds to the first comfort value. After the air conditioner performs a new control action, its new state parameter corresponds to the second comfort value. The second comfort value is compared numerically. If the second comfort value is greater than the first comfort value, the comfort evaluation result is a positive evaluation result, and the comfort evaluation of the control action is better; if the second comfort value is less than the first comfort value, A comfort value is a negative evaluation result, and the comfort evaluation of the control action is poor.

In the above solution, the comfort value corresponding to the state parameter is obtained through a preset comfort evaluation algorithm. The comfort evaluation algorithm may be a comparison table of the state parameter and the comfort value state, or may be a formula or the like. The same or different weights can be set for each state parameter in the comfort evaluation algorithm, and the state parameters are quantified by the weight ratio to obtain the corresponding comfort value.

Step S105: Update the reinforcement learning model according to the comfort evaluation result.

It can be understood that after the reinforcement learning model is updated according to the comfort evaluation result, the control action generated by the reinforcement learning model will be adjusted accordingly, and the control action with the best evaluation result output by the reinforcement learning model can be realized.

In one embodiment, the comfort evaluation result includes a positive evaluation result or a negative evaluation result, and the updating the reinforcement learning model according to the comfort evaluation result includes:

Taking the air conditioner as an example, if the comfort evaluation result is a positive evaluation result, it indicates that the indoor environment after the air conditioner performs the control action increases comfort for the user. According to the positive evaluation, the control action is performed after the air conditioner. During operation, the probability of occurrence is increased; if the comfort evaluation result is a negative evaluation result, it indicates that the indoor environment after the air conditioner performs the control action is less comfortable for the user. According to the negative evaluation, the control action In the subsequent operation of the air conditioner, the probability of occurrence is reduced. It can be understood that in the actual operation of the air conditioner, the above process is repeated many times. If a control action is repeatedly evaluated multiple times, it means that the indoor environment after the execution of the control action gives the user a very good comfort experience. The probability of performing this control action in the future is also very high, and then the control action of the air conditioner can be adjusted in the direction of optimal comfort, thereby achieving comfort control to improve the operation of the air conditioner and improving the user experience.

FIG. 4 is a schematic flowchart of a smart home appliance control method according to another embodiment of the present application. As shown in FIG. 4, the smart home appliance control method includes the following steps:

S21. Obtaining parameter information includes: obtaining environmental parameter information, and / or obtaining own parameter information of the smart home appliance.

S22. Based on the parameter information, a control action corresponding to the parameter information is obtained through a preset model. The preset model includes a state transition model and a reinforcement learning model, wherein the reinforcement learning model can be based on a comfort evaluation result. Make adjustments

Obtaining a state parameter corresponding to the parameter information through the state transition model, where the state transition model is used to represent a correspondence between the parameter information and the state parameter;

S23. Control operation according to the control action.

S24. Obtaining the comfort evaluation result after the control operation according to the control action includes:

S25. The comfort evaluation result includes a positive evaluation result or a negative evaluation result, and updating the reinforcement learning model according to the comfort evaluation result includes: if the comfort evaluation result is a positive evaluation result, increasing Increase the output probability of the control action; or reduce the output probability of the control action if the comfort evaluation result is a negative evaluation result.

It can be understood that, in step S24, acquiring the comfort evaluation result after the control operation according to the control action may further include:

A comfort evaluation result fed back by a user is obtained after a corresponding operation is performed according to the control action. The evaluation by the user feedback is used as the evaluation of the control action.

It can be understood that in the actual operation of smart home appliances, the above steps are repeated many times, and the control actions of the smart home appliances can be adjusted in the direction of the optimal user experience, thereby improving the comfort control of smart home appliance operation and making smart Appliance control is more precise, which improves user experience.

It should be pointed out that the above embodiments of the smart home appliance control method are not limited to being applied to the smart air conditioner embodiment, and can also be applied to other smart home appliances, such as smart air purifiers.

FIG. 5 is a schematic structural diagram of a smart home appliance control device according to an embodiment of the present application. As shown in FIG. 5, the smart home appliance control device 5 includes:

A first acquiring module 51, configured to acquire parameter information;

The second obtaining module 52 obtains a control action corresponding to the parameter information through a preset model based on the parameter information. The preset model includes a reinforcement learning model, and the reinforcement learning model can be adjusted according to a comfort evaluation result. ;

The control module 53 is configured to control operation according to the control action.

It can be understood that the smart home appliance control device 5 obtains parameter information through the first obtaining module 51, and the second obtaining module 52 obtains control actions of the smart home appliance based on the parameter information. In the second obtaining module 52, the reinforcement learning model can be based on The comfort evaluation result is adjusted to realize that the control action generated according to the reinforcement learning model is the control action with the best comfort evaluation result. Therefore, the control module 53 controls the operation of the smart home appliance according to the control action to suit the user ’s comfort experience.

In one embodiment, the first obtaining module 51 is specifically configured to:

Obtain environmental parameter information, and / or,

Get the parameter information of the smart home appliance.

It can be understood that the parameter information is related to the comfort of the smart home appliance control action, and the parameter information may be environmental parameter information. In one embodiment, the environmental parameter information may be environmental parameter information collected and / or configured by the smart home appliance itself. , Smart appliances can obtain environmental parameter information from themselves. For example, smart home appliances use their own configured sensors to collect environmental parameter information, such as indoor temperature, humidity, and particle information; for example, room information configured in smart home appliances, such as room size, orientation, and lighting. Alternatively, the environmental parameter information may also be collected and / or configured for external devices of the smart home appliance. Smart appliances can receive environmental parameter information from external devices, such as smart appliances receiving local weather information, such as local temperature, humidity, rain, and snow, sent by cloud servers through the network. Alternatively, the smart home appliance may also be associated with other smart home appliances and sensors, and receive environmental parameter information collected by other smart home appliances and sensors. For example, it is associated with other smart home appliances and receives temperature and humidity information collected by other smart home appliances. For example, it is associated with door and window sensors. The door and window sensors obtain the door and window switch status information. Then the smart home appliances receive the switch status information sent by the door and window sensors. It is the smart home appliance that receives the information sent by the control center of the smart home system, such as room information configured in the control center of the smart home system.

In one embodiment, in the second obtaining module 52,

Preferably, in the second acquisition module 52, the preset model further includes a state transition model;

Through state conversion, it can be applied to the situation where the parameter information is complicated. The complex parameter information is converted into state parameters with a mapping relationship. The state parameters can form a summary of the complex parameter information, simplify the data processing, and avoid the reinforcement learning model. Processing pressure in the face of complex and numerous parameter information.

FIG. 6 is a schematic structural diagram of a smart home appliance control device according to another embodiment of the present application. As shown in FIG. 6, the smart home appliance control device 5 further includes:

An evaluation module 54 configured to obtain the comfort evaluation result after the control operation is performed according to the control action;

An update module 55 is configured to update the reinforcement learning model according to the comfort evaluation result.

It can be understood that the evaluation result of the control action is obtained through the evaluation module 54. After the control action has a new comfort evaluation result, the appearance of the control action in the subsequent operation of the smart home appliance can be performed based on the new comfort evaluation result. Adjustment. Through the update module 55, after the reinforcement learning model is updated according to the comfort evaluation result, the control action generated by the reinforcement learning model will be adjusted accordingly, so as to achieve the control action with the best comfort evaluation result. Through the above modules, in the actual operation of the smart home appliance, the above process is repeated many times, and the control action of the smart home appliance can be adjusted in the direction of optimal user comfort experience, thereby improving the comfort control of the operation of the smart home appliance. Make smart home appliances more precise and improve user experience.

In one embodiment, the evaluation module 54 is specifically configured to:

In the above solution, the comfort value corresponding to the state parameter is obtained through a preset comfort evaluation algorithm. The comfort evaluation algorithm may be a comparison table between the state parameter and the comfort value state, or a formula or the like. The same or different weights can be set for each state parameter in the comfort evaluation algorithm, and the state parameters are quantified by the weight ratio to obtain the corresponding comfort value.

In one embodiment, the evaluation module 54 is further specifically configured to:

In one embodiment, the update module 55 is specifically configured to:

Taking the air conditioner as an example, if the comfort evaluation result is a positive evaluation result, it indicates that the indoor environment after the air conditioner performs the control action increases comfort for the user. According to the positive evaluation, the control action is performed after the air conditioner. During operation, the probability of occurrence is increased; if the comfort evaluation result is a negative evaluation result, it indicates that the indoor environment after the air conditioner performs the control action is less comfortable for the user. According to the negative evaluation, the control action In the subsequent operation of the air conditioner, the probability of occurrence is reduced. It can be understood that in the actual operation of the air conditioner, the above process is repeated many times. If a control action is repeatedly evaluated multiple times, it means that the indoor environment after the execution of the control action is really very comfortable for the user. The probability of performing this control action in the future is also very high, and then the control action of the air conditioner can be adjusted in the direction of optimal comfort, thereby improving the comfort control of the air conditioner operation and improving the user experience.

It should be noted that the above-mentioned application embodiments of the smart home appliance control device 5 include, but are not limited to, the embodiments of smart air conditioners, and can also be applied to other smart home appliances, such as smart air purifiers.

It can be understood that the same or similar parts in the above embodiments can be referred to each other. For the content that is not described in detail in some embodiments, refer to the same or similar content in other embodiments.

It should be noted that, in the description of the present application, the terms "first", "second" and the like are only used for descriptive purposes, and cannot be understood to indicate or imply relative importance. In addition, in the description of this application, unless otherwise stated, the meaning of "a plurality" means at least two.

Any process or method description in a flowchart or otherwise described herein can be understood as representing a module, fragment, or portion of code that includes one or more executable instructions for implementing a particular logical function or step of a process And, the scope of the preferred embodiments of the present application includes additional implementations, in which the functions may be performed out of the order shown or discussed, including performing functions in a substantially simultaneous manner or in the reverse order according to the functions involved, which should It is understood by those skilled in the art to which the embodiments of the present application pertain.

It should be understood that each part of the application may be implemented by hardware, software, firmware, or a combination thereof. In the above embodiments, multiple steps or methods may be implemented by software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, it may be implemented using any one or a combination of the following techniques known in the art: Discrete logic circuits, application specific integrated circuits with suitable combinational logic gate circuits, programmable gate arrays (PGA), field programmable gate arrays (FPGA), etc.

A person of ordinary skill in the art can understand that all or part of the steps carried by the methods in the foregoing embodiments may be implemented by a program instructing related hardware. The program may be stored in a computer-readable storage medium. The program is When executed, one or a combination of the steps of the method embodiment is included.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing module, or each unit may exist separately physically, or two or more units may be integrated into one module. The above integrated modules can be implemented in the form of hardware or software functional modules. If the integrated module is implemented in the form of a software functional module and sold or used as an independent product, it may also be stored in a computer-readable storage medium.

The aforementioned storage medium may be a read-only memory, a magnetic disk, or an optical disk.

In the description of this specification, the description with reference to the terms “one embodiment”, “some embodiments”, “examples”, “specific examples”, or “some examples” and the like means specific features described in conjunction with the embodiments or examples , Structure, materials, or features are included in at least one embodiment or example of the present application. In this specification, the schematic expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

Although the embodiments of the present application have been shown and described above, it can be understood that the above embodiments are exemplary and should not be construed as limitations on the present application. Those skilled in the art can interpret the above within the scope of the present application. Embodiments are subject to change, modification, substitution, and modification.

Claims

A method for controlling a smart home appliance, comprising:

Get parameter information;

Based on the parameter information, a control action corresponding to the parameter information is obtained through a preset model, the preset model includes a reinforcement learning model, and the reinforcement learning model can be adjusted according to a comfort evaluation result;

The operation is controlled according to the control action.
The smart home appliance control method according to claim 1, wherein the acquiring parameter information comprises:

Obtain environmental parameter information, and / or,

Get the parameter information of the smart home appliance.
The method for controlling a smart home appliance according to claim 2, wherein the acquiring environmental parameter information comprises:

Obtain the environmental parameter information collected and / or configured by the smart appliance itself; and / or

Obtain environmental parameter information collected and / or configured by external devices of smart appliances.
The smart home appliance control method according to any one of claims 1 to 3, wherein the preset model further includes a state transition model;

The obtaining a control action corresponding to the parameter information based on the parameter information through a preset model includes:

Obtaining state parameters corresponding to the parameter information through the state transition model based on the parameter information, and the state transition model is used to represent a correspondence between the parameter information and the state parameters;

Based on the state parameter, a control action is generated by the reinforcement learning model, and the reinforcement learning model is used to represent a correspondence between the state parameter and the control action.
The smart home appliance control method according to claim 4, wherein:

The state transition model includes one or more of a state comparison table, a neural network model, and a preset logic rule.
The smart home appliance control method according to claim 4, wherein the reinforcement learning model can be adjusted according to a comfort evaluation result, comprising:

The probability that the reinforcement learning model outputs the control action can be adjusted according to the comfort evaluation result.
The smart home appliance control method according to claim 4, further comprising:

Acquiring the comfort evaluation result after controlling operation according to the control action;

Updating the reinforcement learning model according to the comfort evaluation result.
The method for controlling a smart home appliance according to claim 7, wherein the obtaining the comfort evaluation result after running according to the control action comprises:

Obtaining state parameters before and after performing a corresponding operation according to the control action;

Calculating a first comfort value and a second comfort value according to a preset comfort evaluation algorithm, wherein the first comfort value is a comfort value corresponding to a state parameter before performing a corresponding operation according to the control action, The second comfort value is a comfort value corresponding to a state parameter after performing a corresponding operation according to the control action;

According to the first comfort value and the second comfort value, the comfort evaluation result is obtained.
The smart home appliance control method according to claim 8, wherein the comfort evaluation algorithm sets the same or different weights corresponding to each state parameter.
The method for controlling a smart home appliance according to claim 7 or 8, wherein the acquiring the comfort evaluation result after running according to the control action comprises:

A comfort evaluation result fed back by a user is obtained after a corresponding operation is performed according to the control action.
The smart home appliance control method according to claim 7 or 8, wherein the comfort evaluation result comprises a positive evaluation result or a negative evaluation result, and the reinforcement learning model is updated according to the comfort evaluation result ,include:

If the comfort evaluation result is a positive evaluation result, increasing the output probability of the control action; or,

If the comfort evaluation result is a negative evaluation result, the output probability of the control action is reduced.
A smart home appliance control device is characterized in that it includes:

A first obtaining module, configured to obtain parameter information;

A second acquisition module, based on the parameter information, obtaining a control action corresponding to the parameter information through a preset model, the preset model includes a reinforcement learning model, and the reinforcement learning model can be adjusted according to a comfort evaluation result;

A control module, configured to control operation according to the control action.
The smart home appliance control device according to claim 12, wherein the first acquisition module is specifically configured to:

Obtain environmental parameter information, and / or,

Get the parameter information of the smart home appliance.
The smart home appliance control device according to claim 12, wherein in the second acquisition module, the preset model further comprises a state transition model;

The obtaining a control action corresponding to the parameter information based on the parameter information through a preset model includes:

Obtaining state parameters corresponding to the parameter information through the state transition model based on the parameter information, and the state transition model is used to represent a correspondence between the parameter information and the state parameters;

Based on the state parameter, a control action is generated by the reinforcement learning model, and the reinforcement learning model is used to represent a correspondence between the state parameter and the control action.
The smart home appliance control device according to any one of claims 12 to 14, wherein the smart home appliance control device further comprises:

An evaluation module, configured to obtain the comfort evaluation result after controlling operation according to the control action;

An update module is configured to update the reinforcement learning model according to the comfort evaluation result.
The smart home appliance control device according to claim 15, wherein the evaluation module is specifically configured to:

Obtaining state parameters before and after performing a corresponding operation according to the control action;

Calculating a first comfort value and a second comfort value according to a preset comfort evaluation algorithm, wherein the first comfort value is a comfort value corresponding to a state parameter before performing a corresponding operation according to the control action, The second comfort value is a comfort value corresponding to a state parameter after performing a corresponding operation according to the control action;

According to the first comfort value and the second comfort value, the comfort evaluation result is obtained.
The smart home appliance control device according to claim 15, wherein the evaluation module is further configured to:

A comfort evaluation result fed back by a user is obtained after a corresponding operation is performed according to the control action.
The control device for a smart home appliance according to claim 15, wherein:

The update module is specifically configured to:

The comfort evaluation result includes a positive evaluation result or a negative evaluation result;

If the comfort evaluation result is a positive evaluation result, increasing the output probability of the control action; or,

If the comfort evaluation result is a negative evaluation result, the output probability of the control action is reduced.