CN116974212A

CN116974212A - Equipment control method and device based on multi-mode information

Info

Publication number: CN116974212A
Application number: CN202311034187.2A
Authority: CN
Inventors: 陈小平; 孙欢
Original assignee: Foshan Viomi Electrical Technology Co Ltd
Current assignee: Foshan Viomi Electrical Technology Co Ltd
Priority date: 2023-08-16
Filing date: 2023-08-16
Publication date: 2023-10-31

Abstract

The invention discloses a device control method and a device based on multi-mode data, wherein the method comprises the following steps: collecting multi-modal data corresponding to a target environment, wherein the multi-modal data comprises at least two of voice data, text data, radar point cloud data, image data and image data; based on a predetermined clustering algorithm, performing clustering operation on the multi-mode data to obtain a data clustering result; generating control parameters corresponding to target control equipment according to the data clustering result, wherein the target control equipment is used for controlling a plurality of equipment in a target environment; and controlling the target control equipment to execute equipment control operation corresponding to the control parameters according to the control parameters corresponding to the target control equipment. Therefore, the control flexibility of the equipment can be improved by implementing the invention, which is beneficial to improving the control accuracy of the equipment.

Description

Equipment control method and device based on multi-mode information

Technical Field

The present invention relates to the field of data analysis technologies, and in particular, to a device control method and apparatus based on multi-mode information.

Background

The smart home is an organic system that connects all devices in a scene by technical means. In recent years, intelligent home is widely welcome by users because of the characteristics of practicality, convenience and the like.

In practical applications, when a user needs to use a device, the user typically uses a single interaction mode to interact with a control device (such as a switch) to control the device, for example: the user gives a specific voice instruction to the control device to control the lighting of the lamp based on the control device, but if the voice instruction given by the user is not accurate enough, the control device may not be able to light the lamp. Therefore, it is important to provide a technical solution that can improve the control flexibility of the device to improve the control accuracy of the device.

Disclosure of Invention

The technical problem to be solved by the invention is to provide a device control method and device based on multi-mode data, which can improve the control flexibility of the device and is beneficial to improving the control accuracy of the device.

In order to solve the technical problem, a first aspect of the present invention discloses a device control method based on multi-mode data, the method comprising:

collecting multi-modal data corresponding to a target environment, wherein the multi-modal data comprises at least two of voice data, text data, radar point cloud data, image data and image data;

based on a predetermined clustering algorithm, performing clustering operation on the multi-mode data to obtain a data clustering result;

Generating control parameters corresponding to target control equipment according to the data clustering result, wherein the target control equipment is used for controlling a plurality of equipment in the target environment;

and controlling the target control equipment to execute equipment control operation corresponding to the control parameters according to the control parameters corresponding to the target control equipment.

As an optional implementation manner, in the first aspect of the present invention, before the performing, based on the predetermined clustering algorithm, a clustering operation on the multi-mode data, to obtain a data clustering result, the method further includes:

judging whether the multi-mode data meets preset control conditions or not;

when judging that the multi-mode data meets the preset control condition, triggering and executing the clustering algorithm based on the preset, and executing clustering operation on the multi-mode data to obtain a data clustering result;

when judging that the multi-modal data does not meet the preset control conditions, acquiring a historical multi-modal data set corresponding to the target environment;

screening historical multi-modal data which meets the preset control conditions and has data similarity with the multi-modal data larger than a preset similarity threshold value from the historical multi-modal data set to serve as target multi-modal data;

And calibrating the multi-modal data according to the target multi-modal data, and re-executing the operation of judging whether the multi-modal data meets the preset control condition.

As an optional implementation manner, in the first aspect of the present invention, the determining whether the multi-mode data meets a preset control condition includes:

judging whether the multi-mode sub-data meets preset screening conditions corresponding to the multi-mode sub-data or not for each multi-mode sub-data in the multi-mode data, and obtaining screening judgment results corresponding to the multi-mode sub-data;

according to all the screening judging results, multi-mode sub-data meeting corresponding preset screening conditions in the multi-mode data are determined to be target sub-data;

judging whether at least two target sub-data exist in the multi-mode data;

when judging that at least two target sub-data exist in the multi-mode data, determining that the multi-mode data meet the preset control conditions;

and when judging that less than two kinds of target sub-data exist in the multi-mode data, determining that the multi-mode data do not meet the preset control conditions.

As an optional implementation manner, in the first aspect of the present invention, the voice data includes one or more of a voice duration, a voice decibel, and a voice voiceprint;

Wherein, for each multi-mode sub-data in the multi-mode data, determining whether the multi-mode sub-data meets a preset screening condition corresponding to the multi-mode sub-data, to obtain a screening determination result corresponding to the multi-mode sub-data includes:

when the multi-modal data comprises the voice data, judging whether the voice duration is greater than or equal to a preset voice duration and whether the voice decibel is greater than or equal to a preset voice decibel;

when the voice time length is larger than or equal to the preset voice time length and the voice decibel is larger than or equal to the preset voice decibel, searching whether a preset voice voiceprint matched with the voice voiceprint exists in a preset voice voiceprint set or not, wherein the preset voice voiceprint set comprises at least one preset voice voiceprint;

when the preset voice voiceprint matched with the voice voiceprint exists in the preset voice voiceprint set, determining that the voice data meets preset screening conditions corresponding to the voice data;

when the preset voice voiceprint matched with the voice voiceprint does not exist in the preset voice voiceprint set, determining that the voice data does not meet preset screening conditions corresponding to the voice data;

And when the voice duration is smaller than the preset voice duration or the voice decibel is smaller than the preset voice decibel, determining that the voice data does not meet the preset screening condition corresponding to the voice data.

In an optional implementation manner, in a first aspect of the present invention, the generating, according to the data clustering result, a control parameter corresponding to a target control device includes:

analyzing the data clustering result to obtain a clustering analysis result;

determining equipment use requirements corresponding to target users according to the clustering analysis results, wherein the target users are users in the target environment;

screening at least one device with device functions meeting the device use requirements from a plurality of devices in the target environment as target device according to the device use requirements;

and generating control parameters corresponding to the target control equipment according to the equipment use requirements, wherein equipment control operation corresponding to the control parameters is used for controlling the target control equipment to adjust equipment operation parameters of the target equipment.

In an optional implementation manner, in a first aspect of the present invention, the determining, according to the result of the cluster analysis, a device usage requirement corresponding to the target user includes:

Determining target information corresponding to the target environment according to the clustering analysis result, wherein the target information comprises scene information corresponding to the target environment and user information of a target user;

according to the user information of the target user, determining the user state information of the target user and the equipment using tendency corresponding to the target user;

analyzing the target information to obtain an information change trend corresponding to the target information, wherein the information change trend corresponding to the target information comprises a sub-information change trend corresponding to each type of target sub-information;

and determining the equipment use requirement corresponding to the multi-mode data according to the user state information, the equipment use trend and the information change trend.

As an optional implementation manner, in the first aspect of the present invention, the scene information corresponding to the target environment includes the scene attribute information and/or a scene environment parameter, and the scene attribute information includes one or more of a combination of a scene space size, a scene type and a scene position; the scene environment parameters comprise one or more of scene temperature, scene humidity and scene brightness;

The user information of the target user comprises one or more of user attribute information, user sign information, user action information, user expression information, user emotion information and user position information; the user attribute information includes a combination of one or more of a user gender, a user age, and a user body type; the user sign information comprises one or more of a user body temperature, a user breathing frequency and a user heartbeat frequency; the user location information includes user coordinates and a user action trajectory.

The second aspect of the invention discloses a device control apparatus based on multi-modal data, the apparatus comprising:

the system comprises an acquisition module, a storage module and a storage module, wherein the acquisition module is used for acquiring multi-mode data corresponding to a target environment, and the multi-mode data comprises at least two of voice data, text data, radar point cloud data, image data and image data;

the clustering module is used for executing clustering operation on the multi-mode data based on a predetermined clustering algorithm to obtain a data clustering result;

the generation module is used for generating control parameters corresponding to target control equipment according to the data clustering result, and the target control equipment is used for controlling a plurality of equipment in the target environment;

And the control module is used for controlling the target control equipment to execute equipment control operation corresponding to the control parameters according to the control parameters corresponding to the target control equipment.

As an alternative embodiment, in the second aspect of the present invention, the apparatus further includes:

the judging module is used for judging whether the multi-mode data meets the preset control condition before the clustering module executes clustering operation on the multi-mode data based on a preset clustering algorithm to obtain a data clustering result; when judging that the multi-mode data meets the preset control condition, triggering the clustering module to execute the clustering algorithm based on the preset, and executing clustering operation on the multi-mode data to obtain a data clustering result;

the acquisition module is used for acquiring a historical multi-mode data set corresponding to the target environment when the judgment module judges that the multi-mode data does not meet the preset control condition;

the screening module is used for screening historical multi-modal data which meets the preset control conditions and has data similarity with the multi-modal data larger than a preset similarity threshold value from the historical multi-modal data set to serve as target multi-modal data;

And the calibration module is used for calibrating the multi-mode data according to the target multi-mode data and re-triggering the judging module to execute the operation of judging whether the multi-mode data meets the preset control condition.

In a second aspect of the present invention, as an optional implementation manner, the determining module determines whether the multi-mode data meets a preset control condition includes:

judging whether at least two target sub-data exist in the multi-mode data;

As an optional implementation manner, in the second aspect of the present invention, the voice data includes one or more of a combination of a voice duration, a voice decibel, and a voice voiceprint;

the specific way for the judging module to judge whether the multi-mode sub-data meets the preset screening conditions corresponding to the multi-mode sub-data for each multi-mode sub-data in the multi-mode data to obtain the screening judging result corresponding to the multi-mode sub-data comprises the following steps:

In a second aspect of the present invention, as an optional implementation manner, the specific manner of generating, by the generating module, the control parameter corresponding to the target control device according to the data clustering result includes:

analyzing the data clustering result to obtain a clustering analysis result;

In a second aspect of the present invention, the specific manner of determining, according to the cluster analysis result, the device usage requirement corresponding to the target user includes:

As an optional implementation manner, in the second aspect of the present invention, the scene information corresponding to the target environment includes the scene attribute information and/or a scene environment parameter, and the scene attribute information includes one or more of a combination of a scene space size, a scene type and a scene position; the scene environment parameters comprise one or more of scene temperature, scene humidity and scene brightness;

In a third aspect, the present invention discloses another device control apparatus based on multi-modal data, the apparatus comprising:

a memory storing executable program code;

a processor coupled to the memory;

the processor invokes the executable program code stored in the memory to execute the device control method based on the multi-mode data disclosed in the first aspect of the present invention.

A fourth aspect of the present invention discloses a computer storage medium storing computer instructions which, when invoked, are adapted to perform the method of controlling a device based on multimodal data disclosed in the first aspect of the present invention.

Compared with the prior art, the embodiment of the invention has the following beneficial effects:

in the embodiment of the invention, multi-mode data corresponding to a target environment is acquired, wherein the multi-mode data comprises at least two of voice data, text data, radar point cloud data, image data and image data; based on a predetermined clustering algorithm, performing clustering operation on the multi-mode data to obtain a data clustering result; generating control parameters corresponding to target control equipment according to the data clustering result, wherein the target control equipment is used for controlling a plurality of equipment in a target environment; and controlling the target control equipment to execute equipment control operation corresponding to the control parameters according to the control parameters corresponding to the target control equipment. Therefore, the method and the device can execute clustering operation on the multi-mode data corresponding to the acquired target environment based on the clustering algorithm to obtain a data clustering result, and generate the control parameters corresponding to the target control device according to the data clustering result to control the target control device to execute the device control operation corresponding to the control parameters, so that intelligent control on the device based on the multi-mode data is realized, the application flexibility of the multi-mode data can be improved, the determination flexibility of the control parameters is improved, the control flexibility of the device is further improved, the control accuracy and the control convenience of the device are improved, intelligent home for intelligently identifying the control requirements of the device is provided for a user, and the use experience of the user on the intelligent home is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic view of a scene to which a device control method based on multi-mode data is applied according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of a device control method based on multi-modal data according to an embodiment of the present invention;

FIG. 3 is a flow chart of another method for controlling a device based on multi-modal data according to an embodiment of the present invention;

FIG. 4 is a schematic structural diagram of a device control apparatus based on multi-modal data according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of another device control apparatus based on multi-modal data according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of another device control apparatus based on multi-mode data according to an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The terms first, second and the like in the description and in the claims and in the above-described figures are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, apparatus, article, or article that comprises a list of steps or elements is not limited to only those listed but may optionally include other steps or elements not listed or inherent to such process, method, article, or article.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the invention. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

The invention discloses a device control method and a device based on multi-mode data, which can execute clustering operation on the multi-mode data corresponding to an acquired target environment based on a clustering algorithm to obtain a data clustering result, generate control parameters corresponding to target control devices according to the data clustering result to control the target control devices to execute device control operation corresponding to the control parameters, realize intelligent control on the devices based on the multi-mode data, improve the application flexibility of the multi-mode data, further improve the determination flexibility of the control parameters, further improve the control flexibility of the devices, be favorable for improving the control accuracy and the control convenience of the devices, provide intelligent home for users with intelligent recognition device control requirements, and be favorable for improving the use experience of the users on the intelligent home. The following will describe in detail.

In order to better understand the device control method and apparatus based on the multi-mode data described in the present invention, first, a scenario to which the device control method based on the multi-mode data is applied is described, where the scenario to which the method is applied is specifically a scenario that needs to control a device, for example, a scenario that controls a device based on the multi-mode data. Specifically, taking a scenario of controlling a device based on multi-mode data as an example, the scenario may be shown in fig. 1, and fig. 1 is a schematic diagram of a scenario applicable to a device control method based on multi-mode data according to an embodiment of the present invention. As shown in fig. 1, the schematic view of the scene is exemplified by a living room area in a scene of controlling devices based on multi-mode data, the living room area can include devices (such as an air conditioner, a humidifier and a lamp) and control devices (such as an intelligent switch), wherein the control devices integrate control functions of a plurality of devices, namely, the control devices can be used for controlling the plurality of devices; further, the area may also include a user, furniture (e.g., a sofa).

Still further, the following operations may be performed on devices within the living room area: collecting multi-modal data corresponding to a target environment, wherein the multi-modal data comprises at least two of voice data, text data, radar point cloud data, image data and image data; based on a predetermined clustering algorithm, performing clustering operation on the multi-mode data to obtain a data clustering result; generating control parameters corresponding to target control equipment according to the data clustering result, wherein the target control equipment is used for controlling a plurality of equipment in a target environment; and controlling the target control equipment to execute equipment control operation corresponding to the control parameters according to the control parameters corresponding to the target control equipment.

Further, the data clustering result can be analyzed to obtain a clustering analysis result, the action (such as the user holds the fan and moves the fan) of the user and the voice (such as the user says that the user is hot) of the user are determined, the equipment use requirement of the user is required to be cooled, the target equipment is determined according to the equipment use requirement, and the control parameters corresponding to the target control equipment are generated to adjust the equipment operation parameters corresponding to the target equipment.

It should be noted that the scene diagram shown in fig. 1 is only for illustrating one of the scenes to which the device control method based on the multi-mode data is applied, and is not limited to other scenes to which the device control method and the device based on the multi-mode data are applied, and the scene diagram shown in fig. 1 does not limit the type, shape, size, function, etc. of the device and the control device.

The above describes, as an example, one scenario to which the device control method and apparatus based on multi-modal data are applied, and the device control method and apparatus based on multi-modal data are described in detail below.

Example 1

Referring to fig. 2, fig. 2 is a flow chart of a device control method based on multi-mode data according to an embodiment of the invention. The device control method based on the multi-mode data described in fig. 2 may be applied to a device control apparatus based on the multi-mode data, where the apparatus may include one of a control device, a control terminal, a control system, and a server, where the server includes a local server or a cloud server, and the device control method based on the multi-mode data may also be applied to a control apparatus corresponding to an intelligent home, and embodiments of the present invention are not limited. As shown in fig. 2, the device control method based on multi-modal data may include the operations of:

101. and collecting multi-mode data corresponding to the target environment.

In the embodiment of the invention, the multi-modal data comprises at least two of voice data, text data, radar point cloud data, image data and image data; the target environment may include at least one target scene, where the target scene may be one of scene areas with different functions in a living room, a bedroom, a kitchen, a bathroom, a study room, and other houses, and the target scene may also be one of indoor places such as a mall, a restaurant, a movie theater, and the like. The voice data can be acquired through recording equipment, the image data can be acquired through equipment with an image acquisition function, the image data can be acquired through equipment with a video recording function, the text data can be acquired through text information in recognition voice data or image data or acquired through user input, and the radar point cloud data can be acquired through laser radar.

102. And based on a predetermined clustering algorithm, performing clustering operation on the multi-mode data to obtain a data clustering result.

In the embodiment of the invention, the clustering algorithm can be one of a K-means-based depth clustering algorithm, a spectral clustering-based depth clustering algorithm, a subspace-based depth clustering algorithm and a GMM-based depth clustering algorithm, and the embodiment of the invention is not limited; the data clustering result comprises classification results of all multi-mode data.

103. And generating control parameters corresponding to the target control equipment according to the data clustering result.

In the embodiment of the present invention, the target control device is used for controlling a plurality of devices in the target environment, alternatively, the target control device may be a control device integrated with control functions of a plurality of devices in the target environment, and exemplary, the target control device may be an intelligent switch, where the embodiment of the present invention is not limited; the control parameters corresponding to the target control device are used for controlling the target control device to adjust devices in the target environment.

104. And controlling the target control equipment to execute equipment control operation corresponding to the control parameters according to the control parameters corresponding to the target control equipment.

Therefore, the method described by the embodiment of the invention can execute clustering operation on the multi-mode data corresponding to the acquired target environment based on the clustering algorithm to obtain the data clustering result, and generate the control parameters corresponding to the target control equipment according to the data clustering result to control the target control equipment to execute the equipment control operation corresponding to the control parameters, thereby realizing intelligent control on the equipment based on the multi-mode data, improving the application flexibility of the multi-mode data, further improving the determination flexibility of the control parameters, further improving the control flexibility of the equipment, being beneficial to improving the control accuracy and the control convenience of the equipment, providing intelligent home for intelligently identifying the equipment control requirements for users, and being beneficial to improving the use experience of the user on the intelligent home.

In an alternative embodiment, before performing a clustering operation on the multi-modal data based on a predetermined clustering algorithm to obtain a data clustering result, the method may further include the following operations:

judging whether the multi-mode data meets preset control conditions or not;

when judging that the multi-mode data meets the preset control condition, triggering and executing the clustering operation on the multi-mode data based on a preset clustering algorithm to obtain the data clustering result;

When judging that the multi-mode data does not meet the preset control conditions, acquiring a historical multi-mode data set corresponding to the target environment;

screening historical multi-modal data which meets preset control conditions and has data similarity with the multi-modal data larger than a preset similarity threshold value from the historical multi-modal data set to serve as target multi-modal data;

and calibrating the multi-modal data according to the target multi-modal data, and re-executing the operation of judging whether the multi-modal data meets the preset control conditions.

The historical multi-modal data set corresponding to the target environment comprises at least one historical multi-modal data corresponding to the target environment, and if the historical multi-modal data set comprises a plurality of historical multi-modal data, the historical multi-modal data set comprises a plurality of historical multi-modal data corresponding to different historical moments; when the multi-modal data includes image data, the method of calibrating the multi-modal data may be to respectively perform operations such as extracting a target edge, identifying a target in the image data, and the like on the target multi-modal data and the current multi-modal data, and compare the identified target, calibrate the image data according to a comparison identification result, so as to infer the target in the image data; when the multimodal data includes voice data, the method of calibrating the multimodal data may be removing noise of the voice data, improving voice decibels of the voice data, and the like through the voice data, comparing the voice data in the multimodal data with the voice data in the target multimodal data, and calibrating the voice data according to the comparison result to infer information included in the voice data.

Therefore, before the clustering operation is performed on the multi-modal data, the optional embodiment can judge whether the multi-modal data meets the preset control condition, if so, the step of performing the clustering operation on the multi-modal data is performed, if not, the historical multi-modal data which meets the preset control condition and has the data similarity greater than the preset similarity threshold value with the multi-modal data is screened out from the historical multi-modal data set corresponding to the target environment and is used as the target multi-modal data, the multi-modal data is calibrated according to the target multi-modal data, and the judging operation is performed again, so that the judging efficiency of the multi-modal data meeting the condition can be improved, the data screening efficiency is improved, and the fusion analysis efficiency of the multi-modal data is improved.

In this alternative embodiment, determining whether the multimodal data satisfies the preset control condition may include the following operations:

according to all screening judging results, determining multi-mode sub-data meeting corresponding preset screening conditions in the multi-mode data as target sub-data;

Judging whether at least two target sub-data exist in the multi-mode data;

when judging that at least two target sub-data exist in the multi-mode data, determining that the multi-mode data meet preset control conditions;

when judging that less than two kinds of target sub-data exist in the multi-mode data, determining that the multi-mode data do not meet the preset control conditions.

For example, assuming that the multi-modal data includes text data, image data and voice data, if only the image data in the multi-modal data satisfies the corresponding preset screening condition, but the text data and the voice data do not satisfy the corresponding preset screening condition, that is, only one target sub-data exists in the multi-modal data, it is determined that the multi-modal data does not satisfy the preset control condition.

It can be seen that, the optional embodiment can also determine whether each multi-mode sub-data meets the corresponding preset screening condition, if so, the multi-mode sub-data is determined to be the target sub-data, if at least two kinds of target sub-data exist in the multi-mode data, the multi-mode data is determined to meet the preset control condition, if less than two kinds of target sub-data exist in the multi-mode data, the multi-mode data is determined to not meet the preset control condition, so that the accuracy of determining that the multi-mode data meets the condition can be further improved, the accuracy of screening data is further improved, and further the accuracy of fusion analysis of the multi-mode data is facilitated, so that the accuracy of determining the equipment parameters is improved.

In this alternative embodiment, the voice data includes a combination of one or more of a voice duration, a voice decibel, and a voice voiceprint;

for each multi-mode sub-data in the multi-mode data, determining whether the multi-mode sub-data meets a preset screening condition corresponding to the multi-mode sub-data, and obtaining a screening determination result corresponding to the multi-mode sub-data may include the following operations:

when the multi-mode data comprises voice data, judging whether the voice duration is greater than or equal to the preset voice duration and whether the voice decibel is greater than or equal to the preset voice decibel;

when the fact that the preset voice voiceprint matched with the voice voiceprint does not exist in the preset voice voiceprint set is found, determining that the voice data does not meet preset screening conditions corresponding to the voice data;

When the voice duration is judged to be smaller than the preset voice duration or the voice decibel is judged to be smaller than the preset voice decibel, determining that the voice data does not meet the preset screening condition corresponding to the voice data.

Optionally, the preset voice voiceprint may be a voice voiceprint stored in advance based on a preset security measure; further optionally, the user age and/or user identity may be determined from the voice voiceprint; still further alternatively, for each device, a preset voice voiceprint corresponding to the device may be set, for example: the fire safety problem may exist in using the kitchen range, in order to reduce the probability of accidents, the preset voice voiceprint corresponding to the kitchen range can be set to only include voice voiceprints of users (such as adults) meeting the age condition, and then the voice voiceprints of users (such as children) not meeting the age condition are not allowed to be set to the preset voice voiceprints, namely, the children are not allowed to control the kitchen range, so that safety accidents are avoided.

It can be seen that, this optional embodiment can also find whether there is a preset voice voiceprint that matches with the voice voiceprint of the voice data in the preset voice voiceprint set when the voice time length of the voice data is greater than or equal to the preset voice time length and the voice decibel is greater than or greater than the preset voice decibel, if there is, determine that the voice data meets the corresponding preset screening condition, if there is no matched preset voice voiceprint or the voice time length is less than the preset voice time length or the voice decibel is less than the preset voice decibel, determine that the voice data does not meet the corresponding preset screening condition, so as to improve the judging accuracy of the voice data meeting condition, thereby being beneficial to improving the judging accuracy of the multi-mode data meeting condition and being beneficial to the fusion analysis accuracy of the multi-mode data.

In this optional embodiment, optionally, for each multi-mode sub-data in the multi-mode data, determining whether the multi-mode sub-data meets a preset screening condition corresponding to the multi-mode sub-data, to obtain a screening determination result corresponding to the multi-mode sub-data may include the following operations:

when the multi-mode data comprises image data, identifying a target user in the image data and position information corresponding to the target user;

judging whether a target user is in a preset area of a target environment;

when the target user is judged to be in the preset area of the target environment, determining that the image data meets the preset screening conditions corresponding to the image data;

when the target user is judged not to be in the preset area of the target environment, determining that the image data does not meet the preset screening conditions corresponding to the image data.

If the target user cannot be identified from the image data, or the position of the target user is an area where the action information or the expression information corresponding to the target user is difficult to identify, namely, a preset area where the user is not in the target environment is determined.

It can be seen that, this optional embodiment can also determine whether the user is in the preset area of the target environment after identifying the user in the image data and the position information corresponding to the user, if it is determined that the user is in the preset area of the target environment, the image data meets the corresponding preset screening condition, if it is determined that the user is not in the preset area of the target environment, it is determined that the image data does not meet the corresponding preset screening condition, so that the function of intelligently determining the position of the user is implemented, the determination accuracy of the position of the user is improved, thereby improving the judgment accuracy of the condition satisfaction of the image data, further being beneficial to generating the equipment control parameters for the position of the user, and improving the accuracy of the equipment control.

Example two

Referring to fig. 3, fig. 3 is a flow chart of a device control method based on multi-mode data according to an embodiment of the invention. The device control method based on the multi-mode data described in fig. 3 may be applied to a device control apparatus based on the multi-mode data, where the apparatus may include one of a control device, a control terminal, a control system, and a server, where the server includes a local server or a cloud server, and the device control method based on the multi-mode data may also be applied to a control apparatus corresponding to an intelligent home, and embodiments of the present invention are not limited. As shown in fig. 3, the device control method based on multi-modality data may include the following operations:

201. and collecting multi-mode data corresponding to the target environment.

202. And based on a predetermined clustering algorithm, performing clustering operation on the multi-mode data to obtain a data clustering result.

203. And analyzing the data clustering result to obtain a clustering analysis result.

In the embodiment of the invention, the clustering analysis results are used for representing the contained information about each data clustering result or the association degree analysis results among a plurality of data clustering results.

204. And determining the equipment use requirement corresponding to the target user according to the clustering analysis result.

In the embodiment of the invention, the target user is a user in a target environment.

205. And screening at least one device with the device function meeting the device use requirement from a plurality of devices in the target environment as a target device according to the device use requirement.

206. And generating control parameters corresponding to the target control equipment according to the equipment use requirements.

In the embodiment of the invention, the device control operation corresponding to the control parameter is used for controlling the target control device to adjust the device operation parameter of the target device.

207. And controlling the target control equipment to execute equipment control operation corresponding to the control parameters according to the control parameters corresponding to the target control equipment.

In the embodiment of the present invention, for other detailed descriptions of step 201 to step 202 and step 207, please refer to the detailed descriptions of step 101 to step 104 in the first embodiment, and the detailed descriptions of the embodiment of the present invention are omitted.

Therefore, the method described by the embodiment of the invention can execute clustering operation on the multi-mode data corresponding to the acquired target environment based on the clustering algorithm to obtain the data clustering result, and generate the control parameters corresponding to the target control equipment according to the data clustering result to control the target control equipment to execute the equipment control operation corresponding to the control parameters, thereby realizing intelligent control on the equipment based on the multi-mode data, improving the application flexibility of the multi-mode data, further improving the determination flexibility of the control parameters, further improving the control flexibility of the equipment, being beneficial to improving the control accuracy and the control convenience of the equipment, providing intelligent home for intelligently identifying the equipment control requirements for users, and being beneficial to improving the use experience of the user on the intelligent home. In addition, a clustering analysis result can be obtained by analyzing the data clustering result, the equipment use requirement corresponding to the user is determined according to the clustering analysis result, at least one device with the equipment function meeting the equipment use requirement is screened out according to the equipment use requirement to serve as target equipment, and the control parameter corresponding to the target control equipment is generated according to the equipment use requirement, so that the intelligent analysis of the user requirement based on the clustering result of the multi-mode data is realized, the determination flexibility and the determination accuracy of the user requirement can be improved, the generation accuracy of the control parameter is improved, and the control accuracy of the equipment is further improved.

In an alternative embodiment, determining the device usage requirement corresponding to the target user according to the result of the cluster analysis may include the following operations:

determining target information corresponding to a target environment according to a cluster analysis result, wherein the target information comprises scene information corresponding to the target environment and user information of a target user;

according to the user information of the target user, determining the user state information of the target user and the equipment use tendency corresponding to the target user;

analyzing the target information to obtain an information change trend corresponding to the target information, wherein the information change trend corresponding to the target information comprises sub-information change trends corresponding to each type of target sub-information;

Optionally, if the target information includes target sub-information that is attribute information, such as scene attribute information and user attribute information, and the attribute information is basically stable and unchanged, the target sub-information may be selected not to be analyzed, and an information change trend corresponding to the target sub-information does not need to be obtained.

Exemplary, if the determined user state is: the user feels very hot and the device tends to use: when the equipment with the cooling function is prone to be used and the scene information in the target information comprises the scene temperature and the sub-information change trend corresponding to the scene temperature is that the temperature rises, the equipment use requirement can be determined to be starting the equipment with the cooling function (such as an air conditioner and/or a fan).

Therefore, according to the alternative embodiment, the target information corresponding to the target environment can be determined according to the clustering analysis result, the user state information and the equipment use tendency corresponding to the user are determined according to the user information in the target information, the target information is analyzed to obtain the information change tendency corresponding to the target information, the equipment use requirement is determined according to the user state information, the equipment use tendency and the information change tendency, the analysis accuracy of the multi-mode data can be improved, the determination accuracy of the equipment use requirement is improved, the generation accuracy of the control parameters is improved, and the control accuracy of the equipment is improved.

In this optional embodiment, optionally, the scene information corresponding to the target environment includes scene attribute information and/or scene environment parameters, and the scene attribute information includes one or more of a combination of a scene space size, a scene type, and a scene position; the scene environment parameters include a combination of one or more of scene temperature, scene humidity, and scene brightness;

the user information of the target user comprises one or more of user attribute information, user sign information, user action information, user expression information, user emotion information and user position information; the user attribute information includes a combination of one or more of a user gender, a user age, and a user body type; the user sign information comprises a combination of one or more of a user body temperature, a user breathing frequency, and a user heartbeat frequency; the user location information includes user coordinates and a user action trajectory.

Therefore, the optional embodiment can further explain the specific information types of the scene information corresponding to the target environment and the user information corresponding to the target environment in detail, namely, diversified information can be obtained through analysis according to the clustering analysis result, and the analysis accuracy and the analysis flexibility of the multi-mode data can be improved, so that the accuracy and the reliability of the determination of the equipment use requirement can be improved, and the generation accuracy of the control parameters can be improved.

In this optional embodiment, further optionally, determining, according to the user information of the target user, the user status information of the target user and the device usage tendency corresponding to the target user may include the following operations:

analyzing the user action information to obtain action meanings corresponding to the user action information;

analyzing the user expression information to obtain expression meanings corresponding to the user expression information;

judging whether the emotion similarity between the user emotion information represented by the action meaning and the user emotion information represented by the expression meaning is higher than a preset emotion similarity or not;

when the emotion similarity between the user emotion information represented by the action meaning and the user emotion information represented by the expression meaning is higher than the preset emotion similarity, determining the user state of the target user according to the action meaning, the expression meaning and the user sign information;

And determining the equipment using tendency corresponding to the target user according to the action meaning, the expression meaning and the user state.

For example, if the user action information is fan, the user expression information is frowning and the heartbeat of the user is accelerated, the action meaning obtained by analysis is: the user feels that the heat needs to be cooled; the expression meaning obtained by analysis is as follows: discontent or uncomfortable; and judging that the user emotion information pointed by the action meaning and the expression meaning is negative emotion, and determining that the user state of the target user is as follows according to the action meaning, the expression meaning and the user heartbeat acceleration: feel very hot, also can confirm the equipment corresponding to the target user uses the trend as: and starting the cooling equipment.

It can be seen that, this optional embodiment can also determine whether the emotion similarity between the identified user emotion information is higher than the preset emotion similarity according to the action meaning corresponding to the user action information obtained by analysis and the expression meaning corresponding to the user expression information, if yes, the user state is determined according to the action meaning, the expression meaning and the user sign information, and the equipment using tendency is determined according to the action meaning, the expression meaning and the user state, so that the intelligent recognition of the user emotion is realized, the accuracy of determining the user state and the equipment using tendency can be improved, thereby improving the accuracy of determining the equipment using requirement, and being beneficial to improving the accuracy of generating the equipment control parameters.

Example III

Referring to fig. 4, fig. 4 is a schematic structural diagram of a device control apparatus based on multi-mode data according to an embodiment of the present invention. The device control apparatus based on multi-mode data described in fig. 4 may include one of a control device, a control terminal, a control system, and a server, where the server includes a local server or a cloud server, and the device control apparatus based on multi-mode data may also be applied to a control apparatus corresponding to an intelligent home, which is not limited in the embodiment of the present invention. As shown in fig. 4, the device control apparatus based on multi-modality data may include:

the acquisition module 301 is configured to acquire multi-modal data corresponding to a target environment, where the multi-modal data includes at least two of voice data, text data, radar point cloud data, image data, and image data;

the clustering module 302 is configured to perform a clustering operation on the multi-mode data based on a predetermined clustering algorithm, so as to obtain a data clustering result;

the generating module 303 is configured to generate control parameters corresponding to a target control device according to a data clustering result, where the target control device is configured to control a plurality of devices in a target environment;

And the control module 304 is configured to control the target control device to execute a device control operation corresponding to the control parameter according to the control parameter corresponding to the target control device.

Therefore, the device described by the embodiment of the invention can execute clustering operation on the multi-mode data corresponding to the acquired target environment based on the clustering algorithm to obtain the data clustering result, and generate the control parameters corresponding to the target control equipment according to the data clustering result to control the target control equipment to execute the equipment control operation corresponding to the control parameters, so that the intelligent control on the equipment based on the multi-mode data is realized, the application flexibility of the multi-mode data can be improved, the determination flexibility of the control parameters is improved, the control flexibility of the equipment is further improved, the control accuracy and the control convenience of the equipment are improved, intelligent home for intelligently identifying the equipment control requirements are provided for users, and the use experience of the users on the intelligent home is improved.

In an alternative embodiment, as shown in fig. 5, the apparatus may further include:

a judging module 305, configured to judge whether the multi-modal data meets a preset control condition before the clustering module 302 performs a clustering operation on the multi-modal data based on a predetermined clustering algorithm to obtain a data clustering result; when judging that the multi-mode data meets the preset control condition, triggering the clustering module 302 to execute the clustering operation on the multi-mode data based on the preset clustering algorithm to obtain the data clustering result;

An obtaining module 306, configured to obtain a historical multimodal data set corresponding to the target environment when the judging module 305 judges that the multimodal data does not meet the preset control condition;

a screening module 307, configured to screen, from the historical multimodal data set, historical multimodal data that meets a preset control condition and has a data similarity with the multimodal data that is greater than a preset similarity threshold, as target multimodal data;

the calibration module 308 is configured to calibrate the multi-modal data according to the target multi-modal data, and retrigger the judging module 305 to perform an operation of judging whether the multi-modal data meets the preset control condition.

Therefore, before the clustering operation is performed on the multi-modal data, the device described in the optional embodiment can judge whether the multi-modal data meets the preset control condition, if so, execute the step of performing the clustering operation on the multi-modal data, and if not, screen the historical multi-modal data which meets the preset control condition and has the data similarity with the multi-modal data larger than the preset similarity threshold value from the historical multi-modal data set corresponding to the target environment as the target multi-modal data, calibrate the multi-modal data according to the target multi-modal data, and re-execute the judging operation, so that the judging efficiency of the multi-modal data meeting the condition can be improved, the data screening efficiency is improved, and further the fusion analysis efficiency of the multi-modal data is facilitated.

In this alternative embodiment, the specific manner of determining whether the multimodal data meets the preset control condition by the determining module 305 may include:

judging whether at least two target sub-data exist in the multi-mode data;

It can be seen that the device described in this optional embodiment may further determine whether each multi-mode sub-data meets a corresponding preset screening condition, if so, determine the multi-mode sub-data as target sub-data, and if at least two kinds of target sub-data exist in the multi-mode data, determine that the multi-mode data meets a preset control condition, and if less than two kinds of target sub-data exist in the multi-mode data, determine that the multi-mode data does not meet the preset control condition, thereby further improving accuracy in determining that the multi-mode data meets the condition, further improving accuracy in screening data, and further being beneficial to accuracy in fusion analysis of the multi-mode data, so as to improve accuracy in determining parameters of the device.

the specific manner of the determining module 305 to determine, for each multi-mode sub-data in the multi-mode data, whether the multi-mode sub-data meets the preset screening condition corresponding to the multi-mode sub-data, to obtain the screening determination result corresponding to the multi-mode sub-data may include:

It can be seen that the device described in this optional embodiment may further find whether a preset voice voiceprint matching the voice voiceprint of the voice data exists in the preset voice voiceprint set when the voice time length of the voice data is greater than or equal to the preset voice time length and the voice decibel is greater than or equal to the preset voice decibel, if so, determine that the voice data meets the corresponding preset screening condition, and if not, determine that the voice data does not meet the corresponding preset screening condition, and improve the judgment accuracy of the voice data meeting condition, thereby improving the screening accuracy of the voice data, further being beneficial to improving the judgment accuracy of the multi-mode data meeting condition and being beneficial to the fusion analysis accuracy of the multi-mode data.

In another alternative embodiment, the specific manner of generating, by the generating module 303, the control parameter corresponding to the target control device according to the data clustering result may include:

Analyzing the data clustering result to obtain a clustering analysis result;

according to the clustering analysis result, determining the equipment use requirement corresponding to a target user, wherein the target user is a user in a target environment;

according to the equipment use requirement, at least one equipment with equipment functions meeting the equipment use requirement is screened out from a plurality of equipment in a target environment to serve as target equipment;

and generating control parameters corresponding to the target control equipment according to equipment use requirements, wherein equipment control operation corresponding to the control parameters is used for controlling the target control equipment to adjust equipment operation parameters of the target equipment.

Therefore, the device described by implementing the alternative embodiment can obtain the clustering analysis result by analyzing the data clustering result, determine the equipment use requirement corresponding to the user according to the clustering analysis result, screen at least one device with the equipment function meeting the equipment use requirement according to the equipment use requirement as the target device, and generate the control parameter corresponding to the target control device according to the equipment use requirement, so that the intelligent analysis of the user requirement based on the clustering result of the multi-mode data is realized, the determination flexibility and the determination accuracy of the user requirement can be improved, the generation accuracy of the control parameter is improved, and the control accuracy of the equipment is further improved.

In this alternative embodiment, the specific manner of determining, by the generating module 303, the device usage requirement corresponding to the target user according to the cluster analysis result may include:

It can be seen that the device described in this optional embodiment may further determine, according to the result of the cluster analysis, target information corresponding to the target environment, determine, according to user information in the target information, user status information and a device usage trend corresponding to the user, and analyze the target information to obtain an information change trend corresponding to the target information, and determine, according to the user status information, the device usage trend and the information change trend, a device usage requirement, so as to improve accuracy in analyzing the multi-mode data, thereby improving accuracy in determining the device usage requirement, further improving accuracy in generating control parameters, and being beneficial to improving accuracy in controlling the device.

In this alternative embodiment, the scene information corresponding to the target environment includes scene attribute information and/or scene environment parameters, and the scene attribute information includes one or more of a combination of a scene space size, a scene type, and a scene position; the scene environment parameters include a combination of one or more of scene temperature, scene humidity, and scene brightness;

Therefore, the device described by implementing the alternative embodiment can also specify the specific information types of the scene information corresponding to the target environment and the user information corresponding to the target environment, namely, can obtain diversified information according to the analysis of the clustering analysis result, and can improve the analysis accuracy and the analysis flexibility of the multi-mode data, thereby being beneficial to improving the determination accuracy and the reliability of the equipment use requirement and further being beneficial to improving the generation accuracy of the control parameters.

Example IV

Referring to fig. 6, fig. 6 is a schematic structural diagram of another device control apparatus based on multi-mode data according to an embodiment of the present invention. As shown in fig. 6, the device control apparatus based on multi-modality data may include:

a memory 401 storing executable program codes;

a processor 402 coupled with the memory 401;

the processor 402 invokes executable program codes stored in the memory 401 to perform the steps in the device control method based on multimodal data described in the first or second embodiment of the present invention.

Example five

The embodiment of the invention discloses a computer storage medium which stores computer instructions for executing the steps in the device control method based on multi-mode data described in the first or second embodiment of the invention when the computer instructions are called.

Example six

An embodiment of the present invention discloses a computer program product including a non-transitory computer-readable storage medium storing a computer program, and the computer program is operable to cause a computer to perform the steps in the device control method based on multimodal data described in the first embodiment or the second embodiment.

The apparatus embodiments described above are merely illustrative, wherein the modules illustrated as separate components may or may not be physically separate, and the components shown as modules may or may not be physical, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above detailed description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course by means of hardware. Based on such understanding, the foregoing technical solutions may be embodied essentially or in part in the form of a software product that may be stored in a computer-readable storage medium including Read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), programmable Read-Only Memory (Programmable Read-Only Memory, PROM), erasable programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), one-time programmable Read-Only Memory (OTPROM), electrically erasable programmable Read-Only Memory (EEPROM), compact disc Read-Only Memory (Compact Disc Read-Only Memory, CD-ROM) or other optical disc Memory, magnetic disc Memory, tape Memory, or any other medium that can be used for computer-readable carrying or storing data.

Finally, it should be noted that: the embodiment of the invention discloses a device control method and device based on multi-mode data, which are only disclosed as a preferred embodiment of the invention, and are only used for illustrating the technical scheme of the invention, but not limiting the technical scheme; although the invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that; the technical scheme recorded in the various embodiments can be modified or part of technical features in the technical scheme can be replaced equivalently; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims

1. A method for controlling a device based on multimodal data, the method comprising:

2. The apparatus control method based on multi-modal information according to claim 1, wherein before the clustering operation is performed on the multi-modal data based on a predetermined clustering algorithm, the method further comprises:

judging whether the multi-mode data meets preset control conditions or not;

3. The apparatus control method based on multi-modal information according to claim 2, wherein the determining whether the multi-modal data satisfies a preset control condition includes:

judging whether at least two target sub-data exist in the multi-mode data;

4. The multi-modal information-based device control method of claim 3, wherein the voice data includes a combination of one or more of a voice duration, a voice decibel, and a voice voiceprint;

5. The device control method based on multi-mode information according to any one of claims 1 to 4, wherein the generating control parameters corresponding to the target control device according to the data clustering result includes:

analyzing the data clustering result to obtain a clustering analysis result;

6. The method for controlling a device based on multimodal information according to claim 5, wherein determining, according to the result of the cluster analysis, a device usage requirement corresponding to a target user includes:

7. The multi-modal information-based device control method of claim 6, wherein the scene information corresponding to the target environment includes the scene attribute information and/or scene environment parameters, the scene attribute information including a combination of one or more of a scene space size, a scene type, and a scene position; the scene environment parameters comprise one or more of scene temperature, scene humidity and scene brightness;

8. A device control apparatus based on multi-modal data, the apparatus comprising:

9. A device control apparatus based on multi-modal data, the apparatus comprising:

a memory storing executable program code;

a processor coupled to the memory;

the processor invokes the executable program code stored in the memory to perform the multi-modality data based device control method of any of claims 1-7.

10. A computer storage medium storing computer instructions which, when invoked, are operable to perform the multi-modality data based device control method of any of claims 1-7.