CN107247923A - Instruction identification method and device, storage equipment, mobile terminal and electric appliance - Google Patents
Instruction identification method and device, storage equipment, mobile terminal and electric appliance Download PDFInfo
- Publication number
- CN107247923A CN107247923A CN201710353318.1A CN201710353318A CN107247923A CN 107247923 A CN107247923 A CN 107247923A CN 201710353318 A CN201710353318 A CN 201710353318A CN 107247923 A CN107247923 A CN 107247923A
- Authority
- CN
- China
- Prior art keywords
- image
- area
- recognition
- sound
- operator
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 67
- 230000009471 action Effects 0.000 claims description 27
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 8
- 239000000779 smoke Substances 0.000 claims description 2
- 230000009286 beneficial effect Effects 0.000 abstract description 5
- 230000007547 defect Effects 0.000 abstract description 4
- 230000006870 function Effects 0.000 description 18
- 230000008569 process Effects 0.000 description 11
- 238000010586 diagram Methods 0.000 description 7
- 230000004807 localization Effects 0.000 description 7
- 238000001514 detection method Methods 0.000 description 6
- 238000012545 processing Methods 0.000 description 6
- 238000012360 testing method Methods 0.000 description 6
- 230000004044 response Effects 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/166—Detection; Localisation; Normalisation using acquisition arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Acoustics & Sound (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Computational Linguistics (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses an instruction identification method, an instruction identification device, a storage device, a mobile terminal and an electric appliance, wherein the method comprises the following steps: analyzing sound in a current scene, determining a sound source of the sound, and positioning an area where the sound source is located in a voice mode; analyzing a first image area of the current scene image in the area of the sound source, and determining whether a set image recognition operation exists in the first image area; and when the image recognition operation exists in the first image area, converting the image recognition operation into a required instruction. The scheme of the invention can overcome the defects of low recognition efficiency, poor positioning accuracy, poor user experience and the like in the prior art, and has the beneficial effects of high recognition efficiency, good positioning accuracy and good user experience.
Description
Technical Field
The invention belongs to the technical field of electric appliances, and particularly relates to an instruction identification method, an instruction identification device, a storage device, a mobile terminal and an electric appliance, in particular to an air conditioner control method based on image and voice identification, a device corresponding to the method, a storage device storing an instruction corresponding to the method, a mobile terminal capable of executing the instruction corresponding to the method, and an air conditioner with the device corresponding to the method.
Background
Image recognition may be a technique that utilizes a computer to process, analyze, and understand images to identify various patterns of objects and objects. The existing image recognition technology needs to continuously analyze and operate the whole collected image under the condition of lacking of auxiliary positioning, so as to extract relevant characteristic information. In which, a large number of operations easily cause recognition lag and there are recognition and positioning errors.
In the prior art, the defects of low recognition efficiency, poor positioning accuracy, poor user experience and the like caused by lack of auxiliary positioning exist.
Disclosure of Invention
The invention aims to provide an instruction identification method, an instruction identification device, a storage device, a mobile terminal and an electric appliance, aiming at overcoming the defects, so as to solve the problem of low identification efficiency caused by the lack of auxiliary positioning and the need of continuous analysis and operation on the acquired images, and achieve the effect of improving the identification efficiency.
The invention provides an instruction identification method, which comprises the following steps: analyzing sound in a current scene, determining a sound source of the sound, and positioning an area where the sound source is located in a voice mode; analyzing a first image area of the current scene image in the area of the sound source, and determining whether a set image recognition operation exists in the first image area; and when the image recognition operation exists in the first image area, converting the image recognition operation into a required instruction.
Optionally, the method further comprises: acquiring the sound in the current scene before analyzing the sound in the current scene; and/or, before analyzing a first image area of the current scene in the area where the sound source is located, acquiring at least one of the image in the current scene and the first image area; and/or determining whether the area where the sound source is located obtained by the voice positioning is valid before analyzing the first image area where the image of the current scene is located in the area where the sound source is located; so that the first image area is only analyzed when the area where the sound source is located is valid.
Optionally, the acquiring the sound in the current scene includes:
receiving the sound in the current scene collected by a sound collection module; wherein, the sound collection module includes: at least one of a microphone, a sound sensor;
and/or, the acquiring at least one of the image in the current scene and the first image region includes: receiving at least one of the image in the current scene acquired by an image acquisition module, the first image region; wherein, the image acquisition module includes: at least one of a camera, an infrared sensor, a CCD image sensor and an ultrasonic sensor.
Optionally, the method further comprises: when the image recognition operation does not exist in the first image area, analyzing a second image area except the first image area in the image to determine whether the image recognition operation exists in the second image area; or analyzing all image areas in the image to determine whether the image recognition operation exists in all the image areas.
Optionally, the determining whether there is a set image recognition operation in the first image region includes: carrying out face recognition on the first image area, and determining whether an operator exists in the first image area; when the operator exists in the first image area, whether the operator has the image recognition operation is determined, and when the operator has the image recognition operation, the image recognition operation is determined in the first image area.
Optionally, determining whether the operator has the image recognition operation includes: image locating the position of the operator in the first image region; locking the operator according to the position, and tracking the action of the operator; and performing gesture recognition on the motion, and determining whether the motion is a setting operation, so that when the motion is a gesture operation, the motion is determined to be an image recognition operation.
In accordance with the above method, another aspect of the present invention provides an instruction recognition apparatus, including: the voice recognition module is used for analyzing the sound in the current scene, determining the sound source of the sound and positioning the area where the sound source is located in a voice mode; the image recognition module is used for analyzing a first image area of the current scene image in the area where the sound source is located and determining whether a set image recognition operation exists in the first image area; the image recognition module is further configured to convert the image recognition operation into a required instruction when the image recognition operation exists in the first image region.
Optionally, the method further comprises: a receiving module; the method comprises the steps of obtaining sound in a current scene before analyzing the sound in the current scene; and/or, the receiving module; the method is further used for acquiring at least one of the image in the current scene and a first image area in the area where the sound source is located before analyzing the first image area in the area where the image in the current scene is located; and/or the image recognition module is further configured to determine whether the area where the sound source is located obtained by the voice positioning is valid before analyzing the first image area where the image of the current scene is located in the area where the sound source is located; so that the first image area is only analyzed when the area where the sound source is located is valid.
Optionally, the receiving module includes: a sound collection module; the receiving module is used for receiving the sound in the current scene collected by the sound collecting module through the sound collecting module; wherein, the sound collection module includes: at least one of a microphone, a sound sensor; and/or, the receiving module further comprises: an image acquisition module; the receiving module is further configured to receive at least one of the image in the current scene and the first image region acquired by the image acquisition module; wherein, the image acquisition module includes: at least one of a camera, an infrared sensor, a CCD image sensor and an ultrasonic sensor.
Optionally, the method further comprises: the image recognition module is further configured to, when the image recognition operation is not performed in the first image region, analyze a second image region in the image, except for the first image region, to determine whether the image recognition operation is performed in the second image region; or analyzing all image areas in the image to determine whether the image recognition operation exists in all the image areas.
Optionally, the image recognition module includes: the face recognition module is used for carrying out face recognition on the first image area and determining whether an operator exists in the first image area; and the action recognition module is used for determining whether the operator has the image recognition operation when the operator exists in the first image area, and determining that the image recognition operation exists in the first image area when the operator has the image recognition operation.
Optionally, the determining, by the action recognition module, whether the operator has the image recognition operation includes: image locating the position of the operator in the first image region; locking the operator according to the position, and tracking the action of the operator; and performing gesture recognition on the motion, and determining whether the motion is a setting operation, so that when the motion is a gesture operation, the motion is determined to be an image recognition operation.
In accordance with the above method, another aspect of the present invention provides a storage device, comprising: a plurality of instructions are stored in the storage device; the instructions are used for loading and executing the instruction identification method by the processor.
In match with the above method, another aspect of the present invention provides a mobile terminal, including: a processor for executing a plurality of instructions; a memory to store a plurality of instructions; wherein the instructions are used for being stored by the memory and loaded and executed by the processor.
In accordance with at least one of the storage device, the mobile terminal, and the apparatus, the present invention provides an electrical appliance, including: at least one of the instruction recognition device, the storage device and the mobile terminal.
Optionally, the appliance includes: at least one of an air conditioner, a refrigerator, a television, a water heater, a water dispenser, an air purifier and a smoke exhaust ventilator.
According to the scheme of the invention, the voice recognition is used for assisting the positioning, so that the image recognition efficiency and the image recognition accuracy are improved.
Furthermore, the scheme of the invention improves the accuracy of locking the orientation of the operator by image recognition control through combining image recognition with voice recognition positioning, thereby improving the accuracy of subsequent control.
Furthermore, the scheme of the invention provides positioning reference for image recognition positioning operator position through voice recognition positioning sound source position, and improves the response speed and accuracy of image recognition for controlling the air conditioner.
Furthermore, the scheme of the invention determines the direction of the person through voice recognition and positioning, thereby giving reference to image recognition, quickly positioning the accurate position of the person in the image and improving the image recognition efficiency and accuracy.
Furthermore, the scheme of the invention provides positioning reference for image recognition positioning operator position through voice recognition positioning sound source position, and improves the response speed and accuracy of image recognition for controlling the air conditioner.
Therefore, according to the scheme of the invention, the problem of low recognition efficiency caused by the fact that continuous analysis operation needs to be carried out on the acquired images due to lack of auxiliary positioning in the prior art is solved by carrying out recognition and positioning through voice and image combination, so that the defects of low recognition efficiency, poor positioning accuracy and poor user experience in the prior art are overcome, and the beneficial effects of high recognition efficiency, good positioning accuracy and good user experience are realized.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
FIG. 1 is a flowchart illustrating an embodiment of an instruction recognition method according to the present invention;
FIG. 2 is a flowchart illustrating an embodiment of determining whether there is a set image recognition operation in the first image region according to the method of the present invention;
FIG. 3 is a flowchart illustrating an embodiment of determining whether the operator has the image recognition operation in the method of the present invention;
FIG. 4 is a schematic diagram illustrating an embodiment of an instruction recognition apparatus according to the present invention;
FIG. 5 is a diagram illustrating a structure of a mobile terminal according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of an embodiment of an electrical appliance (e.g., an air conditioner) according to the present invention;
FIG. 7 is a schematic diagram of an embodiment of a speech recognition area in an electrical appliance (e.g., an air conditioner) according to the present invention;
FIG. 8 is a control flow diagram of an electrical appliance (e.g., an air conditioner) according to an embodiment of the present invention.
The reference numbers in the embodiments of the present invention are as follows, in combination with the accompanying drawings:
102-a receiving module; 1022-a sound collection module; 1024-an image acquisition module; 104-a speech recognition module; 106-an image recognition module; 200-a mobile terminal; 202-a memory; 204-processor.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the specific embodiments of the present invention and the accompanying drawings. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
According to an embodiment of the present invention, an instruction recognition method (e.g., an air conditioner control method based on image and voice recognition) is provided, as shown in fig. 1, which is a schematic flow chart of an embodiment of the method of the present invention. The instruction identification method may include:
at step S110, the sound in the current scene is analyzed, the sound source of the sound is determined, and the area where the sound source is located is voice-located (e.g., the direction, position, etc. of the sound source are voice-located through the voice recognition module 104).
For example: analyzing the sound, determining a sound source (namely, a sound source) of the sound, and positioning an area where the sound source is located.
For example: and voice recognition, namely positioning the sound source through normal voice or a voice command, and calculating the position of the sound source to obtain voice positioning position information.
For example: the speech recognition module 104 can locate a sound source and can perform speech recognition on the sound source sound. For example: the voice recognition module 104 can analyze the voice information collected by the microphone and locate the sound source.
For example: the sound source can be analyzed and identified firstly, and then the sound source is positioned.
For example: the voice recognition module 104 collects voice analysis and locates sound sources in real time and transfers or shares sound source information to the image recognition module 106.
At step S120, a first image region of the current scene in the region of the sound source is analyzed to determine whether there is a set image recognition operation (e.g., gesture operation) in the first image region.
For example: by using the area where the sound source is located as a first identification area, the image identification module 106 analyzes the first image area where the image is located in the first identification area, and determines whether an image identification operation (for example, an image identification operation on the electrical appliance to be controlled) exists in the first image area.
For example: based on the positioning function of both the image recognition and the voice recognition, the position of the sound source positioned by the voice recognition can be transmitted to the image recognition module 106, so that the image recognition module 106 preferentially analyzes the image of the sound source region, the efficiency of image recognition and positioning is improved, and an operator can be found out more quickly.
For example: under the condition of having the position information of the sound source, the image identification module 106 preferentially analyzes the image area corresponding to the position of the sound source, and searches whether a control operator exists, so as to quickly determine whether the position of the sound source has an image control requirement, and provide service more quickly; if not, other areas are searched for whether to exist or not, so that the position of the operator can be located more quickly, and services and interactions such as gesture operation are provided for the operator.
Optionally, a specific process of determining whether there is a set image recognition operation in the first image region in step S120 may be further described with reference to a flowchart of an embodiment of determining whether there is a set image recognition operation in the first image region in the method of the present invention shown in fig. 2.
Step S210, performing face recognition on the first image region, and determining whether there is an operator in the first image region.
For example: through face recognition, whether the first image area contains a person can be determined.
For example: face entry (entry of operator face information into the image recognition system): the operator can make the voice recognition module 104 locate at the position of the operator by making a sound or voice command (such as face input), and make the image recognition module 106 locate at the position of the person to be input with face information, so as to collect the face information.
Step S220, when the operator exists in the first image area, determining whether the operator has the image recognition operation, so that when the operator has the image recognition operation, determining that the image recognition operation exists in the first image area; or when the operator does not have the image recognition operation, determining that the image recognition operation is not present in the first image region.
For example: the image recognition module 106 has a function of analyzing and recognizing the captured image, for example: the system has the functions of gesture recognition, face detection (for example, the face detection can detect whether a certain object is a person) and the like, can analyze and position the position of an operator on a collected picture (namely a collected image), locks and tracks the gesture pattern of the operator, converts the gesture pattern into a specific air conditioner control instruction and realizes the control of the air conditioner.
For example: gesture operation: the operator may cause the voice recognition module 104 to be positioned at the operator's location by sounding a sound or voice command (e.g., gesture control) so that the image recognition system can be quickly and accurately positioned at the operator.
Therefore, whether an operator exists or not is determined through face recognition, and image recognition operation is determined through action recognition when the operator exists, so that the recognition efficiency is high, and the accuracy is good.
More optionally, a specific process of determining whether the operator has the image recognition operation in step S220 may be further described with reference to a flowchart of an embodiment of determining whether the operator has the image recognition operation in the method of the present invention shown in fig. 3.
Step S310, the position of the operator in the first image area is located through images.
And step S320, locking the operator according to the position, and tracking the action of the operator.
Step S330, the action performs gesture recognition, and determines whether the action is a setting operation (for example, a gesture operation), so that when the action is the gesture operation, the action is determined to be an image recognition operation; or when the motion is not a gesture operation, determining that the motion is not an image recognition operation.
For example: the image recognition operation may include: gesture operation: the operator can make the voice recognition module locate at the position of the operator by making sound or voice command (such as gesture control), so that the image recognition system can quickly and accurately locate the operator.
For example: the image recognition operation may further include: face entry (entry of operator face information into the image recognition system): the operator can make the voice recognition module locate at the position of the operator by making a sound or voice command (such as face input), and make the image recognition module locate at the position of the person to be input with face information, so as to collect the face information.
For example: image recognition control (for example, gesture control requires positioning and tracking of a gesture, and face detection control requires positioning and detecting of a face) is mostly based on analysis and recognition of collected images, the position of an operator is firstly positioned, then actions of the operator are tracked and analyzed, and a series of control over an air conditioner is completed, but positioning failure may be caused by a complex background of a scene, and if a background picture has a figure, gesture control cannot be performed due to easy misrecognition.
Therefore, the position of the operator is positioned, the action of the operator is locked and tracked, and then image recognition operations such as gestures of the operator are acquired, so that the reliability is high, and the accuracy is good.
At step S130, when the image recognition operation is available in the first image region, the image recognition operation is converted (e.g., in a set conversion manner) into a desired instruction (e.g., a control instruction, an operation instruction, etc. capable of controlling an electrical appliance to be controlled in the current scene).
When the image recognition operation is recognized, the control instruction corresponding to the corresponding image recognition operation can be determined according to the preset corresponding relation between the image recognition operation and the control instruction.
For example: the palm is hovered for 1-5 seconds (for example, 2 seconds) facing the air conditioner camera to represent a gesture operation of awakening the image.
Therefore, the sound source is positioned through voice recognition, the recognition area of image recognition can be reduced, the image recognition efficiency is improved, the operation process is simple, and the recognition efficiency is high.
In an alternative embodiment, the method may further include: before analyzing the sound in the current scene in step S110, the sound in the current scene is acquired.
For example: and collecting voice.
Therefore, by acquiring the sound in the current scene, accurate and reliable basis can be provided for voice recognition.
Optionally, the acquiring the sound in the current scene may include: receive the sound in the current scene captured by the sound capture module 1022. The sound collection module 1022 may include: at least one of a microphone, a sound sensor.
For example: a microphone may be further disposed in cooperation with the sound collection module 1022 to amplify the sound collected by the sound collection module 1022, which is beneficial for the speech recognition module 104 to receive and recognize.
For example: in general, a microphone array (e.g., two-microphone, four-microphone, eight-microphone) is used for voice acquisition in the voice recognition control so as to locate the orientation of a speaker, then the speaker is locked and tracked, the voice of the speaker is analyzed and recognized, and the voice is converted into a control instruction to complete the control of the device.
For example: a speech device, may include: a voice acquisition module (e.g., a microphone) and a voice recognition module 104. For example: the microphone is connected to the speech recognition module 104.
Therefore, the sound is acquired through the sound acquisition module, the acquisition mode is simple and convenient, and the acquired sound is high in reliability.
In an alternative embodiment, the method may further include: before analyzing a first image area of the current scene in the area where the sound source is located in step S120, at least one of the image and the first image area in the current scene is obtained.
For example: acquiring at least one of the image in the current scene and the first image region may be: all images in the current scene are acquired first, or a partial image in the first image region is acquired first.
For example: and collecting an image.
For example: the air conditioner is provided with an image device and a voice device.
For example: acquiring sound and images in a current scene (such as the environment where an electric appliance to be controlled is located).
For example: an image device may include: an image acquisition module 1024 (e.g., a camera) and an image recognition module 106. For example: the camera is connected to an image recognition module 106.
Therefore, by acquiring the image in the current scene, accurate and reliable basis can be provided for image recognition, and convenience is good.
Optionally, the acquiring at least one of the image in the current scene and the first image region may include: receive at least one of the first image region, the image in the current scene captured by image capture module 1024. The image acquisition module 1024 may include: at least one of a camera, an infrared sensor, a CCD image sensor and an ultrasonic sensor.
Therefore, the image is acquired through the image acquisition module, the acquisition mode is simple and convenient, and the acquired sound reliability is high.
In an alternative embodiment, the method may further include: before analyzing the first image region of the current scene in the region where the sound source is located in step S120, determining whether the region where the sound source is located obtained by the voice localization is valid (for example, whether the region where the sound source is located is in the current scene, whether the region where the sound source is located is in a region that can be identified by an electrical appliance to be controlled, whether the sound is not a sound made by a person, whether the sound is noise, and the like); when the area where the sound source is located is effective, analyzing the first image area; or when the area where the sound is located is invalid, the first image area is not analyzed, and all the image areas can be analyzed to determine whether the image recognition operation exists in all the image areas.
For example: referring to the example shown in fig. 7, the voice localization information is valid, i.e., the area where the voice localization sound source is located is within the image boundary area + D area.
For example: determining, by the image recognition module 106, whether the voice location information is valid: when the voice operation signal is valid, searching whether an image operation signal exists in an area of a preset range of the position according to the position of the voice positioning position information in the image picture, and if so, operating; otherwise, normal search is carried out.
Therefore, the efficiency and the reliability of image identification can be further improved by determining the effectiveness of the sound source positioning information, and the humanization is strong.
In an alternative embodiment, the method may further include: when the image recognition operation is absent from the first image region (e.g., when the image recognition operation is absent from the first image region and/or when the region in which the sound source is located is invalid), a second image region of the image other than the first image region is analyzed to determine whether the image recognition operation is present in the second image region.
In an alternative embodiment, the method may further include: when the image recognition operation is absent from the first image region (e.g., when the image recognition operation is absent from the first image region and/or when the sound source is in an invalid region), all image regions in the image (e.g., all image regions including the first image region and the second image region) are analyzed to determine whether the image recognition operation is present in all image regions.
For example: and when the image is invalid, normally searching whether all the image pictures have image operation signals, if so, operating, and if not, re-identifying.
Therefore, when no image recognition operation exists in the area where the sound source is located or the positioning of the area where the sound source is located is invalid, image recognition is carried out on other areas or all areas, the comprehensiveness and accuracy of the image recognition can be improved, and the humanization is good.
Through a large number of tests, the technical scheme of the embodiment is adopted, and the image recognition efficiency and the accuracy are improved through voice recognition auxiliary positioning.
According to the embodiment of the invention, an instruction recognition device (for example, an air conditioner control device based on image and voice recognition) corresponding to the instruction recognition method is also provided. Referring to fig. 4, a schematic diagram of an embodiment of the apparatus of the present invention is shown. The instruction recognition apparatus may include: a speech recognition module 104 and an image recognition module 106.
In an alternative example, the speech recognition module 104 may be configured to analyze the sound in the current scene, determine a sound source of the sound, and perform speech localization on the area where the sound source is located (e.g., by the speech recognition module 104, the direction, the position, etc. of the sound source are speech-localized). The specific functions and processes of the speech recognition module 104 are shown in step S110.
For example: analyzing the sound, determining a sound source (namely, a sound source) of the sound, and positioning an area where the sound source is located.
For example: and voice recognition, namely positioning the sound source through normal voice or a voice command, and calculating the position of the sound source to obtain voice positioning position information.
For example: the speech recognition module 104 can locate a sound source and can perform speech recognition on the sound source sound. For example: the voice recognition module 104 can analyze the voice information collected by the microphone and locate the sound source.
For example: the sound source can be analyzed and identified firstly, and then the sound source is positioned.
For example: the voice recognition module 104 collects voice analysis and locates sound sources in real time and transfers or shares sound source information to the image recognition module 106.
In an optional example, the image recognition module 106 may be configured to analyze a first image area of the current scene in the area where the sound source is located, and determine whether a set image recognition operation (e.g., a gesture operation) is performed in the first image area. The specific functions and processes of the image recognition module 106 are shown in step S120.
For example: by using the area where the sound source is located as a first identification area, the image identification module 106 analyzes the first image area where the image is located in the first identification area, and determines whether an image identification operation (for example, an image identification operation on the electrical appliance to be controlled) exists in the first image area.
For example: based on the positioning function of both the image recognition and the voice recognition, the position of the sound source positioned by the voice recognition can be transmitted to the image recognition module 106, so that the image recognition module 106 preferentially analyzes the image of the sound source region, the efficiency of image recognition and positioning is improved, and an operator can be found out more quickly.
For example: under the condition of having the position information of the sound source, the image identification module 106 preferentially analyzes the image area corresponding to the position of the sound source, and searches whether a control operator exists, so as to quickly determine whether the position of the sound source has an image control requirement, and provide service more quickly; if not, other areas are searched for whether to exist or not, so that the position of the operator can be located more quickly, and services and interactions such as gesture operation are provided for the operator.
In an optional example, the image recognition module 106 may be further configured to, when the image recognition operation is performed in the first image region, convert (e.g., in a set conversion manner) the image recognition operation into a desired instruction (e.g., a control instruction, an operation instruction, etc., capable of controlling an electrical appliance to be controlled in a current scene). The specific function and processing of the image recognition module 106 are also referred to in step S130.
When the image recognition operation is recognized, the control instruction corresponding to the corresponding image recognition operation can be determined according to the preset corresponding relation between the image recognition operation and the control instruction.
For example: the palm is hovered for 1-5 seconds (for example, 2 seconds) facing the air conditioner camera to represent a gesture operation of awakening the image.
Therefore, the sound source is positioned through voice recognition, the recognition area of image recognition can be reduced, the image recognition efficiency is improved, the operation process is simple, and the recognition efficiency is high.
Optionally, the image recognition module 106 may include: the device comprises a face recognition module and an action recognition module.
In an alternative specific example, the face recognition module may be configured to perform face recognition on the first image region, and determine whether there is an operator in the first image region. The specific functions and processes of the face recognition module are also shown in step S210.
For example: through face recognition, whether the first image area contains a person can be determined.
For example: face entry (entry of operator face information into the image recognition system): the operator can make the voice recognition module 104 locate at the position of the operator by making a sound or voice command (such as face input), and make the image recognition module 106 locate at the position of the person to be input with face information, so as to collect the face information.
In an alternative specific example, the action recognition module (for example: gesture recognition module) may be configured to determine whether the operator has the image recognition operation when the operator is in the first image area, and determine that the image recognition operation is in the first image area when the operator has the image recognition operation; or when the operator does not have the image recognition operation, determining that the image recognition operation is not present in the first image region. The specific function and processing of the motion recognition module are also shown in step S220.
For example: the image recognition module 106 has a function of analyzing and recognizing the captured image, for example: the system has the functions of gesture recognition, face detection (for example, the face detection can detect whether a certain object is a person) and the like, can analyze and position the position of an operator on a collected picture (namely a collected image), locks and tracks the gesture pattern of the operator, converts the gesture pattern into a specific air conditioner control instruction and realizes the control of the air conditioner.
For example: gesture operation: the operator may cause the voice recognition module 104 to be positioned at the operator's location by sounding a sound or voice command (e.g., gesture control) so that the image recognition system can be quickly and accurately positioned at the operator.
Therefore, whether an operator exists or not is determined through face recognition, and image recognition operation is determined through action recognition when the operator exists, so that the recognition efficiency is high, and the accuracy is good.
More optionally, the determining, by the action recognition module, whether the operator has the image recognition operation may specifically include: the image locates the position of the operator in the first image region. The specific function and processing of the motion recognition module are also referred to in step S310.
For example: the action recognition module can comprise an image positioning module, and image positioning can be carried out through the image positioning module.
More optionally, the action recognition module determines whether the operator has the image recognition operation, and may further include: and locking the operator according to the position, and tracking the action of the operator. And (4) placing. The specific function and processing of the motion recognition module are also referred to in step S320.
For example: the action recognition module can comprise a locking tracking module, and locking and tracking can be performed through the locking tracking module.
More optionally, the action recognition module determines whether the operator has the image recognition operation, and may further include: performing gesture recognition on the motion, and determining whether the motion is a setting operation (such as a gesture operation) so as to determine that the motion is an image recognition operation when the motion is the gesture operation; or when the motion is not a gesture operation, determining that the motion is not an image recognition operation. The specific function and processing of the motion recognition module are also referred to in step S330.
For example: the image recognition operation may include: gesture operation: the operator can make the voice recognition module locate at the position of the operator by making sound or voice command (such as gesture control), so that the image recognition system can quickly and accurately locate the operator.
For example: the image recognition operation may further include: face entry (entry of operator face information into the image recognition system): the operator can make the voice recognition module locate at the position of the operator by making a sound or voice command (such as face input), and make the image recognition module locate at the position of the person to be input with face information, so as to collect the face information.
For example: the motion recognition module may include a motion recognition module (e.g., a gesture recognition module) through which motion recognition may be performed (e.g., gesture recognition by the gesture recognition module).
For example: image recognition control (for example, gesture control requires positioning and tracking of a gesture, and face detection control requires positioning and detecting of a face) is mostly based on analysis and recognition of collected images, the position of an operator is firstly positioned, then actions of the operator are tracked and analyzed, and a series of control over an air conditioner is completed, but positioning failure may be caused by a complex background of a scene, and if a background picture has a figure, gesture control cannot be performed due to easy misrecognition.
Therefore, the position of the operator is positioned, the action of the operator is locked and tracked, and then image recognition operations such as gestures of the operator are acquired, so that the reliability is high, and the accuracy is good.
In an alternative embodiment, the method may further include: a receiving module 102.
In an alternative example, the receiving module 102 may be configured to obtain the sound in the current scene before the analyzing of the sound in the current scene.
For example: and collecting voice.
Therefore, by acquiring the sound in the current scene, accurate and reliable basis can be provided for voice recognition.
Optionally, the receiving module 102 may include: the sound collection module 1022.
In an alternative specific example, the receiving module 102 may be configured to receive, through the sound collecting module 1022, the sound collected by the sound collecting module 1022 in the current scene. The sound collection module 1022 may include: at least one of a microphone, a sound sensor.
For example: a microphone may be further disposed in cooperation with the sound collection module 1022 to amplify the sound collected by the sound collection module 1022, which is beneficial for the speech recognition module 104 to receive and recognize.
For example: in general, a microphone array (e.g., two-microphone, four-microphone, eight-microphone) is used for voice acquisition in the voice recognition control so as to locate the orientation of a speaker, then the speaker is locked and tracked, the voice of the speaker is analyzed and recognized, and the voice is converted into a control instruction to complete the control of the device.
For example: a speech device, may include: a voice acquisition module (e.g., a microphone) and a voice recognition module 104. For example: the microphone is connected to the speech recognition module 104.
Therefore, the sound is acquired through the sound acquisition module, the acquisition mode is simple and convenient, and the acquired sound is high in reliability.
In an optional example, the receiving module 102 may be further configured to acquire at least one of the image of the current scene and a first image region of the sound source before analyzing the first image region in which the image of the current scene is located.
For example: acquiring at least one of the image in the current scene and the first image region may be: all images in the current scene are acquired first, or a partial image in the first image region is acquired first.
For example: and collecting an image.
For example: the air conditioner is provided with an image device and a voice device.
For example: acquiring sound and images in a current scene (such as the environment where an electric appliance to be controlled is located).
For example: an image device may include: an image acquisition module 1024 (e.g., a camera) and an image recognition module 106. For example: the camera is connected to an image recognition module 106.
Therefore, by acquiring the image in the current scene, accurate and reliable basis can be provided for image recognition, and convenience is good.
Optionally, the receiving module 102 may further include: an image acquisition module 1024.
In an optional specific example, the receiving module 102 may be further configured to receive at least one of the image in the current scene and the first image region acquired by the image acquiring module 1024. The image acquisition module 1024 may include: at least one of a camera, an infrared sensor, a CCD image sensor and an ultrasonic sensor.
Therefore, the image is acquired through the image acquisition module, the acquisition mode is simple and convenient, and the acquired sound reliability is high.
In an optional embodiment, the image recognition module 106 may be further configured to determine whether the area where the sound source is located obtained by the voice localization is valid (for example, whether the area where the sound source is located is in the current scene, whether the area where the sound source is located is in an area that can be recognized by an electrical appliance to be controlled, whether the sound is not a sound made by a person, whether the sound is noise, or the like) before the analysis is performed on the first image area where the image of the current scene is located in the area where the sound source is located; when the area where the sound source is located is effective, analyzing the first image area; or when the area where the sound is located is invalid, the first image area is not analyzed, and all the image areas can be analyzed to determine whether the image recognition operation exists in all the image areas.
For example: referring to the example shown in fig. 7, the voice localization information is valid, i.e., the area where the voice localization sound source is located is within the image boundary area + D area.
For example: determining, by the image recognition module 106, whether the voice location information is valid: when the voice operation signal is valid, searching whether an image operation signal exists in an area of a preset range of the position according to the position of the voice positioning position information in the image picture, and if so, operating; otherwise, normal search is carried out.
Therefore, the efficiency and the reliability of image identification can be further improved by determining the effectiveness of the sound source positioning information, and the humanization is strong.
In an optional embodiment, the image recognition module 106 may be further configured to analyze a second image region of the image except the first image region to determine whether the image recognition operation exists in the second image region when the image recognition operation does not exist in the first image region and/or when the region where the sound source is located is invalid.
In an optional embodiment, the image recognition module 106 may be further configured to analyze all image areas (e.g., all image areas including the first image area and the second image area) in the image to determine whether the image recognition operation is performed in all image areas when the image recognition operation is not performed in the first image area and/or when the area where the sound source is located is invalid.
For example: and when the image is invalid, normally searching whether all the image pictures have image operation signals, if so, operating, and if not, re-identifying.
Therefore, when no image recognition operation exists in the area where the sound source is located or the positioning of the area where the sound source is located is invalid, image recognition is carried out on other areas or all areas, the comprehensiveness and accuracy of the image recognition can be improved, and the humanization is good.
Since the processes and functions implemented by the apparatus of this embodiment substantially correspond to the embodiments, principles and examples of the method shown in fig. 1 to 3, the description of this embodiment is not detailed, and reference may be made to the related descriptions in the foregoing embodiments, which are not repeated herein.
Through a large number of tests, the technical scheme of the invention improves the accuracy of locking the direction of an operator by image recognition control through combining image recognition with voice recognition positioning, thereby improving the accuracy of subsequent control.
According to the embodiment of the invention, a storage device corresponding to the instruction identification method is also provided. The storage device may include: a plurality of instructions are stored in the storage device; the instructions are used for loading and executing the instruction identification method by the processor.
For example: the instructions are used for loading and executing the steps of the instruction identification method by a processor (such as the processor 204).
Since the processing and functions implemented by the storage device of this embodiment substantially correspond to the embodiments, principles, and examples of the methods shown in fig. 1 to fig. 3, details are not described in the description of this embodiment, and reference may be made to the related descriptions in the foregoing embodiments, which are not described herein again.
Through a large number of tests, the technical scheme of the invention is adopted to identify and position the sound source position through voice, so as to provide positioning reference for image identification and positioning of the position of an operator, and improve the response speed and accuracy of the air conditioner controlled by image identification.
According to the embodiment of the invention, the mobile terminal corresponding to the instruction identification method is also provided. Referring to fig. 5, a schematic structural diagram of an embodiment of a mobile terminal according to the present invention is shown. The mobile terminal may include: a memory 202 and a processor 204.
Optionally, memory 202 may be used to store a plurality of instructions.
Optionally, the processor 204 may be configured to execute a plurality of instructions.
The instructions may be used to be stored in the memory, and loaded and executed by the processor 204.
Since the processes and functions implemented by the mobile terminal of this embodiment substantially correspond to the embodiments, principles, and examples of the methods shown in fig. 1 to fig. 3, details are not described in the description of this embodiment, and reference may be made to the related descriptions in the foregoing embodiments, which are not described herein again.
Through a large number of tests, the technical scheme of the invention is adopted to determine the orientation of the person through voice recognition and positioning, thereby providing reference for image recognition, quickly positioning the accurate position of the person in the image and improving the image recognition efficiency and accuracy.
According to the embodiment of the invention, an electric appliance corresponding to the instruction recognition device is also provided. The appliance may include: at least one of the above-mentioned instruction recognition device, the above-mentioned storage apparatus, and the above-mentioned mobile terminal.
Optionally, the electric appliance may include: at least one of an air conditioner (for example, an air conditioner capable of performing automatic instruction recognition based on image recognition), a refrigerator, a television, a water heater, a water dispenser, an air purifier and a range hood.
For example: the electric appliance may include: air conditioner, water heater, water dispenser, TV set, air purifier, domestic appliances such as TV set.
For example: referring to the examples shown in fig. 6 and 8, the main controller of the air conditioner may be configured to be adapted to the voice recognition module 104, the image recognition module 106, and the like.
Wherein, the camera can be set in a manner of being adapted to the image recognition module 106. The microphone may be adapted to be set with the voice recognition module 104.
Since the processes and functions implemented by the electrical appliance of this embodiment substantially correspond to the embodiments, principles, and examples of at least one of the storage device, the mobile terminal, and the instruction recognition device shown in the foregoing, no details are given in the description of this embodiment, and reference may be made to the related descriptions in the foregoing embodiments, which are not described herein again.
After a large number of tests and verifications, the technical scheme of the invention is adopted to perform face recognition in a face authentication registration mode, which is beneficial to improving the accuracy and reliability of face recognition and further improving the accuracy of reliably playing instruction recognition information to an object to be identified by an instruction.
Through a large number of tests, the technical scheme of the invention is adopted to identify and position the sound source position through voice, so as to provide positioning reference for image identification and positioning of the position of an operator, and improve the response speed and accuracy of the air conditioner controlled by image identification.
In summary, it is readily understood by those skilled in the art that the advantageous modes described above can be freely combined and superimposed without conflict.
The above description is only an example of the present invention, and is not intended to limit the present invention, and it is obvious to those skilled in the art that various modifications and variations can be made in the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.
Claims (16)
1. An instruction recognition method, comprising:
analyzing sound in a current scene, determining a sound source of the sound, and positioning an area where the sound source is located in a voice mode;
analyzing a first image area of the current scene image in the area of the sound source, and determining whether a set image recognition operation exists in the first image area;
and when the image recognition operation exists in the first image area, converting the image recognition operation into a required instruction.
2. The method of claim 1, further comprising:
acquiring the sound in the current scene before analyzing the sound in the current scene;
and/or the presence of a gas in the gas,
before analyzing a first image area of the image of the current scene in the area where the sound source is located, acquiring at least one of the image of the current scene and the first image area;
and/or the presence of a gas in the gas,
before analyzing a first image area of the current scene image in the area of the sound source, determining whether the area of the sound source obtained by the voice positioning is effective; so that the first image area is only analyzed when the area where the sound source is located is valid.
3. The method of claim 2,
the acquiring the sound in the current scene includes:
receiving the sound in the current scene collected by a sound collection module; wherein, the sound collection module includes: at least one of a microphone, a sound sensor;
and/or the presence of a gas in the gas,
the acquiring at least one of the image in the current scene and the first image region includes:
receiving at least one of the image in the current scene acquired by an image acquisition module, the first image region; wherein, the image acquisition module includes: at least one of a camera, an infrared sensor, a CCD image sensor and an ultrasonic sensor.
4. The method of any one of claims 1-3, further comprising:
when the image recognition operation is absent in the first image region,
analyzing a second image area in the image except the first image area to determine whether the image identification operation exists in the second image area; or,
analyzing all image areas in the image to determine whether the image recognition operation exists in all the image areas.
5. The method according to any one of claims 1-4, wherein said determining whether there is a set image recognition operation in said first image region comprises:
carrying out face recognition on the first image area, and determining whether an operator exists in the first image area;
when the operator exists in the first image area, whether the operator has the image recognition operation is determined, and when the operator has the image recognition operation, the image recognition operation is determined in the first image area.
6. The method of claim 5, wherein determining whether the operator has the image recognition operation comprises:
image locating the position of the operator in the first image region;
locking the operator according to the position, and tracking the action of the operator;
and performing gesture recognition on the motion, and determining whether the motion is a setting operation, so that when the motion is a gesture operation, the motion is determined to be an image recognition operation.
7. An instruction recognition apparatus, comprising:
the voice recognition module is used for analyzing the sound in the current scene, determining the sound source of the sound and positioning the area where the sound source is located in a voice mode;
the image recognition module is used for analyzing a first image area of the current scene image in the area where the sound source is located and determining whether a set image recognition operation exists in the first image area;
the image recognition module is further configured to convert the image recognition operation into a required instruction when the image recognition operation exists in the first image region.
8. The apparatus of claim 7, further comprising:
a receiving module; the method comprises the steps of obtaining sound in a current scene before analyzing the sound in the current scene;
and/or the presence of a gas in the gas,
the receiving module; the method is further used for acquiring at least one of the image in the current scene and a first image area in the area where the sound source is located before analyzing the first image area in the area where the image in the current scene is located;
and/or the presence of a gas in the gas,
the image recognition module is further configured to determine whether the area where the sound source is located obtained by the voice positioning is valid before analyzing a first image area where the image of the current scene is located in the area where the sound source is located; so that the first image area is only analyzed when the area where the sound source is located is valid.
9. The apparatus of claim 8,
the receiving module comprises:
a sound collection module; the receiving module is used for receiving the sound in the current scene collected by the sound collecting module through the sound collecting module; wherein, the sound collection module includes: at least one of a microphone, a sound sensor;
and/or the presence of a gas in the gas,
the receiving module further comprises:
an image acquisition module; the receiving module is further configured to receive at least one of the image in the current scene and the first image region acquired by the image acquisition module; wherein, the image acquisition module includes: at least one of a camera, an infrared sensor, a CCD image sensor and an ultrasonic sensor.
10. The apparatus of any of claims 7-9, further comprising:
the image recognition module is further used for, when the image recognition operation is not performed in the first image area,
analyzing a second image area in the image except the first image area to determine whether the image identification operation exists in the second image area; or,
analyzing all image areas in the image to determine whether the image recognition operation exists in all the image areas.
11. The apparatus according to any one of claims 7-10, wherein the image recognition module comprises:
the face recognition module is used for carrying out face recognition on the first image area and determining whether an operator exists in the first image area;
and the action recognition module is used for determining whether the operator has the image recognition operation when the operator exists in the first image area, and determining that the image recognition operation exists in the first image area when the operator has the image recognition operation.
12. The apparatus of claim 11, wherein the action recognition module determines whether the operator has the image recognition operation, and specifically comprises:
image locating the position of the operator in the first image region;
locking the operator according to the position, and tracking the action of the operator;
and performing gesture recognition on the motion, and determining whether the motion is a setting operation, so that when the motion is a gesture operation, the motion is determined to be an image recognition operation.
13. A storage device having a plurality of instructions stored therein; the plurality of instructions for being loaded by a processor and for performing the instruction recognition method of any one of claims 1-6.
14. A mobile terminal, comprising:
a processor for executing a plurality of instructions;
a memory to store a plurality of instructions;
wherein the plurality of instructions are for storage by the memory and for loading and execution by the processor of the instruction recognition method of any of claims 1-6.
15. An electrical appliance, comprising: at least one of an instruction recognition apparatus according to any of claims 7-12, a storage device according to claim 13, a mobile terminal according to claim 14.
16. The electric appliance according to claim 15, characterized in that it comprises: at least one of an air conditioner, a refrigerator, a television, a water heater, a water dispenser, an air purifier and a smoke exhaust ventilator.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710353318.1A CN107247923A (en) | 2017-05-18 | 2017-05-18 | Instruction identification method and device, storage equipment, mobile terminal and electric appliance |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710353318.1A CN107247923A (en) | 2017-05-18 | 2017-05-18 | Instruction identification method and device, storage equipment, mobile terminal and electric appliance |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107247923A true CN107247923A (en) | 2017-10-13 |
Family
ID=60017030
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710353318.1A Pending CN107247923A (en) | 2017-05-18 | 2017-05-18 | Instruction identification method and device, storage equipment, mobile terminal and electric appliance |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107247923A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110486859A (en) * | 2019-08-28 | 2019-11-22 | 广东美的制冷设备有限公司 | Multi-split air conditioner and its control method, control device and readable storage medium storing program for executing |
CN110505399A (en) * | 2019-08-13 | 2019-11-26 | 聚好看科技股份有限公司 | Control method, device and the acquisition terminal of Image Acquisition |
CN110916576A (en) * | 2018-12-13 | 2020-03-27 | 成都家有为力机器人技术有限公司 | Cleaning method based on voice and image recognition instruction and cleaning robot |
CN111145252A (en) * | 2019-11-11 | 2020-05-12 | 云知声智能科技股份有限公司 | Sound source direction judging system assisted by images on child robot |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102298443A (en) * | 2011-06-24 | 2011-12-28 | 华南理工大学 | Smart home voice control system combined with video channel and control method thereof |
CN103685905A (en) * | 2012-09-17 | 2014-03-26 | 联想(北京)有限公司 | Photographing method and electronic equipment |
CN103900204A (en) * | 2014-03-25 | 2014-07-02 | 四川长虹电器股份有限公司 | Air conditioner adjustment method and air conditioner |
CN104954673A (en) * | 2015-06-11 | 2015-09-30 | 广东欧珀移动通信有限公司 | Camera rotating control method and user terminal |
CN105204628A (en) * | 2015-09-01 | 2015-12-30 | 涂悦 | Voice control method based on visual awakening |
CN105975079A (en) * | 2016-05-17 | 2016-09-28 | 珠海格力电器股份有限公司 | Information processing method and device for air conditioner |
CN106203259A (en) * | 2016-06-27 | 2016-12-07 | 旗瀚科技股份有限公司 | The mutual direction regulating method of robot and device |
CN106440192A (en) * | 2016-09-19 | 2017-02-22 | 珠海格力电器股份有限公司 | Household appliance control method, device and system and intelligent air conditioner |
-
2017
- 2017-05-18 CN CN201710353318.1A patent/CN107247923A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102298443A (en) * | 2011-06-24 | 2011-12-28 | 华南理工大学 | Smart home voice control system combined with video channel and control method thereof |
CN103685905A (en) * | 2012-09-17 | 2014-03-26 | 联想(北京)有限公司 | Photographing method and electronic equipment |
CN103900204A (en) * | 2014-03-25 | 2014-07-02 | 四川长虹电器股份有限公司 | Air conditioner adjustment method and air conditioner |
CN104954673A (en) * | 2015-06-11 | 2015-09-30 | 广东欧珀移动通信有限公司 | Camera rotating control method and user terminal |
CN105204628A (en) * | 2015-09-01 | 2015-12-30 | 涂悦 | Voice control method based on visual awakening |
CN105975079A (en) * | 2016-05-17 | 2016-09-28 | 珠海格力电器股份有限公司 | Information processing method and device for air conditioner |
CN106203259A (en) * | 2016-06-27 | 2016-12-07 | 旗瀚科技股份有限公司 | The mutual direction regulating method of robot and device |
CN106440192A (en) * | 2016-09-19 | 2017-02-22 | 珠海格力电器股份有限公司 | Household appliance control method, device and system and intelligent air conditioner |
Non-Patent Citations (1)
Title |
---|
黄仕翰: "手势及声音遥控家电技术", 《HTTP://IR.LIB.NCU.EDU.TW/HANDLE/987654321/48423》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110916576A (en) * | 2018-12-13 | 2020-03-27 | 成都家有为力机器人技术有限公司 | Cleaning method based on voice and image recognition instruction and cleaning robot |
CN110505399A (en) * | 2019-08-13 | 2019-11-26 | 聚好看科技股份有限公司 | Control method, device and the acquisition terminal of Image Acquisition |
CN110486859A (en) * | 2019-08-28 | 2019-11-22 | 广东美的制冷设备有限公司 | Multi-split air conditioner and its control method, control device and readable storage medium storing program for executing |
CN110486859B (en) * | 2019-08-28 | 2020-12-22 | 广东美的制冷设备有限公司 | Multi-split air conditioning system, control method and device thereof and readable storage medium |
CN111145252A (en) * | 2019-11-11 | 2020-05-12 | 云知声智能科技股份有限公司 | Sound source direction judging system assisted by images on child robot |
CN111145252B (en) * | 2019-11-11 | 2023-05-30 | 云知声智能科技股份有限公司 | Sound source direction judging system assisted by images on children robot |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103970260B (en) | A kind of non-contact gesture control method and electric terminal equipment | |
CN107247923A (en) | Instruction identification method and device, storage equipment, mobile terminal and electric appliance | |
US11373646B2 (en) | Household appliance control method, device and system, and intelligent air conditioner by determining user sound source location based on analysis of mouth shape | |
CN106225174B (en) | Air conditioner control method and system and air conditioner | |
WO2018210219A1 (en) | Device-facing human-computer interaction method and system | |
US11847857B2 (en) | Vehicle device setting method | |
CN106991395B (en) | Information processing method and device and electronic equipment | |
CN110434853B (en) | Robot control method, device and storage medium | |
US11263446B2 (en) | Method for person re-identification in closed place, system, and terminal device | |
WO2021136975A1 (en) | Image processing methods and apparatuses, electronic devices, and storage media | |
CN110223690A (en) | The man-machine interaction method and device merged based on image with voice | |
CN108363557A (en) | Man-machine interaction method, device, computer equipment and storage medium | |
WO2006080161A1 (en) | Speech content recognizing device and speech content recognizing method | |
CN109373518B (en) | Air conditioner and voice control device and voice control method thereof | |
JP2000356674A (en) | Sound source identification device and its identification method | |
US20210201478A1 (en) | Image processing methods, electronic devices, and storage media | |
CN103677254B (en) | Method and apparatus for recording operation | |
CN112102546A (en) | Man-machine interaction control method, talkback calling method and related device | |
CN110941992A (en) | Smile expression detection method and device, computer equipment and storage medium | |
CN106813669A (en) | The modification method and device of movable information | |
CN111291671A (en) | Gesture control method and related equipment | |
CN113934307B (en) | Method for starting electronic equipment according to gestures and scenes | |
CN112767931A (en) | Voice interaction method and device | |
JP2019176332A (en) | Speech extracting device and speech extracting method | |
CN111077997B (en) | Click-to-read control method in click-to-read mode and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20171013 |
|
RJ01 | Rejection of invention patent application after publication |