CN109886070A - A kind of apparatus control method, device, storage medium and equipment - Google Patents
A kind of apparatus control method, device, storage medium and equipment Download PDFInfo
- Publication number
- CN109886070A CN109886070A CN201811584166.7A CN201811584166A CN109886070A CN 109886070 A CN109886070 A CN 109886070A CN 201811584166 A CN201811584166 A CN 201811584166A CN 109886070 A CN109886070 A CN 109886070A
- Authority
- CN
- China
- Prior art keywords
- gesture
- gestures
- image
- frame images
- continuous
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- User Interface Of Digital Computer (AREA)
- Image Analysis (AREA)
Abstract
The present invention provides a kind of apparatus control method, device, storage medium and equipment, which comprises when receiving the voice command of user, the Sounnd source direction of institute's speech commands is identified, with the direction where the determination user;It drives the camera of the equipment to turn to the direction where the user, and acquires the continuous N frame images of gestures of the dynamic gesture of the user;Gesture identification is carried out to the continuous N frame images of gestures of acquisition, to identify the gesture meaning of the dynamic gesture, to control the equipment according to the gesture meaning.Scheme provided by the invention can expand the visual field of camera.
Description
Technical field
The present invention relates to control field more particularly to a kind of apparatus control method, device, storage medium and equipment.
Background technique
Currently, the visual field of camera has certain range on air-conditioning, user needs to be in and take the photograph when leading to carry out gesture control
As head field range in, otherwise camera can not collect the gesture motion of user.On the other hand, existing gesture identification one
As be all the meaning for acquiring static gesture and sorting out the expression of each gesture again, user experience is bad.
Summary of the invention
It is a primary object of the present invention to overcome the defect of the above-mentioned prior art, provide a kind of apparatus control method, device,
Storage medium and equipment have certain range to solve the visual field of camera in the prior art, and carrying out user when gesture control needs
The problem that be in camera field range.
One aspect of the present invention provides a kind of apparatus control method, comprising: when receiving the voice command of user, identification
The Sounnd source direction of institute's speech commands, with the direction where the determination user;It drives described in the camera steering of the equipment
Direction where user, and acquire the continuous N frame images of gestures of the dynamic gesture of the user;To the continuous N frame of acquisition
Images of gestures carries out gesture identification, to identify the gesture meaning of the dynamic gesture, to be controlled according to the gesture meaning
The equipment.
Optionally, gesture identification is carried out to the continuous N frame images of gestures of acquisition, to identify the dynamic gesture
Gesture meaning, comprising: the gesture area of the dynamic gesture is identified in every frame images of gestures of the continuous N frame images of gestures
Domain, and default processing is carried out based on the gesture area and obtains the gesture area image of every frame images of gestures;To every frame hand
The gesture area image of gesture image carries out duplex channel process of convolution, to obtain the gesture area of the continuous N frame images of gestures
The continuous N frame characteristic pattern of image;Process of convolution is carried out to the continuous N frame characteristic pattern using 3D convolutional neural networks model to obtain
The body dynamics information of the gesture area;Utilize the motion feature of preset Recognition with Recurrent Neural Network model and the gesture area
Information identifies the continuous N frame characteristic pattern, obtains the gesture meaning of the dynamic gesture.
Optionally, default processing is carried out based on the gesture area and obtains the gesture area image of every frame images of gestures, wrapped
It includes: other image-regions in every frame images of gestures in addition to the gesture area is carried out at virtualization processing or removal
Reason, to obtain the gesture area image of every frame images of gestures.
Optionally, duplex channel process of convolution is carried out to the gesture area image of every frame images of gestures, to obtain
State the continuous N frame characteristic pattern of the gesture area image of continuous N frame images of gestures, comprising: to the gesture of every frame images of gestures
The color image and depth image of area image carry out process of convolution respectively, obtain color property figure and depth characteristic figure;By institute
It states color property figure and depth characteristic figure merges, obtain the characteristic pattern of the gesture area image of every frame images of gestures.
Optionally, the equipment, comprising: air-conditioning, the method, further includes: direction where determining the user it
Afterwards, the air-out direction of the air-conditioning according to the direction controlling where the user.
Another aspect of the present invention provides a kind of plant control unit, comprising: voice recognition unit receives use for working as
When the voice command at family, the Sounnd source direction of institute's speech commands is identified, with the direction where the determination user;Driving unit,
Camera for driving the equipment turns to the direction where the user;Image acquisition units, for acquiring the user
Dynamic gesture continuous N frame images of gestures;Image identification unit, it is described continuous for being acquired to described image acquisition unit
N frame images of gestures carries out gesture identification, to identify the gesture meaning of the dynamic gesture;Control unit, for according to
The gesture meaning that image identification unit identifies controls the equipment.
Optionally, image identification unit, comprising: gesture area recognition unit, in the continuous N frame images of gestures
The gesture area of the dynamic gesture is identified in every frame images of gestures, and default processing is carried out based on the gesture area and is obtained
The gesture area image of every frame images of gestures;Duplex channel processing unit, for the gesture area to every frame images of gestures
Image carries out duplex channel process of convolution, to obtain the continuous N frame feature of the gesture area image of the continuous N frame images of gestures
Figure;3D convolution processing unit is obtained for carrying out process of convolution to the continuous N frame characteristic pattern using 3D convolutional neural networks model
To the body dynamics information of the gesture area;Gesture meaning recognition unit, for utilizing preset Recognition with Recurrent Neural Network model
The continuous N frame characteristic pattern is identified with the body dynamics information of the gesture area, obtains the hand of the dynamic gesture
Gesture meaning.
Optionally, gesture area recognition unit carries out default processing based on the gesture area and obtains every frame images of gestures
Gesture area image, comprising: in every frame images of gestures in addition to the gesture area other image-regions carry out
Virtualization processing or removal processing, to obtain the gesture area image of every frame images of gestures.
Optionally, the duplex channel processing unit carries out the gesture area image of every frame images of gestures two-way
Channel process of convolution, to obtain the continuous N frame characteristic pattern of the gesture area image of the continuous N frame images of gestures, comprising: to institute
The color image and depth image for stating the gesture area image of every frame images of gestures carry out process of convolution respectively, obtain color property
Figure and depth characteristic figure;The color property figure and depth characteristic figure are merged, the hand of every frame images of gestures is obtained
The characteristic pattern of gesture area image.
Optionally, the equipment, comprising: air-conditioning, described control unit are also used to: the side where determining the user
To later, according to the air-out direction of air-conditioning described in the direction controlling where the user.
Another aspect of the invention provides a kind of storage medium, is stored thereon with computer program, and described program is processed
The step of device realizes aforementioned any the method when executing.
Further aspect of the present invention provides a kind of equipment, including processor, memory and storage on a memory can be
The step of computer program run on processor, the processor realizes aforementioned any the method when executing described program.
Further aspect of the present invention provides a kind of equipment, including aforementioned any plant control unit.
According to the technique and scheme of the present invention, by identify the Sounnd source direction that makes a sound of user to determine the direction of user,
It drives camera to turn to according to the direction of user, camera visual field can be expanded, will mutually be handed between voice and image
Mutually fusion improves user and interacts with air conditioner intelligent;The present invention carries out gesture area identification to images of gestures, and carries out default processing
Gesture area image is obtained, by irrelevant information in virtualization or interception image, image data to be treated can be reduced;Using
Duplex channel carries out process of convolution to the continuous N frame images of gestures of acquisition and obtains continuous N frame characteristic pattern, and utilizes 3D convolution mind
Continuous N frame characteristic pattern is identified through network model, motion feature is obtained, to identify dynamic gesture meaning, Neng Gouti
The discrimination of high gesture identification can be realized the identification of dynamic gesture, enhance gesture identification degree.
Detailed description of the invention
The drawings described herein are used to provide a further understanding of the present invention, constitutes a part of the invention, this hair
Bright illustrative embodiments and their description are used to explain the present invention, and are not constituted improper limitations of the present invention.In the accompanying drawings:
Fig. 1 is the method schematic diagram of an embodiment of apparatus control method provided by the invention;
Fig. 2 is that the continuous N frame images of gestures according to an embodiment of the present invention to acquisition carries out gesture identification, with identification
The flow diagram of a specific embodiment of the step of gesture meaning of the dynamic gesture out;
Fig. 3 is that the gesture area image according to an embodiment of the present invention to every frame images of gestures carries out binary channels convolution
The schematic diagram of processing;
Fig. 4 is that the images of gestures according to an embodiment of the present invention to continuous acquisition carries out gesture identification to be embodied
The schematic diagram of mode;
Fig. 5 is the structural schematic diagram of an embodiment of plant control unit provided by the invention;
Fig. 6 is the structural schematic diagram of one specific embodiment of image identification unit according to an embodiment of the present invention.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with the specific embodiment of the invention and
Technical solution of the present invention is clearly and completely described in corresponding attached drawing.Obviously, described embodiment is only the present invention one
Section Example, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not doing
Every other embodiment obtained under the premise of creative work out, shall fall within the protection scope of the present invention.
It should be noted that description and claims of this specification and term " first " in above-mentioned attached drawing, "
Two " etc. be to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should be understood that using in this way
Data be interchangeable under appropriate circumstances, so as to the embodiment of the present invention described herein can in addition to illustrating herein or
Sequence other than those of description is implemented.In addition, term " includes " and " having " and their any deformation, it is intended that cover
Cover it is non-exclusive include, for example, the process, method, system, product or equipment for containing a series of steps or units are not necessarily limited to
Step or unit those of is clearly listed, but may include be not clearly listed or for these process, methods, product
Or other step or units that equipment is intrinsic.
Fig. 1 is the method schematic diagram of an embodiment of apparatus control method provided by the invention.
As shown in Figure 1, according to one embodiment of present invention, the apparatus control method includes at least step S110, step
Rapid S120 and step S130.
Step S110 identifies the Sounnd source direction of institute's speech commands, when receiving the voice command of user to determine
State the direction where user.
Specifically, using the Sounnd source direction of deep learning voice module identification institute's speech commands, the Sounnd source direction is
Issue user's direction of institute's speech commands, i.e., direction of the described user relative to the equipment.
Step S120 drives the camera of the equipment to turn to the direction where the user, and acquires the user's
The continuous N frame images of gestures of dynamic gesture.
Specifically, the camera of the equipment is driven to turn to the direction where the user.To identify the dynamic of user
Gesture then needs the continuous N frame images of gestures of the dynamic gesture of user described in continuous acquisition, by the continuous N frame hand
Gesture image is identified, the gesture meaning of user's dynamic gesture is obtained.The quantity determination of N is the speed that a movement is done by user
It determines, most importantly can recognize that the meaning of a gesture.
Step S130 carries out gesture identification to the continuous N frame images of gestures of acquisition, to identify the dynamic gesture
Gesture meaning, to control the equipment according to the gesture meaning.
Fig. 2 is that the continuous N frame images of gestures according to an embodiment of the present invention to acquisition carries out gesture identification, with identification
The flow diagram of a specific embodiment of the step of gesture meaning of the dynamic gesture out.As shown in Fig. 2, step S130
Including step S131, step S132, step S133 and step S134.
Step S131 identifies the hand of the dynamic gesture in every frame images of gestures of the continuous N frame images of gestures
Gesture region, and default processing is carried out based on the gesture area and obtains the gesture area image of every frame images of gestures.
The gesture area of the dynamic gesture is the region of interest ROI in images of gestures, wherein to the continuous N
Every frame images of gestures of frame images of gestures identifies identical area-of-interest, i.e., the described gesture area.Mould can be carried out in advance
Type training generates the convolution feature identification model of gesture area, to identify the continuous N frame gesture figure according to the identification model
The gesture area of every frame images of gestures as in.It, can be according to identifying after identifying the gesture area in every frame images of gestures
The gesture area come carries out default processing and obtains the gesture area image of every frame images of gestures.Specifically, to every frame gesture
Image-region (namely regions of non-interest) in image in addition to the gesture area carries out virtualization processing or removal processing,
To obtain the gesture area image of every frame images of gestures.That is, blurring or removing the picture region of other irrelevant informations
Domain can reduce the image data that need to be handled, to improve arithmetic speed.
Optionally, it when direction where driving the camera of the equipment to turn to the user, can obtain in real time described
The camera attitude angle of equipment.In order to expand camera visual field, when user is not judged in camera visual field by sound
User is in what direction, direction where driving camera to turn to user simultaneously by the direction of user, camera rotation
Obtain camera attitude angle simultaneously, the effect of attitude angle be known camera position information and height determine user with it is described
The distance of equipment determines that user can preferably determine gesture area size at a distance from equipment.
Step S132 carries out duplex channel process of convolution to the gesture area image of every frame images of gestures, to obtain
The continuous N frame characteristic pattern of the gesture area image of the continuous N frame images of gestures.
Specifically, the color image and depth image of the gesture area image of every frame images of gestures are rolled up respectively
Product processing, obtains color property figure and depth characteristic figure;The color property figure and depth characteristic figure are merged, institute is obtained
The gesture area of the continuous N frame images of gestures finally can be obtained in the characteristic pattern for stating the gesture area image of every frame images of gestures
The continuous N frame characteristic pattern of image.
Fig. 3 is that the gesture area image according to an embodiment of the present invention to every frame images of gestures carries out binary channels convolution
The schematic diagram of processing.Refering to what is shown in Fig. 3, gesture area more features information is obtained using bidirectional picture channel, wherein colored
Image indicates that original image, depth image are generally the gray level image of original image, need to be mapped to same as color image
Location information;Color image obtains color property figure by several convolution process, and depth image also passes through several convolution process
Obtain depth characteristic figure.It obtains color property figure and depth characteristic figure merges to obtain the spy of the gesture area image of present frame
Sign figure.
Step S133 carries out process of convolution to the continuous N frame characteristic pattern using 3D convolutional neural networks model and obtains institute
State the body dynamics information of gesture area.
Specifically, using 3D convolutional neural networks model to the company of the gesture area image of the continuous N frame images of gestures
Continuous N frame characteristic pattern carries out 3D convolution and obtains the relevant informations such as gesture area motion information, such as the direction of motion.
Step S134, using the body dynamics information of preset Recognition with Recurrent Neural Network model and the gesture area to described
Continuous N frame characteristic pattern is identified, the gesture meaning of the dynamic gesture is obtained.
Specifically, the body dynamics information of the continuous N frame characteristic pattern and the gesture area is input to preset RNN
Recognition with Recurrent Neural Network model, continuous N frame characteristic pattern are contained by the gesture that dynamic gesture is got in the function region softmax in model
Justice, for example, the semanteme of the gesture or the corresponding equipment control command of the gesture.
Above-mentioned steps can also refer to Fig. 4, and Fig. 4 is the images of gestures according to an embodiment of the present invention to continuous acquisition
Gesture identification is carried out with the schematic diagram of specific embodiment.As shown in figure 4, consecutive image indicates the user of camera continuous acquisition
The images of gestures of dynamic gesture, that is, the continuous multiple frames image acquired in image acquisition process, then by duplex channel model into
Row process of convolution obtains the body dynamics information of gesture area using 3D_CNN convolution model, and is added to RNN circulation nerve
In network model, the meaning information of user's dynamic gesture is got by the function region softmax in model.
In the gesture meaning for identifying the dynamic gesture, the equipment can be controlled according to the gesture meaning.It is described
Equipment is specifically as follows electric appliance, such as air-conditioning.Optionally, the side when the equipment is air-conditioning, where determining the user
To later, can the air-conditioning according to the direction controlling where the user air-out direction.
Fig. 5 is the structural schematic diagram of an embodiment of plant control unit provided by the invention.As shown in figure 5, described set
Standby control device 100 includes: voice recognition unit 110, driving unit 120, image acquisition units 130, image identification unit 140
With control unit 150.
Voice recognition unit 110 is used for when receiving the voice command of user, identifies the sound source side of institute's speech commands
To with the direction where the determination user;Driving unit 120 is used to drive the camera of the equipment to turn to the user institute
Direction;Image acquisition units 130 are used to acquire the continuous N frame images of gestures of the dynamic gesture of the user;Image recognition
The continuous N frame images of gestures that unit 140 is used to acquire described image acquisition unit carries out gesture identification, to identify
State the gesture meaning of dynamic gesture;Control unit 150 is used for the gesture meaning identified according to described image recognition unit
Control the equipment.
When receiving the voice command of user, voice recognition unit 110 identifies the Sounnd source direction of institute's speech commands, with
Determine the direction where the user.Driving unit 120 drives the camera of the equipment to turn to the direction where the user
Specifically, Sounnd source direction of the voice recognition unit 110 using deep learning voice module identification institute's speech commands, the sound source
Direction is the user's direction for issuing institute's speech commands, i.e., direction of the described user relative to the equipment.Driving unit
The camera of the 120 driving equipment turns to the direction where the user.
Image acquisition units 130 acquire the continuous N frame images of gestures of the dynamic gesture of the user.Specifically, to know
The dynamic gesture of other user then needs the continuous N frame images of gestures of the dynamic gesture of user described in continuous acquisition, by institute
It states continuous N frame images of gestures to be identified, obtains the gesture meaning of user's dynamic gesture.The quantity determination of N is to do one by user
The speed of a movement determines, most importantly can recognize that the meaning of a gesture.
Image identification unit 140 carries out gesture knowledge to the continuous N frame images of gestures that image acquisition units 130 acquire
Not, to identify the gesture meaning of the dynamic gesture, to control the equipment according to the gesture meaning.
Fig. 6 is the structural schematic diagram of one specific embodiment of image identification unit according to an embodiment of the present invention.Such as Fig. 6 institute
Show, image identification unit 140 includes gesture area recognition unit 141, duplex channel processing unit 142,3D convolution processing unit
143 and gesture meaning recognition unit 144.
Gesture area recognition unit 141, for identifying institute in every frame images of gestures of the continuous N frame images of gestures
The gesture area of dynamic gesture is stated, and default processing is carried out based on the gesture area and obtains the gesture area of every frame images of gestures
Image.
The gesture area of the dynamic gesture is the region of interest ROI in images of gestures.Gesture area recognition unit
Every frame images of gestures of 141 pairs of continuous N frame images of gestures identifies identical area-of-interest, i.e., the described gesture area.
The convolution feature identification model that model training generates gesture area can be carried out in advance, thus gesture area recognition unit 141
The gesture area of every frame images of gestures in the continuous N frame images of gestures is identified according to the identification model.Identify every frame gesture figure
After gesture area as in, default processing can be carried out according to the gesture area identified and obtains the hand of every frame images of gestures
Gesture area image.Specifically, gesture area recognition unit 141 in every frame images of gestures in addition to the gesture area
Image-region (namely regions of non-interest) carries out virtualization processing or removal processing, to obtain the hand of every frame images of gestures
Gesture area image.That is, blurring or removing the picture region of other irrelevant informations, the picture number that need to be handled can be reduced
According to improve arithmetic speed.
Optionally, it when direction where driving the camera of the equipment to turn to the user, can obtain in real time described
The camera attitude angle of equipment.In order to expand camera visual field, when user is not judged in camera visual field by sound
User is in what direction, direction where driving camera to turn to user simultaneously by the direction of user, camera rotation
Obtain camera attitude angle simultaneously, the effect of attitude angle be known camera position information and height determine user with it is described
The distance of equipment determines that user can preferably determine gesture area size at a distance from equipment.
Duplex channel processing unit 142 is used to carry out duplex channel volume to the gesture area image of every frame images of gestures
Product processing, to obtain the continuous N frame characteristic pattern of the gesture area image of the continuous N frame images of gestures.
Specifically, color image of the duplex channel processing unit 142 to the gesture area image of every frame images of gestures
Process of convolution is carried out respectively with depth image, obtains color property figure and depth characteristic figure;By the color property figure and depth
Characteristic pattern merges, and obtains the characteristic pattern of the gesture area image of every frame images of gestures, the continuous N finally can be obtained
The continuous N frame characteristic pattern of the gesture area image of frame images of gestures.
Fig. 3 is that the gesture area image according to an embodiment of the present invention to every frame images of gestures carries out binary channels convolution
The schematic diagram of processing.Refering to what is shown in Fig. 3, gesture area more features information is obtained using bidirectional picture channel, wherein colored
Image indicates that original image, depth image are generally the gray level image of original image, need to be mapped to same as color image
Location information;Color image obtains color property figure by several convolution process, and depth image also passes through several convolution process
Obtain depth characteristic figure.It obtains color property figure and depth characteristic figure merges to obtain the spy of the gesture area image of present frame
Sign figure.
3D convolution processing unit 143 is for rolling up the continuous N frame characteristic pattern using 3D convolutional neural networks model
Product processing obtains the body dynamics information of the gesture area.
Specifically, 3D convolution processing unit 143 is using 3D convolutional neural networks model to the continuous N frame images of gestures
The continuous N frame characteristic pattern of gesture area image carries out 3D convolution and obtains the correlations such as gesture area motion information, such as the direction of motion
Information.
Gesture meaning recognition unit 144 is used for the movement using preset Recognition with Recurrent Neural Network model and the gesture area
Characteristic information identifies the continuous N frame characteristic pattern, obtains the gesture meaning of the dynamic gesture.
Specifically, gesture meaning recognition unit 144 is by the motion feature of continuous the N frame characteristic pattern and the gesture area
Information input to preset RNN Recognition with Recurrent Neural Network model, distinguished by the softmax function in model by continuous N frame characteristic pattern
The gesture meaning of dynamic gesture is obtained, for example, the semanteme of the gesture or the corresponding equipment control command of the gesture.
Identify that the gesture meaning of the dynamic gesture, control unit 150 can bases in described image recognition unit 140
The gesture meaning controls the equipment.
Optionally, the equipment is specifically as follows electric appliance, such as air-conditioning.When the equipment is air-conditioning, in the voice
After direction where the determining user of recognition unit 110, described control unit 150 can be according to the side where the user
To the air-out direction for controlling the air-conditioning.
The present invention also provides a kind of storage mediums for corresponding to the apparatus control method, are stored thereon with computer journey
Sequence, the step of aforementioned any the method is realized when described program is executed by processor.
The present invention also provides a kind of equipment for corresponding to the apparatus control method, including processor, memory and deposit
The computer program that can be run on a processor on a memory is stored up, the processor is realized aforementioned any when executing described program
The step of the method.
The present invention also provides a kind of equipment for corresponding to the plant control unit, including aforementioned any equipment control
Device processed.
Accordingly, scheme provided by the invention, by identify the Sounnd source direction that makes a sound of user to determine the direction of user,
It is driven camera to turn to according to the direction of user and obtains camera attitude angle in real time, and related Intelligent dialogue, Neng Goukuo
Big camera visual field will carry out interacting fusion between voice and image, improves user and interacts with air conditioner intelligent;The present invention couple
Images of gestures carries out gesture area identification, and carries out default processing and obtain gesture area image, by virtualization or interception image
Irrelevant information can reduce image data to be treated;It is carried out using continuous N frame images of gestures of the duplex channel to acquisition
Process of convolution obtains continuous N frame characteristic pattern, and is identified using 3D convolutional neural networks model to continuous N frame characteristic pattern, obtains
To motion feature, to identify dynamic gesture meaning, it can be improved the discrimination of gesture identification, can be realized dynamic gesture
Identification, enhances gesture identification degree.
Function described herein can be implemented in hardware, the software executed by processor, firmware or any combination thereof.
If implemented in the software executed by processor, computer can be stored in using function as one or more orders or code
It is transmitted on readable media or via computer-readable media.Other examples and embodiment are wanted in the present invention and appended right
It asks in the scope and spirit of book.For example, due to the property of software, function described above can be used by processor,
Hardware, firmware, hardwired or the software implementation for appointing the combination of whichever to execute in these.In addition, each functional unit can integrate
In one processing unit, it is also possible to each unit to physically exist alone, can also be integrated in two or more units
In one unit.
In several embodiments provided herein, it should be understood that disclosed technology contents can pass through others
Mode is realized.Wherein, the apparatus embodiments described above are merely exemplary, such as the division of the unit, Ke Yiwei
A kind of logical function partition, there may be another division manner in actual implementation, for example, multiple units or components can combine or
Person is desirably integrated into another system, or some features can be ignored or not executed.Another point, shown or discussed is mutual
Between coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or communication link of unit or module
It connects, can be electrical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, and fill as control
The component set may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
On unit.It can some or all of the units may be selected to achieve the purpose of the solution of this embodiment according to the actual needs.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product
When, it can store in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially
The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words
It embodies, which is stored in a storage medium, including several orders are used so that a computer
Equipment (can for personal computer, server or network equipment etc.) execute each embodiment the method for the present invention whole or
Part steps.And storage medium above-mentioned includes: that USB flash disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited
Reservoir (RAM, Random Access Memory), mobile hard disk, magnetic or disk etc. be various to can store program code
Medium.
The above description is only an embodiment of the present invention, is not intended to restrict the invention, for those skilled in the art
For member, the invention may be variously modified and varied.All within the spirits and principles of the present invention, it is made it is any modification,
Equivalent replacement, improvement etc., should be included within scope of the presently claimed invention.
Claims (13)
1. a kind of apparatus control method characterized by comprising
When receiving the voice command of user, the Sounnd source direction of institute's speech commands is identified, where the determination user
Direction;
Drive the camera of the equipment to turn to the direction where the user, and acquire the user dynamic gesture it is continuous
N frame images of gestures;
Gesture identification is carried out to the continuous N frame images of gestures of acquisition, to identify the gesture meaning of the dynamic gesture, from
And the equipment is controlled according to the gesture meaning.
2. the method according to claim 1, wherein the continuous N frame images of gestures to acquisition carries out gesture
Identification, to identify the gesture meaning of the dynamic gesture, comprising:
The gesture area of the dynamic gesture is identified in every frame images of gestures of the continuous N frame images of gestures, and is based on
The gesture area carries out default processing and obtains the gesture area image of every frame images of gestures;
Duplex channel process of convolution is carried out to the gesture area image of every frame images of gestures, to obtain the continuous N frame hand
The continuous N frame characteristic pattern of the gesture area image of gesture image;
Process of convolution is carried out to the continuous N frame characteristic pattern using 3D convolutional neural networks model and obtains the fortune of the gesture area
Dynamic characteristic information;
Using the body dynamics information of preset Recognition with Recurrent Neural Network model and the gesture area to the continuous N frame characteristic pattern
It is identified, obtains the gesture meaning of the dynamic gesture.
3. according to the method described in claim 2, it is characterized in that, carrying out default processing based on the gesture area obtains every frame
The gesture area image of images of gestures, comprising:
Other image-regions in every frame images of gestures in addition to the gesture area are carried out at virtualization processing or removal
Reason, to obtain the gesture area image of every frame images of gestures.
4. according to the method in claim 2 or 3, which is characterized in that the gesture area image of every frame images of gestures
Duplex channel process of convolution is carried out, to obtain the continuous N frame characteristic pattern of the gesture area image of the continuous N frame images of gestures,
Include:
Process of convolution is carried out to the color image and depth image of the gesture area image of every frame images of gestures respectively, is obtained
Color property figure and depth characteristic figure;
The color property figure and depth characteristic figure are merged, the gesture area image of every frame images of gestures is obtained
Characteristic pattern.
5. method according to claim 1-4, which is characterized in that the equipment, comprising: air-conditioning, the method,
Further include:
After determining the direction where the user, according to the outlet air side of air-conditioning described in the direction controlling where the user
To.
6. a kind of plant control unit characterized by comprising
Voice recognition unit, for the Sounnd source direction of institute's speech commands being identified, with true when receiving the voice command of user
Direction where the fixed user;
Driving unit, for the direction where driving the camera of the equipment to turn to the user;
Image acquisition units, the continuous N frame images of gestures of the dynamic gesture for acquiring the user;
Image identification unit, for carrying out gesture identification to the continuous N frame images of gestures that described image acquisition unit acquires,
To identify the gesture meaning of the dynamic gesture;
Control unit, the gesture meaning for being identified according to described image recognition unit control the equipment.
7. device according to claim 6, which is characterized in that image identification unit, comprising:
Gesture area recognition unit, for identifying the dynamic in every frame images of gestures of the continuous N frame images of gestures
The gesture area of gesture, and default processing is carried out based on the gesture area and obtains the gesture area image of every frame images of gestures;
Duplex channel processing unit carries out at duplex channel convolution for the gesture area image to every frame images of gestures
Reason, to obtain the continuous N frame characteristic pattern of the gesture area image of the continuous N frame images of gestures;
3D convolution processing unit, for carrying out process of convolution to the continuous N frame characteristic pattern using 3D convolutional neural networks model
Obtain the body dynamics information of the gesture area;
Gesture meaning recognition unit, for being believed using the motion feature of preset Recognition with Recurrent Neural Network model and the gesture area
Breath identifies the continuous N frame characteristic pattern, obtains the gesture meaning of the dynamic gesture.
8. device according to claim 7, which is characterized in that gesture area recognition unit, based on the gesture area into
The default processing of row obtains the gesture area image of every frame images of gestures, comprising:
Other image-regions in every frame images of gestures in addition to the gesture area are carried out at virtualization processing or removal
Reason, to obtain the gesture area image of every frame images of gestures.
9. device according to claim 7 or 8, which is characterized in that the duplex channel processing unit, to every frame hand
The gesture area image of gesture image carries out duplex channel process of convolution, to obtain the gesture area of the continuous N frame images of gestures
The continuous N frame characteristic pattern of image, comprising:
Process of convolution is carried out to the color image and depth image of the gesture area image of every frame images of gestures respectively, is obtained
Color property figure and depth characteristic figure;
The color property figure and depth characteristic figure are merged, the gesture area image of every frame images of gestures is obtained
Characteristic pattern.
10. according to the described in any item devices of claim 6-9, which is characterized in that the equipment, comprising: air-conditioning, the control
Unit is also used to:
After determining the direction where the user, according to the outlet air side of air-conditioning described in the direction controlling where the user
To.
11. a kind of storage medium, which is characterized in that it is stored thereon with computer program, it is real when described program is executed by processor
The step of existing claim 1-5 any the method.
12. a kind of equipment, which is characterized in that can be transported on a processor on a memory including processor, memory and storage
The step of capable computer program, the processor realizes claim 1-5 any the method when executing described program.
13. a kind of equipment, which is characterized in that including the plant control unit as described in claim 6-10 is any.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811584166.7A CN109886070A (en) | 2018-12-24 | 2018-12-24 | A kind of apparatus control method, device, storage medium and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811584166.7A CN109886070A (en) | 2018-12-24 | 2018-12-24 | A kind of apparatus control method, device, storage medium and equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109886070A true CN109886070A (en) | 2019-06-14 |
Family
ID=66925251
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811584166.7A Pending CN109886070A (en) | 2018-12-24 | 2018-12-24 | A kind of apparatus control method, device, storage medium and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109886070A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111860346A (en) * | 2020-07-22 | 2020-10-30 | 苏州臻迪智能科技有限公司 | Dynamic gesture recognition method and device, electronic equipment and storage medium |
CN112750437A (en) * | 2021-01-04 | 2021-05-04 | 欧普照明股份有限公司 | Control method, control device and electronic equipment |
WO2021135432A1 (en) * | 2019-12-31 | 2021-07-08 | Midea Group Co., Ltd. | System and method of hand gesture detection |
WO2023169123A1 (en) * | 2022-03-11 | 2023-09-14 | 深圳地平线机器人科技有限公司 | Device control method and apparatus, and electronic device and medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103491301A (en) * | 2013-09-18 | 2014-01-01 | 潍坊歌尔电子有限公司 | System and method for regulating and controlling camera of electronic equipment |
CN103994541A (en) * | 2014-04-21 | 2014-08-20 | 美的集团股份有限公司 | Air direction switching method and system based on voice control |
CN104902203A (en) * | 2015-05-19 | 2015-09-09 | 广东欧珀移动通信有限公司 | Video recording method based on rotary camera, and terminal |
CN105353634A (en) * | 2015-11-30 | 2016-02-24 | 北京地平线机器人技术研发有限公司 | Household appliance and method for controlling operation by gesture recognition |
CN107526440A (en) * | 2017-08-28 | 2017-12-29 | 四川长虹电器股份有限公司 | The intelligent electric appliance control method and system of gesture identification based on decision tree classification |
CN107808131A (en) * | 2017-10-23 | 2018-03-16 | 华南理工大学 | Dynamic gesture identification method based on binary channel depth convolutional neural networks |
CN108009499A (en) * | 2017-11-30 | 2018-05-08 | 宁波高新区锦众信息科技有限公司 | A kind of intelligent home control system based on dynamic hand gesture recognition |
CN108852349A (en) * | 2018-05-17 | 2018-11-23 | 浙江大学 | A kind of moving decoding method using Cortical ECoG signal |
-
2018
- 2018-12-24 CN CN201811584166.7A patent/CN109886070A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103491301A (en) * | 2013-09-18 | 2014-01-01 | 潍坊歌尔电子有限公司 | System and method for regulating and controlling camera of electronic equipment |
CN103994541A (en) * | 2014-04-21 | 2014-08-20 | 美的集团股份有限公司 | Air direction switching method and system based on voice control |
CN104902203A (en) * | 2015-05-19 | 2015-09-09 | 广东欧珀移动通信有限公司 | Video recording method based on rotary camera, and terminal |
CN105353634A (en) * | 2015-11-30 | 2016-02-24 | 北京地平线机器人技术研发有限公司 | Household appliance and method for controlling operation by gesture recognition |
CN107526440A (en) * | 2017-08-28 | 2017-12-29 | 四川长虹电器股份有限公司 | The intelligent electric appliance control method and system of gesture identification based on decision tree classification |
CN107808131A (en) * | 2017-10-23 | 2018-03-16 | 华南理工大学 | Dynamic gesture identification method based on binary channel depth convolutional neural networks |
CN108009499A (en) * | 2017-11-30 | 2018-05-08 | 宁波高新区锦众信息科技有限公司 | A kind of intelligent home control system based on dynamic hand gesture recognition |
CN108852349A (en) * | 2018-05-17 | 2018-11-23 | 浙江大学 | A kind of moving decoding method using Cortical ECoG signal |
Non-Patent Citations (1)
Title |
---|
叶人珍: "《视频图像处理研究 基于监控场景下的视觉算法》", 30 September 2018, 华中科技大学出版社 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021135432A1 (en) * | 2019-12-31 | 2021-07-08 | Midea Group Co., Ltd. | System and method of hand gesture detection |
CN111860346A (en) * | 2020-07-22 | 2020-10-30 | 苏州臻迪智能科技有限公司 | Dynamic gesture recognition method and device, electronic equipment and storage medium |
CN112750437A (en) * | 2021-01-04 | 2021-05-04 | 欧普照明股份有限公司 | Control method, control device and electronic equipment |
WO2023169123A1 (en) * | 2022-03-11 | 2023-09-14 | 深圳地平线机器人科技有限公司 | Device control method and apparatus, and electronic device and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109886070A (en) | A kind of apparatus control method, device, storage medium and equipment | |
US11188783B2 (en) | Reverse neural network for object re-identification | |
CN104331168B (en) | Display adjusting method and electronic equipment | |
Baee et al. | Medirl: Predicting the visual attention of drivers via maximum entropy deep inverse reinforcement learning | |
CN106682602B (en) | Driver behavior identification method and terminal | |
CN105095882B (en) | The recognition methods of gesture identification and device | |
CN102081918B (en) | Video image display control method and video image display device | |
CN107578023A (en) | Man-machine interaction gesture identification method, apparatus and system | |
CN106934392B (en) | Vehicle logo identification and attribute prediction method based on multi-task learning convolutional neural network | |
US20200081524A1 (en) | Method and appartus for data capture and evaluation of ambient data | |
CN103310187B (en) | Face-image based on facial quality analysis is prioritized | |
WO2018064047A1 (en) | Performing operations based on gestures | |
US11775054B2 (en) | Virtual models for communications between autonomous vehicles and external observers | |
CN106997236A (en) | Based on the multi-modal method and apparatus for inputting and interacting | |
CN111989689A (en) | Method for identifying objects within an image and mobile device for performing the method | |
JP2022530605A (en) | Child state detection method and device, electronic device, storage medium | |
CN110276229A (en) | Target object regional center localization method and device | |
US11403560B2 (en) | Training apparatus, image recognition apparatus, training method, and program | |
CN106897659A (en) | The recognition methods of blink motion and device | |
Gupta et al. | Online detection and classification of dynamic hand gestures with recurrent 3d convolutional neural networks | |
CN109190516A (en) | A kind of static gesture identification method based on volar edge contour vectorization | |
CN104281839A (en) | Body posture identification method and device | |
CN107944398A (en) | Based on depth characteristic association list diagram image set face identification method, device and medium | |
CN113266975B (en) | Vehicle-mounted refrigerator control method, device, equipment and storage medium | |
CN104021384A (en) | Face recognition method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190614 |
|
RJ01 | Rejection of invention patent application after publication |