WO2018076371A1 - Gesture recognition method, network training method, apparatus and equipment - Google Patents

Gesture recognition method, network training method, apparatus and equipment Download PDF

Info

Publication number
WO2018076371A1
WO2018076371A1 PCT/CN2016/104121 CN2016104121W WO2018076371A1 WO 2018076371 A1 WO2018076371 A1 WO 2018076371A1 CN 2016104121 W CN2016104121 W CN 2016104121W WO 2018076371 A1 WO2018076371 A1 WO 2018076371A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
motion
action
image
frequency domain
Prior art date
Application number
PCT/CN2016/104121
Other languages
French (fr)
Chinese (zh)
Inventor
崔健
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2016/104121 priority Critical patent/WO2018076371A1/en
Priority to CN201680029871.3A priority patent/CN107735796A/en
Publication of WO2018076371A1 publication Critical patent/WO2018076371A1/en

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/0011Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots associated with a remote control arrangement
    • G05D1/0016Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots associated with a remote control arrangement characterised by the operator's input device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/0011Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots associated with a remote control arrangement
    • G05D1/0033Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots associated with a remote control arrangement by having the operator tracking the vehicle either by direct line of sight or via one or more cameras located remotely from the vehicle
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/10Simultaneous control of position or course in three dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Definitions

  • the present application relates to the field of communications technologies, and in particular, to a motion recognition method, a network training method, an apparatus, and a device.
  • the current gesture recognition is identified by identifying the start point and the end point of the data waveform. In this mode, if the user does not make gestures, there may be various motions, and it is difficult to distinguish them. The start point and the end point result in error or unrecognizable gesture recognition, and the accuracy and reliability of gesture recognition are low.
  • the embodiment of the invention provides a motion recognition method, a network training method, a device and a device, which can improve the accuracy and reliability of recognizing a user's gesture action.
  • an embodiment of the present invention provides a motion recognition apparatus, including: an acquisition module and a processing module;
  • An acquiring module configured to acquire motion data detected by an external device for the current motion
  • a processing module configured to convert the motion data acquired by the acquiring module into frequency domain data, and use the frequency domain data to identify an action corresponding to the frequency domain data.
  • the embodiment of the present invention further provides a motion recognition apparatus, including: a first acquisition module, a second acquisition module, a fusion module, and a processing module;
  • a first acquiring module configured to acquire motion data detected by the external device for the current motion, and acquire feature data corresponding to the current motion according to the motion data
  • a second acquiring module configured to acquire an image collected for the current action, and process the image to obtain image recognition data corresponding to the current action
  • a fusion module configured to fuse feature data corresponding to the current action and image recognition data to obtain fusion data
  • a processing module configured to identify an action corresponding to the merged data.
  • the embodiment of the present invention further provides a network training device based on motion recognition, including: a first acquiring module, a second acquiring module, a determining module, and a processing module;
  • a first acquiring module configured to acquire motion data detected by an external device for a preset motion
  • a second acquiring module configured to acquire an image collected for the preset action, and process the image to obtain image recognition data corresponding to the preset action
  • a determining module configured to identify an action corresponding to the motion data
  • a processing module configured to perform supervised learning on the image identification data acquired by the second obtaining module by using the action identified by the determining module, and perform pre-determined network model based on the image recognition data after the supervised learning training.
  • the embodiment of the present invention further provides a motion recognition method, including:
  • an embodiment of the present invention further provides a motion recognition method, including:
  • the embodiment of the present invention further provides a network training method based on motion recognition, including:
  • the image recognition data is supervised and learned by the identified action, and the preset network model is trained based on the supervised learning image recognition data.
  • the embodiment of the present invention further provides a motion recognition device, including: a processor and a communication interface, where the processor is connected to the communication interface;
  • the communication interface is configured to acquire motion data detected by an external device for a current motion
  • the processor is configured to convert the motion data into frequency domain data, and use the frequency domain data to identify an action corresponding to the frequency domain data.
  • the embodiment of the present invention further provides a motion recognition device, including: a processor, a communication interface, and an image acquisition device, where the processor is respectively connected to the image acquisition device and the communication interface, where
  • the communication interface is configured to acquire motion data detected by an external device for a current motion
  • the image obtaining device is configured to collect an image for the current motion
  • the processor is configured to acquire an image acquired by the image acquiring device for the current action, and process an image acquired by the image acquiring device to obtain image recognition data corresponding to the current action, according to the motion
  • the data is acquired by the feature data corresponding to the current action, and the feature data corresponding to the current action and the image recognition data are merged to obtain the merged data, and the action corresponding to the merged data is identified.
  • the embodiment of the present invention further provides a network training device based on motion recognition, comprising: an image acquiring device, a processor, and a communication interface, wherein the processor is respectively connected to the image acquiring device and the communication interface ,among them,
  • the communication interface is configured to acquire motion data detected by an external device for a preset motion
  • the image obtaining device is configured to collect an image for the preset action
  • the processor is configured to acquire an image acquired by the image acquiring device for the preset action, and process an image acquired by the image acquiring device to obtain image recognition data corresponding to the preset action; The action corresponding to the motion data; performing supervised learning on the image recognition data by using the identified action, and training the preset network model based on the image recognition data after the supervised learning.
  • an embodiment of the present invention further provides an aircraft, including
  • the motion recognition device for identifying an action.
  • an embodiment of the present invention further provides an aircraft, including
  • the motion recognition device for identifying an action.
  • an embodiment of the present invention further provides an aircraft, including
  • the motion recognition-based network training device is for training a network model for motion recognition.
  • the embodiment of the present invention can obtain the motion corresponding to the frequency domain data by using the motion data of the external device, and convert the motion data into the frequency domain data, or by using the motion data and the image identification data. Integrating to obtain the fused data, thereby using the fused data to identify an action corresponding to the fused data, or by determining an action corresponding to the motion data, using the image recognition data and the determined action to supervise the image recognition data to enhance The accuracy and reliability of motion recognition are good.
  • FIG. 1 is a schematic diagram of an aircraft control system according to an embodiment of the present invention.
  • FIG. 2 is a schematic flowchart of a motion recognition method according to an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart of another motion recognition method according to an embodiment of the present invention.
  • FIG. 4 is a schematic flowchart of a network training method based on motion recognition according to an embodiment of the present invention
  • FIG. 5 is a schematic structural diagram of a motion recognition apparatus according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of another motion recognition apparatus according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of a network training apparatus based on motion recognition according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a motion recognition device according to an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of another motion recognition apparatus according to an embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of a network training device based on motion recognition according to an embodiment of the present invention.
  • FIG. 1 provides a schematic diagram of an aircraft control system including an aircraft such as a drone 110 and a wearable device 120, wherein the drone 110 includes a flight body, a pan/tilt head, and an imaging device 130.
  • the flying body includes a plurality of rotors and a rotor motor that drives the rotor to rotate, thereby providing the power required for the drone 110 to fly.
  • the imaging device 130 is mounted on the flying body through the pan/tilt.
  • the imaging device 130 is used for image or video capture during flight of the drone 110, including but not limited to multi-spectral imagers, hyperspectral imagers, visible light cameras, infrared cameras, and the like.
  • the pan/tilt is a multi-axis transmission and stabilization system, including multiple rotating shafts and pan/tilt motors.
  • the pan/tilt motor compensates for the photographing angle of the image forming apparatus 130 by adjusting the rotation angle of the rotating shaft, and prevents or reduces the shake of the image forming apparatus 130 by setting an appropriate buffer mechanism.
  • imaging device 130 can be mounted on the flying body either directly or by other means.
  • the wearable device 120 is worn by the operator and communicates with the drone 110 by wireless communication, thereby setting the flight process and imaging of the drone 110.
  • the photographing process of the preparation 130 is controlled.
  • the wearable device 120 has a built-in motion sensor.
  • the motion sensor senses the movement of the hand and outputs corresponding motion data, and according to the movement.
  • the data controls the drone accordingly.
  • the imaging device of the device on the drone can also capture the image data of the human body motion, and recognize the limb motion according to the image data, and the aircraft can be correspondingly controlled according to the recognized limb motion.
  • the embodiment of the invention discloses a motion recognition method, a network training method based on the motion recognition method, a related device and related equipment, which can improve the accuracy and reliability of the motion recognition, and has good robustness. The details are explained below.
  • FIG. 2 is a schematic flowchart diagram of a motion recognition method according to an embodiment of the present invention. Specifically, as shown in FIG. 2, the motion recognition method in the embodiment of the present invention may include the following steps:
  • the technical solution of the embodiment of the present invention may be specifically applied to an external device, or may be specifically applied to a controlled device corresponding to the motion recognition, such as an aircraft, or may be specifically applied to other independent motion recognition devices, and the present invention
  • the embodiment is not limited.
  • the external device may be a wearable device or a handheld device, such as a wristband, a watch, a smart ring, etc., and the external device is configured with a motion sensor such as an Inertial Measurement Unit (IMU), when the external device When moving or making an action such as a gesture, the motion sensor configured by the external device outputs corresponding motion data, and the external device detects the motion data, wherein the motion data may be one or both of angular acceleration and acceleration.
  • IMU Inertial Measurement Unit
  • the using the frequency domain data to identify an action corresponding to the frequency domain data may be specifically: inputting the frequency domain data into a network model, to identify and the frequency domain by using the network model The action corresponding to the data.
  • the feature of the frequency domain data may be further extracted, for example, the feature of the frequency domain data is obtained by extracting and superimposing, and the extracted feature is input into a network model to identify and The action corresponding to the feature to improve the efficiency of the action recognition.
  • the frequency domain data may be obtained by subjecting the motion data to Fourier transform.
  • the network model may be a neural network, or other network models. The embodiment of the present invention uses a neural network as an example for description.
  • the acquired motion data and/or the frequency domain data may also be normalized.
  • the current gesture recognition is based on time domain data for identification, and since the gesture actions performed within a certain time cannot be aligned to the same length of time, it is difficult to distinguish various gestures from the time domain waveform, and each person There is a big difference in doing the same gesture data. Therefore, the recognition effect is poor when the gesture motion recognition is directly performed in the time domain.
  • the embodiment of the present invention uses the frequency domain identification method to perform motion recognition. Since the frequency domain has no time axis, the gestures are aligned, and the various frequency components of each person making the same gesture are very similar, which greatly improves the gesture. The recognition effect of motion recognition.
  • the motion data acquired by the external device tends to be noisy, and the frequency range of the motion data is very low, resulting in inaccurate recognition of gesture gestures corresponding to the motion data.
  • the obtained original motion data can be subjected to low-pass filtering processing with a relatively low frequency band, and the data amount of the acquired motion data can be reduced by reducing the sampling rate of the motion data, thereby reducing the computational cost of the algorithm.
  • the acquired frequency domain data may be normalized to facilitate the network model to perform data processing.
  • the motion data may include data obtained by sampling data output by the motion sensor of the external device within a preset time period; or the motion data may also include data outputted by the motion sensor of the external device.
  • the obtained data is obtained by sampling a preset number of times.
  • the data output by the motion sensor can be collected according to a preset time period. For example, after a large number of tests, it is found that the gesture of the person generally does not exceed 5 s, and the network model can be continuously performed on the motion data collected every 5 s. Train or come to gesture gesture recognition.
  • the embodiment of the invention performs the gesture recognition by converting the motion data into the frequency domain data, that is, the entire segment of the data recognition gesture, and does not identify the start point and the end point of the gesture within 5s, which improves the accuracy of the motion recognition. Sex.
  • the motion data detected by the external device for the preset action may also be acquired; the motion data corresponding to the preset action is converted into frequency domain data, and the frequency domain corresponding to the preset action is utilized.
  • the data and the preset action train the network model.
  • the network model is trained before using the network model to identify the action.
  • several stable gesture actions can be defined in advance, and different users can perform these gesture actions, and collect motion data obtained by a large number of different users to perform these gestures (such as data output by the watch IMU), and convert them.
  • the frequency domain data is utilized to train a network model such as a neural network.
  • the characteristics of the frequency domain data may be further extracted, for example, by extracting, superimposing, etc.
  • the characteristics of the frequency domain data, and the extracted features are taken as inputs, and the preset action is used as an output to train the network model such as a neural network, thereby improving the stability and reliability of the network model through a large amount of network training. And improve the reliability of motion recognition based on the network model.
  • a regularization process may be employed on the motion data and/or the frequency domain data during training to reduce overfitting.
  • the network model is trained by various gesture actions performed by the user using an external device, thereby improving the recognition rate of the gesture action and reducing the false detection rate of the action.
  • the aircraft may also be controlled according to the identified action.
  • the control instruction corresponding to the gesture action may be generated, and the control instruction is sent to the controlled device, such as an aircraft, to enable the aircraft to perform an operation corresponding to the gesture action.
  • a plurality of gestures such as a gesture action 1, a gesture action 2, a gesture action 3, and the like, may be predefined, and a control function corresponding to each gesture action may be further preset to control the corresponding controlled device.
  • a control function corresponding to each gesture action may be further preset to control the corresponding controlled device.
  • the external device as the watch and taking the control aircraft as an example, it is assumed that when the user wears the hand of the watch to perform the gesture action 1, the aircraft can automatically start and take off; when the aircraft is in flight, the user can perform the gesture action 2, the aircraft can enter the surround Selfie function.
  • a mode switching control between the plurality of self-timer modes by the gesture action 1 may be set, and in the self-timer mode, the user may perform the gesture action 2 again to exit the self-timer mode, and the like.
  • the user simply raises the finger with the watch toward the aircraft and performs a gesture action 1 at which time the aircraft can enter which mode to fly, and the aircraft takes the user as the center of the ball.
  • the aircraft is currently flying in a spherical plane with a radius of a person, and the aircraft flies according to the position pointed by the user's finger.
  • the aircraft can adjust the flight radius, and the aircraft can automatically detect whether the user is pointing to the ground and safely control the height to prevent squatting.
  • the user can perform a gesture action 1 by a preset gesture action, such as pointing to the aircraft, and exiting which mode.
  • a gesture action 3 to control the aircraft to make a safe landing. This enables flight control of the aircraft without the need to use the remote control for control, enhancing the user experience.
  • the motion data of the external device is acquired, and the motion data is converted into the frequency domain data, so that the frequency domain data is used to identify the action corresponding to the frequency domain data, which is a good avoidance.
  • the false detection rate is low, and the accuracy and reliability of motion recognition are further improved, and the robustness is good.
  • FIG. 3 is a schematic flowchart diagram of another motion recognition method according to an embodiment of the present invention. Specifically, as shown in FIG. 3, the motion recognition method in the embodiment of the present invention may include the following steps:
  • the technical solution of the embodiment of the present invention may be specifically applied to the controlled device corresponding to the motion recognition, such as an aircraft, or may be specifically applied to other independent motion recognition devices, which is not limited in the embodiment of the present invention.
  • the external device may be a wearable device or a handheld device such as a wristband, a watch, a smart ring, or the like.
  • the acquired motion data may be data collected by a motion sensor such as an IMU disposed in the external device.
  • the feature data may refer to frequency domain data converted from the acquired motion data; or may be obtained by converting the obtained motion data into frequency domain data, and further extracting the frequency data.
  • the characteristics of the domain data such as by extraction, superposition, etc., obtain the characteristics of the frequency domain data.
  • the image identification data may be an image acquisition device such as a camera, and specifically may be a camera disposed on the aircraft, and the image data obtained by detecting the current motion (ie, capturing an image corresponding to the motion), and the image is The data obtained by processing the data.
  • the fusion data includes features of the motion data and features in the image recognition data, so that features of the data for performing motion recognition can be added, thereby avoiding actions (such as gestures) when only the image recognition action is performed.
  • the user of the action is not in the image acquired by the image acquisition device, or the user is small in the image, causing the motion recognition to be erroneous, or even unrecognizable, or the user cannot perform recognition when the action is recognized by the external device.
  • the identifying the action corresponding to the fused data may be specifically: inputting the fused data into a network model, to identify an action corresponding to the fused data by using the network model.
  • the network model may be a neural network, or may be another network model.
  • the embodiment of the present invention uses a neural network as an example for description.
  • the motion data in the embodiment of the present invention may include data obtained by sampling data output by the motion sensor of the external device in a preset time period; or pre-predicting data output by the motion sensor of the external device.
  • the obtained data is obtained by sampling the number of times.
  • the acquiring the feature data corresponding to the current action according to the motion data may be specifically: converting the motion data into frequency domain data, and acquiring the current action according to the frequency domain data.
  • the merging the feature data and the image identification data corresponding to the current action to obtain the fused data may be specifically: merging the frequency domain data and the image identification data to obtain fused data.
  • the obtained motion data such as IMU data
  • the feature data may be determined based on the frequency domain data, for example, directly using the frequency domain data as Feature data, or feature extraction of the frequency domain data, such as data obtained by processing such as extraction, superposition, etc., as the feature data.
  • the fusion data including the feature data and the image recognition data may be generated, which is equivalent to the characteristics of the two data sources (image recognition data and motion data). Fusion, so that gesture recognition can be performed based on the fusion data, and device control can be performed based on the recognized gesture motion.
  • the motion data detected by the external device for the preset motion may be acquired, and the feature data corresponding to the preset action is acquired according to the motion data; and the image identification data obtained by detecting the preset motion is acquired. And combining the image identification data corresponding to the preset action with the feature data corresponding to the preset action to obtain fusion data corresponding to the preset action; and using the fusion data corresponding to the preset action And training the network model with the preset action.
  • the obtained motion data such as the IMU data
  • the frequency domain data is obtained, and the feature data is determined based on the frequency domain data, for example, directly the frequency domain data.
  • the feature data the frequency domain data may be subjected to feature extraction, such as data obtained by processing such as extraction, superposition, etc., as the feature data.
  • the image learns gestures through deep learning.
  • the middle layer is the learned image recognition data, and the special data from the IMU can be
  • the levy data is inserted into an intermediate layer to obtain the fused data of the two, and continues to learn, which is equivalent to merging the characteristics of the two data sources (image recognition data and motion data).
  • the training is performed so that the gesture motion can be recognized by the integrated motion data even if the image is difficult to recognize the gesture, and the device control can be further performed based on the gesture motion.
  • the aircraft may also be controlled according to the identified action.
  • a plurality of gestures may be predefined, and a control function corresponding to each gesture action may be further preset to control a corresponding controlled device such as an aircraft.
  • a control aircraft as an example, the operation of the aircraft can be controlled by gesture gesture recognition to control the takeoff, landing, self-timer, and which flight, thereby enhancing the user experience.
  • the motion data of the acquired external device and the image identification data may be merged, and specifically, the motion data may be converted into frequency domain data, and the feature data of the current action is determined based on the frequency domain data, and The fusion data is obtained by fusing the feature data with the image identification data, so that the fusion data can be used to identify the action corresponding to the fusion data, thereby improving the accuracy and reliability of the motion recognition, and the robustness is good.
  • the problem that the motion data of only the external device is used for identification, or the motion recognition using only the image recognition is low or even unrecognizable is avoided.
  • FIG. 4 is a schematic flowchart diagram of a network training method based on motion recognition according to an embodiment of the present invention.
  • the network training method in the embodiment of the present invention may include the following steps:
  • the technical solution of the embodiment of the present invention may be specifically applied to the controlled device corresponding to the action identification, such as an aircraft, or may be specifically applied to other independent network training devices, which is not limited in the embodiment of the present invention.
  • the external device may be a wearable device or a handheld device such as a wristband, a watch, a smart ring, or the like.
  • the acquired motion data may be data collected by a motion sensor such as an IMU disposed in the external device. Further, in the embodiment of the present invention, the acquiring according to the motion data
  • the feature data corresponding to the preset action is also referred to as feature data corresponding to the motion data, and the feature data is used to determine a specific gesture action.
  • the motion data in the embodiment of the present invention may include data obtained by sampling data acquired by the motion sensor of the external device in a preset time period, or preset data output by the motion sensor of the external device. The number of samples is taken to obtain the data.
  • the image identification data may be an image acquisition device such as a camera, and specifically may be a camera disposed on the aircraft, and the image data obtained by detecting the current motion (ie, capturing an image corresponding to the motion), and the image is The data obtained by processing the data.
  • the network model may be a neural network, or may be another network model.
  • the embodiment of the present invention uses a neural network as an example for description.
  • the performing the supervised learning on the image identification data by using the identified action may be specifically: using depth learning, using the image recognition data as an input, and performing the action as a target output.
  • Supervise learning since the feature dimension collected by the image is large, the depth of the image recognition feature dimension can be reduced by deep learning, thereby improving the stability and reliability of the network model based on the image recognition network training after the learning.
  • the network model may be a network model corresponding to the image recognition action, and the network model may be trained by using the action corresponding to the identified motion data and the image recognition data.
  • the motion data feature acquired by the wristband can be pre-trained (may be data obtained by processing, filtering, etc. the motion data, or the motion data may be subjected to Fourier The corresponding relationship between the transformed frequency domain data, and the like) and the gesture action.
  • the motion data and the image recognition data collected by the wristband are synchronously acquired, the characteristics of the motion data are acquired, and the corresponding relationship between the motion data feature and the gesture motion collected by the wristband is recognized, and the currently collected The action corresponding to the motion data feature of the bracelet.
  • deep learning can be used to take image recognition data as input, and the recognized action is the target output to supervise the image recognition data.
  • the network model can be trained by using the reduced dimension image recognition data, and the corresponding relationship between the image recognition data and the gesture action is determined to perform the gesture action in the subsequent manner.
  • the current image recognition data can be recognized by the acquired current image and the current gesture action can be quickly and accurately recognized by the network model to perform device control based on the gesture action.
  • the network model may be a network model corresponding to the manner of the image recognition action based on the motion data recognition action, and the network model may be trained by using the acquired motion data and the fusion data of the image recognition data.
  • the feature data corresponding to the preset action may be acquired according to the motion data.
  • the image recognition data after the supervised learning is used to train the preset network model, which may be specifically: The feature data is merged with the image recognition data after the supervised learning to obtain fusion data; and the preset network model is trained by using the fusion data.
  • the external device is still used as an example of a wristband, and the corresponding relationship between the motion data feature and the gesture motion collected by the wristband can be pre-trained.
  • the motion data and the image recognition data collected by the wristband are synchronously acquired, the characteristics of the motion data are acquired, and the corresponding relationship between the motion data feature and the gesture motion collected by the wristband is recognized, and the currently collected The action corresponding to the motion data feature of the bracelet. Further, the depth learning is used to take the image recognition data as an input, and the recognized motion is the target output to supervise and learn the image recognition data.
  • the network model can be trained by using the reduced image identification data and the feature data of the motion data, that is, using the fusion data of the two to network
  • the model is trained to input the fusion data of the learned current image recognition data acquired for a certain gesture action and the feature data of the current motion data to the network model, so as to be fast and accurate.
  • the current gesture action is recognized, and device control can be further performed based on the gesture action.
  • the acquiring the feature data corresponding to the current action according to the motion data may be specifically: converting the motion data into frequency domain data, and acquiring, according to the frequency domain data, the current action Characteristic data.
  • the feature data is determined based on the frequency domain data, for example, the converted frequency domain data may be directly used as feature data, or the frequency domain data may be subjected to feature extraction, such as after being extracted, superimposed, etc. The data is used as the feature data.
  • the current image identification data may also be obtained based on the current motion collection, and based on the The neural network trained by the above training method recognizes the gesture action corresponding to the current image recognition data, so that the controlled device such as the aircraft needs to be determined based on the pre-defined gesture action and the control function corresponding to each gesture action.
  • the operations performed, such as gesture recognition, can be used to control the aircraft to take off, land, self-timer, and fly, thereby enhancing the user experience.
  • the motion data and the image recognition data are simultaneously collected when the user performs the gesture action, and the current gesture action is recognized based on the correspondence between the motion data and the gesture action, thereby being able to be identified by using the deep learning method.
  • the gesture action supervises and learns the image recognition data, which improves the accuracy and reliability of motion recognition and is robust.
  • FIG. 5 is a schematic structural diagram of a motion recognition apparatus according to an embodiment of the present invention.
  • the motion recognition device in the embodiment of the present invention may be specifically configured in an external device, or may be specifically disposed in a controlled device, such as an aircraft, corresponding to the motion recognition, or may be specifically configured in another independent motion recognition device. ,and many more.
  • the motion recognition apparatus 10 of the embodiment of the present invention may include an acquisition module 11 and a processing module 12. among them,
  • the obtaining module 11 is configured to acquire motion data detected by an external device for the current motion
  • the processing module 12 is configured to convert the motion data acquired by the acquiring module 11 into frequency domain data, and use the frequency domain data to identify an action corresponding to the frequency domain data.
  • the external device may be a wearable device or a handheld device such as a wristband, a watch, a smart ring, or the like.
  • the external device is configured with a motion sensor such as an IMU.
  • the motion sensor configured by the external device outputs corresponding motion data, and the external device detects the motion data.
  • the processing module 12 is specifically configured to input the frequency domain data into a network model to identify an action corresponding to the frequency domain data by using the network model.
  • the processing module 12 may further extract features of the frequency domain data, such as acquiring, acquiring, or superimposing the characteristics of the frequency domain data, and inputting the extracted features into the network.
  • the model identifies the action corresponding to the feature to improve the efficiency of the action recognition.
  • the frequency domain data may be obtained by subjecting the motion data to Fourier transform.
  • the network model may be a neural network, or other network models, which are not limited in the embodiment of the present invention.
  • the obtaining module 11 is further configured to acquire motion data detected by the external device for the preset motion
  • the processing module 12 is further configured to convert the motion data corresponding to the preset action into frequency domain data, and use the frequency domain data corresponding to the preset action and the preset action to the network.
  • the model is trained.
  • the processing module 12 is further configured to perform regularization processing on the frequency domain data to reduce over-fitting of the frequency domain data.
  • the processing module 12 is further configured to perform normalization processing on the motion data.
  • the motion data may include data obtained by sampling data acquired by the motion sensor of the external device in a preset time period, or sampling the data output by the motion sensor of the external device by a preset number of times. The data.
  • the processing module 12 is further configured to control the aircraft according to the identified action.
  • the processing module 12 may further generate a control instruction corresponding to the gesture action, and send the control instruction to the controlled device, such as an aircraft, to enable the aircraft to perform an operation corresponding to the gesture action.
  • the controlled device such as an aircraft
  • the motion data of the external device is acquired, and the motion data is converted into the frequency domain data, so that the frequency domain data is used to identify the action corresponding to the frequency domain data, which is a good avoidance.
  • Open the recognition start and end points of the gesture data waveform improve the recognition rate of the motion recognition, reduce the false detection rate, and further improve the accuracy and reliability of the motion recognition, and the robustness is better.
  • FIG. 6 is a schematic structural diagram of another motion recognition apparatus according to an embodiment of the present invention.
  • the motion recognition device in the embodiment of the present invention may be specifically disposed in a controlled device corresponding to the motion recognition, such as an aircraft, or may be specifically disposed in another independent motion recognition device, etc., specifically, as shown in FIG. 6.
  • the action recognition apparatus 20 of the embodiment of the present invention may include a first acquisition module 21, a second acquisition module 22, a fusion module 23, and a processing module 24. among them,
  • the first obtaining module 21 is configured to acquire motion data detected by the external device for the current motion, and acquire feature data corresponding to the current motion according to the motion data;
  • the second acquiring module 22 is further configured to acquire an image collected for the current action, and process the image to obtain image recognition data corresponding to the current action;
  • the merging module 23 is configured to combine feature data corresponding to the current action and image identification data to obtain fused data.
  • the processing module 24 is configured to identify an action corresponding to the fused data.
  • the external device may be a wearable device or a handheld device such as a wristband, a watch, a smart ring, or the like.
  • the acquired motion data may be data collected by a motion sensor such as an IMU disposed in the external device.
  • the feature data may refer to frequency domain data obtained by converting the acquired motion data (such as transforming by Fourier transform), and may also refer to converting the obtained motion data into frequency.
  • the domain data, and further extracting the characteristics of the frequency domain data for example, by extracting, superimposing, etc., obtaining the characteristics of the frequency domain data.
  • the processing module 24 is specifically configured to input the merged data into a network model to identify an action corresponding to the merged data by using the network model.
  • the network model may be a neural network, or other network models, which are not limited in the embodiment of the present invention.
  • the first obtaining module 21 may be specifically configured to convert the motion data into frequency domain data when acquiring feature data corresponding to the current motion according to the motion data, to use the frequency domain data as Feature data corresponding to the current action;
  • the merging module 23 may be specifically configured to combine the frequency domain data and the image identification data acquired by the second acquiring module 22 to obtain fused data.
  • the first obtaining module 21 is further configured to acquire motion data detected by the external device for the preset action, and acquire feature data corresponding to the preset action according to the motion data;
  • the second acquiring module 22 is further configured to acquire an image collected for the preset action, and process the collected image to obtain image recognition data corresponding to the preset action;
  • the merging module 23 is further configured to combine the image identification data corresponding to the preset action with the feature data corresponding to the preset action to obtain fused data corresponding to the preset action;
  • the processing module 24 is further configured to train the network model by using the fused data corresponding to the preset action and the preset action.
  • the motion data may include data obtained by sampling data acquired by the motion sensor of the external device in a preset time period, or sampling the data output by the motion sensor of the external device by a preset number of times. The data.
  • the processing module 24 is further configured to control the aircraft according to the identified action.
  • the motion data of the acquired external device and the image identification data may be merged, and specifically, the motion data may be converted into frequency domain data, and the feature data of the current action is determined based on the frequency domain data, and The fusion data is obtained by fusing the feature data with the image identification data, so that the fusion data can be used to identify the action corresponding to the fusion data, thereby improving the accuracy and reliability of the motion recognition, and the robustness is good.
  • the problem of low recognition accuracy or even unrecognizable motion recognition caused by using only external device recognition or using only image recognition is avoided.
  • FIG. 7 is a schematic structural diagram of a network training apparatus based on motion recognition according to an embodiment of the present invention.
  • the motion recognition device in the embodiment of the present invention may be specifically disposed in a controlled device, such as an aircraft, corresponding to the motion recognition, or may be specifically configured in another independent network training device.
  • the network training device 30 of the embodiment of the present invention may include a first obtaining module 31, a second obtaining module 32, a determining module 33, and a processing module 34. among them,
  • the first obtaining module 31 is configured to acquire motion data detected by an external device for a preset motion
  • the second acquiring module 32 is further configured to acquire an image collected for the preset action, and process the image to obtain image recognition data corresponding to the preset action;
  • the determining module 33 is configured to identify an action corresponding to the motion data
  • the processing module 34 is configured to perform supervised learning on the image identification data acquired by the second obtaining module 32 by using the action identified by the determining module 33, and based on the image recognition data after the supervised learning Set up a network model for training.
  • the external device may be a wearable device or a handheld device such as a wristband, a watch, a smart ring, or the like.
  • the acquired motion data may be data collected by a motion sensor such as an IMU disposed in the external device.
  • the network model can be a neural network or other network models. The embodiment of the present invention uses a neural network as an example for description.
  • the processing module 34 is specifically configured to use the depth learning to input the image recognition data acquired by the second acquiring module 32, and use the action as a target output to perform supervised learning.
  • the processing module 34 can reduce the image recognition feature dimension through deep learning, so as to improve the stability and reliability of the network model obtained based on the learned image recognition network training.
  • the first obtaining module 31 is further configured to acquire feature data corresponding to the preset action according to the motion data;
  • the processing module 34 is further configured to fuse the feature data with the image data after the supervised learning to obtain the fused data, and use the fused data to train the preset network model.
  • the first obtaining module 31 is specifically configured to: when acquiring feature data corresponding to the current action according to the motion data, specifically, converting the motion data into frequency domain data, to use the frequency domain data As feature data corresponding to the current action.
  • the feature data is determined based on the frequency domain data.
  • the acquiring module 31 may directly use the converted frequency domain data as feature data, or perform feature extraction on the frequency domain data, such as extraction, superposition, and the like.
  • the data obtained after the processing is taken as the feature data.
  • the motion data and the image recognition data are simultaneously collected when the user performs the gesture action, and the current gesture action is recognized based on the correspondence between the motion data and the gesture action, thereby being able to be identified by using the deep learning method.
  • the gesture action supervises and learns the image recognition data, which improves the accuracy and reliability of motion recognition and is robust.
  • the embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium stores program instructions, and the program may include some or all of the steps of the motion recognition method in the corresponding embodiment of FIG. 2 .
  • the embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores program instructions, and the program execution may include a part of the motion recognition method in the corresponding embodiment of FIG. Or all steps.
  • the embodiment of the present invention further provides a computer storage medium, where the computer storage medium stores program instructions, and the program execution may include some or all steps of the motion recognition based network training method in the corresponding embodiment of FIG. 4 . .
  • FIG. 8 is a schematic structural diagram of a motion recognition device according to an embodiment of the present invention.
  • the motion recognition device in the embodiment of the present invention may be an external device such as a wristband, a watch, a ring, or the like, or may be a controlled device such as an aircraft, or may be another independent motion recognition device, or the like.
  • the motion recognition device 1 in the embodiment of the present invention may include: a communication interface 300, a memory 200, and a processor 100, and the processor 100 may be respectively connected to the communication interface 300 and the memory 200.
  • the motion recognition device 1 may further include a motion sensor, a camera, and the like.
  • the communication interface 300 can include a wired interface, a wireless interface, and the like, and can be used to receive data transmitted by an external device, such as receiving motion data collected by an external device for a certain gesture action of the user, or for transmitting motion data acquired by the external device. and many more.
  • the memory 200 may include a volatile memory such as a random-access memory (RAM); the memory 200 may also include a non-volatile memory such as a flash. A flash memory or the like; the memory 200 may further include a combination of the above types of memories.
  • RAM random-access memory
  • flash non-volatile memory
  • a flash memory or the like; the memory 200 may further include a combination of the above types of memories.
  • the processor 100 may be a central processing unit (CPU), a graphics processing unit (GPU), or the like.
  • the processor may further include a hardware chip.
  • the hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD) or a combination thereof.
  • the PLD may be a complex programmable logic device (CPLD) or a field-programmable gate array (FPGA).
  • the memory 200 is further configured to store program instructions.
  • the processor 100 can invoke the program instructions to implement the motion recognition method as shown in the embodiment of FIG. 2 of the present application.
  • the communication interface 300 can be configured to acquire motion data detected by an external device for current motion detection
  • the processor 100 can invoke program instructions stored in the memory 200 for executing:
  • the processor 100 is specifically configured to input the frequency domain data into a network model to identify an action corresponding to the frequency domain data by using the network model.
  • the network model may comprise a neural network.
  • the communication interface 300 is further configured to acquire motion data detected by the external device for the preset action
  • the processor 100 is further configured to convert motion data corresponding to the preset action into frequency domain data, and use the frequency domain data corresponding to the preset action and the preset action on the network.
  • the model is trained.
  • the processor 100 is further configured to perform regularization processing on the frequency domain data to reduce over-fitting of the frequency domain data.
  • the processor 100 is further configured to perform normalization processing on the motion data.
  • the motion data may include data obtained by sampling data acquired by a motion sensor of an external device within a preset time period, or sampling the data output by the motion sensor of the external device by a preset number of times. The data obtained.
  • the processor 100 is further configured to control the aircraft according to the identified action.
  • An embodiment of the present invention further provides an aircraft, including
  • the motion recognition device according to any of the above embodiments of FIG. 8 is configured to recognize an action.
  • FIG. 9 is a schematic structural diagram of another motion recognition device according to an embodiment of the present invention.
  • the motion recognition device in the embodiment of the present invention may be a controlled device such as an aircraft, or may be another independent motion recognition device, and the like.
  • the motion recognition device 2 in the embodiment of the present invention may include: a communication interface 700, an image acquisition device 600, a memory 500, and a processor 400.
  • the processor 400 may be respectively connected to the communication interface 700 and image.
  • the device 600 and the memory 500 are connected.
  • the motion recognition device 2 may further include a motion sensor.
  • the image acquisition device 600 may include a camera for acquiring an image, such as an image when the user performs a gesture.
  • the communication interface 700 can include a wired interface, a wireless interface, and the like, and can be used to receive data transmitted by the external device, such as receiving motion data collected by the external device for a certain gesture action of the user.
  • the memory 500 may include a volatile memory, such as a random-access memory (RAM); the memory 500 may also include a non-volatile memory, such as a flash. A flash memory or the like; the memory 500 may further include a combination of memories of the above kind.
  • RAM random-access memory
  • non-volatile memory such as a flash.
  • flash memory or the like the memory 500 may further include a combination of memories of the above kind.
  • the processor 400 may be a central processing unit (CPU), a graphics processing unit (GPU), or the like.
  • the processor may further include a hardware chip.
  • the hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD) or a combination thereof.
  • the PLD may be a complex programmable logic device (CPLD) or a field-programmable gate array (FPGA).
  • the memory 500 is further configured to store program instructions.
  • the processor 400 can invoke the program instructions to implement the motion recognition method as shown in the embodiment of FIG. 3 of the present application.
  • the communication interface 700 is configured to acquire motion data detected by an external device for a current motion.
  • the image obtaining device 600 is configured to collect an image for the current motion
  • the processor 400 is configured to process an image acquired by the image acquiring device, obtain image recognition data, acquire feature data corresponding to the current action according to the motion data, and select a feature corresponding to the current action.
  • the data and the image identification data are fused to obtain fused data, and an action corresponding to the fused data is identified.
  • the processor 400 is specifically configured to input the merged data into a network model to identify an action corresponding to the merged data by using the network model.
  • the network model comprises a neural network.
  • the processor 400 is specifically configured to convert the motion data into frequency domain data, and fuse the frequency domain data and the image identification data to obtain fused data.
  • the obtained motion data such as the IMU data
  • the frequency domain data is obtained, and the feature data is determined based on the frequency domain data, for example, directly
  • the frequency domain data obtained by the conversion is used as feature data, or the frequency domain data is subjected to feature extraction, such as data obtained by processing after extraction, superposition, etc., as the feature data.
  • the communication interface 700 is further configured to acquire motion data detected by the external device for the preset action
  • the image obtaining device 600 is further configured to collect an image for the preset action
  • the processor 400 is further configured to process an image acquired by the image acquiring device to obtain image recognition data corresponding to the preset action, and acquire feature data corresponding to the preset action according to the motion data; Combining the image identification data corresponding to the preset action with the feature data corresponding to the preset action to obtain fusion data corresponding to the preset action; using the fusion data corresponding to the preset action and The preset action trains the network model.
  • the motion data may include data obtained by sampling data acquired by a motion sensor of an external device within a preset time period, or sampling the data output by the motion sensor of the external device by a preset number of times. The data obtained.
  • the processor 400 is further configured to control the aircraft according to the identified action.
  • An embodiment of the present invention further provides an aircraft, including
  • the motion recognition device is for identifying an action.
  • FIG. 10 is a schematic structural diagram of a network training device based on motion recognition according to an embodiment of the present invention.
  • the network training device in the embodiment of the present invention may be a controlled device such as an aircraft, or may be other independent network training devices, and the like.
  • the network training device 3 in the embodiment of the present invention may include: a communication interface 1100, an image acquisition device 1000, a memory 900, and a processor 800, and the processor 800 may be respectively associated with the communication interface 1100.
  • the image acquisition device 1000 and the memory 900 are connected.
  • the motion recognition device 3 may further include a motion sensor or the like.
  • the image acquisition device 1000 may include a camera for acquiring an image, such as an image when the user performs a gesture.
  • the communication interface 1100 can include a wired interface, a wireless interface, and the like, and can be used to receive data transmitted by the external device, such as receiving motion data collected by the external device for a certain gesture action of the user.
  • the memory 900 may include a volatile memory, such as a random-access memory (RAM); the memory 900 may also include a non-volatile memory, such as a flash. A flash memory or the like; the memory 900 may further include a combination of the above types of memories.
  • RAM random-access memory
  • flash non-volatile memory
  • the processor 800 can be a central processing unit (CPU), a graphics processing unit (GPU), and the like.
  • the processor may further include a hardware chip.
  • the hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD) or a combination thereof.
  • the PLD may be a complex programmable logic device (CPLD) or a field-programmable gate array (FPGA).
  • the memory 900 is further configured to store program instructions.
  • the processor 800 can invoke the program instructions to implement a motion recognition based network training method as shown in the embodiment of FIG. 4 of the present application.
  • the communication interface 1100 is configured to acquire motion data detected by an external device for a preset action
  • the image obtaining device 1000 is configured to collect an image for the preset action
  • the processor 800 is configured to process an image acquired by the image acquiring device to obtain image recognition data, identify an action corresponding to the motion data, and supervise the image identification data by using the identified action Learning, and training the preset network model based on the image recognition data after the supervised learning.
  • the processor 800 is specifically configured to perform the supervised learning by using the image recognition data as an input and using the action as a target output.
  • the processor 800 is further configured to acquire feature data corresponding to the preset action according to the motion data, and fuse the feature data with the image recognition data after the supervised learning to obtain a fusion. Data; using the fusion data to train a preset network model.
  • the processor 800 is specifically configured to convert the motion data into frequency domain data to use the frequency domain data as feature data corresponding to the current action.
  • the obtained motion data can be transformed from the time domain to the frequency domain by Fourier transform. Go to the frequency domain data, and determine the feature data based on the frequency domain data, for example, directly using the frequency domain data as feature data, or performing feature extraction on the frequency domain data, such as data obtained after processing by extraction, superposition, and the like.
  • the feature data For example, directly using the frequency domain data as feature data, or performing feature extraction on the frequency domain data, such as data obtained after processing by extraction, superposition, and the like.
  • the network model may comprise a neural network.
  • the motion data of the external device may be acquired, and the motion data is converted into frequency domain data, thereby using the frequency domain data to identify an action corresponding to the frequency domain data, or by using motion data and
  • the image recognition data is fused to obtain the fused data, thereby using the fused data to identify an action corresponding to the fused data, or by determining an action corresponding to the motion data, using the image recognition data and the determined action to supervise the image recognition data
  • the robustness is better.
  • An embodiment of the present invention further provides an aircraft, including
  • the motion recognition-based network training device according to any one of the foregoing embodiments of FIG. 10 is configured to train the network model of motion recognition.
  • the disclosed apparatus and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the modules is only a logical function division.
  • there may be another division manner for example, multiple modules or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or module, and may be electrical, mechanical or otherwise.
  • the modules described as separate components may or may not be physically separated.
  • the components displayed as modules may or may not be physical modules, that is, may be located in one place, or may be distributed to multiple network modules. . Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional module in each embodiment of the present invention may be integrated into one processing module, or each module may exist physically separately, or two or more modules may be integrated into one module. in.
  • the above integrated modules can be implemented in the form of hardware or in the form of hardware plus software function modules.
  • the above-described integrated modules implemented in the form of software function modules can be stored in a computer readable storage medium.
  • the software function modules described above are stored in a storage medium and include instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to perform the methods of the various embodiments of the present invention. Part of the steps.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program code. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Automation & Control Theory (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

A gesture recognition method, network training method, apparatus and equipment, the method comprising: acquiring movement data detected by an external device for a current gesture (101); and converting the movement data into frequency domain data, and using the frequency domain data to recognize a gesture corresponding to the frequency domain data (102). Therefore, the accuracy and reliability of gesture recognition may be improved.

Description

动作识别方法、网络训练方法、装置及设备Motion recognition method, network training method, device and device
本专利文件披露的内容包含受版权保护的材料。该版权为版权所有人所有。版权所有人不反对任何人复制专利与商标局的官方记录和档案中所存在的该专利文件或该专利披露。The disclosure of this patent document contains material that is subject to copyright protection. This copyright is the property of the copyright holder. The copyright owner has no objection to the reproduction of the patent document or the patent disclosure in the official records and files of the Patent and Trademark Office.
技术领域Technical field
本申请涉及通信技术领域,尤其涉及一种动作识别方法、网络训练方法、装置及设备。The present application relates to the field of communications technologies, and in particular, to a motion recognition method, a network training method, an apparatus, and a device.
背景技术Background technique
随着终端技术的不断发展,终端上能够实现的功能越来越多,给终端用户带来了极大的方便。例如,各种可穿戴设备应运而生,用户能够通过佩戴可穿戴设备如手环来实现各种功能,如查看时间、记录运动数据、拨打电话等等,且能够通过识别可穿戴设备上的手势动作来实现对其他的设备的控制,如控制飞行器的起飞、降落改变、改变飞行路径等等。With the continuous development of terminal technologies, more and more functions can be implemented on the terminal, which brings great convenience to the terminal users. For example, various wearable devices have emerged, and users can implement various functions by wearing a wearable device such as a wristband, such as viewing time, recording motion data, making a call, etc., and by recognizing gestures on the wearable device. Actions to achieve control of other devices, such as controlling the takeoff and landing of the aircraft, changing the flight path, and so on.
然而,目前的手势动作识别都是通过对识别数据波形的起始点和结束点来进行识别的,该方式下,若用户不做手势时也可能有各种运动,此时就很难区分出起始点和结束点,从而导致手势动作识别出错或无法识别,手势识别的准确性及可靠性较低。However, the current gesture recognition is identified by identifying the start point and the end point of the data waveform. In this mode, if the user does not make gestures, there may be various motions, and it is difficult to distinguish them. The start point and the end point result in error or unrecognizable gesture recognition, and the accuracy and reliability of gesture recognition are low.
发明内容Summary of the invention
本发明实施例提供一种动作识别方法、网络训练方法、装置及设备,能够提升识别用户手势动作的准确性及可靠性。The embodiment of the invention provides a motion recognition method, a network training method, a device and a device, which can improve the accuracy and reliability of recognizing a user's gesture action.
第一方面,本发明实施例提供了一种动作识别装置,包括:获取模块和处理模块;其中,In a first aspect, an embodiment of the present invention provides a motion recognition apparatus, including: an acquisition module and a processing module;
获取模块,用于获取由外部设备针对当前动作检测得到的运动数据;An acquiring module, configured to acquire motion data detected by an external device for the current motion;
处理模块,用于将所述获取模块获取的所述运动数据转换为频域数据,利用所述频域数据来识别与所述频域数据对应的动作。 And a processing module, configured to convert the motion data acquired by the acquiring module into frequency domain data, and use the frequency domain data to identify an action corresponding to the frequency domain data.
第二方面,本发明实施例还提供了一种动作识别装置,包括:第一获取模块、第二获取模块、融合模块和处理模块;其中,In a second aspect, the embodiment of the present invention further provides a motion recognition apparatus, including: a first acquisition module, a second acquisition module, a fusion module, and a processing module;
第一获取模块,用于获取由外部设备针对当前动作检测得到的运动数据,并根据所述运动数据获取与所述当前动作对应的特征数据;a first acquiring module, configured to acquire motion data detected by the external device for the current motion, and acquire feature data corresponding to the current motion according to the motion data;
第二获取模块,用于获取针对所述当前动作采集的图像,并对所述图像进行处理,以得到所述当前动作对应的图像识别数据;a second acquiring module, configured to acquire an image collected for the current action, and process the image to obtain image recognition data corresponding to the current action;
融合模块,用于将与所述当前动作对应的特征数据以及图像识别数据进行融合,得到融合数据;a fusion module, configured to fuse feature data corresponding to the current action and image recognition data to obtain fusion data;
处理模块,用于识别与所述融合数据对应的动作。And a processing module, configured to identify an action corresponding to the merged data.
第三方面,本发明实施例还提供了一种基于动作识别的网络训练装置,包括:第一获取模块、第二获取模块、确定模块和处理模块;其中,In a third aspect, the embodiment of the present invention further provides a network training device based on motion recognition, including: a first acquiring module, a second acquiring module, a determining module, and a processing module;
第一获取模块,用于获取由外部设备针对预设动作检测得到的运动数据;a first acquiring module, configured to acquire motion data detected by an external device for a preset motion;
第二获取模块,用于获取针对所述预设动作采集的图像,并对所述图像进行处理,以得到所述预设动作对应的图像识别数据;a second acquiring module, configured to acquire an image collected for the preset action, and process the image to obtain image recognition data corresponding to the preset action;
确定模块,用于识别与所述运动数据对应的动作;a determining module, configured to identify an action corresponding to the motion data;
处理模块,用于利用所述确定模块识别出的所述动作对所述第二获取模块获取的图像识别数据进行监督学习,并基于所述监督学习后的图像识别数据对预设的网络模型进行训练。a processing module, configured to perform supervised learning on the image identification data acquired by the second obtaining module by using the action identified by the determining module, and perform pre-determined network model based on the image recognition data after the supervised learning training.
第四方面,本发明实施例还提供了一种动作识别方法,包括:In a fourth aspect, the embodiment of the present invention further provides a motion recognition method, including:
获取由外部设备针对当前动作检测得到的运动数据;Acquiring motion data detected by an external device for the current motion;
将所述运动数据转换为频域数据,并利用所述频域数据来识别与所述频域数据对应的动作。Converting the motion data to frequency domain data and using the frequency domain data to identify an action corresponding to the frequency domain data.
第五方面,本发明实施例还提供了一种动作识别方法,包括:In a fifth aspect, an embodiment of the present invention further provides a motion recognition method, including:
获取由外部设备针对当前动作检测得到的运动数据,并根据所述运动数据获取与所述当前动作对应的特征数据;Acquiring motion data detected by the external device for the current motion, and acquiring feature data corresponding to the current motion according to the motion data;
获取针对所述当前动作采集的图像,并对所述图像进行处理,以得到所述当前动作对应的图像识别数据;Acquiring an image acquired for the current action, and processing the image to obtain image recognition data corresponding to the current action;
将与所述当前动作对应的特征数据以及图像识别数据进行融合,得到融合数据; Combining the feature data corresponding to the current action and the image identification data to obtain the fused data;
识别与所述融合数据对应的动作。Identifying an action corresponding to the fused data.
第六方面,本发明实施例还提供了一种基于动作识别的网络训练方法,包括:In a sixth aspect, the embodiment of the present invention further provides a network training method based on motion recognition, including:
获取由外部设备针对预设动作检测得到的运动数据;Acquiring motion data detected by an external device for a preset motion;
获取针对所述预设动作采集的图像,并对所述图像进行处理,以得到所述预设动作对应的图像识别数据;Acquiring an image acquired for the preset action, and processing the image to obtain image recognition data corresponding to the preset action;
识别与所述运动数据对应的动作;Identifying an action corresponding to the motion data;
利用识别出的所述动作对所述图像识别数据进行监督学习,并基于所述监督学习后的图像识别数据对预设的网络模型进行训练。The image recognition data is supervised and learned by the identified action, and the preset network model is trained based on the supervised learning image recognition data.
第七方面,本发明实施例还提供了一种动作识别设备,包括:处理器和通信接口,所述处理器与所述通信接口连接;其中,In a seventh aspect, the embodiment of the present invention further provides a motion recognition device, including: a processor and a communication interface, where the processor is connected to the communication interface;
所述通信接口,用于获取由外部设备针对当前动作检测得到的运动数据;The communication interface is configured to acquire motion data detected by an external device for a current motion;
所述处理器,用于将所述运动数据转换为频域数据,并利用所述频域数据来识别与所述频域数据对应的动作。The processor is configured to convert the motion data into frequency domain data, and use the frequency domain data to identify an action corresponding to the frequency domain data.
第八方面,本发明实施例还提供了一种动作识别设备,包括:处理器、通信接口和图像获取装置,所述处理器分别与所述图像获取装置和所述通信接口连接,其中,In an eighth aspect, the embodiment of the present invention further provides a motion recognition device, including: a processor, a communication interface, and an image acquisition device, where the processor is respectively connected to the image acquisition device and the communication interface, where
所述通信接口,用于获取由外部设备针对当前动作检测得到的运动数据;The communication interface is configured to acquire motion data detected by an external device for a current motion;
所述图像获取装置,用于针对所述当前动作采集图像;The image obtaining device is configured to collect an image for the current motion;
所述处理器,用于获取所述图像获取装置针对所述当前动作采集的图像,并对所述图像获取装置采集的图像进行处理,得到所述当前动作对应的图像识别数据,根据所述运动数据获取与所述当前动作对应的特征数据,将与所述当前动作对应的特征数据以及图像识别数据进行融合,得到融合数据,并识别与所述融合数据对应的动作。The processor is configured to acquire an image acquired by the image acquiring device for the current action, and process an image acquired by the image acquiring device to obtain image recognition data corresponding to the current action, according to the motion The data is acquired by the feature data corresponding to the current action, and the feature data corresponding to the current action and the image recognition data are merged to obtain the merged data, and the action corresponding to the merged data is identified.
第九方面,本发明实施例还提供了一种基于动作识别的网络训练设备,包括:图像获取装置、处理器和通信接口,所述处理器分别与所述图像获取装置和所述通信接口连接,其中,A ninth aspect, the embodiment of the present invention further provides a network training device based on motion recognition, comprising: an image acquiring device, a processor, and a communication interface, wherein the processor is respectively connected to the image acquiring device and the communication interface ,among them,
所述通信接口,用于获取由外部设备针对预设动作检测得到的运动数据;The communication interface is configured to acquire motion data detected by an external device for a preset motion;
所述图像获取装置,用于针对所述预设动作采集图像; The image obtaining device is configured to collect an image for the preset action;
所述处理器,用于获取所述图像获取装置针对所述预设动作采集的图像,并对所述图像获取装置采集的图像进行处理,得到所述预设动作对应的图像识别数据;识别与所述运动数据对应的动作;利用识别出的所述动作对所述图像识别数据进行监督学习,并基于所述监督学习后的图像识别数据对预设的网络模型进行训练。The processor is configured to acquire an image acquired by the image acquiring device for the preset action, and process an image acquired by the image acquiring device to obtain image recognition data corresponding to the preset action; The action corresponding to the motion data; performing supervised learning on the image recognition data by using the identified action, and training the preset network model based on the image recognition data after the supervised learning.
第十方面,本发明实施例还提供了一种飞行器,包括,In a tenth aspect, an embodiment of the present invention further provides an aircraft, including
动力系统,用于为飞行器提供飞行动力;a power system for providing flight power to the aircraft;
上述第七方面任一项所述的动作识别设备,用于对动作进行识别。The motion recognition device according to any of the above seventh aspects, for identifying an action.
第十一方面,本发明实施例还提供了一种飞行器,包括,In an eleventh aspect, an embodiment of the present invention further provides an aircraft, including
动力系统,用于为飞行器提供飞行动力;a power system for providing flight power to the aircraft;
上述第八方面任一项所述的动作识别设备,用于对动作进行识别。The motion recognition device according to any of the above eighth aspects, for identifying an action.
第十二方面,本发明实施例还提供了一种飞行器,包括,In a twelfth aspect, an embodiment of the present invention further provides an aircraft, including
动力系统,为飞行器提供飞行动力;a power system that provides flight power to the aircraft;
上述第九方面任一项所述的基于动作识别的网络训练设备,用于对动作识别的网络模型进行训练。The motion recognition-based network training device according to any one of the above ninth aspects is for training a network model for motion recognition.
实施本发明实施例,具有如下有益效果:Embodiments of the present invention have the following beneficial effects:
本发明实施例可通过获取外部设备的运动数据,并将该运动数据转换为频域数据,从而利用该频域数据来识别与该频域数据对应的动作,或者通过将运动数据和图像识别数据相融合以得到融合数据,从而利用该融合数据识别与所述融合数据对应的动作,或者通过确定运动数据对应的动作,利用图像识别数据和确定出的动作对图像识别数据进行监督学习,以提升动作识别的准确性及可靠性,鲁棒性较好。The embodiment of the present invention can obtain the motion corresponding to the frequency domain data by using the motion data of the external device, and convert the motion data into the frequency domain data, or by using the motion data and the image identification data. Integrating to obtain the fused data, thereby using the fused data to identify an action corresponding to the fused data, or by determining an action corresponding to the motion data, using the image recognition data and the determined action to supervise the image recognition data to enhance The accuracy and reliability of motion recognition are good.
附图说明DRAWINGS
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below. Obviously, the drawings in the following description are only It is a certain embodiment of the present invention, and other drawings can be obtained from those skilled in the art without any creative work.
图1是本发明实施例提供的一种飞行器控制系统的示意图; 1 is a schematic diagram of an aircraft control system according to an embodiment of the present invention;
图2是本发明实施例提供的一种动作识别方法的流程示意图;2 is a schematic flowchart of a motion recognition method according to an embodiment of the present invention;
图3是本发明实施例提供的另一种动作识别方法的流程示意图;3 is a schematic flowchart of another motion recognition method according to an embodiment of the present invention;
图4是本发明实施例提供的一种基于动作识别的网络训练方法的流程示意图;4 is a schematic flowchart of a network training method based on motion recognition according to an embodiment of the present invention;
图5是本发明实施例提供的一种动作识别装置的结构示意图;FIG. 5 is a schematic structural diagram of a motion recognition apparatus according to an embodiment of the present invention; FIG.
图6是本发明实施例提供的另一种动作识别装置的结构示意图;6 is a schematic structural diagram of another motion recognition apparatus according to an embodiment of the present invention;
图7是本发明实施例提供的一种基于动作识别的网络训练装置的结构示意图;FIG. 7 is a schematic structural diagram of a network training apparatus based on motion recognition according to an embodiment of the present invention; FIG.
图8是本发明实施例提供的一种动作识别设备的结构示意图;FIG. 8 is a schematic structural diagram of a motion recognition device according to an embodiment of the present invention;
图9是本发明实施例提供的另一种动作识别设备的结构示意图;FIG. 9 is a schematic structural diagram of another motion recognition apparatus according to an embodiment of the present invention;
图10是本发明实施例提供的一种基于动作识别的网络训练设备的结构示意图。FIG. 10 is a schematic structural diagram of a network training device based on motion recognition according to an embodiment of the present invention.
具体实施方式detailed description
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, but not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
图1提供了一种飞行器控制系统示意图,该飞行器系统包括飞行器如无人机110和穿戴式设备120,其中无人机110包括飞行主体、云台以及成像设备130。在本实施例中,飞行主体包括多个旋翼以及驱动旋翼转动的旋翼电机,由此提供无人机110飞行所需动力。成像设备130通过云台搭载于飞行主体上。成像设备130用于在无人机110的飞行过程中进行图像或视频拍摄,包括但不限于多光谱成像仪、高光谱成像仪、可见光相机及红外相机等。云台为多轴传动及增稳系统,包括多个转动轴和云台电机。云台电机通过调整转动轴的转动角度来对成像设备130的拍摄角度进行补偿,并通过设置适当的缓冲机构来防止或减小成像设备130的抖动。当然,在其他实施例中,成像设备130可以直接或通过其他方式搭载于飞行主体上。穿戴式设备120由操作者佩戴,并通过无线通信方式与无人机110进行通信,进而对无人机110的飞行过程及成像设 备130的拍摄过程进行控制,具体地穿戴式设备120内置运动传感器,当穿戴式设备随着操作者的手运动时,运动传感器会感知手的运动并输出相应的运动数据,并根据所述运动数据对无人机进行相应的控制。另外无人机上装置的成像设备也可以对人的肢体动作进行拍摄获取图像数据,并根据图像数据对肢体动作进行识别,根据识别得到的肢体动作也可以对飞行器进行相应的控制。1 provides a schematic diagram of an aircraft control system including an aircraft such as a drone 110 and a wearable device 120, wherein the drone 110 includes a flight body, a pan/tilt head, and an imaging device 130. In the present embodiment, the flying body includes a plurality of rotors and a rotor motor that drives the rotor to rotate, thereby providing the power required for the drone 110 to fly. The imaging device 130 is mounted on the flying body through the pan/tilt. The imaging device 130 is used for image or video capture during flight of the drone 110, including but not limited to multi-spectral imagers, hyperspectral imagers, visible light cameras, infrared cameras, and the like. The pan/tilt is a multi-axis transmission and stabilization system, including multiple rotating shafts and pan/tilt motors. The pan/tilt motor compensates for the photographing angle of the image forming apparatus 130 by adjusting the rotation angle of the rotating shaft, and prevents or reduces the shake of the image forming apparatus 130 by setting an appropriate buffer mechanism. Of course, in other embodiments, imaging device 130 can be mounted on the flying body either directly or by other means. The wearable device 120 is worn by the operator and communicates with the drone 110 by wireless communication, thereby setting the flight process and imaging of the drone 110. The photographing process of the preparation 130 is controlled. Specifically, the wearable device 120 has a built-in motion sensor. When the wearable device moves with the operator's hand, the motion sensor senses the movement of the hand and outputs corresponding motion data, and according to the movement. The data controls the drone accordingly. In addition, the imaging device of the device on the drone can also capture the image data of the human body motion, and recognize the limb motion according to the image data, and the aircraft can be correspondingly controlled according to the recognized limb motion.
本发明实施例公开了一种动作识别方法、基于动作识别方法的网络训练方法、相关装置及相关设备,能够提升动作识别的准确性及可靠性,鲁棒性好。以下分别详细说明。The embodiment of the invention discloses a motion recognition method, a network training method based on the motion recognition method, a related device and related equipment, which can improve the accuracy and reliability of the motion recognition, and has good robustness. The details are explained below.
请参见图2,图2是本发明实施例提供的一种动作识别方法的流程示意图。具体的,如图2所示,本发明实施例的所述动作识别方法可以包括以下步骤:Referring to FIG. 2, FIG. 2 is a schematic flowchart diagram of a motion recognition method according to an embodiment of the present invention. Specifically, as shown in FIG. 2, the motion recognition method in the embodiment of the present invention may include the following steps:
101、获取由外部设备针对当前动作检测得到的运动数据。101. Acquire motion data detected by an external device for the current motion.
可选地,本发明实施例的技术方案可具体应用于外部设备中,或者可具体应用于动作识别对应的被控制设备如飞行器中,或者可具体应用于其他独立的动作识别设备中,本发明实施例不做限定。Optionally, the technical solution of the embodiment of the present invention may be specifically applied to an external device, or may be specifically applied to a controlled device corresponding to the motion recognition, such as an aircraft, or may be specifically applied to other independent motion recognition devices, and the present invention The embodiment is not limited.
可选地,该外部设备可以为可穿戴设备或手持式设备,如手环、手表、智能戒指等等,外部设备配置有运动传感器如惯性测量单元(Inertial measurement unit,简称IMU),当外部设备移动或者做出动作如手势动作时,外部设备配置的运动传感器会输出相应的运动数据,外部设备会检测到该运动数据,其中运动数据可以为角加速度、加速度中的一种或两种。Optionally, the external device may be a wearable device or a handheld device, such as a wristband, a watch, a smart ring, etc., and the external device is configured with a motion sensor such as an Inertial Measurement Unit (IMU), when the external device When moving or making an action such as a gesture, the motion sensor configured by the external device outputs corresponding motion data, and the external device detects the motion data, wherein the motion data may be one or both of angular acceleration and acceleration.
102、将所述运动数据转换为频域数据,并利用所述频域数据来识别与所述频域数据对应的动作。102. Convert the motion data into frequency domain data, and use the frequency domain data to identify an action corresponding to the frequency domain data.
可选地,所述利用所述频域数据来识别与所述频域数据对应的动作,可以具体为:将所述频域数据输入网络模型,以通过所述网络模型识别与所述频域数据对应的动作。具体的,在获取到该频域数据之后,还可进一步提取该频域数据的特征,比如通过抽取、叠加获取得到该频域数据的特征,并可将提取的特征输入网络模型来识别与该特征对应的动作,以提升动作识别效率。其中,该频域数据可以是通过将运动数据经过傅里叶变换得到的。该网络模型可以为神经网络,或者为其他网络模型,本发明实施例以神经网络为例进行说明。Optionally, the using the frequency domain data to identify an action corresponding to the frequency domain data may be specifically: inputting the frequency domain data into a network model, to identify and the frequency domain by using the network model The action corresponding to the data. Specifically, after acquiring the frequency domain data, the feature of the frequency domain data may be further extracted, for example, the feature of the frequency domain data is obtained by extracting and superimposing, and the extracted feature is input into a network model to identify and The action corresponding to the feature to improve the efficiency of the action recognition. The frequency domain data may be obtained by subjecting the motion data to Fourier transform. The network model may be a neural network, or other network models. The embodiment of the present invention uses a neural network as an example for description.
可选地,对通过对获取的运动数据和/或该频域数据进行正则化处理,以 减少对过度拟合。进一步可选地,还可对获取的运动数据和/或该频域数据进行归一化处理。Optionally, by regularizing the acquired motion data and/or the frequency domain data, Reduce over-fitting. Further optionally, the acquired motion data and/or the frequency domain data may also be normalized.
具体的,目前的手势识别是基于时域数据来进行识别的,而由于在一定时间内做的手势动作无法对齐到相同的时间长度内,使得从时域波形区分各种手势困难,且每个人做相同的手势数据又有较大的区别,因此,直接在时间域内进行手势动作识别时识别效果较差。本发明实施例通过采用频域识别方式来进行动作识别,由于频域没有时间轴则手势是对齐的,而每个人做相同的手势的各种频率成分又十分相近,这就极大地提高了手势动作识别的识别效果。此外,外部设备获取的运动数据(如IMU输出的数据)往往噪声较大,并且运动数据的频率范围很低,导致基于该运动数据对应的手势动作识别不准确。由此,可对获取的原始运动数据进行频带比较低的低通滤波处理,可并通过降低该运动数据的采样率来减少获取的运动数据的数据量,以降低算法计算开销。进一步的,还可对获取的频域数据进行归一化处理,以方便网络模型进行数据处理。Specifically, the current gesture recognition is based on time domain data for identification, and since the gesture actions performed within a certain time cannot be aligned to the same length of time, it is difficult to distinguish various gestures from the time domain waveform, and each person There is a big difference in doing the same gesture data. Therefore, the recognition effect is poor when the gesture motion recognition is directly performed in the time domain. The embodiment of the present invention uses the frequency domain identification method to perform motion recognition. Since the frequency domain has no time axis, the gestures are aligned, and the various frequency components of each person making the same gesture are very similar, which greatly improves the gesture. The recognition effect of motion recognition. In addition, the motion data acquired by the external device (such as the data output by the IMU) tends to be noisy, and the frequency range of the motion data is very low, resulting in inaccurate recognition of gesture gestures corresponding to the motion data. Thereby, the obtained original motion data can be subjected to low-pass filtering processing with a relatively low frequency band, and the data amount of the acquired motion data can be reduced by reducing the sampling rate of the motion data, thereby reducing the computational cost of the algorithm. Further, the acquired frequency domain data may be normalized to facilitate the network model to perform data processing.
可选地,该运动数据可包括在预设的时间段内对外部设备的运动传感器输出的数据进行采样而获取得到的数据;或者,该运动数据也可包括对外部设备的运动传感器输出的数据进行预设次数的采样而获取得到的数据。例如,可按照预设的时间段来对运动传感器输出的数据进行采集,比如经过大量测试发现人做手势一般不会超过5s,则可持续地对每5s内采集的运动数据来对网络模型进行训练或者来进行手势动作识别。本发明实施例通过将运动数据转换为频域数据来进行手势动作识别,即是整段数据识别手势,而不对5s内的手势进行起始点和结束点的识别,这就提升了动作识别的准确性。Optionally, the motion data may include data obtained by sampling data output by the motion sensor of the external device within a preset time period; or the motion data may also include data outputted by the motion sensor of the external device. The obtained data is obtained by sampling a preset number of times. For example, the data output by the motion sensor can be collected according to a preset time period. For example, after a large number of tests, it is found that the gesture of the person generally does not exceed 5 s, and the network model can be continuously performed on the motion data collected every 5 s. Train or come to gesture gesture recognition. The embodiment of the invention performs the gesture recognition by converting the motion data into the frequency domain data, that is, the entire segment of the data recognition gesture, and does not identify the start point and the end point of the gesture within 5s, which improves the accuracy of the motion recognition. Sex.
可选地,还可获取由外部设备针对预设动作检测得到的运动数据;将与所述预设动作对应的运动数据转换为频域数据,利用与所述预设动作对应的所述频域数据和所述预设动作对所述网络模型进行训练。其中,在使用网络模型来识别动作之前,要对所述网络模型进行训练。具体地,可预先定义几种稳定性高的手势动作,并让不同的用户做这些手势动作,通过采集大量不同用户做这些手势得到的运动数据(如手表IMU输出的数据),并将其转换为频域数据,利用该频域数据来训练网络模型如神经网络。具体的,在获取到该频域数据之后,还可进一步提取该频域数据的特征,比如通过抽取、叠加等处理获取得到 该频域数据的特征,并可将提取的特征作为输入,将该预设动作作为输出来对该网络模型如神经网络进行训练,从而可通过大量网络训练来提升网络模型的稳定性及可靠性,并提升了基于该网络模型的动作识别的可靠性。进一步可选地,在训练过程中可对该运动数据和/或该频域数据采用正则化处理以减少过度拟合。通过用户利用外部设备做的各种手势动作,对所述网络模型进行训练,由此可提升手势动作的识别率,并降低动作的误检率。Optionally, the motion data detected by the external device for the preset action may also be acquired; the motion data corresponding to the preset action is converted into frequency domain data, and the frequency domain corresponding to the preset action is utilized. The data and the preset action train the network model. Wherein, the network model is trained before using the network model to identify the action. Specifically, several stable gesture actions can be defined in advance, and different users can perform these gesture actions, and collect motion data obtained by a large number of different users to perform these gestures (such as data output by the watch IMU), and convert them. For frequency domain data, the frequency domain data is utilized to train a network model such as a neural network. Specifically, after acquiring the frequency domain data, the characteristics of the frequency domain data may be further extracted, for example, by extracting, superimposing, etc. The characteristics of the frequency domain data, and the extracted features are taken as inputs, and the preset action is used as an output to train the network model such as a neural network, thereby improving the stability and reliability of the network model through a large amount of network training. And improve the reliability of motion recognition based on the network model. Further optionally, a regularization process may be employed on the motion data and/or the frequency domain data during training to reduce overfitting. The network model is trained by various gesture actions performed by the user using an external device, thereby improving the recognition rate of the gesture action and reducing the false detection rate of the action.
进一步可选地,在所述利用所述频域数据来识别与所述频域数据对应的动作之后,还可根据识别的动作对飞行器进行控制。Further optionally, after the using the frequency domain data to identify an action corresponding to the frequency domain data, the aircraft may also be controlled according to the identified action.
具体的,当外部设备如手表识别出当前手势动作之后,即可生成该手势动作对应的控制指令,并向被控制设备如飞行器发送该控制指令,以使飞行器执行与该手势动作对应的操作。Specifically, after the external device, such as the watch, recognizes the current gesture action, the control instruction corresponding to the gesture action may be generated, and the control instruction is sent to the controlled device, such as an aircraft, to enable the aircraft to perform an operation corresponding to the gesture action.
具体的,可预先定义多种手势,例如手势动作1、手势动作2、手势动作3等等,并可进一步预置得到每一手势动作对应的控制功能,以对相应的被控制设备进行控制。以外部设备为手表,并以控制飞行器为例,假设当用户佩戴手表的手做手势动作1时,飞行器可自动启动并起飞;当飞行器处于飞行状态,用户做手势动作2时,飞行器可进入环绕自拍功能。进一步的,还可设置通过手势动作1多种自拍模式之间的模式切换控制,在自拍模式下用户可再次做手势动作2退出自拍模式,等等。又例如,假设飞行器已经处于飞行状态,并且不处于自拍模式,用户只要举起带手表的手指向飞行器,同时做手势动作1,这时飞行器可进入指哪飞哪模式,飞行器以用户为球心,飞行器当前距离人的距离为半径的球面内飞行,飞行器根据用户手指指向的位置飞行。当识别出用户指着飞行器,转动手臂时,飞行器可调整飞行半径,且飞行器可自动检测用户是否指向了地面并安全控制高度防止砸地。在指哪飞哪模式下,用户可通过预设手势动作如指向飞行器做手势动作1,退出指哪飞哪模式。又例如,当飞行必须紧急降落时,用户可以做手势动作3,控制飞行器进行安全降落。从而实现对飞行器飞行控制,而无需再使用遥控器进行控制,增强了用户体验。Specifically, a plurality of gestures, such as a gesture action 1, a gesture action 2, a gesture action 3, and the like, may be predefined, and a control function corresponding to each gesture action may be further preset to control the corresponding controlled device. Taking the external device as the watch and taking the control aircraft as an example, it is assumed that when the user wears the hand of the watch to perform the gesture action 1, the aircraft can automatically start and take off; when the aircraft is in flight, the user can perform the gesture action 2, the aircraft can enter the surround Selfie function. Further, a mode switching control between the plurality of self-timer modes by the gesture action 1 may be set, and in the self-timer mode, the user may perform the gesture action 2 again to exit the self-timer mode, and the like. For another example, if the aircraft is already in flight state and is not in the self-portrait mode, the user simply raises the finger with the watch toward the aircraft and performs a gesture action 1 at which time the aircraft can enter which mode to fly, and the aircraft takes the user as the center of the ball. The aircraft is currently flying in a spherical plane with a radius of a person, and the aircraft flies according to the position pointed by the user's finger. When it is recognized that the user points at the aircraft and turns the arm, the aircraft can adjust the flight radius, and the aircraft can automatically detect whether the user is pointing to the ground and safely control the height to prevent squatting. In which mode of flying, the user can perform a gesture action 1 by a preset gesture action, such as pointing to the aircraft, and exiting which mode. For another example, when the flight must be emergency landing, the user can perform a gesture action 3 to control the aircraft to make a safe landing. This enables flight control of the aircraft without the need to use the remote control for control, enhancing the user experience.
在本发明实施例中,可通过获取外部设备的运动数据,并将该运动数据转换为频域数据,从而利用该频域数据来识别与该频域数据对应的动作,这就很好的避开识别手势数据波形的起始点和结束点,提升了动作识别的识别率,降 低了误检率,并进一步提升了动作识别的准确性及可靠性,鲁棒性较好。In the embodiment of the present invention, the motion data of the external device is acquired, and the motion data is converted into the frequency domain data, so that the frequency domain data is used to identify the action corresponding to the frequency domain data, which is a good avoidance. Open the recognition start and end points of the gesture data waveform, which improves the recognition rate of motion recognition. The false detection rate is low, and the accuracy and reliability of motion recognition are further improved, and the robustness is good.
请参见图3,图3是本发明实施例提供的另一种动作识别方法的流程示意图。具体的,如图3所示,本发明实施例的所述动作识别方法可以包括以下步骤:Referring to FIG. 3, FIG. 3 is a schematic flowchart diagram of another motion recognition method according to an embodiment of the present invention. Specifically, as shown in FIG. 3, the motion recognition method in the embodiment of the present invention may include the following steps:
201、获取由外部设备针对当前动作检测得到的运动数据,并根据所述运动数据获取与所述当前动作对应的特征数据。201. Acquire motion data detected by an external device for the current motion, and acquire feature data corresponding to the current motion according to the motion data.
可选地,本发明实施例的技术方案可具体应用于动作识别对应的被控制设备如飞行器中,或者可具体应用于其他独立的动作识别设备中,本发明实施例不做限定。Optionally, the technical solution of the embodiment of the present invention may be specifically applied to the controlled device corresponding to the motion recognition, such as an aircraft, or may be specifically applied to other independent motion recognition devices, which is not limited in the embodiment of the present invention.
可选地,该外部设备可以为可穿戴设备或手持式设备,如手环、手表、智能戒指等等。该获取的运动数据可以为设置于该外部设备中的运动传感器如IMU采集的数据。进一步的,在本发明实施例中,该特征数据可以是指由获取的运动数据转换得到的频域数据;还可以是指由获取的运动数据转换得到频域数据,并进一步提取得到的该频域数据的特征,如通过抽取、叠加等处理获取得到该频域数据的特征。Alternatively, the external device may be a wearable device or a handheld device such as a wristband, a watch, a smart ring, or the like. The acquired motion data may be data collected by a motion sensor such as an IMU disposed in the external device. Further, in the embodiment of the present invention, the feature data may refer to frequency domain data converted from the acquired motion data; or may be obtained by converting the obtained motion data into frequency domain data, and further extracting the frequency data. The characteristics of the domain data, such as by extraction, superposition, etc., obtain the characteristics of the frequency domain data.
202、获取针对所述当前动作采集的图像,并对所述图像进行处理,以得到所述当前动作对应的图像识别数据。202. Acquire an image collected for the current action, and process the image to obtain image recognition data corresponding to the current action.
可选地,该图像识别数据可以是通过图像获取装置如摄像头,具体可以是设置于飞行器上的摄像头,针对当前动作进行检测得到的图像数据(即采集该动作对应的图像),并对该图像数据进行处理得到的数据。Optionally, the image identification data may be an image acquisition device such as a camera, and specifically may be a camera disposed on the aircraft, and the image data obtained by detecting the current motion (ie, capturing an image corresponding to the motion), and the image is The data obtained by processing the data.
203、将与所述当前动作对应的特征数据以及图像识别数据进行融合,得到融合数据。203. Fusion the feature data corresponding to the current action and the image identification data to obtain the fused data.
其中,该融合数据中包括该运动数据的特征以及该图像识别数据中的特征,从而可增加用于进行动作识别的数据的特征,这就避免了仅通过图像识别动作时因做动作(如手势动作)的用户不在图像获取装置采集的图像中,或者用户在该图像中较小而导致动作识别出错,甚至无法识别,或者仅通过外部设备识别动作时用户做静态手势而无法识别等问题。通过获取包括两者的特征的融合数据,则能够提升动作识别率及准确性。Wherein, the fusion data includes features of the motion data and features in the image recognition data, so that features of the data for performing motion recognition can be added, thereby avoiding actions (such as gestures) when only the image recognition action is performed. The user of the action is not in the image acquired by the image acquisition device, or the user is small in the image, causing the motion recognition to be erroneous, or even unrecognizable, or the user cannot perform recognition when the action is recognized by the external device. By obtaining the fused data including the features of both, the motion recognition rate and accuracy can be improved.
204、识别与所述融合数据对应的动作。 204. Identify an action corresponding to the merged data.
可选地,所述识别与所述融合数据对应的动作,可以具体为:将所述融合数据输入网络模型,以通过所述网络模型识别与所述融合数据对应的动作。其中,该网络模型可以为神经网络,或者为其他网络模型,本发明实施例以神经网络为例进行说明。Optionally, the identifying the action corresponding to the fused data may be specifically: inputting the fused data into a network model, to identify an action corresponding to the fused data by using the network model. The network model may be a neural network, or may be another network model. The embodiment of the present invention uses a neural network as an example for description.
可选地,本发明实施例中的运动数据可包括在预设的时间段内对外部设备的运动传感器输出的数据进行采样而获取得到的数据;或者对外部设备的运动传感器输出的数据进行预设次数的采样而获取得到的数据。Optionally, the motion data in the embodiment of the present invention may include data obtained by sampling data output by the motion sensor of the external device in a preset time period; or pre-predicting data output by the motion sensor of the external device. The obtained data is obtained by sampling the number of times.
进一步可选地,所述根据所述运动数据获取与所述当前动作对应的特征数据,可以具体为:将所述运动数据转换为频域数据,根据所述频域数据获取与所述当前动作对应的特征数据。进一步的,所述将与所述当前动作对应的特征数据以及图像识别数据进行融合,得到融合数据,可以具体为:将所述频域数据和所述图像识别数据进行融合,得到融合数据。具体的,可通过将获取的运动数据如IMU数据经过傅里叶变换由时域变换到频域,得到频域数据,并可基于该频域数据确定特征数据,比如直接将该频域数据作为特征数据,或者将对该频域数据进行特征提取,如经过抽取、叠加等处理后得到的数据作为该特征数据。在获取得到该当前动作对应的特征数据及图像识别数据之后,则可生成包括该特征数据及图像识别数据的融合数据,这就相当于将两个数据源(图像识别数据和运动数据)的特征融合,从而能够基于该融合数据进行手势动作识别,并可基于该识别出的手势动作进行设备控制。Optionally, the acquiring the feature data corresponding to the current action according to the motion data may be specifically: converting the motion data into frequency domain data, and acquiring the current action according to the frequency domain data. Corresponding feature data. Further, the merging the feature data and the image identification data corresponding to the current action to obtain the fused data may be specifically: merging the frequency domain data and the image identification data to obtain fused data. Specifically, the obtained motion data, such as IMU data, is transformed from the time domain to the frequency domain by Fourier transform to obtain frequency domain data, and the feature data may be determined based on the frequency domain data, for example, directly using the frequency domain data as Feature data, or feature extraction of the frequency domain data, such as data obtained by processing such as extraction, superposition, etc., as the feature data. After obtaining the feature data and the image identification data corresponding to the current action, the fusion data including the feature data and the image recognition data may be generated, which is equivalent to the characteristics of the two data sources (image recognition data and motion data). Fusion, so that gesture recognition can be performed based on the fusion data, and device control can be performed based on the recognized gesture motion.
进一步地,还可获取由外部设备针对预设动作检测得到的运动数据,并根据所述运动数据获取与所述预设动作对应的特征数据;获取针对所述预设动作检测得到的图像识别数据;将与所述预设动作对应的图像识别数据和与所述预设动作对应的特征数据进行融合,得到与所述预设动作对应的融合数据;利用与所述预设动作对应的融合数据和所述预设动作对所述网络模型进行训练。Further, the motion data detected by the external device for the preset motion may be acquired, and the feature data corresponding to the preset action is acquired according to the motion data; and the image identification data obtained by detecting the preset motion is acquired. And combining the image identification data corresponding to the preset action with the feature data corresponding to the preset action to obtain fusion data corresponding to the preset action; and using the fusion data corresponding to the preset action And training the network model with the preset action.
具体的,可通过将获取的运动数据如IMU数据经过傅里叶变换由时域变换到频域,得到频域数据,并可基于该频域数据确定该特征数据,比如直接将该频域数据作为特征数据,或者将对该频域数据进行特征提取,如经过抽取、叠加等处理后得到的数据作为该特征数据。图像通过深度学习学习手势,在深度学习的过程中中间层为学习到的图像识别数据,则可将来自IMU数据的特 征数据插入到某个中间层,以得到两者的融合数据,并继续学习,这就相当于将两个数据源(图像识别数据和运动数据)的特征融合。基于上述的融合方式,经过大量的训练,包括获取大量图像难以识别的手势动作对应的图像识别数据(如当做手势动作的用户不在图像中时或者人在图像中很小时,靠图像识别则难以识别)进行训练,从而在图像难以识别手势的情况下仍能够综合运动数据识别出手势动作,并能够进一步基于该手势动作进行设备控制。Specifically, the obtained motion data, such as the IMU data, is transformed from the time domain to the frequency domain by Fourier transform, and the frequency domain data is obtained, and the feature data is determined based on the frequency domain data, for example, directly the frequency domain data. As the feature data, the frequency domain data may be subjected to feature extraction, such as data obtained by processing such as extraction, superposition, etc., as the feature data. The image learns gestures through deep learning. In the process of deep learning, the middle layer is the learned image recognition data, and the special data from the IMU can be The levy data is inserted into an intermediate layer to obtain the fused data of the two, and continues to learn, which is equivalent to merging the characteristics of the two data sources (image recognition data and motion data). Based on the above fusion mode, after a large amount of training, including image recognition data corresponding to a gesture action that is difficult to recognize by a large number of images (such as when the user who performs the gesture action is not in the image or when the person is in the image is small, it is difficult to recognize by image recognition) The training is performed so that the gesture motion can be recognized by the integrated motion data even if the image is difficult to recognize the gesture, and the device control can be further performed based on the gesture motion.
进一步可选地,在所述识别与所述融合数据对应的动作之后,还可根据识别的动作对飞行器进行控制。Further optionally, after the identifying the action corresponding to the fusion data, the aircraft may also be controlled according to the identified action.
具体的,可预先定义多种手势,并进一步预置得到与每一手势动作对应的控制功能,以对相应的被控制设备如飞行器进行控制。以控制飞行器为例,则可实现通过手势动作识别来控制飞行器实现起飞、降落、自拍、指哪飞哪等操作,从而增强了用户体验。Specifically, a plurality of gestures may be predefined, and a control function corresponding to each gesture action may be further preset to control a corresponding controlled device such as an aircraft. Taking the control aircraft as an example, the operation of the aircraft can be controlled by gesture gesture recognition to control the takeoff, landing, self-timer, and which flight, thereby enhancing the user experience.
在本发明实施例中,可通过将获取的外部设备的运动数据和图像识别数据相融合,具体可以是将该运动数据转换为频域数据,基于该频域数据确定当前动作的特征数据,并通过将该特征数据与该图像识别数据进行融合,得到融合数据,从而能够利用该融合数据识别与该融合数据对应的动作,这就提升了动作识别的准确性及可靠性,鲁棒性较好,避免了现有技术中仅利用外部设备的运动数据来识别,或者仅利用图像识别而带来的动作识别准确性低,甚至无法识别的问题。In the embodiment of the present invention, the motion data of the acquired external device and the image identification data may be merged, and specifically, the motion data may be converted into frequency domain data, and the feature data of the current action is determined based on the frequency domain data, and The fusion data is obtained by fusing the feature data with the image identification data, so that the fusion data can be used to identify the action corresponding to the fusion data, thereby improving the accuracy and reliability of the motion recognition, and the robustness is good. In the prior art, the problem that the motion data of only the external device is used for identification, or the motion recognition using only the image recognition is low or even unrecognizable is avoided.
请参见图4,图4是本发明实施例提供的一种基于动作识别的网络训练方法的流程示意图。具体的,如图4所示,本发明实施例的所述网络训练方法可以包括以下步骤:Referring to FIG. 4, FIG. 4 is a schematic flowchart diagram of a network training method based on motion recognition according to an embodiment of the present invention. Specifically, as shown in FIG. 4, the network training method in the embodiment of the present invention may include the following steps:
301、获取由外部设备针对预设动作检测得到的运动数据。301. Acquire motion data detected by an external device for a preset motion.
可选地,本发明实施例的技术方案可具体应用于动作识别对应的被控制设备如飞行器中,或者可具体应用于其他独立的网络训练设备中,本发明实施例不做限定。Optionally, the technical solution of the embodiment of the present invention may be specifically applied to the controlled device corresponding to the action identification, such as an aircraft, or may be specifically applied to other independent network training devices, which is not limited in the embodiment of the present invention.
可选地,该外部设备可以为可穿戴设备或手持式设备,如手环、手表、智能戒指等等。该获取的运动数据可以为设置于该外部设备中的运动传感器如IMU采集的数据。进一步的,在本发明实施例中,该根据所述运动数据获取 与所述预设动作对应的特征数据又称为该运动数据对应的特征数据,该特征数据用于确定具体的手势动作。Alternatively, the external device may be a wearable device or a handheld device such as a wristband, a watch, a smart ring, or the like. The acquired motion data may be data collected by a motion sensor such as an IMU disposed in the external device. Further, in the embodiment of the present invention, the acquiring according to the motion data The feature data corresponding to the preset action is also referred to as feature data corresponding to the motion data, and the feature data is used to determine a specific gesture action.
可选地,本发明实施例中的运动数据可包括在预设的时间段内对外部设备的运动传感器输出的数据进行采样获取得到的数据,或者对外部设备的运动传感器输出的数据进行预设次数的采样获取得到的数据。Optionally, the motion data in the embodiment of the present invention may include data obtained by sampling data acquired by the motion sensor of the external device in a preset time period, or preset data output by the motion sensor of the external device. The number of samples is taken to obtain the data.
302、获取针对所述预设动作采集的图像,并对所述图像进行处理,以得到所述预设动作对应的图像识别数据。302. Acquire an image collected for the preset action, and process the image to obtain image recognition data corresponding to the preset action.
可选地,该图像识别数据可以是通过图像获取装置如摄像头,具体可以是设置于飞行器上的摄像头,针对当前动作进行检测得到的图像数据(即采集该动作对应的图像),并对该图像数据进行处理得到的数据。Optionally, the image identification data may be an image acquisition device such as a camera, and specifically may be a camera disposed on the aircraft, and the image data obtained by detecting the current motion (ie, capturing an image corresponding to the motion), and the image is The data obtained by processing the data.
303、识别与所述运动数据对应的动作。303. Identify an action corresponding to the motion data.
304、利用识别出的所述动作对所述图像识别数据进行监督学习,并基于所述监督学习后的图像识别数据对预设的网络模型进行训练。304. Perform supervised learning on the image identification data by using the identified action, and train the preset network model based on the image recognition data after the supervised learning.
其中,该网络模型可以为神经网络,或者为其他网络模型,本发明实施例以神经网络为例进行说明。The network model may be a neural network, or may be another network model. The embodiment of the present invention uses a neural network as an example for description.
可选地,所述利用识别出的所述动作对所述图像识别数据进行监督学习,可以具体为:利用深度学习,将所述图像识别数据作为输入,并将所述动作作为目标输出来进行监督学习。具体的,因图像采集到的特征维数很多,通过深度学习则可减少图像识别特征维度,从而能够提升基于该学习后的图像识别网络训练得到的网络模型的稳定性及可靠性。Optionally, the performing the supervised learning on the image identification data by using the identified action may be specifically: using depth learning, using the image recognition data as an input, and performing the action as a target output. Supervise learning. Specifically, since the feature dimension collected by the image is large, the depth of the image recognition feature dimension can be reduced by deep learning, thereby improving the stability and reliability of the network model based on the image recognition network training after the learning.
可选地,该网络模型可以为基于图像识别动作的方式对应的网络模型,则可利用识别的运动数据对应的动作和该图像识别数据对该网络模型进行训练。具体的,以外部设备为手环为例,可预先训练得到手环采集的运动数据特征(可以是对运动数据进行采样、滤波等处理后得到的数据,或者可以是该运动数据经过傅里叶变换得到的频域数据,等等)与手势动作的对应关系。从而可在用户做手势时同步采集手环采集的运动数据和图像识别数据,获取该运动数据的特征,并基于手环采集的运动数据特征与手势动作的对应关系,识别出与当前采集到的手环的运动数据特征对应的动作。并可进一步利用深度学习以图像识别数据为输入,以识别出的动作为目标输出来对该图像识别数据进行监督学 习。在得到学习后的减少了维度后的图像识别数据之后,即可利用该减少了维度的图像识别数据对该网络模型进行训练,确定图像识别数据与手势动作的对应关系,以在后续进行手势动作识别时,能够通过获取的当前图像识别数据并通过该网络模型快速、准确地识别出当前手势动作,以基于该手势动作进行设备控制。Optionally, the network model may be a network model corresponding to the image recognition action, and the network model may be trained by using the action corresponding to the identified motion data and the image recognition data. Specifically, taking the external device as a wristband as an example, the motion data feature acquired by the wristband can be pre-trained (may be data obtained by processing, filtering, etc. the motion data, or the motion data may be subjected to Fourier The corresponding relationship between the transformed frequency domain data, and the like) and the gesture action. Therefore, when the user gestures, the motion data and the image recognition data collected by the wristband are synchronously acquired, the characteristics of the motion data are acquired, and the corresponding relationship between the motion data feature and the gesture motion collected by the wristband is recognized, and the currently collected The action corresponding to the motion data feature of the bracelet. Further, deep learning can be used to take image recognition data as input, and the recognized action is the target output to supervise the image recognition data. Xi. After the learned image recognition data is reduced, the network model can be trained by using the reduced dimension image recognition data, and the corresponding relationship between the image recognition data and the gesture action is determined to perform the gesture action in the subsequent manner. At the time of recognition, the current image recognition data can be recognized by the acquired current image and the current gesture action can be quickly and accurately recognized by the network model to perform device control based on the gesture action.
可选地,该网络模型可以为基于图像识别动作的方式和基于运动数据识别动作的方式对应的网络模型,则可利用获取的运动数据和该图像识别数据的融合数据来对该网络模型进行训练。则还可根据该运动数据获取与所述预设动作对应的特征数据,进一步的,所述基于所述监督学习后的图像识别数据对预设的网络模型进行训练,可以具体为:将所述特征数据与所述监督学习后的图像识别数据进行融合,得到融合数据;利用所述融合数据对预设的网络模型进行训练。具体的,仍以外部设备为手环为例,可预先训练得到手环采集的运动数据特征与手势动作的对应关系。从而可在用户做手势时同步采集手环采集的运动数据和图像识别数据,获取该运动数据的特征,并基于手环采集的运动数据特征与手势动作的对应关系,识别出与当前采集到的手环的运动数据特征对应的动作。并可进一步利用深度学习以图像识别数据为输入,以识别出的动作为目标输出来对该图像识别数据进行监督学习。在得到学习后的减少了维度后的图像识别数据之后,即可利用该减少了维度的图像识别数据和该运动数据的特征数据对该网络模型进行训练,即利用两者的融合数据来对网络模型进行训练,从而在后续进行手势动作识别时,能够通过将针对某一手势动作获取的学习后的当前图像识别数据与当前运动数据的特征数据的融合数据输入到该网络模型,以快速、准确地识别出当前手势动作,并可进一步基于该手势动作进行设备控制。Optionally, the network model may be a network model corresponding to the manner of the image recognition action based on the motion data recognition action, and the network model may be trained by using the acquired motion data and the fusion data of the image recognition data. . The feature data corresponding to the preset action may be acquired according to the motion data. Further, the image recognition data after the supervised learning is used to train the preset network model, which may be specifically: The feature data is merged with the image recognition data after the supervised learning to obtain fusion data; and the preset network model is trained by using the fusion data. Specifically, the external device is still used as an example of a wristband, and the corresponding relationship between the motion data feature and the gesture motion collected by the wristband can be pre-trained. Therefore, when the user gestures, the motion data and the image recognition data collected by the wristband are synchronously acquired, the characteristics of the motion data are acquired, and the corresponding relationship between the motion data feature and the gesture motion collected by the wristband is recognized, and the currently collected The action corresponding to the motion data feature of the bracelet. Further, the depth learning is used to take the image recognition data as an input, and the recognized motion is the target output to supervise and learn the image recognition data. After the learned image recognition data is reduced, the network model can be trained by using the reduced image identification data and the feature data of the motion data, that is, using the fusion data of the two to network The model is trained to input the fusion data of the learned current image recognition data acquired for a certain gesture action and the feature data of the current motion data to the network model, so as to be fast and accurate. The current gesture action is recognized, and device control can be further performed based on the gesture action.
可选地,所述根据所述运动数据获取与所述当前动作对应的特征数据,可以具体为:将所述运动数据转换为频域数据,根据所述频域数据获取与所述当前动作对应的特征数据。具体的,该特征数据是基于该频域数据确定得到的,比如可直接将转换得到的频域数据作为特征数据,或者将对该频域数据进行特征提取,如经过抽取、叠加等处理后得到的数据作为该特征数据。Optionally, the acquiring the feature data corresponding to the current action according to the motion data may be specifically: converting the motion data into frequency domain data, and acquiring, according to the frequency domain data, the current action Characteristic data. Specifically, the feature data is determined based on the frequency domain data, for example, the converted frequency domain data may be directly used as feature data, or the frequency domain data may be subjected to feature extraction, such as after being extracted, superimposed, etc. The data is used as the feature data.
进一步可选地,还可基于当前动作采集得到当前图像识别数据,并基于通 过上述的训练方式训练好的神经网络识别出与该当前图像识别数据对应的手势动作,从而可基于预先定义得到的手势动作以及每一手势动作对应的控制功能,确定出被控制设备如飞行器需要执行的操作,比如可通过手势动作识别来控制飞行器实现起飞、降落、自拍、指哪飞哪等操作,从而增强了用户体验。Further optionally, the current image identification data may also be obtained based on the current motion collection, and based on the The neural network trained by the above training method recognizes the gesture action corresponding to the current image recognition data, so that the controlled device such as the aircraft needs to be determined based on the pre-defined gesture action and the control function corresponding to each gesture action. The operations performed, such as gesture recognition, can be used to control the aircraft to take off, land, self-timer, and fly, thereby enhancing the user experience.
在本发明实施例中,可通过在用户做手势动作时同时采集运动数据及图像识别数据,并基于运动数据与手势动作的对应关系识别出当前手势动作,从而能够利用深度学习方式通过识别出的手势动作对该图像识别数据进行监督学习,使得提升了动作识别的准确性及可靠性,鲁棒性较好。In the embodiment of the present invention, the motion data and the image recognition data are simultaneously collected when the user performs the gesture action, and the current gesture action is recognized based on the correspondence between the motion data and the gesture action, thereby being able to be identified by using the deep learning method. The gesture action supervises and learns the image recognition data, which improves the accuracy and reliability of motion recognition and is robust.
请参见图5,图5是本发明实施例提供的一种动作识别装置的结构示意图。可选地,本发明实施例的所述动作识别装置可具体设置于外部设备中,或者可具体设置于动作识别对应的被控制设备如飞行器中,或者可具体设置于其他独立的动作识别设备中,等等。具体的,如图5所示,本发明实施例的所述动作识别装置10可以包括获取模块11和处理模块12。其中,Referring to FIG. 5, FIG. 5 is a schematic structural diagram of a motion recognition apparatus according to an embodiment of the present invention. Optionally, the motion recognition device in the embodiment of the present invention may be specifically configured in an external device, or may be specifically disposed in a controlled device, such as an aircraft, corresponding to the motion recognition, or may be specifically configured in another independent motion recognition device. ,and many more. Specifically, as shown in FIG. 5, the motion recognition apparatus 10 of the embodiment of the present invention may include an acquisition module 11 and a processing module 12. among them,
所述获取模块11,用于获取由外部设备针对当前动作检测得到的运动数据;The obtaining module 11 is configured to acquire motion data detected by an external device for the current motion;
所述处理模块12,用于将所述获取模块11获取的所述运动数据转换为频域数据,利用所述频域数据来识别与所述频域数据对应的动作。The processing module 12 is configured to convert the motion data acquired by the acquiring module 11 into frequency domain data, and use the frequency domain data to identify an action corresponding to the frequency domain data.
可选地,该外部设备可以为可穿戴设备或手持式设备,如手环、手表、智能戒指等等。外部设备配置有运动传感器如IMU,当外部设备移动或者做出动作如手势动作时,外部设备配置的运动传感器会输出相应的运动数据,外部设备会检测到该运动数据。Alternatively, the external device may be a wearable device or a handheld device such as a wristband, a watch, a smart ring, or the like. The external device is configured with a motion sensor such as an IMU. When the external device moves or makes an action such as a gesture, the motion sensor configured by the external device outputs corresponding motion data, and the external device detects the motion data.
可选地,在一些实施例中,Optionally, in some embodiments,
所述处理模块12,可具体用于将所述频域数据输入网络模型,以通过所述网络模型识别与所述频域数据对应的动作。The processing module 12 is specifically configured to input the frequency domain data into a network model to identify an action corresponding to the frequency domain data by using the network model.
具体的,处理模块12在获取到该频域数据之后,还可进一步提取该频域数据的特征,比如通过抽取、叠加等方式获取得到该频域数据的特征,并可将提取的特征输入网络模型来识别与该特征对应的动作,以提升动作识别效率。其中,该频域数据可以是通过将运动数据经过傅里叶变换得到的。该网络模型可以为神经网络,或者为其他网络模型,本发明实施例不做限定。 Specifically, after acquiring the frequency domain data, the processing module 12 may further extract features of the frequency domain data, such as acquiring, acquiring, or superimposing the characteristics of the frequency domain data, and inputting the extracted features into the network. The model identifies the action corresponding to the feature to improve the efficiency of the action recognition. The frequency domain data may be obtained by subjecting the motion data to Fourier transform. The network model may be a neural network, or other network models, which are not limited in the embodiment of the present invention.
进一步的,所述获取模块11,还可用于获取由外部设备针对预设动作检测得到的运动数据;Further, the obtaining module 11 is further configured to acquire motion data detected by the external device for the preset motion;
所述处理模块12,还可用于将与所述预设动作对应的运动数据转换为频域数据,利用与所述预设动作对应的所述频域数据和所述预设动作对所述网络模型进行训练。The processing module 12 is further configured to convert the motion data corresponding to the preset action into frequency domain data, and use the frequency domain data corresponding to the preset action and the preset action to the network. The model is trained.
进一步可选地,在一些实施例中,Further optionally, in some embodiments,
所述处理模块12,还可用于对所述频域数据进行正则化处理,以减少对所述频域数据的过度拟合。The processing module 12 is further configured to perform regularization processing on the frequency domain data to reduce over-fitting of the frequency domain data.
进一步可选地,在一些实施例中,Further optionally, in some embodiments,
所述处理模块12,还可用于对所述运动数据进行归一化处理。The processing module 12 is further configured to perform normalization processing on the motion data.
可选地,该运动数据可包括在预设的时间段内对外部设备的运动传感器输出的数据进行采样获取得到的数据,或者对外部设备的运动传感器输出的数据进行预设次数的采样获取得到的数据。Optionally, the motion data may include data obtained by sampling data acquired by the motion sensor of the external device in a preset time period, or sampling the data output by the motion sensor of the external device by a preset number of times. The data.
进一步可选地,在一些实施例中,Further optionally, in some embodiments,
所述处理模块12,还可用于根据识别的动作对飞行器进行控制。The processing module 12 is further configured to control the aircraft according to the identified action.
具体的,当识别出当前手势动作之后,处理模块12还可生成该手势动作对应的控制指令,并向被控制设备如飞行器发送该控制指令,以使飞行器执行与该手势动作对应的操作。Specifically, after the current gesture action is recognized, the processing module 12 may further generate a control instruction corresponding to the gesture action, and send the control instruction to the controlled device, such as an aircraft, to enable the aircraft to perform an operation corresponding to the gesture action.
在本发明实施例中,可通过获取外部设备的运动数据,并将该运动数据转换为频域数据,从而利用该频域数据来识别与该频域数据对应的动作,这就很好的避开识别手势数据波形的起始点和结束点,提升了动作识别的识别率,降低了误检率,并进一步提升了动作识别的准确性及可靠性,鲁棒性较好。In the embodiment of the present invention, the motion data of the external device is acquired, and the motion data is converted into the frequency domain data, so that the frequency domain data is used to identify the action corresponding to the frequency domain data, which is a good avoidance. Open the recognition start and end points of the gesture data waveform, improve the recognition rate of the motion recognition, reduce the false detection rate, and further improve the accuracy and reliability of the motion recognition, and the robustness is better.
请参见图6,图6是本发明实施例提供的另一种动作识别装置的结构示意图。可选地,本发明实施例的所述动作识别装置可具体设置于动作识别对应的被控制设备如飞行器中,或者可具体设置于其他独立的动作识别设备中,等等具体的,如图6所示,本发明实施例的所述动作识别装置20可以包括第一获取模块21、第二获取模块22、融合模块23和处理模块24。其中,Referring to FIG. 6, FIG. 6 is a schematic structural diagram of another motion recognition apparatus according to an embodiment of the present invention. Optionally, the motion recognition device in the embodiment of the present invention may be specifically disposed in a controlled device corresponding to the motion recognition, such as an aircraft, or may be specifically disposed in another independent motion recognition device, etc., specifically, as shown in FIG. 6. The action recognition apparatus 20 of the embodiment of the present invention may include a first acquisition module 21, a second acquisition module 22, a fusion module 23, and a processing module 24. among them,
所述第一获取模块21,用于获取由外部设备针对当前动作检测得到的运动数据,并根据所述运动数据获取与所述当前动作对应的特征数据; The first obtaining module 21 is configured to acquire motion data detected by the external device for the current motion, and acquire feature data corresponding to the current motion according to the motion data;
所述第二获取模块22,还用于获取针对所述当前动作采集的图像,并对所述图像进行处理,以得到所述当前动作对应的图像识别数据;The second acquiring module 22 is further configured to acquire an image collected for the current action, and process the image to obtain image recognition data corresponding to the current action;
所述融合模块23,用于将与所述当前动作对应的特征数据以及图像识别数据进行融合,得到融合数据;The merging module 23 is configured to combine feature data corresponding to the current action and image identification data to obtain fused data.
所述处理模块24,用于识别与所述融合数据对应的动作。The processing module 24 is configured to identify an action corresponding to the fused data.
可选地,该外部设备可以为可穿戴设备或手持式设备,如手环、手表、智能戒指等等。该获取的运动数据可以为设置于该外部设备中的运动传感器如IMU采集的数据。进一步的,在本发明实施例中,该特征数据可以是指由获取的运动数据转换(如经过傅里叶变换进行转换)得到的频域数据;还可以是指由获取的运动数据转换得到频域数据,并进一步提取得到的该频域数据的特征,如通过抽取、叠加等处理获取得到该频域数据的特征。Alternatively, the external device may be a wearable device or a handheld device such as a wristband, a watch, a smart ring, or the like. The acquired motion data may be data collected by a motion sensor such as an IMU disposed in the external device. Further, in the embodiment of the present invention, the feature data may refer to frequency domain data obtained by converting the acquired motion data (such as transforming by Fourier transform), and may also refer to converting the obtained motion data into frequency. The domain data, and further extracting the characteristics of the frequency domain data, for example, by extracting, superimposing, etc., obtaining the characteristics of the frequency domain data.
可选地,在一些实施例中,Optionally, in some embodiments,
所述处理模块24,可具体用于将所述融合数据输入网络模型,以通过所述网络模型识别与所述融合数据对应的动作。The processing module 24 is specifically configured to input the merged data into a network model to identify an action corresponding to the merged data by using the network model.
其中,该网络模型可以为神经网络,或者为其他网络模型,本发明实施例不做限定。The network model may be a neural network, or other network models, which are not limited in the embodiment of the present invention.
进一步可选地,在一些实施例中,Further optionally, in some embodiments,
所述第一获取模块21,可具体用于在根据所述运动数据获取与所述当前动作对应的特征数据时,将所述运动数据转换为频域数据,以将所述频域数据作为与所述当前动作对应的特征数据;The first obtaining module 21 may be specifically configured to convert the motion data into frequency domain data when acquiring feature data corresponding to the current motion according to the motion data, to use the frequency domain data as Feature data corresponding to the current action;
所述融合模块23,可具体用于将所述频域数据和所述第二获取模块22获取的图像识别数据进行融合,得到融合数据。The merging module 23 may be specifically configured to combine the frequency domain data and the image identification data acquired by the second acquiring module 22 to obtain fused data.
进一步可选地,在一些实施例中,Further optionally, in some embodiments,
所述第一获取模块21,还用于获取由外部设备针对预设动作检测得到的运动数据,并根据所述运动数据获取与所述预设动作对应的特征数据;The first obtaining module 21 is further configured to acquire motion data detected by the external device for the preset action, and acquire feature data corresponding to the preset action according to the motion data;
所述第二获取模块22,还用于获取针对所述预设动作采集的图像,并对采集的图像进行处理,以得到所述预设动作对应的图像识别数据;The second acquiring module 22 is further configured to acquire an image collected for the preset action, and process the collected image to obtain image recognition data corresponding to the preset action;
所述融合模块23,还用于将与所述预设动作对应的图像识别数据和与所述预设动作对应的特征数据进行融合,得到与所述预设动作对应的融合数据; The merging module 23 is further configured to combine the image identification data corresponding to the preset action with the feature data corresponding to the preset action to obtain fused data corresponding to the preset action;
所述处理模块24,还用于利用与所述预设动作对应的融合数据和所述预设动作对所述网络模型进行训练。The processing module 24 is further configured to train the network model by using the fused data corresponding to the preset action and the preset action.
可选地,该运动数据可包括在预设的时间段内对外部设备的运动传感器输出的数据进行采样获取得到的数据,或者对外部设备的运动传感器输出的数据进行预设次数的采样获取得到的数据。Optionally, the motion data may include data obtained by sampling data acquired by the motion sensor of the external device in a preset time period, or sampling the data output by the motion sensor of the external device by a preset number of times. The data.
进一步可选地,在一些实施例中,Further optionally, in some embodiments,
所述处理模块24,还可用于根据识别的动作对飞行器进行控制。The processing module 24 is further configured to control the aircraft according to the identified action.
在本发明实施例中,可通过将获取的外部设备的运动数据和图像识别数据相融合,具体可以是将该运动数据转换为频域数据,基于该频域数据确定当前动作的特征数据,并通过将该特征数据与该图像识别数据进行融合,得到融合数据,从而能够利用该融合数据识别与该融合数据对应的动作,这就提升了动作识别的准确性及可靠性,鲁棒性较好,避免了现有技术中仅利用外部设备识别,或者仅利用图像识别而带来的动作识别准确性低,甚至无法识别的问题。In the embodiment of the present invention, the motion data of the acquired external device and the image identification data may be merged, and specifically, the motion data may be converted into frequency domain data, and the feature data of the current action is determined based on the frequency domain data, and The fusion data is obtained by fusing the feature data with the image identification data, so that the fusion data can be used to identify the action corresponding to the fusion data, thereby improving the accuracy and reliability of the motion recognition, and the robustness is good. In the prior art, the problem of low recognition accuracy or even unrecognizable motion recognition caused by using only external device recognition or using only image recognition is avoided.
请参见图7,图7是本发明实施例提供的一种基于动作识别的网络训练装置的结构示意图。可选地,本发明实施例的所述动作识别装置可具体设置于动作识别对应的被控制设备如飞行器中,或者可具体设置于其他独立的网络训练设备中。具体的,如图7所示,本发明实施例的网络训练装置30可以包括第一获取模块31、第二获取模块32、确定模块33和处理模块34。其中,Referring to FIG. 7, FIG. 7 is a schematic structural diagram of a network training apparatus based on motion recognition according to an embodiment of the present invention. Optionally, the motion recognition device in the embodiment of the present invention may be specifically disposed in a controlled device, such as an aircraft, corresponding to the motion recognition, or may be specifically configured in another independent network training device. Specifically, as shown in FIG. 7, the network training device 30 of the embodiment of the present invention may include a first obtaining module 31, a second obtaining module 32, a determining module 33, and a processing module 34. among them,
所述第一获取模块31,用于获取由外部设备针对预设动作检测得到的运动数据;The first obtaining module 31 is configured to acquire motion data detected by an external device for a preset motion;
所述第二获取模块32,还用于获取针对所述预设动作采集的图像,并对所述图像进行处理,以得到所述预设动作对应的图像识别数据;The second acquiring module 32 is further configured to acquire an image collected for the preset action, and process the image to obtain image recognition data corresponding to the preset action;
所述确定模块33,用于识别与所述运动数据对应的动作;The determining module 33 is configured to identify an action corresponding to the motion data;
所述处理模块34,用于利用所述确定模块33识别出的所述动作对所述第二获取模块32获取的图像识别数据进行监督学习,并基于所述监督学习后的图像识别数据对预设的网络模型进行训练。The processing module 34 is configured to perform supervised learning on the image identification data acquired by the second obtaining module 32 by using the action identified by the determining module 33, and based on the image recognition data after the supervised learning Set up a network model for training.
可选地,该外部设备可以为可穿戴设备或手持式设备,如手环、手表、智能戒指等等。该获取的运动数据可以为设置于该外部设备中的运动传感器如IMU采集的数据。其中,该网络模型可以为神经网络,或者为其他网络模型, 本发明实施例以神经网络为例进行说明。Alternatively, the external device may be a wearable device or a handheld device such as a wristband, a watch, a smart ring, or the like. The acquired motion data may be data collected by a motion sensor such as an IMU disposed in the external device. Wherein, the network model can be a neural network or other network models. The embodiment of the present invention uses a neural network as an example for description.
进一步可选地,在一些实施例中,Further optionally, in some embodiments,
所述处理模块34,可具体用于利用深度学习,将所述第二获取模块32获取的图像识别数据作为输入,并将所述动作作为目标输出来进行监督学习。The processing module 34 is specifically configured to use the depth learning to input the image recognition data acquired by the second acquiring module 32, and use the action as a target output to perform supervised learning.
具体的,因图像采集到的特征维数很多,处理模块34可通过深度学习来减少图像识别特征维度,以提升基于该学习后的图像识别网络训练得到的网络模型的稳定性及可靠性。Specifically, because the feature dimension collected by the image is large, the processing module 34 can reduce the image recognition feature dimension through deep learning, so as to improve the stability and reliability of the network model obtained based on the learned image recognition network training.
进一步可选地,在一些实施例中,Further optionally, in some embodiments,
所述第一获取模块31,还用于根据所述运动数据获取与所述预设动作对应的特征数据;The first obtaining module 31 is further configured to acquire feature data corresponding to the preset action according to the motion data;
所述处理模块34,还用于将所述特征数据与所述监督学习后的图像识别数据进行融合,得到融合数据,并利用所述融合数据对预设的网络模型进行训练。The processing module 34 is further configured to fuse the feature data with the image data after the supervised learning to obtain the fused data, and use the fused data to train the preset network model.
进一步可选地,在一些实施例中,Further optionally, in some embodiments,
所述第一获取模块31,具体用于在根据所述运动数据获取与所述当前动作对应的特征数据时,具体地,将所述运动数据转换为频域数据,以将所述频域数据作为与所述当前动作对应的特征数据。The first obtaining module 31 is specifically configured to: when acquiring feature data corresponding to the current action according to the motion data, specifically, converting the motion data into frequency domain data, to use the frequency domain data As feature data corresponding to the current action.
具体的,该特征数据是基于该频域数据确定得到的,比如获取模块31可直接将转换得到的频域数据作为特征数据,或者将对该频域数据进行特征提取,如经过抽取、叠加等处理后得到的数据作为该特征数据。Specifically, the feature data is determined based on the frequency domain data. For example, the acquiring module 31 may directly use the converted frequency domain data as feature data, or perform feature extraction on the frequency domain data, such as extraction, superposition, and the like. The data obtained after the processing is taken as the feature data.
在本发明实施例中,可通过在用户做手势动作时同时采集运动数据及图像识别数据,并基于运动数据与手势动作的对应关系识别出当前手势动作,从而能够利用深度学习方式通过识别出的手势动作对该图像识别数据进行监督学习,使得提升了动作识别的准确性及可靠性,鲁棒性较好。In the embodiment of the present invention, the motion data and the image recognition data are simultaneously collected when the user performs the gesture action, and the current gesture action is recognized based on the correspondence between the motion data and the gesture action, thereby being able to be identified by using the deep learning method. The gesture action supervises and learns the image recognition data, which improves the accuracy and reliability of motion recognition and is robust.
本发明实施例还提供了一种计算机存储介质,该计算机存储介质中存储有程序指令,所述程序执行时可包括如图2对应实施例中的动作识别方法的部分或全部步骤。The embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium stores program instructions, and the program may include some or all of the steps of the motion recognition method in the corresponding embodiment of FIG. 2 .
本发明实施例还提供了一种计算机存储介质,该计算机存储介质中存储有程序指令,所述程序执行时可包括如图3对应实施例中的动作识别方法的部分 或全部步骤。The embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores program instructions, and the program execution may include a part of the motion recognition method in the corresponding embodiment of FIG. Or all steps.
本发明实施例还提供了一种计算机存储介质,该计算机存储介质中存储有程序指令,所述程序执行时可包括如图4对应实施例中的基于动作识别的网络训练方法的部分或全部步骤。The embodiment of the present invention further provides a computer storage medium, where the computer storage medium stores program instructions, and the program execution may include some or all steps of the motion recognition based network training method in the corresponding embodiment of FIG. 4 . .
再请参见图8,图8是本发明实施例提供的一种动作识别设备的结构示意图。本发明实施例的所述动作识别设备可以为外部设备如手环、手表、戒指等,或者可以为被控制设备如飞行器,或者可以为其他独立的动作识别设备,等等。具体的,本发明实施例中的所述动作识别设备1可包括:通信接口300、存储器200和处理器100,所述处理器100可分别与所述通信接口300及所述存储器200连接。可选地,该动作识别设备1还可包括运动传感器、摄像头等等。Referring to FIG. 8, FIG. 8 is a schematic structural diagram of a motion recognition device according to an embodiment of the present invention. The motion recognition device in the embodiment of the present invention may be an external device such as a wristband, a watch, a ring, or the like, or may be a controlled device such as an aircraft, or may be another independent motion recognition device, or the like. Specifically, the motion recognition device 1 in the embodiment of the present invention may include: a communication interface 300, a memory 200, and a processor 100, and the processor 100 may be respectively connected to the communication interface 300 and the memory 200. Alternatively, the motion recognition device 1 may further include a motion sensor, a camera, and the like.
所述通信接口300可包括有线接口、无线接口等,可用于接收外部设备传输的数据,如接收外部设备针对用户的某一手势动作采集的运动数据;或者用于传输外部设备获取的运动数据,等等。The communication interface 300 can include a wired interface, a wireless interface, and the like, and can be used to receive data transmitted by an external device, such as receiving motion data collected by an external device for a certain gesture action of the user, or for transmitting motion data acquired by the external device. and many more.
所述存储器200可以包括易失性存储器(volatile memory),例如随机存取存储器(random-access memory,简称RAM);存储器200也可以包括非易失性存储器(non-volatile memory),例如快闪存储器(flash memory)等;存储器200还可以包括上述种类的存储器的组合。The memory 200 may include a volatile memory such as a random-access memory (RAM); the memory 200 may also include a non-volatile memory such as a flash. A flash memory or the like; the memory 200 may further include a combination of the above types of memories.
所述处理器100可以为中央处理器(central processing unit,简称CPU)、图像处理器(graphics processing unit,简称GPU)等等。所述处理器还可以进一步包括硬件芯片。硬件芯片可以是专用集成电路(application-specific integrated circuit,简称ASIC),可编程逻辑器件(programmable logic device,简称PLD)或其组合。上述PLD可以是复杂可编程逻辑器件(complex programmable logic device,简称CPLD),现场可编程逻辑门阵列(field-programmable gate array,简称FPGA)等。The processor 100 may be a central processing unit (CPU), a graphics processing unit (GPU), or the like. The processor may further include a hardware chip. The hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD) or a combination thereof. The PLD may be a complex programmable logic device (CPLD) or a field-programmable gate array (FPGA).
可选地,所述存储器200还用于存储程序指令。所述处理器100可以调用所述程序指令,实现如本申请图2实施例中所示的动作识别方法。Optionally, the memory 200 is further configured to store program instructions. The processor 100 can invoke the program instructions to implement the motion recognition method as shown in the embodiment of FIG. 2 of the present application.
具体的,所述通信接口300,可用于获取由外部设备针对当前动作检测得到的运动数据;Specifically, the communication interface 300 can be configured to acquire motion data detected by an external device for current motion detection;
所述处理器100,可调用所述存储器200中存储的程序指令,用于执行: The processor 100 can invoke program instructions stored in the memory 200 for executing:
将所述运动数据转换为频域数据,并利用所述频域数据来识别与所述频域数据对应的动作。Converting the motion data to frequency domain data and using the frequency domain data to identify an action corresponding to the frequency domain data.
可选地,所述处理器100,具体用于将所述频域数据输入网络模型,以通过所述网络模型识别与所述频域数据对应的动作。Optionally, the processor 100 is specifically configured to input the frequency domain data into a network model to identify an action corresponding to the frequency domain data by using the network model.
可选地,所述网络模型可包括神经网络。Alternatively, the network model may comprise a neural network.
可选地,所述通信接口300,还用于获取由外部设备针对预设动作检测得到的运动数据;Optionally, the communication interface 300 is further configured to acquire motion data detected by the external device for the preset action;
所述处理器100,还用于将与所述预设动作对应的运动数据转换为频域数据,利用与所述预设动作对应的所述频域数据和所述预设动作对所述网络模型进行训练。The processor 100 is further configured to convert motion data corresponding to the preset action into frequency domain data, and use the frequency domain data corresponding to the preset action and the preset action on the network. The model is trained.
可选地,所述处理器100,还用于对所述频域数据进行正则化处理,以减少对所述频域数据的过度拟合。Optionally, the processor 100 is further configured to perform regularization processing on the frequency domain data to reduce over-fitting of the frequency domain data.
可选地,所述处理器100,还用于对所述运动数据进行归一化处理。Optionally, the processor 100 is further configured to perform normalization processing on the motion data.
可选地,所述运动数据可包括在预设的时间段内对外部设备的运动传感器输出的数据进行采样获取得到的数据,或者对外部设备的运动传感器输出的数据进行预设次数的采样获取得到的数据。Optionally, the motion data may include data obtained by sampling data acquired by a motion sensor of an external device within a preset time period, or sampling the data output by the motion sensor of the external device by a preset number of times. The data obtained.
可选地,所述处理器100,还用于根据识别的动作对飞行器进行控制。Optionally, the processor 100 is further configured to control the aircraft according to the identified action.
本发明实施例还提供了一种飞行器,包括,An embodiment of the present invention further provides an aircraft, including
动力系统,用于为飞行器提供飞行动力;a power system for providing flight power to the aircraft;
上述图8对应的实施例中任一项所述的动作识别设备,用于对动作进行识别。The motion recognition device according to any of the above embodiments of FIG. 8 is configured to recognize an action.
再请参见图9,图9是本发明实施例提供的另一种动作识别设备的结构示意图。本发明实施例的所述动作识别设备可以为被控制设备如飞行器,或者可以为其他独立的动作识别设备,等等。具体的,本发明实施例中的所述动作识别设备2可包括:通信接口700、图像获取装置600、存储器500和处理器400,所述处理器400可分别与所述通信接口700、图像获取装置600及所述存储器500连接。可选地,该动作识别设备2还可包括运动传感器。Referring to FIG. 9, FIG. 9 is a schematic structural diagram of another motion recognition device according to an embodiment of the present invention. The motion recognition device in the embodiment of the present invention may be a controlled device such as an aircraft, or may be another independent motion recognition device, and the like. Specifically, the motion recognition device 2 in the embodiment of the present invention may include: a communication interface 700, an image acquisition device 600, a memory 500, and a processor 400. The processor 400 may be respectively connected to the communication interface 700 and image. The device 600 and the memory 500 are connected. Alternatively, the motion recognition device 2 may further include a motion sensor.
所述图像获取装置600可包括摄像头,用于采集图像,如采集用户做手势动作时的图像。 The image acquisition device 600 may include a camera for acquiring an image, such as an image when the user performs a gesture.
所述通信接口700可包括有线接口、无线接口等,可用于接收外部设备传输的数据,如接收外部设备针对用户的某一手势动作采集的运动数据。The communication interface 700 can include a wired interface, a wireless interface, and the like, and can be used to receive data transmitted by the external device, such as receiving motion data collected by the external device for a certain gesture action of the user.
所述存储器500可以包括易失性存储器(volatile memory),例如随机存取存储器(random-access memory,简称RAM);存储器500也可以包括非易失性存储器(non-volatile memory),例如快闪存储器(flash memory)等;存储器500还可以包括上述种类的存储器的组合。The memory 500 may include a volatile memory, such as a random-access memory (RAM); the memory 500 may also include a non-volatile memory, such as a flash. A flash memory or the like; the memory 500 may further include a combination of memories of the above kind.
所述处理器400可以为中央处理器(central processing unit,简称CPU)、图像处理器(graphics processing unit,简称GPU)等等。所述处理器还可以进一步包括硬件芯片。硬件芯片可以是专用集成电路(application-specific integrated circuit,简称ASIC),可编程逻辑器件(programmable logic device,简称PLD)或其组合。上述PLD可以是复杂可编程逻辑器件(complex programmable logic device,简称CPLD),现场可编程逻辑门阵列(field-programmable gate array,简称FPGA)等。The processor 400 may be a central processing unit (CPU), a graphics processing unit (GPU), or the like. The processor may further include a hardware chip. The hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD) or a combination thereof. The PLD may be a complex programmable logic device (CPLD) or a field-programmable gate array (FPGA).
可选地,所述存储器500还用于存储程序指令。所述处理器400可以调用所述程序指令,实现如本申请图3实施例中所示的动作识别方法。Optionally, the memory 500 is further configured to store program instructions. The processor 400 can invoke the program instructions to implement the motion recognition method as shown in the embodiment of FIG. 3 of the present application.
具体的,所述通信接口700,用于获取由外部设备针对当前动作检测得到的运动数据;Specifically, the communication interface 700 is configured to acquire motion data detected by an external device for a current motion.
所述图像获取装置600,用于针对所述当前动作采集图像;The image obtaining device 600 is configured to collect an image for the current motion;
所述处理器400,用于对所述图像获取装置采集的图像进行处理,得到图像识别数据,根据所述运动数据获取与所述当前动作对应的特征数据,将与所述当前动作对应的特征数据以及图像识别数据进行融合,得到融合数据,并识别与所述融合数据对应的动作。The processor 400 is configured to process an image acquired by the image acquiring device, obtain image recognition data, acquire feature data corresponding to the current action according to the motion data, and select a feature corresponding to the current action The data and the image identification data are fused to obtain fused data, and an action corresponding to the fused data is identified.
可选地,所述处理器400,具体用于将所述融合数据输入网络模型,以通过所述网络模型识别与所述融合数据对应的动作。Optionally, the processor 400 is specifically configured to input the merged data into a network model to identify an action corresponding to the merged data by using the network model.
可选地,所述网络模型包括神经网络。Optionally, the network model comprises a neural network.
可选地,所述处理器400,具体用于将所述运动数据转换为频域数据,并将所述频域数据和所述图像识别数据进行融合,得到融合数据。Optionally, the processor 400 is specifically configured to convert the motion data into frequency domain data, and fuse the frequency domain data and the image identification data to obtain fused data.
具体的,可通过将获取的运动数据如IMU数据经过傅里叶变换由时域变换到频域,得到频域数据,并可基于该频域数据确定该特征数据,比如直接将 该转换得到的频域数据作为特征数据,或者将对该频域数据进行特征提取,如经过抽取、叠加等处理后得到的数据作为该特征数据。Specifically, the obtained motion data, such as the IMU data, is transformed from the time domain to the frequency domain by Fourier transform, and the frequency domain data is obtained, and the feature data is determined based on the frequency domain data, for example, directly The frequency domain data obtained by the conversion is used as feature data, or the frequency domain data is subjected to feature extraction, such as data obtained by processing after extraction, superposition, etc., as the feature data.
可选地,所述通信接口700,还用于获取由外部设备针对预设动作检测得到的运动数据;Optionally, the communication interface 700 is further configured to acquire motion data detected by the external device for the preset action;
所述图像获取装置600,还用于针对所述预设动作采集图像;The image obtaining device 600 is further configured to collect an image for the preset action;
所述处理器400,还用于对所述图像获取装置采集的图像进行处理,得到所述预设动作对应的图像识别数据;根据所述运动数据获取与所述预设动作对应的特征数据;将与所述预设动作对应的图像识别数据和与所述预设动作对应的特征数据进行融合,得到与所述预设动作对应的融合数据;利用与所述预设动作对应的融合数据和所述预设动作对所述网络模型进行训练。The processor 400 is further configured to process an image acquired by the image acquiring device to obtain image recognition data corresponding to the preset action, and acquire feature data corresponding to the preset action according to the motion data; Combining the image identification data corresponding to the preset action with the feature data corresponding to the preset action to obtain fusion data corresponding to the preset action; using the fusion data corresponding to the preset action and The preset action trains the network model.
可选地,所述运动数据可包括在预设的时间段内对外部设备的运动传感器输出的数据进行采样获取得到的数据,或者对外部设备的运动传感器输出的数据进行预设次数的采样获取得到的数据。Optionally, the motion data may include data obtained by sampling data acquired by a motion sensor of an external device within a preset time period, or sampling the data output by the motion sensor of the external device by a preset number of times. The data obtained.
可选地,所述处理器400,还用于根据识别的动作对飞行器进行控制。Optionally, the processor 400 is further configured to control the aircraft according to the identified action.
本发明实施例还提供了一种飞行器,包括,An embodiment of the present invention further provides an aircraft, including
动力系统,用于为飞行器提供飞行动力;a power system for providing flight power to the aircraft;
上述图9对应的实施例中任一项所述的动作识别设备,用于对动作进行识别。The motion recognition device according to any of the above-described embodiments of FIG. 9 is for identifying an action.
再请参见图10,图10是本发明实施例提供的一种基于动作识别的网络训练设备的结构示意图。本发明实施例的所述网络训练设备可以为被控制设备如飞行器,或者可以为其他独立的网络训练设备,等等。具体的,本发明实施例中的所述网络训练设备3可包括:通信接口1100、图像获取装置1000、存储器900和处理器800,所述处理器800可分别与所述通信接口1100、所述图像获取装置1000及所述存储器900连接。可选地,该动作识别设备3还可包括运动传感器等。Referring to FIG. 10, FIG. 10 is a schematic structural diagram of a network training device based on motion recognition according to an embodiment of the present invention. The network training device in the embodiment of the present invention may be a controlled device such as an aircraft, or may be other independent network training devices, and the like. Specifically, the network training device 3 in the embodiment of the present invention may include: a communication interface 1100, an image acquisition device 1000, a memory 900, and a processor 800, and the processor 800 may be respectively associated with the communication interface 1100. The image acquisition device 1000 and the memory 900 are connected. Alternatively, the motion recognition device 3 may further include a motion sensor or the like.
所述图像获取装置1000可包括摄像头,用于采集图像,如采集用户做手势动作时的图像。The image acquisition device 1000 may include a camera for acquiring an image, such as an image when the user performs a gesture.
所述通信接口1100可包括有线接口、无线接口等,可用于接收外部设备传输的数据,如接收外部设备针对用户的某一手势动作采集的运动数据。 The communication interface 1100 can include a wired interface, a wireless interface, and the like, and can be used to receive data transmitted by the external device, such as receiving motion data collected by the external device for a certain gesture action of the user.
所述存储器900可以包括易失性存储器(volatile memory),例如随机存取存储器(random-access memory,简称RAM);存储器900也可以包括非易失性存储器(non-volatile memory),例如快闪存储器(flash memory)等;存储器900还可以包括上述种类的存储器的组合。The memory 900 may include a volatile memory, such as a random-access memory (RAM); the memory 900 may also include a non-volatile memory, such as a flash. A flash memory or the like; the memory 900 may further include a combination of the above types of memories.
所述处理器800可以为中央处理器(central processing unit,简称CPU)、图像处理器(graphics processing unit,简称GPU)等等。所述处理器还可以进一步包括硬件芯片。硬件芯片可以是专用集成电路(application-specific integrated circuit,简称ASIC),可编程逻辑器件(programmable logic device,简称PLD)或其组合。上述PLD可以是复杂可编程逻辑器件(complex programmable logic device,简称CPLD),现场可编程逻辑门阵列(field-programmable gate array,简称FPGA)等。The processor 800 can be a central processing unit (CPU), a graphics processing unit (GPU), and the like. The processor may further include a hardware chip. The hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD) or a combination thereof. The PLD may be a complex programmable logic device (CPLD) or a field-programmable gate array (FPGA).
可选地,所述存储器900还用于存储程序指令。所述处理器800可以调用所述程序指令,实现如本申请图4实施例中所示的基于动作识别的网络训练方法。Optionally, the memory 900 is further configured to store program instructions. The processor 800 can invoke the program instructions to implement a motion recognition based network training method as shown in the embodiment of FIG. 4 of the present application.
具体的,所述通信接口1100,用于获取由外部设备针对预设动作检测得到的运动数据;Specifically, the communication interface 1100 is configured to acquire motion data detected by an external device for a preset action;
所述图像获取装置1000,用于针对所述预设动作采集图像;The image obtaining device 1000 is configured to collect an image for the preset action;
所述处理器800,用于对所述图像获取装置采集的图像进行处理,得到图像识别数据;识别与所述运动数据对应的动作;利用识别出的所述动作对所述图像识别数据进行监督学习,并基于所述监督学习后的图像识别数据对预设的网络模型进行训练。The processor 800 is configured to process an image acquired by the image acquiring device to obtain image recognition data, identify an action corresponding to the motion data, and supervise the image identification data by using the identified action Learning, and training the preset network model based on the image recognition data after the supervised learning.
可选地,所述处理器800,具体用于利用深度学习,将所述图像识别数据作为输入,并将所述动作作为目标输出来进行监督学习。Optionally, the processor 800 is specifically configured to perform the supervised learning by using the image recognition data as an input and using the action as a target output.
可选地,所述处理器800,还用于根据所述运动数据获取与所述预设动作对应的特征数据;将所述特征数据与所述监督学习后的图像识别数据进行融合,得到融合数据;利用所述融合数据对预设的网络模型进行训练。Optionally, the processor 800 is further configured to acquire feature data corresponding to the preset action according to the motion data, and fuse the feature data with the image recognition data after the supervised learning to obtain a fusion. Data; using the fusion data to train a preset network model.
可选地,所述处理器800,具体用于将所述运动数据转换为频域数据,以将所述频域数据作为与所述当前动作对应的特征数据。Optionally, the processor 800 is specifically configured to convert the motion data into frequency domain data to use the frequency domain data as feature data corresponding to the current action.
具体的,可通过将获取的运动数据经过傅里叶变换由时域变换到频域,得 到频域数据,并可基于该频域数据确定该特征数据,比如直接将该频域数据作为特征数据,或者将对该频域数据进行特征提取,如经过抽取、叠加等处理后得到的数据作为该特征数据。Specifically, the obtained motion data can be transformed from the time domain to the frequency domain by Fourier transform. Go to the frequency domain data, and determine the feature data based on the frequency domain data, for example, directly using the frequency domain data as feature data, or performing feature extraction on the frequency domain data, such as data obtained after processing by extraction, superposition, and the like. As the feature data.
可选地,所述网络模型可包括神经网络。Alternatively, the network model may comprise a neural network.
在本发明实施例中,可通过获取外部设备的运动数据,并将该运动数据转换为频域数据,从而利用该频域数据来识别与该频域数据对应的动作,或者通过将运动数据和图像识别数据相融合以得到融合数据,从而利用该融合数据识别与所述融合数据对应的动作,或者通过确定运动数据对应的动作,利用图像识别数据和确定出的动作对图像识别数据进行监督学习,以提升动作识别的准确性及可靠性,鲁棒性较好。In the embodiment of the present invention, the motion data of the external device may be acquired, and the motion data is converted into frequency domain data, thereby using the frequency domain data to identify an action corresponding to the frequency domain data, or by using motion data and The image recognition data is fused to obtain the fused data, thereby using the fused data to identify an action corresponding to the fused data, or by determining an action corresponding to the motion data, using the image recognition data and the determined action to supervise the image recognition data In order to improve the accuracy and reliability of motion recognition, the robustness is better.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above embodiments, the descriptions of the various embodiments are different, and the details that are not detailed in a certain embodiment can be referred to the related descriptions of other embodiments.
本发明实施例还提供了一种飞行器,包括,An embodiment of the present invention further provides an aircraft, including
动力系统,为飞行器提供飞行动力;a power system that provides flight power to the aircraft;
上述图10对应的实施例中任一项所述的基于动作识别的网络训练设备,用于对动作识别的网络模型进行训练。The motion recognition-based network training device according to any one of the foregoing embodiments of FIG. 10 is configured to train the network model of motion recognition.
在本发明所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of the modules is only a logical function division. In actual implementation, there may be another division manner, for example, multiple modules or components may be combined or Can be integrated into another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or module, and may be electrical, mechanical or otherwise.
所述该作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络模块上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated. The components displayed as modules may or may not be physical modules, that is, may be located in one place, or may be distributed to multiple network modules. . Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
另外,在本发明各个实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块 中。上述集成的模块既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。In addition, each functional module in each embodiment of the present invention may be integrated into one processing module, or each module may exist physically separately, or two or more modules may be integrated into one module. in. The above integrated modules can be implemented in the form of hardware or in the form of hardware plus software function modules.
上述以软件功能模块的形式实现的集成的模块,可以存储在一个计算机可读取存储介质中。上述软件功能模块存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本发明各个实施例所述方法的部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The above-described integrated modules implemented in the form of software function modules can be stored in a computer readable storage medium. The software function modules described above are stored in a storage medium and include instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to perform the methods of the various embodiments of the present invention. Part of the steps. The foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program code. .
本领域技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。上述描述的装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。A person skilled in the art can clearly understand that for the convenience and brevity of the description, only the division of each functional module described above is exemplified. In practical applications, the above function assignment can be completed by different functional modules as needed, that is, the device is installed. The internal structure is divided into different functional modules to perform all or part of the functions described above. For the specific working process of the device described above, refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.
最后应说明的是:以上各实施例仅用以说明本公开的技术方案,而非对其限制;尽管参照前述各实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本公开各实施例技术方案的范围。 It should be noted that the above embodiments are merely illustrative of the technical solutions of the present disclosure, and are not intended to be limiting; although the present disclosure has been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that The technical solutions described in the foregoing embodiments may be modified, or some or all of the technical features may be equivalently replaced; and the modifications or substitutions do not deviate from the technical solutions of the embodiments of the present disclosure. range.

Claims (63)

  1. 一种动作识别装置,其特征在于,包括:A motion recognition device, comprising:
    获取模块,用于获取由外部设备针对当前动作检测得到的运动数据;An acquiring module, configured to acquire motion data detected by an external device for the current motion;
    处理模块,用于将所述获取模块获取的所述运动数据转换为频域数据,利用所述频域数据来识别与所述频域数据对应的动作。And a processing module, configured to convert the motion data acquired by the acquiring module into frequency domain data, and use the frequency domain data to identify an action corresponding to the frequency domain data.
  2. 根据权利要求1所述的装置,其特征在于,The device of claim 1 wherein:
    所述处理模块,具体用于将所述频域数据输入网络模型,以通过所述网络模型识别与所述频域数据对应的动作。The processing module is specifically configured to input the frequency domain data into a network model to identify an action corresponding to the frequency domain data by using the network model.
  3. 根据权利要求2所述的装置,其特征在于,The device according to claim 2, characterized in that
    所述网络模型包括神经网络。The network model includes a neural network.
  4. 根据权利要求2或3所述的装置,其特征在于,Device according to claim 2 or 3, characterized in that
    所述获取模块,还用于获取由外部设备针对预设动作检测得到的运动数据;The acquiring module is further configured to acquire motion data detected by the external device for the preset action;
    所述处理模块,还用于将与所述预设动作对应的运动数据转换为频域数据,利用与所述预设动作对应的所述频域数据和所述预设动作对所述网络模型进行训练。The processing module is further configured to convert motion data corresponding to the preset action into frequency domain data, and use the frequency domain data corresponding to the preset action and the preset action to the network model Train.
  5. 根据权利要求1-4任一项所述的装置,其特征在于,Apparatus according to any one of claims 1 to 4, wherein
    所述处理模块,还用于对所述频域数据进行正则化处理,以减少对所述频域数据的过度拟合。The processing module is further configured to perform regularization processing on the frequency domain data to reduce over-fitting of the frequency domain data.
  6. 根据权利要求1-5任一项所述的装置,其特征在于,Apparatus according to any one of claims 1 to 5, wherein
    所述处理模块,还用于对所述运动数据进行归一化处理。The processing module is further configured to perform normalization processing on the motion data.
  7. 根据权利要求1-6任一项所述的装置,其特征在于, Apparatus according to any of claims 1-6, characterized in that
    所述运动数据包括:在预设的时间段内对所述外部设备的运动传感器输出的数据进行采样获取得到的数据,或者对所述外部设备的运动传感器输出的数据进行预设次数的采样获取得到的数据。The motion data includes: data obtained by sampling and acquiring data output by the motion sensor of the external device in a preset time period, or sampling the data output by the motion sensor of the external device by a preset number of times. The data obtained.
  8. 根据权利要求1-7任一项所述的装置,其特征在于,Apparatus according to any of claims 1-7, wherein
    所述处理模块,还用于根据识别的动作对飞行器进行控制。The processing module is further configured to control the aircraft according to the identified action.
  9. 一种动作识别装置,其特征在于,包括:A motion recognition device, comprising:
    第一获取模块,用于获取由外部设备针对当前动作检测得到的运动数据,并根据所述运动数据获取与所述当前动作对应的特征数据;a first acquiring module, configured to acquire motion data detected by the external device for the current motion, and acquire feature data corresponding to the current motion according to the motion data;
    第二获取模块,用于获取针对所述当前动作采集的图像,并对所述图像进行处理,以得到所述当前动作对应的图像识别数据;a second acquiring module, configured to acquire an image collected for the current action, and process the image to obtain image recognition data corresponding to the current action;
    融合模块,用于将与所述当前动作对应的特征数据以及图像识别数据进行融合,得到融合数据;a fusion module, configured to fuse feature data corresponding to the current action and image recognition data to obtain fusion data;
    处理模块,用于识别与所述融合数据对应的动作。And a processing module, configured to identify an action corresponding to the merged data.
  10. 根据权利要求9所述的装置,其特征在于,The device of claim 9 wherein:
    所述处理模块,具体用于将所述融合数据输入网络模型,以通过所述网络模型识别与所述融合数据对应的动作。The processing module is specifically configured to input the merged data into a network model to identify an action corresponding to the merged data by using the network model.
  11. 根据权利要求10所述的装置,其特征在于,The device of claim 10 wherein:
    所述网络模型包括神经网络。The network model includes a neural network.
  12. 根据权利要求9-11任一项所述的装置,其特征在于,A device according to any one of claims 9-11, wherein
    所述第一获取模块,具体用于在根据所述运动数据获取与所述当前动作对应的特征数据时,将所述运动数据转换为频域数据,以将所述频域数据作为与所述当前动作对应的特征数据;The first acquiring module is configured to convert the motion data into frequency domain data when the feature data corresponding to the current action is acquired according to the motion data, to use the frequency domain data as Feature data corresponding to the current action;
    所述融合模块,具体用于将所述频域数据和所述第二获取模块获取的图像识别数据进行融合,得到融合数据。 The merging module is specifically configured to combine the frequency domain data and the image identification data acquired by the second acquiring module to obtain fused data.
  13. 根据权利要求10或11所述的装置,其特征在于,Device according to claim 10 or 11, characterized in that
    所述第一获取模块,还用于获取由外部设备针对预设动作检测得到的运动数据,并根据所述运动数据获取与所述预设动作对应的特征数据;The first acquiring module is further configured to acquire motion data detected by the external device for the preset action, and acquire feature data corresponding to the preset action according to the motion data;
    所述第二获取模块,还用于获取针对所述预设动作采集的图像,并对采集的图像进行处理,以得到所述预设动作对应的图像识别数据;The second acquiring module is further configured to acquire an image collected for the preset action, and process the collected image to obtain image recognition data corresponding to the preset action;
    所述融合模块,还用于将与所述预设动作对应的图像识别数据和与所述预设动作对应的特征数据进行融合,得到与所述预设动作对应的融合数据;The merging module is further configured to combine the image identification data corresponding to the preset action with the feature data corresponding to the preset action to obtain fused data corresponding to the preset action;
    所述处理模块,还用于利用与所述预设动作对应的融合数据和所述预设动作对所述网络模型进行训练。The processing module is further configured to train the network model by using the fused data corresponding to the preset action and the preset action.
  14. 根据权利要求9-13任一项所述的装置,其特征在于,A device according to any one of claims 9-13, wherein
    所述运动数据包括在预设的时间段内对所述外部设备的运动传感器输出的数据进行采样获取得到的数据,或者对所述外部设备的运动传感器输出的数据进行预设次数的采样获取得到的数据。The motion data includes data obtained by sampling and acquiring data output by the motion sensor of the external device in a preset time period, or sampling the data output by the motion sensor of the external device by a preset number of times. The data.
  15. 根据权利要求9-14任一项所述的装置,其特征在于,A device according to any one of claims 9-14, wherein
    所述处理模块,还用于根据识别的动作对飞行器进行控制。The processing module is further configured to control the aircraft according to the identified action.
  16. 一种基于动作识别的网络训练装置,其特征在于,包括:A network training device based on motion recognition, comprising:
    第一获取模块,用于获取由外部设备针对预设动作检测得到的运动数据;a first acquiring module, configured to acquire motion data detected by an external device for a preset motion;
    第二获取模块,用于获取针对所述预设动作采集的图像,并对所述图像进行处理,以得到所述预设动作对应的图像识别数据;a second acquiring module, configured to acquire an image collected for the preset action, and process the image to obtain image recognition data corresponding to the preset action;
    确定模块,用于识别与所述运动数据对应的动作;a determining module, configured to identify an action corresponding to the motion data;
    处理模块,用于利用所述确定模块识别出的所述动作对所述第二获取模块获取的图像识别数据进行监督学习,并基于所述监督学习后的图像识别数据对预设的网络模型进行训练。a processing module, configured to perform supervised learning on the image identification data acquired by the second obtaining module by using the action identified by the determining module, and perform pre-determined network model based on the image recognition data after the supervised learning training.
  17. 根据权利要求16所述的装置,其特征在于, The device of claim 16 wherein:
    所述处理模块,具体用于利用深度学习,将所述第二获取模块获取的图像识别数据作为输入,并将所述动作作为目标输出来进行监督学习。The processing module is specifically configured to use the depth learning to input the image recognition data acquired by the second acquiring module as an input, and use the action as a target output to perform supervised learning.
  18. 根据权利要求16或17所述的装置,其特征在于,A device according to claim 16 or 17, wherein
    所述第一获取模块,还用于根据所述运动数据获取与所述预设动作对应的特征数据;The first acquiring module is further configured to acquire feature data corresponding to the preset action according to the motion data;
    所述处理模块,还用于将所述特征数据与所述监督学习后的图像识别数据进行融合,得到融合数据,并利用所述融合数据对预设的网络模型进行训练。The processing module is further configured to fuse the feature data with the image data after the supervised learning to obtain the fused data, and use the fused data to train the preset network model.
  19. 根据权利要求18所述的装置,其特征在于,The device of claim 18, wherein
    所述第一获取模块,具体用于在根据所述运动数据获取与所述当前动作对应的特征数据时,将所述运动数据转换为频域数据,以将所述频域数据作为与所述当前动作对应的特征数据。The first acquiring module is configured to convert the motion data into frequency domain data when the feature data corresponding to the current action is acquired according to the motion data, to use the frequency domain data as The feature data corresponding to the current action.
  20. 根据权利要求16-19所述的装置,其特征在于,Device according to claims 16-19, characterized in that
    所述网络模型包括神经网络。The network model includes a neural network.
  21. 一种动作识别方法,其特征在于,包括:A motion recognition method, comprising:
    获取由外部设备针对当前动作检测得到的运动数据;Acquiring motion data detected by an external device for the current motion;
    将所述运动数据转换为频域数据,并利用所述频域数据来识别与所述频域数据对应的动作。Converting the motion data to frequency domain data and using the frequency domain data to identify an action corresponding to the frequency domain data.
  22. 根据权利要求21所述的方法,其特征在于,所述利用所述频域数据来识别与所述频域数据对应的动作,包括:The method according to claim 21, wherein the using the frequency domain data to identify an action corresponding to the frequency domain data comprises:
    将所述频域数据输入网络模型,以通过所述网络模型识别与所述频域数据对应的动作。The frequency domain data is input to a network model to identify an action corresponding to the frequency domain data by the network model.
  23. 根据权利要求33所述的方法,其特征在于,The method of claim 33, wherein
    所述网络模型包括神经网络。 The network model includes a neural network.
  24. 根据权利要求22或23所述的方法,其特征在于,所述方法还包括:The method according to claim 22 or 23, wherein the method further comprises:
    获取由外部设备针对预设动作检测得到的运动数据;Acquiring motion data detected by an external device for a preset motion;
    将与所述预设动作对应的运动数据转换为频域数据,利用与所述预设动作对应的所述频域数据和所述预设动作对所述网络模型进行训练。And converting the motion data corresponding to the preset action into frequency domain data, and training the network model by using the frequency domain data corresponding to the preset action and the preset action.
  25. 根据权利要求21-24任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 21 to 24, wherein the method further comprises:
    对所述频域数据进行正则化处理,以减少对所述频域数据的过度拟合。The frequency domain data is regularized to reduce over-fitting of the frequency domain data.
  26. 根据权利要求21-25任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 21 to 25, wherein the method further comprises:
    对所述运动数据进行归一化处理。The motion data is normalized.
  27. 根据权利要求21-26任一项所述的方法,其特征在于,A method according to any one of claims 21-26, characterized in that
    所述运动数据包括在预设的时间段内对所述外部设备的运动传感器输出的数据进行采样获取得到的数据,或者对所述外部设备的运动传感器输出的数据进行预设次数的采样获取得到的数据。The motion data includes data obtained by sampling and acquiring data output by the motion sensor of the external device in a preset time period, or sampling the data output by the motion sensor of the external device by a preset number of times. The data.
  28. 根据权利要求21-27任一项所述的方法,其特征在于,在所述利用所述频域数据来识别与所述频域数据对应的动作之后,所述方法还包括:The method according to any one of claims 21 to 27, wherein after the using the frequency domain data to identify an action corresponding to the frequency domain data, the method further comprises:
    根据识别的动作对飞行器进行控制。The aircraft is controlled according to the identified action.
  29. 一种动作识别方法,其特征在于,包括:A motion recognition method, comprising:
    获取由外部设备针对当前动作检测得到的运动数据,并根据所述运动数据获取与所述当前动作对应的特征数据;Acquiring motion data detected by the external device for the current motion, and acquiring feature data corresponding to the current motion according to the motion data;
    获取针对所述当前动作采集的图像,并对所述图像进行处理,以得到所述当前动作对应的图像识别数据;Acquiring an image acquired for the current action, and processing the image to obtain image recognition data corresponding to the current action;
    将与所述当前动作对应的特征数据以及图像识别数据进行融合,得到融合数据;Combining the feature data corresponding to the current action and the image identification data to obtain the fused data;
    识别与所述融合数据对应的动作。 Identifying an action corresponding to the fused data.
  30. 根据权利要求29所述的方法,其特征在于,所述识别与所述融合数据对应的动作,包括:The method according to claim 29, wherein the identifying an action corresponding to the fused data comprises:
    将所述融合数据输入网络模型,以通过所述网络模型识别与所述融合数据对应的动作。The fused data is input to a network model to identify an action corresponding to the fused data by the network model.
  31. 根据权利要求30所述的方法,其特征在于,The method of claim 30 wherein:
    所述网络模型包括神经网络。The network model includes a neural network.
  32. 根据权利要求29-31任一项所述的方法,其特征在于,所述根据所述运动数据获取与所述当前动作对应的特征数据,包括:The method according to any one of claims 29 to 31, wherein the acquiring the feature data corresponding to the current action according to the motion data comprises:
    将所述运动数据转换为频域数据,以将所述频域数据作为与所述当前动作对应的特征数据;Converting the motion data into frequency domain data to use the frequency domain data as feature data corresponding to the current action;
    所述将与所述当前动作对应的特征数据以及图像识别数据进行融合,得到融合数据,包括:The merging the feature data corresponding to the current action and the image identification data to obtain the fused data includes:
    将所述频域数据和所述图像识别数据进行融合,得到融合数据。The frequency domain data and the image recognition data are fused to obtain fused data.
  33. 根据权利要求30或31所述的方法,其特征在于,所述方法还包括:The method according to claim 30 or 31, wherein the method further comprises:
    获取由外部设备针对预设动作检测得到的运动数据,并根据所述运动数据获取与所述预设动作对应的特征数据;Obtaining motion data detected by the external device for the preset action, and acquiring feature data corresponding to the preset action according to the motion data;
    获取针对所述预设动作采集的图像,并对采集的图像进行处理,以得到所述预设动作对应的图像识别数据;Acquiring an image acquired for the preset action, and processing the acquired image to obtain image recognition data corresponding to the preset action;
    将与所述预设动作对应的图像识别数据和与所述预设动作对应的特征数据进行融合,得到与所述预设动作对应的融合数据;Combining the image identification data corresponding to the preset action with the feature data corresponding to the preset action to obtain fusion data corresponding to the preset action;
    利用与所述预设动作对应的融合数据和所述预设动作对所述网络模型进行训练。The network model is trained using the fused data corresponding to the preset action and the preset action.
  34. 根据权利要求29-33任一项所述的方法,其特征在于,A method according to any of claims 29-33, wherein
    所述运动数据包括在预设的时间段内对所述外部设备的运动传感器输出 的数据进行采样获取得到的数据,或者对所述外部设备的运动传感器输出的数据进行预设次数的采样获取得到的数据。The motion data includes a motion sensor output to the external device for a preset period of time The data is sampled to obtain the obtained data, or the data output by the motion sensor outputted by the motion sensor of the external device is obtained by sampling a preset number of times.
  35. 根据权利要求29-34任一项所述的方法,其特征在于,在所述识别与所述融合数据对应的动作之后,所述方法还包括:The method according to any one of claims 29 to 34, wherein after the identifying the action corresponding to the fused data, the method further comprises:
    根据识别的动作对飞行器进行控制。The aircraft is controlled according to the identified action.
  36. 一种基于动作识别的网络训练方法,其特征在于,包括:A network training method based on motion recognition, characterized in that it comprises:
    获取由外部设备针对预设动作检测得到的运动数据;Acquiring motion data detected by an external device for a preset motion;
    获取针对所述预设动作采集的图像,并对所述图像进行处理,以得到所述预设动作对应的图像识别数据;Acquiring an image acquired for the preset action, and processing the image to obtain image recognition data corresponding to the preset action;
    识别与所述运动数据对应的动作;Identifying an action corresponding to the motion data;
    利用识别出的所述动作对所述图像识别数据进行监督学习,并基于所述监督学习后的图像识别数据对预设的网络模型进行训练。The image recognition data is supervised and learned by the identified action, and the preset network model is trained based on the supervised learning image recognition data.
  37. 根据权利要求36所述的方法,其特征在于,所述利用识别出的所述动作对所述图像识别数据进行监督学习,包括:The method according to claim 36, wherein said supervising learning said image identification data by said identified said action comprises:
    利用深度学习,将所述图像识别数据作为输入,并将所述动作作为目标输出来进行监督学习。Using depth learning, the image recognition data is taken as an input, and the action is output as a target for supervised learning.
  38. 根据权利要求36或37所述的方法,其特征在于,所述方法还包括:The method of claim 36 or 37, wherein the method further comprises:
    根据所述运动数据获取与所述预设动作对应的特征数据;Acquiring feature data corresponding to the preset action according to the motion data;
    所述基于所述监督学习后的图像识别数据对预设的网络模型进行训练,包括:The training based on the image recognition data after the supervised learning to the preset network model includes:
    将所述特征数据与所述监督学习后的图像识别数据进行融合,得到融合数据;Merging the feature data with the supervised and learned image recognition data to obtain fusion data;
    利用所述融合数据对预设的网络模型进行训练。The preset network model is trained using the fusion data.
  39. 根据权利要求38所述的方法,其特征在于,所述根据所述运动数据获 取与所述当前动作对应的特征数据,包括:The method of claim 38, wherein said obtaining from said motion data Taking feature data corresponding to the current action, including:
    将所述运动数据转换为频域数据,以将所述频域数据作为与所述当前动作对应的特征数据。Converting the motion data into frequency domain data to use the frequency domain data as feature data corresponding to the current action.
  40. 根据权利要求36-39所述的方法,其特征在于,Method according to claims 36-39, characterized in that
    所述网络模型包括神经网络。The network model includes a neural network.
  41. 一种动作识别设备,其特征在于,包括:处理器和通信接口,所述处理器与所述通信接口连接;其中,A motion recognition device, comprising: a processor and a communication interface, wherein the processor is connected to the communication interface; wherein
    所述通信接口,用于获取由外部设备针对当前动作检测得到的运动数据;The communication interface is configured to acquire motion data detected by an external device for a current motion;
    所述处理器,用于将所述运动数据转换为频域数据,并利用所述频域数据来识别与所述频域数据对应的动作。The processor is configured to convert the motion data into frequency domain data, and use the frequency domain data to identify an action corresponding to the frequency domain data.
  42. 根据权利要求41所述的设备,其特征在于,The device according to claim 41, wherein
    所述处理器,具体用于将所述频域数据输入网络模型,以通过所述网络模型识别与所述频域数据对应的动作。The processor is specifically configured to input the frequency domain data into a network model to identify an action corresponding to the frequency domain data by using the network model.
  43. 根据权利要求43所述的设备,其特征在于,The device according to claim 43, wherein
    所述网络模型包括神经网络。The network model includes a neural network.
  44. 根据权利要求42或43所述的设备,其特征在于,A device according to claim 42 or 43, wherein
    所述通信接口,还用于获取由外部设备针对预设动作检测得到的运动数据;The communication interface is further configured to acquire motion data detected by the external device for the preset action;
    所述处理器,还用于将与所述预设动作对应的运动数据转换为频域数据,利用与所述预设动作对应的所述频域数据和所述预设动作对所述网络模型进行训练。The processor is further configured to convert the motion data corresponding to the preset action into frequency domain data, and use the frequency domain data corresponding to the preset action and the preset action to the network model Train.
  45. 根据权利要求41-44任一项所述的设备,其特征在于,Apparatus according to any of claims 41-44, wherein
    所述处理器,还用于对所述频域数据进行正则化处理,以减少对所述频域 数据的过度拟合。The processor is further configured to perform regularization processing on the frequency domain data to reduce the frequency domain Over-fitting of the data.
  46. 根据权利要求41-45任一项所述的设备,其特征在于,Apparatus according to any of claims 41-45, characterized in that
    所述处理器,还用于对所述运动数据进行归一化处理。The processor is further configured to perform normalization processing on the motion data.
  47. 根据权利要求41-46任一项所述的设备,其特征在于,Apparatus according to any of claims 41-46, wherein
    所述运动数据包括在预设的时间段内对所述外部设备的运动传感器输出的数据进行采样获取得到的数据,或者对所述外部设备的运动传感器输出的数据进行预设次数的采样获取得到的数据。The motion data includes data obtained by sampling and acquiring data output by the motion sensor of the external device in a preset time period, or sampling the data output by the motion sensor of the external device by a preset number of times. The data.
  48. 根据权利要求41-47任一项所述的设备,其特征在于,Apparatus according to any of claims 41-47, wherein
    所述处理器,还用于根据识别的动作对飞行器进行控制。The processor is further configured to control the aircraft according to the identified action.
  49. 一种动作识别设备,其特征在于,包括:处理器、通信接口和图像获取装置,所述处理器分别与所述图像获取装置和所述通信接口连接,其中,A motion recognition device, comprising: a processor, a communication interface, and an image acquisition device, wherein the processor is respectively connected to the image acquisition device and the communication interface, wherein
    所述通信接口,用于获取由外部设备针对当前动作检测得到的运动数据;The communication interface is configured to acquire motion data detected by an external device for a current motion;
    所述图像获取装置,用于针对所述当前动作采集图像;The image obtaining device is configured to collect an image for the current motion;
    所述处理器,用于获取所述图像获取装置针对所述当前动作采集的图像,并对所述图像获取装置采集的图像进行处理,得到所述当前动作对应的图像识别数据,根据所述运动数据获取与所述当前动作对应的特征数据,将与所述当前动作对应的特征数据以及图像识别数据进行融合,得到融合数据,并识别与所述融合数据对应的动作。The processor is configured to acquire an image acquired by the image acquiring device for the current action, and process an image acquired by the image acquiring device to obtain image recognition data corresponding to the current action, according to the motion The data is acquired by the feature data corresponding to the current action, and the feature data corresponding to the current action and the image recognition data are merged to obtain the merged data, and the action corresponding to the merged data is identified.
  50. 根据权利要求49所述的设备,其特征在于,The device according to claim 49, wherein
    所述处理器,具体用于将所述融合数据输入网络模型,以通过所述网络模型识别与所述融合数据对应的动作。The processor is specifically configured to input the merged data into a network model to identify an action corresponding to the merged data by using the network model.
  51. 根据权利要求50所述的设备,其特征在于,The device according to claim 50, characterized in that
    所述网络模型包括神经网络。 The network model includes a neural network.
  52. 根据权利要求49-51任一项所述的设备,其特征在于,Apparatus according to any of claims 49-51, characterized in that
    所述处理器,具体用于将所述运动数据转换为频域数据,并将所述频域数据和所述图像识别数据进行融合,得到融合数据。The processor is specifically configured to convert the motion data into frequency domain data, and fuse the frequency domain data and the image identification data to obtain fused data.
  53. 根据权利要求50或51所述的设备,其特征在于,A device according to claim 50 or 51, wherein
    所述通信接口,还用于获取由外部设备针对预设动作检测到的运动数据;The communication interface is further configured to acquire motion data detected by the external device for the preset action;
    所述图像获取装置,还用于针对所述预设动作采集图像;The image obtaining device is further configured to collect an image for the preset action;
    所述处理器,还用于获取所述图像获取装置针对所述当前动作采集的图像,并对所述图像获取装置采集的图像进行处理,得到所述预设动作对应的图像识别数据;根据所述运动数据获取与所述预设动作对应的特征数据;将与所述预设动作对应的图像识别数据和与所述预设动作对应的特征数据进行融合,得到与所述预设动作对应的融合数据;利用与所述预设动作对应的融合数据和所述预设动作对所述网络模型进行训练。The processor is further configured to acquire an image acquired by the image acquiring device for the current action, and process an image acquired by the image acquiring device to obtain image recognition data corresponding to the preset action; And acquiring the feature data corresponding to the preset action, and combining the image identification data corresponding to the preset action with the feature data corresponding to the preset action to obtain a corresponding action corresponding to the preset action Converging data; training the network model with the fused data corresponding to the preset action and the preset action.
  54. 根据权利要求49-53任一项所述的设备,其特征在于,Apparatus according to any of claims 49-53, wherein
    所述运动数据包括在预设的时间段内对所述外部设备的运动传感器输出的数据进行采样获取得到的数据,或者对所述外部设备的运动传感器输出的数据进行预设次数的采样获取得到的数据。The motion data includes data obtained by sampling and acquiring data output by the motion sensor of the external device in a preset time period, or sampling the data output by the motion sensor of the external device by a preset number of times. The data.
  55. 根据权利要求49-54任一项所述的设备,其特征在于,Apparatus according to any of claims 49-54, wherein
    所述处理器,还用于根据识别的动作对飞行器进行控制。The processor is further configured to control the aircraft according to the identified action.
  56. 一种基于动作识别的网络训练设备,其特征在于,包括:图像获取装置、处理器和通信接口,所述处理器分别与所述图像获取装置和所述通信接口连接,其中,A network training device based on motion recognition, comprising: an image acquisition device, a processor, and a communication interface, wherein the processor is respectively connected to the image acquisition device and the communication interface, wherein
    所述通信接口,用于获取由外部设备针对预设动作检测得到的运动数据;The communication interface is configured to acquire motion data detected by an external device for a preset motion;
    所述图像获取装置,用于针对所述预设动作采集图像;The image obtaining device is configured to collect an image for the preset action;
    所述处理器,用于获取所述图像获取装置针对所述预设动作采集的图像, 并对所述图像获取装置采集的图像进行处理,得到所述预设动作对应的图像识别数据;识别与所述运动数据对应的动作;利用识别出的所述动作对所述图像识别数据进行监督学习,并基于所述监督学习后的图像识别数据对预设的网络模型进行训练。The processor is configured to acquire an image that is collected by the image acquiring device for the preset action, And processing the image acquired by the image acquiring device to obtain image recognition data corresponding to the preset action; identifying an action corresponding to the motion data; and supervising the image recognition data by using the identified action Learning, and training the preset network model based on the image recognition data after the supervised learning.
  57. 根据权利要求56所述的设备,其特征在于,The device according to claim 56, wherein
    所述处理器,具体用于利用深度学习,将所述图像识别数据作为输入,并将所述动作作为目标输出来进行监督学习。The processor is specifically configured to use the depth learning to take the image recognition data as an input, and use the action as a target output to perform supervised learning.
  58. 根据权利要求56或57所述的设备,其特征在于,A device according to claim 56 or 57, wherein
    所述处理器,还用于根据所述运动数据获取与所述预设动作对应的特征数据;将所述特征数据与所述监督学习后的图像识别数据进行融合,得到融合数据;利用所述融合数据对预设的网络模型进行训练。The processor is further configured to: acquire feature data corresponding to the preset action according to the motion data; and fuse the feature data with the image recognition data after the supervised learning to obtain fusion data; The fusion data trains the preset network model.
  59. 根据权利要求56-58任一项所述的设备,其特征在于,Apparatus according to any of claims 56-58, wherein
    所述处理器,具体用于将所述运动数据转换为频域数据,以将所述频域数据作为与所述当前动作对应的特征数据。The processor is specifically configured to convert the motion data into frequency domain data to use the frequency domain data as feature data corresponding to the current action.
  60. 根据权利要求56-59所述的设备,其特征在于,Device according to claims 56-59, characterized in that
    所述网络模型包括神经网络。The network model includes a neural network.
  61. 一种飞行器,其特征在于,包括,An aircraft characterized in that
    动力系统,用于为飞行器提供动力;a power system for powering the aircraft;
    如权利要求41-48任一项所述的动作识别设备,对动作进行识别。The motion recognition device according to any one of claims 41 to 48, wherein the motion is identified.
  62. 一种飞行器,其特征在于,包括,An aircraft characterized in that
    动力系统,用于为飞行器提供动力;a power system for powering the aircraft;
    如权利要求49-55任一项所述的动作识别设备,对动作进行识别。 The motion recognition device according to any one of claims 49 to 55, wherein the motion is identified.
  63. 一种飞行器,其特征在于,包括,An aircraft characterized in that
    动力系统,用于为飞行器提供动力;a power system for powering the aircraft;
    如权利要求56-60任一项所述的基于动作识别的网络训练设备,用于对动作识别的网络模型进行训练。 A motion recognition based network training device according to any one of claims 56-60 for training a network model of motion recognition.
PCT/CN2016/104121 2016-10-31 2016-10-31 Gesture recognition method, network training method, apparatus and equipment WO2018076371A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2016/104121 WO2018076371A1 (en) 2016-10-31 2016-10-31 Gesture recognition method, network training method, apparatus and equipment
CN201680029871.3A CN107735796A (en) 2016-10-31 2016-10-31 Action identification method, network training method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/104121 WO2018076371A1 (en) 2016-10-31 2016-10-31 Gesture recognition method, network training method, apparatus and equipment

Publications (1)

Publication Number Publication Date
WO2018076371A1 true WO2018076371A1 (en) 2018-05-03

Family

ID=61201295

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/104121 WO2018076371A1 (en) 2016-10-31 2016-10-31 Gesture recognition method, network training method, apparatus and equipment

Country Status (2)

Country Link
CN (1) CN107735796A (en)
WO (1) WO2018076371A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109196518B (en) * 2018-08-23 2022-06-07 合刃科技(深圳)有限公司 Gesture recognition method and device based on hyperspectral imaging
CN111050266B (en) * 2019-12-20 2021-07-30 朱凤邹 Method and system for performing function control based on earphone detection action
CN116311539B (en) * 2023-05-19 2023-07-28 亿慧云智能科技(深圳)股份有限公司 Sleep motion capturing method, device, equipment and storage medium based on millimeter waves

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101807245A (en) * 2010-03-02 2010-08-18 天津大学 Artificial neural network-based multi-source gait feature extraction and identification method
CN104484037A (en) * 2014-12-12 2015-04-01 三星电子(中国)研发中心 Method for intelligent control by virtue of wearable device and wearable device
CN204945794U (en) * 2015-08-24 2016-01-06 武汉理工大学 Based on the wireless remote control dolly of gesture identification
CN105817037A (en) * 2016-05-19 2016-08-03 深圳大学 Toy air vehicle based on myoelectric control and control method thereof
CN105867355A (en) * 2016-06-03 2016-08-17 深圳市迪瑞特科技有限公司 Intelligent vehicle-mounted device system
CN105955306A (en) * 2016-07-20 2016-09-21 西安中科比奇创新科技有限责任公司 Wearable device and unmanned aerial vehicle control method and system based on wearable device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101807245A (en) * 2010-03-02 2010-08-18 天津大学 Artificial neural network-based multi-source gait feature extraction and identification method
CN104484037A (en) * 2014-12-12 2015-04-01 三星电子(中国)研发中心 Method for intelligent control by virtue of wearable device and wearable device
CN204945794U (en) * 2015-08-24 2016-01-06 武汉理工大学 Based on the wireless remote control dolly of gesture identification
CN105817037A (en) * 2016-05-19 2016-08-03 深圳大学 Toy air vehicle based on myoelectric control and control method thereof
CN105867355A (en) * 2016-06-03 2016-08-17 深圳市迪瑞特科技有限公司 Intelligent vehicle-mounted device system
CN105955306A (en) * 2016-07-20 2016-09-21 西安中科比奇创新科技有限责任公司 Wearable device and unmanned aerial vehicle control method and system based on wearable device

Also Published As

Publication number Publication date
CN107735796A (en) 2018-02-23

Similar Documents

Publication Publication Date Title
US11892859B2 (en) Remoteless control of drone behavior
US11914370B2 (en) System and method for providing easy-to-use release and auto-positioning for drone applications
US11861069B2 (en) Gesture operated wrist mounted camera system
US20230136669A1 (en) Event camera-based gaze tracking using neural networks
JP6732317B2 (en) Face activity detection method and apparatus, and electronic device
CN107239728B (en) Unmanned aerial vehicle interaction device and method based on deep learning attitude estimation
JP6503070B2 (en) Method for determining the position of a portable device
US11715231B2 (en) Head pose estimation from local eye region
CN110083202A (en) With the multi-module interactive of near-eye display
WO2018001245A1 (en) Robot control using gestures
EP3188128A1 (en) Information-processing device, information processing method, and program
US10971152B2 (en) Imaging control method and apparatus, control device, and imaging device
WO2018076371A1 (en) Gesture recognition method, network training method, apparatus and equipment
Abate et al. Remote 3D face reconstruction by means of autonomous unmanned aerial vehicles
WO2022082440A1 (en) Method, apparatus and system for determining target following strategy, and device and storage medium
Tu et al. Face and gesture based human computer interaction
CN115393962A (en) Motion recognition method, head-mounted display device, and storage medium
US20220265168A1 (en) Real-time limb motion tracking
WO2023151551A1 (en) Video image processing method and apparatus, and electronic device and storage medium
US20230143443A1 (en) Systems and methods of fusing computer-generated predicted image frames with captured images frames to create a high-dynamic-range video having a high number of frames per second
US20200341556A1 (en) Pattern embeddable recognition engine and method
Martins A human-machine interface using augmented reality glasses for applications in assistive robotics.
Mishra A review on learning-based algorithms for human activity recognition
KR20230168094A (en) Method, system and non-transitory computer-readable recording medium for processing image for analysis of nail
Bauer et al. User independent, multi-modal spotting of subtle arm actions with minimal training data

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16919618

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16919618

Country of ref document: EP

Kind code of ref document: A1