CN106598226A - UAV (Unmanned Aerial Vehicle) man-machine interaction method based on binocular vision and deep learning - Google Patents

UAV (Unmanned Aerial Vehicle) man-machine interaction method based on binocular vision and deep learning Download PDF

Info

Publication number
CN106598226A
CN106598226A CN201611030533.XA CN201611030533A CN106598226A CN 106598226 A CN106598226 A CN 106598226A CN 201611030533 A CN201611030533 A CN 201611030533A CN 106598226 A CN106598226 A CN 106598226A
Authority
CN
China
Prior art keywords
unmanned plane
deep learning
machine interaction
man
binocular vision
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611030533.XA
Other languages
Chinese (zh)
Other versions
CN106598226B (en
Inventor
侯永宏
叶秀峰
侯春萍
刘春源
陈艳芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201611030533.XA priority Critical patent/CN106598226B/en
Publication of CN106598226A publication Critical patent/CN106598226A/en
Application granted granted Critical
Publication of CN106598226B publication Critical patent/CN106598226B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Abstract

The invention relates to a UAV (Unmanned Aerial Vehicle) man-machine interaction method based on binocular vision and deep learning. An embedded image processing platform and a binocular camera are carried on a UAV. The embedded image processing platform is connected to a flight controller through an interface. The UAV communicates with a ground station through the platform. The platform is provided with a Graphics Processing Unit (GPU), a convolutional neural network deep learning algorithm is run and parallel computing for an image is carried out during video capture of the camera. According to the invention, a convolutional neutral network is transplanted to the embedded platform with a dedicated GPU and the run speed becomes faster through parallel computing for speeding up the computation.

Description

It is a kind of based on binocular vision and the unmanned plane man-machine interaction method of deep learning
Technical field
This method belongs to multimedia signal processing field, and in particular to the skill such as computer vision, deep learning, man-machine interaction Art, it is especially a kind of based on binocular vision and the unmanned plane man-machine interaction method of deep learning.
Background technology
Human-computer interaction technology be accompanied by the birth of computer and produce and with computer software and hardware development and by Gradually develop, the appearance of new technique constantly simplifies the flow process of man-machine interaction.In recent years, with the appearance of artificial intelligence technology And development, the continuous progressive and innovation of related software and hardware technology, how to realize that more convenient man-machine interaction becomes study hotspot, Various novel human-machine interaction technologies are continued to bring out.At the same time, the rise of relevant industries such as inexpensive SUAV (UAV) with Popularization, in the urgent need to some, more easily people reduces the threshold that unmanned plane is operated with the interactive controlling mode of unmanned plane so that Unmanned plane is increasingly widely applied.
The man-machine interaction of unmanned plane is shown and mainly controlled using special equipments such as remote control, rocking bar, ground station softwares System.Wherein the most importantly operated using remote control, although the operation difficulty of remote control is sent out with unmanned air vehicle technique Exhibition is lowered significantly, however, the remote control of heaviness still brings very big inconvenience to operator.By mobile phone, computer and specially With the earth station of software sharing in many cases, man-machine interaction is caused to be more convenient from certain depth.In recent years, it is new man-machine Exchange method emerges in an endless stream, and occurs in that and wears special auxiliary equipment, using the measured value or EEG signals of body part motion Simplify the control mode of unmanned plane as control signal.But the control mode for relying on special auxiliary equipment still suffers from flower Fei Gao, using troublesome problem.
In the middle of the unmanned plane popularized on the market, photographic head is all equipped with mostly.Photographic head it is cheap so that vision Solution becomes takes photo by plane, computer vision navigation, the first-selection of vision avoidance.The function of these photographic head is made full use of, is utilized Computer vision, unmanned plane exchange method is carried out with more universality by image recognition gesture.It is existing to be regarded based on computer Feel man-machine interaction method, due to the restriction of software and hardware, tend not to enough interact in scope remote enough, and easily receive To environmental disturbances, it is impossible to apply in outdoor scene.In order to improve the identification of the Gesture Recognition Algorithm in outdoor environment unmanned plane Precision, the present invention carries out gesture identification using the method for deep learning first, the motion of the control unmanned plane that uses gesture, and simplifies nothing Man-machine manipulation difficulty.
Traditional action recognition algorithm computation complexity is high, due to lacking necessary accelerating algorithm, recognition speed is slow, accurately Rate is low.
The content of the invention
The present invention is based on binocular camera, by carrying embedded platform on unmanned plane, constructs one based on calculating The unmanned plane man-machine interactive system of machine vision and deep learning, system is provided can be according to the avigator's for specifying on the ground The heading of gesture control aircraft.
1. hardware system is constituted
The system by carry the unmanned aerial vehicle platform of geographical position acquisition module, embedded image processing platform, photographic head, 4, face station part is constituted.
Unmanned aerial vehicle platform is multi-rotor unmanned aerial vehicle, and unmanned plane is positioned by geographical position acquisition module, flight control Device can control unmanned plane in outdoor autonomous hovering.Embedded image processing platform and photographic head, the system are carried on unmanned plane In photographic head catch for subsequent treatment high-definition picture.
Embedded platform is the platform that can provide enough operational capabilities for image procossing with graphic process unit (GPU). This platform is responsible for the acquisition process and action recognition of image, in the middle of practical application, can be served as by high-performance mobile terminal handler, The system connects flight controller by interface, and thus platform serves as the communication on unmanned plane and ground.Platform carries operating system Operation processing routine.
Earth station is responsible for monitoring the state of four-axle aircraft, for specifying avigator and checking the result of real-time operation, Can be served as by notebook or intelligent terminal.
2. action recognition framework
Action recognition framework mainly includes:Video pre-filtering, the instruction for generating color texture figure and convolutional neural networks model Practice and classify.
1) according to the one-view image of unmanned plane passback, unmanned plane avigator is selected, method is using mouse or touch Screen, body (upper body) region of avigator is irised out to come;
2) selected region is expanded into frame according to person ratio and selects the whole body region of people, while from another In the middle of viewpoint, the person region of respective regions is selected, and extracted frame by frame in the video sequence using track algorithm Go out avigator region and avigator is tracked, and according to the position of setting regions, by whole human body institute Region image segmentation out;
3) Stereo matching, according to the Stereo Matching Algorithm based on Block- matching, by the image between the left and right viewpoint for splitting Matched, the stereopsises figure after being split, the personage comprising avigator and background in the middle of disparity map;
4) disparity map for obtaining is normalized.And use threshold value wiping out background.Obtain clean character image;
5) character image of adjacent two frame is carried out into difference, obtains difference image sequence.
6) according to the picture sequence for generating, successively the difference image using different colours representative not in the same time, is encoded, The character image that the picture of about 2s or so is produced is superimposed as into color texture figure.Using the method for sliding window, take every 5 frames The picture of about 2s or so;
7) gather substantial amounts of, the color texture figure of selected action is done under various circumstances to neutral net by different operating person It is trained.Training is carried out in special purpose workstation.After the completion of training, training parameter is uploaded to into embedded image processing platform.
8) on embedded image processing platform, using the parameter for training to Real-time Collection and generate color texture Figure is classified.
9) on embedded image processing platform, tracking, segmentation, Stereo matching asks for color texture figure, and classification exists respectively Different threads is carried out simultaneously, so as to the disposal ability for substantially utilizing processor.Meanwhile, by about the calculating of image, profit Accelerated with graphic process unit so that processing speed meets the requirement of real-time.
Advantages of the present invention and beneficial effect:
1st, convolutional neural networks are transplanted to and are configured with the embedded platform with dedicated graphics processors (GPU) by the present invention On, accelerated so as to accelerate the speed of service by parallel computation.
2nd, the present invention extracts operator region using target tracking algorism from video sequence, efficiently solves nothing The problems such as man-machine in-flight camera drift and complex background are disturbed, while reducing operand.The method and method for distinguishing phase Relatively, with working range it is wide, accuracy rate is high the characteristics of.
3rd, the present invention will be accelerated about the calculating of image using graphic process unit so that processing speed meets real-time Requirement.
Description of the drawings
Fig. 1 is the hardware system connection diagram of this method;
Fig. 2 is step 2 Stereo matching design sketch in embodiment;
Fig. 3 is the image in embodiment after step 3 wiping out background;
Fig. 4 is color texture figure composition principle figure;
Fig. 5 is the process chart of this method.
Specific embodiment
Below in conjunction with the accompanying drawings and by specific embodiment the invention will be further described, and following examples are descriptive , it is not determinate, it is impossible to which protection scope of the present invention is limited with this.
It is a kind of based on binocular vision and the unmanned plane man-machine interaction method of deep learning, comprise the following steps that:
1) during system start-up, according to video camera display content, the single viewpoint frame shown by earth station selects avigator, profit Avigator is tracked with fast iterative algorithm, is extracted with avigator in the video of high-resolution according to tracking result Centered on low-resolution video sequence.
2) using the Stereo Matching Algorithm based on Block- matching, Stereo matching is carried out to the part of low resolution, this part Stereo matching, is accelerated by graphic process unit.Simultaneously the parameter of this step provides minimax parallax value DmaxWith Dmin.It is three-dimensional Matching effect is shown in accompanying drawing 2.
3) result of Stereo matching is processed:The step is mainly used in wiping out background noise.First by three-dimensional Image pixel by pixel normalization after matching somebody with somebody, this step operand is big, is accelerated using graphic process unit (gpu), is calculated using multiple gpu The multithreading of core accelerates to complete.Normalization formula is as follows:
Wherein DmaxAnd DminThe maximum and minimum value of gray value is represented respectively, and d is the gray value of current pixel.D' takes less In the maximum on the right.
Secondly, with the method for thresholding, background noise is filtered:
Wherein threadhold is threshold value, and the selection of threshold value is relevant with resolution of video camera and camera spacing, needs to pass through Statistics determination is carried out to parallax distribution.In the middle of the experimental demonstration system of the present invention, we determine that value is 225 according to experiment. There was only the character image of people information through the image of thresholding.Image after filtering is shown in accompanying drawing 3.
4) by through doing difference between two frames before and after the character image sequence of thresholding, the video sequence of difference is obtained.It is false The deep video sequence that setting tool has n frames is:d1,d2,…,dn, wherein diRepresent the depth map of the i-th frame.Due to picture on depth image The value of vegetarian refreshments represents distance of the location of pixels relative to camera lens, therefore for two adjacent frame depth images, can be by meter The difference of pixel point value of same pixel position is calculated describing action message.Here, adjacent two frame is done the knot drawn after difference Fruit is expressed as:m1,m2,…,mn-1
5) in order in a figure, with the temporal characteristics for representing action, the present invention is using color come coding depth figure sequence Row, by HSV color spaces change colourity H, with different colors represent difference depth not in the same time.Assume hmaxAnd hmin Represent the span of the colourity in HSV color spaces in experiment.Then in i-th depth map, all difference depth that calculate Location of pixels uses colourity HiTo be encoded:
6) in additive process, for location of pixels p (x, y), if it gets the m in depth differenceiIt is upper that there is depth Changing value zi, according to the change in depth value sequence z of the location of pixels in whole video sequence1,z2,…,ziThe depth of maximum can be obtained Degree changing value zmax=zkSo as to specify final color allocation H of the location of pixelsk.All pixels position on to whole pictures Do the above operation after, can by whole compression of video sequence into a coloury color texture picture, wherein pixel The locus of value describe the space characteristics of action sequence, and the corresponding color value of pixel is the temporal characteristics of action sequence. The colored schematic diagram of synthesis is shown in accompanying drawing 4.
7) after obtaining color texture figure, picture is learnt by convolutional neural networks (CNN) and is classified to complete to move The identification (experiment adopts network structure for Alexnet) of work.
In order to reach the requirement of real-time, present invention utilizes time interval during camera captures video, and it is embedded The parallel processing capability of formula system, parallel computation is carried out while video is caught to image.In image procossing and neutral net Identification process in the middle of, accelerated using graphic process unit.The track algorithm adopted in the present invention is fast in order to improve operation Degree, following range is only limited to operator's face part, in subsequent treatment, further according to tracing area bigger region is intercepted.And depth Figure calculating aspect, employs the algorithm based on Block- matching of speed, and it is per second that Stereo matching frame per second can reach about 16 frames.Finally Generating classification results can be with the speed of 2 frame per second.System software structure is shown in accompanying drawing 5.
The picture that unmanned plane catches often drifts about and rocks for the picture that still camera catches with camera. The present invention needs the corresponding data collection for gathering in different environments and generating according to use environment.The demonstration system of the present invention The training dataset that system is generated, with camera drift, rocks and is walked about with background characters, if the video by acquiring 5 actions It is dry, it is that each action generates about 2000 color texture pictures, in addition, also generating a class comprising 3000 color texture figures Non-controlling instruction.
It is below the experimental result on data set of the invention and explanation:Action gesture is converted to into action command.First Neutral net is trained using large-scale work station and training result is uploaded to into embedded image processing platform.In outdoor environment Under, operator is doing respectively 20 times, 100 to each control instruction at a certain distance totally in the range of unmanned plane 6-13 rice Operational order, period walks about and violate-action with left and right.Test to system shows, in the range of 10m, the standard of system identification Really rate can reach more than 90 percent, and recognition effect is reliable effectively.Recognition result sees attached list 1.
Table 1
In the middle of the gatherer process of the color texture figure of training set, in order to avoid over-fitting cause figure action when Between it is different in size and can not correctly be identified.Color texture figure needs all to be done with the action of different time length to synthesize.Tool Body is to take different frame numbers.In the middle of the demo system of the present invention, the method that we adopt is that, respectively with 30,40,50 frames are length Degree is synthesized.Simultaneously in order to avoid over-fitting situation, also using the method for rotating image and resolution conversion to training data Collection is extended.
Above-described is only the preferred embodiment of the present invention, it is noted that for one of ordinary skill in the art For, on the premise of without departing from inventive concept, some deformations and improvement can also be made, these belong to the protection of the present invention Scope.

Claims (8)

1. a kind of based on binocular vision and the unmanned plane man-machine interaction method of deep learning, it is characterised in that:Take on unmanned plane Embedded image processing platform and binocular camera are carried, the embedded image processing platform connects flight controller by interface, With ground by the Platform communication, platform carries graphic process unit to unmanned plane, runs convolutional neural networks deep learning algorithm, Photographic head carries out parallel computation while catching video to image.
2. according to claim 1 based on binocular vision and the unmanned plane man-machine interaction method of deep learning, its feature exists In:The exchange method is concretely comprised the following steps:
1) according to the one-view image of unmanned plane passback, unmanned plane avigator is selected, using mouse or Touch screen, will be navigated The upper body region of member is irised out and;
2) selected region is expanded into frame according to person ratio and selects the whole body region of people, while from another viewpoint It is central, the person region of respective regions is selected, and extract neck frame by frame in the video sequence using track algorithm Boat person region and avigator is tracked, and according to the position of setting regions, by whole human body location The image segmentation in domain is out;
3) Stereo matching, according to the Stereo Matching Algorithm based on Block- matching, the image between the left and right viewpoint for splitting is carried out Personage comprising avigator and background in the middle of matching, the stereopsises figure after being split, disparity map;
4) disparity map for obtaining is normalized, and uses threshold value wiping out background, obtain clean character image;
5) character image of adjacent two frame is carried out into difference, obtains difference image sequence;
6) according to the picture sequence for generating, successively the difference image using different colours representative not in the same time, is encoded, will be big The character image that the picture of about 2s or so is produced is superimposed as color texture figure;Using the method for sliding window, take about every 5 frames The picture of 2s or so;
7) gather the substantial amounts of color texture figure for doing selected action under various circumstances by different operating person is carried out to neutral net Training, training is carried out in special purpose workstation, after the completion of training, training parameter is uploaded to into embedded image processing platform;
8) on embedded image processing platform, using the parameter for training to Real-time Collection and generate color texture figure enter Row classification, identification;
9) action command that will identify that is sent to flight controller, instructs unmanned plane to move.
3. according to claim 2 based on binocular vision and the unmanned plane man-machine interaction method of deep learning, its feature exists In:The step 2) following range of track algorithm is only limited to operator's face part.
4. according to claim 2 based on binocular vision and the unmanned plane man-machine interaction method of deep learning, its feature exists In:The step 3) Stereo Matching Algorithm by graphic process unit accelerate.
5. according to claim 2 based on binocular vision and the unmanned plane man-machine interaction method of deep learning, its feature exists In:The step 4) normalization by graphic process unit accelerate.
6. according to claim 2 based on binocular vision and the unmanned plane man-machine interaction method of deep learning, its feature exists In:The step 8) classify, recognize by graphic process unit acceleration.
7. according to claim 2 based on binocular vision and the unmanned plane man-machine interaction method of deep learning, its feature exists In:The step 7) train the color texture figure of collection respectively with 30,40,50 frames are synthesized for length.
8. according to claim 2 based on binocular vision and the unmanned plane man-machine interaction method of deep learning, its feature exists In:The step 7) training dataset is extended using the method for rotating image and resolution conversion.
CN201611030533.XA 2016-11-16 2016-11-16 A kind of unmanned plane man-machine interaction method based on binocular vision and deep learning Expired - Fee Related CN106598226B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611030533.XA CN106598226B (en) 2016-11-16 2016-11-16 A kind of unmanned plane man-machine interaction method based on binocular vision and deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611030533.XA CN106598226B (en) 2016-11-16 2016-11-16 A kind of unmanned plane man-machine interaction method based on binocular vision and deep learning

Publications (2)

Publication Number Publication Date
CN106598226A true CN106598226A (en) 2017-04-26
CN106598226B CN106598226B (en) 2019-05-21

Family

ID=58591578

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611030533.XA Expired - Fee Related CN106598226B (en) 2016-11-16 2016-11-16 A kind of unmanned plane man-machine interaction method based on binocular vision and deep learning

Country Status (1)

Country Link
CN (1) CN106598226B (en)

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107643826A (en) * 2017-08-28 2018-01-30 天津大学 A kind of unmanned plane man-machine interaction method based on computer vision and deep learning
CN107657233A (en) * 2017-09-28 2018-02-02 东华大学 Static sign language real-time identification method based on modified single multi-target detection device
CN108052901A (en) * 2017-12-13 2018-05-18 中国科学院沈阳自动化研究所 A kind of gesture identification Intelligent unattended machine remote control method based on binocular
CN108257145A (en) * 2017-12-13 2018-07-06 北京华航无线电测量研究所 A kind of UAV Intelligent based on AR technologies scouts processing system and method
CN108496188A (en) * 2017-05-31 2018-09-04 深圳市大疆创新科技有限公司 Method, apparatus, computer system and the movable equipment of neural metwork training
CN109155067A (en) * 2018-01-23 2019-01-04 深圳市大疆创新科技有限公司 The control method of moveable platform, equipment, computer readable storage medium
CN109343565A (en) * 2018-10-29 2019-02-15 中国航空无线电电子研究所 A kind of UAV Intelligent ground control control method based on gesture perception identification
CN109445465A (en) * 2018-10-17 2019-03-08 深圳市道通智能航空技术有限公司 Method for tracing, system, unmanned plane and terminal based on unmanned plane
CN109697428A (en) * 2018-12-27 2019-04-30 江西理工大学 Positioning system is identified based on the unmanned plane of RGB_D and depth convolutional network
CN109858341A (en) * 2018-12-24 2019-06-07 北京澎思智能科技有限公司 A kind of Face detection and tracking method based on embedded system
WO2019144300A1 (en) * 2018-01-23 2019-08-01 深圳市大疆创新科技有限公司 Target detection method and apparatus, and movable platform
CN110096973A (en) * 2019-04-16 2019-08-06 东南大学 A kind of traffic police's gesture identification method separating convolutional network based on ORB algorithm and depth level
CN110222581A (en) * 2019-05-13 2019-09-10 电子科技大学 A kind of quadrotor drone visual target tracking method based on binocular camera
CN110415236A (en) * 2019-07-30 2019-11-05 深圳市博铭维智能科技有限公司 A kind of method for detecting abnormality of the complicated underground piping based on double-current neural network
CN110730899A (en) * 2018-08-23 2020-01-24 深圳市大疆创新科技有限公司 Control method and device for movable platform
WO2021042375A1 (en) * 2019-09-06 2021-03-11 深圳市汇顶科技股份有限公司 Face spoofing detection method, chip, and electronic device
US11403069B2 (en) 2017-07-24 2022-08-02 Tesla, Inc. Accelerated mathematical engine
US11409692B2 (en) 2017-07-24 2022-08-09 Tesla, Inc. Vector computational unit
US11487288B2 (en) 2017-03-23 2022-11-01 Tesla, Inc. Data synthesis for autonomous control systems
TWI784254B (en) * 2020-03-18 2022-11-21 中強光電股份有限公司 Unmanned aircraft and image recognition method thereof
US11537811B2 (en) 2018-12-04 2022-12-27 Tesla, Inc. Enhanced object detection for autonomous vehicles based on field view
US11561791B2 (en) 2018-02-01 2023-01-24 Tesla, Inc. Vector computational unit receiving data elements in parallel from a last row of a computational array
US11562231B2 (en) 2018-09-03 2023-01-24 Tesla, Inc. Neural networks for embedded devices
US11567514B2 (en) 2019-02-11 2023-01-31 Tesla, Inc. Autonomous and user controlled vehicle summon to a target
US11610117B2 (en) 2018-12-27 2023-03-21 Tesla, Inc. System and method for adapting a neural network model on a hardware platform
US11636333B2 (en) 2018-07-26 2023-04-25 Tesla, Inc. Optimizing neural network structures for embedded systems
US11665108B2 (en) 2018-10-25 2023-05-30 Tesla, Inc. QoS manager for system on a chip communications
US11681649B2 (en) 2017-07-24 2023-06-20 Tesla, Inc. Computational array microprocessor system using non-consecutive data formatting
US11734562B2 (en) 2018-06-20 2023-08-22 Tesla, Inc. Data pipeline and deep learning system for autonomous driving
US11748620B2 (en) 2019-02-01 2023-09-05 Tesla, Inc. Generating ground truth for machine learning from time series elements

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11893393B2 (en) 2017-07-24 2024-02-06 Tesla, Inc. Computational array microprocessor system with hardware arbiter managing memory requests
US11361457B2 (en) 2018-07-20 2022-06-14 Tesla, Inc. Annotation cross-labeling for autonomous control systems
CN115512173A (en) 2018-10-11 2022-12-23 特斯拉公司 System and method for training machine models using augmented data
US11816585B2 (en) 2018-12-03 2023-11-14 Tesla, Inc. Machine learning models operating at different frequencies for autonomous vehicles
US10956755B2 (en) 2019-02-19 2021-03-23 Tesla, Inc. Estimating object properties using visual image data

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102665049A (en) * 2012-03-29 2012-09-12 中国科学院半导体研究所 Programmable visual chip-based visual image processing system
US20130253733A1 (en) * 2012-03-26 2013-09-26 Hon Hai Precision Industry Co., Ltd. Computing device and method for controlling unmanned aerial vehicle in flight space
CN104808799A (en) * 2015-05-20 2015-07-29 成都通甲优博科技有限责任公司 Unmanned aerial vehicle capable of indentifying gesture and identifying method thereof
CN105222760A (en) * 2015-10-22 2016-01-06 一飞智控(天津)科技有限公司 The autonomous obstacle detection system of a kind of unmanned plane based on binocular vision and method
CN106022236A (en) * 2016-05-13 2016-10-12 上海宝宏软件有限公司 Action identification method based on human body contour
CN106020227A (en) * 2016-08-12 2016-10-12 北京奇虎科技有限公司 Control method and device for unmanned aerial vehicle
US20160327950A1 (en) * 2014-06-19 2016-11-10 Skydio, Inc. Virtual camera interface and other user interaction paradigms for a flying digital assistant

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130253733A1 (en) * 2012-03-26 2013-09-26 Hon Hai Precision Industry Co., Ltd. Computing device and method for controlling unmanned aerial vehicle in flight space
CN102665049A (en) * 2012-03-29 2012-09-12 中国科学院半导体研究所 Programmable visual chip-based visual image processing system
US20160327950A1 (en) * 2014-06-19 2016-11-10 Skydio, Inc. Virtual camera interface and other user interaction paradigms for a flying digital assistant
CN104808799A (en) * 2015-05-20 2015-07-29 成都通甲优博科技有限责任公司 Unmanned aerial vehicle capable of indentifying gesture and identifying method thereof
CN105222760A (en) * 2015-10-22 2016-01-06 一飞智控(天津)科技有限公司 The autonomous obstacle detection system of a kind of unmanned plane based on binocular vision and method
CN106022236A (en) * 2016-05-13 2016-10-12 上海宝宏软件有限公司 Action identification method based on human body contour
CN106020227A (en) * 2016-08-12 2016-10-12 北京奇虎科技有限公司 Control method and device for unmanned aerial vehicle

Cited By (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11487288B2 (en) 2017-03-23 2022-11-01 Tesla, Inc. Data synthesis for autonomous control systems
CN108496188A (en) * 2017-05-31 2018-09-04 深圳市大疆创新科技有限公司 Method, apparatus, computer system and the movable equipment of neural metwork training
US11403069B2 (en) 2017-07-24 2022-08-02 Tesla, Inc. Accelerated mathematical engine
US11409692B2 (en) 2017-07-24 2022-08-09 Tesla, Inc. Vector computational unit
US11681649B2 (en) 2017-07-24 2023-06-20 Tesla, Inc. Computational array microprocessor system using non-consecutive data formatting
CN107643826A (en) * 2017-08-28 2018-01-30 天津大学 A kind of unmanned plane man-machine interaction method based on computer vision and deep learning
CN107657233A (en) * 2017-09-28 2018-02-02 东华大学 Static sign language real-time identification method based on modified single multi-target detection device
CN108257145A (en) * 2017-12-13 2018-07-06 北京华航无线电测量研究所 A kind of UAV Intelligent based on AR technologies scouts processing system and method
CN108052901A (en) * 2017-12-13 2018-05-18 中国科学院沈阳自动化研究所 A kind of gesture identification Intelligent unattended machine remote control method based on binocular
CN108257145B (en) * 2017-12-13 2021-07-02 北京华航无线电测量研究所 Intelligent unmanned aerial vehicle reconnaissance processing system and method based on AR technology
CN108052901B (en) * 2017-12-13 2021-05-25 中国科学院沈阳自动化研究所 Binocular-based gesture recognition intelligent unmanned aerial vehicle remote control method
CN109155067A (en) * 2018-01-23 2019-01-04 深圳市大疆创新科技有限公司 The control method of moveable platform, equipment, computer readable storage medium
WO2019144300A1 (en) * 2018-01-23 2019-08-01 深圳市大疆创新科技有限公司 Target detection method and apparatus, and movable platform
US11227388B2 (en) 2018-01-23 2022-01-18 SZ DJI Technology Co., Ltd. Control method and device for mobile platform, and computer readable storage medium
US11561791B2 (en) 2018-02-01 2023-01-24 Tesla, Inc. Vector computational unit receiving data elements in parallel from a last row of a computational array
US11734562B2 (en) 2018-06-20 2023-08-22 Tesla, Inc. Data pipeline and deep learning system for autonomous driving
US11636333B2 (en) 2018-07-26 2023-04-25 Tesla, Inc. Optimizing neural network structures for embedded systems
CN110730899A (en) * 2018-08-23 2020-01-24 深圳市大疆创新科技有限公司 Control method and device for movable platform
CN110730899B (en) * 2018-08-23 2024-01-16 深圳市大疆创新科技有限公司 Control method and device for movable platform
US11562231B2 (en) 2018-09-03 2023-01-24 Tesla, Inc. Neural networks for embedded devices
CN109445465A (en) * 2018-10-17 2019-03-08 深圳市道通智能航空技术有限公司 Method for tracing, system, unmanned plane and terminal based on unmanned plane
US11665108B2 (en) 2018-10-25 2023-05-30 Tesla, Inc. QoS manager for system on a chip communications
CN109343565A (en) * 2018-10-29 2019-02-15 中国航空无线电电子研究所 A kind of UAV Intelligent ground control control method based on gesture perception identification
US11537811B2 (en) 2018-12-04 2022-12-27 Tesla, Inc. Enhanced object detection for autonomous vehicles based on field view
US11908171B2 (en) 2018-12-04 2024-02-20 Tesla, Inc. Enhanced object detection for autonomous vehicles based on field view
CN109858341A (en) * 2018-12-24 2019-06-07 北京澎思智能科技有限公司 A kind of Face detection and tracking method based on embedded system
CN109858341B (en) * 2018-12-24 2021-05-25 北京澎思科技有限公司 Rapid multi-face detection and tracking method based on embedded system
US11610117B2 (en) 2018-12-27 2023-03-21 Tesla, Inc. System and method for adapting a neural network model on a hardware platform
CN109697428A (en) * 2018-12-27 2019-04-30 江西理工大学 Positioning system is identified based on the unmanned plane of RGB_D and depth convolutional network
US11748620B2 (en) 2019-02-01 2023-09-05 Tesla, Inc. Generating ground truth for machine learning from time series elements
US11567514B2 (en) 2019-02-11 2023-01-31 Tesla, Inc. Autonomous and user controlled vehicle summon to a target
CN110096973A (en) * 2019-04-16 2019-08-06 东南大学 A kind of traffic police's gesture identification method separating convolutional network based on ORB algorithm and depth level
CN110222581A (en) * 2019-05-13 2019-09-10 电子科技大学 A kind of quadrotor drone visual target tracking method based on binocular camera
CN110222581B (en) * 2019-05-13 2022-04-19 电子科技大学 Binocular camera-based quad-rotor unmanned aerial vehicle visual target tracking method
CN110415236A (en) * 2019-07-30 2019-11-05 深圳市博铭维智能科技有限公司 A kind of method for detecting abnormality of the complicated underground piping based on double-current neural network
WO2021042375A1 (en) * 2019-09-06 2021-03-11 深圳市汇顶科技股份有限公司 Face spoofing detection method, chip, and electronic device
CN112997185A (en) * 2019-09-06 2021-06-18 深圳市汇顶科技股份有限公司 Face living body detection method, chip and electronic equipment
TWI784254B (en) * 2020-03-18 2022-11-21 中強光電股份有限公司 Unmanned aircraft and image recognition method thereof
US11875560B2 (en) 2020-03-18 2024-01-16 Coretronic Corporation Unmanned aerial vehicle and image recognition method thereof

Also Published As

Publication number Publication date
CN106598226B (en) 2019-05-21

Similar Documents

Publication Publication Date Title
CN106598226B (en) A kind of unmanned plane man-machine interaction method based on binocular vision and deep learning
US10521928B2 (en) Real-time gesture recognition method and apparatus
EP3961485A1 (en) Image processing method, apparatus and device, and storage medium
CN107168527B (en) The first visual angle gesture identification and exchange method based on region convolutional neural networks
CN107688391B (en) Gesture recognition method and device based on monocular vision
CN108776773B (en) Three-dimensional gesture recognition method and interaction system based on depth image
CN102854983B (en) A kind of man-machine interaction method based on gesture identification
CN102638653B (en) Automatic face tracing method on basis of Kinect
CN103530619A (en) Gesture recognition method of small quantity of training samples based on RGB-D (red, green, blue and depth) data structure
CN107765855A (en) A kind of method and system based on gesture identification control machine people motion
CN102194443B (en) Display method and system for window of video picture in picture and video processing equipment
CN108960067A (en) Real-time train driver motion recognition system and method based on deep learning
CN103105924B (en) Man-machine interaction method and device
US10803604B1 (en) Layered motion representation and extraction in monocular still camera videos
WO2023146241A1 (en) System and method for generating a three-dimensional photographic image
CN112487981A (en) MA-YOLO dynamic gesture rapid recognition method based on two-way segmentation
Maher et al. Realtime human-UAV interaction using deep learning
Abed et al. Python-based Raspberry Pi for hand gesture recognition
CN113158833A (en) Unmanned vehicle control command method based on human body posture
WO2022267653A1 (en) Image processing method, electronic device, and computer readable storage medium
CN108052901A (en) A kind of gesture identification Intelligent unattended machine remote control method based on binocular
CN114241379A (en) Passenger abnormal behavior identification method, device and equipment and passenger monitoring system
CN116935008A (en) Display interaction method and device based on mixed reality
Niranjani et al. System application control based on Hand gesture using Deep learning
CN104123008B (en) A kind of man-machine interaction method and system based on static gesture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190521

Termination date: 20201116

CF01 Termination of patent right due to non-payment of annual fee