WO2022141369A1 - Systems and methods for supporting automatic video capture and video editing - Google Patents

Systems and methods for supporting automatic video capture and video editing Download PDF

Info

Publication number
WO2022141369A1
WO2022141369A1 PCT/CN2020/142023 CN2020142023W WO2022141369A1 WO 2022141369 A1 WO2022141369 A1 WO 2022141369A1 CN 2020142023 W CN2020142023 W CN 2020142023W WO 2022141369 A1 WO2022141369 A1 WO 2022141369A1
Authority
WO
WIPO (PCT)
Prior art keywords
implementations
video
movable object
uav
flight route
Prior art date
Application number
PCT/CN2020/142023
Other languages
French (fr)
Inventor
Luoxiao QIN
Wei Zhang
Yuqi Liu
Junbei SHANG
Original Assignee
SZ DJI Technology Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SZ DJI Technology Co., Ltd. filed Critical SZ DJI Technology Co., Ltd.
Priority to PCT/CN2020/142023 priority Critical patent/WO2022141369A1/en
Priority to PCT/CN2021/087611 priority patent/WO2022141955A1/en
Priority to CN202180005825.0A priority patent/CN114556256A/en
Priority to PCT/CN2021/087612 priority patent/WO2022141956A1/en
Priority to CN202180006635.0A priority patent/CN114981746A/en
Publication of WO2022141369A1 publication Critical patent/WO2022141369A1/en
Priority to US18/215,729 priority patent/US20230359204A1/en

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/0094Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots involving pointing a payload, e.g. camera, weapon, sensor, towards a fixed or moving target
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/239Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/695Control of camera direction for changing a field of view, e.g. pan, tilt or based on tracking of objects
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B64AIRCRAFT; AVIATION; COSMONAUTICS
    • B64UUNMANNED AERIAL VEHICLES [UAV]; EQUIPMENT THEREFOR
    • B64U2101/00UAVs specially adapted for particular uses or applications
    • B64U2101/30UAVs specially adapted for particular uses or applications for imaging, photography or videography
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B64AIRCRAFT; AVIATION; COSMONAUTICS
    • B64UUNMANNED AERIAL VEHICLES [UAV]; EQUIPMENT THEREFOR
    • B64U2201/00UAVs characterised by their flight controls
    • B64U2201/10UAVs characterised by their flight controls autonomous, i.e. by navigating independently from ground or air stations, e.g. by using inertial navigation systems [INS]
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S3/00Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received
    • G01S3/78Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received using electromagnetic waves other than radio waves
    • G01S3/782Systems for determining direction or deviation from predetermined direction
    • G01S3/785Systems for determining direction or deviation from predetermined direction using adjustment of orientation of directivity characteristics of a detector or detector system to give a desired condition of signal derived from that detector or detector system
    • G01S3/786Systems for determining direction or deviation from predetermined direction using adjustment of orientation of directivity characteristics of a detector or detector system to give a desired condition of signal derived from that detector or detector system the desired condition being maintained automatically
    • G01S3/7864T.V. type tracking systems
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/16Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using electromagnetic waves other than radio waves

Definitions

  • the disclosed implementations relate generally to automatic video capture and video editing and more specifically, to systems, methods, and user interfaces that enable automatic video capture and video editing for UAV aerial photography.
  • Movable objects can be used for performing surveillance, reconnaissance, and exploration tasks for military and civilian applications.
  • An unmanned aerial vehicle (UAV) is an example of a movable object.
  • a movable object may carry a payload for performing specific functions such as capturing images and video of a surrounding environment of the movable object or for tracking a specific target.
  • a movable object may track a target object moving on the ground or in the air. Movement control information for controlling a movable object is typically received by the movable object from a remote device and/or determined by the movable object.
  • UAV aerial photography relates to a series of operations, such as camera settings, gimbal control, joystick control, and image composition and view finding. If a user desires to use the UAV to capture smooth videos with beautiful image composition, the user needs to adjust numerous parameters for the camera, gimbal, joystick, and image composition and view finding. The process of the control is relatively complex. Thus, it is challenging for a user who is not familiar with aerial photography operations to determine satisfactory parameters in a short amount of time. Furthermore, if a user wants to capture a video that contains multiple shots of various scenes and edit them into a visually and logical video, the user will still need to manually capture each of the scenes and then combine and edit the scenes.
  • Such systems and methods optionally complement or replace conventional methods for target tracking, image or video capture, and/or image or video editing.
  • a method performed by an unmanned aerial vehicle performed by an unmanned aerial vehicle (UAV) .
  • the UAV receives, from a computing device that is communicatively connected to the UAV, a first input that includes identification of a target object.
  • the UAV determines a target type corresponding to the target object.
  • the UAV also determines a distance between the UAV and the target object.
  • the UAV selects automatically, from a plurality of predefined flight routes, a flight route for the UAV.
  • the UAV sends to the computing device the selected flight route for display on the computing device.
  • the selected flight route includes a plurality of paths of different trajectory modes.
  • the UAV after the sending, receives from the computing device a second input.
  • the UAV controls the UAV to fly autonomously according to the selected flight route, including capturing by an image sensor of the UAV a video feed having a field of view of the image sensor and corresponding to each path of the plurality of paths.
  • the UAV simultaneously stores the video feed while the video feed is being captured.
  • the UAV stores with the video feed tag information associated with the flight path and trajectory mode corresponding to a respective segment of the video feed.
  • the UAV simultaneously sends the video feed and tag information associated with the flight path and trajectory mode corresponding to a respective segment of the video feed to the computing device for store on the remote control device.
  • the plurality of predefined flight routes include: a portrait flight route, a long range flight route, and a normal flight route
  • each of the plurality of paths comprises a respective one or more of: a path distance, a velocity of the UAV, an acceleration of the UAV, a flight time, an angle of view; a starting altitude, an ending altitude, a pan tilt zoom (PTZ) setting of the image sensor, an optical zoom setting of the image sensor, a digital zoom setting of the image sensor, and a focal length of the image sensor.
  • a path distance a velocity of the UAV
  • an acceleration of the UAV a flight time, an angle of view
  • a starting altitude, an ending altitude a pan tilt zoom (PTZ) setting of the image sensor
  • PTZ pan tilt zoom
  • sending to the computing device the selected flight route further comprises causing to be displayed on the computing device a preview of the selected flight route.
  • the preview comprises a three-dimensional or two-dimensional representation of the selected flight route.
  • the preview comprises a map of a vicinity of the UAV and the target object.
  • the trajectory modes include a first trajectory mode, corresponding to: the UAV traveling in a first direction; and a field of view of an image sensor of the UAV facing a second direction, opposite to the first direction.
  • the trajectory modes include a first trajectory mode, corresponding to: the UAV traveling in a first direction; and a field of view of an image sensor of the UAV facing the first direction.
  • the method further comprises rotating the field of view of an image sensor of the UAV from the first direction to a second direction, distinct from the first direction, while executing the first trajectory mode.
  • the trajectory modes include a first trajectory mode, corresponding to: the UAV traveling in a first direction; and a field of view of an image sensor of the UAV facing a second direction, perpendicular to the first direction.
  • a method for editing a video is performed at a computing device.
  • the computing device has one or more processors and memory storing programs to be executed by the one or more processors.
  • the video includes a plurality of video segments captured by an unmanned aerial vehicle (UAV) during a flight route associated with a target object. Each of the video segments corresponds to a respective path of the flight route.
  • UAV unmanned aerial vehicle
  • the computing device obtains a set of tag information for each of the plurality of video segments.
  • the computing device selects, from a plurality of video templates, a video template for the video.
  • the computing device extracts, from the plurality of video segments, one or more video sub-segments according to the tag information and the selected video template.
  • the computing device combines the extracted video sub-segments into a complete video of the flight route of the UAV.
  • the selected video template includes a plurality of scenes, each of the scenes corresponding to respective subset of the tag information.
  • the plurality of scenes include: an opening scene, one or more intermediate scenes, and a concluding scene.
  • the selected video template comprises a theme
  • the selected video template includes music that matches the theme
  • the extracted video sub-segments are combined according to a time sequence in which the video sub-segments are captured.
  • the extracted video sub-segments are combined in a time sequence that is defined by the selected video template.
  • the method further comprises: prior to the extracting, receiving a user input specifying a total time duration of the video.
  • the method further comprises automatically allocating a time for the video sub-segment
  • the plurality of video templates comprise a plurality of themes.
  • the selected video template is selected based on a user input.
  • the flight route is one of a plurality of predefined flight routes.
  • the plurality of video templates are determined based on the flight route.
  • a UAV comprises an image sensor, one or more processors, memory, and one or more programs stored in the memory.
  • the programs are configured for execution by the one or more processors.
  • the one or more programs include instructions for performing any of the methods described herein.
  • a computing device includes one or more processors, memory, and one or more programs stored in the memory.
  • the programs are configured for execution by the one or more processors.
  • the one or more programs include instructions for performing any of the methods described herein.
  • a non-transitory computer-readable storage medium stores one or more programs configured for execution by a UAV having one or more processors and memory.
  • the one or more programs include instructions for performing any of the methods described herein.
  • a non-transitory computer-readable storage medium stores one or more programs configured for execution by a computing device having one or more processors and memory.
  • the one or more programs include instructions for performing any of the methods described herein.
  • Figure 1 illustrates an exemplary target tracking system according to some implementations.
  • Figures 2A to 2C illustrate respectively, an exemplary movable object, an exemplary carrier of a movable object, and an exemplary payload of a movable object according to some implementations.
  • Figure 3 illustrates an exemplary sensing system of a movable object according to some implementations.
  • Figure 4 is a block diagram illustrating an exemplary memory 118 of a movable object 102 according to some implementations.
  • Figure 5 illustrates an exemplary control unit of a target tracking system according to some implementations.
  • Figure 6 illustrates an exemplary computing device for controlling a movable object according to some implementations.
  • Figure 7 illustrates an exemplary configuration 700 of a movable object 102, a carrier 108, and a payload 110 according to some implementations.
  • Figures 8A and 8B illustrate process flows between a user, a computing device 126, and a movable object 102 according to some implementations.
  • Figure 9 provides a screen shot for selecting an object of interest as a target object according to some implementations.
  • Figure 10 provides a screen shot for displaying flight range information according to some implementations.
  • Figure 11 provides a screen shot for execution of a flight route by a movable object according to some implementations.
  • Figure 12 provides a screen shot during a video editing process according to some implementations.
  • Figure 13 illustrates a flight route matching strategy according to some implementations.
  • Figure 14 illustrates an exemplary normal flight route according to some implementations.
  • Figure 15 illustrates an exemplary portrait flight route according to some implementations.
  • Figure 16 illustrates an exemplary long range flight route according to some implementations.
  • Figure 17 illustrates an exemplary video segment download and extraction process at a computing device, in accordance with some implementations.
  • Figure 18 illustrates an exemplary video template matching strategy in accordance with some implementations.
  • Figure 19 illustrates another exemplary video template matching strategy in accordance with some implementations.
  • Figures 20A-20C provide a flowchart of a method that is performed by a UAV according to some implementations.
  • Figures 21A-21C provide a flowchart of a method for editing a video according to some implementations.
  • UAV unmanned aerial vehicle
  • UAVs include, e.g., fixed-wing aircrafts and rotary-wing aircrafts such as helicopters, quadcopters, and aircraft having other numbers and/or configurations of rotors. It will be apparent to those skilled in the art that other types of movable objects may be substituted for UAVs as described below in accordance with implementations of the invention.
  • FIG. 1 illustrates an exemplary target tracking system 100 according to some implementations.
  • the target tracking system 100 includes a movable object 102 (e.g., a UAV) and a control unit 104.
  • the target tracking system 100 is used for tracking a target object 106 and/or for initiating tracking of the target object 106.
  • the target object 106 includes natural and/or man-made objects, such geographical landscapes (e.g., mountains, vegetation, valleys, lakes, and/or rivers) , buildings, and/or vehicles (e.g., aircrafts, ships, cars, trucks, buses, vans, and/or motorcycles) .
  • the target object 106 includes live subjects such as people and/or animals.
  • the target object 106 is a moving object, e.g., moving relative to a reference frame (such as the Earth and/or movable object 102) .
  • the target object 106 is static.
  • the target object 106 includes an active positioning and navigational system (e.g., a GPS system) that transmits information (e.g., location, positioning, and/or velocity information) about the target object 106 to the movable object 102, a control unit 104, and/or a computing device 126.
  • information may be transmitted to the movable object 102 via wireless communication from a communication unit of the target object 106 to a communication system 120 of the movable object 102, as illustrated in Figure 2A.
  • the movable object 102 includes a carrier 108 and/or a payload 110.
  • the carrier 108 is used to couple the payload 110 to the movable object 102.
  • the carrier 108 includes an element (e.g., a gimbal and/or damping element) to isolate the payload 110 from movement of the movable object 102.
  • the carrier 108 includes an element for controlling movement of the payload 110 relative to the movable object 102.
  • the payload 110 is coupled (e.g., rigidly coupled) to the movable object 102 (e.g., coupled via the carrier 108) such that the payload 110 remains substantially stationary relative to movable object 102.
  • the carrier 108 may be coupled to the payload 110 such that the payload is not movable relative to the movable object 102.
  • the payload 110 is mounted directly to the movable object 102 without requiring the carrier 108.
  • the payload 106 is located partially or fully within the movable object 102.
  • the movable object 102 is configured to communicate with the control unit 104, e.g., via wireless communications 124.
  • the movable object 102 may receive control instructions from the control unit 104 and/or send data (e.g., data from a movable object sensing system 122, Figure 2A) to the control unit 104.
  • control instructions may include, e.g., navigation instructions for controlling one or more navigational parameters of the movable object 102 such as a position, an orientation, an altitude, an attitude (e.g., aviation) and/or one or more movement characteristics of the movable object 102.
  • control instructions may include instructions for controlling one or more parameters of a carrier 108 and/or a payload 110.
  • the control instructions include instructions for directing movement of one or more of movement mechanisms 114 ( Figure 2A) of the movable object 102.
  • the control instructions may be used to control a flight of the movable object 102.
  • control instructions may include information for controlling operations (e.g., movement) of the carrier 108.
  • control instructions may be used to control an actuation mechanism of the carrier 108 so as to cause angular and/or linear movement of the payload 110 relative to the movable object 102.
  • control instructions are used to adjust one or more operational parameters for the payload 110, such as instructions for capturing one or more images, capturing video, adjusting a zoom level, powering on or off a component of the payload, adjusting an imaging mode (e.g., capturing still images or capturing video) , adjusting an image resolution, adjusting a focus, adjusting a viewing angle, adjusting a field of view, adjusting a depth of field, adjusting an exposure time, adjusting a shutter speed, adjusting a lens speed, adjusting an ISO, changing a lens and/or moving the payload 110 (and/or a part of payload 110, such as imaging device 214 (shown in Figure 2C) ) .
  • the control instructions are used to control the communication system 120, the sensing system 122, and/or another component of the movable object 102.
  • control instructions from the control unit 104 may include target information.
  • the movable object 102 is configured to communicate with a computing device 126 (e.g., an electronic device) .
  • the movable object 102 receives control instructions from the computing device 126 and/or sends data (e.g., data from the movable object sensing system 122) to the computing device 126.
  • data e.g., data from the movable object sensing system 122
  • communications from the computing device 126 to the movable object 102 are transmitted from computing device 126 to a cell tower 130 (e.g., via internet 128) and from the cell tower 130 to the movable object 102 (e.g., via RF signals) .
  • a satellite is used in lieu of or in addition to cell tower 130.
  • the target tracking system 100 includes additional control units 104 and/or computing devices 126 that are configured to communicate with the movable object 102.
  • Figure 2A illustrates an exemplary movable object 102 according to some implementations.
  • the movable object 102 includes processor (s) 116, memory 118, a communication system 120, and a sensing system 122, which are connected by data connections such as a control bus 112.
  • the control bus 112 optionally includes circuitry (sometimes called a chipset) that interconnects and controls communications between system components.
  • the movable object 102 is a UAV and includes components to enable flight and/or flight control. Although the movable object 102 is depicted as an aircraft in this example, this depiction is not intended to be limiting, and any suitable type of movable object may be used.
  • the movable object 102 includes movement mechanisms 114 (e.g., propulsion mechanisms) .
  • movement mechanisms 114 may refer to a single movement mechanism (e.g., a single propeller) or multiple movement mechanisms (e.g., multiple rotors) .
  • the movement mechanisms 114 may include one or more movement mechanism types such as rotors, propellers, blades, engines, motors, wheels, axles, magnets, and nozzles.
  • the movement mechanisms 114 are coupled to the movable object 102 at, e.g., the top, bottom, front, back, and/or sides.
  • the movement mechanisms 114 of a single movable object 102 may include multiple movement mechanisms each having the same type. In some implementations, the movement mechanisms 114 of a single movable object 102 include multiple movement mechanisms with different movement mechanism types.
  • the movement mechanisms 114 are coupled to the movable object 102 using any suitable means, such as support elements (e.g., drive shafts) or other actuating elements (e.g., one or more actuators 132) .
  • the actuator 132 e.g., movable object actuator
  • receives control signals from processor (s) 116 e.g., via control bus 112 that activates the actuator to cause movement of a movement mechanism 114.
  • the processor (s) 116 include an electronic speed controller that provides control signals to the actuators 132.
  • the movement mechanisms 114 enable the movable object 102 to take off vertically from a surface or land vertically on a surface without requiring any horizontal movement of the movable object 102 (e.g., without traveling down a runway) .
  • the movement mechanisms 114 are operable to permit the movable object 102 to hover in the air at a specified position and/or orientation.
  • one or more of the movement mechanisms 114 are controllable independently of one or more of the other movement mechanisms 114. For example, when the movable object 102 is a quadcopter, each rotor of the quadcopter is controllable independently of the other rotors of the quadcopter.
  • multiple movement mechanisms 114 are configured for simultaneous movement.
  • the movement mechanisms 114 include multiple rotors that provide lift and/or thrust to the movable object 102.
  • the multiple rotors are actuated to provide, e.g., vertical takeoff, vertical landing, and hovering capabilities to the movable object 102.
  • one or more of the rotors spin in a clockwise direction, while one or more of the rotors spin in a counterclockwise direction.
  • the number of clockwise rotors is equal to the number of counterclockwise rotors.
  • the rotation rate of each of the rotors is independently variable, e.g., for controlling the lift and/or thrust produced by each rotor, and thereby adjusting the spatial disposition, velocity, and/or acceleration of the movable object 102 (e.g., with respect to up to three degrees of translation and/or up to three degrees of rotation) .
  • the memory 118 stores one or more instructions, programs (e.g., sets of instructions) , modules, controlling systems and/or data structures, collectively referred to as “elements” herein.
  • One or more elements described with regard to the memory 118 are optionally stored by the control unit 104, the computing device 126, and/or another device.
  • an imaging device 214 ( Figure 2C) includes memory that stores one or more parameters described with regard to the memory 118.
  • the memory 118 stores a controlling system configuration that includes one or more system settings (e.g., as configured by a manufacturer, administrator, and/or user) .
  • identifying information for the movable object 102 is stored as a system setting of the system configuration.
  • the controlling system configuration includes a configuration for the imaging device 214.
  • the configuration for the imaging device 214 stores parameters such as position (e.g., relative to the image sensor 216) , a zoom level and/or focus parameters (e.g., amount of focus, selecting autofocus or manual focus, and/or adjusting an autofocus target in an image) .
  • Imaging property parameters stored by the imaging device configuration include, e.g., image resolution, image size (e.g., image width and/or height) , aspect ratio, pixel count, quality, focus distance, depth of field, exposure time, shutter speed, and/or white balance.
  • parameters stored by the imaging device configuration are updated in response to control instructions (e.g., generated by processor (s) 116 and/or received by the movable object 102 from the control unit 104 and/or the computing device 126) .
  • control instructions e.g., generated by processor (s) 116 and/or received by the movable object 102 from the control unit 104 and/or the computing device 126) .
  • parameters stored by the imaging device configuration are updated in response to information received from the movable object sensing system 122 and/or the imaging device 214.
  • the carrier 108 is coupled to the movable object 102 and a payload 110 is coupled to the carrier 108.
  • the carrier 108 includes one or more mechanisms that enable the payload 110 to move relative to the movable object 102, as described further with respect to Figure 2B.
  • the payload 110 is rigidly coupled to the movable object 102 such that the payload 110 remains substantially stationary relative to the movable object 102.
  • the carrier 108 is coupled to the payload 110 such that the payload 110 is not movable relative to the movable object 102.
  • the payload 110 is coupled to the movable object 102 without requiring the use of the carrier 108.
  • the movable object 102 also includes the communication system 120, which enables communication with between the movable object with the control unit 104 and/or the computing device 126 (e.g., via wireless signals 124) .
  • the communication system 120 includes transmitters, receivers, and/or transceivers for wireless communication.
  • the communication is a one-way communication, such that data is transmitted only from the movable object 102 to the control unit 104, or vice-versa.
  • communication is a two-way communication, such that data is transmitted from the movable object 102 to the control unit 104, as well as from the control unit 104 to the movable object 102.
  • the movable object 102 communicates with the computing device 126.
  • the movable object 102, the control unit 104, and/or the computing device 126 are connected to the Internet or other telecommunications network, e.g., such that data generated by the movable object 102, the control unit 104, and/or the computing device 126 is transmitted to a server for data storage and/or data retrieval (e.g., for display by a website) .
  • data generated by the movable object 102, the control unit 104, and/or the computing device 126 is stored locally on each of the respective devices.
  • the movable object 102 comprises a sensing system (e.g., the movable object sensing system 122) that includes one or more sensors, as described further with reference to Figure 3.
  • the movable object 102 and/or the control unit 104 use sensing data generated by sensors of sensing system 122 to determine information such as a position of the movable object 102, an orientation of the movable object 102, movement characteristics of the movable object 102 (e.g., an angular velocity, an angular acceleration, a translational velocity, a translational acceleration and/or a direction of motion along one or more axes) , a distance between the movable object 102 to a target object, proximity (e.g., distance) of the movable object 102 to potential obstacles, weather conditions, locations of geographical features and/or locations of manmade structures.
  • Figure 2B illustrates an exemplary carrier 108 according to some implementations.
  • the carrier 108 couples the payload 110 to the movable object 102.
  • the carrier 108 includes a frame assembly having one or more frame members 202.
  • the frame member (s) 202 are coupled with the movable object 102 and the payload 110.
  • the frame member (s) 202 support the payload 110.
  • the carrier 108 includes one or more mechanisms, such as one or more actuators 204, to cause movement of the carrier 108 and/or the payload 110.
  • the actuator 204 is, e.g., a motor, such as a hydraulic, pneumatic, electric, thermal, magnetic, and/or mechanical motor.
  • the actuator 204 causes movement of the frame member (s) 202.
  • the actuator 204 rotates the payload 110 with respect to one or more axes, such as one or more of: an X axis ( “pitch axis” ) , a Z axis ( “roll axis” ) , and a Y axis ( “yaw axis” ) , relative to the movable object 102. In some implementations, the actuator 204 translates the payload 110 along one or more axes relative to the movable object 102.
  • the carrier 108 includes a carrier sensing system 206 for determining a state of the carrier 108 or the payload 110.
  • the carrier sensing system 206 includes one or more of: motion sensors (e.g., accelerometers) , rotation sensors (e.g., gyroscopes) , potentiometers, and/or inertial sensors.
  • the carrier sensing system 206 includes one or more sensors of the movable object sensing system 122 as described below with respect to Figure 3.
  • Sensor data determined by the carrier sensing system 206 may include spatial disposition (e.g., position, orientation, or attitude) , movement information such as velocity (e.g., linear or angular velocity) and/or acceleration (e.g., linear or angular acceleration) of the carrier 108 and/or the payload 110.
  • movement information such as velocity (e.g., linear or angular velocity) and/or acceleration (e.g., linear or angular acceleration) of the carrier 108 and/or the payload 110.
  • the sensing data as well as state information calculated from the sensing data are used as feedback data to control the movement of one or more components (e.g., the frame member 202 (s) , the actuator 204, and/or the damping element 208) of the carrier 108.
  • the carrier sensing system 206 is coupled to the frame member (s) 202, the actuator 204, the damping element 208, and/or the payload 110.
  • a sensor in the carrier sensing system 206 may measure movement of the actuator 204 (e.g., the relative positions of a motor rotor and a motor stator) and generate a position signal representative of the movement of the actuator 204 (e.g., a position signal representative of relative positions of the motor rotor and the motor stator) .
  • data generated by the sensors is received by processor (s) 116 and/or memory 118 of the movable object 102.
  • the coupling between the carrier 108 and the movable object 102 includes one or more damping elements 208.
  • the damping element (s) 208 are configured to reduce or eliminate movement of the load (e.g., the payload 110 and/or the carrier 108) caused by movement of the movable object 102.
  • the damping element (s) 208 may include active damping elements, passive damping elements, and/or hybrid damping elements having both active and passive damping characteristics.
  • the motion damped by the damping element (s) 208 may include vibrations, oscillations, shaking, and/or impacts. Such motions may originate from motions of the movable object 102, which are transmitted to the payload 110.
  • the motion may include vibrations caused by the operation of a propulsion system and/or other components of the movable object 102.
  • the damping element (s) 208 provide motion damping by isolating the payload 110 from the source of unwanted motion, by dissipating or reducing the amount of motion transmitted to the payload 110 (e.g., vibration isolation) .
  • the damping element 208 (s) reduce a magnitude (e.g., an amplitude) of the motion that would otherwise be experienced by the payload 110.
  • the motion damping applied by the damping element (s) 208 is used to stabilize the payload 110, thereby improving the quality of video and/or images captured by the payload 110 (e.g., using the imaging device 214, Figure 2C) .
  • the improved video and/or image quality reduces the computational complexity of processing steps required to generate an edited video based on the captured video, or to generate a panoramic image based on the captured images.
  • the damping element (s) 208 may be manufactured using any suitable material or combination of materials, including solid, liquid, or gaseous materials.
  • the materials used for the damping element (s) 208 may be compressible and/or deformable.
  • the damping element (s) 208 may be made of sponge, foam, rubber, gel, and the like.
  • the damping element (s) 208 may include rubber balls that are substantially spherical in shape.
  • the damping element (s) 208 may be substantially spherical, rectangular, and/or cylindrical in shape.
  • the damping element (s) 208 may include piezoelectric materials or shape memory materials.
  • the damping element (s) 208 may include one or more mechanical elements, such as springs, pistons, hydraulics, pneumatics, dashpots, shock absorbers, and/or isolators.
  • properties of the damping element (s) 208 are selected so as to provide a predetermined amount of motion damping.
  • the damping element (s) 208 have viscoelastic properties.
  • the properties of damping element (s) 208 may be isotropic or anisotropic.
  • the damping element (s) 208 provide motion damping equally along all directions of motion.
  • the damping element (s) 208 provide motion damping only along a subset of the directions of motion (e.g., along a single direction of motion) .
  • the damping element (s) 208 may provide damping primarily along the Y (yaw) axis. In this manner, the illustrated damping element (s) 208 reduce vertical motions.
  • the carrier 108 further includes a controller 210.
  • the controller 210 may include one or more controllers and/or processors.
  • the controller 210 receives instructions from the processor (s) 116 of the movable object 102.
  • the controller 210 may be connected to the processor (s) 116 via the control bus 112.
  • the controller 210 may control movement of the actuator 204, adjust one or more parameters of the carrier sensing system 206, receive data from carrier sensing system 206, and/or transmit data to the processor (s) 116.
  • Figure 2C illustrates an exemplary payload 110 according to some implementations.
  • the payload 110 includes a payload sensing system 212 and a controller 218.
  • the payload sensing system 212 may include an imaging device 214 (e.g., a camera) having an image sensor 216 with a field of view.
  • the payload sensing system 212 includes one or more sensors of the movable object sensing system 122, as described below with respect to Figure 3.
  • the payload sensing system 212 generates static sensing data (e.g., a single image captured in response to a received instruction) and/or dynamic sensing data (e.g., a series of images captured at a periodic rate, such as a video) .
  • static sensing data e.g., a single image captured in response to a received instruction
  • dynamic sensing data e.g., a series of images captured at a periodic rate, such as a video
  • the image sensor 216 is, e.g., a sensor that detects light, such as visible light, infrared light, and/or ultraviolet light.
  • the image sensor 216 includes, e.g., semiconductor charge-coupled device (CCD) , active pixel sensors using complementary metal-oxide-semiconductor (CMOS) or N-type metal-oxide-semiconductor (NMOS, Live MOS) technologies, or any other types of sensors.
  • CMOS complementary metal-oxide-semiconductor
  • NMOS N-type metal-oxide-semiconductor
  • Live MOS Live MOS
  • Adjustable parameters of imaging device 214 include, e.g., width, height, aspect ratio, pixel count, resolution, quality, imaging mode, focus distance, depth of field, exposure time, shutter speed and/or lens configuration.
  • the imaging device 214 may configured to capture videos and/or images at different resolutions (e.g., low, medium, high, or ultra-high resolutions, and/or high-definition or ultra-high-definition videos such as 720p, 1080i, 1080p, 1440p, 2000p, 2160p, 2540p, 4000p, and 4320p) .
  • the payload 110 includes the controller 218.
  • the controller 218 may include one or more controllers and/or processors.
  • the controller 218 receives instructions from the processor (s) 116 of the movable object 102.
  • the controller 218 is connected to the processor (s) 116 via the control bus 112.
  • the controller 218 may adjust one or more parameters of one or more sensors of the payload sensing system 212, receive data from one or more sensors of payload sensing system 212, and/or transmit data, such as image data from the image sensor 216, to the processor (s) 116, the memory 118, and/or the control unit 104.
  • data generated by one or more sensors of the payload sensor system 212 is stored, e.g., by the memory 118.
  • data generated by the payload sensor system 212 are transmitted to the control unit 104 (e.g., via communication system 120) .
  • video is streamed from the payload 110 (e.g., the imaging device 214) to the control unit 104.
  • the control unit 104 displays, e.g., real-time (or slightly delayed) video received from the imaging device 214.
  • an adjustment of the orientation, position, altitude, and/or one or more movement characteristics of the movable object 102, the carrier 108, and/or the payload 110 is generated (e.g., by the processor (s) 116) based at least in part on configurations (e.g., preset and/or user configured in system configuration 400, Figure 4) of the movable object 102, the carrier 108, and/or the payload 110.
  • an adjustment that involves a rotation with respect to two axes is achieved solely by corresponding rotation of movable object around the two axes if the payload 110 including imaging device 214 is rigidly coupled to the movable object 102 (and hence not movable relative to movable object 102) and/or the payload 110 is coupled to the movable object 102 via a carrier 108 that does not permit relative movement between the imaging device 214 and the movable object 102.
  • the same two-axis adjustment may be achieved by, e.g., combining adjustments of both the movable object 102 and the carrier 108 if the carrier 108 permits the imaging device 214 to rotate around at least one axis relative to the movable object 102.
  • the carrier 108 can be controlled to implement the rotation around one or two of the two axes required for the adjustment and the movable object 120 can be controlled to implement the rotation around one or two of the two axes.
  • the carrier 108 may include a one-axis gimbal that allows the imaging device 214 to rotate around one of the two axes required for adjustment while the rotation around the remaining axis is achieved by the movable object 102.
  • the same two-axis adjustment is achieved by the carrier 108 alone when the carrier 108 permits the imaging device 214 to rotate around two or more axes relative to the movable object 102.
  • the carrier 108 may include a two-axis or three-axis gimbal that enables the imaging device 214 to rotate around two or all three axes.
  • Figure 3 illustrates an exemplary sensing system 122 of a movable object 102 according to some implementations.
  • one or more sensors of the movable object sensing system 122 are mounted to an exterior, or located within, or otherwise coupled to the movable object 102.
  • one or more sensors of movable object sensing system are components of carrier sensing system 206 and/or payload sensing system 212. Where sensing operations are described as being performed by the movable object sensing system 122 herein, it will be recognized that such operations are optionally performed by the carrier sensing system 206 and/or the payload sensing system 212.
  • the movable object sensing system 122 generates static sensing data (e.g., a single image captured in response to a received instruction) and/or dynamic sensing data (e.g., a series of images captured at a periodic rate, such as a video) .
  • the movable object sensing system 122 includes one or more image sensors 302, such as image sensor 308 (e.g., a left stereographic image sensor) and/or image sensor 310 (e.g., a right stereographic image sensor) .
  • the image sensors 302 capture, e.g., images, image streams (e.g., videos) , stereographic images, and/or stereographic image streams (e.g., stereographic videos) .
  • the image sensors 302 detect light, such as visible light, infrared light, and/or ultraviolet light.
  • the movable object sensing system 122 includes one or more optical devices (e.g., lenses) to focus or otherwise alter the light onto the one or more image sensors 302.
  • the image sensors 302 include, e.g., semiconductor charge-coupled devices (CCD) , active pixel sensors using complementary metal-oxide-semiconductor (CMOS) or N-type metal-oxide-semiconductor (NMOS, Live MOS) technologies, or any other types of sensors.
  • CCD semiconductor charge-coupled devices
  • CMOS complementary metal-oxide-semiconductor
  • NMOS N-type metal-oxide-semiconductor
  • Live MOS Live MOS
  • the movable object sensing system 122 includes one or more audio transducers 304.
  • the audio transducers 304 may include an audio output transducer 312 (e.g., a speaker) , and an audio input transducer 314 (e.g. a microphone, such as a parabolic microphone) .
  • the audio output transducer 312 and the audio input transducer 314 are used as components of a sonar system for tracking a target object (e.g., detecting location information of a target object) .
  • the movable object sensing system 122 includes one or more infrared sensors 306.
  • a distance measurement system includes a pair of infrared sensors e.g., infrared sensor 316 (such as a left infrared sensor) and infrared sensor 318 (such as a right infrared sensor) or another sensor or sensor pair. The distance measurement system is used for measuring a distance between the movable object 102 and the target object 106.
  • the movable object sensing system 122 may include other sensors for sensing a distance between the movable object 102 and the target object 106, such as a Radio Detection and Ranging (RADAR) sensor, a Light Detection and Ranging (LiDAR) sensor, or any other distance sensor.
  • RADAR Radio Detection and Ranging
  • LiDAR Light Detection and Ranging
  • a system to produce a depth map includes one or more sensors or sensor pairs of movable object sensing system 122 (such as a left stereographic image sensor 308 and a right stereographic image sensor 310; an audio output transducer 312 and an audio input transducer 314; and/or a left infrared sensor 316 and a right infrared sensor 318.
  • a pair of sensors in a stereo data system e.g., a stereographic imaging system
  • a depth map is generated by a stereo data system using the simultaneously captured data.
  • a depth map is used for positioning and/or detection operations, such as detecting a target object 106, and/or detecting current location information of a target object 106.
  • the movable object sensing system 122 includes one or more global positioning system (GPS) sensors, motion sensors (e.g., accelerometers) , rotation sensors (e.g., gyroscopes) , inertial sensors, proximity sensors (e.g., infrared sensors) and/or weather sensors (e.g., pressure sensor, temperature sensor, moisture sensor, and/or wind sensor) .
  • GPS global positioning system
  • motion sensors e.g., accelerometers
  • rotation sensors e.g., gyroscopes
  • inertial sensors e.g., inertial sensors
  • proximity sensors e.g., infrared sensors
  • weather sensors e.g., pressure sensor, temperature sensor, moisture sensor, and/or wind sensor
  • sensing data generated by one or more sensors of the movable object sensing system 122 and/or information determined using sensing data from one or more sensors of the movable object sensing system 122 are transmitted to the control unit 104 (e.g., via the communication system 120) .
  • data generated one or more sensors of the movable object sensing system 122 and/or information determined using sensing data from one or more sensors of the movable object sensing system 122 is stored by the memory 118.
  • Figure 4 is a block diagram illustrating an exemplary memory 118 of a movable object 102 according to some implementations.
  • one or more elements illustrated in Figure 4 may be located in the control unit 104, the computing device 126, and/or another device.
  • the memory 118 stores a system configuration 400.
  • the system configuration 400 includes one or more system settings (e.g., as configured by a manufacturer, administrator, and/or user of the movable object 102) .
  • system settings e.g., as configured by a manufacturer, administrator, and/or user of the movable object 102
  • a constraint on one or more of orientation, position, attitude, and/or one or more movement characteristics of the movable object 102, the carrier 108, and/or the payload 110 is stored as a system setting of the system configuration 400.
  • the memory 118 stores a motion control module 402.
  • the motion control module 402 stores control instructions that are received from the control module 104 and/or the computing device 126. The control instructions are used for controlling operation of the movement mechanisms 114, the carrier 108, and/or the payload 110.
  • memory 118 stores a tracking module 404.
  • the tracking module 404 generates tracking information for a target object 106 that is being tracked by the movable object 102.
  • the tracking information is generated based on images captured by the imaging device 214 and/or based on output from an video analysis module 406 (e.g., after pre-processing and/or processing operations have been performed on one or more images) .
  • the tracking information may be generated based on analysis of gestures of a human target, which are captured by the imaging device 214 and/or analyzed by a gesture analysis module 403.
  • the tracking information generated by the tracking module 404 may include a location, a size, and/or other characteristics of the target object 106 within one or more images.
  • the tracking information generated by the tracking module 404 is transmitted to the control unit 104 and/or the computing device 126 (e.g., augmenting or otherwise combined with images and/or output from the video analysis module 406) .
  • the tracking information may be transmitted to the control unit 104 in response to a request from the control unit 104 and/or on a periodic basis (e.g., every 2 seconds, 5 seconds, 10 seconds, or 30 seconds) .
  • the memory 118 includes a video analysis module 406.
  • the video analysis module 406 performs processing operations on videos and images, such as videos and images captured by the imaging device 214.
  • the video analysis module 406 performs pre-processing on raw video and/or image data, such as re-sampling to assure the correctness of the image coordinate system, noise reduction, contrast enhancement, and/or scale space representation.
  • the processing operations performed on video and image data include feature extraction, image segmentation, data verification, image recognition, image registration, and/or image matching.
  • the output from the video analysis module 406 (e.g., after the pre-processing and/or processing operations have been performed) is transmitted to the control unit 104 and/or the computing device 126.
  • feature extraction is performed by the control unit 104, the processor (s) 116 of the movable object 102, and/or the computing device 126.
  • the video analysis module 406 may use neural networks to perform image recognition and/or classify object (s) that are included in the videos and/or images.
  • the video analysis module 406 may extract frames that include the target object 106, analyze features of the target object 106, and compare the features with characteristics of one or more predetermined recognizable target object types, thereby enabling the target object 106 to be recognized at a certain confidence level.
  • the memory 118 includes a gesture analysis module 403.
  • the gesture analysis module 403 processes gestures of one or more human targets.
  • the gestures may be captured by the imaging device 214.
  • the gesture analysis results may be fed into the tracking module 404 and/or the motion control module 402 to generate, respectively, tracking information and/or control instructions that are used for controlling operations of the movement mechanisms 114, the carrier 108, and/or the payload 110 of the movable object 102.
  • a calibration process may be performed before using gestures of a human target to control the movable object 102.
  • the gesture analysis module 403 may capture certain features of human gestures associated with a certain control command and stores the gesture features in the memory 118. When a human gesture is received, the gesture analysis module 403 may extract features of the human gesture and compare these features with the stored features to determine whether the certain command may be performed by the user.
  • the correlations between gestures and control commands associated with a certain human target may or may not be different from such correlations associated with another human target.
  • the memory 118 also includes a spatial relationship determination module 405.
  • the spatial relationship determination module 405 calculates one or more spatial relationships between the target object 106 and the movable object 102, such as a horizontal distance between the target object 106 and the movable object 102, and/or a pitch angle between the target object 106 and the movable object 102.
  • the memory 118 stores target information 408.
  • the target information 408 is received by the movable object 102 (e.g., via communication system 120) from the control unit 104, the computing device 126, the target object 106, and/or another movable object.
  • the target information 408 includes a time value (e.g., a time duration) and/or an expiration time indicating a period of time during which the target object 106 is to be tracked.
  • the target information 408 includes a flag (e.g., a label) indicating whether a target information entry includes specific tracked target information 412 and/or target type information 410.
  • the target information 408 includes target type information 410 such as color, texture, pattern, size, shape, and/or dimension.
  • the target type information 410 includes, but is not limited to, a predetermined recognizable object type and a general object type as identified by the video analysis module 406.
  • the target type information 410 includes features or characteristics for each type of target and is preset and stored in the memory 118.
  • the target type information 410 is provided to a user input device (e.g., the control unit 104) via user input.
  • the user may select a pre-existing target pattern or type (e.g., an object or a round object with a radius greater or less than a certain value) .
  • the target information 408 includes tracked target information 412 for a specific target object 106 being tracked.
  • the target information 408 may be identified by the video analysis module 406 by analyzing the target in a captured image.
  • the tracked target information 412 includes, e.g., an image of the target object 106, an initial position (e.g., location coordinates, such as pixel coordinates within an image) of the target object 106, and/or a size of the target object 106 within one or more images (e.g., images captured by the imaging device 214) .
  • a size of the target object 106 is stored, e.g., as a length (e.g., mm or other length unit) , an area (e.g., mm2 or other area unit) , a number of pixels in a line (e.g., indicating a length, width, and/or diameter) , a ratio of a length of a representation of the target in an image relative to a total image length (e.g., a percentage) , a ratio of an area of a representation of the target in an image relative to a total image area (e.g., a percentage) , a number of pixels indicating an area of target object 106, and/or a corresponding spatial relationship (e.g., a vertical distance and/or a horizontal distance) between the target object 106 and the movable object 102 (e.g., an area of the target object 106 changes based on a distance of the target object 106 from the movable object 102) .
  • a length
  • one or more features (e.g., characteristics) of the target object 106 are determined from an image of the target object 106 (e.g., using image analysis techniques on images captured by the imaging device 112) .
  • one or more features of the target object 106 are determined from an orientation and/or part or all of identified boundaries of the target object 106.
  • the tracked target information 412 includes pixel coordinates and/or a number of pixel counts to indicate, e.g., a size parameter, position, and/or shape of the target object 106.
  • one or more features of the tracked target information 412 are to be maintained as the movable object 102 tracks the target object 106 (e.g., the tracked target information 412 are to be maintained as images of the target object 106 are captured by the imaging device 214) .
  • the tracked target information 412 is used to adjust the movable object 102, the carrier 108, and/or the imaging device 214, such that specific features of the target object 106 are substantially maintained.
  • the tracked target information 412 is determined based on one or more of the target types 410.
  • the memory 118 also includes predetermined recognizable target type information 414.
  • the predetermined recognizable target type information 414 specifies one or more characteristics of certain predetermined recognizable target types (e.g., target type 1, target type 2, ..., target type n) .
  • Each predetermined recognizable target type may include one or more characteristics such as a size parameter (e.g., area, diameter, height, length and/or width) , position (e.g., relative to an image center and/or image boundary) , movement (e.g., speed, acceleration, altitude) and/or shape.
  • target type 1 may be a human target.
  • One or more characteristics associated with a human target may include a height in a range from about 1.4 meters to about 2 meters, a pattern comprising a head, a torso, and limbs, and/or a moving speed having a range from about 2 kilometers/hour to about 25 kilometers/hour.
  • target type 2 may be a car target.
  • One or more characteristics associated with a car target may include a height in a range from about 1.4 meters to about 4.5 meters, a length having a range from about 3 meters to about 10 meters, a moving speed of 5 kilometers/hour to about 140 kilometers/hour, and/or a pattern of a sedan, a SUV, a truck, or a bus.
  • target type 3 may be a ship target.
  • predetermined recognizable target object may also include: an airplane target, an animal target, other moving targets, and stationary (e.g., non-moving) targets such as a building and a statue.
  • Each predetermined target type may further include one or more subtypes, each of the subtypes having more specific characteristics thereby providing more accurate target classification results.
  • the target information 408 (including, e.g., the target type information 410 and the tracked target information 412) , and/or predetermined recognizable target information 414 is generated based on user input, such as a user input received at user input device 506 ( Figure 5) of the control unit 104. Additionally or alternatively, the target information 408 may be generated based on data from sources other than the control unit 104.
  • the target type information 410 may be based on previously stored images of the target object 106 (e.g., images captured by the imaging device 214 and stored by the memory 118) , other data stored by the memory 118, and/or data from data stores that are remote from the control unit 104 and/or the movable object 102.
  • the target type information 408 is generated using a computer-generated image of the target object 106.
  • the target information 408 is used by the movable object 102 (e.g., the tracking module 404) to track the target object 106.
  • the target information 408 is used by a video analysis module 406 to identify and/or classify the target object 106.
  • target identification involves image recognition and/or matching algorithms based on, e.g., CAD-like object models, appearance-based methods, feature-based methods, and/or genetic algorithms.
  • target identification includes comparing two or more images to determine, extract, and/or match features contained therein.
  • the memory 118 also includes flight routes 416 (e.g., predefined flight routes) of the movable object 102, such as a portrait flight route 418, a long range flight route 418, and a normal flight route 422, as will be discussed below with respect to, e.g., Figures 8A-8B.
  • flight routes 416 e.g., predefined flight routes
  • each of the flight routes 416 includes one or more paths, each of the one or more paths having a corresponding trajectory mode.
  • the movable object 102 automatically selects one of the predefined flight routes according to a target type of the target object 106 and distance between the movable object 102 and the target object 106.
  • the movable object 102 after automatically selecting a flight route 416 for the movable object 102, the movable object 102 further performs an automatic customization of the flight route taking into consideration factors such as a updated distance between the movable object 102 and the target object 106, presence of potential obstacle (s) and/or other structures (e.g., buildings and trees) , or weather conditions.
  • customization of the flight route includes modifying a rate of ascent of the movable object 102, an initial velocity of the movable object 102, and/or an acceleration of the movable object 102.
  • the customization is provided in part by a user.
  • the movable object 102 may cause the computing device 126 to display a library of trajectories that can be selected by the user.
  • the movable object 102 then automatically generates the paths of the flight route based on the user selections.
  • the flight routes 416 also include user defined flight route (s) 424, which are routes that are defined by the user.
  • the memory stores data 426 that are captured by the image sensor 216 during an autonomous flight, including video data 428 and image (s) 430.
  • the data 426 also includes audio data 432 that are captured by a microphone of the movable object 102 (e.g., the audio input transducer 314) .
  • the data 426 is simultaneously stored on the moving object 102 as it is being captured.
  • the memory 118 further stores with the data 426 metadata information.
  • the video data 428 may include tag information (e.g., metadata) associated with the flight path and trajectory mode corresponding to a respective segment of the video data 428.
  • the memory 118 may store a subset of the modules and data structures identified above. Furthermore, the memory 118 may store additional modules and data structures not described above.
  • the programs, modules, and data structures stored in the memory 118, or a non-transitory computer readable storage medium of the memory 118 provide instructions for implementing respective operations in the methods described below. In some implementations, some or all of these modules may be implemented with specialized hardware circuits that subsume part or all of the module functionality.
  • One or more of the above identified elements may be executed by the one or more processors 116 of the movable object 102. In some implementations, one or more of the above identified elements is executed by one or more processors of a device remote from the movable object 102, such as the control unit 104 and/or the computing device 126.
  • Figure 5 illustrates an exemplary control unit 104 of target tracking system 100, in accordance with some implementations.
  • the control unit 104 communicates with the movable object 102 via the communication system 120, e.g., to provide control instructions to the movable object 102.
  • the control unit 104 is typically a portable (e.g., handheld) device, the control unit 104 need not be portable.
  • the control unit 104 is a dedicated control device (e.g., dedicated to operation of movable object 102) , a laptop computer, a desktop computer, a tablet computer, a gaming system, a wearable device (e.g., watches, glasses, gloves, and/or helmet) , a microphone, and/or a combination thereof.
  • the control unit 104 typically includes one or more processor (s) 502, a communication system 510 (e.g., including one or more network or other communications interfaces) , memory 504, one or more input/output (I/O) interfaces (e.g., an input device 506 and/or a display 506) , and one or more communication buses 512 for interconnecting these components.
  • the input device 506 and/or the display 508 comprises a touchscreen display.
  • the touchscreen display optionally uses LCD (liquid crystal display) technology, LPD (light emitting polymer display) technology, or LED (light emitting diode) technology, although other display technologies are used in other implementations.
  • the touchscreen display and the processor (s) 502 optionally detect contact and any movement or breaking thereof using any of a plurality of touch sensing technologies now known or later developed, including but not limited to capacitive, resistive, infrared, and surface acoustic wave technologies, as well as other proximity sensor arrays or other elements for determining one or more points of contact with the touchscreen display.
  • the input device 506 includes one or more: joysticks, switches, knobs, slide switches, buttons, dials, keypads, keyboards, mice, audio transducers (e.g., microphones for voice control systems) , motion sensors, and/or gesture controls.
  • an I/O interface of the control unit 104 includes sensors (e.g., GPS sensors, and/or accelerometers) , audio output transducers (e.g., speakers) , and/or one or more tactile output generators for generating tactile outputs.
  • the input device 506 receives user input to control aspects of the movable object 102, the carrier 108, the payload 110, or a component thereof. Such aspects include, e.g., attitude (e.g., aviation) , position, orientation, velocity, acceleration, navigation, and/or tracking.
  • attitude e.g., aviation
  • the input device 506 is manually set to one or more positions by a user. Each of the positions may correspond to a predetermined input for controlling the movable object 102.
  • the input device 506 is manipulated by a user to input control instructions for controlling the navigation of the movable object 102.
  • the input device 506 is used to input a flight mode for the movable object 102, such as auto pilot or navigation according to a predetermined navigation path.
  • the input device 506 is used to input a target tracking mode for the movable object 102, such as a manual tracking mode or an automatic tracking mode.
  • the user controls the movable object 102, e.g., the position, attitude, and/or orientation of the movable object 102, by changing a position of the control unit 104 (e.g., by tilting or otherwise moving the control unit 104) .
  • a change in a position of the control unit 104 may detected by one or more inertial sensors and output of the one or more inertial sensors may be used to generate command data.
  • the input device 506 is used to adjust an operational parameter of the payload, such as a parameter of the payload sensing system 212 (e.g., to adjust a zoom parameter of the imaging device 214) and/or a position of the payload 110 relative to the carrier 108 and/or the movable object 102.
  • an operational parameter of the payload such as a parameter of the payload sensing system 212 (e.g., to adjust a zoom parameter of the imaging device 214) and/or a position of the payload 110 relative to the carrier 108 and/or the movable object 102.
  • the input device 506 is used to indicate information about the target object 106, e.g., to select a target object 106 to track and/or to indicate the target type information 412.
  • the input device 506 is used for interaction with augmented image data.
  • an image displayed by the display 508 includes representations of one or more target objects 106.
  • representations of the one or more target objects 106 are augmented to indicate identified objects for potential tracking and/or a target object 106 that is currently being tracked. Augmentation includes, for example, a graphical tracking indicator (e.g., a box) adjacent to or surrounding a respective target object 106.
  • the input device 506 is used to select a target object 106 to track or to change the target object being tracked.
  • a target object 106 is selected when an area corresponding to a representation of the target object 106 is selected by e.g., a finger, stylus, mouse, joystick, or other component of the input device 506.
  • the specific target information 412 is generated when a user selects a target object 106 to track.
  • the control unit 104 may also be configured to allow a user to enter target information using any suitable method.
  • the input device 506 receives a selection of a target object 106 from one or more images (e.g., video or snapshot) displayed by the display 508.
  • the input device 506 receives input including a selection performed by a gesture around the target object 106 and/or a contact at a location corresponding to the target object 106 in an image.
  • computer vision or other techniques are used to determine a boundary of the target object 106.
  • input received at the input device 506 defines a boundary of the target object 106.
  • multiple targets are simultaneously selected.
  • a selected target is displayed with a selection indicator (e.g., a bounding box) to indicate that the target is selected for tracking.
  • the input device 506 receives input indicating information such as color, texture, shape, dimension, and/or other characteristics associated with a target object 106.
  • the input device 506 includes a keyboard to receive typed input indicating the target information 408.
  • the control unit 104 provides an interface that enables a user to select (e.g., using the input device 506) between a manual tracking mode and an automatic tracking mode.
  • the interface enables the user to select a target object 106 to track.
  • a user is enabled to manually select a representation of a target object 106 from an image displayed by the display 508 of the control unit 104.
  • Specific target information 412 associated with the selected target object 106 is transmitted to the movable object 102, e.g., as initial expected target information.
  • the input device 506 receives target type information 410 from a user input.
  • the movable object 102 uses the target type information 410, e.g., to automatically identify the target object 106 to be tracked and/or to track the identified target object 106.
  • manual tracking requires more user control of the tracking of the target and less automated processing or computation (e.g., image or target recognition) by the processor (s) 116 of the movable object 102, while automatic tracking requires less user control of the tracking process but more computation performed by the processor (s) 116 of the movable object 102 (e.g., by the video analysis module 406) .
  • allocation of control over the tracking process between the user and the onboard processing system is adjusted, e.g., depending on factors such as the surroundings of movable object 102, motion of the movable object 102, altitude of the movable object 102, the system configuration 400 (e.g., user preferences) , and/or available computing resources (e.g., CPU or memory) of the movable object 102, the control unit 104, and/or the computing device 126.
  • relatively more control is allocated to the user when movable object is navigating in a relatively complex environment (e.g., with numerous buildings or obstacles or indoor) than when movable object is navigating in a relatively simple environment (e.g., wide open space or outdoor) .
  • control is allocated to the user when the movable object 102 is at a lower altitude than when the movable object 102 is at a higher altitude.
  • more control is allocated to the movable object 102 if the movable object 102 is equipped with a high-speed processor adapted to perform complex computations relatively quickly.
  • the allocation of control over the tracking process between the user and the movable object 102 is dynamically adjusted based on one or more of the factors described herein.
  • the control unit 104 includes an electronic device (e.g., a portable electronic device) and an input device 506 that is a peripheral device that is communicatively coupled (e.g., via a wireless and/or wired connection) and/or mechanically coupled to the electronic device.
  • the control unit 104 includes a portable electronic device (e.g., a cellphone or a smart phone) and a remote control device (e.g., a standard remote control with a joystick) coupled to the portable electronic device.
  • a portable electronic device e.g., a cellphone or a smart phone
  • a remote control device e.g., a standard remote control with a joystick
  • the display device 508 displays information about the movable object 102, the carrier 108, and/or the payload 110, such as position, attitude, orientation, movement characteristics of the movable object 102, and/or distance between the movable object 102 and another object (e.g., the target object 106 and/or an obstacle) .
  • information displayed by the display device 508 includes images captured by the imaging device 214, tracking data (e.g., a graphical tracking indicator applied to a representation of the target object 106, such as a box or other shape around the target object 106 shown to indicate that target object 106 is currently being tracked) , and/or indications of control data transmitted to the movable object 102.
  • the images including the representation of the target object 106 and the graphical tracking indicator are displayed in substantially real-time as the image data and tracking information are received from the movable object 102 and/or as the image data is acquired.
  • the communication system 510 enables communication with the communication system 120 of the movable object 102, the communication system 610 ( Figure 6) of the computing device 126, and/or a base station (e.g., computing device 126) via a wired or wireless communication connection.
  • the communication system 510 transmits control instructions (e.g., navigation control instructions, target information, and/or tracking instructions) .
  • the communication system 510 receives data (e.g., tracking data from the payload imaging device 214, and/or data from movable object sensing system 122) .
  • the control unit 104 receives tracking data (e.g., via the wireless communications 124) from the movable object 102.
  • Tracking data is used by the control unit 104 to, e.g., display the target object 106 as the target is being tracked.
  • data received by the control unit 104 includes raw data (e.g., raw sensing data as acquired by one or more sensors) and/or processed data (e.g., raw data as processed by, e.g., the tracking module 404) .
  • the memory 504 stores instructions for generating control instructions automatically and/or based on input received via the input device 506.
  • the control instructions may include control instructions for operating the movement mechanisms 114 of the movable object 102 (e.g., to adjust the position, attitude, orientation, and/or movement characteristics of the movable object 102, such as by providing control instructions to the actuators 132) .
  • the control instructions adjust movement of the movable object 102 with up to six degrees of freedom.
  • the control instructions are generated to initialize and/or maintain tracking of the target object 106.
  • control instructions include instructions for adjusting the carrier 108 (e.g., instructions for adjusting the damping element 208, the actuator 204, and/or one or more sensors of the carrier sensing system 206) .
  • control instructions include instructions for adjusting the payload 110 (e.g., instructions for adjusting one or more sensors of the payload sensing system 212) .
  • control instructions include control instructions for adjusting the operations of one or more sensors of movable the object sensing system 122.
  • the memory 504 also stores instructions for performing image recognition, target classification, spatial relationship determination, and/or gesture analysis that are similar to the corresponding functionalities discussed with reference to Figure 4.
  • the memory 504 may also store target information, such as tracked target information and/or predetermined recognizable target type information, as discussed in Figure 4.
  • the input device 506 receives user input to control one aspect of the movable object 102 (e.g., the zoom of the imaging device 214) while a control application generates the control instructions for adjusting another aspect of movable the object 102 (e.g., to control one or more movement characteristics of movable object 102) .
  • the control application includes, e.g., control module 402, tracking module 404 and/or a control application of control unit 104 and/or computing device 126.
  • input device 506 receives user input to control one or more movement characteristics of movable object 102 while the control application generates the control instructions for adjusting a parameter of imaging device 214. In this manner, a user is enabled to focus on controlling the navigation of movable object without having to provide input for tracking the target (e.g., tracking is performed automatically by the control application) .
  • allocation of tracking control between user input received at the input device 506 and the control application varies depending on factors such as, e.g., surroundings of the movable object 102, motion of the movable object 102, altitude of the movable object 102, system configuration (e.g., user preferences) , and/or available computing resources (e.g., CPU or memory) of the movable object 102, the control unit 104, and/or the computing device 126.
  • relatively more control is allocated to the user when movable object is navigating in a relatively complex environment (e.g., with numerous buildings or obstacles or indoor) than when movable object is navigating in a relatively simple environment (e.g., wide open space or outdoor) .
  • control is allocated to the user when the movable object 102 is at a lower altitude than when the movable object 102 is at a higher altitude.
  • more control is allocated to the movable object 102 if movable object 102 is equipped with a high-speed processor adapted to perform complex computations relatively quickly.
  • the allocation of control over the tracking process between the user and the movable object is dynamically adjusted based on one or more of the factors described herein.
  • FIG. 6 illustrates an exemplary computing device 126 for controlling movable object 102 according to some implementations.
  • the computing device 126 may be a server computer, a laptop computer, a desktop computer, a tablet, or a phone.
  • the computing device 126 typically includes one or more processor (s) 602 (e.g., processing units) , memory 604, a communication system 610 and one or more communication buses 612 for interconnecting these components.
  • the computing device 126 includes input/output (I/O) interfaces 606, such as a display 614 and/or an input device 616.
  • I/O input/output
  • the computing device 126 is a base station that communicates (e.g., wirelessly) with the movable object 102 and/or the control unit 104.
  • the computing device 126 provides data storage, data retrieval, and/or data processing operations, e.g., to reduce the processing power and/or data storage requirements of movable object 102 and/or control unit 104.
  • computing device 126 is communicatively connected to a database (e.g., via communication 610) and/or computing device 126 includes database (e.g., database is connected to communication bus 612) .
  • the communication system 610 includes one or more network or other communications interfaces.
  • the computing device 126 receives data from the movable object 102 (e.g., from one or more sensors of the movable object sensing system 122) and/or the control unit 104.
  • the computing device 126 transmits data to the movable object 102 and/or the control unit 104.
  • computing device provides control instructions to the movable object 102.
  • the memory 604 stores instructions for performing image recognition, target classification, spatial relationship determination, and/or gesture analysis that are similar to the corresponding functionalities discussed with respect to Figure 4.
  • the memory 604 may also store target information, such as the tracked target information 408 and/or the predetermined recognizable target type information 414 as discussed in Figure 4.
  • the memory 604 or a non-transitory computer-readable storage medium of the memory 604 stores an application 620, which enables interactions with and control over the movable object 102, and which enables data (e.g., audio, video and/or image data) captured by the movable object 102 to be displayed, downloaded, and/or post-processed.
  • the application 620 may include a user interface 630, which enables interactions between a user of the computing device 126 and the movable object 126.
  • the application 630 may include a video editing module 640, which enables a user of the computing device 126 to edit videos and/or images that have been captured by the movable object 102 during a flight associated with a target object 102, e.g., captured using the image sensor 216.
  • a video editing module 640 which enables a user of the computing device 126 to edit videos and/or images that have been captured by the movable object 102 during a flight associated with a target object 102, e.g., captured using the image sensor 216.
  • the memory 604 also stores templates 650, which may be used for generating edited videos.
  • the memory 604 also stores data 660 that have been captured by the movable object 102 during a flight associated with a target object 106, which include videos 661 that have been captured by the movable object 102 during a flight associated with a target object 106.
  • the data 660 may be organized according to flights 661 (e.g., for each flight route) by the movable object 102.
  • the data for each of the flights 661 may include video data 662, images 663, and/or audio data 664. and/or.
  • the memory 604 further stores with the video data 662, the images 663, and the audio data 664 tag information 666 (e.g., metadata information) .
  • the video data 662-1 corresponding to flight 1 661-1 may include tag information (e.g., metadata) associated with the flight path and trajectory mode corresponding to the flight 661-1.
  • the memory 604 also stores a web browser 670 (or other application capable of displaying web pages) , which enables a user to communicate over a network with remote computers or devices.
  • a web browser 670 or other application capable of displaying web pages
  • Each of the above identified executable modules, applications, or sets of procedures may be stored in one or more of the memory devices, and corresponds to a set of instructions for performing a function described above.
  • the above identified modules or programs i.e., sets of instructions
  • the memory 604 stores a subset of the modules and data structures identified above.
  • the memory 604 may store additional modules or data structures not described above.
  • Figure 7 illustrates an exemplary configuration 700 of a movable object 102, a carrier 108, and a payload 110 according to some implementations.
  • the configuration 700 is used to illustrate exemplary adjustments to an orientation, position, attitude, and/or one or more movement characteristics of the movable object 102, the carrier 108, and/or the payload 110, e.g., as used to perform initialization of target tracking and/or to track a target object 106.
  • the movable object 102 rotates around up to three orthogonal axes, such as X1 (pitch) 710, Y1 (yaw) 708 and Z1 (roll) 712 axes.
  • the rotations around the three axes are referred to herein as a pitch rotation 722, a yaw rotation 720, and a roll rotation 724, respectively.
  • Angular velocities of the movable object 102 around the X1, Y1, and Z1 axes are referred to herein as ⁇ X1, ⁇ Y1, and ⁇ Z1, respectively.
  • the movable object 102 engages in translational movements 728, 726, and 730 along the Xl, Y1, and Z1 axes, respectively.
  • Linear velocities of the movable object 102 along the X1, Y1, and Z1 axes are referred to herein as VX1, VY1, and VZ1, respectively.
  • the payload 110 is coupled to the movable object 102 via the carrier 108. In some implementations, the payload 110 moves relative to the movable object 102 (e.g., the payload 110 is caused by the actuator 204 of the carrier 108 to move relative to the movable object 102) .
  • the payload 110 moves around and/or along up to three orthogonal axes, e.g., an X2 (pitch) axis 716, a Y2 (yaw) axis 714, and a Z2 (roll) axis 718.
  • the X2, Y2, and Z2 axes are parallel to the X1, Y1, and Z1 axes respectively.
  • the payload 110 includes the imaging device 214 (e.g., an optical module 702)
  • the roll axis Z2 718 is substantially parallel to an optical path or optical axis for the optical module 702.
  • the optical module 702 is optically coupled to the image sensor 216 (and/or one or more sensors of the movable object sensing system 122) .
  • the carrier 108 causes the payload 110 to rotate around up to three orthogonal axes, X2 (pitch) 716, Y2 (yaw) 714 and Z2 (roll) 718, e.g., based on control instructions provided to the actuator 204 of the carrier 108.
  • the rotations around the three axes are referred to herein as the pitch rotation 734, yaw rotation 732, and roll rotation 736, respectively.
  • the angular velocities of the payload 110 around the X2, Y2, and Z2 axes are referred to herein as ⁇ X2, ⁇ Y2, and ⁇ Z2, respectively.
  • the carrier 108 causes the payload 110 to engage in translational movements 740, 738, and 742, along the X2, Y2, and Z2 axes, respectively, relative to the movable object 102.
  • the linear velocity of the payload 110 along the X2, Y2, and Z2 axes is referred to herein as VX2, VY2, and VZ2, respectively.
  • the movement of the payload 110 may be restricted (e.g., the carrier 108 restricts movement of the payload 110, e.g., by constricting movement of the actuator 204 and/or by lacking an actuator capable of causing a particular movement) .
  • the movement of the payload 110 may be restricted to movement around and/or along a subset of the three axes X2, Y2, and Z2 relative to the movable object 102.
  • the payload 110 is rotatable around the X2, Y2, and Z2 axes (e.g., the movements 832, 834, 836) or any combination thereof, the payload 110 is not movable along any of the axes (e.g., the carrier 108 does not permit the payload 110 to engage in the movements 838, 840, 842) .
  • the payload 110 is restricted to rotation around one of the X2, Y2, and Z2 axes.
  • the payload 110 is only rotatable about the Y2 axis (e.g., rotation 832) .
  • the payload 110 is restricted to rotation around only two of the X2, Y2, and Z2 axes.
  • the payload 110 is rotatable around all three of the X2, Y2, and Z2 axes.
  • the payload 110 is restricted to movement along the X2, Y2, or Z2 axis (e.g., the movements 838, 840, or 842) , or any combination thereof, and the payload 110 is not rotatable around any of the axes (e.g., the carrier 108 does not permit the payload 110 to engage in the movements 832, 834, or 836) .
  • the payload 110 is restricted to movement along only one of the X2, Y2, and Z2 axes. For example, movement of the payload 110 is restricted to the movement 840 along the X2 axis) .
  • the payload 110 is restricted to movement along only two of the X2, Y2, and Z2 axes. In some implementations, the payload 110 is movable along all three of the X2, Y2, and Z2 axes.
  • the payload 110 is able to perform both rotational and translational movement relative to the movable object 102.
  • the payload 110 is able to move along and/or rotate around one, two, or three of the X2, Y2, and Z2 axes.
  • the payload 110 is coupled to the movable object 102 directly without the carrier 108, or the carrier 108 does not permit the payload 110 to move relative to the movable object 102.
  • the attitude, position and/or orientation of the payload 110 is fixed relative to the movable object 102 in such cases.
  • adjustment of attitude, orientation, and/or position of the payload 110 is performed by adjustment of the movable object 102, the carrier 108, and/or the payload 110, such as an adjustment of a combination of two or more of the movable object 102, the carrier 108, and/or the payload 110.
  • a rotation of 60 degrees around a given axis for the payload is achieved by a 60-degree rotation by the movable object 102 alone, a 60-degree rotation by the payload relative to the movable object 102 as effectuated by the carrier, or a combination of 40-degree rotation by the movable object and a 20-degree rotation by the payload 110 relative to the movable object 102.
  • a translational movement for the payload 110 is achieved via adjustment of the movable object 102, the carrier 108, and/or the payload 110 such as an adjustment of a combination of two or more of the movable object 102, carrier 108, and/or the payload 110.
  • a desired adjustment is achieved by adjustment of an operational parameter of the payload 110, such as an adjustment of a zoom level or a focal length of the imaging device 214.
  • Figures 8A and 8B illustrate process flows between a user, the computing device 126 (e.g., the application 620) , and the movable object 102 (e.g., a UAV) according to some implementations.
  • the computing device 126 e.g., the application 620
  • the movable object 102 e.g., a UAV
  • the application 620 enables a user to control a flight route of the movable object 102 for automatic and efficient video capture.
  • the application 620 is communicatively connected with the movable object 102 to facilitate automatic recognition of a target type of the target object 106.
  • the movable object 102 is configured to capture live video (e.g., via the image sensor) during the flight route.
  • a flight route may consist of flight paths. The flight route is designed to so as to utilize a variety of aerial videography techniques.
  • the application 620 also provides video-editing functionalities for editing a video that is captured by the movable object 102 during a flight route associated with a target object 102.
  • the application 620 may generate an edited video that is composed of multiple scenes and matched with music, filters, and transition. Because the video capture and editing processes are designed to be fully automated, the efficiency and quality of the entire process of flight, video capture, and video editing is significantly improved. Thus, the interactive experience of the users is enhanced.
  • Figure 8A illustrates an exemplary process flow 800 during an initialization and flight route selection phase.
  • a user may initiate an automated flight route video capture process by launching (804) an application (e.g., application 620, Figure 6) on the computing device 126.
  • the user may also initiate the automated flight route video capture process by other means, such as via the input device 506 of the control unit 104, or using facial recognition and/or hand gesture.
  • the user selects (806) a target object (e.g., the target object 106) .
  • the user selects an object of interest as a target object for the automated flight route video capture. This is illustrated in Figure 9.
  • the user may also select an object of interest as a target object 106 by entering the GPS coordinates of the object of interest via a user interface (e.g., the graphical user interface 630, Figure 6) .
  • the computing device 126 in response to user selection of the target object 106, the computing device 126 (e.g., application 620) displays (808) a target zone and parameters associated with the target zone. This is described in Figure 10.
  • the movable object 102 determines (812) a target type and a distance between the movable object 102 and the target object 106. This is described in Figure 13.
  • the movable object 102 selects (814) a flight route. This is described in Figure 13.
  • the computing device 126 displays (818) the selected flight route.
  • the movable object e.g., UAV
  • flies (822) autonomously according to customized flight route.
  • the computing device 126 displays (824) the progress of the flight route as it is being executed by the movable object 102. This is illustrated in Figure 11.
  • the movable object 102 capture (826) a video feed having a field of view of the image sensor 216 for each of a plurality of paths of the flight route.
  • the movable object 102 stores (828) a video feed of the flight route as it is being captured (e.g., video data 428, Figure 4) .
  • the movable object 102 also stores video feed tag information along with the video feed.
  • the movable object 102 after completion of flight route, the movable object 102 returns (830) to its starting position.
  • Figure 8B illustrates an exemplary process flow 850 during a video editing phase.
  • the computing device 126 retrieves (852) from the movable object 102 video segments of the captured video. Each video segment corresponds to a respective flight path of the flight route. This is described in Figure 17.
  • the user views (854) the captured video feed. This is illustrated in Figure 12.
  • the user may view (856) the view segments.
  • the computing device 126 selects (864) a video template.
  • the video template is selected by the computing device 126 automatically.
  • the video template is selected by the user.
  • the user may select (858) a theme for the video.
  • the computing device 126 may display one or more templates corresponding to the user-selected theme.
  • the computing device may also automatically select a default video template.
  • the user may either confirm (860) or modify the video template selection.
  • the user may also input (862) a time duration of the video.
  • the user may also select (866) a resolution for the video. This is illustrated in Figures 12 and 19.
  • the computing device 126 determines (868) a time duration of each scene of the video.
  • the computing device 126 extracts (870) video sub-segments from the video segments.
  • the computing device combines (872) the extracted video sub-segments into an edited video.
  • the computing device 126 displays (874) the edited video.
  • Figure 9 provides a screen shot for selecting an object of interest as a target object according to some implementations.
  • the graphical user interface 630 displays a scene (e.g., a live scene or a previously recorded scene) that is captured by the image sensor 216 during a flight of the movable object 102.
  • a scene e.g., a live scene or a previously recorded scene
  • the user may identify an object 902 (e.g., a structure) in the scene as an object of interest.
  • the user defines the object 902 as the target object 106 by selecting the object 902 on the graphical user interface 630, thereby causing an identifier 910 (e.g., an “X” ) to be displayed on the graphical user interface 630.
  • the user may initiate an automated flight route video capture process using the object 902 as a target object 106.
  • the graphical user interface 630 may display a notification (e.g., as a pop-up window) indicating that the movable object 102 will start a range of movements for target recognition and position estimation.
  • the movable object 102 may determine, a target type corresponding to the target object 102.
  • the movable object 102 may also determine a distance between the movable object 102 and the target object 106.
  • the movable object 102 may select automatically a predefined flight route, as described in further detail in Figure 13.
  • the graphical user interface 630 may display an updated view that includes a maximum height 904 for the flight route (e.g., “120 m” ) , a maximum distance 906 for the flight route (e.g., “200 m” ) , and a time duration 908 (e.g., “3m 0s” ) for the flight route, as illustrated in Figure 9.
  • a maximum height 904 for the flight route e.g., “120 m”
  • a maximum distance 906 for the flight route e.g., “200 m”
  • a time duration 908 e.g., “3m 0s”
  • the selected flight route may be customized.
  • the customization is performed automatically by the movable object 102.
  • the customization is provided in part by the user.
  • a flight route may be comprised of multiple flight paths, each having a different trajectory.
  • the movable object 102 may cause the computing device 126 to display a library of trajectories that can be selected by the user. Based on the user selection, the movable object 102 automatically generates flight paths corresponding to the trajectories.
  • the selected flight route may be customized according to a surrounding environment between the movable object 102 and the target object 106. For example, modification of a predefined flight route may be necessary to overcome obstacle (s) between the movable object 102 and the target object 106. In some implementations, the customization may also be based on an updated distance between the movable object 102 and the target object 106. For example, if the maximum altitude of the movable object 102 during the flight route is fixed, the rate of ascent of the movable object 102 may be higher when the target object 106 is nearer and the rate of ascent of the movable object 102 may be lower when the target object is farther.
  • Figure 10 provides a screen shot for displaying a customized flight path according to some implementations.
  • the graphical user interface 630 displays a map 1002 that includes a current location 1004 of the target object 106 and its vicinity.
  • the map 1002 also includes a flight zone 1006 that indicates a surrounding area of the target object 106 that would be traversed by the movable object 102 during the flight.
  • the flight zone 1006 may be color-coded to show a range of altitudes that will be attained by the movable object 102 during the execution of the flight route, as illustrated in Figure 10.
  • the graphical user interface 630 also displays a notification 1008 to the user (e.g., “The UAV will be flying in the identified fly zone.
  • a “Confirm” affordance 1010 causes the movable object 102 to commence execution of the flight route.
  • User selection of the “Cancel” affordance 1012 causes the movable object 102 to refrain from executing the flight route.
  • the flight route may include a plurality of flight paths.
  • the graphical user interface 630 may also display a preview of the flight routes of the flight path.
  • the preview may comprise a simulated view (e.g., a three-dimensional or two-dimensional representation) of each of the flight routes from the perspective of the movable object 102.
  • the three-dimensional representation may be based on a superposition of satellite images, e.g., images from Google Earth, corresponding to the each of the flight paths. Stated another way, the three-dimensional representation enables the user to observe a simulated preview of the flight route using satellite imagery prior to execution of the flight route by the movable object 102.
  • Figure 11 provides a screen shot for execution of a flight route by a movable object 102 according to some implementations.
  • the movable object 102 in response to user selection of the “Confirm” affordance 1010 in Figure 10, commences execution of the flight route without user intervention.
  • the graphical user interface 630 displays a bar 1102 that includes multiple segments, as illustrated in Figure 11. Each of the segments corresponds to a flight path with a corresponding trajectory mode, as will be described with respect to Figures 14 to 16.
  • the graphical user interface 630 also displays a status bar 1104 that indicates a degree of completion of the flight route and a description 1106 (e.g., “5/9 Downward spiral” ) of a current status of the flight route.
  • the executed flight route consists of nine flight paths, and the description 1106 “5/9 Downward spiral” indicates that the movable object 102 is currently executing the fifth flight path corresponding to a “downward spiral” trajectory mode.
  • the graphical user interface 630 also displays a current flight direction 1108 executed by the movable object 102.
  • the flight route includes a plurality of flight paths
  • the image sensor 216 is configured to capture a video feed for each of the flight paths with a corresponding capture setting.
  • the capture setting includes settings for an angle of view of the image sensor 216 with respect to the flight path, a pan tilt zoom (PTZ) setting of the image sensor 216, an optical zoom setting of the image sensor 216, a digital zoom setting of the image sensor 216, and/or a focal length of the image sensor 216. Exemplary details of flight routes, flight paths, and image sensor capture settings are described in further detail with respect to Figures 14 to 16.
  • Figure 12 provides a screen shot of an edited video according to some implementations.
  • the movable object 102 simultaneously transmits a live video stream to the computing device 126 for display on the graphical user interface 630 as the video is being captured.
  • the computing device 126 e.g., the application 620 automatically generates an edited video 1200 from the captured video.
  • the user can select a playback affordance on the graphical user interface 630 that enables playback of the video as it is captured, and/or the edited video 1200. Further details of the video editing process are described with respect to Figure 17 to 19.
  • the graphical user interface 630 also displays different themes (e.g., a “joyous” theme 1202, an “action” theme 1204, and a “scenic” theme 1206) that, when selected by the user, generates a different effect on the video.
  • the user may also select a resolution 1208 for the video.
  • the edited video may be stored locally (E.g., on the computing device 126) or on the Cloud. The user may also share the edited video via email and/or on social media platforms.
  • Figure 13 illustrates an exemplary flight route matching strategy according to some implementations.
  • the movable object 102 executes a flight route matching strategy (1302) to determine a flight route to be executed for video capture of the selected target object 106.
  • the movable object first determines a target type corresponding to the target object 106.
  • the movable object 102 may capture images of the target object (e.g., using the image sensors 302 and/or the image sensor 216) and use image matching algorithms to identify a target type corresponding to the target object 106.
  • the movable object determines (1304) whether the target object is a person.
  • the movable object 102 selects a normal flight route (1310) . In accordance with a determination that the target object is (1308) a person, the movable object 102 selects a portrait flight route (1312) .
  • the movable object 102 also determines a distance between the movable object 102 UAV and the target object 106 (e.g., using a distance sensor such as the infrared sensors 306, a RADAR sensor, a LiDAR sensor, or GPS coordinates) , and further refines the selected flight route based on the distance.
  • a distance sensor such as the infrared sensors 306, a RADAR sensor, a LiDAR sensor, or GPS coordinates
  • the movable object 102 further determines if a distance between the movable object 102 and the target object 106 exceeds a threshold distance (e.g., over 100 meters) (step 1314) .
  • the movable object In accordance with a determination that the distance does not (1316) exceed the threshold distance, the movable object identifies the normal flight route (1320) as the flight route to be executed. In accordance with a determination that the distance exceeds (1318) the threshold distance, the movable object identifies a long route flight route (1322) as the flight route to be executed.
  • the movable object 102 determines if a distance between the movable object 102 and the target object 106 exceeds a threshold distance (e.g., over 100 meters) (step 1324) . In accordance with a determination that the distance exceeds (1326) the threshold distance, the movable object 102 identifies a long route flight route (1322) as the flight route to be executed. In accordance with a determination that the distance does not exceed (1328) the threshold distance, the movable object 102 identifies a portrait flight route (1330) as the flight route to be executed.
  • a threshold distance e.g., over 100 meters
  • the flight route matching strategy 1302 includes more target types and/or flight routes than those described in Figure 13.
  • the movable object 102 may utilize other sensors and/or machine learning algorithms to identify target types other than those described in Figure 13 (e.g., target types such as animals, vehicles, mountains, buildings, and sculptures) .
  • a user may override the identified flight route (e.g., by electing a different flight route) .
  • the user may also customize flight route (e.g., by modifying a subset of the flight paths of the flight route) , and/or create new flight routes for execution by the movable object 102.
  • Figures 14. 15, and 16 illustrate, respectively, an exemplary normal flight route, an exemplary portrait flight route, and an exemplary long range flight route according to some implementations.
  • a respective flight route consists of multiple flight paths (e.g., trajectory modes) that are executed successively in the flight route.
  • the numbers in parentheses e.g., (0) , (1) , (2) etc.
  • the order of execution of the flight paths is flight path (0) , (1) , (2) , (3) , (4) , (5) , (6) , (7) , (8) , and (9) .
  • the block arrows represent a direction of travel of the movable object 102.
  • the triangle represents the image sensor 216 and the base of the triangle denotes the field of view of the image sensor 216.
  • the thin arrows represent a direction of rotation of the image sensor 216 as the movable object 102 transitions to a subsequent trajectory mode within the flight route.
  • the global x-, y-, and z-axes are also depicted in the figures.
  • Figure 14 illustrates an exemplary normal flight route according to some implementations.
  • the normal flight route is designed to be executed within the shortest possible time and using the smallest possible flight area (e.g., smallest possible flight zone) .
  • the flight paths are designed so as to display the target object 106 and the surrounding environment as richly as possible, taking into account different target types (e.g., target types such as animals, vehicles, buildings, sculptures, and scenery) .
  • the normal flight route includes a maximum allowable distance between a starting position of the movable object 102 and the target object 106 (e.g., 100 m, 150 m, or 200 m) and a maximum allowable flight altitude (e.g., 50 m, 80 m, or 100 m) . If the distance between the movable object 102 and the target object 106 exceeds the maximum allowable distance, the long range flight route will be selected instead (See Figure 13) .
  • the movable object 102 is configured to rotate with a fixed angle of rotation in a specific direction for flight paths (2) , (4) , and (5) .
  • the movable object 102 is configured to rotate 180° in a counterclockwise direction with respect to the y-axis (e.g., the yaw rotation 720, Figure 7) when executing flight path (2) .
  • the movable object 2 is configured to rotate 180° in a clockwise direction with respect to the x-axis (e.g., the roll rotation 724, Figure 7) .
  • the image sensor 216 is configured to maintain a fixed angle (e.g., fixed with respect to the global x-, y-, and z-axes) in one or more flight paths. As illustrated in Figure 14, the image sensor 216 has a fixed angle with respect to the global x-axis for each of the flight paths (2) , (4) , and (5) .
  • the carrier 108 includes a gimbal and is configured to continuously adjust the gimbal (e.g., adjust an angle of rotation and/or a tilt angle) in order to maintain the image sensor 216 at the fixed angle.
  • the image sensor 216 is configured to rotate about one or more axis in a flight path.
  • the image sensor is configured to rotate about the y-axis (e.g., yaw rotation 732, Figure 7) while the movable object 102 travels in an upward direction.
  • the image sensor 216 is configured to tilt and/or rotate between flight paths of the flight route. For example, as the movable object 102 transitions from flight path (8) to flight path (9) , the field of view of the image sensor 216 switches from facing a negative x-axis direction to a negative y-axis direction.
  • the flight route includes a flight path wherein the field of view of the image sensor 216 faces a direction that is the same as a direction of travel of the movable object 102.
  • flight path (9) the movable object 102 is traveling in a downward direction (e.g., negative y-axis direction) and the field of view of the image sensor 216 also faces the downward direction.
  • the flight route includes a flight path wherein the field of view of the image sensor 216 faces a direction that is opposite to a direction of travel of the movable object 102.
  • flight path (1) the movable object 102 is traveling away from the starting position of the movable object 102 whereas the field of view of the image sensor is facing the starting position of the movable object 102.
  • flight paths (1) , (3) , (6) , (7) , (8) , and (9) are trajectory modes in which the movable object 102 traverses a respective fixed distance.
  • the distance between the starting point of the movable object 102 (e.g., UAV starting position) and the target object 106 is D1
  • the distance between the farthest point of the flight route and the starting point of the movable object 102 is D2.
  • the ratio olD2 and D1 should be as large as possible.
  • the distance D1 between the movable object 102 and the target object 106 should be between 2 m and 100 m.
  • Figure 15 illustrates an exemplary portrait flight route according to some implementations.
  • the portrait mode is used when the target object is a person.
  • the portrait mode is designed so as to present people in a suitable proportion in the capture video. Accordingly, the movable object 102 should not be too far from the target object 106 while taking into consideration the surrounding environment.
  • the portrait flight route includes a maximum allowable distance between a starting position of the movable object 102 and the target object 106 (e.g., 40 m, 50 m, or 60 m) and a maximum allowable flight altitude (e.g., 30 m, 40 m, or 50 m) .
  • the portrait flight route includes a plurality of flight paths, each having a respective trajectory mode.
  • the movable object 102 is configured to rotate with a fixed angle of rotation in a specific direction for flight paths (2) , (3) , and (5) .
  • the movable object 102 is configured to travel a fixed distance for flight paths (1) , (4) , (6) , (7) , (8) , (9) and (10) .
  • the image sensor 216 is also configured to rotate or tilt with respect to one or more axis, or maintain a fixed angle, as discussed previously with respect to Figure 14.
  • the portrait flight route is designed to enrich the video capture effects, for example by controlling a pan, tilt, and zoom (PTZ) setting of the image sensor 216.
  • flight path (1) comprises a trajectory mode wherein the image sensor performs dolly zoom, dynamically adjusts a zoom factor and focal length during capture of the target object 106 as the movable object 102 is moving away from the target object 106.
  • the starting position of the portrait flight route is predetermined.
  • the predetermined starting position is three meters high and five meters away from the target object 106, so that the size of the target object in images captured by the image sensor would be appropriate.
  • the movable object 102 is configured to track the target object 106 continuously for flight paths (1) , (2) , (3) , (4) , (5) . If the target object 106 moves, the flight paths (1) , (2) , (3) , (4) , (5) will changes accordingly.
  • Figure 16 illustrates an exemplary long range flight route according to some implementations.
  • the long range flight route is designed to enrich the video capture effects, for example by controlling a pan, tilt, and zoom (PTZ) setting of the image sensor 216.
  • flight path (3) comprises a trajectory mode wherein the field of view of the image sensor 216, which is facing the target object 106, dynamically adjusts a zoom factor (e.g., from 3-6 times) during capture of the target object 106 as the movable object 102 is moving away from the target object 106 (e.g., the direction of travel of the movable object 102 is in the positive x direction) .
  • a zoom factor e.g., from 3-6 times
  • the long range flight route includes a maximum allowable distance between a starting position of the movable object 102 and the target object 106 (e.g., 100 m, 150 m, or 200 m) and a maximum allowable flight altitude (e.g., 80 m, 100 m, or 120 m) .
  • the flight paths include a discovery approach trajectory mode, as illustrated in flight path (3) of the normal flight route, flight path (3) of the portrait flight route, and flight path (1) of the long range flight route.
  • the carrier 108 controls a pitch angle of the gimbal such that the field of view of the image sensor 216 no longer follows the target object 106, but instead rotates down to 90° relative to a horizontal plane.
  • the movable object 102 is configured to fly towards the target object 106.
  • the movable object 102 determines in real time a current position of the movable object 102 in the flight route, and controls the gimbal to gradually lift at a certain speed to make the target object 106 appear on the field of view of the image sensor 216 again.
  • the flight route of the movable object 102 (as well as the corresponding flight paths and trajectory modes) is highly dependent on the position of the target object 102 (and correspondingly, a distance between the movable object 102 and the target object 106) .
  • the movable object 102 continuously determines a current location of the target object 106 while executing the flight route (e.g., using the image sensors 302 and/or 216, the distance sensors, and/or GPS) , and modifies a current flight path and/or subsequent flight paths based on a presently determined location of the target object 106.
  • the movable object 102 may use a most recently determined location of the target object 106 as the current location of the target object 106, and update the location data once it relocates the target object 106.
  • the target object 106 is a moving object that changes its location while the flight route is being executed.
  • the movable object 102 may be configured to adjust a current flight path and/or subsequent flight path (s) for optimum video capture.
  • the movable object 102 is configured to return to its starting location after execution of the flight route.
  • a user may issue a command to pause or stop the movable object 102 during execution of a flight route (e.g., for collision prevention) .
  • the movable object 102 may hover at a fixed location (e.g., its most recent location prior to the user command) and await the user's follow-up operation.
  • the movable object 102 may be configured to automatically return to its starting location. If the user decides to continue with the video capture, the user may resume operation of the movable object 102.
  • the movable object 102 resumes its operation by executing the remaining portion of a flight path (i.e., the flight path that it was executing prior to receiving the user command) , and then following on to execute subsequent flight paths of the flight route. In some implementations, the movable object 102 resumes its operation by skipping the unfinished portion of a current flight path and commencing with the next flight path in the flight route. In some implementations, the movable object 102 resumes its operation by repeating a current flight path that was being executed prior to the user command, and then following on to execute subsequent flight path (s) of the flight route.
  • the movable object 102 is configured to avoid obstacles during execution of a flight route.
  • a radius of the distance from the movable object 102 to the target object 106 is 100 meters
  • the image sensors 302 of the movable object 102 are located at a fixed position (e.g., in the front) of the movable object 102 and are in a forward-facing direction, and the field of view of the image sensor 216 is facing the target object 106, which is located at an angle offset (e.g., 44°from the image sensors 302.
  • an internal spiral flight trajectory can be used to ensure obstacle avoidance within the visual field of view (e.g., the field of view of the image sensors 302) , but a surrounding radius will gradually shrink (e.g., by ⁇ 8%for every 30° rotation) .
  • the radius of the movable object 102 and the target object 106 is less than 64.8 m, obstacle avoidance can be performed based on a map of the target object location without the use of an internal spiral flight trajectory.
  • the image sensors 302 and the image sensor 216 are both facing the target object 106, and the flight path is circular, no obstacle avoidance can be achieved.
  • a computing device 126 e.g., the application 620
  • Figures 17 to 19 illustrate.
  • Figure 17 illustrates an exemplary video segment download and extraction process at a computing device 126, in accordance with some implementations.
  • the image sensor 216 automatically captures video during a flight route of the movable object 102.
  • the movable device 102 is configured to store the video and its corresponding metadata information as the video is being captured (e.g., while the movable object 102 is executing the flight route) .
  • the movable object 102 may store the captured video and metadata locally on the movable object 102 (e.g., video data 428, Figure 4) . Additionally and/or alternatively, the movable object 102 may store the captured video and metadata on an external storage medium (e.g., an SD card) that is located on the movable object 102.
  • an external storage medium e.g., an SD card
  • the movable object 102 simultaneously transmits a live video stream to the computing device 126 for display on the computing device 126 (e.g., on the graphical user interface 630) as the video is being captured.
  • the live video stream may also include metadata (e.g., tag information) corresponding to the live video stream.
  • the computing device 126 may store the video stream and metadata information (e.g., locally on the computing device 126 and/or in the database 614) .
  • the video that is stored on the movable object 102 has a higher resolution than the streamed video data.
  • the computing device 126 e.g., the application 620
  • the computing device 126 can use the extra bandwidth that has been freed up from video transmission, to download the higher-resolution video data from the movable object 102 for post-processing and video-editing.
  • the shaded regions represent video segments that are downloaded by the computing device 126.
  • the computing device 126 may download an entire video feed that is captured during a flight route, as illustrated by the shaded region representing “Captured Video Feed” in Figure 17.
  • the computing device 126 may download segments (e.g., portions) of a video feed. These are illustrated by “Level 1 download” and “Level 2 download” in Figure 17.
  • the downloading and segment selection process is performed automatically by the computing device 126 (i.e., without user intervention) .
  • the computing device 126 may use the tag information to identify segments of the captured video whose image quality may be poorer. These may include segments corresponding to changes in speed of the movable object 102, and/or arising from the movable object 102 transitioning from one flight path to another in the flight route.
  • the computing device 126 may download other segments of the captured video while refraining from downloading segments that are deemed to be of lower image quality. Accordingly, the amount of video data to be downloaded and the time required to download the data can be significantly reduced.
  • the downloading and segment selection process is performed by the computing device 126 with user input. For example, a user may identify certain portions of the video as being more interesting and/or of better image quality when viewing the streamed video, and download the video segments corresponding to these portions for subsequent use.
  • the computing device 126 simultaneously performs the Level 1 download and the Level 2 download that are illustrated in Figure 17. In some implementations, the computing device 126 sequentially performs the Level 1 download and the Level 2 download.
  • the edited video comprises a video template.
  • Figure 18 illustrates an exemplary video template matching strategy (1802) in accordance with some implementations.
  • the computing device 126 determines, based on the flight route (e.g., using the metadata of the captured video) , whether the captured video corresponds to the long range flight route (1322) , the normal flight route (1322) , or portrait flight route (1330) . In accordance with a determination that the flight route corresponds to the long range flight route (1322) , the computing device 126 selects a first template strategy (e.g., template strategy A 1810) . In accordance with a determination that the flight route corresponds to the normal flight route (1320) or the portrait flight route, the computing device 126 selects a second template strategy (e.g., template strategy B 1812) .
  • a first template strategy e.g., template strategy A 1810
  • a second template strategy e.g., template strategy B 1812
  • each of the template strategies further comprises a plurality of templates with different themes (or styles) .
  • the first template strategy 1810 includes templates of different themes 1804 whereas the second template strategy 1812 includes templates of different themes 1806, for selection by the user.
  • each of the themes may generate a different effect on the video.
  • Figure 19 illustrates another exemplary video template matching strategy 1900 in accordance with some implementations.
  • the template matching strategy 1900 comprises a theme selection 1902.
  • the computing device 126 may display one or more themes that are available for selection by the user (e.g., a joyous theme, an action theme, a scenic theme, and an artistic theme) .
  • the computing device 126 may initiate a template selection step 1903, in which templates corresponding to the selected theme are presented to the user.
  • the computing device 126 also prompts the user to select a template.
  • the computing device 126 displays a list of all available templates to the user and prompts the user to select a template.
  • the computing device 126 automatically selects a theme and a template based on the flight route, which can be subsequently modified by the user.
  • the computing device 126 may prompt the user to select music (1904) that matches the theme. In some implementations, the music is automatically selected by the computing device 126.
  • each of the templates includes scenes 1906.
  • the scenes 1906 include an opening scene, one or more intermediate scenes (e.g., intermediate scene 1 and intermediate scene 2, Figure 19) and a concluding scene.
  • the computing device 126 determines, for each of the scenes 1906, one or more flight paths 1908 whose video segments may be used for the scene. In some implementations, the determination is based on the flight parameters (e.g., flight path, an angle and/or direction of field of view of the image sensor 216, a trajectory mode etc. ) that are extracted from the metadata information. In some implementations, one flight path may be used in more than one scene. For example, in Figure 19, a video segment of flight path 2 may be used in the second intermediate scene as well as in the concluding scene. Furthermore, the sequence in which the flight paths are executed in a flight route of the movable object 102 does not have any bearing on the scene (s) in which they may be used.
  • the flight parameters e.g., flight path, an angle and/or direction of field of view of the image sensor 216, a trajectory mode etc.
  • the flight routes illustrated in Figures 14 to 16 each consists of flight paths (0) to (9) that are executed in this order. While the flight path (0) is the flight path to be executed, video segments from the flight path (0) may be used in the opening scene as well as in the concluding scene, as illustrated in Figure 19.
  • the computing device 126 determines a total time duration for the edited video.
  • the total time duration may be defined by the user (see, e.g., step 862 in Figure 8B) .
  • the computing device 126 determines a corresponding time duration for each of the scenes.
  • the computing device 126 extracts, from the video segments for the corresponding scene, one or more video sub-segments 1910 according to a time duration of the scene.
  • the video template matching strategy illustrated in Figures 18 and 19 improve over existing post-editing templates that are available on the market.
  • the existing templates only allow a user to design general templates and perform basic video editing functions such as segmentation, changing a playback speed, image transformation (e.g., image translation, zoom, rotation, crop, mirror, mask, changing a degree of image transparency, keyframe editing) , changing a transition effect (e.g., basic transitions, mirror transitions, special effects transitions, and mask transitions) and other aspects of the effect that can be achieved by manual editing.
  • image transformation e.g., image translation, zoom, rotation, crop, mirror, mask, changing a degree of image transparency, keyframe editing
  • changing a transition effect e.g., basic transitions, mirror transitions, special effects transitions, and mask transitions
  • these editing effects do not improve the actual quality of the raw video capture.
  • the computing device 126 stores both the captured video as well as the corresponding metadata (e.g., tag) information.
  • metadata e.g., tag
  • information such as a time duration of each of the flight paths, as well as information of a prior scene, an angle of view, and the relationship between the various flight paths and the target object 106 are known.
  • Figures 20A-20C provide a flowchart of a method 2000 according to some implementations.
  • the method 2000 is performed (2002) by an unmanned aerial vehicle (UAV) (e.g., the movable object 102) .
  • UAV unmanned aerial vehicle
  • the UAV receives (2006) , from a computing device 126 that is communicatively connected to the UAV, a first input that includes identification of a target object 106. This is illustrated in Figure 9.
  • the UAV determines (2010) a target type corresponding to the target object 106.
  • the target type may include a person, an animal, a vehicle, a building, a sculpture, a mountain, or the sea.
  • determining the target type further comprises employing image recognition algorithms.
  • the UAV determines (2012) , a distance between the UAV and the target object 106.
  • the UAV may include GPS sensing technology.
  • the target object 106 may include GPS or ultra-wide band (UWB) .
  • the UAV may determine a distance between the UAV and the target object 106 using their respective locations, which are obtained via GPS.
  • a user may input coordinates corresponding to the target object 106 and the UAV determines the distance between the UAV and the target object 106 based on the coordinates.
  • a user may identify a location of the target object 106 using a map that is displayed on a user interface (e.g., a graphical user interface 630) .
  • the UAV selects (2014) automatically, from a plurality of predefined flight routes, a flight route for the UAV.
  • the UAV employs a flight route matching strategy 1302 to select the flight route, as illustrated in Figure 13.
  • the plurality of predefined flight routes include (2016) a portrait flight route, a long range flight route, and a normal flight route. This is illustrated in Figures 13 to 16.
  • the UAV automatically customizes the selected flight route by taking into consideration factors such as a updated distance between the UAV and the target object 106, presence of potential obstacle (s) and/or other structures (e.g., buildings and trees) , or weather conditions.
  • customizing the flight route includes modifying a rate of ascent of the UAV, an initial velocity of the UAV, and/or an acceleration of the UAV.
  • the customization is provided in part by a user. For example, depending on the target type and the distance, the UAV may cause the user interface 630 to display a library of trajectories that can be selected by the user. The UAV then automatically generates the paths of the flight route based on the user selections
  • the selected flight route includes (2020) a plurality of paths of different trajectory modes. This is illustrated in Figures 14 to 16.
  • each of the plurality of paths (2022) comprises a respective one or more of: a path distance, a velocity of the UAV, an acceleration of the UAV, a flight time, an angle of view; a starting altitude, an ending altitude, a pan tilt zoom (PTZ) setting of the image sensor, an optical zoom setting of the image sensor, a digital zoom setting of the image sensor, and a focal length of the image sensor.
  • a path distance a velocity of the UAV
  • an acceleration of the UAV a flight time, an angle of view
  • a starting altitude, an ending altitude a pan tilt zoom (PTZ) setting of the image sensor
  • PTZ pan tilt zoom
  • the trajectory modes include (2024) a first trajectory mode, corresponding to: the UAV traveling in a first direction; and a field of view of an image sensor of the UAV facing a second direction, opposite to the first direction. This is illustrated in flight path (1) in Figure 14.
  • the trajectory modes include (2026) a first trajectory mode, corresponding to: the UAV traveling in a first direction; and a field of view of an image sensor of the UAV facing the first direction. This is illustrated in flight path (9) in Figure 14.
  • the method further comprises (2028) rotating the field of view of an image sensor of the UAV from the first direction to a second direction, distinct from the first direction, while executing the first trajectory mode.
  • the trajectory modes include (2030) a first trajectory mode, corresponding to: the UAV traveling in a first direction; and a field of view of an image sensor of the UAV facing a second direction, perpendicular to the first direction. This is illustrated in flight path (8) in Figure 14.
  • the UAV sends (2032) to the computing device 126 the selected flight route for display on the computing device.
  • sending to the computing device 126 the selected flight route further comprises (2034) causing to be displayed on the computing device 126 a preview of the selected flight route.
  • the preview comprises (2036) a three-dimensional or two-dimensional representation of the selected flight route.
  • the three-dimensional representation may be based on a superposition of satellite images, e.g., images from Google Earth, corresponding to the each of the flight paths of the selected flight route.
  • the preview comprises (2038) a map of a vicinity of the UAV and the target object. This is illustrated in Figure 10.
  • the UAV receives (2040) from the computing device 126 a second input.
  • the UAV receives from the computing device an input corresponding to user selection of the “Confirm” affordance 1010 in Figure 10.
  • the UAV controls (2042) the UAV to fly autonomously according to the selected flight route, including capturing by an image sensor of the UAV a video feed having a field of view of the image sensor and corresponding to each path of the plurality of paths.
  • the UAV simultaneously stores (2044) the video feed while the video feed is being captured (e.g., video data 428, Figure 4) .
  • the UAV stores (2046) with the video feed tag information associated with the flight path and trajectory mode corresponding to a respective segment of the video feed.
  • Figures 21A -21C provide a flowchart of a method 2100 according to some implementations.
  • the method 2100 comprises a method for editing (2102) a video.
  • the method 2100 is performed (2104) at a computing device 126.
  • the computing device 126 has (2106) one or more processors 602 and memory 604.
  • the memory 604 stores (2108) programs to be executed by the one or more processors.
  • the programs include an application 620.
  • the video includes (2110) a plurality of video segments captured by an unmanned aerial vehicle (UAV) (e.g., the movable object 102) during a flight route associated with a target object 106.
  • UAV unmanned aerial vehicle
  • Each of the video segments corresponds (2112) to a respective path of the flight route.
  • the computing device 126 obtains (2114) a set of tag information for each of the plurality of video segments.
  • the computing device 126 selects (2118) , from a plurality of video templates, a video template for the video.
  • the selected video template includes (2120) a plurality of scenes. This is illustrated in Figure 19.
  • the plurality of scenes include (2122) : an opening scene, one or more intermediate scenes, and a concluding scene. This is illustrated in Figure 19.
  • the selected video template comprises (2128) a theme and includes music that matches the theme.
  • the plurality of video templates comprise (2130) a plurality of themes.
  • the plurality of themes may include a plurality of: an artistic theme, an action theme, a scenic theme, a dynamic theme, a rhythmic theme, and a joyous theme.
  • the computing device 126 extracts (2134) , from the plurality of video segments, one or more video sub-segments according to the tag information and the selected video template.
  • the method 2100 further comprises: prior to (2140) the extracting, receiving a user input specifying a total time duration of the video.
  • the time duration of the scene is (2142) based on the total time duration of the video.
  • the method 100 further comprises automatically allocating (2144) a time for the video sub-segment
  • the computing device 126 combines (2146) the extracted video sub-segments from the plurality of scenes into a complete video of the flight route of the UAV.
  • the extracted video sub-segments are (2148) combined according to a time sequence in which the video sub-segments are captured.
  • the extracted video sub-segments are (2150) combined in a time sequence that is defined by the selected video template.
  • Exemplary processing systems include, without limitation, one or more general purpose microprocessors (for example, single or multi-core processors) , application-specific integrated circuits, application-specific instruction-set processors, field-programmable gate arrays, graphics processing units, physics processing units, digital signal processing units, coprocessors, network processing units, audio processing units, encryption processing units, and the like.
  • general purpose microprocessors for example, single or multi-core processors
  • application-specific integrated circuits for example, application-specific instruction-set processors, field-programmable gate arrays
  • graphics processing units for example, single or multi-core processors
  • physics processing units for example, digital signal processing units, coprocessors, network processing units, audio processing units, encryption processing units, and the like.
  • FIG. 1 A block diagram illustrating an exemplary computing environment in accordance with the present invention.
  • FIG. 1 A block diagram illustrating an exemplary computing environment in accordance with the present invention.
  • FIG. 1 A block diagram illustrating an exemplary computing environment in accordance with the present invention.
  • FIG. 1 A block diagram illustrating an exemplary computing environment in accordance with the present invention.
  • FIG. 1 A block diagram illustrating an exemplary computing environment in accordance with the present inventions.
  • FIG. 1 A block diagram illustrating an exemplary computing environment in accordance with the present invention.
  • FIG. 1 A block diagram illustrating an exemplary computing environment in accordance with the present invention.
  • FIG. 1 A block diagram illustrating an exemplary computing environment in accordance with the present invention.
  • FIG. 1 A block diagram illustrating an exemplary computing environment in accordance with the present invention.
  • FIG. 1 A block diagram illustrating an exemplary computing environment in accordance with the present invention.
  • FIG. 1 A block diagram illustrating an exemplary computing environment in accordance with the present invention.
  • memory 118, 504, 604) can include, but is not limited to, any type of disk including floppy disks, optical discs, DVD, CD-ROMs, microdrive, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, DRAMs, VRAMs, DDR RAMs, flash memory devices, magnetic or optical cards, nanosystems (including molecular memory ICs) , or any type of media or device suitable for storing instructions and/or data.
  • any type of disk including floppy disks, optical discs, DVD, CD-ROMs, microdrive, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, DRAMs, VRAMs, DDR RAMs, flash memory devices, magnetic or optical cards, nanosystems (including molecular memory ICs) , or any type of media or device suitable for storing instructions and/or data.
  • features of the present invention can be incorporated in software and/or firmware for controlling the hardware of a processing system, and for enabling a processing system to interact with other mechanism utilizing the results of the present invention.
  • software or firmware may include, but is not limited to, application code, device drivers, operating systems and execution environments/containers.
  • Communication systems as referred to herein optionally communicate via wired and/or wireless communication connections.
  • communication systems optionally receive and send RF signals, also called electromagnetic signals.
  • RF circuitry of the communication systems convert electrical signals to/from electromagnetic signals and communicate with communications networks and other communications devices via the electromagnetic signals.
  • RF circuitry optionally includes well-known circuitry for performing these functions, including but not limited to an antenna system, an RF transceiver, one or more amplifiers, a tuner, one or more oscillators, a digital signal processor, a CODEC chipset, a subscriber identity module (SIM) card, memory, and so forth.
  • an antenna system an RF transceiver, one or more amplifiers, a tuner, one or more oscillators, a digital signal processor, a CODEC chipset, a subscriber identity module (SIM) card, memory, and so forth.
  • SIM subscriber identity module
  • Communication systems optionally communicate with networks, such as the Internet, also referred to as the World Wide Web (WWW) , an intranet and/or a wireless network, such as a cellular telephone network, a wireless local area network (LAN) and/or a metropolitan area network (MAN) , and other devices by wireless communication.
  • networks such as the Internet, also referred to as the World Wide Web (WWW)
  • WWW World Wide Web
  • a wireless network such as a cellular telephone network, a wireless local area network (LAN) and/or a metropolitan area network (MAN)
  • LAN wireless local area network
  • MAN metropolitan area network
  • Wireless communication connections optionally use any of a plurality of communications standards, protocols and technologies, including but not limited to Global System for Mobile Communications (GSM) , Enhanced Data GSM Environment (EDGE) , high-speed downlink packet access (HSDPA) , high-speed uplink packet access (HSUPA) , Evolution, Data-Only (EV-DO) , HSPA, HSPA+, Dual-Cell HSPA (DC-HSPDA) , long term evolution (LTE) , near field communication (NFC) , wideband code division multiple access (W-CDMA) , code division multiple access (CDMA) , time division multiple access (TDMA) , Bluetooth, Wireless Fidelity (Wi-Fi) (e.g., IEEE 102.11a, IEEE 102.11ac, IEEE 102.11ax, IEEE 102.11b, IEEE 102.11g and/or IEEE 102.11n) , voice over Internet Protocol (VoIP) , Wi-MAX, a protocol for e-mail
  • the term “if” may be construed to mean “when” or “upon” or “in response to determining” or “in accordance with a determination” or “in response to detecting, ” that a stated condition precedent is true, depending on the context.
  • the phrase “if it is determined [that a stated condition precedent is true] ” or “if [a stated condition precedent is true] ” or “when [a stated condition precedent is true] ” may be construed to mean “upon determining” or “in response to determining” or “in accordance with a determination” or “upon detecting” or “in response to detecting” that the stated condition precedent is true, depending on the context.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Studio Devices (AREA)
  • Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)

Abstract

A method is performed by an unmanned aerial vehicle (UAV) (102) includes: the UAV (102) receives from a computing device (126) that is communicatively connected to the UAV (102) a first input that includes identification of a target object (106) (2006); in response to the first input, the UAV (102) determines a target type corresponding to the target object (106) (2010); the UAV (102) determines a distance between the UAV (102) and the target object (106) (2012); the UAV (102) also selects automatically, from a plurality of predefined flight routes, a flight route for the UAV (102) according to the determined target type and the distance (2014); the selected flight route includes a plurality of paths of different trajectory modes (2020); the UAV (102) sends to the computing device (126) the selected flight route for display on the computing device (126) (2032).

Description

Systems and Methods for Supporting Automatic Video Capture and Video Editing TECHNICAL FIELD
The disclosed implementations relate generally to automatic video capture and video editing and more specifically, to systems, methods, and user interfaces that enable automatic video capture and video editing for UAV aerial photography.
BACKGROUND
Movable objects can be used for performing surveillance, reconnaissance, and exploration tasks for military and civilian applications. An unmanned aerial vehicle (UAV) is an example of a movable object. A movable object may carry a payload for performing specific functions such as capturing images and video of a surrounding environment of the movable object or for tracking a specific target. For example, a movable object may track a target object moving on the ground or in the air. Movement control information for controlling a movable object is typically received by the movable object from a remote device and/or determined by the movable object.
SUMMARY
The advancement of UAV technology has enabled UAV aerial photography and videography. UAV aerial photography relates to a series of operations, such as camera settings, gimbal control, joystick control, and image composition and view finding. If a user desires to use the UAV to capture smooth videos with beautiful image composition, the user needs to adjust numerous parameters for the camera, gimbal, joystick, and image composition and view finding. The process of the control is relatively complex. Thus, it is challenging for a user who is not familiar with aerial photography operations to determine satisfactory parameters in a short amount of time. Furthermore, if a user wants to capture a video that  contains multiple shots of various scenes and edit them into a visually and logical video, the user will still need to manually capture each of the scenes and then combine and edit the scenes.
Accordingly, there is a need for improved systems and methods that support automatic capture and editing for UAV aerial photography and videography. Such systems and methods optionally complement or replace conventional methods for target tracking, image or video capture, and/or image or video editing.
In accordance with some implementations, a method performed by an unmanned aerial vehicle (UAV) . The UAV receives, from a computing device that is communicatively connected to the UAV, a first input that includes identification of a target object. In response to the first input, the UAV determines a target type corresponding to the target object. The UAV also determines a distance between the UAV and the target object. The UAV selects automatically, from a plurality of predefined flight routes, a flight route for the UAV. The UAV sends to the computing device the selected flight route for display on the computing device.
In some implementations, the selected flight route includes a plurality of paths of different trajectory modes.
In some implementations, after the sending, the UAV receives from the computing device a second input. In response to the second input, the UAV controls the UAV to fly autonomously according to the selected flight route, including capturing by an image sensor of the UAV a video feed having a field of view of the image sensor and corresponding to each path of the plurality of paths.
In some instances, the UAV simultaneously stores the video feed while the video feed is being captured.
In some instances, the UAV stores with the video feed tag information associated with the flight path and trajectory mode corresponding to a respective segment of the video feed.
In some instances, the UAV simultaneously sends the video feed and tag information associated with the flight path and trajectory mode corresponding to a respective segment of the video feed to the computing device for store on the remote control device.
In some implementations, the plurality of predefined flight routes include: a portrait flight route, a long range flight route, and a normal flight route
In some implementations, each of the plurality of paths comprises a respective one or more of: a path distance, a velocity of the UAV, an acceleration of the UAV, a flight time, an angle of view; a starting altitude, an ending altitude, a pan tilt zoom (PTZ) setting of the image sensor, an optical zoom setting of the image sensor, a digital zoom setting of the image sensor, and a focal length of the image sensor.
In some implementations, sending to the computing device the selected flight route further comprises causing to be displayed on the computing device a preview of the selected flight route.
In some instances, the preview comprises a three-dimensional or two-dimensional representation of the selected flight route.
In some instances, the preview comprises a map of a vicinity of the UAV and the target object.
In some implementations, the trajectory modes include a first trajectory mode, corresponding to: the UAV traveling in a first direction; and a field of view of an image sensor of the UAV facing a second direction, opposite to the first direction.
In some implementations, the trajectory modes include a first trajectory mode, corresponding to: the UAV traveling in a first direction; and a field of view of an image sensor of the UAV facing the first direction.
In some instances, the method further comprises rotating the field of view of an image sensor of the UAV from the first direction to a second direction, distinct from the first direction, while executing the first trajectory mode.
In some implementations, the trajectory modes include a first trajectory mode, corresponding to: the UAV traveling in a first direction; and a field of view of an image sensor of the UAV facing a second direction, perpendicular to the first direction.
In accordance with another aspect of the present disclosure, a method for editing a video is performed at a computing device. The computing device has one or more processors and memory storing programs to be executed by the one or more processors. The video includes a plurality of video segments captured by an unmanned aerial vehicle (UAV) during a flight route associated with a target object. Each of the video segments corresponds to a respective path of the flight route. The computing device obtains a set of tag information for each of the plurality of video segments. The computing device selects, from a plurality of video templates, a video template for the video. The computing device extracts, from the plurality of video segments, one or more video sub-segments according to the tag information and the selected video template. The computing device combines the extracted video sub-segments into a complete video of the flight route of the UAV.
In some instances, the selected video template includes a plurality of scenes, each of the scenes corresponding to respective subset of the tag information.
In some instances, the plurality of scenes include: an opening scene, one or more intermediate scenes, and a concluding scene.
In some implementations, the selected video template comprises a theme, and the selected video template includes music that matches the theme.
In some implementations, the extracted video sub-segments are combined according to a time sequence in which the video sub-segments are captured.
In some implementations, the extracted video sub-segments are combined in a time sequence that is defined by the selected video template.
In some implementations, the method further comprises: prior to the extracting, receiving a user input specifying a total time duration of the video.
In some instances, the method further comprises automatically allocating a time for the video sub-segment
In some implementations, the plurality of video templates comprise a plurality of themes.
In some implementations, the selected video template is selected based on a user input.
In some implementations, the flight route is one of a plurality of predefined flight routes.
In some implementations, the plurality of video templates are determined based on the flight route.
In some implementations, a UAV comprises an image sensor, one or more processors, memory, and one or more programs stored in the memory. The programs are configured for execution by the one or more processors. The one or more programs include instructions for performing any of the methods described herein.
In some implementations, a computing device includes one or more processors, memory, and one or more programs stored in the memory. The programs are configured for execution by the one or more processors. The one or more programs include instructions for performing any of the methods described herein.
In some implementations, a non-transitory computer-readable storage medium stores one or more programs configured for execution by a UAV having one or more processors and memory. The one or more programs include instructions for performing any of the methods described herein.
In some implementations, a non-transitory computer-readable storage medium stores one or more programs configured for execution by a computing device having one or more processors and memory. The one or more programs include instructions for performing any of the methods described herein.
Thus, methods, systems, and graphical user interfaces are disclosed that support automatic video capture and video editing for UAV aerial photography.
BRIEF DESCRIPTION OF THE DRAWINGS
For a better understanding of the aforementioned systems, methods, and graphical user interfaces, as well as additional systems, methods, and graphical user interfaces that provide UAV video capture and video editing, reference should be made to the Description of Implementations below, in conjunction with the following drawings in which like reference numerals refer to corresponding parts throughout the figures.
Figure 1 illustrates an exemplary target tracking system according to some implementations.
Figures 2A to 2C illustrate respectively, an exemplary movable object, an exemplary carrier of a movable object, and an exemplary payload of a movable object according to some implementations.
Figure 3 illustrates an exemplary sensing system of a movable object according to some implementations.
Figure 4 is a block diagram illustrating an exemplary memory 118 of a movable object 102 according to some implementations.
Figure 5 illustrates an exemplary control unit of a target tracking system according to some implementations.
Figure 6 illustrates an exemplary computing device for controlling a movable object according to some implementations.
Figure 7 illustrates an exemplary configuration 700 of a movable object 102, a carrier 108, and a payload 110 according to some implementations.
Figures 8A and 8B illustrate process flows between a user, a computing device 126, and a movable object 102 according to some implementations.
Figure 9 provides a screen shot for selecting an object of interest as a target object according to some implementations.
Figure 10 provides a screen shot for displaying flight range information according to some implementations.
Figure 11 provides a screen shot for execution of a flight route by a movable object according to some implementations.
Figure 12 provides a screen shot during a video editing process according to some implementations.
Figure 13 illustrates a flight route matching strategy according to some implementations.
Figure 14 illustrates an exemplary normal flight route according to some implementations.
Figure 15 illustrates an exemplary portrait flight route according to some implementations.
Figure 16 illustrates an exemplary long range flight route according to some implementations.
Figure 17 illustrates an exemplary video segment download and extraction process at a computing device, in accordance with some implementations.
Figure 18 illustrates an exemplary video template matching strategy in accordance with some implementations.
Figure 19 illustrates another exemplary video template matching strategy in accordance with some implementations.
Figures 20A-20C provide a flowchart of a method that is performed by a UAV according to some implementations.
Figures 21A-21C provide a flowchart of a method for editing a video according to some implementations.
DESCRIPTION OF IMPLEMENTATIONS
Reference will now be made in detail to implementations, examples of which are illustrated in the accompanying drawings. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to one of ordinary skill in the art that the present invention may be practiced without requiring these specific details. In other instances, well-known methods, procedures, components, circuits, and networks have not been described in detail so as not to unnecessarily obscure aspects of the implementations.
The following description uses an unmanned aerial vehicle (UAV) as an example of a movable object. UAVs include, e.g., fixed-wing aircrafts and rotary-wing aircrafts such as helicopters, quadcopters, and aircraft having other numbers and/or configurations of rotors. It will be apparent to those skilled in the art that other types of movable objects may be substituted for UAVs as described below in accordance with implementations of the invention.
Figure 1 illustrates an exemplary target tracking system 100 according to some implementations. The target tracking system 100 includes a movable object 102 (e.g., a UAV) and a control unit 104. In some implementations, the target tracking system 100 is used for tracking a target object 106 and/or for initiating tracking of the target object 106.
In some implementations, the target object 106 includes natural and/or man-made objects, such geographical landscapes (e.g., mountains, vegetation, valleys, lakes, and/or rivers) , buildings, and/or vehicles (e.g., aircrafts, ships, cars, trucks, buses, vans, and/or motorcycles) . In some implementations, the target object 106 includes live subjects such as people and/or animals. In some implementations, the target object 106 is a moving object, e.g., moving relative to a reference frame (such as the Earth and/or movable object 102) . In some implementations, the target object 106 is static. In some implementations, the target object 106 includes an active positioning and navigational system (e.g., a GPS system) that transmits information (e.g., location, positioning, and/or velocity information) about the target object 106 to the movable object 102, a control unit 104, and/or a computing device 126. For example, information may be transmitted to the movable object 102 via wireless communication from a communication unit of the target object 106 to a communication system 120 of the movable object 102, as illustrated in Figure 2A.
In some implementations, the movable object 102 includes a carrier 108 and/or a payload 110. The carrier 108 is used to couple the payload 110 to the movable object 102. In some implementations, the carrier 108 includes an element (e.g., a gimbal and/or damping element) to isolate the payload 110 from movement of the movable object 102. In some implementations, the carrier 108 includes an element for controlling movement of the payload 110 relative to the movable object 102.
In some implementations, the payload 110 is coupled (e.g., rigidly coupled) to the movable object 102 (e.g., coupled via the carrier 108) such that the payload 110 remains substantially stationary relative to movable object 102. For example, the carrier 108 may be  coupled to the payload 110 such that the payload is not movable relative to the movable object 102. In some implementations, the payload 110 is mounted directly to the movable object 102 without requiring the carrier 108. In some implementations, the payload 106 is located partially or fully within the movable object 102.
In some implementations, the movable object 102 is configured to communicate with the control unit 104, e.g., via wireless communications 124. For example, the movable object 102 may receive control instructions from the control unit 104 and/or send data (e.g., data from a movable object sensing system 122, Figure 2A) to the control unit 104.
In some implementations, the control instructions may include, e.g., navigation instructions for controlling one or more navigational parameters of the movable object 102 such as a position, an orientation, an altitude, an attitude (e.g., aviation) and/or one or more movement characteristics of the movable object 102. In some implementations, the control instructions may include instructions for controlling one or more parameters of a carrier 108 and/or a payload 110. In some implementations, the control instructions include instructions for directing movement of one or more of movement mechanisms 114 (Figure 2A) of the movable object 102. For example, the control instructions may be used to control a flight of the movable object 102. In some implementations, the control instructions may include information for controlling operations (e.g., movement) of the carrier 108. For example, the control instructions may be used to control an actuation mechanism of the carrier 108 so as to cause angular and/or linear movement of the payload 110 relative to the movable object 102. In some implementations, the control instructions are used to adjust one or more operational parameters for the payload 110, such as instructions for capturing one or more images, capturing video, adjusting a zoom level, powering on or off a component of the payload, adjusting an imaging mode (e.g., capturing still images or capturing video) , adjusting an image resolution, adjusting a focus, adjusting a viewing angle, adjusting a field of view, adjusting a depth of field, adjusting an exposure time, adjusting a shutter speed,  adjusting a lens speed, adjusting an ISO, changing a lens and/or moving the payload 110 (and/or a part of payload 110, such as imaging device 214 (shown in Figure 2C) ) . In some implementations, the control instructions are used to control the communication system 120, the sensing system 122, and/or another component of the movable object 102.
In some implementations, the control instructions from the control unit 104 may include target information.
In some implementations, the movable object 102 is configured to communicate with a computing device 126 (e.g., an electronic device) . For example, the movable object 102 receives control instructions from the computing device 126 and/or sends data (e.g., data from the movable object sensing system 122) to the computing device 126. In some implementations, communications from the computing device 126 to the movable object 102 are transmitted from computing device 126 to a cell tower 130 (e.g., via internet 128) and from the cell tower 130 to the movable object 102 (e.g., via RF signals) . In some implementations, a satellite is used in lieu of or in addition to cell tower 130.
In some implementations, the target tracking system 100 includes additional control units 104 and/or computing devices 126 that are configured to communicate with the movable object 102.
Figure 2A illustrates an exemplary movable object 102 according to some implementations. In some implementations, the movable object 102 includes processor (s) 116, memory 118, a communication system 120, and a sensing system 122, which are connected by data connections such as a control bus 112. The control bus 112 optionally includes circuitry (sometimes called a chipset) that interconnects and controls communications between system components.
In some implementations, the movable object 102 is a UAV and includes components to enable flight and/or flight control. Although the movable object 102 is  depicted as an aircraft in this example, this depiction is not intended to be limiting, and any suitable type of movable object may be used.
In some implementations, the movable object 102 includes movement mechanisms 114 (e.g., propulsion mechanisms) . Although the plural term “movement mechanisms” is used herein for convenience of reference, “movement mechanisms 114” may refer to a single movement mechanism (e.g., a single propeller) or multiple movement mechanisms (e.g., multiple rotors) . The movement mechanisms 114 may include one or more movement mechanism types such as rotors, propellers, blades, engines, motors, wheels, axles, magnets, and nozzles. The movement mechanisms 114 are coupled to the movable object 102 at, e.g., the top, bottom, front, back, and/or sides. In some implementations, the movement mechanisms 114 of a single movable object 102 may include multiple movement mechanisms each having the same type. In some implementations, the movement mechanisms 114 of a single movable object 102 include multiple movement mechanisms with different movement mechanism types. The movement mechanisms 114 are coupled to the movable object 102 using any suitable means, such as support elements (e.g., drive shafts) or other actuating elements (e.g., one or more actuators 132) . For example, the actuator 132 (e.g., movable object actuator) receives control signals from processor (s) 116 (e.g., via control bus 112) that activates the actuator to cause movement of a movement mechanism 114. For example, the processor (s) 116 include an electronic speed controller that provides control signals to the actuators 132.
In some implementations, the movement mechanisms 114 enable the movable object 102 to take off vertically from a surface or land vertically on a surface without requiring any horizontal movement of the movable object 102 (e.g., without traveling down a runway) . In some implementations, the movement mechanisms 114 are operable to permit the movable object 102 to hover in the air at a specified position and/or orientation. In some implementations, one or more of the movement mechanisms 114 are controllable independently of one or more of the other movement mechanisms 114. For example, when  the movable object 102 is a quadcopter, each rotor of the quadcopter is controllable independently of the other rotors of the quadcopter. In some implementations, multiple movement mechanisms 114 are configured for simultaneous movement.
In some implementations, the movement mechanisms 114 include multiple rotors that provide lift and/or thrust to the movable object 102. The multiple rotors are actuated to provide, e.g., vertical takeoff, vertical landing, and hovering capabilities to the movable object 102. In some implementations, one or more of the rotors spin in a clockwise direction, while one or more of the rotors spin in a counterclockwise direction. For example, the number of clockwise rotors is equal to the number of counterclockwise rotors. In some implementations, the rotation rate of each of the rotors is independently variable, e.g., for controlling the lift and/or thrust produced by each rotor, and thereby adjusting the spatial disposition, velocity, and/or acceleration of the movable object 102 (e.g., with respect to up to three degrees of translation and/or up to three degrees of rotation) .
In some implementations, the memory 118 stores one or more instructions, programs (e.g., sets of instructions) , modules, controlling systems and/or data structures, collectively referred to as “elements” herein. One or more elements described with regard to the memory 118 are optionally stored by the control unit 104, the computing device 126, and/or another device. In some implementations, an imaging device 214 (Figure 2C) includes memory that stores one or more parameters described with regard to the memory 118.
In some implementations, the memory 118 stores a controlling system configuration that includes one or more system settings (e.g., as configured by a manufacturer, administrator, and/or user) . For example, identifying information for the movable object 102 is stored as a system setting of the system configuration. In some implementations, the controlling system configuration includes a configuration for the imaging device 214. The configuration for the imaging device 214 stores parameters such as position (e.g., relative to the image sensor 216) , a zoom level and/or focus parameters (e.g., amount of focus, selecting  autofocus or manual focus, and/or adjusting an autofocus target in an image) . Imaging property parameters stored by the imaging device configuration include, e.g., image resolution, image size (e.g., image width and/or height) , aspect ratio, pixel count, quality, focus distance, depth of field, exposure time, shutter speed, and/or white balance. In some implementations, parameters stored by the imaging device configuration are updated in response to control instructions (e.g., generated by processor (s) 116 and/or received by the movable object 102 from the control unit 104 and/or the computing device 126) . In some implementations, parameters stored by the imaging device configuration are updated in response to information received from the movable object sensing system 122 and/or the imaging device 214.
In some implementations, the carrier 108 is coupled to the movable object 102 and a payload 110 is coupled to the carrier 108. In some implementations, the carrier 108 includes one or more mechanisms that enable the payload 110 to move relative to the movable object 102, as described further with respect to Figure 2B. In some implementations, the payload 110 is rigidly coupled to the movable object 102 such that the payload 110 remains substantially stationary relative to the movable object 102. For example, the carrier 108 is coupled to the payload 110 such that the payload 110 is not movable relative to the movable object 102. In some implementations, the payload 110 is coupled to the movable object 102 without requiring the use of the carrier 108.
As further depicted in Figure 2A, the movable object 102 also includes the communication system 120, which enables communication with between the movable object with the control unit 104 and/or the computing device 126 (e.g., via wireless signals 124) . In some implementations, the communication system 120 includes transmitters, receivers, and/or transceivers for wireless communication. In some implementations, the communication is a one-way communication, such that data is transmitted only from the movable object 102 to the control unit 104, or vice-versa. In some implementations, communication is a two-way communication, such that data is transmitted from the movable  object 102 to the control unit 104, as well as from the control unit 104 to the movable object 102.
In some implementations, the movable object 102 communicates with the computing device 126. In some implementations, the movable object 102, the control unit 104, and/or the computing device 126 are connected to the Internet or other telecommunications network, e.g., such that data generated by the movable object 102, the control unit 104, and/or the computing device 126 is transmitted to a server for data storage and/or data retrieval (e.g., for display by a website) . In some implementations, data generated by the movable object 102, the control unit 104, and/or the computing device 126 is stored locally on each of the respective devices.
In some implementations, the movable object 102 comprises a sensing system (e.g., the movable object sensing system 122) that includes one or more sensors, as described further with reference to Figure 3. In some implementations, the movable object 102 and/or the control unit 104 use sensing data generated by sensors of sensing system 122 to determine information such as a position of the movable object 102, an orientation of the movable object 102, movement characteristics of the movable object 102 (e.g., an angular velocity, an angular acceleration, a translational velocity, a translational acceleration and/or a direction of motion along one or more axes) , a distance between the movable object 102 to a target object, proximity (e.g., distance) of the movable object 102 to potential obstacles, weather conditions, locations of geographical features and/or locations of manmade structures.
Figure 2B illustrates an exemplary carrier 108 according to some implementations. In some implementations, the carrier 108 couples the payload 110 to the movable object 102.
In some implementations, the carrier 108 includes a frame assembly having one or more frame members 202. In some implementations, the frame member (s) 202 are  coupled with the movable object 102 and the payload 110. In some implementations, the frame member (s) 202 support the payload 110.
In some implementations, the carrier 108 includes one or more mechanisms, such as one or more actuators 204, to cause movement of the carrier 108 and/or the payload 110. In some implementations, the actuator 204 is, e.g., a motor, such as a hydraulic, pneumatic, electric, thermal, magnetic, and/or mechanical motor. In some implementations, the actuator 204 causes movement of the frame member (s) 202. In some implementations, the actuator 204 rotates the payload 110 with respect to one or more axes, such as one or more of: an X axis ( “pitch axis” ) , a Z axis ( “roll axis” ) , and a Y axis ( “yaw axis” ) , relative to the movable object 102. In some implementations, the actuator 204 translates the payload 110 along one or more axes relative to the movable object 102.
In some implementations, the carrier 108 includes a carrier sensing system 206 for determining a state of the carrier 108 or the payload 110. The carrier sensing system 206 includes one or more of: motion sensors (e.g., accelerometers) , rotation sensors (e.g., gyroscopes) , potentiometers, and/or inertial sensors. In some implementations, the carrier sensing system 206 includes one or more sensors of the movable object sensing system 122 as described below with respect to Figure 3. Sensor data determined by the carrier sensing system 206 may include spatial disposition (e.g., position, orientation, or attitude) , movement information such as velocity (e.g., linear or angular velocity) and/or acceleration (e.g., linear or angular acceleration) of the carrier 108 and/or the payload 110. In some implementations, the sensing data as well as state information calculated from the sensing data are used as feedback data to control the movement of one or more components (e.g., the frame member 202 (s) , the actuator 204, and/or the damping element 208) of the carrier 108. In some implementations, the carrier sensing system 206 is coupled to the frame member (s) 202, the actuator 204, the damping element 208, and/or the payload 110. In some instances, a sensor in the carrier sensing system 206 (e.g., a potentiometer) may measure movement of the actuator 204 (e.g., the relative positions of a motor rotor and a motor stator) and generate a  position signal representative of the movement of the actuator 204 (e.g., a position signal representative of relative positions of the motor rotor and the motor stator) . In some implementations, data generated by the sensors is received by processor (s) 116 and/or memory 118 of the movable object 102.
In some implementations, the coupling between the carrier 108 and the movable object 102 includes one or more damping elements 208. The damping element (s) 208 are configured to reduce or eliminate movement of the load (e.g., the payload 110 and/or the carrier 108) caused by movement of the movable object 102. The damping element (s) 208 may include active damping elements, passive damping elements, and/or hybrid damping elements having both active and passive damping characteristics. The motion damped by the damping element (s) 208 may include vibrations, oscillations, shaking, and/or impacts. Such motions may originate from motions of the movable object 102, which are transmitted to the payload 110. For example, the motion may include vibrations caused by the operation of a propulsion system and/or other components of the movable object 102.
In some implementations, the damping element (s) 208 provide motion damping by isolating the payload 110 from the source of unwanted motion, by dissipating or reducing the amount of motion transmitted to the payload 110 (e.g., vibration isolation) . In some implementations, the damping element 208 (s) reduce a magnitude (e.g., an amplitude) of the motion that would otherwise be experienced by the payload 110. In some implementations, the motion damping applied by the damping element (s) 208 is used to stabilize the payload 110, thereby improving the quality of video and/or images captured by the payload 110 (e.g., using the imaging device 214, Figure 2C) . In some implementations, the improved video and/or image quality reduces the computational complexity of processing steps required to generate an edited video based on the captured video, or to generate a panoramic image based on the captured images.
In some implementations, the damping element (s) 208 may be manufactured using any suitable material or combination of materials, including solid, liquid, or gaseous materials. The materials used for the damping element (s) 208 may be compressible and/or deformable. In one example, the damping element (s) 208 may be made of sponge, foam, rubber, gel, and the like. In another example, the damping element (s) 208 may include rubber balls that are substantially spherical in shape. In other instances, the damping element (s) 208 may be substantially spherical, rectangular, and/or cylindrical in shape. In some implementations, the damping element (s) 208 may include piezoelectric materials or shape memory materials. In some implementations, the damping element (s) 208 may include one or more mechanical elements, such as springs, pistons, hydraulics, pneumatics, dashpots, shock absorbers, and/or isolators. In some implementations, properties of the damping element (s) 208 are selected so as to provide a predetermined amount of motion damping. In some instances, the damping element (s) 208 have viscoelastic properties. The properties of damping element (s) 208 may be isotropic or anisotropic. In some implementations, the damping element (s) 208 provide motion damping equally along all directions of motion. In some implementations, the damping element (s) 208 provide motion damping only along a subset of the directions of motion (e.g., along a single direction of motion) . For example, the damping element (s) 208 may provide damping primarily along the Y (yaw) axis. In this manner, the illustrated damping element (s) 208 reduce vertical motions.
In some implementations, the carrier 108 further includes a controller 210. The controller 210 may include one or more controllers and/or processors. In some implementations, the controller 210 receives instructions from the processor (s) 116 of the movable object 102. For example, the controller 210 may be connected to the processor (s) 116 via the control bus 112. In some implementations, the controller 210 may control movement of the actuator 204, adjust one or more parameters of the carrier sensing system  206, receive data from carrier sensing system 206, and/or transmit data to the processor (s) 116.
Figure 2C illustrates an exemplary payload 110 according to some implementations. In some implementations, the payload 110 includes a payload sensing system 212 and a controller 218. The payload sensing system 212 may include an imaging device 214 (e.g., a camera) having an image sensor 216 with a field of view. In some implementations, the payload sensing system 212 includes one or more sensors of the movable object sensing system 122, as described below with respect to Figure 3.
The payload sensing system 212 generates static sensing data (e.g., a single image captured in response to a received instruction) and/or dynamic sensing data (e.g., a series of images captured at a periodic rate, such as a video) .
The image sensor 216 is, e.g., a sensor that detects light, such as visible light, infrared light, and/or ultraviolet light. In some embodiments, the image sensor 216 includes, e.g., semiconductor charge-coupled device (CCD) , active pixel sensors using complementary metal-oxide-semiconductor (CMOS) or N-type metal-oxide-semiconductor (NMOS, Live MOS) technologies, or any other types of sensors. The image sensor 216 and/or imaging device 214 captures images or image streams (e.g., videos) . Adjustable parameters of imaging device 214 include, e.g., width, height, aspect ratio, pixel count, resolution, quality, imaging mode, focus distance, depth of field, exposure time, shutter speed and/or lens configuration. In some implementations, the imaging device 214 may configured to capture videos and/or images at different resolutions (e.g., low, medium, high, or ultra-high resolutions, and/or high-definition or ultra-high-definition videos such as 720p, 1080i, 1080p, 1440p, 2000p, 2160p, 2540p, 4000p, and 4320p) .
In some implementations, the payload 110 includes the controller 218. The controller 218 may include one or more controllers and/or processors. In some implementations, the controller 218 receives instructions from the processor (s) 116 of the  movable object 102. For example, the controller 218 is connected to the processor (s) 116 via the control bus 112. In some implementations, the controller 218 may adjust one or more parameters of one or more sensors of the payload sensing system 212, receive data from one or more sensors of payload sensing system 212, and/or transmit data, such as image data from the image sensor 216, to the processor (s) 116, the memory 118, and/or the control unit 104.
In some implementations, data generated by one or more sensors of the payload sensor system 212 is stored, e.g., by the memory 118. In some implementations, data generated by the payload sensor system 212 are transmitted to the control unit 104 (e.g., via communication system 120) . For example, video is streamed from the payload 110 (e.g., the imaging device 214) to the control unit 104. In this manner, the control unit 104 displays, e.g., real-time (or slightly delayed) video received from the imaging device 214.
In some implementations, an adjustment of the orientation, position, altitude, and/or one or more movement characteristics of the movable object 102, the carrier 108, and/or the payload 110 is generated (e.g., by the processor (s) 116) based at least in part on configurations (e.g., preset and/or user configured in system configuration 400, Figure 4) of the movable object 102, the carrier 108, and/or the payload 110. For example, an adjustment that involves a rotation with respect to two axes (e.g., yaw and pitch) is achieved solely by corresponding rotation of movable object around the two axes if the payload 110 including imaging device 214 is rigidly coupled to the movable object 102 (and hence not movable relative to movable object 102) and/or the payload 110 is coupled to the movable object 102 via a carrier 108 that does not permit relative movement between the imaging device 214 and the movable object 102. The same two-axis adjustment may be achieved by, e.g., combining adjustments of both the movable object 102 and the carrier 108 if the carrier 108 permits the imaging device 214 to rotate around at least one axis relative to the movable object 102. In this case, the carrier 108 can be controlled to implement the rotation around one or two of the two axes required for the adjustment and the movable object 120 can be controlled to implement the rotation around one or two of the two axes. In some implementations, the  carrier 108 may include a one-axis gimbal that allows the imaging device 214 to rotate around one of the two axes required for adjustment while the rotation around the remaining axis is achieved by the movable object 102. In some implementations, the same two-axis adjustment is achieved by the carrier 108 alone when the carrier 108 permits the imaging device 214 to rotate around two or more axes relative to the movable object 102. In some implementations, the carrier 108 may include a two-axis or three-axis gimbal that enables the imaging device 214 to rotate around two or all three axes.
Figure 3 illustrates an exemplary sensing system 122 of a movable object 102 according to some implementations. In some implementations, one or more sensors of the movable object sensing system 122 are mounted to an exterior, or located within, or otherwise coupled to the movable object 102. In some implementations, one or more sensors of movable object sensing system are components of carrier sensing system 206 and/or payload sensing system 212. Where sensing operations are described as being performed by the movable object sensing system 122 herein, it will be recognized that such operations are optionally performed by the carrier sensing system 206 and/or the payload sensing system 212.
In some implementations, the movable object sensing system 122 generates static sensing data (e.g., a single image captured in response to a received instruction) and/or dynamic sensing data (e.g., a series of images captured at a periodic rate, such as a video) .
In some implementations, the movable object sensing system 122 includes one or more image sensors 302, such as image sensor 308 (e.g., a left stereographic image sensor) and/or image sensor 310 (e.g., a right stereographic image sensor) . The image sensors 302 capture, e.g., images, image streams (e.g., videos) , stereographic images, and/or stereographic image streams (e.g., stereographic videos) . The image sensors 302 detect light, such as visible light, infrared light, and/or ultraviolet light. In some implementations, the movable object sensing system 122 includes one or more optical devices (e.g., lenses) to focus or  otherwise alter the light onto the one or more image sensors 302. In some implementations, the image sensors 302 include, e.g., semiconductor charge-coupled devices (CCD) , active pixel sensors using complementary metal-oxide-semiconductor (CMOS) or N-type metal-oxide-semiconductor (NMOS, Live MOS) technologies, or any other types of sensors.
In some implementations, the movable object sensing system 122 includes one or more audio transducers 304. The audio transducers 304 may include an audio output transducer 312 (e.g., a speaker) , and an audio input transducer 314 (e.g. a microphone, such as a parabolic microphone) . In some implementations, the audio output transducer 312 and the audio input transducer 314 are used as components of a sonar system for tracking a target object (e.g., detecting location information of a target object) .
In some implementations, the movable object sensing system 122 includes one or more infrared sensors 306. In some implementations, a distance measurement system includes a pair of infrared sensors e.g., infrared sensor 316 (such as a left infrared sensor) and infrared sensor 318 (such as a right infrared sensor) or another sensor or sensor pair. The distance measurement system is used for measuring a distance between the movable object 102 and the target object 106.
In some implementations, the movable object sensing system 122 may include other sensors for sensing a distance between the movable object 102 and the target object 106, such as a Radio Detection and Ranging (RADAR) sensor, a Light Detection and Ranging (LiDAR) sensor, or any other distance sensor.
In some implementations, a system to produce a depth map includes one or more sensors or sensor pairs of movable object sensing system 122 (such as a left stereographic image sensor 308 and a right stereographic image sensor 310; an audio output transducer 312 and an audio input transducer 314; and/or a left infrared sensor 316 and a right infrared sensor 318. In some implementations, a pair of sensors in a stereo data system (e.g., a stereographic imaging system) simultaneously captures data from different positions. In  some implementations, a depth map is generated by a stereo data system using the simultaneously captured data. In some implementations, a depth map is used for positioning and/or detection operations, such as detecting a target object 106, and/or detecting current location information of a target object 106.
In some implementations, the movable object sensing system 122 includes one or more global positioning system (GPS) sensors, motion sensors (e.g., accelerometers) , rotation sensors (e.g., gyroscopes) , inertial sensors, proximity sensors (e.g., infrared sensors) and/or weather sensors (e.g., pressure sensor, temperature sensor, moisture sensor, and/or wind sensor) .
In some implementations, sensing data generated by one or more sensors of the movable object sensing system 122 and/or information determined using sensing data from one or more sensors of the movable object sensing system 122 are transmitted to the control unit 104 (e.g., via the communication system 120) . In some implementations, data generated one or more sensors of the movable object sensing system 122 and/or information determined using sensing data from one or more sensors of the movable object sensing system 122 is stored by the memory 118.
Figure 4 is a block diagram illustrating an exemplary memory 118 of a movable object 102 according to some implementations. In some implementations, one or more elements illustrated in Figure 4 may be located in the control unit 104, the computing device 126, and/or another device.
In some implementations, the memory 118 stores a system configuration 400. The system configuration 400 includes one or more system settings (e.g., as configured by a manufacturer, administrator, and/or user of the movable object 102) . For example, a constraint on one or more of orientation, position, attitude, and/or one or more movement characteristics of the movable object 102, the carrier 108, and/or the payload 110 is stored as a system setting of the system configuration 400.
In some implementations, the memory 118 stores a motion control module 402. The motion control module 402 stores control instructions that are received from the control module 104 and/or the computing device 126. The control instructions are used for controlling operation of the movement mechanisms 114, the carrier 108, and/or the payload 110.
In some implementations, memory 118 stores a tracking module 404. In some implementations, the tracking module 404 generates tracking information for a target object 106 that is being tracked by the movable object 102. In some implementations, the tracking information is generated based on images captured by the imaging device 214 and/or based on output from an video analysis module 406 (e.g., after pre-processing and/or processing operations have been performed on one or more images) . Alternatively or in combination, the tracking information may be generated based on analysis of gestures of a human target, which are captured by the imaging device 214 and/or analyzed by a gesture analysis module 403. The tracking information generated by the tracking module 404 may include a location, a size, and/or other characteristics of the target object 106 within one or more images. In some implementations, the tracking information generated by the tracking module 404 is transmitted to the control unit 104 and/or the computing device 126 (e.g., augmenting or otherwise combined with images and/or output from the video analysis module 406) . For example, the tracking information may be transmitted to the control unit 104 in response to a request from the control unit 104 and/or on a periodic basis (e.g., every 2 seconds, 5 seconds, 10 seconds, or 30 seconds) .
In some implementations, the memory 118 includes a video analysis module 406. The video analysis module 406 performs processing operations on videos and images, such as videos and images captured by the imaging device 214. In some implementations, the video analysis module 406 performs pre-processing on raw video and/or image data, such as re-sampling to assure the correctness of the image coordinate system, noise reduction, contrast enhancement, and/or scale space representation. In some implementations, the  processing operations performed on video and image data (including data of videos and/or images that has been pre-processed) include feature extraction, image segmentation, data verification, image recognition, image registration, and/or image matching. In some implementations, the output from the video analysis module 406 (e.g., after the pre-processing and/or processing operations have been performed) is transmitted to the control unit 104 and/or the computing device 126. In some implementations, feature extraction is performed by the control unit 104, the processor (s) 116 of the movable object 102, and/or the computing device 126. In some implementations, the video analysis module 406 may use neural networks to perform image recognition and/or classify object (s) that are included in the videos and/or images. For example, the video analysis module 406 may extract frames that include the target object 106, analyze features of the target object 106, and compare the features with characteristics of one or more predetermined recognizable target object types, thereby enabling the target object 106 to be recognized at a certain confidence level.
In some implementations, the memory 118 includes a gesture analysis module 403. The gesture analysis module 403 processes gestures of one or more human targets. In some implementations, the gestures may be captured by the imaging device 214. In some implementations, after processing the gestures, the gesture analysis results may be fed into the tracking module 404 and/or the motion control module 402 to generate, respectively, tracking information and/or control instructions that are used for controlling operations of the movement mechanisms 114, the carrier 108, and/or the payload 110 of the movable object 102.
In some implementations, a calibration process may be performed before using gestures of a human target to control the movable object 102. For example, during the calibration process, the gesture analysis module 403 may capture certain features of human gestures associated with a certain control command and stores the gesture features in the memory 118. When a human gesture is received, the gesture analysis module 403 may  extract features of the human gesture and compare these features with the stored features to determine whether the certain command may be performed by the user. In some implementations, the correlations between gestures and control commands associated with a certain human target may or may not be different from such correlations associated with another human target.
In some implementations, the memory 118 also includes a spatial relationship determination module 405. The spatial relationship determination module 405 calculates one or more spatial relationships between the target object 106 and the movable object 102, such as a horizontal distance between the target object 106 and the movable object 102, and/or a pitch angle between the target object 106 and the movable object 102.
In some implementations, the memory 118 stores target information 408. In some implementations, the target information 408 is received by the movable object 102 (e.g., via communication system 120) from the control unit 104, the computing device 126, the target object 106, and/or another movable object.
In some implementations, the target information 408 includes a time value (e.g., a time duration) and/or an expiration time indicating a period of time during which the target object 106 is to be tracked. In some implementations, the target information 408 includes a flag (e.g., a label) indicating whether a target information entry includes specific tracked target information 412 and/or target type information 410.
In some implementations, the target information 408 includes target type information 410 such as color, texture, pattern, size, shape, and/or dimension. In some implementations, the target type information 410 includes, but is not limited to, a predetermined recognizable object type and a general object type as identified by the video analysis module 406. In some implementations, the target type information 410 includes features or characteristics for each type of target and is preset and stored in the memory 118. In some implementations, the target type information 410 is provided to a user input device  (e.g., the control unit 104) via user input. In some implementations, the user may select a pre-existing target pattern or type (e.g., an object or a round object with a radius greater or less than a certain value) .
In some implementations, the target information 408 includes tracked target information 412 for a specific target object 106 being tracked. The target information 408 may be identified by the video analysis module 406 by analyzing the target in a captured image. The tracked target information 412 includes, e.g., an image of the target object 106, an initial position (e.g., location coordinates, such as pixel coordinates within an image) of the target object 106, and/or a size of the target object 106 within one or more images (e.g., images captured by the imaging device 214) . A size of the target object 106 is stored, e.g., as a length (e.g., mm or other length unit) , an area (e.g., mm2 or other area unit) , a number of pixels in a line (e.g., indicating a length, width, and/or diameter) , a ratio of a length of a representation of the target in an image relative to a total image length (e.g., a percentage) , a ratio of an area of a representation of the target in an image relative to a total image area (e.g., a percentage) , a number of pixels indicating an area of target object 106, and/or a corresponding spatial relationship (e.g., a vertical distance and/or a horizontal distance) between the target object 106 and the movable object 102 (e.g., an area of the target object 106 changes based on a distance of the target object 106 from the movable object 102) .
In some implementations, one or more features (e.g., characteristics) of the target object 106 are determined from an image of the target object 106 (e.g., using image analysis techniques on images captured by the imaging device 112) . For example, one or more features of the target object 106 are determined from an orientation and/or part or all of identified boundaries of the target object 106. In some implementations, the tracked target information 412 includes pixel coordinates and/or a number of pixel counts to indicate, e.g., a size parameter, position, and/or shape of the target object 106. In some implementations, one or more features of the tracked target information 412 are to be maintained as the movable object 102 tracks the target object 106 (e.g., the tracked target information 412 are to be  maintained as images of the target object 106 are captured by the imaging device 214) . In some implementations, the tracked target information 412 is used to adjust the movable object 102, the carrier 108, and/or the imaging device 214, such that specific features of the target object 106 are substantially maintained. In some implementations, the tracked target information 412 is determined based on one or more of the target types 410.
In some implementations, the memory 118 also includes predetermined recognizable target type information 414. The predetermined recognizable target type information 414 specifies one or more characteristics of certain predetermined recognizable target types (e.g., target type 1, target type 2, ..., target type n) . Each predetermined recognizable target type may include one or more characteristics such as a size parameter (e.g., area, diameter, height, length and/or width) , position (e.g., relative to an image center and/or image boundary) , movement (e.g., speed, acceleration, altitude) and/or shape. For example, target type 1 may be a human target. One or more characteristics associated with a human target may include a height in a range from about 1.4 meters to about 2 meters, a pattern comprising a head, a torso, and limbs, and/or a moving speed having a range from about 2 kilometers/hour to about 25 kilometers/hour. In another example, target type 2 may be a car target. One or more characteristics associated with a car target may include a height in a range from about 1.4 meters to about 4.5 meters, a length having a range from about 3 meters to about 10 meters, a moving speed of 5 kilometers/hour to about 140 kilometers/hour, and/or a pattern of a sedan, a SUV, a truck, or a bus. In yet another example, target type 3 may be a ship target. Other types of predetermined recognizable target object may also include: an airplane target, an animal target, other moving targets, and stationary (e.g., non-moving) targets such as a building and a statue. Each predetermined target type may further include one or more subtypes, each of the subtypes having more specific characteristics thereby providing more accurate target classification results.
In some implementations, the target information 408 (including, e.g., the target type information 410 and the tracked target information 412) , and/or predetermined  recognizable target information 414 is generated based on user input, such as a user input received at user input device 506 (Figure 5) of the control unit 104. Additionally or alternatively, the target information 408 may be generated based on data from sources other than the control unit 104. For example, the target type information 410 may be based on previously stored images of the target object 106 (e.g., images captured by the imaging device 214 and stored by the memory 118) , other data stored by the memory 118, and/or data from data stores that are remote from the control unit 104 and/or the movable object 102. In some implementations, the target type information 408 is generated using a computer-generated image of the target object 106.
In some implementations, the target information 408 is used by the movable object 102 (e.g., the tracking module 404) to track the target object 106. In some implementations, the target information 408 is used by a video analysis module 406 to identify and/or classify the target object 106. In some cases, target identification involves image recognition and/or matching algorithms based on, e.g., CAD-like object models, appearance-based methods, feature-based methods, and/or genetic algorithms. In some implementations, target identification includes comparing two or more images to determine, extract, and/or match features contained therein.
In some implementations, the memory 118 also includes flight routes 416 (e.g., predefined flight routes) of the movable object 102, such as a portrait flight route 418, a long range flight route 418, and a normal flight route 422, as will be discussed below with respect to, e.g., Figures 8A-8B. In some implementations, and as discussed below with respect to, e.g., Figures 14-16, each of the flight routes 416 includes one or more paths, each of the one or more paths having a corresponding trajectory mode. In some implementations, the movable object 102 automatically selects one of the predefined flight routes according to a target type of the target object 106 and distance between the movable object 102 and the target object 106. In some implementations, after automatically selecting a flight route 416 for the movable object 102, the movable object 102 further performs an automatic  customization of the flight route taking into consideration factors such as a updated distance between the movable object 102 and the target object 106, presence of potential obstacle (s) and/or other structures (e.g., buildings and trees) , or weather conditions. In some implementations, customization of the flight route includes modifying a rate of ascent of the movable object 102, an initial velocity of the movable object 102, and/or an acceleration of the movable object 102. In some implementations, the customization is provided in part by a user. For example, depending on the target type and the distance, the movable object 102 may cause the computing device 126 to display a library of trajectories that can be selected by the user. The movable object 102 then automatically generates the paths of the flight route based on the user selections.
In some implementations, the flight routes 416 also include user defined flight route (s) 424, which are routes that are defined by the user.
In some implementations, the memory stores data 426 that are captured by the image sensor 216 during an autonomous flight, including video data 428 and image (s) 430. In some implementations, the data 426 also includes audio data 432 that are captured by a microphone of the movable object 102 (e.g., the audio input transducer 314) . In some implementations, the data 426 is simultaneously stored on the moving object 102 as it is being captured. In some implementations, the memory 118 further stores with the data 426 metadata information. For example, the video data 428 may include tag information (e.g., metadata) associated with the flight path and trajectory mode corresponding to a respective segment of the video data 428.
The above identified modules or programs (i.e., sets of instructions) need not be implemented as separate software programs, procedures or modules, and thus various subsets of these modules may be combined or otherwise re-arranged in various implementations. In some implementations, the memory 118 may store a subset of the modules and data structures identified above. Furthermore, the memory 118 may store  additional modules and data structures not described above. In some implementations, the programs, modules, and data structures stored in the memory 118, or a non-transitory computer readable storage medium of the memory 118, provide instructions for implementing respective operations in the methods described below. In some implementations, some or all of these modules may be implemented with specialized hardware circuits that subsume part or all of the module functionality. One or more of the above identified elements may be executed by the one or more processors 116 of the movable object 102. In some implementations, one or more of the above identified elements is executed by one or more processors of a device remote from the movable object 102, such as the control unit 104 and/or the computing device 126.
Figure 5 illustrates an exemplary control unit 104 of target tracking system 100, in accordance with some implementations. In some implementations, the control unit 104 communicates with the movable object 102 via the communication system 120, e.g., to provide control instructions to the movable object 102. Although the control unit 104 is typically a portable (e.g., handheld) device, the control unit 104 need not be portable. In some implementations, the control unit 104 is a dedicated control device (e.g., dedicated to operation of movable object 102) , a laptop computer, a desktop computer, a tablet computer, a gaming system, a wearable device (e.g., watches, glasses, gloves, and/or helmet) , a microphone, and/or a combination thereof.
The control unit 104 typically includes one or more processor (s) 502, a communication system 510 (e.g., including one or more network or other communications interfaces) , memory 504, one or more input/output (I/O) interfaces (e.g., an input device 506 and/or a display 506) , and one or more communication buses 512 for interconnecting these components.
In some implementations, the input device 506 and/or the display 508 comprises a touchscreen display. The touchscreen display optionally uses LCD (liquid  crystal display) technology, LPD (light emitting polymer display) technology, or LED (light emitting diode) technology, although other display technologies are used in other implementations. The touchscreen display and the processor (s) 502 optionally detect contact and any movement or breaking thereof using any of a plurality of touch sensing technologies now known or later developed, including but not limited to capacitive, resistive, infrared, and surface acoustic wave technologies, as well as other proximity sensor arrays or other elements for determining one or more points of contact with the touchscreen display.
In some implementations, the input device 506 includes one or more: joysticks, switches, knobs, slide switches, buttons, dials, keypads, keyboards, mice, audio transducers (e.g., microphones for voice control systems) , motion sensors, and/or gesture controls. In some implementations, an I/O interface of the control unit 104 includes sensors (e.g., GPS sensors, and/or accelerometers) , audio output transducers (e.g., speakers) , and/or one or more tactile output generators for generating tactile outputs.
In some implementations, the input device 506 receives user input to control aspects of the movable object 102, the carrier 108, the payload 110, or a component thereof. Such aspects include, e.g., attitude (e.g., aviation) , position, orientation, velocity, acceleration, navigation, and/or tracking. For example, the input device 506 is manually set to one or more positions by a user. Each of the positions may correspond to a predetermined input for controlling the movable object 102. In some implementations, the input device 506 is manipulated by a user to input control instructions for controlling the navigation of the movable object 102. In some implementations, the input device 506 is used to input a flight mode for the movable object 102, such as auto pilot or navigation according to a predetermined navigation path.
In some implementations, the input device 506 is used to input a target tracking mode for the movable object 102, such as a manual tracking mode or an automatic tracking mode. In some implementations, the user controls the movable object 102, e.g., the  position, attitude, and/or orientation of the movable object 102, by changing a position of the control unit 104 (e.g., by tilting or otherwise moving the control unit 104) . For example, a change in a position of the control unit 104 may detected by one or more inertial sensors and output of the one or more inertial sensors may be used to generate command data. In some implementations, the input device 506 is used to adjust an operational parameter of the payload, such as a parameter of the payload sensing system 212 (e.g., to adjust a zoom parameter of the imaging device 214) and/or a position of the payload 110 relative to the carrier 108 and/or the movable object 102.
In some implementations, the input device 506 is used to indicate information about the target object 106, e.g., to select a target object 106 to track and/or to indicate the target type information 412. In some implementations, the input device 506 is used for interaction with augmented image data. For example, an image displayed by the display 508 includes representations of one or more target objects 106. In some implementations, representations of the one or more target objects 106 are augmented to indicate identified objects for potential tracking and/or a target object 106 that is currently being tracked. Augmentation includes, for example, a graphical tracking indicator (e.g., a box) adjacent to or surrounding a respective target object 106. In some implementations, the input device 506 is used to select a target object 106 to track or to change the target object being tracked. In some implementations, a target object 106 is selected when an area corresponding to a representation of the target object 106 is selected by e.g., a finger, stylus, mouse, joystick, or other component of the input device 506. In some implementations, the specific target information 412 is generated when a user selects a target object 106 to track.
The control unit 104 may also be configured to allow a user to enter target information using any suitable method. In some implementations, the input device 506 receives a selection of a target object 106 from one or more images (e.g., video or snapshot) displayed by the display 508. For example, the input device 506 receives input including a selection performed by a gesture around the target object 106 and/or a contact at a location  corresponding to the target object 106 in an image. In some implementations, computer vision or other techniques are used to determine a boundary of the target object 106. In some implementations, input received at the input device 506 defines a boundary of the target object 106. In some implementations, multiple targets are simultaneously selected. In some implementations, a selected target is displayed with a selection indicator (e.g., a bounding box) to indicate that the target is selected for tracking. In some other implementations, the input device 506 receives input indicating information such as color, texture, shape, dimension, and/or other characteristics associated with a target object 106. For example, the input device 506 includes a keyboard to receive typed input indicating the target information 408.
In some implementations, the control unit 104 provides an interface that enables a user to select (e.g., using the input device 506) between a manual tracking mode and an automatic tracking mode. When the manual tracking mode is selected, the interface enables the user to select a target object 106 to track. For example, a user is enabled to manually select a representation of a target object 106 from an image displayed by the display 508 of the control unit 104. Specific target information 412 associated with the selected target object 106 is transmitted to the movable object 102, e.g., as initial expected target information.
In some implementations, when the automatic tracking mode is selected, the user does not provide input selecting a target object 106 to track. In some implementations, the input device 506 receives target type information 410 from a user input. In some implementations, the movable object 102 uses the target type information 410, e.g., to automatically identify the target object 106 to be tracked and/or to track the identified target object 106.
Typically, manual tracking requires more user control of the tracking of the target and less automated processing or computation (e.g., image or target recognition) by the  processor (s) 116 of the movable object 102, while automatic tracking requires less user control of the tracking process but more computation performed by the processor (s) 116 of the movable object 102 (e.g., by the video analysis module 406) . In some implementations, allocation of control over the tracking process between the user and the onboard processing system is adjusted, e.g., depending on factors such as the surroundings of movable object 102, motion of the movable object 102, altitude of the movable object 102, the system configuration 400 (e.g., user preferences) , and/or available computing resources (e.g., CPU or memory) of the movable object 102, the control unit 104, and/or the computing device 126. For example, relatively more control is allocated to the user when movable object is navigating in a relatively complex environment (e.g., with numerous buildings or obstacles or indoor) than when movable object is navigating in a relatively simple environment (e.g., wide open space or outdoor) . As another example, more control is allocated to the user when the movable object 102 is at a lower altitude than when the movable object 102 is at a higher altitude. As a further example, more control is allocated to the movable object 102 if the movable object 102 is equipped with a high-speed processor adapted to perform complex computations relatively quickly. In some implementations, the allocation of control over the tracking process between the user and the movable object 102 is dynamically adjusted based on one or more of the factors described herein.
In some implementations, the control unit 104 includes an electronic device (e.g., a portable electronic device) and an input device 506 that is a peripheral device that is communicatively coupled (e.g., via a wireless and/or wired connection) and/or mechanically coupled to the electronic device. For example, the control unit 104 includes a portable electronic device (e.g., a cellphone or a smart phone) and a remote control device (e.g., a standard remote control with a joystick) coupled to the portable electronic device. In this example, an application executed by the cellphone generates control instructions based on input received at the remote control device.
In some implementations, the display device 508 displays information about the movable object 102, the carrier 108, and/or the payload 110, such as position, attitude, orientation, movement characteristics of the movable object 102, and/or distance between the movable object 102 and another object (e.g., the target object 106 and/or an obstacle) . In some implementations, information displayed by the display device 508 includes images captured by the imaging device 214, tracking data (e.g., a graphical tracking indicator applied to a representation of the target object 106, such as a box or other shape around the target object 106 shown to indicate that target object 106 is currently being tracked) , and/or indications of control data transmitted to the movable object 102. In some implementations, the images including the representation of the target object 106 and the graphical tracking indicator are displayed in substantially real-time as the image data and tracking information are received from the movable object 102 and/or as the image data is acquired.
The communication system 510 enables communication with the communication system 120 of the movable object 102, the communication system 610 (Figure 6) of the computing device 126, and/or a base station (e.g., computing device 126) via a wired or wireless communication connection. In some implementations, the communication system 510 transmits control instructions (e.g., navigation control instructions, target information, and/or tracking instructions) . In some implementations, the communication system 510 receives data (e.g., tracking data from the payload imaging device 214, and/or data from movable object sensing system 122) . In some implementations, the control unit 104 receives tracking data (e.g., via the wireless communications 124) from the movable object 102. Tracking data is used by the control unit 104 to, e.g., display the target object 106 as the target is being tracked. In some implementations, data received by the control unit 104 includes raw data (e.g., raw sensing data as acquired by one or more sensors) and/or processed data (e.g., raw data as processed by, e.g., the tracking module 404) .
In some implementations, the memory 504 stores instructions for generating control instructions automatically and/or based on input received via the input device 506.  The control instructions may include control instructions for operating the movement mechanisms 114 of the movable object 102 (e.g., to adjust the position, attitude, orientation, and/or movement characteristics of the movable object 102, such as by providing control instructions to the actuators 132) . In some implementations, the control instructions adjust movement of the movable object 102 with up to six degrees of freedom. In some implementations, the control instructions are generated to initialize and/or maintain tracking of the target object 106. In some implementations, the control instructions include instructions for adjusting the carrier 108 (e.g., instructions for adjusting the damping element 208, the actuator 204, and/or one or more sensors of the carrier sensing system 206) . In some implementations, the control instructions include instructions for adjusting the payload 110 (e.g., instructions for adjusting one or more sensors of the payload sensing system 212) . In some implementations, the control instructions include control instructions for adjusting the operations of one or more sensors of movable the object sensing system 122.
In some implementations, the memory 504 also stores instructions for performing image recognition, target classification, spatial relationship determination, and/or gesture analysis that are similar to the corresponding functionalities discussed with reference to Figure 4. The memory 504 may also store target information, such as tracked target information and/or predetermined recognizable target type information, as discussed in Figure 4.
In some implementations, the input device 506 receives user input to control one aspect of the movable object 102 (e.g., the zoom of the imaging device 214) while a control application generates the control instructions for adjusting another aspect of movable the object 102 (e.g., to control one or more movement characteristics of movable object 102) . The control application includes, e.g., control module 402, tracking module 404 and/or a control application of control unit 104 and/or computing device 126. For example, input device 506 receives user input to control one or more movement characteristics of movable object 102 while the control application generates the control instructions for adjusting a  parameter of imaging device 214. In this manner, a user is enabled to focus on controlling the navigation of movable object without having to provide input for tracking the target (e.g., tracking is performed automatically by the control application) .
In some implementations, allocation of tracking control between user input received at the input device 506 and the control application varies depending on factors such as, e.g., surroundings of the movable object 102, motion of the movable object 102, altitude of the movable object 102, system configuration (e.g., user preferences) , and/or available computing resources (e.g., CPU or memory) of the movable object 102, the control unit 104, and/or the computing device 126. For example, relatively more control is allocated to the user when movable object is navigating in a relatively complex environment (e.g., with numerous buildings or obstacles or indoor) than when movable object is navigating in a relatively simple environment (e.g., wide open space or outdoor) . As another example, more control is allocated to the user when the movable object 102 is at a lower altitude than when the movable object 102 is at a higher altitude. As a further example, more control is allocated to the movable object 102 if movable object 102 is equipped with a high-speed processor adapted to perform complex computations relatively quickly. In some implementations, the allocation of control over the tracking process between the user and the movable object is dynamically adjusted based on one or more of the factors described herein.
Figure 6 illustrates an exemplary computing device 126 for controlling movable object 102 according to some implementations. The computing device 126 may be a server computer, a laptop computer, a desktop computer, a tablet, or a phone. The computing device 126 typically includes one or more processor (s) 602 (e.g., processing units) , memory 604, a communication system 610 and one or more communication buses 612 for interconnecting these components. In some implementations, the computing device 126 includes input/output (I/O) interfaces 606, such as a display 614 and/or an input device 616.
In some implementations, the computing device 126 is a base station that communicates (e.g., wirelessly) with the movable object 102 and/or the control unit 104.
In some implementations, the computing device 126 provides data storage, data retrieval, and/or data processing operations, e.g., to reduce the processing power and/or data storage requirements of movable object 102 and/or control unit 104. For example, computing device 126 is communicatively connected to a database (e.g., via communication 610) and/or computing device 126 includes database (e.g., database is connected to communication bus 612) .
The communication system 610 includes one or more network or other communications interfaces. In some implementations, the computing device 126 receives data from the movable object 102 (e.g., from one or more sensors of the movable object sensing system 122) and/or the control unit 104. In some implementations, the computing device 126 transmits data to the movable object 102 and/or the control unit 104. For example, computing device provides control instructions to the movable object 102.
In some implementations, the memory 604 stores instructions for performing image recognition, target classification, spatial relationship determination, and/or gesture analysis that are similar to the corresponding functionalities discussed with respect to Figure 4. The memory 604 may also store target information, such as the tracked target information 408 and/or the predetermined recognizable target type information 414 as discussed in Figure 4.
In some implementations, the memory 604 or a non-transitory computer-readable storage medium of the memory 604 stores an application 620, which enables interactions with and control over the movable object 102, and which enables data (e.g., audio, video and/or image data) captured by the movable object 102 to be displayed, downloaded, and/or post-processed. The application 620 may include a user interface 630, which enables interactions between a user of the computing device 126 and the movable  object 126. In some implementations, the application 630 may include a video editing module 640, which enables a user of the computing device 126 to edit videos and/or images that have been captured by the movable object 102 during a flight associated with a target object 102, e.g., captured using the image sensor 216.
In some implementations, the memory 604 also stores templates 650, which may be used for generating edited videos.
In some implementations, the memory 604 also stores data 660 that have been captured by the movable object 102 during a flight associated with a target object 106, which include videos 661 that have been captured by the movable object 102 during a flight associated with a target object 106. In some implementations, the data 660 may be organized according to flights 661 (e.g., for each flight route) by the movable object 102. The data for each of the flights 661 may include video data 662, images 663, and/or audio data 664. and/or. In some implementations, the memory 604 further stores with the video data 662, the images 663, and the audio data 664 tag information 666 (e.g., metadata information) . For example, the video data 662-1 corresponding to flight 1 661-1 may include tag information (e.g., metadata) associated with the flight path and trajectory mode corresponding to the flight 661-1.
In some implementations, the memory 604 also stores a web browser 670 (or other application capable of displaying web pages) , which enables a user to communicate over a network with remote computers or devices.
Each of the above identified executable modules, applications, or sets of procedures may be stored in one or more of the memory devices, and corresponds to a set of instructions for performing a function described above. The above identified modules or programs (i.e., sets of instructions) need not be implemented as separate software programs, procedures, or modules, and thus various subsets of these modules may be combined or otherwise re-arranged in various implementations. In some implementations, the memory  604 stores a subset of the modules and data structures identified above. Furthermore, the memory 604 may store additional modules or data structures not described above.
Figure 7 illustrates an exemplary configuration 700 of a movable object 102, a carrier 108, and a payload 110 according to some implementations. The configuration 700 is used to illustrate exemplary adjustments to an orientation, position, attitude, and/or one or more movement characteristics of the movable object 102, the carrier 108, and/or the payload 110, e.g., as used to perform initialization of target tracking and/or to track a target object 106.
In some implementations, the movable object 102 rotates around up to three orthogonal axes, such as X1 (pitch) 710, Y1 (yaw) 708 and Z1 (roll) 712 axes. The rotations around the three axes are referred to herein as a pitch rotation 722, a yaw rotation 720, and a roll rotation 724, respectively. Angular velocities of the movable object 102 around the X1, Y1, and Z1 axes are referred to herein as ωX1, ωY1, and ωZ1, respectively. In some implementations, the movable object 102 engages in  translational movements  728, 726, and 730 along the Xl, Y1, and Z1 axes, respectively. Linear velocities of the movable object 102 along the X1, Y1, and Z1 axes (e.g., velocities of the  translational movements  728, 726, and 730) are referred to herein as VX1, VY1, and VZ1, respectively.
In some implementations, the payload 110 is coupled to the movable object 102 via the carrier 108. In some implementations, the payload 110 moves relative to the movable object 102 (e.g., the payload 110 is caused by the actuator 204 of the carrier 108 to move relative to the movable object 102) .
In some implementations, the payload 110 moves around and/or along up to three orthogonal axes, e.g., an X2 (pitch) axis 716, a Y2 (yaw) axis 714, and a Z2 (roll) axis 718. The X2, Y2, and Z2 axes are parallel to the X1, Y1, and Z1 axes respectively. In some implementations, where the payload 110 includes the imaging device 214 (e.g., an optical module 702) , the roll axis Z2 718 is substantially parallel to an optical path or optical axis for  the optical module 702. In some implementations, the optical module 702 is optically coupled to the image sensor 216 (and/or one or more sensors of the movable object sensing system 122) . In some implementations, the carrier 108 causes the payload 110 to rotate around up to three orthogonal axes, X2 (pitch) 716, Y2 (yaw) 714 and Z2 (roll) 718, e.g., based on control instructions provided to the actuator 204 of the carrier 108. The rotations around the three axes are referred to herein as the pitch rotation 734, yaw rotation 732, and roll rotation 736, respectively. The angular velocities of the payload 110 around the X2, Y2, and Z2 axes are referred to herein as ωX2, ωY2, and ωZ2, respectively. In some implementations, the carrier 108 causes the payload 110 to engage in  translational movements  740, 738, and 742, along the X2, Y2, and Z2 axes, respectively, relative to the movable object 102. The linear velocity of the payload 110 along the X2, Y2, and Z2 axes is referred to herein as VX2, VY2, and VZ2, respectively.
In some implementations, the movement of the payload 110 may be restricted (e.g., the carrier 108 restricts movement of the payload 110, e.g., by constricting movement of the actuator 204 and/or by lacking an actuator capable of causing a particular movement) .
In some implementations, the movement of the payload 110 may be restricted to movement around and/or along a subset of the three axes X2, Y2, and Z2 relative to the movable object 102. For example, the payload 110 is rotatable around the X2, Y2, and Z2 axes (e.g., the movements 832, 834, 836) or any combination thereof, the payload 110 is not movable along any of the axes (e.g., the carrier 108 does not permit the payload 110 to engage in the movements 838, 840, 842) . In some implementations, the payload 110 is restricted to rotation around one of the X2, Y2, and Z2 axes. For example, the payload 110 is only rotatable about the Y2 axis (e.g., rotation 832) . In some implementations, the payload 110 is restricted to rotation around only two of the X2, Y2, and Z2 axes. In some implementations, the payload 110 is rotatable around all three of the X2, Y2, and Z2 axes.
In some implementations, the payload 110 is restricted to movement along the X2, Y2, or Z2 axis (e.g., the movements 838, 840, or 842) , or any combination thereof, and the payload 110 is not rotatable around any of the axes (e.g., the carrier 108 does not permit the payload 110 to engage in the movements 832, 834, or 836) . In some implementations, the payload 110 is restricted to movement along only one of the X2, Y2, and Z2 axes. For example, movement of the payload 110 is restricted to the movement 840 along the X2 axis) . In some implementations, the payload 110 is restricted to movement along only two of the X2, Y2, and Z2 axes. In some implementations, the payload 110 is movable along all three of the X2, Y2, and Z2 axes.
In some implementations, the payload 110 is able to perform both rotational and translational movement relative to the movable object 102. For example, the payload 110 is able to move along and/or rotate around one, two, or three of the X2, Y2, and Z2 axes.
In some implementations, the payload 110 is coupled to the movable object 102 directly without the carrier 108, or the carrier 108 does not permit the payload 110 to move relative to the movable object 102. In some implementations, the attitude, position and/or orientation of the payload 110 is fixed relative to the movable object 102 in such cases.
In some implementations, adjustment of attitude, orientation, and/or position of the payload 110 is performed by adjustment of the movable object 102, the carrier 108, and/or the payload 110, such as an adjustment of a combination of two or more of the movable object 102, the carrier 108, and/or the payload 110. For example, a rotation of 60 degrees around a given axis (e.g., yaw axis) for the payload is achieved by a 60-degree rotation by the movable object 102 alone, a 60-degree rotation by the payload relative to the movable object 102 as effectuated by the carrier, or a combination of 40-degree rotation by the movable object and a 20-degree rotation by the payload 110 relative to the movable object 102.
In some implementations, a translational movement for the payload 110 is achieved via adjustment of the movable object 102, the carrier 108, and/or the payload 110 such as an adjustment of a combination of two or more of the movable object 102, carrier 108, and/or the payload 110. In some implementations, a desired adjustment is achieved by adjustment of an operational parameter of the payload 110, such as an adjustment of a zoom level or a focal length of the imaging device 214.
Figures 8A and 8B illustrate process flows between a user, the computing device 126 (e.g., the application 620) , and the movable object 102 (e.g., a UAV) according to some implementations.
In accordance with some implementations of the present disclosure, the application 620 enables a user to control a flight route of the movable object 102 for automatic and efficient video capture. The application 620 is communicatively connected with the movable object 102 to facilitate automatic recognition of a target type of the target object 106. The movable object 102 is configured to capture live video (e.g., via the image sensor) during the flight route. In some implementations, and as described in Figures 14 to 16, a flight route may consist of flight paths. The flight route is designed to so as to utilize a variety of aerial videography techniques.
In accordance with some implementations of the present disclosure, the application 620 also provides video-editing functionalities for editing a video that is captured by the movable object 102 during a flight route associated with a target object 102. For example, the application 620 may generate an edited video that is composed of multiple scenes and matched with music, filters, and transition. Because the video capture and editing processes are designed to be fully automated, the efficiency and quality of the entire process of flight, video capture, and video editing is significantly improved. Thus, the interactive experience of the users is enhanced.
Figure 8A illustrates an exemplary process flow 800 during an initialization and flight route selection phase.
In some implementations, a user may initiate an automated flight route video capture process by launching (804) an application (e.g., application 620, Figure 6) on the computing device 126. In some implementations, the user may also initiate the automated flight route video capture process by other means, such as via the input device 506 of the control unit 104, or using facial recognition and/or hand gesture.
The user selects (806) a target object (e.g., the target object 106) .
In some implementations, the user selects an object of interest as a target object for the automated flight route video capture. This is illustrated in Figure 9. In some implementations, the user may also select an object of interest as a target object 106 by entering the GPS coordinates of the object of interest via a user interface (e.g., the graphical user interface 630, Figure 6) .
In some implementations, in response to user selection of the target object 106, the computing device 126 (e.g., application 620) displays (808) a target zone and parameters associated with the target zone. This is described in Figure 10.
The user confirms (810) the parameters.
The movable object 102 determines (812) a target type and a distance between the movable object 102 and the target object 106. This is described in Figure 13.
The movable object 102 selects (814) a flight route. This is described in Figure 13.
The computing device 126 displays (818) the selected flight route.
The user confirms (820) the selected flight route. This is illustrated in Figure 10.
The movable object (e.g., UAV) flies (822) autonomously according to customized flight route.
In some implementations, the computing device 126 displays (824) the progress of the flight route as it is being executed by the movable object 102. This is illustrated in Figure 11.
The movable object 102 capture (826) a video feed having a field of view of the image sensor 216 for each of a plurality of paths of the flight route.
The movable object 102 stores (828) a video feed of the flight route as it is being captured (e.g., video data 428, Figure 4) . In some implementations, the movable object 102 also stores video feed tag information along with the video feed.
In some implementations, after completion of flight route, the movable object 102 returns (830) to its starting position.
Figure 8B illustrates an exemplary process flow 850 during a video editing phase.
The computing device 126 retrieves (852) from the movable object 102 video segments of the captured video. Each video segment corresponds to a respective flight path of the flight route. This is described in Figure 17.
In some implementations, the user views (854) the captured video feed. This is illustrated in Figure 12.
In some implementations, the user may view (856) the view segments.
In some implementations, the computing device 126 selects (864) a video template. In some implementations, the video template is selected by the computing device 126 automatically. In some implementations, the video template is selected by the user. For example, the user may select (858) a theme for the video. In response to the user selection, the computing device 126 may display one or more templates corresponding to the  user-selected theme. In some implementations, the computing device may also automatically select a default video template. The user may either confirm (860) or modify the video template selection. The user may also input (862) a time duration of the video. The user may also select (866) a resolution for the video. This is illustrated in Figures 12 and 19.
The computing device 126 determines (868) a time duration of each scene of the video.
The computing device 126 extracts (870) video sub-segments from the video segments.
The computing device combines (872) the extracted video sub-segments into an edited video.
The computing device 126 displays (874) the edited video.
Figure 9 provides a screen shot for selecting an object of interest as a target object according to some implementations. In this example, the graphical user interface 630 displays a scene (e.g., a live scene or a previously recorded scene) that is captured by the image sensor 216 during a flight of the movable object 102.
In some implementations, and as illustrated in Figure 9, the user may identify an object 902 (e.g., a structure) in the scene as an object of interest. In this example, the user defines the object 902 as the target object 106 by selecting the object 902 on the graphical user interface 630, thereby causing an identifier 910 (e.g., an “X” ) to be displayed on the graphical user interface 630. The user may initiate an automated flight route video capture process using the object 902 as a target object 106.
In some implementations, in response to user selection of the object 902 as a target object 106, the graphical user interface 630 may display a notification (e.g., as a pop-up window) indicating that the movable object 102 will start a range of movements for  target recognition and position estimation. During the target recognition and position estimation process, the movable object 102 may determine, a target type corresponding to the target object 102. The movable object 102 may also determine a distance between the movable object 102 and the target object 106. In some implementations, in accordance with the determined target type and the distance, the movable object 102 may select automatically a predefined flight route, as described in further detail in Figure 13.
In some implementations, after the range of movements has been completed, the graphical user interface 630 may display an updated view that includes a maximum height 904 for the flight route (e.g., “120 m” ) , a maximum distance 906 for the flight route (e.g., “200 m” ) , and a time duration 908 (e.g., “3m 0s” ) for the flight route, as illustrated in Figure 9.
In some implementations, the selected flight route may be customized. In some implementations, the customization is performed automatically by the movable object 102. In some implementations, the customization is provided in part by the user. In some implementations, and as illustrated in Figure 14 to 16, a flight route may be comprised of multiple flight paths, each having a different trajectory. Depending on the target type and the distance between the movable object 102 and the target object 106, the movable object 102 may cause the computing device 126 to display a library of trajectories that can be selected by the user. Based on the user selection, the movable object 102 automatically generates flight paths corresponding to the trajectories.
In some implementations, the selected flight route may be customized according to a surrounding environment between the movable object 102 and the target object 106. For example, modification of a predefined flight route may be necessary to overcome obstacle (s) between the movable object 102 and the target object 106. In some implementations, the customization may also be based on an updated distance between the movable object 102 and the target object 106. For example, if the maximum altitude of the  movable object 102 during the flight route is fixed, the rate of ascent of the movable object 102 may be higher when the target object 106 is nearer and the rate of ascent of the movable object 102 may be lower when the target object is farther.
Figure 10 provides a screen shot for displaying a customized flight path according to some implementations. In this example, the graphical user interface 630 displays a map 1002 that includes a current location 1004 of the target object 106 and its vicinity. The map 1002 also includes a flight zone 1006 that indicates a surrounding area of the target object 106 that would be traversed by the movable object 102 during the flight. The flight zone 1006 may be color-coded to show a range of altitudes that will be attained by the movable object 102 during the execution of the flight route, as illustrated in Figure 10. The graphical user interface 630 also displays a notification 1008 to the user (e.g., “The UAV will be flying in the identified fly zone. Please ensure that there is no obstruction” ) , a “Confirm” affordance 1010, and a “Cancel” affordance 1012. User selection of the “Confirm” affordance 1010 causes the movable object 102 to commence execution of the flight route. User selection of the “Cancel” affordance 1012 causes the movable object 102 to refrain from executing the flight route.
In some implementations, the flight route may include a plurality of flight paths. In some implementations, the graphical user interface 630 may also display a preview of the flight routes of the flight path. For example, the preview may comprise a simulated view (e.g., a three-dimensional or two-dimensional representation) of each of the flight routes from the perspective of the movable object 102. The three-dimensional representation may be based on a superposition of satellite images, e.g., images from Google Earth, corresponding to the each of the flight paths. Stated another way, the three-dimensional representation enables the user to observe a simulated preview of the flight route using satellite imagery prior to execution of the flight route by the movable object 102.
Figure 11 provides a screen shot for execution of a flight route by a movable object 102 according to some implementations.
In some implementations, in response to user selection of the “Confirm” affordance 1010 in Figure 10, the movable object 102 commences execution of the flight route without user intervention. During the execution of the flight route, the graphical user interface 630 displays a bar 1102 that includes multiple segments, as illustrated in Figure 11. Each of the segments corresponds to a flight path with a corresponding trajectory mode, as will be described with respect to Figures 14 to 16. The graphical user interface 630 also displays a status bar 1104 that indicates a degree of completion of the flight route and a description 1106 (e.g., “5/9 Downward spiral” ) of a current status of the flight route. In this example, the executed flight route consists of nine flight paths, and the description 1106 “5/9 Downward spiral” indicates that the movable object 102 is currently executing the fifth flight path corresponding to a “downward spiral” trajectory mode. In some implementations, and as illustrated in Figure 11, the graphical user interface 630 also displays a current flight direction 1108 executed by the movable object 102.
In some implementations, the flight route includes a plurality of flight paths, and the image sensor 216 is configured to capture a video feed for each of the flight paths with a corresponding capture setting. The capture setting includes settings for an angle of view of the image sensor 216 with respect to the flight path, a pan tilt zoom (PTZ) setting of the image sensor 216, an optical zoom setting of the image sensor 216, a digital zoom setting of the image sensor 216, and/or a focal length of the image sensor 216. Exemplary details of flight routes, flight paths, and image sensor capture settings are described in further detail with respect to Figures 14 to 16.
Figure 12 provides a screen shot of an edited video according to some implementations.
In some implementations, the movable object 102 simultaneously transmits a live video stream to the computing device 126 for display on the graphical user interface 630 as the video is being captured.
In some implementations, after completion of the flight route by the movable object 102 (e.g., the video capture is completed) , the computing device 126 (e.g., the application 620) automatically generates an edited video 1200 from the captured video. The user can select a playback affordance on the graphical user interface 630 that enables playback of the video as it is captured, and/or the edited video 1200. Further details of the video editing process are described with respect to Figure 17 to 19. In some implementations, the graphical user interface 630 also displays different themes (e.g., a “joyous” theme 1202, an “action” theme 1204, and a “scenic” theme 1206) that, when selected by the user, generates a different effect on the video. The user may also select a resolution 1208 for the video. After the edited video is generated, the edited video may be stored locally (E.g., on the computing device 126) or on the Cloud. The user may also share the edited video via email and/or on social media platforms.
Figure 13 illustrates an exemplary flight route matching strategy according to some implementations.
In some implementations, after a user selects an object of interest as a target object 106, the movable object 102 executes a flight route matching strategy (1302) to determine a flight route to be executed for video capture of the selected target object 106. To this end, the movable object first determines a target type corresponding to the target object 106. For example, the movable object 102 may capture images of the target object (e.g., using the image sensors 302 and/or the image sensor 216) and use image matching algorithms to identify a target type corresponding to the target object 106. Using the images and/or algorithms, the movable object determines (1304) whether the target object is a person. In accordance with a determination that the target object is not (1306) a person, the movable  object 102 selects a normal flight route (1310) . In accordance with a determination that the target object is (1308) a person, the movable object 102 selects a portrait flight route (1312) .
In some implementations, the movable object 102 also determines a distance between the movable object 102 UAV and the target object 106 (e.g., using a distance sensor such as the infrared sensors 306, a RADAR sensor, a LiDAR sensor, or GPS coordinates) , and further refines the selected flight route based on the distance. Referring again to Figure 13, after determining that the target object 106 is not a person (1306) , the movable object 102 further determines if a distance between the movable object 102 and the target object 106 exceeds a threshold distance (e.g., over 100 meters) (step 1314) . In accordance with a determination that the distance does not (1316) exceed the threshold distance, the movable object identifies the normal flight route (1320) as the flight route to be executed. In accordance with a determination that the distance exceeds (1318) the threshold distance, the movable object identifies a long route flight route (1322) as the flight route to be executed.
If the target object is a person (1308) , the movable object 102 determines if a distance between the movable object 102 and the target object 106 exceeds a threshold distance (e.g., over 100 meters) (step 1324) . In accordance with a determination that the distance exceeds (1326) the threshold distance, the movable object 102 identifies a long route flight route (1322) as the flight route to be executed. In accordance with a determination that the distance does not exceed (1328) the threshold distance, the movable object 102 identifies a portrait flight route (1330) as the flight route to be executed.
In some implementations, the flight route matching strategy 1302 includes more target types and/or flight routes than those described in Figure 13. For example, the movable object 102 may utilize other sensors and/or machine learning algorithms to identify target types other than those described in Figure 13 (e.g., target types such as animals, vehicles, mountains, buildings, and sculptures) .
In some implementations, after the movable object 102 has identified a flight route as the flight route to be executed, a user may override the identified flight route (e.g., by electing a different flight route) . In some implementations, the user may also customize flight route (e.g., by modifying a subset of the flight paths of the flight route) , and/or create new flight routes for execution by the movable object 102.
Figures 14. 15, and 16 illustrate, respectively, an exemplary normal flight route, an exemplary portrait flight route, and an exemplary long range flight route according to some implementations. In these examples, a respective flight route consists of multiple flight paths (e.g., trajectory modes) that are executed successively in the flight route. In Figures 14 to 16, the numbers in parentheses (e.g., (0) , (1) , (2) etc. ) represent a respective flight path having a corresponding trajectory mode. The order of execution of the flight paths is flight path (0) , (1) , (2) , (3) , (4) , (5) , (6) , (7) , (8) , and (9) . The block arrows represent a direction of travel of the movable object 102. The triangle represents the image sensor 216 and the base of the triangle denotes the field of view of the image sensor 216. The thin arrows represent a direction of rotation of the image sensor 216 as the movable object 102 transitions to a subsequent trajectory mode within the flight route. The global x-, y-, and z-axes are also depicted in the figures.
Figure 14 illustrates an exemplary normal flight route according to some implementations.
In some implementations, the normal flight route is designed to be executed within the shortest possible time and using the smallest possible flight area (e.g., smallest possible flight zone) . In the normal flight route, the flight paths are designed so as to display the target object 106 and the surrounding environment as richly as possible, taking into account different target types (e.g., target types such as animals, vehicles, buildings, sculptures, and scenery) . In some implementations, the normal flight route includes a maximum allowable distance between a starting position of the movable object 102 and the  target object 106 (e.g., 100 m, 150 m, or 200 m) and a maximum allowable flight altitude (e.g., 50 m, 80 m, or 100 m) . If the distance between the movable object 102 and the target object 106 exceeds the maximum allowable distance, the long range flight route will be selected instead (See Figure 13) .
In the example of Figure 14, the movable object 102 is configured to rotate with a fixed angle of rotation in a specific direction for flight paths (2) , (4) , and (5) . For instance, the movable object 102 is configured to rotate 180° in a counterclockwise direction with respect to the y-axis (e.g., the yaw rotation 720, Figure 7) when executing flight path (2) . During the execution of flight path (5) , the movable object 2 is configured to rotate 180° in a clockwise direction with respect to the x-axis (e.g., the roll rotation 724, Figure 7) .
In some implementations, the image sensor 216 is configured to maintain a fixed angle (e.g., fixed with respect to the global x-, y-, and z-axes) in one or more flight paths. As illustrated in Figure 14, the image sensor 216 has a fixed angle with respect to the global x-axis for each of the flight paths (2) , (4) , and (5) . In some implementations, the carrier 108 includes a gimbal and is configured to continuously adjust the gimbal (e.g., adjust an angle of rotation and/or a tilt angle) in order to maintain the image sensor 216 at the fixed angle.
In some implementations, the image sensor 216 is configured to rotate about one or more axis in a flight path. For example, in flight path (7) , the image sensor is configured to rotate about the y-axis (e.g., yaw rotation 732, Figure 7) while the movable object 102 travels in an upward direction.
In some implementations, the image sensor 216 is configured to tilt and/or rotate between flight paths of the flight route. For example, as the movable object 102 transitions from flight path (8) to flight path (9) , the field of view of the image sensor 216 switches from facing a negative x-axis direction to a negative y-axis direction.
In some implementations, the flight route includes a flight path wherein the field of view of the image sensor 216 faces a direction that is the same as a direction of travel of the movable object 102. For example, in flight path (9) , the movable object 102 is traveling in a downward direction (e.g., negative y-axis direction) and the field of view of the image sensor 216 also faces the downward direction.
In some implementations, the flight route includes a flight path wherein the field of view of the image sensor 216 faces a direction that is opposite to a direction of travel of the movable object 102. For example, in flight path (1) , the movable object 102 is traveling away from the starting position of the movable object 102 whereas the field of view of the image sensor is facing the starting position of the movable object 102.
In the example of Figure 14, flight paths (1) , (3) , (6) , (7) , (8) , and (9) are trajectory modes in which the movable object 102 traverses a respective fixed distance. Suppose the distance between the starting point of the movable object 102 (e.g., UAV starting position) and the target object 106 is D1, and the distance between the farthest point of the flight route and the starting point of the movable object 102 is D2. In order to distinguish the difference between D2 and D1, the ratio olD2 and D1 should be as large as possible. In some implementations, in order to take into account the video capture effect and the flight time, the distance D1 between the movable object 102 and the target object 106 should be between 2 m and 100 m.
Figure 15 illustrates an exemplary portrait flight route according to some implementations.
In some implementations, the portrait mode is used when the target object is a person. The portrait mode is designed so as to present people in a suitable proportion in the capture video. Accordingly, the movable object 102 should not be too far from the target object 106 while taking into consideration the surrounding environment.
In some implementations, the portrait flight route includes a maximum allowable distance between a starting position of the movable object 102 and the target object 106 (e.g., 40 m, 50 m, or 60 m) and a maximum allowable flight altitude (e.g., 30 m, 40 m, or 50 m) .
In some implementations, and as illustrated in Figure 15, the portrait flight route includes a plurality of flight paths, each having a respective trajectory mode. In the example of Figure 15, the movable object 102 is configured to rotate with a fixed angle of rotation in a specific direction for flight paths (2) , (3) , and (5) . The movable object 102 is configured to travel a fixed distance for flight paths (1) , (4) , (6) , (7) , (8) , (9) and (10) . The image sensor 216 is also configured to rotate or tilt with respect to one or more axis, or maintain a fixed angle, as discussed previously with respect to Figure 14.
In some implementations, the portrait flight route is designed to enrich the video capture effects, for example by controlling a pan, tilt, and zoom (PTZ) setting of the image sensor 216. In the example of Figure 15, flight path (1) comprises a trajectory mode wherein the image sensor performs dolly zoom, dynamically adjusts a zoom factor and focal length during capture of the target object 106 as the movable object 102 is moving away from the target object 106.
In some implementations, the starting position of the portrait flight route is predetermined. For instances, the predetermined starting position is three meters high and five meters away from the target object 106, so that the size of the target object in images captured by the image sensor would be appropriate.
In some implementations, the movable object 102 is configured to track the target object 106 continuously for flight paths (1) , (2) , (3) , (4) , (5) . If the target object 106 moves, the flight paths (1) , (2) , (3) , (4) , (5) will changes accordingly.
Figure 16 illustrates an exemplary long range flight route according to some implementations.
In some implementations, the long range flight route is designed to enrich the video capture effects, for example by controlling a pan, tilt, and zoom (PTZ) setting of the image sensor 216. In the example of Figure 16, flight path (3) comprises a trajectory mode wherein the field of view of the image sensor 216, which is facing the target object 106, dynamically adjusts a zoom factor (e.g., from 3-6 times) during capture of the target object 106 as the movable object 102 is moving away from the target object 106 (e.g., the direction of travel of the movable object 102 is in the positive x direction) .
In some implementations, the long range flight route includes a maximum allowable distance between a starting position of the movable object 102 and the target object 106 (e.g., 100 m, 150 m, or 200 m) and a maximum allowable flight altitude (e.g., 80 m, 100 m, or 120 m) .
In some implementations, the flight paths include a discovery approach trajectory mode, as illustrated in flight path (3) of the normal flight route, flight path (3) of the portrait flight route, and flight path (1) of the long range flight route. In this mode, the carrier 108 controls a pitch angle of the gimbal such that the field of view of the image sensor 216 no longer follows the target object 106, but instead rotates down to 90° relative to a horizontal plane. The movable object 102 is configured to fly towards the target object 106. The movable object 102 determines in real time a current position of the movable object 102 in the flight route, and controls the gimbal to gradually lift at a certain speed to make the target object 106 appear on the field of view of the image sensor 216 again.
As the above examples illustrate, the flight route of the movable object 102 (as well as the corresponding flight paths and trajectory modes) is highly dependent on the position of the target object 102 (and correspondingly, a distance between the movable object 102 and the target object 106) . In some implementations, the movable object 102 continuously determines a current location of the target object 106 while executing the flight route (e.g., using the image sensors 302 and/or 216, the distance sensors, and/or GPS) , and  modifies a current flight path and/or subsequent flight paths based on a presently determined location of the target object 106. In some situations, if the movable object 102 is not able to locate target object 106 while in flight (e.g., during execution of a flight path in the flight route) , the movable object 102 may use a most recently determined location of the target object 106 as the current location of the target object 106, and update the location data once it relocates the target object 106. In some implementations, the target object 106 is a moving object that changes its location while the flight route is being executed. The movable object 102 may be configured to adjust a current flight path and/or subsequent flight path (s) for optimum video capture.
In come implementations, the movable object 102 is configured to return to its starting location after execution of the flight route. In some implementations, a user may issue a command to pause or stop the movable object 102 during execution of a flight route (e.g., for collision prevention) . In response to the user command, the movable object 102 may hover at a fixed location (e.g., its most recent location prior to the user command) and await the user's follow-up operation. In some implementations, if the user does not provide a follow-up command after a certain period of time, the movable object 102 may be configured to automatically return to its starting location. If the user decides to continue with the video capture, the user may resume operation of the movable object 102. In some implementations, the movable object 102 resumes its operation by executing the remaining portion of a flight path (i.e., the flight path that it was executing prior to receiving the user command) , and then following on to execute subsequent flight paths of the flight route. In some implementations, the movable object 102 resumes its operation by skipping the unfinished portion of a current flight path and commencing with the next flight path in the flight route. In some implementations, the movable object 102 resumes its operation by repeating a current flight path that was being executed prior to the user command, and then following on to execute subsequent flight path (s) of the flight route.
In some implementations, the movable object 102 is configured to avoid obstacles during execution of a flight route. As an example, suppose that a radius of the distance from the movable object 102 to the target object 106 is 100 meters, the image sensors 302 of the movable object 102 are located at a fixed position (e.g., in the front) of the movable object 102 and are in a forward-facing direction, and the field of view of the image sensor 216 is facing the target object 106, which is located at an angle offset (e.g., 44°from the image sensors 302. When the flight path has a trajectory of any radius, an internal spiral flight trajectory can be used to ensure obstacle avoidance within the visual field of view (e.g., the field of view of the image sensors 302) , but a surrounding radius will gradually shrink (e.g., by ~ 8%for every 30° rotation) . When the radius of the movable object 102 and the target object 106 is less than 64.8 m, obstacle avoidance can be performed based on a map of the target object location without the use of an internal spiral flight trajectory. However, if the image sensors 302 and the image sensor 216 are both facing the target object 106, and the flight path is circular, no obstacle avoidance can be achieved.
In accordance with some implementations of the present disclosure, a computing device 126 (e.g., the application 620) provides an video editing feature that automatically edits videos captured by a movable object 102 during a flight route associated with a target object 106. Figures 17 to 19 illustrate.
Figure 17 illustrates an exemplary video segment download and extraction process at a computing device 126, in accordance with some implementations.
In some implementations, the image sensor 216 automatically captures video during a flight route of the movable object 102. The movable device 102 is configured to store the video and its corresponding metadata information as the video is being captured (e.g., while the movable object 102 is executing the flight route) . In some implementations, the movable object 102 may store the captured video and metadata locally on the movable object 102 (e.g., video data 428, Figure 4) . Additionally and/or alternatively, the movable  object 102 may store the captured video and metadata on an external storage medium (e.g., an SD card) that is located on the movable object 102.
In some implementations, the movable object 102 simultaneously transmits a live video stream to the computing device 126 for display on the computing device 126 (e.g., on the graphical user interface 630) as the video is being captured. The live video stream may also include metadata (e.g., tag information) corresponding to the live video stream. In some implementations, the computing device 126 may store the video stream and metadata information (e.g., locally on the computing device 126 and/or in the database 614) .
In some implementations, the video that is stored on the movable object 102 (or on the external storage medium of the movable object 102) has a higher resolution than the streamed video data. Thus, after the video capture is completed (e.g., the movable device 102 has completed the flight route) , the computing device 126 (e.g., the application 620) can use the extra bandwidth that has been freed up from video transmission, to download the higher-resolution video data from the movable object 102 for post-processing and video-editing.
In the example of Figure 17, the shaded regions represent video segments that are downloaded by the computing device 126. In some implementations, the computing device 126 may download an entire video feed that is captured during a flight route, as illustrated by the shaded region representing “Captured Video Feed” in Figure 17. In some implementations, instead of downloading an entire video feed, the computing device 126 may download segments (e.g., portions) of a video feed. These are illustrated by “Level 1 download” and “Level 2 download” in Figure 17.
In some implementations, the downloading and segment selection process is performed automatically by the computing device 126 (i.e., without user intervention) . For example, the computing device 126 may use the tag information to identify segments of the captured video whose image quality may be poorer. These may include segments  corresponding to changes in speed of the movable object 102, and/or arising from the movable object 102 transitioning from one flight path to another in the flight route. The computing device 126 may download other segments of the captured video while refraining from downloading segments that are deemed to be of lower image quality. Accordingly, the amount of video data to be downloaded and the time required to download the data can be significantly reduced.
In some implementations, the downloading and segment selection process is performed by the computing device 126 with user input. For example, a user may identify certain portions of the video as being more interesting and/or of better image quality when viewing the streamed video, and download the video segments corresponding to these portions for subsequent use.
In some implementations, the computing device 126 simultaneously performs the Level 1 download and the Level 2 download that are illustrated in Figure 17. In some implementations, the computing device 126 sequentially performs the Level 1 download and the Level 2 download.
In some implementations, the edited video comprises a video template. Figure 18 illustrates an exemplary video template matching strategy (1802) in accordance with some implementations.
In the example of Figure 18, the computing device 126 determines, based on the flight route (e.g., using the metadata of the captured video) , whether the captured video corresponds to the long range flight route (1322) , the normal flight route (1322) , or portrait flight route (1330) . In accordance with a determination that the flight route corresponds to the long range flight route (1322) , the computing device 126 selects a first template strategy (e.g., template strategy A 1810) . In accordance with a determination that the flight route corresponds to the normal flight route (1320) or the portrait flight route, the computing device 126 selects a second template strategy (e.g., template strategy B 1812) .
In some implementations, each of the template strategies further comprises a plurality of templates with different themes (or styles) . For example, the first template strategy 1810 includes templates of different themes 1804 whereas the second template strategy 1812 includes templates of different themes 1806, for selection by the user. As discussed previously in Figure 12, each of the themes may generate a different effect on the video.
Figure 19 illustrates another exemplary video template matching strategy 1900 in accordance with some implementations.
In some implementations, the template matching strategy 1900 comprises a theme selection 1902. For example, the computing device 126 may display one or more themes that are available for selection by the user (e.g., a joyous theme, an action theme, a scenic theme, and an artistic theme) . After the user selects a theme, the computing device 126 may initiate a template selection step 1903, in which templates corresponding to the selected theme are presented to the user. The computing device 126 also prompts the user to select a template.
In some implementations, instead of the separate steps of theme selection 1902 and template selection 1903, the computing device 126 displays a list of all available templates to the user and prompts the user to select a template.
In some implementations, the computing device 126 automatically selects a theme and a template based on the flight route, which can be subsequently modified by the user.
In some implementations, the computing device 126 may prompt the user to select music (1904) that matches the theme. In some implementations, the music is automatically selected by the computing device 126.
In some implementations, each of the templates includes scenes 1906. In the example of Figure 19, the scenes 1906 include an opening scene, one or more intermediate  scenes (e.g., intermediate scene 1 and intermediate scene 2, Figure 19) and a concluding scene.
The computing device 126 determines, for each of the scenes 1906, one or more flight paths 1908 whose video segments may be used for the scene. In some implementations, the determination is based on the flight parameters (e.g., flight path, an angle and/or direction of field of view of the image sensor 216, a trajectory mode etc. ) that are extracted from the metadata information. In some implementations, one flight path may be used in more than one scene. For example, in Figure 19, a video segment of flight path 2 may be used in the second intermediate scene as well as in the concluding scene. Furthermore, the sequence in which the flight paths are executed in a flight route of the movable object 102 does not have any bearing on the scene (s) in which they may be used. For example, the flight routes illustrated in Figures 14 to 16 each consists of flight paths (0) to (9) that are executed in this order. While the flight path (0) is the flight path to be executed, video segments from the flight path (0) may be used in the opening scene as well as in the concluding scene, as illustrated in Figure 19.
In some implementations, the computing device 126 determines a total time duration for the edited video. Alternatively, the total time duration may be defined by the user (see, e.g., step 862 in Figure 8B) . Based on the total time duration, the computing device 126 determines a corresponding time duration for each of the scenes. The computing device 126 extracts, from the video segments for the corresponding scene, one or more video sub-segments 1910 according to a time duration of the scene.
The video template matching strategy illustrated in Figures 18 and 19 improve over existing post-editing templates that are available on the market. The existing templates only allow a user to design general templates and perform basic video editing functions such as segmentation, changing a playback speed, image transformation (e.g., image translation, zoom, rotation, crop, mirror, mask, changing a degree of image  transparency, keyframe editing) , changing a transition effect (e.g., basic transitions, mirror transitions, special effects transitions, and mask transitions) and other aspects of the effect that can be achieved by manual editing. However, these editing effects do not improve the actual quality of the raw video capture.
In contrast to the abovementioned existing post-editing templates, which do not contain any a priori information about video capture, the computing device 126 stores both the captured video as well as the corresponding metadata (e.g., tag) information. In other words, information such as a time duration of each of the flight paths, as well as information of a prior scene, an angle of view, and the relationship between the various flight paths and the target object 106 are known. Based on these prior information and the multiple high-quality video shots that are taken automatically by the mage sensor 216, one can design a rich post-template combined with professional post-editing skills.
Figures 20A-20C provide a flowchart of a method 2000 according to some implementations.
The method 2000 is performed (2002) by an unmanned aerial vehicle (UAV) (e.g., the movable object 102) .
The UAV receives (2006) , from a computing device 126 that is communicatively connected to the UAV, a first input that includes identification of a target object 106. This is illustrated in Figure 9.
In response (2008) to the first input, the UAV determines (2010) a target type corresponding to the target object 106. For example, the target type may include a person, an animal, a vehicle, a building, a sculpture, a mountain, or the sea. In some implementations, determining the target type further comprises employing image recognition algorithms.
The UAV determines (2012) , a distance between the UAV and the target object 106.
In some implementations, the UAV may include GPS sensing technology. The target object 106 may include GPS or ultra-wide band (UWB) . The UAV may determine a distance between the UAV and the target object 106 using their respective locations, which are obtained via GPS. In some implementations, a user may input coordinates corresponding to the target object 106 and the UAV determines the distance between the UAV and the target object 106 based on the coordinates. Alternatively, in some implementations, a user may identify a location of the target object 106 using a map that is displayed on a user interface (e.g., a graphical user interface 630) .
The UAV selects (2014) automatically, from a plurality of predefined flight routes, a flight route for the UAV. In some implementations, the UAV employs a flight route matching strategy 1302 to select the flight route, as illustrated in Figure 13.
In some implementations, the plurality of predefined flight routes include (2016) a portrait flight route, a long range flight route, and a normal flight route. This is illustrated in Figures 13 to 16.
In some implementations, the UAV automatically customizes the selected flight route by taking into consideration factors such as a updated distance between the UAV and the target object 106, presence of potential obstacle (s) and/or other structures (e.g., buildings and trees) , or weather conditions. In some implementations, customizing the flight route includes modifying a rate of ascent of the UAV, an initial velocity of the UAV, and/or an acceleration of the UAV. In some implementations, the customization is provided in part by a user. For example, depending on the target type and the distance, the UAV may cause the user interface 630 to display a library of trajectories that can be selected by the user. The UAV then automatically generates the paths of the flight route based on the user selections
The selected flight route includes (2020) a plurality of paths of different trajectory modes. This is illustrated in Figures 14 to 16.
In some implementations, each of the plurality of paths (2022) comprises a respective one or more of: a path distance, a velocity of the UAV, an acceleration of the UAV, a flight time, an angle of view; a starting altitude, an ending altitude, a pan tilt zoom (PTZ) setting of the image sensor, an optical zoom setting of the image sensor, a digital zoom setting of the image sensor, and a focal length of the image sensor.
In some implementations, the trajectory modes include (2024) a first trajectory mode, corresponding to: the UAV traveling in a first direction; and a field of view of an image sensor of the UAV facing a second direction, opposite to the first direction. This is illustrated in flight path (1) in Figure 14.
In some implementations, the trajectory modes include (2026) a first trajectory mode, corresponding to: the UAV traveling in a first direction; and a field of view of an image sensor of the UAV facing the first direction. This is illustrated in flight path (9) in Figure 14.
In some instances, the method further comprises (2028) rotating the field of view of an image sensor of the UAV from the first direction to a second direction, distinct from the first direction, while executing the first trajectory mode.
In some implementations, the trajectory modes include (2030) a first trajectory mode, corresponding to: the UAV traveling in a first direction; and a field of view of an image sensor of the UAV facing a second direction, perpendicular to the first direction. This is illustrated in flight path (8) in Figure 14.
The UAV sends (2032) to the computing device 126 the selected flight route for display on the computing device.
In some implementations, sending to the computing device 126 the selected flight route further comprises (2034) causing to be displayed on the computing device 126 a preview of the selected flight route.
In some instances, the preview comprises (2036) a three-dimensional or two-dimensional representation of the selected flight route. For example, the three-dimensional representation may be based on a superposition of satellite images, e.g., images from Google Earth, corresponding to the each of the flight paths of the selected flight route.
In some instances, the preview comprises (2038) a map of a vicinity of the UAV and the target object. This is illustrated in Figure 10.
In some implementations, after the sending, the UAV receives (2040) from the computing device 126 a second input. For example, the UAV receives from the computing device an input corresponding to user selection of the “Confirm” affordance 1010 in Figure 10.
In response to the second input, the UAV controls (2042) the UAV to fly autonomously according to the selected flight route, including capturing by an image sensor of the UAV a video feed having a field of view of the image sensor and corresponding to each path of the plurality of paths.
In some instances, the UAV simultaneously stores (2044) the video feed while the video feed is being captured (e.g., video data 428, Figure 4) .
In some instances, the UAV stores (2046) with the video feed tag information associated with the flight path and trajectory mode corresponding to a respective segment of the video feed.
Figures 21A -21C provide a flowchart of a method 2100 according to some implementations.
The method 2100 comprises a method for editing (2102) a video.
The method 2100 is performed (2104) at a computing device 126.
The computing device 126 has (2106) one or more processors 602 and memory 604.
The memory 604 stores (2108) programs to be executed by the one or more processors. In some implementations, the programs include an application 620.
The video includes (2110) a plurality of video segments captured by an unmanned aerial vehicle (UAV) (e.g., the movable object 102) during a flight route associated with a target object 106. This is illustrated in Figure 17.
Each of the video segments corresponds (2112) to a respective path of the flight route.
The computing device 126 obtains (2114) a set of tag information for each of the plurality of video segments.
The computing device 126 selects (2118) , from a plurality of video templates, a video template for the video.
The selected video template includes (2120) a plurality of scenes. This is illustrated in Figure 19.
In some implementations, the plurality of scenes include (2122) : an opening scene, one or more intermediate scenes, and a concluding scene. This is illustrated in Figure 19.
In some implementations, the selected video template comprises (2128) a theme and includes music that matches the theme.
In some implementations, the plurality of video templates comprise (2130) a plurality of themes. For examples, the plurality of themes may include a plurality of: an artistic theme, an action theme, a scenic theme, a dynamic theme, a rhythmic theme, and a joyous theme.
The computing device 126 extracts (2134) , from the plurality of video segments, one or more video sub-segments according to the tag information and the selected video template.
In some implementations, the method 2100 further comprises: prior to (2140) the extracting, receiving a user input specifying a total time duration of the video.
In some instances, the time duration of the scene is (2142) based on the total time duration of the video.
In some instances, the method 100 further comprises automatically allocating (2144) a time for the video sub-segment
The computing device 126 combines (2146) the extracted video sub-segments from the plurality of scenes into a complete video of the flight route of the UAV.
In some implementations, the extracted video sub-segments are (2148) combined according to a time sequence in which the video sub-segments are captured.
In some implementations, the extracted video sub-segments are (2150) combined in a time sequence that is defined by the selected video template.
Many features of the present invention can be performed in, using, or with the assistance of hardware, software, firmware, or combinations thereof. Consequently, features of the present invention may be implemented using a processing system. Exemplary processing systems (e.g., processor (s) 116, controller 210, controller 218, processor (s) 502 and/or processor (s) 602) include, without limitation, one or more general purpose microprocessors (for example, single or multi-core processors) , application-specific integrated circuits, application-specific instruction-set processors, field-programmable gate arrays, graphics processing units, physics processing units, digital signal processing units, coprocessors, network processing units, audio processing units, encryption processing units, and the like.
Features of the present invention can be implemented in, using, or with the assistance of a computer program product, such as a storage medium (media) or computer readable medium (media) having instructions stored thereon/in which can be used to program a processing system to perform any of the features presented herein. The storage medium (e.g., ( e.g. memory  118, 504, 604) can include, but is not limited to, any type of disk including floppy disks, optical discs, DVD, CD-ROMs, microdrive, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, DRAMs, VRAMs, DDR RAMs, flash memory devices, magnetic or optical cards, nanosystems (including molecular memory ICs) , or any type of media or device suitable for storing instructions and/or data.
Stored on any one of the machine readable medium (media) , features of the present invention can be incorporated in software and/or firmware for controlling the hardware of a processing system, and for enabling a processing system to interact with other mechanism utilizing the results of the present invention. Such software or firmware may include, but is not limited to, application code, device drivers, operating systems and execution environments/containers.
Communication systems as referred to herein (e.g., communication systems 120, 510, 610) optionally communicate via wired and/or wireless communication connections. For example, communication systems optionally receive and send RF signals, also called electromagnetic signals. RF circuitry of the communication systems convert electrical signals to/from electromagnetic signals and communicate with communications networks and other communications devices via the electromagnetic signals. RF circuitry optionally includes well-known circuitry for performing these functions, including but not limited to an antenna system, an RF transceiver, one or more amplifiers, a tuner, one or more oscillators, a digital signal processor, a CODEC chipset, a subscriber identity module (SIM) card, memory, and so forth. Communication systems optionally communicate with networks, such as the Internet, also referred to as the World Wide Web (WWW) , an intranet and/or a wireless network, such as a cellular telephone network, a wireless local area network (LAN) and/or a metropolitan  area network (MAN) , and other devices by wireless communication. Wireless communication connections optionally use any of a plurality of communications standards, protocols and technologies, including but not limited to Global System for Mobile Communications (GSM) , Enhanced Data GSM Environment (EDGE) , high-speed downlink packet access (HSDPA) , high-speed uplink packet access (HSUPA) , Evolution, Data-Only (EV-DO) , HSPA, HSPA+, Dual-Cell HSPA (DC-HSPDA) , long term evolution (LTE) , near field communication (NFC) , wideband code division multiple access (W-CDMA) , code division multiple access (CDMA) , time division multiple access (TDMA) , Bluetooth, Wireless Fidelity (Wi-Fi) (e.g., IEEE 102.11a, IEEE 102.11ac, IEEE 102.11ax, IEEE 102.11b, IEEE 102.11g and/or IEEE 102.11n) , voice over Internet Protocol (VoIP) , Wi-MAX, a protocol for e-mail (e.g., Internet message access protocol (IMAP) and/or post office protocol (POP) ) , instant messaging (e.g., extensible messaging and presence protocol (XMPP) , Session Initiation Protocol for Instant Messaging and Presence Leveraging Extensions (SIMPLE) , Instant Messaging and Presence Service (IMPS) ) , and/or Short Message Service (SMS) , or any other suitable communication protocol, including communication protocols not yet developed as of the filing date of this document.
While various implementations of the present invention have been described above, it should be understood that they have been presented by way of example, and not limitation. It will be apparent to persons skilled in the relevant art that various changes in form and detail can be made therein without departing from the spirit and scope of the invention.
The present invention has been described above with the aid of functional building blocks illustrating the performance of specified functions and relationships thereof. The boundaries of these functional building blocks have often been arbitrarily defined herein for the convenience of the description. Alternate boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Any such alternate boundaries are thus within the scope and spirit of the invention.
The terminology used in the description of the various described implementations herein is for the purpose of describing particular implementations only and is not intended to be limiting. As used in the description of the various described implementations and the appended claims, the singular forms “a, ” “an, ” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term “and/or” as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms “includes, ” “including, ” “comprises, ” and/or “comprising, ” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
As used herein, the term “if” may be construed to mean “when” or “upon” or “in response to determining” or “in accordance with a determination” or “in response to detecting, ” that a stated condition precedent is true, depending on the context. Similarly, the phrase “if it is determined [that a stated condition precedent is true] ” or “if [a stated condition precedent is true] ” or “when [a stated condition precedent is true] ” may be construed to mean “upon determining” or “in response to determining” or “in accordance with a determination” or “upon detecting” or “in response to detecting” that the stated condition precedent is true, depending on the context.
The foregoing description of the present invention has been provided for the purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise forms disclosed. The breadth and scope of the present invention should not be limited by any of the above-described exemplary implementations. Many modifications and variations will be apparent to the practitioner skilled in the art. The modifications and variations include any relevant combination of the disclosed features. The implementations were chosen and described in order to best explain the principles of the invention and its practical application, thereby enabling others skilled in the art to understand  the invention for various implementations and with various modifications that are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the following claims and their equivalence.

Claims (30)

  1. A method performed by an unmanned aerial vehicle (UAV) , the method comprising:
    receiving, from a computing device that is communicatively connected to the UAV, a first input that includes identification of a target object;
    in response to the first input:
    determining a target type corresponding to the target object;
    determining a distance between the UAV and the target object;
    selecting automatically, from a plurality of predefined flight routes, a flight route for the UAV according to the determined target type and the distance; and
    sending to the computing device the selected flight route for display on the computing device.
  2. The method of claim 1, wherein the selected flight route includes a plurality of paths of different trajectory modes.
  3. The method of claim 2, further comprising:
    after the sending, receiving from the computing device a second input; and
    in response to the second input, controlling the UAV to fly autonomously according to the selected flight route, including capturing by an image sensor of the UAV a video feed having a field of view of the image sensor and corresponding to each path of the plurality of paths.
  4. The method of claim 3, further comprising simultaneously storing the video feed while the video feed is being captured.
  5. The method of claim 4, further comprising storing with the video feed tag information associated with the flight path and trajectory mode corresponding to a respective segment of the video feed.
  6. The method of claim 4, further comprising simultaneously sending the video feed and tag information associated with the flight path and trajectory mode corresponding to a respective segment of the video feed to the computing device for store on the remote control device.
  7. The method of claim 1, wherein the plurality of predefined flight routes include: a portrait flight route, a long range flight route, and a normal flight route.
  8. The method of claim 1, wherein each of the plurality of paths comprises a respective one or more of:
    a path distance;
    a velocity of the UAV;
    an acceleration of the UAV
    a flight time;
    an angle of view;
    a starting altitude;
    an ending altitude;
    a pan tilt zoom (PTZ) setting of the image sensor;
    an optical zoom setting of the image sensor;
    a digital zoom setting of the image sensor; and
    a focal length of the image sensor.
  9. The method of claim 1, wherein sending to the computing device the selected flight route further comprises causing to be displayed on the computing device a preview of the selected flight route.
  10. The method of claim 9, wherein the preview comprises a three-dimensional or two -dimensional representation of the selected flight route.
  11. The method of claim 9, wherein the preview comprises a map of a vicinity of the UAV and the target object.
  12. The method of claim 1, wherein the trajectory modes include a first trajectory mode, corresponding to:
    the UAV traveling in a first direction; and
    a field of view of an image sensor of the UAV facing a second direction, opposite to the first direction.
  13. The method of claim 1, wherein the trajectory modes include a first trajectory mode, corresponding to:
    the UAV traveling in a first direction; and
    a field of view of an image sensor of the UAV facing the first direction.
  14. The method of claim 13, further comprising:
    rotating the field of view of the image sensor from the first direction to a second direction, distinct from the first direction, while executing the first trajectory mode.
  15. The method of claim 1, wherein the trajectory modes include a first trajectory mode, corresponding to:
    the UAV traveling in a first direction; and
    a field of view of an image sensor of the UAV facing a second direction, perpendicular to the first direction.
  16. A method for editing a video performed at a computing device having one or more processors and memory storing programs to be executed by the one or more processors, wherein the video includes a plurality of video segments captured by an unmanned aerial vehicle (UAV) during a flight route associated with a target object, each of the video segments corresponding to a respective path of the flight route, the method comprising:
    obtaining a set of tag information for each of the plurality of video segments;
    selecting, from a plurality of video templates, a video template for the video;
    extracting, from the plurality of video segments , one or more video sub-segments according to the tag information and the selected video template; and
    combining the extracted video sub-segments into a complete video of the flight route of the UAV.
  17. The method of claim 16, wherein the selected video template includes a plurality of scenes, each of the scenes corresponding to respective subset of the tag information.
  18. The method of claim 16, wherein the plurality of scenes include: an opening scene, one or more intermediate scenes, and a concluding scene.
  19. The method of claim 16, wherein:
    the selected video template comprises a theme; and
    the selected video template includes music that matches the theme.
  20. The method of claim 16, wherein the extracted video sub-segments are combined according to a time sequence in which the video sub-segments are captured.
  21. The method of claim 16, wherein the extracted video sub-segments are combined in a time sequence that is defined by the selected video template.
  22. The method of claim 16, further comprising:
    prior to the extracting, receiving a user input specifying a total time duration of the video.
  23. The method of claim 22, further comprising automatically allocating a time for the video sub-segment.
  24. The method of claim 16, wherein the plurality of video templates comprise a plurality of themes.
  25. The method of claim 16, wherein the selected video template is selected based on a user input.
  26. The method of claim 16, wherein:
    the flight route is one of a plurality of predefined flight routes; and
    the plurality of video templates are determined based on the flight route.
  27. An unmanned vehicle (UAV) , comprising:
    an image sensor;
    one or more processors; and
    memory having instructions stored thereon, which when executed by the one or more processors cause the processors to perform the method of any of claims 1-15.
  28. A computing device for editing a video, wherein the video includes a plurality of video segments captured by an unmanned aerial vehicle (UAV) during a flight route associated with a target object, each of the video segments corresponding to a respective path of the flight route, the computing device comprising:
    one or more processors; and
    memory having instructions stored thereon, which when executed by the one or more processors cause the processors to perform the method of any of claims 16-26.
  29. A non-transitory computer-readable storage medium having instructions stored thereon, which when executed by one or more processors of an unmanned aerial vehicle (UAV) cause the processors to perform the method of any of claims 1-15.
  30. A non-transitory computer-readable storage medium having instructions stored thereon, which when executed by one or more processors of a computing system cause the processors to perform the method of any of claims 16-26.
PCT/CN2020/142023 2020-12-31 2020-12-31 Systems and methods for supporting automatic video capture and video editing WO2022141369A1 (en)

Priority Applications (6)

Application Number Priority Date Filing Date Title
PCT/CN2020/142023 WO2022141369A1 (en) 2020-12-31 2020-12-31 Systems and methods for supporting automatic video capture and video editing
PCT/CN2021/087611 WO2022141955A1 (en) 2020-12-31 2021-04-15 Unmanned aerial vehicle control method, apparatus, and computer-readable storage medium
CN202180005825.0A CN114556256A (en) 2020-12-31 2021-04-15 Flight control method, video editing method, device, unmanned aerial vehicle and storage medium
PCT/CN2021/087612 WO2022141956A1 (en) 2020-12-31 2021-04-15 Flight control method, video editing method, device, unmanned aerial vehicle, and storage medium
CN202180006635.0A CN114981746A (en) 2020-12-31 2021-04-15 Unmanned aerial vehicle control method and device and computer readable storage medium
US18/215,729 US20230359204A1 (en) 2020-12-31 2023-06-28 Flight control method, video editing method, device, uav and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/142023 WO2022141369A1 (en) 2020-12-31 2020-12-31 Systems and methods for supporting automatic video capture and video editing

Related Child Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/087612 Continuation WO2022141956A1 (en) 2020-12-31 2021-04-15 Flight control method, video editing method, device, unmanned aerial vehicle, and storage medium

Publications (1)

Publication Number Publication Date
WO2022141369A1 true WO2022141369A1 (en) 2022-07-07

Family

ID=82258868

Family Applications (3)

Application Number Title Priority Date Filing Date
PCT/CN2020/142023 WO2022141369A1 (en) 2020-12-31 2020-12-31 Systems and methods for supporting automatic video capture and video editing
PCT/CN2021/087612 WO2022141956A1 (en) 2020-12-31 2021-04-15 Flight control method, video editing method, device, unmanned aerial vehicle, and storage medium
PCT/CN2021/087611 WO2022141955A1 (en) 2020-12-31 2021-04-15 Unmanned aerial vehicle control method, apparatus, and computer-readable storage medium

Family Applications After (2)

Application Number Title Priority Date Filing Date
PCT/CN2021/087612 WO2022141956A1 (en) 2020-12-31 2021-04-15 Flight control method, video editing method, device, unmanned aerial vehicle, and storage medium
PCT/CN2021/087611 WO2022141955A1 (en) 2020-12-31 2021-04-15 Unmanned aerial vehicle control method, apparatus, and computer-readable storage medium

Country Status (2)

Country Link
CN (1) CN114981746A (en)
WO (3) WO2022141369A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023211616A1 (en) * 2022-04-27 2023-11-02 Snap Inc. Editing video captured by electronic devices using associated flight path information

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115435776B (en) * 2022-11-03 2023-03-14 成都沃飞天驭科技有限公司 Method and device for displaying three-dimensional airway route, aircraft and storage medium
CN115499596B (en) * 2022-11-18 2023-05-30 北京中科觅境智慧生态科技有限公司 Method and device for processing image
CN116109956A (en) * 2023-04-12 2023-05-12 安徽省空安信息技术有限公司 Unmanned aerial vehicle self-adaptive zooming high-precision target detection intelligent inspection method

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150208023A1 (en) * 2014-01-20 2015-07-23 H4 Engineering, Inc. Neural network for video editing
US20170244937A1 (en) * 2014-06-03 2017-08-24 Gopro, Inc. Apparatus and methods for aerial video acquisition
US20170300759A1 (en) * 2016-03-03 2017-10-19 Brigham Young University Automated multiple target detection and tracking system
US20180103197A1 (en) * 2016-10-06 2018-04-12 Gopro, Inc. Automatic Generation of Video Using Location-Based Metadata Generated from Wireless Beacons
CN108513641A (en) * 2017-05-08 2018-09-07 深圳市大疆创新科技有限公司 Unmanned plane filming control method, unmanned plane image pickup method, control terminal, unmanned aerial vehicle (UAV) control device and unmanned plane
CN109565605A (en) * 2016-08-10 2019-04-02 松下电器(美国)知识产权公司 Technique for taking generation method and image processor
CN110139149A (en) * 2019-06-21 2019-08-16 上海摩象网络科技有限公司 A kind of video optimized method, apparatus, electronic equipment
CN110692027A (en) * 2017-06-05 2020-01-14 杭州零零科技有限公司 System and method for providing easy-to-use release and automatic positioning of drone applications
CN110853032A (en) * 2019-11-21 2020-02-28 北京航空航天大学 Unmanned aerial vehicle video aesthetic quality evaluation method based on multi-mode deep learning

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105120146B (en) * 2015-08-05 2018-06-26 普宙飞行器科技(深圳)有限公司 It is a kind of to lock filming apparatus and image pickup method automatically using unmanned plane progress moving object
US10587790B2 (en) * 2015-11-04 2020-03-10 Tencent Technology (Shenzhen) Company Limited Control method for photographing using unmanned aerial vehicle, photographing method using unmanned aerial vehicle, mobile terminal, and unmanned aerial vehicle
US11295458B2 (en) * 2016-12-01 2022-04-05 Skydio, Inc. Object tracking by an unmanned aerial vehicle using visual sensors
CN106657779B (en) * 2016-12-13 2022-01-04 北京远度互联科技有限公司 Surrounding shooting method and device and unmanned aerial vehicle
US10375289B2 (en) * 2017-03-31 2019-08-06 Hangzhou Zero Zero Technology Co., Ltd. System and method for providing autonomous photography and videography
CN107990877B (en) * 2017-12-06 2020-07-10 华中师范大学 Internet-based unmanned aerial vehicle remote sensing interpretation field investigation system and method
CN110362098B (en) * 2018-03-26 2022-07-05 北京京东尚科信息技术有限公司 Unmanned aerial vehicle visual servo control method and device and unmanned aerial vehicle
CN108566513A (en) * 2018-03-28 2018-09-21 深圳臻迪信息技术有限公司 A kind of image pickup method of unmanned plane to moving target
WO2021035731A1 (en) * 2019-08-30 2021-03-04 深圳市大疆创新科技有限公司 Control method and apparatus for unmanned aerial vehicle, and computer readable storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150208023A1 (en) * 2014-01-20 2015-07-23 H4 Engineering, Inc. Neural network for video editing
US20170244937A1 (en) * 2014-06-03 2017-08-24 Gopro, Inc. Apparatus and methods for aerial video acquisition
US20170300759A1 (en) * 2016-03-03 2017-10-19 Brigham Young University Automated multiple target detection and tracking system
CN109565605A (en) * 2016-08-10 2019-04-02 松下电器(美国)知识产权公司 Technique for taking generation method and image processor
US20180103197A1 (en) * 2016-10-06 2018-04-12 Gopro, Inc. Automatic Generation of Video Using Location-Based Metadata Generated from Wireless Beacons
CN108513641A (en) * 2017-05-08 2018-09-07 深圳市大疆创新科技有限公司 Unmanned plane filming control method, unmanned plane image pickup method, control terminal, unmanned aerial vehicle (UAV) control device and unmanned plane
CN110692027A (en) * 2017-06-05 2020-01-14 杭州零零科技有限公司 System and method for providing easy-to-use release and automatic positioning of drone applications
CN110139149A (en) * 2019-06-21 2019-08-16 上海摩象网络科技有限公司 A kind of video optimized method, apparatus, electronic equipment
CN110853032A (en) * 2019-11-21 2020-02-28 北京航空航天大学 Unmanned aerial vehicle video aesthetic quality evaluation method based on multi-mode deep learning

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023211616A1 (en) * 2022-04-27 2023-11-02 Snap Inc. Editing video captured by electronic devices using associated flight path information

Also Published As

Publication number Publication date
WO2022141955A1 (en) 2022-07-07
WO2022141956A1 (en) 2022-07-07
CN114981746A (en) 2022-08-30

Similar Documents

Publication Publication Date Title
US10802491B2 (en) Methods and systems for target tracking
US11632497B2 (en) Systems and methods for controlling an image captured by an imaging device
US20220091607A1 (en) Systems and methods for target tracking
US11669987B2 (en) Obstacle avoidance during target tracking
US11019255B2 (en) Depth imaging system and method of rendering a processed image to include in-focus and out-of-focus regions of one or more objects based on user selection of an object
US11513511B2 (en) Techniques for image recognition-based aerial vehicle navigation
WO2022141369A1 (en) Systems and methods for supporting automatic video capture and video editing
US20230259132A1 (en) Systems and methods for determining the position of an object using an unmanned aerial vehicle
WO2022141187A1 (en) Systems and methods for controlling an unmanned aerial vehicle using a body-attached remote control

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20967708

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20967708

Country of ref document: EP

Kind code of ref document: A1