WO2019144296A1 - 可移动平台的控制方法、装置和可移动平台 - Google Patents
可移动平台的控制方法、装置和可移动平台 Download PDFInfo
- Publication number
- WO2019144296A1 WO2019144296A1 PCT/CN2018/073879 CN2018073879W WO2019144296A1 WO 2019144296 A1 WO2019144296 A1 WO 2019144296A1 CN 2018073879 W CN2018073879 W CN 2018073879W WO 2019144296 A1 WO2019144296 A1 WO 2019144296A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- frame
- tracking
- target object
- image
- target
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/10—Image acquisition
Definitions
- Embodiments of the present invention relate to the field of control, and in particular, to a method, an apparatus, and a mobile platform for controlling a mobile platform.
- a mobile platform (such as a drone) can achieve tracking of a target object, so that the user can always be in the shooting picture of the camera of the movable platform without leaving the handheld control terminal.
- Embodiments of the present invention provide a method, an apparatus, and a mobile platform for controlling a mobile platform, so as to improve reliability and robustness of the target platform tracking of the mobile platform.
- an embodiment of the present invention provides a method for controlling a mobile platform, including:
- a detection frame of the palm of the target object is determined from a detection frame of the palm of the object according to a joint point of the target object.
- an embodiment of the present invention provides a control device for a mobile platform, including: a processor and a memory;
- the memory for storing a computer program
- the processor is configured to execute the computer program of the memory storage to perform:
- a detection frame of the palm of the target object is determined from a detection frame of the palm of the object according to a joint point of the target object.
- an embodiment of the present invention provides a readable storage medium, where the readable storage medium stores a computer program, and when the computer program is executed, the first aspect of the present invention may be implemented as described in the embodiment of the present invention.
- Mobile platform control method
- an embodiment of the present invention provides a mobile platform, including a photographing device, and the control device according to the second aspect.
- the control method, device and mobile platform of the movable platform provided by the embodiment of the invention can determine the tracking frame of the feature part of the target object in the image output by the camera device, and identify the joint points and all the objects of all the objects in the image.
- the detection box of the palm of the object can determine the tracking frame of the feature part of the target object in the image output by the camera device, and identify the joint points and all the objects of all the objects in the image.
- the precise matching between the tracking frame of the feature part of the target object and the detection frame of the palm of the target object is realized, so that the movable platform can stably and continuously recognize the detection frame of the palm of the target object, and solves the problem that the target is easy in the prior art.
- the problem of the object matching the palm of the target object is wrong.
- an embodiment of the present invention provides a method for controlling a mobile platform, including:
- Each of the tracking frames is mutually exclusive matched with the detection frame or each of the detection frames is mutually exclusive matched with the tracking frame to determine a plurality of matching results;
- the target tracking frame is updated by the target detection frame to obtain a tracking frame of the updated feature part.
- an embodiment of the present invention provides a control device for a mobile platform, including: a processor and a memory;
- the memory for storing a computer program
- the processor is configured to execute the computer program of the memory storage to perform:
- Each of the tracking frames is mutually exclusive matched with the detection frame or each of the detection frames is mutually exclusive matched with the tracking frame to determine a plurality of matching results;
- the target tracking frame is updated by the target detection frame to obtain a tracking frame of the updated feature part.
- an embodiment of the present invention provides a readable storage medium, where the readable storage medium stores a computer program, and when the computer program is executed, Mobile platform control method.
- an embodiment of the present invention provides a mobile platform, including a photographing apparatus, and the control apparatus according to the sixth aspect.
- the control method, device and mobile platform of the mobile platform provided by the embodiment of the present invention perform mutual exclusion matching through the detection frame and the tracking frame of the feature parts of all objects, and then use the target detection frame pair and the target detection frame that are successfully matched.
- the matching target tracking frame is updated to obtain the target tracking frame of the updated feature part.
- the embodiment of the invention can complete the update process of the tracking frame of the feature parts of all objects, and improve the accuracy of the tracking of the movable platform according to the tracking frame of the feature parts of the tracking object, and solve the problem in the prior art due to other objects.
- the interference and the interference of the similar area of the background cause the mobile platform to misalign with the object, thereby providing a stable and reliable tracking object for the control of the mobile platform in a complex and varied user environment.
- FIG. 1 is a schematic diagram of an application scenario of a mobile platform photographing provided by the present invention.
- FIG. 2 is a flowchart of a method for controlling a mobile platform according to an embodiment of the present invention
- 3a is a schematic diagram of an image in a method for controlling a mobile platform according to an embodiment of the present invention
- FIG. 3b is a schematic diagram of a tracking frame of a feature part of a target object in a method for controlling a mobile platform according to an embodiment of the present invention
- 3c is a schematic diagram of a joint point of an object in a method for controlling a mobile platform according to an embodiment of the present invention
- FIG. 3 is a schematic diagram of a detection frame of a palm of an object in a method for controlling a mobile platform according to an embodiment of the present invention
- FIG. 4 is a flowchart of a method for determining a joint point of a target object from a joint point of an object according to a tracking frame of a feature part of a target object according to an embodiment of the present invention
- FIG. 5 is a flowchart of a method for determining a joint point of a target object from a joint point of an object according to a tracking frame of a feature part of a target object according to an embodiment of the present invention
- FIG. 6 is a flowchart of a method for determining a detection frame of a palm of a target object from a detection frame of a palm of the object according to a joint point of the target object according to an embodiment of the present invention
- FIG. 7 is a schematic structural diagram of a control apparatus of a mobile platform according to an embodiment of the present invention.
- FIG. 8 is a schematic structural diagram of a mobile platform according to an embodiment of the present invention.
- FIG. 9 is a flowchart of a method for controlling a mobile platform according to an embodiment of the present invention.
- FIG. 10 is a flowchart of a method for mutually matching each of the tracking frames with the detection frame or mutually matching the detection frame with the tracking frame to determine a plurality of matching results according to an embodiment of the present invention. ;
- FIG. 11 is a schematic structural diagram of a control apparatus of a mobile platform according to an embodiment of the present invention.
- FIG. 12 is a schematic structural diagram of a mobile platform according to an embodiment of the present invention.
- a component when referred to as being "fixed” to another component, it can be directly on the other component or the component can be present. When a component is considered to "connect” another component, it can be directly connected to another component or possibly a central component.
- FIG. 1 is a schematic diagram of an application scenario of a mobile platform photographing provided by the present invention.
- the movable platform involved in the embodiments of the present invention may include, but is not limited to, a drone, an unmanned vehicle, and an unmanned ship.
- the mobile platform is specifically described by taking the drone 101 as an example.
- the drone 101 of the later-described portion can be replaced by a movable platform.
- the UAV 101 is provided with a pan/tilt head 102 that can be rotated.
- the PTZ 102 is provided with an imaging device 103.
- the UAV 101 can adjust the orientation of the imaging device 103 by controlling the posture of the PTZ 102, and the imaging device 103 can take a picture.
- An environment image is acquired, such as a shot to acquire an image containing the object 104.
- the drone 103 is capable of transmitting the captured image to the control terminal 105 in real time and displaying the image on the display screen of the control terminal 105.
- the control terminal 105 can be one or more of a remote controller, a mobile phone, a laptop computer, and a tablet computer, which is not limited in this embodiment.
- the traditional palm tracking algorithm detects the palm tracking of the target object (the object tracked by the drone) in the image captured by the historical time camera in the current image frame in the target image region.
- the most similar palm tracking frame (such as the proximity of the position, the similarity of the image area size, and the similarity of the image in the tracking frame), wherein the target image area may be the palm of the target object in the image captured by the historical time capturing device.
- the location is determined.
- the palm of different objects is almost indistinguishable on the image, and the tracking algorithm of the traditional palm cannot recognize the palm of the person.
- a trained neural network is usually used to identify the detection frame of the palm of the object, and the matching of the target object with the palm of the target object is achieved.
- the target object is very close to the drone, and the drone itself will move, which may cause the target object's palm to appear motion blur on the image.
- it is difficult to continuously detect the detection frame of the palm of the target object by using the method of the neural network, and even the real position of the palm of the target object has already exceeded the target image area. If you simply increase the search area of the palm, it is easier to match the palm of the target object to other objects.
- the control device of the movable platform is used as an execution body (not shown in FIG. 1 ), and the control device of the movable platform identifies the joint point of the target object as an intermediate bridge, and can accurately determine the target object of the tracking target.
- the palm solves the problem of matching errors in the prior art due to the inability to continuously detect the palm of the target object.
- the control method of the movable platform will be described in detail through a specific embodiment.
- FIG. 2 is a flowchart of a method for controlling a mobile platform according to an embodiment of the present invention
- FIG. 3 is a schematic diagram of an image in a method for controlling a mobile platform according to an embodiment of the present invention
- FIG. 3b is a schematic diagram of an embodiment of the present invention
- FIG. 3 is a schematic diagram of a joint point of a target in a control method of a movable platform according to an embodiment of the present invention
- FIG. 3 is an embodiment of the present invention.
- FIG. The schematic diagram of the detection frame of the palm of the object in the control method of the movable platform provided by the example.
- the control method of the mobile platform of this embodiment may include:
- the movable platform may be configured with a photographing device for photographing and outputting an image.
- the control device of the movable platform can receive an image output by the camera, and further, the processor of the control device can receive an image output by the camera.
- At least one object may be included in the image output by the camera, and at least one object includes at least one target object, wherein the target object is an object tracked by the movable platform.
- the control device of the movable platform can identify the tracking frame of the feature part of the target object from the image, and the control device of the mobile platform can identify the object in the environment by detecting the feature part in the image.
- the feature part may be a head, or a head and a shoulder, or may be a human body, which is not limited in this embodiment.
- the tracking frame of the feature part of the target object is an image corresponding to the feature part of the target object.
- the implementation manner of the tracking device of the movable platform for determining the feature part of the target object from the image may include the following:
- the control device of the mobile platform may determine a tracking frame of the feature portion of the target object from the image by using a tracking algorithm of the traditional target object. For example, after acquiring the current image frame, centering on the tracking frame of the feature part of the target object in the previous frame or the previous time of the current image, the image is extended in a local range, according to the image similarity function obtained by the training, In this local range, an image region that is most similar to the feature portion of the target object is determined, and the tracking frame of the feature portion of the target object is the image region.
- the training parameters in the image similarity function may include any one of an Euclidean distance, a block distance, a checkerboard distance, a weighted distance, a Barth Charlie coefficient, and a Hausdorff distance.
- a core search algorithm such as a Kalman filter, a particle filter, a mean shift (Meanshift) algorithm, an extended meanshift algorithm, etc., or an autocorrelation filter (Correlation Filter) may be employed.
- the algorithm, the random forest algorithm, and the support vector machine (Support Vector Machine) algorithm, etc. are not limited in this embodiment.
- the second possible implementation manner can be used to determine the tracking frame of the feature part of the target object from the image by the method provided in FIG. 9 below.
- FIG. 9 For details, please refer to the later part of this document, which will not be described here.
- the joint point of an object includes at most 19, wherein 19 joint points include: a left eye joint point, a right eye joint point, a nose joint point, a left ear joint point, a right ear joint point, and a mouth.
- 19 joint points include: a left eye joint point, a right eye joint point, a nose joint point, a left ear joint point, a right ear joint point, and a mouth.
- neck joint point left shoulder joint point, right shoulder joint point, left elbow joint point, right elbow joint point, left hand joint point, right hand joint point, left ankle joint point, right ankle joint point, left knee joint point, Right knee joint point, left foot joint point, and right foot joint point.
- the control device of the movable platform can identify the joint points of each object in the image, wherein each object corresponds to a set of joints. point.
- the control device of the movable platform needs to be used for each object in the image.
- the detection frame of the palm is identified.
- the detection frame of the palm is an image area corresponding to the palm, wherein the detection frame may be represented in the form of image coordinates.
- the detection frame may be represented by the coordinates of the upper left corner of the image area and the coordinates of the lower right corner.
- the neural network can be obtained by training the palm of the object in a large number of offline images in advance, and the control device of the mobile platform can use the neural network to detect the image in real time and obtain a detection frame of the palm of each object.
- the network can return the position and size of the detection frame of the palm of each object in the image in the image.
- the neural network can return the coordinates of the upper left and lower right corners of the detection frame of the palm of each object.
- the neural network may include a CNN, a normal deep neural network, a loop network, and the like, which is not limited in this embodiment.
- each object corresponds to a set of joint points, and therefore, the control device of the movable platform needs to determine which set of joint points is the joint point of the target object.
- the control device of the movable platform can determine the joint point of the target object from the plurality of sets of joint points according to the tracking frame of the feature portion of the target object.
- the joint point of the target object can be determined according to the above steps, and the control device of the movable platform detects the palm of the object by comparing the matching relationship between the joint point of the target object and the detection frame of the palm of each object.
- the box determines the detection box of the palm of the target object.
- the movable platform of the embodiment of the present invention is used.
- the specific process of controlling the method to obtain the detection frame of the palm of the target object A is:
- the tracking frame M of the feature portion of the target object A is determined from the image.
- the joint points of the objects A, B, and C in the image are identified, the joint points of the object A are represented by solid circles, the joint points of the object B are represented by open circles, and the joint points of the object C are represented by triangles.
- the detection frames of the palms of the objects A, B, and C in the image are identified as N1, N2, and N3, respectively.
- the joint point of the object A in FIG. 3c can be determined as the joint point of the target object according to the tracking frame M of the feature part of the target object A.
- the matching relationship between the joint point of the target object and the detection frames N1, N2 and N3 of the palms of the objects A, B and C is determined, and according to the matching relationship, N1 can be determined as the palm of the target object A. Detection box.
- the control method of the movable platform provided by the embodiment of the present invention can determine the tracking frame of the feature part of the target object in the image output by the camera device, and identify the joint point of all the objects in the image and the detection frame of the palm of all the objects. .
- the precise matching between the tracking frame of the feature part of the target object and the detection frame of the palm of the target object is realized, so that the movable platform can stably and continuously recognize the detection frame of the palm of the target object, and solves the problem that the target is easy in the prior art.
- the problem of the object matching the palm of the target object is wrong.
- the control device of the movable platform not only needs to determine the target object, but also needs to determine an instruction issued by the target object according to the palm of the target object, so as to control the movable platform to perform corresponding according to the instruction. Actions.
- the control method of the movable platform of the embodiment further includes: identifying an action feature of the detection frame of the palm of the target object to control the action of the movable platform to perform the action feature indication.
- the target object since the target object performs the action by controlling the control device of the movable platform through the palm, between the action of the palm of the target object and the action performed by the control device of the movable platform can be agreed in advance.
- the control device of the movable platform controls the action of the control device of the movable platform to perform the action feature indication by detecting and analyzing the action characteristics of the detection frame of the palm of the target object in real time.
- the flying height of the movable platform, the approaching or moving away from the target object, and the like can be controlled, which is not limited in this embodiment.
- the specific manner of determining the joint point of the target object from the joint points of the object according to the tracking frame of the feature part of the target object in S205 includes multiple types.
- a specific manner of determining the joint point of the target object from the joint points of the object based on the tracking frame of the feature portion of the target object will be described in detail with reference to FIGS. 4 and 5.
- FIG. 4 is a flowchart of a method for determining a joint point of a target object from a joint point of a target according to a tracking frame of a feature part of a target object according to an embodiment of the present invention. As shown in FIG. 4, the method may include:
- S401 Determine a number of joint points in the joint image point of each object that are located within the target image area, wherein the target image area is determined according to a tracking frame of the feature part of the target object.
- S402. Determine an object having the largest number of joint points from the object.
- the target image region since the target image region is determined according to the tracking frame of the feature portion of the target object, the target image region may be a tracking frame of the feature portion of the target object, or may be a feature portion of the target object.
- the larger area of the tracking frame is not limited in this embodiment.
- the control device of the movable platform needs to determine the joints located within the target image area for the joint points of each object.
- the number of points is obtained, and the maximum number of joint points among the plurality of joint points is obtained, and the object with the largest number of joint points is used as the target object in all the objects, that is, the joint point of the target object is the object with the largest number of joint points. Joint point.
- object 1 there are two objects in the image, object 1 and object 2.
- the number of joint points of the joint point of the object 1 within the target image area is two, and the number of joint points of the joint point of the object 2 within the target image area is six.
- the object 2 has more joint points falling into the target image area, so the object 2 is determined as the target object, and the joint point of the object 2 is taken as the joint point of the target object.
- FIG. 5 is a flowchart of a method for determining a joint point of a target object from a joint point of a target according to a tracking frame of a feature part of the target object according to an embodiment of the present invention. As shown in FIG. 5, the method may include:
- S501 Determine a tracking frame of a predicted feature part of each object according to a joint point of each object.
- each set of joint points corresponds to one object
- the control device of the movable platform can predict the feature parts of the object according to the joint points of each object, that is, the object can be determined according to the joint points of each object.
- the feature portion is predicted, wherein the predicted feature portion of the object can be represented by a tracking frame, that is, a tracking frame of the predicted feature portion of each object is determined according to the joint point of each object.
- the tracking frame of the predicted human body of each object can be determined according to the joint point of each object.
- the tracking frame of the predicted head of each object can be determined according to the joint point of each object.
- the coincidence degree of the tracking frame of the predicted feature part of each object and the tracking frame of the feature part of the target object are compared, and the tracking frame of the predicted feature part with the largest degree of coincidence is obtained as the tracking frame of the target predicted feature part, and the target prediction is performed.
- the object corresponding to the tracking frame of the feature part is used as the target object, such that the joint point of the target object is the joint point of the object corresponding to the tracking frame of the target predicted feature part with the largest degree of coincidence.
- the coincidence degree of the tracking frame of the predicted feature portion determined by the joint point of the object 1 and the tracking frame of the feature portion of the target object is 80%
- the tracking frame of the predicted feature portion determined by the joint point of the object 2 and the target object are The tracking frame of the feature portion has a degree of coincidence of 10%.
- the tracking frame of the predicted feature portion with the largest degree of coincidence is the tracking frame of the predicted feature portion of the object 1, and the joint point of the object 1 can be the target object. Joint point.
- the control device of the movable platform can determine the joint point of the target object from the joint points of the object according to the tracking frame of the feature part, and determine according to the joint point of the determined target object.
- FIG. 6 is a flowchart of a method for determining a detection frame of a palm of a target object from a detection frame of a palm of the object according to a joint point of the target object according to an embodiment of the present invention. As shown in FIG. 6, the method may include:
- S602. Determine a detection frame of the palm closest to the target joint point in the detection frame of the palm of the object as the detection frame of the palm of the target object.
- the control device of the movable platform can determine the type and position of each joint point from the image. Therefore, in order to facilitate matching the target object with the palm of the target object, a joint point can be selected from the joint point of the target object. Or multiple target joint points.
- the target joint points include palm joint points and/or elbow joint points.
- the distance between the target joint point and the detection frame of the palm of the target object is the closest. Specifically, the distance between the target joint point and the center point of the detection frame of the palm of the target object is the closest, and therefore, the target can be compared.
- the distance between the node and the detection frame of the palm of each object is determined by the detection frame of the palm closest to the target joint point as the detection frame of the palm of the target object.
- the tracking algorithm of the traditional target object is to track a single feature part of the target object, such as using the human body of the target object as a tracking target, or using a preset part of the human body of the target object (for example, the head of the human body) as a tracking target.
- a preset part of the human body of the target object for example, the head of the human body
- the size ratio of the tracking frame of the feature part of the target object in the captured image also changes, so that Will affect the effect of tracking.
- the size of the tracking frame of the feature part of the target object is relatively large in the captured image, which may cause the tracking speed to be slow, thereby easily causing the target object to be tracked and lost, and tracking
- the reliability of the control is deteriorated; when the distance between the movable platform and the target object is long, the size of the tracking frame of the feature part of the target object is small in the captured image, which may cause the feature of the tracked target object to be blurred.
- the reliability of tracking control deteriorates. Therefore, in order to enable the control device of the movable platform to reliably track the target object in different scenarios, the specific manner of determining the tracking frame of the feature portion of the target object in S202 will be described in detail.
- the tracking frame of the feature part of the target object is determined from the image as a tracking frame of the first feature part.
- the control device of the movable platform can acquire the tracking parameter of the target object, compare the tracking parameter of the target object with the preset first condition, and determine whether the tracking parameter of the target object satisfies the preset condition.
- First condition the tracking parameter of the target object satisfies the preset first condition, that is, the size ratio of the target object in the image is less than or equal to a preset first ratio threshold, and/or the distance between the target object and the movable platform Greater than or equal to the preset first distance.
- the target The area of the image occupied by the partial image area of the object is small, and the entire target object can be in the image, and the control device of the movable platform can use the tracking frame of the first feature part as the tracking frame of the feature part of the target object.
- the first feature portion is a human body of the target object.
- the tracking frame of the feature part of the target object is determined from the image as a tracking frame of the second feature part.
- the movable platform can acquire the tracking parameter of the target object, compare the tracking parameter of the target object with the preset second condition, and determine whether the tracking parameter of the target object satisfies the preset second condition.
- the tracking parameter of the target object satisfies the preset second condition, including: the size ratio of the target object in the image is greater than or equal to a preset second ratio threshold, and/or the distance between the target object and the movable platform Less than or equal to the preset second distance.
- the target The area of the image occupied by the partial image area of the object is large, and the overall image of the target object may have exceeded the boundary of the image.
- the control device of the movable platform may use the tracking frame of the second feature part as the tracking frame of the feature part of the target object.
- the second feature is the head of the target object, or the head and the shoulder.
- control device of the movable platform distinguishes different scenes by detecting that the tracking parameters of the target object satisfy the preset conditions, so that the control device of the movable platform can select the feature parts according to the tracking parameters of the current target object.
- the target object is identified, and the matching of the tracking frame of the feature part of the target object with the detection frame of the palm of the target object is more accurately realized.
- FIG. 7 is a schematic structural diagram of a control device for a mobile platform according to an embodiment of the present invention.
- the control device 700 of the mobile platform of the present embodiment may include: a processor 701 and a memory 702;
- the memory 702 is configured to store a computer program
- the processor 701 is configured to execute the computer program stored in the memory to perform:
- a detection frame of the palm of the target object is determined from a detection frame of the palm of the object according to a joint point of the target object.
- the processor 701 is further configured to identify an action feature of the detection frame of the palm of the target object to control the movable platform to perform the action feature indication action.
- the processor 701 is specifically configured to:
- a joint point of the object having the largest number of joint points is determined as a joint point of the target object.
- the processor 701 is specifically configured to:
- a joint point of the object corresponding to the tracking frame of the target predicted feature portion is determined as a joint point of the target object.
- the processor 701 is specifically configured to:
- a detection frame of the palm closest to the target joint point in the detection frame of the palm of the object is determined as a detection frame of the palm of the target object.
- the target joint point comprises a palm joint point and/or an elbow joint point.
- the processor 701 is specifically configured to:
- determining a tracking frame of the feature part of the target object from the image is a tracking frame of the first feature part.
- the tracking parameter of the target object satisfies the preset first condition, that is, the size ratio of the target object in the image is less than or equal to a preset first percentage threshold, and/or The distance of the target object from the movable platform is greater than or equal to a preset first distance.
- the first feature part is a human body of the target object.
- the processor 701 is specifically configured to:
- determining a tracking frame of the feature part of the target object from the image is a tracking frame of the second feature part.
- the tracking parameter of the target object satisfies the preset second condition, that is, the size ratio of the target object in the image is greater than or equal to a preset second ratio threshold, and/or The distance of the target object from the movable platform is less than or equal to a preset second distance.
- the second feature portion is a head of the target object, or a head and a shoulder.
- control device 700 of the mobile platform may further include:
- a bus 703 is provided for connecting the processor 701 and the memory 702.
- control device of the mobile platform of the present embodiment can be used to perform the technical solutions in the foregoing method embodiments, and the implementation principles and technical effects thereof are similar, and details are not described herein again.
- FIG. 8 is a schematic structural diagram of a mobile platform according to an embodiment of the present invention.
- the mobile platform 800 of the present embodiment may include: a photographing device 801 and a control device 802.
- the photographing device 801 is configured to output an image.
- the control device 802 can adopt the structure of the device embodiment shown in FIG. 7, and correspondingly, the technical solution of any of the foregoing method embodiments can be executed, and the implementation principle and technical effects are similar, and details are not described herein again.
- the mobile platform 800 can be a drone.
- the number of the objects 104 may be one or more.
- the object 104 may include a target object, wherein the target object is an object tracked by the drone 101.
- the drone 101 can track the target object by the image captured by the imaging device 103.
- the target object is usually in motion, and the drone 101 will also shoot from different aerial perspectives, so the target object on the image will present different states.
- the tracking algorithm of the traditional target object only tracks the current image frame and the historical moment.
- the image area captured by the device is the most similar image area of the target object, so when the target object is occluded, or an interference area similar to the target object appears on the background, for example, when an interference object appears on the background, the drone 101 is easy to follow up.
- the control device of the mobile platform can mutually update the tracking frame of the object in real time by matching the tracking frame and the detection frame of the target object to each other, so that the movable platform can be accurately updated.
- the tracking object is identified, and a stable and continuous tracking and shooting process is completed, which solves the problem that the movable platform in the prior art causes the movable platform to erroneously follow the object and the similar interference area due to interference of other objects, thereby causing a problem with the background.
- the control method of the movable platform will be described in detail through a specific embodiment.
- FIG. 9 is a flowchart of a method for controlling a mobile platform according to an embodiment of the present invention. As shown in FIG. 9, the method for controlling a mobile platform according to this embodiment may include:
- the movable platform may be configured with a photographing device for taking and outputting an image.
- the control device of the movable platform can receive an image output by the camera, and further, the processor of the control device can receive an image output by the camera.
- at least one object is included in the image, and the object may be a person in the image. In this embodiment, the number of objects in the image is not limited.
- the control device of the movable platform can identify the detection frame of the feature portion of each object in the image.
- the detection frame of the feature part of each object is an image area corresponding to the feature part of the object, and the control device of the movable platform identifies each object in the environment by detecting the feature part in the image, wherein the feature part It can be a head, or a head and a shoulder, or a human body, which is not limited in this embodiment.
- the detection frame may be represented in the form of image coordinates.
- the detection frame may be represented by the coordinates of the upper left corner of the image area and the coordinates of the lower right corner.
- the detection frame of the feature portion of the object in the image may be determined by a preset neural network.
- the preset neural network may be a neural network trained on a feature part of a person in a large number of offline images.
- the control device of the movable platform can use the neural network to detect an image in real time and obtain a detection frame of a feature portion of each object.
- the neural network may include a CNN, a general deep neural network, a cyclic network, and the like, which is not limited in this embodiment.
- the control device of the movable platform may determine a tracking frame of the feature portion of each object in the image.
- the tracking frame of the feature part of each object is an image area corresponding to the feature part of the object, wherein the tracking frame can be represented in the form of image coordinates.
- the tracking frame can be the coordinates of the upper left corner of the image area and the lower right corner. The coordinates are represented.
- the tracking frame of the feature portion of the object in the image may be determined according to the tracking frame of the feature portion of the object in the image captured by the historical time capturing device.
- a tracking frame of a traditional target object may be used to determine a tracking frame of a feature portion of each object in the image.
- a tracking frame of the feature portion of the object is obtained according to an image captured by the historical time capturing device, where The image taken by the historical time photographing device may be an image taken by the photographing device before the current time.
- the parameters trained in the image similarity function include Euclidean distance, block distance, checkerboard distance, weighted distance, Barth Charlie coefficient, Hausdorff distance, and the like.
- control device of the mobile platform may also adopt a core search algorithm, such as a Kalman filter, a particle filter, a mean shift (Meanshift) algorithm, an extended meanshift algorithm, etc., and may also include A correlation filter (Correlation Filter) algorithm, a random forest algorithm, and a support vector machine (Support Vector Machine) algorithm, etc., the embodiment is not limited to the above algorithm.
- a core search algorithm such as a Kalman filter, a particle filter, a mean shift (Meanshift) algorithm, an extended meanshift algorithm, etc.
- a correlation filter Correlation Filter
- random forest algorithm a random forest algorithm
- support vector machine Support Vector Machine
- S902 and S903 there is no sequence in sequence between S902 and S903, and S902 and S903 may be executed simultaneously or sequentially.
- the control device of the movable platform needs to determine a matching relationship between the detection frame of the feature portion of the object and the tracking frame of the feature portion of the object.
- each of the tracking frames can be mutually exclusive matched with the detection frame, that is, each of the tracking frames can only match one of the detection frames, and when there are multiple tracking frames, the tracking frame Any two cannot match the same detection box.
- each of the detection frames may be mutually exclusive matched with the tracking frame, that is, each of the detection frames can only match one of the tracking frames, and when the detection frame is multiple, the detection frame is Any two of them cannot match the same tracking box.
- S905. Determine, according to the multiple matching results, a target detection frame in the detection frame and a target tracking frame that successfully matches the target detection frame in the tracking frame.
- the matching result after obtaining a plurality of matching results, it may be determined according to the matching result that the matching combination of the detection frame and the tracking frame corresponding to the matching result is the best matching combination.
- the matching result indicates that a matching combination is the best matching combination
- the tracking frame with the target detection frame matching success is determined as the target tracking frame.
- the target detection frame can be used to update the target tracking frame in the tracking frame. Furthermore, the control device of the mobile platform updates the target tracking frame in the tracking frame by using the target detection frame, so that a more accurate tracking frame in the current image frame can be obtained, and the detection frame corrects the tracking frame.
- the control method of the mobile platform performs mutual exclusion matching through the detection frame and the tracking frame of the feature parts of all objects, and then uses the matching target detection frame to match the target tracking frame successfully matched with the target detection frame.
- Update to get the target tracking box of the updated feature The embodiment of the invention can complete the update process of the tracking frame of the feature parts of all objects, and improve the accuracy of the tracking of the movable platform according to the tracking frame of the feature parts of the tracking object, and solve the problem in the prior art due to other objects.
- the interference and the interference of the similar area of the background cause the mobile platform to misalign with the object, thereby providing a stable and reliable tracking object for the control of the mobile platform in a complex and varied user environment.
- each of the tracking frames is mutually exclusive matched with the detection frame or each of the detection frames is mutually exclusive matched with the tracking frame to determine the specificity of the multiple matching results.
- the process is described in detail.
- each of the tracking frames is mutually exclusive matched with the detection frame to determine a plurality of matching results.
- each of the detection frames is mutually exclusive matched with the tracking frame to determine a plurality of matching results.
- each tracking frame when the number of tracking frames is smaller than the number of detection frames, each tracking frame is mutually exclusive matched in the detection frame to obtain the most matching multiple matching results.
- each detection frame performs mutual exclusion matching in the tracking frame to obtain the most matching multiple matching results.
- any one of the above methods may be selected for mutual exclusion matching.
- each of the tracking frames in FIG. 9 is mutually exclusive matched with the detection frame or the detection frame is mutually exclusive matched with the tracking frame to determine to determine in FIG.
- the specific manner of multiple matching results will be described in detail.
- FIG. 10 is a flowchart of a method for mutually matching each of the tracking frames with the detection frame or mutually matching the detection frame with the tracking frame to determine a plurality of matching results according to an embodiment of the present invention. As shown in FIG. 10, the method may include:
- the comparison between the detection frame and the tracking frame may be used to determine the degree of matching between each detection frame and each tracking frame.
- a coefficient, wherein the matching degree coefficient may represent a parameter of a degree of similarity between a detection frame and a tracking frame, that is, a parameter indicating a degree of matching between a detection frame and a tracking frame.
- the greater the degree of matching coefficient the higher the degree of similarity between the tracking frame and the detection frame corresponding to the matching degree coefficient.
- determining a matching degree coefficient between each detection frame and each tracking frame includes: determining a degree of similarity of the image in the detection frame and the tracking frame, a degree of coincidence of the detection frame and the tracking frame, a detection frame, and a tracking frame. At least one of the degree of size matching determines a coefficient of match between each detection frame and each tracking frame.
- the degree of similarity between the image in the detection frame and the tracking frame may be obtained by weighting and normalizing the detection frame and the tracking frame to obtain a color distribution.
- the color distribution is used to characterize the similarity of the image within the detection frame and the tracking frame.
- the degree of coincidence of the detection frame and the tracking frame can be used to characterize the coincidence degree between the detection frame and the tracking frame by calculating the distance between the detection frame and the geometric center of the tracking frame. Or, the ratio of the intersection and the union between the detection frame and the tracking frame can be calculated to characterize the degree of coincidence of the detection frame and the tracking frame.
- the degree of matching between the detection frame and the tracking frame can be characterized by calculating the ratio of the size of the detection frame and the tracking frame or the difference between the sizes.
- S1002 Match each of the tracking frames to the detection frame in a mutually exclusive manner according to the matching degree coefficient or mutually match each of the detection frames with the tracking frame to determine a plurality of matching results.
- the control device of the movable platform after determining the matching degree coefficient between each detection frame and each tracking frame, matches each of the tracking frames with the detection frame in a mutually exclusive manner or Each of the detection frames is mutually exclusive matched with the tracking frame. According to the matching degree coefficient, the matching result of each matching combination can be determined, and according to the obtained multiple matching results, it can be determined which matching combination is the best matching combination.
- each of the tracking frames is mutually exclusively matched with the detection frame according to the matching degree coefficient or each of the detection frames is mutually exclusive matched with the tracking frame to determine a plurality of matching results, according to multiple matches.
- the process of determining the target detection frame in the detection frame and the target tracking frame in the tracking frame that matches the target detection frame is explained in detail.
- tracking frame 1 and tracking frame 2 there are two tracking frames determined from the image, which are tracking frame 1 and tracking frame 2 respectively.
- detection frame 1 There are three detection frames determined from the image, which are detection frame 1, detection frame 2 and detection frame 3.
- Table 1 shows the matching degree coefficient between each tracking frame and each detection frame, where Cij represents the matching degree coefficient between the i-th tracking frame and the j-th detection frame, i ⁇ 2, j ⁇ 3, Both i and j are positive integers.
- Tracking frame 1 matches detection frame 1
- tracking frame 2 matches detection frame 2, in which the matching result can be represented by the sum of matching degree coefficients of C11+C22;
- the tracking frame 1 matches the detection frame 1, and the tracking frame 2 matches the detection frame 3.
- the matching result can be represented by the sum of the matching degree coefficients of C11+C23;
- Tracking frame 1 matches detection frame 2
- tracking frame 2 matches detection frame 1.
- the matching result can be represented by the sum of matching degree coefficients of C12+C21.
- Tracking frame 1 matches detection frame 2
- tracking frame 2 matches detection frame 3
- the matching result can be represented by the sum of matching degree coefficients of C12+C23;
- the tracking frame 1 matches the detection frame 3
- the tracking frame 2 matches the detection frame 1.
- the matching result can be represented by the sum of the matching degree coefficients of C13+C21.
- the tracking frame 1 matches the detection frame 3, and the tracking frame 2 matches the detection frame 2, and in the matching combination, the matching result can be represented by the sum of the matching degree coefficients of C13+C22;
- the matching detection frame is determined as the target detection frame. For example, if the C13+C22 value is the largest among the six matching results, it is determined that the tracking frame 1 should be successfully matched with the detection frame 3, and the tracking frame 2 and the detection frame 2 are successfully matched, and the target tracking frame is successful.
- the target detection frame is detection frame 3 and detection frame 2. In order to obtain a more accurate tracking frame, the tracking frame 1 can be updated using the detection frame 3, and the tracking frame 1 can be updated using the detection frame 2.
- step S901 at least one object includes a target object, where the target object is an object tracked by the movable platform.
- a tracking frame of the feature portion of the object in the image captured by the current time capturing device is determined, wherein the tracking frame of the feature portion of the object includes a tracking frame of the feature portion of the target object.
- a detection frame of a feature portion of the object in the image captured by the current time photographing device is determined, wherein the detection frame of the feature portion of the object includes a detection frame of the feature portion of the target object.
- the tracking frame of the updated object includes the tracking frame of the updated target object, so that the image captured by the current time capturing device is updated.
- the tracking box of the feature part of the target object In this way, the tracking frame of the feature part of the target object in the image can be updated, and it is also the second feasible implementation manner of the tracking frame for determining the feature part of the target object from the image by the movable platform mentioned in the foregoing section.
- the target object and the interference object are included in the at least one object, wherein the interference object is another object other than the target object in the object, and the current camera is acquired in S902.
- a tracking frame of a feature part of the object in the captured image the tracking frame of the feature part of the object includes a tracking frame of the feature part of the target object and a tracking frame of the feature part of the interference object, and the object in the image captured by the current camera is acquired in S903
- the detection frame of the feature portion, and the detection of the feature portion of the object includes a detection frame of the feature portion of the target object and a detection frame of the feature portion of the interference object.
- the tracking frame of the interference object does not match the detection frame, the tracking frame is deleted from the tracking frame of the updated feature part.
- the tracking frame of the interference object when the tracking frame of the interference object does not match the detection frame, it indicates that the interference object may not be in the image captured by the camera at the current time.
- the tracking frame of the interference object may continue to be mutually exclusive with the detection frame within a preset time. If the tracking frame of the interference object still does not match the detection frame, the tracking frame of the interference object is updated from the updated frame. The feature area of the tracking box is removed.
- the preset time can be 3 frames.
- one or more detection frames are added to the tracking frame of the updated feature portion.
- one or more of the detection frames when one or more of the detection frames do not match the tracking frame, it indicates that other objects appear in the image captured by the imaging device at the current time, that is, objects other than the target object, and therefore, Based on the tracking frame of the updated feature part, one or more detection frames are newly created as a new tracking frame, and added to the tracking frame of the updated feature part, and the interference of other objects on the target object can be fully considered. Avoid moving the platform to target other objects as a target object due to other objects.
- the specific setting of the tracking frame of the feature part of the target object is performed by different scenarios,
- the control device of the mobile platform enables the target object to be reliably and continuously tracked.
- the tracking frame of the feature portion of the object is a tracking frame of the first feature portion.
- the control device of the movable platform can acquire the tracking parameter of the target object, compare the tracking parameter of the target object with the preset first condition, and determine whether the tracking parameter of the target object satisfies the preset condition.
- First condition the tracking parameter of the target object satisfies the preset first condition, that is, the size ratio of the target object in the image is less than or equal to a preset first ratio threshold, and/or the distance between the target object and the movable platform Greater than or equal to the preset first distance.
- the object When the size ratio of the target object in the image is less than or equal to the preset first ratio threshold, or the distance between the target object and the movable platform is greater than or equal to the preset first distance, or both of the above conditions are satisfied, the object The area of the image occupied by the partial image area is small, and the overall image of the object can be in the image, and the control device of the movable platform can use the tracking frame of the first feature part as the tracking frame of the feature part of the object.
- the first feature portion is a human body of the subject.
- the tracking frame of the feature part of the object is a tracking frame of the second feature part.
- the movable platform can acquire the tracking parameter of the target object, compare the tracking parameter of the target object with the preset second condition, and determine whether the tracking parameter of the target object satisfies the preset second condition.
- the tracking parameter of the target object satisfies the preset second condition, including: the size ratio of the target object in the image is greater than or equal to a preset second ratio threshold, and/or the distance between the target object and the movable platform Less than or equal to the preset second distance.
- the object When the size ratio of the target object in the image is greater than or equal to the preset first ratio threshold, or the distance between the target object and the movable platform is less than or equal to the preset first distance, or both of the above conditions are satisfied, the object The area of the image occupied by the partial image area is large, and the overall image of the object may have exceeded the boundary of the image.
- the control device of the movable platform may use the tracking frame of the second feature part as the tracking frame of the feature part of the object.
- the second feature is the head of the subject, or the head and shoulders.
- control device of the movable platform distinguishes different scenarios by determining that the tracking parameters of the target object satisfy the preset conditions, so that the control device of the movable platform can accurately acquire the tracking frame of the feature parts of the object, and achieve the target.
- the tracking frame of the feature part of the object is precisely matched with the detection frame of the palm of the target object.
- FIG 11 is a schematic structural diagram of a control device for a mobile platform according to an embodiment of the present invention.
- the control device 1100 of the mobile platform of the present embodiment may include: a processor 1101 and a memory 1102;
- the memory 1102 is configured to store a computer program
- the processor 1101 is configured to execute the computer program stored in the memory to perform:
- Each of the tracking frames is mutually exclusive matched with the detection frame or each of the detection frames is mutually exclusive matched with the tracking frame to determine a plurality of matching results;
- the target tracking frame is updated by the target detection frame to obtain a tracking frame of the updated feature part.
- the processor 1101 is specifically configured to:
- a detection frame of a feature portion of the object in the image is determined by a preset neural network.
- the processor 1101 is specifically configured to:
- a tracking frame of a feature portion of the object in the image is determined according to a tracking frame of a feature portion of the object in the image captured by the historical time capturing device.
- the processor 1101 is specifically configured to:
- each of the tracking frames is mutually exclusive matched with the detection frame to determine a plurality of matching results.
- the processor 1101 is specifically configured to:
- each of the detection frames is mutually exclusive matched with the tracking frame to determine a plurality of matching results.
- the processor 1101 is specifically configured to:
- Each of the tracking frames is mutually exclusive matched to the detection frame according to the matching degree coefficient or each of the detection frames is mutually exclusive matched with the tracking frame to determine the plurality of matches result.
- the processor 1101 is specifically configured to:
- each detection Determining each detection according to at least one of a degree of similarity between the detection frame and the image in the tracking frame, a degree of coincidence of the detection frame and the tracking frame, and a degree of matching between the detection frame and the tracking frame The degree of matching between the box and each tracking frame.
- the at least one object includes: a target object and an interference object
- the tracking frame of the feature part of the object in the image includes: a tracking frame of the feature part of the target object and a tracking frame of the feature part of the interference object, wherein the detection frame of the feature part of the object in the image includes: a detection frame of a feature portion of the target object and a detection frame of the feature portion of the interference object, wherein the target object is an object tracked by the movable platform.
- processor 1101 is further configured to:
- the tracking frame of the interference object does not match the detection frame, the tracking frame is deleted from the tracking frame of the updated feature part.
- processor 1101 is further configured to:
- the one or more detection frames are added to the tracking frame of the updated feature portion.
- the tracking frame of the feature part of the object is a tracking frame of the first feature part.
- the tracking parameter of the target object satisfies the preset first condition, that is, the size ratio of the target object in the image is less than or equal to a preset first percentage threshold, and/or The distance of the target object from the movable platform is greater than or equal to a preset first distance.
- the first feature part is a human body of the object.
- the tracking frame of the feature part of the object is a tracking frame of the second feature part.
- the tracking parameter of the target object satisfies the preset second condition, that is, the size ratio of the target object in the image is greater than or equal to a preset first percentage threshold, and/or The distance of the target object from the movable platform is less than or equal to a preset first distance.
- the second feature is the head of the subject, or the head and the shoulder.
- control device 1100 of the mobile platform may further include:
- the bus 1103 is configured to connect the processor 1101 and the memory 1102.
- control device of the mobile platform of the present embodiment can be used to perform the technical solutions in the foregoing method embodiments, and the implementation principles and technical effects thereof are similar, and details are not described herein again.
- FIG. 12 is a schematic structural diagram of a mobile platform according to an embodiment of the present invention.
- the mobile platform 1200 of the present embodiment may include: a photographing device 1201 and a control device 1202.
- the photographing device 1201 is configured to output an image.
- the control device 1202 can adopt the structure of the device embodiment shown in FIG. 11 , and correspondingly, the technical solution of any of the foregoing method embodiments can be executed, and the implementation principle and technical effects thereof are similar, and details are not described herein again.
- the mobile platform 1200 can be a drone.
- the foregoing program may be stored in a computer readable storage medium, and the program is executed when executed.
- the foregoing storage medium includes: read-only memory (English: Read-Only Memory, ROM for short), random access memory (English: Random Access Memory, RAM), disk or A variety of media such as optical discs that can store program code.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
本发明实施例提供一种可移动平台的控制方法、装置和可移动平台。该方法包括:获取拍摄装置输出的图像;从图像中确定目标对象的特征部位的跟踪框;识别图像中对象的关节点;识别图像中对象的手掌的检测框;根据特征部位的跟踪框从对象的关节点中确定目标对象的关节点;根据目标对象的关节点从对象的手掌的检测框中确定目标对象的手掌的检测框。本发明实施例实现了目标对象的特征部位的跟踪框与目标对象的手掌的检测框的精准匹配,提高了可移动平台识别目标对象的手掌的检测框的稳定性和持续性。
Description
本发明实施例涉及控制领域,尤其涉及一种可移动平台的控制方法、装置和可移动平台。
目前,可移动平台(例如无人机)能够实现对目标对象的跟踪,使得用户在脱离手持控制终端的前提下,能够始终处于可移动平台的拍摄装置的拍摄画面中。
然而,面对越来越复杂的应用场景和使用环境,现有的对目标对象的识别和跟踪的策略,不能持续稳定地识别和跟踪目标对象以及目标对象的手掌,在某些情况中,降低了可移动平台的有用性。
发明内容
本发明实施例提供一种可移动平台的控制方法、装置和可移动平台,以提高可移动平台对目标对象跟踪的可靠性和鲁棒性。
第一方面,本发明实施例提供一种可移动平台的控制方法,包括:
获取拍摄装置输出的图像;
从所述图像中确定目标对象的特征部位的跟踪框,其中,所述目标对象为所述可移动平台跟踪的对象;
识别所述图像中对象的关节点;
识别所述图像中对象的手掌的检测框;
根据所述特征部位的跟踪框从所述对象的关节点中确定所述目标对象的关节点;
根据所述目标对象的关节点从所述对象的手掌的检测框中确定所述目标对象的手掌的检测框。
第二方面,本发明实施例提供一种可移动平台的控制装置,包括:处理器和存储器;
所述存储器,用于存储计算机程序;
所述处理器,用于执行所述存储器存储的计算机程序,以执行:
获取拍摄装置输出的图像;
从所述图像中确定目标对象的特征部位的跟踪框,其中,所述目标对象为所述可移动平台跟踪的对象;
识别所述图像中对象的关节点;
识别所述图像中对象的手掌的检测框;
根据所述特征部位的跟踪框从所述对象的关节点中确定所述目标对象的关节点;
根据所述目标对象的关节点从所述对象的手掌的检测框中确定所述目标对象的手掌的检测框。
第三方面,本发明实施例提供一种可读存储介质,所述可读存储介质上存储有计算机程序;所述计算机程序在被执行时,实现如第一方面本发明实施例所述的可移动平台的控制方法。
第四方面,本发明实施例提供一种可移动平台,包括,拍摄装置,和如第二方面所述的控制装置。
本发明实施例提供的可移动平台的控制方法、装置和可移动平台,通过在摄像装置输出的图像中能够确定目标对象的特征部位的跟踪框,并识别图像中的所有对象的关节点和所有对象的手掌的检测框。根据目标对象的特征部位的跟踪框从所有对象的关节点中确定出目标对象的关节点,接着以目标对象的关节点为桥梁在所有对象的手掌的检测框中确定目标对象的手掌的检测框,实现了目标对象的特征部位的跟踪框与目标对象的手掌的检测框的精准匹配,使得可移动平台能够稳定持续的识别出目标对象的手掌的检测框,解决了现有技术中容易将目标对象与目标对象的手掌匹配错误的问题。
第五方面,本发明实施例提供一种可移动平台的控制方法,包括:
获取当前时刻拍摄装置拍摄的图像,其中,所述图像中包括至少一个对象;
确定所述图像中对象的特征部位的检测框;
确定所述图像中对象的特征部位的跟踪框;
将所述跟踪框中的每一个与所述检测框互斥地匹配或者将所述检测框中 的每一个与所述跟踪框互斥地匹配以确定多个匹配结果;
根据所述多个匹配结果确定所述检测框中的目标检测框以及所述跟踪框中与所述目标检测框匹配成功的目标跟踪框;
利用所述目标检测框对所述目标跟踪框进行更新以获取更新后的特征部位的跟踪框。
第六方面,本发明实施例提供一种可移动平台的控制装置,包括:处理器和存储器;
所述存储器,用于存储计算机程序;
所述处理器,用于执行所述存储器存储的计算机程序,以执行:
获取当前时刻拍摄装置拍摄的图像,其中,所述图像中包括至少一个对象;
确定所述图像中对象的特征部位的检测框;
确定所述图像中对象的特征部位的跟踪框;
将所述跟踪框中的每一个与所述检测框互斥地匹配或者将所述检测框中的每一个与所述跟踪框互斥地匹配以确定多个匹配结果;
根据所述多个匹配结果确定所述检测框中的目标检测框以及所述跟踪框中与所述目标检测框匹配成功的目标跟踪框;
利用所述目标检测框对所述目标跟踪框进行更新以获取更新后的特征部位的跟踪框。
第七方面,本发明实施例提供一种可读存储介质,所述可读存储介质上存储有计算机程序;所述计算机程序在被执行时,实现如第五方面本发明实施例所述的可移动平台的控制方法。
第八方面,本发明实施例提供一种可移动平台,包括,拍摄装置,和如第六方面所述的控制装置。
本发明实施例提供的可移动平台的控制方法、装置和可移动平台,通过所有对象的特征部位的检测框和跟踪框进行互斥性匹配,再利用匹配成功的目标检测框对与目标检测框匹配成功的目标跟踪框进行更新,得到更新后的特征部位的目标跟踪框。本发明实施例能够完成所有对象的特征部位的跟踪框的更新过程,根据更加准确的追踪对象的特征部位的跟踪框,提高了可移动平台跟踪的精准度,解决了现有技术中由于其他对象的干扰以及背景相似 区域的干扰而导致可移动平台错跟对象的问题,从而在复杂多变的用户使用环境中为可移动平台的控制提供稳定可靠的跟踪的对象。
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明提供的可移动平台拍摄的应用场景示意图;
图2为本发明一实施例提供的可移动平台的控制方法的流程图;
图3a为本发明一实施例提供的可移动平台的控制方法中图像的示意图;
图3b为本发明一实施例提供的可移动平台的控制方法中目标对象的特征部位的跟踪框的示意图;
图3c为本发明一实施例提供的可移动平台的控制方法中对象的关节点的示意图;
图3d为本发明一实施例提供的可移动平台的控制方法中对象的手掌的检测框的示意图;
图4为本发明一实施例提供的根据目标对象的特征部位的跟踪框从对象的关节点中确定目标对象的关节点的方法的流程图;
图5为本发明一实施例提供的根据目标对象的特征部位的跟踪框从对象的关节点中确定目标对象的关节点的方法的流程图;
图6为本发明一实施例提供的根据目标对象的关节点从对象的手掌的检测框中确定目标对象的手掌的检测框的方法的流程图;
图7为本发明一实施例提供的可移动平台的控制装置的结构示意图;
图8为本发明一实施例提供的可移动平台的结构示意图;
图9为本发明一实施例提供的可移动平台的控制方法的流程图;
图10为本发明一实施例提供的将跟踪框中的每一个与检测框互斥地匹配或者将检测框中的每一个与跟踪框互斥地匹配以确定多个匹配结果的方法的流程图;
图11为本发明一实施例提供的可移动平台的控制装置的结构示意图;
图12为本发明一实施例提供的可移动平台的结构示意图。
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
需要说明的是,当组件被称为“固定于”另一个组件,它可以直接在另一个组件上或者也可以存在居中的组件。当一个组件被认为是“连接”另一个组件,它可以是直接连接到另一个组件或者可能同时存在居中组件。
除非另有定义,本文所使用的所有的技术和科学术语与属于本发明的技术领域的技术人员通常理解的含义相同。本文中在本发明的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本发明。本文所使用的术语“及/或”包括一个或多个相关的所列项目的任意的和所有的组合。
图1为本发明提供的可移动平台拍摄的应用场景示意图。本发明实施例中涉及的可移动平台可包括但不仅限于无人机、无人车、无人船。请参见图1,可移动平台以无人机101为例进行具体说明,本文后述部位的无人机101都可以使用可移动平台替代。在无人机101上设置有可以进行旋转的云台102,云台102上设置有拍摄装置103,无人机101通过控制云台102的姿态可以调整拍摄装置103的朝向,拍摄装置103可以拍摄获取环境图像,例如拍摄获取包含对象104的图像。无人机103能够将拍摄到的图像实时传输至控制终端105,并在控制终端105的显示屏上显示图像。其中,控制终端105可为遥控器、手机、手提电脑和平板电脑中的一种或多种,本实施例对此不做限定。
在使用手掌控制无人机的应用场景中,传统手掌的跟踪算法是在目标图像区域中检测当前图像帧中与历史时刻拍摄装置拍摄的图像中目标对象(无人机跟踪的对象)的手掌跟踪框最相似的手掌跟踪框(例如位置的靠近程度、图像区域大小相似程度、跟踪框内图像的相似程度),其中,目标图像区域可以是根据历史时刻拍摄装置拍摄的图像中目标对象的手掌 的位置确定的。目前,由于不同对象的手掌在图像上几乎不能区分,且传统手掌的跟踪算法无法识别人的手掌。因此,通常会采用训练好的神经网络来识别对象的手掌的检测框,实现目标对象与目标对象的手掌的匹配。然而,大多数时候,目标对象距离无人机很近,再加上无人机自身也会运动,易造成目标对象的手掌在图像上出现运动模糊。此时,采用神经网络的方法也很难持续检测到目标对象的手掌的检测框,甚至目标对象的手掌的真实位置早已经超出目标图像区域。如果单纯增大手掌的搜索区域,更加容易将目标对象的手掌错误匹配到其他对象身上。
在本发明实施例中,以可移动平台的控制装置为执行主体(图1中未示出),可移动平台的控制装置以识别目标对象的关节点为中间桥梁,能够精准确定跟踪目标对象的手掌,解决现有技术中由于无法持续检测到目标对象的手掌,而出现匹配错误的问题。下面,通过具体实施例,对可移动平台的控制方法进行详细说明。
图2为本发明一实施例提供的可移动平台的控制方法的流程图,图3a为本发明一实施例提供的可移动平台的控制方法中图像的示意图,图3b为本发明一实施例提供的可移动平台的控制方法中目标对象的特征部位的跟踪框的示意图,图3c为本发明一实施例提供的可移动平台的控制方法中对象的关节点的示意图,图3d为本发明一实施例提供的可移动平台的控制方法中对象的手掌的检测框的示意图。如图2所示,本实施例的可移动平台的控制方法可以包括:
S201、获取拍摄装置输出的图像。
在本发明实施例中,如前所述,可移动平台可以配置有拍摄装置,拍摄装置用于拍摄并输出图像。可移动平台的控制装置可以接收拍摄装置输出的图像,进一步地,控制装置的处理器可以接收拍摄装置输出的图像。
S202、从图像中确定目标对象的特征部位的跟踪框,其中,目标对象为可移动平台跟踪的对象。
在本发明实施例中,拍摄装置输出的图像中可以包括至少一个对象,另外,至少一个对象中至少包括目标对象,其中,目标对象为可移动平台跟踪的对象。
在本发明实施例中,可移动平台的控制装置可以从图像中识别出目标对 象的特征部位的跟踪框,即可移动平台的控制装置通过对图像中的特征部位的检测来识别环境中的对象,其中,特征部位可以为头部,或者头部和肩部,也可以为人体,本实施例对此不做限定,其中,目标对象的特征部位的跟踪框为目标对象的特征部位对应的图像区域,其中,跟踪框可以以图像坐标的形式表示,例如,跟踪框可以以图像区域的左上角的坐标和右下角的坐标来表示。
在本发明实施例中,可移动平台的控制装置从图像中确定目标对象的特征部位的跟踪框的实现方式可以包括以下几种:
第一种可行的实施方式,可移动平台的控制装置可以采用传统目标对象的跟踪算法从图像中确定目标对象的特征部位的跟踪框。例如,在获取到当前图像帧后,以当前图像的上一帧或上一时刻的目标对象的特征部位的跟踪框为中心,在图像扩展一个局部范围,根据训练得到的图像相似度函数,在这个局部范围内确定一个跟目标对象的特征部位最相似的图像区域,目标对象的特征部位的跟踪框便为这个图像区域。其中,该图像相似度函数中的训练参数可以包括欧氏距离、街区距离、棋盘距离、加权距离、巴特查理亚系数、Hausdorff距离中的任一。且除了可以采用上述相似性度量算法以外,还可以采用核心搜索算法,如卡尔曼滤波、粒子滤波器、均值漂移(Meanshift)算法,扩展的meanshift算法等,也可以采用自相关滤波器(Correlation Filter)算法、随机森林(Random forest)算法以及支持向量积(Support Vector Machine)算法等,本实施例对此不做限定。
第二种可行的实施方式,可以通过如下文图9所提供的方法来从图像中确定目标对象的特征部位的跟踪框,具体地,请参见本文后述部分,此处先不赘述。
S203、识别图像中对象的关节点。
在本发明实施例中,一个对象的关节点最多包括19个,其中,19个关节点包括:左眼关节点、右眼关节点、鼻子关节点、左耳关节点、右耳关节点、嘴巴关节点、颈部关节点、左肩关节点、右肩关节点、左肘关节点、右肘关节点、左手关节点、右手关节点、左胯关节点、右胯关节点、左膝关节点、右膝关节点、左脚关节点以及右脚关节点。
在本发明实施例中,由于图像中不限于包括一个对象和多个对象,因此, 可移动平台的控制装置可以对图像中每一个对象的关节点进行识别,其中,每一个对象对应一组关节点。其中,识别图像中对象的关节点的具体的技术方案可以参照现有技术,具体过程此处不再赘述。
S204、识别图像中对象的手掌的检测框。
在本发明实施例中,由于目标对象通过手掌,例如手掌的动作特征,来控制可移动平台执行手掌的动作特征指示的动作,因此,可移动平台的控制装置需要对图像中的每一个对象的手掌的检测框进行识别。其中,手掌的检测框为手掌对应的图像区域,其中,检测框可以以图像坐标的形式表示,例如,检测框可以以图像区域的左上角的坐标和右下角的坐标来表示。
在本发明实施例中,可以事先对大量离线图像中对象的手掌进行训练得到神经网络,可移动平台的控制装置可以采用该神经网络实时检测图像并得到每一个对象的手掌的检测框,该神经网络可以返回图像中每一个对象的手掌的检测框的在图像中的位置和大小,例如,该神经网络可以返回每一个对象的手掌的检测框的左上角和右下角的坐标。其中,该神经网络可以包括CNN、普通深度神经网络以及循环网络等,本实施例对此不做限定。
需要说明的是,上述S202-S204之间没有时序上的先后顺序,且S202、S203和S204可以同时执行,也可以顺序执行。
S205、根据特征部位的跟踪框从对象的关节点中确定目标对象的关节点。
在本发明实施例中,由于图像中可能会存在多个对象,每一个对象对应一组关节点,因此,可移动平台的控制装置需要确定到底哪一组关节点是目标对象的关节点。可移动平台的控制装置可以根据目标对象的特征部位的跟踪框从多组关节点中确定目标对象的关节点。
S206、根据目标对象的关节点从对象的手掌的检测框中确定目标对象的手掌的检测框。
在本发明实施例中,根据上述步骤能够确定目标对象的关节点,可移动平台的控制装置通过比较目标对象的关节点与每一个对象的手掌的检测框的匹配关系,从对象的手掌的检测框中确定目标对象的手掌的检测框。
在一个具体的实施例中,如图3a所示,当拍摄装置输出的图像中有三个对象A、B和C,结合图3b、图3c和图3d,采用本发明实施例的可移动平台的控制方法来获取目标对象A的手掌的检测框的具体过程是:
1、如图3b所示,从图像中确定目标对象A的特征部位的跟踪框M。
2、如图3c所示,识别图像中对象A、B和C的关节点,对象A的关节点用实心圆表示,对象B的关节点用空心圆表示,对象C的关节点用三角形表示。
3、如图3d所示,识别图像中对象A、B和C的手掌的检测框,分别为N1、N2和N3。
4、结合图3b和图3c,可以根据目标对象A的特征部位的跟踪框M确定图3c中对象A的关节点为目标对象的关节点。
5、结合图3c和图3d,确定目标对象的关节点与对象A、B和C的手掌的检测框N1、N2和N3之间的匹配关系,根据匹配关系可以确定N1为目标对象A的手掌的检测框。
本发明实施例提供的可移动平台的控制方法,通过在摄像装置输出的图像中能够确定目标对象的特征部位的跟踪框,并识别图像中的所有对象的关节点和所有对象的手掌的检测框。根据目标对象的特征部位的跟踪框从所有对象的关节点中确定出目标对象的关节点,接着以目标对象的关节点为桥梁在所有对象的手掌的检测框中确定目标对象的手掌的检测框,实现了目标对象的特征部位的跟踪框与目标对象的手掌的检测框的精准匹配,使得可移动平台能够稳定持续的识别出目标对象的手掌的检测框,解决了现有技术中容易将目标对象与目标对象的手掌匹配错误的问题。
可选地,在上述图2实施例的基础上,可移动平台的控制装置不仅需要确定目标对象,还需要根据目标对象的手掌来确定目标对象发出的指令,以控制可移动平台根据指令执行相应的动作。这样,在S206之后,本实施例的可移动平台的控制方法还包括:识别目标对象的手掌的检测框的动作特征,以控制可移动平台执行动作特征指示的动作。
在本发明实施例中,由于目标对象是通过手掌来控制可移动平台的控制装置来执行动作的,因此,可以将事先约定目标对象的手掌的动作与可移动平台的控制装置执行的动作之间的对应关系,进而可移动平台的控制装置通过实时检测并分析目标对象的手掌的检测框的动作特征,来控制可移动平台的控制装置执行动作特征指示的动作。例如,根据该动作特征可以控制可移动平台的飞行高度、靠近或远离目标对象等等,本实施例对此不做限定。
可选地,在上述图2实施例的基础上,S205中根据目标对象的特征部位的跟踪框从对象的关节点中确定目标对象的关节点的具体方式包括多种。下面,结合图4和图5,对根据目标对象的特征部位的跟踪框从对象的关节点中确定目标对象的关节点的具体方式对进行详细的说明。
图4为本发明一实施例提供的根据目标对象的特征部位的跟踪框从对象的关节点中确定目标对象的关节点的方法的流程图,如图4所示,所述方法可以包括:
S401、确定每一个对象的关节点中位于目标图像区域之内的关节点个数,其中,目标图像区域是根据目标对象的特征部位的跟踪框确定的。
S402、从对象中确定关节点个数最大的对象。
S403、将关节点个数最大的对象的关节点确定为目标对象的关节点。
在本发明实施例中,由于目标图像区域是根据目标对象的特征部位的跟踪框确定的,因此,目标图像区域可以为目标对象的特征部位的跟踪框,也可以为比目标对象的特征部位的跟踪框更大的区域,本实施例对此不做限定。
由于目标对象的关节点中一般有较多的关节点会落入目标对象的特征部位上,因此,针对每一个对象的关节点,可移动平台的控制装置需要确定位于目标图像区域之内的关节点个数,得到多个关节点个数中最大的关节点个数,并在所有对象中将关节点个数最大的对象作为目标对象,即目标对象的关节点为关节点个数最大的对象的关节点。
例如,图像中有两个对象,分别为对象1和对象2。在图像中,对象1的关节点位于目标图像区域之内的关节点个数为2个,对象2的关节点位于目标图像区域之内的关节点个数为6个。相比于对象1,对象2有较多的关节点落入了目标图像区域,因此将对象2确定为目标对象,将对象2的关节点作为目标对象的关节点。
图5为本发明一实施例提供的根据目标对象的特征部位的跟踪框从对象的关节点中确定目标对象的关节点的方法的流程图,如图5所示,所述方法可以包括:
S501、根据每一个对象的关节点确定每一个对象的预测特征部位的跟踪框。
S502、从预测特征部位的跟踪框中确定与目标对象的特征部位的跟踪框 重合程度最大的目标预测特征部位的跟踪框。
S503、将目标预测特征部位的跟踪框对应的对象的关节点确定为目标对象的关节点。
在本发明实施例中,每一组关节点对应一个对象,可移动平台的控制装置根据每一个对象的关节点可以预测该对象的特征部位,即可以根据每一个对象的关节点确定该对象的预测特征部位,其中,对象的预测特征部位可以以跟踪框来表示,即根据每一个对象的关节点确定每一个对象的预测特征部位的跟踪框。例如,当特征部位是人体时,即可以根据每一个对象的关节点确定每一个对象的预测人体的跟踪框。例如,当特征部位是头部时,即可以根据每一个对象的关节点确定每一个对象的预测头部的跟踪框。然后,比较每一个对象的预测特征部位的跟踪框和目标对象的特征部位的跟踪框的重合程度,得到重合程度最大的预测特征部位的跟踪框作为目标预测特征部位的跟踪框,将该目标预测特征部位的跟踪框所对应的对象作为目标对象,这样目标对象的关节点为重合程度最大的目标预测特征部位的跟踪框对应的对象的关节点。
例如,图像中有两个对象,分别为对象1和对象2。在图像中,对象1的关节点确定的预测特征部位的跟踪框与目标对象的特征部位的跟踪框的重合程度为80%,对象2的关节点确定的预测特征部位的跟踪框与目标对象的特征部位的跟踪框的重合程度为10%,相比对象2而言,重合程度最大的预测特征部位的跟踪框为对象1的预测特征部位的跟踪框,可以将对象1的关节点为目标对象的关节点。
综上所述,无论采用上述哪种方式,可移动平台的控制装置皆可以根据特征部位的跟踪框,从对象的关节点中确定目标对象的关节点,并根据确定的目标对象的关节点确定目标对象的特征部位的跟踪框和目标对象的手掌的检测框之间的对应关系,准确匹配目标对象和目标对象的手掌,避免目标对象与其他对象的手掌进行匹配,而导致目标对象控制可移动平台失败。
可选地,在上述实施例的基础上,结合图6对图2中根据目标对象的关节点从对象的手掌的检测框中确定目标对象的手掌的检测框的具体方式对进行详细的说明。
图6为本发明一实施例提供的根据目标对象的关节点从对象的手掌的检 测框中确定目标对象的手掌的检测框的方法的流程图,如图6所示,所述方法可以包括:
S601、从目标对象的关节点中确定目标关节点。
S602、将对象的手掌的检测框中距离目标关节点最近的手掌的检测框确定为目标对象的手掌的检测框。
在本发明实施例中,可移动平台的控制装置从图像中可以确定每一个关节点的类型和位置,因此,为了便于匹配目标对象与目标对象的手掌,可以从目标对象的关节点选择出一个或多个目标关节点。可选地,目标关节点包括手掌关节点和/或手肘关节点。在一般情况下,目标关节点与目标对象的手掌的检测框的距离是最近的,具体地,目标关节点与目标对象的手掌的检测框的中心点的距离是最近的,因此,可以比较目标关节点与每个对象的手掌的检测框的距离,将距离目标关节点最近的手掌的检测框作为目标对象的手掌的检测框。
传统目标对象的跟踪算法是对目标对象的单一特征部位进行跟踪,如将目标对象的人体作为跟踪目标,或者,将目标对象的人体的预设部位(例如人体的头部)作为跟踪目标。然而,在对目标对象的单一特征部位进行跟踪的过程中,由于可移动平台与目标对象的距离在变化,目标对象的特征部位的跟踪框在拍摄图像中的尺寸占比也随之变化,这样会影响跟踪的效果。例如,当可移动平台与目标对象的距离很近时,则目标对象的特征部位的跟踪框在拍摄图像中的尺寸占比较大,会造成跟踪速度变慢,进而容易造成目标对象跟踪丢失,跟踪控制的可靠性变差;当可移动平台与目标对象的距离较远时,则目标对象的特征部位的跟踪框在拍摄图像中的尺寸占比较小,会造成跟踪到的目标对象的特征模糊,跟踪控制的可靠性变差。因此,为了可移动平台的控制装置能够在不同场景下可靠地对目标对象进行跟踪,对S202中的确定目标对象的特征部位的跟踪框的具体方式进行详细的说明。
可选地,当目标对象的跟踪参数满足预设的第一条件时,从图像中确定目标对象的特征部位的跟踪框为第一特征部位的跟踪框。
在本发明实施例中,可移动平台的控制装置可以获取到目标对象的跟踪参数,并将目标对象的跟踪参数与预设的第一条件进行比较,判断目标对象的跟踪参数是否满足预设的第一条件。可选地,目标对象的跟踪参数满足预 设的第一条件包括:目标对象在图像中的尺寸占比小于或等于预设第一占比阈值,和/或,目标对象与可移动平台的距离大于或等于预设第一距离。
当目标对象在图像中的尺寸占比小于或等于预设第一占比阈值,或者目标对象与可移动平台的距离大于或等于预设第一距离,或者上述两种情况皆会满足时,目标对象的局部图像区域所占图像的面积小,整个目标对象能够处于图像中,可移动平台的控制装置可以将第一特征部位的跟踪框作为目标对象的特征部位的跟踪框。可选地,第一特征部位为目标对象的人体。
可选地,当目标对象的跟踪参数满足预设的第二条件时,从图像中确定目标对象的特征部位的跟踪框为第二特征部位的跟踪框。
在本发明实施例中,可移动平台可以获取到目标对象的跟踪参数,将目标对象的跟踪参数与预设的第二条件进行比较,判断目标对象的跟踪参数是否满足预设的第二条件。可选地,目标对象的跟踪参数满足预设的第二条件包括:目标对象在图像中的尺寸占比大于或等于预设第二占比阈值,和/或,目标对象与可移动平台的距离小于或等于预设第二距离。
当目标对象在图像中的尺寸占比大于或等于预设第一占比阈值,或者目标对象与可移动平台的距离小于或等于预设第一距离,或者上述两种情况皆会满足时,目标对象的局部图像区域所占图像的面积大,目标对象的整体图像可能已经超出图像边界,可移动平台的控制装置可以将第二特征部位的跟踪框作为目标对象的特征部位的跟踪框。可选地,第二特征部位为目标对象的头部,或者,头部和肩部。
综上所述,可移动平台的控制装置通过检测目标对象的跟踪参数所满足预设的条件来区分不同场景,使得可移动平台的控制装置能够根据与当前目标对象的跟踪参数相适应的特征部位对目标对象进行识别,更加精准地实现了目标对象的特征部位的跟踪框与目标对象的手掌的检测框的匹配。
图7为本发明一实施例提供的可移动平台的控制装置的结构示意图,如图7所示,本实施例的可移动平台的控制装置700可以包括:处理器701和存储器702;
所述存储器702,用于存储计算机程序;
所述处理器701,用于执行所述存储器存储的计算机程序,以执行:
获取拍摄装置输出的图像;
从所述图像中确定目标对象的特征部位的跟踪框,其中,所述目标对象为所述可移动平台跟踪的对象;
识别所述图像中对象的关节点;
识别所述图像中对象的手掌的检测框;
根据所述特征部位的跟踪框从所述对象的关节点中确定所述目标对象的关节点;
根据所述目标对象的关节点从所述对象的手掌的检测框中确定所述目标对象的手掌的检测框。
可选地,所述处理器701,还用于识别所述目标对象的手掌的检测框的动作特征,以控制所述可移动平台执行所述动作特征指示动作。
可选地,所述处理器701,具体用于:
确定每一个对象的关节点中位于目标图像区域之内的关节点个数,其中,所述目标图像区域是根据所述目标对象的特征部位的跟踪框确定的;
从所述对象中确定所述关节点个数最大的对象;
将所述关节点个数最大的对象的关节点确定为所述目标对象的关节点。
可选地,所述处理器701,具体用于:
根据每一个对象的关节点确定每一个对象的预测特征部位的跟踪框;
从所述预测特征部位的跟踪框中确定与目标对象的特征部位的跟踪框重合程度最大的目标预测特征部位的跟踪框;
将所述目标预测特征部位的跟踪框对应的对象的关节点确定为所述目标对象的关节点。
可选地,所述处理器701,具体用于:
从所述目标对象的关节点中确定目标关节点;
将所述对象的手掌的检测框中距离所述目标关节点最近的手掌的检测框确定为所述目标对象的手掌的检测框。
可选地,所述目标关节点包括手掌关节点和/或手肘关节点。
可选地,所述处理器701,具体用于:
当所述目标对象的跟踪参数满足预设的第一条件时,从所述图像中确定所述目标对象的特征部位的跟踪框为第一特征部位的跟踪框。
可选地,所述目标对象的跟踪参数满足预设的第一条件包括:所述目标 对象在所述图像中的尺寸占比小于或等于预设第一占比阈值,和/或,所述目标对象与所述可移动平台的距离大于或等于预设第一距离。
可选地,所述第一特征部位为所述目标对象的人体。
可选地,所述处理器701,具体用于:
当所述目标对象的跟踪参数满足预设的第二条件时,从所述图像中确定所述目标对象的特征部位的跟踪框为第二特征部位的跟踪框。
可选地,所述目标对象的跟踪参数满足预设的第二条件包括:所述目标对象在所述图像中的尺寸占比大于或等于预设第二占比阈值,和/或,所述目标对象与所述可移动平台的距离小于或等于预设第二距离。
可选地,所述第二特征部位为所述目标对象的头部,或者,头部和肩部。
当所述存储器702是独立于处理器701之外的器件时,所述可移动平台的控制装置700还可以包括:
总线703,用于连接处理器701和所述存储器702。
本实施例的可移动平台的控制装置,可以用于执行上述各方法实施例中的技术方案,其实现原理和技术效果类似,此处不再赘述。
图8为本发明一实施例提供的可移动平台的结构示意图,如图8所示,本实施例的可移动平台800可以包括:拍摄装置801和控制装置802。其中,所述拍摄装置801,用于输出图像。控制装置802可以采用图7所示装置实施例的结构,其对应地,可以执行上述任一方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。
在一些实施例中,可移动平台800可以是无人机。
在无人机101对目标对象进行跟踪的场景中,对象104的个数可以为一个或多个,进一步地,对象104中可以包括目标对象,其中,目标对象为无人机101追踪的对象。无人机101能够通过拍摄装置103拍摄的图像对目标对象进行跟踪。然而,目标对象通常处于运动状态,无人机101也会从不同空中视角进行拍摄,因此图像上目标对象会呈现出不同的状态,传统目标对象的跟踪算法只是跟踪当前图像帧中与历史时刻拍摄装置拍摄的图像中目标对象最相似的图像区域,因此当目标对象被遮挡,或者背景上出现与目标对象相似的干扰区域,例如,背景上出现干扰对象时,无人机101就容易跟错到背景中的干扰对象身上;另外,在某些场景中,图像中有可能会存在多个对 象,如图1中,三个人在交叉运动时,多个对象在图像上原本就比较相似,因此,若只通过传统目标对象的跟踪算法对对象中的目标对象进行跟踪,往往容易跟错到另外一个对象身上。
为了解决上述问题,可移动平台的控制装置通过对目标对象的跟踪框和检测框进行互斥地匹配,能够用更加精准的对象的检测框来实时更新对象的跟踪框,使得可移动平台能够准确识别跟踪的对象,完成稳定持续的追踪和拍摄过程,解决现有技术中可移动平台由于其他对象的干扰而导致可移动平台错跟对象以及相似的干扰区域而导致错跟背景的问题。下面,通过具体实施例,对可移动平台的控制方法进行详细说明。
图9为本发明一实施例提供的可移动平台的控制方法的流程图,如图9所示,本实施例的可移动平台的控制方法可以包括:
S901、获取当前时刻拍摄装置拍摄的图像,其中,图像中包括至少一个对象。
在本发明实施例中,可移动平台可以配置有拍摄装置,拍摄装置用于拍摄并输出图像。可移动平台的控制装置可以接收拍摄装置输出的图像,进一步地,控制装置的处理器可以接收拍摄装置输出的图像。其中,在该图像中包括至少一个对象,对象可以为图像中的人,本实施例对图像中对象的个数不做限定。
S902、确定图像中对象的特征部位的检测框。
在本发明实施例中,可移动平台的控制装置可以对图像中的每一个对象的特征部位的检测框进行识别。其中,每一个对象的特征部位的检测框为该对象的特征部位对应的图像区域,可移动平台的控制装置通过对图像中的特征部位的检测来识别环境中的每一个对象,其中,特征部位可以为头部,或者头部和肩部,也可以为人体,本实施例对此不做限定。其中,检测框可以以图像坐标的形式表示,例如,检测框可以以图像区域的左上角的坐标和右下角的坐标来表示。
可选地,可以通过预设的神经网络确定图像中对象的特征部位的检测框。在本发明实施例中,预设的神经网络可以为对大量离线图像中人的特征部位训练得到的神经网络。可移动平台的控制装置可以使用该神经网络实时检测图像并得到每一个对象的特征部位的检测框。其中,该神经网络可以包括 CNN、普通深度神经网络、循环网络等,本实施例对此不做限定。
S903、确定图像中对象的特征部位的跟踪框。
在本发明实施例中,可移动平台的控制装置在获取拍摄装置拍摄的图像后,可以确定图像中每一个对象的特征部位的跟踪框。其中,每一个对象的特征部位的跟踪框为该对象的特征部位对应的图像区域,其中,跟踪框可以以图像坐标的形式表示,例如,跟踪框可以以图像区域的左上角的坐标和右下角的坐标来表示。
可选地,可以根据历史时刻拍摄装置拍摄的图像中对象的特征部位的跟踪框确定图像中对象的特征部位的跟踪框。具体地,可以采用传统目标对象的跟踪算法确定图像中每一个对象的特征部位的跟踪框,对于任意对象而言,根据历史时刻拍摄装置拍摄的图像得到该对象的特征部位的跟踪框,其中,历史时刻拍摄装置拍摄的图像可以为在当前时刻之前拍摄装置拍摄的图像。以当前图像的上一帧或上一时刻的该对象的特征部位的跟踪框为中心,扩展一个局部范围,且根据训练得到的图像相似度函数,在这个局部范围内确定一个跟该对象的特征部位最相似的图像区域,并将这个图像区域作为该对象的特征部位的跟踪框。其中,该图像相似度函数中训练的参数包括欧氏距离、街区距离、棋盘距离、加权距离、巴特查理亚系数、Hausdorff距离等。且除了可以采用上述相似性度量算法以外,可移动平台的控制装置还可以采用核心搜索算法,如卡尔曼滤波、粒子滤波器、均值漂移(Meanshift)算法,扩展的meanshift算法等,也可包括自相关滤波器(Correlation Filter)算法、随机森林(Random forest)算法以及支持向量积(Support Vector Machine)算法等,本实施例不限于上述算法。
需要说明的是,上述S902和S903之间没有时序上的先后顺序,且S902和S903可以同时执行,也可以顺序执行。
S904、将跟踪框中的每一个与检测框互斥地匹配或者将检测框中的每一个与跟踪框互斥地匹配以确定多个匹配结果。
在本发明实施例中,为了能够用更加精准的对象的检测框来实时更新对象的跟踪框以获取更加准确的对象的跟踪框,因此,在得到了对象的特征部位的检测框和跟踪框以后,可移动平台的控制装置需要确定对象的特征部位的检测框与对象的特征部位的跟踪框之间的匹配关系。
可选地,可以将跟踪框中的每一个与检测框互斥地匹配,即跟踪框中的每一个只能与检测框中的一个匹配,并且当跟踪框为多个时,跟踪框中的任意两个不能与同一个检测框匹配。可选地,可以将检测框中的每一个与跟踪框互斥地匹配,即检测框中的每一个只能与跟踪框中的一个进行匹配,并且当检测框为多个时,检测框中的任意两个不能与同一个跟踪框匹配。通过这种匹配方式,每一种匹配组合都可以确定出一个匹配结果,其中,匹配结果可以表示这种匹配组合是最佳的匹配组合的可能性。
S905、根据多个匹配结果确定检测框中的目标检测框以及跟踪框中与目标检测框匹配成功的目标跟踪框。
在本发明实施例中,在获得了多个匹配结果后,即可以根据匹配结果来判断哪个匹配结果对应的检测框与跟踪框的匹配组合是最佳的匹配组合。当匹配结果指示某一个匹配组合是最佳的匹配组合时,即可以确定该匹配组合中的检测框和检测框匹配成功,即可以将该匹配组合中的检测框确定为目标检测框,将与目标检测框匹配成功的跟踪框确定为目标跟踪框。
S906、利用目标检测框对目标跟踪框进行更新以获取更新后的特征部位的跟踪框。
在本发明实施例中,确定了目标检测框和目标跟踪框之后,由于检测框相对于跟踪框更加精准,因此,可以使用目标检测框对跟踪框中的目标跟踪框进行更新。进而,可移动平台的控制装置利用目标检测框对跟踪框中的目标跟踪框进行更新,这样,可以获取当前图像帧中更加精准的跟踪框,实现检测框对跟踪框的纠正。
本发明实施例提供的可移动平台的控制方法,通过所有对象的特征部位的检测框和跟踪框进行互斥性匹配,再利用匹配成功的目标检测框对与目标检测框匹配成功的目标跟踪框进行更新,得到更新后的特征部位的目标跟踪框。本发明实施例能够完成所有对象的特征部位的跟踪框的更新过程,根据更加准确的追踪对象的特征部位的跟踪框,提高了可移动平台跟踪的精准度,解决了现有技术中由于其他对象的干扰以及背景相似区域的干扰而导致可移动平台错跟对象的问题,从而在复杂多变的用户使用环境中为可移动平台的控制提供稳定可靠的跟踪的对象。
首先,在上述图9实施例的基础上,对将跟踪框中的每一个与检测框互 斥地匹配或者将检测框中的每一个与跟踪框互斥地匹配以确定多个匹配结果的具体过程进行详细的说明。
可选地,当跟踪框的个数小于检测框的个数时,将跟踪框中的每一个与检测框互斥地匹配以确定多个匹配结果。
可选地,当跟踪框的个数大于检测框的个数时,将检测框中的每一个与跟踪框互斥地匹配以确定多个匹配结果。
在本发明实施例中,当跟踪框的个数小于检测框的个数时,每一个跟踪框在检测框中进行互斥地匹配,得到最匹配的多个匹配结果。当跟踪框的个数大于检测框的个数时,每一个检测框在跟踪框中进行互斥性匹配,得到最匹配的多个匹配结果。其中,当跟踪框的个数等于检测框的个数时,可选择上述任一方式进行互斥性匹配。
可选地,在上述实施例的基础上,结合图10对图9中将跟踪框中的每一个与检测框互斥地匹配或者将检测框中的每一个与跟踪框互斥地匹配以确定多个匹配结果的具体方式对进行详细的说明。
图10为本发明一实施例提供的将跟踪框中的每一个与检测框互斥地匹配或者将检测框中的每一个与跟踪框互斥地匹配以确定多个匹配结果的方法的流程图,如图10所示,所述方法可以包括:
S1001、确定每一检测框与每一个跟踪框的之间的匹配程度系数。
在本发明实施例中,为了确定检测框和跟踪框中每一种匹配组合的匹配结果,可以通过检测框和跟踪框的比较以确定每一检测框与每一个跟踪框的之间的匹配程度系数,其中,匹配程度系数可以表示一个检测框与一个跟踪框之间相似程度的参数,即表示一个检测框与一个跟踪框之间匹配程度的参数。在某些情况中,匹配程度系数越大,可以表示与该匹配程度系数对应的跟踪框和检测框之间相似程度越高。
可选地,确定每一检测框与每一个跟踪框的之间的匹配程度系数包括:根据检测框和跟踪框内图像的相似程度、检测框和跟踪框的重合程度、检测框和跟踪框的大小匹配程度中的至少一个确定每一检测框与每一个跟踪框的之间的匹配程度系数。
在本发明实施例中,检测框和跟踪框内图像的相似程度,即检测框和跟踪框内图像的相似程度,可以通过对检测框和跟踪框进行加权和归一化处理 得到颜色分布,用颜色分布来表征检测框和跟踪框内图像的相似程度。
在本发明实施例中,检测框和跟踪框的重合程度,即检测框和跟踪框的位置匹配程度,可以通过计算检测框和跟踪框的几何中心的距离来表征检测框和跟踪框的重合程度,或者可以计算检测框与跟踪框之间的交集与并集之比来表征检测框和跟踪框的重合程度。
在本发明实施例中,检测框和跟踪框的大小匹配程度,即检测框和跟踪框在几何上的匹配程度,可以通过计算检测框和跟踪框的大小之比或者大小之差来表征。
S1002、根据匹配程度系数将跟踪框中的每一个与检测框互斥地匹配或者将检测框中的每一个与跟踪框互斥地匹配以确定多个匹配结果。
在本发明实施例中,在确定了每一个检测框与每一个跟踪框的之间的匹配程度系数之后,可移动平台的控制装置将跟踪框中的每一个与检测框互斥地匹配或者将检测框中的每一个与跟踪框互斥地匹配,根据匹配程度系数能够确定每一个匹配组合的匹配结果,根据得到的多个匹配结果即可以确定哪一个匹配组合是最佳的匹配组合。
下面将举例对S1002中根据匹配程度系数将跟踪框中的每一个与检测框互斥地匹配或者将检测框中的每一个与跟踪框互斥地匹配以确定多个匹配结果,根据多个匹配结果确定检测框中的目标检测框以及跟踪框中与目标检测框匹配成功的目标跟踪框的过程进行详细的解释。
例如,从图像中确定的跟踪框有两个,分别为跟踪框1和跟踪框2,从图像中确定的检测框有三个,分别为检测框1、检测框2以及检测框3。如表1表明每一个跟踪框与每一个检测框之间的匹配程度系数,其中,Cij表示第i个跟踪框与第j个检测框之间的匹配程度系数,i≤2,j≤3,i和j皆为正整数。
表1 每一个跟踪框与每一个检测框的匹配程度系数
将跟踪框1和跟踪框2中每一个与三个检测框互斥性匹配,可以得到6种互斥的匹配组合:
1、跟踪框1与检测框1匹配,跟踪框2与检测框2匹配,则在该匹配组合中,匹配结果可以用C11+C22的匹配程度系数之和表示;
2、跟踪框1与检测框1匹配,跟踪框2与检测框3匹配,则在该匹配组合中,匹配结果可以用C11+C23的匹配程度系数之和表示;
3、跟踪框1与检测框2匹配,跟踪框2与检测框1匹配,则在该匹配组合中,匹配结果可以用C12+C21的匹配程度系数之和表示;
4、跟踪框1与检测框2匹配,跟踪框2与检测框3匹配,则在该匹配组合中,匹配结果可以用C12+C23的匹配程度系数之和表示;
5、跟踪框1与检测框3匹配,跟踪框2与检测框1匹配,则在该匹配组合中,匹配结果可以用C13+C21的匹配程度系数之和表示;
6、跟踪框1与检测框3匹配,跟踪框2与检测框2匹配,则在该匹配组合中,匹配结果可以用C13+C22的匹配程度系数之和表示;
比较这6种匹配组合的匹配结果,从6种匹配结果中确定匹配程度系数之和的数值最大的一组匹配结果,将该组匹配结果对应的跟踪框确定为目标跟踪框,将与跟踪框匹配的检测框确定为目标检测框,例如,6种匹配结果中C13+C22数值最大,则确定跟踪框1应该与检测框3匹配成功,跟踪框2与检测框2匹配成功,则目标跟踪框为跟踪框1和跟踪框2,目标检测框为检测框3和检测框2。为了获取更加精准的跟踪框,可以使用检测框3更新跟踪框1,使用检测框2更新跟踪框1。
可选地,在上述实施例的基础上,步骤S901中,至少一个对象中包括目标对象,其中,目标对象为可移动平台跟踪的对象。S902中确定当前时刻拍摄装置拍摄的图像中对象的特征部位的跟踪框,其中,对象的特征部位的跟踪框包括目标对象的特征部位的跟踪框。S903中确定当前时刻拍摄装置拍摄的图像中对象的特征部位的检测框,其中,对象的特征部位的检测框包括目标对象的特征部位的检测框。
在使用如前所述方法,使用目标检测框更新目标跟踪框之后,更新后的对象的跟踪框中包括更新后的目标对象的跟踪框,这样就得到了当前时刻拍摄装置拍摄的图像中更新后的目标对象的特征部位的跟踪框。这种方式即可以更新得到图像中目标对象的特征部位的跟踪框,也是本文前述部分中提到的可移动平台从图像中确定目标对象的特征部位的跟踪框第二种可行的实施 方式。
可选地,在上述实施例的基础上,S901中,所述至少一个对象中包括目标对象和干扰对象,其中,干扰对象是对象中除目标对象之外的其他对象,S902中获取当前拍摄装置拍摄的图像中对象的特征部位的跟踪框,对象的特征部位的跟踪框包括目标对象的特征部位的跟踪框和干扰对象的特征部位的跟踪框,S903中获取当前拍摄装置拍摄的图像中对象的特征部位的检测框,对象的特征部位的跟检测包括目标对象的特征部位的检测框和干扰对象的特征部位的检测框。通过这种方式即可以更新得到图像中目标对象的特征部位的跟踪框和干扰对象的特征部位的跟踪框,可以更新精准的对目标对象进行追踪。
可选地,当干扰对象的跟踪框匹配不到检测框时,将跟踪框从更新后的特征部位的跟踪框中删除。
在本发明实施例中,当干扰对象的跟踪框匹配不到检测框时,说明干扰对象可能已经不在当前时刻拍摄装置拍摄的图像中。可以在预设的时间内,继续对该干扰对象的跟踪框进行与检测框互斥性匹配,若该干扰对象的跟踪框仍匹配不到检测框,则将干扰对象的跟踪框从更新后的特征部位的跟踪框中删除。其中,预设的时间可为3帧。
可选地,当检测框中的一个或多个匹配不到跟踪框时,将一个或多个检测框添加到更新后的特征部位的跟踪框中。
在本发明实施例中,当检测框中的一个或多个匹配不到跟踪框时,说明在当前时刻拍摄装置拍摄的图像中出现了其他对象,即除目标对象以外的对象,因此,可在更新后的特征部位的跟踪框的基础上,将一个或多个检测框新建为新的跟踪框,添加到更新后的特征部位的跟踪框中,便能够充分考虑其他对象对目标对象的干扰,避免由于出现其他对象而导致可移动平台错将其他对象作为目标对象。
为了避免可移动平台与目标对象的距离对可移动平台跟踪控制目标对象所产生的不良效果,在本发明实施例中,通过不同的场景对目标对象的特征部位的跟踪框进行的具体设定,使得可移动平台的控制装置能够可靠且持续跟踪目标对象。
可选地,在目标对象的跟踪参数满足预设的第一条件时,对象的特征部 位的跟踪框为第一特征部位的跟踪框。
在本发明实施例中,可移动平台的控制装置可以获取到目标对象的跟踪参数,并将目标对象的跟踪参数与预设的第一条件进行比较,判断目标对象的跟踪参数是否满足预设的第一条件。可选地,目标对象的跟踪参数满足预设的第一条件包括:目标对象在图像中的尺寸占比小于或等于预设第一占比阈值,和/或,目标对象与可移动平台的距离大于或等于预设第一距离。
当目标对象在图像中的尺寸占比小于或等于预设第一占比阈值,或者目标对象与可移动平台的距离大于或等于预设第一距离,或者上述两种情况皆会满足时,对象的局部图像区域所占图像的面积小,对象的整体图像能够处于图像中,可移动平台的控制装置可以将第一特征部位的跟踪框作为对象的特征部位的跟踪框。可选地,第一特征部位为对象的人体。
可选地,在目标对象的跟踪参数满足预设的第二条件时,对象的特征部位的跟踪框为第二特征部位的跟踪框。
在本发明实施例中,可移动平台可以获取到目标对象的跟踪参数,将目标对象的跟踪参数与预设的第二条件进行比较,判断目标对象的跟踪参数是否满足预设的第二条件。可选地,目标对象的跟踪参数满足预设的第二条件包括:目标对象在图像中的尺寸占比大于或等于预设第二占比阈值,和/或,目标对象与可移动平台的距离小于或等于预设第二距离。
当目标对象在图像中的尺寸占比大于或等于预设第一占比阈值,或者目标对象与可移动平台的距离小于或等于预设第一距离,或者上述两种情况皆会满足时,对象的局部图像区域所占图像的面积大,对象的整体图像可能已经超出图像边界,可移动平台的控制装置可以将第二特征部位的跟踪框作为对象的特征部位的跟踪框。可选地,第二特征部位为对象的头部,或者,头部和肩部。
综上所述,可移动平台的控制装置通过确定目标对象的跟踪参数所满足预设的条件来区分不同场景,使得可移动平台的控制装置能够准确获取对象的特征部位的跟踪框,实现了目标对象的特征部位的跟踪框与目标对象的手掌的检测框的精准匹配。
图11为本发明一实施例提供的可移动平台的控制装置的结构示意图,如图11所示,本实施例的可移动平台的控制装置1100可以包括:处理器1101 和存储器1102;
所述存储器1102,用于存储计算机程序;
所述处理器1101,用于执行所述存储器存储的计算机程序,以执行:
获取当前时刻拍摄装置拍摄的图像,其中,所述图像中包括至少一个对象;
确定所述图像中对象的特征部位的检测框;
确定所述图像中对象的特征部位的跟踪框;
将所述跟踪框中的每一个与所述检测框互斥地匹配或者将所述检测框中的每一个与所述跟踪框互斥地匹配以确定多个匹配结果;
根据所述多个匹配结果确定所述检测框中的目标检测框以及所述跟踪框中与所述目标检测框匹配成功的目标跟踪框;
利用所述目标检测框对所述目标跟踪框进行更新以获取更新后的特征部位的跟踪框。
可选地,所述处理器1101,具体用于:
通过预设的神经网络确定所述图像中对象的特征部位的检测框。
可选地,所述处理器1101,具体用于:
根据历史时刻拍摄装置拍摄的图像中对象的特征部位的跟踪框确定所述图像中对象的特征部位的跟踪框。
可选地,所述处理器1101,具体用于:
当所述跟踪框的个数小于所述检测框的个数时,将所述跟踪框中的每一个与所述检测框互斥地匹配以确定多个匹配结果。
可选地,所述处理器1101,具体用于:
当所述跟踪框的个数大于所述检测框的个数时,将所述检测框中的每一个与所述跟踪框互斥地匹配以确定多个匹配结果。
可选地,所述处理器1101,具体用于:
确定每一检测框与每一个跟踪框的之间的匹配程度系数;
根据所述匹配程度系数将所述跟踪框中的每一个与所述检测框互斥地匹配或者将所述检测框中的每一个与所述跟踪框互斥地匹配以确定所述多个匹配结果。
可选地,所述处理器1101,具体用于:
根据所述检测框和所述跟踪框内图像的相似程度、所述检测框和所述跟踪框的重合程度、所述检测框和所述跟踪框的大小匹配程度中的至少一个确定每一检测框与每一个跟踪框的之间的匹配程度系数。
可选地,
所述至少一个对象中包括:目标对象和干扰对象;
所述图像中对象的特征部位的跟踪框包括:所述目标对象的特征部位的跟踪框和所述干扰对象的特征部位的跟踪框,所述图像中对象的特征部位的检测框包括:所述目标对象的特征部位的检测框和所述干扰对象的特征部位的检测框,其中,所述目标对象为所述可移动平台跟踪的对象。
可选地,所述处理器1101,还用于:
当所述干扰对象的跟踪框匹配不到所述检测框时,将所述跟踪框从所述更新后的特征部位的跟踪框中删除。
可选地,所述处理器1101,还用于:
当所述检测框中的一个或多个匹配不到所述跟踪框时,将所述一个或多个检测框添加到所述更新后的特征部位的跟踪框中。
可选地,在所述目标对象的跟踪参数满足预设的第一条件时,所述对象的特征部位的跟踪框为第一特征部位的跟踪框。
可选地,所述目标对象的跟踪参数满足预设的第一条件包括:所述目标对象在所述图像中的尺寸占比小于或等于预设第一占比阈值,和/或,所述目标对象与所述可移动平台的距离大于或等于预设第一距离。
可选地,所述第一特征部位为所述对象的人体。
可选地,在所述目标对象的跟踪参数满足预设的第二条件时,所述对象的特征部位的跟踪框为第二特征部位的跟踪框。
可选地,所述目标对象的跟踪参数满足预设的第二条件包括:所述目标对象在所述图像中的尺寸占比大于或等于预设第一占比阈值,和/或,所述目标对象与所述可移动平台的距离小于或等于预设第一距离。
可选地,所述第二特征部位为所述对象的头部,或者头部和肩部。
当所述存储器1102是独立于处理器1101之外的器件时,所述可移动平台的控制装置1100还可以包括:
总线1103,用于连接处理器1101和所述存储器1102。
本实施例的可移动平台的控制装置,可以用于执行上述各方法实施例中的技术方案,其实现原理和技术效果类似,此处不再赘述。
图12为本发明一实施例提供的可移动平台的结构示意图,如图12所示,本实施例的可移动平台1200可以包括:拍摄装置1201和控制装置1202。其中,所述拍摄装置1201,用于输出图像。控制装置1202可以采用图11所示装置实施例的结构,其对应地,可以执行上述任一方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。
在一些实施例中,可移动平台1200可以是无人机。
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:只读内存(英文:Read-Only Memory,简称:ROM)、随机存取存储器(英文:Random Access Memory,简称:RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。
Claims (58)
- 一种可移动平台的控制方法,其特征在于,包括:获取拍摄装置输出的图像;从所述图像中确定目标对象的特征部位的跟踪框,其中,所述目标对象为所述可移动平台跟踪的对象;识别所述图像中对象的关节点;识别所述图像中对象的手掌的检测框;根据所述特征部位的跟踪框从所述对象的关节点中确定所述目标对象的关节点;根据所述目标对象的关节点从所述对象的手掌的检测框中确定所述目标对象的手掌的检测框。
- 根据权利要求1所述的方法,其特征在于,所述方法还包括:识别所述目标对象的手掌的检测框的动作特征,以控制所述可移动平台执行所述动作特征指示的动作。
- 根据权利要求1所述的方法,其特征在于,所述根据所述特征部位的跟踪框从所述对象的关节点中确定所述目标对象的关节点,包括:确定每一个对象的关节点中位于目标图像区域之内的关节点个数,其中,所述目标图像区域是根据所述目标对象的特征部位的跟踪框确定的;从所述对象中确定所述关节点个数最大的对象;将所述关节点个数最大的对象的关节点确定为所述目标对象的关节点。
- 根据权利要求1所述的方法,其特征在于,所述根据所述特征部位的跟踪框从所述对象的关节点中确定所述目标对象的关节点,包括:根据每一个对象的关节点确定每一个对象的预测特征部位的跟踪框;从所述预测特征部位的跟踪框中确定与目标对象的特征部位的跟踪框重合程度最大的目标预测特征部位的跟踪框;将所述目标预测特征部位的跟踪框对应的对象的关节点确定为所述目标对象的关节点。
- 根据权利要求1所述的方法,其特征在于,所述根据所述目标对象的关节点从所述对象的手掌的检测框中确定所述目标对象的手掌的检测框,包括:从所述目标对象的关节点中确定目标关节点;将所述对象的手掌的检测框中距离所述目标关节点最近的手掌的检测框确定为所述目标对象的手掌的检测框。
- 根据权利要求5所述的方法,其特征在于,所述目标关节点包括手掌关节点和/或手肘关节点。
- 根据权利要求1所述的方法,其特征在于,所述从所述图像中确定目标对象的特征部位的跟踪框,包括:当所述目标对象的跟踪参数满足预设的第一条件时,从所述图像中确定所述目标对象的特征部位的跟踪框为第一特征部位的跟踪框。
- 根据权利要求7所述的方法,其特征在于,所述目标对象的跟踪参数满足预设的第一条件包括:所述目标对象在所述图像中的尺寸占比小于或等于预设第一占比阈值,和/或,所述目标对象与所述可移动平台的距离大于或等于预设第一距离。
- 根据权利要求7所述的方法,其特征在于,所述第一特征部位为所述目标对象的人体。
- 根据权利要求1所述的方法,其特征在于,所述从所述图像中确定目标对象的特征部位的跟踪框,包括:当所述目标对象的跟踪参数满足预设的第二条件时,从所述图像中确定所述目标对象的特征部位的跟踪框为第二特征部位的跟踪框。
- 根据权利要求10所述的方法,其特征在于,所述目标对象的跟踪参数满足预设的第二条件包括:所述目标对象在所述图像中的尺寸占比大于或等于预设第二占比阈值,和/或,所述目标对象与所述可移动平台的距离小于或等于预设第二距离。
- 根据权利要求10所述的方法,其特征在于,所述第二特征部位为所述目标对象的头部,或者头部和肩部。
- 一种可移动平台的控制装置,其特征在于,包括:处理器和存储器;所述存储器,用于存储计算机程序;所述处理器,用于执行所述存储器存储的计算机程序,以执行:获取拍摄装置输出的图像;从所述图像中确定目标对象的特征部位的跟踪框,其中,所述目标对象 为所述可移动平台跟踪的对象;识别所述图像中对象的关节点;识别所述图像中对象的手掌的检测框;根据所述特征部位的跟踪框从所述对象的关节点中确定所述目标对象的关节点;根据所述目标对象的关节点从所述对象的手掌的检测框中确定所述目标对象的手掌的检测框。
- 根据权利要求13所述的装置,其特征在于,所述处理器,还用于识别所述目标对象的手掌的检测框的动作特征,以控制所述可移动平台执行所述动作特征指示的动作。
- 根据权利要求13所述的装置,其特征在于,所述处理器,具体用于:确定每一个对象的关节点中位于目标图像区域之内的关节点个数,其中,所述目标图像区域是根据所述目标对象的特征部位的跟踪框确定的;从所述对象中确定所述关节点个数最大的对象;将所述关节点个数最大的对象的关节点确定为所述目标对象的关节点。
- 根据权利要求13所述的装置,其特征在于,所述处理器,具体用于:根据每一个对象的关节点确定每一个对象的预测特征部位的跟踪框;从所述预测特征部位的跟踪框中确定与目标对象的特征部位的跟踪框重合程度最大的目标预测特征部位的跟踪框;将所述目标预测特征部位的跟踪框对应的对象的关节点确定为所述目标对象的关节点。
- 根据权利要求13所述的装置,其特征在于,所述处理器,具体用于:从所述目标对象的关节点中确定目标关节点;将所述对象的手掌的检测框中距离所述目标关节点最近的手掌的检测框确定为所述目标对象的手掌的检测框。
- 根据权利要求17所述的装置,其特征在于,所述目标关节点包括手掌关节点和/或手肘关节点。
- 根据权利要求13所述的装置,其特征在于,所述处理器,具体用于:当所述目标对象的跟踪参数满足预设的第一条件时,从所述图像中确定所述目标对象的特征部位的跟踪框为第一特征部位的跟踪框。
- 根据权利要求19所述的装置,其特征在于,所述目标对象的跟踪参数满足预设的第一条件包括:所述目标对象在所述图像中的尺寸占比小于或等于预设第一占比阈值,和/或,所述目标对象与所述可移动平台的距离大于或等于预设第一距离。
- 根据权利要求19所述的装置,其特征在于,所述第一特征部位为所述目标对象的人体。
- 根据权利要求13所述的装置,其特征在于,所述处理器,具体用于:当所述目标对象的跟踪参数满足预设的第二条件时,从所述图像中确定所述目标对象的特征部位的跟踪框为第二特征部位的跟踪框。
- 根据权利要求22所述的装置,其特征在于,所述目标对象的跟踪参数满足预设的第二条件包括:所述目标对象在所述图像中的尺寸占比大于或等于预设第二占比阈值,和/或,所述目标对象与所述可移动平台的距离小于或等于预设第二距离。
- 根据权利要求22所述的装置,其特征在于,所述第二特征部位为所述目标对象的头部,或者,头部和肩部。
- 一种可移动平台,包括,拍摄装置,和如权利要求13-24任一项所述的控制装置。
- 一种可移动平台的控制方法,其特征在于,包括:获取当前时刻拍摄装置拍摄的图像,其中,所述图像中包括至少一个对象;确定所述图像中对象的特征部位的检测框;确定所述图像中对象的特征部位的跟踪框;将所述跟踪框中的每一个与所述检测框互斥地匹配或者将所述检测框中的每一个与所述跟踪框互斥地匹配以确定多个匹配结果;根据所述多个匹配结果确定所述检测框中的目标检测框以及所述跟踪框 中与所述目标检测框匹配成功的目标跟踪框;利用所述目标检测框对所述目标跟踪框进行更新以获取更新后的特征部位的跟踪框。
- 根据权利要求26所述的方法,其特征在于,所述确定所述图像中对象的特征部位的检测框,包括:通过预设的神经网络确定所述图像中对象的特征部位的检测框。
- 根据权利要求26所述的方法,其特征在于,所述确定所述图像中对象的特征部位的跟踪框,包括:根据历史时刻拍摄装置拍摄的图像中对象的特征部位的跟踪框确定所述图像中对象的特征部位的跟踪框。
- 根据权利要求26所述的方法,其特征在于,所述将所述跟踪框中的每一个与所述检测框互斥地匹配或者将所述检测框中的每一个与所述跟踪框互斥地匹配以确定多个匹配结果,包括:当所述跟踪框的个数小于所述检测框的个数时,将所述跟踪框中的每一个与所述检测框互斥地匹配以确定多个匹配结果。
- 根据权利要求26所述的方法,其特征在于,所述将所述跟踪框中的每一个与所述检测框互斥地匹配或者将所述检测框中的每一个与所述跟踪框互斥地匹配以确定多个匹配结果,包括:当所述跟踪框的个数大于所述检测框的个数时,将所述检测框中的每一个与所述跟踪框互斥地匹配以确定多个匹配结果。
- 根据权利要求26所述的方法,其特征在于,所述将所述跟踪框中的每一个与所述检测框互斥地匹配或者将所述检测框中的每一个与所述跟踪框互斥地匹配以确定多个匹配结果,包括:确定每一检测框与每一个跟踪框的之间的匹配程度系数;根据所述匹配程度系数将所述跟踪框中的每一个与所述检测框互斥地匹配或者将所述检测框中的每一个与所述跟踪框互斥地匹配以确定所述多个匹配结果。
- 根据权利要求31所述的方法,其特征在于,所述确定每一检测框与每一个跟踪框的之间的匹配程度系数,包括:根据所述检测框和所述跟踪框内图像的相似程度、所述检测框和所述跟 踪框的重合程度、所述检测框和所述跟踪框的大小匹配程度中的至少一个确定每一检测框与每一个跟踪框的之间的匹配程度系数。
- 根据权利要求26所述的方法,其特征在于,所述至少一个对象中包括:目标对象和干扰对象;所述图像中对象的特征部位的跟踪框包括:所述目标对象的特征部位的跟踪框和所述干扰对象的特征部位的跟踪框,所述图像中对象的特征部位的检测框包括:所述目标对象的特征部位的检测框和所述干扰对象的特征部位的检测框,其中,所述目标对象为所述可移动平台跟踪的对象。
- 根据权利要求33所述的方法,其特征在于,所述方法还包括:当所述干扰对象的跟踪框匹配不到所述检测框时,将所述跟踪框从所述更新后的特征部位的跟踪框中删除。
- 根据权利要求33所述的方法,其特征在于,所述方法还包括:当所述检测框中的一个或多个匹配不到所述跟踪框时,将所述一个或多个检测框添加到所述更新后的特征部位的跟踪框中。
- 根据权利要求33所述的方法,其特征在于,在所述目标对象的跟踪参数满足预设的第一条件时,所述对象的特征部位的跟踪框为第一特征部位的跟踪框。
- 根据权利要求36所述的方法,其特征在于,所述目标对象的跟踪参数满足预设的第一条件包括:所述目标对象在所述图像中的尺寸占比小于或等于预设第一占比阈值,和/或,所述目标对象与所述可移动平台的距离大于或等于预设第一距离。
- 根据权利要求36所述的方法,其特征在于,所述第一特征部位为所述对象的人体。
- 根据权利要求33所述的方法,其特征在于,在所述目标对象的跟踪参数满足预设的第二条件时,所述对象的特征部位的跟踪框为第二特征部位的跟踪框。
- 根据权利要求39所述的方法,其特征在于,所述目标对象的跟踪参数满足预设的第二条件包括:所述目标对象在所述图像中的尺寸占比大于或等于预设第一占比阈值,和/或,所述目标对象与所述可移动平台的距离小于或等于预设第一距离。
- 根据权利要求39所述的方法,其特征在于,所述第二特征部位为所述对象的头部,或者头部和肩部。
- 一种可移动平台的控制装置,其特征在于,包括:处理器和存储器;所述存储器,用于存储计算机程序;所述处理器,用于执行所述存储器存储的计算机程序,以执行:获取当前时刻拍摄装置拍摄的图像,其中,所述图像中包括至少一个对象;确定所述图像中对象的特征部位的检测框;确定所述图像中对象的特征部位的跟踪框;将所述跟踪框中的每一个与所述检测框互斥地匹配或者将所述检测框中的每一个与所述跟踪框互斥地匹配以确定多个匹配结果;根据所述多个匹配结果确定所述检测框中的目标检测框以及所述跟踪框中与所述目标检测框匹配成功的目标跟踪框;利用所述目标检测框对所述目标跟踪框进行更新以获取更新后的特征部位的跟踪框。
- 根据权利要求42所述的装置,其特征在于,所述处理器,具体用于:通过预设的神经网络确定所述图像中对象的特征部位的检测框。
- 根据权利要求42所述的装置,其特征在于,所述处理器,具体用于:根据历史时刻拍摄装置拍摄的图像中对象的特征部位的跟踪框确定所述图像中对象的特征部位的跟踪框。
- 根据权利要求42所述的装置,其特征在于,所述处理器,具体用于:当所述跟踪框的个数小于所述检测框的个数时,将所述跟踪框中的每一个与所述检测框互斥地匹配以确定多个匹配结果。
- 根据权利要求42所述的装置,其特征在于,所述处理器,具体用于:当所述跟踪框的个数大于所述检测框的个数时,将所述检测框中的每一个与所述跟踪框互斥地匹配以确定多个匹配结果。
- 根据权利要求42所述的装置,其特征在于,所述处理器,具体用于:确定每一检测框与每一个跟踪框的之间的匹配程度系数;根据所述匹配程度系数将所述跟踪框中的每一个与所述检测框互斥地匹配或者将所述检测框中的每一个与所述跟踪框互斥地匹配以确定所述多个匹配结果。
- 根据权利要求47所述的装置,其特征在于,所述处理器,具体用于:根据所述检测框和所述跟踪框内图像的相似程度、所述检测框和所述跟踪框的重合程度、所述检测框和所述跟踪框的大小匹配程度中的至少一个确定每一检测框与每一个跟踪框的之间的匹配程度系数。
- 根据权利要求42所述的装置,其特征在于,所述至少一个对象中包括:目标对象和干扰对象;所述图像中对象的特征部位的跟踪框包括:所述目标对象的特征部位的跟踪框和所述干扰对象的特征部位的跟踪框,所述图像中对象的特征部位的检测框包括:所述目标对象的特征部位的检测框和所述干扰对象的特征部位的检测框,其中,所述目标对象为所述可移动平台跟踪的对象。
- 根据权利要求49所述的装置,其特征在于,所述处理器,还用于于:当所述干扰对象的跟踪框匹配不到所述检测框时,将所述跟踪框从所述更新后的特征部位的跟踪框中删除。
- 根据权利要求49所述的装置,其特征在于,所述处理器,还用于于:当所述检测框中的一个或多个匹配不到所述跟踪框时,将所述一个或多个检测框添加到所述更新后的特征部位的跟踪框中。
- 根据权利要求49所述的装置,其特征在于,在所述目标对象的跟踪参数满足预设的第一条件时,所述对象的特征部位的跟踪框为第一特征部位的跟踪框。
- 根据权利要求52所述的装置,其特征在于,所述目标对象的跟踪参数满足预设的第一条件包括:所述目标对象在所述图像中的尺寸占比小于或 等于预设第一占比阈值,和/或,所述目标对象与所述可移动平台的距离大于或等于预设第一距离。
- 根据权利要求52所述的装置,其特征在于,所述第一特征部位为所述对象的人体。
- 根据权利要求49所述的装置,其特征在于,在所述目标对象的跟踪参数满足预设的第二条件时,所述对象的特征部位的跟踪框为第二特征部位的跟踪框。
- 根据权利要求55所述的装置,其特征在于,所述目标对象的跟踪参数满足预设的第二条件包括:所述目标对象在所述图像中的尺寸占比大于或等于预设第一占比阈值,和/或,所述目标对象与所述可移动平台的距离小于或等于预设第一距离。
- 根据权利要求55所述的装置,其特征在于,所述第二特征部位为所述对象的头部,或者头部和肩部。
- 一种可移动平台,包括,拍摄装置,和如权利要求42-57任一项所述的控制装置。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201880032197.3A CN110651274A (zh) | 2018-01-23 | 2018-01-23 | 可移动平台的控制方法、装置和可移动平台 |
PCT/CN2018/073879 WO2019144296A1 (zh) | 2018-01-23 | 2018-01-23 | 可移动平台的控制方法、装置和可移动平台 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2018/073879 WO2019144296A1 (zh) | 2018-01-23 | 2018-01-23 | 可移动平台的控制方法、装置和可移动平台 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019144296A1 true WO2019144296A1 (zh) | 2019-08-01 |
Family
ID=67394528
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/073879 WO2019144296A1 (zh) | 2018-01-23 | 2018-01-23 | 可移动平台的控制方法、装置和可移动平台 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110651274A (zh) |
WO (1) | WO2019144296A1 (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112639815A (zh) * | 2020-03-27 | 2021-04-09 | 深圳市大疆创新科技有限公司 | 目标跟踪方法、目标跟踪装置、可移动平台和存储介质 |
CN112753210A (zh) * | 2020-04-26 | 2021-05-04 | 深圳市大疆创新科技有限公司 | 可移动平台及其控制方法、存储介质 |
CN112784680A (zh) * | 2020-12-23 | 2021-05-11 | 中国人民大学 | 一种人流密集场所锁定密集接触者的方法和系统 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115862144B (zh) * | 2022-12-23 | 2023-06-23 | 杭州晨安科技股份有限公司 | 一种摄像机手势识别方法 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101212658A (zh) * | 2007-12-21 | 2008-07-02 | 北京中星微电子有限公司 | 一种目标跟踪方法及装置 |
CN101271520A (zh) * | 2008-04-01 | 2008-09-24 | 北京中星微电子有限公司 | 一种确定图像中的特征点位置的方法及装置 |
CN102982557A (zh) * | 2012-11-06 | 2013-03-20 | 桂林电子科技大学 | 基于深度相机的空间手势姿态指令处理方法 |
US20130176430A1 (en) * | 2012-01-06 | 2013-07-11 | Pelco, Inc. | Context aware moving object detection |
CN103559491A (zh) * | 2013-10-11 | 2014-02-05 | 北京邮电大学 | 人体动作捕获及姿态分析系统 |
CN104700088A (zh) * | 2015-03-23 | 2015-06-10 | 南京航空航天大学 | 一种基于单目视觉移动拍摄下的手势轨迹识别方法 |
CN105760832A (zh) * | 2016-02-14 | 2016-07-13 | 武汉理工大学 | 基于Kinect传感器的逃犯识别方法 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7916895B2 (en) * | 2007-05-07 | 2011-03-29 | Harris Corporation | Systems and methods for improved target tracking for tactical imaging |
CN103198492A (zh) * | 2013-03-28 | 2013-07-10 | 沈阳航空航天大学 | 一种人体运动捕获方法 |
-
2018
- 2018-01-23 WO PCT/CN2018/073879 patent/WO2019144296A1/zh active Application Filing
- 2018-01-23 CN CN201880032197.3A patent/CN110651274A/zh active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101212658A (zh) * | 2007-12-21 | 2008-07-02 | 北京中星微电子有限公司 | 一种目标跟踪方法及装置 |
CN101271520A (zh) * | 2008-04-01 | 2008-09-24 | 北京中星微电子有限公司 | 一种确定图像中的特征点位置的方法及装置 |
US20130176430A1 (en) * | 2012-01-06 | 2013-07-11 | Pelco, Inc. | Context aware moving object detection |
CN102982557A (zh) * | 2012-11-06 | 2013-03-20 | 桂林电子科技大学 | 基于深度相机的空间手势姿态指令处理方法 |
CN103559491A (zh) * | 2013-10-11 | 2014-02-05 | 北京邮电大学 | 人体动作捕获及姿态分析系统 |
CN104700088A (zh) * | 2015-03-23 | 2015-06-10 | 南京航空航天大学 | 一种基于单目视觉移动拍摄下的手势轨迹识别方法 |
CN105760832A (zh) * | 2016-02-14 | 2016-07-13 | 武汉理工大学 | 基于Kinect传感器的逃犯识别方法 |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112639815A (zh) * | 2020-03-27 | 2021-04-09 | 深圳市大疆创新科技有限公司 | 目标跟踪方法、目标跟踪装置、可移动平台和存储介质 |
CN112753210A (zh) * | 2020-04-26 | 2021-05-04 | 深圳市大疆创新科技有限公司 | 可移动平台及其控制方法、存储介质 |
CN112784680A (zh) * | 2020-12-23 | 2021-05-11 | 中国人民大学 | 一种人流密集场所锁定密集接触者的方法和系统 |
CN112784680B (zh) * | 2020-12-23 | 2024-02-02 | 中国人民大学 | 一种人流密集场所锁定密集接触者的方法和系统 |
Also Published As
Publication number | Publication date |
---|---|
CN110651274A (zh) | 2020-01-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108615248B (zh) | 相机姿态追踪过程的重定位方法、装置、设备及存储介质 | |
JP6433149B2 (ja) | 姿勢推定装置、姿勢推定方法およびプログラム | |
CN107990899B (zh) | 一种基于slam的定位方法和系统 | |
WO2019144296A1 (zh) | 可移动平台的控制方法、装置和可移动平台 | |
US10559062B2 (en) | Method for automatic facial impression transformation, recording medium and device for performing the method | |
WO2019228196A1 (zh) | 一种全景视频的目标跟踪方法和全景相机 | |
WO2021135827A1 (zh) | 视线方向确定方法、装置、电子设备及存储介质 | |
US10217221B2 (en) | Place recognition algorithm | |
US11417095B2 (en) | Image recognition method and apparatus, electronic device, and readable storage medium using an update on body extraction parameter and alignment parameter | |
CN111094895B (zh) | 用于在预构建的视觉地图中进行鲁棒自重新定位的系统和方法 | |
US11922658B2 (en) | Pose tracking method, pose tracking device and electronic device | |
JP7272024B2 (ja) | 物体追跡装置、監視システムおよび物体追跡方法 | |
CN110874865A (zh) | 三维骨架生成方法和计算机设备 | |
CN105095853B (zh) | 图像处理装置及图像处理方法 | |
US10861185B2 (en) | Information processing apparatus and method of controlling the same | |
JP2019191981A (ja) | 行動認識装置、モデル構築装置及びプログラム | |
CN108446672A (zh) | 一种基于由粗到细脸部形状估计的人脸对齐方法 | |
JP2014164446A (ja) | 背景モデル構築装置、背景モデル構築方法、およびプログラム | |
JP6922348B2 (ja) | 情報処理装置、方法、及びプログラム | |
JP6276713B2 (ja) | 画像データ処理方法、画像データ処理装置および画像データ処理プログラム | |
JP6305856B2 (ja) | 画像処理装置、画像処理方法、およびプログラム | |
WO2020149149A1 (en) | Information processing apparatus, information processing method, and program | |
WO2022174603A1 (zh) | 一种位姿预测方法、位姿预测装置及机器人 | |
WO2022040994A1 (zh) | 手势识别方法及装置 | |
WO2022110059A1 (zh) | 视频处理、景别识别方法、终端设备和拍摄系统 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18902449 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18902449 Country of ref document: EP Kind code of ref document: A1 |