WO2019127192A1

WO2019127192A1 - Image processing method and apparatus

Info

Publication number: WO2019127192A1
Application number: PCT/CN2017/119291
Authority: WO
Inventors: 周游; 朱振宇; 刘洁
Original assignee: 深圳市大疆创新科技有限公司
Priority date: 2017-12-28
Filing date: 2017-12-28
Publication date: 2019-07-04
Also published as: CN109074661A

Abstract

Embodiments of the present application provide an image processing method and apparatus. In a highly dynamic scenario with a human subject, the present application fully considers the features of limbs or joints of the human subject, thereby improving computation accuracy of a depth map of the human subject. The method comprises: determining a pointing vector pointing from a target point on a target human subject in an image to at least one joint, and determining a positional relationship between the target point and at least one pixel point; and calculating a parallax of the target point according to the pointing vector, the positional relationship, and a parallax of the at least one pixel point, wherein a penalty coefficient of a global energy function of an SGM algorithm is adjusted according to the pointing vector and the positional relationship, and the parallax of the target point is calculated on the basis of the parallax of the at least one pixel point by using the global energy function with the adjusted penalty coefficient.

Description

Image processing method and device

Copyright statement

The disclosure of this patent document contains material that is subject to copyright protection. This copyright is the property of the copyright holder. The copyright owner has no objection to the reproduction of the patent document or the patent disclosure in the official records and files of the Patent and Trademark Office.

Technical field

Embodiments of the present application relate to the field of image processing, and more particularly, to an image processing method and apparatus.

Background technique

Human beings are entering the information age, and computers are increasingly entering almost all areas. As an important area of intelligent computing, computer vision has been greatly developed and applied. Computer vision relies on the imaging system instead of the visual organ as an input sensitive means. The most common is the camera, which can form a basic vision system by a dual camera.

The binocular camera system can capture two photos at the same moment and at different angles through two cameras, and then through the difference between the two photos and the position and angle relationship between the two cameras, the triangle relationship can be used to calculate the scene and The distance map of the camera, that is, the depth map can be obtained. In the final analysis, the binocular camera system obtains the depth information of the scene by the difference of two photos at different angles at the same time.

However, for high dynamic scenes, there will be some cases where the depth map is invalid. Even if there is a dynamic adjustment exposure strategy for the foreground, there are still some cases that do not work well.

Summary of the invention

The embodiment of the present application provides an image processing method and device, which can fully consider the features of the limbs or joints of the living body in a high dynamic scene with a living body, so that the depth map calculation of the living body is more accurate.

In one aspect, an image processing method is provided, including: determining a pointing vector of a target point on a target living body in an image to at least one joint, and determining a positional relationship between the target point and at least one pixel point; a pointing vector, and the positional relationship, adjusting a penalty coefficient of a global energy function of a Semi-Global Matching (SGM) algorithm; based on a disparity of the at least one pixel, using the penalty coefficient adjusted The global energy function calculates the parallax of the target point.

In another aspect, an image processing apparatus is provided, including a determining unit and a calculating unit; wherein the determining unit is configured to: determine a pointing vector of a target point on a target living body in the image to at least one joint, and determine a positional relationship between the target point and at least one pixel point; the calculating unit is configured to: adjust a penalty coefficient of a global energy function of the SGM algorithm according to the pointing vector, and the positional relationship; based on the at least one pixel point The parallax of the target point is calculated by using the global energy function adjusted by the penalty coefficient.

In another aspect, an image processing apparatus is provided, comprising a memory and a processor, the memory storing code, the processor being capable of calling code in the memory to perform a method of determining a target point on a target living body in the image to at least a pointing vector at a joint, and determining a positional relationship of the target point with at least one pixel; adjusting a penalty coefficient of a global energy function of the SGM algorithm according to the pointing vector, and the positional relationship; based on the at least one The parallax of the pixel is calculated by using the global energy function adjusted by the penalty coefficient to calculate the parallax of the target point.

In another aspect, a computer storage medium is provided, the medium storing code for determining a pointing vector of a target point on a target living body in the image to at least one joint, and determining the target point and a positional relationship of at least one pixel; adjusting a penalty coefficient of a global energy function of the SGM algorithm according to the pointing vector, and the positional relationship; and adjusting a penalty coefficient based on the parallax of the at least one pixel The global energy function calculates the parallax of the target point.

In another aspect, a computer program product is provided, the computer program product comprising code for determining a pointing vector of a target point on a target living body in an image to at least one joint, and determining the target point a positional relationship with at least one pixel; adjusting a penalty coefficient of a global energy function of the SGM algorithm according to the pointing vector, and the positional relationship; and adjusting the penalty coefficient based on the parallax of the at least one pixel The global energy function calculates the parallax of the target point.

Therefore, in the embodiment of the present application, a pointing vector of a target point on a target living body in an image to at least one joint is determined, and a positional relationship between the target point and at least one pixel point is determined; according to the pointing vector, And the positional relationship, adjusting a penalty coefficient of the global energy function of the SGM algorithm; calculating a parallax of the target point by using a global energy function adjusted by the penalty coefficient based on the parallax of the at least one pixel point, In the high dynamic scene with living body, the penalty coefficient in the semi-global matching algorithm is adjusted by fully considering the features of the limb or joint of the living body, and the fixed penalty coefficient is avoided, so that the depth map calculation of the living body is more accurate.

DRAWINGS

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings to be used in the embodiments or the prior art description will be briefly described below. Obviously, the drawings in the following description are only some of the present application. For the embodiments, those skilled in the art can obtain other drawings according to the drawings without any creative work.

FIG. 1 is a schematic diagram of the disconnection of depth information in a high dynamic scenario.

2 is a schematic diagram of an image processing method according to an embodiment of the present application.

FIG. 3 is a schematic diagram of segmenting a living body using a PAF algorithm according to an embodiment of the present application.

Figure 4 is a schematic view of the ground.

Figure 5 is a schematic illustration of a limb vector field.

Figure 6 is a schematic illustration of a limb vector field.

Figure 7 is an image of a thermography.

FIG. 8 is a schematic diagram of an image processing apparatus according to an embodiment of the present application.

FIG. 9 is a schematic diagram of an image processing apparatus according to an embodiment of the present application.

FIG. 10 is a schematic view of a drone according to an embodiment of the present application.

Detailed ways

The technical solutions in the embodiments of the present application are described in conjunction with the accompanying drawings in the embodiments of the present application. It is obvious that the described embodiments are a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present application without departing from the inventive scope are the scope of the present application.

It should be noted that, when a component is "fixedly connected" or "connected" to another component in the embodiment of the present application, or when one component is "fixed" to another component, it may be directly on another component, or There can be a centered component.

Unless otherwise indicated, all technical and scientific terms used in the embodiments of the present application have the same meaning The terminology used in the present application is for the purpose of describing particular embodiments and is not intended to limit the scope of the application. The term "and/or" used in this application includes any and all combinations of one or more of the associated listed.

For living organisms (for example, people), due to the fact that in a highly dynamic environment, when the depth information is acquired, some depth maps are invalid. Even with the dynamic adjustment exposure strategy for the foreground, there are still some cases. The depth map is invalid. For example, when the scene is switched, the dynamic exposure strategy still needs convergence time, which will cause the depth map to be invalid. For example, as shown in Figure 1 for the human arm part, there is more invalid depth information, resulting in 3D depth. There are cases of limb disconnection on the map. Among them, the left side of Figure 1 is the photograph taken, and the right side is the depth information.

For the UAV and the language, the route planning obstacle avoidance needs to use the 3D depth map, that is, the quality of the depth map directly affects the success or failure of the obstacle avoidance.

Therefore, the embodiments of the present application provide the following solutions, and better depth information can be obtained.

It should be understood that the depth information of the embodiment of the present application may be used for the obstacle avoidance of the UAV, and may also be used for other scenarios, which is not specifically limited in this embodiment of the present application.

It should be understood that the embodiment of the present application may use the image captured by the binocular camera to calculate the depth information, and may also use the image captured by the monocular camera to calculate the depth information, which is not specifically limited in the embodiment of the present application.

The embodiments of the present application can be used for aerial vehicles or other vehicles with multiple cameras, such as unmanned cars, auto-flying drones, VR/AR glasses, dual-camera mobile phones, smart cars with visual systems, etc. device.

FIG. 2 is a schematic flowchart of an image processing method 100 according to an embodiment of the present application. The method 100 includes at least a portion of the following.

In 110, a pointing vector of a target point on the target living body in the image to at least one joint is determined, and a positional relationship of the target point with the at least one pixel point is determined.

Optionally, each limb mentioned in the embodiment of the present application may be divided by a joint. For example, the limb mentioned in the embodiment of the present application may include a head, a hand, an upper arm, a lower arm, a thigh, and a lower leg. Department and so on.

Optionally, the living body mentioned in the embodiment of the present application may be a person, and of course, other living bodies, such as cats, dogs, elephants, and birds.

Optionally, in the embodiment of the present application, the target living body may be segmented from the image in advance. Further, a pointing vector of the target point on the target living body to the at least one joint is determined, and a positional relationship between the target point and the at least one pixel point is determined. Wherein, the target point refers to a target pixel point, and the at least one pixel point may be an adjacent pixel point of the target pixel point.

Specifically, on the image, the limb joint of the living body can be determined; according to the vector field of the limb joint of the living body, the connection relationship of the limb joint of the living body is determined; according to the connection relationship of the limb joint of the living body, from the image Split the target living body.

Specifically, the living body can be segmented by using Part Affinity Fields (PAFs).

Specifically, a in FIG. 3 may be configured as a Part Confidence Maps as shown by b in FIG. 3, and further configured as a limb-associated field as shown by c in FIG. 3 ( Part Affinity Fields (PAF), so that the human body can be segmented according to the limb-associated field shown by c in FIG. 3, as shown by d in FIG.

In the process of constructing a joint articulation map of the limb as shown by b in FIG. 3, the joint portion of the human body, for example, the wrist joint, the elbow joint, the shoulder joint, etc., can be found by the Convolutional Neural Network (CNN). . Among them, the body joint confidence map can be obtained for the human body k

Value at position p

It can be constructed as follows:

Where x _j,k is the groundtruth position of the limb j of the human body k in the image, and σ defines an extension of the peak, wherein the peak corresponds to each visible limb of each human body.

The limb-associated field refers to the pointing relationship between the limb joints, for example, the shoulder joint points to the elbow joint, and the elbow joint points to the wrist joint. Wherein, as shown in FIG. 4, x _j1,k and x _j2,k are the true positions of the joints j ₁ and j ₂ of the limb c connecting the human body k, and the limb-associated vector field of the point p can be defined according to the following formula 2:

It can be seen from the above formula 2 that if the point p is on the limb c, then

The value is the unit vector of the point from j1 to j2, if not on limb c, then

The value is 0. among them,

The set of points on the limb c can be defined as points within the line segment within the distance threshold, which satisfy the following Equation 3 and Equation 4:

Where the limb width σ _l is the pixel distance,

Is the length of the limb, and v _⊥ is a vector perpendicular to v.

Therefore, after determining the vector field of the limb joint of the living body according to the above-described scheme or the similar scheme, the connection relationship of the limb joint of the living body can be determined; and the connection relationship of the limb joint of the living body is segmented from the image according to the connection relationship of the limb joint of the living body The target living body.

Optionally, in the embodiment of the present application, the pointing vector of the target point to the at least one joint may be determined according to the pointing relationship of the limb joint of the target living body.

Specifically, as shown in FIG. 4, after the relationship from the elbow joint to the wrist is determined, a pointing vector at any point in the lower arm to the joint can be obtained.

Optionally, in the embodiment of the present application, before the target living body is segmented from the image according to the connection relationship of the limb joint of the living body, the target is initially segmented from the image by using thermal imaging. Life form. Among them, FIG. 7 shows a human body image acquired by means of thermal imaging. Of course, in the embodiment of the present application, the target living body may be initially segmented without using thermal imaging.

Alternatively, the living body may be segmented based on a minimum spanning tree graph segmentation method. Specifically, the following operations may be included.

Step 1: First convert the image into a graph, and get a graph G=(V, E). For graph G, there are n vertex v and m edges e.

The set of divided regions obtained by the following steps S = (C ₁ , ..., C _r ), where C ₁ , ..., _Cr are a subset of the vertices.

Step 2: Arrange the weights w of the edges E in a non-decreasing manner to obtain a set π=(o ₁ , . . . , o _m ).

Step 3: Starting from the block area S ⁰ , in the set S ⁰ , each vertex is a self-forming subset (each pixel is a separate area).

Step 4: Repeat step 4 for S ^q , q=1,...,m.

Step 5: Calculate S ^q by S ^q-1 as described below. Here, v _i and v _{j are used to} represent the two vertices of the q- _th edge connection, denoted as o _q = (v _i , v _j ). If v _i and v _j are independent and unconnected vertices in S ^q-1 , and the edge weight w(o _q ) of the connection is less than a certain threshold

Then join the subset where v _i and v _j are merged, otherwise it remains unchanged. Specifically, the subset of regions containing vi in S ^q-1 is recorded as

And the record containing v _j is

in case

And

Merge subset

versus

A new set S ^{q is obtained} , otherwise S ^q =S ^q-1 remains unchanged.

among them,

Where int is defined as the inner difference, that is, the largest weight in the minimum small spanning tree MST composed of the subset C ₁ , and the weight is defined as w(e)=w(v _i , v _j )=|I(p _i ) -I(p _j )|.

Step 6: After traversing the m connection edge lines, return the final result S ^m , which is S.

In 120, according to the pointing vector and the positional relationship, the penalty coefficient of the global energy function of the Semi-Global Matching (SGM) algorithm is adjusted.

In 130, based on the parallax of the at least one pixel, the global energy function adjusted by the penalty coefficient is used to calculate the parallax of the target point.

It should be understood that, in the embodiment of the present application, in addition to using the SGM algorithm to calculate the parallax of the target point, other algorithms may be used to calculate the parallax of the target point. In the following example, the SGM is used as an example to illustrate how to use the SGM to calculate the target. The parallax of the point.

In order to facilitate a clearer understanding of the present application, the SGM algorithm will be described below.

A disparity map can be formed by selecting the disparity of each pixel, and a global energy function related to the disparity map can be set to minimize the energy function to achieve optimal solution for each pixel. The purpose of parallax. Wherein, the energy function can be as shown in the following formula 6:

Where D refers to the disparity map, E(D) is the energy function corresponding to the disparity map; p, q represents a certain pixel in the image; N _p refers to the adjacent pixel of the pixel p; C(p, D _p ) Refers to the cost of the pixel when the current pixel disparity is D _p ; P ₁ is a penalty coefficient, which is applicable to those pixels in which the disparity value of the pixel adjacent to the pixel p differs by 1 from the disparity value of p; P ₂ It is a penalty coefficient, which is applicable to those pixels in which the disparity values of pixels adjacent to the pixel p differ by more than one from the disparity values of p.

Using the above function formula 6 to find the optimal solution time in the two-dimensional image is large, so the problem is approximately decomposed into a plurality of one-dimensional problems, that is, a linear problem. And every one-dimensional problem can be solved with dynamic programming. Since 1 pixel usually has 8 adjacent pixels (of course, there may be other numbers of adjacent pixels), it is generally decomposed into 8 one-dimensional problems.

For each one-dimensional solution, Equation 7 below can be used:

Where r refers to a direction pointing to the current pixel p, which can be understood here as an adjacent pixel point of the pixel p in the direction.

L _r (p,d) represents the minimum cost value when the parallax of the current pixel point p is d along the current direction.

Wherein, the minimum value can be selected from the minimum of 4 possible candidate values:

The first possible value is the smallest value when the current pixel is equal to the previous pixel's disparity value.

The second and third types may be the smallest cost + penalty coefficient P ₁ when the current pixel is 1 (more than 1 or less) from the previous pixel disparity value.

The fourth possibility is that the difference between the current pixel point and the previous pixel disparity value is greater than 1, and its minimum cost value + penalty coefficient P ₂ .

In addition, the generation value of the current pixel point can also be subtracted from the minimum value of the previous pixel point when taking different disparity values. This is because L _r (p,d) will grow with the right shift of the current pixel. To prevent the value from overflowing, you can keep it at a small value.

Among them, C(p,d) can be calculated by the following formulas 8 and 9:

C(p,d)=min(d(p,pd,I _L ,I _R ),d(pd,p,I _R ,I _L ))

When the cost value of each direction is calculated separately, the multiple values, for example, the cumulative value of the value of the eight directions, and the disparity value with the smallest accumulated value may be selected as the final disparity value of the pixel, for example, Accumulate by Equation 10 below:

S(p,d)=∑L _r (p,d) Equation 10

It can be seen from the above Equation 7 that if the target pixel point and the adjacent pixel point are expected to adopt the same disparity, the penalty coefficient P ₁ and the penalty coefficient P ₂ can be set larger, so that the target pixel point and the adjacent pixel can be increased. The pixel points adopt the same probability of parallax. If the difference between the target pixel point and the adjacent pixel point is expected to be large, the penalty coefficient P ₂ can be set smaller, and the penalty coefficient P ₁ can be set larger. The probability that the large target pixel points and the adjacent pixel points adopt a larger difference of the parallax, if the difference between the target pixel point and the adjacent pixel point is expected to be small, the penalty coefficient P ₁ can be set smaller, and the penalty will be The coefficient P _{2 is} set larger, which increases the probability that the difference between the target pixel and the adjacent pixel takes a smaller parallax.

Here, the ground shown in FIG. 5 can be used as an example for description. For the ground, in the 2D image, the ground is in the up and down direction, the depth changes sequentially, and in the left and right direction, the depth is basically the same.

Therefore, in FIG. 5, the time corresponding to vertical and horizontal four directions (path), and since the depth of the left-right direction is the same, so the penalty parameter P ₁ in the horizontal direction and P ₂ is large, the algorithm will tend in the horizontal direction The same parallax is selected on the top; in the up and down direction, the smaller penalty parameters P ₁ and P _{2 are given} , and the algorithm tends to select different parallaxes in the up and down direction.

Optionally, the penalty coefficient is adjusted according to an angle between the pointing vector and a vector corresponding to the positional relationship.

In one implementation, when the absolute value of the disparity value is greater than or equal to the predetermined disparity, the modulus of the difference between the angle and the 90 degree is positively correlated with the penalty coefficient.

For example, the image shown in Fig. 6 is a bevel in the direction of extension of the arm, that is, in the direction from the elbow joint to the wrist joint, the distance from the lens is gradually changed, and the depth is changed, analogous to Fig. 5 In the up and down direction of the ground portion, given the larger penalty parameters P ₁ and P ₂ , the algorithm tends to choose different parallaxes in the direction of the elbow joint to the wrist joint. The direction perpendicular to the arm, perpendicular to the elbow joint to the wrist joint, is basically at a distance, that is, the depth is substantially the same, analogous to the left and right direction of the ground in Figure 5, dynamically increasing the penalty parameters P ₁ and P ₂ The algorithm tends to choose similar or even the same parallax.

Optionally, determining, according to the limb edge of the target living body, the at least one pixel point is on the target living body.

Specifically, when the adjacent pixel points are located on the living body, it is more meaningful to determine the penalty coefficient by using the positional relationship between the adjacent pixel points and the target point, and the target point to the pointing vector of the at least one joint, so that the target can be determined according to the target When the edge of the limb of the living body is determined to be the target living body, the method 100 of the embodiment of the present application can be used to calculate the parallax of the target point.

Of course, if the target point is a pixel point at the edge of the limb, and the pixel used to calculate the depth information is a pixel point other than the target living body, a smaller penalty P ₂ can be set, and a larger penalty coefficient is further set. P ₁ , thus allowing the calculated parallax jump.

Optionally, in the embodiment of the present application, only the target living body segmented by the PAF method in the embodiment of the present application may be referred to, and the pixel point of the edge of the target living body is determined, and the positional relationship between the pixel based on the edge and the adjacent pixel point is determined. , adjusting the penalty coefficient, regardless of the pointing vector of the target point to at least one joint.

Optionally, in the embodiment of the present application, the depth information of the target living body may be calculated according to the disparity of the at least one target point.

Specifically, after calculating the parallax of each pixel of the target living body, the depth information of the living body can be calculated. Among them, the parallax can be inversely proportional to the depth.

Alternatively, the depth can be calculated by the following formula 11:

Where d is the depth, b is the distance between the left and right cameras, f is the focal length of the camera, and d _p is the disparity.

It can be seen from the above formula 1 that since b and f are physical properties and generally remain unchanged, d is inversely proportional to d _p . For objects at close range, the depth is small, then the parallax is larger, while for distant objects, the depth is larger and the corresponding parallax is smaller.

Optionally, in the embodiment of the present application, the first speed may be determined according to the depth of the target living body, where the direction of the first speed is a direction from the target living body to the unmanned driving device; according to the first speed And a second speed determining a control speed for controlling the unmanned device, the second speed being a speed input by the controller; and controlling the flight of the unmanned device according to the control speed. Optionally, the magnitude of the first velocity is inversely proportional to the depth. The unmanned device may be a drone or a driverless car. The following description will be made by taking an unmanned aerial vehicle as an example.

Specifically, when the drone flies closer to the living body, the Repulsive Force Field is used to "bounce" the drone to achieve the obstacle bypass.

Here, the repulsion field can be constructed by referring to the following universal gravitation formula 12, and the specific expression can refer to Equation 13.

Here, the life mass m _obstacle can take a larger constant value, where m _drone is the mass of the _drone , and G is also a constant value, so the constant k=G·m _obstacle can be defined, thereby obtaining the next Equation 14:

Where D _x is the depth information of the living body, and the depth of each pixel of the living body can be averaged.

Then, the speed of the repulsive field planning can be obtained by the constant acceleration formula in Equation 15 below.

The speed corresponding to the repulsive field points away from the direction of the living body, and the user-controlled drone has a speed. The two speeds are superimposed by the vector to generate a new speed. As the final planned speed, as the speed loop command of the control system, Eventually, the obstacles will be bypassed.

Moreover, further, in the embodiment of the present application, a complete strategy is proposed for the low-altitude flying drone, which is used for detecting the human body and other animals, and is used for optimizing the calculation of the depth map, thereby obtaining more accurate. Obstacle observation can guarantee the safety of people, and can better realize the planning of bypass for unmanned obstacle avoidance routes.

FIG. 8 is a schematic block diagram of an image processing apparatus 200 according to an embodiment of the present application. As shown in FIG. 8, the device 200 includes a determining unit 210 and a calculating unit 220;

The determining unit 210 is configured to: determine a pointing vector of the target point on the target living body in the image to the at least one joint, and determine a positional relationship between the target point and the at least one pixel point;

The calculating unit 220 is configured to: adjust a penalty coefficient of a global energy function of the semi-global matching SGM algorithm according to the pointing vector and the position relationship;

A parallax of the target point is calculated based on a parallax of the at least one pixel point using a global energy function adjusted by the penalty coefficient.

Optionally, the calculating unit 220 is further configured to:

The penalty coefficient is adjusted according to an angle between the pointing vector and a vector corresponding to the positional relationship.

Optionally, when the absolute value of the disparity value is greater than or equal to a predetermined disparity, a modulus of the difference between the angle and 90 degrees is positively correlated with the penalty coefficient.

Optionally, as shown in FIG. 8, the device 200 further includes a first dividing unit 230, configured to:

Determining a limb joint of a living body on the image;

Determining a connection relationship of the limb joints of the living body according to a vector field of the limb joint of the living body;

The target living body is segmented from the image according to a connection relationship of the limb joints of the living body.

Optionally, the determining unit 210 is further configured to:

Determining a pointing vector of the target point to at least one joint according to a pointing relationship of a limb joint of the target living body.

Optionally, the determining unit 210 is further configured to:

Determining that the at least one pixel point is on the target living body according to a limb edge of the target living body.

Optionally, as shown in FIG. 8, the device 200 further includes a second dividing unit 240, configured to:

The target living body is initially segmented from the image by means of thermal imaging.

Optionally, as shown in FIG. 8, the device 200 further includes a control unit 250, configured to:

Calculating a depth of the target living body according to a disparity of the at least one target point;

Determining a first speed according to a depth of the target living body, the direction of the first speed being a direction from the target living body to the unmanned device;

Determining, according to the first speed and the second speed, a control speed for controlling the unmanned device, wherein the second speed is a speed input by the controller;

The unmanned device is controlled according to the control speed.

Optionally, the magnitude of the first velocity is inversely proportional to the depth.

It should be understood that the image processing device may implement corresponding operations in the method 100, and for brevity, no further details are provided herein.

FIG. 9 is a schematic block diagram of an image processing apparatus 400 according to an embodiment of the present application.

Alternatively, the image processing device 400 may include a plurality of different components that may be integrated circuits (ICs), or portions of integrated circuits, discrete electronic devices, or other suitable for use in a circuit board (such as a motherboard). Modules, or additional boards, may also be incorporated as part of a computer system.

Alternatively, the image processing device can include a processor 410 and a storage medium 420 coupled to the processor 410.

Processor 410 may include one or more general purpose processors, such as a central processing unit (CPU), or a processing device or the like. Specifically, the processor 410 may be a complex instruction set computing (CISC) microprocessor, a very long instruction word (VLIW) microprocessor, and implements micro-processing of multiple instruction set combinations. Device. The processor may also be one or more dedicated processors, such as an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), and a digital signal processor. , DSP).

Processor 410 can be in communication with storage medium 420. The storage medium 420 may be a magnetic disk, an optical disk, a read only memory (ROM), a flash memory, or a phase change memory. The storage medium 420 can store instructions stored by the processor and/or can cache some of the information stored from the external storage device.

Optionally, in addition to the processor 420 and the storage medium 420, the image processing apparatus may include a display controller and/or display device unit 430, a transceiver 440, a video input output unit 450, an audio input output unit 460, and other input and output units 470. . These components included in image processing device 400 may be interconnected by a bus or internal connection.

Alternatively, the transceiver 440 can be a wired transceiver or a wireless transceiver, such as a WIFI transceiver, a satellite transceiver, a Bluetooth transceiver, a wireless cellular telephone transceiver, or combinations thereof.

Optionally, the video input and output unit 450 may include an image processing subsystem such as a camera, including a photo sensor, a charge coupled device (CCD) or a complementary metal-oxide semiconductor (CMOS) light. Sensor for use in shooting functions.

Alternatively, the audio input and output unit 460 may include a speaker, a microphone, an earpiece, and the like.

Alternatively, other input and output devices 470 may include a storage device, a universal serial bus (USB) port, a serial port, a parallel port, a printer, a network interface, and the like.

Optionally, the image processing device 400 can perform the operations shown in the method 100. For brevity, details are not described herein again.

Alternatively, the image processing device 400 or 400 may be located in a mobile device. The mobile device can be moved in any suitable environment, for example, in the air (eg, a fixed-wing aircraft, a rotorcraft, or an aircraft with neither a fixed wing nor a rotor), in water (eg, a ship or submarine), on land. (for example, a car or train), space (for example, a space plane, satellite or detector), and any combination of the above. The mobile device can be an unmanned car, an auto-flying drone, a VR/AR glasses, a dual-camera mobile phone, a smart car with a vision system, and the like.

FIG. 10 is a schematic block diagram of a removable device 500 in accordance with an embodiment of the present application. As shown in FIG. 10, the mobile device 500 includes a carrier 510 and a load 520. The description of the mobile device in Figure 14 as a drone is for illustrative purposes only. The load 520 may not be connected to the mobile device via the carrier 510. The removable device 500 can also include a power system 530, a sensing system 540 and a communication system 550 and an image processing device 562 and a photographing system 564.

Power system 530 can include an electronic governor (referred to as an ESC), one or more propellers, and one or more electric machines corresponding to one or more propellers. The motor and the propeller are disposed on the corresponding arm; the electronic governor is configured to receive a driving signal generated by the flight controller, and provide a driving current to the motor according to the driving signal to control the rotation speed and/or steering of the motor. The motor is used to drive the propeller to rotate to power the UAV's flight, which enables the UAV to achieve one or more degrees of freedom of motion. In certain embodiments, the UAV can be rotated about one or more axes of rotation. For example, the above-described rotating shaft may include a roll axis, a pan axis, and a pitch axis. It should be understood that the motor can be a DC motor or an AC motor. In addition, the motor can be a brushless motor or a brush motor.

The sensing system 540 is used to measure the attitude information of the UAV, that is, the position information and state information of the UAV in space, for example, three-dimensional position, three-dimensional angle, three-dimensional speed, three-dimensional acceleration, and three-dimensional angular velocity. The sensing system may include, for example, a gyroscope, an electronic compass, an Inertial Measurement Unit ("IMU"), a vision sensor, a Global Positioning System (GPS), and a barometer. At least one of them. The flight controller is used to control the flight of the UAV, for example, the UAV flight can be controlled based on the attitude information measured by the sensing system. It should be understood that the flight controller may control the UAV in accordance with pre-programmed program instructions, or may control the UAV in response to one or more control commands from the operating device.

Communication system 550 is capable of communicating with wireless terminal 590 via a terminal device 580 having communication system 570. Communication system 550 and communication system 570 can include a plurality of transmitters, receivers, and/or transceivers for wireless communication. The wireless communication herein may be one-way communication, for example, only the mobile device 500 may transmit data to the terminal device 580. Alternatively, the wireless communication may be two-way communication, and the data may be transmitted from the mobile device 500 to the terminal device 580 or may be transmitted by the terminal device 580 to the mobile device 500.

Alternatively, terminal device 580 can provide control data for one or more of mobile device 500, carrier 510, and load 520, and can receive information transmitted by mobile device 500, carrier 510, and load 520. The control data provided by terminal device 580 can be used to control the status of one or more of mobile device 500, carrier 510, and load 520. Optionally, a carrier 510 and load 520 include a communication module for communicating with the terminal device 580.

It can be understood that the image processing device 562 included in the mobile device shown in FIG. 10 can perform the method 100, which is not described herein for brevity.

The foregoing is only a specific embodiment of the present application, but the scope of protection of the present application is not limited thereto, and any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in the present application. It should be covered by the scope of protection of this application. Therefore, the scope of protection of the present application should be determined by the scope of the claims.

The above is only a specific embodiment of the present application, but the scope of protection of the present application is not limited thereto, and any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in the present application, and should cover Within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of protection of the claims.

Claims

An image processing method, comprising:

Determining a pointing vector of the target point on the target living body in the image to the at least one joint, and determining a positional relationship of the target point and the at least one pixel point;

Adjusting a penalty coefficient of a global energy function of the semi-global matching SGM algorithm according to the pointing vector and the positional relationship;

A parallax of the target point is calculated based on a parallax of the at least one pixel point using a global energy function adjusted by the penalty coefficient.
The method according to claim 1, wherein the penalty coefficient of the global energy function of the semi-global matching SGM algorithm is adjusted according to the pointing vector and the positional relationship, including:

The penalty coefficient is adjusted according to an angle between the pointing vector and a vector corresponding to the positional relationship.
The method according to claim 2, wherein, when the absolute value of the disparity value is greater than or equal to a predetermined disparity, a modulus of the difference between the angle and 90 degrees is positively correlated with the penalty coefficient.
The method according to any one of claims 1 to 3, wherein before the determining a target point on the target living body in the image to the pointing vector at the at least one joint, the method further comprises:

Determining a limb joint of a living body on the image;

Determining a connection relationship of the limb joints of the living body according to a vector field of the limb joint of the living body;

The target living body is segmented from the image according to a connection relationship of the limb joints of the living body.
The method according to claim 4, wherein the determining a pointing vector of the target point on the target living body in the image to the at least one joint comprises:

Determining a pointing vector of the target point to at least one joint according to a pointing relationship of a limb joint of the target living body.
The method according to claim 4 or 5, wherein before the determining the positional relationship between the target point and the at least one pixel, the method further comprises:

Determining that the at least one pixel point is on the target living body based on a limb edge of the target living body.
The method according to any one of claims 4 to 6, wherein said said living body is segmented from said image based on said connection relationship of limb joints of said living body The method also includes:

The target living body is initially segmented from the image by means of thermal imaging.
The method according to any one of claims 1 to 7, wherein the method further comprises:

Calculating a depth of the target living body according to a disparity of the at least one target point;

Determining a first speed according to a depth of the target living body, the direction of the first speed being a direction from the target living body to the unmanned device;

Determining, according to the first speed and the second speed, a control speed for controlling the unmanned device, wherein the second speed is a speed input by the controller;

The unmanned device is controlled according to the control speed.
The method of claim 8 wherein the magnitude of the first velocity is inversely proportional to the depth.
An image processing device, comprising: a determining unit and a calculating unit; wherein

The determining unit is configured to: determine a pointing vector of the target point on the target living body in the image to the at least one joint, and determine a positional relationship between the target point and the at least one pixel point;

The calculating unit is configured to: adjust a penalty coefficient of a global energy function of the semi-global matching SGM algorithm according to the pointing vector and the positional relationship;

A parallax of the target point is calculated based on a parallax of the at least one pixel point using a global energy function adjusted by the penalty coefficient.
The device according to claim 10, wherein the calculating unit is further configured to:

The penalty coefficient is adjusted according to an angle between the pointing vector and a vector corresponding to the positional relationship.
The apparatus according to claim 11, wherein a modulus of a difference between the included angle and 90 degrees is positively correlated with the penalty coefficient when an absolute value of the parallax difference value is greater than or equal to a predetermined parallax.
The device according to any one of claims 10 to 12, further comprising a first dividing unit for:

Determining a limb joint of a living body on the image;

Determining a connection relationship of the limb joints of the living body according to a vector field of the limb joint of the living body;

The target living body is segmented from the image according to a connection relationship of the limb joints of the living body.
The device according to claim 13, wherein the determining unit is further configured to:

Determining a pointing vector of the target point to at least one joint according to a pointing relationship of a limb joint of the target living body.
The device according to claim 13 or 14, wherein the determining unit is further configured to:

Determining that the at least one pixel point is on the target living body according to a limb edge of the target living body.
The device according to any one of claims 13 to 15, further comprising a second dividing unit for:

The target living body is initially segmented from the image by means of thermal imaging.
The device according to any one of claims 10 to 16, further comprising a control unit for:

Calculating a depth of the target living body according to a disparity of the at least one target point;

Determining a first speed according to a depth of the target living body, the direction of the first speed being a direction from the target living body to the unmanned device;

Determining, according to the first speed and the second speed, a control speed for controlling the unmanned device, wherein the second speed is a speed input by the controller;

The unmanned device is controlled according to the control speed.
The apparatus of claim 17 wherein the magnitude of the first velocity is inversely proportional to the depth.