WO2020237611A1 - Image processing method and apparatus, control terminal and mobile device - Google Patents

Image processing method and apparatus, control terminal and mobile device Download PDF

Info

Publication number
WO2020237611A1
WO2020237611A1 PCT/CN2019/089425 CN2019089425W WO2020237611A1 WO 2020237611 A1 WO2020237611 A1 WO 2020237611A1 CN 2019089425 W CN2019089425 W CN 2019089425W WO 2020237611 A1 WO2020237611 A1 WO 2020237611A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target object
depth information
target
area
Prior art date
Application number
PCT/CN2019/089425
Other languages
French (fr)
Chinese (zh)
Inventor
蔡剑钊
赵峰
周游
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to CN201980008862.XA priority Critical patent/CN111602139A/en
Priority to PCT/CN2019/089425 priority patent/WO2020237611A1/en
Publication of WO2020237611A1 publication Critical patent/WO2020237611A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands

Definitions

  • the present invention belongs to the field of image processing technology, and particularly relates to an image processing method, device, control terminal and movable equipment.
  • Computer vision technology replaces human visual organs with imaging systems to achieve tracking and positioning of target objects.
  • a depth information map used to represent the depth information of the target.
  • Solution 1 referring to Figure 1, perform feature detection on the image 1 including the target object 2 acquired by the imaging system, and by delimiting a feature frame 3 including the target object 2 and part of the background picture 4, according to all features in the feature frame 3 Pixels are used to calculate the depth information of the target object 2, so as to draw the depth information map of the target object 2.
  • the second solution is to directly use the image semantic segmentation algorithm or the semantic parsing algorithm to recognize the target object 2 in the image 1, thereby drawing the depth information map of the target object 2.
  • the process of drawing the depth information map using the feature frame 3 introduces a lot of useless information, such as the background image 4.
  • the depth information of some non-important parts of the target object 2, etc. As a result, the final depth information map cannot accurately express the target object 2, resulting in poor tracking and positioning accuracy of the target object 2.
  • algorithm processing is directly performed on the whole image of image 1, which results in the need for larger computing resources and higher processing costs.
  • the present invention provides an image processing method, device, control terminal and movable equipment, so as to solve the need for large computing resources to determine the three-dimensional physical information of an object in the prior art, resulting in high processing cost, and tracking and positioning accuracy of the object Poor problem.
  • the present invention is implemented as follows:
  • an image processing method which may include:
  • the target area determine the main feature area of the target object
  • the three-dimensional physical information of the target object is determined according to the depth information.
  • an embodiment of the present invention provides an image processing device, and the image processing device may include:
  • the receiver is configured to perform: acquiring a target image including a target object;
  • the processor is used to execute:
  • the target area determine the main feature area of the target object
  • the three-dimensional physical information of the target object is determined according to the depth information.
  • a computer-readable storage medium is provided, and a computer program is stored on the computer-readable storage medium.
  • the computer program is executed by a processor, the steps of the image processing method described above are implemented .
  • a control terminal which is characterized in that it includes the image processing device, a transmitting device, and a receiving device.
  • the transmitting device sends a shooting instruction to a movable device, and the receiving device The image taken by the movable device is received, and the image processing device processes the image.
  • a movable device including a photographing device, the movable device further comprising an image processing device, and the image processing device receives an image photographed by the photographing device and performs image processing.
  • the present invention obtains a target image including the target object; determines the target area in the target image, where at least the main body of the target object is located in the target area; in the target area, determines the main characteristic area of the target object; The initial depth information of the feature area of the subject determines the depth information of the target object, and the three-dimensional physical information of the target object is determined according to the depth information.
  • the present invention removes the interference of the background, occlusions, and non-subject parts of the target object in the target image, so it reduces the probability of introducing useless information in the process of calculating depth information and improves the three-dimensional The accuracy of the physical information.
  • the present invention scans and processes the feature area of the subject to obtain the three-dimensional physical information of the corresponding target object. Compared with scanning and processing the entire target image directly, the amount of calculation is reduced and the processing is improved. effectiveness.
  • FIG. 1 is a flowchart of steps of an image processing method provided by an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a target image provided by an embodiment of the present invention.
  • Fig. 3 is a schematic diagram of another target image provided by an embodiment of the present invention.
  • FIG. 5 is a flowchart of specific steps of another image processing method provided by an embodiment of the present invention.
  • Fig. 6 is a schematic diagram of another target image provided by an embodiment of the present invention.
  • FIG. 7 is a scene diagram of acquiring initial depth information of a target object according to an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of another target image provided by an embodiment of the present invention.
  • FIG. 9 is a probability distribution diagram of a timing matching operation provided by an embodiment of the present invention.
  • FIG. 10 is a block diagram of an image processing device according to an embodiment of the present invention.
  • FIG. 11 is a block diagram of a movable device according to an embodiment of the present invention.
  • FIG. 12 is a schematic diagram of the hardware structure of a control terminal provided by an embodiment of the present invention.
  • Fig. 1 is a flowchart of steps of an image processing method provided by an embodiment of the present invention. As shown in Fig. 1, the method may include:
  • Step 101 Obtain a target image including a target object.
  • the image processing method provided by the embodiment of the present invention can be applied to a movable device, which includes: unmanned aerial vehicle, unmanned vehicle, unmanned boat, handheld camera, etc.
  • the movable equipment is usually provided with an image processing device with a photographing function.
  • the normal operation of the movable equipment is inseparable from the image processing device's photographing and processing of the surrounding objects of the movable equipment and the depth information of the object.
  • an unmanned vehicle when it is driving unmanned, it needs to use the image processing device installed on it to collect images of objects in the surrounding environment of the unmanned vehicle in real time, and further process the image to obtain the depth information of the object.
  • People and vehicles can use depth information to achieve the purpose of determining the orientation of objects to achieve unmanned automatic driving.
  • the target image including the target object is acquired.
  • the camera in the image processing device may be used to shoot one or more images of the target image with the target object.
  • Step 102 Determine a target area in the target image, and at least a main body of the target object is located in the target area.
  • the target area in the target image can be further determined to achieve the purpose of detecting the object in the target image, wherein the target area may include the target object At least the main body, that is, the target area, may completely overlap or partially overlap with at least the main body of the target object.
  • FIG. 2 which shows a schematic diagram of a target image provided by an embodiment of the present invention, where the target image 10 includes a human target object 11 and two street lamps 12 in the background. If the entire target image 10 is scanned directly To determine the depth information of the character target object 11, firstly, it will cause too much calculation. Secondly, in the process of determining the depth information of the character target object 11, irrelevant background and relevant information of the street lamp 12 will be introduced, resulting in the character target object 11. The probability of error in depth information is greater.
  • the area where the human target object 11 is located can be roughly selected by the target area frame 13, and the target area frame 13 may include the entire human target object 11 and a small part of the background area.
  • the main body part of the target object may be determined first, and the target area at least includes the main body part.
  • the target area frame 13 including the entire human target object 11, and then scan the area in the target area frame 13, which can reduce the amount of calculation to a certain extent, and the target The area frame 13 filters out the two irrelevant street lights 12 in the background, and reduces the probability of error when calculating the depth information of the character target object 11.
  • the area where the character target object 11 is selected through the target area frame 13 can be implemented in the following two ways:
  • Manner 1 By receiving the user's frame selection operation, the target area frame 13 is generated, and the area where the human target object 11 is located is selected through the target area frame 13.
  • Method 2 Through deep learning, train to obtain a recognition model that can recognize and determine the character target object 11 in the target image 10, so that after the target image 10 is input to the recognition model, the recognition model can automatically output the target area frame including the character target object 11 13. This method is similar to the current face region positioning technology, and the present invention will not repeat it here.
  • the shape of the target area is preferably a rectangle.
  • the shape of the target area may also be a circle, an irregular shape, etc., according to actual needs, which is not limited in the embodiment of the present invention.
  • Step 103 Determine the subject characteristic area of the target object in the target area.
  • the subject feature area can accurately reflect the orientation of the target object.
  • the subject feature area can represent the center of mass of the entire target object, so that the subject feature can be
  • the trajectory generated by the area movement is determined as the trajectory generated when the target object moves.
  • the subject characteristic area of the target object can be determined.
  • the human torso can be defined (that is, the part in Figure 2
  • the area ABCD) is the subject feature area, so that the subject feature area is further delimited in the target area frame 13 to reduce the variance generated during the subsequent calculation of the depth information.
  • the target area frame 23 encircles the entire car 21 and a partial occlusion area 22.
  • 21 is a relatively regular-shaped object. Therefore, the area of the car 21 outside the occlusion region 22 in the target region frame 23 can be defined as the main feature region, so as to reduce the variance generated by the occlusion region 22 in the subsequent calculation of depth information.
  • Step 104 Determine the depth information of the target object according to the initial depth information of the subject feature area.
  • the perception of depth information is the prerequisite for humans to produce stereoscopic object vision.
  • the depth information refers to the number of bits used to store each pixel in the image, which determines the number of colors that each pixel of the color image may have. Or determine the number of gray levels that each pixel of the gray image may have.
  • the depth information of an object can be a grayscale image that contains the depth information of each pixel.
  • the size of the depth information is expressed in the depth of the grayscale.
  • the grayscale image represents the distance between the object and the camera through a grayscale gradient. The distance.
  • step 103 has determined the subject feature area of the target object, so that the initial depth information of the subject feature area is further used to determine the depth information of the target object.
  • the initial depth information of the subject feature area can be acquired in multiple ways.
  • the current mobile device can include the configuration of a binocular camera module, so the depth information of the target object can be acquired. It is realized by the method of passive ranging sensing. This method uses two cameras separated by a certain distance to obtain two images of the same target object at the same time, and uses a stereo matching algorithm to find the pixel points corresponding to the main feature area in the two images, and then The disparity information is calculated according to the triangulation principle, and the disparity information can be converted to obtain the initial depth information used to characterize the feature area of the subject in the scene.
  • the initial depth information of the feature area of the subject through the method of active ranging sensing.
  • the most obvious feature of active ranging sensing is: using the device itself
  • the emitted energy completes the collection of initial depth information, which also ensures that the acquisition of depth images is independent of the acquisition of color images. Therefore, in the embodiment of the present invention, continuous near-infrared pulses can be emitted to the target object through the movable device , And then use the sensor of the movable device to receive the light pulse reflected by the target object.
  • the transmission delay between the light pulses can be inferred and the target object relative Depending on the distance of the transmitter, a depth image containing the initial depth information corresponding to the main feature area of the target object is finally obtained.
  • the depth information of the corresponding target object can be obtained by averaging the initial depth information of the subject feature area.
  • the embodiment of the present invention is aimed at the target image.
  • the local feature area of the subject is scanned and processed to obtain the depth information of the corresponding target object. Compared with scanning and processing the entire target image directly, the amount of calculation is reduced and the processing efficiency is improved.
  • Step 105 Determine three-dimensional physical information of the target object according to the depth information.
  • the three-dimensional physical information of the target object can be further determined according to the depth information of the target object.
  • the three-dimensional physical information of the target object can be used To indicate the orientation and trajectory of the target object.
  • the depth information of the object can be a grayscale image containing the depth information of each pixel.
  • the depth information is expressed in the depth of the grayscale.
  • the grayscale image represents the distance of the object from the camera through a grayscale gradient. degree.
  • the depth information of the target object can be converted into a grayscale image, and by calculating the grayscale gradient value in the grayscale image, the corresponding relationship between the grayscale gradient value and the distance can be used to determine the target object and the movable device.
  • To determine the position coordinates of the target object at different times and to associate the position coordinates of the target object at different times with the corresponding time to obtain the three-dimensional physical information of the target object.
  • the position coordinates of the target object at different times can also be associated with the corresponding time and reflected on a specific map to obtain the three-dimensional physical information of the target object.
  • an image processing method obtains a target image including a target object; determines a target area in the target image, where at least the main body of the target object is located in the target area; in the target area, determines the target object The subject feature area of the subject; the depth information of the target object is determined according to the initial depth information of the subject feature area, and the three-dimensional physical information of the target object is determined according to the depth information.
  • the present invention removes the interference of the background, occlusions, and non-subject parts of the target object in the target image, so it reduces the probability of introducing useless information in the process of calculating depth information and improves the three-dimensional The accuracy of the physical information.
  • the present invention scans and processes the feature area of the subject to obtain the three-dimensional physical information of the corresponding target object. Compared with scanning and processing the entire target image directly, the amount of calculation is reduced and the processing is improved. effectiveness.
  • Fig. 4 is a flowchart of specific steps of an image processing method provided by an embodiment of the present invention. As shown in Fig. 4, the method may include:
  • Step 201 Obtain a target image including the target object.
  • Step 202 Determine a target area in the target image, and at least a main body of the target object is located in the target area.
  • step 102 For details of this step, please refer to the above step 102, which will not be repeated here.
  • Step 203 Divide the target area into multiple sub-areas by extracting edge features of the target area.
  • the edge feature is used to indicate the obviously changing edge or discontinuous area in an image. Since the edge is the boundary line between different areas in an image, an edge image can be a two-dimensional image. For value images, the purpose of edge detection is to capture areas with sharp changes in brightness. In an ideal situation, the edge detection of the target area can obtain an edge feature composed of a series of continuous curves in the target area, which is used to represent the boundary of the object, and the entire target area can be divided by the intersection of each edge feature For multiple sub-regions.
  • Step 204 Determine the classification categories of the multiple sub-regions through the classification model.
  • the step 204 may also be implemented in a manner of determining the classification categories of multiple sub-regions through a convolutional neural network model or determining the classification categories of multiple sub-regions through a classifier.
  • the training data set can be used to train the classification model.
  • the classification model is used to classify the categories of each sub-region.
  • the training process of the classification model can include: the region using the preset pattern and the preset The corresponding relationship between the classifications of the patterns, the classification model is trained, so that the classification model can achieve the purpose of inputting a certain area and outputting the classification corresponding to the area.
  • multiple sub-regions of the target region can be input into the trained classification model, and the classification model will output the classification category of each sub-region.
  • Step 205 Combine the sub-areas corresponding to the target classification category among the multiple sub-areas to obtain the subject feature area.
  • the target classification category matching the subject feature area can be determined first, and then the subregions corresponding to the target classification category are connected to obtain the subject feature area.
  • the human body can be defined as the main feature area, and the target classification category can be determined as The human body torso category combines the sub-regions corresponding to the human body torso category to obtain the subject feature region.
  • the offset of the contour of the main feature region is less than or equal to a preset threshold.
  • the definition of the subject feature area is that when the target object is under force or in motion, the offset of the contour of the subject feature area is less than or equal to the preset threshold, that is, the subject feature area of the target object moves or moves on the target object. Under the force state, it can maintain a relatively stable state to avoid introducing too much useless information when calculating the depth information of the target object later.
  • the measurement of the offset of the contour of the subject feature region may specifically include: obtaining consecutive frames of frame images including the target object under a fixed shooting angle of view, and displacing the contour of the subject feature region in adjacent frame images The difference is recorded as the offset, or the displacement difference between the contour of the subject feature area in one frame of image and the contour of the subject feature area of the frame image several frames before is recorded as the offset.
  • Step 206 Determine the depth information of the target object according to the initial depth information of the subject feature area.
  • step 104 For details of this step, refer to the above step 104, which will not be repeated here.
  • Step 207 Determine the position coordinates of the target object at different times according to the depth information.
  • the depth information of the object can be a grayscale image, including the depth information of each pixel.
  • the size of the depth information is expressed in the depth of the grayscale.
  • the grayscale image represents the representative object through grayscale gradient. The distance from the camera.
  • the depth information of the target object can be converted into a grayscale image, and by calculating the grayscale gradient value in the grayscale image, the corresponding relationship between the grayscale gradient value and the distance can be used to determine the target object and the movable The spacing between devices.
  • the depth information is constantly updated, so a new grayscale image can be obtained according to the updated depth information, so as to determine the target object at different times according to the distance between the target object and the movable device The position coordinates at the moment.
  • Step 208 Determine the three-dimensional physical information of the target object according to the position coordinates of the target object at different times.
  • the position coordinates of the target object at different times can be correlated with the corresponding times to obtain the three-dimensional physical information of the target object.
  • the position coordinates of the target object at different times can also be associated with the corresponding time and reflected on a specific map to obtain the three-dimensional physical information of the target object.
  • the image processing method obtains a target image including a target object; determines a target area in the target image, where at least the main body of the target object is located in the target area; in the target area, determines the target object The subject feature area of the subject; the depth information of the target object is determined according to the initial depth information of the subject feature area, and the three-dimensional physical information of the target object is determined according to the depth information.
  • the present invention removes the interference of the background, occlusions, and non-subject parts of the target object in the target image, so it reduces the probability of introducing useless information in the process of calculating depth information and improves the three-dimensional The accuracy of the physical information.
  • the present invention scans and processes the feature area of the subject to obtain the three-dimensional physical information of the corresponding target object. Compared with scanning and processing the entire target image directly, the amount of calculation is reduced and the processing is improved. effectiveness.
  • FIG. 5 is a flowchart of specific steps of an image processing method provided by an embodiment of the present invention. As shown in FIG. 5, the method may include:
  • Step 301 At a preset moment, acquire a first image and a second image of the target object through the binocular camera module.
  • a binocular camera module can be used to determine the initial depth information of the target object.
  • the binocular camera module includes two first and second cameras with a fixed optical center and a fixed distance of interest.
  • a device based on the principle of binocular parallax and obtaining three-dimensional geometric information of a target object from multiple images.
  • FIG. 6 which shows a schematic diagram of a target image provided by an embodiment of the present invention
  • the first image 30 of the target object may be acquired by the first camera of the binocular camera module at a preset time T1.
  • the second image 40 of the target object is acquired by the second camera of the binocular camera module.
  • Step 302 Determine a target area in the first image and the second image, where at least the main body of the target object is located in the target area.
  • the first target area 31 in the first image 30 can be determined, and the second target area 41 in the second image 40 can be determined.
  • the method of determining the target area in the image can refer to the above Step 102, it will not be repeated here.
  • Step 303 In the target area, determine the subject characteristic area of the target object.
  • the first subject feature area EFGH in the first target area 31 can be determined, and the second subject feature area E'F'G'H' in the second target area 41 can be determined, specifically In the target area, the method for determining the main feature area of the target object can refer to the above step 103, which will not be repeated here.
  • Step 304 Perform matching processing on the first subject feature region of the first image and the second image, and/or compare the second subject feature region of the second image with the first image Perform matching processing, and calculate the initial depth information.
  • the actual operation of using the binocular camera module to obtain the initial depth information of the target object includes 4 steps: camera calibration-binocular correction-binocular matching-calculation of depth information.
  • Camera calibration is a process of eliminating the distortion of the camera due to the characteristics of the optical lens. Through camera calibration, the internal and external parameters and distortion parameters of the first camera and the second camera of the binocular camera module can be obtained.
  • Binocular correction After the first image and the second image are acquired, the internal and external parameters and distortion parameters of the first camera and the second camera obtained by the camera calibration are used to perform distortion elimination and processing on the first image and the second image. The alignment process obtains the first image and the second image without distortion.
  • Binocular matching performing matching processing on the first subject feature area of the first image and the second image, and/or matching the second subject feature area of the second image with the first The image is matched.
  • the pixels in the first subject feature area EFGH can be matched with the pixels in the entire second image 40, or the second subject feature area E'F'G'H' can be matched.
  • the pixels in the first image 30 are matched with the pixels in the entire first image 30.
  • the pixels in the first main feature area EFGH can be matched with the pixels in the entire second image 40, and the second The pixel points in the subject feature area E'F'G'H' are matched with the pixels in the entire first image 30.
  • the function of binocular matching is to match the corresponding pixels of the same scene in the left and right views (that is, the first image and the second image). The purpose of this is to obtain the disparity value. After the disparity value is obtained, the operation of calculating depth information can be further performed.
  • the first subject feature region of the first image can be matched with the second image, and/or the second image
  • the second feature area of the subject is matched with the first image.
  • step 304 may specifically include:
  • Sub-step 3041 matching the first subject feature area of the first image with the second image, and/or compare the second subject feature area of the second image with the first The images are matched to get the disparity value.
  • the operation of calculating depth information in determining the depth information can be performed.
  • the disparity value between the first camera and the second camera needs to be calculated, which specifically includes:
  • the first subject feature region of the first image is matched with the second image, and/or the second subject feature region of the second image is matched with the first image.
  • the image is matched, specifically, the feature pixel points extracted from the first feature area can be matched in the second image; and/or the feature extracted from the second subject feature area
  • the pixel points are implemented by matching processing in the first image.
  • the characteristic pixel is a pixel in the image whose gray value change is greater than a preset threshold or the curvature on the edge of the image is greater than the preset curvature value.
  • the characteristic pixels extracted from the first characteristic region may be further matched in the second image; and/or further The feature pixel points extracted from the second subject feature region are matched in the first image.
  • a characteristic pixel is a pixel in the image whose gray value change is greater than a preset threshold or the curvature on the edge of the image is greater than the preset curvature value.
  • the characteristic pixel can be a corner point of the target object, boundary points, etc., which have drastic changes. Point.
  • the feature pixels extracted in the first feature region may be the four corner points E, F, G, and H of the human torso
  • the feature pixels extracted in the second feature region may be the human torso
  • the initial depth information is calculated according to the disparity value.
  • the focal length of the binocular camera module the distance between the optical centers of the first camera and the second camera, and the parallax value, the initial depth information of the target object can be calculated.
  • step 301 it may further include:
  • Step 305 Acquire a first image and a second image of the target object through the binocular camera module at multiple times.
  • the weight value can be weighted in the initial depth information, so that the calculated depth information of the target object is more stable and accurate.
  • the key feature pixel point may be a point that is relatively stable at different times and is unlikely to change in relative position. Therefore, in this step, it is first necessary to acquire the first image and the second image of the target object at multiple times through the binocular camera module.
  • FIG. 8 shows a schematic diagram of a target image provided by an embodiment of the present invention.
  • the first camera of the binocular camera module can obtain the first image 30 of the target object, and the binocular
  • the second camera of the camera module obtains the second image 40 of the target object; and at time T2, the first camera of the binocular camera module obtains the third image 50 of the target object, and the second camera of the binocular camera module
  • the camera acquires a fourth image 60 of the target object.
  • the first target area 31 in the first image 30 determines the second target area 41 in the second image 40, determine the third target area 51 in the third image 50, and determine the fourth image 60 The fourth target area 61.
  • the first subject feature area EFGH in the first target area 31 determines the second subject feature area E'F'G'H' in the second target area 41, and determine the first subject feature area in the third target area 51
  • Three subject feature areas IJKL, and a fourth subject feature area I'J'K'L' in the fourth target area 61 is determined.
  • Step 306 Perform matching processing on the first subject feature region of the first image with the second image acquired at the corresponding time, and/or compare the second subject feature region of the second image with the corresponding The first image acquired at any time is subjected to matching processing.
  • the image obtained by the first camera can be matched with the image obtained by the second camera to determine whether there is a comparison between the image obtained by the first camera and the image obtained by the second camera at the same time.
  • Stable key feature pixel points specifically, the first subject feature area of the first image can be matched with the second image acquired at the corresponding time, and/or all of the second image The second subject feature area is matched with the first image acquired at the corresponding time.
  • the pixels in the first subject feature area EFGH can be matched with the pixels in the entire second image 40, or the pixels in the second subject feature area E'F'G'H' can be matched. Perform matching processing with the pixels in the entire first image 30.
  • Step 307 Determine the first number of successful matching times of the matching process.
  • this step if a pixel in the image obtained by the first camera does not change from the position coordinate of the corresponding pixel in the image obtained by the second camera, it can be determined that the pixel matches at this moment. Success, increase the confidence of the pixel as a key feature pixel. By counting the matching results of the matching processing for the pixel points at each moment, the first number of successful matching can be obtained.
  • Step 308 Perform matching processing between the feature regions in the multiple first images acquired at different times.
  • matching processing can be performed on the characteristic regions in the first image at different times.
  • the pixel points in the first subject feature region EFGH at time T1 can be matched with the pixels in the third subject feature region IJKL at time T2.
  • Step 309 Determine the second number of successful matching times of the matching process.
  • step 304 and step 309 it may further include:
  • Step 310 Set a weight value for the initial depth information according to the first number of successful matching times and the second number of successful matching times, and the size of the weight value is positively correlated with the number of matching times.
  • the weight value P t corresponding to the initial depth information determined according to the first number of successful matching times c1 and the second number of successful matching times c2 can be realized according to the following formula 1:
  • the first number of successful matching times c1 can be obtained, and at the same time
  • the initial depth information Ed of point E can be calculated by binocular matching, and the initial confidence of point E is set as
  • the second number of successful matching c2 can be obtained, and according to the above formula 1, the calculation is obtained The weight value P t of point E.
  • the weight values P t of points F, G, and H can be obtained. It should be noted that the parameter 60 and the parameter 5 in formula 1 can be updated and set based on experience and requirements, which is not limited in the embodiment of the present invention.
  • Step 311 Perform a weighted average calculation according to the initial depth information and the weight value corresponding to the initial depth information to obtain the depth information of the target object.
  • the weight value P t of the point E is weighted to the initial depth information corresponding to the point E in the subject feature area to achieve the pixel
  • the confidence of the point achieves the purpose of weighted average, making the calculated depth information of the target object more stable and accurate.
  • the pixels E, F, G, and H in the first subject feature region EFGH are determined as feature pixels, and the weight value P t1 and the initial depth information Ed of the point E are calculated for the point E , F calculation point for the obtained initial depth data Fd weight value P t2 and point E, to give the initial depth information Gd weight value P t3 and point E point G calculated for the obtained weight value P t4 point H is calculated for and The initial depth information Hd of point E is calculated by weighted average, and finally the target object can be calculated
  • FIG. 9 there is shown a probability distribution diagram of a timing matching operation provided by an embodiment of the present invention, where the abscissa is the number of frames, and the number of frames is used to represent the binocular camera
  • the continuous multiple frames of the first image and the second image of the target object acquired by the module At one of the multiple moments, the first image and the second image of the target object acquired by the binocular camera module can be regarded as one frame .
  • the ordinate is the probability, which is used to indicate the degree of confidence.
  • Step 312 Determine three-dimensional physical information of the target object according to the depth information.
  • step 105 For details of this step, refer to the above step 105, which will not be repeated here.
  • the image processing method obtains a target image including a target object; determines a target area in the target image, where at least the main body of the target object is located in the target area; in the target area, determines the target object The subject feature area of the subject; the depth information of the target object is determined according to the initial depth information of the subject feature area, and the three-dimensional physical information of the target object is determined according to the depth information.
  • the present invention removes the interference of the background, occlusions, and non-subject parts of the target object in the target image, so it reduces the probability of introducing useless information in the process of calculating depth information and improves the three-dimensional The accuracy of the physical information.
  • the present invention scans and processes the feature area of the subject to obtain the three-dimensional physical information of the corresponding target object. Compared with scanning and processing the entire target image directly, the amount of calculation is reduced and the processing is improved. effectiveness.
  • FIG. 10 is a block diagram of an image processing apparatus according to an embodiment of the present invention.
  • the image processing apparatus 400 may include: a receiver 401 and a processor 402;
  • the receiver 401 is configured to perform: acquiring a target image including a target object;
  • the processor 402 is configured to execute:
  • the target area determine the main feature area of the target object
  • the three-dimensional physical information of the target object is determined according to the depth information.
  • processor 402 is further configured to execute:
  • sub-regions corresponding to the target classification category are merged to obtain the subject feature region.
  • the offset of the contour of the main feature region is less than or equal to a preset threshold.
  • processor 402 is further configured to execute:
  • the classification categories of multiple sub-regions are determined.
  • the receiver 401 is further configured to perform:
  • the first image and the second image of the target object are acquired through the binocular camera module.
  • processor 402 is further configured to execute:
  • the depth information of the target object is determined.
  • processor 402 is further configured to execute:
  • the initial depth information is calculated.
  • processor 402 is further configured to execute:
  • the feature pixels extracted from the first feature area are matched in the second image; and/or the feature pixels extracted from the second subject feature area are in the first Matching processing in the image.
  • the characteristic pixel is a pixel in the image whose gray value change is greater than a preset threshold or the curvature on the edge of the image is greater than the preset curvature value.
  • the receiver 401 is further configured to perform:
  • the first image and the second image of the target object are acquired through the binocular camera module.
  • processor 402 is further configured to execute:
  • the first number of successful matching times of the matching process is determined.
  • processor 402 is further configured to execute:
  • the second number of successful matching times of the matching process is determined.
  • processor 402 is further configured to execute:
  • a weighted average calculation is performed to obtain the depth information of the target object.
  • processor 402 is further configured to execute:
  • the three-dimensional physical information of the target object is determined according to the position coordinates of the target object at different times.
  • the image processing device acquires a target image including a target object; determines the target area in the target image, where at least the main body of the target object is located in the target area; in the target area, determines the main body of the target object Feature area: Determine the depth information of the target object according to the initial depth information of the feature area of the subject, and determine the three-dimensional physical information of the target object according to the depth information.
  • the present invention removes the interference of the background, occlusions, and non-subject parts of the target object in the target image, so it reduces the probability of introducing useless information in the process of calculating depth information and improves the three-dimensional The accuracy of the physical information.
  • the present invention scans and processes the feature area of the subject to obtain the three-dimensional physical information of the corresponding target object. Compared with scanning and processing the entire target image directly, the amount of calculation is reduced and the processing is improved. effectiveness.
  • the embodiment of the present invention also provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the image processing method are implemented.
  • An embodiment of the present invention also provides a control terminal, which is characterized in that it includes the image processing device, a transmitting device, and a receiving device.
  • the transmitting device sends a shooting instruction to a movable device, and the receiving device receives the movable device.
  • the image taken by the device, the image processing device processes the image.
  • an embodiment of the present invention also provides a portable device 500, including a photographing device 501, and further includes the image processing device 400 described in FIG. 10, the image processing device 400 receives the image taken by the photographing device 501 and Perform image processing.
  • the movable device 500 further includes a controller 502 and a power system 503, and the controller 502 controls the power output of the power system 503 according to the processing result processed by the image processing device 400.
  • the power system includes a motor that drives the propeller and a motor that drives the movement of the pan/tilt. Therefore, the controller 502 can change the posture of the movable device 500 or the orientation of the pan/tilt (that is, the orientation of the camera 501) according to the image processing result.
  • the image processing device 400 is integrated in the controller 502.
  • the movable device 500 includes at least one of a drone, an unmanned vehicle, an unmanned boat, and a handheld camera.
  • the control terminal 600 includes but is not limited to: a radio frequency unit 601, a network module 602, an audio output unit 603, an input unit 604, a sensor 605, and a display unit 606, a user input unit 607, an interface unit 608, a memory 609, a processor 610, a power supply 611 and other components.
  • a radio frequency unit 601 for implementing various embodiments of the present invention.
  • the control terminal 600 includes but is not limited to: a radio frequency unit 601, a network module 602, an audio output unit 603, an input unit 604, a sensor 605, and a display unit 606, a user input unit 607, an interface unit 608, a memory 609, a processor 610, a power supply 611 and other components.
  • the structure of the control terminal shown in FIG. 12 does not constitute a limitation on the control terminal, and the control terminal may include more or less components than those shown in the figure, or a combination of certain components, or different components Layout.
  • the control terminal includes, but is not limited to,
  • the radio frequency unit 601 can be used for receiving and sending signals during the process of sending and receiving information or talking. Specifically, the downlink data from the base station is received and processed by the processor 610; in addition, Uplink data is sent to the base station.
  • the radio frequency unit 601 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like.
  • the radio frequency unit 601 can also communicate with the network and other devices through a wireless communication system.
  • the control terminal provides users with wireless broadband Internet access through the network module 602, such as helping users to send and receive emails, browse web pages, and access streaming media.
  • the audio output unit 603 can convert the audio data received by the radio frequency unit 601 or the network module 602 or stored in the memory 609 into audio signals and output them as sounds. Moreover, the audio output unit 603 may also provide audio output related to a specific function performed by the control terminal 600 (for example, call signal reception sound, message reception sound, etc.).
  • the audio output unit 603 includes a speaker, a buzzer, a receiver, and the like.
  • the input unit 604 is used to receive audio or video signals.
  • the input unit 604 may include a graphics processing unit (GPU) 6041 and a microphone 6042.
  • the graphics processor 6041 is configured to monitor static pictures or videos obtained by an image capture control terminal (such as a camera) in the video capture mode or the image capture mode. Image data is processed.
  • the processed image frame may be displayed on the display unit 606.
  • the image frame processed by the graphics processor 6041 may be stored in the memory 609 (or other storage medium) or sent via the radio frequency unit 601 or the network module 602.
  • the microphone 6042 can receive sound, and can process such sound into audio data.
  • the processed audio data can be converted into a format that can be sent to the mobile communication base station via the radio frequency unit 601 for output in the case of a telephone call mode.
  • the control terminal 600 also includes at least one sensor 605, such as a light sensor, a motion sensor, and other sensors.
  • the light sensor includes an ambient light sensor and a proximity sensor.
  • the ambient light sensor can adjust the brightness of the display panel 6061 according to the brightness of the ambient light.
  • the proximity sensor can turn off the display panel 6061 and the display panel 6061 when the control terminal 600 is moved to the ear. / Or backlight.
  • the accelerometer sensor can detect the magnitude of acceleration in various directions (usually three-axis), and can detect the magnitude and direction of gravity when stationary, and can be used to identify and control terminal posture (such as horizontal and vertical screen switching, related games) , Magnetometer attitude calibration), vibration recognition related functions (such as pedometer, percussion), etc.; sensor 605 can also include fingerprint sensor, pressure sensor, iris sensor, molecular sensor, gyroscope, barometer, hygrometer, thermometer, Infrared sensors, etc., will not be repeated here.
  • the display unit 606 is used to display information input by the user or information provided to the user.
  • the display unit 606 may include a display panel 6061, and the display panel 6061 may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), etc.
  • LCD liquid crystal display
  • OLED organic light-emitting diode
  • the user input unit 607 may be used to receive inputted numeric or character information, and generate key signal input related to user settings and function control of the control terminal.
  • the user input unit 607 includes a touch panel 6071 and other input devices 6072.
  • the touch panel 6071 also called a touch screen, can collect user touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc.) on the touch panel 6071 or near the touch panel 6071. operating).
  • the touch panel 6071 may include two parts, a touch detection control terminal and a touch controller.
  • the touch detection control terminal detects the user's touch position, detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection control terminal and converts it into contact coordinates , And then send to the processor 610, receive the command sent by the processor 610 and execute it.
  • the touch panel 6071 can be implemented in multiple types such as resistive, capacitive, infrared, and surface acoustic wave.
  • the user input unit 607 may also include other input devices 6072.
  • other input devices 6072 may include, but are not limited to, a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackball, mouse, and joystick, which will not be repeated here.
  • the touch panel 6071 can cover the display panel 6061.
  • the touch panel 6071 detects a touch operation on or near it, it is transmitted to the processor 610 to determine the type of the touch event, and then the processor 610 determines the type of the touch event according to the touch.
  • the type of event provides corresponding visual output on the display panel 6061.
  • the touch panel 6071 and the display panel 6061 are used as two independent components to realize the input and output functions of the control terminal, in some embodiments, the touch panel 6071 and the display panel 6061 can be integrated to realize the control terminal. Input and output functions are not limited here.
  • the interface unit 608 is an interface for connecting the external control terminal and the control terminal 600.
  • the external control terminal may include a wired or wireless headset port, an external power source (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a control terminal with an identification module, and audio input / Output (I/O) port, video I/O port, headphone port, etc.
  • the interface unit 608 can be used to receive input (for example, data information, power, etc.) from an external control terminal and transmit the received input to one or more elements in the control terminal 600 or can be used to connect to the control terminal 600 Transfer data between external control terminals.
  • the memory 609 can be used to store software programs and various data.
  • the memory 609 may mainly include a storage program area and a storage data area.
  • the storage program area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data (such as audio data, phone book, etc.) created by the use of mobile phones.
  • the memory 609 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
  • the processor 610 is the control center of the control terminal. It uses various interfaces and lines to connect the various parts of the entire control terminal, runs or executes the software programs and/or modules stored in the memory 609, and calls the data stored in the memory 609. , Perform various functions of the control terminal and process data, so as to monitor the control terminal as a whole.
  • the processor 610 may include one or more processing units; preferably, the processor 610 may integrate an application processor and a modem processor, where the application processor mainly processes the operating system, user interface and application programs, etc., the modem
  • the processor mainly deals with wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 610.
  • the control terminal 600 may also include a power supply 611 (such as a battery) for supplying power to various components.
  • a power supply 611 such as a battery
  • the power supply 611 may be logically connected to the processor 610 through a power management system, so as to manage charging, discharging, and power consumption management through the power management system. And other functions.
  • control terminal 600 includes some functional modules not shown, which will not be repeated here.
  • the embodiment of the present invention also provides a control terminal, including a processor 610, a memory 609, and a computer program stored on the memory 609 and running on the processor 610.
  • a control terminal including a processor 610, a memory 609, and a computer program stored on the memory 609 and running on the processor 610.
  • the computer program is executed by the processor 610,
  • Each process of the foregoing image processing method embodiment is implemented, and the same technical effect can be achieved. To avoid repetition, details are not repeated here.
  • the embodiment of the present invention also provides a computer-readable storage medium, and a computer program is stored on the computer-readable storage medium.
  • a computer program is stored on the computer-readable storage medium.
  • the computer program is executed by a processor, each process of the above-mentioned image processing method embodiment is realized, and the same technology can be achieved. The effect, in order to avoid repetition, will not be repeated here.
  • the computer readable storage medium such as read-only memory (Read-Only Memory, ROM for short), random access memory (Random Access Memory, RAM for short), magnetic disk or optical disk, etc.
  • an embodiment of the present invention also provides a portable device, including a photographing device, and further comprising the image processing device described in FIG. 10, the image processing device receives an image photographed by the photographing device and performs image processing.
  • the movable device further includes a controller and a power system, and the controller controls the power output of the power system according to the processing result processed by the image processing device.
  • the image processing device is integrated in the controller.
  • the movable equipment includes at least one of a drone, an unmanned vehicle, an unmanned boat, and a handheld camera.
  • the embodiments of the present application can be provided as a method, a control terminal, or a computer program product. Therefore, the present application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing terminal equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction control terminal, The instruction controls the terminal to realize the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing terminal equipment, so that a series of operation steps are executed on the computer or other programmable terminal equipment to produce computer-implemented processing, so that the computer or other programmable terminal equipment
  • the instructions executed above provide steps for implementing functions specified in a flow or multiple flows in the flowchart and/or a block or multiple blocks in the block diagram.

Abstract

An image processing method and apparatus, a control terminal and a mobile device. The image processing method comprises: acquiring a target image comprising a target object (101); determining a target area in the target image (102); in the target area, determining a main body feature area of the target object (103); determining, according to initial depth information of the main body feature area, depth information of the target object (104); and determining, according to the depth information, three-dimensional physical information of the target object (105). According to the present invention, during the process of obtaining the depth information, the interference of a background, an obstruction, and a non-main body part of the target object in the target image is removed, and therefore, the probability of introducing useless information during the process of computing the depth information is reduced and the accuracy of the three-dimensional physical information is improved. In addition, in the present invention, scanning and processing are performed with regard to a main body feature area, so as to obtain three-dimensional physical information of a corresponding target object; and compared with directly performing scanning and processing on an overall target image, the amount of computation is reduced and processing efficiency is increased.

Description

图像处理方法、装置、控制终端及可移动设备Image processing method, device, control terminal and movable equipment 技术领域Technical field
本发明属于图像处理技术领域,特别是涉及一种图像处理方法、装置、控制终端及可移动设备。The present invention belongs to the field of image processing technology, and particularly relates to an image processing method, device, control terminal and movable equipment.
背景技术Background technique
作为智能计算的重要领域,计算机视觉技术得到了极大的开发应用。计算机视觉技术通过成像系统来代替人类的视觉器官,从而实现对目标物体的跟踪定位。As an important field of intelligent computing, computer vision technology has been greatly developed and applied. Computer vision technology replaces human visual organs with imaging systems to achieve tracking and positioning of target objects.
在目前,实现对目标物体的跟踪定位,需要首先得知目标物体的深度信息,即获得一张用于表示目标深度信息的深度信息图(Depth Map),目前获取深度信息图的方式有两种,方案一,参照图1,对成像系统获取的包括目标物体2的图像1进行特征检测,并通过圈定一个包括目标物体2和部分背景画面4的特征框3,使得根据特征框3中的所有像素点,来计算目标物体2的深度信息,从而绘制目标物体2的深度信息图。方案二,直接采用图像语义分割(semantic segmentation)算法或语意解析(semantic parsing)算法,对图像1进行针对目标物体2的识别,从而绘制目标物体2的深度信息图。At present, to realize the tracking and positioning of the target object, it is necessary to first know the depth information of the target object, that is, to obtain a depth information map (Depth Map) used to represent the depth information of the target. There are two ways to obtain the depth information map. , Solution 1, referring to Figure 1, perform feature detection on the image 1 including the target object 2 acquired by the imaging system, and by delimiting a feature frame 3 including the target object 2 and part of the background picture 4, according to all features in the feature frame 3 Pixels are used to calculate the depth information of the target object 2, so as to draw the depth information map of the target object 2. The second solution is to directly use the image semantic segmentation algorithm or the semantic parsing algorithm to recognize the target object 2 in the image 1, thereby drawing the depth information map of the target object 2.
但是,目前的方案一中,由于圈定的特征框3中包括了目标物体2以及部分背景画面4,使得利用特征框3绘制深度信息图的过程中,引入了大量无用的信息,如背景画面4的深度信息,目标物体2的一些非重要部分的深度信息等。使得最终绘制的深度信息图无法精确表达目标物体2,导致对目标物体2的跟踪定位精度较差。另外,方案二中,直接对图像1全图进行算法处理,导致需要较大的计算资源,使得处理成本较高。However, in the current solution 1, since the delineated feature frame 3 includes the target object 2 and part of the background image 4, the process of drawing the depth information map using the feature frame 3 introduces a lot of useless information, such as the background image 4. The depth information of some non-important parts of the target object 2, etc. As a result, the final depth information map cannot accurately express the target object 2, resulting in poor tracking and positioning accuracy of the target object 2. In addition, in the second solution, algorithm processing is directly performed on the whole image of image 1, which results in the need for larger computing resources and higher processing costs.
发明内容Summary of the invention
本发明提供一种图像处理方法、装置、控制终端及可移动设备,以便解决现有技术中确定物体的三维物理信息需要较大的计算资源,导致处理成本较高,且对物体的跟踪定位精度较差的问题。The present invention provides an image processing method, device, control terminal and movable equipment, so as to solve the need for large computing resources to determine the three-dimensional physical information of an object in the prior art, resulting in high processing cost, and tracking and positioning accuracy of the object Poor problem.
为了解决上述技术问题,本发明是这样实现的:In order to solve the above technical problems, the present invention is implemented as follows:
第一方面,本发明实施例提供了一种图像处理方法,该方法可以包括:In the first aspect, an embodiment of the present invention provides an image processing method, which may include:
获取包括目标物体的目标图像;Acquiring a target image including the target object;
确定所述目标图像中的目标区域,所述目标物体至少主体部分位于所述目标区域内;Determining a target area in the target image, and at least a main body of the target object is located in the target area;
在所述目标区域中,确定所述目标物体的主体特征区域;In the target area, determine the main feature area of the target object;
根据所述主体特征区域的初始深度信息,确定所述目标物体的深度信息;Determine the depth information of the target object according to the initial depth information of the subject feature area;
根据所述深度信息确定所述目标物体的三维物理信息。The three-dimensional physical information of the target object is determined according to the depth information.
第二方面,本发明实施例提供了一种图像处理装置,该图像处理装置可以包括:In the second aspect, an embodiment of the present invention provides an image processing device, and the image processing device may include:
所述接收器用于执行:获取包括目标物体的目标图像;The receiver is configured to perform: acquiring a target image including a target object;
所述处理器用于执行:The processor is used to execute:
确定所述目标图像中的目标区域,所述目标物体至少主体部分位于所述目标区域内;Determining a target area in the target image, and at least a main body of the target object is located in the target area;
在所述目标区域中,确定所述目标物体的主体特征区域;In the target area, determine the main feature area of the target object;
根据所述主体特征区域的初始深度信息,确定所述目标物体的深度信息;Determine the depth information of the target object according to the initial depth information of the subject feature area;
根据所述深度信息确定所述目标物体的三维物理信息。The three-dimensional physical information of the target object is determined according to the depth information.
本发明实施例的第三方面,提供了一种计算机可读存储介质,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现上述所述的图像处理方法的步骤。In a third aspect of the embodiments of the present invention, a computer-readable storage medium is provided, and a computer program is stored on the computer-readable storage medium. When the computer program is executed by a processor, the steps of the image processing method described above are implemented .
本发明实施例的第四方面,提供了一种控制终端,其特征在于,包括所述的图像处理装置,发射装置,接收装置,所述发射装置向可移动设备发送拍摄指令,所述接收装置接收所述可移动设备拍摄的图像,所述图像处理装置对所述图像进行处理。In a fourth aspect of the embodiments of the present invention, a control terminal is provided, which is characterized in that it includes the image processing device, a transmitting device, and a receiving device. The transmitting device sends a shooting instruction to a movable device, and the receiving device The image taken by the movable device is received, and the image processing device processes the image.
本发明实施例的第五方面,提供了一种可移动设备,包括拍摄装置,所述可移动设备还包括图像处理装置,所述图像处理装置接收所述拍摄装置拍摄的图像并进行图像处理。In a fifth aspect of the embodiments of the present invention, there is provided a movable device including a photographing device, the movable device further comprising an image processing device, and the image processing device receives an image photographed by the photographing device and performs image processing.
在本发明实施例中,本发明通过获取包括目标物体的目标图像;确定目标图像中的目标区域,目标物体至少主体部分位于目标区域内;在目标区域中,确定目标物体的主体特征区域;根据主体特征区域的初始深度信息,确定目标物体的深度信息,根据深度信息确定目标物体的三维物理信息。本发明在得到深度信息的过程中,由于去除了目标图像中的背景、遮挡物、目标物体的非主体部分的干扰,所以降低了在计算深度信息的过程中引入无用信息的几率,提高了三维物理信息的精度,另外,本发明是针对主体特征区域进行扫描和处理,从而得到对应目标物体的三维物理信息,相较于直接对整个目标图像进行扫描和处理,减少了计算量,提高了处理效率。In the embodiments of the present invention, the present invention obtains a target image including the target object; determines the target area in the target image, where at least the main body of the target object is located in the target area; in the target area, determines the main characteristic area of the target object; The initial depth information of the feature area of the subject determines the depth information of the target object, and the three-dimensional physical information of the target object is determined according to the depth information. In the process of obtaining depth information, the present invention removes the interference of the background, occlusions, and non-subject parts of the target object in the target image, so it reduces the probability of introducing useless information in the process of calculating depth information and improves the three-dimensional The accuracy of the physical information. In addition, the present invention scans and processes the feature area of the subject to obtain the three-dimensional physical information of the corresponding target object. Compared with scanning and processing the entire target image directly, the amount of calculation is reduced and the processing is improved. effectiveness.
附图说明Description of the drawings
图1是本发明实施例提供的一种图像处理方法的步骤流程图;FIG. 1 is a flowchart of steps of an image processing method provided by an embodiment of the present invention;
图2是本发明实施例提供的一种目标图像的示意图;2 is a schematic diagram of a target image provided by an embodiment of the present invention;
图3是本发明实施例提供的另一种目标图像的示意图;Fig. 3 is a schematic diagram of another target image provided by an embodiment of the present invention;
图4是本发明实施例提供的一种图像处理方法的具体步骤流程图;4 is a flowchart of specific steps of an image processing method provided by an embodiment of the present invention;
图5是本发明实施例提供的另一种图像处理方法的具体步骤流程图;Figure 5 is a flowchart of specific steps of another image processing method provided by an embodiment of the present invention;
图6是本发明实施例提供的另一种目标图像的示意图;Fig. 6 is a schematic diagram of another target image provided by an embodiment of the present invention;
图7是本发明实施例提供的一种目标物体的初始深度信息的获取场景图;FIG. 7 is a scene diagram of acquiring initial depth information of a target object according to an embodiment of the present invention;
图8是本发明实施例提供的另一种目标图像的示意图;FIG. 8 is a schematic diagram of another target image provided by an embodiment of the present invention;
图9是本发明实施例提供的一种时序匹配操作的的概率分布图;FIG. 9 is a probability distribution diagram of a timing matching operation provided by an embodiment of the present invention;
图10是本发明实施例提供的一种图像处理装置的框图;FIG. 10 is a block diagram of an image processing device according to an embodiment of the present invention;
图11是本发明实施例提供的一种可移动设备的框图;FIG. 11 is a block diagram of a movable device according to an embodiment of the present invention;
图12是本发明实施例提供的一种控制终端的硬件结构示意图。FIG. 12 is a schematic diagram of the hardware structure of a control terminal provided by an embodiment of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, rather than all of them. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
图1是本发明实施例提供的一种图像处理方法的步骤流程图,如图1所示,该方法可以包括:Fig. 1 is a flowchart of steps of an image processing method provided by an embodiment of the present invention. As shown in Fig. 1, the method may include:
步骤101、获取包括目标物体的目标图像。Step 101: Obtain a target image including a target object.
在本发明实施例中,本发明实施例提供的一种图像处理方法可以应用于一种可移动设备,该可移动设备包括:无人机、无人车、无人船、手持拍摄设备等,可移动设备上通常设置有具有拍摄功能的图像处理装置,另外,可移动设备的正常工作,离不开图像处理装置对可移动设备周边物体的拍摄和处理得到的物体的深度信息。In the embodiment of the present invention, the image processing method provided by the embodiment of the present invention can be applied to a movable device, which includes: unmanned aerial vehicle, unmanned vehicle, unmanned boat, handheld camera, etc., The movable equipment is usually provided with an image processing device with a photographing function. In addition, the normal operation of the movable equipment is inseparable from the image processing device's photographing and processing of the surrounding objects of the movable equipment and the depth information of the object.
例如,无人车在进行无人驾驶时,需要利用其上设置的图像处理装置实时采集无人车周边环境中物体的图像,并根据对该图像的进一步处理,得到该物体的深度信息,无人车可以利用深度信息,实现确定物体的方位的目的,以实现无人自动驾驶。For example, when an unmanned vehicle is driving unmanned, it needs to use the image processing device installed on it to collect images of objects in the surrounding environment of the unmanned vehicle in real time, and further process the image to obtain the depth information of the object. People and vehicles can use depth information to achieve the purpose of determining the orientation of objects to achieve unmanned automatic driving.
在该步骤中,获取包括目标物体的目标图像,具体可以通过图像处理装置中的摄像头,拍摄一张或多张画面中具有目标物体的目标图像。In this step, the target image including the target object is acquired. Specifically, the camera in the image processing device may be used to shoot one or more images of the target image with the target object.
步骤102、确定所述目标图像中的目标区域,所述目标物体至少主体部分位于所述目标区域内。Step 102: Determine a target area in the target image, and at least a main body of the target object is located in the target area.
具体的,在本发明实施例中,在获取到包括目标物体的目标图像之后,可以进一步确定目标图像中的目标区域,以达到检测目标图像中的物体的目的,其中,目标区域可以包括目标物体至少主体部分,即目标区域可以与目标物体的至少主体部分完全重叠或者部分重叠。Specifically, in the embodiment of the present invention, after the target image including the target object is acquired, the target area in the target image can be further determined to achieve the purpose of detecting the object in the target image, wherein the target area may include the target object At least the main body, that is, the target area, may completely overlap or partially overlap with at least the main body of the target object.
参照图2,其示出了本发明实施例提供的一种目标图像的示意图,其中,目标图像10中包括人物目标物体11以及背景中的两个路灯12,若直接对整个目标图像10进行扫描来确定人物目标物体11的深度信息,首先会导致计算量过大,其次在确定人物目标物体11的深度信息的过程中会引入不相关的背景以及路灯12的相关信息,导致人物目标物体11的深度信息产生误差的几率较大。2, which shows a schematic diagram of a target image provided by an embodiment of the present invention, where the target image 10 includes a human target object 11 and two street lamps 12 in the background. If the entire target image 10 is scanned directly To determine the depth information of the character target object 11, firstly, it will cause too much calculation. Secondly, in the process of determining the depth information of the character target object 11, irrelevant background and relevant information of the street lamp 12 will be introduced, resulting in the character target object 11. The probability of error in depth information is greater.
因此,在本发明实施例中,可以通过目标区域框13大致框选出人物目标物体11所在的区域,目标区域框13中可以包括整个人物目标物体11以及少部分背景区域。Therefore, in the embodiment of the present invention, the area where the human target object 11 is located can be roughly selected by the target area frame 13, and the target area frame 13 may include the entire human target object 11 and a small part of the background area.
另外,在目标物体的尺寸比较大或形状不规则的情况下,也可以先确定目标物体的主体部分,并使得目标区域至少包括该主体部分。In addition, when the size of the target object is relatively large or the shape is irregular, the main body part of the target object may be determined first, and the target area at least includes the main body part.
相较于直接对整个目标图像10进行扫描处理,先圈定包括整个人物目标物体11的目标区域框13,再对目标区域框13中的区域进行扫描处理,能够一定程度上降低计算量,且目标区域框13过滤掉了背景中两个不相关的路灯12,降低了在计算人物目标物体11的深度信息时产生误差的几率。Compared with directly scanning the entire target image 10, first delimit the target area frame 13 including the entire human target object 11, and then scan the area in the target area frame 13, which can reduce the amount of calculation to a certain extent, and the target The area frame 13 filters out the two irrelevant street lights 12 in the background, and reduces the probability of error when calculating the depth information of the character target object 11.
具体的,通过目标区域框13框选出人物目标物体11所在区域,具体可以采用下述两种实现方式:Specifically, the area where the character target object 11 is selected through the target area frame 13 can be implemented in the following two ways:
方式一、通过接收用户的框选操作,产生目标区域框13,并通过目标区域框13框选出人物目标物体11所在区域。Manner 1: By receiving the user's frame selection operation, the target area frame 13 is generated, and the area where the human target object 11 is located is selected through the target area frame 13.
方式二、通过深度学习,训练得到能够识别和确定目标图像10中的人物目标物体11的识别模型,使得将目标图像10输入识别模型后,识别模型可以自动输出包括人物目标物体11的目标区域框13。该方式类似于目前的人脸区域定位技术,本发明在此不做赘述。Method 2: Through deep learning, train to obtain a recognition model that can recognize and determine the character target object 11 in the target image 10, so that after the target image 10 is input to the recognition model, the recognition model can automatically output the target area frame including the character target object 11 13. This method is similar to the current face region positioning technology, and the present invention will not repeat it here.
需要说明的是,目标区域的形状优选为矩形,当然,根据实际需求不同,目标区域的形状也可以 采用圆形、不规则形状等,本发明实施例对此不作限定。It should be noted that the shape of the target area is preferably a rectangle. Of course, the shape of the target area may also be a circle, an irregular shape, etc., according to actual needs, which is not limited in the embodiment of the present invention.
步骤103,在所述目标区域中,确定所述目标物体的主体特征区域。Step 103: Determine the subject characteristic area of the target object in the target area.
在本发明实施例中,主体特征区域可以精准反映目标物体所在的方位,如,在可移动设备对目标物体进行运动轨迹定位时,主体特征区域可以代表整个目标物体的质心,使得可以将主体特征区域移动所产生的轨迹确定为目标物体移动时所产生的轨迹。In the embodiment of the present invention, the subject feature area can accurately reflect the orientation of the target object. For example, when the movable device locates the target object, the subject feature area can represent the center of mass of the entire target object, so that the subject feature can be The trajectory generated by the area movement is determined as the trajectory generated when the target object moves.
具体的,根据目标物体的种类,可以实现目标物体的主体特征区域的确定。Specifically, according to the type of the target object, the subject characteristic area of the target object can be determined.
例如,参照图2,在目标物体为人的情况下,由于人在运动的时候四肢动作幅度较大,会使得测得的深度信息方差较大,因此可以定义人的躯干部分(即图2中的区域ABCD)为主体特征区域,使得在目标区域框13中进一步圈定出主体特征区域,以降低后续计算深度信息时产生的方差。For example, referring to Figure 2, in the case where the target object is a human, since the limbs of the human move with a large amplitude, the measured depth information will have a large variance, so the human torso can be defined (that is, the part in Figure 2 The area ABCD) is the subject feature area, so that the subject feature area is further delimited in the target area frame 13 to reduce the variance generated during the subsequent calculation of the depth information.
例如,参照图3,在目标图像20中的目标物体为汽车21,且拍摄到的目标图像20存在遮挡区域22的情况下,目标区域框23圈定了整个汽车21以及部分遮挡区域22,由于汽车21为较规则形状物体,因此可以定义目标区域框23中遮挡区域22之外的汽车21所在区域为主体特征区域,以降低后续计算深度信息时遮挡区域22产生的方差。For example, referring to FIG. 3, when the target object in the target image 20 is a car 21, and the captured target image 20 has an occlusion area 22, the target area frame 23 encircles the entire car 21 and a partial occlusion area 22. 21 is a relatively regular-shaped object. Therefore, the area of the car 21 outside the occlusion region 22 in the target region frame 23 can be defined as the main feature region, so as to reduce the variance generated by the occlusion region 22 in the subsequent calculation of depth information.
步骤104,根据所述主体特征区域的初始深度信息,确定所述目标物体的深度信息。Step 104: Determine the depth information of the target object according to the initial depth information of the subject feature area.
在本发明实施例中,深度信息的感知是人类产生立体物体视觉的前提,深度信息是指图像中存储每个像素所用的位数,其确定了彩色图像的每个像素可能有的色彩数,或者确定灰度图像的每个像素可能有的灰度级数。In the embodiment of the present invention, the perception of depth information is the prerequisite for humans to produce stereoscopic object vision. The depth information refers to the number of bits used to store each pixel in the image, which determines the number of colors that each pixel of the color image may have. Or determine the number of gray levels that each pixel of the gray image may have.
在现实中,物体在人眼的观察范围内,存在由近到远深度变化,如,一把直尺水平放置在桌面上,用户站着直尺的刻度起点一端进行观看,视觉范围内存在直尺的刻度由小变大的趋势,且随着视线向直尺的另一端移动,会有刻度之间的间隔不断缩小的感觉,这就是深度信息对人类视觉的影响。In reality, objects in the observation range of the human eye have depth changes from near to far. For example, a ruler is placed horizontally on the desktop, and the user stands at the starting point of the ruler to observe. There is a straight edge in the visual range. The scale of the ruler tends to change from small to large, and as the line of sight moves to the other end of the ruler, there will be a feeling that the interval between the scales is shrinking. This is the effect of depth information on human vision.
在计算机视觉领域,物体的深度信息可以是一张灰度图,包含每个像素点的深度信息,深度信息的大小表现在灰度的深浅,灰度图通过灰度渐变表示了代表物体距摄像头的远近程度。In the field of computer vision, the depth information of an object can be a grayscale image that contains the depth information of each pixel. The size of the depth information is expressed in the depth of the grayscale. The grayscale image represents the distance between the object and the camera through a grayscale gradient. The distance.
因此,在本发明实施例中,可以通过获取可移动设备附近物体的深度信息,来对可移动设备附近物体进行方位定位和测距等操作,从而提高可移动设备的智能化体验。Therefore, in the embodiments of the present invention, by acquiring depth information of objects near the movable device, operations such as orientation positioning and ranging of the objects near the movable device can be performed, thereby improving the intelligent experience of the movable device.
具体的,步骤103已经确定了目标物体的主体特征区域,使得进一步通过主体特征区域的初始深度信息,来确定目标物体的深度信息。Specifically, step 103 has determined the subject feature area of the target object, so that the initial depth information of the subject feature area is further used to determine the depth information of the target object.
具体的,主体特征区域的初始深度信息的获取方式可以有多种,如,在一种实现方案中,目前的可移动设备可以包括双目摄像模组的配置,因此获取目标物体的深度信息可以通过被动测距传感的方法进行实现,该方法通过两个相隔一定距离的摄像头同时获取同一目标物体的两幅图像,并通过立体匹配算法找到两幅图像中主体特征区域对应的像素点,随后根据三角原理计算出视差信息,而视差信息通过转换可得到用于表征场景中主体特征区域的初始深度信息。Specifically, the initial depth information of the subject feature area can be acquired in multiple ways. For example, in an implementation scheme, the current mobile device can include the configuration of a binocular camera module, so the depth information of the target object can be acquired. It is realized by the method of passive ranging sensing. This method uses two cameras separated by a certain distance to obtain two images of the same target object at the same time, and uses a stereo matching algorithm to find the pixel points corresponding to the main feature area in the two images, and then The disparity information is calculated according to the triangulation principle, and the disparity information can be converted to obtain the initial depth information used to characterize the feature area of the subject in the scene.
在另一种实现方案中,还可以通过主动测距传感的方法实现获取主体特征区域的初始深度信息,主动测距传感相比较于被动测距传感最明显的特征是:利用设备本身发射的能量来完成初始深度信息 的采集,这也就保证了深度图像的获取独立于彩色图像的获取,因此,在本发明实施例中,可以通过可移动设备对目标物体发射连续的近红外脉冲,然后利用可移动设备的传感器接收由目标物体反射回的光脉冲,通过比较发射光脉冲与经过目标物体反射的光脉冲的相位差,可以推算得到光脉冲之间的传输延迟进而得到目标物体相对于发射器的距离,最终得到一幅包含了目标物体的主体特征区域对应的初始深度信息的深度图像。In another implementation scheme, it is also possible to obtain the initial depth information of the feature area of the subject through the method of active ranging sensing. Compared with passive ranging sensing, the most obvious feature of active ranging sensing is: using the device itself The emitted energy completes the collection of initial depth information, which also ensures that the acquisition of depth images is independent of the acquisition of color images. Therefore, in the embodiment of the present invention, continuous near-infrared pulses can be emitted to the target object through the movable device , And then use the sensor of the movable device to receive the light pulse reflected by the target object. By comparing the phase difference between the emitted light pulse and the light pulse reflected by the target object, the transmission delay between the light pulses can be inferred and the target object relative Depending on the distance of the transmitter, a depth image containing the initial depth information corresponding to the main feature area of the target object is finally obtained.
进一步的,在确定了主体特征区域的初始深度信息之后,可以通过对主体特征区域的初始深度信息求平均值,得到对应目标物体的深度信息,在得到深度信息的过程中,由于去除了目标图像中的背景、遮挡物、目标物体的非主体部分的干扰,所以降低了在计算深度信息的过程中引入无用信息的几率,提高了深度信息的精度,另外,本发明实施例是针对目标图像中局部的主体特征区域进行扫描和处理,得到对应目标物体的深度信息,相较于直接对整个目标图像进行扫描和处理,减少了计算量,提高了处理效率。Further, after the initial depth information of the subject feature area is determined, the depth information of the corresponding target object can be obtained by averaging the initial depth information of the subject feature area. In the process of obtaining the depth information, since the target image is removed The background, occluders, and non-subject parts of the target object interfere with each other, so the probability of introducing useless information in the process of calculating the depth information is reduced, and the accuracy of the depth information is improved. In addition, the embodiment of the present invention is aimed at the target image. The local feature area of the subject is scanned and processed to obtain the depth information of the corresponding target object. Compared with scanning and processing the entire target image directly, the amount of calculation is reduced and the processing efficiency is improved.
步骤105,根据所述深度信息确定所述目标物体的三维物理信息。Step 105: Determine three-dimensional physical information of the target object according to the depth information.
在该步骤中,若要对目标物体进行跟踪,必须获取到目标物体的三维物理信息,因此,可以进一步根据目标物体的深度信息,确定目标物体的三维物理信息,目标物体的三维物理信息可以用于表明目标物体的方位和运动时的运动轨迹。In this step, if you want to track the target object, you must obtain the three-dimensional physical information of the target object. Therefore, the three-dimensional physical information of the target object can be further determined according to the depth information of the target object. The three-dimensional physical information of the target object can be used To indicate the orientation and trajectory of the target object.
具体的,物体的深度信息可以是一张灰度图,包含每个像素点的深度信息,深度信息的大小表现在灰度的深浅,灰度图通过灰度渐变表示了代表物体距摄像头的远近程度。可以将目标物体的深度信息转化为一张灰度图,并通过计算灰度图中的灰度渐变值,利用灰度渐变值与距离之间的对应关系,确定出目标物体与可移动设备之间的间距,确定目标物体在不同时刻下的位置坐标,另外,可以将目标物体在不同时刻下的位置坐标与对应的时刻关联起来,得到目标物体的三维物理信息。另外,也可以将目标物体在不同时刻下的位置坐标与对应的时刻关联起来,并体现在具体的地图上,得到目标物体的三维物理信息。Specifically, the depth information of the object can be a grayscale image containing the depth information of each pixel. The depth information is expressed in the depth of the grayscale. The grayscale image represents the distance of the object from the camera through a grayscale gradient. degree. The depth information of the target object can be converted into a grayscale image, and by calculating the grayscale gradient value in the grayscale image, the corresponding relationship between the grayscale gradient value and the distance can be used to determine the target object and the movable device. To determine the position coordinates of the target object at different times, and to associate the position coordinates of the target object at different times with the corresponding time to obtain the three-dimensional physical information of the target object. In addition, the position coordinates of the target object at different times can also be associated with the corresponding time and reflected on a specific map to obtain the three-dimensional physical information of the target object.
综上,本发明实施例提供的一种图像处理方法,通过获取包括目标物体的目标图像;确定目标图像中的目标区域,目标物体至少主体部分位于目标区域内;在目标区域中,确定目标物体的主体特征区域;根据主体特征区域的初始深度信息,确定目标物体的深度信息,根据深度信息确定目标物体的三维物理信息。本发明在得到深度信息的过程中,由于去除了目标图像中的背景、遮挡物、目标物体的非主体部分的干扰,所以降低了在计算深度信息的过程中引入无用信息的几率,提高了三维物理信息的精度,另外,本发明是针对主体特征区域进行扫描和处理,从而得到对应目标物体的三维物理信息,相较于直接对整个目标图像进行扫描和处理,减少了计算量,提高了处理效率。In summary, an image processing method provided by an embodiment of the present invention obtains a target image including a target object; determines a target area in the target image, where at least the main body of the target object is located in the target area; in the target area, determines the target object The subject feature area of the subject; the depth information of the target object is determined according to the initial depth information of the subject feature area, and the three-dimensional physical information of the target object is determined according to the depth information. In the process of obtaining depth information, the present invention removes the interference of the background, occlusions, and non-subject parts of the target object in the target image, so it reduces the probability of introducing useless information in the process of calculating depth information and improves the three-dimensional The accuracy of the physical information. In addition, the present invention scans and processes the feature area of the subject to obtain the three-dimensional physical information of the corresponding target object. Compared with scanning and processing the entire target image directly, the amount of calculation is reduced and the processing is improved. effectiveness.
图4是本发明实施例提供的一种图像处理方法的具体步骤流程图,如图4所示,该方法可以包括:Fig. 4 is a flowchart of specific steps of an image processing method provided by an embodiment of the present invention. As shown in Fig. 4, the method may include:
步骤201、获取包括目标物体的目标图像。Step 201: Obtain a target image including the target object.
该步骤具体可以参照上述步骤101,此处不再赘述。For details of this step, please refer to the above step 101, which will not be repeated here.
步骤202、确定所述目标图像中的目标区域,所述目标物体至少主体部分位于所述目标区域内。Step 202: Determine a target area in the target image, and at least a main body of the target object is located in the target area.
该步骤具体可以参照上述步骤102,此处不再赘述。For details of this step, please refer to the above step 102, which will not be repeated here.
步骤203、通过提取所述目标区域的边缘特征,将所述目标区域划分为多个子区域。Step 203: Divide the target area into multiple sub-areas by extracting edge features of the target area.
在本发明实施例中,边缘特征用于表示一张图像中有明显变化的边缘或者不连续的区域,由于边缘是一幅图像中不同区域之间的边界线,所以一个边缘图像可以是一个二值图像,边缘检测的目的是捕捉亮度急剧变化的区域。在理想情况下,对目标区域进行边缘检测,可以在目标区域中得到由一系列连续的曲线组成的边缘特征,用于表示对象的边界,各个边缘特征之间通过相交,可以将整个目标区域划分为多个子区域。In the embodiment of the present invention, the edge feature is used to indicate the obviously changing edge or discontinuous area in an image. Since the edge is the boundary line between different areas in an image, an edge image can be a two-dimensional image. For value images, the purpose of edge detection is to capture areas with sharp changes in brightness. In an ideal situation, the edge detection of the target area can obtain an edge feature composed of a series of continuous curves in the target area, which is used to represent the boundary of the object, and the entire target area can be divided by the intersection of each edge feature For multiple sub-regions.
步骤204、通过分类模型,确定多个所述子区域的分类类别。Step 204: Determine the classification categories of the multiple sub-regions through the classification model.
可选的,所述步骤204还可以通过卷积神经网络模型,确定多个所述子区域的分类类别或,通过分类器,确定多个所述子区域的分类类别的方式来实现。Optionally, the step 204 may also be implemented in a manner of determining the classification categories of multiple sub-regions through a convolutional neural network model or determining the classification categories of multiple sub-regions through a classifier.
基于深度学习,可以利用训练数据集训练得到分类模型,分类模型用于对各个子区域的所属类别进行划分,具体的,对分类模型的训练过程可以包括:采用预设图案的区域与该预设图案的所属分类之间的对应关系,对分类模型进行训练,使得分类模型能够达到输入某一区域,输出该区域对应的所属分类的目的。Based on deep learning, the training data set can be used to train the classification model. The classification model is used to classify the categories of each sub-region. Specifically, the training process of the classification model can include: the region using the preset pattern and the preset The corresponding relationship between the classifications of the patterns, the classification model is trained, so that the classification model can achieve the purpose of inputting a certain area and outputting the classification corresponding to the area.
在该步骤中,可以将目标区域的多个子区域输入训练好的分类模型,该分类模型会输出每个子区域的分类类别。In this step, multiple sub-regions of the target region can be input into the trained classification model, and the classification model will output the classification category of each sub-region.
步骤205、在多个所述子区域中,合并与目标分类类别对应的子区域,得到所述主体特征区域。Step 205: Combine the sub-areas corresponding to the target classification category among the multiple sub-areas to obtain the subject feature area.
在该步骤中,可以先确定与主体特征区域匹配的目标分类类别,之后将目标分类类别所对应的子区域进行连接,得到主体特征区域。In this step, the target classification category matching the subject feature area can be determined first, and then the subregions corresponding to the target classification category are connected to obtain the subject feature area.
例如,在目标物体为人的情况下,由于人在运动的时候四肢动作幅度较大,会使得测得的深度信息方差较大,因此可以定义人体躯干部分为主体特征区域,并确定目标分类类别为人体躯干类别,将人体躯干类别对应的子区域进行合并,可以得到所述主体特征区域。For example, in the case where the target object is a human, the limbs movement amplitude is large when the human is moving, which will cause the measured depth information to have a large variance. Therefore, the human body can be defined as the main feature area, and the target classification category can be determined as The human body torso category combines the sub-regions corresponding to the human body torso category to obtain the subject feature region.
可选的,当所述目标物体处于受力或运动状态时,所述主体特征区域的轮廓的偏移量小于或等于预设阈值。Optionally, when the target object is under force or in motion, the offset of the contour of the main feature region is less than or equal to a preset threshold.
具体的,对主体特征区域的定义是,在目标物体处于受力或运动状态时,主体特征区域的轮廓的偏移量小于或等于预设阈值,即目标物体的主体特征区域在目标物体运动或受力状态下,能够保持一个相对较为稳定的状态,以避免后期计算目标物体的深度信息时,引入过多无用信息。Specifically, the definition of the subject feature area is that when the target object is under force or in motion, the offset of the contour of the subject feature area is less than or equal to the preset threshold, that is, the subject feature area of the target object moves or moves on the target object. Under the force state, it can maintain a relatively stable state to avoid introducing too much useless information when calculating the depth information of the target object later.
另外,对主体特征区域的轮廓的偏移量的测定,具体可以包括:在固定拍摄视角下,获取连续多帧包括目标物体的帧图像,并将相邻帧图像中主体特征区域的轮廓的位移差记录为偏移量,或者,将一帧图像中主体特征区域的轮廓与若干帧之前的帧图像的主体特征区域的轮廓之间的位移差,记录为偏移量。In addition, the measurement of the offset of the contour of the subject feature region may specifically include: obtaining consecutive frames of frame images including the target object under a fixed shooting angle of view, and displacing the contour of the subject feature region in adjacent frame images The difference is recorded as the offset, or the displacement difference between the contour of the subject feature area in one frame of image and the contour of the subject feature area of the frame image several frames before is recorded as the offset.
步骤206、根据所述主体特征区域的初始深度信息,确定所述目标物体的深度信息。Step 206: Determine the depth information of the target object according to the initial depth information of the subject feature area.
该步骤具体可以参照上述步骤104,此处不再赘述。For details of this step, refer to the above step 104, which will not be repeated here.
步骤207、根据所述深度信息,确定所述目标物体在不同时刻下的位置坐标。Step 207: Determine the position coordinates of the target object at different times according to the depth information.
在本发明实施例中,物体的深度信息可以是一张灰度图,包含每个像素点的深度信息,深度信息的大小表现在灰度的深浅,灰度图通过灰度渐变表示了代表物体距摄像头的远近程度。In the embodiment of the present invention, the depth information of the object can be a grayscale image, including the depth information of each pixel. The size of the depth information is expressed in the depth of the grayscale. The grayscale image represents the representative object through grayscale gradient. The distance from the camera.
因此,可以将目标物体的深度信息转化为一张灰度图,并通过计算灰度图中的灰度渐变值,利用灰度渐变值与距离之间的对应关系,确定出目标物体与可移动设备之间的间距。目标物体在运动的时候,会不断的更新深度信息,因此能够根据更新的深度信息得到新的灰度图,从而根据目标物体在不同时刻下与可移动设备之间的间距,确定目标物体在不同时刻下的位置坐标。Therefore, the depth information of the target object can be converted into a grayscale image, and by calculating the grayscale gradient value in the grayscale image, the corresponding relationship between the grayscale gradient value and the distance can be used to determine the target object and the movable The spacing between devices. When the target object is in motion, the depth information is constantly updated, so a new grayscale image can be obtained according to the updated depth information, so as to determine the target object at different times according to the distance between the target object and the movable device The position coordinates at the moment.
步骤208、根据所述目标物体在不同时刻下的位置坐标,确定所述目标物体的三维物理信息。Step 208: Determine the three-dimensional physical information of the target object according to the position coordinates of the target object at different times.
在本发明实施例中,可以将目标物体在不同时刻下的位置坐标与对应的时刻关联起来,得到目标物体的三维物理信息。另外,也可以将目标物体在不同时刻下的位置坐标与对应的时刻关联起来,并体现在具体的地图上,得到目标物体的三维物理信息。In the embodiment of the present invention, the position coordinates of the target object at different times can be correlated with the corresponding times to obtain the three-dimensional physical information of the target object. In addition, the position coordinates of the target object at different times can also be associated with the corresponding time and reflected on a specific map to obtain the three-dimensional physical information of the target object.
综上所述,本发明实施例提供的图像处理方法,通过获取包括目标物体的目标图像;确定目标图像中的目标区域,目标物体至少主体部分位于目标区域内;在目标区域中,确定目标物体的主体特征区域;根据主体特征区域的初始深度信息,确定目标物体的深度信息,根据深度信息确定目标物体的三维物理信息。本发明在得到深度信息的过程中,由于去除了目标图像中的背景、遮挡物、目标物体的非主体部分的干扰,所以降低了在计算深度信息的过程中引入无用信息的几率,提高了三维物理信息的精度,另外,本发明是针对主体特征区域进行扫描和处理,从而得到对应目标物体的三维物理信息,相较于直接对整个目标图像进行扫描和处理,减少了计算量,提高了处理效率。In summary, the image processing method provided by the embodiments of the present invention obtains a target image including a target object; determines a target area in the target image, where at least the main body of the target object is located in the target area; in the target area, determines the target object The subject feature area of the subject; the depth information of the target object is determined according to the initial depth information of the subject feature area, and the three-dimensional physical information of the target object is determined according to the depth information. In the process of obtaining depth information, the present invention removes the interference of the background, occlusions, and non-subject parts of the target object in the target image, so it reduces the probability of introducing useless information in the process of calculating depth information and improves the three-dimensional The accuracy of the physical information. In addition, the present invention scans and processes the feature area of the subject to obtain the three-dimensional physical information of the corresponding target object. Compared with scanning and processing the entire target image directly, the amount of calculation is reduced and the processing is improved. effectiveness.
图5是本发明实施例提供的一种图像处理方法的具体步骤流程图,如图5所示,该方法可以包括:FIG. 5 is a flowchart of specific steps of an image processing method provided by an embodiment of the present invention. As shown in FIG. 5, the method may include:
步骤301、在预设时刻,通过双目摄像模组获取所述目标物体的第一图像以及第二图像。Step 301: At a preset moment, acquire a first image and a second image of the target object through the binocular camera module.
在本发明实施例中,可以采用双目摄像模组实现目标物体的初始深度信息的确定,双目摄像模组包括两台光心固定且关心间距固定的第一摄像头和第二摄像头,是一种基于双目视差原理并由多幅图像获取目标物体的三维几何信息的装置。具体的,参照图6,其示出了本发明实施例提供的一种目标图像的示意图,可以在预设时刻T1,由双目摄像模组的第一摄像头获取目标物体的第一图像30,同时由双目摄像模组的第二摄像头获取目标物体的第二图像40。In the embodiment of the present invention, a binocular camera module can be used to determine the initial depth information of the target object. The binocular camera module includes two first and second cameras with a fixed optical center and a fixed distance of interest. A device based on the principle of binocular parallax and obtaining three-dimensional geometric information of a target object from multiple images. Specifically, referring to FIG. 6, which shows a schematic diagram of a target image provided by an embodiment of the present invention, the first image 30 of the target object may be acquired by the first camera of the binocular camera module at a preset time T1. At the same time, the second image 40 of the target object is acquired by the second camera of the binocular camera module.
步骤302、确定第一图像以及第二图像中的目标区域,所述目标物体至少主体部分位于所述目标区域内。Step 302: Determine a target area in the first image and the second image, where at least the main body of the target object is located in the target area.
在该步骤中,参照图6,可以确定第一图像30中的第一目标区域31,并确定第二图像40中的第二目标区域41,具体的,确定图像中目标区域的方法可以参照上述步骤102,此处不再赘述。In this step, referring to FIG. 6, the first target area 31 in the first image 30 can be determined, and the second target area 41 in the second image 40 can be determined. Specifically, the method of determining the target area in the image can refer to the above Step 102, it will not be repeated here.
步骤303、在所述目标区域中,确定所述目标物体的主体特征区域。Step 303: In the target area, determine the subject characteristic area of the target object.
在该步骤中,参照图6,可以确定第一目标区域31中的第一主体特征区域EFGH,并确定第二目标区域41中的第二主体特征区域E’F’G’H’,具体的,在所述目标区域中,确定所述目标物体 的主体特征区域的方法可以参照上述步骤103,此处不再赘述。In this step, referring to FIG. 6, the first subject feature area EFGH in the first target area 31 can be determined, and the second subject feature area E'F'G'H' in the second target area 41 can be determined, specifically In the target area, the method for determining the main feature area of the target object can refer to the above step 103, which will not be repeated here.
步骤304、将所述第一图像的所述第一主体特征区域与所述第二图像进行匹配处理,和/或将所述第二图像的所述第二主体特征区域与所述第一图像进行匹配处理,计算得到所述初始深度信息。Step 304: Perform matching processing on the first subject feature region of the first image and the second image, and/or compare the second subject feature region of the second image with the first image Perform matching processing, and calculate the initial depth information.
利用双目摄像模组获取目标物体的初始深度信息实际操作包括4个步骤:相机标定—双目校正—双目匹配—计算深度信息。The actual operation of using the binocular camera module to obtain the initial depth information of the target object includes 4 steps: camera calibration-binocular correction-binocular matching-calculation of depth information.
相机标定:其中,相机标定是对摄像头由于光学透镜的特性使得成像存在的畸变的消除过程,通过相机标定,可以得到双目摄像模组第一摄像头和第二摄像头的内外参数以及畸变参数。Camera calibration: Among them, camera calibration is a process of eliminating the distortion of the camera due to the characteristics of the optical lens. Through camera calibration, the internal and external parameters and distortion parameters of the first camera and the second camera of the binocular camera module can be obtained.
双目校正:在第获取到第一图像和第二图像后,利用相机标定得出的第一摄像头和第二摄像头的内外参数以及畸变参数,对第一图像和第二图像进行畸变消除和行对准处理,得到无畸变的第一图像和第二图像。Binocular correction: After the first image and the second image are acquired, the internal and external parameters and distortion parameters of the first camera and the second camera obtained by the camera calibration are used to perform distortion elimination and processing on the first image and the second image. The alignment process obtains the first image and the second image without distortion.
双目匹配:将所述第一图像的所述第一主体特征区域与所述第二图像进行匹配处理,和/或将所述第二图像的所述第二主体特征区域与所述第一图像进行匹配处理。Binocular matching: performing matching processing on the first subject feature area of the first image and the second image, and/or matching the second subject feature area of the second image with the first The image is matched.
具体的,参照图6,可以将第一主体特征区域EFGH中的像素点与整个第二图像40中的像素点进行匹配处理,也可以将第二主体特征区域E’F’G’H’中的像素点与整个第一图像30中的像素点进行匹配处理,另外也可以既将第一主体特征区域EFGH中的像素点与整个第二图像40中的像素点进行匹配处理,又将第二主体特征区域E’F’G’H’中的像素点与整个第一图像30中的像素点进行匹配处理。双目匹配的作用是把同一场景在左右视图(即第一图像和第二图像)上对应的像素点匹配起来,这样做的目的是为了得到视差值。得到视差值后,可以进一步进行计算深度信息的操作。Specifically, referring to FIG. 6, the pixels in the first subject feature area EFGH can be matched with the pixels in the entire second image 40, or the second subject feature area E'F'G'H' can be matched. The pixels in the first image 30 are matched with the pixels in the entire first image 30. In addition, the pixels in the first main feature area EFGH can be matched with the pixels in the entire second image 40, and the second The pixel points in the subject feature area E'F'G'H' are matched with the pixels in the entire first image 30. The function of binocular matching is to match the corresponding pixels of the same scene in the left and right views (that is, the first image and the second image). The purpose of this is to obtain the disparity value. After the disparity value is obtained, the operation of calculating depth information can be further performed.
在本发明实施例中,由于已经确定了能够精确反应目标物体的质心的主体特征区域,则可以将第一图像的第一主体特征区域与第二图像进行匹配处理,和/或将第二图像的第二主体特征区域与第一图像进行匹配处理,该两种方式都能够达到双目匹配的目的。同时针对主体特征区域进行双目匹配处理,相较于直接对整个第一图像和第二图像进行双目匹配,减少了计算量,提高了处理效率。In the embodiment of the present invention, since the subject feature region that can accurately reflect the centroid of the target object has been determined, the first subject feature region of the first image can be matched with the second image, and/or the second image The second feature area of the subject is matched with the first image. Both methods can achieve the purpose of binocular matching. At the same time, the binocular matching processing is performed on the main feature area, compared with the binocular matching directly on the entire first image and the second image, the calculation amount is reduced and the processing efficiency is improved.
可选的,步骤304具体可以包括:Optionally, step 304 may specifically include:
子步骤3041、将所述第一图像的所述第一主体特征区域与所述第二图像进行匹配处理,和/或将所述第二图像的所述第二主体特征区域与所述第一图像进行匹配,得到视差值。Sub-step 3041, matching the first subject feature area of the first image with the second image, and/or compare the second subject feature area of the second image with the first The images are matched to get the disparity value.
在该步骤中,可以执行确定深度信息中的计算深度信息操作,计算深度信息首先需要计算第一摄像头和第二摄像头之间的视差值,具体包括:In this step, the operation of calculating depth information in determining the depth information can be performed. To calculate the depth information, the disparity value between the first camera and the second camera needs to be calculated, which specifically includes:
在该步骤中,参照图7,示出了本发明实施例提供的一种目标物体的初始深度信息的获取场景图,其中,P是目标物体的主体特征区域中某一点,OR与OT分别是第一摄像头和第二摄像头的光心,点P在两个第一摄像头和第二摄像头的感光器上的成像点分别为P和P’(摄像头的成像平面经过旋转后放在了镜头前方),f为相机焦距,B为两相机中心距,设点P到点P’之间的视差值为dis,则:视差值dis=B-(Xr-Xt)。In this step, referring to FIG. 7, there is shown a scene graph for acquiring initial depth information of a target object provided by an embodiment of the present invention, where P is a point in the main feature area of the target object, and OR and OT are respectively The optical centers of the first camera and the second camera, the imaging points of point P on the photoreceptors of the two first and second cameras are P and P'respectively (the imaging plane of the camera is rotated and placed in front of the lens) , F is the focal length of the camera, B is the center distance between the two cameras, and the disparity value between point P and point P'is dis, then: disparity value dis=B-(Xr-Xt).
可选的,将所述第一图像的所述第一主体特征区域与所述第二图像进行匹配处理,和/或将所述 第二图像的所述第二主体特征区域与所述第一图像进行匹配,具体可以通过将从所述第一特征区域中提取到的特征像素点在所述第二图像中进行匹配处理;和/或将从所述第二主体特征区域中提取到的特征像素点在所述第一图像中进行匹配处理的方式进行实现。Optionally, the first subject feature region of the first image is matched with the second image, and/or the second subject feature region of the second image is matched with the first image. The image is matched, specifically, the feature pixel points extracted from the first feature area can be matched in the second image; and/or the feature extracted from the second subject feature area The pixel points are implemented by matching processing in the first image.
可选的,所述特征像素点为图像中灰度值变化大于预设阈值或图像边缘上曲率大于预设曲率值的像素点。Optionally, the characteristic pixel is a pixel in the image whose gray value change is greater than a preset threshold or the curvature on the edge of the image is greater than the preset curvature value.
在本发明实施例中,为了进一步降低初始深度信息计算过程中的数据处理量,还可以进一步将从第一特征区域中提取到的特征像素点在第二图像中进行匹配处理;和/或进一步将从第二主体特征区域中提取到的特征像素点在第一图像中进行匹配处理。其中,特征像素点为图像中灰度值变化大于预设阈值或图像边缘上曲率大于预设曲率值的像素点,该特征像素点可以为目标物体的拐角点,边界点等一些具有剧烈变化特征的点。In the embodiment of the present invention, in order to further reduce the amount of data processing in the initial depth information calculation process, the characteristic pixels extracted from the first characteristic region may be further matched in the second image; and/or further The feature pixel points extracted from the second subject feature region are matched in the first image. Among them, a characteristic pixel is a pixel in the image whose gray value change is greater than a preset threshold or the curvature on the edge of the image is greater than the preset curvature value. The characteristic pixel can be a corner point of the target object, boundary points, etc., which have drastic changes. Point.
例如,参照图6,第一特征区域中提取到的特征像素点可以为人体躯干的四个边角点E、F、G、H,第二特征区域中提取到的特征像素点可以为人体躯干的四个边角点E’、F’、G’、H’。For example, referring to Figure 6, the feature pixels extracted in the first feature region may be the four corner points E, F, G, and H of the human torso, and the feature pixels extracted in the second feature region may be the human torso The four corner points E', F', G', H'.
子步骤3042、根据所述视差值,计算得到所述初始深度信息。In sub-step 3042, the initial depth information is calculated according to the disparity value.
在该步骤中,参照图7,假设初始深度信息为Z,则在求得视差值dis=B-(Xr-Xt)之后,可以再根据相似三角形原理:B-(Xr-Xt)/B=(Z-f)/Z,可以得到初始深度信息Z=(fB)/(Xr-Xt)。In this step, referring to Figure 7, assuming that the initial depth information is Z, after the disparity value dis=B-(Xr-Xt) is obtained, the principle of similar triangles can be followed: B-(Xr-Xt)/B =(Zf)/Z, the initial depth information Z=(fB)/(Xr-Xt) can be obtained.
因此,根据双目摄像模组的焦距、第一摄像头和第二摄像头的光心间距和视差值,可以计算得到目标物体的初始深度信息。Therefore, according to the focal length of the binocular camera module, the distance between the optical centers of the first camera and the second camera, and the parallax value, the initial depth information of the target object can be calculated.
可选的,在步骤301之后,还可以包括:Optionally, after step 301, it may further include:
步骤305、在多个时刻,通过所述双目摄像模组获取所述目标物体的第一图像以及第二图像。Step 305: Acquire a first image and a second image of the target object through the binocular camera module at multiple times.
在本发明实施例中,还可以通过时序匹配操作,确定主体特征区域中的关键特征像素点,并按照关键特征像素点的置信度,为关键特征像素点添加对应的权重值,在根据初始深度信息计算目标物体的深度信息的过程中,可以将权重值加权在初始深度信息中,使得计算得到的目标物体的深度信息更加稳定和准确。In the embodiment of the present invention, it is also possible to determine the key feature pixels in the subject feature area through the timing matching operation, and add corresponding weight values to the key feature pixels according to the confidence of the key feature pixels. In the process of calculating the depth information of the target object, the weight value can be weighted in the initial depth information, so that the calculated depth information of the target object is more stable and accurate.
具体的,关键特征像素点可以为不同时刻相对较为稳定且不易发生相对位置变化的点。因此,在该步骤中,首先需要通过双目摄像模组在多个时刻,获取目标物体的第一图像以及第二图像。Specifically, the key feature pixel point may be a point that is relatively stable at different times and is unlikely to change in relative position. Therefore, in this step, it is first necessary to acquire the first image and the second image of the target object at multiple times through the binocular camera module.
例如,参照图8,其示出了本发明实施例提供的一种目标图像的示意图,可以在时刻T1,由双目摄像模组的第一摄像头获取目标物体的第一图像30,同时双目摄像模组的第二摄像头获取目标物体的第二图像40;并在时刻T2,由双目摄像模组的第一摄像头获取目标物体的第三图像50,同时由双目摄像模组的第二摄像头获取目标物体的第四图像60。For example, referring to FIG. 8, it shows a schematic diagram of a target image provided by an embodiment of the present invention. At time T1, the first camera of the binocular camera module can obtain the first image 30 of the target object, and the binocular The second camera of the camera module obtains the second image 40 of the target object; and at time T2, the first camera of the binocular camera module obtains the third image 50 of the target object, and the second camera of the binocular camera module The camera acquires a fourth image 60 of the target object.
进一步的,可以确定第一图像30中的第一目标区域31,确定第二图像40中的第二目标区域41,确定第三图像50中的第三目标区域51,并确定第四图像60中的第四目标区域61。Further, it is possible to determine the first target area 31 in the first image 30, determine the second target area 41 in the second image 40, determine the third target area 51 in the third image 50, and determine the fourth image 60 The fourth target area 61.
进一步的,可以确定第一目标区域31中的第一主体特征区域EFGH,确定第二目标区域41中的第二主体特征区域E’F’G’H’,确定第三目标区域51中的第三主体特征区域IJKL,并确定第四 目标区域61中的第四主体特征区域I’J’K’L’。Further, it is possible to determine the first subject feature area EFGH in the first target area 31, determine the second subject feature area E'F'G'H' in the second target area 41, and determine the first subject feature area in the third target area 51 Three subject feature areas IJKL, and a fourth subject feature area I'J'K'L' in the fourth target area 61 is determined.
步骤306、将所述第一图像的所述第一主体特征区域与对应时刻获取的所述第二图像进行匹配处理,和/或将所述第二图像的所述第二主体特征区域与对应时刻获取的所述第一图像进行匹配处理。Step 306: Perform matching processing on the first subject feature region of the first image with the second image acquired at the corresponding time, and/or compare the second subject feature region of the second image with the corresponding The first image acquired at any time is subjected to matching processing.
在该步骤中,在同一时刻,可以对第一摄像头获取的图像与第二摄像头获取的图像进行匹配操作,来判断同一时刻下第一摄像头获取的图像与第二摄像头获取的图像中是否存在较为稳定的关键特征像素点,具体的,可以将所述第一图像的所述第一主体特征区域与对应时刻获取的所述第二图像进行匹配处理,和/或将所述第二图像的所述第二主体特征区域与对应时刻获取的所述第一图像进行匹配处理。参照图8,可以将第一主体特征区域EFGH中的像素点与整个第二图像40中的像素点进行匹配处理,也可以将第二主体特征区域E’F’G’H’中的像素点与整个第一图像30中的像素点进行匹配处理,另外也可以既将第一主体特征区域EFGH中的像素点与整个第二图像40中的像素点进行匹配处理,又将第二主体特征区域E’F’G’H’中的像素点与整个第一图像30中的像素点进行匹配处理。In this step, at the same time, the image obtained by the first camera can be matched with the image obtained by the second camera to determine whether there is a comparison between the image obtained by the first camera and the image obtained by the second camera at the same time. Stable key feature pixel points, specifically, the first subject feature area of the first image can be matched with the second image acquired at the corresponding time, and/or all of the second image The second subject feature area is matched with the first image acquired at the corresponding time. Referring to FIG. 8, the pixels in the first subject feature area EFGH can be matched with the pixels in the entire second image 40, or the pixels in the second subject feature area E'F'G'H' can be matched. Perform matching processing with the pixels in the entire first image 30. In addition, it is also possible to perform matching processing on the pixels in the first subject feature area EFGH with the pixels in the entire second image 40, and to perform the second subject feature area The pixels in E'F'G'H' are matched with the pixels in the entire first image 30.
步骤307、确定所述匹配处理的第一匹配成功次数。Step 307: Determine the first number of successful matching times of the matching process.
在该步骤中,若第一摄像头获取的图像中的一个像素点相较于与第二摄像头获取的图像中的对应像素点位置坐标未发生改变,即可确定在该时刻下,该像素点匹配成功,增加该像素点作为关键特征像素点的置信度。通过对每个时刻的像素点针对匹配处理的匹配结果进行统计,可以得到第一匹配成功次数。In this step, if a pixel in the image obtained by the first camera does not change from the position coordinate of the corresponding pixel in the image obtained by the second camera, it can be determined that the pixel matches at this moment. Success, increase the confidence of the pixel as a key feature pixel. By counting the matching results of the matching processing for the pixel points at each moment, the first number of successful matching can be obtained.
例如,参照图8,若第一主体特征区域EFGH中的像素点E、F、G、H的位置与第二主体特征区域E’F’G’H’中的像素点E’、F’、G’、H’的位置未发生相对变化,则为像素点E、F、G、H增加一次第一匹配成功次数。For example, referring to FIG. 8, if the positions of the pixel points E, F, G, and H in the first body feature region EFGH and the pixel points E', F', and F'in the second body feature region E'F'G'H' There is no relative change in the positions of G'and H', and the number of first matching successes is increased by one pixel point E, F, G, and H.
步骤308、将不同时刻获取的多个所述第一图像中的特征区域之间进行匹配处理。Step 308: Perform matching processing between the feature regions in the multiple first images acquired at different times.
在该步骤中,可以对不同时刻下第一图像中的特征区域之间进行匹配处理。In this step, matching processing can be performed on the characteristic regions in the first image at different times.
例如,参照图8,可以将T1时刻下第一主体特征区域EFGH中的像素点与T2时刻下第三主体特征区域IJKL中的像素点进行匹配处理。For example, referring to FIG. 8, the pixel points in the first subject feature region EFGH at time T1 can be matched with the pixels in the third subject feature region IJKL at time T2.
步骤309、确定所述匹配处理的第二匹配成功次数。Step 309: Determine the second number of successful matching times of the matching process.
例如,参照图8,若第一主体特征区域EFGH中的像素点E、F、G、H的位置与第三主体特征区域IJKL中的像素点I、J、K、L的位置未发生相对变化,则为像素点E、F、G、H增加一次第二匹配成功次数。For example, referring to FIG. 8, if the positions of the pixels E, F, G, and H in the first main feature area EFGH and the positions of the pixels I, J, K, and L in the third main feature area IJKL do not change relatively , Then the number of second matching successes is increased for the pixels E, F, G, and H.
可选的,在步骤304和步骤309之后,还可以包括:Optionally, after step 304 and step 309, it may further include:
步骤310、根据所述第一匹配成功次数和所述第二匹配成功次数,为所述初始深度信息设置权重值,所述权重值的大小随着所述匹配次数正相关。Step 310: Set a weight value for the initial depth information according to the first number of successful matching times and the second number of successful matching times, and the size of the weight value is positively correlated with the number of matching times.
在该步骤中,可以根据如下公式1,实现根据第一匹配成功次数c1和第二匹配成功次数c2,确定的初始深度信息对应的权重值P tIn this step, the weight value P t corresponding to the initial depth information determined according to the first number of successful matching times c1 and the second number of successful matching times c2 can be realized according to the following formula 1:
Figure PCTCN2019089425-appb-000001
Figure PCTCN2019089425-appb-000001
具体的参照图7,在时刻T1,针对于第一图像30中的点E,根据点E与第二图像中的点E’之间的匹配操作,可以得出第一匹配成功次数c1,同时能够通过双目匹配计算得到点E的初始深度信息Ed,设定点E的初始置信度为
Figure PCTCN2019089425-appb-000002
Specifically referring to FIG. 7, at time T1, for point E in the first image 30, according to the matching operation between point E and point E'in the second image, the first number of successful matching times c1 can be obtained, and at the same time The initial depth information Ed of point E can be calculated by binocular matching, and the initial confidence of point E is set as
Figure PCTCN2019089425-appb-000002
在时刻T2,针对于第一图像30中的点E,根据点E与第三图像中的点I之间的匹配操作,可以得出第二匹配成功次数c2,并根据上述公式1,计算得到点E的权重值P tAt time T2, for the point E in the first image 30, according to the matching operation between the point E and the point I in the third image, the second number of successful matching c2 can be obtained, and according to the above formula 1, the calculation is obtained The weight value P t of point E.
按照同样的方式,可以得到点F、点G、点H的的权重值P t。需要说明的是,公式1中的参数60和参数5可以依据经验和需求进行更新设定,本发明实施例对此不做限定。 In the same way, the weight values P t of points F, G, and H can be obtained. It should be noted that the parameter 60 and the parameter 5 in formula 1 can be updated and set based on experience and requirements, which is not limited in the embodiment of the present invention.
步骤311、根据所述初始深度信息,以及所述初始深度信息对应的权重值,进行加权平均计算,得到所述目标物体的深度信息。Step 311: Perform a weighted average calculation according to the initial depth information and the weight value corresponding to the initial depth information to obtain the depth information of the target object.
在对主体特征区域的初始深度信息求平均值,得到对应目标物体的深度信息的过程中,将点E的权重值P t加权到主体特征区域中点E对应的初始深度信息上,达到按照像素点的置信度实现加权平均的目的,使得计算得到的目标物体的深度信息更加稳定和准确。 In the process of averaging the initial depth information of the subject feature area to obtain the depth information of the corresponding target object, the weight value P t of the point E is weighted to the initial depth information corresponding to the point E in the subject feature area to achieve the pixel The confidence of the point achieves the purpose of weighted average, making the calculated depth information of the target object more stable and accurate.
例如,参照图8,假设确定了第一主体特征区域EFGH中的像素点E、F、G、H为特征像素点,且针对点E计算得到了权重值P t1以及点E的初始深度信息Ed,针对点F计算得到了权重值P t2以及点E的初始深度信息Fd,针对点G计算得到了权重值P t3以及点E的初始深度信息Gd,针对点H计算得到了权重值P t4以及点E的初始深度信息Hd,则通过加权平均计算,最终可以计算得到目标物体的
Figure PCTCN2019089425-appb-000003
For example, referring to FIG. 8, it is assumed that the pixels E, F, G, and H in the first subject feature region EFGH are determined as feature pixels, and the weight value P t1 and the initial depth information Ed of the point E are calculated for the point E , F calculation point for the obtained initial depth data Fd weight value P t2 and point E, to give the initial depth information Gd weight value P t3 and point E point G calculated for the obtained weight value P t4 point H is calculated for and The initial depth information Hd of point E is calculated by weighted average, and finally the target object can be calculated
Figure PCTCN2019089425-appb-000003
进一步的,在实际应用中,参照图9,示出了本发明实施例提供的一种时序匹配操作的的概率分布图,其中,横坐标为帧数,该帧数用于表示通过双目摄像模组获取的目标物体的连续多帧第一图像以及第二图像,在多个时刻中的一个时刻,通过双目摄像模组获取的目标物体的第一图像以及第二图像可以视为一帧。纵坐标为概率,用于表示置信度的大小,在将不同时刻获取的多个所述第一图像中的特征区域之间进行匹配处理的过程中,匹配连续成功,置信度就越高,占得权重就越高;反之,匹配失败就会逐渐降低置信度,从而降低其权重。如图9所示,在连续匹配成功后,置信度逐渐上升,直到最高100%,而一旦出现匹配失败,则渐渐下降置信概率,直到0%。Further, in practical applications, referring to FIG. 9, there is shown a probability distribution diagram of a timing matching operation provided by an embodiment of the present invention, where the abscissa is the number of frames, and the number of frames is used to represent the binocular camera The continuous multiple frames of the first image and the second image of the target object acquired by the module. At one of the multiple moments, the first image and the second image of the target object acquired by the binocular camera module can be regarded as one frame . The ordinate is the probability, which is used to indicate the degree of confidence. In the process of matching between the feature regions in the multiple first images acquired at different times, if the matching is continuously successful, the higher the confidence, the higher the The higher the weight is; on the contrary, the failure of the match will gradually reduce the confidence, thereby reducing its weight. As shown in Figure 9, after successive matching successes, the confidence level gradually rises to a maximum of 100%, and once a matching failure occurs, the confidence probability is gradually reduced to 0%.
步骤312、根据所述深度信息确定所述目标物体的三维物理信息。Step 312: Determine three-dimensional physical information of the target object according to the depth information.
该步骤具体可以参照上述步骤105,此处不再赘述。For details of this step, refer to the above step 105, which will not be repeated here.
综上所述,本发明实施例提供的图像处理方法,通过获取包括目标物体的目标图像;确定目标图像中的目标区域,目标物体至少主体部分位于目标区域内;在目标区域中,确定目标物体的主体特征 区域;根据主体特征区域的初始深度信息,确定目标物体的深度信息,根据深度信息确定目标物体的三维物理信息。本发明在得到深度信息的过程中,由于去除了目标图像中的背景、遮挡物、目标物体的非主体部分的干扰,所以降低了在计算深度信息的过程中引入无用信息的几率,提高了三维物理信息的精度,另外,本发明是针对主体特征区域进行扫描和处理,从而得到对应目标物体的三维物理信息,相较于直接对整个目标图像进行扫描和处理,减少了计算量,提高了处理效率。In summary, the image processing method provided by the embodiments of the present invention obtains a target image including a target object; determines a target area in the target image, where at least the main body of the target object is located in the target area; in the target area, determines the target object The subject feature area of the subject; the depth information of the target object is determined according to the initial depth information of the subject feature area, and the three-dimensional physical information of the target object is determined according to the depth information. In the process of obtaining depth information, the present invention removes the interference of the background, occlusions, and non-subject parts of the target object in the target image, so it reduces the probability of introducing useless information in the process of calculating depth information and improves the three-dimensional The accuracy of the physical information. In addition, the present invention scans and processes the feature area of the subject to obtain the three-dimensional physical information of the corresponding target object. Compared with scanning and processing the entire target image directly, the amount of calculation is reduced and the processing is improved. effectiveness.
图10是本发明实施例提供的一种图像处理装置的框图,如图10所示,该图像处理装置400可以包括:接收器401和处理器402;FIG. 10 is a block diagram of an image processing apparatus according to an embodiment of the present invention. As shown in FIG. 10, the image processing apparatus 400 may include: a receiver 401 and a processor 402;
所述接收器401用于执行:获取包括目标物体的目标图像;The receiver 401 is configured to perform: acquiring a target image including a target object;
所述处理器402用于执行:The processor 402 is configured to execute:
确定所述目标图像中的目标区域,所述目标物体至少主体部分位于所述目标区域内;Determining a target area in the target image, and at least a main body of the target object is located in the target area;
在所述目标区域中,确定所述目标物体的主体特征区域;In the target area, determine the main feature area of the target object;
根据所述主体特征区域的初始深度信息,确定所述目标物体的深度信息;Determine the depth information of the target object according to the initial depth information of the subject feature area;
根据所述深度信息确定所述目标物体的三维物理信息。The three-dimensional physical information of the target object is determined according to the depth information.
可选的,所述处理器402还用于执行:Optionally, the processor 402 is further configured to execute:
通过提取所述目标区域的边缘特征,将所述目标区域划分为多个子区域;Dividing the target area into multiple sub-areas by extracting edge features of the target area;
通过分类模型,确定多个所述子区域的分类类别;Determine the classification categories of multiple sub-regions through a classification model;
在多个所述子区域中,合并与目标分类类别对应的子区域,得到所述主体特征区域。Among the multiple sub-regions, sub-regions corresponding to the target classification category are merged to obtain the subject feature region.
可选的,当所述目标物体处于受力或运动状态时,所述主体特征区域的轮廓的偏移量小于或等于预设阈值。Optionally, when the target object is under force or in motion, the offset of the contour of the main feature region is less than or equal to a preset threshold.
可选的,所述处理器402还用于执行:Optionally, the processor 402 is further configured to execute:
通过卷积神经网络模型,确定多个所述子区域的分类类别;Using a convolutional neural network model to determine the classification categories of multiple sub-regions;
或,通过分类器,确定多个所述子区域的分类类别。Or, through a classifier, the classification categories of multiple sub-regions are determined.
可选的,所述接收器401还用于执行:Optionally, the receiver 401 is further configured to perform:
在预设时刻,通过双目摄像模组获取所述目标物体的第一图像以及第二图像。At a preset moment, the first image and the second image of the target object are acquired through the binocular camera module.
可选的,所述处理器402还用于执行:Optionally, the processor 402 is further configured to execute:
将所述第一图像的所述第一主体特征区域与所述第二图像进行匹配处理,和/或将所述第二图像的所述第二主体特征区域与所述第一图像进行匹配处理,计算得到所述初始深度信息;Perform matching processing on the first subject feature region of the first image and the second image, and/or perform matching processing on the second subject feature region of the second image and the first image , Calculating the initial depth information;
根据所述初始深度信息,确定所述目标物体的深度信息。According to the initial depth information, the depth information of the target object is determined.
可选的,所述处理器402还用于执行:Optionally, the processor 402 is further configured to execute:
将所述第一图像的所述第一主体特征区域与所述第二图像进行匹配处理,和/或将所述第二图像的所述第二主体特征区域与所述第一图像进行匹配处理,得到视差值;Perform matching processing on the first subject feature region of the first image and the second image, and/or perform matching processing on the second subject feature region of the second image and the first image , Get the disparity value;
根据所述视差值,计算得到所述初始深度信息。According to the disparity value, the initial depth information is calculated.
可选的,所述处理器402还用于执行:Optionally, the processor 402 is further configured to execute:
将从所述第一特征区域中提取到的特征像素点在所述第二图像中进行匹配处理;和/或将从所述第二主体特征区域中提取到的特征像素点在所述第一图像中进行匹配处理。The feature pixels extracted from the first feature area are matched in the second image; and/or the feature pixels extracted from the second subject feature area are in the first Matching processing in the image.
可选的,所述特征像素点为图像中灰度值变化大于预设阈值或图像边缘上曲率大于预设曲率值的像素点。Optionally, the characteristic pixel is a pixel in the image whose gray value change is greater than a preset threshold or the curvature on the edge of the image is greater than the preset curvature value.
可选的,所述接收器401还用于执行:Optionally, the receiver 401 is further configured to perform:
在多个时刻,通过所述双目摄像模组获取所述目标物体的第一图像以及第二图像。At multiple times, the first image and the second image of the target object are acquired through the binocular camera module.
可选的,所述处理器402还用于执行:Optionally, the processor 402 is further configured to execute:
将所述第一图像的所述第一主体特征区域与对应时刻获取的所述第二图像进行匹配处理,和/或将所述第二图像的所述第二主体特征区域与对应时刻获取的所述第一图像进行匹配处理;Perform matching processing between the first subject feature area of the first image and the second image acquired at the corresponding time, and/or compare the second subject feature area of the second image with the second image acquired at the corresponding time Performing matching processing on the first image;
确定所述匹配处理的第一匹配成功次数。The first number of successful matching times of the matching process is determined.
可选的,所述处理器402还用于执行:Optionally, the processor 402 is further configured to execute:
将不同时刻获取的多个所述第一图像中的特征区域之间进行匹配处理;Performing matching processing between the feature regions in the multiple first images acquired at different times;
确定所述匹配处理的第二匹配成功次数。The second number of successful matching times of the matching process is determined.
可选的,所述处理器402还用于执行:Optionally, the processor 402 is further configured to execute:
根据所述第一匹配成功次数和所述第二匹配成功次数,为所述初始深度信息设置权重值,所述权重值的大小随着所述匹配次数正相关;Setting a weight value for the initial depth information according to the first number of successful matching times and the second number of successful matching times, and the size of the weight value is positively correlated with the number of matching times;
根据所述初始深度信息,以及所述初始深度信息对应的权重值,进行加权平均计算,得到所述目标物体的深度信息。According to the initial depth information and the weight value corresponding to the initial depth information, a weighted average calculation is performed to obtain the depth information of the target object.
可选的,所述处理器402还用于执行:Optionally, the processor 402 is further configured to execute:
根据所述深度信息,确定所述目标物体在不同时刻下的位置坐标;Determine the position coordinates of the target object at different times according to the depth information;
根据所述目标物体在不同时刻下的位置坐标,确定所述目标物体的三维物理信息。The three-dimensional physical information of the target object is determined according to the position coordinates of the target object at different times.
综上,本发明实施例提供的图像处理装置,通过获取包括目标物体的目标图像;确定目标图像中的目标区域,目标物体至少主体部分位于目标区域内;在目标区域中,确定目标物体的主体特征区域;根据主体特征区域的初始深度信息,确定目标物体的深度信息,根据深度信息确定目标物体的三维物理信息。本发明在得到深度信息的过程中,由于去除了目标图像中的背景、遮挡物、目标物体的非主体部分的干扰,所以降低了在计算深度信息的过程中引入无用信息的几率,提高了三维物理信息的精度,另外,本发明是针对主体特征区域进行扫描和处理,从而得到对应目标物体的三维物理信息,相较于直接对整个目标图像进行扫描和处理,减少了计算量,提高了处理效率。In summary, the image processing device provided by the embodiment of the present invention acquires a target image including a target object; determines the target area in the target image, where at least the main body of the target object is located in the target area; in the target area, determines the main body of the target object Feature area: Determine the depth information of the target object according to the initial depth information of the feature area of the subject, and determine the three-dimensional physical information of the target object according to the depth information. In the process of obtaining depth information, the present invention removes the interference of the background, occlusions, and non-subject parts of the target object in the target image, so it reduces the probability of introducing useless information in the process of calculating depth information and improves the three-dimensional The accuracy of the physical information. In addition, the present invention scans and processes the feature area of the subject to obtain the three-dimensional physical information of the corresponding target object. Compared with scanning and processing the entire target image directly, the amount of calculation is reduced and the processing is improved. effectiveness.
本发明实施例还提供一种计算机可读存储介质,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现所述的图像处理方法的步骤。The embodiment of the present invention also provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the image processing method are implemented.
本发明实施例还提供一种控制终端,其特征在于,包括所述的图像处理装置,发射装置,接收装置,所述发射装置向可移动设备发送拍摄指令,所述接收装置接收所述可移动设备拍摄的图像,所述图像处理装置对所述图像进行处理。An embodiment of the present invention also provides a control terminal, which is characterized in that it includes the image processing device, a transmitting device, and a receiving device. The transmitting device sends a shooting instruction to a movable device, and the receiving device receives the movable device. The image taken by the device, the image processing device processes the image.
参照图11,本发明实施例还提供一种可移动设备500,包括拍摄装置501,还包括图10所述的图像处理装置400,所述图像处理装置400接收所述拍摄装置501拍摄的图像并进行图像处理。11, an embodiment of the present invention also provides a portable device 500, including a photographing device 501, and further includes the image processing device 400 described in FIG. 10, the image processing device 400 receives the image taken by the photographing device 501 and Perform image processing.
可选的,所述可移动设备500还包括控制器502以及动力系统503,所述控制器502根据所述图像处理装置400处理的处理结果,控制所述动力系统503的动力输出。Optionally, the movable device 500 further includes a controller 502 and a power system 503, and the controller 502 controls the power output of the power system 503 according to the processing result processed by the image processing device 400.
具体的,动力系统包括驱动桨的电机以及驱动云台动作的电机,故控制器502可以根据图像处理结果可以改变可移动设备500的姿态或云台朝向(即拍摄装置501朝向)。Specifically, the power system includes a motor that drives the propeller and a motor that drives the movement of the pan/tilt. Therefore, the controller 502 can change the posture of the movable device 500 or the orientation of the pan/tilt (that is, the orientation of the camera 501) according to the image processing result.
可选的,所述图像处理装置400集成于所述控制器502中。Optionally, the image processing device 400 is integrated in the controller 502.
可选的,所述可移动设备500包括无人机、无人车、无人船、手持拍摄设备中的至少一种。Optionally, the movable device 500 includes at least one of a drone, an unmanned vehicle, an unmanned boat, and a handheld camera.
图12为实现本发明各个实施例的一种控制终端的硬件结构示意图,该控制终端600包括但不限于:射频单元601、网络模块602、音频输出单元603、输入单元604、传感器605、显示单元606、用户输入单元607、接口单元608、存储器609、处理器610、以及电源611等部件。本领域技术人员可以理解,图12中示出的控制终端结构并不构成对控制终端的限定,控制终端可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。在本发明实施例中,控制终端包括但不限于手机、平板电脑、笔记本电脑、掌上电脑、车载终端、可穿戴设备、以及计步器等。12 is a schematic diagram of the hardware structure of a control terminal for implementing various embodiments of the present invention. The control terminal 600 includes but is not limited to: a radio frequency unit 601, a network module 602, an audio output unit 603, an input unit 604, a sensor 605, and a display unit 606, a user input unit 607, an interface unit 608, a memory 609, a processor 610, a power supply 611 and other components. Those skilled in the art can understand that the structure of the control terminal shown in FIG. 12 does not constitute a limitation on the control terminal, and the control terminal may include more or less components than those shown in the figure, or a combination of certain components, or different components Layout. In the embodiment of the present invention, the control terminal includes, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a palmtop computer, a vehicle-mounted terminal, a wearable device, and a pedometer.
应理解的是,本发明实施例中,射频单元601可用于收发信息或通话过程中,信号的接收和发送,具体的,将来自基站的下行数据接收后,给处理器610处理;另外,将上行的数据发送给基站。通常,射频单元601包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器、双工器等。此外,射频单元601还可以通过无线通信系统与网络和其他设备通信。It should be understood that, in the embodiment of the present invention, the radio frequency unit 601 can be used for receiving and sending signals during the process of sending and receiving information or talking. Specifically, the downlink data from the base station is received and processed by the processor 610; in addition, Uplink data is sent to the base station. Generally, the radio frequency unit 601 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 601 can also communicate with the network and other devices through a wireless communication system.
控制终端通过网络模块602为用户提供了无线的宽带互联网访问,如帮助用户收发电子邮件、浏览网页和访问流式媒体等。The control terminal provides users with wireless broadband Internet access through the network module 602, such as helping users to send and receive emails, browse web pages, and access streaming media.
音频输出单元603可以将射频单元601或网络模块602接收的或者在存储器609中存储的音频数据转换成音频信号并且输出为声音。而且,音频输出单元603还可以提供与控制终端600执行的特定功能相关的音频输出(例如,呼叫信号接收声音、消息接收声音等等)。音频输出单元603包括扬声器、蜂鸣器以及受话器等。The audio output unit 603 can convert the audio data received by the radio frequency unit 601 or the network module 602 or stored in the memory 609 into audio signals and output them as sounds. Moreover, the audio output unit 603 may also provide audio output related to a specific function performed by the control terminal 600 (for example, call signal reception sound, message reception sound, etc.). The audio output unit 603 includes a speaker, a buzzer, a receiver, and the like.
输入单元604用于接收音频或视频信号。输入单元604可以包括图形处理器(Graphics Processing Unit,GPU)6041和麦克风6042,图形处理器6041对在视频捕获模式或图像捕获模式中由图像捕获控制终端(如摄像头)获得的静态图片或视频的图像数据进行处理。处理后的图像帧可以显示在显示单元606上。经图形处理器6041处理后的图像帧可以存储在存储器609(或其它存储介质)中或者经由射频单元601或网络模块602进行发送。麦克风6042可以接收声音,并且能够将这样的声音处理为音频数据。处理后的音频数据可以在电话通话模式的情况下转换为可经由射频单元601发送到移动通信基站的格式输出。The input unit 604 is used to receive audio or video signals. The input unit 604 may include a graphics processing unit (GPU) 6041 and a microphone 6042. The graphics processor 6041 is configured to monitor static pictures or videos obtained by an image capture control terminal (such as a camera) in the video capture mode or the image capture mode. Image data is processed. The processed image frame may be displayed on the display unit 606. The image frame processed by the graphics processor 6041 may be stored in the memory 609 (or other storage medium) or sent via the radio frequency unit 601 or the network module 602. The microphone 6042 can receive sound, and can process such sound into audio data. The processed audio data can be converted into a format that can be sent to the mobile communication base station via the radio frequency unit 601 for output in the case of a telephone call mode.
控制终端600还包括至少一种传感器605,比如光传感器、运动传感器以及其他传感器。具体地,光传感器包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板6061的亮度,接近传感器可在控制终端600移动到耳边时,关闭显示面板6061和/或背光。作为运动传感器的一种,加速计传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别控制终端姿态(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;传感器605还可以包括指纹传感器、压力传感器、虹膜传感器、分子传感器、陀螺仪、气压计、湿度计、温度计、红外线传感器等,在此不再赘述。The control terminal 600 also includes at least one sensor 605, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor and a proximity sensor. The ambient light sensor can adjust the brightness of the display panel 6061 according to the brightness of the ambient light. The proximity sensor can turn off the display panel 6061 and the display panel 6061 when the control terminal 600 is moved to the ear. / Or backlight. As a kind of motion sensor, the accelerometer sensor can detect the magnitude of acceleration in various directions (usually three-axis), and can detect the magnitude and direction of gravity when stationary, and can be used to identify and control terminal posture (such as horizontal and vertical screen switching, related games) , Magnetometer attitude calibration), vibration recognition related functions (such as pedometer, percussion), etc.; sensor 605 can also include fingerprint sensor, pressure sensor, iris sensor, molecular sensor, gyroscope, barometer, hygrometer, thermometer, Infrared sensors, etc., will not be repeated here.
显示单元606用于显示由用户输入的信息或提供给用户的信息。显示单元606可包括显示面板6061,可以采用液晶显示器(Liquid Crystal Display,LCD)、有机发光二极管(Organic Light-Emitting Diode,OLED)等形式来配置显示面板6061。The display unit 606 is used to display information input by the user or information provided to the user. The display unit 606 may include a display panel 6061, and the display panel 6061 may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), etc.
用户输入单元607可用于接收输入的数字或字符信息,以及产生与控制终端的用户设置以及功能控制有关的键信号输入。具体地,用户输入单元607包括触控面板6071以及其他输入设备6072。触控面板6071,也称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板6071上或在触控面板6071附近的操作)。触控面板6071可包括触摸检测控制终端和触摸控制器两个部分。其中,触摸检测控制终端检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测控制终端上接收触摸信息,并将 它转换成触点坐标,再送给处理器610,接收处理器610发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板6071。除了触控面板6071,用户输入单元607还可以包括其他输入设备6072。具体地,其他输入设备6072可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆,在此不再赘述。The user input unit 607 may be used to receive inputted numeric or character information, and generate key signal input related to user settings and function control of the control terminal. Specifically, the user input unit 607 includes a touch panel 6071 and other input devices 6072. The touch panel 6071, also called a touch screen, can collect user touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc.) on the touch panel 6071 or near the touch panel 6071. operating). The touch panel 6071 may include two parts, a touch detection control terminal and a touch controller. Among them, the touch detection control terminal detects the user's touch position, detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection control terminal and converts it into contact coordinates , And then send to the processor 610, receive the command sent by the processor 610 and execute it. In addition, the touch panel 6071 can be implemented in multiple types such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the touch panel 6071, the user input unit 607 may also include other input devices 6072. Specifically, other input devices 6072 may include, but are not limited to, a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackball, mouse, and joystick, which will not be repeated here.
进一步的,触控面板6071可覆盖在显示面板6061上,当触控面板6071检测到在其上或附近的触摸操作后,传送给处理器610以确定触摸事件的类型,随后处理器610根据触摸事件的类型在显示面板6061上提供相应的视觉输出。虽然触控面板6071与显示面板6061是作为两个独立的部件来实现控制终端的输入和输出功能,但是在某些实施例中,可以将触控面板6071与显示面板6061集成而实现控制终端的输入和输出功能,具体此处不做限定。Further, the touch panel 6071 can cover the display panel 6061. When the touch panel 6071 detects a touch operation on or near it, it is transmitted to the processor 610 to determine the type of the touch event, and then the processor 610 determines the type of the touch event according to the touch. The type of event provides corresponding visual output on the display panel 6061. Although the touch panel 6071 and the display panel 6061 are used as two independent components to realize the input and output functions of the control terminal, in some embodiments, the touch panel 6071 and the display panel 6061 can be integrated to realize the control terminal. Input and output functions are not limited here.
接口单元608为外部控制终端与控制终端600连接的接口。例如,外部控制终端可以包括有线或无线头戴式耳机端口、外部电源(或电池充电器)端口、有线或无线数据端口、存储卡端口、用于连接具有识别模块的控制终端的端口、音频输入/输出(I/O)端口、视频I/O端口、耳机端口等等。接口单元608可以用于接收来自外部控制终端的输入(例如,数据信息、电力等等)并且将接收到的输入传输到控制终端600内的一个或多个元件或者可以用于在控制终端600和外部控制终端之间传输数据。The interface unit 608 is an interface for connecting the external control terminal and the control terminal 600. For example, the external control terminal may include a wired or wireless headset port, an external power source (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a control terminal with an identification module, and audio input / Output (I/O) port, video I/O port, headphone port, etc. The interface unit 608 can be used to receive input (for example, data information, power, etc.) from an external control terminal and transmit the received input to one or more elements in the control terminal 600 or can be used to connect to the control terminal 600 Transfer data between external control terminals.
存储器609可用于存储软件程序以及各种数据。存储器609可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据手机的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器609可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。The memory 609 can be used to store software programs and various data. The memory 609 may mainly include a storage program area and a storage data area. The storage program area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data (such as audio data, phone book, etc.) created by the use of mobile phones. In addition, the memory 609 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
处理器610是控制终端的控制中心,利用各种接口和线路连接整个控制终端的各个部分,通过运行或执行存储在存储器609内的软件程序和/或模块,以及调用存储在存储器609内的数据,执行控制终端的各种功能和处理数据,从而对控制终端进行整体监控。处理器610可包括一个或多个处理单元;优选的,处理器610可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器610中。The processor 610 is the control center of the control terminal. It uses various interfaces and lines to connect the various parts of the entire control terminal, runs or executes the software programs and/or modules stored in the memory 609, and calls the data stored in the memory 609. , Perform various functions of the control terminal and process data, so as to monitor the control terminal as a whole. The processor 610 may include one or more processing units; preferably, the processor 610 may integrate an application processor and a modem processor, where the application processor mainly processes the operating system, user interface and application programs, etc., the modem The processor mainly deals with wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 610.
控制终端600还可以包括给各个部件供电的电源611(比如电池),优选的,电源611可以通过电源管理系统与处理器610逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。The control terminal 600 may also include a power supply 611 (such as a battery) for supplying power to various components. Preferably, the power supply 611 may be logically connected to the processor 610 through a power management system, so as to manage charging, discharging, and power consumption management through the power management system. And other functions.
另外,控制终端600包括一些未示出的功能模块,在此不再赘述。In addition, the control terminal 600 includes some functional modules not shown, which will not be repeated here.
优选的,本发明实施例还提供一种控制终端,包括处理器610,存储器609,存储在存储器609上并可在所述处理器610上运行的计算机程序,该计算机程序被处理器610执行时实现上述图像处理方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。Preferably, the embodiment of the present invention also provides a control terminal, including a processor 610, a memory 609, and a computer program stored on the memory 609 and running on the processor 610. When the computer program is executed by the processor 610, Each process of the foregoing image processing method embodiment is implemented, and the same technical effect can be achieved. To avoid repetition, details are not repeated here.
本发明实施例还提供一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现上述图像处理方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。其中,所述的计算机可读存储介质,如只读存储器(Read-Only Memory,简称ROM)、随机存取存储器(Random Access Memory,简称RAM)、磁碟或者光盘等。The embodiment of the present invention also provides a computer-readable storage medium, and a computer program is stored on the computer-readable storage medium. When the computer program is executed by a processor, each process of the above-mentioned image processing method embodiment is realized, and the same technology can be achieved. The effect, in order to avoid repetition, will not be repeated here. Wherein, the computer readable storage medium, such as read-only memory (Read-Only Memory, ROM for short), random access memory (Random Access Memory, RAM for short), magnetic disk or optical disk, etc.
参照图12,本发明实施例还提供一种可移动设备,包括拍摄装置,还包括图10所述的图像处理装置,所述图像处理装置接收所述拍摄装置拍摄的图像并进行图像处理。Referring to FIG. 12, an embodiment of the present invention also provides a portable device, including a photographing device, and further comprising the image processing device described in FIG. 10, the image processing device receives an image photographed by the photographing device and performs image processing.
可选的,所述可移动设备还包括控制器以及动力系统,所述控制器根据所述图像处理装置处理的处理结果,控制所述动力系统的动力输出。Optionally, the movable device further includes a controller and a power system, and the controller controls the power output of the power system according to the processing result processed by the image processing device.
可选的,所述图像处理装置集成于所述控制器中。Optionally, the image processing device is integrated in the controller.
可选的,所述可移动设备包括无人机、无人车、无人船、手持拍摄设备中的至少一种。Optionally, the movable equipment includes at least one of a drone, an unmanned vehicle, an unmanned boat, and a handheld camera.
本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。The various embodiments in this specification are described in a progressive manner. Each embodiment focuses on the differences from other embodiments, and the same or similar parts between the various embodiments can be referred to each other.
本领域内的技术人员应明白,本申请的实施例可提供为方法、控制终端、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且, 本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application can be provided as a method, a control terminal, or a computer program product. Therefore, the present application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
本申请是参照根据本申请的方法、终端设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理终端设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理终端设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的控制终端。This application is described with reference to the flowchart and/or block diagram of the method, terminal device (system), and computer program product according to the application. It should be understood that each process and/or block in the flowchart and/or block diagram, and the combination of processes and/or blocks in the flowchart and/or block diagram can be implemented by computer program instructions. These computer program instructions can be provided to the processors of general-purpose computers, special-purpose computers, embedded processors, or other programmable data processing terminal equipment to generate a machine, so that instructions executed by the processor of the computer or other programmable data processing terminal equipment Generate a control terminal for realizing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理终端设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令控制终端的制造品,该指令控制终端实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing terminal equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction control terminal, The instruction controls the terminal to realize the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理终端设备上,使得在计算机或其他可编程终端设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程终端设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing terminal equipment, so that a series of operation steps are executed on the computer or other programmable terminal equipment to produce computer-implemented processing, so that the computer or other programmable terminal equipment The instructions executed above provide steps for implementing functions specified in a flow or multiple flows in the flowchart and/or a block or multiple blocks in the block diagram.
尽管已描述了本申请的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例做出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请范围的所有变更和修改。Although the preferred embodiments of the present application have been described, those skilled in the art can make additional changes and modifications to these embodiments once they learn the basic creative concept. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments and all changes and modifications falling within the scope of the present application.
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者终端设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者终端设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者终端设备中还存在另外的相同要素。Finally, it should be noted that in this article, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these entities Or there is any such actual relationship or sequence between operations. Moreover, the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article, or terminal device that includes a series of elements includes not only those elements, but also those that are not explicitly listed. Other elements listed, or also include elements inherent to this process, method, article, or terminal device. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other same elements in the process, method, article or terminal device that includes the element.
以上对本申请所提供的一种应用的图标的处理方法及控制终端,进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。The processing method and control terminal of an application icon provided by this application are described in detail above. Specific examples are used in this article to illustrate the principle and implementation of this application. The description of the above embodiments is only for help Understand the methods and core ideas of this application; at the same time, for those of ordinary skill in the art, according to the ideas of this application, there will be changes in the specific implementation and scope of application. In summary, the content of this specification does not It should be understood as a limitation of this application.

Claims (35)

  1. 一种图像处理方法,其特征在于,所述方法包括:An image processing method, characterized in that the method includes:
    获取包括目标物体的目标图像;Acquiring a target image including the target object;
    确定所述目标图像中的目标区域,所述目标物体至少主体部分位于所述目标区域内;Determining a target area in the target image, and at least a main body of the target object is located in the target area;
    在所述目标区域中,确定所述目标物体的主体特征区域;In the target area, determine the main feature area of the target object;
    根据所述主体特征区域的初始深度信息,确定所述目标物体的深度信息;Determine the depth information of the target object according to the initial depth information of the subject feature area;
    根据所述深度信息确定所述目标物体的三维物理信息。The three-dimensional physical information of the target object is determined according to the depth information.
  2. 根据权利要求1所述的方法,其特征在于,所述在所述目标区域中,确定所述目标物体的主体特征区域的步骤,包括:The method according to claim 1, wherein the step of determining the main characteristic area of the target object in the target area comprises:
    通过提取所述目标区域的边缘特征,将所述目标区域划分为多个子区域;Dividing the target area into multiple sub-areas by extracting edge features of the target area;
    通过分类模型,确定多个所述子区域的分类类别;Determine the classification categories of multiple sub-regions through a classification model;
    在多个所述子区域中,合并与目标分类类别对应的子区域,得到所述主体特征区域。Among the multiple sub-regions, sub-regions corresponding to the target classification category are merged to obtain the subject feature region.
  3. 根据权利要求1至2任一所述的方法,其特征在于,当所述目标物体处于受力或运动状态时,所述主体特征区域的轮廓的偏移量小于或等于预设阈值。The method according to any one of claims 1 to 2, wherein when the target object is under force or in motion, the offset of the contour of the subject characteristic region is less than or equal to a preset threshold.
  4. 根据权利要求2所述的方法,其特征在于,所述通过分类模型,确定多个所述子区域的分类类别的步骤,包括:The method according to claim 2, wherein the step of determining the classification categories of a plurality of the sub-regions through a classification model comprises:
    通过卷积神经网络模型,确定多个所述子区域的分类类别;Using a convolutional neural network model to determine the classification categories of multiple sub-regions;
    或,通过分类器,确定多个所述子区域的分类类别。Or, through a classifier, the classification categories of multiple sub-regions are determined.
  5. 根据权利要求1所述的方法,其特征在于,所述获取包括目标物体的目标图像的步骤,包括:The method according to claim 1, wherein the step of obtaining a target image including a target object comprises:
    在预设时刻,通过双目摄像模组获取所述目标物体的第一图像以及第二图像。At a preset moment, the first image and the second image of the target object are acquired through the binocular camera module.
  6. 根据权利要求5所述的方法,其特征在于,所述根据所述主体特征区域中像素点的初始深度信息,确定所述目标物体的深度信息的步骤,包括:The method according to claim 5, wherein the step of determining the depth information of the target object according to the initial depth information of the pixels in the subject feature area comprises:
    将所述第一图像的所述第一主体特征区域与所述第二图像进行匹配处理,和/或将所述第二图像的所述第二主体特征区域与所述第一图像进行匹配处理,计算得到所述初始深度信息;Perform matching processing on the first subject feature region of the first image and the second image, and/or perform matching processing on the second subject feature region of the second image and the first image , Calculating the initial depth information;
    根据所述初始深度信息,确定所述目标物体的深度信息。According to the initial depth information, the depth information of the target object is determined.
  7. 根据权利要求6所述的方法,将所述第一图像的所述第一主体特征区域与所述第二图像进行匹配处理,和/或将所述第二图像的所述第二主体特征区域与所述第一图像进行匹配处理,计算得到所述初始深度信息,具体包括:The method according to claim 6, performing a matching process on the first subject feature area of the first image and the second image, and/or the second subject feature area of the second image Performing matching processing with the first image to obtain the initial depth information by calculation includes:
    将所述第一图像的所述第一主体特征区域与所述第二图像进行匹配处理,和/或将所述第二图像的所述第二主体特征区域与所述第一图像进行匹配处理,得到视差值;Perform matching processing on the first subject feature region of the first image and the second image, and/or perform matching processing on the second subject feature region of the second image and the first image , Get the disparity value;
    根据所述视差值,计算得到所述初始深度信息。According to the disparity value, the initial depth information is calculated.
  8. 根据权利要求6或7所述的方法,将所述第一图像的所述第一主体特征区域与所述第二图像进行匹配处理,和/或将所述第二图像的所述第二主体特征区域与所述第一图像进行匹配处理具体包 括:According to the method of claim 6 or 7, the first subject feature area of the first image is matched with the second image, and/or the second subject of the second image The matching process between the characteristic region and the first image specifically includes:
    将从所述第一特征区域中提取到的特征像素点在所述第二图像中进行匹配处理;和/或将从所述第二主体特征区域中提取到的特征像素点在所述第一图像中进行匹配处理。The feature pixels extracted from the first feature area are matched in the second image; and/or the feature pixels extracted from the second subject feature area are in the first Matching processing in the image.
  9. 根据权利要求8所述的方法,所述特征像素点为图像中灰度值变化大于预设阈值或图像边缘上曲率大于预设曲率值的像素点。The method according to claim 8, wherein the characteristic pixel is a pixel in the image whose gray value change is greater than a preset threshold or the curvature on the edge of the image is greater than the preset curvature value.
  10. 根据权利要求6所述的方法,在将所述第一图像的所述第一主体特征区域与所述第二图像进行匹配处理,和/或将所述第二图像的所述第二主体特征区域与所述第一图像进行匹配处理,计算得到所述初始深度信息之后,还包括:The method according to claim 6, wherein the first subject feature area of the first image is matched with the second image, and/or the second subject feature of the second image is After the region is matched with the first image, and after the initial depth information is calculated, the method further includes:
    在多个时刻,通过所述双目摄像模组获取所述目标物体的第一图像以及第二图像。At multiple times, the first image and the second image of the target object are acquired through the binocular camera module.
  11. 根据权利要求10所述的方法,在多个时刻,通过所述双目摄像模组获取所述目标物体的第一图像以及第二图像之后,还包括:The method according to claim 10, after acquiring the first image and the second image of the target object through the binocular camera module at multiple times, further comprising:
    将所述第一图像的所述第一主体特征区域与对应时刻获取的所述第二图像进行匹配处理,和/或将所述第二图像的所述第二主体特征区域与对应时刻获取的所述第一图像进行匹配处理;Perform matching processing between the first subject feature area of the first image and the second image acquired at the corresponding time, and/or compare the second subject feature area of the second image with the second image acquired at the corresponding time Performing matching processing on the first image;
    确定所述匹配处理的第一匹配成功次数。The first number of successful matching times of the matching process is determined.
  12. 根据权利要求11所述的方法,其特征在于,在多个时刻,通过所述双目摄像模组获取所述目标物体的第一图像以及第二图像之后,还包括:The method according to claim 11, wherein after acquiring the first image and the second image of the target object through the binocular camera module at multiple times, the method further comprises:
    将不同时刻获取的多个所述第一图像中的特征区域之间进行匹配处理;Performing matching processing between the feature regions in the multiple first images acquired at different times;
    确定所述匹配处理的第二匹配成功次数。The second number of successful matching times of the matching process is determined.
  13. 根据权利要求12所述的方法,所述根据所述初始深度信息,确定所述目标物体的深度信息,具体包括:The method according to claim 12, wherein the determining the depth information of the target object according to the initial depth information specifically includes:
    根据所述第一匹配成功次数和所述第二匹配成功次数,为所述初始深度信息设置权重值,所述权重值的大小随着所述匹配次数正相关;Setting a weight value for the initial depth information according to the first number of successful matching times and the second number of successful matching times, and the size of the weight value is positively correlated with the number of matching times;
    根据所述初始深度信息,以及所述初始深度信息对应的权重值,进行加权平均计算,得到所述目标物体的深度信息。According to the initial depth information and the weight value corresponding to the initial depth information, a weighted average calculation is performed to obtain the depth information of the target object.
  14. 根据权利要求1所述的方法,其特征在于,所述根据所述深度信息确定所述目标物体的三维物理信息,包括:The method of claim 1, wherein the determining three-dimensional physical information of the target object according to the depth information comprises:
    根据所述深度信息,确定所述目标物体在不同时刻下的位置坐标;Determine the position coordinates of the target object at different times according to the depth information;
    根据所述目标物体在不同时刻下的位置坐标,确定所述目标物体的三维物理信息。The three-dimensional physical information of the target object is determined according to the position coordinates of the target object at different times.
  15. 一种图像处理装置,其特征在于,所述装置包括:接收器和处理器;An image processing device, characterized in that the device includes: a receiver and a processor;
    所述接收器用于执行:获取包括目标物体的目标图像;The receiver is configured to perform: acquiring a target image including a target object;
    所述处理器用于执行:The processor is used to execute:
    确定所述目标图像中的目标区域,所述目标物体至少主体部分位于所述目标区域内;Determining a target area in the target image, and at least a main body of the target object is located in the target area;
    在所述目标区域中,确定所述目标物体的主体特征区域;In the target area, determine the main feature area of the target object;
    根据所述主体特征区域的初始深度信息,确定所述目标物体的深度信息;Determine the depth information of the target object according to the initial depth information of the subject feature area;
    根据所述深度信息确定所述目标物体的三维物理信息。The three-dimensional physical information of the target object is determined according to the depth information.
  16. 根据权利要求15所述的装置,其特征在于,所述处理器还用于执行:The apparatus according to claim 15, wherein the processor is further configured to execute:
    通过提取所述目标区域的边缘特征,将所述目标区域划分为多个子区域;Dividing the target area into multiple sub-areas by extracting edge features of the target area;
    通过分类模型,确定多个所述子区域的分类类别;Determine the classification categories of multiple sub-regions through a classification model;
    在多个所述子区域中,合并与目标分类类别对应的子区域,得到所述主体特征区域。Among the multiple sub-regions, sub-regions corresponding to the target classification category are merged to obtain the subject feature region.
  17. 根据权利要求15至16任一所述的装置,其特征在于,当所述目标物体处于受力或运动状态时,所述主体特征区域的轮廓的偏移量小于或等于预设阈值。The device according to any one of claims 15 to 16, wherein when the target object is under force or in motion, the offset of the contour of the subject characteristic region is less than or equal to a preset threshold.
  18. 根据权利要求16所述的装置,其特征在于,所述处理器还用于执行:The device according to claim 16, wherein the processor is further configured to execute:
    通过卷积神经网络模型,确定多个所述子区域的分类类别;Using a convolutional neural network model to determine the classification categories of multiple sub-regions;
    或,通过分类器,确定多个所述子区域的分类类别。Or, through a classifier, the classification categories of multiple sub-regions are determined.
  19. 根据权利要求15所述的装置,其特征在于,所述接收器还用于执行:The device according to claim 15, wherein the receiver is further configured to perform:
    在预设时刻,通过双目摄像模组获取所述目标物体的第一图像以及第二图像。At a preset moment, the first image and the second image of the target object are acquired through the binocular camera module.
  20. 根据权利要求19所述的装置,其特征在于,所述处理器还用于执行:The device according to claim 19, wherein the processor is further configured to execute:
    将所述第一图像的所述第一主体特征区域与所述第二图像进行匹配处理,和/或将所述第二图像的所述第二主体特征区域与所述第一图像进行匹配处理,计算得到所述初始深度信息;Perform matching processing on the first subject feature region of the first image and the second image, and/or perform matching processing on the second subject feature region of the second image and the first image , Calculating the initial depth information;
    根据所述初始深度信息,确定所述目标物体的深度信息。According to the initial depth information, the depth information of the target object is determined.
  21. 根据权利要求20所述的装置,其特征在于,所述处理器还用于执行:The apparatus according to claim 20, wherein the processor is further configured to execute:
    将所述第一图像的所述第一主体特征区域与所述第二图像进行匹配处理,和/或将所述第二图像的所述第二主体特征区域与所述第一图像进行匹配处理,得到视差值;Perform matching processing on the first subject feature region of the first image and the second image, and/or perform matching processing on the second subject feature region of the second image and the first image , Get the disparity value;
    根据所述视差值,计算得到所述初始深度信息。According to the disparity value, the initial depth information is calculated.
  22. 根据权利要求20或21所述的装置,其特征在于,所述处理器还用于执行:The device according to claim 20 or 21, wherein the processor is further configured to execute:
    将从所述第一特征区域中提取到的特征像素点在所述第二图像中进行匹配处理;和/或将从所述第二主体特征区域中提取到的特征像素点在所述第一图像中进行匹配处理。The feature pixels extracted from the first feature area are matched in the second image; and/or the feature pixels extracted from the second subject feature area are in the first Matching processing in the image.
  23. 根据权利要求22所述的装置,其特征在于,所述特征像素点为图像中灰度值变化大于预设阈值或图像边缘上曲率大于预设曲率值的像素点。The device according to claim 22, wherein the characteristic pixel is a pixel in the image whose gray value change is greater than a preset threshold or the curvature on the edge of the image is greater than the preset curvature value.
  24. 根据权利要求20所述的装置,其特征在于,所述接收器还用于执行:The device according to claim 20, wherein the receiver is further configured to perform:
    在多个时刻,通过所述双目摄像模组获取所述目标物体的第一图像以及第二图像。At multiple times, the first image and the second image of the target object are acquired through the binocular camera module.
  25. 根据权利要求24所述的装置,其特征在于,所述处理器还用于执行:The device according to claim 24, wherein the processor is further configured to execute:
    将所述第一图像的所述第一主体特征区域与对应时刻获取的所述第二图像进行匹配处理,和/或将所述第二图像的所述第二主体特征区域与对应时刻获取的所述第一图像进行匹配处理;Perform matching processing between the first subject feature area of the first image and the second image acquired at the corresponding time, and/or compare the second subject feature area of the second image with the second image acquired at the corresponding time Performing matching processing on the first image;
    确定所述匹配处理的第一匹配成功次数。The first number of successful matching times of the matching process is determined.
  26. 根据权利要求25所述的装置,其特征在于,所述处理器还用于执行:The device according to claim 25, wherein the processor is further configured to execute:
    将不同时刻获取的多个所述第一图像中的特征区域之间进行匹配处理;Performing matching processing between the feature regions in the multiple first images acquired at different times;
    确定所述匹配处理的第二匹配成功次数。The second number of successful matching times of the matching process is determined.
  27. 根据权利要求26所述的装置,其特征在于,所述处理器还用于执行:The device according to claim 26, wherein the processor is further configured to execute:
    根据所述第一匹配成功次数和所述第二匹配成功次数,为所述初始深度信息设置权重值,所述权重值的大小随着所述匹配次数正相关;Setting a weight value for the initial depth information according to the first number of successful matching times and the second number of successful matching times, and the size of the weight value is positively correlated with the number of matching times;
    根据所述初始深度信息,以及所述初始深度信息对应的权重值,进行加权平均计算,得到所述目标物体的深度信息。According to the initial depth information and the weight value corresponding to the initial depth information, a weighted average calculation is performed to obtain the depth information of the target object.
  28. 根据权利要求15所述的装置,其特征在于,所述处理器还用于执行:The apparatus according to claim 15, wherein the processor is further configured to execute:
    根据所述深度信息,确定所述目标物体在不同时刻下的位置坐标;Determine the position coordinates of the target object at different times according to the depth information;
    根据所述目标物体在不同时刻下的位置坐标,确定所述目标物体的三维物理信息。The three-dimensional physical information of the target object is determined according to the position coordinates of the target object at different times.
  29. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现权利要求1至14中任一项所述的图像处理方法的步骤。A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the image processing method according to any one of claims 1 to 14 is implemented step.
  30. 一种控制终端,其特征在于,包括权利要求15至28任一所述的图像处理装置,发射装置,接收装置,所述发射装置向可移动设备发送拍摄指令,所述接收装置接收所述可移动设备拍摄的图像,所述图像处理装置对所述图像进行处理。A control terminal, characterized by comprising the image processing device according to any one of claims 15 to 28, a transmitting device, and a receiving device. The transmitting device sends a shooting instruction to a movable device, and the receiving device receives the An image taken by a mobile device, and the image processing device processes the image.
  31. 根据权利要求30所述的控制终端,其特征在于,所述可移动设备包括无人机、无人车、无人船、手持拍摄设备中的至少一种。The control terminal according to claim 30, wherein the movable device includes at least one of a drone, an unmanned vehicle, an unmanned boat, and a handheld camera.
  32. 一种可移动设备,包括拍摄装置,其特征在于,所述可移动设备还包括权利要求15至28任一所述的图像处理装置,所述图像处理装置接收所述拍摄装置拍摄的图像并进行图像处理。A portable device, comprising a photographing device, characterized in that the portable device further comprises the image processing device according to any one of claims 15 to 28, and the image processing device receives the image taken by the photographing device and performs Image Processing.
  33. 根据权利要求32所述的可移动设备,其特征在于,所述可移动设备还包括控制器以及动力系统,所述控制器根据所述图像处理装置处理的处理结果,控制所述动力系统的动力输出。The movable device according to claim 32, wherein the movable device further comprises a controller and a power system, and the controller controls the power of the power system according to the processing result processed by the image processing device. Output.
  34. 根据权利要求33所述的可移动设备,其特征在于,所述图像处理装置集成于所述控制器中。The movable device according to claim 33, wherein the image processing device is integrated in the controller.
  35. 根据权利要求32所述的可移动设备,其特征在于,所述可移动设备包括无人机、无人车、无人船、手持拍摄设备中的至少一种。The movable device of claim 32, wherein the movable device comprises at least one of a drone, an unmanned vehicle, an unmanned boat, and a handheld camera.
PCT/CN2019/089425 2019-05-31 2019-05-31 Image processing method and apparatus, control terminal and mobile device WO2020237611A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980008862.XA CN111602139A (en) 2019-05-31 2019-05-31 Image processing method and device, control terminal and mobile device
PCT/CN2019/089425 WO2020237611A1 (en) 2019-05-31 2019-05-31 Image processing method and apparatus, control terminal and mobile device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/089425 WO2020237611A1 (en) 2019-05-31 2019-05-31 Image processing method and apparatus, control terminal and mobile device

Publications (1)

Publication Number Publication Date
WO2020237611A1 true WO2020237611A1 (en) 2020-12-03

Family

ID=72191934

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/089425 WO2020237611A1 (en) 2019-05-31 2019-05-31 Image processing method and apparatus, control terminal and mobile device

Country Status (2)

Country Link
CN (1) CN111602139A (en)
WO (1) WO2020237611A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114281096A (en) * 2021-11-09 2022-04-05 中时讯通信建设有限公司 Unmanned aerial vehicle tracking control method, device and medium based on target detection algorithm
WO2022213364A1 (en) * 2021-04-09 2022-10-13 Oppo广东移动通信有限公司 Image processing method, image processing apparatus, terminal, and readable storage medium

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111984017A (en) * 2020-08-31 2020-11-24 苏州三六零机器人科技有限公司 Cleaning equipment control method, device and system and computer readable storage medium
CN111970568B (en) * 2020-08-31 2021-07-16 上海松鼠课堂人工智能科技有限公司 Method and system for interactive video playing
CN112163562B (en) * 2020-10-23 2021-10-22 珠海大横琴科技发展有限公司 Image overlapping area calculation method and device, electronic equipment and storage medium
CN112433529B (en) * 2020-11-30 2024-02-27 东软睿驰汽车技术(沈阳)有限公司 Moving object determining method, device and equipment
CN112967249B (en) * 2021-03-03 2023-04-07 南京工业大学 Intelligent identification method for manufacturing errors of prefabricated pier reinforcing steel bar holes based on deep learning
CN112598698B (en) * 2021-03-08 2021-05-18 南京爱奇艺智能科技有限公司 Long-time single-target tracking method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102779347A (en) * 2012-06-14 2012-11-14 清华大学 Method and device for tracking and locating target for aircraft
CN104573715A (en) * 2014-12-30 2015-04-29 百度在线网络技术(北京)有限公司 Recognition method and device for image main region
CN105786016A (en) * 2016-03-31 2016-07-20 深圳奥比中光科技有限公司 Unmanned plane and RGBD image processing method
CN106887018A (en) * 2015-12-15 2017-06-23 株式会社理光 Solid matching method, controller and system
US9896205B1 (en) * 2015-11-23 2018-02-20 Gopro, Inc. Unmanned aerial vehicle with parallax disparity detection offset from horizontal
CN108475072A (en) * 2017-04-28 2018-08-31 深圳市大疆创新科技有限公司 A kind of tracking and controlling method, device and aircraft

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102779347A (en) * 2012-06-14 2012-11-14 清华大学 Method and device for tracking and locating target for aircraft
CN104573715A (en) * 2014-12-30 2015-04-29 百度在线网络技术(北京)有限公司 Recognition method and device for image main region
US9896205B1 (en) * 2015-11-23 2018-02-20 Gopro, Inc. Unmanned aerial vehicle with parallax disparity detection offset from horizontal
CN106887018A (en) * 2015-12-15 2017-06-23 株式会社理光 Solid matching method, controller and system
CN105786016A (en) * 2016-03-31 2016-07-20 深圳奥比中光科技有限公司 Unmanned plane and RGBD image processing method
CN108475072A (en) * 2017-04-28 2018-08-31 深圳市大疆创新科技有限公司 A kind of tracking and controlling method, device and aircraft

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022213364A1 (en) * 2021-04-09 2022-10-13 Oppo广东移动通信有限公司 Image processing method, image processing apparatus, terminal, and readable storage medium
CN114281096A (en) * 2021-11-09 2022-04-05 中时讯通信建设有限公司 Unmanned aerial vehicle tracking control method, device and medium based on target detection algorithm

Also Published As

Publication number Publication date
CN111602139A (en) 2020-08-28

Similar Documents

Publication Publication Date Title
WO2020237611A1 (en) Image processing method and apparatus, control terminal and mobile device
US11481923B2 (en) Relocalization method and apparatus in camera pose tracking process, device, and storage medium
WO2020216054A1 (en) Sight line tracking model training method, and sight line tracking method and device
US20210343041A1 (en) Method and apparatus for obtaining position of target, computer device, and storage medium
US11398044B2 (en) Method for face modeling and related products
CN111223143B (en) Key point detection method and device and computer readable storage medium
CN108989672B (en) Shooting method and mobile terminal
CN109685915B (en) Image processing method and device and mobile terminal
US20220309836A1 (en) Ai-based face recognition method and apparatus, device, and medium
CN109272473B (en) Image processing method and mobile terminal
CN109241832B (en) Face living body detection method and terminal equipment
CN107730460B (en) Image processing method and mobile terminal
CN108881544B (en) Photographing method and mobile terminal
CN111031234B (en) Image processing method and electronic equipment
CN109544445B (en) Image processing method and device and mobile terminal
CN115526983A (en) Three-dimensional reconstruction method and related equipment
CN111008929B (en) Image correction method and electronic equipment
CN111091519B (en) Image processing method and device
CN109840476B (en) Face shape detection method and terminal equipment
CN110908517A (en) Image editing method, image editing device, electronic equipment and medium
CN110555815A (en) Image processing method and electronic equipment
CN110443752B (en) Image processing method and mobile terminal
CN111405361A (en) Video acquisition method, electronic equipment and computer readable storage medium
CN109345636B (en) Method and device for obtaining virtual face image
CN110930372A (en) Image processing method, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19930786

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19930786

Country of ref document: EP

Kind code of ref document: A1