WO2019019160A1

WO2019019160A1 - Method for acquiring image information, image processing device, and computer storage medium

Info

Publication number: WO2019019160A1
Application number: PCT/CN2017/094932
Authority: WO
Inventors: 阳光
Original assignee: 深圳配天智能技术研究院有限公司
Priority date: 2017-07-28
Filing date: 2017-07-28
Publication date: 2019-01-31
Also published as: CN110800020B; CN110800020A

Abstract

Disclosed in an embodiment of the present invention are a method for acquiring image information, an image processing device, and a computer storage medium, which are used during multi-view stereoscopic vision detection so as to reduce unnecessary hardware expenditure. The method in the embodiment of the present invention comprises: acquiring an actual image feature value of a point to be detected, the point to be detected being a pixel point in a first image or a second image, and the point to be detected not being contained within a matching region, wherein the matching region is a region comprised in both the first image and the second image; acquiring a feature value set corresponding to the first image and the second image, the feature value set comprising actual image feature values of each pixel point in the matching region; according to the actual image feature value of the point to be detected and the feature value set, searching in the matching region for a target pixel point, wherein the rate of difference between an actual image feature value of the target pixel point and the actual image feature value of the point to be detected is less than a first pre-configured rate of difference value; using the depth of the target pixel point as the depth of the point to be detected.

Description

Image information acquisition method, image processing device and computer storage medium

Technical field

The invention belongs to the technical field of information analysis, and in particular relates to an image information acquisition method, an image processing device and a computer storage medium.

Background technique

Binocular stereo vision is an important branch of computer vision. Binocular stereo vision is a method of simulating the principle of human vision. It uses a computer to passively perceive distance. It uses two identical cameras to image the same object from different positions to obtain the stereoscopic view of the object. The image pair, according to the pixel matching relationship between the images, calculates the offset between the pixels by the principle of triangulation to obtain the three-dimensional information of the object, and obtains the depth information of the object, and can calculate the actual distance between the object and the camera. The three-dimensional size of the object, the actual distance between the two points.

However, in practical applications, when an object is occluded, the line of sight of the dual camera is blocked, and the depth cannot be effectively calculated, resulting in a large visual error. In the prior art, the camera is generally used to perform imaging shooting from multiple angles, and then the depth information of the occluded pixels is restored according to multiple images captured by multiple cameras at multiple angles, thereby obtaining the depth of the occluded portion of the object. Eliminate dead ends.

However, in the prior art, the camera is added to eliminate the dead angle. When there are multiple objects in the direction in which the object is blocked, correspondingly, it is also necessary to add a plurality of cameras in the direction in which the object is blocked, which increases the hardware cost.

Summary of the invention

Embodiments of the present invention provide an image information acquisition method, an image processing device, and a computer storage medium for reducing unnecessary hardware expenditures in multi-view stereo vision detection.

A first aspect of the embodiments of the present invention provides a method for acquiring image information, including:

Acquiring an actual image feature value of the to-be-detected point, where the to-be-detected point is a pixel in the first image or the second image, the to-be-detected point is not included in the matching area, and the matching area is the first image And an area included in the second image, the first image is captured by a first camera, the second image is captured by a second camera, and the first image and the second image are taken at different angles Images obtained from the same target;

Acquiring a set of feature values corresponding to the first image and the second image, where the set of feature values includes actual image feature values of each pixel in the matching region;

Finding a target pixel point in the matching area according to the actual image feature value of the to-be-detected point and the feature value set, an actual image feature value of the target pixel point and an actual image feature value of the to-be-detected point The difference is less than the first preset difference value;

The depth of the target pixel is taken as the depth of the point to be detected.

A second aspect of the embodiments of the present invention provides an image processing device, where the image processing device includes:

Memory, processor and sensor;

The memory is configured to store an operation instruction;

The processor is configured to acquire an actual image feature value of the to-be-detected point, where the to-be-detected point is a pixel in the first image or the second image, and the to-be-detected point is not included in the matching area, and the matching An area is an area included in the first image and the second image, the first image is captured by a first camera, and the second image is captured by a second camera, the first image and the first The image obtained by capturing the same target at different angles is used to obtain a set of feature values corresponding to the first image and the second image, where the set of feature values includes actual pixels of the matching region. An image feature value, configured to find a target pixel point in the matching area according to the actual image feature value of the to-be-detected point and the feature value set, and the actual image feature value of the target pixel point and the to-be-detected The difference between the actual image feature values of the points is smaller than the first preset difference rate value; and the depth of the target pixel points is used as the depth of the to-be-detected point;

The sensor is configured to acquire the first image and the second image.

A third aspect of an embodiment of the present invention provides a computer storage medium comprising instructions which, when run on a computer, cause the computer to perform the methods described in the above aspects.

In the technical solution provided by the embodiment of the present invention, the first camera and the second camera respectively capture the same target at different angles to obtain the first image and the second image, and the pixels existing only in the first image or the second image A point is called a point to be detected, and an area included in the first image and the second image is referred to as a matching area, and an actual image feature value of the point to be detected is obtained by detecting an image of the point to be detected, by detecting the first image and the The second image obtains the actual image feature values of each pixel in the matching region to obtain a feature value set, and finds the target pixel according to the actual image feature value and the feature value set of the point to be detected. a point, the difference between the actual image feature value of the target pixel and the actual image feature value of the point to be detected is smaller than the first preset value, and the depth of the target pixel is obtained as the depth of the point to be detected. In this embodiment, There is no need to additionally add a camera to eliminate the occluded portion, that is, the portion existing only in the first image or the second image, and restore the occluded portion by finding the target pixel point in the unoccluded portion, that is, the matching region. The point to be tested reduces unnecessary hardware expenses.

DRAWINGS

FIG. 1 is a schematic diagram of an application scenario according to an embodiment of the present invention;

2A is a flowchart of an embodiment of an image information acquiring method according to an embodiment of the present invention;

2B is a schematic technical diagram of depth calculation according to an embodiment of the present invention;

3A is a flowchart of another embodiment of an image information acquiring method according to an embodiment of the present invention;

FIG. 3B is a schematic diagram of a technical method for determining a polar line according to an embodiment of the present invention; FIG.

4 is a device diagram of an embodiment of an image processing apparatus according to an embodiment of the present invention.

Detailed ways

The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, but not all embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.

The terms "first", "second", "third", "fourth", etc. (if present) in the specification and claims of the present invention and the above figures are used to distinguish similar objects without having to use To describe a specific order or order. It is to be understood that the data so used may be interchanged where appropriate so that the embodiments described herein can be implemented in a sequence other than what is illustrated or described herein. In addition, the terms "comprises" and "comprises" and "the" and "the" are intended to cover a non-exclusive inclusion, for example, a process, method, system, product, or device that comprises a series of steps or units is not necessarily limited to Those steps or units may include other steps or units not explicitly listed or inherent to such processes, methods, products or devices.

The embodiment of the invention is applicable to the application scenario shown in FIG. 1 . Point a and point b on object A are projected onto sensor 1 through lens 1, and point a and point b are occluded for the line of sight of lens 2, so the effective depths of point a and point b cannot be calculated. In the prior art, an additional camera is used to perform from multiple angles. Shooting to eliminate dead ends, but when there are multiple directions in which objects are blocked, correspondingly, multiple cameras are required, which increases hardware costs.

In view of this, in the embodiment of the present invention, there is no need to additionally add a camera to eliminate the occluded portion, that is, the portion existing only in the first image or the second image, the first image and the second image are the same target at different angles. The resulting image reduces the unnecessary hardware expenditure by restoring the to-be-detected point in the occluded portion by finding the target pixel point in the unoccluded portion, that is, the matching region.

For ease of understanding, the specific process in the embodiment of the present invention is described below. Referring to FIG. 2A, an embodiment of the image information acquiring method in the embodiment of the present invention includes:

201. Acquire an actual image feature value of the point to be detected.

One of the most common methods of binocular stereo vision when measuring distance and recovering 3D scenes is to calculate the depth using the parallax of two images of the same target from two cameras. In the embodiment of the present invention, the first camera captures the first image, and the second camera captures the second image. Since the first image and the second image are shot at different angles of the same target, the first image and the second image are The same area and different areas are also included. For convenience of description, the same area in the first image and the second image is referred to as a matching area, and may also be referred to as a coincident area in practical applications. The device may detect an image of the point to be detected by an image analysis method to obtain an actual image feature value of the point to be detected, where the point to be detected is a pixel in the first image or the second image and is not included in the matching area, and In the embodiment of the present invention, the actual image feature value may be an actual ambiguity value or an actual sharpness value, etc., which is not limited herein.

It should be noted that, in practical applications, there are various ways to obtain actual image feature values, such as a gray-scale variance algorithm, which can be calculated by using the following formula:

among them,

Indicates the mean value of the gray value of the image, f(x, y) represents the gray value of the image at the point (x, y), Nx, Ny respectively represent the width and height of the image, and s represents the actual point (x, y) Image feature value.

It should be noted that the embodiment of the present invention can be applied to a multi-view stereo vision technology, that is, includes at least two For the convenience of description, the embodiment of the present invention uses two cameras as an example for description.

202. Acquire a set of feature values corresponding to the first image and the second image.

Similar to the manner of obtaining the actual image feature value of the point to be detected in step 201, in step 202, the first image and the second image are detected by image analysis to obtain a feature value set corresponding to the first image and the second image, wherein the feature value set The actual image feature value of each pixel in the matching area, that is, the area included in the first image and the second image is included.

It is to be noted that the device obtains the actual image feature value of the point to be detected in step 201, and obtains the feature value set according to step 202. The two processes do not have a sequence relationship. Step 201 may be performed first, or step 202 may be performed first. , or at the same time, specifically not limited here.

203. Find a target pixel point in the matching area according to the actual image feature value and the feature value set of the point to be detected.

After obtaining the feature value set and the actual image feature value of the point to be detected, the device finds the target pixel point in the matching area according to the two, wherein the actual image feature value of the target pixel point and the actual image feature of the point to be detected In the embodiment of the present invention, there are various ways to calculate the actual image feature values of the pixel of the matching area and the point to be detected, for example, the actual pixel point of the matching area is set. If the image feature value is a, and the actual image feature value of the point to be detected is b, the difference rate can be obtained by the formula (ab)/a, and the obtained difference rate is compared with the first preset difference rate value. If the difference is smaller than the first preset difference value, the pixel of the matching area is determined as the target pixel.

204. The depth of the target pixel is used as the depth of the point to be detected.

After the device determines the target pixel in the matching area, since the target pixel is the pixel included in the first image and the second image, the device may determine the depth of the target pixel according to a preset algorithm, and the target pixel is The depth is taken as the depth of the point to be detected. Where depth refers to the distance from a point in the scene to the XY plane where the camera center is located. In practical applications, a depth map can be used to represent the depth information of each point in the scene, that is, each pixel in the depth map records the distance from a certain point in the scene to the XY plane where the camera center is located. In addition, there are various ways to determine the depth of a pixel. For example, a special hardware device can be used to actively acquire depth information of each pixel in the image, such as using an infrared pulse light source to transmit a signal to a scene, and then detecting by using an infrared sensor. The infrared light reflected back from the object in the scene to determine the distance from each pixel in the image to the camera; or, based on the traditional computer stereo vision method, by using two images of the same scene obtained at two different viewpoints The image or the plurality of viewpoint images are stereo-matched to restore the depth information of the object, including: (1) performing stereo matching on the image pair to obtain a parallax image of the corresponding point; and (2) calculating the depth according to the relationship between the parallax and the depth of the corresponding point. , thereby converting the parallax image into a depth image. Therefore, in the embodiment of the present invention, referring to FIG. 2B, the depth of the pixel point can be calculated by using the following formula: Z=B*f/(x-x'), where O and O' represent the first camera and the second camera, respectively. Z is used to indicate the depth of the pixel point, B is used to indicate the distance between the optical center of the first camera and the optical center of the second camera, and f is used to indicate the focal length of the first camera or the second camera, corresponding to x and x' It is the distance between the pixel point and the projection point of the camera center on the image plane, and the difference between them (x-x') is used to represent the parallax of the pixel point.

As can be seen from the above technical solution, in the embodiment of the present invention, the device finds the target pixel point in the matching area by using the feature value set corresponding to the first image and the second image and the actual image feature value of the point to be detected, and The depth of the pixel is used as the depth of the point to be detected, and the depth of the point to be detected is calculated, and no additional camera is needed, which reduces unnecessary hardware expenditure.

For ease of understanding, the image information acquisition method of the embodiment of the present invention will be described in detail below. Referring to FIG. 3A, FIG. 3A is a flowchart of another embodiment of an image information acquiring method according to an embodiment of the present invention.

301. Obtain an actual image feature value of the point to be detected.

302. Acquire a set of feature values corresponding to the first image and the second image.

In the embodiment of the present invention, steps 301 to 302 in FIG. 3A are similar to steps 201 to 202 in FIG. 2A, and details are not described herein again.

303. Determine whether the target actual image feature value exists in the feature value set; if yes, execute step 304; if no, perform step 306.

After obtaining the actual image feature value and the feature value set of the image to be detected, determining whether the target actual image feature value exists in the feature value set, wherein the difference between the target actual image feature value and the actual image feature value of the to-be-detected point is smaller than the first a preset difference value, that is, the device determines that the target actual image feature value is the same as the actual image feature value to be detected or the error range is within an acceptable range. If the target actual image feature value exists, step 304 is performed; if not, Then step 306 is performed.

304. The device selects one pixel from the pixel corresponding to the target actual image feature value as the target pixel.

If the device determines that there is a target actual image feature value in the feature value set, it can be understood that The target actual image feature value may be one or more, and the corresponding pixel point is also one or more, so the device may randomly select one pixel point from the corresponding pixel point as the target pixel point.

It should be noted that, in practical applications, there are various ways to select target pixel points. For example, a pixel point with the smallest difference between the actual image feature values of the point to be detected may be selected as the target pixel point, so the selection of the target pixel point. The method is not limited here.

305. The depth of the target pixel is the depth of the point to be detected.

In the embodiment of the present invention, the step 305 in FIG. 3A is similar to the step 204 in FIG. 2A, and details are not described herein again.

306. Obtain a reference value of the first reference point.

When it is determined that the target actual image feature value does not exist in the feature value set, the device selects one pixel from the matching region, which may be referred to as a first reference point in the embodiment of the present invention, and obtains a reference value of the first reference point, where The reference value includes at least the reference actual image feature value and the reference theoretical image feature value, and the reference actual image feature value of the first reference point may be obtained by image detection technology, and the manner of obtaining the reference actual image feature value of the first reference point is compared with FIG. 2A Step 201 of the illustrated embodiment obtains similar actual image feature values of the point to be detected, and details are not described herein again. In addition, in practical applications, the reference theoretical image feature value of the first reference point can be obtained by a preset calculation formula. For example, the preset calculation formula can be the following formula: C=d*F ² /(2nU ² *m) Where n represents the aperture value of the camera; C represents the theoretical image feature value of the first reference point, in this formula, the theoretical image feature value is the theoretical ambiguity value; U represents the depth of the point to be detected; F represents the lens focal length of the camera; d is the fixed value when the camera system is fixed; m is the depth of field, where depth of field can be understood as the range of the distance between the front and back of the subject measured by the camera lens or other imager leading edge.

307. Calculate a theoretical image feature value of the to-be-detected point according to the reference value of the first reference point and the actual image feature value of the point to be detected.

After obtaining the reference value of the first reference point, the device calculates the theoretical image feature value of the point to be detected according to the detected actual image feature value of the to-be-detected point, and the calculation process may include the following process: setting the first reference point Referring to the actual image feature value R1, the reference theoretical image feature value of the first reference point is M1, the actual image feature value of the point to be detected is R2, and the theoretical image feature value of the point to be detected is M2. In practical applications, it can be considered R1/M1≈R2/M2, so the theoretical image feature value of the point to be detected can be estimated by this formula.

308. Calculate a depth of the point to be detected according to the theoretical image feature value of the point to be detected.

After the device obtains the theoretical image feature value of the point to be detected, the depth of the point to be detected is calculated according to a preset formula. For example, the depth of the point to be detected may be calculated as follows:

2ncU ² /F ² =d/m, where n represents the aperture value of the camera; c represents the theoretical ambiguity value of the point to be detected; U represents the depth of the point to be detected; F represents the focal length of the lens of the camera; d is fixed when the camera system is fixed The value of the time; m is the depth of field, where the depth of field can be understood as the distance between the front and back of the subject measured by the camera lens or other imager front edge. Therefore, n, c, F, d, and m are all known, so the depth U of the point to be detected can be calculated.

309. Find a first closed edge and a second closed edge respectively in the first region and the second region by using a contour extraction method.

After the device obtains the depth of the point to be detected, the depth of the point to be detected is verified to ensure the depth of the point to be detected. In the embodiment of the present invention, the area of the matching area in the first image is referred to as a first area, and the area of the matching area in the second image is referred to as a second area, and the device may be used in the contour extraction method in the prior art. Finding a first closed edge in the first region and finding a second closed edge in the second region, wherein the purpose of the contour extraction method is to obtain a peripheral contour feature of the image, and the step of the contour extraction method may include first finding the extracted edge Any point on the contour of the image is used as the starting point, and from this starting point, the starting point field is searched in one direction, and the next contour boundary point of the detected image is continuously found, and finally the complete contour area is obtained, and the The closed edge of the outline area.

310. When the first closed edge and the second closed edge match, the first world point and the second world point are found according to the first closed edge or the second closed edge, and the polar plane.

311. Determine a target intersection according to the first world point and the second world point.

312. The depth of the target intersection is taken as the target value.

After the device finds the first closed edge in the first region and the second closed edge in the second region, since the number of the first closed edge and the second closed edge may be one or more, it is necessary to determine and A closed edge that matches the second closed edge. In a practical application, the first closed edge and the second closed edge may be matched by a preset matching algorithm, and specifically, the point on the first closed edge may be correlated with the point on the second closed edge, for example, The correlation value of each point on the first closed edge and each point on each second closed edge is accumulated to obtain an accumulated value, and the largest accumulated is found among the second closed edges The second closed edge corresponding to the value is considered to match the first closed edge.

Assuming that the first camera and the second camera have a point P on the target, the projection point on the imaging plane of the first camera is P ₁ , and the projection point on the imaging plane of the second camera is P ₂ , as shown in FIG. 3B. Wherein C ₁ and C ₂ are the optical centers of the first camera and the second camera, respectively, that is, the origin of the camera coordinate system. In the polar line geometry, the line connecting C ₁ and C ₂ is the baseline. The intersection of the baseline and the imaging plane of the first camera is called the intersection point e ₁ , and the intersection point e ₁ is the pole of the first camera. Similarly, the intersection of the baseline and the imaging plane of the second camera is called the intersection point e ₂ , and the intersection point e ₂ is the first point. The poles of the two cameras, which are the projection coordinates of the optical centers C ₁ and C ₂ of the two cameras on the corresponding camera imaging plane. The triangular plane composed of P, C ₁ and C _{2 is} called the polar plane π. The intersection line π and the intersection planes of the two camera imaging planes l ₁ and l _{2 are} called polar lines, and it can be said that l ₁ is the polar line corresponding to the point P ₁ , and l ₂ is the polar line corresponding to the point P ₂ .

After the first closed edge and the second closed edge are matched, a point M is taken from the first closed edge, and the polar plane formed by the point M, the optical center of the first camera, and the optical center of the second camera is determined. The intersection line of the imaging plane of the first camera is an epipolar line. In the two-dimensional plane, the polar line has at least two intersection points with the first closed edge. For convenience of description, the at least two intersection points are referred to as a first world point and a second world point, and in a quadrilateral region composed of four points of the first world point, the second world point, the optical center of the first camera, and the optical center of the second camera, find a point where the diagonal intersects in the quadrilateral As the target intersection point, the depth of the target intersection point is determined as the target value to verify the depth of the point to be detected. It can be understood that, in practical applications, a point may be taken from the second closed edge to form a plane with the optical center of the first camera and the optical center of the second camera, which is not limited herein.

313. Verify that the depth of the point to be detected is greater than the target value. If yes, go to step 314; if no, go to step 315.

After the device obtains the target value, it verifies whether the depth of the point to be detected is reliable according to the target value. When comparing the target value with the depth of the point to be detected, when the depth of the point to be detected is greater than the target value, step 314 is performed; when the depth of the point to be detected is not greater than the target value, step 315 is performed.

314. Confirm that the depth of the point to be detected passes verification.

When the depth of the point to be detected is greater than the target value, the device confirms that the depth of the point to be detected is reliable, that is, confirms that the depth of the point to be detected passes the verification.

315. Use an average value of depths of pixel points adjacent to the point to be detected as the depth of the point to be detected.

When the depth of the point to be detected is not greater than the target value, the device will be deeper than the pixel point adjacent to the point to be detected The average value of the degree is taken as the depth of the point to be detected, and the adjacent pixel points include a plurality of pixel points, which may include pixel points in the matching area, and the depth of the pixel points in the matching area is known, and may also include not matching The pixel points in the area, it should be noted that the pixel points in the non-matching area are points that have been estimated to have depth and whose depth has been verified, that is, if the pixel points adjacent to the detected point are not in the matching area, and the estimation is performed. If the depth is not verified, the depth of the pixel adjacent to the detected point is calculated, and the depth of the point is not considered.

In addition, it can be understood that there are various ways of selecting a pixel point adjacent to the point to be detected, for example, selecting a pixel point directly adjacent to the point to be detected, and the like, which is not limited herein.

316. Combine the first image and the second image into a depth image.

After the device obtains the depth of the point to be detected, the first image and the second image are combined into a depth image, that is, the image obtained by the matching of the matching regions in the two images. In the embodiment of the present invention, the first region and the first image are After the second area of the two images is matched, the first area of the first image and the second area of the second image are overlapped to obtain a superimposed image, that is, a depth image, where the matching area and the occlusion area are included, and the occlusion is included. The area includes an area in the first image from which the first area is removed, and an area in the second image from which the second area is removed.

317. Select a target occlusion point from the sub-occlusion area.

Since the number of gray levels that each pixel of the gray image may have is determined by the depth of the pixel, when the depth distribution of a certain region in the occlusion region does not correspond to the gray value distribution of the corresponding image The device performs smooth pre-processing on the area. In the embodiment of the present invention, for convenience of description, the area is referred to as a sub-occlusion area, that is, the depth distribution of the sub-occlusion area does not correspond to the gray value distribution of the sub-occlusion area, for example, The depth distribution of the sub-occlusion area is a progressively increasing distribution, and the gray value distribution of the sub-occlusion area is firstly smaller and then larger, so the device needs to perform smooth pre-processing on the sub-occlusion area, and the sub-occlusion is performed in the sub-occlusion area. A target occlusion point is selected in the region, wherein a depth difference between a depth of the target occlusion point and an adjacent point of the target occlusion point is greater than a preset difference.

318. Use a depth average of adjacent points of the target occlusion point as the depth of the target occlusion point.

After the device determines the target occlusion point, the depth of the adjacent point of the target occlusion point is obtained, and the depth of the adjacent point is averaged to obtain the depth average of the adjacent point, and the device can target the depth average The depth of the occlusion point.

It should be noted that the device passes the first image and the second image through steps 315 to 317. For example, in the actual application, the process is an optional step, which is not limited herein.

The present invention further provides an apparatus. Referring to FIG. 4, it is a device diagram of a device according to an embodiment of the present invention. The device 40 includes a memory 410, a processor 420, and a sensor 430.

The memory 410 is configured to store an operation instruction;

The processor 420 is configured to perform the following steps by calling an operation instruction stored in the memory 410:

Acquiring an actual image feature value of the to-be-detected point, where the to-be-detected point is a pixel in the first image or the second image, the to-be-detected point is not included in the matching area, and the matching area is the first image And an area included in the second image, the first image is captured by a first camera, the second image is captured by a second camera, and the first image and the second image are taken at different angles An image obtained by the same target; a set of feature values corresponding to the first image and the second image, the set of feature values comprising actual image feature values of each pixel in the matching region; The actual image feature value of the to-be-detected point and the feature value set find a target pixel point in the matching area, and the actual image feature value of the target pixel point and the actual image feature value of the to-be-detected point The difference is smaller than the first preset difference value; and the depth of the target pixel is used as the depth of the point to be detected;

The sensor 430 is configured to acquire the first image and the second image.

It should be noted that, in this embodiment, the processor 420 may also be referred to as a central processing unit (English full name: Central Processing Unit, English abbreviation: CPU).

The memory 410 is configured to store operation instructions and data, so that the processor 420 invokes the above operation instructions to implement corresponding operations, and may include a read only memory and a random access memory. A portion of the memory 410 may also include a non-volatile random access memory (English name: Non-Volatile Random Access Memory, English abbreviation: NVRAM).

The apparatus 40 also includes a bus system 440 that couples various components of the device 40, the various components including the sensor 410, the memory 420, and the processor 430, wherein the bus system 440 includes, in addition to the data bus, It can also include a power bus, a control bus, a status signal bus, and the like. However, for clarity of description, various buses are labeled as bus system 440 in the figure.

In this embodiment, it should be noted that the method disclosed in the foregoing embodiment of the present invention may be applied to the processor 420 or implemented by the processor 420. Processor 420 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method may pass through the processor 420. The integrated logic of the hardware or the instruction in the form of software is completed. The processor 420 may be a general-purpose processor, a digital signal processor (English name: Digital Signal Processing, English abbreviation: DSP), an application specific integrated circuit (English name: Application Specific Integrated Circuit, English abbreviation: ASIC), ready-made programmable Gate array (English name: Field-Programmable Gate Array, English abbreviation: FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. The methods, steps, and logical block diagrams disclosed in the embodiments of the present invention may be implemented or carried out. The general purpose processor may be a microprocessor or the processor or any conventional processor or the like. The steps of the method disclosed in the embodiments of the present invention may be directly implemented by the hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor. The software modules can be located in a conventional computer storage medium of the art, such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like. The computer storage medium is located in memory 410, and processor 420 reads the information in memory 410 and, in conjunction with its hardware, performs the steps of the above method.

In the foregoing embodiment, the specific implementation of the processor 420 to find the target pixel in the matching area according to the first actual image feature value and the feature value set may be:

Determining whether the target actual image feature value exists in the feature value set, the difference between the target actual image feature value and the actual image feature value is less than the first preset difference rate value; if yes, from the One pixel of the pixel corresponding to the target actual image feature value is selected as the target pixel.

In another possible embodiment, the processor 420 may also invoke an operation instruction in the memory 410 to perform the following steps:

When the target actual image feature value does not exist in the feature value set, the reference theoretical image feature value of the first reference point is obtained according to a preset calculation formula, and the first reference point is a pixel point in the matching region; according to the first reference Calculating a theoretical image feature value of the point to be detected by the reference value of the point and the actual image feature value of the point to be detected, the reference value of the first reference point includes the reference actual image feature value of the first reference point and the reference theoretical image feature value; The theoretical image feature value of the detection point is calculated to obtain the depth of the point to be detected.

Verify that the depth of the point to be detected is greater than the target value;

If yes, confirm that the depth of the point to be detected passes verification;

If not, the average value of the depths of the pixel points adjacent to the point to be detected is taken as the depth of the point to be detected.

The first closed edge and the second closed edge are respectively correspondingly found in the first region and the second region by the contour extraction method, the first region is a region of the matching region in the first image, and the second region is a matching region in the first The area in the second image;

When the first closed edge and the second closed edge match, the first world point and the second world point are found according to the first closed edge or the second closed edge, and the polar plane;

Determining the target intersection based on the first world point and the second world point;

The depth of the target intersection is taken as the target value.

Combining the first image and the second image into a depth image, the depth image including a matching area and an occlusion area, the occlusion area including an area in the first image from which the first area is removed and an area in the second image from which the second area is removed;

When the depth distribution of the sub-occlusion region does not correspond to the gray value distribution of the corresponding image in the sub-occlusion region, the occlusion region is preprocessed, and the sub-occlusion region is included in the occlusion region.

The specific implementation of the pre-processing of the sub-occlusion area by the processor 420 in the above embodiment may be:

Selecting a target occlusion point from the sub-occlusion area, and a depth difference between the target occlusion point and an adjacent point of the target occlusion point is greater than a preset difference;

The depth average of the adjacent points of the target occlusion point is taken as the depth of the target occlusion point.

In the above embodiment, the first camera and the second camera respectively capture the same target at different angles to obtain the first image and the second image, and the pixels existing only in the first image or the second image are referred to as to be detected. a point, an area included in the first image and the second image is referred to as a matching area, and an actual image feature value of the point to be detected is obtained by detecting an image of the point to be detected, by detecting the first image Obtaining an actual image feature value of each pixel in the matching region with the second image to obtain a feature value set, and finding a target pixel point according to the actual image feature value and the feature value set of the to-be-detected point, the actual image feature value of the target pixel point The difference between the actual image feature value and the point to be detected is smaller than the first preset value, and the depth of the target pixel is obtained as the depth of the point to be detected. In this embodiment, the camera is not required to be added to eliminate the occluded portion. That is, only the portion of the first image or the second image exists, and the target pixel to be detected in the occluded portion is restored by finding the target pixel in the unoccluded portion, that is, the matching region, thereby reducing unnecessary hardware expenditure. .

A person skilled in the art can clearly understand that, for the convenience and brevity of the description, the specific working process of the system, the device and the unit described above can refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.

In the several embodiments provided by the present application, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.

The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may contribute to the prior art or all or part of the technical solution may be embodied in the form of a software product stored in a storage. The medium includes instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention. The foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .

The above embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to be limiting; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that The technical solutions described in the embodiments are modified, or the equivalents of the technical features are replaced by the equivalents of the technical solutions of the embodiments of the present invention.

Claims

An image information acquisition method, characterized in that the method is applied to a multi-view stereo vision technology, the method comprising:

Acquiring an actual image feature value of the to-be-detected point, where the to-be-detected point is a pixel in the first image or the second image, the to-be-detected point is not included in the matching area, and the matching area is the first image And an area included in the second image, the first image is captured by a first camera, the second image is captured by a second camera, and the first image and the second image are taken at different angles Images obtained from the same target;

Acquiring a set of feature values corresponding to the first image and the second image, where the set of feature values includes actual image feature values of each pixel in the matching region;

Finding a target pixel point in the matching area according to the actual image feature value of the to-be-detected point and the feature value set, an actual image feature value of the target pixel point and an actual image feature value of the to-be-detected point The difference is less than the first preset difference value;

The depth of the target pixel is taken as the depth of the point to be detected.
The image information acquiring method according to claim 1, wherein the finding the target pixel point in the matching area according to the first actual image feature value and the feature value set comprises:

Determining whether there is a target actual image feature value in the set of feature values, and a difference between the target actual image feature value and the actual image feature value is smaller than the first preset difference value;

If so, one pixel point is selected from the pixel points corresponding to the target actual image feature value as the target pixel point.
The image information obtaining method according to claim 2, wherein after the determining whether the target actual image feature value exists in the feature value set, the method further comprises:

When it is determined that the target actual image feature value does not exist in the feature value set, obtaining a reference theoretical image feature value of the first reference point according to a preset calculation formula, where the first reference point is within the matching area pixel;

Calculating a theoretical image feature value of the to-be-detected point according to a reference value of the first reference point and an actual image feature value of the to-be-detected point, where the reference value of the first reference point includes the first reference point Referring to an actual image feature value and the reference theoretical image feature value;

Determining the depth of the point to be detected according to the theoretical image feature value of the point to be detected.
The image information obtaining method according to claim 1, wherein after the depth of the target pixel is used as the depth of the point to be detected, the method further includes:

Verifying whether the depth of the point to be detected is greater than a target value;

If yes, confirm that the depth of the point to be detected passes verification;

If not, the average value of the depths of the pixel points adjacent to the point to be detected is taken as the depth of the point to be detected.
The image information obtaining method according to claim 4, wherein the method further comprises: before the verifying whether the depth of the point to be detected is greater than a target value, the method further comprising:

Finding a first closed edge and a second closed edge respectively in the first area and the second area by a contour extraction method, the first area being an area of the matching area in the first image, the second An area is an area of the matching area in the second image;

When the first closed edge and the second closed edge match, the first world point and the second world point are found according to the first closed edge or the second closed edge, and the polar plane;

Determining a target intersection according to the first world point and the second world point;

The depth of the target intersection is taken as the target value.
The image information obtaining method according to any one of claims 5 to 5, wherein after the obtaining the depth of the target pixel point as the depth of the point to be detected, the method further comprises:

Combining the first image and the second image into a depth image, the depth image including the matching area and an occlusion area, the occlusion area including an area and a portion of the first image from which the first area is removed Removing the region of the second region from the second image;

When the depth distribution of the sub-occlusion region does not correspond to the gray value distribution of the corresponding image in the sub-occlusion region, the occlusion region is pre-processed, and the sub-occlusion region is included in the occlusion region.
The image information obtaining method according to claim 6, wherein the preprocessing the sub-occlusion area comprises:

Selecting a target occlusion point from the sub-occlusion area, wherein a depth difference between the target occlusion point and an adjacent point of the target occlusion point is greater than a preset difference;

The depth average of the adjacent points of the target occlusion point is taken as the depth of the target occlusion point.
An image processing apparatus, characterized in that the image processing apparatus comprises:

Memory, processor and sensor;

The memory is configured to store an operation instruction;

The processor is configured to acquire an actual image feature value of the to-be-detected point, where the to-be-detected point is a pixel in the first image or the second image, and the to-be-detected point is not included in the matching area, and the matching An area is an area included in the first image and the second image, the first image is captured by a first camera, and the second image is captured by a second camera, the first image and the first The image obtained by capturing the same target at different angles is used to obtain a set of feature values corresponding to the first image and the second image, where the set of feature values includes actual pixels of the matching region. An image feature value, configured to find a target pixel point in the matching area according to the actual image feature value of the to-be-detected point and the feature value set, and the actual image feature value of the target pixel point and the to-be-detected The difference between the actual image feature values of the points is smaller than the first preset difference rate value; and the depth of the target pixel points is used as the depth of the to-be-detected point;

The sensor is configured to acquire the first image and the second image.
The image processing device according to claim 8, wherein said processor is configured to:

Determining whether the target actual image feature value exists in the feature value set, the difference between the target actual image feature value and the actual image feature value is less than the first preset difference rate value; if yes, from the One pixel of the pixel corresponding to the target actual image feature value is selected as the target pixel.
The image processing device according to claim 9, wherein the processor is further configured to:

When the target actual image feature value does not exist in the feature value set, the reference theoretical image feature value of the first reference point is obtained according to a preset calculation formula, where the first reference point is a pixel point in the matching area; Calculating a theoretical image feature value of the to-be-detected point according to a reference value of the first reference point and an actual image feature value of the to-be-detected point, where the reference value of the first reference point includes the first reference point Referring to the actual image feature value and the reference theoretical image feature value; calculating the depth of the point to be detected according to the theoretical image feature value of the point to be detected.
The image processing device according to claim 9, wherein the processor is further configured to:

Verifying whether the depth of the point to be detected is greater than a target value; if yes, confirming the point to be detected The depth passes the verification; if not, the average value of the depths of the pixel points adjacent to the point to be detected is taken as the depth of the point to be detected.
The image processing device according to claim 11, wherein the processor is further configured to:

Finding a first closed edge and a second closed edge respectively in the first area and the second area by a contour extraction method, the first area being an area of the matching area in the first image, the second a region is an area of the matching area in the second image; determining that the first closed edge and the second closed edge match; when the first closed edge and the second closed edge match, according to Determining a first world point and a second world point by the first closed edge or the second closed edge, and determining a target intersection point according to the first world point and the second world point; The depth is taken as the target value.
The image processing device according to any one of claims 12 to 12, wherein the processor is further configured to:

Combining the first image and the second image into a depth image, the depth image including the matching area and an occlusion area, the occlusion area including an area and a portion of the first image from which the first area is removed Removing the region of the second region in the second image; pre-processing the occlusion region when the depth distribution of the sub-occlusion region does not correspond to the gray value distribution of the corresponding image in the sub-occlusion region, A sub-occlusion area is included in the occlusion area.
The image processing device according to claim 13, wherein the processor is specifically configured to:

Selecting a target occlusion point from the sub-occlusion area, a depth difference between the target occlusion point and an adjacent point of the target occlusion point is greater than a preset difference; and a depth average of adjacent points of the target occlusion point The depth of the target occlusion point.
A computer storage medium comprising instructions which, when run on a computer, cause the computer to perform the image information acquisition method of any of claims 1-7.