CN112907559A - Monocular camera-based depth map generation device and method - Google Patents

Monocular camera-based depth map generation device and method Download PDF

Info

Publication number
CN112907559A
CN112907559A CN202110281368.XA CN202110281368A CN112907559A CN 112907559 A CN112907559 A CN 112907559A CN 202110281368 A CN202110281368 A CN 202110281368A CN 112907559 A CN112907559 A CN 112907559A
Authority
CN
China
Prior art keywords
camera
monocular camera
depth map
realsense
rgb image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110281368.XA
Other languages
Chinese (zh)
Other versions
CN112907559B (en
Inventor
屠礼芬
宋伟
彭祺
李春生
余振宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hubei Engineering University
Original Assignee
Hubei Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hubei Engineering University filed Critical Hubei Engineering University
Priority to CN202110281368.XA priority Critical patent/CN112907559B/en
Publication of CN112907559A publication Critical patent/CN112907559A/en
Application granted granted Critical
Publication of CN112907559B publication Critical patent/CN112907559B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Length Measuring Devices By Optical Means (AREA)

Abstract

The invention relates to a monocular camera-based depth map generation device, which comprises a monocular camera and a RealSense camera; the monocular camera is arranged on the first cloud deck; the RealSense camera is arranged on the second cloud deck; the monocular camera and the RealSense camera are closely matched, and the optical axes are parallel. The invention also relates to a monocular camera-based depth map generation method, which comprises the following steps: collecting 1 monocular camera RGB image; collecting 1 RealSense camera RGB image and 1 RealSense camera depth map; sampling downwards to obtain a downwards sampled monocular camera RGB image; performing super-pixel segmentation operation to obtain a segmentation monocular camera RGB image; performing feature point matching operation to obtain a matching depth map; performing region segmentation to obtain a partition depth map; counting the average value, and filling the corresponding segmentation area to obtain a filling depth map; and (5) carrying out upward sampling to obtain a monocular camera depth map. The method can keep the high precision and the view field of the RGB image of the monocular camera and fit a depth map; the cost is low; hardware calibration is not needed, learning and modeling are not needed, and priori knowledge is not needed.

Description

Monocular camera-based depth map generation device and method
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a monocular camera-based depth map generation device and method.
Background
With the continuous development of the artificial intelligence application field, the application of combining the image depth and the RGB information is more and more extensive, compared with the RGB information, the depth information introduces the distance from a target to a camera, one spatial dimension is increased, a scene can be better understood, and the detection or identification precision is obviously improved. The image containing depth information is a depth map.
The prior art has several methods for generating a depth map as follows:
1. conventional hardware acquisition methods:
the method is a convenient depth map generation technology, and simply directly uses hardware such as laser radar, Kinect and RealSense to directly obtain the depth information of the image and then obtain the depth map; the advantages and disadvantages of these three devices are as follows:
laser radar:
the advantages are that: the precision is higher;
the disadvantages are as follows: the method comprises the steps that three-dimensional point cloud information is obtained, and RGB images are lacked, namely texture information is lost;
Kinect/RealSense:
the advantages are that: the RGB image and the depth map can be obtained simultaneously, and the method is low in price and easy to popularize;
the disadvantages are as follows: the RGB image has low resolution, low contrast and limited field range.
2. Image processing based methods, such as more mainstream binocular or binocular stereo matching:
the principle of the method is that after a camera is calibrated, a depth map is obtained through feature point matching, global matching and local matching; the advantages and disadvantages of this approach are as follows:
the advantages are that: a depth map with higher precision can be generated, and better RGB image information is also reserved;
the disadvantages are as follows: the method needs to carry out complex calibration on the camera, once the calibration is finished, the relative position of the camera cannot move, and the flexibility is poor; in addition, the hardware used in the technical scheme needs to be customized, and the cost is not low.
3. The monocular-based depth estimation method comprises the following steps:
the technical scheme is that a depth map is obtained by a traditional machine learning or deep learning method; the advantages and disadvantages of this type of solution are as follows:
the advantages are that: the hardware cost is low;
the disadvantages are as follows: the learning and modeling are required to be performed first, so that a large amount of data sets and a complex operation process are required, and the method is not suitable for popularization.
Disclosure of Invention
The invention aims to solve the problems and provides a monocular camera-based depth map generation device and a monocular camera-based depth map generation method, and aims to fit a depth map on the premise of keeping the high precision and the unchanged view field of an RGB (red, green and blue) map of a monocular camera; hardware is not required to be calibrated, learning and modeling are not required to be carried out on a scene, and a large amount of prior knowledge is not required; the application cost is reduced.
In order to solve the problems, the technical scheme provided by the invention is as follows:
a depth map generating device based on a monocular camera comprises the monocular camera and a RealSense camera; wherein:
the monocular camera is arranged on a quick-mounting plate of the first cloud deck; the base of the first tripod head is fixedly arranged on the tripod head fixing plate;
the RealSense camera is arranged on a quick installation plate of the second cloud deck; the base of the second holder is fixedly arranged on the holder fixing plate;
the monocular camera is tightly matched with the RealSense camera; the optical axis of the monocular camera is parallel to the optical axis of the RealSense camera.
Preferably, the monocular camera fits closely with the RealSense camera in the horizontal direction.
Preferably, the monocular camera fits closely with the RealSense camera in the vertical direction.
Preferably, it is characterized in that: the monocular camera is arranged on a quick-mounting plate of the first holder through a conversion frame made of a tough material and used for buffering and resisting shock; the RealSense camera is installed on the fast-installation plate of the second holder through a conversion frame made of a tough material and used for buffering and resisting shock.
Preferably, the monocular camera is provided with cooling fins in four directions, namely, up, down, left and right.
A monocular camera-based depth map generating method using a depth map generating apparatus, comprising the steps of:
s100, simultaneously aligning the optical axis of the monocular camera and the optical axis of the RealSense camera to an image acquisition target;
s200, collecting 1 monocular camera RGB image for the image collection target by using the monocular camera; collecting 1 RealSense camera RGB image and 1 RealSense camera depth map for the image collection target by using the RealSense camera;
pixel points in the RGB image of the RealSense camera correspond to pixel points in the depth map of the RealSense camera one to one;
s300, the RGB image of the monocular camera is down-sampled, so that the resolution of the RGB image of the monocular camera is reduced to be the same as that of the RGB image of the RealSense camera, and the RGB image of the down-sampled monocular camera is obtained;
s400, performing super-pixel segmentation operation on the down-sampling monocular camera RGB image to obtain a segmented monocular camera RGB image;
s500, performing feature point matching operation on the down-sampling monocular camera RGB image and the RealSense camera RGB image to obtain a matching depth map;
s600, performing region segmentation on the matched depth map according to the RGB image of the segmentation monocular camera to obtain a partition depth map; the partition depth map is composed of a plurality of partition regions;
s700, counting the average values of the depth values of all pixel points in each partition area one by one, then taking the average values as the depth values of the corresponding partition areas, and filling the corresponding partition areas to obtain a filling depth map;
s800, by up-sampling the filling depth map, raising the resolution of the filling depth map to be the same as that of the RGB image of the monocular camera, and obtaining a depth map of the monocular camera; and then outputting the monocular camera depth map as a result of the depth map generation method.
Preferably, the feature point matching operation in S500 specifically includes the following operations:
s510, searching pixel points which can be matched in the RGB image of the RealSense camera one by pixel points in the RGB image of the downward sampling monocular camera;
s520, according to the search result, the following operations are carried out:
if the pixel points in the RGB image of the down-sampling monocular camera have pixel points which can be matched in the RGB image of the RealSense camera, the pixel points in the RGB image of the down-sampling monocular camera are endowed with the depth values of the corresponding pixel points in the depth map of the RealSense camera, wherein the pixel points in the RGB image of the down-sampling monocular camera can be matched with the pixel points in the RGB image of the RealSense camera;
otherwise, setting the gray value of the pixel point in the RGB image of the down-sampling monocular camera to be 0.
Compared with the prior art, the invention has the following advantages:
1. because the monocular camera and the RealSense camera are tightly connected together and the optical axes are approximately coincident, the RGB image acquired by the RealSense camera and the high-precision RGB image acquired by the monocular camera can be matched with the characteristic points, so that the depth image can be fitted on the premise of keeping the high precision and the view field of the RGB image of the monocular camera unchanged, and the defect that the texture information is lost in the technical scheme of the laser radar is overcome;
2. because the invention does not use the customization equipment, thus overcome the disadvantage of high cost of the method technical scheme based on image processing;
3. the three-dimensional coordinates are not calculated through multi-camera image coordinates, so that the defect of the technical scheme of the monocular-based depth estimation method is overcome, hardware does not need to be calibrated, a scene does not need to be learned and modeled, a large amount of priori knowledge is not needed, and the method is further suitable for popularization and application.
Drawings
Fig. 1 is a schematic front view of a monocular camera-based depth map generating device according to an embodiment of the present invention;
FIG. 2 is a front view of an apparatus according to an embodiment of the present invention;
fig. 3 is a schematic flowchart of a monocular camera-based depth map generation method according to an embodiment of the present invention;
FIG. 4 shows an embodiment of the present inventionRGBThe image of (a);
FIG. 5 shows R in an embodiment of the present inventionRGBThe image of (a);
FIG. 6 shows R in an embodiment of the present inventionDThe image of (a);
FIG. 7 shows an embodiment of the present inventionRGBAn image of the feature point detection result of (1);
FIG. 8 shows R in an embodiment of the present inventionRGBAn image of the feature point detection result of (1);
FIG. 9 is an image of feature point matching results of an embodiment of the present invention;
FIG. 10 shows the result S after super-pixel segmentation according to an embodiment of the present inventionRGBThe image of (a);
FIG. 11 shows a schematic diagram of an embodiment of the present inventionDThe image of (a);
FIG. 12 is a schematic diagram of the variation of each image according to the algorithm flow according to the embodiment of the present invention.
Wherein: 1. the camera comprises a monocular camera, a RealSense camera, a first tripod head, a second tripod head, a tripod head fixing plate, a conversion frame, a radiating fin and a tripod, wherein the monocular camera comprises 2. the RealSense camera, 3. the first tripod head, 4. the second tripod head, 5. the tripod head fixing plate, 6. the conversion frame, 7. the radiating fin and 8. the tripod.
Detailed Description
The present invention is further illustrated by the following examples, which are intended to be purely exemplary and are not intended to limit the scope of the invention, as various equivalent modifications of the invention will occur to those skilled in the art upon reading the present disclosure and fall within the scope of the appended claims.
As shown in fig. 1 (front view), a monocular camera-based depth map generating apparatus includes a monocular camera 1 and a RealSense camera 2.
In this embodiment, the monocular camera 1 is an industrial camera, and specifically, a micro-vision RS-A14K-GC8 industrial camera is adopted.
In this specific embodiment, the RealSense camera 2 employs an Intel RealSense D415 depth camera; in addition, the RealSense camera 2 may also employ an Intel RealSense D435 depth camera.
Wherein:
the monocular camera 1 is installed on a quick-mounting plate of the first cloud deck 3; the base of the first pan/tilt head 3 is fixedly mounted on the pan/tilt fixing plate 5.
In this embodiment, the monocular camera 1 is mounted on the quick mount plate of the first pan/tilt head 3 via the conversion frame 6 made of a flexible material for buffering and shock resistance.
The RealSense camera 2 is arranged on a quick-mounting plate of the second cloud deck 4; the base of the second pan/tilt head 4 is fixedly mounted on the pan/tilt head fixing plate 5.
In this embodiment, the RealSense camera 2 is mounted on the fast-mounting plate of the second pan/tilt head 4 through a conversion frame 6 made of a flexible material for buffering and shock resistance.
The optical axis of the monocular camera 1 is parallel to the optical axis of the RealSense camera 2. The monocular camera 1 fits closely with the RealSense camera 2 in the horizontal direction, or fits closely in the vertical direction.
The purpose of this is to: the same scene collected by the two cameras has the same depth, so that the RealSense camera depth map can be used for fitting and finally obtaining the monocular camera depth map.
In this embodiment, the monocular camera 1 and the RealSense camera 2 are closely fitted in the horizontal direction.
In this embodiment, a tripod 8 is installed below the holder fixing plate 5, and the optical axis of the monocular camera 1 and the optical axis of the RealSense camera 2 are both kept in a horizontal posture by adjusting the posture of the tripod 8.
For the monocular camera 1, there may be very little error in the absolute depth generated, because the image planes of the two cameras, monocular camera 1 and RealSense camera 2, do not completely coincide, but the relative depths of different objects in the scene are not affected.
In this embodiment, the heat dissipation fins 7 are mounted on the monocular camera 1 in the four directions, i.e., up, down, left, and right directions. This is because the industrial camera consumes a large amount of power and easily generates heat when in use, and heat dissipation is necessarily required.
Fig. 2 is a front view of the device according to this embodiment.
As shown in fig. 3, a monocular camera-based depth map generating method using a depth map generating apparatus includes the steps of:
s100, simultaneously aligning the optical axis of the monocular camera 1 and the optical axis of the RealSense camera 2 to an image acquisition target.
S200, collecting 1 monocular camera RGB image for an image collection target by using the monocular camera 1, and recording the monocular camera RGB image as I for convenience of description belowRGB(ii) a Collecting 1 RealSense camera RGB image and 1 RealSense camera depth map for an image collection target by using a RealSense camera 2; for convenience, the RealSense camera RGB image will be referred to as RRGBLet RealSense Camera depth map be RD
In this particular example, IRGBAs shown in fig. 4; rRGBAs shown in fig. 5; rDAs shown in fig. 6.
As can be clearly seen by comparing fig. 4 and 5, although the optical axes of the industrial camera and the RealSense camera 2 are not perfectly collinear, the image scenes are very close since they are mounted in close proximity.
As can be seen by comparing FIG. 5 with FIG. 6, RRGBAnd RDThe points in (1) are in one-to-one correspondence and are coincident.
RRGBPixel point of (3) and RDThe pixel points in the image are in one-to-one correspondence.
S300. by mixing IRGBDown-sampling to make IRGBResolution of (2) is reduced to RRGBThe resolution ratios are the same, and an RGB image of the down-sampling monocular camera is obtained; for convenience of description, the following will be made of a down-sampled monocular camera RGB mapLike note as iRGB
The reasons for this step are: the resolution of the industrial camera in this embodiment is 4384 × 3288, while there are various resolutions of the RealSense camera 2; the resolution of the RealSense camera 2 requires that the resolution be chosen with an aspect ratio that is consistent with that of the industrial camera, due to algorithmic requirements.
In this particular embodiment: the model of the RealSense camera 2 is selected to be Intel RealSense D415, or the model is changed to be Intel RealSense D435, so that the same effect can be achieved; however, whether the Intel RealSense D415 or the Intel RealSense D435 can output the depth map with the maximum resolution of 1280 multiplied by 720; clearly the resolution of the contrast industrial camera is much poorer; also as mentioned above, in order to select the resolution of the RealSense camera 2 with an aspect ratio that is consistent with the resolution of the industrial camera, the resolution of the RealSense camera 2 can only be selected to be 640 × 480 in this embodiment, i.e. R in this embodimentRGBThe image quality of (1).
On the other hand, however, the image definition and contrast are high in industrial applications, so RRGBThe picture quality of (1) is unusable, but only by IRGB(ii) a May be of the formula IRGBAnd lack the corresponding depth map; thus, the contradiction is revealed; the contradiction is the fundamental problem to be solved by the invention; briefly, the object of the present invention is to generate a sum of IRGBPoint-to-point depth map ID
Therefore, the practical meaning of this step of S300 is that it is obtained by the following steps to IRGBDown-sampling to convert into RRGBNew graphs of the same resolution, i.e. iRGB
S400, to iRGBPerforming super-pixel segmentation operation to obtain a segmentation monocular camera RGB image; for convenience, the divided monocular camera RGB image will be referred to as SRGB
In this embodiment, the super-pixel segmentation operation is performed by performing region division according to a scene to be analyzed.
Superpixel segmentation is an irregular block of pixels with certain visual significance, which is composed of adjacent pixels with similar texture, color, brightness, etc.
S500, mixing the IRGBAnd RRGBPerforming feature point matching operation to obtain a matching depth map; for convenience of description, the matching depth map is denoted as iDP(ii) a The method specifically comprises the following operations:
s510, one-by-one RDIn (1) the pixel point is in RRGBSearching pixel points which can be matched;
s520, according to the search result, the following operations are carried out:
if iRGBIn (1) the pixel point is in RRGBHaving pixel points which can be matched, if the matching is successful, i isRGBThe pixel point in (1) is endowed with a corresponding RRGBIn (3) the pixel points capable of being matched are in RDThe depth value of the corresponding pixel point in the image; two successfully paired pixel points are called feature points
Otherwise, it indicates that the match was not successful, and then i will beRGBThe gray value of the pixel point in (1) is set to 0.
In step S500, the feature point matching operation is used to eliminate points with large errors and retain good matching points. This is due to: the model difference between the industrial camera and the RealSense camera 2 is large, the view field angles are different, and two sides of an image collected by the camera with the large view field angle have more misaligned areas. However, since the industrial camera and the RealSense camera 2 are installed in close connection, the similarity of the overlapped partial images in the middle of the field of view is high, which can reduce the matching difficulty, so that more matching point pairs are usually generated in the region.
For example, there is a pair of matching points, where at iRGBThe image coordinates of the pixel points in (1) are (m, n) and are in RRGBThe coordinates in (a) are (m ', n'). Because the two cameras are different in model and coordinate, the two cameras are generally different, but the two cameras are installed in a left-right or up-down close connection mode, and the front position and the rear position are kept consistent, so that the absolute depth is similar and the relative depth is the same for the same scene. Due to RRGBAnd RDThe pixel points in (1) are in one-to-one correspondence, and R is usedDThe depth value of the pixel point at the middle coordinate (m ', n') is taken as iRGBDepth at coordinates (m, n) in an imageThe value is obtained. According to the corresponding relation, the depth values of all the matching point positions can be generated by analogy, and a new image is formed, namely iDP
In this embodiment, FIG. 7 is IRGBFIG. 8 shows the result of detection of characteristic points of (A), RRGBFIG. 9 shows the result of feature point detection, although R is the result of feature point matchingRGBThe image quality is poor due to the influence of the resolution, but the influence on the characteristic points is small, the detected characteristic points are basically consistent with the industrial camera with high resolution, and the matching result is good.
In this embodiment, FIG. 10 shows S obtained by superpixel segmentationRGB
S600. according to SRGBTo iDPPerforming region segmentation to obtain a partition depth map; for convenience of description, the partition depth map is denoted as iDPA;iDPAIs composed of a plurality of divided regions.
Filling the irregular pixel blocks obtained after the super-pixel segmentation in the step S400; during filling, each irregular pixel block is taken as a unit, and filling is carried out by using the same depth value; therefore, the meaning of this step is to convert i intoDPDepth values at characteristic points of medium hash distribution according to SRGBIs partitioned to generate iDPAAnd i isDPAI.e. the result of the partitioning.
S700, counting the average values of the depth values of all pixel points in each segmentation area one by one, then taking the average values as the depth values of the corresponding segmentation areas, and filling the corresponding segmentation areas to obtain a filling depth map; for convenience, the filling depth map is denoted as iD
It should be noted that, in practical operation, it is found that, in a scene with rare feature points, there is a phenomenon that some regions have no feature points, and then the scene is filled with a background gray-scale value 0.
S800. by mixing iDUp-sampling, so that iDResolution of (2) is raised toRGBThe resolutions of the two images are the same, and a monocular camera depth map is obtained; for convenience of description, the monocular phase will be described belowThe depth map is marked as ID(ii) a Then adding IDAnd output as a result of the depth map generation method.
To be particularly noted is that IDExcept for the partial area where the feature points are not detected and the area where the image can not be acquired by the outer RealSense camera 2, the depth values and I of other pointsRGBAnd correspond to each other.
In this particular example, IDAs shown in fig. 11.
The black areas in fig. 6 and 11 are both depth missing areas.
Comparing fig. 6 and fig. 11, it can be found that: r automatically generated by RealSense camera 2DThe image is complete, and only small areas on two sides of the image are lost in depth; although the depth area causing the deficiency is more, the depth area is mainly distributed around the image.
The depth missing of the algorithm is mainly caused by two reasons:
1. although the industrial camera and the RealSense camera 2 are closely connected and installed, the optical axes are not completely consistent, so that the shot pictures are not completely overlapped, and the depth loss at the periphery is caused.
2. The partial area is smooth and lacks of characteristic points, and the problem is likely to occur in the middle and the periphery of the image. From fig. 7 to fig. 11, in addition to the partial depth missing region, other regions with depth also have better results.
It can therefore be concluded that: the invention can well solve the defects of the prior art and obviously improve the precision of industrial detection or identification.
Finally, supplementary notes are that: fig. 12 is a schematic diagram showing a change of each image according to the algorithm flow in the present invention, in which: i.e. iDPIs shown white to more clearly represent image iDPThe actual algorithm is as described in S520, and the gray value of the pixel point that is not successfully matched is set to 0; in FIG. 12, iDPAThe region dividing line in (a) is drawn for more clearly expressing the meaning of the algorithm, and in the actual algorithm, as described in S600, there is no region dividing line.
In the foregoing detailed description, various features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments of the subject matter require more features than are expressly recited in each claim. Rather, as the following claims reflect, invention lies in less than all features of a single disclosed embodiment. Thus, the following claims are hereby expressly incorporated into the detailed description, with each claim standing on its own as a separate preferred embodiment of the invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. To those skilled in the art; various modifications to these embodiments will be readily apparent, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
What has been described above includes examples of one or more embodiments. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the aforementioned embodiments, but one of ordinary skill in the art may recognize that many further combinations and permutations of various embodiments are possible. Accordingly, the embodiments described herein are intended to embrace all such alterations, modifications and variations that fall within the scope of the appended claims. Furthermore, to the extent that the term "includes" is used in either the detailed description or the claims, such term is intended to be inclusive in a manner similar to the term "comprising" as "comprising" is interpreted when employed as a transitional word in a claim. Furthermore, any use of the term "or" in the specification of the claims is intended to mean a "non-exclusive or".
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (7)

1. A depth map generation device based on a monocular camera is characterized in that: comprises a monocular camera (1) and a RealSense camera (2); wherein:
the monocular camera (1) is arranged on a quick-mounting plate of the first cloud deck (3); the base of the first tripod head (3) is fixedly arranged on the tripod head fixing plate;
the RealSense camera (2) is installed on a fast-installation plate of the second cloud deck (4); the base of the second tripod head (4) is fixedly arranged on the tripod head fixing plate;
the monocular camera (1) is tightly matched with the RealSense camera (2); the optical axis of the monocular camera (1) is parallel to the optical axis of the RealSense camera (2).
2. The monocular camera-based depth map generating device of claim 1, wherein: the monocular camera (1) and the RealSense camera (2) are closely matched in the horizontal direction.
3. The monocular camera-based depth map generating device of claim 1, wherein: the monocular camera (1) and the RealSense camera (2) are closely matched in the vertical direction.
4. The monocular camera-based depth map generating device according to claim 2 or 3, wherein: the monocular camera (1) is arranged on a quick-mounting plate of the first cloud deck (3) through a conversion frame (6) made of a tough material and used for buffering and resisting shock; the RealSense camera (2) is arranged on a quick-mounting plate of the second cloud deck (4) through a conversion frame (6) made of a flexible material and used for buffering and resisting shock.
5. The monocular camera-based depth map generating device of claim 4, wherein: the monocular camera (1) is provided with radiating fins (7) in the upper, lower, left and right directions.
6. A monocular camera-based depth map generating method using the depth map generating apparatus according to any one of claims 1 to 5, characterized in that: comprises the following steps:
s100, simultaneously aligning the optical axis of the monocular camera (1) and the optical axis of the RealSense camera (2) to an image acquisition target;
s200, collecting 1 monocular camera RGB image for the image collection target by using the monocular camera (1); collecting 1 RealSense camera RGB image and 1 RealSense camera depth map for the image collection target by using the RealSense camera (2);
pixel points in the RGB image of the RealSense camera correspond to pixel points in the depth map of the RealSense camera one to one;
s300, the RGB image of the monocular camera is down-sampled, so that the resolution of the RGB image of the monocular camera is reduced to be the same as that of the RGB image of the RealSense camera, and the RGB image of the down-sampled monocular camera is obtained;
s400, performing super-pixel segmentation operation on the down-sampling monocular camera RGB image to obtain a segmented monocular camera RGB image;
s500, performing feature point matching operation on the down-sampling monocular camera RGB image and the RealSense camera RGB image to obtain a matching depth map;
s600, performing region segmentation on the matched depth map according to the RGB image of the segmentation monocular camera to obtain a partition depth map; the partition depth map is composed of a plurality of partition regions;
s700, counting the average values of the depth values of all pixel points in each partition area one by one, then taking the average values as the depth values of the corresponding partition areas, and filling the corresponding partition areas to obtain a filling depth map;
s800, by up-sampling the filling depth map, raising the resolution of the filling depth map to be the same as that of the RGB image of the monocular camera, and obtaining a depth map of the monocular camera; and then outputting the monocular camera depth map as a result of the depth map generation method.
7. The monocular camera-based depth map generating method according to claim 6, characterized in that: the feature point matching operation in S500 specifically includes the following operations:
s510, searching pixel points which can be matched in the RGB image of the RealSense camera one by pixel points in the RGB image of the downward sampling monocular camera;
s520, according to the search result, the following operations are carried out:
if the pixel points in the RGB image of the down-sampling monocular camera have pixel points which can be matched in the RGB image of the RealSense camera, the pixel points in the RGB image of the down-sampling monocular camera are endowed with the depth values of the corresponding pixel points in the depth map of the RealSense camera, wherein the pixel points in the RGB image of the down-sampling monocular camera can be matched with the pixel points in the RGB image of the RealSense camera;
otherwise, setting the gray value of the pixel point in the RGB image of the down-sampling monocular camera to be 0.
CN202110281368.XA 2021-03-16 2021-03-16 Depth map generation device based on monocular camera Active CN112907559B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110281368.XA CN112907559B (en) 2021-03-16 2021-03-16 Depth map generation device based on monocular camera

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110281368.XA CN112907559B (en) 2021-03-16 2021-03-16 Depth map generation device based on monocular camera

Publications (2)

Publication Number Publication Date
CN112907559A true CN112907559A (en) 2021-06-04
CN112907559B CN112907559B (en) 2022-06-07

Family

ID=76105192

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110281368.XA Active CN112907559B (en) 2021-03-16 2021-03-16 Depth map generation device based on monocular camera

Country Status (1)

Country Link
CN (1) CN112907559B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105115445A (en) * 2015-09-14 2015-12-02 杭州光珀智能科技有限公司 Three-dimensional imaging system and imaging method based on combination of depth camera and binocular vision
CN105184784A (en) * 2015-08-28 2015-12-23 西交利物浦大学 Motion information-based method for monocular camera to acquire depth information
US20180189565A1 (en) * 2015-08-28 2018-07-05 Imperial College Of Science, Technology And Medicine Mapping a space using a multi-directional camera
CN109166149A (en) * 2018-08-13 2019-01-08 武汉大学 A kind of positioning and three-dimensional wire-frame method for reconstructing and system of fusion binocular camera and IMU
CN110519502A (en) * 2019-09-24 2019-11-29 远形时空科技(北京)有限公司 A kind of sensor and implementation method having merged depth camera and general camera
CN111242080A (en) * 2020-01-21 2020-06-05 南京航空航天大学 Power transmission line identification and positioning method based on binocular camera and depth camera

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105184784A (en) * 2015-08-28 2015-12-23 西交利物浦大学 Motion information-based method for monocular camera to acquire depth information
US20180189565A1 (en) * 2015-08-28 2018-07-05 Imperial College Of Science, Technology And Medicine Mapping a space using a multi-directional camera
CN105115445A (en) * 2015-09-14 2015-12-02 杭州光珀智能科技有限公司 Three-dimensional imaging system and imaging method based on combination of depth camera and binocular vision
CN109166149A (en) * 2018-08-13 2019-01-08 武汉大学 A kind of positioning and three-dimensional wire-frame method for reconstructing and system of fusion binocular camera and IMU
CN110519502A (en) * 2019-09-24 2019-11-29 远形时空科技(北京)有限公司 A kind of sensor and implementation method having merged depth camera and general camera
CN111242080A (en) * 2020-01-21 2020-06-05 南京航空航天大学 Power transmission line identification and positioning method based on binocular camera and depth camera

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YI ZHANG等: "Depth Inpainting Algorithm of RGB-D Camera", 《IEEE》, 31 December 2018 (2018-12-31) *
彭祺,屠礼芬: "三维模型的空间匹配与拼接", 《计算机工程与科学》, 31 March 2017 (2017-03-31) *

Also Published As

Publication number Publication date
CN112907559B (en) 2022-06-07

Similar Documents

Publication Publication Date Title
CN111062873B (en) Parallax image splicing and visualization method based on multiple pairs of binocular cameras
CN105243637B (en) One kind carrying out full-view image joining method based on three-dimensional laser point cloud
CN104899870B (en) The depth estimation method being distributed based on light field data
CN106091984B (en) A kind of three dimensional point cloud acquisition methods based on line laser
CN109919911B (en) Mobile three-dimensional reconstruction method based on multi-view photometric stereo
CN111028155B (en) Parallax image splicing method based on multiple pairs of binocular cameras
TWI253006B (en) Image processing system, projector, information storage medium, and image processing method
WO2018129191A1 (en) Rear-stitched view panorama for rear-view visualization
de Agapito et al. Self-Calibration of a Rotating Camera with Varying Intrinsic Parameters.
US20140118482A1 (en) Method and apparatus for 2d to 3d conversion using panorama image
CN110809786A (en) Calibration device, calibration chart, chart pattern generation device, and calibration method
JP2009116532A (en) Method and apparatus for generating virtual viewpoint image
GB2555908A (en) Multi-tier camera rig for stereoscopic image capture
CN111027415B (en) Vehicle detection method based on polarization image
KR101705558B1 (en) Top view creating method for camera installed on vehicle and AVM system
CN108090877A (en) A kind of RGB-D camera depth image repair methods based on image sequence
CN105513074B (en) A kind of scaling method of shuttlecock robot camera and vehicle body to world coordinate system
TWI820246B (en) Apparatus with disparity estimation, method and computer program product of estimating disparity from a wide angle image
CN114359406A (en) Calibration of auto-focusing binocular camera, 3D vision and depth point cloud calculation method
US6996266B2 (en) Method and apparatus for generating three-dimensional data of an object by selecting the method for detecting corresponding points that is suitable to the object
CN110428361A (en) A kind of multiplex image acquisition method based on artificial intelligence
CN112907559B (en) Depth map generation device based on monocular camera
CN110430400B (en) Ground plane area detection method of binocular movable camera
CN110060212B (en) Deep learning-based multispectral luminosity three-dimensional surface normal direction recovery method
EP4071713A1 (en) Parameter calibration method and apapratus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant