CN117788535A - Image pair registration method and device, electronic equipment and storage medium - Google Patents
Image pair registration method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN117788535A CN117788535A CN202311841081.3A CN202311841081A CN117788535A CN 117788535 A CN117788535 A CN 117788535A CN 202311841081 A CN202311841081 A CN 202311841081A CN 117788535 A CN117788535 A CN 117788535A
- Authority
- CN
- China
- Prior art keywords
- image
- camera
- image pair
- pair
- pixel point
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 63
- 230000003287 optical effect Effects 0.000 claims abstract description 52
- 238000013135 deep learning Methods 0.000 claims abstract description 36
- 230000005540 biological transmission Effects 0.000 claims abstract description 27
- 239000011159 matrix material Substances 0.000 claims description 51
- 238000003384 imaging method Methods 0.000 claims description 32
- 230000006870 function Effects 0.000 claims description 21
- 238000004590 computer program Methods 0.000 claims description 17
- 239000013598 vector Substances 0.000 claims description 12
- 238000005286 illumination Methods 0.000 claims description 10
- 238000004422 calculation algorithm Methods 0.000 abstract description 12
- 238000012549 training Methods 0.000 description 17
- 230000008569 process Effects 0.000 description 10
- 238000012545 processing Methods 0.000 description 9
- 238000004891 communication Methods 0.000 description 8
- 238000003062 neural network model Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 6
- 230000033001 locomotion Effects 0.000 description 5
- 230000003068 static effect Effects 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 125000004122 cyclic group Chemical group 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses an image pair registration method, an image pair registration device, electronic equipment and a storage medium. The method comprises the following steps: registering the first camera and the second camera by adopting a checkerboard calibration plate, wherein the first camera is used for collecting a transmission image corresponding to the transmission light of the spectroscope, and the second camera is used for collecting a reflection image corresponding to the reflection light of the spectroscope; acquiring an image pair of the same target through the registered image pair acquisition equipment, wherein the image pair comprises a first image and a second image; predicting optical flow corresponding to each pixel point in the reference image through a deep learning network; and aligning the first image with the second image according to the optical flow corresponding to each pixel point in the reference image. According to the technical scheme, the registered spectroscopic image pair acquisition equipment is used for acquiring the image pair, the algorithm alignment of the image pair is realized through predicting the optical flow, the consistency of the content of the image pair is ensured, the accurate registration of the image pair is realized, and the image quality of the acquired image pair is improved.
Description
Technical Field
The embodiment of the invention relates to the technical field of image processing, in particular to an image pair registration method, an image pair registration device, electronic equipment and a storage medium.
Background
In training a deep-learning neural network model, it is often necessary to acquire image pairs as training data, such as, for example, to acquire dark-bright image pairs of different illumination levels. The existing mode usually uses two cameras to shoot the same scene or object respectively, the mode has certain feasibility for the image pair acquisition of the static scene, but because the parameters, the performances, the shooting angles of the same object and the like of different cameras are different, the consistency of the content of the images shot by different cameras cannot be ensured, the pixel positions of different images are difficult to match, and the obtained image pair substantially contains errors of a plurality of pixel positions; if the scene or the object is dynamically changed, the difference of shooting contents of the two cameras is larger, and the registration accuracy between the images is lower, so that the training effect of the neural network model and the prediction capability of the model are affected.
Disclosure of Invention
The invention provides an image pair registration method, an image pair registration device, electronic equipment and a storage medium, so that accurate registration of an image pair is realized, and the image quality of an acquired image pair is improved.
In a first aspect, an embodiment of the present invention provides an image pair registration method, including:
Registering a first camera and a second camera in the image pair acquisition equipment by adopting a checkerboard calibration plate, wherein the image pair acquisition equipment further comprises a spectroscope, the first camera is used for acquiring a transmission image of transmission light corresponding to the spectroscope, and the second camera is used for acquiring a reflection image of reflection light corresponding to the spectroscope;
acquiring an image pair of the same target through registered image pair acquisition equipment, wherein the image pair comprises a first image acquired by a first camera and a second image acquired by a second camera;
predicting optical flows corresponding to all pixel points in a reference image through a deep learning network, wherein the reference image is the first image or the second image;
and aligning the first image with the second image according to the optical flow corresponding to each pixel point in the reference image.
In a second aspect, an embodiment of the present invention provides an image pair registration apparatus, including:
the first registration module is used for registering a first camera and a second camera in the image pair acquisition equipment by adopting a checkerboard calibration plate, the image pair acquisition equipment further comprises a spectroscope, the first camera is used for acquiring a transmission image of transmission light rays corresponding to the spectroscope, and the second camera is used for acquiring a reflection image of reflection light rays corresponding to the spectroscope;
The acquisition module is used for acquiring an image pair of the same target through registered image pair acquisition equipment, wherein the image pair comprises a first image acquired by a first camera and a second image acquired by a second camera;
the optical flow prediction module is used for predicting the optical flow corresponding to each pixel point in a reference image through a deep learning network, wherein the reference image is the first image or the second image;
and the second registration module is used for aligning the first image with the second image according to the optical flow corresponding to each pixel point in the reference image.
In a third aspect, an embodiment of the present invention provides an electronic device, including:
one or more processors;
a storage means for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the image pair registration method as described in the first aspect.
In a fourth aspect, embodiments of the present invention also provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the image pair registration method according to the first aspect.
The embodiment of the invention provides an image pair registration method, an image pair registration device, electronic equipment and a storage medium. The method comprises the following steps: registering a first camera and a second camera in the image pair acquisition equipment by adopting a checkerboard calibration plate, wherein the image pair acquisition equipment further comprises a spectroscope, the first camera is used for acquiring a transmission image of transmission light corresponding to the spectroscope, and the second camera is used for acquiring a reflection image of reflection light corresponding to the spectroscope; acquiring an image pair of the same target through registered image pair acquisition equipment, wherein the image pair comprises a first image acquired by a first camera and a second image acquired by a second camera; predicting optical flows corresponding to all pixel points in a reference image through a deep learning network, wherein the reference image is the first image or the second image; and aligning the first image with the second image according to the optical flow corresponding to each pixel point in the reference image. According to the technical scheme, the registered spectroscopic image pair acquisition equipment is utilized to acquire the image pair, the two cameras can acquire the reflected image and the transmitted image of the same target at the same time, the content of the acquired image pair is consistent, and the acquired image pair is not influenced by the motion state or change of the target or the surrounding environment, so that the method is applicable to static scenes and dynamic scenes; in addition, on the basis of registering the first camera and the second camera, the algorithm alignment of the image pair is further realized through predicting the optical flow, so that the registration precision of the image pair is improved, and the image quality of the acquired image pair is further improved.
Drawings
The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.
FIG. 1 is a flow chart of a method for image pair registration according to one embodiment of the present invention;
fig. 2 is a schematic structural view of an image capturing apparatus according to an embodiment of the present invention;
FIG. 3 is a flowchart of a method for image pair registration according to another embodiment of the present invention;
fig. 4 is a schematic structural diagram of an image pair registration apparatus according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. Furthermore, embodiments of the invention and features of the embodiments may be combined with each other without conflict. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.
Before discussing exemplary embodiments in more detail, it should be mentioned that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart depicts steps as a sequential process, many of the steps may be implemented in parallel, concurrently, or with other steps. Furthermore, the order of the steps may be rearranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figures. The processes may correspond to methods, functions, procedures, subroutines, and the like.
It should be noted that the concepts of "first," "second," and the like in the embodiments of the present invention are merely used to distinguish between different devices, modules, units, or other objects, and are not intended to limit the order or interdependence of functions performed by the devices, modules, units, or other objects.
Fig. 1 is a flowchart of an image pair registration method according to an embodiment of the present invention, where the embodiment may be applicable to a case of registering a acquired image pair. In particular, the image pair registration method may be performed by an image pair registration device, which may be implemented in software and/or hardware and integrated in an electronic device. Further, the electronic device includes, but is not limited to: electronic devices with image processing functions such as desktop computers, notebook computers, smart phones, servers and the like.
As shown in fig. 1, the method specifically includes the following steps:
s110, registering a first camera and a second camera in the image pair acquisition equipment by adopting a checkerboard calibration plate, wherein the image pair acquisition equipment further comprises a spectroscope, the first camera is used for acquiring a transmission image of transmission light corresponding to the spectroscope, and the second camera is used for acquiring a reflection image of reflection light corresponding to the spectroscope.
In this embodiment, the image pair capturing device includes two cameras, and the image pairs captured by the two cameras are spatially inconsistent, so that the two cameras are to be physically registered, so that the images captured by the two cameras correspond to each other on a pixel level. Two cameras are used for respectively shooting the checkerboard calibration plate, two images are obtained, and the two images are respectively a transmission image and a reflection image formed by the light beams of the checkerboard calibration plate after passing through the spectroscope. The transmission image and the reflection image of the checkerboard calibration plate can be used for carrying out physical registration on the two cameras so as to ensure that the shooting contents of the two cameras are consistent, and the two cameras have higher matching degree when shooting the same target.
Fig. 2 is a schematic structural diagram of an image capturing apparatus according to an embodiment of the present invention. Specifically, the image pair acquisition device includes: a beam splitter, a first camera and a second camera (corresponding to camera 1 and camera 2, respectively, in fig. 2). The two cameras can be fixed on two mutually perpendicular inner walls of the black box by means of optical brackets. In addition, a checkerboard calibration plate needs to be prepared. Through spectroscope, the images of the checkerboard calibration plate can be acquired in pairs. As shown in fig. 2, the light beams of the checkerboard calibration plate are split by the beam splitter and captured by the camera 1 and the camera 2, respectively. In addition, two cameras can also be enabled to collect images with different illumination through the shading film, for example, the camera 1 is used for collecting images with low illumination, and the camera 2 is used for collecting images with normal illumination.
It should be noted that, in this embodiment, a spectroscopic image pair acquisition device is adopted, that is, a spectroscope is adopted to divide a light beam of the same target into two beams and capture and image the two beams by corresponding cameras respectively, on the basis, the two cameras can acquire a reflected image and a transmitted image of the same target simultaneously, and on the basis of ensuring that the acquired images are consistent in content, the acquired images are not affected by the motion state or change of the target or surrounding environment, so that the method is suitable for both static scenes and dynamic scenes. In addition, the characteristic of black and white interlacing of the checkerboard calibration plate can be utilized to design a matching degree function for guiding the registration of the two cameras, so that the registration of the two cameras is effectively realized.
S120, acquiring an image pair of the same target through the registered image pair acquisition device, wherein the image pair comprises a first image acquired by a first camera and a second image acquired by a second camera.
The registered image pair acquisition device can be used for acquiring an image pair aiming at the same target (such as a specified object or scene), the image acquired by the first camera is called a first image, the image acquired by the second camera is called a second image, and one image from the image pair can be selected as a reference image to perform subsequent image pair algorithm registration. Each pixel point (which may be referred to as a source pixel point) in the reference image (which is the first image or the second image) has a corresponding pixel point in a corresponding position in the non-reference image (which should be the second image or the first image, respectively), and the pixel points in the surrounding area of the corresponding pixel point have a certain similarity with the source pixel point.
S130, predicting optical flow corresponding to each pixel point in a reference image through a deep learning network, wherein the reference image is the first image or the second image.
In this embodiment, the image pair acquisition can be performed after the physical registration of the two cameras is completed, but due to the problems of camera lenses, equipment precision and the like, the acquired image pair still has inconsistency, and the two images acquired simultaneously can be aligned at the algorithm level by predicting the optical flow, so that the registration precision is further improved. Each pixel point (which may be referred to as a source pixel point) in the reference image has a corresponding pixel point at a corresponding position in the non-reference image, and the pixel points in the surrounding area of the corresponding pixel point have a certain similarity with the source pixel point. Through the deep learning network, the similarity between each source pixel point in the reference image and the corresponding pixel point in the non-reference image can be utilized to predict the optical flow corresponding to each source pixel point. The feature similarity may be determined by comparing differences between pixel values of two pictures based on cosine similarity, mean-Square Error (MSE), structural similarity (Structural Similarity, SSIM), peak signal-to-noise ratio (Peak Signal to Noise Ratio, PSNR), or feature matching algorithm, among others. On the basis, the feature similarity corresponding to each source pixel point can be expressed in a vector form, and the vectors corresponding to the source pixel points are combined together to obtain a feature map.
The deep learning network is used for predicting and outputting the optical flow of each source pixel point according to the feature map. Optical Flow (Optical Flow) may be used to estimate the direction and speed of motion of pixels in multiple images, and the offset of the pixels may be inferred by analyzing the pixel differences between the different images, thereby achieving target tracking and registration. In this embodiment, training data may be used in advance to train the deep learning network, so that the deep learning network has the capability of performing optical flow prediction with respect to the feature map, so that the optical flow prediction has higher accuracy.
And S140, aligning the first image with the second image according to the optical flow corresponding to each pixel point in the reference image.
The pixel coordinate offset of the reference image relative to the non-reference image can be determined according to the optical flow corresponding to each pixel point in the reference image, so that the two images can be aligned, and high-precision image registration is realized.
According to the image pair registration method provided by the embodiment of the invention, the registered spectroscopic image pair acquisition equipment is utilized to acquire the image pair, the two cameras can acquire the reflected image and the transmitted image of the same target at the same time, the content of the acquired image pair is ensured to be consistent, and the acquired image pair is not influenced by the motion state or change of the target or the surrounding environment, so that the method is applicable to static scenes and dynamic scenes; in addition, on the basis of registering the first camera and the second camera, the algorithm alignment of the image pair is further realized through predicting the optical flow, so that the registration precision of the image pair is improved, and the image quality of the acquired image pair is further improved.
In an embodiment, the image pair may be training data for training a neural network model. The neural network model may be used to change the brightness of an image, such as to improve the visibility of a darkened video. In general, these dim light-bright light images are synthesized from training data, and cannot be applied to a dim light video collected in a real scene; when the scene is completely stationary, a dark light image is easy to collect, and a bright light image can be obtained by a method of lengthening exposure time, however, the method cannot be applied to a dynamic scene. In this embodiment, the image pair acquisition device further includes a light shielding film, and by disposing the light shielding film on one side of the spectroscope, the first camera and the second camera can acquire images of the same target corresponding to different illuminances at the same time, so that image pairs of different illuminances (lower illuminance and higher illuminance, or dim light-bright light) can be acquired, and further, the image pairs of different illuminances can be used as training data of a neural network model for illumination conversion processing. On the basis, the acquired images are consistent in content, can meet actual requirements as the images are derived from real scenes, can be suitable for static scenes and dynamic scenes, and have wider applicability.
Fig. 3 is a flowchart of an image registration method according to an embodiment of the present invention, where the image registration process is specifically described by optimizing the image registration method based on the foregoing embodiment. It should be noted that technical details not described in detail in this embodiment may be found in any of the above embodiments.
In this embodiment, to realize image pair registration, two registration processes are mainly involved: physical registration of the cameras (S210-S230) and algorithmic registration of the image pairs (S250-S270). It should be noted that, the algorithm registration for the image pair may be performed after each acquisition of the image pair; for the physical registration of the camera, the image may be performed once after the image is built on the acquisition device, or may be performed periodically or on demand, which is not limited in this embodiment.
Specifically, in one embodiment, as shown in fig. 3, the method specifically includes the following steps:
s210, calibrating the first camera so that an imaging plane of the first camera is parallel to a plane where the checkerboard calibration plate is located.
In this embodiment, calibrating the first camera so that the imaging plane of the first camera is parallel to the plane where the checkerboard calibration plate is located includes:
Calibrating an internal reference matrix of the first camera, and adjusting the position of the first camera in the image pair acquisition equipment until the transmission image meets a first preset condition, determining a first external reference matrix of the first camera based on the internal reference matrix of the first camera;
and adjusting the angle of the first camera according to the first external reference matrix so that the imaging plane of the first camera is parallel to the plane where the checkerboard calibration plate is located.
Specifically, considering that the images acquired by the two cameras are inconsistent in space, the two cameras are physically registered by utilizing the checkerboard calibration plate, so that the images acquired by the two cameras are in one-to-one correspondence at the pixel level. Firstly, the checkerboard calibration plate is placed vertically to the ground, and the shading rate of the beam splitter can be selected according to the illumination requirement. The first camera (camera 1) is calibrated to obtain an internal reference matrix of the first camera (the internal reference mainly refers to parameters required for projecting the first camera onto an imaging plane, including focal length, principal point position, pixel size, and the like). Referring to fig. 2, after calibrating the internal reference matrix of the first camera, adjusting the position of the first camera until the transmission image acquired by the first camera meets a first preset condition, determining a first external reference matrix of the first camera according to the internal reference matrix of the first camera, wherein in one embodiment, the first preset condition is that a checkerboard calibration plate shot by the first camera is located in a central area in the transmission image and the area ratio is greater than a preset threshold; the extrinsic matrix may be used to describe the position and pose of the camera in the world coordinate system, and for example, includes a rotation matrix and a translation vector, and according to the first extrinsic matrix, the angle of the first camera may be adjusted so that the imaging plane of the first camera is parallel to the plane of the checkerboard calibration plate.
S220, calibrating the second camera so that an imaging plane of the second camera is parallel to a plane where the checkerboard calibration plate reflected image is located.
In this embodiment, calibrating the second camera so that the imaging plane of the second camera is parallel to the plane in which the checkerboard calibration plate reflects the image includes: calibrating an internal reference matrix of the second camera, and adjusting the position of the second camera in the image pair acquisition equipment until the reflected image meets a second preset condition, determining a second external reference matrix of the second camera based on the internal reference matrix of the second camera; and adjusting the angle of the second camera according to the second exogenous matrix so as to enable the imaging plane of the second camera to be parallel to the plane where the reflected image of the checkerboard calibration plate is located.
The second camera (camera 2) is calibrated to obtain an internal reference matrix of the second camera (the internal reference mainly refers to parameters required for projecting the second camera onto an imaging plane, including focal length, principal point position, pixel size, and the like). Referring to fig. 2, after calibrating the internal reference matrix of the second camera, adjusting the position of the second camera until the reflected image acquired by the second camera meets a second preset condition, and determining a second external reference matrix of the second camera according to the internal reference matrix of the second camera, where in one embodiment, the second preset condition is that a checkerboard calibration plate in the reflected image acquired by the second camera is located in a central area of the reflected image and the area ratio is greater than a preset threshold; the extrinsic matrix may be used to describe the position and pose of the camera in the world coordinate system, and for example, includes a rotation matrix and a translation vector, and according to the second extrinsic matrix, the angle of the second camera may be adjusted so that the imaging plane of the second camera is parallel to the plane of the checkerboard calibration plate.
S230, fixing the position of the first camera, moving the second camera on the basis of keeping the imaging plane of the second camera parallel to the plane where the reflection image of the checkerboard calibration plate is located until the matching value of the transmission image acquired by the first camera and the reflection image acquired by the second camera reaches the maximum value, and fixing the position of the second camera to realize the registration of the first camera and the second camera, wherein the matching value is calculated based on an absolute value loss function.
The absolute value loss function (L1 loss) may be used to represent the matching degree of the images acquired by the first camera and the second camera, and the lower the L1loss value, the higher the matching degree is due to checkerboard black-white interleaving. In this embodiment, in consideration of the specificity of the checkerboard image, the matching degree function is convex in a certain spatial range, and the movement of the highest point of the approach matching degree in each step can raise the matching degree. The registration process of the image to the acquisition device is completed by fixing the position of the first camera, moving the second camera on the basis of keeping the imaging plane of the second camera parallel to the plane where the reflected image of the checkerboard calibration plate is located, for example, sequentially controlling the second camera to adjust in three degrees of freedom of front, back, upper, lower, left and right (z, x, y) by using a motor for a plurality of times until the matching degree in all directions cannot be improved. The registering process has the advantages of simple steps, no need of complex calculation, low requirement on equipment precision, realization of physical accurate registering of the two cameras, ensuring that images shot by the two cameras have consistency in space, and providing a basis for subsequent acquisition of high-quality image pairs.
S240, acquiring an image pair of the same target through the registered image pair acquisition equipment.
In this embodiment, after the physical registration of the first camera and the second camera in the image pair acquisition device is completed, the image pair of the same target is acquired by the image pair acquisition device, that is, the first image is acquired by the registered first camera and the second image is acquired by the registered second camera, and then the first image and the second image are further aligned in the algorithm level.
S250, generating a feature map according to the feature similarity between each pixel point in the reference image and the corresponding region of the non-reference image in the image pair.
In this embodiment, feature similarity is calculated for each source pixel point and each pixel point in a corresponding region in the non-reference image, and then a feature map is generated by using the feature similarity corresponding to each source pixel point, where the feature map is used for inputting to a deep learning network to perform optical flow prediction.
In an embodiment, generating a feature map according to feature similarity between each pixel point in the reference image and a corresponding region of the non-reference image in the image pair includes:
s2510, for each pixel point in the reference image, calculating cosine similarity between the pixel point and all pixel points in a corresponding area with a set size in a non-reference image in the image pair, and obtaining a similarity matrix corresponding to the pixel point;
S2520, generating a feature map according to the similarity matrix corresponding to each pixel point.
In this embodiment, the feature similarity is calculated by calculating the cosine similarity to determine each source pixel point in the reference image and each pixel point in the corresponding region in the non-reference image, and the feature map can be obtained by fusing the similarity matrixes corresponding to the source pixel points. Specifically, features of a reference image and a non-reference image can be extracted through a computer vision system identification item (ImageNet) respectively to serve as feature points to be matched of the two images. And for each source pixel point of the reference image, calculating the cosine similarity of the features of all the pixel points in the range of the set size (such as w multiplied by w) around the corresponding pixel point in the non-reference image, and obtaining a similarity matrix corresponding to each source pixel point. Wherein the size of the set dimension (w) may be determined based on the result of the physical registration of the cameras. For example, if the device accuracy employed for camera registration is low, w may suitably take a large value; if the device precision used for camera registration is high, w may suitably take a small value.
Optionally, generating the feature map according to the similarity matrix corresponding to each pixel point includes: respectively expanding the similarity matrixes corresponding to the pixel points in sequence to obtain corresponding one-dimensional vectors; and splicing the one-dimensional vectors corresponding to the pixel points in a space domain to obtain a feature map.
Specifically, the similarity matrix corresponding to each source pixel point is sequentially unfolded into one-dimensional vectors, all the one-dimensional vectors are spliced on a space domain to obtain a total similarity feature map, the total similarity feature map is used as input of a deep learning network, and the optical flow corresponding to each source pixel point is predicted through the deep learning network. In the image registration process, the similarity matrix of each source pixel point is unfolded on the airspace and spliced to serve as the input of the deep learning network, so that the deep learning network considers the offset of surrounding pixel points, the offset value in the prediction result can be avoided, all predicted optical flows are smoother, and the accuracy of optical flow prediction is improved.
S260, inputting the feature map to a deep learning network, and predicting the optical flow corresponding to each pixel point in the reference image through the deep learning network.
S270, aligning the first image with the image according to the optical flow corresponding to each pixel point in the reference image.
Specifically, according to the optical flow corresponding to each source pixel point, the image coordinates of the corresponding pixel point of each source pixel point in the non-reference image can be obtained, and the coordinate transformation relation between the first image and the second image is defined, so that the alignment of the first image and the second image is realized, and the registration of the image to the algorithm level is completed.
In an embodiment, the registered image pair has higher quality, can be used for training a neural network model, for example, the image pair with different illumination can be acquired by arranging the shading film on the spectroscope, can be used as training data, provides a reliable basis for training the neural network model (such as a low illumination enhancement model) for image illumination processing or conversion, avoids the inconsistency of model training data and practical application scene data, and can improve the performance of the neural network model.
In an embodiment, before registering the image pairs with the deep learning network, the method further comprises: training a deep learning network by adopting a structural loss function and/or a cyclic consistency loss function; the deep learning network is a full convolution network, and can support input of any scale by adopting the full convolution network. The training data comprises a plurality of groups of sample image pairs, each group of sample image pairs can comprise a source image and a corresponding target image), when the deep learning network is trained, a feature image corresponding to the source image can be used as input, the deep learning network predicts the optical flow, pixels in the source image are mapped into another image, then a structural Loss function (Structural Similarity Loss, SSIM Loss) and/or a cyclic Consistency Loss function (cyclic-Consistency Loss) are used for calculating the Loss between the mapped image and the target image, the Loss is minimized as the target training deep learning network, network parameters are continuously adjusted until the requirements are met, and the obtained deep learning network can accurately predict the optical flow and has good performance and effect.
In other embodiments, the deep learning network may employ other networks capable of predicting optical flow.
According to the image registration method provided by the embodiment of the invention, the camera is subjected to physical registration by adopting the checkerboard calibration plate, and the advantage of matching degree calculation of the checkerboard is utilized, so that the method is simple and efficient, the precision requirement on an optical system can be reduced, the requirement on equipment precision is reduced, and the cost is saved; the method for registering the images by the algorithm and the later image alignment method adopts a mode of combining optical flow and a projection matrix, so that the prior of the projection matrix method is reserved, the problem of misalignment caused by micro deformation of the images can be solved, the consistency of the image pairs is ensured, and better optical flow prediction and more accurate image registration are realized; the deep learning network is built by adopting the full convolution network, and is trained by adopting the structural loss function and/or the cyclic consistency loss function, so that the robustness is high, the precision is high, the accurate registration of the image pairs is realized, and the image quality of the acquired image pairs is improved.
Fig. 4 is a schematic structural diagram of an image pair registration apparatus according to an embodiment of the present invention. The image pair registration apparatus provided in this embodiment includes:
A first registration module 310, configured to register a first camera and a second camera in the image pair acquisition device with a checkerboard calibration plate, where the image pair acquisition device further includes a beam splitter, the first camera is configured to acquire a transmission image of a transmission light ray corresponding to the beam splitter, and the second camera is configured to acquire a reflection image of a reflection light ray corresponding to the beam splitter;
the acquisition module 320 is configured to acquire, by using the registered image pair acquisition device, an image pair of the same target, where the image pair includes a first image acquired by the first camera and a second image acquired by the second camera;
the optical flow prediction module 330 is configured to predict, through the deep learning network, an optical flow corresponding to each pixel point in a reference image, where the reference image is a first image or a second image;
the second registration module 340 is configured to align the first image with the second image according to the optical flow corresponding to each pixel point in the reference image.
According to the image pair registration device, the registered spectroscopic image pair is utilized to acquire the image pair, the algorithm alignment of the image pair is realized through predicting the optical flow, the consistency of the content of the image pair is ensured, the accurate registration of the image pair is realized, and the image quality of the acquired image pair is improved.
On the basis of the above embodiment, the first registration module 310 includes:
the first calibration unit is used for calibrating the first camera so that an imaging plane of the first camera is parallel to a plane where the checkerboard calibration plate is located;
the second calibration unit is used for calibrating the second camera so that the imaging plane of the second camera is parallel to the plane where the reflected image of the checkerboard calibration plate is located;
and the registration unit is used for fixing the position of the first camera, moving the second camera on the basis of keeping the imaging plane of the second camera parallel to the plane where the reflection image of the checkerboard calibration plate is positioned until the matching value of the transmission image acquired by the first camera and the reflection image acquired by the second camera reaches the maximum value, fixing the position of the second camera, and realizing the registration of the first camera and the second camera, wherein the matching value is calculated based on an absolute value loss function.
On the basis of the above embodiment, the step of calibrating the first camera so that the imaging plane of the first camera is parallel to the plane where the checkerboard calibration plate is located and calibrating the second camera so that the imaging plane of the second camera is parallel to the plane where the checkerboard calibration plate reflects the image includes:
Calibrating an internal reference matrix of the first camera, adjusting the position of the first camera in the image pair acquisition equipment until the transmission image meets a first preset condition, and determining a first external reference matrix of the first camera based on the internal reference matrix of the first camera;
adjusting the angle of the first camera according to the first external reference matrix so that the imaging plane of the first camera is parallel to the plane where the checkerboard calibration plate is located;
calibrating an internal reference matrix of the second camera, adjusting the position of the second camera in the image pair acquisition equipment until the reflected image meets a second preset condition, and determining a second external reference matrix of the second camera based on the internal reference matrix of the second camera;
and adjusting the angle of the second camera according to the second exogenous matrix so that the imaging plane of the second camera is parallel to the plane where the reflected image of the checkerboard calibration plate is located.
Based on the above embodiment, the optical flow prediction module 330 includes:
the generation sub-module is used for generating a feature map according to the feature similarity between each pixel point in the reference image and the corresponding region of the non-reference image in the image pair;
And the prediction sub-module is used for inputting the feature map to a deep learning network, and predicting the optical flow corresponding to each pixel point in the reference image through the deep learning network.
On the basis of the above embodiment, the first registration module 310 includes:
the first adjusting unit is used for shooting a transmission image of the checkerboard calibration plate passing through the spectroscope through the first camera, determining a first external parameter of the first camera, and adjusting the position of the first camera according to the first external parameter so that the imaging plane of the first camera is parallel to the plane where the checkerboard calibration plate is located when the proportion of the imaging area of the checkerboard calibration plate occupying the shot picture exceeds a set threshold value and the imaging area is in the middle area of the shot picture;
the second adjusting unit is used for shooting the reflection image of the checkerboard calibration plate passing through the spectroscope through the second camera, determining a second external parameter of the second camera according to the second external parameter when the proportion of the imaging area of the checkerboard calibration plate to the shot picture exceeds a set threshold value and the imaging area is in the middle area of the shot picture, and adjusting the position of the second camera according to the second external parameter so that the imaging plane of the second camera is parallel to the plane where the reflection image of the checkerboard calibration plate is located;
And a third adjustment unit, configured to adjust a position of the second camera so that a matching degree between the first image and the first image in each direction reaches a maximum value, where the matching value is calculated based on an absolute value loss function.
On the basis of the above embodiment, generating the submodule includes:
the computing unit is used for computing cosine similarity of all the pixel points in the corresponding areas with set sizes in the non-reference images in the image pairs for each pixel point in the reference images to obtain a similarity matrix corresponding to the pixel points;
and the generating unit is used for generating a feature map according to the similarity matrix corresponding to each pixel point.
On the basis of the above embodiment, the generating unit includes:
the unfolding subunit is used for respectively unfolding the similarity matrixes corresponding to the pixel points in sequence to obtain corresponding one-dimensional vectors;
and the splicing subunit comprises the step of splicing the one-dimensional vectors corresponding to the pixel points in the space domain to obtain a feature map.
On the basis of the above embodiment, the image pair acquisition apparatus further includes a light shielding film disposed on one side of the spectroscope, and the first image and the second image correspond to different illuminance.
On the basis of the above embodiment, the device further includes:
the training module is used for training the deep learning network by adopting a structural loss function and/or a cyclic consistency loss function;
the deep learning network is a full convolution network and comprises an input layer, a convolution layer, a batch normalization layer and an output layer.
The image pair registration device provided by the embodiment can be used for executing the image pair registration method provided by any embodiment, and has corresponding functions and beneficial effects.
Fig. 5 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. The electronic device 10 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device 10 may also represent various forms of mobile equipment, such as personal digital assistants, cellular telephones, smartphones, user equipment, wearable devices (e.g., helmets, eyeglasses, watches, etc.), and other similar computing equipment. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 5, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks, wireless networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above.
In some embodiments, the methods of the above embodiments may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. One or more steps of the methods described above may be performed when the computer program is loaded into RAM 13 and executed by processor 11. Alternatively, in other embodiments, processor 11 may be configured to perform any of the embodiment methods described above in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device 10, the electronic device 10 having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the electronic device 10. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.
Claims (10)
1. A method of image pair registration, comprising:
registering a first camera and a second camera in the image pair acquisition equipment by adopting a checkerboard calibration plate, wherein the image pair acquisition equipment further comprises a spectroscope, the first camera is used for acquiring a transmission image of transmission light corresponding to the spectroscope, and the second camera is used for acquiring a reflection image of reflection light corresponding to the spectroscope;
Acquiring an image pair of the same target through registered image pair acquisition equipment, wherein the image pair comprises a first image acquired by a first camera and a second image acquired by a second camera;
predicting optical flows corresponding to all pixel points in a reference image through a deep learning network, wherein the reference image is the first image or the second image;
and aligning the first image with the second image according to the optical flow corresponding to each pixel point in the reference image.
2. The method of claim 1, wherein registering the first camera and the second camera in the acquisition device with the checkerboard calibration plate comprises:
calibrating the first camera so that an imaging plane of the first camera is parallel to a plane where the checkerboard calibration plate is located;
calibrating the second camera so that an imaging plane of the second camera is parallel to a plane where the reflected image of the checkerboard calibration plate is located;
and fixing the position of the first camera, moving the second camera on the basis of keeping the imaging plane of the second camera parallel to the plane where the reflection image of the checkerboard calibration plate is positioned until the matching value of the transmission image acquired by the first camera and the reflection image acquired by the second camera reaches the maximum value, and fixing the position of the second camera to realize the registration of the first camera and the second camera, wherein the matching value is calculated based on an absolute value loss function.
3. The method of claim 2, wherein the step of calibrating the first camera such that the imaging plane of the first camera is parallel to the plane of the checkerboard calibration plate, and calibrating the second camera such that the imaging plane of the second camera is parallel to the plane of the checkerboard calibration plate reflection image comprises:
calibrating an internal reference matrix of the first camera, adjusting the position of the first camera in the image pair acquisition equipment until the transmission image meets a first preset condition, and determining a first external reference matrix of the first camera based on the internal reference matrix of the first camera;
adjusting the angle of the first camera according to the first external reference matrix so that the imaging plane of the first camera is parallel to the plane where the checkerboard calibration plate is located;
calibrating an internal reference matrix of the second camera, adjusting the position of the second camera in the image pair acquisition equipment until the reflected image meets a second preset condition, and determining a second external reference matrix of the second camera based on the internal reference matrix of the second camera;
and adjusting the angle of the second camera according to the second exogenous matrix so that the imaging plane of the second camera is parallel to the plane where the reflected image of the checkerboard calibration plate is located.
4. The method of claim 1, wherein predicting, via the deep learning network, optical flow for each pixel in the reference image comprises:
generating a feature map according to the feature similarity between each pixel point in the reference image and the corresponding region of the non-reference image in the image pair;
inputting the feature map to a deep learning network, and predicting the optical flow corresponding to each pixel point in the reference image through the deep learning network.
5. The method of claim 4, wherein generating a feature map based on feature similarities for each pixel in the reference image and corresponding regions of non-reference images in the image pair, comprises:
for each pixel point in the reference image, calculating cosine similarity of the pixel point and all pixel points in a corresponding area with a set size in a non-reference image in the image pair, and obtaining a similarity matrix corresponding to the pixel point;
and generating a feature map according to the similarity matrix corresponding to each pixel point.
6. The method of claim 5, wherein generating a feature map from the similarity matrix for each pixel comprises:
Respectively expanding the similarity matrixes corresponding to the pixel points in sequence to obtain corresponding one-dimensional vectors;
and splicing the one-dimensional vectors corresponding to the pixel points in a space domain to obtain a feature map.
7. The method of any one of claims 1-6, wherein the image pair acquisition device further comprises a light shielding film disposed on one side of the beam splitter, the first image and the second image corresponding to different illuminations.
8. An image pair registration apparatus, comprising:
the first registration module is used for registering a first camera and a second camera in the image pair acquisition equipment by adopting a checkerboard calibration plate, the image pair acquisition equipment further comprises a spectroscope, the first camera is used for acquiring a transmission image of transmission light rays corresponding to the spectroscope, and the second camera is used for acquiring a reflection image of reflection light rays corresponding to the spectroscope;
the acquisition module is used for acquiring an image pair of the same target through registered image pair acquisition equipment, wherein the image pair comprises a first image acquired by a first camera and a second image acquired by a second camera;
the optical flow prediction module is used for predicting the optical flow corresponding to each pixel point in a reference image through a deep learning network, wherein the reference image is the first image or the second image;
And the second registration module is used for aligning the first image with the second image according to the optical flow corresponding to each pixel point in the reference image.
9. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the image pair registration method as claimed in any one of claims 1-7.
10. A computer readable storage medium having stored thereon a computer program, characterized in that the program, when executed by a processor, implements the image pair registration method as claimed in any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311841081.3A CN117788535A (en) | 2023-12-28 | 2023-12-28 | Image pair registration method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311841081.3A CN117788535A (en) | 2023-12-28 | 2023-12-28 | Image pair registration method and device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117788535A true CN117788535A (en) | 2024-03-29 |
Family
ID=90383144
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311841081.3A Pending CN117788535A (en) | 2023-12-28 | 2023-12-28 | Image pair registration method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117788535A (en) |
-
2023
- 2023-12-28 CN CN202311841081.3A patent/CN117788535A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230419437A1 (en) | Systems and methods for fusing images | |
US8723926B2 (en) | Parallax detecting apparatus, distance measuring apparatus, and parallax detecting method | |
JP5075757B2 (en) | Image processing apparatus, image processing program, image processing method, and electronic apparatus | |
US20210044725A1 (en) | Camera-specific distortion correction | |
CN112862877B (en) | Method and apparatus for training an image processing network and image processing | |
KR20200044676A (en) | Method and apparatus for active depth sensing and calibration method thereof | |
JP2009230537A (en) | Image processor, image processing program, image processing method, and electronic equipment | |
JP2018503066A (en) | Accuracy measurement of image-based depth detection system | |
CN112991180A (en) | Image splicing method, device, equipment and storage medium | |
US11803982B2 (en) | Image processing device and three-dimensional measuring system | |
CN114125411B (en) | Projection device correction method, projection device correction device, storage medium and projection device | |
US20190355101A1 (en) | Image refocusing | |
CN113643414A (en) | Three-dimensional image generation method and device, electronic equipment and storage medium | |
US8817246B2 (en) | Lens test device and method | |
CN113724391A (en) | Three-dimensional model construction method and device, electronic equipment and computer readable medium | |
CN110658918B (en) | Positioning method, device and medium for eyeball tracking camera of video glasses | |
CN116347056A (en) | Image focusing method, device, computer equipment and storage medium | |
US20220012905A1 (en) | Image processing device and three-dimensional measuring system | |
JP2009301181A (en) | Image processing apparatus, image processing program, image processing method and electronic device | |
CN112752088B (en) | Depth image generation method and device, reference image generation method and electronic equipment | |
CN117788535A (en) | Image pair registration method and device, electronic equipment and storage medium | |
CN114255177B (en) | Exposure control method, device, equipment and storage medium in imaging | |
JP2013148467A (en) | Measurement device, method, and program | |
CN117336459B (en) | Three-dimensional video fusion method and device, electronic equipment and storage medium | |
JP2005216191A (en) | Stereo image processing apparatus and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |