CN116503299A - Method, system and storage medium for fusing targets among vehicle-mounted multiple cameras - Google Patents
Method, system and storage medium for fusing targets among vehicle-mounted multiple cameras Download PDFInfo
- Publication number
- CN116503299A CN116503299A CN202310483852.XA CN202310483852A CN116503299A CN 116503299 A CN116503299 A CN 116503299A CN 202310483852 A CN202310483852 A CN 202310483852A CN 116503299 A CN116503299 A CN 116503299A
- Authority
- CN
- China
- Prior art keywords
- camera
- target
- target frame
- frame
- vehicle
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000005070 sampling Methods 0.000 claims abstract description 52
- 239000011159 matrix material Substances 0.000 claims abstract description 35
- 230000004927 fusion Effects 0.000 claims abstract description 16
- 238000013528 artificial neural network Methods 0.000 claims abstract description 11
- 230000008569 process Effects 0.000 claims abstract description 11
- 238000001514 detection method Methods 0.000 claims abstract description 9
- 238000007499 fusion processing Methods 0.000 claims description 9
- 238000007500 overflow downdraw method Methods 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000013527 convolutional neural network Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G06T5/80—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
- G06T7/74—Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/80—Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20092—Interactive image processing based on input by user
- G06T2207/20104—Interactive definition of region of interest [ROI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Abstract
The invention discloses a method, a system and a storage medium for fusing targets among vehicle-mounted multiple cameras, wherein the method comprises the following steps: step 1: acquiring images shot by the vehicle-mounted multi-camera at the same time, an internal reference matrix of each camera, an external reference matrix of an adjacent camera and a base distance between the adjacent cameras, and performing target detection by using a neural network to acquire all targets on each image; step 2: determining a non-overlapping target frame; step 3: uniformly sampling to obtain a sampling point set, and re-projecting the sampling point set onto an adjacent camera image according to an epipolar constraint error minimization criterion to obtain a re-projection rectangular frame; step 4: logically judging an original target frame, a reprojected rectangular frame and an adjacent camera target frame of the current camera, and carrying out target fusion according to a judging result; step 5: and traversing all cameras, and repeating the steps 2-4 to process the target frames under all cameras. The method can efficiently and accurately judge the uniqueness of the targets among the multiple cameras, and solves the problem of fusion among the multiple cameras of the shielding targets.
Description
Technical Field
The invention relates to the technical field of target fusion, in particular to a method, a system and a storage medium for fusing targets among vehicle-mounted multiple cameras.
Background
The vision is used as a core sensing means of intelligent driving, so that a dynamic target and a static target in the running process of the vehicle can be intuitively sensed, and the safety running of the vehicle is ensured. Due to the view angle of the camera, a plurality of cameras are generally required to be arranged for sensing 360-degree images around the vehicle, and images between adjacent cameras generally have overlapping areas. On the premise of ensuring 360-degree full perception, the uniqueness of the target in the overlapped area is judged efficiently and accurately, and is also important for planning control of vehicle running.
In the prior art, the Chinese patent publication No. CN113077511B discloses a multi-camera target matching and tracking method and device for an automobile, wherein the method uses a projection matrix between a camera coordinate system and a vehicle coordinate system to project an image target onto the vehicle coordinate system, and the aim of matching targets in multiple paths of video frames is fulfilled by calculating the difference of three-dimensional position coordinates and the appearance similarity. As an industry consensus, a monocular image lacks depth information, and cannot recover the full-image depth information through a projection matrix to be used for calculating three-dimensional coordinates (x, y, z) under a world coordinate system, but if the world coordinate z=0 corresponding to a pixel is known, the pixel is generally called a grounding point, and x, y values of the pixel under the world coordinate system can be calculated through the projection matrix. In the method, the vehicle coordinate system is regarded as a world coordinate system and is positioned on a flat ground, and the coordinate calculation mode under the vehicle coordinate system essentially assumes that the bottom edge of the target frame is positioned on a ground plane and the ground is a horizontal plane, so that the x and y values of the grounding point of the bottom edge of the target frame under the vehicle coordinate system can be obtained. However, in the actual running process, the running road surface is usually a non-standard horizontal plane, and the bottom edge of the target frame is not necessarily located on the running ground of the vehicle, for example, the traffic sign is located on the sky, the sidewalk is higher than the running road surface of the vehicle, and the coordinates calculated by using the method have larger errors. In order to achieve accuracy, the method uses the image similarity as a basis for further judgment, but when the target frames are blocked mutually, the method using the image similarity is time-consuming to calculate and can not accurately distinguish blocked targets.
Therefore, the method has a certain limitation in use, and the vehicle running process is more required to be an efficient and accurate multi-camera target fusion method which is suitable for different roads, does not limit the position of the target in the image and can solve the problem that the target is partially blocked.
Disclosure of Invention
The invention aims to solve the problems in the prior art, and provides a vehicle-mounted multi-camera target fusion method, a vehicle-mounted multi-camera target fusion system and a vehicle-mounted multi-camera target fusion storage medium, which can efficiently and accurately judge the uniqueness of targets among multiple cameras without limiting the running conditions of a vehicle, and solve the problem of multi-camera fusion of shielding targets.
The primary purpose of the invention is to solve the technical problems, and the technical scheme of the invention is as follows:
the invention provides a target fusion method among vehicle-mounted multiple cameras, which comprises the following steps:
step 1: acquiring images shot by the vehicle-mounted multi-camera at the same time, an internal reference matrix of each camera, an external reference matrix of an adjacent camera and a base distance between the adjacent cameras, and performing target detection by using a neural network to acquire all targets on each image;
step 2: selecting each camera to determine a non-overlapping target frame;
step 3: uniformly sampling the area in the non-overlapping target frame to obtain a sampling point set, and re-projecting the sampling point set onto an adjacent camera image according to an epipolar constraint error minimization criterion to obtain a re-projected rectangular frame;
step 4: performing logic judgment on an original target frame, a reprojected rectangular frame and an adjacent camera target frame of the current camera, and performing target fusion processing according to judgment results;
step 5: and traversing all cameras, and repeating the steps 2-4 to process the target frames under all cameras.
Further, the neural network includes: convolutional neural network, transform network.
Further, each camera is selected to determine a non-overlapping target frame, and the specific steps are as follows:
judging whether overlapping exists among original target frames under the currently selected camera, and if so, acquiring a non-overlapping area of each target frame as a non-overlapping target frame; if there is no overlap, the target frame itself is taken as a non-overlapping target frame.
Further, the sampling point set is reprojected onto an adjacent camera image according to an epipolar constraint error minimization criterion to obtain a reprojected rectangular frame, which comprises the following specific steps:
(a) Carrying out distortion correction on the sampling points to obtain undistorted coordinates Po;
(b) Acquiring a right eye based on an intermediate object basic matrix F and a sampling point undistorted coordinate Po by using an internal reference matrix of each camera and an external reference matrix of an adjacent camera, and calculating a polar equation corresponding to the sampling point Po on an image of the adjacent camera;
le=FPo
(c) According to the polar line equation le, setting a distance threshold value Td, and according to a point-to-direct distance calculation formula, obtaining all pixel points Pt on an image with the polar line distance smaller than Td i (t i =0,1,...,n);
(d) Based on sampling point Po and pixel point Pt i For the pixel point Pt i Three-dimensional reconstruction is carried out to obtain sampling points Po and pixel points Pt i Three-dimensional coordinates Wt (Xt, yt, zt) of two corresponding points in adjacent camera coordinate systems;
(e) According to the external parameter matrixes R and T of the current camera and the adjacent camera, converting the three-dimensional coordinate under the coordinate system of the adjacent camera into the three-dimensional coordinate Wo under the coordinate system of the current camera;
(f) According to the internal reference matrix of the current camera, converting the three-dimensional coordinates of the sampling points in the current camera coordinate system into the coordinate of the reprojection image in the current camera, and recording the converted points as Pr i (i=0,1,...,n);
(g) Calculating the distance between the sampling point Po and the converted point to obtain the minimum distance re-projection pixel coordinate Pr i Determining an optimal re-projection point Pt;
(h) And determining the maximum circumscribed rectangular frame as the reprojected rectangular frame of the current camera unobstructed rectangular frame on the adjacent camera image according to the optimal reprojected points of all the sampling points.
Further, the expression of three-dimensional reconstruction is:
Zt=B*f/d
Xt=Z*Pt i _x/f
Yt=Z*Pt i _y/f
wherein B is the base distance between the current camera and the adjacent camera, f is the focal length of the camera, d is Po and Pt i Parallax of dot, pt i X is the point Pt i Pixel x coordinate, pt i Y is the point Pt i Is defined as the pixel y coordinate of (c).
Further, the three-dimensional coordinate conversion formula is:
Wo=RWt+T
wo represents the three-dimensional coordinates in the current camera coordinate system.
Further, in step 4, the original target frame, the reprojected rectangular frame, and the target frames of the adjacent cameras of the current camera are logically judged, and the target fusion processing is performed according to the judging result, specifically:
calculating the intersection ratio of the re-projection rectangular frame and the adjacent camera target frame, and setting the threshold value of the intersection ratio as Tiou;
if the intersection ratio of the adjacent camera target frame and the reprojection rectangular frame is larger than a threshold Tiou, judging that the current camera original target frame corresponding to the adjacent camera target frame and the reprojection frame is a target, and carrying out target fusion;
if the intersection ratio of the adjacent camera target frame and the reprojection rectangular frame is smaller than or equal to the threshold Tiou, the current camera original target frame corresponding to the adjacent camera target frame and the reprojection rectangular frame is not the same target, and target fusion is not carried out.
A second aspect of the present invention provides a vehicle-mounted multi-camera inter-target fusion system, the system comprising: the system comprises a memory and a processor, wherein the memory comprises a vehicle-mounted multi-camera target fusion method program, and the method comprises the following steps of:
step 1: acquiring images shot by the vehicle-mounted multi-camera at the same time, an internal reference matrix of each camera, an external reference matrix of an adjacent camera and a base distance between the adjacent cameras, and performing target detection by using a neural network to acquire all targets on each image;
step 2: selecting each camera to determine a non-overlapping target frame;
step 3: uniformly sampling the region in the non-overlapping target frame to obtain a sampling point set, and re-projecting the sampling point set onto an adjacent camera image according to an epipolar constraint error minimization criterion to obtain a re-projected rectangular frame;
step 4: performing logic judgment on an original target frame, a reprojected rectangular frame and an adjacent camera target frame of the current camera, and performing target fusion processing according to judgment results;
step 5: and traversing all cameras, and repeating the steps 2-4 to process the target frames under all cameras.
Further, each camera is selected to determine a non-overlapping target frame, and the specific steps are as follows:
judging whether overlapping exists among original target frames under the currently selected camera, and if so, acquiring a non-overlapping area of each target frame as a non-overlapping target frame; if there is no overlap, the target frame itself is taken as a non-overlapping target frame.
The third aspect of the present invention provides a storage medium, where the storage medium includes a vehicle-mounted multi-camera target fusion method program, where the vehicle-mounted multi-camera target fusion method program, when executed by a processor, implements the steps of the vehicle-mounted multi-camera target fusion method.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention obtains the target frame of the reprojection rectangular frame by utilizing the minimum rule according to the epipolar constraint error for the sampling point set, and can efficiently and accurately judge the uniqueness of the target among multiple cameras without limiting the running condition of the vehicle; and meanwhile, the problem of fusion among multiple cameras of the shielding target is solved by uniformly sampling the area in the non-overlapping target frame.
Drawings
Fig. 1 is a flowchart of a method for fusing targets between multiple cameras on a vehicle according to an embodiment of the invention.
FIG. 2 is a flow chart of determining non-overlapping target boxes according to an embodiment of the invention.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description. It should be noted that, in the case of no conflict, the embodiments of the present application and the features in the embodiments may be combined with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those described herein, and therefore the scope of the present invention is not limited to the specific embodiments disclosed below.
Example 1
As shown in fig. 1, the first aspect of the present invention provides a method for fusing targets between multiple cameras on a vehicle, comprising the following steps:
step 1: acquiring images shot by the vehicle-mounted multi-camera at the same time, an internal reference matrix of each camera, an external reference matrix of an adjacent camera and a base distance between the adjacent cameras, and performing target detection by using a neural network to acquire all targets on each image;
in the present invention, the reference matrix: the method is used for describing the relation between the three-dimensional world coordinates and the two-dimensional pixel coordinate system, wherein the three-dimensional coordinates can obtain a determined image pixel coordinate through an internal reference matrix, and the pixel coordinate can obtain a plurality of three-dimensional world coordinates of non-unique solutions through the internal reference matrix, which are parameters of each camera; external parameter matrix: three-dimensional coordinate points are converted from one coordinate system to another three-dimensional coordinate system, and pose transformation between different cameras is described. The internal reference matrix and the external reference matrix can be obtained through a camera calibration method, the method is more and universal, and the invention is not limited.
The invention uses the neural network to detect the target, and the detected target in the image corresponds to a target frame and is selected by the target frame. The present invention is not limited to a specific neural network, and convolutional neural networks such as: a one-phase target detection network, such as the yolo series, may also be used, as may a two-phase target detection network, such as FasterRCNN and its optimization series. A transform network, such as DETR and its variants, may also be used.
Step 2: selecting each camera to determine a non-overlapping target frame;
fig. 2 shows a flow of determining non-overlapping target boxes.
It should be noted that, selecting each camera to determine a non-overlapping target frame includes the following specific steps:
judging whether overlapping exists among original target frames under the currently selected camera, and if so, acquiring a non-overlapping area of each target frame as a non-overlapping target frame;
if there is no overlap, the target frame itself is taken as a non-overlapping target frame.
It should be noted that, by determining the non-overlapping target frame and further determining the non-overlapping region, sampling is facilitated for the non-overlapping region to obtain the sampling point set.
Step 3: uniformly sampling the area in the non-overlapping target frame to obtain a sampling point set, and re-projecting the sampling point set onto an adjacent camera image according to an epipolar constraint error minimization criterion to obtain a re-projected rectangular frame;
it should be noted that, the polar constraint error minimization criterion can accurately establish a re-projection rectangular frame of the target frame under one camera image coordinate system under another camera image coordinate system.
The sampling point set is re-projected onto an adjacent camera image according to an epipolar constraint error minimization criterion to obtain a re-projected rectangular frame, and the specific steps are as follows:
(a) Carrying out distortion correction on the sampling points to obtain undistorted coordinates Po;
(b) Acquiring a right eye based on an intermediate object basic matrix F and a sampling point undistorted coordinate Po by using an internal reference matrix of each camera and an external reference matrix of an adjacent camera, and calculating a polar equation corresponding to the sampling point Po on an image of the adjacent camera;
le=FPo
(c) Setting a distance threshold Td according to a polar equation le, and calculating a formula according to the point and the direct distanceObtaining all pixel points Pt on the image with the distance from the polar line less than Td i (t i =0,1,...,n);
(d) Based on sampling point Po and pixel point Pt i For the pixel point Pt i Three-dimensional reconstruction is carried out to obtain sampling points Po and pixel points Pt i Three-dimensional coordinates Wt (Xt, yt, zt) of two corresponding points in adjacent camera coordinate systems;
the expression for three-dimensional reconstruction is:
Zt=B*f/d
Xt=Z*Pt i _x/f
Yt=Z*Pt i _y/f
wherein B is the base distance between the current camera and the adjacent camera, f is the focal length of the camera, d is Po and Pt i Parallax of dot, pt i X is the point Pt i Pixel x coordinate, pt i Y is the point Pt i Is defined as the pixel y coordinate of (c).
(e) According to the external parameter matrixes R and T of the current camera and the adjacent camera, converting the three-dimensional coordinates in the coordinate system of the adjacent camera into the three-dimensional coordinates Wo in the coordinate system of the current camera;
the three-dimensional coordinate conversion formula is:
Wo=RWt+T
wo represents the three-dimensional coordinates in the current camera coordinate system.
(f) According to the internal reference matrix of the current camera, converting the three-dimensional coordinates of the sampling point under the current camera coordinate system into the coordinate of the reprojection image under the current camera, and recording the converted point as Pr i (i=0,1,...,n);
(g) Calculating the distance between the sampling point Po and the converted point to obtain the minimum distance re-projection pixel coordinate Pr i Determining an optimal re-projection point Pt;
(h) And determining the maximum circumscribed rectangular frame as the reprojected rectangular frame of the current camera unobstructed rectangular frame on the adjacent camera image according to the optimal reprojected points of all the sampling points.
Step 4: performing logic judgment on an original target frame, a reprojected rectangular frame and an adjacent camera target frame of the current camera, and performing target fusion processing according to judgment results;
it should be noted that, by performing logic judgment on the original target frame, the reprojection rectangular frame and the adjacent camera target frame of the current camera, the target fusion processing is performed according to the judgment result, and the specific process is as follows:
calculating the intersection ratio of the re-projection rectangular frame and the adjacent camera target frame, and setting the threshold value of the intersection ratio as Tiou;
if the intersection ratio of the adjacent camera target frame and the reprojection rectangular frame is larger than a threshold Tiou, judging that the current camera original target frame corresponding to the adjacent camera target frame and the reprojection frame is a target, and carrying out target fusion;
if the intersection ratio of the adjacent camera target frame and the reprojection rectangular frame is smaller than or equal to the threshold Tiou, the current camera original target frame corresponding to the adjacent camera target frame and the reprojection rectangular frame is not the same target, and target fusion is not carried out.
Step 5: and traversing all cameras, and repeating the steps 2-4 to process the target frames under all cameras.
A second aspect of the present invention provides a vehicle-mounted multi-camera inter-target fusion system, the system comprising: the system comprises a memory and a processor, wherein the memory comprises a vehicle-mounted multi-camera target fusion method program, and the method comprises the following steps of:
step 1: acquiring images shot by the vehicle-mounted multi-camera at the same time, an internal reference matrix of each camera, an external reference matrix of an adjacent camera and a base distance between the adjacent cameras, and performing target detection by using a neural network to acquire all targets on each image;
step 2: selecting each camera to determine a non-overlapping target frame;
step 3: uniformly sampling the area in the non-overlapping target frame to obtain a sampling point set, and re-projecting the sampling point set onto an adjacent camera image according to an epipolar constraint error minimization criterion to obtain a re-projected rectangular frame;
step 4: performing logic judgment on an original target frame, a reprojected rectangular frame and an adjacent camera target frame of the current camera, and performing target fusion processing according to judgment results;
step 5: and traversing all cameras, and repeating the steps 2-4 to process the target frames under all cameras.
Further, each camera is selected to determine a non-overlapping target frame, and the specific steps are as follows:
judging whether overlapping exists among original target frames under the currently selected camera, and if so, acquiring a non-overlapping area of each target frame as a non-overlapping target frame; if there is no overlap, the target frame itself is taken as a non-overlapping target frame.
The third aspect of the present invention provides a storage medium, where the storage medium includes a vehicle-mounted multi-camera target fusion method program, where the vehicle-mounted multi-camera target fusion method program, when executed by a processor, implements the steps of the vehicle-mounted multi-camera target fusion method.
It is to be understood that the above examples of the present invention are provided by way of illustration only and not by way of limitation of the embodiments of the present invention. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the invention are desired to be protected by the following claims.
Claims (10)
1. The method for fusing targets among the vehicle-mounted multiple cameras is characterized by comprising the following steps of:
step 1: acquiring images shot by the vehicle-mounted multi-camera at the same time, an internal reference matrix of each camera, an external reference matrix of an adjacent camera and a base distance between the adjacent cameras, and performing target detection by using a neural network to acquire all targets on each image;
step 2: selecting each camera to determine a non-overlapping target frame;
step 3: uniformly sampling the region in the non-overlapping target frame to obtain a sampling point set, and re-projecting the sampling point set onto an adjacent camera image according to an epipolar constraint error minimization criterion to obtain a re-projected rectangular frame;
step 4: performing logic judgment on an original target frame, a reprojected rectangular frame and an adjacent camera target frame of the current camera, and performing target fusion processing according to judgment results;
step 5: and traversing all cameras, and repeating the steps 2-4 to process the target frames under all cameras.
2. The method for fusing targets among multiple cameras in a vehicle of claim 1, wherein the neural network comprises: convolutional neural network, transform network.
3. The method for merging targets among multiple cameras in a vehicle according to claim 1, wherein selecting each camera to determine a non-overlapping target frame comprises the following specific steps:
judging whether overlapping exists among original target frames under the currently selected camera, and if so, acquiring a non-overlapping area of each target frame as a non-overlapping target frame; if there is no overlap, the target frame itself is taken as a non-overlapping target frame.
4. The method for merging targets among multiple vehicle-mounted cameras according to claim 1, wherein the sampling point set is re-projected onto the adjacent camera images according to a polar constraint error minimization criterion to obtain a re-projected rectangular frame, and the specific steps are as follows:
(a) Carrying out distortion correction on the sampling points to obtain undistorted coordinates Po;
(b) Acquiring a right eye based on an intermediate object basic matrix F and a sampling point undistorted coordinate Po by using an internal reference matrix of each camera and an external reference matrix of an adjacent camera, and calculating a polar equation corresponding to the sampling point Po on an image of the adjacent camera;
le=FPo
(c) According to the polar line equation le, setting a distance threshold value Td, and according to a point-to-direct distance calculation formula, obtaining all pixel points Pt on an image with the polar line distance smaller than Td i (t i =0,1,...,n);
(d) Based on sampling point Po and pixel point Pt i For the pixel point Pt i Three-dimensional reconstruction is carried out to obtain sampling points Po and pixel points Pt i Three-dimensional coordinates Wt (Xt, yt, zt) of two corresponding points in adjacent camera coordinate systems;
(e) According to the external parameter matrixes R and T of the current camera and the adjacent camera, converting the three-dimensional coordinate under the coordinate system of the adjacent camera into the three-dimensional coordinate Wo under the coordinate system of the current camera;
(f) According to the internal reference matrix of the current camera, converting the three-dimensional coordinates of the sampling points in the current camera coordinate system into the coordinate of the reprojection image in the current camera, and recording the converted points as Pr i (i=0,1,...,n);
(g) Calculating the distance between the sampling point Po and the converted point to obtain the minimum distance re-projection pixel coordinate Pr i Determining an optimal re-projection point Pt;
(h) And determining the maximum circumscribed rectangular frame as the reprojected rectangular frame of the current camera unobstructed rectangular frame on the adjacent camera image according to the optimal reprojected points of all the sampling points.
5. The method for fusing targets among multiple cameras on a vehicle according to claim 4, wherein the expression of three-dimensional reconstruction is:
Zt=B*f/d
Xt=Z*Pt i _x/f
Yt=Z*Pt i _y/f
wherein B is the base distance between the current camera and the adjacent camera, f is the focal length of the camera, d is Po and Pt i Parallax of dot, pt i X is the point Pt i Pixel x coordinate, pt i Y is the point Pt i Is defined as the pixel y coordinate of (c).
6. The method for fusing targets among multiple cameras in a vehicle according to claim 4, wherein the three-dimensional coordinate transformation formula is:
Wo=RWt+T
wo represents the three-dimensional coordinates in the current camera coordinate system.
7. The method of claim 1, wherein in step 4, the logical judgment is performed on the original target frame, the reprojected rectangular frame, and the adjacent target frame of the current camera, and the target fusion process is performed according to the judgment result, specifically:
calculating the intersection ratio of the re-projection rectangular frame and the adjacent camera target frame, and setting the threshold value of the intersection ratio as Tiou;
if the intersection ratio of the adjacent camera target frame and the reprojection rectangular frame is larger than a threshold Tiou, judging that the current camera original target frame corresponding to the adjacent camera target frame and the reprojection frame is a target, and carrying out target fusion;
if the intersection ratio of the adjacent camera target frame and the reprojection rectangular frame is smaller than or equal to the threshold Tiou, the current camera original target frame corresponding to the adjacent camera target frame and the reprojection rectangular frame is not the same target, and target fusion is not carried out.
8. A vehicle-mounted multi-camera inter-target fusion system, the system comprising: memory, processor, including a vehicle-mounted multi-camera inter-target fusion method program in the memory, the vehicle-mounted multi-camera inter-target fusion method program when executed by the processor implementing the steps of a vehicle-mounted multi-camera inter-target fusion method according to any one of claims 1-6:
step 1: acquiring images shot by the vehicle-mounted multi-camera at the same time, an internal reference matrix of each camera, an external reference matrix of an adjacent camera and a base distance between the adjacent cameras, and performing target detection by using a neural network to acquire all targets on each image;
step 2: selecting each camera to determine a non-overlapping target frame;
step 3: uniformly sampling the region in the non-overlapping target frame to obtain a sampling point set, and re-projecting the sampling point set onto an adjacent camera image according to an epipolar constraint error minimization criterion to obtain a re-projected rectangular frame;
step 4: performing logic judgment on an original target frame, a reprojected rectangular frame and an adjacent camera target frame of the current camera, and performing target fusion processing according to judgment results;
step 5: and traversing all cameras, and repeating the steps 2-4 to process the target frames under all cameras.
9. The system for merging targets among multiple cameras in a vehicle according to claim 7, wherein each camera is selected to determine a non-overlapping target frame, comprising the following steps:
judging whether overlapping exists among original target frames under the currently selected camera, and if so, acquiring a non-overlapping area of each target frame as a non-overlapping target frame; if there is no overlap, the target frame itself is taken as a non-overlapping target frame.
10. A storage medium, wherein the storage medium includes a vehicle-mounted multi-camera inter-target fusion method program, and the vehicle-mounted multi-camera inter-target fusion method program, when executed by a processor, implements the steps of a vehicle-mounted multi-camera inter-target fusion method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310483852.XA CN116503299A (en) | 2023-04-28 | 2023-04-28 | Method, system and storage medium for fusing targets among vehicle-mounted multiple cameras |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310483852.XA CN116503299A (en) | 2023-04-28 | 2023-04-28 | Method, system and storage medium for fusing targets among vehicle-mounted multiple cameras |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116503299A true CN116503299A (en) | 2023-07-28 |
Family
ID=87319868
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310483852.XA Pending CN116503299A (en) | 2023-04-28 | 2023-04-28 | Method, system and storage medium for fusing targets among vehicle-mounted multiple cameras |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116503299A (en) |
-
2023
- 2023-04-28 CN CN202310483852.XA patent/CN116503299A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111368706B (en) | Data fusion dynamic vehicle detection method based on millimeter wave radar and machine vision | |
JP3868876B2 (en) | Obstacle detection apparatus and method | |
KR102516326B1 (en) | Camera extrinsic parameters estimation from image lines | |
US11393126B2 (en) | Method and apparatus for calibrating the extrinsic parameter of an image sensor | |
CN109903341A (en) | Join dynamic self-calibration method outside a kind of vehicle-mounted vidicon | |
CN106127787B (en) | A kind of camera calibration method based on Inverse projection | |
CN103177439A (en) | Automatically calibration method based on black and white grid corner matching | |
CN111815710B (en) | Automatic calibration method for fish-eye camera | |
CN110827361B (en) | Camera group calibration method and device based on global calibration frame | |
CN111443704B (en) | Obstacle positioning method and device for automatic driving system | |
CN114283391A (en) | Automatic parking sensing method fusing panoramic image and laser radar | |
JP4344860B2 (en) | Road plan area and obstacle detection method using stereo image | |
CN112950696A (en) | Navigation map generation method and generation device and electronic equipment | |
CN111382591B (en) | Binocular camera ranging correction method and vehicle-mounted equipment | |
CN111860270B (en) | Obstacle detection method and device based on fisheye camera | |
CN113869422A (en) | Multi-camera target matching method, system, electronic device and readable storage medium | |
CN111046809B (en) | Obstacle detection method, device, equipment and computer readable storage medium | |
CN110738696B (en) | Driving blind area perspective video generation method and driving blind area view perspective system | |
CN116503299A (en) | Method, system and storage medium for fusing targets among vehicle-mounted multiple cameras | |
Geiger | Monocular road mosaicing for urban environments | |
Zhao et al. | Extrinsic calibration of a small fov lidar and a camera | |
CN115797405A (en) | Multi-lens self-adaptive tracking method based on vehicle wheel base | |
CN112884845B (en) | Indoor robot obstacle positioning method based on single camera | |
CN114926332A (en) | Unmanned aerial vehicle panoramic image splicing method based on unmanned aerial vehicle mother vehicle | |
Son et al. | Detection of nearby obstacles with monocular vision for earthmoving operations |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |