CN116258740A - Vehicle-mounted forward-looking multi-target tracking method based on multi-camera pixel fusion - Google Patents

Vehicle-mounted forward-looking multi-target tracking method based on multi-camera pixel fusion Download PDF

Info

Publication number
CN116258740A
CN116258740A CN202111454700.4A CN202111454700A CN116258740A CN 116258740 A CN116258740 A CN 116258740A CN 202111454700 A CN202111454700 A CN 202111454700A CN 116258740 A CN116258740 A CN 116258740A
Authority
CN
China
Prior art keywords
image
vehicle
target
visible light
gray
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111454700.4A
Other languages
Chinese (zh)
Inventor
邓亮
金瑞鸣
谢正华
王金磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changzhou Xingyu Automotive Lighting Systems Co Ltd
Original Assignee
Changzhou Xingyu Automotive Lighting Systems Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changzhou Xingyu Automotive Lighting Systems Co Ltd filed Critical Changzhou Xingyu Automotive Lighting Systems Co Ltd
Priority to CN202111454700.4A priority Critical patent/CN116258740A/en
Publication of CN116258740A publication Critical patent/CN116258740A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/215Motion-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/269Analysis of motion using gradient-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/292Multi-camera tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20164Salient point detection; Corner detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20224Image subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of artificial intelligent image processing and computer vision, in particular to an automobile driving scene target detection method based on multi-camera pixel fusion, which comprises the following steps: s1, shooting the front of a vehicle by adopting a plurality of cameras with different focal lengths and different imaging modes, and forming a fusion image by adopting a pixel fusion algorithm on a plurality of shooting images; s2, tracking vehicles and pedestrians within a preset distance by using an identification and tracking algorithm on the fusion image to obtain vehicle and pedestrian information; s3, inputting the vehicle and pedestrian information into a multi-scale target detection model based on deep learning to obtain the position and type information of the target pedestrian and the vehicle in each frame; and S4, sending the obtained position and type information of the target pedestrian and the vehicle in each frame to a vehicle body controller. The detection method disclosed by the invention has the characteristics of easiness in fusion algorithm, high-speed and long-distance identification of small targets of vehicles and pedestrians at night, and effectively improves the accuracy rate of target identification.

Description

Vehicle-mounted forward-looking multi-target tracking method based on multi-camera pixel fusion
Technical Field
The invention relates to the technical field of artificial intelligent image processing and computer vision, in particular to an automobile driving scene target detection method based on multi-camera pixel fusion.
Background
Along with the popularization of intelligent driving technology, the vehicle-mounted image processing and identifying technology has the advantages of low cost, high-efficiency detection and identification of various information such as target positions, sizes and types, and wide application. Especially in the front target detection and recognition field, ADAS driving assistance functions such as daytime vehicle detection, pedestrian anti-collision reminding, lane line detection and the like can be realized.
The existing ADAS front view image recognition technology mainly adopts a visible light fixed focal length single lens, and can clearly collect images within 30 degrees and 300m in front under the condition of good ambient illuminance. However, if a target of 300m or more is to be acquired, the imaging blur resolution of the target becomes poor; the night environment illumination is poor, the contrast ratio of the low-reflection target is poor, or strong light irradiation exists, the photosensitive element is excessively exposed, and the target and the background are integrated. In practical use, the remote, low-illumination and overexposed scenes severely limit the recognition accuracy of the monocular forward-looking recognition technology on the targets of vehicles and pedestrians.
Disclosure of Invention
The invention aims to solve the technical problems that: in order to solve the technical problems that in the prior art, the imaging fuzzy resolution of the target is poor and the night target and the background are integrated. The invention provides an automobile driving scene target detection method based on multi-camera pixel fusion, which can improve imaging resolution in a dark environment by fusing images shot by a plurality of cameras and has good imaging effect of acquiring a remote target.
The technical scheme adopted for solving the technical problems is as follows: a vehicle-mounted forward-looking multi-target tracking method based on multi-camera pixel fusion comprises the following steps:
s1, shooting the front of a vehicle by adopting a plurality of cameras with different focal lengths and different imaging modes, and fusing images shot by the cameras by adopting a pixel fusion algorithm to form a fused image;
s2, tracking vehicles and pedestrians within a preset distance by using an identification and tracking algorithm to the fusion image to obtain motion information of the vehicles and pedestrians;
s3, inputting the motion information of the vehicle and the pedestrian into a multi-scale target detection model based on deep learning to obtain the position and type information of the target pedestrian and the vehicle in each frame;
and S4, sending the obtained position and type information of the target pedestrian and the vehicle in each frame to a vehicle body controller.
Specifically, in step S1, the plurality of cameras with different focal lengths and different imaging modes include: long-focus long-distance camera, mid-focus camera and near-infrared camera.
Specifically, in step S2, if the vehicle is in daytime, the tracking distance of the preset vehicle is 500m, and the tracking distance of the pedestrian is 200m; if the vehicle is at night, the tracking distance of the preset vehicle is 600m, and the tracking distance of the pedestrian is 150m.
Further, in step S1, the following steps are included:
s11, respectively performing MSRCR enhancement processing on the long-focus image and the middle-focus image, shrinking the long-focus image according to the focal length ratio, embedding the long-focus image into the middle-focus image, and performing visible light pixel fusion to obtain a visible light fusion image;
s12, detecting the gray level of the visible light image, if the average gray level of the visible light image is less than 40, judging that the night is reached, starting a near infrared light supplementing lamp to shoot a near infrared camera, and if the average gray level of the visible light image is more than or equal to 40, judging that the day is reached, and directly shooting the near infrared camera;
s13, performing MSR enhancement processing on the near-infrared image shot by the near-infrared camera in the step S12, expanding the visible light image according to YUV channels, superposing according to respective brightness differences of infrared light and visible light, and converting the images into RGB for pixel fusion.
Specifically, in step S13, specifically including:
s131, converting the visible light image into a gray image, and calculating the same brightness component of the visible light gray image and the near infrared gray image;
s132, respectively subtracting the same brightness component obtained in the step S131 from the gray value of the visible light gray image and the gray value of the near infrared gray image to respectively obtain components specific to the visible light gray image and the near infrared gray image;
s133, creating a pure black RGB image, wherein an R channel is a special component of the near infrared gray image, a G channel is a special component of the visible light gray image, a B channel is an absolute value of a difference value between the near infrared gray image and the special component of the visible light gray image, a R ', G ', B ' new image is obtained, and then the R ', G ', B ' new image is converted into a Y ', U ', V ' image;
s134, converting the original visible light image into Y, U and V images, wherein a Y channel is replaced by Y 'in the step S133, and reversing the Y', U and V images into RGB images to finish the fusion of the visible light image and the near infrared image.
Specifically, in step S3, the following steps are included:
for the fused continuous images, if the number of image frames is 5, invoking a target detection model based on deep learning to obtain the position and type information of the target pedestrians and vehicles; if the number of image frames is not 5, adopting corner detection and LK optical flow tracking algorithm to continuously track the position change of each target.
Preferably, the deep learning target detection model adopts an image detection model yolov3 of a deep learning CNN convolutional neural network.
Optionally, the tracking using corner detection and LK pyramid optical flow algorithm includes:
s321, in each target position ROI of the previous frame, selecting a rectangular area range surrounded by an upper left corner coordinate and a lower right corner coordinate, searching a plurality of Harris corner points inside, and storing the positions of the Harris corner points;
s322, adopting a layering (0-n) strategy from thin to thick to the image gray of the previous frame and the current frame to generate an image pyramid;
s323, substituting the estimated values of each angular point of the previous frame and the current frame into a pyramid iteration LK optical flow algorithm, and calculating the difference value of gray scales of each angular point of each layer by the uppermost layer;
s324, when the difference value of the gray scales is judged to be <0.03 or the number of iterations is more than 20, if the judgment result is yes, stopping the iteration of the current layer to obtain the coordinates of each corner point of the 1 st frame; if the judgment result is negative, correcting according to the iteration result, and returning to the step S323;
s325, calculating the ROI of each target position of the current frame according to the obtained corner coordinates of the current frame;
s326, judging whether the current frame is the 5 th frame, if the current frame is not the 5 th frame, taking all target positions ROI of the current frame as all target positions ROI of the previous frame, cycling between the steps S321-S325 to obtain all target positions ROI of the frames, and if the current frame is the 5 th frame, calling a yolov3 detection model to correct the target positions ROI.
Preferably, in step S321, harris corner points are 10.
The invention relates to a vehicle-mounted forward-looking multi-target tracking method based on multi-camera pixel fusion, which further comprises the following steps:
s5, the vehicle body controller further processes the detection result, and outputs a video with the detection result or directly sends a warning signal to a driver.
The vehicle-mounted multi-target tracking method based on multi-camera pixel fusion has the beneficial effects that firstly, a method of combining visible light and near infrared imaging is adopted, and near infrared light supplementing is used for the situation of low night visibility; secondly, respectively extracting imaging advantages of visible light in lines and textures and night advantages of near infrared in brightness to perform image fusion, and improving imaging resolution in dark environment; then, using a yolov3 model to identify the obstacle in the image, wherein the model is fully trained on the object in the dark environment before identification, and the identification rate is high; and finally, transmitting the identification detection result to a vehicle body controller through a CAN signal, and further processing the detection result by the vehicle body controller and outputting a video with the detection result or directly sending an alarm signal to a driver. The vehicle-mounted forward-looking multi-target tracking method based on multi-camera pixel fusion has the advantages that the fusion algorithm is simple, the target tracking calculated amount is small, the method is very suitable for being deployed on a vehicle embedded platform, the method has the characteristic of high-speed and long-distance recognition of small targets of vehicles and pedestrians at night, and the accuracy rate of target recognition is effectively improved.
Drawings
The invention will be further described with reference to the drawings and examples.
FIG. 1 is a general flow chart of a multi-camera pixel fusion-based on-vehicle multi-target tracking method of the present invention;
FIG. 2 is a flow chart of the fusion of infrared images and visible light images of the vehicle-mounted multi-target tracking method based on multi-camera pixel fusion of the present invention;
FIG. 3 is a flow chart of LK target tracking algorithm of the vehicle-mounted multi-target tracking method based on multi-camera pixel fusion of the present invention;
fig. 4 is a detection result of the vehicle-mounted multi-target tracking method based on multi-camera pixel fusion in daytime after three cameras of long Jiao Yuan distance, middle focus and near infrared cameras are used for shooting fusion;
fig. 5 is a detection result of three cameras of long Jiao Yuan distance, middle focus and near infrared cameras after shooting fusion based on a multi-camera pixel fusion vehicle-mounted multi-target tracking method.
Detailed Description
The invention will now be described in further detail with reference to the accompanying drawings. The drawings are simplified schematic representations which merely illustrate the basic structure of the invention and therefore show only the structures which are relevant to the invention.
As shown in fig. 1 to 5, which are the best embodiments of the present invention, a vehicle-mounted front-view multi-target tracking method based on multi-camera pixel fusion includes the following steps:
s1, shooting the front of a vehicle by adopting a long-focus remote camera (focal length of 25 mm), a middle-focus camera (focal length of 12 mm) and a near infrared camera (focal length of 12mm,850nm narrow band, with infrared light filling), and fusing images shot by a plurality of cameras by adopting a pixel fusion algorithm to form a fused image;
the step S1 includes: s11, respectively performing MSRCR enhancement processing on the long-focus image and the middle-focus image, shrinking the long-focus image according to the focal length ratio, embedding the long-focus image into the middle-focus image, and performing visible light pixel fusion to obtain a visible light fusion image; s12, detecting the gray level of the visible light image, if the average gray level of the visible light image is less than 40, judging that the night is reached, starting a near infrared light supplementing lamp to shoot a near infrared camera, supplementing light for the near infrared camera, improving the image quality and the detection distance, and if the average gray level of the visible light image is more than or equal to 40, judging that the day is reached, and directly shooting by the near infrared camera; s13, performing MSR enhancement processing on the near-infrared image shot by the near-infrared camera in the step S12, expanding the visible light image according to YUV channels, superposing according to respective brightness differences of infrared light and visible light, and converting the images into RGB for pixel fusion.
The middle-focus camera image is virtual for a long-distance target, and the resolution is low; the long-focus camera image has excellent imaging quality for a long-distance target. The long-focus image and the middle-focus image are fused at the pixel level, the target information is obvious, the details are rich, the multi-scale target detection model based on deep learning is called, the recognition accuracy of the near-view large target and the far-view small target is higher than that of the image with a single focal length, the detection model is not required to be called repeatedly for many times, the floating point operation task amount is small, and the detection speed is accelerated. In the method, the focal length ratio of the two images is fixed, the middle-focus image is taken as a reference, and the center part of the reference image is replaced by a far-focus image with a reduced size according to the focal length ratio, so that the pixel-level fusion can be completed.
The near infrared image can improve the contrast of the target image under the condition of insufficient illumination or fog at night, and can supplement information to the visible light image. But the texture information of the near infrared target image is less, and the visible light image still has advantages under the condition of good daytime illumination. Therefore, no matter daytime and night, the near infrared and visible light image pixel level fusion can be adopted to compensate each other, and the object recognition degree is improved. In the method, the near-infrared and middle-focus visible light cameras adopt the same CMOS sensing chip, the fixed focus lenses with the same optical parameters have small influence on the scale deviation of the target image above 10m due to the installation interval of the cameras, and the scale deviation can be ignored. And expanding the visible light image according to YUV channels (brightness and chromaticity components), supplementing the visible light image according to the near infrared image, and converting the visible light image into an RGB image, thus completing pixel-level fusion.
In step S13, further comprising: s131, converting the visible light image into a gray image, and calculating the same brightness component of the visible light gray image and the near infrared gray image; s132, respectively subtracting the same brightness components obtained in the step S131 from the gray value of the visible light gray image and the gray value of the near infrared gray image to respectively obtain the specific components of the visible light gray image and the near infrared gray image; s133, creating a pure black RGB image, wherein an R channel is a special component of a near infrared gray image, a G channel is a special component of a visible light gray image, a B channel is an absolute value of a difference value between the near infrared gray image and the special component of the visible light gray image, R ', G ', B ' new images are obtained after the image is traversed, if the image is not traversed, the step S131 is continued, and finally, the R ', G ', B ' new images are converted into Y ', U ', V ' images; s134, converting the original visible light image into Y, U and V images, wherein a Y channel is replaced by Y 'in the step S133, and reversing the Y', U and V images into RGB images to finish the fusion of the visible light image and the near infrared image.
S2, tracking vehicles and pedestrians within a preset distance by using an identification and tracking algorithm on the fusion image to obtain motion information of the vehicles and the pedestrians; if so, presetting the tracking distance of the vehicle to be 500m and the tracking distance of the pedestrian to be 200m; if the vehicle is at night, the tracking distance of the preset vehicle is 600m, and the tracking distance of the pedestrian is 150m.
S3, inputting motion information of the vehicle and the pedestrian into a multi-scale target detection model based on deep learning to obtain the position and type information of the target pedestrian and the vehicle in each frame, wherein the type information is divided into two types, one type is pedestrian information, and the other type is vehicle information; for the fused continuous images, if the number of image frames is 5, invoking a target detection model based on deep learning to obtain the position and type information of a target pedestrian and a vehicle, wherein the target detection model of the deep learning adopts an image detection model yolov3 of a deep learning CNN convolutional neural network (You only look once, an end-to-end rapid target detection model based on the convolutional neural network); if the number of image frames is not 5, the angular point detection and LK optical flow tracking algorithm is adopted to continuously track the position change of each target, so that the floating point operand can be reduced, the processing speed can be improved, and the method is suitable for rapid deployment of the vehicle embedded computing platform.
The tracking by adopting the corner detection and LK pyramid optical flow algorithm comprises the following steps:
s321, in each target position ROI of the previous frame, selecting a rectangular area range surrounded by an upper left corner coordinate and a lower right corner coordinate, searching 10 Harris corner points in the area, and storing the positions of the 10 Harris corner points; s322, adopting a layering (0-n) strategy from thin to thick to the image gray of the previous frame and the current frame to generate an image pyramid; s323, substituting the estimated values of each angular point of the previous frame and the current frame into a pyramid iteration LK optical flow algorithm, and calculating the difference value of gray scales of each angular point of each layer by the uppermost layer; s324, when the difference value of the gray scales is judged to be <0.03 or the number of iterations is more than 20, if the judgment result is yes, stopping the iteration of the current layer to obtain the coordinates of each corner point of the 1 st frame; if the judgment result is negative, correcting according to the iteration result, and returning to the step S323; s325, calculating the ROI of each target position of the current frame according to the obtained corner coordinates of the current frame; s326, judging whether the current frame is the 5 th frame, if the current frame is not the 5 th frame, taking all target positions ROI of the current frame as all target positions ROI of the previous frame, cycling between the steps S321-S325 to obtain all target positions ROI of the frames, and if the current frame is the 5 th frame, calling a yolov3 detection model to correct the target positions ROI.
And S4, sending the obtained position and type information of the target pedestrian and the vehicle in each frame to a vehicle body controller.
S5, the vehicle body controller further processes the detection result, and outputs a video with the detection result or directly sends a warning signal to a driver. And the vehicle body controller judges whether the acquisition of the processed video is finished or not, if not, the vehicle body controller returns to the step S1 to continue to acquire the processed video, and if so, the vehicle body controller closes the video processing program.
The general flow of the invention is shown in figure 1:
s11, importing long-focus and medium-focus images, and respectively performing MSRCR image enhancement and registration;
the focal length ratio of the long-focus fixed-focus lens and the medium-focus fixed-focus lens is f r =f l /f m Wherein the length Jiao Jiaoju is f 1 The focal length of the middle focus is f m The reduction ratio of the width w and the height h of the tele image is 1/f r Cutting the middle-focus image with the same size from the middle, and embedding the contracted long-focus image;
s12, after graying the fused visible light image, judging the gray level, if the whole is dark and the average gray level is less than 40, turning on an infrared light supplementing lamp to supplement light for the near infrared camera;
s13, introducing a near infrared camera image, performing MSR enhancement, expanding a visible light image according to a YUV channel, performing brightness level calculation with the near infrared image, replacing a visible light Y channel, and performing inverse conversion to RGB to complete image fusion of near infrared and visible light (see FIG. 2 and description for details);
in step S3, the number of frames of the fused image is calculated, and the yolov3 target detection model is called 1 time every 5 frames. The model adopts a multi-layer convolution and pooling method, so that the complexity of image processing is effectively reduced, and a multi-scale anchor point is adopted, so that the model is more sensitive to a small target. After the transfer learning is carried out on night visible light and near infrared fusion images, the modified yolov3 model can output the positions of the ROI interesting areas of all targets at night and the target types;
if every 5 adjacent intermediate frames, searching Harris angle point sets in all ROIs according to all target positions identified by the 1 st frame detection model, searching a new angle point set corresponding to the current frame by using an LK pyramid optical flow algorithm, and calculating and outputting all target positions ROIs of the current frame.
As shown in fig. 2, fig. 2 is a flowchart of a YUV channel pixel level fusion of an infrared image and a visible light image, wherein the visible light image after fusion of a long-focus image and a middle-focus image is also fused with a near-infrared image, and the difference between daytime and night near-infrared images is that a near-infrared light supplementing image is adopted in the night.
S131, converting the visible light image into a gray level image, and calculating the gray level component I which is the same as each pixel of the near infrared image c (i,j)=I v (i,j)∩I i (I, j), wherein I v Is the gray level of visible light, I i Is the gray level of near infrared, I c Is the common component of the two, namely the gray component which is the same for each pixel of the visible light image and the near infrared image;
s132 subtracting the common component I c The specific gray components of the visible light image are calculated respectively:
ΔI v (i,j)=I v (i,j)-I c (i,j);
characteristic components of near infrared: ΔI i (i,j)=I i (x,y)-I c (i,j);
S133, newly creating a pure black RGB image, wherein an R channel is a near infrared specific component, a G channel is a visible light specific component, and a B channel is an absolute value of a difference between the near infrared specific component and the visible light specific component, namely a new image [ R ', G ', B ] ']=[ΔI i ,ΔI v ,|ΔI i -ΔI v |]Then the RGB image is converted into the channel
[Y’,U’,V’];
S134, converting the original visible light into [ Y, U, V ], replacing a Y channel by Y 'of a new image, and reversing the [ Y', U, V ] into an RGB image to finish the fusion of the visible light image and the near infrared image.
As shown in fig. 3, continuous detection and tracking are performed on the fused image, the yolov3 after transfer learning is adopted for every 0 th 5 th frame, the position ROI and type of the target in the image are detected, and LK pyramid optical flow algorithm tracking is applied for other 1-4 frames:
s321, selecting the upper left corner coordinate (x) from the target positions ROI of the 0 th frame 0 ,y 0 ) Finding the interior in the rectangular region surrounded by the lower right corner coordinates (x ', y')10 Harris corner points of the part, and saving the positions of the corner points:
Figure BDA0003387371080000101
the number of frames k is up to 4.
S322, adopting a thin-to-thick layering (0-n) strategy to generate an image pyramid by adopting the image gray levels of the 0 th frame and the 1 st frame, wherein the 0 th layer is the gray level I of the original image 0 The grayscale of the first layer (l e (1, n)) is:
I l (x,y)=1/4I l-1 (2x,2y)+1/8[I l-1 (2x-1,2y)+I l-1 (2x+1,2y)
+I l-1 (2x,2y-1)+I l-1 (2x,2y+1)]+1/16[I l-1 (2x-1,2y-1)
+I l-1 (2x+1,2y+1)+I l-1 (2x-1,2y+1)+I l-1 (2x+1,2y-1)]
s323, every corner point of the 0 th frame
Figure BDA0003387371080000102
Estimated value of each corner point of 1 st frame +.>
Figure BDA0003387371080000103
Substituting pyramid iterative LK optical flow algorithm, firstly calculating the difference value of each angular point gray scale of each layer from the uppermost layer:
Figure BDA0003387371080000104
wherein the method comprises the steps of
Figure BDA0003387371080000105
Is to transfer optical flow between layers, and is to add>
Figure BDA0003387371080000106
Is an iterative correction.
S324, iterating according to the difference value
Figure BDA0003387371080000107
Z is a spatial gradient matrix, and the matrix is a spatial gradient matrix,and stopping the iteration of the current layer when the gray level difference value is less than 0.03 or the iteration number is more than 20.
Iterative calculation of the gray-level difference for layer l-1, where the optical flow g is transferred between layers l-1 =2(g l +v k )。
Calculating to the lowest layer to obtain the coordinates of each corner point of the 1 st frame:
Figure BDA0003387371080000111
s325, calculating the ROI position according to the coordinates of each corner point, and then calculating the ROI position according to the method by each frame, and continuously calling the yolov3 detection model for correction from the 5 th frame.
According to the vehicle-mounted multi-target tracking method based on multi-camera pixel fusion, a method of combining visible light and near infrared imaging is adopted, and near infrared light supplementing is used for the situation of low night visibility; secondly, respectively extracting imaging advantages of visible light in lines and textures and night advantages of near infrared in brightness to perform image fusion, and improving imaging resolution in dark environment; then, using a yolov3 model to identify the obstacle in the image, wherein the model is fully trained on the object in the dark environment before identification, and the identification rate is high; and finally, transmitting the identification detection result to a vehicle body controller through a CAN signal, and further processing the detection result by the vehicle body controller and outputting a video with the detection result or directly sending an alarm signal to a driver. The vehicle-mounted forward-looking multi-target tracking method based on multi-camera pixel fusion has the advantages that the fusion algorithm is simple, the target tracking calculated amount is small, the method is very suitable for being deployed on a vehicle embedded platform, the method has the characteristic of high-speed and long-distance recognition of small targets of vehicles and pedestrians at night, and the accuracy rate of target recognition is effectively improved.
With the above-described preferred embodiments according to the present invention as an illustration, the above-described descriptions can be used by persons skilled in the relevant art to make various changes and modifications without departing from the scope of the technical idea of the present invention. The technical scope of the present invention is not limited to the description, but must be determined according to the scope of claims.

Claims (10)

1. A vehicle-mounted forward-looking multi-target tracking method based on multi-camera pixel fusion is characterized by comprising the following steps:
s1, shooting the front of a vehicle by adopting a plurality of cameras with different focal lengths and different imaging modes, and fusing images shot by the cameras by adopting a pixel fusion algorithm to form a fused image;
s2, tracking vehicles and pedestrians within a preset distance by using an identification and tracking algorithm to the fusion image to obtain motion information of the vehicles and pedestrians;
s3, inputting the motion information of the vehicle and the pedestrian into a multi-scale target detection model based on deep learning to obtain the position and type information of the target pedestrian and the vehicle in each frame;
and S4, sending the obtained position and type information of the target pedestrian and the vehicle in each frame to a vehicle body controller.
2. The vehicle-mounted front-view multi-target tracking method based on multi-camera pixel fusion according to claim 1, wherein in step S1, the plurality of cameras with different focal lengths and different imaging modes comprise: long-focus long-distance camera, mid-focus camera and near-infrared camera.
3. The vehicle-mounted front view multi-target tracking method based on multi-camera pixel fusion according to claim 1, wherein in step S2, if the vehicle is in daytime, the tracking distance of the preset vehicle is 500m, and the tracking distance of the pedestrian is 200m; if the vehicle is at night, the tracking distance of the preset vehicle is 600m, and the tracking distance of the pedestrian is 150m.
4. The vehicle-mounted front view multi-target tracking method based on multi-camera pixel fusion according to claim 2, comprising the following steps in step S1:
s11, respectively performing MSRCR enhancement processing on the long-focus image and the middle-focus image, shrinking the long-focus image according to the focal length ratio, embedding the long-focus image into the middle-focus image, and performing visible light pixel fusion to obtain a visible light fusion image;
s12, detecting the gray level of the visible light image, if the average gray level of the visible light image is less than 40, judging that the night is reached, starting a near infrared light supplementing lamp to shoot a near infrared camera, and if the average gray level of the visible light image is more than or equal to 40, judging that the day is reached, and directly shooting the near infrared camera;
s13, performing MSR enhancement processing on the near-infrared image shot by the near-infrared camera in the step S12, expanding the visible light image according to YUV channels, superposing according to respective brightness differences of infrared light and visible light, and converting the images into RGB for pixel fusion.
5. The method for vehicle-mounted front-view multi-target tracking based on multi-camera pixel fusion according to claim 4, wherein in step S13, specifically comprising:
s131, converting the visible light image into a gray image, and calculating the same brightness component of the visible light gray image and the near infrared gray image;
s132, respectively subtracting the same brightness component obtained in the step S131 from the gray value of the visible light gray image and the gray value of the near infrared gray image to respectively obtain components specific to the visible light gray image and the near infrared gray image;
s133, creating a pure black RGB image, wherein an R channel is a special component of the near infrared gray image, a G channel is a special component of the visible light gray image, a B channel is an absolute value of a difference value between the near infrared gray image and the special component of the visible light gray image, a R ', G ', B ' new image is obtained, and then the R ', G ', B ' new image is converted into a Y ', U ', V ' image;
s134, converting the original visible light image into Y, U and V images, wherein a Y channel is replaced by Y 'in the step S133, and reversing the Y', U and V images into RGB images to finish the fusion of the visible light image and the near infrared image.
6. The vehicle-mounted front view multi-target tracking method based on multi-camera pixel fusion according to claim 5, comprising the following steps in step S3:
for the fused continuous images, if the number of image frames is 5, invoking a target detection model based on deep learning to obtain the position and type information of the target pedestrians and vehicles; if the number of image frames is not 5, adopting corner detection and LK optical flow tracking algorithm to continuously track the position change of each target.
7. The vehicle-mounted front-view multi-target tracking method based on multi-camera pixel fusion according to claim 6, wherein the deep-learning target detection model adopts an image detection model yolov3 of a deep-learning CNN convolutional neural network.
8. The method for tracking the vehicle-mounted front view multi-target based on multi-camera pixel fusion according to claim 6, wherein the tracking by adopting the corner detection and LK pyramid optical flow algorithm comprises:
s321, in each target position ROI of the previous frame, selecting a rectangular area range surrounded by an upper left corner coordinate and a lower right corner coordinate, searching a plurality of Harris corner points inside, and storing the positions of the Harris corner points;
s322, adopting a layering (0-n) strategy from thin to thick to the image gray of the previous frame and the current frame to generate an image pyramid;
s323, substituting the estimated values of each angular point of the previous frame and the current frame into a pyramid iteration LK optical flow algorithm, and calculating the difference value of gray scales of each angular point of each layer by the uppermost layer;
s324, when the difference value of the gray scales is judged to be <0.03 or the number of iterations is more than 20, if the judgment result is yes, stopping the iteration of the current layer to obtain the coordinates of each corner point of the 1 st frame; if the judgment result is negative, correcting according to the iteration result, and returning to the step S323;
s325, calculating the ROI of each target position of the current frame according to the obtained corner coordinates of the current frame;
s326, judging whether the current frame is the 5 th frame, if the current frame is not the 5 th frame, taking all target positions ROI of the current frame as all target positions ROI of the previous frame, cycling between the steps S321-S325 to obtain all target positions ROI of the frames, and if the current frame is the 5 th frame, calling a yolov3 detection model to correct the target positions ROI.
9. The vehicle-mounted front view multi-target tracking method based on multi-camera pixel fusion according to claim 8, wherein in step S321, harris corner points are 10.
10. The vehicle-mounted front-view multi-target tracking method based on multi-camera pixel fusion of claim 1, further comprising:
s5, the vehicle body controller further processes the detection result, and outputs a video with the detection result or directly sends a warning signal to a driver.
CN202111454700.4A 2021-12-01 2021-12-01 Vehicle-mounted forward-looking multi-target tracking method based on multi-camera pixel fusion Pending CN116258740A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111454700.4A CN116258740A (en) 2021-12-01 2021-12-01 Vehicle-mounted forward-looking multi-target tracking method based on multi-camera pixel fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111454700.4A CN116258740A (en) 2021-12-01 2021-12-01 Vehicle-mounted forward-looking multi-target tracking method based on multi-camera pixel fusion

Publications (1)

Publication Number Publication Date
CN116258740A true CN116258740A (en) 2023-06-13

Family

ID=86679714

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111454700.4A Pending CN116258740A (en) 2021-12-01 2021-12-01 Vehicle-mounted forward-looking multi-target tracking method based on multi-camera pixel fusion

Country Status (1)

Country Link
CN (1) CN116258740A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117784801A (en) * 2024-02-27 2024-03-29 锐驰激光(深圳)有限公司 Tracking obstacle avoidance method, device, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117784801A (en) * 2024-02-27 2024-03-29 锐驰激光(深圳)有限公司 Tracking obstacle avoidance method, device, equipment and storage medium
CN117784801B (en) * 2024-02-27 2024-05-28 锐驰激光(深圳)有限公司 Tracking obstacle avoidance method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110569704B (en) Multi-strategy self-adaptive lane line detection method based on stereoscopic vision
JP4970516B2 (en) Surrounding confirmation support device
US10504214B2 (en) System and method for image presentation by a vehicle driver assist module
JP7268001B2 (en) Arithmetic processing unit, object identification system, learning method, automobile, vehicle lamp
CN115082924B (en) Three-dimensional target detection method based on monocular vision and radar pseudo-image fusion
CN111967498A (en) Night target detection and tracking method based on millimeter wave radar and vision fusion
CN103770708A (en) Dynamic rearview mirror adaptive dimming overlay through scene brightness estimation
CN111965636A (en) Night target detection method based on millimeter wave radar and vision fusion
US11403767B2 (en) Method and apparatus for detecting a trailer, tow-ball, and coupler for trailer hitch assistance and jackknife prevention
CN109919026B (en) Surface unmanned ship local path planning method
CN112731436B (en) Multi-mode data fusion travelable region detection method based on point cloud up-sampling
CN110751206A (en) Multi-target intelligent imaging and identifying device and method
CN111768332A (en) Splicing method of vehicle-mounted all-around real-time 3D panoramic image and image acquisition device
CN113223044A (en) Infrared video target detection method combining feature aggregation and attention mechanism
CN113781562A (en) Lane line virtual and real registration and self-vehicle positioning method based on road model
CN116258740A (en) Vehicle-mounted forward-looking multi-target tracking method based on multi-camera pixel fusion
CN111860270B (en) Obstacle detection method and device based on fisheye camera
Gu et al. Radar-enhanced image fusion-based object detection for autonomous driving
CN113139986A (en) Integrated environment perception and multi-target tracking system
CN110738696B (en) Driving blind area perspective video generation method and driving blind area view perspective system
CN117173399A (en) Traffic target detection method and system of cross-modal cross-attention mechanism
CN116403186A (en) Automatic driving three-dimensional target detection method based on FPN Swin Transformer and Pointernet++
JP7498364B2 (en) Correcting camera images in the presence of rain, light and dirt
CN111833384B (en) Method and device for rapidly registering visible light and infrared images
CN115249269A (en) Object detection method, computer program product, storage medium, and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination