CN112364793A - Target detection and fusion method based on long-focus and short-focus multi-camera vehicle environment - Google Patents
Target detection and fusion method based on long-focus and short-focus multi-camera vehicle environment Download PDFInfo
- Publication number
- CN112364793A CN112364793A CN202011288888.5A CN202011288888A CN112364793A CN 112364793 A CN112364793 A CN 112364793A CN 202011288888 A CN202011288888 A CN 202011288888A CN 112364793 A CN112364793 A CN 112364793A
- Authority
- CN
- China
- Prior art keywords
- focus
- camera
- short
- long
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 53
- 238000007500 overflow downdraw method Methods 0.000 title claims abstract description 10
- 238000000034 method Methods 0.000 claims abstract description 42
- 238000013507 mapping Methods 0.000 claims abstract description 14
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 8
- 238000003384 imaging method Methods 0.000 claims abstract description 8
- 230000000007 visual effect Effects 0.000 claims description 11
- 238000013135 deep learning Methods 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 9
- 239000011159 matrix material Substances 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 230000004927 fusion Effects 0.000 claims description 6
- 230000003287 optical effect Effects 0.000 claims description 4
- 238000012549 training Methods 0.000 claims description 4
- 238000013526 transfer learning Methods 0.000 claims description 4
- 230000009467 reduction Effects 0.000 claims description 3
- 238000013519 translation Methods 0.000 claims description 3
- 238000004519 manufacturing process Methods 0.000 claims 1
- 238000005516 engineering process Methods 0.000 description 8
- 230000000694 effects Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000007499 fusion processing Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000004576 sand Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/80—Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30248—Vehicle exterior or interior
- G06T2207/30252—Vehicle exterior; Vicinity of vehicle
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Studio Devices (AREA)
Abstract
The invention requests to protect a target detection and fusion method based on a long-focus and short-focus multi-camera vehicle environment. The method comprises the following steps: 1. and performing target detection on the images acquired by the long-focus and short-focus binocular cameras by adopting a convolutional neural network to obtain the positions of target frames in the images acquired by the cameras with different focal lengths at the same time. 2. According to the camera imaging principle and the internal and external parameters K, R, T obtained by camera calibration, the mapping relationship f of the space target point P in the long and short focus camera pixel coordinate system can be obtained. 3. And obtaining the position of the corresponding target frame in the short-focus camera image according to the mapping relation f of the target frame position in the long-focus camera image, and fusing the target frame position with the target in the original short-focus camera image, thereby realizing the target detection task under different distance conditions. The invention overcomes the limitation that a single focal length camera cannot adapt to target detection tasks at different distances, and improves the target detection accuracy in the vehicle environment. Meanwhile, the method is simple and easy to use, low in cost and high in instantaneity.
Description
Technical Field
The invention belongs to the technical field of intelligent automobile environment sensing, and particularly relates to a target detection and fusion method in a long-and-short-focal-length multi-camera vehicle environment.
Background
In recent years, with rapid development of fields such as artificial intelligence and machine vision, autopilot has become an important field for academic and industrial research. The environment perception technology is one of the key technologies in the automatic driving system and is the most basic module, and the environment around the vehicle is informed by the eyes of the vehicle. Target detection, positioning and motion state estimation are the most basic functions in the environment sensing module.
With the wide application of deep learning and the great improvement of computing capability of computing devices, the environmental awareness technology based on deep learning becomes an important support for the environmental awareness module. The environment perception based on vision mainly realizes the functions of pedestrian detection, obstacle detection, lane line detection, drivable area detection, traffic sign identification and the like, and can realize the positioning of a target by combining a stereoscopic vision technology. At present, researchers at home and abroad are always focusing on improving the target detection performance of a single focus camera. However, in a complex working environment, information acquired by a single focal length camera is limited, and only the single focal length camera cannot correctly detect targets at different distances, so that detection omission often occurs. And the cameras with different focal lengths can just make up the defects between the cameras and the vehicle, integrate the advantages of the cameras and the vehicle, and accurately detect the target in the vehicle environment. For example, short-focus cameras have a wide field of view, and distant targets are imaged less and difficult to detect through depth learning; the near target is large and easy to detect. The long-focus camera has narrow visual field and large distant target, and is easy to detect; but near objects may not be captured due to the camera view. Therefore, the respective advantages of the short-focus camera image and the long-focus camera image are integrated, the target detection tasks under different distances can be realized, the target under the vehicle environment can be more accurately detected, and the condition that the target is missed to be detected is effectively avoided.
Disclosure of Invention
The present invention is directed to solving the above problems of the prior art. A target detection and fusion method based on a long-focus and short-focus multi-camera vehicle environment is provided. The technical scheme of the invention is as follows:
a target detection and fusion method based on a long-focus and short-focus multi-camera vehicle environment comprises the following steps:
step 1, respectively installing a long-focus camera and a short-focus binocular camera, and calibrating a binocular system. Inputting image sequences acquired by the long-focus camera and the short-focus camera into a deep learning convolutional neural network, and obtaining the positions of target frames of the long-focus binocular camera and the short-focus binocular camera in the wide visual field and the narrow visual field at the same moment through target detection;
step 2, obtaining a mapping relation f of the space target point P under the long-focus camera pixel coordinate system according to the target position in the long-focus camera narrow-view field and the internal and external parameters calibrated by the two eyes by utilizing the camera imaging principle, and obtaining a target position P in the long-focus camera narrow-view field1Corresponding target position p in the wide field of view of a short-focus camera2;
And 3, fusing target frames in the long-focus image and the short-focus image by analyzing the target position detected in the wide field of view of the short-focus camera and the target position corresponding to the target position in the narrow field of view of the long-focus camera in the wide field of view of the short-focus camera obtained in the step 2.
Further, the step 1) specifically comprises the following steps:
2-1, setting different focal lengths of the long and short-focus cameras, installing a binocular camera system to the same height above the vehicle, and reserving a certain baseline distance between the binocular cameras;
step 2-2, calibrating by using a Zhangyingyou calibration method to obtain an internal parameter K and an external parameter R, T of the binocular system; where K is an internal reference matrix containing information of the focal length, optical center, etc. of the camera, and R and T are the rotation matrix and translation matrix of the long focus camera relative to the short focus camera, respectively.
And 2-3, performing deep learning target detection, and performing target detection on images collected by the long-focus binocular camera and the short-focus binocular camera at the same moment by adopting a lightweight convolutional neural network YOLOv3-Tiny, wherein the method specifically comprises the following steps: and (4) data set making, transfer learning, network reasoning and target detection are carried out to obtain the positions of target frames under cameras with different focal lengths.
Further, the step 2-1 is to set the focal length of the camera and install the camera system, two cameras with different focal lengths are adopted, the short focal length camera is placed on the left side, the long focal length camera is placed on the right side, the length of a base line between the two cameras is b, a long-short focal binocular vision system is formed, and the binocular vision system is placed in front of the top of the vehicle.
Further, calibrating the long-focus and short-focus binocular camera in the step 2-2), placing a checkerboard calibration plate in front of the binocular camera, and necessarily requiring the checkerboard to be simultaneously present in the visual field of the long-focus and short-focus camera; capturing the corner points of the checkerboard calibration board by using a binocular camera, and calculating the internal parameters K of the cameras by using a Zhang-Zhengyou calibration method1,K2And external references R and T between the binocular cameras.
Further, the specific process of the step 2-3 of making the data set is that a Chongqing city traffic data set which is automatically collected and finished with label making is merged with an open-source Pascal VOC 2012 data set, and then the merged data set is subjected to data enhancement to obtain more training samples;
the specific process of the transfer learning is that a merged data set is loaded and trained by using a YOLOv3_ Tiny network on the basis of an existing pre-training model;
the network reasoning and target detection means that a trained network model weight is loaded by a YOLOv3_ Tiny network in the normal operation process of the intelligent vehicle to perform forward reasoning calculation so as to complete a target detection task.
Further, in step 2, a corresponding relationship between the long focus camera pixel coordinate system and the short focus camera pixel coordinate system is established by a camera imaging principle, and can be calculated by the following formula:
s1p1=K1P,s2p2=K2(RP+T)
p represents a point in real space, P1,p2Expressing the pixel points, K, corresponding to the points P in the space in the long and short-focus camera pixel coordinate system1,K2Respectively representing the internal parameters of the long-focus camera and the short-focus camera, and R, T representing the external parameters between the long-focus binocular camera and the short-focus binocular camera; s1,s2Representing depth information of the point P in the long and short focus camera coordinate systems, respectively.
When using homogeneous coordinates, the above equation is written as follows:
p1=K1P,p2=K2(RP+T)
by the above formula, p can be obtained1,p2The mapping relationship f between the following components:
p2=K2RK1-1p1+K2T
further, the step 3 specifically includes the following steps:
step 3-1, detecting the ith target frame B in the long-focus cameralPosition (x) ofl,yl,wl,hl) According to the mapping relation f, the corresponding target frame B in the short-focus camera image can be obtaineds'position (x's,y′s,w′s,h′s) (ii) a Wherein x isl,yl,wl,hlRespectively representing the horizontal and vertical coordinates of the central position of the target and the width and height of the target frame; x's,y′s,w′s,h′sRespectively representing the horizontal and vertical coordinates of the center position of the mapped target and the width and height of the target frame.
Step 3-2, calculating the mapped target frame B's(x′s,y′s,w′s,h′s) Target frame B detected from short-focus cameras(xs,ys,ws,hs) The cross-over ratio between IOU when IOU>When the threshold value t is reached, the long and short focal cameras detect the target frame; otherwise, at least one camera does not detect the target, and the calculation formula of the IOU is as follows:
step 3-3, when IOU>When the threshold value t is reached, the target is detected in the long-focus camera and the short-focus camera, and the target frame B 'needs to be calculated in consideration of the deviation of the actual mapping result's、BsThe scaling ratios Δ w, Δ h and the offset ratios Δ x, Δ y of (a) are calculated as follows:
Δx=xs-x′s
Δy=ys-y′s
step 3-4, when IOU<At threshold t, it means that the short-focus camera has not detected the target, and B 'is required at this time'sReduction of BsI.e. calculate BsAt the position in the short-focus camera, the reduction calculation formula is as follows:
ws=w′s*Δw
hs=h′s*Δh
xs=x′s+Δx
ys=y′s+Δy
and 3-5, repeating all the steps from 3-1 to 3-4, and completing target fusion according to the target positions and types in the long-focus camera and the short-focus camera.
The invention has the following advantages and beneficial effects:
the invention provides a target detection and fusion method based on a long-focus and short-focus multi-camera vehicle environment. In the field of unmanned driving, the target detection technology based on monocular vision is widely applied. The methods often have the problems of good detection effect of near targets and poor detection effect of far targets. This is because a single camera with a fixed focal length does not adapt well to the detection of objects at different positions, for example, short focus cameras have a wide field of view and distant objects have a small image, and are therefore difficult to detect by depth learning. The long-focus camera has narrow visual field, and a far target is imaged clearly, so that the detection is facilitated for deep learning, but a near target may not be in the long-focus visual field and therefore cannot be detected.
Therefore, the invention adopts a method of long-focus and short-focus multi-camera to detect and fuse the targets in the vehicle environment, and becomes an effective method for solving the problems. The advantages are shown in the following aspects:
(1) the invention adopts the short-focus camera and the long-focus camera as the target detection and fusion technology of the sensor, and compared with the target detection method based on the single-focus camera, the target detection method has higher accuracy and better practical application effect. The method combines the advantages of the short-focus camera and the long-focus camera, makes up for the defect of a single-focus camera, and improves the accuracy of target detection in a vehicle environment.
(2) Compared with a monocular camera, the binocular long-focus and short-focus camera can obtain richer visual information in a vehicle environment and can better realize detection tasks of targets at different distances.
(3) On the basis of a self-made Chongqing traffic data set, the method adopts the lightweight convolutional neural network YOLOv3-Tiny to detect the target in the image, and compared with a common YOLOv3 algorithm, the method has the advantages that the detection speed is higher, the real-time operation can be realized on embedded edge equipment and the like, and the good detection precision can be achieved.
(4) The IOU is often used for deep learning target detection to measure the confidence of a target frame, the method innovatively adopts the IOU as the judgment standard of target matching, the accuracy of target matching is greatly improved, and meanwhile, the method is low in computation time complexity and is faster than the traditional method.
Drawings
FIG. 1 is a simplified flowchart of a target detection and fusion method based on a long and short-focus multi-camera vehicle environment according to a preferred embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described in detail and clearly with reference to the accompanying drawings. The described embodiments are only some of the embodiments of the present invention.
The technical scheme for solving the technical problems is as follows:
the invention aims to provide a target detection and fusion method based on a long-focus and short-focus multi-camera vehicle environment. Through the independent camera of two different focus of installation at intelligent roof portion (keep certain baseline distance between), utilize target detection and fusion technique based on degree of depth learning, overcome the limitation of target detection task under the different distances, the effectual emergence of avoiding the target condition of louing examining, propose this technical scheme, as shown in fig. 1, include following step:
step 1, installing a long-focus and short-focus binocular camera and calibrating a binocular system. And inputting the image sequence acquired by the long-focus and short-focus cameras into the deep learning convolutional neural network, and obtaining the positions of target frames of the long-focus and short-focus binocular cameras in the wide view field and the narrow view field at the same moment through target detection. The method comprises the following specific steps:
the short-focus camera and the long-focus camera are arranged on the top of a vehicle, the focal length of the cameras is set, a camera system is arranged, two cameras with different focal lengths are adopted, the short-focus camera is arranged on the left side, the long-focus camera is arranged on the right side, the length of a base line between the two cameras is b, a long-short-focus binocular vision system is formed, and the binocular vision system is arranged in front of the top of the vehicle.
(2) Calibrating the long-focus and short-focus binocular cameras, placing a checkerboard calibration plate in front of the binocular cameras, and necessarily requiring the checkerboard to be simultaneously present in the visual fields of the long-focus and short-focus cameras; and capturing the angular points of the checkerboard calibration board by using binocular cameras, and calculating internal parameters K of the cameras and external parameters R and T between the binocular cameras by using a Zhang-friend calibration method. Where K is an internal reference matrix containing information of the focal length, optical center, etc. of the camera, and R and T are the rotation matrix and translation matrix of the long focus camera relative to the short focus camera, respectively.
(3) And (3) deep learning target detection, wherein a light-weight convolutional neural network YOLOv3-Tiny is adopted to carry out target detection on images collected by the long-focus and short-focus binocular cameras at the same moment, and the positions of target frames under different focus cameras are obtained.
Step 2, obtaining a mapping relation f of the space target point P under the long-focus camera pixel coordinate system according to the target position in the long-focus camera narrow-view field and the internal and external parameters calibrated by the two eyes by using the camera imaging principle, and further obtaining a target position P in the long-focus camera narrow-view field1Corresponding target position p in the wide field of view of a short-focus camera2. The method comprises the following specific steps:
(1) according to the camera imaging principle, establishing a pixel coordinate P of a point P in space in a long-focus and short-focus camera pixel coordinate system1,p2The relationship between the target positions can be reduced by the target positions on the long-focus camera pixel coordinate system, and the target positions which are not detected in the short-focus camera pixel coordinate system can be reduced.
(2) The corresponding relation between the long-focus camera pixel coordinate system and the short-focus camera pixel coordinate system is established by the camera imaging principle, and can be calculated by the following formula:
s1p1=K1P,s2p2=K2(RP+T)
p represents a point in real space, P1,p2And expressing pixel points corresponding to the points P in the space in the long and short focus camera pixel coordinate system respectively. K1,K2Representing the internal parameters of a long focus camera and a short focus camera, respectively. R, T denotes extrinsic parameters between the long and short-focus binocular cameras. s1,s2Representing depth information of the point P in the long and short focus camera coordinate systems, respectively.
If homogeneous coordinates are used, the above equation can be written as follows:
p1=K1P,p2=K2(RP+T)
(3) by the above formula, p can be obtained1,p2The mapping relationship f between the following components:
p2=K2RK1-1p1+K2T
and 3, analyzing the target position detected in the wide field of view of the short-focus camera and the target position corresponding to the target position in the narrow field of view of the long-focus camera in the wide field of view of the short-focus camera obtained in the step 2, and further performing fusion processing on the target frames in the long-focus and short-focus images. The method comprises the following specific steps:
(1) for the ith target frame B detected in the long-focus cameralPosition (x) ofl,yl,wl,hl) According to the mapping relation f, the corresponding target frame B 'in the short-focus camera image can be obtained'sPosition (x's,y′s,w′s,h′s). Wherein x isl,yl,wl,hlRespectively representing the horizontal and vertical coordinates of the central position of the target and the width and height of the target frame; x's,y′s,w′s,h′sRespectively representing the horizontal and vertical coordinates of the center position of the mapped target and the width and height of the target frame.
(2) Calculating mapped target frame B's(x′s,y′s,w′s,h′s) Target frame B detected from short-focus cameras(xs,ys,ws,hs) The cross-over ratio between IOU. When IOU is used>When the threshold value t is reached, the long and short focal cameras detect the target frame; otherwise, at least one camera does not detect the target. The calculation formula of the IOU is as follows:
(3) when IOU is used>When the threshold value t is reached, the target is detected in the long-focus camera and the short-focus camera, and the target frame B 'needs to be calculated in consideration of the deviation of the actual mapping result's、BsAnd the scaling Δ w, Δ h and the offset scaling Δ x, Δ y. The calculation formula is as follows:
Δx=xs-x′s
Δy=ys-y′s
(4) when IOU is used<At threshold t, it means that the short-focus camera has not detected the target, and B 'is required at this time'sReduction of BsI.e. calculate BsAt the position in the short-focus camera, the reduction calculation formula is as follows:
ws=w′s*Δw
hs=h′s*Δh
xs=x′s+Δx
ys=y′s+Δy
(5) and repeating all the steps from 3-1 to 3-4 to complete target fusion according to the target positions and the types in the long-focus camera and the short-focus camera.
The method illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above examples are to be construed as merely illustrative and not limitative of the remainder of the disclosure. After reading the description of the invention, the skilled person can make various changes or modifications to the invention, and these equivalent changes and modifications also fall into the scope of the invention defined by the claims.
Claims (7)
1. A target detection and fusion method based on a long-focus and short-focus multi-camera vehicle environment is characterized by comprising the following steps:
step 1, respectively installing a long-focus camera and a short-focus binocular camera, and calibrating a binocular system. Inputting image sequences acquired by the long-focus camera and the short-focus camera into a deep learning convolutional neural network, and obtaining the positions of target frames of the long-focus binocular camera and the short-focus binocular camera in the wide visual field and the narrow visual field at the same moment through target detection;
step 2, obtaining a mapping relation f of the space target point P under the long-focus camera pixel coordinate system according to the target position in the long-focus camera narrow-view field and the binocular calibrated internal and external parameters by utilizing the camera imaging principle, and obtaining a target position P in the long-focus camera narrow-view field1Corresponding target position p in the wide field of view of a short-focus camera2;
And 3, fusing target frames in the long-focus image and the short-focus image by analyzing the target position detected in the wide field of view of the short-focus camera and the target position corresponding to the target position in the narrow field of view of the long-focus camera in the wide field of view of the short-focus camera obtained in the step 2.
2. The method for detecting and fusing the targets in the long-and-short-focus-based multi-camera vehicle environment according to claim 1, wherein the step 1) specifically comprises the following steps:
2-1, setting different focal lengths of the long and short-focus cameras, installing a binocular camera system to the same height above the vehicle, and reserving a certain baseline distance between the binocular cameras;
step 2-2, calibrating by using a Zhangyingyou calibration method to obtain an internal parameter K and an external parameter R, T of the binocular system; wherein K is an internal reference matrix containing information such as focal length and optical center of the camera, and R and T are a rotation matrix and a translation matrix of the long-focus camera relative to the short-focus camera respectively;
and 2-3, performing deep learning target detection, and performing target detection on images collected by the long-focus binocular camera and the short-focus binocular camera at the same moment by adopting a lightweight convolutional neural network YOLOv3-Tiny, wherein the method specifically comprises the following steps: and (4) data set making, transfer learning, network reasoning and target detection are carried out to obtain the positions of target frames under cameras with different focal lengths.
3. The method for detecting and fusing targets under the vehicle environment based on the long-focus and short-focus multiple cameras as claimed in claim 2, wherein the step 2-1 is to set the focal lengths of the cameras and install the camera system, two cameras with different focal lengths are adopted, the short-focus camera is placed on the left, the long-focus camera is placed on the right, the length of a base line between the two cameras is b, a long-focus and short-focus binocular vision system is formed, and the binocular vision system is placed in front of the top of the vehicle.
4. The method for target detection and fusion based on long-and-short-focus multi-camera vehicle environment according to claim 2, wherein the step 2-2) is used for calibrating long-and-short-focus doubleA checkerboard calibration plate is placed in front of the binocular camera, and the checkerboard must be required to be simultaneously present in the visual field of the long-focus camera and the short-focus camera; capturing the corner points of the checkerboard calibration board by using a binocular camera, and calculating the internal parameters K of the cameras by using a Zhang-Zhengyou calibration method1,K2And external references R and T between the binocular cameras.
5. The method for detecting and fusing targets under the environment of the long-and-short-focus multi-camera vehicle as claimed in claim 2, wherein the specific process of the step 2-3 data set production is to combine a Chongqing city traffic data set which is automatically collected and labeled with an open-source Pascal VOC 2012 data set, and then perform data enhancement on the combined data set to obtain more training samples;
the specific process of the transfer learning is that a merged data set is loaded and trained by using a YOLOv3_ Tiny network on the basis of an existing pre-training model;
the network reasoning and target detection means that a trained network model weight is loaded by a YOLOv3_ Tiny network in the normal operation process of the intelligent vehicle to perform forward reasoning calculation so as to complete a target detection task.
6. The method for detecting and fusing targets in the long-focus and short-focus multi-camera vehicle environment according to one of claims 1 to 5, wherein the step 2 establishes the correspondence relationship between the long-focus camera pixel coordinate system and the short-focus camera pixel coordinate system according to the camera imaging principle, and can be calculated according to the following formula:
s1p1=K1P,s2p2=K2(RP+T)
p represents a point in real space, P1,p2Expressing the pixel points, K, corresponding to the points P in the space in the long and short-focus camera pixel coordinate system1,K2Denotes intrinsic parameters of the telephoto camera and the short-focus camera, respectively, R, T denotes extrinsic parameters between the telephoto and the short-focus binocular cameras, s1,s2Representing point P in long and short-focus camera coordinate systems, respectivelyDepth information;
when using homogeneous coordinates, the above equation is written as follows:
p1=K1P,p2=K2(RP+T)
by the above formula, p can be obtained1,p2The mapping relationship f between the following components:
7. the method for detecting and fusing targets in the long-and-short-focus-based multi-camera vehicle environment according to claim 6, wherein the step 3 specifically comprises the following steps:
step 3-1, detecting the ith target frame B in the long-focus cameralPosition (x) ofl,yl,wl,hl) According to the mapping relation f, the corresponding target frame B 'in the short-focus camera image can be obtained'sPosition (x's,y′s,w′s,h′s) (ii) a Wherein x isl,yl,wl,hlRespectively representing the horizontal and vertical coordinates of the central position of the target and the width and height of the target frame; x's,y′s,w′s,h′sRespectively representing the horizontal and vertical coordinates of the center position of the mapped target and the width and height of the target frame;
step 3-2, calculating the mapped target frame B's(x′s,y′s,w′s,h′s) Target frame B detected from short-focus cameras(xs,ys,ws,hs) The cross-over ratio between IOU when IOU>When the threshold value t is reached, the long and short focal cameras detect the target frame; otherwise, at least one camera does not detect the target, and the calculation formula of the IOU is as follows:
step 3-3, when IOU>When the threshold value t is reached, the target is detected in the long-focus camera and the short-focus camera, and the target frame B 'needs to be calculated in consideration of the deviation of the actual mapping result's、BsThe scaling ratios Δ w, Δ h and the offset ratios Δ x, Δ y of (a) are calculated as follows:
Δx=xs-x′s
Δy=ys-y′s
step 3-4, when IOU<At threshold t, it means that the short-focus camera has not detected the target, and B 'is required at this time'sReduction of BsI.e. calculate BsAt the position in the short-focus camera, the reduction calculation formula is as follows:
ws=w′s*Δw
hs=h′s*Δh
xs=x′s+Δx
ys=y′s+Δy
and 3-5, repeating all the steps from 3-1 to 3-4, and completing target fusion according to the target positions and types in the long-focus camera and the short-focus camera.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011288888.5A CN112364793A (en) | 2020-11-17 | 2020-11-17 | Target detection and fusion method based on long-focus and short-focus multi-camera vehicle environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011288888.5A CN112364793A (en) | 2020-11-17 | 2020-11-17 | Target detection and fusion method based on long-focus and short-focus multi-camera vehicle environment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112364793A true CN112364793A (en) | 2021-02-12 |
Family
ID=74532438
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011288888.5A Pending CN112364793A (en) | 2020-11-17 | 2020-11-17 | Target detection and fusion method based on long-focus and short-focus multi-camera vehicle environment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112364793A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113223094A (en) * | 2021-05-24 | 2021-08-06 | 深圳市智像科技有限公司 | Binocular imaging system, control method and device thereof, and storage medium |
CN116778360A (en) * | 2023-06-09 | 2023-09-19 | 北京科技大学 | Ground target positioning method and device for flapping-wing flying robot |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108171758A (en) * | 2018-01-16 | 2018-06-15 | 重庆邮电大学 | Polyphaser scaling method based on minimum time principle and transparent glass scaling board |
CN108257161A (en) * | 2018-01-16 | 2018-07-06 | 重庆邮电大学 | Vehicle environmental three-dimensionalreconstruction and movement estimation system and method based on polyphaser |
CN109163657A (en) * | 2018-06-26 | 2019-01-08 | 浙江大学 | A kind of circular target position and posture detection method rebuild based on binocular vision 3 D |
CN109165629A (en) * | 2018-09-13 | 2019-01-08 | 百度在线网络技术(北京)有限公司 | It is multifocal away from visual barrier cognitive method, device, equipment and storage medium |
CN109448054A (en) * | 2018-09-17 | 2019-03-08 | 深圳大学 | The target Locate step by step method of view-based access control model fusion, application, apparatus and system |
CN109815886A (en) * | 2019-01-21 | 2019-05-28 | 南京邮电大学 | A kind of pedestrian and vehicle checking method and system based on improvement YOLOv3 |
WO2019198381A1 (en) * | 2018-04-13 | 2019-10-17 | ソニー株式会社 | Information processing device, information processing method, and program |
CN110378210A (en) * | 2019-06-11 | 2019-10-25 | 江苏大学 | A kind of vehicle and car plate detection based on lightweight YOLOv3 and long short focus merge distance measuring method |
CN110532937A (en) * | 2019-08-26 | 2019-12-03 | 北京航空航天大学 | Method for distinguishing is known to targeting accuracy with before disaggregated model progress train based on identification model |
CN111210478A (en) * | 2019-12-31 | 2020-05-29 | 重庆邮电大学 | Method, medium and system for calibrating external parameters of common-view-free multi-camera system |
-
2020
- 2020-11-17 CN CN202011288888.5A patent/CN112364793A/en active Pending
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108171758A (en) * | 2018-01-16 | 2018-06-15 | 重庆邮电大学 | Polyphaser scaling method based on minimum time principle and transparent glass scaling board |
CN108257161A (en) * | 2018-01-16 | 2018-07-06 | 重庆邮电大学 | Vehicle environmental three-dimensionalreconstruction and movement estimation system and method based on polyphaser |
WO2019198381A1 (en) * | 2018-04-13 | 2019-10-17 | ソニー株式会社 | Information processing device, information processing method, and program |
CN109163657A (en) * | 2018-06-26 | 2019-01-08 | 浙江大学 | A kind of circular target position and posture detection method rebuild based on binocular vision 3 D |
CN109165629A (en) * | 2018-09-13 | 2019-01-08 | 百度在线网络技术(北京)有限公司 | It is multifocal away from visual barrier cognitive method, device, equipment and storage medium |
US20200089976A1 (en) * | 2018-09-13 | 2020-03-19 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method and device of multi-focal sensing of an obstacle and non-volatile computer-readable storage medium |
CN109448054A (en) * | 2018-09-17 | 2019-03-08 | 深圳大学 | The target Locate step by step method of view-based access control model fusion, application, apparatus and system |
CN109815886A (en) * | 2019-01-21 | 2019-05-28 | 南京邮电大学 | A kind of pedestrian and vehicle checking method and system based on improvement YOLOv3 |
CN110378210A (en) * | 2019-06-11 | 2019-10-25 | 江苏大学 | A kind of vehicle and car plate detection based on lightweight YOLOv3 and long short focus merge distance measuring method |
CN110532937A (en) * | 2019-08-26 | 2019-12-03 | 北京航空航天大学 | Method for distinguishing is known to targeting accuracy with before disaggregated model progress train based on identification model |
CN111210478A (en) * | 2019-12-31 | 2020-05-29 | 重庆邮电大学 | Method, medium and system for calibrating external parameters of common-view-free multi-camera system |
Non-Patent Citations (3)
Title |
---|
SAFWAN WSHAH.: ""Deep Learning for Model Parameter Calibration in Power Systems"", 《2020 IEEE INTERNATIONAL CONFERENCE ON POWER SYSTEMS TECHNOLOGY (POWERCON)》 * |
李星辰: ""融合YOLO检测的多目标跟踪算法"", 《计算机工程与科学》 * |
贾祥: ""基于双目视觉的车辆三维环境重建方法研究"", 《中国优秀硕士论文全文数据库》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113223094A (en) * | 2021-05-24 | 2021-08-06 | 深圳市智像科技有限公司 | Binocular imaging system, control method and device thereof, and storage medium |
CN116778360A (en) * | 2023-06-09 | 2023-09-19 | 北京科技大学 | Ground target positioning method and device for flapping-wing flying robot |
CN116778360B (en) * | 2023-06-09 | 2024-03-19 | 北京科技大学 | Ground target positioning method and device for flapping-wing flying robot |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109308693B (en) | Single-binocular vision system for target detection and pose measurement constructed by one PTZ camera | |
CN111462135B (en) | Semantic mapping method based on visual SLAM and two-dimensional semantic segmentation | |
CN110889829A (en) | Monocular distance measurement method based on fisheye lens | |
CN106529587B (en) | Vision course recognition methods based on object detection | |
CN105758426A (en) | Combined calibration method for multiple sensors of mobile robot | |
US20230063939A1 (en) | Electro-hydraulic varifocal lens-based method for tracking three-dimensional trajectory of object by using mobile robot | |
CN111091023B (en) | Vehicle detection method and device and electronic equipment | |
CN111738071B (en) | Inverse perspective transformation method based on motion change of monocular camera | |
CN114089329A (en) | Target detection method based on fusion of long and short focus cameras and millimeter wave radar | |
CN106920247A (en) | A kind of method for tracking target and device based on comparison network | |
CN114898314B (en) | Method, device, equipment and storage medium for detecting target of driving scene | |
CN110779491A (en) | Method, device and equipment for measuring distance of target on horizontal plane and storage medium | |
CN111260539A (en) | Fisheye pattern target identification method and system | |
CN112115913B (en) | Image processing method, device and equipment and storage medium | |
CN111932627B (en) | Marker drawing method and system | |
Cvišić et al. | Recalibrating the KITTI dataset camera setup for improved odometry accuracy | |
CN112364793A (en) | Target detection and fusion method based on long-focus and short-focus multi-camera vehicle environment | |
CN111967396A (en) | Processing method, device and equipment for obstacle detection and storage medium | |
CN114037762B (en) | Real-time high-precision positioning method based on registration of image and high-precision map | |
CN113379848A (en) | Target positioning method based on binocular PTZ camera | |
CN111239684A (en) | Binocular fast distance measurement method based on YoloV3 deep learning | |
CN111161305A (en) | Intelligent unmanned aerial vehicle identification tracking method and system | |
CN110992424A (en) | Positioning method and system based on binocular vision | |
CN104463240A (en) | Method and device for controlling list interface | |
CN111950370A (en) | Dynamic environment offline visual milemeter expansion method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210212 |