CN117095370A - Multi-camera detection target fusion and blind supplementing method - Google Patents

Multi-camera detection target fusion and blind supplementing method Download PDF

Info

Publication number
CN117095370A
CN117095370A CN202311186896.2A CN202311186896A CN117095370A CN 117095370 A CN117095370 A CN 117095370A CN 202311186896 A CN202311186896 A CN 202311186896A CN 117095370 A CN117095370 A CN 117095370A
Authority
CN
China
Prior art keywords
target
coordinates
pixel
targets
coordinate system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311186896.2A
Other languages
Chinese (zh)
Inventor
胡丽娟
张琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chelutong Technology Chengdu Co ltd
Original Assignee
Chelutong Technology Chengdu Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chelutong Technology Chengdu Co ltd filed Critical Chelutong Technology Chengdu Co ltd
Priority to CN202311186896.2A priority Critical patent/CN117095370A/en
Publication of CN117095370A publication Critical patent/CN117095370A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/54Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a method for fusing and blind supplementing detection targets by multiple cameras, which comprises the steps of installing multiple cameras at an intersection, acquiring multiple video source images, and establishing a pixel coordinate system; respectively carrying out target detection on a plurality of video source images to obtain a target detection type; according to the target detection type and box characteristics, taking the pixel coordinates of the target pixel points under the pixel coordinate system of each detected target, and converting the pixel coordinates to obtain target coordinates; defining a specific area and an interested area of each camera, and directly outputting all targets in the specific area; establishing a custom coordinate system, and performing aerial view perspective transformation on target coordinates in the region of interest to obtain aerial view coordinates; and performing target fusion based on the aerial view coordinates, and outputting all targets of the region of interest. The invention avoids errors caused by fusion of the surface features of the target frame images, does not need to carry out joint calibration on the video source images, and reduces a great deal of time and labor cost.

Description

Multi-camera detection target fusion and blind supplementing method
Technical Field
The invention relates to the technical field of intelligent traffic detection, in particular to a method for fusing detection targets and blind supplementing by multiple cameras.
Background
Intelligent traffic plays a vital role in realizing sustainable development and economic development of intelligent city construction, and with the improvement of the economic level of people, motor vehicles, non-motor vehicles and pedestrian targets in traffic scenes are increasing, and traffic flow and traffic target illegal events are counted manually. The requirements on aspects such as road side perception real-time detection, illegal event alarm, evidence preservation and the like are higher, and a scheme of cooperatively detecting targets by multiple cameras on the road side is commonly used for realizing compensation of blind areas in the road so as to improve recall rate of target detection and reduce omission rate.
The existing method for fusing the detection results of multiple cameras comprises the following steps:
1. based on extracting target features such as color, texture, shape, etc., and fusing with these features.
2. Based on target fusion of deep learning, a deep learning model is obtained by carrying out joint training on a plurality of camera images, advanced semantic features in the images are extracted, and the target fusion is realized by utilizing the features.
The above method has at least the following disadvantages:
1. the method for extracting the target features by using the colors, textures and shapes is characterized in that the mounting positions of the multiple cameras in the road side project are usually shooting at the middle of the intersection in four directions, and the targets are slightly different in textures and shapes due to different shooting angles, and the cameras are influenced by the illumination directions, so that the colors of the targets shot by the cameras mounted in different directions are slightly different, and the fusion accuracy of the targets is reduced.
2. Before the images of the cameras are jointly trained, the image data of the cameras are required to be jointly calibrated, and accordingly larger time and labor cost are brought.
Disclosure of Invention
The invention aims to provide a method for detecting target fusion and blind complement by multiple cameras, which avoids errors caused by fusion of surface features of target frame images only, does not need to carry out joint calibration on video source images, and reduces a large amount of time and labor cost.
The embodiment of the invention is realized by the following technical scheme:
the method for fusing and blind supplementing of the detection targets by using the multiple cameras is characterized by comprising the following steps of:
installing a plurality of cameras at an intersection, acquiring a plurality of video source images, and establishing a pixel coordinate system;
respectively carrying out target detection on a plurality of video source images to obtain a target detection type;
according to the target detection type and box characteristics, taking the pixel coordinates of the target pixel points under the pixel coordinate system of each detected target, and converting the pixel coordinates to obtain target coordinates;
defining a specific area and an interested area of each camera, and directly outputting all targets in the specific area; establishing a custom coordinate system, and performing aerial view perspective transformation on target coordinates in the region of interest to obtain aerial view coordinates;
and performing target fusion based on the aerial view coordinates, and outputting all targets of the region of interest.
In one embodiment of the invention, the object detection types include motor vehicle objects and non-motor vehicle objects.
In an embodiment of the present invention, the specific method for obtaining the target coordinates by converting the coordinates of any one pixel point under the pixel coordinate system according to the target detection type and the box characteristics is as follows:
when the target belongs to a non-motor vehicle target, a conversion formula is provided:
wherein x and y are target coordinates, x 0 、y O And h is the height of the box frame and is the pixel coordinate.
In an embodiment of the present invention, the specific method for obtaining the pixel coordinates of the target pixel point under the pixel coordinate system according to the target detection type and the box characteristics and converting the pixel coordinates of each detected target is as follows:
when the target belongs to the motor vehicle target, the relation is satisfiedWhen the method is used, a conversion formula is as follows:
when the target belongs to the motor vehicle target, the relation is satisfiedSetting a threshold value for the aspect ratio of the box frame;
when the aspect ratio of the target box is greater than the threshold, then there is a conversion formula:
when the aspect ratio of the target box is greater than the threshold, then there is a conversion formula:
in the above formula, x and y are target coordinates, x 0 、y O And H is the height of the box, w is the width of the box, and H is the height of the single frame image.
In an embodiment of the present invention, the specific formula for obtaining the aerial view coordinate by performing aerial view perspective transformation on the target coordinate in the region of interest is:
in the formula, u ' and v ' are bird's eye viewsMark, k 11 、k 12 、k 13 、K 21 、K 22 、K 23 、K 31 、K 32 And x and y are target coordinates and are coefficients.
In an embodiment of the present invention, the specific method for fusing targets based on the bird's eye coordinates and outputting all targets of the region of interest includes:
setting a distance threshold D min And D max Traversing a maximum number of cameras S within a region of interest kmax And other cameras S i Is a target of all of the above;
finding out the video camera S kmax Each object within the region of interest is at a camera S i The nearest Euclidean target d in the region of interest min
Comparison d min And D min And D max The magnitude of the value, if the relation d is satisfied min >D max Then it is stated that both are not the same target; if the relation d is satisfied min <D min Then the two are explained as the same target and are fused; if satisfy the relation D min <d min <D max And judging whether the target types of the two are consistent, if so, merging the same target, and if not, merging the video source images again in the next frame.
In an embodiment of the present invention, the specific method for fusing targets based on the bird's eye coordinates and outputting all targets of the region of interest further includes:
and setting a fusion threshold, and stopping fusion if the fusion times exceed the fusion threshold and are not successful.
The technical scheme of the embodiment of the invention has at least the following advantages and beneficial effects:
according to the invention, through a joint deployment scheme of a plurality of cameras, the blind areas caused by mutual shielding among targets in a road target detection scene are compensated, the omission rate of target detection is reduced, the reduction of the target detection accuracy caused by factors such as illumination, weather and the like is avoided, and the robustness of target detection is improved; meanwhile, by fusing the image information of the cameras, the position and the motion trail of the target can be tracked more accurately, the continuity of target tracking is enhanced, and the accuracy of target identification is improved.
Drawings
FIG. 1 is a flow chart of the steps of the present invention;
FIG. 2 is a view of an installation plan of an intersection camera;
FIG. 3 is a schematic view of a camera shooting an intersection;
FIG. 4 is a schematic diagram of the type of target detection;
FIG. 5 is a schematic illustration of a bird's eye perspective transformation;
FIG. 6 is a schematic diagram of target fusion;
FIG. 7 is a schematic diagram of a custom coordinate system;
fig. 8 is a schematic diagram of a pixel coordinate system.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention.
Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1:
referring to fig. 1-8, a method for fusing and blind supplementing targets by using multiple cameras includes the following steps:
100. installing a plurality of cameras at an intersection, acquiring a plurality of video source images, and establishing a pixel coordinate system;
200. respectively carrying out target detection on a plurality of video source images to obtain a target detection type;
300. according to the target detection type and box characteristics, taking the pixel coordinates of the target pixel points under the pixel coordinate system of each detected target, and converting the pixel coordinates to obtain target coordinates;
400. defining a specific area and an interested area of each camera, and directly outputting all targets in the specific area; establishing a custom coordinate system, and performing aerial view perspective transformation on target coordinates in the region of interest to obtain aerial view coordinates;
500. and performing target fusion based on the aerial view coordinates, and outputting all targets of the region of interest.
In step 100, the method for establishing the pixel coordinate system is as shown in fig. 8: and establishing a coordinate system by taking the upper left corner of the picture as an origin of coordinates, taking the horizontal direction as an X axis and taking the vertical direction as a Y axis.
The specific method of step 200 is:
and carrying out target detection on the video source image by adopting a target detection algorithm yolov8, obtaining a target detection type, and judging that the target belongs to a motor vehicle target or a non-motor vehicle target.
The specific method of step 300 is:
when the target belongs to a non-motor vehicle target, a conversion formula is provided:
wherein x and y are target coordinates, x 0 、y O And h is the height of the box frame and is the pixel coordinate.
When the target belongs to the motor vehicle target, the relation is satisfiedWhen the method is used, a conversion formula is as follows:
when the target belongs to the motor vehicle target, the relation is satisfiedSetting a threshold value for the aspect ratio of the box frame;
when the aspect ratio of the target box is greater than the threshold, then there is a conversion formula:
when the aspect ratio of the target box is greater than the threshold, then there is a conversion formula:
wherein, x and y are the coordinates of a self-defined coordinate system, and x 0 、y O The coordinates of the pixel coordinate system are H, w, and H are the heights of the box, the width of the box, and the height of the single frame image.
The specific formula of step 400 is:
in the formula, u 'and v' are aerial coordinates, k 11 、k 12 、k 13 、K 21 、K 22 、K 23 、K 31 、K 32 And x and y are target coordinates and are coefficients.
In step 400, the method for establishing the custom coordinate system is shown in fig. 7: taking the central point of the target detection area at the intersection, for example, selecting the central point of the middle area of the intersection as a reference point for the intersection, and setting a proper coordinate system; for example, according to the size of the intersection and considering the width of the road surface at two sides of the intersection, selecting a proper position as an origin of coordinates, for example, taking the intersection point of the extension lines of two adjacent road perpendicular lines of the intersection as the origin of coordinates, and setting the directions of the x axis and the y axis of coordinates, a coordinate system Oxy shown in fig. 7 is schematically shown;
it should be noted that, in a real scene, the origin of coordinates of the coordinate system Oxy often includes a building, so that it is inconvenient to obtain the position of the selected calibration point on the image under the custom coordinate system, so that when the corresponding point under the custom coordinate system is obtained, the coordinate of the point under the coordinate system O 'x' y 'can be obtained first, and the coordinate system O' x 'y' shown in fig. 7 is illustrated. And then, calculating the coordinate value of the coordinate point under the coordinate system Oxy through translation.
The specific method of step 500 is:
setting a distance threshold D min And D max Traversing a maximum number of cameras S within a region of interest kmax And other cameras S i Is a target of all of the above;
finding out the video camera S kmax Each object within the region of interest is at a camera S i The nearest Euclidean target d in the region of interest min
Comparison d min And D min And D max The magnitude of the value, if the relation d is satisfied min >D max Then it is stated that both are not the same target; if the relation d is satisfied min <D min Then the two are explained as the same target and are fused; if satisfy the relation D min <d min <D max And judging whether the target types of the two are consistent, if so, merging the same target, and if not, merging the video source images again in the next frame.
In an embodiment of the present invention, the specific method for performing the target fusion based on the target position after the bird's eye perspective transformation and outputting all the targets and the types thereof at the intersection further includes:
and setting a fusion threshold, and stopping fusion if the fusion times exceed the fusion threshold and are not successful.
Example 2:
this example is a specific analysis of example 1.
In step 100, the number of cameras is preferably four, and the installation mode is as shown in fig. 2, and the video source images are acquired in real time through the cameras.
In step 200, a specific manner of performing object detection on the plurality of video source images is to acquire an object detection type by using an object detection algorithm yolov8, where the acquired object detection types include an automotive object and a non-automotive object.
In step 300, as shown in fig. 3, for the target in the middle of the intersection, as perceived by the multiple video source images, the perceived result is the center point coordinate (x 0 ,y 0 ) And the width (w) and height (h) of the box, as in the target detection result example of fig. 4.
In order to more accurately obtain the pixel position of the target, the pixel position can be more accurately mapped to a self-defined coordinate system, a type based on the target is adopted, the height-width ratio of the box frame is utilized to set a threshold value, and the information of the whole box frame is considered in combination with the pixel position and the box height of the target, so that a certain pixel point is selected as the pixel position of the target.
1. For non-motor vehicle targets, such as pedestrians, bicycles, (electric) motorcycles and the like, the ground occupancy and the space occupancy of a single target are relatively small, and the pixel position of the target is obtained through a formula I, namely the coordinates of two pixels upwards from the center point of the lower edge of the target box frame are obtained.
The reason that the coordinates of the center point of the lower edge of the box frame are not taken at the pixel position at this time is that when the data are marked, the boundary of the box frame is formed by the minimum circumscribed rectangle of the detected object, and the position of a certain pixel on the boundary is not accurately taken as the object pixel.
2. For automotive targets, such as cars, buses, vans, etc., the floor occupancy and space occupancy of a single target is high. When a pixel point is taken as a target position on an image, setting thresholds according to the aspect ratio of a target box, and giving calculation modes of target pixel positions (x, y) under different conditions; the threshold value of the aspect ratio of the box of this embodiment is set to 1.
(a) When the target meets the switchTied typeAnd when the target is considered to be at a position with a far image field of view, the target pixel position is obtained through the formula I. />The y-direction coordinate value of the lower edge of the box frame is shown, and H is the height of a single frame image.
(b) When the target satisfies the relationIn the time, according to the height-width ratio of the target box frame +.>Values, divided into two cases:
I. when the target box meetsThe target is considered to be in a state of running longitudinally relative to the visual field and is in a forward running or backward running state (the front or the tail of the motor vehicle target is forward-facing on the image), and the pixel position of the target is obtained through a formula II by considering the characteristics of the image.
At the moment, the pixel position does not take the coordinates of the central point of the box frame, but increases the height parameter of the box frame, and takes the position of the central point pixel of the target box frame shifted downwards by 1/4 of the height of the box frame, so that the characteristic that the ground corresponding to the pixel position of the central point of the target box frame is inconsistent with the position of the target is effectively avoided.
II, when the target Box frame meetsThe object being considered to be in a cornering or transverse-looking condition (in which the vehicle object is visible on both sides of the imageEither side of (c), at which point the pixel location of the target is obtained by equation three. And the pixel position of the vehicle which runs transversely relative to the visual field is closer to the calculation mode of the center point of the lower edge of the box frame according to the formula III.
In this embodiment, in the first, second and third formulas, x and y are the target coordinates, x 0 、y O And h is the height of the box and w is the width of the box.
In step 400, the formula derivation method is as follows:
1. and respectively calibrating four groups of pixel coordinates and custom coordinates, and selecting four proper pixel points as calibration points before perspective transformation.
2. Determining the position of the selected calibration point on the image under the self-defined coordinate system as the calibration point after perspective transformation according to the self-defined coordinate origin, the coordinate axis direction and the reference point at the crossroad; specifically, the method comprises the following steps:
(a) A coordinate system is customized at the crossroad, a central point of a target detection area is taken, for example, a central point of a middle area of the crossroad is selected as a reference point for the crossroad, and a proper coordinate system is set; for example, according to the size of the intersection and considering the width of the road surface at two sides of the intersection, a suitable position is selected as the origin of coordinates, for example, the intersection point of the extension lines of two adjacent road perpendicular lines of the intersection is taken as the origin of coordinates, and the directions of the x axis and the y axis of the coordinates are set, and a coordinate system Oxy shown in fig. 7 is schematically shown.
(b) Defining a proper coordinate value for the reference point on the Oxy coordinate; in a real scene, the origin of coordinates of the coordinate system Oxy often includes a building, so that it is inconvenient to obtain the position of the selected coordinate point on the image under the custom coordinate system, so that when the corresponding point under the custom coordinate system is obtained, the coordinate of the point under the coordinate system O ' x ' y ' can be obtained first, and then the coordinate value of the coordinate point under the coordinate system Oxy is calculated through translation.
3. Calculating perspective transformation matrixes of a plurality of video sources respectively for four groups of one-to-one corresponding points calibrated by each video source image; specifically, the method comprises the following steps:
(a) The A, B, C, D is set to be four pixel calibration points before perspective transformation on a certain video source image in sequence.
(b) Setting A ', B', C 'and D' as A, B, C, D standard points under the self-defined coordinate system; a ', B', C ', D' are considered to be points under the custom coordinate system after perspective transformation under the pixel coordinate system by A, B, C, D.
(c) Let M be perspective transformation matrix, (x, y, 1) be homogeneous coordinates of original image pixel point, the transformed homogeneous coordinates are:
the method can obtain:
then there are:
let m 33 =1, then there is:
the perspective transformation matrix can be obtained by substituting the four groups of mapping points A, B, C, D, A ', B', C 'and D' into formula four and simultaneously calculating.
In step 400, the final goal of using multiple cameras for target detection at an intersection is to accurately output all targets and types within the effective detection range of the intersection. When the detection result is finally output, all targets are classified into two types.
One is a target that does not require fusion; a specific area (ROS, region of special) is defined for each camera as an effective detection range for the corresponding camera and is not covered by the field of view or is not well angled for other cameras, such as a lane area outside the pavement. The target detection of ROS is completed only through a certain camera, and Hungary optimization matching fusion is not needed at this time.
The other type is a target which needs to be output after Hungary matching fusion; a region of interest (ROI, region of interest) is defined for each camera as an effective detection range for a plurality of video sources, such as a region in the middle of an intersection.
For each camera the target of the ROI area:
1. firstly, counting the target number k of the ROI area of each camera, and recording the maximum value k max A corresponding camera;
2. and performing bird-eye perspective transformation on the ROI area of each camera, and converting all targets of the ROI area of each camera into a custom coordinate system through bird-eye perspective transformation.
In step 500, k is defined in the custom coordinate system max Corresponding camera (set as)) The detection target of the ROI of (2) is respectively compared with other sources (set as S i I=1, 2, 3) ROI detection targets are fused using the hungarian optimization matching algorithm. The cost matrix is calculated by Euclidean distance between every two targets of targets in ROIs of different cameras under a custom coordinate system. At the same time need to set a distance threshold D min And D max Traversing the camera->And S is i Finding out the camera +.>Each object within the upper ROI is at S i The object with the nearest Euclidean distance in the upper ROI is recorded and the nearest distance value d is recorded min : if d min >D max Then it is considered that both are not necessarily the same target; if d min <D min The two targets are considered to be the same target, the type of the corresponding target is determined according to the party with the larger box area, and the probability that the party with the larger box area is blocked is considered to be lower, so that the accuracy rate of target identification is higher; if D min <d min <D max And considering whether the types corresponding to the two targets are consistent or not, if the types of the targets are consistent, considering the two targets as the same target, if the types of the targets are inconsistent, fusing the target detection data of the next frame as the target to be fused again, and if the continuous 10 frames are not fused successfully, not fusing.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (7)

1. The method for fusing and blind supplementing of the detection targets by using the multiple cameras is characterized by comprising the following steps of:
installing a plurality of cameras at an intersection, acquiring a plurality of video source images, and establishing a pixel coordinate system;
respectively carrying out target detection on a plurality of video source images to obtain a target detection type;
according to the target detection type and box characteristics, taking the pixel coordinates of the target pixel points under the pixel coordinate system of each detected target, and converting the pixel coordinates to obtain target coordinates;
defining a specific area and an interested area of each camera, and directly outputting all targets in the specific area; establishing a custom coordinate system, and performing aerial view perspective transformation on target coordinates in the region of interest to obtain aerial view coordinates;
and performing target fusion based on the aerial view coordinates, and outputting all targets of the region of interest.
2. The method of claim 1, wherein the types of object detection include automotive objects and non-automotive objects.
3. The method for fusing and blind supplementing the targets detected by the multiple cameras according to claim 2, wherein the specific method for acquiring the coordinates of any one pixel point of each detected target under a pixel coordinate system and converting the coordinates to obtain the target coordinates according to the target detection type and the box characteristics is as follows:
when the target belongs to a non-motor vehicle target, a conversion formula is provided:
wherein x and y are target coordinates, x 0 、y O And h is the height of the box frame and is the pixel coordinate.
4. The method for fusing and blind supplementing the detection targets by using the multiple cameras according to claim 3, wherein the specific method for obtaining the target coordinates by converting the pixel coordinates of the target pixel point of each detected target under the pixel coordinate system according to the detection type of the target and the characteristics of the box frame is as follows:
when the target belongs to the motor vehicle target, the relation is satisfiedWhen the method is used, a conversion formula is as follows:
when the target belongs to the motor vehicle target, the relation is satisfiedSetting a threshold value for the aspect ratio of the box frame;
when the aspect ratio of the target box is greater than the threshold, then there is a conversion formula:
when the aspect ratio of the target box is greater than the threshold, then there is a conversion formula:
in the above formula, x and y are target coordinates, x 0 、y O And H is the height of the box, w is the width of the box, and H is the height of the single frame image.
5. The method for fusing and blind supplementing the targets detected by the multiple cameras according to claim 1, wherein the specific formula for obtaining the aerial view coordinate by carrying out aerial view perspective transformation on the target coordinate in the region of interest is as follows:
in the formula, u 'and v' are aerial coordinates, k 11 、k 12 、k 13 、K 21 、K 22 、K 23 、K 31 、K 32 And x and y are target coordinates and are coefficients.
6. The method for target fusion and blind complement for multi-camera detection according to claim 1, wherein the specific method for target fusion based on the aerial view coordinates and outputting all targets of the region of interest comprises the following steps:
setting a distance threshold D min And D max Traversing a maximum number of cameras S within a region of interest kmax And other cameras S i Is a target of all of the above;
finding out the video camera S kmax Each object within the region of interest is at a camera S i The nearest Euclidean target d in the region of interest min
Comparison d min And D min And D max The magnitude of the value, if the relation d is satisfied min >D max Then it is stated that both are not the same target; if the relation d is satisfied min <D min Then the two are explained as the same target and are fused; if satisfy the relation D min <d min <D max And judging whether the target types of the two are consistent, if so, merging the same target, and if not, merging the video source images again in the next frame.
7. The method for target fusion and blind complement for multi-camera detection according to claim 6, wherein the specific method for target fusion based on the bird's eye coordinates and outputting all targets of the region of interest further comprises:
and setting a fusion threshold, and stopping fusion if the fusion times exceed the fusion threshold and are not successful.
CN202311186896.2A 2023-09-14 2023-09-14 Multi-camera detection target fusion and blind supplementing method Pending CN117095370A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311186896.2A CN117095370A (en) 2023-09-14 2023-09-14 Multi-camera detection target fusion and blind supplementing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311186896.2A CN117095370A (en) 2023-09-14 2023-09-14 Multi-camera detection target fusion and blind supplementing method

Publications (1)

Publication Number Publication Date
CN117095370A true CN117095370A (en) 2023-11-21

Family

ID=88773522

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311186896.2A Pending CN117095370A (en) 2023-09-14 2023-09-14 Multi-camera detection target fusion and blind supplementing method

Country Status (1)

Country Link
CN (1) CN117095370A (en)

Similar Documents

Publication Publication Date Title
US8457392B2 (en) Identifying an object in an image using color profiles
US8244027B2 (en) Vehicle environment recognition system
US8036424B2 (en) Field recognition apparatus, method for field recognition and program for the same
CN109299674B (en) Tunnel illegal lane change detection method based on car lamp
CN111448478A (en) System and method for correcting high-definition maps based on obstacle detection
US20100201814A1 (en) Camera auto-calibration by horizon estimation
JPH10512694A (en) Method and apparatus for detecting movement of an object in a continuous image
Siogkas et al. Random-walker monocular road detection in adverse conditions using automated spatiotemporal seed selection
CN110717445B (en) Front vehicle distance tracking system and method for automatic driving
CN112215306A (en) Target detection method based on fusion of monocular vision and millimeter wave radar
US10984264B2 (en) Detection and validation of objects from sequential images of a camera
Adamshuk et al. On the applicability of inverse perspective mapping for the forward distance estimation based on the HSV colormap
US20190180121A1 (en) Detection of Objects from Images of a Camera
CN111723778B (en) Vehicle distance measuring system and method based on MobileNet-SSD
CN114120283A (en) Method for distinguishing unknown obstacles in road scene three-dimensional semantic segmentation
CN104156727B (en) Lamplight inverted image detection method based on monocular vision
CN113762134B (en) Method for detecting surrounding obstacles in automobile parking based on vision
CN106803073A (en) DAS (Driver Assistant System) and method based on stereoscopic vision target
CN106340031A (en) Method and device for detecting moving object
CN117095370A (en) Multi-camera detection target fusion and blind supplementing method
WO2022142827A1 (en) Road occupancy information determination method and apparatus
DE102020213799A1 (en) Obstacle detection device and obstacle detection method
JP4055785B2 (en) Moving object height detection method and apparatus, and object shape determination method and apparatus
JP7456716B2 (en) Obstacle detection device and moving body
JP2712809B2 (en) Obstacle detection device for vehicles

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination