CN115909252A - Vehicle-mounted multi-camera target detection method, control device, storage medium and vehicle - Google Patents

Vehicle-mounted multi-camera target detection method, control device, storage medium and vehicle Download PDF

Info

Publication number
CN115909252A
CN115909252A CN202211627320.0A CN202211627320A CN115909252A CN 115909252 A CN115909252 A CN 115909252A CN 202211627320 A CN202211627320 A CN 202211627320A CN 115909252 A CN115909252 A CN 115909252A
Authority
CN
China
Prior art keywords
target detection
loss function
target
cameras
loss
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211627320.0A
Other languages
Chinese (zh)
Inventor
胡少晗
王皓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Weilai Zhijia Technology Co Ltd
Original Assignee
Anhui Weilai Zhijia Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Weilai Zhijia Technology Co Ltd filed Critical Anhui Weilai Zhijia Technology Co Ltd
Priority to CN202211627320.0A priority Critical patent/CN115909252A/en
Publication of CN115909252A publication Critical patent/CN115909252A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of automatic driving, in particular to a vehicle-mounted multi-camera target detection method, a control device, a storage medium and a vehicle, and aims to solve the problem that when the depths of prediction results of a plurality of cameras on the vehicle for the same object are inconsistent, fusion errors are caused, and therefore misjudgment is further caused. For the purpose, the method applies the first loss function and the second loss function to construct a joint loss function, and performs joint training on target detection models of a plurality of cameras. And then, respectively inputting image data acquired by the plurality of cameras into the target detection models corresponding to the cameras to acquire a plurality of target detection results, and fusing the plurality of target detection results to acquire a target detection fusion result. The method and the device can improve the consistency of the target detection results among the multiple cameras on the premise of not losing the performance of each target detection model.

Description

Vehicle-mounted multi-camera target detection method, control device, storage medium and vehicle
Technical Field
The invention relates to the technical field of automatic driving, and particularly provides a vehicle-mounted multi-camera target detection method, a control device, a storage medium and a vehicle.
Background
In the vehicle automatic driving technology, it is necessary to fuse output results of all sensors so as to obtain a fused result of 360 degrees of full view angle to assist automatic driving. Due to the redundancy among the sensors, there may be a case where a plurality of cameras can all see the same object, so that a plurality of cameras may all output a prediction result of the object.
In the process of fusing multi-sensor prediction results, generally, fusion is performed according to the distance between objects, the intersection ratio of the object predicted by the camera A and the object predicted by the camera B when the object is projected onto the camera B, and the similarity of the object under the imaging of different cameras, and if the difference is too large, the object is determined to be a different object. The fusion result obtained by applying multi-sensor fusion is generally much better than the prediction result of a single sensor, but if the depths of the prediction results of multiple cameras on the same object are inconsistent, and when the degree of the inconsistency is large, it is determined that two objects exist, so that fusion errors are generated, and serious misjudgment is caused.
Accordingly, there is a need in the art for a new multi-camera prediction fusion scheme to solve the above problems.
Disclosure of Invention
In order to overcome the defects, the invention is provided to solve or at least partially solve the problem that when the depths of the prediction results of a plurality of cameras arranged on a vehicle for the same object are inconsistent, fusion errors are caused, and further misjudgment is caused.
In a first aspect, the present invention provides a method of on-board multi-camera target detection, the method comprising:
the method comprises the steps that image data collected by a plurality of cameras are subjected to time synchronization and then are respectively input into target detection models corresponding to the cameras to obtain a plurality of target detection results, wherein the target detection models of the cameras are obtained by performing joint training through joint loss functions constructed by a first loss function and a second loss function, the first loss function is a consistency loss function of the target detection results among the target detection models, and the second loss function is a loss function between the target detection result and a true value of each target detection model; and
and fusing the target detection results to obtain a target detection fusion result.
In one technical solution of the above vehicle-mounted multi-camera target detection method, the jointly training the target detection models of the multiple cameras by using a joint loss function constructed by a first loss function and a second loss function includes:
grouping image training data in a preset data set according to time stamps to obtain different groups of image training data, wherein each group of image training data has the same time stamp;
for each iteration of the joint training, respectively inputting a group of image training data into the plurality of target detection models to respectively obtain target detection results of the plurality of target detection models on the same target;
obtaining consistency loss among the target detection results according to the first loss function, and obtaining original loss between the target detection result and a true value of each target detection model according to the second loss function;
applying the joint loss function, and obtaining the joint loss of the current iteration according to the consistency loss and the original loss;
and according to the joint loss, reversely propagating and updating parameters of the plurality of target detection models so as to realize joint training of the plurality of target detection models.
In one embodiment of the above vehicle-mounted multi-camera target detection method, obtaining a consistency loss between the plurality of target detection results according to the first loss function includes:
obtaining a plurality of target detection results of a plurality of target detection models on the same target;
and according to the external parameters among the cameras and the target detection results, the first loss function is applied to obtain the consistency loss of the target detection results in the same coordinate system.
In one technical solution of the above vehicle-mounted multi-camera target detection method, the obtaining of multiple target detection results of multiple target detection models for a same target includes:
for the same target, each target detection model obtains a matching 2d frame matched with a real 2d frame of the target from a plurality of predicted detection 2d frames;
and averaging the 3d coordinates of the matched 2d frames to obtain a target detection result of the target detection model on the target.
In one embodiment of the above vehicle-mounted multi-camera target detection method, for the same target, obtaining, by each target detection model, a matching 2d frame matching a real 2d frame of the target from a plurality of predicted detection 2d frames, includes:
calculating the intersection ratio between each detection 2d frame and the real 2d frame;
comparing the cross-over ratio with a preset cross-over ratio threshold value;
and when the intersection ratio is greater than the intersection ratio threshold value, taking the corresponding detection 2d frame as the matching 2d frame.
In an embodiment of the above vehicle-mounted multi-camera target detection method, the obtaining, by applying the first loss function according to external parameters among the multiple cameras and the multiple target detection results, a consistency loss of the multiple target detection results in the same coordinate system includes:
converting target detection results of the targets detected by the plurality of target detection models into the same camera coordinate system according to external parameters among different cameras;
and acquiring consistency loss of the plurality of target detection results in the same coordinate system according to the consistency loss function and the target detection results of different cameras in the same camera coordinate system.
In one technical solution of the above vehicle-mounted multi-camera target detection method, the consistency loss function is a function of offsets between center coordinates of target detection results of the plurality of target detection models for the same target in the same camera coordinate system; and/or the presence of a gas in the gas,
the weight value of the first loss function in the joint loss function is less than the weight value of the second loss function in the joint loss function.
In a second aspect, there is provided a control device comprising a processor and a storage device adapted to store a plurality of program codes, the program codes being adapted to be loaded and run by the processor to perform the vehicle-mounted multi-camera object detection method according to any one of the above-mentioned aspects of the vehicle-mounted multi-camera object detection method.
In a third aspect, a computer readable storage medium is provided, having stored therein a plurality of program codes adapted to be loaded and run by a processor to perform the vehicle mounted multi-camera object detection method according to any one of the above-mentioned aspects of the vehicle mounted multi-camera object detection method.
In a fourth aspect, a vehicle is provided, where a plurality of cameras are arranged on the vehicle, and the vehicle includes the control device in the above control device technical solution.
One or more of the above technical solutions of the present invention include at least one or more of the following
Has the advantages that:
in the technical scheme for implementing the invention, a first loss function and a second loss function are applied to construct a joint loss function, and joint training is performed on target detection models of a plurality of cameras, wherein the first loss function is a consistency loss function between target detection results of a plurality of target detection models, and the second loss function is a loss function between a target detection result and a true value of each target detection model. And then, respectively inputting image data acquired by the plurality of cameras into the target detection models corresponding to the cameras after time synchronization, acquiring a plurality of target detection results, and fusing the plurality of target detection results to acquire a target detection fusion result. Through the configuration mode, the target detection models for the vehicle-mounted multi-cameras are obtained by applying the first loss function and the second loss function to carry out combined training, so that the consistency of target detection results among the multi-cameras can be further improved on the premise that the performance of each target detection model is not lost, the fusion error of judging the same object as two objects can be effectively avoided when the target detection results of a plurality of cameras are fused, and the accuracy of the target detection fusion result is improved.
Drawings
The disclosure of the present invention will become more readily understood with reference to the accompanying drawings. As is readily understood by those skilled in the art: these drawings are for illustrative purposes only and are not intended to be a limitation on the scope of the present disclosure. Wherein:
fig. 1 is a flow chart illustrating the main steps of a vehicle-mounted multi-camera target detection method according to an embodiment of the present invention;
FIG. 2 is image data acquired by an example in-vehicle front view FW camera;
FIG. 3 is image data collected by an example in-vehicle forward view FN camera;
FIG. 4 is image data collected by an example in-vehicle side view FL camera;
FIG. 5 is a schematic diagram of a main architecture for jointly training a plurality of target detection models according to an embodiment of the present invention;
fig. 6 is a schematic diagram comparing the target detection fusion results obtained by the vehicle-mounted multi-camera target detection method according to the present invention and the detection method in the prior art.
Detailed Description
Some embodiments of the invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are only for explaining the technical principle of the present invention, and are not intended to limit the scope of the present invention.
In the description of the present invention, a "module" or "processor" may include hardware, software, or a combination of both. A module may comprise hardware circuitry, various suitable sensors, communication ports, memory, may comprise software components such as program code, or may be a combination of software and hardware. The processor may be a central processing unit, microprocessor, image processor, digital signal processor, or any other suitable processor. The processor has data and/or signal processing functionality. The processor may be implemented in software, hardware, or a combination thereof. Non-transitory computer readable storage media include any suitable medium that can store program code, such as magnetic disks, hard disks, optical disks, flash memory, read-only memory, random-access memory, and the like. The term "a and/or B" denotes all possible combinations of a and B, such as a alone, B alone or a and B. The term "at least one of A or B" or "at least one of A and B" means similar to "A and/or B" and may include only A, only B, or both A and B. The singular forms "a", "an" and "the" may include the plural forms as well.
In the driving assistance technology, the surrounding environment is often perceived based on a plurality of sensors provided on the vehicle. As shown in fig. 2 to 4, fig. 2 to 4 respectively show image data acquired by a front view FW camera, a front view FN camera and a side view FL camera arranged on a vehicle at the same time, wherein a box indicates that a truck is present in the field of view of three cameras, so that each camera has a prediction result for the truck, the prediction result can be a 3d coordinate of the truck, and when the 3d coordinates are uniformly converted into a body coordinate system, the difference is large (for example, the difference can reach 50 m), which may cause the truck to be predicted as a plurality of objects during the multi-camera fusion process, thereby further causing a false judgment of the fusion process.
Accordingly, a new on-board multi-camera object detection method is needed to solve the above problems.
Referring to fig. 1, fig. 1 is a flow chart illustrating main steps of a vehicle-mounted multi-camera target detection method according to an embodiment of the invention. As shown in fig. 1, the vehicle-mounted multi-camera target detection method in the embodiment of the present invention mainly includes the following steps S101 to S102.
Step S101: the method comprises the steps of performing time synchronization on image data collected by a plurality of cameras, and then respectively inputting the image data into target detection models corresponding to the cameras to obtain a plurality of target detection results, wherein the target detection models of the plurality of cameras are obtained by performing joint training by using a joint loss function constructed by a first loss function and a second loss function, the first loss function is a consistency loss function of the target detection results among the plurality of target detection models, and the second loss function is a loss function between the target detection result and a true value of each target detection model.
In this embodiment, a first loss function and a second loss function may be defined, where the first loss function is a consistency loss function of the target detection results among the plurality of target detection models, and the second loss function is a loss function between the target detection result and a true value of each target detection model. The first loss function and the second loss function can form a combined loss function, and the target detection models of the multiple cameras are subjected to linkage training so as to perform target detection by applying the trained target detection models and obtain a target detection result of each camera. Among them, the consistency loss is a parameter for evaluating a difference in 3d coordinates between target detection results of different cameras for the same target, such as a difference in depth, a difference in offset between center coordinates, and the like.
In one embodiment, the consistency loss function is a function of an offset between center coordinates of target detection results of a plurality of target detection models for the same target in the same camera coordinate system.
In one embodiment, the offsets of the target detection results of any two cameras in the x, y and z axis directions can be determined by the following formula (1) to further determine the consistency loss function (i.e. the first loss function):
Figure BDA0004004067440000061
Figure BDA0004004067440000062
wherein L is con For the consistency loss function, N is the number of target detection results, L i,j,x (x i ,x j ) Is the offset of x coordinate between the center coordinates of the target detection results of the ith camera and the jth camera for the same target, L j,j,y (y i ,y j ) Is the offset of the y coordinate between the center coordinates of the target detection results of the ith camera and the jth camera for the same target, L i,j,z (z i ,z j ) The offset amount of the z coordinate between the center coordinates of the target detection results for the same target for the ith camera and the jth camera.
In one embodiment, the target detection result of one camera may be selected as a reference, and offsets between the target detection results of other cameras and the reference may be obtained and summed to obtain the consistency loss function.
In one embodiment, the loss function (i.e., the second loss function) between the target detection result and the true value of a certain target detection model can be determined by the following formula (2):
L k =L k (Δx k )+L k (Δy k )+L k (Δz k ),k=1…N (2)
wherein L is k Is a second loss function for the kth target detection model, N is the number of target detection results, L k (Δx k ) Is the offset of the x coordinate between the center coordinate of the kth target detection result and the true coordinate, L k (Δy k ) Is the offset of the y coordinate between the center coordinate of the kth target detection result and the true coordinate, L k (Δz k ) Is the offset of the z coordinate between the center coordinate of the kth target detection result and the true coordinate.
In one embodiment, the weight values in the joint loss function may be set for the first loss function and the second loss function, respectively, and the joint loss function may be obtained by weighting the first loss function and the second loss function, as shown in the following formula (3).
Figure BDA0004004067440000071
Wherein L is total As a combined loss function, λ 1 Is a weight value of the first loss function, λ 2 Is the weight value of the second loss function.
In one embodiment, the weight value λ of the first loss function 1 A weight value lambda less than the second loss function 2 . Therefore, the first loss function does not dominate in the joint training process, the consistency loss among all target detection models is smaller on the premise of ensuring the performance of each target detection model, and meanwhile, a trivial solution is avoided when the joint training is carried out.
In one embodiment, the weight value ratio between the first loss function and the second loss function may be 1.
Step S102: and fusing the multiple target detection results to obtain a target detection fusion result.
In this embodiment, a plurality of target detection results may be fused, thereby obtaining a target detection fusion result based on multiple cameras. The process of fusing the multiple target detection results is target-level detection result fusion (high-level fusion), that is, after the target-level target detection results are obtained from the image data of each camera, the multiple target-level detection results are fused.
Based on the above steps S101 to S102, the embodiment of the present invention applies a first loss function and a second loss function to construct a joint loss function, and performs joint training on the target detection models of the multiple cameras, where the first loss function is a consistency loss function between target detection results of the multiple target detection models, and the second loss function is a loss function between a target detection result of each target detection model and a true value. And then, respectively inputting image data acquired by the plurality of cameras into the target detection models corresponding to the cameras after time synchronization, acquiring a plurality of target detection results, and fusing the plurality of target detection results to acquire a target detection fusion result. Through the configuration mode, the target detection models for the vehicle-mounted multi-cameras are obtained by applying the first loss function and the second loss function to carry out combined training, so that the consistency of target detection results among the multi-cameras can be further improved on the premise that the performance of each target detection model is not lost, fusion errors of the same object determined as two objects can be effectively avoided when the target detection results of the multiple cameras are fused, and the accuracy of the target detection fusion results is improved.
The process of jointly training the object detection models of multiple cameras is further described below.
In one implementation of the embodiment of the present invention, joint training of target detection models of multiple cameras may be performed through the following steps S201 to S205:
step S201: and grouping the image training data in the preset data set according to the time stamps to obtain different groups of image training data, wherein each group of image training data has the same time stamp.
In this embodiment, the image training data in the preset data set may be classified according to the time stamp, and the image training data with the same time stamp may be used as a group to perform iterative training for multiple target detection models. Referring to fig. 5, fig. 5 is a schematic diagram of a main architecture of joint training of multiple target detection models according to an embodiment of the present invention. As shown in fig. 5, still taking a front view FW camera, a front view FN camera, and a side view FL camera as an example, correspondingly, there are 3 object detection models (it should be noted that, schematically, the models are shown as a single box in the figure), and the image training data acquired by FW, FN, and FL can be grouped as a preset data set according to the time stamps, so as to obtain a plurality of groups such as "FW1, FN1, FL1", "FW2, FN2, FL2",\8230 \ 8230, and so on.
Step S202: for each iteration of the joint training, a group of image training data is respectively input into the plurality of target detection models so as to respectively obtain the target detection results of the plurality of target detection models on the same target.
In the present embodiment, a set of image training data may be input to a plurality of target detection models at each iteration of the joint training, and a target detection result of the same target by the plurality of target detection models may be obtained. As shown in fig. 5, a set of image training data is input to the model to obtain target detection results for different targets (object 1, object 2, object 3 \8230;) for different cameras.
In one embodiment, step S202 may further include the following steps S2021 and S2022:
step S2021: for the same target, each target detection model obtains a matching 2d box matching the real 2d box of the target from the predicted multiple detection 2d boxes.
In the present embodiment, step S2021 may further include the following step S20211 to step S20213:
step S20211: the intersection ratio between each detected 2d frame and the true 2d frame is calculated.
Step S20212: and comparing the intersection ratio with a preset intersection ratio threshold value.
Step S20213: and when the intersection ratio is greater than the intersection ratio threshold value, taking the corresponding detection 2d frame as a matching 2d frame.
Step S2022: and averaging the 3d coordinates matched with the 2d frame to obtain a target detection result of the target detection model on the target.
Step S203: and obtaining consistency loss among a plurality of target detection results according to the first loss function, and obtaining original loss between the target detection result and the true value of each target detection model according to the second loss function.
In the present embodiment, the consistency loss may be acquired according to the following steps S2031 and S2032:
step S2031: and obtaining a plurality of target detection results of the plurality of target detection models on the same target.
Step S2032: and according to external parameters among the cameras and a plurality of target detection results, applying a first loss function to obtain consistency loss of the plurality of target detection results in the same coordinate system. For example, the consistency loss may be calculated according to a consistency loss function shown in equation (1).
In the present embodiment, as shown in fig. 5, if a target is the object 2, the 3d coordinates of the object 2 obtained in different target detection models can be converted to a camera coordinate system (FW camera coordinate system) to obtain loss @ (FW, FL) (offset of the object 2 in the target detection result of FW camera and FL camera) and loss @ (FW, FN) (offset of the object 2 in the target detection result of FW camera and FN camera), so as to obtain consistency loss from loss @ (FW, FL) and loss @ (FW, FN)
Step S204: and applying a joint loss function to obtain the joint loss of the current iteration according to the consistency loss and the original loss.
In this embodiment, a joint loss function may be applied to obtain the joint loss of the current iteration based on the consistency loss and the original loss.
Step S205: and according to the joint loss, reversely propagating and updating parameters of the plurality of target detection models so as to realize joint training of the plurality of target detection models.
In this embodiment, parameters of the plurality of target detection models may be updated by back propagation according to the joint loss, so as to implement joint training of the plurality of target detection models.
In one embodiment, a gradient descent method may be applied to back-propagate the parameters of the updated target detection model to minimize the loss of consistency of the target detection results obtained by the target detection model, so as to implement joint training of multiple target detection models.
In one embodiment, the training cutoff condition for the joint training is to traverse a preset data set a preset number of times. Those skilled in the art may also set a training cutoff condition according to the actual application requirement, for example, the prediction accuracy and consistency of the target prediction model both reach a preset condition.
In one embodiment, referring to fig. 6, fig. 6 is a schematic diagram of the comparison of the target detection fusion results obtained by the vehicle-mounted multi-camera target detection method according to the present invention and the detection method in the prior art. As shown in fig. 6, the top of fig. 6 is the target detection result of the prior art, and the bottom is the target fusion detection result obtained by applying the embodiment of the present invention. Compared with the detection result obtained by the detection method in the prior art, the target fusion detection result obtained by applying the vehicle-mounted multi-camera target detection method provided by the embodiment of the invention has fewer fusion errors, and the accuracy of multi-camera target detection is effectively improved.
It should be noted that, although the foregoing embodiments describe each step in a specific sequence, those skilled in the art will understand that, in order to achieve the effect of the present invention, different steps do not necessarily need to be executed in such a sequence, and they may be executed simultaneously (in parallel) or in other sequences, and these changes are all within the protection scope of the present invention.
It will be understood by those skilled in the art that all or part of the flow of the method according to the above-described embodiment may be implemented by a computer program, which may be stored in a computer-readable storage medium and used to implement the steps of the above-described embodiments of the method when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable storage medium may include: any entity or device capable of carrying said computer program code, media, usb disk, removable hard disk, magnetic diskette, optical disk, computer memory, read-only memory, random access memory, electrical carrier wave signals, telecommunication signals, software distribution media, etc. It should be noted that the computer readable storage medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable storage media that does not include electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.
Furthermore, the invention also provides a control device. In an embodiment of the control device according to the present invention, the control device comprises a processor and a storage device, the storage device may be configured to store a program for executing the on-board multi-camera object detection method of the above-mentioned method embodiment, and the processor may be configured to execute the program in the storage device, the program including but not limited to the program for executing the on-board multi-camera object detection method of the above-mentioned method embodiment. For convenience of explanation, only the parts related to the embodiments of the present invention are shown, and details of the specific techniques are not disclosed. The control device may be a control device apparatus formed including various electronic apparatuses.
Further, the invention also provides a computer readable storage medium. In one computer-readable storage medium embodiment according to the present invention, the computer-readable storage medium may be configured to store a program for executing the in-vehicle multi-camera target detection method of the above-described method embodiment, which may be loaded and run by the processor to implement the in-vehicle multi-camera target detection method described above. For convenience of explanation, only the parts related to the embodiments of the present invention are shown, and details of the specific techniques are not disclosed. The computer readable storage medium may be a storage device formed by including various electronic devices, and optionally, the computer readable storage medium is a non-transitory computer readable storage medium in the embodiment of the present invention.
Further, the invention also provides a vehicle, in one embodiment of the vehicle according to the invention, a plurality of cameras are arranged on the vehicle, and the vehicle comprises the control device in the embodiment of the control device.
Further, it should be understood that, since the modules are only configured to illustrate the functional units of the apparatus of the present invention, the corresponding physical devices of the modules may be the processor itself, or a part of software, a part of hardware, or a part of a combination of software and hardware in the processor. Thus, the number of individual blocks in the figures is merely illustrative.
Those skilled in the art will appreciate that the various modules in the apparatus may be adaptively split or combined. Such splitting or combining of specific modules does not cause the technical solutions to deviate from the principle of the present invention, and therefore, the technical solutions after splitting or combining will fall within the protection scope of the present invention.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.

Claims (10)

1. An on-vehicle multi-camera target detection method, characterized in that the method comprises:
the method comprises the steps that image data collected by a plurality of cameras are subjected to time synchronization and then are respectively input into target detection models corresponding to the cameras to obtain a plurality of target detection results, wherein the target detection models of the cameras are obtained by performing joint training through joint loss functions constructed by a first loss function and a second loss function, the first loss function is a consistency loss function of the target detection results among the target detection models, and the second loss function is a loss function between the target detection result and a true value of each target detection model; and
and fusing the target detection results to obtain a target detection fusion result.
2. The method of claim 1,
the target detection models of the multiple cameras are obtained by performing joint training by using a joint loss function constructed by a first loss function and a second loss function, and the method comprises the following steps:
grouping image training data in a preset data set according to time stamps to obtain different groups of image training data, wherein each group of image training data has the same time stamp;
for each iteration of the joint training, respectively inputting a group of image training data into the plurality of target detection models to respectively obtain target detection results of the plurality of target detection models on the same target;
obtaining consistency loss among the target detection results according to the first loss function, and obtaining original loss between the target detection result and a true value of each target detection model according to the second loss function;
applying the joint loss function, and obtaining the joint loss of the current iteration according to the consistency loss and the original loss;
and according to the joint loss, reversely propagating and updating parameters of the plurality of target detection models so as to realize joint training of the plurality of target detection models.
3. The method of claim 2,
obtaining consistency loss among the plurality of target detection results according to the first loss function, including:
obtaining a plurality of target detection results of a plurality of target detection models on the same target;
and according to the external parameters among the cameras and the target detection results, applying the first loss function to obtain the consistency loss of the target detection results in the same coordinate system.
4. The method of claim 3,
the obtaining of the multiple target detection results of the multiple target detection models on the same target includes:
for the same target, each target detection model obtains a matching 2d frame matched with a real 2d frame of the target from a plurality of predicted detection 2d frames;
and averaging the 3d coordinates of the matched 2d frames to obtain a target detection result of the target detection model on the target.
5. The method of claim 4,
for the same target, each target detection model obtains a matching 2d frame matching a real 2d frame of the target from a plurality of predicted detection 2d frames, including:
calculating the intersection ratio between each detection 2d frame and the real 2d frame;
comparing the cross-over ratio with a preset cross-over ratio threshold value;
and when the intersection ratio is greater than the intersection ratio threshold value, taking the corresponding detection 2d frame as the matching 2d frame.
6. The method of claim 3,
the obtaining consistency loss of the multiple target detection results in the same coordinate system by applying the first loss function according to the external parameters among the multiple cameras and the multiple target detection results includes:
converting target detection results of the targets detected by the plurality of target detection models into the same camera coordinate system according to external parameters among different cameras;
and acquiring consistency loss of the plurality of target detection results in the same coordinate system according to the consistency loss function and target detection results of different cameras in the same camera coordinate system.
7. The method according to any one of claims 1 to 6, wherein the consistency loss function is a function of an offset between center coordinates of target detection results of the plurality of target detection models for the same target under the same camera coordinate system; and/or the presence of a gas in the gas,
the weight value of the first loss function in the joint loss function is less than the weight value of the second loss function in the joint loss function.
8. A control device comprising a processor and a memory device, said memory device being adapted to store a plurality of program codes, characterized in that said program codes are adapted to be loaded and run by said processor to perform the method according to any of claims 1 to 7.
9. A computer readable storage medium having stored therein a plurality of program codes, characterized in that the program codes are adapted to be loaded and run by a processor to perform the method of any of claims 1 to 7.
10. A vehicle provided with a plurality of cameras, characterized by comprising the control apparatus of claim 8.
CN202211627320.0A 2022-12-16 2022-12-16 Vehicle-mounted multi-camera target detection method, control device, storage medium and vehicle Pending CN115909252A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211627320.0A CN115909252A (en) 2022-12-16 2022-12-16 Vehicle-mounted multi-camera target detection method, control device, storage medium and vehicle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211627320.0A CN115909252A (en) 2022-12-16 2022-12-16 Vehicle-mounted multi-camera target detection method, control device, storage medium and vehicle

Publications (1)

Publication Number Publication Date
CN115909252A true CN115909252A (en) 2023-04-04

Family

ID=86491106

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211627320.0A Pending CN115909252A (en) 2022-12-16 2022-12-16 Vehicle-mounted multi-camera target detection method, control device, storage medium and vehicle

Country Status (1)

Country Link
CN (1) CN115909252A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117523431A (en) * 2023-11-17 2024-02-06 中国科学技术大学 Firework detection method and device, electronic equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117523431A (en) * 2023-11-17 2024-02-06 中国科学技术大学 Firework detection method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN107481292B (en) Attitude error estimation method and device for vehicle-mounted camera
CN111191487A (en) Lane line detection and driving control method and device and electronic equipment
CN112419494A (en) Obstacle detection and marking method and device for automatic driving and storage medium
CN110807439B (en) Method and device for detecting obstacle
CN102463990A (en) System and method for tracking objects
US20210397907A1 (en) Methods and Systems for Object Detection
CN112465877B (en) Kalman filtering visual tracking stabilization method based on motion state estimation
CN115909252A (en) Vehicle-mounted multi-camera target detection method, control device, storage medium and vehicle
CN109360239A (en) Obstacle detection method, device, computer equipment and storage medium
CN112052807B (en) Vehicle position detection method, device, electronic equipment and storage medium
EP4345421A2 (en) Method for calibrating sensor parameters based on autonomous driving, apparatus, storage medium, and vehicle
CN113311905B (en) Data processing system
CN112766135A (en) Target detection method, target detection device, electronic equipment and storage medium
CN115578468A (en) External parameter calibration method and device, computer equipment and storage medium
CN112946612B (en) External parameter calibration method and device, electronic equipment and storage medium
CN114820769A (en) Vehicle positioning method and device, computer equipment, storage medium and vehicle
CN114842340A (en) Robot binocular stereoscopic vision obstacle sensing method and system
CN107392898A (en) Applied to the pixel parallax value calculating method and device in binocular stereo vision
WO2020018140A1 (en) Ballistic estimnation of vehicle data
CN111476062A (en) Lane line detection method and device, electronic equipment and driving system
CN110363863B (en) Input data generation method and system of neural network
CN116152714A (en) Target tracking method and system and electronic equipment
CN113932815B (en) Robustness optimization Kalman filtering relative navigation method, device, equipment and storage medium
CN114690226A (en) Monocular vision distance measurement method and system based on carrier phase difference technology assistance
CN111898396A (en) Obstacle detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination