WO2023273041A1

WO2023273041A1 - Target detection method and apparatus in vehicle-road coordination, and roadside device

Info

Publication number: WO2023273041A1
Application number: PCT/CN2021/126163
Authority: WO
Inventors: 夏春龙
Original assignee: 阿波罗智联(北京)科技有限公司
Priority date: 2021-06-28
Filing date: 2021-10-25
Publication date: 2023-01-05
Also published as: CN113420682A; CN113420682B; JP2023536025A; JP7436670B2; KR20220091607A

Abstract

The present disclosure relates to the field of intelligent transportation, and in particular, to the technical field of image detection. Disclosed are a target detection method and apparatus in vehicle-road coordination, and a roadside device. A specific implementation solution comprises: performing target detection on an image, and obtaining candidate target regions in the image, confidence levels of the candidate target regions, and the degrees of blockage of the candidate target regions; updating the confidence levels of the candidate target regions on the basis of the intersection over union between the candidate target regions and the degrees of blockage of the candidate target regions; and detecting a target in the image from the candidate target regions according to the updated confidence levels. When the solution provided by embodiments of the present disclosure is used for target detection, the accuracy of target detection is improved.

Description

Target detection method, device and roadside equipment in vehicle-road coordination

This application claims the priority of the Chinese patent application with the application number 202110721853.4 submitted to the China Patent Office on June 28, 2021, and the invention title is "Target Detection Method, Device and Roadside Equipment in Vehicle-Infrastructure Coordination", the entire content of which is incorporated by reference incorporated in this application.

technical field

The present disclosure relates to the technical field of intelligent transportation, in particular to the technical field of image detection.

Background technique

In the application scenarios such as road monitoring and vehicle route planning of V2X (Vehicle to everything, wireless communication technology for vehicles), after obtaining the image collected by the image acquisition device, it is necessary to carry out an Detection, to locate the target in the image, and then trigger the processing operation for the above target, or combine the above target for vehicle path planning, etc. Therefore, a method for object detection in vehicle-road coordination is needed to detect objects in images.

Contents of the invention

The present disclosure provides a target detection method, device and roadside equipment in vehicle-road coordination.

According to an aspect of the present disclosure, there is provided a method for object detection in vehicle-road coordination, the method comprising:

Carrying out target detection on the image to obtain a candidate target area in the image, a confidence degree of the candidate target area, and an occlusion degree of the candidate target area;

Update the confidence of the candidate target area based on the intersection ratio between the candidate target areas and the occlusion degree of the candidate target area;

Objects in the image are detected from candidate object regions according to the updated confidence.

According to an aspect of the present disclosure, a device for detecting objects in vehicle-road coordination is provided, the device comprising:

An information obtaining module, configured to perform target detection on an image, and obtain a candidate target area in the image, a confidence degree of the candidate target area, and an occlusion degree of the candidate target area;

Confidence update module, for updating the confidence of the candidate target area based on the intersection ratio between the candidate target areas and the degree of occlusion of the candidate target area;

The object detection module is used to detect the object in the image from the candidate object area according to the updated confidence.

According to another aspect of the present disclosure, an electronic device is provided, including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions that can be executed by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can implement the method for object detection in vehicle-road coordination.

According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to execute a method for detecting objects in vehicle-road coordination.

According to another aspect of the present disclosure, a computer program product is provided, including a computer program, and when the computer program is executed by a processor, a method for detecting objects in vehicle-road coordination is implemented.

According to another aspect of the present disclosure, there is provided a roadside device, including the above-mentioned electronic device.

According to another aspect of the present disclosure, a cloud control platform is provided, including the above-mentioned electronic device.

It can be seen from the above that when applying the scheme provided by the embodiments of the present disclosure for object detection, firstly, the confidence of the candidate object area is updated according to the intersection ratio between the candidate object areas and the degree of occlusion of the candidate object area, and then based on the updated Confidence, to detect an object in an image from candidate object regions. Since the intersection ratio between candidate object regions can reflect the degree of overlap between candidate object regions, and the occlusion degree of candidate object regions can reflect the degree of occlusion of candidate object regions, therefore, according to the above intersection ratio and occlusion degree, update the candidate The confidence of the target area can refer to the overlap between the target areas, so that the updated confidence of the candidate target area is closer to the actual situation, so that the image can be detected according to the updated confidence, which can improve the accuracy of target detection .

It should be understood that what is described in this section is not intended to identify key or important features of the embodiments of the present disclosure, nor is it intended to limit the scope of the present disclosure. Other features of the present disclosure will be readily understood through the following description.

Description of drawings

The accompanying drawings are used to better understand the present solution, and do not constitute a limitation to the present disclosure. in:

FIG. 1 is a schematic flowchart of a method for detecting objects in vehicle-road coordination according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of an image provided according to an embodiment of the present disclosure;

Fig. 3a is a schematic structural diagram of a network model provided according to an embodiment of the present disclosure;

Fig. 3b is a schematic structural diagram of another network model provided according to an embodiment of the present disclosure;

Fig. 4 is a schematic structural diagram of an object detection device in vehicle-road coordination provided according to an embodiment of the present disclosure;

Fig. 5 is a schematic structural diagram of an electronic device provided according to an embodiment of the present disclosure.

detailed description

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Embodiments of the present disclosure provide a method, device, and roadside equipment for object detection in vehicle-road coordination.

In one embodiment of the present disclosure, a method for object detection in vehicle-road coordination is provided, the method includes:

Perform target detection on the image to obtain the candidate target area in the image, the confidence of the candidate target area, and the occlusion degree of the candidate target area;

Objects in the image are detected from candidate object regions based on the updated confidence.

Since the intersection ratio between candidate object regions can reflect the degree of overlap between candidate object regions, and the occlusion degree of candidate object regions can reflect the degree of occlusion of candidate object regions, therefore, according to the above intersection ratio and occlusion degree, update the candidate The confidence of the target area can refer to the overlap between the target areas, so that the updated confidence of the candidate target area is closer to the actual situation, so that the image can be detected according to the updated confidence, which can improve the accuracy of target detection .

The subject of execution of the embodiments of the present disclosure will be described below.

The execution subject of the embodiment of the present disclosure may be an electronic device integrated with a target detection function, wherein the above-mentioned electronic device may be: a desktop computer, a notebook computer, a server, an image acquisition device, and the like. Wherein, the image acquisition device may include: a video camera, a camera, a driving recorder, and the like.

The solutions provided by the embodiments of the present disclosure can be applied to target detection of images collected in application scenarios such as vehicle-road coordination V2X road monitoring and vehicle path planning.

In addition, the solutions provided by the embodiments of the present disclosure may also be used to perform target detection on images collected in other scenarios. For example, the above-mentioned other scenes may be scenes with a high density of people, such as subway stations, shopping malls, concerts, etc. If images are collected for such scenes, the people contained in the collected images are often dense, and it is easy for some people’s faces to be blocked by others. Situations where a person's face is occluded. The above-mentioned scene can also be a scene with relatively dense personnel such as the entrance of a museum, a bank lobby, etc. For image collection of such a scene, the face of the person may be blocked by other people or buildings in the collected image.

The above are only examples of application scenarios of the embodiments of the present disclosure, and do not limit the present disclosure.

The aforementioned objects may be human faces, animals, vehicles, and so on.

The object detection method in the vehicle-road coordination provided by the embodiments of the present disclosure will be described in detail below.

Referring to FIG. 1 , FIG. 1 is a schematic flowchart of a method for object detection in vehicle-road coordination provided by an embodiment of the present disclosure. The above method includes the following steps S101 - S103.

Step S101: Perform target detection on the image to obtain the candidate target area in the image, the confidence level of the candidate target area, and the occlusion degree of the candidate target area.

The foregoing images may be images acquired through image acquisition for a specific scene. The above-mentioned scenes can include vehicle driving scenes, parking lot scenes, etc. In this case, the above-mentioned objects can be vehicles; the above-mentioned scenes can also include public space scenes such as subway stations and high-speed rail stations. In this case, the above-mentioned objects can be people.

When performing target detection, in one embodiment, a preset target detection algorithm may be used to perform target detection on an image to obtain a candidate target area in the image, a confidence degree of the candidate target area, and an occlusion degree of the candidate target area.

The aforementioned preset target detection algorithm may be a detection algorithm adopted for different types of targets. For example, when the target is a person, a face detection algorithm, a human body detection algorithm, etc. can be used; when the target is a vehicle, a vehicle detection algorithm, a license plate detection algorithm, etc. can be used.

For other implementation manners of target detection, reference may be made to subsequent embodiments, and details are not described here.

The candidate target area refers to the area where the target may exist after target detection. Taking FIG. 2 as an example, the area surrounded by each rectangular frame in FIG. 2 is a candidate target area obtained by performing animal detection on the image.

The confidence of the candidate target area reflects: the possibility of the existence of the target in the candidate target area. The confidence level above can be expressed in decimals, percentages, and the like. The larger the value of the confidence degree, the higher the probability that the target exists in the candidate target area.

For example, when the target is a person, when the confidence of the candidate target area A is greater than the confidence of the candidate target area B, it means that the possibility of a person in the candidate target area A is higher than the possibility of a person in the candidate target area B.

The degree of occlusion of the candidate target area reflects: the degree of occlusion of the candidate target area. The above occlusion degree can be represented by decimals, percentages, etc., and can also be expressed by the occlusion level serial number, for example: the occlusion level serial number includes 1, 2, 3, wherein, the serial number 1 can indicate that the occlusion level is severe occlusion, and the serial number is 2 can indicate that the occlusion level is moderate occlusion, and the sequence number 3 can indicate that the occlusion level is light occlusion.

Step S102: Based on the intersection-over-union ratio between the candidate target areas and the degree of occlusion of the candidate target areas, update the confidence of the candidate target areas.

The intersection-over-union ratio between candidate object regions is used to describe the coincidence degree between two candidate object regions. When the intersection ratio is higher, it means that the overlap degree between the two candidate target regions is higher; when the intersection ratio is lower, it means that the overlap degree between the two candidate target regions is lower.

Specifically, the overlapping area between two candidate target regions can be calculated to obtain the first area, the sum of the areas of the two candidate target regions can be calculated to obtain the second area, and then the difference between the second area and the first area can be calculated to obtain the second area Three areas, the ratio between the first area and the third area is determined as the intersection ratio between the candidate target areas.

For example: the area of the candidate target area A is 48, and the area of the candidate target area B is 32, wherein, the overlapping area of the candidate target area A and the candidate target area B is 16, that is, the first area is 16, and the candidate target area A and the candidate target area B are 16. The total area of the candidate target area B is (46+32)=80, that is, the second area is 80, and the difference between the second area and the first area is calculated (80-16)=64, that is, the third area is 64, The ratio of the first area to the third area is calculated to obtain 16/64=0.25, and 0.25 is the intersection and union ratio between the candidate target areas.

In an implementation manner, a reference area may be selected from each candidate target area, and for every other candidate target area except the reference area in each candidate target area, the intersection ratio between the other candidate target area and the reference area is calculated, The calculated intersection and union ratio is determined as the intersection and union ratio for updating the confidence of the candidate target region.

The aforementioned reference region may be the region with the highest confidence among the candidate target regions.

In another implementation manner, for each candidate target area, an intersection and union ratio may be selected from the intersection and union ratios between the candidate target area and other candidate target areas, and the selected intersection and union ratio may be determined as the The intersection and union ratio of the confidence of the candidate target region is updated.

For example: the maximum intersection and union ratio, the average intersection and union ratio, the median intersection and union ratio, or the minimum intersection and union ratio may be selected from the above-mentioned multiple intersection and union ratios.

When updating the confidence of the candidate target area, in one embodiment, the adjustment coefficient can be calculated according to the preset first weight and second weight according to the intersection ratio between the candidate target areas and the degree of occlusion of the candidate target area , according to the calculated adjustment coefficient, update the confidence of the candidate target region.

Specifically, the product of the intersection ratio between candidate target areas and the first weight can be calculated, and the product of the occlusion degree of the candidate target area and the second weight can be calculated, and the sum of the two calculated products can be used as Adjustment coefficient.

For example: the intersection ratio between candidate target areas is 80%, the occlusion degree of candidate target areas is 50%, the preset first weight is 0.8, and the preset second weight is 0.2. Calculate the intersection of candidate target areas. The product between the combination ratio and the first weight is: 0.8*80%=64%, the product between the occlusion degree of the calculated candidate target area and the second weight is: 0.2*50%=10%, the calculated two The sum of the products is: 64%+10%=74%, thus the adjustment factor is 74%.

After the adjustment coefficient is calculated, the product of the adjustment coefficient and the confidence of the candidate target area may be calculated as the updated confidence of the candidate target area.

For other implementation manners of updating the confidence of the candidate target region, reference may be made to subsequent embodiments, and details are not described here.

Step S103: Detect the target in the image from the candidate target regions according to the updated confidence.

In an embodiment of the present disclosure, a candidate target area whose updated confidence is greater than a preset confidence threshold may be selected, and the target in the selected candidate target area is determined as the target in the image.

The aforementioned preset reliability threshold may be set by staff based on experience, for example: when the confidence is represented by a percentage, the preset reliability threshold may be 90%, 95% and so on.

An example is used to illustrate the above target determination process, assuming that the confidence levels of the updated candidate target areas are: 80%, 70%, 90%, and 95%, respectively, and the preset confidence threshold is 85%. Confidence levels are 90% and 95%, among which, the confidence level after the update of area 1 is 90%, and the confidence level after update of area 2 is 95%, so the target in area 1 and the target in area 2 are The goal.

In this way, for candidate target regions whose confidence is greater than the preset confidence threshold, the probability that these candidate target regions contain a target is higher than the probability that other candidate target regions contain a target. Therefore, the target in the candidate target area whose confidence is greater than the preset confidence threshold is determined as the target in the image, and the accuracy of the obtained target is higher.

In an embodiment of the present disclosure, a preset number of candidate target areas with the highest updated confidence may also be selected, and the target in the selected candidate target areas is determined as the target in the image.

The above-mentioned preset number can be set by the staff based on experience, for example: the above-mentioned preset number can be 1, 3, 5, etc.

An example is used to illustrate the above target determination process. Assume that the confidence levels of the target area are: 80%, 70%, 90%, and 95%, respectively, and the preset number is 3, among which the three with the highest updated confidence levels are 95%, 90%, and 80% respectively; Objects in the candidate object regions with updated confidence levels of 95%, 90%, and 80% are determined as objects in the image.

In this way, for the preset number of candidate target regions with the highest confidence, the possibility of containing the target in these candidate target regions is higher than the possibility of containing the target in other candidate regions. Therefore, the target in the preset number of candidate target areas with the highest confidence is determined as the target in the image, and the accuracy of the obtained target is relatively high.

It can be seen from the above that when applying the scheme provided by the embodiments of the present disclosure for object detection, firstly, the confidence of the candidate object area is updated according to the intersection ratio between the candidate object areas and the degree of occlusion of the candidate object area, and then based on the updated Confidence, to detect an object in an image from candidate object regions. Since the intersection ratio between candidate object regions can reflect the degree of overlap between candidate object regions, and the occlusion degree of candidate object regions can reflect the degree of occlusion of candidate object regions, therefore, according to the above intersection ratio and occlusion degree, update The confidence of the candidate target area can refer to the overlap between the candidate target areas, so that the updated confidence of the candidate target area is closer to the actual situation, so that the image can be detected according to the updated confidence, which can improve the accuracy of target detection. Accuracy.

In addition, in dense scenes, such as scenes with dense traffic of people and traffic, the occlusion of objects is especially serious. For the images of these dense scenes, the occlusion degree of each candidate target area is relatively high, so that the target in the candidate target area is incomplete, and the error of the confidence degree of the obtained candidate target area is relatively large. Update the confidence of the candidate target area by the occlusion degree of the candidate target area, which can effectively eliminate the influence of the error on the confidence of each candidate target area when it is occluded, so that the accuracy of the updated confidence is high, and then the detection is accurate. . Therefore, the solutions provided by the embodiments of the present disclosure can be better adapted to situations where occlusion exists in dense scenes, and improve the accuracy of object detection.

In order to accurately update the confidence degree of the candidate target region, in one embodiment of the present disclosure, the first region with the highest confidence degree can be cyclically selected from the region set, and according to the intersection ratio between other regions in the region set and the first region and other The degree of occlusion of the region is used to update the confidence of other regions. At this point, a confidence update operation is completed, and the above operations are performed cyclically until one region is included in the region set. Each confidence update operation can be called a cycle.

The above region set includes: regions that have not been selected in the candidate target regions. Specifically, at the beginning of the first cycle, the region set includes each candidate target region obtained in step S101; in each cycle, after the first region is selected from the region set, the region set no longer includes the selected first region. area.

At the beginning of the first cycle, the first area is: the area with the highest confidence among the candidate target areas obtained in step S101; in each subsequent cycle, the first area is: the updated each target area obtained after the last cycle. The region with the highest confidence in the region

The aforementioned other areas refer to areas in the area concentration except the first area. For example, the area set includes: area 1, area 2, and area 3, where area 1 is the first area, and areas other than the first area are area 2 and area 3, then area 2 and area 3 are other areas.

In each cycle, each region in the region set may be traversed, the confidence of each region is sorted from high to low, and the region with the highest confidence is determined as the first region. In addition, the first region may also be stored in the prediction set, and as the number of cycles increases, the number of first regions stored in the prediction set also increases.

The above cycle process will be described below with reference to specific examples.

Assume that each candidate target area obtained in step S101 is b1, b2, b3, ... bn.

At the beginning of the first cycle, the set of regions B = {b1, b2, b3, . . . , bn}. Wherein, the area with the highest confidence in each candidate target area is the area b1, so the area b1 is taken as the first area. In the area set B, the areas other than the area b1 are b2, b3, ..., bn, so {b2, b3, ..., bn} are other areas.

Update other areas {b2, b3, ..., bn} confidence. And the first region b1 can be added to the prediction set D, and the added prediction set D={b1};

At the beginning of the second cycle, since the area b1 has been selected as the first area, the area b1 is not included in the area set B, and the area set B={b2, b3, . . . , bn}. Among them, the area with the highest confidence in the updated {b2, b3, ..., bn} is the area b2, so the area b2 is taken as the first area. In the area set B, the areas other than the area b2 are b3, ..., bn, so {b3, ..., bn} are other areas.

According to the intersection ratio between other regions {b3,...,bn} and region b2 and the occlusion degree of other regions {b3,...,bn}, update the confidence of other regions {b3,...,bn}. And the first region b2 can be added to the prediction set D, and the added prediction set D={b1, b2}.

At the beginning of the third cycle, since the regions b1 and b2 have been selected as the first regions, then the region set B={b3,...,bn}. Among them, the area with the highest confidence in the updated {b3, ..., bn} is the area b3, so the area b3 is taken as the first area. In the area set B, the areas other than the area b3 are b4, ..., bn, so {b4, ..., bn} are other areas.

According to the intersection ratio between other regions {b4,...,bn} and region b3 and the occlusion degree of other regions {b4,...,bn}, update the confidence of other regions {b4,...,bn}. And the first region b3 can be added to the prediction set D, and the added prediction set D={b1, b2, b3}.

In a similar manner, until the number of regions in region set B is 1, the only region in region set B is directly added to prediction set D, and the cycle ends to obtain the updated confidence of each region.

In this way, in each cycle, the confidence of the region in the concentrated region is updated according to the intersection ratio between other regions in the concentrated region and the first region and the degree of occlusion of other regions. Among them, the degree of occlusion of other regions reflects the degree of occlusion of other regions. When the region is occluded, the accuracy of the confidence of the detected region is low. By introducing the degree of occlusion of other regions, the candidate target region after update can be The accuracy of the confidence degree is high; and the intersection ratio between other regions and the first region reflects the coincidence degree between other regions and the first region, and the first region is the region with the highest confidence, and the region with the highest confidence The degree of coincidence between them can also effectively adjust the confidence of other regions. Therefore, the confidence levels of other regions can be effectively updated according to the intersection ratio and the degree of occlusion in each cycle. Moreover, the accuracy of the updated confidence level can be further improved by looping and iterating the updating process.

In an embodiment of the present disclosure, when updating the confidence levels of other regions in each cycle, it may be implemented according to the following steps A1-A4.

Step A1: Calculate the intersection and union ratios between other areas in the area set and the first area.

Specifically, first calculate the overlapping area between other areas and the first area, calculate the total area of other areas and the first area; then calculate the difference between the above total area and the overlapping area to obtain the target area, and finally calculate the difference between the overlapping area and the target area The ratio between them is determined as the intersection and union ratio between the candidate target regions.

Step A2: Determine a first confidence adjustment value according to the intersection and union ratio and the preset intersection and union ratio threshold.

The above-mentioned preset intersection and union ratio thresholds may be set by staff based on experience, for example, the intersection and union ratio thresholds may be 90%, 95%, and so on.

In one embodiment, it may be judged whether the intersection ratio is less than a preset intersection ratio threshold, if yes, determine the first confidence adjustment value as the first preset value, and if no, determine the first confidence adjustment value is the second preset value.

Both the above-mentioned first preset value and the second preset value are set by the staff based on experience.

In an embodiment of the present disclosure, it may also be judged whether the intersection ratio is less than the preset intersection ratio threshold; if yes, determine the first confidence adjustment value is 1; if no, determine the first confidence adjustment value : The difference between 1 and the intersection and union ratio.

For example: Assuming that the preset intersection ratio threshold is 90%, when the intersection ratio between other regions and the first region is 95%, and the intersection ratio 95% is greater than the preset intersection ratio threshold 90%, determine the first Confidence adjustment value: 1-95% = 5%; when the intersection and union ratio between other areas and the first area is 50%, and the intersection and union ratio 50% is less than the preset intersection ratio threshold 90%, determine the first confidence Degree adjustment value: 1.

In this way, when the intersection and union ratio is less than the preset intersection and union ratio threshold, it means that the overlap between other areas and the first area is small, indicating that a small part of the image content in other areas is blocked, and the confidence of the other areas detected is The accuracy is high, in which case no adjustments to the confidences of the other regions may be made. Setting the first confidence adjustment value to 1 can realize that the confidence of the region is not adjusted. When the intersection ratio is not less than the preset intersection ratio threshold, it means that the overlap between other regions and the first region is relatively large, indicating that most of the image content in other regions is blocked, and the confidence of the other regions detected is The accuracy is low. In this case, the confidence of other regions needs to be adjusted. Setting the first confidence adjustment value to 1 and the difference between the intersection and union ratio can make the adjusted confidence approach the actual situation.

Step A3: Determine a second confidence adjustment value according to the degree of occlusion of other areas.

In an implementation manner, the product of the degree of occlusion of other regions and a preset adjustment coefficient may be calculated as the second confidence adjustment value.

The aforementioned preset adjustment coefficient can be set by the staff based on experience, for example: the preset adjustment coefficient can be 1.2, 1.5, etc.

In an embodiment of the present disclosure, the second confidence adjustment value g(occ_pred) may also be determined according to the following expression:

g(occ_pred)=α ^occ_pred

Among them, occ_pred is the occlusion degree of other areas, α is a preset constant, α>1.

Because α>1, the second confidence adjustment value g(occ_pred) increases as the occlusion degree of other regions increases.

Since the accuracy of the confidence of the region is low when the occlusion degree of the region is high, it is necessary to make a large adjustment to the confidence of the region so that the adjusted confidence is close to the actual situation. And because the second confidence degree adjustment value g(occ_pred) increases as the degree of occlusion of other regions increases, that is, the higher the degree of occlusion of other regions, the larger the second degree of confidence adjustment value. The confidence of other regions is greatly adjusted, so that the adjusted confidence of other regions is close to the actual situation.

Step A4: Using the first confidence adjustment value and the second confidence adjustment value, adjust the confidence of other regions.

In an embodiment of the present disclosure, the confidence of other regions can be adjusted according to the following expression:

S'=S*T1*T2

Wherein, S' represents the confidence degree of other regions after adjustment, S represents the confidence degree of other regions before adjustment, T1 represents the first confidence degree adjustment value, and T2 represents the second confidence degree adjustment value.

In this way, since the adjusted confidence level is the product of the first confidence level adjustment value, the second confidence level adjustment value and the confidence levels of other regions, and because the first confidence level adjustment value and the second confidence level adjustment value are different from The angle reflects the occlusion situation of other regions. Therefore, the above adjusted confidence refers to the occlusion situation of other regions, making the adjusted confidence closer to the actual situation.

In another embodiment, the product of the first confidence adjustment value, the second confidence adjustment value, and the confidence of other regions can also be calculated as the reference confidence, and the above reference can be adjusted by the preset confidence error value. Confidence, the adjusted reference confidence is obtained as the adjusted confidence of other regions.

Specifically, the product of the preset confidence error value and the parameter confidence may be calculated, and the calculated product may be determined as the adjusted confidence of other regions.

In this way, since the first confidence adjustment value is determined by the intersection and union ratio between other regions and the first region, the intersection and union ratio reflects the coincidence degree of other regions and the first region, and the second confidence adjustment value is based on the intersection and union ratio of other regions The degree of occlusion is determined, and the degree of occlusion reflects the degree of occlusion of other regions, and the first and second confidence adjustment values can reflect the occlusion of other regions from different angles. Therefore, when using the first confidence adjustment value and the second confidence adjustment value to adjust the confidence of other areas, since the first confidence adjustment value and the second confidence adjustment value reflect the occlusion of other areas from different angles, use When adjusting the first confidence level adjustment value and the second confidence level adjustment value, the confidence level is adjusted based on more accurate occlusion conditions of other regions, so that the adjusted confidence level is closer to the actual situation.

Hereinafter, a specific implementation process will be used to illustrate the above-mentioned manner of cyclically updating the confidence level.

Assume that each candidate target area is b1, b2, b3, and the preset intersection-over-union ratio threshold Nt is 90%, the confidence and occlusion degree of each candidate target area are shown in Table 1 below.

Table 1

候选目标区域candidate target area	置信度Confidence	被遮挡度degree of occlusion
区域b1area b1	Cv1Cv1	Co1Co1
区域b2area b2	Cv2Cv2	Co2Co2
区域b3area b3	Cv3Cv3	Co3Co3

At the beginning of the first cycle, the region set B={b1, b2, b3}, wherein the confidence Cv1 of the region b1 is the highest, the region b1 is the first region, and the regions b2 and b3 are other regions.

For the area b2, calculate the intersection ratio between the area b2 and the area b1, and determine the first confidence adjustment value according to the above intersection ratio and the preset intersection ratio threshold of 90%; and according to the occlusion degree Co2 of the area b2 , determine the second confidence adjustment value; according to the first confidence adjustment value and the second confidence adjustment value, adjust the confidence Cv2 of the region b2, and the updated confidence is Cv21.

For area b3, calculate the intersection ratio between area b3 and area b1, and determine the first confidence adjustment value according to the above intersection ratio and the preset intersection ratio threshold of 90%; and according to the occlusion degree Co3 of area b3 , determine the second confidence adjustment value; according to the first confidence adjustment value and the second confidence adjustment value, adjust the confidence Cv3 of the region b3, and the updated confidence is Cv31.

The confidences of the updated candidate target regions obtained in the first cycle are shown in Table 2 below.

Table 2

候选目标区域candidate target area	置信度Confidence
区域b1area b1	Cv1Cv1
区域b2area b2	Cv21Cv21
区域b3area b3	Cv31Cv31

At the beginning of the second cycle, since the region b1 has been selected, the region set B={b2, b3}, wherein the confidence Cv21 of the region b2 is the highest, the region b2 is the first region, and the region b3 is other regions.

For the area b3, calculate the intersection ratio between the area b3 and the area b2, and determine the first confidence adjustment value according to the above intersection ratio and the preset intersection ratio threshold of 90%; and according to the occlusion degree Co3 of the area b3 , determine the second confidence adjustment value; according to the first confidence adjustment value and the second confidence adjustment value, adjust the confidence Cv31 of the region b3, and the updated confidence is Cv311.

Since the regions b1 and b3 have been selected, the region set B={b3} contains one region, and the loop ends.

The finally updated confidence of each candidate target region is shown in Table 3 below.

table 3

候选目标区域candidate target area	置信度Confidence
区域b1area b1	Cv1Cv1
区域b2area b2	Cv21Cv21
区域b3area b3	Cv311Cv311

In one embodiment of the present disclosure, in the above step S101, target detection can be performed on the image for different target scales, and the candidate target areas of different scales in the image, the confidence of the candidate target areas, and the occlusion degree of the candidate target areas can be obtained .

The target scale refers to: the size of the target.

The target scale may be a preset scale value, for example, the target scale may be 16x16, 32x32, or 64x64.

Specifically, multi-layer feature extraction can be performed on the image, and then feature fusion is performed on different features to obtain features of different scales. The features of different scales are used to detect the target of the image, and the candidate target areas of different scales are obtained, and the confidence and occlusion degree of the candidate target areas of different scales are obtained.

In this way, since the image feature information contained in the candidate object regions of different scales is different, by obtaining the candidate object regions of different scales in the image, the feature information of the candidate object regions at different scales is enriched.

In one embodiment of the present disclosure, an image may be input into a pre-trained target detection model, and the candidate target areas in the image output by the target detection model, the confidence of the candidate target areas, and the occlusion degree of the candidate target areas may be obtained.

The above object detection model includes: an object detection layer for detecting candidate object areas in an image, and an occlusion degree prediction layer for predicting the degree of occlusion of the candidate object areas.

In one implementation, in addition to detecting the candidate target area in the image, the target detection layer can also calculate the confidence of the candidate target area. In this case, the network structure of the target detection model can be shown in Figure 3a. The target detection model includes Object detection layer and occlusion prediction layer.

Specifically, after the image is input to the target detection model, the target detection layer in the model detects the candidate target area in the image, calculates the confidence of the candidate target area, and transmits the detection result to the occlusion degree prediction layer; the occlusion degree prediction layer predicts Each candidate target area is occluded; the target detection model outputs the candidate target area, the confidence of the candidate target area, and the occlusion degree.

It can be known from the foregoing embodiments that when performing object detection on an image, for each object scale, candidate object regions of different scales, confidence levels of the candidate object regions, and occlusion degrees of the candidate object regions can be obtained. In this case, FPN (Feature Pyramid Networks, Feature Pyramid Network) can be added on the basis of the above network model. FPN is used to obtain candidate target areas of various scales, confidence and occlusion of candidate target areas.

The network structure of the network model after adding the FPN may be shown in FIG. 3b, and the network structure shown in FIG. 3b includes a backbone network (Backbone) and an FPN.

Among them, the backbone network is used to extract the features of the image, obtain the image features of different levels in the image, and input the image features of different levels into the FPN.

For example, when the backbone network is a convolutional neural network, each convolutional layer of the convolutional neural network can perform convolution operations on images to obtain image features at different levels.

FPN is used to perform feature fusion of image features at different levels to obtain image features of different scales, perform target detection based on image features of different scales, obtain candidate target areas of different scales, and obtain the confidence and occlusion degree of candidate target areas , realizing the divide-and-conquer processing of image features at different levels.

When training the target detection model, the sample image is used as the training sample, and the real candidate target area and the real occlusion degree in the sample image are used as the training labels to train the preset neural network model until the training end condition is met, and the obtained The trained object detection model.

The aforementioned preset neural network model may be a CNN (Conv Neural Network, convolutional neural network) model, an RNN (Recurrent Neural Network, recursive neural network) model, a DNN (Deep Neural Network, deep neural network) model, etc.

Specifically, after the sample image is input to the preset neural network model, the above preset neural network model performs target detection on the sample image, obtains the candidate target area and the degree of occlusion of the sample image, and calculates the candidate target area and the real target area and the difference between the occlusion degree of the candidate target area and the real occlusion degree, adjust the parameters of the neural network model according to the calculated difference, and iteratively adjust the parameters until the preset training end conditions are met.

The aforementioned training end conditions may be that the number of training times reaches a preset number of times, the model parameters meet the preset model parameter convergence conditions, and the like.

Since the target detection model is trained through a large number of training samples, during the training process, the target detection model learns the features of the target area and the occluded features in the sample image, therefore, the target detection model has strong robustness, so the target detection model is adopted. When the detection model performs target detection on an image, it can output accurate candidate target areas, confidence levels of candidate target areas, and occlusion degrees.

In the above step S101, in addition to using the target detection model to perform target detection on the image, the image can also be divided into multiple regions, and for each region, the image features in the region are extracted, and the candidate targets in the region are determined according to the image features area.

The aforementioned image features include: texture features, color features, edge features, and the like.

After each candidate target area is obtained, the confidence of each candidate target is predicted according to the image features of each candidate target area.

In addition, the degree of occlusion of each candidate target area may also be calculated according to the layer to which each candidate target area belongs and the location information.

Specifically, according to the layer to which the candidate target areas belong and the relative relationship between positions, it can be determined whether occlusion occurs between the candidate target areas, and the ratio between the occluded area and the area of the occluded area can be calculated as the candidate target area degree of occlusion.

For example: when the candidate target area A is located in the foreground layer and the candidate target area B is located in the background layer, and the position information of the candidate target area A and the candidate target area B overlaps, it can be determined that the candidate target area B is blocked, and the candidate target area B is calculated. The ratio of the shaded area of the target area B to the area of the candidate target area B is used as the shaded degree of the candidate target area B.

Corresponding to the above object detection method in vehicle-road coordination, an embodiment of the present disclosure further provides a device for detecting objects in vehicle-road coordination.

Referring to FIG. 4 , FIG. 4 is a schematic structural diagram of an object detection device in vehicle-road coordination provided by an embodiment of the present disclosure. The above-mentioned device includes the following modules 401-403.

An information obtaining module 401, configured to perform target detection on an image, and obtain a candidate target area in the image, a confidence degree of the candidate target area, and an occlusion degree of the candidate target area;

A confidence update module 402, configured to update the confidence of the candidate target region based on the intersection-over-union ratio between the candidate target regions and the degree of occlusion of the candidate target region;

The target detection module 403 is configured to detect the target in the image from the candidate target regions according to the updated confidence.

It can be seen from the above that when applying the scheme provided by the embodiments of the present disclosure for object detection, firstly, the confidence of the candidate object area is updated according to the intersection ratio between the candidate object areas and the degree of occlusion of the candidate object area, and then based on the updated Confidence, to detect an object in an image from candidate object regions. Since the intersection ratio between candidate object regions can reflect the degree of overlap between candidate object regions, and the occlusion degree of candidate object regions can reflect the degree of occlusion of candidate object regions, therefore, according to the above intersection ratio and occlusion degree, update The confidence of the candidate target area can refer to the overlap between the candidate target areas, so that the updated confidence of the candidate target area is more inclined to the actual situation. Therefore, the target detection is performed on the image according to the updated confidence level, which can improve the accuracy of target detection.

In an embodiment of the present disclosure, the confidence update module 402 is specifically configured to cyclically select the first region with the highest confidence from the region set, and according to the intersection ratio between other regions in the region set and the first region and other regions The degree of occlusion of other areas is updated until the set of areas includes an area, wherein the set of areas includes: an unselected area in the candidate target area.

In an embodiment of the present disclosure, the confidence update module 402 includes:

an intersection ratio calculation unit, configured to calculate intersection ratios between other regions in the region set and the first region;

A first adjustment value determination unit, configured to determine a first confidence adjustment value according to the intersection ratio and a preset intersection ratio threshold;

A second adjustment value determination unit, configured to determine a second confidence adjustment value according to the degree of occlusion of other areas;

A confidence adjustment unit, configured to adjust the confidence of other regions by using the first confidence adjustment value and the second confidence adjustment value.

In an embodiment of the present disclosure, the first adjustment value determination unit is specifically configured to judge whether the intersection-over-union ratio is smaller than a preset intersection-over-union ratio threshold; if yes, determine a first confidence adjustment value of 1; If not, determine that the adjusted value of the first confidence level is the difference between 1 and the intersection-over-union ratio.

In an embodiment of the present disclosure, the second adjustment value determination unit is specifically configured to determine the second confidence adjustment value g(occ_pred) according to the following expression:

g(occ_pred)=α ^occ_pred

In an embodiment of the present disclosure, the confidence adjustment unit is specifically configured to adjust the confidence of other regions according to the following expression:

S'=S*T1*T2

Wherein, S' represents the confidence degree of other regions after adjustment, S represents the confidence degree of other regions before adjustment, T1 represents the first confidence adjustment value, and T2 represents the second confidence adjustment value.

In an embodiment of the present disclosure, the target detection module 403 is specifically configured to select a candidate target area whose updated confidence is greater than a preset confidence threshold, and determine the target in the selected candidate target area as the target in the image or select a preset number of candidate target areas with the highest confidence after updating, and determine the targets in the selected candidate target areas as the targets in the image.

In this way, for candidate target regions whose confidence is greater than the preset confidence threshold, the probability that these candidate target regions contain a target is higher than the probability that other candidate target regions contain a target. Therefore, if the target in the candidate target area whose confidence is greater than the preset confidence threshold is determined as the target in the image, the accuracy of the obtained target is relatively high; for the preset number of candidate target areas with the largest confidence, these The probability of an object being contained in a candidate object region is higher than that of other candidate regions. Therefore, the target in the preset number of candidate target areas with the highest confidence is determined as the target in the image, and the accuracy of the obtained target is relatively high.

In an embodiment of the present disclosure, the information obtaining module 401 is specifically configured to perform target detection on an image for different target scales, and obtain candidate target areas of different scales in the image, confidence levels of candidate target areas, and candidate target areas. The degree of occlusion of the target area.

In an embodiment of the present disclosure, the information obtaining module 401 is specifically configured to input an image into a pre-trained target detection model, and obtain the candidate target area and the candidate target area in the image output by the target detection model. The confidence level and the occlusion degree of the candidate target area, wherein the target detection model includes: an object detection layer for detecting the candidate target area in the image and an occlusion degree prediction layer for predicting the occlusion degree of the candidate target area.

In this way, since the target detection model is trained through a large number of training samples, during the training process, the target detection model has learned the features of the target area and the occluded features in the sample image, so the target detection model has strong robustness, thus When the target detection model is used to detect the target in the image, it can output the accurate candidate target area, the confidence degree of the candidate target area and the degree of occlusion.

In the technical solution of this disclosure, the collection, storage, use, processing, transmission, provision, and disclosure of user personal information involved are all in compliance with relevant laws and regulations, and do not violate public order and good customs.

According to the embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium, and a computer program product.

In one embodiment of the present disclosure, an electronic device is provided, including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions that can be executed by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can perform any vehicle-road coordination in the foregoing method embodiments. Object detection method.

In one embodiment of the present disclosure, a non-transitory computer-readable storage medium storing computer instructions is provided, wherein the computer instructions are used to enable the computer to execute any vehicle-road coordination target in the foregoing method embodiments. Detection method.

In one embodiment of the present disclosure, a computer program product is provided, including a computer program. When the computer program is executed by a processor, any method for object detection in vehicle-road coordination in the foregoing method embodiments is implemented.

In one embodiment of the present disclosure, a roadside device is provided, including the above-mentioned electronic device.

In one embodiment of the present disclosure, a cloud control platform is provided, including the above-mentioned electronic device.

FIG. 5 shows a schematic block diagram of an example electronic device 500 that may be used to implement embodiments of the present disclosure. Electronic device is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are by way of example only, and are not intended to limit implementations of the disclosure described and/or claimed herein.

As shown in FIG. 5 , the device 500 includes a computing unit 501 that can execute according to a computer program stored in a read-only memory (ROM) 502 or loaded from a storage unit 508 into a random-access memory (RAM) 503. Various appropriate actions and treatments. In the RAM 503, various programs and data necessary for the operation of the device 500 can also be stored. The computing unit 501, ROM 502, and RAM 503 are connected to each other through a bus 504. An input/output (I/O) interface 505 is also connected to the bus 504 .

Multiple components in the device 500 are connected to the I/O interface 505, including: an input unit 506, such as a keyboard, a mouse, etc.; an output unit 507, such as various types of displays, speakers, etc.; a storage unit 508, such as a magnetic disk, an optical disk, etc. ; and a communication unit 509, such as a network card, a modem, a wireless communication transceiver, and the like. The communication unit 509 allows the device 500 to exchange information/data with other devices over a computer network such as the Internet and/or various telecommunication networks.

The computing unit 501 may be various general-purpose and/or special-purpose processing components having processing and computing capabilities. Some examples of computing units 501 include, but are not limited to, central processing units (CPUs), graphics processing units (GPUs), various dedicated artificial intelligence (AI) computing chips, various computing units that run machine learning model algorithms, digital signal processing processor (DSP), and any suitable processor, controller, microcontroller, etc. The calculation unit 501 executes various methods and processes described above, for example, the object detection method in vehicle-road coordination. For example, in some embodiments, the object detection method in vehicle-road coordination can be implemented as a computer software program, which is tangibly contained in a machine-readable medium, such as the storage unit 508 . In some embodiments, part or all of the computer program may be loaded and/or installed on the device 500 via the ROM 502 and/or the communication unit 509. When the computer program is loaded into the RAM 503 and executed by the computing unit 501, one or more steps of the object detection method in the vehicle-road coordination described above can be executed. Alternatively, in other embodiments, the calculation unit 501 may be configured in any other appropriate way (for example, by means of firmware) to execute the object detection method in vehicle-road coordination.

Various implementations of the systems and techniques described above herein can be implemented in digital electronic circuit systems, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on chips Implemented in a system of systems (SOC), load programmable logic device (CPLD), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpreted on a programmable system including at least one programmable processor, the programmable processor Can be special-purpose or general-purpose programmable processor, can receive data and instruction from storage system, at least one input device, and at least one output device, and transmit data and instruction to this storage system, this at least one input device, and this at least one output device an output device.

Optionally, in addition to electronic equipment, the roadside equipment may also include communication components, etc., and the electronic equipment and communication components may be integrally integrated or separately provided. Electronic devices can obtain data from sensing devices (such as roadside cameras), such as pictures and videos, for image and video processing and data calculation. Optionally, the electronic device itself may also have the function of acquiring sensory data and communication functions, such as an AI camera, and the electronic device may directly perform image and video processing and data calculation based on the acquired sensory data.

Optionally, the cloud control platform performs processing on the cloud, and the electronic devices included in the cloud control platform can obtain data from sensing devices (such as roadside cameras), such as pictures and videos, to perform image and video processing and data calculation; the cloud control platform It can also be called vehicle-road collaborative management platform, edge computing platform, cloud computing platform, central system, cloud server, etc.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general-purpose computer, a special purpose computer, or other programmable data processing devices, so that the program codes, when executed by the processor or controller, make the functions/functions specified in the flow diagrams and/or block diagrams Action is implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of the present disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device. A machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer discs, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.

To provide for interaction with the user, the systems and techniques described herein can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user. ); and a keyboard and pointing device (eg, a mouse or a trackball) through which a user can provide input to the computer. Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and can be in any form (including Acoustic input, speech input or, tactile input) to receive input from the user.

The systems and techniques described herein can be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., as a a user computer having a graphical user interface or web browser through which a user can interact with embodiments of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system. The components of the system can be interconnected by any form or medium of digital data communication, eg, a communication network. Examples of communication networks include: Local Area Network (LAN), Wide Area Network (WAN) and the Internet.

A computer system may include clients and servers. Clients and servers are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, a server of a distributed system, or a server combined with a blockchain.

It should be understood that steps may be reordered, added or deleted using the various forms of flow shown above. For example, each step described in the present disclosure may be executed in parallel, sequentially, or in a different order, as long as the desired result of the technical solution disclosed in the present disclosure can be achieved, no limitation is imposed herein.

The above specific implementation manners are not intended to limit the protection scope of the present disclosure. It should be apparent to those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present disclosure shall be included within the protection scope of the present disclosure.

Claims

A method for target detection in vehicle-road coordination, the method comprising:

Carrying out target detection on the image to obtain a candidate target area in the image, a confidence degree of the candidate target area, and an occlusion degree of the candidate target area;

Update the confidence of the candidate target area based on the intersection ratio between the candidate target areas and the occlusion degree of the candidate target area;

Objects in the image are detected from candidate object regions according to the updated confidence.
The method according to claim 1, wherein said updating the confidence of the candidate target area based on the intersection-over-union ratio between the candidate target areas and the degree of occlusion of the candidate target area includes:

Loop selects the first region with the highest confidence from the region set, and updates the confidence of other regions according to the intersection ratio between other regions in the region set and the first region and the degree of occlusion of other regions until the region set includes An area, wherein the area set includes: unselected areas in candidate target areas.
The method according to claim 2, wherein the updating the confidence of other regions according to the intersection ratio between other regions in the region set and the first region and the degree of occlusion of other regions includes:

calculating intersection ratios between other areas in the area set and the first area;

Determine a first confidence adjustment value according to the intersection-over-union ratio and a preset intersection-over-union ratio threshold;

Determine a second confidence adjustment value according to the degree of occlusion of other areas;

The confidence of other regions is adjusted by using the first confidence adjustment value and the second confidence adjustment value.
The method according to claim 3, wherein said determining the first confidence adjustment value according to the intersection-over-union ratio and the preset intersection-over-union ratio threshold comprises:

judging whether the intersection and union ratio is less than a preset intersection and union ratio threshold;

If yes, determine that the first confidence adjustment value is 1;

If not, determine that the adjusted value of the first confidence level is the difference between 1 and the intersection-over-union ratio.
The method according to claim 3, wherein said determining the second confidence adjustment value according to the degree of occlusion of other regions comprises:

Determine the second confidence adjustment value g(occ_pred) according to the following expression:

g(occ_pred)=α occ_pred

Among them, occ_pred is the occlusion degree of other areas, α is a preset constant, α>1.
The method according to any one of claims 3-5, wherein the adjusting the confidence of other regions by using the first confidence adjustment value and the second confidence adjustment value comprises:

Adjust the confidence of other regions according to the following expression:

S'=S*T1*T2

Wherein, S' represents the confidence degree of other regions after adjustment, S represents the confidence degree of other regions before adjustment, T1 represents the first confidence adjustment value, and T2 represents the second confidence adjustment value.
The method according to any one of claims 1-5, wherein the detecting the target in the image from the candidate target area according to the updated confidence level comprises:

Select a candidate target area whose updated confidence is greater than a preset confidence threshold, and determine the target in the selected candidate target area as the target in the image;

or

Selecting a preset number of candidate target areas with the highest confidence after updating, and determining the targets in the selected candidate target areas as targets in the image.
The method according to any one of claims 1-5, wherein the target detection is performed on the image to obtain the candidate target area in the image, the confidence of the candidate target area, and the occlusion degree of the candidate target area, include:

Target detection is performed on the image for different target scales, and candidate target areas of different scales in the image, confidence levels of the candidate target areas, and occlusion degrees of the candidate target areas are obtained.
The method according to any one of claims 1-5, wherein the target detection is performed on the image to obtain the candidate target area in the image, the confidence of the candidate target area, and the occlusion degree of the candidate target area, include:

Input the image into the pre-trained target detection model, and obtain the candidate target area in the image output by the target detection model, the confidence of the candidate target area, and the occlusion degree of the candidate target area, wherein the target detection model It includes: an object detection layer for detecting candidate object areas in an image, and an occlusion degree prediction layer for predicting the degree of occlusion of the candidate object areas.
A target detection device in vehicle-road coordination, the device comprising:

An information obtaining module, configured to perform target detection on an image, and obtain a candidate target area in the image, a confidence degree of the candidate target area, and an occlusion degree of the candidate target area;

Confidence update module, for updating the confidence of the candidate target area based on the intersection ratio between the candidate target areas and the degree of occlusion of the candidate target area;

The object detection module is used to detect the object in the image from the candidate object area according to the updated confidence.
The apparatus of claim 10, wherein,

The confidence update module is specifically used to circularly select the first area with the highest confidence from the area set, and update other areas according to the intersection ratio between other areas in the area set and the first area and the occlusion degree of other areas until the set of regions includes a region, wherein the set of regions includes: a region that has not been selected in the candidate target region.
The device according to claim 11, wherein the confidence update module comprises:

an intersection ratio calculation unit, configured to calculate intersection ratios between other regions in the region set and the first region;

A first adjustment value determination unit, configured to determine a first confidence adjustment value according to the intersection ratio and a preset intersection ratio threshold;

A second adjustment value determination unit, configured to determine a second confidence adjustment value according to the degree of occlusion of other areas;

A confidence level adjustment unit, configured to use the first confidence level adjustment value and the second confidence level adjustment value to adjust the confidence level of other regions.
The apparatus of claim 12, wherein,

The first adjustment value determination unit is specifically configured to judge whether the intersection ratio is less than a preset intersection ratio threshold; if yes, determine the first confidence adjustment value is 1; if no, determine the first adjustment value A confidence adjustment value is the difference between 1 and the cross-over-union ratio.
The device according to claim 12, wherein the second adjustment value determination unit is specifically configured to determine the second confidence adjustment value g(occ_pred) according to the following expression:

g(occ_pred)=α occ_pred

Among them, occ_pred is the occlusion degree of other areas, α is a preset constant, α>1.
Apparatus according to any one of claims 12-14, wherein,

The confidence adjustment unit is specifically configured to adjust the confidence of other regions according to the following expression:

S'=S*T1*T2

Wherein, S' represents the confidence degree of other regions after adjustment, S represents the confidence degree of other regions before adjustment, T1 represents the first confidence adjustment value, and T2 represents the second confidence adjustment value.
Apparatus according to any one of claims 10-14, wherein,

The target detection module is specifically configured to select a candidate target area whose confidence after updating is greater than a preset reliability threshold, and determine the target in the selected candidate target area as the target in the image; or select the updated confidence A preset number of candidate target areas with the highest degrees are selected, and the target in the selected candidate target area is determined as the target in the image.
Apparatus according to any one of claims 10-14, wherein,

The information obtaining module is specifically configured to perform target detection on images for different target scales, and obtain candidate target areas of different scales in the image, confidence levels of candidate target areas, and occlusion degrees of candidate target areas.
Apparatus according to any one of claims 10-14, wherein,

The information obtaining module is specifically configured to input the image into the pre-trained target detection model, and obtain the candidate target area in the image output by the target detection model, the confidence of the candidate target area, and the occlusion of the candidate target area degree, wherein the target detection model includes: a target detection layer for detecting candidate target areas in an image and an occlusion degree prediction layer for predicting the degree of occlusion of the candidate target areas.
An electronic device comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can perform any one of claims 1-9. Methods.
A non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to execute the method according to any one of claims 1-9.
A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-9.
A roadside device comprising the electronic device as claimed in claim 19.
A cloud control platform, comprising the electronic device according to claim 19.