CN113011331B

CN113011331B - Method and device for detecting whether motor vehicle gives way to pedestrians, electronic equipment and medium

Info

Publication number: CN113011331B
Application number: CN202110295491.7A
Authority: CN
Inventors: 王健; 皖彦淇; 岳名扬; 祝偲博; 任慧慧; 杨珺淞; 申南玲; 白璐; 李昀浩; 马钰
Original assignee: Jilin University
Current assignee: Jilin University
Priority date: 2021-03-19
Filing date: 2021-03-19
Publication date: 2021-11-09
Anticipated expiration: 2041-03-19
Also published as: CN113011331A

Abstract

The invention is applicable to the technical field of intelligent transportation, and provides a method, device, electronic device and storage medium for detecting whether a motor vehicle is courteous to pedestrians. Element identification, obtain the target element set, use the preset target tracking algorithm to track the motion track of the target element in the target element set, based on the identified main crosswalk line and the tracked motion track of the target element, use The preset illegal behavior detection algorithm detects whether there is a violation of the vehicle inappropriately yielding to the pedestrian in the current video frame, thereby improving the detection accuracy of the motor vehicle inappropriately yielding to the pedestrian.

Description

Method, device, electronic device and medium for detecting whether a motor vehicle yields to pedestrians

技术领域technical field

本发明属于智慧交通技术领域，尤其涉及一种机动车是否礼让行人的检测方法、装置、电子设备及存储介质。The invention belongs to the technical field of intelligent transportation, and in particular relates to a method, device, electronic device and storage medium for detecting whether a motor vehicle is courteous to a pedestrian.

背景技术Background technique

智慧交通是电子、计算机、自动化等技术发展到一定程度的必然产物，近年来我国密集出台多项政策用以支持智慧交通的发展。世界发达国家如美国、欧洲等地区已普遍将智慧交通应用于其交通建设领域。随着我国机动车保有量的趋于饱和，运用智能交通系统提升人、车、路的密切配合从而提升交通运输效率，缓解交通阻塞，提高路网通过能力，减少交通事故已成为我国交通管理部门急需解决的问题。传统的交通违章检测主要通过摄像头抓拍的车辆信息，由多轮人工反复审核、验证的方式，确认车辆行为是否违反交通规则。该方法耗费大量时间成本与人工成本，且人工审核的方式掺杂了审核员疲劳、情绪等诸多主观因素，使得审核效率低，结果缺乏公正性与准确性。Smart transportation is an inevitable product of the development of electronics, computers, automation and other technologies to a certain extent. In recent years, my country has intensively issued a number of policies to support the development of smart transportation. Developed countries in the world such as the United States, Europe and other regions have generally applied smart transportation to their transportation construction. As the number of motor vehicles in my country tends to be saturated, the use of intelligent transportation systems to improve the close cooperation between people, vehicles and roads to improve transportation efficiency, ease traffic congestion, improve road network passing capacity, and reduce traffic accidents has become my country's traffic management department. Urgent problems. The traditional traffic violation detection mainly confirms whether the vehicle behavior violates the traffic rules through multiple rounds of manual review and verification of the vehicle information captured by the camera. This method consumes a lot of time and labor costs, and the manual review method is mixed with many subjective factors such as auditor fatigue and emotion, which makes the review efficiency low, and the results lack fairness and accuracy.

即使当今诸多城市高速推进交通智慧化，闯红灯、压线、占非机动车道等简单违规行为的检测监督得以自动化，如何在更加节约人工、设备、技术成本的前提下，实现对复杂的礼让行人行为的高效、快速、准确、公正的自动化检测，仍然成为亟待解决的问题。Even if many cities are promoting intelligent traffic at high speed, and the detection and supervision of simple violations such as running red lights, pressing lines, and occupying non-motorized vehicle lanes can be automated, how to realize complex comity pedestrian behaviors under the premise of saving labor, equipment, and technology costs The efficient, fast, accurate and fair automated detection of

发明内容SUMMARY OF THE INVENTION

本发明的目的在于提供一种机动车是否礼让行人的检测方法、装置、电子设备及存储介质，旨在解决由于现有技术中对机动车是否礼让行人的检测准确性不够高的问题。The purpose of the present invention is to provide a method, device, electronic device and storage medium for detecting whether a motor vehicle is courteous to a pedestrian, aiming to solve the problem that the detection accuracy of whether a motor vehicle is courteous to a pedestrian is not high enough in the prior art.

一方面，本发明提供一种机动车是否礼让行人的检测方法，所述方法包括下述步骤：In one aspect, the present invention provides a method for detecting whether a motor vehicle yields to a pedestrian, the method comprising the following steps:

使用训练好的目标识别网络对采集到的视频帧进行目标元素识别，得到目标元素集合；Use the trained target recognition network to perform target element recognition on the collected video frames to obtain a target element set;

使用预设的目标跟踪算法跟踪所述目标元素集合中的目标元素的运动轨迹；Use a preset target tracking algorithm to track the motion trajectory of the target element in the target element set;

基于已识别出的主人行横道线和跟踪到的所述目标元素的运动轨迹，使用预设的违规行为检测算法检测当前视频帧中是否存在机动车不礼让行人的违规行为。Based on the identified main crosswalk line and the tracked movement trajectory of the target element, a preset violation detection algorithm is used to detect whether there is a violation of a motor vehicle disobeying a pedestrian in the current video frame.

优选地，所述目标识别网络为SSD MobileNet-v2网络。Preferably, the target recognition network is an SSD MobileNet-v2 network.

优选地，所述使用训练好的目标识别网络对采集到的视频帧进行目标元素识别的步骤之前，还包括：Preferably, before the step of using the trained target recognition network to perform target element recognition on the collected video frames, the method further includes:

若满足预设的人行横道线识别条件，则识别所述主人行横道线，其中，所述人行横道线识别条件包括当前视频帧为视频首帧，或当前视频帧为预设的人行横道线识别周期对应的视频帧，和/或当前视频帧与前一该识别而未识别出人行横道线的视频帧的帧间隔等于预设的第一间隔阈值。If the preset crosswalk line recognition conditions are met, the main crosswalk line is recognized, wherein the crosswalk line recognition conditions include that the current video frame is the first frame of the video, or the current video frame is a video corresponding to a preset crosswalk line recognition period frame, and/or the frame interval between the current video frame and the previous video frame in which the crosswalk line is not identified is equal to the preset first interval threshold.

优选地，所述第一间隔阈值为10。Preferably, the first interval threshold is 10.

优选地，所述识别所述主人行横道线的步骤，还包括：Preferably, the step of identifying the main crosswalk line further includes:

使用训练好的分割网络对当前视频帧进行人行横道线实例分割，获得至少一个粗定位人行横道检测框；Use the trained segmentation network to segment the crosswalk line instance on the current video frame to obtain at least one coarsely positioned crosswalk detection frame;

基于所有的所述粗定位人行横道检测框精定位所述主人行横道线。The main crosswalk line is finely positioned based on all of the coarsely positioned crosswalk detection frames.

优选地，所述分割网络为Mask R-CNN网络，对所述Mask R-CNN网络进行训练的步骤，包括：Preferably, the segmentation network is a Mask R-CNN network, and the step of training the Mask R-CNN network includes:

获取车道线数据集；Get the lane line dataset;

若所述车道线数据集中的车道线标记仅包含车道线的侧边轮廓，则对所述车道线进行预处理，得到预处理后的车道线数据集，其中，所述车道线包括人行横道线，预处理后的车道线的标记包含分割掩模；If the lane markings in the lane marking data set only include the side contours of the lane markings, the lane markings are preprocessed to obtain a preprocessing lane marking data set, wherein the lane markings include crosswalk markings, The preprocessed lane markings contain segmentation masks;

将所述预处理后的车道线数据集输入到所述Mask R-CNN网络进行训练，得到训练好的Mask R-CNN网络。The preprocessed lane line data set is input into the Mask R-CNN network for training to obtain a trained Mask R-CNN network.

优选地，所述对所述车道线数据集中的车道线进行预处理的步骤，还包括：Preferably, the step of preprocessing the lane lines in the lane line data set further includes:

若所述车道线的标记仅包含一条线，则以该线为轴心在两边各扩散预设的像素宽度；If the marking of the lane line contains only one line, take the line as the axis and spread the preset pixel width on both sides;

若所述车道线的标记包含两条直线，则将所述两条线的首端和末端分别进行连接，形成四边形闭合区域，并按照所述车道线的类别对所述四边形闭合区域进行填充。If the marking of the lane line includes two straight lines, the head ends and the ends of the two lines are respectively connected to form a quadrilateral closed area, and the quadrilateral closed area is filled according to the category of the lane line.

优选地，所述像素宽度为5个像素宽度。Preferably, the pixel width is 5 pixels wide.

优选地，所述基于所有的所述粗定位人行横道检测框精定位所述主人行横道线的步骤，包括：Preferably, the step of finely locating the main crosswalk line based on all the roughly positioned crosswalk detection frames includes:

基于每个所述粗定位人行横道检测框的置信度，采用预设的条件打分算法从所有的所述粗定位人行横道检测框中筛选所述主人行横道线，其中，所述条件打分算法结合了所述粗定位人行横道检测框在视频帧中的位置和/或所述粗定位人行横道检测框相对于视频帧的比例。Based on the confidence of each of the coarsely positioned crosswalk detection frames, a preset conditional scoring algorithm is used to screen the main crosswalk line from all of the coarsely positioned crosswalk detection frames, wherein the conditional scoring algorithm combines the Coarsely locate the position of the pedestrian crossing detection frame in the video frame and/or the ratio of the coarsely positioned pedestrian crossing detection frame relative to the video frame.

优选地，所述采用预设的条件打分算法从所有的所述粗定位人行横道检测框中筛选所述主人行横道线的步骤，包括：Preferably, the step of screening the main crosswalk line from all the roughly positioned crosswalk detection frames using a preset conditional scoring algorithm includes:

采用预设的条件打分算法对每个所述粗定位人行横道检测框进行打分，得到每个所述粗定位人行横道检测框的条件总分；Use a preset conditional scoring algorithm to score each of the roughly positioned pedestrian crossing detection frames to obtain a conditional total score of each of the roughly positioned pedestrian crossing detection frames;

获取条件总分最高的粗定位人行横道检测框，将所述条件总分最高的粗定位人行横道检测框作为主人行横道线检测框。The coarse-positioned crosswalk detection frame with the highest total condition score is obtained, and the coarse-positioned crosswalk detection frame with the highest condition total score is used as the main crosswalk line detection frame.

优选地，所述条件打分算法使用的公式如下：Preferably, the formula used by the conditional scoring algorithm is as follows:

其中，

表示第i个所述粗定位人行横道检测框的条件总分，

分别表示第i个所述粗定位人行横道检测框的左上角相对当前视频帧左上角在水平方向和垂直方向的距离，

分别表示第i个所述粗定位人行横道检测框的宽度和高度，height_IMG表示视频帧的高度，

表示第i个所述粗定位人行横道检测框的置信度，I()为指示函数，(R1_t/h，R2_t/h)表示第一预设区间，(R1_w/h，R2_w/h)表示第二预设区间，s1表示第一预设固定值，s2表示第二预设固定值。in,

represents the conditional total score of the i-th coarsely positioned pedestrian crossing detection frame,

respectively represent the distance between the upper left corner of the i-th coarsely positioned crosswalk detection frame relative to the upper left corner of the current video frame in the horizontal direction and the vertical direction,

respectively represent the width and height of the i-th coarsely positioned crosswalk detection frame, height _IMG represents the height of the video frame,

Represents the confidence of the i-th coarsely positioned pedestrian crossing detection frame, I() is the indicator function, (R1 _t/h , R2 _t/h ) represents the first preset interval, (R1 _w/h , R2 _{w/h )} ) represents the second preset interval, s1 represents the first preset fixed value, and s2 represents the second preset fixed value.

优选地，所述第一预设区间为(0.4，0.6)，所述第一预设固定值为0.4，所述第二预设区间为(0.07，0.14)，所述第二预设固定值为0.14。Preferably, the first preset interval is (0.4, 0.6), the first preset fixed value is 0.4, the second preset interval is (0.07, 0.14), and the second preset fixed value is 0.14.

优选地，所述获取条件总分最高的粗定位人行横道检测框的步骤之后，还包括：Preferably, after the step of obtaining the coarsely positioned pedestrian crossing detection frame with the highest condition total score, the method further includes:

判断所述条件总分最高的粗定位人行横道检测框与视频帧的宽度比是否小于预设的宽度比阈值；Determine whether the width ratio of the coarse-positioned crosswalk detection frame with the highest conditional total score to the video frame is less than a preset width ratio threshold;

若小于，则将所述条件总分最高的粗定位人行横道检测框向两侧等距离扩展，以使所述条件总分最高的粗定位人行横道检测框与视频帧的宽度比达到所述宽度比阈值，将扩展后的所述条件总分最高的粗定位人行横道检测框作为所述主人行横道线检测框；If it is less than, expand the coarse-positioned crosswalk detection frame with the highest conditional total score to both sides at equal distances, so that the width ratio of the coarse-positioned crosswalk detection frame with the highest conditional total score to the video frame reaches the width ratio threshold , taking the expanded crosswalk detection frame with the highest conditional total score as the main crosswalk line detection frame;

若不小于，则将所述条件总分最高的粗定位人行横道检测框作为所述主人行横道线检测框。If it is not less than, the roughly positioned pedestrian crossing detection frame with the highest total condition score is used as the main pedestrian crossing line detection frame.

优选地，所述宽度比阈值为0.85。Preferably, the width ratio threshold is 0.85.

优选地，所述目标元素集合包括每个目标元素的类别和检测框，所述目标跟踪算法为最大交并比筛选法，所述使用预设的目标跟踪算法跟踪所述目标元素集合中的目标元素的运动轨迹的步骤，包括：Preferably, the target element set includes a category and a detection frame of each target element, the target tracking algorithm is a maximum intersection ratio screening method, and the target in the target element set is tracked by using a preset target tracking algorithm The steps of the motion trajectory of the element, including:

对于所述目标元素集合中的每个所述目标元素，根据当前所有的跟踪路径序列框列表获得第一序列框列表集合，其中，所述第一序列框列表集合中包含所有与所述目标元素的类别相同的跟踪路径序列框列表，每个所述跟踪路径序列框列表中包含跟踪到的目标元素的类别以及按时间顺序排列的该目标元素的检测框，每个所述跟踪路径序列框列表对应一个目标元素的运动轨迹；For each target element in the target element set, a first sequence box list set is obtained according to all current tracking path sequence box lists, wherein the first sequence box list set includes all the target elements related to the target element. A list of tracking path sequence boxes of the same category, each of the tracking path sequence box lists contains the category of the tracked target element and the detection boxes of the target element arranged in chronological order, each of the tracking path sequence box lists Corresponding to the motion trajectory of a target element;

计算所述目标元素的检测框与所述第一序列框列表集合中每个跟踪路径序列框列表的最末检测框的交并比；Calculate the intersection ratio of the detection frame of the target element and the last detection frame of each tracking path sequence frame list in the first sequence frame list set;

判断是否存在交并比目标框，所述交并比目标框为交并比大于预设的交并比阈值的最末检测框；Judging whether there is an intersection and ratio target frame, the intersection and ratio target frame is the final detection frame whose intersection ratio is greater than a preset intersection ratio threshold;

若存在，则确定跟踪到所述目标元素的运动轨迹，并将所述目标元素的检测框添加到交并比最大的最末检测框所在的跟踪路径序列框列表中；If it exists, then determine the motion track tracked to the target element, and add the detection frame of the target element to the list of tracking path sequence frames where the last detection frame with the largest intersection ratio is located;

若不存在，则确定所述目标元素为新出现的元素，并根据所述目标元素的检测框和类别新建一个跟踪路径序列框列表。If it does not exist, it is determined that the target element is a newly appeared element, and a tracking path sequence frame list is created according to the detection frame and category of the target element.

优选地，所述交并比阈值为0.75。Preferably, the cross-union ratio threshold is 0.75.

优选地，所述根据当前所有的跟踪路径序列框列表获得第一序列框列表集合的步骤之前，还包括：Preferably, before the step of obtaining the first sequence box list set according to all the current tracking path sequence box lists, the method further includes:

获取当前视频帧与当前每个所述跟踪路径序列框列表中的最末检测框对应的视频帧的第二帧间隔；Obtain the second frame interval of the current video frame and the video frame corresponding to the last detection frame in each of the current tracking path sequence frame lists;

丢弃第二帧间隔不小于预设的第二间隔阈值的跟踪路径序列框列表。The list of tracking path sequence boxes whose second frame interval is not less than the preset second interval threshold is discarded.

优选地，所述第二间隔阈值为5。Preferably, the second interval threshold is five.

优选地，所述判断是否存在交并比大于预设的交并比阈值的最末检测框的步骤之后，还包括：Preferably, after the step of judging whether there is a final detection frame with an intersection ratio greater than a preset intersection ratio threshold, the method further includes:

若存在所述交并比目标框，则将最大交并比与次大交并比进行取对数检验；If there is the cross-union ratio target frame, the largest cross-union ratio and the second largest cross-union ratio are subjected to a logarithmic test;

若通过取对数检验，则确定跟踪到所述目标元素的运动轨迹；If the logarithm test is passed, then determine the movement track that is tracked to the target element;

若未通过所述取对数检验，则采用预设的余弦距离比对法跟踪所述目标元素的运动轨迹；If the logarithm test is not passed, the preset cosine distance comparison method is used to track the motion trajectory of the target element;

所述取对数检验的使用公式为：The use formula of the logarithmic test is:

(log₂IoU_max-log₂IoU_{second_max})＜ε(log ₂ IoU _max -log ₂ IoU _{second_max} )＜ε

其中，IoU_max表示最大交并比，IoU_{second_max}表示次大交并比，ε为常数，表示为预设的对数检验阈值。Among them, IoU _max represents the maximum cross-union ratio, IoU _{second_max} represents the second largest cross-union ratio, and ε is a constant, which is expressed as a preset logarithmic test threshold.

优选地，所述采用预设的余弦距离比对法跟踪所述目标元素的运动轨迹的步骤，包括：Preferably, the step of using a preset cosine distance comparison method to track the motion trajectory of the target element includes:

获取当前视频帧中所述目标元素的最新特征图，以及第二序列框列表集合中每个所述跟踪路径序列框列表的最末检测框和次末检测框分别对应的第一特征图和第二特征图，其中，所述第二序列框列表集合中包含所有的所述交并比目标框所在的跟踪路径序列框列表；Obtain the latest feature map of the target element in the current video frame, and the first feature map and the first feature map corresponding to the last detection frame and the second last detection frame of each of the tracking path sequence frame lists in the second sequence frame list set respectively. Two feature maps, wherein the second sequence box list set includes all the tracking path sequence box lists where the intersection and ratio target boxes are located;

对所述最新特征图以及所有的所述第一特征图和第二特征图进行归一化处理，得到统一大小的特征图；Normalizing the latest feature map and all the first feature maps and the second feature maps to obtain feature maps of uniform size;

分别计算经归一化处理后的最新特征图与每个所述第一特征图和每个所述第二特征图的第一余弦距离和第二余弦距离；Calculate the first cosine distance and the second cosine distance of the latest normalized feature map and each of the first feature maps and each of the second feature maps respectively;

对每个所述第一余弦距离和每个所述第二余弦距离，分别取对数后得到对应的第一特征相似因子和第二特征相似因子；For each of the first cosine distances and each of the second cosine distances, take the logarithm to obtain the corresponding first feature similarity factor and second feature similarity factor;

对每个所述第一特征相似因子和与每个所述第一特征相似因子对应的第二特征相似因子进行线性加权计算，得到最小加权特征相似因子；Perform linear weighting calculation on each of the first feature similarity factors and the second feature similarity factors corresponding to each of the first feature similarity factors to obtain a minimum weighted feature similarity factor;

将与所述最小加权特征相似因子对应的跟踪路径序列框列表所对应的运动轨迹作为所述目标元素的运动轨迹，并将所述目标元素的检测框添加到与所述最小加权特征相似因子对应的跟踪路径序列框列表中。Taking the motion trajectory corresponding to the tracking path sequence box list corresponding to the minimum weighted feature similarity factor as the motion trajectory of the target element, and adding the detection frame of the target element to the minimum weighted feature similarity factor corresponding to the in the Trace Path sequence box list.

优选地，所述预设的违规行为检测算法为违规警示点算法，使用预设的违规行为检测算法检测当前视频帧中是否存在机动车不礼让行人的违规行为的步骤，包括：Preferably, the preset illegal behavior detection algorithm is a violation warning point algorithm, and the step of using the preset illegal behavior detection algorithm to detect whether there is an illegal behavior of a motor vehicle disrespecting pedestrians in the current video frame includes:

根据所述主人行横道线上每个行人的运动轨迹预测出当前视频帧中的违规警示点；Predict the violation warning point in the current video frame according to the motion trajectory of each pedestrian on the main crosswalk;

判断所述违规警示点是否处于识别出的任一机动车的最末检测框的上半部区域内；Determine whether the violation warning point is within the upper half area of the last detection frame of any identified motor vehicle;

若处于，则确定在当前视频帧中检测出所述违规行为。If so, it is determined that the violation is detected in the current video frame.

优选地，所述违规警示点的计算方式为：Preferably, the calculation method of the violation warning point is:

其中，x_wp、y_wp分别表示所述违规警示点的坐标，

分别表示行人的跟踪路径序列框列表中第m个检测框中心点的坐标，c表示行人的跟踪路径序列框列表中检测框的个数。Wherein, x _wp and y _wp respectively represent the coordinates of the violation warning point,

respectively represent the coordinates of the center point of the mth detection box in the pedestrian tracking path sequence box list, and c represents the number of detection boxes in the pedestrian tracking path sequence box list.

优选地，所述根据所述主人行横道线上每个行人的运动轨迹预测出当前视频帧中的违规警示点的步骤之前，包括：Preferably, before the step of predicting the violation warning point in the current video frame according to the movement trajectory of each pedestrian on the main crosswalk, the method includes:

判断当前视频帧中的每个行人是否处于所述主人行横道线上；Determine whether each pedestrian in the current video frame is on the main crosswalk line;

所述判断当前视频帧中的每个行人是否处于所述主人行横道线上的步骤，包括：The step of judging whether each pedestrian in the current video frame is on the main crosswalk line includes:

判断所述行人的行人检测框下边缘中心点是否处于人行横道检测框内；Determine whether the center point of the lower edge of the pedestrian detection frame of the pedestrian is within the pedestrian crossing detection frame;

若处于所述人行横道检测框内，则判定所述行人处于所述人行横道线上。If it is within the pedestrian crossing detection frame, it is determined that the pedestrian is on the pedestrian crossing line.

优选地，所述使用预设的违规行为检测算法检测当前视频帧中是否存在机动车不礼让行人的违规行为的步骤之后，还包括：Preferably, after the step of using a preset illegal behavior detection algorithm to detect whether there is an illegal behavior of the motor vehicle disobeying pedestrians in the current video frame, the method further includes:

若检测到所述违规行为，则使用预设的车牌识别算法识别发生所述违规行为的机动车的车牌；If the violation is detected, use a preset license plate recognition algorithm to identify the license plate of the motor vehicle where the violation occurs;

可视化显示所述车牌、所述违规行为和/或所述违规行为发生的时间。A visualization shows the license plate, the violation, and/or the time when the violation occurred.

另一方面，本发明提供了一种机动车是否礼让行人的检测装置，所述装置包括：In another aspect, the present invention provides a device for detecting whether a motor vehicle yields to a pedestrian, the device comprising:

目标元素识别单元，用于使用训练好的目标识别网络对采集到的视频帧进行目标元素识别，得到目标元素集合；The target element recognition unit is used to use the trained target recognition network to perform target element recognition on the collected video frames to obtain a target element set;

运动轨迹跟踪单元，用于使用预设的目标跟踪算法跟踪所述目标元素集合中的目标元素的运动轨迹；以及a motion trajectory tracking unit, used for tracking the motion trajectory of the target element in the target element set using a preset target tracking algorithm; and

违规行为识别单元，用于基于已识别出的主人行横道线和跟踪到的所述目标元素的运动轨迹，使用预设的违规行为检测算法检测当前视频帧中是否存在机动车不礼让行人的违规行为。The illegal behavior identification unit is used to detect, based on the identified main crosswalk line and the tracked motion trajectory of the target element, whether there is an illegal behavior of a motor vehicle rudely yielding to pedestrians in the current video frame by using a preset illegal behavior detection algorithm .

另一方面，本发明还提供了一种电子设备，包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序，所述处理器执行所述计算机程序时实现如上所述方法的步骤。In another aspect, the present invention also provides an electronic device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor is implemented when the processor executes the computer program The steps of the method as described above.

另一方面，本发明还提供了一种计算机可读存储介质，所述计算机可读存储介质存储有计算机程序，所述计算机程序被处理器执行时实现如上所述方法的步骤。In another aspect, the present invention also provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, implements the steps of the above-mentioned method.

本发明通过使用训练好的目标识别网络对采集到的视频帧进行目标元素识别，得到目标元素集合，使用预设的目标跟踪算法跟踪该目标元素集合中的目标元素的运动轨迹，基于已识别出的主人行横道线和跟踪到的目标元素的运动轨迹，使用预设的违规行为检测算法检测当前视频帧中是否存在机动车不礼让行人的违规行为，从而提高了机动车不礼让行人的检测准确性。The present invention uses a trained target recognition network to perform target element recognition on the collected video frames to obtain a target element set, and uses a preset target tracking algorithm to track the motion trajectory of the target element in the target element set. The main pedestrian crossing line and the tracked motion trajectory of the target element are used to detect whether there is a violation of the vehicle's rude pedestrians in the current video frame by using the preset violation detection algorithm, thereby improving the detection accuracy of the vehicle's rude pedestrians. .

附图说明Description of drawings

图1A是本发明实施例一提供的机动车是否礼让行人的检测方法的实现流程图；1A is a flow chart of the realization of a method for detecting whether a motor vehicle yields to a pedestrian according to Embodiment 1 of the present invention;

图1B是本发明实施例一提供的对采集到的视频帧进行目标元素识别的示例效果图；FIG. 1B is an exemplary rendering of the target element identification of the collected video frame provided by Embodiment 1 of the present invention;

图2是本发明实施例二提供的识别主人行横道线的实现流程图；Fig. 2 is the realization flow chart of identifying the main crosswalk line provided by the second embodiment of the present invention;

图3是本发明实施例三提供的训练Mask R-CNN网络的实现流程图；Fig. 3 is the realization flow chart of training Mask R-CNN network provided by the third embodiment of the present invention;

图4是本发明实施例四提供的使用最大交并比筛选法跟踪目标元素的运动轨迹的实现流程图；以及Fig. 4 is the realization flow chart of using the maximum intersection ratio screening method to track the motion trajectory of the target element provided by the fourth embodiment of the present invention; And

图5是本发明实施例五提供的采用余弦距离比对法跟踪目标元素的运动轨迹的实现流程图。FIG. 5 is a flow chart of the implementation of tracking the motion trajectory of the target element by using the cosine distance comparison method according to Embodiment 5 of the present invention.

图6A是本发明实施例六提供的使用违规警示点算法检测机动车不礼让行人的违规行为的实现流程图；6A is a flowchart of an implementation of using the violation warning point algorithm to detect the violation behavior of a motor vehicle in disrespect to pedestrians provided by Embodiment 6 of the present invention;

图6B是本发明实施例六提供的使用违规警示点算法检测当前视频帧中是否存在机动车不礼让行人的违规行为的示例图；FIG. 6B is an exemplary diagram of using the violation warning point algorithm to detect whether there is a violation of a motor vehicle disobeying pedestrians in a current video frame according to Embodiment 6 of the present invention;

图7是本发明实施例七提供的机动车是否礼让行人的检测装置的结构示意图；以及7 is a schematic structural diagram of a device for detecting whether a motor vehicle yields to a pedestrian according to Embodiment 7 of the present invention; and

图8是本发明实施例五提供的电子设备的结构示意图。FIG. 8 is a schematic structural diagram of an electronic device according to Embodiment 5 of the present invention.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白，以下结合附图及实施例，对本发明进行进一步详细说明。应当理解，此处所描述的具体实施例仅仅用以解释本发明，并不用于限定本发明。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

以下结合具体实施例对本发明的具体实现进行详细描述：The specific implementation of the present invention is described in detail below in conjunction with specific embodiments:

实施例一：Example 1:

图1A示出了本发明实施例一提供的机动车是否礼让行人的检测方法的实现流程，为了便于说明，仅示出了与本发明实施例相关的部分，详述如下：1A shows the implementation process of the method for detecting whether a motor vehicle yields to pedestrians provided by the first embodiment of the present invention. For the convenience of description, only the parts related to the embodiment of the present invention are shown, and the details are as follows:

在步骤S101中，使用训练好的目标识别网络对采集到的视频帧进行目标元素识别，得到目标元素集合。In step S101, target element recognition is performed on the collected video frame by using the trained target recognition network to obtain a target element set.

本发明实施例适用电子设备，具体地，可适用于监控摄像头或与监控摄像头连接的硬盘录像机或电脑、服务器等电子设备。在本发明实施例中，通过设置于路口处的摄像头采集路口处视频，该目标元素可以包括机动车、行人和交通信号灯，目标元素集合可包括识别出的每个目标元素的类别和检测框等。The embodiments of the present invention are applicable to electronic devices, specifically, to surveillance cameras, hard disk video recorders connected to surveillance cameras, computers, servers, and other electronic devices. In the embodiment of the present invention, the video at the intersection is collected by a camera disposed at the intersection, the target elements may include motor vehicles, pedestrians, and traffic lights, and the target element set may include the category and detection frame of each identified target element, etc. .

目标识别网络可以是R-CNN、Fast R-CNN、Faster R-CNN、Yolo或SSD等网络，优选地，该目标识别网络SSD MobileNet-v2网络，以提高目标元素的识别速度。在使用训练好的目标识别网络对采集到的视频帧进行目标元素识别，得到目标元素集合时，具体地，可使用训练好的目标识别网络对采集到的视频帧进行前向传播，在每一组不为背景类别的锚点特征图组中筛选出置信度最高的特征图，获取与筛选出的特征图对应的目标元素的类别和检测框，根据该目标元素的类别和检测框形成目标元素集合。The target recognition network can be a network such as R-CNN, Fast R-CNN, Faster R-CNN, Yolo or SSD, preferably, the target recognition network SSD MobileNet-v2 network to improve the recognition speed of target elements. When using the trained target recognition network to perform target element recognition on the collected video frames to obtain the target element set, specifically, the trained target recognition network can be used to forward the collected video frames, and in each The feature map with the highest confidence is selected from the anchor feature map group that is not a background category, and the category and detection frame of the target element corresponding to the selected feature map are obtained, and the target element is formed according to the category and detection frame of the target element. gather.

在使用训练好的目标识别网络对采集到的视频帧进行目标元素识别之前，可使用目标识别数据集对该目标识别网络进行训练，得到训练好的目标识别网络。其中，该目标识别数据集中的目标物可包括行人、交通信号灯和机动车等，该机动车可以进一步包括机动车、行人、信号灯、卡车、公交车或摩托车等类别。Before using the trained target recognition network to perform target element recognition on the collected video frames, the target recognition data set can be used to train the target recognition network to obtain a trained target recognition network. The target objects in the target recognition data set may include pedestrians, traffic lights, and motor vehicles, and the motor vehicles may further include categories such as motor vehicles, pedestrians, signal lights, trucks, buses, or motorcycles.

在使用训练好的目标识别网络对采集到的视频帧进行目标元素识别的步骤之前，可以对每个视频帧进行人行横道线检测，基于摄像头负责的区域内可能存在多个人行横道线，而每个摄像头通常负责一个区域，因此可识别每个视频帧的主人行横道线，基于人行横道线在每个视频帧中的位置区域并非都有变化，且摄像头的视角通常固定，优选地，若满足预设的人行横道线识别条件，则识别主人行横道线，以在不浪费计算资源的前提下，及时对因摄像头位置、道路线更新等外界因素导致主人行横道线位置变化予以矫正。其中，人行横道线识别条件包括当前视频帧为视频首帧，或当前视频帧为预设的人行横道线识别周期对应的视频帧，和/或当前视频帧与前一该识别而未识别出人行横道线的视频帧的帧间隔等于预设的第一间隔阈值。进一步优选地，该第一间隔阈值为10，以在降低计算资源的前提下，满足实际检测需要。Before using the trained target recognition network to identify the target elements of the collected video frames, each video frame can be detected by crosswalk line. Usually responsible for one area, so the main crosswalk line of each video frame can be identified. Based on the position of the crosswalk line in each video frame, the area does not always change, and the viewing angle of the camera is usually fixed. Preferably, if the preset crosswalk is satisfied If the line identification conditions are met, the main crosswalk line is identified, so as to correct the position change of the main crosswalk line caused by external factors such as camera position and road line update in time without wasting computing resources. Wherein, the crosswalk line recognition conditions include that the current video frame is the first frame of the video, or the current video frame is a video frame corresponding to a preset crosswalk line recognition period, and/or the current video frame is the same as the previous one that was recognized but did not recognize the crosswalk line. The frame interval of the video frame is equal to the preset first interval threshold. Further preferably, the first interval threshold is 10, so as to meet the actual detection needs on the premise of reducing computing resources.

具体地，可预先设置一人行横道线识别周期，在当前视频帧为视频的首帧时，进行主人行横道线识别，之后按照设置的人行横道线识别周期周期性地进行主人行横道线识别，在周期性识别过程中，若当前视频帧与前一该识别而未识别出人行横道线的视频帧的帧间隔等于第一间隔阈值(例如10)，则对当前视频帧进行人行横道线识别。其中，前一该识别而未识别出人行横道线的视频帧可以理解为需要进行人行横道线但未识别出人行横道线的视频帧，该人行横道线识别周期可以由用户设定，例如每隔10秒进行一次人行横道线识别，或每隔30帧进行一次人行横道线识别，当然，该人行横道线识别周期可以由获取到的当前摄像头的类型、巡航路径和/或巡航周期综合确定。例如，对于固定枪机等不含云台的摄像头，基于该摄像头拍摄区域通常不变，可设置一相对较大的人行横道线识别周期；又如，对于球机等含云台的摄像头，基于该摄像头可旋转而导致拍摄区域发生变化，若该摄像头启动了自动巡航，则可根据巡航时间和巡航路径综合确定人行横道线识别周期。Specifically, a pedestrian crossing line recognition period can be preset, and when the current video frame is the first frame of the video, the main pedestrian crossing line recognition is performed, and then the main pedestrian crossing line recognition is periodically performed according to the set pedestrian crossing line recognition period. During the process, if the frame interval between the current video frame and the previous video frame for which the crosswalk line is not recognized is equal to the first interval threshold (for example, 10), the crosswalk line identification is performed on the current video frame. Among them, the previous video frame that should be recognized but not recognized the crosswalk line can be understood as a video frame that needs to perform the crosswalk line but has not recognized the crosswalk line, and the crosswalk line recognition cycle can be set by the user, for example, once every 10 seconds Crosswalk line recognition, or crosswalk line recognition every 30 frames, of course, the crosswalk line recognition period can be comprehensively determined by the acquired current camera type, cruise path and/or cruise period. For example, for a camera without a gimbal such as a fixed gun, based on the fact that the shooting area of the camera is usually unchanged, a relatively large pedestrian crossing line recognition cycle can be set; The camera can be rotated to change the shooting area. If the camera starts automatic cruise, the pedestrian crossing line recognition cycle can be comprehensively determined according to the cruise time and the cruise path.

图1B为对采集到的视频帧进行目标元素识别的示例图，图1B中识别出的目标元素包括识别出的行人的检测框及置信度、识别出的机动车的检测框及置信度以及已经识别出的主人行横道检测框。FIG. 1B is an example diagram of target element recognition on the collected video frame. The target elements identified in FIG. 1B include the detection frame and confidence of the identified pedestrian, the detection frame and confidence of the identified motor vehicle, and the detected frame and confidence of the identified motor vehicle. The identified main crosswalk detection box.

在步骤S102中，使用预设的目标跟踪算法跟踪该目标元素集合中的目标元素的运动轨迹。In step S102, a preset target tracking algorithm is used to track the motion trajectory of the target element in the target element set.

在本发明实施例中，目标跟踪算法可以使用行人、车辆重识别的相关算法，例如ReID算法、SOTA算法、PROVID算法等，对于识别出的当前视频帧中的每个目标元素，优选地，使用预设的最大交并比筛选法跟踪该目标元素集合中的目标元素的运动轨迹，通过线性时间复杂度的计算量，省去了重定位算法前向传播需要的庞大计算量，显著提高了检测效率，并进一步提高了检测的准确性。使用预设的最大交并比筛选法跟踪每个目标元素集合中的目标元素的运动轨迹的具体实现方式可参考实施例四的相关描述。In this embodiment of the present invention, the target tracking algorithm may use a related algorithm for pedestrian and vehicle re-identification, such as ReID algorithm, SOTA algorithm, PROVID algorithm, etc., for each target element in the identified current video frame, preferably, use The preset maximum intersection ratio screening method tracks the motion trajectories of the target elements in the target element set. Through the calculation amount of linear time complexity, the huge amount of calculation required for the forward propagation of the relocation algorithm is eliminated, and the detection is significantly improved. efficiency, and further improve the detection accuracy. For the specific implementation of tracking the motion trajectory of the target element in each target element set using the preset maximum intersection ratio screening method, reference may be made to the relevant description of the fourth embodiment.

在步骤S103中，基于已识别出的主人行横道线和跟踪到的目标元素的运动轨迹，使用预设的违规行为检测算法检测当前视频帧中是否存在机动车不礼让行人的违规行为。In step S103 , based on the identified main pedestrian crossing line and the tracked movement trajectory of the target element, a preset illegal behavior detection algorithm is used to detect whether there is a violation of the motor vehicle disobeying pedestrians in the current video frame.

在本发明实施例中，该违规行为检测算法可以是通过判断行人与机动车的路径是否相交的相关算法，通过该违规行为检测算法算法来检测当前视频帧中是否存在机动车不礼让行人的违规行为，优选地，该违规行为检测算法为违规警示点算法，以通过违规警示点来检测当前视频帧中是否存在机动车不礼让行人的违规行为。使用违规警示点算法检测机动车不礼让行人的违规行为的具体实现方式可参考实施例六的相关描述，在此不再赘述。In the embodiment of the present invention, the illegal behavior detection algorithm may be a related algorithm by judging whether the paths of the pedestrian and the motor vehicle intersect, and the illegal behavior detection algorithm algorithm is used to detect whether there is a violation of the motor vehicle disrespecting the pedestrian in the current video frame. Behavior, preferably, the violation behavior detection algorithm is a violation warning point algorithm, so as to detect whether there is a violation behavior of a motor vehicle disrespecting pedestrians in the current video frame through the violation warning point. The specific implementation manner of using the violation warning point algorithm to detect the violation behavior of the motor vehicle disobeying pedestrians can refer to the relevant description of the sixth embodiment, and will not be repeated here.

在检测当前视频帧中是否存在机动车不礼让行人的违规行为之后，优选地，若检测到该违规行为，则使用预设的车牌识别算法识别发生该违规行为的机动车的车牌，可视化显示该车牌、该违规行为和/或该违规行为发生的时间，以通过可视化方式呈现违规行为。其中，预设的车牌识别算法可以采用HyperLPR等车牌识别算法。After detecting whether there is a violation of the motor vehicle disrespecting pedestrians in the current video frame, preferably, if the violation is detected, a preset license plate recognition algorithm is used to identify the license plate of the motor vehicle in which the violation occurs, and the violation is displayed visually. The license plate, the violation, and/or the time the violation occurred, to visualize the violation. The preset license plate recognition algorithm may use a license plate recognition algorithm such as HyperLPR.

在本发明实施例中，使用训练好的目标识别网络对采集到的视频帧进行目标元素识别，得到目标元素集合，使用预设的目标跟踪算法跟踪该目标元素集合中的目标元素的运动轨迹，基于已识别出的主人行横道线和跟踪到的目标元素的运动轨迹，检测当前视频帧中是否存在机动车不礼让行人的违规行为，In the embodiment of the present invention, a trained target recognition network is used to perform target element recognition on the collected video frames to obtain a target element set, and a preset target tracking algorithm is used to track the motion trajectory of the target element in the target element set, Based on the identified main crosswalk line and the tracked motion trajectory of the target element, detect whether there is a violation of the vehicle in the current video frame that is not polite to pedestrians,

实施例二：Embodiment 2:

图2示出了本发明实施例二提供的识别主人行横道线的实现流程，为了便于说明，仅示出了与本发明实施例相关的部分，详述如下：FIG. 2 shows the implementation process of identifying the main crosswalk line provided by the second embodiment of the present invention. For the convenience of description, only the part related to the embodiment of the present invention is shown, and the details are as follows:

在步骤S201中，使用训练好的分割网络对当前视频帧进行人行横道线实例分割，获得至少一个粗定位人行横道检测框。In step S201, use the trained segmentation network to segment the current video frame as a crosswalk line instance to obtain at least one coarsely positioned crosswalk detection frame.

在本发明实施例中，使用训练好的分割网络对当前视频帧进行人行横道线实例分割，获得至少一个粗定位人行横道检测框。上述分割网络可以为R-CNN、Fast R-CNN或Faster R-CNN等网络，优选地，该分割网络为Mask R-CNN网络，以在对图像中的目标进行检测时，给出高质量的分割结果。训练Mask R-CNN网络的具体实现方式可对应参考实施例三的相关描述。In the embodiment of the present invention, a trained segmentation network is used to segment the current video frame as a crosswalk line instance to obtain at least one coarsely positioned crosswalk detection frame. The above-mentioned segmentation network can be a network such as R-CNN, Fast R-CNN or Faster R-CNN. Split result. The specific implementation of training the Mask R-CNN network may correspond to the relevant description in the third embodiment.

基于摄像头视角固定和道路线具有相对稳定性的先验知识，在Mask R-CNN网络初始化时运行一次，在运行结果中，筛选并保存所有的粗定位人行横道检测框。进一步地，还可以获得每个粗定位人行横道检测框的置信度。每个粗定位人行横道检测框的数据格式示例如下：Based on the prior knowledge that the camera angle of view is fixed and the road line is relatively stable, it runs once when the Mask R-CNN network is initialized. In the running result, all coarsely positioned pedestrian crossing detection frames are screened and saved. Further, the confidence level of each roughly positioned pedestrian crossing detection frame can also be obtained. An example of the data format for each coarsely positioned pedestrian crossing detection frame is as follows:

crossdata＝[{"roi":{"rois":[400,300,240,50],"class_ids":1},"score":0.85,″index″：0}，{″roi″：{″rois″：[800，100，1000，150]，″class_ids″：1}，″score″：0.65，″index″：1}]crossdata=[{"roi":{"rois":[400,300,240,50],"class_ids":1},"score":0.85,"index":0},{"roi":{"rois":[ 800, 100, 1000, 150], "class_ids": 1}, "score": 0.65, "index": 1}]

在步骤S202中，基于所有的粗定位人行横道检测框精定位主人行横道线。In step S202, the main crosswalk line is precisely positioned based on all the coarsely positioned crosswalk detection frames.

在本发明实施例中，可以根据每个粗定位人行横道检测框的置信度对所有的粗定位人行横道检测框进行排序，将置信度最高的人行横道检测框作为主人行横道检测框，优选地，基于每个粗定位人行横道检测框的置信度，采用预设的条件打分算法从所有的粗定位人行横道检测框中筛选主人行横道线，其中，条件打分算法结合了粗定位人行横道检测框在视频帧中的位置和/或粗定位人行横道检测框相对于视频帧的比例，以通过条件打分算法更准确地筛选出主人行横道线。In this embodiment of the present invention, all coarsely positioned pedestrian crossing detection frames may be sorted according to the confidence of each coarsely positioned pedestrian crossing detection frame, and the pedestrian crossing detection frame with the highest confidence may be used as the main pedestrian crossing detection frame. Coarsely locate the confidence level of the crosswalk detection frame, and use a preset conditional scoring algorithm to screen the main crosswalk line from all the coarsely positioned crosswalk detection frames. The conditional scoring algorithm combines the position of the coarsely positioned crosswalk detection frame in the video frame and/or Or roughly locate the ratio of the crosswalk detection frame relative to the video frame, so as to more accurately screen out the main crosswalk line through the conditional scoring algorithm.

在采用预设的条件打分算法从所有的粗定位人行横道检测框中筛选主人行横道线时，优选地，采用预设的条件打分算法对每个粗定位人行横道检测框进行打分，得到每个粗定位人行横道检测框的条件总分，获取条件总分最高的粗定位人行横道检测框，将条件总分最高的粗定位人行横道检测框作为主人行横道线检测框，条件打分算法使用的公式如下：When using the preset conditional scoring algorithm to screen the main crosswalk lines from all the coarsely positioned crosswalk detection frames, preferably, the preset conditional scoring algorithm is used to score each coarsely positioned crosswalk detection frame to obtain each coarsely positioned crosswalk The conditional total score of the detection frame is obtained, and the coarsely positioned pedestrian crossing detection frame with the highest conditional total score is obtained, and the coarsely positioned pedestrian crossing detection frame with the highest conditional total score is used as the main pedestrian crossing line detection frame. The formula used by the conditional scoring algorithm is as follows:

其中，

表示第i个粗定位人行横道检测框的条件总分，

分别表示第i个粗定位人行横道检测框的左上角相对当前视频帧左上角在水平方向和垂直方向的距离，

分别表示第i个粗定位人行横道检测框的宽度和高度，width_IMG、height_IMG分别表示视频帧的宽度和高度，

表示第i个粗定位人行横道检测框的置信度，I()为指示函数，(R1_t/h，R2_t/h)表示第一预设区间，(R1_w/h，R2_w/h)表示第二预设区间，s1表示第一预设固定值，s2表示第二预设固定值。in,

respectively represent the distance between the upper left corner of the i-th coarsely positioned crosswalk detection frame relative to the upper left corner of the current video frame in the horizontal and vertical directions,

respectively represent the width and height of the i-th coarse positioning crosswalk detection frame, width _IMG and height _IMG respectively represent the width and height of the video frame,

Represents the confidence of the i-th coarsely positioned pedestrian crossing detection frame, I() is the indicator function, (R1 _t/h , R2 _t/h ) represents the first preset interval, (R1 _w/h , R2 _w/h ) represents In the second preset interval, s1 represents the first preset fixed value, and s2 represents the second preset fixed value.

具体地，若粗定位人行横道检测框的上边缘到视频帧上边缘的距离与视频帧高度比值在第一预设区间内，则粗定位人行横道检测框的条件总分为粗定位人行横道检测框的置信度加上第一预设固定值，若粗定位人行横道检测框的高度与视频帧的高度的比值在第二预设区间内，则粗定位人行横道线边界框的条件总分为粗定位人行横道线边界框的置信度加上第二预设固定值，若同时满足上述两个条件，则粗定位人行横道线边界框的条件总分为粗定位人行横道线边界框的置信度加上第一预设固定值和第二预设固定值。Specifically, if the ratio of the distance from the upper edge of the coarsely positioned crosswalk detection frame to the upper edge of the video frame to the height of the video frame is within the first preset interval, then the condition of the coarsely positioned crosswalk detection frame is divided into the confidence level of the coarsely positioned crosswalk detection frame. plus the first preset fixed value, if the ratio of the height of the coarsely positioned crosswalk detection frame to the height of the video frame is within the second preset interval, then the condition of the coarsely positioned crosswalk line boundary box is divided into the coarsely positioned crosswalk line boundary The confidence level of the box plus the second preset fixed value, if the above two conditions are met at the same time, the conditions for roughly positioning the crosswalk line bounding box are the total confidence level of the roughly positioning crosswalk line bounding box plus the first preset fixed value. and a second preset fixed value.

进一步优选地，该第一预设区间为(0.4，0.6)，该第一预设固定值为0.4，该第二预设区间为(0.07，0.14)，该第二预设固定值为0.14，以根据实验结果设定相关数值，进而进一步提高了主人行横道定位的准确性。Further preferably, the first preset interval is (0.4, 0.6), the first preset fixed value is 0.4, the second preset interval is (0.07, 0.14), and the second preset fixed value is 0.14, In order to set relevant values according to the experimental results, the accuracy of the main pedestrian crossing positioning is further improved.

在获取条件总分最高的粗定位人行横道检测框的之后，优选地，判断条件总分最高的粗定位人行横道检测框与视频帧的宽度比是否小于预设的宽度比阈值，若小于，则将条件总分最高的粗定位人行横道检测框向两侧等距离扩展，以使条件总分最高的粗定位人行横道检测框与视频帧的宽度比达到宽度比阈值，将扩展后的条件总分最高的粗定位人行横道检测框作为主人行横道线检测框，若不小于，则将条件总分最高的粗定位人行横道检测框作为主人行横道线检测框，通过扩展主行人横道检测框以便于后续更准确地根据人行横道线上的行人预测违规警示点。进一步优选地，该宽度比阈值为0.85，以根据实验结果进行设定，进而进一步提高了违规警示点预测的准确性。After obtaining the coarse-positioned crosswalk detection frame with the highest total condition score, preferably, it is determined whether the width ratio of the coarse-positioned crosswalk detection frame with the highest conditional total score to the video frame is less than a preset width ratio threshold; The coarse-positioned crosswalk detection frame with the highest total score is extended equidistantly to both sides, so that the width ratio of the coarse-positioned crosswalk detection frame with the highest conditional total score to the video frame reaches the width ratio threshold, and the expanded coarsely positioned with the highest conditional total score The crosswalk detection frame is used as the main crosswalk line detection frame. If it is not less than, the coarsely positioned crosswalk detection frame with the highest conditional total score will be used as the main crosswalk line detection frame. Pedestrian prediction violation warning points. Further preferably, the width ratio threshold is 0.85, so as to be set according to the experimental results, thereby further improving the prediction accuracy of violation warning points.

在本发明实施例中，使用训练好的分割网络对当前视频帧进行人行横道线实例分割，获得至少一个粗定位人行横道检测框，基于所有的粗定位人行横道检测框精定位主人行横道线，在精定位主人行横道线时，根据每个粗定位人行横道检测框的置信度对所有的粗定位人行横道检测框进行排序，采用预设的条件打分算法从所有的粗定位人行横道检测框中筛选主人行横道线，从而通过条件打分算法更准确地筛选出主人行横道线。In the embodiment of the present invention, a trained segmentation network is used to segment the current video frame for crosswalk line instances, to obtain at least one coarsely positioned crosswalk detection frame, and based on all the coarsely positioned crosswalk detection frames, the main crosswalk line is precisely positioned. When crossing a crosswalk, sort all the coarsely positioned crosswalk detection frames according to the confidence of each coarsely positioned crosswalk detection frame, and use a preset conditional scoring algorithm to screen the main crosswalk line from all the coarsely positioned crosswalk detection frames, so as to pass the condition The scoring algorithm more accurately screened out the main crosswalk lines.

实施例三：Embodiment three:

图3示出了本发明实施例三提供的训练Mask R-CNN网络的实现流程，为了便于说明，仅示出了与本发明实施例相关的部分，详述如下：FIG. 3 shows the implementation process of training the Mask R-CNN network provided by the third embodiment of the present invention. For the convenience of description, only the part related to the embodiment of the present invention is shown, and the details are as follows:

在步骤S301中，获取车道线数据集。In step S301, a data set of lane lines is acquired.

在本发明实施例中，该车道线数据集中的车道线包括人行横道线，该车道线数据集可以采用BDD100K等车道线数据集。In the embodiment of the present invention, the lane lines in the lane line data set include pedestrian crossing lines, and the lane line data set may adopt a lane line data set such as BDD100K.

在步骤S302中，若该车道线数据集中的车道线标记仅包含车道线的侧边轮廓，则对车道线进行预处理，得到预处理后的车道线数据集，其中，预处理后的车道线的标记包含分割掩模。In step S302, if the lane markings in the lane marking data set only include the side contours of the lane markings, the lane markings are preprocessed to obtain a preprocessing lane marking data set, wherein the preprocessing lane markings are The markup contains the segmentation mask.

在本发明实施例中，基于Mask R-CNN训练的标记掩模需包含对象内部的像素，因此，对该Mask R-CNN网络进行训练时，若该车道线数据集中的车道线标记仅包含车道线的侧边轮廓，则对车道线进行预处理，得到预处理后的车道线数据集，将该预处理后的车道线数据集输入到Mask R-CNN网络进行训练，得到训练好的Mask R-CNN网络，以满足Mask R-CNN网络的训练要求。In the embodiment of the present invention, the marking mask based on Mask R-CNN training needs to include pixels inside the object. Therefore, when training the Mask R-CNN network, if the lane markings in the lane marking dataset only include lanes The side contour of the line, then the lane line is preprocessed to obtain the preprocessed lane line data set, and the preprocessed lane line data set is input into the Mask R-CNN network for training, and the trained Mask R is obtained. -CNN network to meet the training requirements of Mask R-CNN network.

在对该车道线数据集中的车道线进行预处理时，优选地，若该车道线的标记仅包含一条线，则以该线为轴心在两边各扩散预设的像素宽度，若该车道线的标记包含两条直线，则将该两条线的首端和末端分别进行连接，形成四边形闭合区域，并按照该车道线的类别对该四边形闭合区域进行填充，以使预处理后的车道线的标记掩模包含对象内部的像素。When preprocessing the lane lines in the lane line data set, preferably, if the mark of the lane line contains only one line, take the line as the axis and spread the preset pixel width on both sides. The mark contains two straight lines, then connect the head and end of the two lines to form a quadrilateral closed area, and fill the quadrilateral closed area according to the category of the lane line, so that the preprocessed lane line The marker mask contains the pixels inside the object.

进一步优选地，该像素宽度为5个像素宽度，以根据实际试验结果设定像素宽度。Further preferably, the pixel width is 5 pixels width, so as to set the pixel width according to the actual test result.

在步骤S103中，将预处理后的车道线数据集输入到Mask R-CNN网络进行训练，得到训练好的Mask R-CNN网络。In step S103, the preprocessed lane line data set is input into the Mask R-CNN network for training, and the trained Mask R-CNN network is obtained.

实施例四：Embodiment 4:

图4示出了本发明实施例四提供的使用最大交并比筛选法跟踪目标元素的运动轨迹的实现流程，为了便于说明，仅示出了与本发明实施例相关的部分，其中包括：FIG. 4 shows the implementation process of tracking the motion trajectory of the target element using the maximum intersection ratio screening method provided by the fourth embodiment of the present invention. For the convenience of description, only the part related to the embodiment of the present invention is shown, including:

在步骤S401中，根据当前所有的跟踪路径序列框列表获得第一序列框列表集合。In step S401, a first sequence box list set is obtained according to all current tracking path sequence box lists.

在本发明实施例中，该第一序列框列表集合中包含所有与该目标元素的类别相同的跟踪路径序列框列表，每个跟踪路径序列框列表中包含跟踪到的目标元素的类别以及按时间顺序排列的该目标元素的检测框，每个跟踪路径序列框列表对应一个目标元素的运动轨迹，换言之，每个跟踪路径序列框列表中所有的检测框构成一个目标元素的运动轨迹。In this embodiment of the present invention, the first sequence box list set includes all tracking path sequence box lists that have the same category as the target element, and each tracking path sequence box list includes the category of the tracked target element and the time For the detection frames of the target element arranged in sequence, each tracking path sequence box list corresponds to a motion trajectory of a target element. In other words, all detection frames in each tracking path sequence box list constitute a motion trajectory of a target element.

由于目标检测算法中，某个目标在某帧中可能会被漏检，这样在该帧对应时刻将不会有新框加入到该对象对应的边界框序列内，但若后续帧内又检测到了该对象，会导致同一目标的跟踪中断，也有可能该目标消失导致有效帧丢失，从而优选地，在根据当前所有的跟踪路径序列框列表获得第一序列框列表集合之前，获取当前视频帧与当前每个跟踪路径序列框列表中的最末检测框对应的视频帧的第二帧间隔，丢弃第二帧间隔不小于预设的第二间隔阈值的跟踪路径序列框列表。具体地，可对每一个跟踪路径序列框列表额外记录其最末检测框被检测到时对应视频帧的帧编号，并将此帧称为最末有效帧，将没有漏检、成功检测到目标元素的视频帧称为有效帧，并且设定一个第二间隔阈值，计算当前视频帧与每个跟踪路径序列框列表的最末有效帧的帧编号差，若当前视频帧与最末有效帧的帧编号差在帧阈值内，则仍然对该跟踪路径序列框列表予以保留，否则认为该目标已经不存在于视频中，对跟踪路径序列框列表予以舍弃。Because in the target detection algorithm, a certain target may be missed in a certain frame, so that no new frame will be added to the bounding box sequence corresponding to the object at the corresponding moment of the frame, but if it is detected again in subsequent frames This object will cause the tracking of the same target to be interrupted, and it is also possible that the disappearance of the target will lead to the loss of valid frames. Therefore, preferably, before obtaining the first sequence box list set according to all the current tracking path sequence box lists, obtain the current video frame and the current frame. The second frame interval of the video frame corresponding to the last detection frame in each tracking path sequence frame list is discarded, and the tracking path sequence frame list whose second frame interval is not less than the preset second interval threshold is discarded. Specifically, the frame number of the corresponding video frame when the last detection frame is detected can be additionally recorded for each tracking path sequence frame list, and this frame is called the last valid frame, and there will be no missed detection and successful detection of the target. The video frame of the element is called a valid frame, and a second interval threshold is set to calculate the frame number difference between the current video frame and the last valid frame of each tracking path sequence box list. If the frame number difference is within the frame threshold, the tracking path sequence box list is still reserved, otherwise it is considered that the target no longer exists in the video, and the tracking path sequence box list is discarded.

优选地，该第二间隔阈值为5，以根据实验结果进行第二间隔阈值的设定，进而提高检测的准确性。Preferably, the second interval threshold is 5, so as to set the second interval threshold according to the experimental result, thereby improving the detection accuracy.

在步骤S402中，计算该目标元素的检测框与第一序列框列表集合中每个跟踪路径序列框列表的最末检测框的交并比。In step S402, the intersection ratio of the detection frame of the target element and the last detection frame of each tracking path sequence frame list in the first sequence frame list set is calculated.

在本发明实施例中，交并比计算公式为：In the embodiment of the present invention, the calculation formula of the intersection ratio is:

其中，box_cur表示目标元素的检测框，q表示第一序列框列表中跟踪路径序列框列表的总数量，box_j表示第一序列框列表中第j个跟踪路径序列框列表的最末检测框，Area表示求检测框面积。Among them, box _cur represents the detection frame of the target element, q represents the total number of the tracking path sequence box list in the first sequence box list, box _j represents the last detection frame of the jth tracking path sequence box list in the first sequence box list , Area represents the area of the detection frame.

在步骤S403中，判断是否存在交并比目标框，若存在，则执行步骤S404，否则执行步骤S405。In step S403, it is judged whether there is an intersection ratio target frame, if so, step S404 is performed, otherwise, step S405 is performed.

在本发明实施例中，该交并比目标框为交并比大于预设的交并比阈值的最末检测框，优选地，该交并比阈值为0.75。In this embodiment of the present invention, the cross-unit ratio target frame is the final detection frame whose cross-unit ratio is greater than a preset cross-unit ratio threshold. Preferably, the cross-unit ratio threshold is 0.75.

对于两个以及多个相同类别目标向相同方向或相近方向运动，导致一目标的最新帧与其他目标最末有效帧距离过近，采用最大交并比不能很有效地将这两个或多个目标的轨迹区分开的情况，在判断是否存在交并比大于预设的交并比阈值的最末检测框之后，优选地，若存在该交并比目标框，则将最大交并比与次大交并比进行取对数检验，若通过取对数检验，则执行步骤S404，否则可采用相似度等其他算法进一步跟踪该目标元素的运动轨迹。For two or more objects of the same category moving in the same direction or similar directions, the latest frame of one object is too close to the last valid frame of other objects, and the maximum intersection ratio cannot effectively combine these two or more objects. When the trajectory of the target is distinguished, after judging whether there is a final detection frame whose intersection ratio is greater than the preset intersection ratio threshold, preferably, if there is this intersection ratio target frame, then the maximum intersection ratio and the second The logarithm test is performed on the large intersection and union ratio. If the logarithm test is passed, step S404 is performed; otherwise, other algorithms such as similarity can be used to further track the motion trajectory of the target element.

取对数检验的使用公式为：The formula used to take the logarithmic test is:

其中，IoU_max表示最大交并比，IoU_{second_max}表示次大交并比，ε为常数，表示预设的对数检验阈值。Among them, IoU _max represents the maximum cross-union ratio, IoU _{second_max} represents the second largest cross-union ratio, and ε is a constant, representing the preset logarithmic test threshold.

优选地，采用预设的余弦距离比对法跟踪该目标元素的运动轨迹，以通过余弦距离跟踪目标元素的运动轨迹，采用预设的余弦距离比对法跟踪该目标元素的运动轨迹的具体实现方式可参考实施例五的相关描述。Preferably, a preset cosine distance comparison method is used to track the movement trajectory of the target element, so as to track the movement trajectory of the target element through the cosine distance, and a specific implementation of the preset cosine distance comparison method to track the movement trajectory of the target element For the method, reference may be made to the relevant description of the fifth embodiment.

在步骤S404中，确定跟踪到该目标元素的运动轨迹，并将该目标元素的检测框添加到交并比最大的最末检测框所在的跟踪路径序列框列表中。In step S404, determine the motion track tracked to the target element, and add the detection frame of the target element to the tracking path sequence frame list where the last detection frame with the largest intersection ratio is located.

在步骤S405中，确定该目标元素为新出现的元素，并根据该目标元素的检测框和类别新建一个跟踪路径序列框列表。In step S405, it is determined that the target element is a newly appearing element, and a tracking path sequence frame list is created according to the detection frame and category of the target element.

实施例五：Embodiment 5:

本发明实施例基于实施例四，图5示出了本发明实施例五提供的采用余弦距离比对法跟踪目标元素的运动轨迹的实现流程，为了便于说明，仅示出了与本发明实施例相关的部分，其中包括：The embodiment of the present invention is based on the fourth embodiment. FIG. 5 shows the implementation process of tracking the motion trajectory of the target element by using the cosine distance comparison method provided by the fifth embodiment of the present invention. Relevant sections, which include:

在步骤S501中，获取当前视频帧中该目标元素的最新特征图，以及第二序列框列表集合中每个该跟踪路径序列框列表的最末检测框和次末检测框分别对应的第一特征图和第二特征图。In step S501, obtain the latest feature map of the target element in the current video frame, and the first feature corresponding to the last detection frame and the second last detection frame of each tracking path sequence frame list in the second sequence frame list set respectively and the second feature map.

在本发明实施例中，获取当前视频帧中当前跟踪的目标元素的特征图，为便于说明，用最新特征图来表示当前视频帧中该目标元素(待跟踪的目标元素)的特征图，并获取第二序列框列表集合中每个跟踪路径序列框列表的最末检测框和次末检测框分别对应的特征图，为便于说明，分别用第一特征图和第二特征图来表示每个跟踪路径序列框列表的最末检测框和次末检测框对应的特征图，该第一特征图和第二特征图是指在对应的有效帧中的特征图，其中，该第二序列框列表集合中包含所有的交并比目标框所在的跟踪路径序列框列表，换言之，该第二序列框列表集合中包含所有的交并比大于预设的交并比阈值的最末检测框所在的跟踪路径序列框列表。In the embodiment of the present invention, the feature map of the currently tracked target element in the current video frame is obtained, and for the convenience of description, the latest feature map is used to represent the feature map of the target element (target element to be tracked) in the current video frame, and Obtain the feature maps corresponding to the last detection frame and the second-end detection frame of each tracking path sequence frame list in the second sequence frame list set. For the convenience of description, the first feature map and the second feature map are used to represent each The feature maps corresponding to the last detection frame and the second last detection frame of the tracking path sequence frame list, the first feature map and the second feature map refer to the feature maps in the corresponding valid frames, wherein the second sequence frame list The set contains all the tracking path sequence box lists where the target frame of the intersection ratio is located. In other words, the second sequence box list set contains all the tracking paths where the last detection frame whose intersection ratio is greater than the preset intersection ratio threshold is located. Path sequence box list.

在步骤S502中，对该最新特征图以及所有的第一特征图和第二特征图进行归一化处理，得到统一大小的特征图。In step S502, normalize the latest feature map and all the first and second feature maps to obtain feature maps of uniform size.

在步骤S503中，分别计算经归一化处理后的最新特征图与每个第一特征图和每个第二特征图的第一余弦距离和第二余弦距离。In step S503, the first cosine distance and the second cosine distance between the latest normalized feature map and each first feature map and each second feature map are calculated respectively.

在本发明实施例中，余弦距离计算公式为：In the embodiment of the present invention, the cosine distance calculation formula is:

m＝1或2，k∈[1，t]

m=1 or 2, k∈[1,t]

其中，t表示第三序列框列表中跟踪路径序列框列表的数量，A表示归一化处理后的最新特征图；B_k1表示第三序列框列表中第k个跟踪路径序列框列表的最末检测框对应的归一化处理后的第一特征图，即，B_k1表示第k个归一化处理后的第一特征图；B_k2表示第三序列框列表中第k个跟踪路径序列框列表的次末检测框对应的归一化处理后的第二特征图，即，B_k2表示第k个归一化处理后的第二特征图，cos(θ)_k1表示最新特征图与第k个归一化处理后的第一特征图的余弦距离，cos(θ)_k2表示最新特征图与第k个归一化处理后的第二特征图的余弦距离。Among them, t represents the number of tracking path sequence box lists in the third sequence box list, A represents the latest feature map after normalization processing; B _k1 represents the kth tracking path sequence box list in the third sequence box list at the end of the list The normalized first feature map corresponding to the detection frame, that is, B _k1 represents the k-th normalized first feature map; B _k2 represents the k-th tracking path sequence frame in the third sequence frame list The normalized second feature map corresponding to the detection frame at the end of the list, that is, B _k2 represents the k-th normalized second feature map, cos(θ) _k1 represents the latest feature map and the k-th feature map The cosine distance of the first normalized feature map, cos(θ) _k2 represents the cosine distance between the latest feature map and the kth normalized second feature map.

在步骤S504中，对每个第一余弦距离和每个第二余弦距离，分别取对数后得到对应的第一特征相似因子和第二特征相似因子。In step S504, logarithms are taken for each first cosine distance and each second cosine distance, respectively, to obtain the corresponding first feature similarity factor and second feature similarity factor.

在本发明实施例中，特征相似因子计算公式为：In the embodiment of the present invention, the calculation formula of the feature similarity factor is:

其中，

表示第k个第一特征相似因子，

表示第k个第二特征相似因子。in,

represents the kth first feature similarity factor,

represents the kth second feature similarity factor.

在步骤S505中，对每个第一特征相似因子和与每个第一特征相似因子对应的第二特征相似因子进行线性加权计算，得到最小加权特征相似因子。In step S505, linearly weighted calculation is performed on each first feature similarity factor and the second feature similarity factor corresponding to each first feature similarity factor to obtain a minimum weighted feature similarity factor.

在本发明实施例中，加权最小特征相似因子计算公式为：In the embodiment of the present invention, the calculation formula of the weighted minimum feature similarity factor is:

其中，a表示第一特征相似因子的权重，b表示第二特征相似因子的权重，且第一特征相似因子的权重大于第二特征相似因子的权重，即a＞b。Among them, a represents the weight of the first feature similarity factor, b represents the weight of the second feature similarity factor, and the weight of the first feature similarity factor is greater than the weight of the second feature similarity factor, that is, a>b.

在步骤S506中，将与该最小加权特征相似因子对应的跟踪路径序列框列表所对应的运动轨迹作为该目标元素的运动轨迹，并将该目标元素的检测框添加到与该最小加权特征相似因子对应的跟踪路径序列框列表中。In step S506, the motion track corresponding to the tracking path sequence frame list corresponding to the minimum weighted feature similarity factor is taken as the motion track of the target element, and the detection frame of the target element is added to the minimum weighted feature similarity factor The corresponding trace path sequence box list.

在本发明实施例中，与该最小加权特征相似因子对应的跟踪路径序列框列表所形成的运动轨迹即为该目标元素的运动轨迹。In the embodiment of the present invention, the motion trajectory formed by the tracking path sequence box list corresponding to the minimum weighted feature similarity factor is the motion trajectory of the target element.

实施例六：Embodiment 6:

图6A示出了本发明实施例六提供的使用违规警示点算法检测机动车不礼让行人的违规行为的实现流程，为了便于说明，仅示出了与本发明实施例相关的部分，其中包括：FIG. 6A shows the implementation process of using the violation warning point algorithm to detect the violation behavior of the motor vehicle in disobeying pedestrians provided by the sixth embodiment of the present invention. For the convenience of description, only the parts related to the embodiment of the present invention are shown, including:

在步骤S601中，根据主人行横道线上每个行人的运动轨迹预测出当前视频帧中的违规警示点。In step S601, the violation warning point in the current video frame is predicted according to the movement trajectory of each pedestrian on the main crosswalk.

在本发明实施例中，可基于每个行人的运动轨迹预测每个行人在下一视频帧中的位置，以根据预测出的行人的位置确定违规警示点，进一步地，可结合时间因素对行人下一帧出现的位置进行预测，即离当前帧越近，在预测中所起到的作用越大，更进一步地，可采用取对数的方式对每一个预测因子进行放大，从而使时间因素对预测结果的作用更加明显，最终将计算结果点(x_wp，y_wp)设为违规警示点。从而优选地，违规警示点的计算方式为：In the embodiment of the present invention, the position of each pedestrian in the next video frame can be predicted based on the motion trajectory of each pedestrian, so as to determine the violation warning point according to the predicted position of the pedestrian. Predict the position where a frame appears, that is, the closer it is to the current frame, the greater the role it plays in the prediction. Further, each predictor can be amplified by taking the logarithm, so that the time factor can be The effect of the prediction result is more obvious, and finally the calculation result point (x _wp , y _wp ) is set as the violation warning point. Therefore, preferably, the calculation method of the violation warning point is:

其中，x_wp、y_wp分别表示违规警示点的坐标，

分别表示行人的跟踪路径序列框列表中第m个检测框中心点的坐标，c表示行人的跟踪路径序列框列表中检测框的个数。Among them, x _wp and y _wp respectively represent the coordinates of the violation warning point,

具体实现中，可根据每个行人的运动轨迹预测一个违规警示点，即当前视频帧中可能存在多个违规警示点。In a specific implementation, one violation warning point can be predicted according to the motion trajectory of each pedestrian, that is, there may be multiple violation warning points in the current video frame.

在根据主人行横道线上每个行人的运动轨迹预测出当前视频帧中的违规警示点之前，优选地，判断当前视频帧中的每个行人是否处于主人行横道线上，在判断当前视频帧中的每个行人是否处于主人行横道线上时，可根据行人检测框与人行横道检测框的交并比来计算，若交并比大于零，则确定该首次出现的行人处于人行横道线上，优选地，判断行人检测框下边缘中心点是否处于人行横道检测框内，以进一步减小判断过程中的运算量。判断行人检测框的下边缘中心点是否处于人行横道线检测框内使用的公式为：Before predicting the violation warning point in the current video frame according to the movement trajectory of each pedestrian on the main crosswalk, preferably, it is determined whether each pedestrian in the current video frame is on the main crosswalk. When each pedestrian is on the main crosswalk line, it can be calculated according to the intersection ratio between the pedestrian detection frame and the crosswalk detection frame. If the intersection ratio is greater than zero, it is determined that the pedestrian who appears for the first time is on the crosswalk line. Preferably, determine Whether the center point of the lower edge of the pedestrian detection frame is within the pedestrian crossing detection frame, so as to further reduce the amount of computation in the judgment process. The formula used to judge whether the center point of the lower edge of the pedestrian detection frame is within the detection frame of the pedestrian crossing line is:

其中，(x_p，y_p)表示行人检测框的左上角的坐标，width_p、height_p分别表示行人检测框的宽度和高度，left_pc、top_pc分别表示主人行横道检测框的左上角相对当前视频帧左上角在水平方向和垂直方向的距离，width_pc、height_pc分别表示主人行横道检测框的宽度和高度，F(x)表示预测结果，F(x)的值为0或1，当F(x)＝0时，表示行人检测框下边缘中心点处于人行横道线检测框内，当F(x)＝1时，表示行人检测框下边缘中心点不处于人行横道线检测框内。Among them, (x _p , y _p ) represent the coordinates of the upper left corner of the pedestrian detection frame, width _p , height _p represent the width and height of the pedestrian detection frame, respectively, left _pc , top _pc represent the upper left corner of the main crosswalk detection frame relative to the current The distance between the upper left corner of the video frame in the horizontal and vertical directions, width _pc and height _pc represent the width and height of the main crosswalk detection frame respectively, F(x) represents the prediction result, and the value of F(x) is 0 or 1, when F When (x)=0, it means that the center point of the lower edge of the pedestrian detection frame is within the crosswalk line detection frame. When F(x)=1, it means that the center point of the lower edge of the pedestrian detection frame is not within the crosswalk line detection frame.

在步骤S602中，判断违规警示点是否处于识别出的任一机动车的最末检测框的上半部区域内，若处于，则执行步骤S603，若不处于，则执行步骤S604。In step S602, it is judged whether the violation warning point is within the upper half area of the last detection frame of any identified motor vehicle, if so, execute step S603, if not, execute step S604.

在步骤S603中，确定在当前视频帧中检测出违规行为。In step S603, it is determined that a violation is detected in the current video frame.

在步骤S604中，确定在当前视频帧中未检测出违规行为。In step S604, it is determined that no violation behavior is detected in the current video frame.

图6B示出了使用违规警示点算法检测当前视频帧中是否存在机动车不礼让行人的违规行为的示例图，图6B中，违规警示点不处于机动车的最末检测框的上半部区域内，即，图6B中不存在机动车不礼让行人的违规行为。FIG. 6B shows an example diagram of using the violation warning point algorithm to detect whether there is a violation behavior of a motor vehicle disobeying pedestrians in the current video frame. In FIG. 6B , the violation warning point is not located in the upper half area of the last detection frame of the motor vehicle , that is, there is no violation of vehicle disobedience to pedestrians in Figure 6B.

在本发明实施例中，通过预测出的当前视频帧中的违规警示点来检测当前视频帧中是否存在机动车不礼让行人的违规行为，相对于判断机动车与行人前进路径是否有遮挡的算法，本方法还可以适用于行人静止、行人已通过的情况，从而进一步提高了违规行为识别精度及场景泛化能力。In the embodiment of the present invention, the predicted violation warning point in the current video frame is used to detect whether there is a violation of the motor vehicle in the current video frame. , the method can also be applied to the situation where the pedestrian is stationary and the pedestrian has passed, thereby further improving the recognition accuracy of illegal behavior and the scene generalization ability.

实施例七：Embodiment 7:

图7示出了本发明实施例七提供的机动车是否礼让行人的检测装置的结构，为了便于说明，仅示出了与本发明实施例相关的部分，其中包括：Fig. 7 shows the structure of the device for detecting whether a motor vehicle yields to pedestrians provided by the seventh embodiment of the present invention. For the convenience of description, only the parts related to the embodiment of the present invention are shown, including:

目标元素识别单元71，用于使用训练好的目标识别网络对采集到的视频帧进行目标元素识别，得到目标元素集合；The target element identification unit 71 is used for using the trained target identification network to carry out target element identification on the collected video frame to obtain a target element set;

运动轨迹跟踪单元72，用于使用预设的目标跟踪算法跟踪该目标元素集合中的目标元素的运动轨迹；以及A motion track tracking unit 72, configured to use a preset target tracking algorithm to track the motion track of the target element in the target element set; and

违规行为识别单元73，用于基于已识别出的主人行横道线和跟踪到的目标元素的运动轨迹，使用预设的违规行为检测算法检测当前视频帧中是否存在机动车不礼让行人的违规行为。The illegal behavior identification unit 73 is configured to use a preset illegal behavior detection algorithm to detect whether there is an illegal behavior of a motor vehicle disobeying pedestrians in the current video frame based on the identified main pedestrian crossing line and the tracked movement trajectory of the target element.

优选地，该装置包括：Preferably, the device includes:

人行横道识别单元，用于若满足预设的人行横道线识别条件，则识别主人行横道线，其中，人行横道线识别条件包括当前视频帧为视频首帧，或当前视频帧为预设的人行横道线识别周期对应的视频帧，和/或当前视频帧与前一该识别而未识别出人行横道线的视频帧的帧间隔等于预设的第一间隔阈值。The crosswalk recognition unit is used to identify the main crosswalk line if the preset crosswalk line recognition conditions are met, wherein the crosswalk line recognition conditions include that the current video frame is the first frame of the video, or the current video frame corresponds to the preset crosswalk line recognition cycle and/or the frame interval between the current video frame and the previous video frame that is identified but not identified as the crosswalk line is equal to the preset first interval threshold.

优选地，该人行横道识别单元还包括：Preferably, the pedestrian crossing identification unit further includes:

人行横道分割单元，用于使用训练好的分割网络对当前视频帧进行人行横道线实例分割，获得至少一个粗定位人行横道检测框；以及a crosswalk segmentation unit, used to segment the current video frame by crosswalk line instances using the trained segmentation network to obtain at least one coarsely positioned crosswalk detection frame; and

精定位单元，用于基于所有的粗定位人行横道检测框精定位主人行横道线。The fine positioning unit is used to finely locate the main crosswalk line based on all the coarsely positioned crosswalk detection frames.

优选地，该精定位单元还包括：Preferably, the precise positioning unit further includes:

精定位子单元，用于基于每个粗定位人行横道检测框的置信度，采用预设的条件打分算法从所有的粗定位人行横道检测框中筛选主人行横道线，其中，该条件打分算法结合了粗定位人行横道检测框在视频帧中的位置和/或粗定位人行横道检测框相对于视频帧的比例。The fine positioning sub-unit is used to screen the main crosswalk line from all the coarse positioning crosswalk detection frames by using a preset conditional scoring algorithm based on the confidence of each coarsely positioned crosswalk detection frame. The conditional scoring algorithm combines the coarse positioning The position of the crosswalk detection frame in the video frame and/or the ratio of the coarsely located crosswalk detection frame relative to the video frame.

优选地，该精定位子单元还包括：Preferably, the precise positioning subunit also includes:

条件总分计算单元，用于采用预设的条件打分算法对每个粗定位人行横道检测框进行打分，得到每个粗定位人行横道检测框的条件总分；以及a conditional total score calculation unit, configured to use a preset conditional scoring algorithm to score each roughly positioned pedestrian crossing detection frame to obtain a conditional total score of each roughly positioned pedestrian crossing detection frame; and

主横道确定单元，用于获取条件总分最高的粗定位人行横道检测框，将条件总分最高的粗定位人行横道检测框作为主人行横道线检测框。The main crosswalk determination unit is used to obtain the coarsely positioned pedestrian crossing detection frame with the highest total condition score, and use the coarsely positioned pedestrian crossing detection frame with the highest conditional total score as the main pedestrian crossing line detection frame.

优选地，该主横道确定单元还包括：Preferably, the main cross lane determination unit further includes:

宽度比判断单元，用于判断该条件总分最高的粗定位人行横道检测框与视频帧的宽度比是否小于预设的宽度比阈值；A width ratio judgment unit, used for judging whether the width ratio of the coarse positioning pedestrian crossing detection frame with the highest total score of the condition and the video frame is less than a preset width ratio threshold;

第一确定子单元，用于若小于宽度比阈值，则将条件总分最高的粗定位人行横道检测框向两侧等距离扩展，以使条件总分最高的粗定位人行横道检测框与视频帧的宽度比达到宽度比阈值，将扩展后的条件总分最高的粗定位人行横道检测框作为主人行横道线检测框；以及The first determination subunit is used to expand the coarse-positioned crosswalk detection frame with the highest conditional total score to both sides at equal distances if it is less than the width ratio threshold, so as to make the coarse-positioned crosswalk detection frame with the highest conditional total score and the width of the video frame When the ratio reaches the width ratio threshold, the coarsely positioned crosswalk detection frame with the highest extended conditional total score is used as the main crosswalk line detection frame; and

第二确定子单元，用于若不小于宽度比阈值，则将条件总分最高的粗定位人行横道检测框作为主人行横道线检测框。The second determination subunit is configured to use the roughly positioned pedestrian crossing detection frame with the highest total condition score as the main pedestrian crossing line detection frame if it is not less than the width ratio threshold.

优选地，该宽度比阈值为0.85。Preferably, the width ratio threshold is 0.85.

优选地，该目标元素集合包括每个目标元素的类别和检测框，该目标跟踪算法为最大交并比筛选法，该运动轨迹跟踪单元还包括：Preferably, the target element set includes the category and detection frame of each target element, the target tracking algorithm is a maximum intersection ratio screening method, and the motion track tracking unit also includes:

第一集合获取单元，用于对于该目标元素集合中的每个目标元素，根据当前所有的跟踪路径序列框列表获得第一序列框列表集合，其中，该第一序列框列表集合中包含所有与该目标元素的类别相同的跟踪路径序列框列表，每个跟踪路径序列框列表中包含跟踪到的目标元素的类别以及按时间顺序排列的该目标元素的检测框，每个跟踪路径序列框列表对应一个目标元素的运动轨迹；The first set obtaining unit is configured to, for each target element in the target element set, obtain a first sequence box list set according to all current tracking path sequence box lists, wherein the first sequence box list set includes all the A list of tracking path sequence boxes with the same category of the target element, each tracking path sequence box list contains the category of the tracked target element and the detection boxes of the target element in chronological order, each tracking path sequence box list corresponds to The trajectory of a target element;

交并比计算单元，用于计算该目标元素的检测框与第一序列框列表集合中每个跟踪路径序列框列表的最末检测框的交并比；an intersection ratio calculation unit, used to calculate the intersection ratio between the detection frame of the target element and the last detection frame of each tracking path sequence frame list in the first sequence frame list set;

第一判断单元，用于判断是否存在交并比目标框，该交并比目标框为交并比大于预设的交并比阈值的最末检测框；a first judging unit, used for judging whether there is an intersection and ratio target frame, and the intersection and ratio target frame is the last detection frame whose intersection ratio is greater than a preset intersection ratio threshold;

轨迹确定单元，用于若存在交并比目标框，则确定跟踪到该目标元素的运动轨迹，并将该目标元素的检测框添加到交并比最大的最末检测框所在的跟踪路径序列框列表中；The trajectory determination unit is used to determine the motion trajectory tracked to the target element if there is an intersection and ratio target frame, and add the detection frame of the target element to the tracking path sequence frame where the last detection frame with the largest intersection ratio is located List;

新元素发现单元，用于若不存在交并比目标框，则确定该目标元素为新出现的元素，并根据该目标元素的检测框和类别新建一个跟踪路径序列框列表。The new element discovery unit is used to determine that the target element is a newly appeared element if there is no intersection and ratio target frame, and create a new tracking path sequence frame list according to the detection frame and category of the target element.

优选地，该交并比阈值为0.75。Preferably, the cross-union ratio threshold is 0.75.

优选地，该运动轨迹跟踪单元还包括：Preferably, the motion track tracking unit further includes:

第二帧间隔获取单元，用于获取当前视频帧与当前每个跟踪路径序列框列表中的最末检测框对应的视频帧的第二帧间隔；The second frame interval obtaining unit is used to obtain the second frame interval of the current video frame and the video frame corresponding to the last detection frame in each current tracking path sequence frame list;

列表丢弃单元，用于丢弃第二帧间隔不小于预设的第二间隔阈值的跟踪路径序列框列表。A list discarding unit, configured to discard the list of tracking path sequence boxes whose second frame interval is not less than a preset second interval threshold.

优选地，该第二间隔阈值为5。Preferably, the second interval threshold is five.

交并比检验单元，用于若存在该交并比目标框，则将最大交并比与次大交并比进行取对数检验；The cross-union ratio test unit is used for taking the logarithm test of the largest cross-union ratio and the second largest cross-union ratio if there is the cross-union ratio target frame;

检验结果确定单元，用于若通过取对数检验，则确定跟踪到该目标元素的运动轨迹；以及a test result determination unit, configured to determine the motion track tracked to the target element if the logarithm test is passed; and

跟踪子单元，用于若未通过取对数检验，则采用预设的余弦距离比对法跟踪该目标元素的运动轨迹。The tracking subunit is used for tracking the movement track of the target element by adopting the preset cosine distance comparison method if the logarithm test is not passed.

优选地，该跟踪子单元还包括：Preferably, the tracking subunit further includes:

特征图获取单元，用于获取当前视频帧中该目标元素的最新特征图，以及第二序列框列表集合中每个跟踪路径序列框列表的最末检测框和次末检测框分别对应的第一特征图和第二特征图，其中，该第二序列框列表集合中包含所有的交并比目标框所在的跟踪路径序列框列表；The feature map acquisition unit is used to obtain the latest feature map of the target element in the current video frame, and the first detection frame corresponding to the last detection frame and the second last detection frame of each tracking path sequence frame list in the second sequence frame list set respectively. A feature map and a second feature map, wherein the second sequence box list set contains all the tracking path sequence box lists where the intersection and ratio target boxes are located;

归一化处理单元，用于对该最新特征图以及所有的第一特征图和第二特征图进行归一化处理，得到统一大小的特征图；a normalization processing unit for normalizing the latest feature map and all the first feature maps and the second feature maps to obtain feature maps of uniform size;

余弦距离计算单元，用于分别计算经归一化处理后的最新特征图与每个第一特征图和每个第二特征图的第一余弦距离和第二余弦距离；a cosine distance calculation unit, used for calculating the first cosine distance and the second cosine distance of the latest normalized feature map and each first feature map and each second feature map respectively;

相似因子计算单元，用于对每个第一余弦距离和每个第二余弦距离，分别取对数后得到对应的第一特征相似因子和第二特征相似因子；a similarity factor calculation unit, which is used to obtain the corresponding first feature similarity factor and second feature similarity factor after taking the logarithm of each first cosine distance and each second cosine distance respectively;

加权计算单元，用于对每个第一特征相似因子和与每个第一特征相似因子对应的第二特征相似因子进行线性加权计算，得到最小加权特征相似因子；以及A weighted calculation unit, configured to perform linear weighted calculation on each of the first feature similarity factors and the second feature similarity factors corresponding to each of the first feature similarity factors to obtain a minimum weighted feature similarity factor; and

轨迹确定子单元，用于将与该最小加权特征相似因子对应的跟踪路径序列框列表所对应的运动轨迹作为该目标元素的运动轨迹，并将该目标元素的检测框添加到与该最小加权特征相似因子对应的跟踪路径序列框列表中。The trajectory determination subunit is used for taking the motion trajectory corresponding to the tracking path sequence frame list corresponding to the minimum weighted feature similarity factor as the motion trajectory of the target element, and adding the detection frame of the target element to the minimum weighted feature Similar factors corresponding to the tracking path sequence box list.

优选地，该预设的违规行为检测算法为违规警示点算法，违规行为识别单元还包括：Preferably, the preset illegal behavior detection algorithm is a violation warning point algorithm, and the illegal behavior identification unit further includes:

警示点预测单元，用于根据主人行横道线上每个行人的运动轨迹预测出当前视频帧中的违规警示点；The warning point prediction unit is used to predict the violation warning point in the current video frame according to the movement trajectory of each pedestrian on the main crosswalk;

位置关系判断单元，用于判断违规警示点是否处于识别出的任一机动车的最末检测框的上半部区域内；a positional relationship judgment unit for judging whether the violation warning point is within the upper half area of the last detection frame of any identified motor vehicle;

违规行为确定单元，用于若违规警示点处于当前视频帧中的任一机动车检测框的上半部区域内，则确定在当前视频帧中检测出违规行为。A violation behavior determination unit, configured to determine that a violation behavior is detected in the current video frame if the violation warning point is within the upper half area of any motor vehicle detection frame in the current video frame.

优选地，该违规行为识别单元还包括：Preferably, the violation identification unit further includes:

行人位置判断单元，用于判断当前视频帧中的每个行人是否处于主人行横道线上；a pedestrian position determination unit, used to determine whether each pedestrian in the current video frame is on the main crosswalk;

该行人位置判断单元还包括：The pedestrian position determination unit further includes:

判断该行人的行人检测框下边缘中心点是否处于人行横道检测框内；以及Determine whether the center point of the lower edge of the pedestrian detection frame of the pedestrian is within the pedestrian crossing detection frame; and

行人位置确定单元，用于若处于该人行横道检测框内，则判定该行人处于该人行横道线上。The pedestrian position determination unit is configured to determine that the pedestrian is on the line of the pedestrian crossing if it is within the detection frame of the pedestrian crossing.

在本发明实施例中，机动车是否礼让行人的检测装置的各单元可由相应的硬件或软件单元实现，各单元可以为独立的软、硬件单元，也可以集成为一个软、硬件单元，在此不用以限制本发明。机动车是否礼让行人的检测装置的各单元的具体实施方式可参考前述方法实施例的描述，在此不再赘述。In the embodiment of the present invention, each unit of the device for detecting whether the motor vehicle yields to pedestrians may be implemented by corresponding hardware or software units. Each unit may be an independent software or hardware unit, or may be integrated into a software and hardware unit. Here It is not intended to limit the present invention. For the specific implementation of each unit of the device for detecting whether a motor vehicle yields to a pedestrian, reference may be made to the description of the foregoing method embodiments, which will not be repeated here.

实施例八：Embodiment 8:

图8示出了本发明实施例五提供的电子设备的结构，为了便于说明，仅示出了与本发明实施例相关的部分。FIG. 8 shows the structure of the electronic device provided by the fifth embodiment of the present invention. For convenience of description, only the part related to the embodiment of the present invention is shown.

本发明实施例的电子设备8包括处理器80、存储器81以及存储在存储器81中并可在处理器80上运行的计算机程序82。该处理器80执行计算机程序82时实现上述各方法实施例中的步骤，例如图1所示的步骤S101至S103。或者，处理器80执行计算机程序82时实现上述各装置实施例中各单元的功能，例如图7所示单元71至73的功能。The electronic device 8 of the embodiment of the present invention includes a processor 80 , a memory 81 , and a computer program 82 stored in the memory 81 and executable on the processor 80 . When the processor 80 executes the computer program 82 , the steps in the above-mentioned method embodiments are implemented, for example, steps S101 to S103 shown in FIG. 1 . Alternatively, when the processor 80 executes the computer program 82, the functions of the units in the above-mentioned apparatus embodiments, for example, the functions of the units 71 to 73 shown in FIG. 7 are realized.

实施例九：Embodiment 9:

在本发明实施例中，提供了一种计算机可读存储介质，该计算机可读存储介质存储有计算机程序，该计算机程序被处理器执行时实现上述方法实施例中的步骤，例如，图1所示的步骤S101至S103。或者，该计算机程序被处理器执行时实现上述各装置实施例中各单元的功能，例如图7所示单元71至73的功能。In an embodiment of the present invention, a computer-readable storage medium is provided, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the steps in the foregoing method embodiments are implemented, for example, as shown in FIG. 1 . Steps S101 to S103 shown. Alternatively, when the computer program is executed by the processor, the functions of the units in the above-mentioned apparatus embodiments, for example, the functions of the units 71 to 73 shown in FIG. 7 , are implemented.

本发明实施例的计算机可读存储介质可以包括能够携带计算机程序代码的任何实体或装置、记录介质，例如，ROM/RAM、磁盘、光盘、闪存等存储器。The computer-readable storage medium of the embodiments of the present invention may include any entity or device capable of carrying computer program codes, recording medium, for example, memory such as ROM/RAM, magnetic disk, optical disk, flash memory, and the like.

以上所述仅为本发明的较佳实施例而已，并不用以限制本发明，凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention shall be included in the protection of the present invention. within the range.

Claims

1. a detection method of whether a motor vehicle yields to pedestrians, it is characterized in that, described method comprises the steps:

Use the trained target recognition network to perform target element recognition on the collected video frames to obtain a target element set;

Use a preset target tracking algorithm to track the motion trajectory of the target element in the target element set;

Based on the identified main crosswalk line and the tracked motion trajectory of the target element, use a preset illegal behavior detection algorithm to detect whether there is a violation of a motor vehicle in the current video frame that is not courteous to a pedestrian;

The target element set includes the category and detection frame of each target element, the target tracking algorithm is a maximum intersection ratio screening method, and the preset target tracking algorithm is used to track the movement of the target elements in the target element set. The steps of the trajectory, including:

For each target element in the target element set, a first sequence box list set is obtained according to all current tracking path sequence box lists, wherein the first sequence box list set includes all the target elements related to the target element. A list of tracking path sequence boxes of the same category, each of the tracking path sequence box lists contains the category of the tracked target element and the detection boxes of the target element arranged in chronological order, each of the tracking path sequence box lists Corresponding to the motion trajectory of a target element;

Calculate the intersection ratio of the detection frame of the target element and the last detection frame of each tracking path sequence frame list in the first sequence frame list set;

Judging whether there is an intersection and ratio target frame, the intersection and ratio target frame is the final detection frame whose intersection ratio is greater than a preset intersection ratio threshold;

If the intersection and ratio target frame does not exist, then determine that the target element is a newly appeared element, and create a new tracking path sequence frame list according to the detection frame and category of the target element;

If there is the cross-union ratio target frame, the largest cross-union ratio and the second largest cross-union ratio are subjected to a logarithmic test;

If the logarithm test is passed, then determine the motion track of the target element to be tracked, and add the detection frame of the target element to the list of tracking path sequence frames where the last detection frame with the largest intersection ratio is located;

If the logarithm check is not passed, obtain the latest feature map of the target element in the current video frame, and the last detection frame and the last detection frame of each of the tracking path sequence frame lists in the second sequence frame list set The first feature map and the second feature map corresponding to the detection frame respectively, wherein the second sequence frame list set includes all the tracking path sequence frame lists where the intersection and ratio target frame is located;

Normalizing the latest feature map and all the first feature maps and the second feature maps to obtain feature maps of uniform size;

Calculate the first cosine distance and the second cosine distance of the latest normalized feature map and each of the first feature maps and each of the second feature maps respectively;

For each of the first cosine distances and each of the second cosine distances, take the logarithm to obtain the corresponding first feature similarity factor and second feature similarity factor;

Perform linear weighting calculation on each of the first feature similarity factors and the second feature similarity factors corresponding to each of the first feature similarity factors to obtain a minimum weighted feature similarity factor;

Taking the motion trajectory corresponding to the tracking path sequence box list corresponding to the minimum weighted feature similarity factor as the motion trajectory of the target element, and adding the detection frame of the target element to the minimum weighted feature similarity factor corresponding to the in the trace path sequence box list.

2. method as claimed in claim 1, is characterized in that, before the step of described using trained target recognition network to carry out target element identification to the video frame collected, also comprises:

If the preset crosswalk line recognition conditions are met, the main crosswalk line is recognized, wherein the crosswalk line recognition conditions include that the current video frame is the first frame of the video, or the current video frame is a video corresponding to a preset crosswalk line recognition period frame, and/or the frame interval between the current video frame and the previous video frame that has been identified without identifying the crosswalk line is equal to the preset first interval threshold;

The step of identifying the main crosswalk line also includes:

Use the trained segmentation network to segment the crosswalk line instance on the current video frame to obtain at least one coarsely positioned crosswalk detection frame;

Finely locate the main crosswalk line based on all the coarsely positioned crosswalk detection frames;

The segmentation network is a Mask R-CNN network, and the steps of training the Mask R-CNN network include:

Get the lane line dataset;

If the lane markings in the lane marking data set only include the side contours of the lane markings, the lane markings are preprocessed to obtain a preprocessing lane marking data set, wherein the lane markings include crosswalk markings, The preprocessed lane markings contain segmentation masks;

Inputting the preprocessed lane line data set into the Mask R-CNN network for training to obtain a trained Mask R-CNN network;

The step of preprocessing the lane lines in the lane line dataset further includes:

If the marking of the lane line contains only one line, then take the line as the axis and spread the preset pixel width on both sides, wherein the pixel width is 5 pixel width;

If the marking of the lane line includes two straight lines, the head ends and the ends of the two straight lines are respectively connected to form a quadrilateral closed area, and the quadrilateral closed area is filled according to the category of the lane line.

3. The method according to claim 2, wherein the step of finely positioning the main crosswalk line based on all of the rough positioning crosswalk detection frames comprises:

Based on the confidence of each of the coarsely positioned crosswalk detection frames, a preset conditional scoring algorithm is used to screen the main crosswalk line from all of the coarsely positioned crosswalk detection frames, wherein the conditional scoring algorithm combines the Coarsely locate the position of the crosswalk detection frame in the video frame and/or the ratio of the roughly positioned pedestrian crossing detection frame to the video frame;

The step of screening the main crosswalk line from all the roughly positioned crosswalk detection frames using a preset conditional scoring algorithm, including:

Use a preset conditional scoring algorithm to score each of the roughly positioned pedestrian crossing detection frames to obtain a conditional total score of each of the roughly positioned pedestrian crossing detection frames;

Obtain the coarse-positioned crosswalk detection frame with the highest total condition score, and use the coarse-positioned crosswalk detection frame with the highest conditional total score as the main crosswalk line detection frame;

The formula used by the conditional scoring algorithm is as follows:

in,

Represents the confidence of the i-th coarsely positioned pedestrian crossing detection frame, I() is the indicator function, (R1 _t/h , R2 _t/h ) represents the first preset interval, (R1 _w/h , R2 _{w/h )} ) represents the second preset interval, s1 represents the first preset fixed value, and s2 represents the second preset fixed value;

After the step of obtaining the coarsely positioned pedestrian crossing detection frame with the highest total score of the condition, it also includes:

Determine whether the width ratio of the coarse-positioned crosswalk detection frame with the highest conditional total score to the video frame is less than a preset width ratio threshold;

If it is less than, expand the coarse-positioned crosswalk detection frame with the highest conditional total score to both sides at equal distances, so that the width ratio of the coarse-positioned crosswalk detection frame with the highest conditional total score to the video frame reaches the width ratio threshold , taking the expanded crosswalk detection frame with the highest conditional total score as the main crosswalk line detection frame;

If it is not less than, the roughly positioned pedestrian crossing detection frame with the highest total condition score is used as the main pedestrian crossing line detection frame.

4. The method of claim 3, wherein the target recognition network is an SSD MobileNet-v2 network, the first interval threshold is 10, and the first preset interval is (0.4, 0.6), The first preset fixed value is 0.4, the second preset interval is (0.07, 0.14), the second preset fixed value is 0.14, and the width ratio threshold is 0.85.

5. The method of claim 1, wherein the cross-union ratio threshold is 0.75,

Before the step of obtaining the first sequence box list set according to all the current tracking path sequence box lists, the method further includes:

Obtain the second frame interval of the current video frame and the video frame corresponding to the last detection frame in each of the current tracking path sequence frame lists;

Discard the list of tracking path sequence boxes whose second frame interval is not less than a preset second interval threshold, where the second interval threshold is 5.

6. The method of claim 1, wherein

The use formula of the logarithmic test is:

(log ₂ IoU _max -log ₂ IoU _{second_max} )＜ε

Among them, IoU _max represents the maximum cross-union ratio, IoU _{second_max} represents the second largest cross-union ratio, and ε is a constant, representing the preset logarithmic test threshold.

7 . The method of claim 1 , wherein the preset illegal behavior detection algorithm is a violation warning point algorithm, and the preset illegal behavior detection algorithm is used to detect whether there is a motor vehicle disobeying pedestrians in the current video frame. 8 . The steps for the violation include:

Predict the violation warning point in the current video frame according to the motion trajectory of each pedestrian on the main crosswalk;

Determine whether the violation warning point is within the upper half area of the last detection frame of any identified motor vehicle;

If it is, it is determined that the violation is detected in the current video frame;

The calculation method of the violation warning point is:

Wherein, x _wp and y _wp respectively represent the coordinates of the violation warning point,

respectively represent the coordinates of the center point of the mth detection box in the pedestrian tracking path sequence box list, and c represents the number of detection boxes in the pedestrian tracking path sequence box list;

Before the step of predicting the violation warning point in the current video frame according to the motion trajectory of each pedestrian on the main crosswalk, the steps include:

Determine whether each pedestrian in the current video frame is on the main crosswalk line;

The step of judging whether each pedestrian in the current video frame is on the main crosswalk line includes:

Determine whether the center point of the lower edge of the pedestrian detection frame of the pedestrian is within the pedestrian crossing detection frame;

If it is within the pedestrian crossing detection frame, it is determined that the pedestrian is on the pedestrian crossing line;

After the step of using the preset illegal behavior detection algorithm to detect whether there is an illegal behavior of the motor vehicle in disrespect to pedestrians in the current video frame, it also includes:

If the violation is detected, use a preset license plate recognition algorithm to identify the license plate of the motor vehicle where the violation occurs;

A visualization shows the license plate, the violation, and/or the time when the violation occurred.

8. A detection device for whether a motor vehicle yields to a pedestrian, wherein the device comprises:

The target element recognition unit is used to use the trained target recognition network to perform target element recognition on the collected video frames to obtain a target element set;

a motion trajectory tracking unit, used for tracking the motion trajectory of the target element in the target element set using a preset target tracking algorithm; and

The illegal behavior identification unit is used to detect, based on the identified main crosswalk line and the tracked motion trajectory of the target element, whether there is an illegal behavior of a motor vehicle rudely yielding to pedestrians in the current video frame by using a preset illegal behavior detection algorithm ;

The target element set includes the category and detection frame of each target element, the target tracking algorithm is a maximum intersection ratio screening method, and the motion track tracking unit also includes:

a first set obtaining unit, configured to obtain, for each target element in the target element set, a first sequence box list set according to all current tracking path sequence box lists, wherein the first sequence box list set contains all the tracking path sequence box lists with the same category as the target element, and each tracking path sequence box list contains the category of the tracked target element and the detection frame of the target element arranged in chronological order, each Each of the tracking path sequence box lists corresponds to a motion trajectory of a target element;

an intersection ratio calculation unit, configured to calculate the intersection ratio between the detection frame of the target element and the last detection frame of each tracking path sequence frame list in the first sequence frame list set;

a first judging unit for judging whether there is a cross-union ratio target frame, the cross-union ratio target frame is the last detection frame whose cross-union ratio is greater than a preset cross-union ratio threshold;

A new element discovery unit, configured to determine that the target element is a newly-appeared element if the intersection and ratio target frame does not exist, and create a new tracking path sequence frame list according to the detection frame and category of the target element;

The trajectory determination unit is used to determine the motion trajectory tracked to the target element if there is the intersection and ratio target frame, and add the detection frame of the target element to the location where the final detection frame with the largest intersection ratio is located. Trace path sequence box list;

The motion track tracking unit also includes:

a cross-union ratio test unit, configured to perform a logarithmic test on the largest cross-union ratio and the next largest cross-and-merge ratio if the cross-union ratio target frame exists;

a test result determination unit, configured to determine the motion track tracked to the target element if the logarithm test is passed;

A tracking subunit, used for tracking the motion trajectory of the target element by adopting a preset cosine distance comparison method if the logarithm test is not passed;

The tracking subunit also includes:

The feature map acquisition unit is used to acquire the latest feature map of the target element in the current video frame, and the last detection frame and the second last detection frame of each of the tracking path sequence frame lists in the second sequence frame list set correspond respectively to The first feature map and the second feature map of , wherein, the second sequence box list set contains all the tracking path sequence box lists where the intersection and ratio target boxes are located;

a normalization processing unit, configured to perform normalization processing on the latest feature map and all the first feature maps and the second feature maps to obtain feature maps of uniform size;

A cosine distance calculation unit, used for calculating the first cosine distance and the second cosine distance of the latest normalized feature map and each of the first feature maps and each of the second feature maps respectively;

a similarity factor calculation unit, used to obtain the corresponding first feature similarity factor and second feature similarity factor after taking the logarithm of each of the first cosine distances and each of the second cosine distances respectively;

a weighted calculation unit, configured to perform linear weighted calculation on each of the first feature similarity factors and the second feature similarity factors corresponding to each of the first feature similarity factors to obtain a minimum weighted feature similarity factor; and

The trajectory determination subunit is used to take the motion trajectory corresponding to the tracking path sequence frame list corresponding to the minimum weighted feature similarity factor as the motion trajectory of the target element, and add the detection frame of the target element to the target element. in the list of the tracking path sequence box corresponding to the minimum weighted feature similarity factor.

9. An electronic device, comprising a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor implements the computer program as claimed in the claims The steps of any one of 1 to 7 of the method.

10. A computer-readable storage medium storing a computer program, characterized in that, when the computer program is executed by a processor, the steps of the method according to any one of claims 1 to 7 are implemented .