CN113469026B - Intersection retention event detection method and system based on machine learning - Google Patents

Intersection retention event detection method and system based on machine learning Download PDF

Info

Publication number
CN113469026B
CN113469026B CN202110735123.XA CN202110735123A CN113469026B CN 113469026 B CN113469026 B CN 113469026B CN 202110735123 A CN202110735123 A CN 202110735123A CN 113469026 B CN113469026 B CN 113469026B
Authority
CN
China
Prior art keywords
image
frame
ith
retention
intersection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110735123.XA
Other languages
Chinese (zh)
Other versions
CN113469026A (en
Inventor
汪志涛
胡健萌
许乐
倪红波
李汪
唐崇伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Intelligent Transportation Co ltd
Original Assignee
Shanghai Intelligent Transportation Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Intelligent Transportation Co ltd filed Critical Shanghai Intelligent Transportation Co ltd
Priority to CN202110735123.XA priority Critical patent/CN113469026B/en
Publication of CN113469026A publication Critical patent/CN113469026A/en
Application granted granted Critical
Publication of CN113469026B publication Critical patent/CN113469026B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a method and a system for detecting an intersection retention event based on machine learning, belonging to the field of computer vision and comprising the following steps: collecting continuous multi-frame images of a road junction to be detected in a set time period; obtaining the position information and the speed information of each vehicle target frame in a real coordinate system according to each frame of image; aiming at the ith frame image, obtaining a multi-dimensional feature vector of the ith frame image according to the position information and the speed information of each vehicle target frame in the ith frame image and the adjacent three frames of images, wherein i is greater than 3; based on a retention detection model, determining the retention state of the intersection to be detected at the time point corresponding to the ith frame image according to the multi-dimensional feature vector of the ith frame image; and determining the retention event of the intersection to be detected according to the retention state of the intersection to be detected at each time point. The determined multi-dimensional feature vector has continuity and robustness between front and back frames, so that misjudgment of a prediction result can be reduced, and the accuracy of detection of intersection retention events is improved.

Description

Intersection retention event detection method and system based on machine learning
Technical Field
The invention relates to the field of computer vision, in particular to a method and a system for detecting intersection retention events based on machine learning.
Background
For traffic jam, scholars at home and abroad give different definitions aiming at different angles, generally speaking, the traffic jam is a traffic running condition that the traffic demand exceeds the traffic supply, vehicles are blocked on a traffic lane outside a signal lamp-controlled intersection, the queuing length exceeds 250m, or an intersection where the vehicles which do not pass through the intersection are defined as a jammed intersection when the intersection is green for 3 times under the control of a signal lamp; the congested section is defined as a state that the vehicles are blocked on the roadway and the queuing length exceeds 1 km.
The existing traffic state (including traffic congestion) detection algorithm can be roughly divided into an indirect mode and a direct mode, wherein the indirect detection mode is a method for judging the traffic state through parameters such as traffic flow, occupancy, delay time and the like of upstream and downstream sections. The direct detection mode is to detect the traffic state through a direct traffic state detection algorithm or a manual monitoring means, and is typically a traffic state detection method based on video images.
However, the traffic retention state in an actual scene is a dynamic change process, and most of the existing retention detection technical solutions, whether direct or indirect, only classify and judge the state under a single time node without considering information correlation between previous and next frames, so that the robustness of the algorithm is poor, and false alarm or false alarm omission easily occurs in some special scenes.
Disclosure of Invention
The invention aims to provide a method and a system for detecting intersection retention events based on machine learning, which can improve the detection accuracy and reduce the retention false alarm in an actual scene.
In order to achieve the purpose, the invention provides the following scheme:
a machine learning-based intersection retention event detection method, comprising:
collecting continuous multi-frame images of a road junction to be detected in a set time period;
obtaining the position information and the speed information of each vehicle target frame in a real coordinate system according to each frame of image; the vehicle target frame is a vehicle image frame which is calibrated in each frame image in advance aiming at each vehicle;
aiming at the ith frame image, obtaining a multi-dimensional feature vector of the ith frame image, wherein i is greater than 3, according to the position information and the speed information of each vehicle target frame in the ith frame image, the ith-1 frame image, the ith-2 frame image and the ith-3 frame image;
based on a retention detection model, determining the retention state of the intersection to be detected at the time point corresponding to the ith frame image according to the multi-dimensional feature vector of the ith frame image; the retention state is dredging or retention;
determining a retention event of the intersection to be detected in the time period according to the retention state of the intersection to be detected at each time point;
the retention event is from the beginning of a retention event to the end of a retention event; wherein, when the detention state of the intersection to be detected changes from dredging to detention, the detention state is the beginning of a detention event; and when the retention state of the intersection to be detected is changed from retention to dredging, ending a retention event.
Optionally, the obtaining, according to each frame image, position information and speed information of each vehicle target frame in the real coordinate system specifically includes:
determining a spatial projection relation corresponding to an image coordinate system and a real coordinate system according to any frame of image;
for each frame of image, determining the position information of the image coordinate of the corresponding vehicle target frame in a real coordinate system according to the space projection relation and the image coordinate of each vehicle target frame in the image;
and obtaining the speed information of the vehicle target frame in the real coordinate system in the time points corresponding to the two frames of images according to the moving distance of the image coordinates of the same vehicle target frame in the two adjacent frames of images in the real coordinate system and the time difference between the two frames of images.
Optionally, the determining, according to any frame of image, a spatial projection relationship corresponding to an image coordinate system and a real coordinate system specifically includes:
any four image coordinates which are not on a straight line in the plane of the road surface are marked in any image;
acquiring real coordinates of the intersection to be detected corresponding to the coordinates of the four images;
and determining the spatial projection relation according to the four pairs of corresponding image coordinates and real coordinates.
Optionally, for the ith frame image, obtaining a multidimensional feature vector, i >3, of the ith frame image according to the position information and the speed information of each vehicle target frame in the ith frame image, the (i-1) th frame image, the (i-2) th frame image, and the (i-3) th frame image, specifically including:
dividing the ith frame image, the ith-1 frame image, the ith-2 frame image and the ith-3 frame image into four areas in equal proportion;
for each region, obtaining eight eigenvectors of the region according to the position information and the speed information of each vehicle target frame of the region; the eight feature vectors comprise the number of vehicles, the transverse average speed of the vehicles, the longitudinal average speed of the vehicles, the area of a lane area, the average width ratio of a vehicle detection frame to a lane, the average height ratio of the vehicle detection frame to the lane, the change rate of the traffic flow and the stability of a picture;
and integrating the feature vectors of each region of the ith frame image, the ith-1 frame image, the ith-2 frame image and the ith-3 frame image to obtain the multi-dimensional feature vector.
Optionally, the determining, based on the retention detection model, the retention state of the intersection to be detected at the time point corresponding to the ith frame image according to the multidimensional feature vector of the ith frame image specifically includes:
acquiring continuous multi-frame historical images of a road junction to be detected;
obtaining the position information and the speed information of each vehicle target frame in a real coordinate system according to each frame of historical image;
aiming at the ith frame of historical image, obtaining a multi-dimensional feature vector of the ith frame of historical image, wherein i is greater than 3, according to the position information and the speed information of each vehicle target frame in the ith frame of historical image, the (i-1) th frame of historical image, the (i-2) th frame of historical image and the (i-3) th frame of historical image;
determining a corresponding retention state prediction confidence coefficient according to the multi-dimensional feature vector of the ith frame of historical image;
training a random forest classifier according to the multi-dimensional feature vectors of the historical images of the frames and the corresponding retention state prediction confidence coefficient to obtain a retention detection model;
based on the retention detection model, determining a retention state prediction confidence corresponding to the ith frame of image according to the multi-dimensional feature vector of the ith frame of image;
and determining the retention state of the intersection to be detected at the time point corresponding to the ith frame image according to the retention state prediction confidence corresponding to the ith frame image.
Optionally, the determining, based on the retention detection model, the retention state of the intersection to be detected at the time point corresponding to the ith frame image according to the multi-dimensional feature vector of the ith frame image further includes:
after determining a retention state prediction confidence corresponding to the ith frame of image based on the retention detection model, performing smoothing processing on the retention state prediction confidence corresponding to the ith frame of image to obtain a processed retention state prediction confidence; and the processed retention state prediction confidence coefficient is used for determining the retention state of the intersection to be detected at the time point corresponding to the ith frame of image.
Optionally, the processed retention state prediction confidence is obtained according to the following formula:
Score' i =r×Score i +(1-r)×Score' i-1 ,i>4;
Score' 4 =Score 4
wherein, score' i Is the retention state prediction confidence after the i-th frame image processing, score i Is a retention state prediction confidence, score' i-1 The retention state prediction confidence after the processing of the i-1 th frame image is shown, and r is a damping coefficient.
In order to achieve the above purpose, the invention also provides the following scheme:
a machine learning based intersection retention event detection system, the machine learning based intersection retention event detection system comprising:
the acquisition unit is used for acquiring continuous multi-frame images of the intersection to be detected within a set time period;
the position and speed determining unit is connected with the acquisition unit and is used for obtaining the position information and the speed information of each vehicle target frame in a real coordinate system according to each frame of image; the vehicle target frame is a vehicle image frame which is calibrated in each frame image in advance aiming at each vehicle;
a multi-dimensional feature vector determining unit connected with the position and speed determining unit and used for obtaining a multi-dimensional feature vector of the ith frame image, wherein i >3 according to the position information and the speed information of each vehicle target frame in the ith frame image, the (i-1) th frame image, the (i-2) th frame image and the (i-3) th frame image;
the retention state determining unit is connected with the multi-dimensional characteristic vector determining unit and used for determining the retention state of the intersection to be detected at the time point corresponding to the ith frame of image according to the multi-dimensional characteristic vector of the ith frame of image based on a retention detection model; the retention state is dredging or retention;
the retention event determining unit is connected with the retention state determining unit and used for determining the retention event of the intersection to be detected in the time period according to the retention state of the intersection to be detected at each time point; the retention event is from the beginning of a retention event to the end of a retention event; wherein, when the detention state of the intersection to be detected changes from dredging to detention, the detention state is the beginning of a detention event; and when the retention state of the intersection to be detected is changed from retention to dredging, ending a retention event.
Optionally, the position and velocity determination unit comprises:
the spatial projection relation determining module is connected with the acquisition unit and used for determining a spatial projection relation corresponding to an image coordinate system and a real coordinate system according to any frame of image;
the position determining module is respectively connected with the acquisition unit and the spatial projection relation determining module and is used for determining the position information of the image coordinate of the corresponding vehicle target frame in a real coordinate system according to the spatial projection relation and the image coordinate of each vehicle target frame in each image;
and the speed determining module is connected with the position determining module and is used for obtaining the speed information of the vehicle target frame in the real coordinate system in the time points corresponding to the two frames of images according to the moving distance of the image coordinates of the same vehicle target frame in the two adjacent frames of images in the real coordinate system and the time difference between the two frames of images.
Optionally, the spatial projection relationship determining module includes:
the image coordinate extraction submodule is connected with the acquisition unit and is used for calibrating any four image coordinates which are not on a straight line in a road surface plane in any image;
the real coordinate acquisition submodule is used for acquiring real coordinates of the intersection to be detected, which correspond to the four image coordinates;
and the projection relation determining submodule is respectively connected with the image coordinate extracting submodule and the real coordinate obtaining submodule and is used for determining a space projection relation according to four pairs of corresponding image coordinates and real coordinates.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects: collecting continuous multi-frame images of a road junction to be detected in a set time period; obtaining the position information and the speed information of each vehicle target frame in a real coordinate system according to each frame of image; aiming at the ith frame image, obtaining a multi-dimensional feature vector of the ith frame image according to the position information and the speed information of the vehicle target frame in the ith frame image and the adjacent three frames of images, wherein i is greater than 3, and the determined multi-dimensional feature vector has continuity and robustness between front and back frames, so that the misjudgment of a prediction result can be reduced; based on a retention detection model, determining the retention state of the intersection to be detected at the time point corresponding to the ith frame image according to the multi-dimensional feature vector of the ith frame image; and finally, determining the retention event of the intersection to be detected according to the retention state of the intersection to be detected at each time point, thereby improving the accuracy of detecting the retention event of the intersection.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a flow chart of a machine learning based intersection retention event detection method of the present invention;
fig. 2 is a block diagram of the intersection retention event detection system based on machine learning according to the present invention.
Description of the symbols:
the device comprises an acquisition unit-1, a position and velocity determination unit-2, a multi-dimensional feature vector determination unit-3, a retention state determination unit-4 and a retention event determination unit-5.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
The invention aims to provide a method and a system for detecting intersection retention events based on machine learning, which are implemented by acquiring continuous multi-frame images of an intersection to be detected within a set time period; obtaining the position information and the speed information of each vehicle target frame in a real coordinate system according to each frame of image; aiming at the ith frame image, obtaining a multi-dimensional feature vector of the ith frame image according to the position information and the speed information of the vehicle target frame in the ith frame image and the adjacent three frames of images, wherein i is greater than 3, and the determined multi-dimensional feature vector has continuity and robustness between front and back frames, so that the misjudgment of a prediction result can be reduced; based on a retention detection model, determining the retention state of the intersection to be detected at the time point corresponding to the ith frame image according to the multi-dimensional feature vector of the ith frame image; and finally, determining the retention event of the intersection to be detected according to the retention state of the intersection to be detected at each time point, thereby improving the accuracy of detecting the retention event of the intersection.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
As shown in fig. 1, the intersection retention event detection method based on machine learning of the present invention includes:
s1: and collecting continuous multi-frame images of the intersection to be detected in a set time period.
S2: obtaining the position information and the speed information of each vehicle target frame in a real coordinate system according to each frame of image; the vehicle target frame is a vehicle image frame which is calibrated in each frame image in advance aiming at each vehicle.
S3: and aiming at the ith frame image, obtaining a multi-dimensional feature vector of the ith frame image, wherein i is greater than 3, according to the position information and the speed information of each vehicle target frame in the ith frame image, the ith-1 frame image, the ith-2 frame image and the ith-3 frame image.
S4: based on a retention detection model, determining the retention state of the intersection to be detected at the time point corresponding to the ith frame image according to the multi-dimensional feature vector of the ith frame image; the retention state is dredging or retention.
S5: and determining the retention events of the intersection to be detected in the time period according to the retention states of the intersection to be detected at all time points.
In this embodiment, the holdover event is the beginning of a holdover event to the end of a holdover event; and when the detention state of the intersection to be detected changes from dredging to detention, the detention event is started. And when the retention state of the intersection to be detected is changed from retention to dredging, ending a retention event.
In order to improve the degree of distinction of retention events, the intersection retention event detection method based on machine learning further comprises the following steps:
and acquiring the time interval of two adjacent retention events, and merging the two retention events if the time interval of the two adjacent retention events is smaller than an interval threshold.
And acquiring the time length of the retention event for each retention event, and deleting the retention event if the time length of the retention event is smaller than a time threshold.
Further, S2: obtaining the position information and the speed information of each vehicle target frame under the real coordinate system according to each frame of image, which specifically comprises the following steps:
s21: and determining a spatial projection relation corresponding to the image coordinate system and the real coordinate system according to any frame of image.
S22: and determining the position information of the image coordinates of the corresponding vehicle target frame in a real coordinate system according to the space projection relation and the image coordinates of each vehicle target frame in each frame of image.
S23: and obtaining the speed information of the vehicle target frame in the real coordinate system in the time points corresponding to the two frames of images according to the moving distance of the image coordinates of the same vehicle target frame in the two adjacent frames of images in the real coordinate system and the time difference between the two frames of images.
As another embodiment, when determining the speed information, acquiring position information of a central point of the vehicle target frame in a real coordinate system; and obtaining the speed information of the central point of the vehicle target frame in the real coordinate system in the time points corresponding to the two frames of images according to the moving distance of the image coordinates of the central point of the same vehicle target frame in the two adjacent frames of images in the real coordinate system and the time difference between the two frames of images.
Further, S21: according to any frame of image, determining a spatial projection relation corresponding to an image coordinate system and a real coordinate system, specifically comprising:
any four image coordinates not on a straight line in the road surface plane are specified in any one image.
And acquiring real coordinates of the intersection to be detected corresponding to the coordinates of the four images. In this embodiment, by obtaining an aerial photography top view of the intersection to be measured, the real coordinates corresponding to the four image coordinates are determined according to the aerial photography top view.
And determining the space projection relation according to the four pairs of corresponding image coordinates and real coordinates.
In this embodiment, the transformation formula of the spatial projection relationship is:
Figure BDA0003141342400000081
wherein u, v, w are image coordinatesX ', y ', w ' are real coordinates, a 11 To a 33 A perspective transformation matrix composed of 9 unknown parameters.
In addition, the normal perspective transformation matrix has 9 unknown variables, and coordinates of four point pairs are required to be provided to obtain the transformation matrix. However, for the application scenario of the present invention, the purpose of the calibration frame is to obtain the speed of the vehicle target, and only the relative position of the target frame in the real coordinate system is required to be obtained, instead of the real position, so that only three point pairs are required.
Substituting the four pairs of corresponding image coordinates and real coordinates into the transformation formula to obtain four equation sets, and solving the equation sets to obtain 9 unknown parameters a of the perspective transformation matrix 11 To a 33 The value of (c).
The real positions and the real speeds of all vehicles can be obtained through the space projection relation.
Preferably, S3: for the ith frame image, obtaining a multi-dimensional feature vector of the ith frame image, wherein i >3, according to the position information and the speed information of each vehicle target frame in the ith frame image, the ith-1 frame image, the ith-2 frame image and the ith-3 frame image, specifically comprising the following steps:
s31: dividing the ith frame image, the ith-1 frame image, the ith-2 frame image and the ith-3 frame image into four areas in equal proportion.
S32: and aiming at each area, obtaining eight characteristic vectors of the area according to the position information and the speed information of each vehicle target frame of the area. In the present embodiment, the eight feature vectors include the number of vehicles, the vehicle lateral average speed, the vehicle longitudinal average speed, the lane area, the average width ratio of the vehicle detection frame to the lane, the average height ratio of the vehicle detection frame to the lane, the rate of change of the traffic flow, and the screen stability.
S33: and integrating the feature vectors of each region of the ith frame image, the ith-1 frame image, the ith-2 frame image and the ith-3 frame image to obtain the multi-dimensional feature vector. In this embodiment, the multidimensional feature vector is a 128-dimensional feature vector of 4 regions × 4 frames × 8 features.
Aiming at the characteristics of road conditions when a traffic intersection is detained, the feature vector acquired by the invention mainly comprises static features and dynamic features. The static characteristics refer to the vehicle and lane characteristics of the current frame and are used for describing a lane detention static state under the current frame; the dynamic feature refers to vehicle and lane features of three frames extracted every 10s in the past, and is used for describing dynamic changes of a lane staying state in the past time period.
Specifically, the number of vehicles is the number of vehicles in a lane area of the intersection to be detected, and the characteristic is positively correlated with the retention state of the intersection to be detected.
The vehicle transverse average speed is the average speed of the vehicle in the horizontal direction in the lane area of the intersection to be detected, and the characteristic is negatively correlated with the retention state of the intersection to be detected.
The longitudinal average speed of the vehicle is the average speed of the vehicle in the vertical direction in the lane area of the intersection to be detected, and the characteristic is negatively correlated with the retention state of the intersection to be detected.
The area of the lane area is the area of the lane area in the image screen. Due to the fact that the distance and the angle of a camera shooting picture are different, the area of lanes with the same size in different camera pictures is different, and errors caused by camera differences are caused. The introduction of this feature may alleviate such problems.
The average width ratio of the vehicle detection frame to the lane is the average value of the ratio of each vehicle target frame to the lane width, and the characteristic represents the bearing capacity of the lateral direction of the lane area to the retention jam.
The average height ratio of the vehicle detection frame to the lane is the average value of the height ratio of each vehicle target frame to the lane, and the characteristic represents the bearing capacity of the longitudinal direction of the lane area to the retention jam.
The traffic flow change rate is the traffic flow of the lane area under each frame of image, and the traffic flow of the current frame minus the traffic flow of the previous frame is the traffic flow change rate. The special frame is in negative correlation with the retention state of the intersection to be detected.
In the embodiment, a ball-based camera is adopted to collect multi-frame images of the intersection to be detected. The ball-based camera may rotate, and the scene may be affected by day and night changes, lighting conditions, and climate, and the picture status of the ball-based camera is unstable. Therefore, for each camera for collecting the image of the intersection to be detected, the invention extracts a reference picture in advance and selects a plurality of sub-areas from the background part of the picture. And obtaining the similarity of the two images by carrying out template matching on the sub-regions of the reference image and the current frame image. The picture stability is the similarity of two images.
Further, S4: based on a retention detection model, determining the retention state of the intersection to be detected at the time point corresponding to the ith frame image according to the multi-dimensional feature vector of the ith frame image, specifically comprising:
s41: acquiring continuous multi-frame historical images of the intersection to be detected.
S42: and obtaining the position information and the speed information of each vehicle target frame in the real coordinate system according to each frame of historical image.
S43: and aiming at the ith frame of historical image, obtaining the multi-dimensional feature vector of the ith frame of historical image, wherein i is greater than 3, according to the position information and the speed information of each vehicle target frame in the ith frame of historical image, the ith-1 frame of historical image, the ith-2 frame of historical image and the ith-3 frame of historical image.
S44: and determining the corresponding retention state prediction confidence according to the multi-dimensional feature vector of the ith frame of historical image.
S45: and training a random forest classifier according to the multi-dimensional feature vector of each frame of historical image and the corresponding retention state prediction confidence coefficient to obtain a retention detection model.
S46: based on the retention detection model, determining a retention state prediction confidence corresponding to the ith frame of image according to the multi-dimensional feature vector of the ith frame of image;
s47: and determining the retention state of the intersection to be detected at the time point corresponding to the ith frame image according to the retention state prediction confidence corresponding to the ith frame image.
Further, S4: based on a detention detection model, determining the detention state of the intersection to be detected at the time point corresponding to the ith frame image according to the multi-dimensional characteristic vector of the ith frame image, and further comprising:
after determining a retention state prediction confidence degree corresponding to the ith frame of image based on the retention detection model, performing smoothing processing on the retention state prediction confidence degree corresponding to the ith frame of image to obtain a processed retention state prediction confidence degree; and the processed retention state prediction confidence coefficient is used for determining the retention state of the intersection to be detected at the time point corresponding to the ith frame of image.
In this embodiment, the processed retention state prediction confidence is obtained according to the following formula:
Score' i =r×Score i +(1-r)×Score' i-1 ,i>4;
Score' 4 =Score 4
wherein, score' i Is the retention state prediction confidence after the i-th frame image processing, score i Is a retention state prediction confidence, score' i-1 The retention state prediction confidence after the processing of the i-1 th frame image is shown, and r is a damping coefficient.
The prediction result of the random forest classifier is the voting result of the decision tree, and the principle of training by the random forest classifier adopted by the invention is as follows:
dividing a sample data set into a training set and a test set;
each tree in the forest has the same probability distribution, and the classification error of the classifier depends on the classification capability of each decision tree and the relevance between the trees. Although the classification performance of a decision tree is relatively weak, through the integration of a large number of randomly generated decision trees, a test sample passes through each decision tree and is counted to determine the final classification result.
Random repeated replacement extraction from the entire training set (n samples) results in m new subsets of samples. Constructing a decision tree for each sample subset, wherein samples which are not extracted in each extraction form corresponding m pieces of out-of-bag data; each sample has 128 characteristic variables, y variables are randomly extracted for each decision node of each decision tree, and one variable with the strongest classification capability and a threshold value are selected. In the process of generating the decision tree, each tree grows as much as possible and no pruning is performed. And (4) combining the generated decision trees into a random forest, and comprehensively determining the prediction result of the random forest classifier according to the voting result of each tree.
And after the training of the random forest classifier is finished, the feature vectors extracted from the test set are sent to the classifier, and the retention state prediction confidence of the current frame can be obtained.
And after smoothing the confidence coefficient of the stay state prediction, judging the stay state of the current frame image if the confidence coefficient is higher than a threshold value.
Because the change of the confidence coefficient of the retention state prediction between different adjacent frames is jumping and discontinuous, the confidence coefficient needs to be smoothed, and in the embodiment, the confidence coefficient is smoothed by adopting a window average method, so that a prediction result with high robustness is obtained. After adding the window smoothing, the confidence of each frame is updated according to the proportion of the damping coefficient r. When a certain frame of a video is detained, the prediction score of the model reaches a threshold value through a smoothing process, so that false detection of a few frames can be avoided, and the prediction result of the method has continuity and robustness.
Because the detention state of the intersection to be detected is a dynamic change process, the detection of the detention state of the intersection to be detected only according to the single-frame image has limitation. The method firstly calibrates the collected multi-frame images, and obtains the real positions and the real speeds of all vehicles by utilizing projection transformation. And secondly, obtaining a multi-dimensional retention characteristic vector according to vehicle condition data information in a multi-frame image in a time period, and combining a random forest classifier to enable the retention classified characteristic vector to have continuity and robustness between front and rear frames so as to reduce misjudgment of a prediction result. And finally, carrying out window smoothing treatment on the prediction confidence coefficient of the retention state by using a window averaging algorithm to obtain a prediction result with stronger robustness.
As shown in fig. 2, the intersection retention event detection system based on machine learning of the present invention includes: an acquisition unit 1, a position and velocity determination unit 2, a multi-dimensional feature vector determination unit 3, a retention state determination unit 4, and a retention event determination unit 5.
The acquisition unit 1 is used for acquiring continuous multi-frame images of the intersection to be detected in a set time period.
The position and speed determining unit 2 is connected with the collecting unit 1, and the position and speed determining unit 2 is used for obtaining position information and speed information of each vehicle target frame in a real coordinate system according to each frame image; the vehicle target frame is a vehicle image frame which is calibrated in each frame image in advance aiming at each vehicle.
The multi-dimensional feature vector determination unit 3 is connected to the position and speed determination unit 2, and the multi-dimensional feature vector determination unit 3 is configured to, for an ith frame image, obtain a multi-dimensional feature vector of the ith frame image, where i >3, according to position information and speed information of each vehicle target frame in the ith frame image, the ith-1 frame image, the ith-2 frame image, and the ith-3 frame image.
The retention state determining unit 4 is connected to the multi-dimensional feature vector determining unit 3, and the retention state determining unit 4 is configured to determine, based on a retention detection model, a retention state of the intersection to be detected at a time point corresponding to the ith frame image according to the multi-dimensional feature vector of the ith frame image; the retention state is dredging or retention.
The retention event determining unit 5 is connected with the retention state determining unit 4, and the retention event determining unit 5 is configured to determine a retention event of the intersection to be measured in the time period according to the retention state of the intersection to be measured at each time point; the retention event is from the beginning of a retention event to the end of a retention event; wherein, when the detention state of the intersection to be detected changes from dredging to detention, the detention state is the beginning of a detention event; and when the retention state of the intersection to be detected is changed from retention to dredging, ending a retention event.
Further, the position and velocity determination unit 2 includes: the device comprises a spatial projection relation determining module, a position determining module and a speed determining module.
The spatial projection relation determining module is connected with the acquisition unit 1, and is used for determining a spatial projection relation corresponding to an image coordinate system and a real coordinate system according to any frame of image.
The position determining module is respectively connected with the acquisition unit 1 and the spatial projection relation determining module, and is used for determining the position information of the image coordinates of the corresponding vehicle target frame in a real coordinate system according to the spatial projection relation and the image coordinates of each vehicle target frame in each frame of image.
The speed determining module is connected with the position determining module and is used for obtaining the speed information of the vehicle target frame in the real coordinate system in the time points corresponding to the two frames of images according to the moving distance of the image coordinates of the same vehicle target frame in the two adjacent frames of images in the real coordinate system and the time difference between the two frames of images.
Still further, the spatial projection relationship determination module includes: the system comprises an image coordinate extraction submodule, a real coordinate acquisition submodule and a projection relation determination submodule.
The image coordinate extraction submodule is connected with the acquisition unit 1 and is used for calibrating any four image coordinates which are not on a straight line in a road surface plane in any frame of image.
And the real coordinate acquisition submodule is used for acquiring real coordinates of the intersection to be detected, which correspond to the four image coordinates.
The projection relation determining submodule is respectively connected with the image coordinate extracting submodule and the real coordinate obtaining submodule and is used for determining a space projection relation according to four pairs of corresponding image coordinates and real coordinates.
Compared with the prior art, the intersection retention event detection system based on machine learning has the same beneficial effects as the intersection retention event detection method based on machine learning, and is not repeated herein.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (9)

1. A machine learning-based intersection retention event detection method is characterized by comprising the following steps:
collecting continuous multi-frame images of a road junction to be detected in a set time period;
obtaining the position information and the speed information of each vehicle target frame in a real coordinate system according to each frame of image; the vehicle target frame is a vehicle image frame which is calibrated in each frame image in advance aiming at each vehicle;
aiming at the ith frame image, obtaining a multi-dimensional feature vector of the ith frame image, wherein i is greater than 3, according to the position information and the speed information of each vehicle target frame in the ith frame image, the (i-1) th frame image, the (i-2) th frame image and the (i-3) th frame image;
for the ith frame image, obtaining a multi-dimensional feature vector of the ith frame image according to the position information and the speed information of each vehicle target frame in the ith frame image, the ith-1 frame image, the ith-2 frame image and the ith-3 frame image, wherein i >3, specifically comprises the following steps:
dividing the ith frame image, the ith-1 frame image, the ith-2 frame image and the ith-3 frame image into four areas in equal proportion;
for each region, obtaining eight eigenvectors of the region according to the position information and the speed information of each vehicle target frame of the region; the eight feature vectors comprise the number of vehicles, the transverse average speed of the vehicles, the longitudinal average speed of the vehicles, the area of a lane area, the average width ratio of a vehicle detection frame to a lane, the average height ratio of the vehicle detection frame to the lane, the change rate of the traffic flow and the stability of a picture; for each camera for collecting the image of the intersection to be detected, extracting a reference picture in advance, and selecting a plurality of sub-regions from the background part of the picture; obtaining the similarity of two images by carrying out template matching on the sub-regions of the reference picture and the current frame picture; the picture stability is the similarity of two images;
integrating the feature vectors of each region of the ith frame image, the ith-1 frame image, the ith-2 frame image and the ith-3 frame image to obtain a multi-dimensional feature vector;
based on a retention detection model, determining the retention state of the intersection to be detected at the time point corresponding to the ith frame image according to the multi-dimensional feature vector of the ith frame image; the retention state is dredging or retention;
determining a retention event of the intersection to be detected in the time period according to the retention state of the intersection to be detected at each time point;
the retention event is from the beginning of a retention event to the end of a retention event; wherein, when the detention state of the intersection to be detected changes from dredging to detention, the detention state is the beginning of a detention event; and when the detention state of the intersection to be detected is changed from detention to dredging, ending a detention event.
2. The intersection retention event detection method based on machine learning of claim 1, wherein the obtaining of the position information and the speed information of each vehicle target frame in the real coordinate system according to each frame image specifically comprises:
determining a spatial projection relation corresponding to an image coordinate system and a real coordinate system according to any frame of image;
for each frame of image, determining the position information of the image coordinate of the corresponding vehicle target frame in a real coordinate system according to the space projection relation and the image coordinate of each vehicle target frame in the image;
and obtaining the speed information of the vehicle target frame in the real coordinate system in the time points corresponding to the two frames of images according to the moving distance of the image coordinates of the same vehicle target frame in the two adjacent frames of images in the real coordinate system and the time difference between the two frames of images.
3. The machine learning-based intersection retention event detection method according to claim 2, wherein the determining a spatial projection relationship corresponding to an image coordinate system and a real coordinate system according to any frame of image specifically comprises:
any four image coordinates which are not on a straight line in the plane of the road surface are marked in any image;
acquiring real coordinates of the intersection to be detected corresponding to the coordinates of the four images;
and determining the spatial projection relation according to the four pairs of corresponding image coordinates and real coordinates.
4. The machine learning-based intersection retention event detection method according to claim 1, wherein the determining, based on the retention detection model, the retention state of the intersection to be detected at the time point corresponding to the ith frame image according to the multidimensional feature vector of the ith frame image specifically includes:
acquiring continuous multi-frame historical images of a road junction to be detected;
obtaining the position information and the speed information of each vehicle target frame in a real coordinate system according to each frame of historical image;
aiming at the ith frame of historical image, obtaining a multi-dimensional feature vector of the ith frame of historical image according to the position information and the speed information of each vehicle target frame in the ith frame of historical image, the ith-1 frame of historical image, the ith-2 frame of historical image and the ith-3 frame of historical image, wherein i is greater than 3;
determining a corresponding retention state prediction confidence coefficient according to the multi-dimensional feature vector of the ith frame of historical image;
training a random forest classifier according to the multi-dimensional feature vector of each frame of historical image and the corresponding retention state prediction confidence coefficient to obtain a retention detection model;
based on the retention detection model, determining a retention state prediction confidence corresponding to the ith frame of image according to the multi-dimensional feature vector of the ith frame of image;
and determining the retention state of the intersection to be detected at the time point corresponding to the ith frame image according to the retention state prediction confidence corresponding to the ith frame image.
5. The intersection retention event detection method based on machine learning according to claim 4, wherein the retention detection model is used to determine the retention state of the intersection to be detected at the time point corresponding to the ith frame image according to the multidimensional feature vector of the ith frame image, and further includes:
after determining a retention state prediction confidence corresponding to the ith frame of image based on the retention detection model, performing smoothing processing on the retention state prediction confidence corresponding to the ith frame of image to obtain a processed retention state prediction confidence; and the processed retention state prediction confidence coefficient is used for determining the retention state of the intersection to be detected at the time point corresponding to the ith frame of image.
6. The machine learning-based intersection retention event detection method according to claim 5, wherein the processed retention state prediction confidence is obtained according to the following formula:
Score′ i =r×Score i +(1-r)×Score′ i-1 ,i>4;
Score′ 4 =Score′ 4
wherein, score' i Is the retention state prediction confidence after the i-th frame image processing, score i Is a retention state prediction confidence, score' i-1 The retention state prediction confidence after the processing of the i-1 th frame image is shown, and r is a damping coefficient.
7. A machine learning based intersection retention event detection system, comprising:
the acquisition unit is used for acquiring continuous multi-frame images of the intersection to be detected within a set time period;
the position and speed determining unit is connected with the acquisition unit and is used for obtaining the position information and the speed information of each vehicle target frame in a real coordinate system according to each frame of image; the vehicle target frame is a vehicle image frame which is calibrated in each frame image in advance aiming at each vehicle;
a multidimensional feature vector determination unit, connected to the position and speed determination unit, configured to, for an ith frame image, obtain a multidimensional feature vector, i >3, of the ith frame image according to position information and speed information of each vehicle target frame in the ith frame image, the ith-1 frame image, the ith-2 frame image, and the ith-3 frame image, where:
dividing the ith frame image, the ith-1 frame image, the ith-2 frame image and the ith-3 frame image into four areas in equal proportion;
for each region, obtaining eight eigenvectors of the region according to the position information and the speed information of each vehicle target frame of the region; the eight feature vectors comprise the number of vehicles, the transverse average speed of the vehicles, the longitudinal average speed of the vehicles, the area of a lane area, the average width ratio of a vehicle detection frame to a lane, the average height ratio of the vehicle detection frame to the lane, the change rate of the traffic flow and the stability of a picture; for each camera for collecting the image of the intersection to be detected, extracting a reference picture in advance, and selecting a plurality of sub-regions from the background part of the picture; obtaining the similarity of two images by carrying out template matching on the sub-regions of the reference picture and the current frame picture; the picture stability is the similarity of two images;
integrating the feature vectors of each region of the ith frame image, the ith-1 frame image, the ith-2 frame image and the ith-3 frame image to obtain a multi-dimensional feature vector;
the retention state determining unit is connected with the multi-dimensional characteristic vector determining unit and used for determining the retention state of the intersection to be detected at the time point corresponding to the ith frame of image according to the multi-dimensional characteristic vector of the ith frame of image based on a retention detection model; the retention state is dredging or retention;
the retention event determining unit is connected with the retention state determining unit and used for determining the retention event of the intersection to be detected in the time period according to the retention state of the intersection to be detected at each time point; the retention event is from the beginning of a retention event to the end of a retention event; wherein, when the detention state of the intersection to be detected changes from dredging to detention, the detention state is the beginning of a detention event; and when the retention state of the intersection to be detected is changed from retention to dredging, ending a retention event.
8. The machine learning-based intersection tie-down event detection system of claim 7, wherein the position speed determination unit comprises:
the spatial projection relation determining module is connected with the acquisition unit and used for determining a spatial projection relation corresponding to an image coordinate system and a real coordinate system according to any frame of image;
the position determining module is respectively connected with the acquisition unit and the spatial projection relation determining module and is used for determining the position information of the image coordinate of the corresponding vehicle target frame in a real coordinate system according to the spatial projection relation and the image coordinate of each vehicle target frame in each image;
and the speed determining module is connected with the position determining module and used for obtaining the speed information of the vehicle target frame in the real coordinate system in the time points corresponding to the two frames of images according to the moving distance of the image coordinates of the same vehicle target frame in the two adjacent frames of images in the real coordinate system and the time difference between the two frames of images.
9. The machine learning-based intersection holdover event detection system of claim 8, wherein the spatial projection relationship determination module comprises:
the image coordinate extraction submodule is connected with the acquisition unit and is used for calibrating any four image coordinates which are not on a straight line in a road surface plane in any image;
the real coordinate acquisition sub-module is used for acquiring real coordinates of the intersection to be detected, which correspond to the four image coordinates;
and the projection relation determining submodule is respectively connected with the image coordinate extracting submodule and the real coordinate obtaining submodule and is used for determining a space projection relation according to four pairs of corresponding image coordinates and real coordinates.
CN202110735123.XA 2021-06-30 2021-06-30 Intersection retention event detection method and system based on machine learning Active CN113469026B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110735123.XA CN113469026B (en) 2021-06-30 2021-06-30 Intersection retention event detection method and system based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110735123.XA CN113469026B (en) 2021-06-30 2021-06-30 Intersection retention event detection method and system based on machine learning

Publications (2)

Publication Number Publication Date
CN113469026A CN113469026A (en) 2021-10-01
CN113469026B true CN113469026B (en) 2023-03-24

Family

ID=77874368

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110735123.XA Active CN113469026B (en) 2021-06-30 2021-06-30 Intersection retention event detection method and system based on machine learning

Country Status (1)

Country Link
CN (1) CN113469026B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115083179B (en) * 2022-08-23 2022-12-20 江苏鼎集智能科技股份有限公司 Intelligent traffic application service control system based on Internet of things

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106355903A (en) * 2016-09-13 2017-01-25 枣庄学院 Method for detecting vehicle flow of multiple lanes on basis of video analysis
CN110598511A (en) * 2018-06-13 2019-12-20 杭州海康威视数字技术股份有限公司 Method, device, electronic equipment and system for detecting green light running event

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104464307B (en) * 2014-12-17 2016-08-10 合肥革绿信息科技有限公司 A kind of tunnel traffic based on video blocks up event automatic detection method
CN105788272B (en) * 2016-05-16 2018-03-16 杭州智诚惠通科技有限公司 A kind of method and system of vehicle flow jam alarming
CN111275960A (en) * 2018-12-05 2020-06-12 杭州海康威视系统技术有限公司 Traffic road condition analysis method, system and camera
CN113841188B (en) * 2019-05-13 2024-02-20 日本电信电话株式会社 Traffic flow estimating device, traffic flow estimating method, and storage medium
CN111950329B (en) * 2019-05-16 2024-06-18 长沙智能驾驶研究院有限公司 Target detection and model training method, device, computer equipment and storage medium
CN110688922A (en) * 2019-09-18 2020-01-14 苏州奥易克斯汽车电子有限公司 Deep learning-based traffic jam detection system and detection method
CN111583668B (en) * 2020-05-27 2021-12-28 阿波罗智联(北京)科技有限公司 Traffic jam detection method and device, electronic equipment and storage medium
CN111899514A (en) * 2020-08-19 2020-11-06 陇东学院 Artificial intelligence's detection system that blocks up

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106355903A (en) * 2016-09-13 2017-01-25 枣庄学院 Method for detecting vehicle flow of multiple lanes on basis of video analysis
CN110598511A (en) * 2018-06-13 2019-12-20 杭州海康威视数字技术股份有限公司 Method, device, electronic equipment and system for detecting green light running event

Also Published As

Publication number Publication date
CN113469026A (en) 2021-10-01

Similar Documents

Publication Publication Date Title
US8582816B2 (en) Method and apparatus for video analytics based object counting
CN104303193B (en) Target classification based on cluster
Bibi et al. Automatic parking space detection system
US9373043B2 (en) Method and apparatus for detecting road partition
US12002225B2 (en) System and method for transforming video data into directional object count
CN103049787B (en) A kind of demographic method based on head shoulder feature and system
CN104239867B (en) License plate locating method and system
CN102819764B (en) Method for counting pedestrian flow from multiple views under complex scene of traffic junction
CN110992693B (en) Deep learning-based traffic congestion degree multi-dimensional analysis method
CN109241938B (en) Road congestion detection method and terminal
US20170032676A1 (en) System for detecting pedestrians by fusing color and depth information
CN110379168B (en) Traffic vehicle information acquisition method based on Mask R-CNN
US10438072B2 (en) Video data background tracking and subtraction with multiple layers of stationary foreground and background regions
CN110874592A (en) Forest fire smoke image detection method based on total bounded variation
CN110598511A (en) Method, device, electronic equipment and system for detecting green light running event
CN102915433A (en) Character combination-based license plate positioning and identifying method
CN101936730A (en) Vehicle queue length detection method and device
CN106372619B (en) A kind of detection of vehicle robust and divided lane reach summation curve estimation method
CN102768802B (en) Method for judging road vehicle jam based on finite-state machine (FSM)
CN110889328A (en) Method, device, electronic equipment and storage medium for detecting road traffic condition
KR101026778B1 (en) Vehicle image detection apparatus
CN113469026B (en) Intersection retention event detection method and system based on machine learning
CN111127520A (en) Vehicle tracking method and system based on video analysis
Chen et al. Traffic congestion classification for nighttime surveillance videos
Shafie et al. Smart video surveillance system for vehicle detection and traffic flow control

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A Machine Learning Based Detection Method and System for Intersection Detention Events

Granted publication date: 20230324

Pledgee: China Construction Bank Corporation Shanghai Second Branch

Pledgor: SHANGHAI INTELLIGENT TRANSPORTATION Co.,Ltd.

Registration number: Y2024980017834