CN116883610A

CN116883610A - Digital twin intersection construction method and system based on vehicle identification and track mapping

Info

Publication number: CN116883610A
Application number: CN202311146515.8A
Authority: CN
Inventors: 杨亚宁; 钱程扬; 蒋如乔; 王一梅; 杨瑞雪
Original assignee: Yuance Information Technology Co ltd
Current assignee: Yuance Information Technology Co ltd
Priority date: 2023-09-07
Filing date: 2023-09-07
Publication date: 2023-10-13

Abstract

The invention provides a digital twin intersection construction method and a digital twin intersection construction system based on vehicle identification and track mapping, wherein in the method, a monitoring video is used as a data source, vehicle metadata is acquired based on a target detection technology, a vehicle track is acquired based on a target tracking technology, a vehicle geographic track is acquired through an image and geographic coordinate mapping method, vehicle information which is important in attention is stored in a structured mode, the data volume is reduced, the limitation that video data cannot be stored for a long time can be made up, and the subsequent application analysis is facilitated; and constructing a digital twin intersection scene by combining the geographic video projection and the three-dimensional scene visualization technology. The real-time track of the vehicle and the metadata of the vehicle are used as supports, a refined intersection scene is established, and data support is provided for optimizing an intersection regulation and control scheme, so that the intersection passing efficiency and the road safety are improved.

Description

Digital twin intersection construction method and system based on vehicle identification and track mapping

Technical Field

The invention belongs to the technical field of digital twinning, and particularly relates to a digital twinning intersection construction method and system based on vehicle identification and track mapping.

Background

The road intersection is the most complex scene of congestion and traffic incidents in a traffic system because of more traffic participants and complex traffic rules, so that more refined and intelligent management is needed. The conventional intersection management means are camera detection and field management, and have certain hysteresis in event discovery and event processing, so that complex road intersection environments cannot be met.

The intelligent intersection based on the digital twin technology is a new generation traffic control system which combines the leading edge technologies of artificial intelligence, big data, new generation Internet of things and the like to establish omnibearing traffic data perception and multi-source data participation decision. Therefore, the digital twin intersection can realize more refined and intelligent intersection management. The existing digital twin intersection construction scheme is a complex huge system, needs to integrate intersection radar, electronic police/bayonet, edge computing equipment and the like, has high requirements on software and hardware, and is high in cost for constructing an intersection, so that the existing digital twin intersection construction scheme is implemented only as a test point, and is difficult to comprehensively construct in the urban range.

The Chinese patent document CN110009561A proposes a method and a system for mapping a monitoring video target to a three-dimensional geographic scene model, wherein the method comprises the steps of firstly reading a video image of a monitoring probe, then collecting a homonymy point pair in the three-dimensional geographic scene model, obtaining image coordinates and geographic coordinates of the homonymy point pair, and establishing a mapping relation between the video image and a geographic space; then, a video target area and a sub-image in the image are obtained through video target detection processing; and finally, mapping the video target sub-image into a geographic scene model by using a video and geographic space mapping relation model to realize visual fusion. However, the above patent application does not consider the continuity of the objects between the continuous video frames, and cannot identify the same object in the three-dimensional scene, and cannot well display the dynamic track of the object.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a digital twin intersection construction method and system based on vehicle identification and track mapping.

In a first aspect, the present invention provides a digital twin intersection construction method based on vehicle identification and trajectory mapping, including:

acquiring a monitoring video of a target direction of a target traffic intersection;

reading images in the monitoring video frame by frame;

performing target detection and target tracking on the vehicles in the image to obtain target bounding boxes, confidence levels, vehicle types and identities of the target vehicles between adjacent video frames of all the vehicles in the image;

determining coordinates of the vehicle in the image;

determining geographic coordinates of the vehicle according to the coordinates of the vehicle in the image;

determining the real-time speed of the vehicle according to the geographic coordinates of the vehicle between adjacent frames;

determining a moving direction azimuth angle of the vehicle according to the geographic coordinates of the vehicle between adjacent frames to serve as a vehicle orientation;

determining the main color of an image contained in the target bounding box according to the image pixels in the target bounding box to serve as the vehicle color;

obtaining a license plate number corresponding to a vehicle;

adopting a two-dimensional and three-dimensional fusion rendering mode to construct a digital twin intersection static scene;

And adding a dynamic vehicle object in the digital twin intersection static scene according to the vehicle type, the geographic coordinates of the vehicle, the real-time speed of the vehicle, the vehicle orientation, the vehicle color and the vehicle license plate number.

Further, the performing object detection and object tracking on the vehicles in the image to obtain object bounding boxes, confidence levels, vehicle types and identities of the object vehicles between adjacent video frames of all the vehicles in the image includes:

acquisition of the firstkImage corresponding to frame videoM _k WhereinM _k =[m _pq ] _{w h×} ；wIs the image pixel width;his image pixel high;m _pq is the first in the imagepLine 1qPixel values of the columns;

will be the firstkInputting the image corresponding to the frame video into a YOLOv3 network to output target bounding boxes of all vehicles in the imagebboxConfidence levelconfidenceAnd vehicle typelabelThe method comprises the steps of carrying out a first treatment on the surface of the Wherein the method comprises the steps ofbbox=[bx，by，bw，bh]；bxThe abscissa of the upper left corner pixel for the bounding box of the object;bythe ordinate of the upper left corner pixel of the bounding box of the target;bwpixel width for the target bounding box;bhpixel height for the target bounding box; confidence represents the rate of agreement of the output vehicle type with the actual vehicle type in the target bounding box; vehicle types include car, bus and truck; image processing apparatusM _k All vehicles information table detected in (a) Array shown as a vehicle；/>；bx _i Is an imageM _k Middle (f)iThe abscissa of the upper left corner pixel of the target bounding box corresponding to the vehicle;by _i is an imageM _k Middle (f)iThe ordinate of the pixel at the upper left corner of the target bounding box corresponding to the vehicle;bw _i is an imageM _k Middle (f)iThe pixel width of the target bounding box corresponding to the vehicle;bh _i is an imageM _k Middle (f)iThe pixel height of the target bounding box corresponding to the vehicle;confidence _i to output vehicle type and imageM _k Middle (f)iThe consistency rate of the actual vehicle type in the target bounding box corresponding to the vehicle;label _i is an imageM _k Middle (f)iThe output vehicle type of the vehicle;

image is formedM _k All the detected vehicle information is input into a deep source model to track the target of the target detection result of the continuous video frames, and a code is given to the successfully tracked target vehicleidThe method comprises the steps of carrying out a first treatment on the surface of the The DeepSort model outputs the result of；/>；id _i Is an imageM _k Middle (f)iEncoding of the vehicle.

Further, the determining coordinates of the vehicle in the image includes:

image coordinates of the target vehicle by taking the midpoint of the lower border of the target bounding box corresponding to the target vehicle as the image coordinates of the target vehicleu _i ,v _i ）；u _i =bx _i +bw _i /2；v _i =by _i -bh _i ；bx _i Is an imageM _k Middle (f)iThe abscissa of the upper left corner pixel of the target bounding box corresponding to the vehicle;by _i is an imageM _k Middle (f)iThe ordinate of the pixel at the upper left corner of the target bounding box corresponding to the vehicle;bw _i is an image M _k Middle (f)iThe pixel width of the target bounding box corresponding to the vehicle;bh _i is an imageM _k Middle (f)iThe pixels of the corresponding target bounding box of the vehicle are high.

Further, the determining the geographic coordinates of the vehicle according to the coordinates of the vehicle in the image includes:

taking a point coordinate on the geospatial planeX _i ,Y _i ) To correspond to the coordinates on the monitoring video planeu _i ,v _i ）；

And constructing a coordinate and geographic coordinate conversion expression on the monitoring video plane:

；

wherein, the method comprises the following steps ofu _i ,v _i 1) is the firstiHomogeneous coordinates of the image coordinates of the vehicle;His a homography matrix;for translating the matrixM _t An inverse matrix of (a); />To scale the matrixM _s An inverse matrix of (a); (X _i ,Y _i 1) is the point%X _i ,Y _i ) Is a homogeneous coordinate of (3);Tis the transpose operator of the matrix.

Further, the determining the real-time speed of the vehicle according to the geographic coordinates of the vehicle between adjacent frames includes:

the real-time speed of the vehicle is determined by calculating the geographic coordinates of the vehicle between adjacent frames according to the following formula:

；

wherein, Dist _{t t(-1,)} for the previous framet-1 to current frametThe distance the vehicle moves in geographic space;X _t for the current frametThe abscissa of the geographic coordinates of the vehicle;X _t-1 for the previous framet-1 the abscissa of the geographic coordinates of the vehicle;Y _t for the current frametAn ordinate of the geographic coordinates of the vehicle;Y _t-1 for the previous framet-1 the ordinate of the geographic coordinates of the vehicle; v _t For the previous framet-1 to current frametThe speed at which the vehicle moves in the geographic space;fis the video frame rate.

Further, the determining the azimuth angle of the moving direction of the vehicle according to the geographic coordinates of the vehicle between the adjacent frames as the vehicle orientation includes:

the moving direction azimuth of the vehicle is calculated according to the following formula:

；

wherein, Azia moving direction azimuth angle of the vehicle;X _t for the current frametThe abscissa of the geographic coordinates of the vehicle;X _t-1 for the previous framet-1 the abscissa of the geographic coordinates of the vehicle;Y _t for the current frametAn ordinate of the geographic coordinates of the vehicle;Y _t-1 for the previous framet-1 the ordinate of the geographic coordinates of the vehicle.

Further, the determining the main color of the image contained in the target bounding box according to the image pixels in the target bounding box as the vehicle color includes:

acquiring RGB color values of target pixels in a target bounding box;

converting RGB color values of target pixels in the target bounding box into HLS color values;

dividing hues in the HLS color values into a plurality of groups sequentially with the first target value as an interval;

calculating hue groups of target pixels in the target bounding box according to the following formula:

；

wherein, is the first in the target bounding boxjHue groups where the individual pixels are located; / >Is the first in the target bounding boxjHue of individual pixels;αis a first target value;

traversing all hue groups, and taking the product of the number of the group with the largest pixel number and a first target value as the hue of the main color;

dividing the brightness in the HLS color value into a plurality of groups sequentially with the second target value as an interval;

calculating the brightness group of the target pixel in the target bounding box according to the following formula:

；

wherein, is the first in the target bounding boxjThe brightness group of the pixels; />Is the first in the target bounding boxjBrightness of individual pixels;βis a second target value;

traversing all brightness groups, and taking the product of the number of the group with the largest pixel number and the second target value as the brightness of the main color;

dividing the saturation in the HLS color value into a plurality of groups sequentially with the third target value as an interval;

calculating the saturation group of the target pixel in the target bounding box according to the following formula:

；

wherein, is the first in the target bounding boxjGrouping saturation of the pixels; />Is the first in the target bounding boxjSaturation of individual pixels;γis a third target value;

traversing all saturation groups, and taking the product of the number of the group with the largest pixel number and a third target value as the saturation of the main color;

The hue, brightness and saturation of the subject color are converted into RGB color values of the subject color as RGB color values of the vehicle.

Further, the method for constructing the digital twin intersection static scene by adopting the two-dimensional and three-dimensional fusion rendering mode comprises the following steps:

the Mapbox technology framework is adopted to superimpose vector tiles to realize the rendering of a two-dimensional vector base map, and elements of the two-dimensional vector base map comprise a water system surface, a green ground surface, a building stretching surface, a road surface, POI point marks, road line marks, boundaries and political areas;

the Mapbox technical framework is adopted to superimpose vector tiles to realize the rendering of traffic vector elements, and the traffic vector elements comprise road surfaces, sidewalks, green belts, isolation bars and road marks; the road mark comprises a zebra crossing, a lane line, a stop line, a drainage line and a steering arrow;

realizing three-dimensional element rendering of traffic by combining Mapbox with a threebox technical framework, and loading a three-dimensional model with real proportion according to real point positions, wherein the three-dimensional model comprises a signal lamp model, a road sign model, a street tree model and a street lamp model;

adding a monitoring three-dimensional model into the intersection scene according to the actual geographic space position, and rotating the monitoring three-dimensional model according to the monitoring actual orientation;

and calculating the coverage range of the monitoring vision field, superposing and drawing a visual field range surface layer on the intersection area in a vector surface mode, and setting the surface layer to be semitransparent.

Further, adding a dynamic vehicle object in the digital twin intersection static scene according to the vehicle type, the geographic coordinates of the vehicle, the real-time speed of the vehicle, the vehicle orientation, the vehicle color and the vehicle license plate number comprises the following steps:

accessing a real-time video stream, and mapping a video image to a three-dimensional geographic space in a texture map mode by adopting a video projection technology so as to realize pixel-by-pixel fusion of the video image and the three-dimensional geographic space;

the WebSocket technology is adopted to realize front-end and back-end data transmission, the front end receives real-time vehicle data pushed by the back end, and a vehicle model is drawn and updated in a three-dimensional intersection scene.

In a second aspect, the present invention provides a digital twin intersection construction system based on vehicle identification and trajectory mapping, comprising:

the monitoring video acquisition module is used for acquiring a monitoring video of the target direction of the target traffic intersection;

the image reading module is used for reading images in the monitoring video frame by frame;

the vehicle detection tracking module is used for carrying out target detection and target tracking on the vehicles in the image to obtain target bounding boxes, confidence levels, vehicle types and identities of the target vehicles between adjacent video frames of all the vehicles in the image;

The vehicle coordinate determining module is used for determining coordinates of the vehicle in the image;

the vehicle geographic coordinate determining module is used for determining geographic coordinates of the vehicle according to the coordinates of the vehicle in the image;

the vehicle speed determining module is used for determining the real-time speed of the vehicle according to the geographic coordinates of the vehicle between adjacent frames;

the vehicle orientation determining module is used for determining the moving direction azimuth angle of the vehicle according to the geographic coordinates of the vehicle between adjacent frames to serve as the vehicle orientation;

a vehicle color determination module for determining a subject color of an image contained in the target bounding box as a vehicle color according to the image pixels in the target bounding box;

the license plate number acquisition module is used for acquiring a license plate number corresponding to the vehicle;

the scene construction module is used for constructing a digital twin intersection static scene by adopting a two-dimensional and three-dimensional fusion rendering mode;

the vehicle adding module is used for adding dynamic vehicle objects in the digital twin intersection static scene according to the vehicle type, the geographic coordinates of the vehicle, the real-time speed of the vehicle, the vehicle orientation, the vehicle color and the vehicle license plate number.

Drawings

In order to more clearly illustrate the technical solutions of the present invention, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a digital twin intersection construction method based on vehicle identification and track mapping provided by an embodiment of the invention;

FIG. 2 is an exemplary diagram of a digital twin intersection scenario provided by an embodiment of the present invention;

fig. 3 is a block diagram of a digital twin intersection construction system based on vehicle identification and track mapping according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In an embodiment, as shown in fig. 1, an embodiment of the present invention provides a digital twin intersection construction method based on vehicle identification and trajectory mapping, including:

step 101, acquiring a monitoring video of a target direction of a target traffic intersection.

Traffic gate monitoring is generally erected behind a traffic intersection lane entrance, and lane lines, zebra crossings, incoming vehicles and license plate information can be clearly shot. The coverage area of the single-path monitoring area usually comprises 2-3 complete lane entrances, so that the monitoring with unequal quantity is distributed according to the actual quantity of lanes in the same entrance direction.

And 102, reading images in the monitoring video frame by frame.

If more than one path of monitoring is arranged in the same direction and the coverage areas of adjacent monitoring vision fields are overlapped, splicing the multiple paths of monitoring video images. If the coverage areas of only one monitoring field or multiple monitoring fields are not overlapped, the splicing processing is not needed.

And performing image stitching based on the feature point matching. And the adjacent monitoring images are spliced in pairs, the shooting visual angle of the camera is taken as a reference direction, the video image with the left relative position is marked as a left image, the video image with the right relative position is marked as a right image, and the image in one direction can be converted into the image plane in the other direction based on the homography principle. The principle that the left and right graphs meet the homography transformation condition is that two paths of monitoring have the same internal parameters.

If two paths of monitoring are arranged in the same direction, the right image is converted into a left image plane.

If more than two paths of monitoring are arranged in the same direction, selecting the intermediate position monitoring as a reference image, and converting the images in other directions into a reference image plane.

The homography transformation process of the specific image is as follows:

1) And performing SIFT feature point extraction and feature descriptor extraction on the left and right images.

2) And carrying out feature matching on the feature descriptors extracted from the left and right images by adopting a KNN algorithm to obtain matched feature point pairs.

3) And eliminating the feature point pairs which are mismatched by adopting a RANSAC algorithm.

4) And calculating a homography matrix based on the finally matched characteristic point pairs.

5) And carrying out homography conversion and resampling on all pixel points of the right image to obtain the right image after homography conversion.

And performing splicing processing on the left image and the homography-converted right image, and selecting left image pixels at the overlapping position of the images.

And 103, carrying out target detection and target tracking on the vehicles in the image to obtain target bounding boxes, confidence levels, vehicle types and identities of the target vehicles between adjacent video frames of all the vehicles in the image.

And reading the spliced and cut video images frame by frame, identifying the vehicle object in the monitoring image by adopting a deep learning target detection model, and acquiring the unique corresponding relation of the vehicle object between the continuous frames based on a tracking algorithm so as to acquire the image position of the vehicle in the continuous video.

Illustratively, obtain the firstkImage corresponding to frame videoM _k WhereinM _k =[m _pq ] _{w h×} ；wIs the image pixel width;his image pixel high;m _pq is the first in the imagepLine 1qPixel values of the columns.

The task of object detection is to find objects of interest in an image or video while detecting their position and size. Will be the firstkInputting the image corresponding to the frame video into a YOLOv3 network to output target bounding boxes of all vehicles in the imagebboxConfidence levelconfidenceAnd vehicle typelabelThe method comprises the steps of carrying out a first treatment on the surface of the Wherein the method comprises the steps ofbbox=[bx，by，bw，bh]；bxThe abscissa of the upper left corner pixel for the bounding box of the object;bythe ordinate of the upper left corner pixel of the bounding box of the target;bwpixel width for the target bounding box;bhpixel height for the target bounding box; confidence degree represents the consistency ratio of the output vehicle type and the actual vehicle type in the target bounding box, and the value range is that the confidence degree range is [0,1]The method comprises the steps of carrying out a first treatment on the surface of the Vehicle types include car, bus and truck; image processing apparatusM _k All the vehicle information detected in the database is expressed as an array of vehicles；；bx _i Is an imageM _k Middle (f)iThe abscissa of the upper left corner pixel of the target bounding box corresponding to the vehicle;by _i is an imageM _k Middle (f)iThe ordinate of the pixel at the upper left corner of the target bounding box corresponding to the vehicle;bw _i is an imageM _k Middle (f)iThe pixel width of the target bounding box corresponding to the vehicle; bh _i Is an imageM _k Middle (f)iThe pixel height of the target bounding box corresponding to the vehicle;confidence _i to output vehicle type and imageM _k Middle (f)iThe consistency rate of the actual vehicle type in the target bounding box corresponding to the vehicle;label _i is an imageM _k Middle (f)iOutput vehicle type of vehicle.

The target detection result only represents all the vehicle information in the current frame video picture, and does not include the vehicle correspondence in the preceding and following frame video pictures. The target tracking can acquire the unique corresponding relation of the video inter-frame objects.

Object-Tracking (Object-Tracking) is a technique that uses contextual information of a video or image sequence to model the appearance and motion information of an Object, thereby predicting the motion state of the Object and calibrating the position of the Object. Embodiments of the present invention relate to the Multi-Object Tracking (Multi-Object Tracking) problem, i.e., based on video context information, obtaining unique identities and location information of multiple Object vehicles in a video between adjacent video frames.

Image is formedM _k All the detected vehicle information is input into a deep source model to track the target of the target detection result of the continuous video frames, and a code is given to the successfully tracked target vehicleidThe method comprises the steps of carrying out a first treatment on the surface of the The DeepSort model outputs the result of ；/>；id _i Is an imageM _k Middle (f)iEncoding of the vehicle.

Step 104, determining coordinates of the vehicle in the image.

The target tracking result enables the position of the vehicle in the image to be acquired. The midpoint of the lower border of the target surrounding frame corresponding to the target vehicle is set as the target vehicle (the firstiVehicle) image coordinatesu _i ,v _i ）；u _i =bx _i +bw _i /2；v _i =by _i -bh _i ；bx _i Is an imageM _k Middle (f)iThe abscissa of the upper left corner pixel of the target bounding box corresponding to the vehicle;by _i is an imageM _k Middle (f)iThe ordinate of the pixel at the upper left corner of the target bounding box corresponding to the vehicle;bw _i is an imageM _k Middle (f)iThe pixel width of the target bounding box corresponding to the vehicle;bh _i is an imageM _k Middle (f)iThe pixels of the corresponding target bounding box of the vehicle are high.

Step 105, determining geographic coordinates of the vehicle according to the coordinates of the vehicle in the image.

The image coordinates can only describe the relative position of the vehicle on the image plane, the spatial position of the vehicle in the real world cannot be known, fusion expression with a three-dimensional map cannot be carried out, and spatial analysis cannot be carried out. It is therefore necessary to convert the image pixel coordinates of the vehicle object into geospatial coordinates. The method comprises the steps of taking an orthophoto image as a geospatial reference, and realizing conversion calculation from image coordinates to geographic coordinates by establishing a homography conversion relation between a video picture and the orthophoto image.

The real geographic space is shot by the monitoring video image, the embodiment of the invention adopts the orthographic image to represent the geographic space, the orthographic image adopts a local projection coordinate system in order to reduce errors caused by map projection as much as possible, and the image resolution is required to be not lower than 0.2m in order to reduce errors selected by homonymous control points as much as possible. The principle of homonymous control point selection is that the homonymous control point is as close to the edge of the video image as possible, at least four pairs of homonymous control points are not collinear, and the homonymous control point selection is shown in table 1.

Table 1 homonymous control Point selection example

Because the geographic coordinates are usually larger in value and smaller in image coordinate range, the geographic coordinates are directly used for matrix calculation, and errors are larger, the geographic coordinates are subjected to offset and scaling treatment, so that the geographic coordinates after treatment are in the same magnitude as the image coordinates.

Taking a point coordinate on the geospatial planeX _i ,Y _i ) To correspond to the coordinates on the monitoring video planeu _i ,v _i ）。

。

wherein, the method comprises the following steps ofu _i ,v _i 1) is the firstiHomogeneous coordinates of the image coordinates of the vehicle;His a homography matrix;for translating the matrixM _t An inverse matrix of (a); />To scale the matrix M _s An inverse matrix of (a); (X _i ,Y _i 1) is the point%X _i ,Y _i ) Homogeneous coordinates of the original geographic coordinates of the control points with the same name;Tis the transpose operator of the matrix; homography matrixHThe SVD matrix decomposition method can be adopted to solve the problem based on four pairs of homonymous control points.

And 106, determining the real-time speed of the vehicle according to the geographic coordinates of the vehicle between adjacent frames.

Illustratively, the real-time vehicle speed of the vehicle is determined by calculating the geographic coordinates of the vehicle between adjacent frames according to the following formula:

。

wherein, Dist _{t t(-1,)} for the previous framet-1 to current frametThe distance the vehicle moves in geographic coordinates;X _t for the current frametThe abscissa of the geographic coordinates of the vehicle;X _t-1 for the previous framet-1 the abscissa of the geographic coordinates of the vehicle;Y _t for the current frametAn ordinate of the geographic coordinates of the vehicle;Y _t-1 for the previous framet-1 vehicleAn ordinate of the geographic coordinates;v _t for the previous framet-1 to current frametThe speed at which the vehicle moves in geographic coordinates;fis the video frame rate.

Step 107, determining the azimuth angle of the moving direction of the vehicle as the vehicle orientation according to the geographic coordinates of the vehicle between the adjacent frames.

Illustratively, the azimuth angle of the direction of movement of the vehicle is calculated according to the following formula:

。

wherein, Aziis the azimuth angle of the moving direction of the vehicle.

Step 108, determining the main color of the image contained in the target bounding box as the vehicle color according to the image pixels in the target bounding box.

Illustratively, this step includes obtaining RGB color values for a target pixel in a target bounding box.

The RGB color values of the target pixels in the target bounding box are converted into HLS color values, and the conversion process can be implemented by adopting a cv2.Color_bgr2hls method in an OpenCV image library.

Hues in the HLS color values are divided into a plurality of groups sequentially at intervals of a first target value.

。

wherein, is the first in the target bounding boxjHue groups where the individual pixels are located; />Is the first in the target bounding boxjHue of individual pixels;αis a first target value. The numerical range of hue is [0,360 ]]The units are degrees. In this embodiment, the division into 10 degrees at intervals is performedA group of numbers, i.eα=10, divided into 36 groups in turn.

Traversing all hue groups, and taking the product of the number of the group with the largest pixel number and the first target value as the hue of the main color.

The brightness in the HLS color values is divided into a plurality of groups sequentially with the second target value as an interval.

。

wherein, is the first in the target bounding boxjThe brightness group of the pixels; />Is the first in the target bounding box jBrightness of individual pixels;βis a second target value. The brightness range is [0,1 ]]In this embodiment, each interval of 0.01 is divided into a group and numbered, i.eβ=0.1, divided into 10 groups in turn.

Traversing all brightness groups, taking the product of the number of the group with the largest pixel number and the second target value as the brightness of the main color.

The saturation in the HLS color values is divided into a plurality of groups sequentially with the third target value as an interval.

。

wherein, is the first in the target bounding boxjGrouping saturation of the pixels; />Is the first in the target bounding boxjSaturation of individual pixels;γis a third target value. The saturation is in the range of 0,1]In this embodiment, each interval of 0.01 is divided into a group and numbered, i.eγ=0.1, divided into 10 groups in turn.

Traversing all saturation groups, and taking the product of the number of the group with the largest pixel number and the third target value as the saturation of the main color.

The hue, brightness and saturation of the subject color are converted into RGB color values of the subject color as RGB color values of the vehicle. The conversion process may be implemented using the cv2. Color_hls2bgr method in the OpenCV image library.

Step 109, obtaining the license plate number corresponding to the vehicle.

And carrying out license plate information identification by adopting an open source frame HyperLPR. The specific flow is to input the vehicle image information in the vehicle bounding box into the HyperLPR and output license plate information represented by character strings.

And 1010, constructing a digital twin intersection static scene by adopting a two-dimensional and three-dimensional fusion rendering mode.

In this step, as shown in fig. 2, a Mapbox technology is adopted to frame and superimpose vector tiles to realize the rendering of a two-dimensional vector base map, and the elements of the two-dimensional vector base map include a water system surface, a green ground surface, a building stretching surface, a road surface, POI point marks, road line marks, boundaries and political areas.

The Mapbox technical framework is adopted to superimpose vector tiles to realize the rendering of traffic vector elements, and the traffic vector elements comprise road surfaces, sidewalks, green belts, isolation bars and road marks; the road markings include zebra crossings, lane lines, stop lines, drain lines, and steering arrows.

And realizing the rendering of the traffic three-dimensional elements by combining the Mapbox with the threebox technical framework, and loading a three-dimensional model with real proportion according to the real point positions, wherein the three-dimensional model comprises a signal lamp model, a road sign model, a street tree model and a street lamp model.

And adding a monitoring three-dimensional model into the intersection scene according to the actual geographic space position, and rotating the monitoring three-dimensional model according to the monitoring actual orientation.

The monitoring visual field, that is, the geographical space range covered by the monitoring picture, can be represented by a quadrilateral area surrounded by geographical space positions corresponding to four corner points of the monitoring video image. The view angle point geographic coordinates can be calculated by converting the monitor image angle point image coordinates into geographic coordinates, for example, the monitor image pixel width iswThe pixel height ishThe four corner image coordinates of the monitoring image are: upper left: [0,0]Lower left: [0,h]lower right: [w,h]Upper right [w,0]。

Since the monitoring frame may include a portion above the ground level, the geospatial coordinates of the ground level are virtually infinite and are difficult to express in three-dimensional space, and thus the image corner points may be adjusted, and the point located on the ground in the image is selected again as the view corner point, for example: upper left: [0,hc]lower left: [0,h]lower right: [w,h]Upper right [w,hc]，hcIs the distance of the pixel in the image below the horizon position from the top of the image.

And step 1011, adding a dynamic vehicle object in the digital twin intersection static scene according to the vehicle type, the geographic coordinates of the vehicle, the real-time speed of the vehicle, the vehicle orientation, the vehicle color and the vehicle license plate number.

In the three-dimensional intersection scene, real-time video streams are overlaid and rendered. The method comprises the steps of accessing a real-time video stream, mapping a video image to a three-dimensional geographic space in a texture map mode by adopting a video projection technology, realizing pixel-by-pixel fusion of the video image and the three-dimensional geographic space, enhancing the sense of reality of the three-dimensional geographic space, and mutually verifying with a dynamic vehicle track, thereby constructing an interactive interconnected digital twin intersection.

The specific video projection technology principle is that a virtual camera consistent with the geographical position, shooting view angle and imaging process of a real camera is constructed in a three-dimensional scene based on the camera aperture imaging principle, and an imaging image of the real camera is used as a texture to be mapped into a rendering image of the virtual camera, so that the effect of pixel-by-pixel fusion of a monitoring video and the three-dimensional scene is achieved.

The parameters required by the virtual camera are set and obtained through a camera calibration mode, the camera calibration comprises an internal parameter calibration and an external parameter calibration, the internal parameter calibration can be achieved through a checkerboard calibration method, and the external parameter calibration is calculated through an EPnP algorithm.

Virtual camera setup and rendering are implemented based on WebGL technology, camera imaging is simulated by computing a camera view matrix and a perspective projection matrix, and pixel-by-pixel texture mapping is implemented by GPU fragment shader programming.

In the embodiment, the monitoring video is taken as a data source, the vehicle metadata is acquired based on the target detection technology, the vehicle track is acquired based on the target tracking technology, the vehicle geographic track is acquired through the image and geographic coordinate mapping method, the vehicle information which is important in attention is stored in a structured mode, the data volume is reduced, the limitation that the video data cannot be stored for a long time can be made up, and the follow-up application analysis is convenient; and constructing a digital twin intersection scene by combining the geographic video projection and the three-dimensional scene visualization technology. The real-time track of the vehicle and the metadata of the vehicle are used as supports, a refined intersection scene is established, and data support is provided for optimizing an intersection regulation and control scheme, so that the intersection passing efficiency and the road safety are improved.

Based on the same inventive concept, the embodiment of the invention also provides a digital twin intersection construction system based on vehicle identification and track mapping, and because the principle of solving the problem of the system is similar to that of a digital twin intersection construction method based on vehicle identification and track mapping, the implementation of the system can refer to the implementation of the digital twin intersection construction method based on vehicle identification and track mapping, and the repetition is omitted.

In another embodiment, a digital twin intersection construction system based on vehicle identification and trajectory mapping provided in an embodiment of the present invention, as shown in fig. 3, includes:

the monitoring video acquisition module 10 is used for acquiring a monitoring video of a target direction of a target traffic intersection.

The image reading module 20 is configured to read the image in the surveillance video frame by frame.

The vehicle detection tracking module 30 is configured to perform target detection and target tracking on the vehicles in the image, so as to obtain target bounding boxes, confidence levels, vehicle types and identities of the target vehicles between adjacent video frames of all the vehicles in the image.

The vehicle coordinate determination module 40 is configured to determine coordinates of the vehicle in the image.

The vehicle geographic coordinate determination module 50 is configured to determine geographic coordinates of the vehicle according to coordinates of the vehicle in the image.

The vehicle speed determining module 60 is configured to determine a real-time vehicle speed of the vehicle according to geographic coordinates of the vehicle between adjacent frames.

The vehicle orientation determining module 70 is configured to determine a moving direction azimuth of the vehicle as a vehicle orientation according to geographic coordinates of the vehicle between adjacent frames.

The vehicle color determining module 80 is configured to determine, as a vehicle color, a subject color of an image included in the target bounding box according to the image pixels in the target bounding box.

The license plate number acquisition module 90 is configured to acquire a license plate number corresponding to a vehicle.

The scene construction module 100 is configured to construct a digital twin intersection static scene by adopting a two-dimensional and three-dimensional fusion rendering mode.

The vehicle adding module 110 is configured to add a dynamic vehicle object in the digital twin intersection static scene according to the vehicle type, the geographic coordinates of the vehicle, the real-time speed of the vehicle, the vehicle orientation, the vehicle color and the license plate number of the vehicle.

For more specific working procedures of the above modules, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and no further description is given here.

In another embodiment, the invention provides a computer device comprising a processor and a memory; the method comprises the steps of realizing the digital twin intersection construction method based on vehicle identification and track mapping when a processor executes a computer program stored in a memory.

For more specific procedures of the above method, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and no further description is given here.

In another embodiment, the present invention provides a computer-readable storage medium storing a computer program; the computer program when executed by the processor realizes the steps of the digital twin intersection construction method based on vehicle identification and track mapping.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the system, apparatus and storage medium disclosed in the embodiments, since it corresponds to the method disclosed in the embodiments, the description is relatively simple, and the relevant points refer to the description of the method section.

It will be apparent to those skilled in the art that the techniques of embodiments of the present invention may be implemented in software plus a necessary general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be embodied in essence or what contributes to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present invention.

The invention has been described in detail in connection with the specific embodiments and exemplary examples thereof, but such description is not to be construed as limiting the invention. It will be understood by those skilled in the art that various equivalent substitutions, modifications or improvements may be made to the technical solution of the present invention and its embodiments without departing from the spirit and scope of the present invention, and these fall within the scope of the present invention. The scope of the invention is defined by the appended claims.

Claims

1. The digital twin intersection construction method based on vehicle identification and track mapping is characterized by comprising the following steps:

reading images in the monitoring video frame by frame;

determining coordinates of the vehicle in the image;

obtaining a license plate number corresponding to a vehicle;

2. The method for constructing a digital twin intersection based on vehicle identification and trajectory mapping according to claim 1, wherein the steps of performing object detection and object tracking on vehicles in the image to obtain object bounding boxes, confidence levels, vehicle types and identities of the object vehicles between adjacent video frames of all vehicles in the image include:

will be the firstkInputting the image corresponding to the frame video into a YOLOv3 network to output target bounding boxes of all vehicles in the imagebboxConfidence levelconfidenceAnd vehicle typelabelThe method comprises the steps of carrying out a first treatment on the surface of the Wherein the method comprises the steps ofbbox=[bx，by，bw，bh]；bxThe abscissa of the upper left corner pixel for the bounding box of the object; byThe ordinate of the upper left corner pixel of the bounding box of the target;bwpixel width for the target bounding box;bhpixel height for the target bounding box; confidence represents the rate of agreement of the output vehicle type with the actual vehicle type in the target bounding box; vehicle types include car, bus and truck; image processing apparatusM _k All the vehicle information detected in the database is expressed as an array of vehicles；/>；bx _i Is an imageM _k Middle (f)iThe abscissa of the upper left corner pixel of the target bounding box corresponding to the vehicle;by _i is an imageM _k Middle (f)iThe ordinate of the pixel at the upper left corner of the target bounding box corresponding to the vehicle;bw _i is an imageM _k Middle (f)iThe pixel width of the target bounding box corresponding to the vehicle;bh _i is an imageM _k Middle (f)iThe pixel height of the target bounding box corresponding to the vehicle;confidence _i to output vehicle type and imageM _k Middle (f)iThe consistency rate of the actual vehicle type in the target bounding box corresponding to the vehicle;label _i is an imageM _k Middle (f)iThe output vehicle type of the vehicle;

3. The method for constructing a digital twin intersection based on vehicle identification and trajectory mapping according to claim 1, wherein determining coordinates of a vehicle in an image comprises:

image coordinates of the target vehicle by taking the midpoint of the lower border of the target bounding box corresponding to the target vehicle as the image coordinates of the target vehicleu _i ,v _i ）；u _i =bx _i +bw _i /2；v _i =by _i -bh _i ；bx _i Is an imageM _k Middle (f)iThe abscissa of the upper left corner pixel of the target bounding box corresponding to the vehicle;by _i is an imageM _k Middle (f)iThe ordinate of the pixel at the upper left corner of the target bounding box corresponding to the vehicle;bw _i is an imageM _k Middle (f)iThe pixel width of the target bounding box corresponding to the vehicle;bh _i is an imageM _k Middle (f)iThe pixels of the corresponding target bounding box of the vehicle are high.

4. The digital twin intersection construction method based on vehicle identification and trajectory mapping of claim 3, wherein the determining geographic coordinates of the vehicle from coordinates of the vehicle in the image comprises:

；

wherein, the method comprises the following steps ofu _i ,v _i 1) is the firstiHomogeneous coordinates of the image coordinates of the vehicle;His a homography matrix;for translating the matrixM _t An inverse matrix of (a); />To scale the matrix M _s An inverse matrix of (a); (X _i ,Y _i 1) is the point%X _i ,Y _i ) Is a homogeneous coordinate of (3);Tis the transpose operator of the matrix.

5. The digital twin intersection construction method based on vehicle identification and trajectory mapping according to claim 1, wherein the determining the real-time speed of the vehicle according to the geographic coordinates of the vehicle between adjacent frames comprises:

；

wherein, Dist _{t t(-1,)} for the previous framet-1 to current frametThe distance the vehicle moves in geographic space;X _t for the current frametThe abscissa of the geographic coordinates of the vehicle;X _t-1 for the previous framet-1 the abscissa of the geographic coordinates of the vehicle;Y _t for the current frametAn ordinate of the geographic coordinates of the vehicle;Y _t-1 for the previous framet-1 the ordinate of the geographic coordinates of the vehicle;v _t for the previous framet-1 to current frametThe speed at which the vehicle moves in the geographic space;fis the video frame rate.

6. The method for constructing a digital twin intersection based on vehicle identification and trajectory mapping according to claim 1, wherein the determining a moving direction azimuth of the vehicle as a vehicle orientation according to geographic coordinates of the vehicle between adjacent frames comprises:

；

7. The method for constructing a digital twin intersection based on vehicle recognition and trajectory mapping according to claim 1, wherein the determining a main color of an image included in the target bounding box from image pixels in the target bounding box as a vehicle color comprises:

acquiring RGB color values of target pixels in a target bounding box;

；

wherein, is the first in the target bounding boxjHue groups where the individual pixels are located; />Is the first in the target bounding boxjHue of individual pixels;αis a first target value;

；

8. The method for constructing a digital twin intersection based on vehicle identification and track mapping according to claim 1, wherein the method for constructing a digital twin intersection static scene by adopting a two-dimensional and three-dimensional fusion rendering mode comprises the following steps:

9. The method for constructing a digital twin intersection based on vehicle identification and trajectory mapping according to claim 1, wherein adding a dynamic vehicle object in a static scene of the digital twin intersection according to a vehicle type, a geographic coordinate of the vehicle, a real-time speed of the vehicle, a vehicle orientation, a vehicle color and a license plate number of the vehicle comprises:

10. A digital twin intersection construction system based on vehicle identification and trajectory mapping, comprising: