CN116152342A - Guideboard registration positioning method based on gradient - Google Patents
Guideboard registration positioning method based on gradient Download PDFInfo
- Publication number
- CN116152342A CN116152342A CN202310229650.2A CN202310229650A CN116152342A CN 116152342 A CN116152342 A CN 116152342A CN 202310229650 A CN202310229650 A CN 202310229650A CN 116152342 A CN116152342 A CN 116152342A
- Authority
- CN
- China
- Prior art keywords
- guideboard
- image
- gradient
- positioning
- area
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000001514 detection method Methods 0.000 claims abstract description 15
- 238000005457 optimization Methods 0.000 claims abstract description 11
- 238000004364 calculation method Methods 0.000 claims abstract description 5
- 238000000605 extraction Methods 0.000 claims abstract description 5
- 230000009466 transformation Effects 0.000 claims description 29
- 239000011159 matrix material Substances 0.000 claims description 28
- 238000003384 imaging method Methods 0.000 claims description 11
- 238000012549 training Methods 0.000 claims description 11
- 238000004590 computer program Methods 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 9
- 230000006870 function Effects 0.000 claims description 8
- 230000003287 optical effect Effects 0.000 claims description 7
- 230000000877 morphologic effect Effects 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 6
- 238000004422 calculation algorithm Methods 0.000 claims description 4
- 238000006073 displacement reaction Methods 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 238000012937 correction Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 3
- 230000001186 cumulative effect Effects 0.000 claims description 2
- 238000012952 Resampling Methods 0.000 claims 2
- 238000009499 grossing Methods 0.000 claims 1
- 238000002372 labelling Methods 0.000 claims 1
- 238000005070 sampling Methods 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 26
- 230000000875 corresponding effect Effects 0.000 description 8
- 230000000694 effects Effects 0.000 description 5
- 239000003086 colorant Substances 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 239000013598 vector Substances 0.000 description 3
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 2
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/30—Determination of transform parameters for the alignment of images, i.e. image registration
- G06T7/33—Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/80—Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/90—Determination of colour characteristics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
- G06V20/582—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of traffic signs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30244—Camera pose
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30248—Vehicle exterior or interior
- G06T2207/30252—Vehicle exterior; Vicinity of vehicle
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
本发明涉及一种基于梯度的路牌配准定位方法,包括:A、构建数据库;B、路牌粗定位与提取;a、获取道路图像;b、将道路图像利用训练好的YOLOv8网络模型进行目标检测,实现目标路牌的粗定位,并识别当前路牌的类别,获取当前路牌的数据库信息,并求取粗定位区域图像;c、将粗定位区域图像从RGB空间转换为HSV空间;d、将处理的图像在HSV空间进行阈值处理,得到路牌区域ROI图像;C、基于梯度的配准优化;D、车辆位姿计算;本发明运用最高版本的YOLO网络模型进行路牌的粗定位,实现了在复杂场景下检测路牌目标区域的问题,提高了检测结果的速度和可靠性,保证了路牌区域的完整性,有效排除了其他相似区域的干扰。
The invention relates to a gradient-based road sign registration and positioning method, comprising: A, building a database; B, rough positioning and extraction of street signs; a, obtaining road images; b, using the trained YOLOv8 network model to perform target detection on the road images , realize the coarse positioning of the target street sign, and identify the category of the current street sign, obtain the database information of the current street sign, and obtain the rough positioning area image; c, convert the rough positioning area image from RGB space to HSV space; d, convert the processed The image is thresholded in the HSV space to obtain the ROI image of the road sign area; C, gradient-based registration optimization; D, vehicle pose calculation; the present invention uses the highest version of the YOLO network model to perform rough positioning of road signs, and realizes in complex scenes The problem of detecting the target area of road signs improves the speed and reliability of the detection results, ensures the integrity of the road sign area, and effectively eliminates the interference of other similar areas.
Description
技术领域Technical Field
本发明涉及一种基于梯度的路牌配准定位方法,属于数字图像处理和计算机视觉领域。The invention relates to a road sign registration and positioning method based on gradient, and belongs to the field of digital image processing and computer vision.
背景技术Background Art
随着国家经济水平的快速提升,汽车保有量迅速增加,交通拥堵问题日益严重,智慧交通系统以及自动驾驶技术成为国内外学者研究的热点课题。其中,车辆自定位作为基础和关键技术,极为重要。在地下停车场、隧道或建筑密集的城市中心区域,由于卫星信号被遮挡,基于卫星的车载定位系统定位精度影响很大或无法正常工作。基于此背景,深入研究基于路牌的车辆定位系统,旨在提高车载定位系统在特定区域的定位精度和可靠性。With the rapid improvement of the national economic level, the number of cars has increased rapidly, and traffic congestion has become increasingly serious. Intelligent transportation systems and autonomous driving technologies have become hot topics for scholars at home and abroad. Among them, vehicle self-positioning, as a basic and key technology, is extremely important. In underground parking lots, tunnels, or densely built urban areas, the positioning accuracy of satellite-based vehicle positioning systems is greatly affected or cannot work properly due to the obstruction of satellite signals. Based on this background, in-depth research on vehicle positioning systems based on road signs aims to improve the positioning accuracy and reliability of vehicle positioning systems in specific areas.
发展最早且应用最为广泛的定位技术是全球定位系统GPS,但其在密集城市地区的定位效果明显下降。若使用单一的传感器,则无法在车辆密集的情况下实现无干扰高精度定位;多传感器融合的定位系统,成本高且不够灵活,阻碍了该系统的大规模产品化;同时,大多数依靠传感器级联的定位系统,在复杂的城市环境和拥堵的道路状况下,定位会出现累积误差并导致较大的定位误差。The earliest developed and most widely used positioning technology is the Global Positioning System (GPS), but its positioning effect in dense urban areas has declined significantly. If a single sensor is used, it is impossible to achieve interference-free high-precision positioning in densely populated areas; the multi-sensor fusion positioning system is costly and inflexible, which hinders the large-scale productization of the system; at the same time, most positioning systems that rely on sensor cascades will have cumulative errors and lead to large positioning errors in complex urban environments and congested road conditions.
基于以上方法的局限性,随着计算机视觉的发展,视觉传感器越来越多地用于车辆定位,双目相机、单目相机与各种传感器相结合的系统层出不穷,使用单目相机完成车辆定位一般需要构建更加复杂和精准的比对数据库,且准确度较低。Based on the limitations of the above methods, with the development of computer vision, visual sensors are increasingly used for vehicle positioning, and systems combining binocular cameras, monocular cameras and various sensors are emerging in an endless stream. Using a monocular camera to complete vehicle positioning generally requires the construction of a more complex and accurate comparison database, and the accuracy is lower.
发明内容Summary of the invention
针对现有技术的不足,本发明提供了一种基于梯度的路牌配准定位方法。In view of the deficiencies of the prior art, the present invention provides a gradient-based road sign registration and positioning method.
本发明采用双目视觉车辆自定位算法,结合GPS或离线下载的路线规划、道路标志数据库实现自主导航,特别在路口车辆密集和车道线被覆盖时,利用路牌实现车辆的精准自定位。The present invention adopts binocular vision vehicle self-positioning algorithm, combined with GPS or offline downloaded route planning and road sign database to realize autonomous navigation, especially when there are dense vehicles at intersections and lane lines are covered, the road signs are used to realize accurate self-positioning of the vehicle.
术语说明:Terminology Note:
1、RGB颜色空间:由红(R)、绿(G)、蓝(B)三个基本颜色通道组成,每个通道中像素的取值范围都是[0,255],改变三个通道的取值然后进行叠加,从而得到不同的颜色。1. RGB color space: It consists of three basic color channels: red (R), green (G), and blue (B). The value range of pixels in each channel is [0,255]. By changing the values of the three channels and then superimposing them, different colors can be obtained.
2、HSV颜色空间:以色调(H)、饱和度(S)、明度(V)三种特征对颜色进行描述,更类似于人眼感知颜色的方式。2. HSV color space: describes colors using three characteristics: hue (H), saturation (S), and value (V), which is more similar to the way the human eye perceives colors.
3、getPerspectiveTransform函数:根据源图像和目标图像上的四对点坐标来计算从原图像透视变换到目标头像的透视变换矩阵。3. getPerspectiveTransform function: Calculates the perspective transformation matrix from the original image to the target avatar based on the four pairs of point coordinates on the source image and the target image.
4、相机的内参矩阵:可以将3D相机坐标变换到2D齐次图像坐标,表示为其中f是焦距,代表x轴方向焦距的像素长度,代表y轴方向焦距的像素长度,u0,v0为图像主点的实际位置,这些参数只由相机自身属性决定,不因外界环境而改变。4. Camera intrinsic parameter matrix: The 3D camera coordinates can be transformed into 2D homogeneous image coordinates, expressed as Where f is the focal length, Represents the pixel length of the focal length in the x-axis direction, represents the pixel length of the focal length in the y-axis direction, u 0 ,v 0 are the actual positions of the principal points of the image, and these parameters are determined only by the camera's own properties and will not change due to the external environment.
5、外参矩阵:实现从世界坐标系到相机坐标系的变换,可表示为其中R是旋转矩阵,它的每一个列向量表示世界坐标系的每一个坐标轴在相机坐标系下的指向;T是平移矩阵,它是世界坐标系原点在相机坐标系下的表示。5. External parameter matrix: realizes the transformation from the world coordinate system to the camera coordinate system, which can be expressed as Where R is the rotation matrix, each of its column vectors represents the direction of each coordinate axis of the world coordinate system in the camera coordinate system; T is the translation matrix, which is the representation of the origin of the world coordinate system in the camera coordinate system.
6、透视变换矩阵:两台相机对同一场景进行成像,两幅图像之间存在着唯一的几何对应关系,可以用矩阵的形式来表达。满足透视变换的一种情形是两幅图像拍摄时光心是重合的,另一种情形是拍摄的场景处于一个平面上,例如本发明所用的路牌。6. Perspective transformation matrix: Two cameras image the same scene, and there is a unique geometric correspondence between the two images, which can be expressed in the form of a matrix. One situation that satisfies the perspective transformation is that the light centers of the two images coincide when they are taken, and another situation is that the scene being photographed is on the same plane, such as the road sign used in the present invention.
7、齐次坐标:是将一个原本是n维的向量用一个n+1维向量来表示,用于投影几何里的坐标系统,二维坐标(x,y)的齐次坐标形式通常用三维坐标(hx,hy,h)表示,在齐次坐标中,h为比例因子,可任意取值,一般设为1,以保持两种坐标的一致。7. Homogeneous coordinates: A vector that is originally n-dimensional is represented by an n+1-dimensional vector. It is used in the coordinate system in projective geometry. The homogeneous coordinate form of the two-dimensional coordinate (x, y) is usually expressed by three-dimensional coordinates (hx, hy, h). In homogeneous coordinates, h is the scale factor, which can take any value and is generally set to 1 to keep the two coordinates consistent.
8、图像配准:是指将不同时间、不同传感器(成像设备)或不同条件下(天候、照度、摄像位置和角度等)获取的两幅或多幅图像进行匹配、叠加的过程。8. Image registration: It refers to the process of matching and superimposing two or more images acquired at different times, using different sensors (imaging devices) or under different conditions (weather, illumination, camera position and angle, etc.).
9、光流:是空间运动物体在观察成像平面上的像素运动的瞬时速度,在时间间隔很小(比如视频的连续前后两帧之间)时,也等同于目标点的位移。9. Optical flow: It is the instantaneous speed of the pixel movement of a moving object in space on the observation imaging plane. When the time interval is very small (such as between two consecutive frames of a video), it is also equivalent to the displacement of the target point.
本发明的技术方案为:The technical solution of the present invention is:
一种基于梯度的路牌配准定位方法,包括步骤如下:A gradient-based road sign registration and positioning method comprises the following steps:
A、构建数据库A. Building a database
所述数据库包括各个路牌的以下信息:地理坐标、路牌中心与车道的距离信息、底色,所述地理坐标是指路牌所在的经度、纬度;所述路牌中心与车道的距离信息是指路牌中心与每条车道线的横向距离;所述底色是指路牌的颜色;The database includes the following information of each road sign: geographic coordinates, distance information between the center of the road sign and the lane, and background color. The geographic coordinates refer to the longitude and latitude of the road sign; the distance information between the center of the road sign and the lane refers to the lateral distance between the center of the road sign and each lane line; the background color refers to the color of the road sign;
B、路牌粗定位与提取B. Rough positioning and extraction of road signs
a、将双目相机安装在车辆前方以实时获取道路图像;a. Install a binocular camera in front of the vehicle to obtain road images in real time;
b、将步骤a获得的道路图像利用训练好的YOLOv8网络模型进行目标检测,实现目标路牌的粗定位,并识别当前路牌的类别,获取当前路牌的数据库信息,并求取粗定位区域图像;b. Use the trained YOLOv8 network model to perform target detection on the road image obtained in step a, achieve coarse positioning of the target road sign, identify the category of the current road sign, obtain the database information of the current road sign, and obtain the coarse positioning area image;
c、将步骤b获得的粗定位区域图像从RGB空间转换为HSV空间;c. Convert the coarse positioning area image obtained in step b from RGB space to HSV space;
d、将经过步骤c处理的图像在HSV空间进行阈值处理,并通过形态学操作、连通域长宽比例约束得到掩模图像,从而得到路牌区域ROI图像;d. Perform threshold processing on the image processed in step c in HSV space, and obtain a mask image through morphological operations and connected domain length-width ratio constraints, thereby obtaining a road sign area ROI image;
C、基于梯度的配准优化C. Gradient-based registration optimization
根据得到的掩模图像,初步得到四对初始对应点,从而得到初始的透视变换矩阵,再利用基于梯度的优化算法对高斯平滑后的左右目路牌区域ROI图像进行亚像素级配准,迭代更新得到准确的透视变换矩阵;Based on the obtained mask image, four pairs of initial corresponding points are initially obtained, thereby obtaining the initial perspective transformation matrix. Then, the gradient-based optimization algorithm is used to perform sub-pixel registration on the Gaussian-smoothed left and right road sign area ROI images, and the accurate perspective transformation matrix is obtained by iterative update.
D、车辆位姿计算D. Vehicle posture calculation
求得路牌整体位移,即求得视差,进而得到车辆的位置信息,包括相机相对于路牌的横向和法线距离,即车辆相对路牌的距离和所在车道。The overall displacement of the road sign is obtained, that is, the parallax is obtained, and then the vehicle's position information is obtained, including the lateral and normal distances of the camera relative to the road sign, that is, the distance of the vehicle relative to the road sign and the lane it is in.
根据本发明优选的,步骤b的具体实现过程包括:Preferably, according to the present invention, the specific implementation process of step b includes:
(1)训练YOLOv8网络模型:(1) Training YOLOv8 network model:
获取训练集:选取含有路牌的图片,标注上标签,标签包括图片中路牌的坐标信息以及路牌的类别;Get the training set: select pictures containing road signs and label them. The labels include the coordinate information of the road signs in the pictures and the categories of the road signs.
YOLOv8网络模型包括Backbone单元、Neck单元、Head单元;The YOLOv8 network model includes Backbone unit, Neck unit, and Head unit;
Backbone单元包括卷积模块、C2f模块、SPPF模块,其中,卷积模块(Conv模块)包括卷积层、批归一化层和SiLU激活函数层;C2f模块包括卷积模块、Bottleneck模块和残差结构模块;SPPF模块中包括卷积层、池化层;The Backbone unit includes a convolution module, a C2f module, and an SPPF module. The convolution module (Conv module) includes a convolution layer, a batch normalization layer, and a SiLU activation function layer; the C2f module includes a convolution module, a Bottleneck module, and a residual structure module; the SPPF module includes a convolution layer and a pooling layer;
Neck单元包括卷积模块、C2f模块、上采样层;The Neck unit includes a convolution module, a C2f module, and an upsampling layer;
Head单元包括检测模块(Detect模块),检测模块包括卷积模块、卷积层;The Head unit includes a detection module (Detect module), which includes a convolution module and a convolution layer;
基于Backbone单元对训练集中的每张图片进行放缩以及卷积操作,从而获得初始的特征图;基于Neck单元对得到的初始特征图进行二次提取,获得不同尺度的中间特征图;将获得的不同尺度的中间特征图输入Head单元,得到YOLOv8网络模型预测的路牌坐标;Based on the Backbone unit, each image in the training set is scaled and convolved to obtain the initial feature map; based on the Neck unit, the initial feature map is extracted again to obtain intermediate feature maps of different scales; the intermediate feature maps of different scales are input into the Head unit to obtain the coordinates of the road sign predicted by the YOLOv8 network model;
通过YOLOv8网络模型预测的路牌坐标和真实的路牌坐标计算损失,通过损失获得YOLOv8网络模型优化的梯度,进行YOLOv8网络模型权重的更新,损失不断下降网络预测的准确率不断上升,从而获得一个训练好的YOLOv8网络模型;The loss is calculated using the road sign coordinates predicted by the YOLOv8 network model and the actual road sign coordinates. The gradient of the YOLOv8 network model optimization is obtained through the loss. The weights of the YOLOv8 network model are updated. The loss continues to decrease and the accuracy of network prediction continues to increase, thus obtaining a trained YOLOv8 network model.
将步骤a获得的道路图像利用训练好的YOLOv8网络模型进行目标检测,实现目标路牌的粗定位,并识别当前路牌的类别,获取数据库信息;The road image obtained in step a is used to perform target detection using the trained YOLOv8 network model to achieve rough positioning of the target road sign, identify the category of the current road sign, and obtain database information;
(2)在实际的推理测试阶段,将步骤a获得的道路图像输入训练好的YOLOv8网络模型,得到预测的路牌粗定位坐标,以及路牌的种类;(2) In the actual reasoning test phase, the road image obtained in step a is input into the trained YOLOv8 network model to obtain the predicted rough positioning coordinates of the road sign and the type of the road sign;
(3)将路牌粗定位坐标得到的粗定位区域置1,其他区域置0,得到粗定位区域图像。(3) The coarse positioning area obtained by the coarse positioning coordinates of the road sign is set to 1, and the other areas are set to 0 to obtain a coarse positioning area image.
根据本发明优选的,步骤c中,According to the preferred embodiment of the present invention, in step c,
HSV空间中,分离出符合以下阈值范围的像素:In HSV space, separate pixels that meet the following threshold ranges:
饱和度S的阈值取值范围为0.35<S<1,亮度V的阈值取值范围为0.35<V<1;色调H的阈值取值范围由路牌决定,对于矩形蓝色路牌,设置200<H<280,对于自制标准路牌提取四角红色区域,设置H>330或H<30。The threshold value range of saturation S is 0.35<S<1, and the threshold value range of brightness V is 0.35<V<1. The threshold value range of hue H is determined by the road sign. For rectangular blue road signs, set 200<H<280. For the four corner red areas of self-made standard road signs, set H>330 or H<30.
根据本发明优选的,步骤d中,According to the preferred embodiment of the present invention, in step d,
阈值处理,是指:对于符合阈值范围的像素,设置为255,其余像素设置为0,得到初步掩模图像;Threshold processing means: for pixels that meet the threshold range, set them to 255, and the remaining pixels are set to 0 to obtain a preliminary mask image;
形态学操作,是指:调用morphology的库函数,去除外部噪点和内部孔洞,并通过闭运算的方式,解决可能存在的边缘不连续情况,消除大部分干扰区域;Morphological operation means: calling the morphology library function to remove external noise and internal holes, and solving possible edge discontinuities through closing operations to eliminate most interference areas;
连通域长宽比例约束,是指:根据目标区域的长宽比例和面积大小,对所求区域进行约束,得到毫无干扰的最终目标区域;The connected domain length-width ratio constraint means: constraining the target area according to the length-width ratio and area size of the target area to obtain the final target area without interference;
求得目标区域的最小外接矩形,将其置为255,得到路牌区域的掩模图像,将掩模图像和路牌原图像进行与运算,最终得到路牌区域的ROI图像。The minimum enclosing rectangle of the target area is obtained, and is set to 255 to obtain the mask image of the road sign area. The mask image and the original road sign image are ANDed together to finally obtain the ROI image of the road sign area.
根据本发明优选的,步骤C的具体实现过程包括:Preferably, according to the present invention, the specific implementation process of step C includes:
提取左右目的掩模图像中最小外接矩形的顶点坐标,作为四对初始对应点,求得初始的透视变换矩阵M,平面透视投影变换关系如式(I)所示:The vertex coordinates of the minimum circumscribed rectangle in the left and right target mask images are extracted as four pairs of initial corresponding points to obtain the initial perspective transformation matrix M. The plane perspective projection transformation relationship is shown in formula (I):
其中x=(x,y,1)和x'=(x',y',1)是齐次坐标,~表示成比例,重写为:Where x = (x, y, 1) and x' = (x', y', 1) are homogeneous coordinates, ~ indicates proportionality, rewritten as:
将右图作为目标图像,对左图进行透视投影变换以逼近右图,使用M←(E+D)M来迭代更新变换矩阵,其中,Take the right image as the target image, perform perspective projection transformation on the left image to approximate the right image, and use M←(E+D)M to iteratively update the transformation matrix, where:
式(V)中,d0至d7对应于M矩阵中m0至m7每次迭代的更新参数;In formula (V), d 0 to d 7 correspond to the update parameters of each iteration of m 0 to m 7 in the M matrix;
此时,使用新的变换x'~(E+D)Mx对左图图像I1进行重采样,相当于使用x”~(E+D)x变换重采样后的左图图像即:At this time, the left image I 1 is resampled using the new transformation x'~(E+D)Mx, which is equivalent to the left image resampled using the transformation x"~(E+D)x Right now:
其中,x”=(x”,y”,1)是齐次坐标,重写为:Where x'=(x',y',1) is a homogeneous coordinate, rewritten as:
通过最小化两幅图像间的强度误差来估计像素的运动,则强度误差方程如下:The pixel motion is estimated by minimizing the intensity error between the two images. The intensity error equation is as follows:
式(X)、(XI)中,是重采样后的左图图像在xi的图像梯度,xi的取值范围是路牌ROI区域;In formula (X) and (XI), The left image is resampled In the image gradient of xi , the value range of xi is the road sign ROI area;
是重采样后的左图图像和目标图像I0对应点的强度误差; The left image is resampled The intensity error of the corresponding point with the target image I 0 ;
d=(d0,d1,...,d7)是运动更新参数,Ji=Jd(xi)是重采样点坐标x'i'关于d的雅克比行列式,对应由三维平面的瞬时运动引起的光流,表示为:d = (d 0 , d 1 , ..., d 7 ) is the motion update parameter, Ji = Jd ( xi ) is the Jacobian determinant of the resampled point coordinate x'i ' with respect to d, corresponding to the optical flow caused by the instantaneous motion of the three-dimensional plane, expressed as:
此时,用最小二乘法得解析解:At this time, the least squares method is used to obtain the analytical solution:
Ad=-b (XIII)Ad=-b (XIII)
其中,海森矩阵:Among them, the Hessian matrix:
累积梯度:Accumulated gradient:
根据本发明优选的,步骤D中,According to the preferred embodiment of the present invention, in step D,
根据双目相机的成像原理可得:According to the imaging principle of the binocular camera:
其中,X是指相机到路牌中心的横向距离,根据数据库中路牌中心与车道的距离信息,推算出车辆所在车道;Z是指相机到路牌平面的法线距离,即车辆到路牌的距离;Among them, X refers to the lateral distance from the camera to the center of the road sign. The lane where the vehicle is located is calculated based on the distance information between the center of the road sign and the lane in the database; Z refers to the normal distance from the camera to the road sign plane, that is, the distance from the vehicle to the road sign;
OL、OR为双目相机的左、右光圈中心,OL、OR之间的距离是双目相机的基线b,f为焦距;P(X,Y,Z)为三维空间中的一点,P(X,Y,Z)在双目相机各成一像,记作PL和PR,校正后,PL和PR在成像平面x轴的坐标为uL和uR,求得的视差disp=uL-uR; OL and OR are the left and right aperture centers of the binocular camera, the distance between OL and OR is the baseline b of the binocular camera, and f is the focal length; P(X,Y,Z) is a point in three-dimensional space, and P(X,Y,Z) forms an image in each binocular camera, recorded as PL and PR . After correction, the coordinates of PL and PR on the x-axis of the imaging plane are uL and uR , and the obtained disparity disp = uL - uR ;
至此,结合相机标定获得的内外参数,得到了相机相对于路牌的横向和法线距离,即求得了车辆相对路牌距离和所在车道,实现车辆自定位。At this point, combined with the internal and external parameters obtained by camera calibration, the lateral and normal distances of the camera relative to the road sign are obtained, that is, the distance of the vehicle relative to the road sign and the lane it is in are obtained, thus achieving vehicle self-positioning.
一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现基于梯度的路牌配准定位方法的步骤。A computer device comprises a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of a gradient-based road sign registration and positioning method when executing the computer program.
一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现基于梯度的路牌配准定位方法的步骤。A computer-readable storage medium stores a computer program, which, when executed by a processor, implements the steps of a gradient-based road sign registration and positioning method.
本发明的有益效果为:The beneficial effects of the present invention are:
1、本发明运用最高版本的YOLO(You Only Look Once version 8)网络模型进行路牌的粗定位,并与颜色提取相结合,实现了在复杂场景下检测路牌目标区域的问题,提高了检测结果的速度和可靠性,保证了路牌区域的完整性,且有效排除了其他相似区域的干扰。1. The present invention uses the highest version of the YOLO (You Only Look Once version 8) network model to perform rough positioning of road signs, and combines it with color extraction to solve the problem of detecting the target area of road signs in complex scenes, improve the speed and reliability of the detection results, ensure the integrity of the road sign area, and effectively eliminate the interference of other similar areas.
2、本发明充分利用路牌是一个平面的特性,将其作为一个特征点进行匹配,得到左右目路牌区域的透视变换矩阵,计算平面之间的整体光流进行立体匹配,避免了逐点计算的冗余,使得配准结果的鲁棒性更强,检测结果精度更高,速度更快。2. The present invention makes full use of the characteristic that the road sign is a plane, and uses it as a feature point for matching, obtains the perspective transformation matrix of the left and right road sign areas, calculates the overall optical flow between the planes for stereo matching, avoids the redundancy of point-by-point calculation, makes the registration result more robust, and the detection result more accurate and faster.
3、本发明提出使用一种简易的数据库系统,该数据库结构简单、数据量小,并且易于后期维护,数据库内容主要包括各个路牌的位置信息、底色以及路牌处道路的车道信息,就能实现车辆的车道级定位。3. The present invention proposes to use a simple database system with a simple structure, small data volume, and easy later maintenance. The database content mainly includes the location information, background color and lane information of each road sign at the road sign, which can realize the lane-level positioning of the vehicle.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本发明基于梯度的路牌配准定位方法的流程示意图;FIG1 is a schematic diagram of a flow chart of a gradient-based road sign registration and positioning method of the present invention;
图2为蓝色矩形的道路指路标志牌示意图;Figure 2 is a schematic diagram of a blue rectangular road sign;
图3(a)为标号为A1的自制标准路牌示意图;Figure 3(a) is a schematic diagram of a self-made standard road sign labeled A1;
图3(b)为标号为A2的自制标准路牌示意图;Figure 3(b) is a schematic diagram of a self-made standard road sign labeled A2;
图3(c)为标号为B3的自制标准路牌示意图;Figure 3(c) is a schematic diagram of a self-made standard road sign labeled B3;
图3(d)为标号为B4的自制标准路牌示意图;Figure 3(d) is a schematic diagram of a self-made standard road sign labeled B4;
图3(e)为标号为C5的自制标准路牌示意图;Figure 3(e) is a schematic diagram of a self-made standard road sign labeled C5;
图3(f)为标号为C6的自制标准路牌示意图;Figure 3(f) is a schematic diagram of a self-made standard road sign labeled C6;
图4(a)为YOLOv8网络模型的简略示意图;Figure 4(a) is a simplified schematic diagram of the YOLOv8 network model;
图4(b)为YOLOv8网络模型的结构示意图;Figure 4(b) is a schematic diagram of the structure of the YOLOv8 network model;
图4(c)为YOLOv8的Conv模块详细结构示意图;Figure 4(c) is a detailed structural diagram of the Conv module of YOLOv8;
图4(d)为YOLOv8的C2f模块详细结构示意图;Figure 4(d) is a detailed structural diagram of the C2f module of YOLOv8;
图4(e)为YOLOv8的Bottleneck模块详细结构示意图;Figure 4(e) is a detailed structural diagram of the Bottleneck module of YOLOv8;
图4(f)为YOLOv8的SPPF模块详细结构示意图;Figure 4(f) is a detailed structural diagram of the SPPF module of YOLOv8;
图4(g)为YOLOv8的Detect模块详细结构示意图;Figure 4(g) is a detailed structural diagram of the Detect module of YOLOv8;
图5为YOLOv8检测蓝色矩形路牌效果示意图;Figure 5 is a schematic diagram of the effect of YOLOv8 detecting a blue rectangular road sign;
图6为YOLOv8检测自制标准路牌效果示意图;Figure 6 is a schematic diagram of the effect of YOLOv8 detecting a self-made standard road sign;
图7为蓝色矩形路牌的ROI图像示意图;Figure 7 is a schematic diagram of the ROI image of a blue rectangular road sign;
图8为自制标准路牌的ROI图像示意图;FIG8 is a schematic diagram of a ROI image of a self-made standard road sign;
图9为迭代优化过程中误差变化示意图;FIG9 is a schematic diagram of error variation during iterative optimization;
图10为双目相机的成像模型示意图。FIG10 is a schematic diagram of an imaging model of a binocular camera.
具体实施方式DETAILED DESCRIPTION
下面结合实施例和说明书附图对本发明做进一步说明,但不限于此。The present invention will be further described below in conjunction with the embodiments and the accompanying drawings, but is not limited thereto.
实施例1Example 1
一种基于梯度的路牌配准定位方法,路牌为数据库中任意路牌,设置在道路右侧,且与路面垂直,以蓝色矩形的道路指路标志牌(如图2所示)和自制标准路牌为例,自制标准路牌为正方形平面指示牌,底色为白色,四个顶角处设有正方形红色区域,指示字符为字母加数字,用黑色标注,如图1所示,包括步骤如下:A road sign registration and positioning method based on gradient, the road sign is any road sign in the database, set on the right side of the road and perpendicular to the road surface, taking a blue rectangular road sign (as shown in FIG2 ) and a self-made standard road sign as examples, the self-made standard road sign is a square plane sign, the background color is white, and square red areas are set at the four top corners, and the indicating characters are letters and numbers, marked in black, as shown in FIG1 , including the following steps:
A、构建数据库A. Building a database
数据库包括各个路牌的以下信息:地理坐标、路牌中心与车道的距离信息、底色,地理坐标是指路牌所在的经度、纬度;路牌中心与车道的距离信息是指路牌中心与每条车道线的横向距离;底色是指路牌的颜色;本发明采用双目相机,对于不同路牌均适用,不需要提前采集路牌的尺寸大小。The database includes the following information of each road sign: geographic coordinates, distance information between the center of the road sign and the lane, and background color. The geographic coordinates refer to the longitude and latitude of the road sign; the distance information between the center of the road sign and the lane refers to the lateral distance between the center of the road sign and each lane line; the background color refers to the color of the road sign; the present invention uses a binocular camera, which is applicable to different road signs and does not require the size of the road sign to be collected in advance.
B、路牌粗定位与提取B. Rough positioning and extraction of road signs
a、将双目相机安装在车辆前方以实时获取道路图像;双目相机的光轴平行且与车辆行驶方向相同,双目相机的左、右相机焦距相等且x轴正方向重合;a. Install a binocular camera in front of the vehicle to obtain road images in real time; the optical axis of the binocular camera is parallel and in the same direction as the vehicle's travel, the focal lengths of the left and right cameras of the binocular camera are equal, and the positive directions of the x-axis coincide;
b、将步骤a获得的道路图像利用训练好的YOLOv8网络模型进行目标检测,实现目标路牌的粗定位,并识别当前路牌的类别,本实施例所用路牌有7类,其中蓝色矩形路牌一类,自制标准路牌六类,路牌上标号分别为A1、A2、B3、B4、C5、C6,分别如图3(a)、图3(b)、图3(c)、图3(d)、图3(e)、图3(f)所示;b. Use the trained YOLOv8 network model to perform target detection on the road image obtained in step a to achieve rough positioning of the target road sign and identify the category of the current road sign. There are 7 types of road signs used in this embodiment, including one type of blue rectangular road signs and six types of self-made standard road signs. The numbers on the road signs are A1, A2, B3, B4, C5, and C6, as shown in Figures 3(a), 3(b), 3(c), 3(d), 3(e), and 3(f), respectively;
c、将步骤b获得的粗定位区域图像从RGB空间转换为HSV空间;在RGB空间三个颜色分量高度相关、分析困难,且易受到光照等影响,而HSV空间可通过调节饱和度和明度消除光照的影响,能够更准确的分离出某一颜色。c. Convert the roughly positioned area image obtained in step b from the RGB space to the HSV space. In the RGB space, the three color components are highly correlated, difficult to analyze, and easily affected by lighting, while the HSV space can eliminate the influence of lighting by adjusting saturation and brightness, and can more accurately separate a certain color.
d、将经过步骤c处理的图像在HSV空间进行阈值处理,并通过形态学操作、连通域长宽比例约束得到掩模图像,从而得到路牌区域ROI图像;d. Perform threshold processing on the image processed in step c in HSV space, and obtain a mask image through morphological operations and connected domain length-width ratio constraints, thereby obtaining a road sign area ROI image;
C、基于梯度的配准优化C. Gradient-based registration optimization
根据得到的掩模图像,初步得到四对初始对应点,从而得到初始的透视变换矩阵,再利用基于梯度的优化算法对高斯平滑后的左右目路牌区域ROI图像进行亚像素级配准,迭代更新得到准确的透视变换矩阵;Based on the obtained mask image, four pairs of initial corresponding points are initially obtained, thereby obtaining the initial perspective transformation matrix. Then, the gradient-based optimization algorithm is used to perform sub-pixel registration on the Gaussian-smoothed left and right road sign area ROI images, and the accurate perspective transformation matrix is obtained by iterative update.
D、车辆位姿计算D. Vehicle posture calculation
上述步骤迭代更新得到了准确的透视变换矩阵,根据路牌中心点坐标和公式(I)求得路牌整体位移,即求得视差,进而得到车辆的位置信息,包括相机相对于路牌的横向和法线距离,即车辆相对路牌的距离和所在车道。The above steps are iteratively updated to obtain an accurate perspective transformation matrix. The overall displacement of the road sign is calculated according to the coordinates of the center point of the road sign and formula (I), that is, the parallax is calculated, and then the vehicle position information is obtained, including the lateral and normal distances of the camera relative to the road sign, that is, the distance of the vehicle relative to the road sign and the lane it is in.
实施例2Example 2
根据实施例1所述的一种基于梯度的路牌配准定位方法,其区别在于:The gradient-based road sign registration and positioning method according to Example 1 is different in that:
步骤b的具体实现过程包括:The specific implementation process of step b includes:
(1)训练YOLOv8网络模型:(1) Training YOLOv8 network model:
获取训练集:选取含有路牌的图片,标注上标签,标签包括图片中路牌的坐标信息以及路牌的类别;Get the training set: select pictures containing road signs and label them. The labels include the coordinate information of the road signs in the pictures and the categories of the road signs.
因为原始的模型不能较好的识别出路牌这样的目标信息,针对本发明所面临的实际问题,对每种路牌分别选取了100张含有路牌的图片进行训练,图片里面覆盖了较多的路牌场景,这样可以保证在新场景的路牌检测准确率,训练过程中的标签是图片中路牌的坐标信息以及路牌的类别,这样在推理阶段就能获取所需要的路牌坐标和类别信息便于下一步的操作。Because the original model cannot recognize target information such as road signs well, in order to address the practical problems faced by the present invention, 100 pictures containing road signs were selected for each type of road sign for training. The pictures cover a large number of road sign scenes, which can ensure the accuracy of road sign detection in new scenes. The labels in the training process are the coordinate information of the road signs in the pictures and the categories of the road signs. In this way, the required road sign coordinates and category information can be obtained in the inference stage to facilitate the next step of operation.
如图4(a)、图4(b)所示,YOLOv8网络模型包括Backbone单元、Neck单元、Head单元;As shown in Figure 4(a) and Figure 4(b), the YOLOv8 network model includes Backbone unit, Neck unit, and Head unit;
Backbone单元包括卷积模块、C2f模块、SPPF模块,其中,如图4(c)所示,卷积模块(Conv模块)包括卷积层、批归一化层和SiLU激活函数层;C2f模块中具体结构如图4(d)所示,C2f模块包括卷积模块、Bottleneck模块和残差结构模块,Bottleneck模块具体结构如图4(e)所示;SPPF模块中具体结构如图4(f)所示,SPPF模块中包括卷积层、池化层;The Backbone unit includes a convolution module, a C2f module, and an SPPF module. As shown in Figure 4(c), the convolution module (Conv module) includes a convolution layer, a batch normalization layer, and a SiLU activation function layer. The specific structure of the C2f module is shown in Figure 4(d). The C2f module includes a convolution module, a Bottleneck module, and a residual structure module. The specific structure of the Bottleneck module is shown in Figure 4(e). The specific structure of the SPPF module is shown in Figure 4(f). The SPPF module includes a convolution layer and a pooling layer.
输入放缩成高和宽均为640的图片后,经过Backbone单元后得到高和宽为80、高和宽为40、高和宽为20的初始特征图;After the input is scaled to a picture with a height and width of 640, the initial feature maps with height and width of 80, height and width of 40, and height and width of 20 are obtained after passing through the Backbone unit;
Neck单元包括卷积模块、C2f模块、上采样层;经过Neck单元后得到高和宽为80、高和宽为40、高和宽为20的中间特征图。The Neck unit includes a convolution module, a C2f module, and an upsampling layer; after passing through the Neck unit, intermediate feature maps with height and width of 80, height and width of 40, and height and width of 20 are obtained.
Head单元包括检测模块(Detect模块),如图4(g)所示,检测模块包括卷积模块、卷积层;经过Head单元后会得到预测的目标类别信息和目标坐标信息;The Head unit includes a detection module (Detect module), as shown in Figure 4(g), which includes a convolution module and a convolution layer. After passing through the Head unit, the predicted target category information and target coordinate information are obtained;
基于Backbone单元对训练集中的每张图片进行放缩(放缩成高和宽均为640的图片)以及卷积操作,从而获得初始的特征图;基于Neck单元对得到的初始特征图进行二次提取,获得不同尺度的中间特征图;将获得的不同尺度的中间特征图输入Head单元,得到YOLOv8网络模型预测的路牌坐标;Based on the Backbone unit, each image in the training set is scaled (scaled to an image with a height and width of 640) and convolved to obtain the initial feature map; based on the Neck unit, the initial feature map is secondary extracted to obtain intermediate feature maps of different scales; the intermediate feature maps of different scales are input into the Head unit to obtain the coordinates of the road sign predicted by the YOLOv8 network model;
通过YOLOv8网络模型预测的路牌坐标和真实的路牌坐标计算损失,通过损失获得YOLOv8网络模型优化的梯度,图9为迭代优化过程中误差变化示意图;进行YOLOv8网络模型权重的更新,损失不断下降网络预测的准确率不断上升,从而获得一个不错的训练好的YOLOv8网络模型;The loss is calculated by using the road sign coordinates predicted by the YOLOv8 network model and the actual road sign coordinates, and the gradient of the YOLOv8 network model optimization is obtained through the loss. Figure 9 is a schematic diagram of the error change during the iterative optimization process; the weights of the YOLOv8 network model are updated, the loss continues to decrease, and the accuracy of network prediction continues to increase, thereby obtaining a well-trained YOLOv8 network model;
将步骤a获得的道路图像利用训练好的YOLOv8网络模型进行目标检测,实现目标路牌的粗定位,并识别当前路牌的类别,获取数据库信息;The road image obtained in step a is used to perform target detection using the trained YOLOv8 network model to achieve rough positioning of the target road sign, identify the category of the current road sign, and obtain database information;
(2)在实际的推理测试阶段,将步骤a获得的道路图像输入训练好的YOLOv8网络模型,得到预测的路牌粗定位坐标,以及路牌的种类;图5为YOLOv8检测蓝色矩形路牌效果示意图;图6为YOLOv8检测自制标准路牌效果示意图。(2) In the actual reasoning test phase, the road image obtained in step a is input into the trained YOLOv8 network model to obtain the predicted rough positioning coordinates of the road sign and the type of the road sign; FIG5 is a schematic diagram of the effect of YOLOv8 detecting a blue rectangular road sign; FIG6 is a schematic diagram of the effect of YOLOv8 detecting a self-made standard road sign.
(3)将路牌粗定位坐标得到的粗定位区域置1,其他区域置0,得到粗定位区域图像。(3) The coarse positioning area obtained by the coarse positioning coordinates of the road sign is set to 1, and the other areas are set to 0 to obtain a coarse positioning area image.
步骤c中,在RGB空间三个颜色分量高度相关、分析困难,且易受到光照等影响,而HSV空间可通过调节饱和度和明度消除光照的影响,能够更准确的分离出某一颜色。HSV空间中,分离出符合以下阈值范围的像素:In step c, the three color components in the RGB space are highly correlated, difficult to analyze, and easily affected by lighting, while the HSV space can eliminate the influence of lighting by adjusting saturation and brightness, and can more accurately separate a certain color. In the HSV space, separate the pixels that meet the following threshold range:
根据先验知识和实验测定,饱和度S的阈值取值范围为0.35<S<1,亮度V的阈值取值范围为0.35<V<1;色调H的阈值取值范围由路牌决定,对于矩形蓝色路牌,设置200<H<280,对于自制标准路牌提取四角红色区域,设置H>330或H<30。According to prior knowledge and experimental measurements, the threshold range of saturation S is 0.35<S<1, and the threshold range of brightness V is 0.35<V<1. The threshold range of hue H is determined by the road sign. For rectangular blue road signs, 200<H<280 is set. For self-made standard road signs to extract the four red corners, H>330 or H<30 is set.
步骤d中,阈值处理,是指:对于符合阈值范围的像素,设置为255,其余像素设置为0,得到初步掩模图像;In step d, threshold processing means: for pixels that meet the threshold range, set them to 255, and set the remaining pixels to 0, to obtain a preliminary mask image;
形态学操作,是指:调用morphology的库函数,去除外部噪点和内部孔洞,并通过闭运算的方式,解决可能存在的边缘不连续情况,消除大部分干扰区域,保证路牌区域是一个完整的连通区域;Morphological operation means: calling the morphology library function to remove external noise and internal holes, and solving possible edge discontinuities through closing operations, eliminating most interference areas, and ensuring that the road sign area is a complete connected area;
连通域长宽比例约束,是指:根据目标区域的长宽比例和面积大小,对所求区域进行约束,实验表明,至此可得到毫无干扰的最终目标区域;The connected domain length-width ratio constraint means: constraining the target region according to the length-width ratio and area size of the target region. Experiments show that the final target region without interference can be obtained.
因本发明所用自制红色标准路牌的四角区域是红色正方形,即宽高比是1,所以可限制宽高比大于0.8小于1.2;同时,由于粗定位区域中路牌的红色是最大的红色区域,可以进行面积约束,保留面积最大的四个区域作为最终目标区域;Since the four corner areas of the self-made red standard road sign used in the present invention are red squares, that is, the aspect ratio is 1, the aspect ratio can be limited to be greater than 0.8 and less than 1.2; at the same time, since the red color of the road sign in the rough positioning area is the largest red area, area constraints can be performed, and the four areas with the largest areas are retained as the final target areas;
至此,已排除所有干扰,现提取四个目标区域,并求其最小外接矩形,将其置为255,得到整个路牌区域的掩模图像,将掩模图像和路牌原图像进行与运算,最终得到了路牌区域的ROI图像,图7为蓝色矩形路牌的ROI图像示意图;图8为自制标准路牌的ROI图像示意图。At this point, all interference has been eliminated. Now, four target areas are extracted, and their minimum circumscribed rectangle is calculated and set to 255 to obtain the mask image of the entire road sign area. The mask image and the original road sign image are ANDed together to finally obtain the ROI image of the road sign area. Figure 7 is a schematic diagram of the ROI image of the blue rectangular road sign; Figure 8 is a schematic diagram of the ROI image of the homemade standard road sign.
步骤C的具体实现过程包括:The specific implementation process of step C includes:
提取左右目的掩模图像中最小外接矩形的顶点坐标,作为四对初始对应点,调用OpenCV中getPerspectiveTransform函数求得初始的透视变换矩阵M,平面透视投影变换关系如式(I)所示:Extract the vertex coordinates of the minimum circumscribed rectangle in the left and right target mask images as four pairs of initial corresponding points, and call the getPerspectiveTransform function in OpenCV to obtain the initial perspective transformation matrix M. The plane perspective projection transformation relationship is shown in formula (I):
其中x=(x,y,1)和x'=(x',y',1)是齐次坐标,~表示成比例,重写为:Where x = (x, y, 1) and x' = (x', y', 1) are homogeneous coordinates, ~ indicates proportionality, rewritten as:
将右图作为目标图像,对左图进行透视投影变换以逼近右图,为了优化投影矩阵的八个参数,使用M←(E+D)M来迭代更新变换矩阵,其中,Take the right image as the target image, perform perspective projection transformation on the left image to approximate the right image. In order to optimize the eight parameters of the projection matrix, use M←(E+D)M to iteratively update the transformation matrix, where:
式(XXII)中,d0至d7对应于M矩阵中m0至m7每次迭代的更新参数;In formula (XXII), d0 to d7 correspond to the update parameters of each iteration of m0 to m7 in the M matrix;
此时,使用新的变换x'~(E+D)Mx对左图图像I1进行重采样,相当于使用x”~(E+D)x变换重采样后的左图图像即:At this time, the left image I 1 is resampled using the new transformation x'~(E+D)Mx, which is equivalent to the left image resampled using the transformation x"~(E+D)x Right now:
其中,x”=(x”,y”,1)是齐次坐标,重写为:Where x'=(x',y',1) is a homogeneous coordinate, rewritten as:
为了恢复准确的透视变换关系,通过最小化两幅图像间的强度误差来估计像素的运动,则强度误差方程如下:In order to restore the accurate perspective transformation relationship, the pixel motion is estimated by minimizing the intensity error between the two images. The intensity error equation is as follows:
式(XXVII)、(XXVIII)中,是重采样后的左图图像在xi的图像梯度,xi的取值范围是路牌ROI区域;In formula (XXVII) and (XXVIII), The left image is resampled In the image gradient of xi , the value range of xi is the road sign ROI area;
是重采样后的左图图像和目标图像I0对应点的强度误差; The left image is resampled The intensity error of the corresponding point with the target image I 0 ;
d=(d0,d1,...,d7)是运动更新参数,Ji=Jd(xi)是重采样点坐标x″i关于d的雅克比行列式,对应由三维平面的瞬时运动引起的光流,表示为:d = (d 0 , d 1 , ..., d 7 ) is the motion update parameter, Ji = J d ( xi ) is the Jacobian determinant of the resampled point coordinate x″ i with respect to d, corresponding to the optical flow caused by the instantaneous motion of the three-dimensional plane, expressed as:
此时,用最小二乘法得解析解:At this time, the least squares method is used to obtain the analytical solution:
Ad=-b (XXX)Ad=-b (XXX)
其中,海森矩阵:Among them, the Hessian matrix:
累积梯度:Accumulated gradient:
步骤D中,根据双目相机的成像原理可得:In step D, according to the imaging principle of the binocular camera, we can get:
其中,X是指相机到路牌中心的横向距离,根据数据库中路牌中心与车道的距离信息,推算出车辆所在车道;Z是指相机到路牌平面的法线距离,即车辆到路牌的距离;Among them, X refers to the lateral distance from the camera to the center of the road sign. The lane where the vehicle is located is calculated based on the distance information between the center of the road sign and the lane in the database; Z refers to the normal distance from the camera to the road sign plane, that is, the distance from the vehicle to the road sign;
双目相机的成像模型如图10所示,OL、OR为双目相机的左、右光圈中心,OL、OR之间的距离是双目相机的基线b,方框为成像平面,f为焦距;P(X,Y,Z)为三维空间中的一点(取左相机光心为原点坐标),P(X,Y,Z)在双目相机各成一像,记作PL和PR,校正后,PL和PR在成像平面x轴的坐标为uL和uR(以图像主点为原点,所以uR为负数),则上述步骤求得的视差disp=uL-uR;The imaging model of the binocular camera is shown in Figure 10. OL and OR are the left and right aperture centers of the binocular camera, the distance between OL and OR is the baseline b of the binocular camera, the box is the imaging plane, and f is the focal length; P(X, Y, Z) is a point in three-dimensional space (the optical center of the left camera is taken as the origin coordinate), and P(X, Y, Z) forms an image in each binocular camera, recorded as PL and PR . After correction, the coordinates of PL and PR on the x-axis of the imaging plane are uL and uR (the image principal point is taken as the origin, so uR is a negative number), then the parallax disp obtained in the above steps = uL - uR ;
至此,结合相机标定获得的内外参数,得到了相机相对于路牌的横向和法线距离,即求得了车辆相对路牌距离和所在车道,实现车辆自定位。At this point, combined with the internal and external parameters obtained by camera calibration, the lateral and normal distances of the camera relative to the road sign are obtained, that is, the distance of the vehicle relative to the road sign and the lane it is in are obtained, thus achieving vehicle self-positioning.
实施例3Example 3
一种计算机设备,包括存储器和处理器,存储器存储有计算机程序,处理器执行计算机程序时实现实施例1或2基于梯度的路牌配准定位方法的步骤。A computer device includes a memory and a processor, wherein the memory stores a computer program, and when the processor executes the computer program, the steps of the gradient-based road sign registration and positioning method of
实施例4Example 4
一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现实施例1或2基于梯度的路牌配准定位方法的步骤。A computer-readable storage medium stores a computer program, which, when executed by a processor, implements the steps of the gradient-based road sign registration and positioning method of
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310229650.2A CN116152342A (en) | 2023-03-10 | 2023-03-10 | Guideboard registration positioning method based on gradient |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310229650.2A CN116152342A (en) | 2023-03-10 | 2023-03-10 | Guideboard registration positioning method based on gradient |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116152342A true CN116152342A (en) | 2023-05-23 |
Family
ID=86350635
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310229650.2A Pending CN116152342A (en) | 2023-03-10 | 2023-03-10 | Guideboard registration positioning method based on gradient |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116152342A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116681885A (en) * | 2023-08-03 | 2023-09-01 | 国网安徽省电力有限公司超高压分公司 | Infrared image target identification method and system for power transmission and transformation equipment |
CN116895030A (en) * | 2023-09-11 | 2023-10-17 | 西华大学 | Insulator detection method based on target detection algorithm and attention mechanism |
CN117746649A (en) * | 2023-11-06 | 2024-03-22 | 南京城建隧桥智慧管理有限公司 | Tunnel traffic flow detection system and method based on YOLOv8 algorithm |
-
2023
- 2023-03-10 CN CN202310229650.2A patent/CN116152342A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116681885A (en) * | 2023-08-03 | 2023-09-01 | 国网安徽省电力有限公司超高压分公司 | Infrared image target identification method and system for power transmission and transformation equipment |
CN116681885B (en) * | 2023-08-03 | 2024-01-02 | 国网安徽省电力有限公司超高压分公司 | Infrared image target identification method and system for power transmission and transformation equipment |
CN116895030A (en) * | 2023-09-11 | 2023-10-17 | 西华大学 | Insulator detection method based on target detection algorithm and attention mechanism |
CN116895030B (en) * | 2023-09-11 | 2023-11-17 | 西华大学 | Insulator detection method based on target detection algorithm and attention mechanism |
CN117746649A (en) * | 2023-11-06 | 2024-03-22 | 南京城建隧桥智慧管理有限公司 | Tunnel traffic flow detection system and method based on YOLOv8 algorithm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3735675B1 (en) | Image annotation | |
CN109752701B (en) | Road edge detection method based on laser point cloud | |
CN106651953B (en) | A Vehicle Pose Estimation Method Based on Traffic Signs | |
CN108802785B (en) | Vehicle self-positioning method based on high-precision vector map and monocular vision sensor | |
CN116152342A (en) | Guideboard registration positioning method based on gradient | |
CN107045629B (en) | A multi-lane line detection method | |
CN105930819B (en) | Real-time city traffic lamp identifying system based on monocular vision and GPS integrated navigation system | |
CN107341453B (en) | Lane line extraction method and device | |
Liang et al. | Video stabilization for a camcorder mounted on a moving vehicle | |
CN106225787A (en) | Unmanned aerial vehicle visual positioning method | |
CN112308913B (en) | Vehicle positioning method and device based on vision and vehicle-mounted terminal | |
CN113903011A (en) | Semantic map construction and positioning method suitable for indoor parking lot | |
CN115717894A (en) | A high-precision vehicle positioning method based on GPS and common navigation maps | |
CN110415299B (en) | Vehicle position estimation method based on set guideboard under motion constraint | |
CN108416798A (en) | A Vehicle Distance Estimation Method Based on Optical Flow | |
CN113781562A (en) | Lane line virtual and real registration and self-vehicle positioning method based on road model | |
CN112906616A (en) | Lane line extraction and generation method | |
CN107220632B (en) | A road image segmentation method based on normal feature | |
CN112749584A (en) | Vehicle positioning method based on image detection and vehicle-mounted terminal | |
CN115239822A (en) | Real-time visual identification and positioning method and system for multi-module space of split type flying vehicle | |
CN115100292A (en) | An online calibration method of external parameters between lidar and camera in road environment | |
Bulatov et al. | Context-based urban terrain reconstruction from images and videos | |
KR20220151572A (en) | Method and System for change detection and automatic updating of road marking in HD map through IPM image and HD map fitting | |
CN114596369A (en) | Map generation method and device, electronic equipment and computer storage medium | |
Li et al. | Lane detection and road surface reconstruction based on multiple vanishing point & symposia |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |