CN116152342A - Guideboard registration positioning method based on gradient - Google Patents

Guideboard registration positioning method based on gradient Download PDF

Info

Publication number
CN116152342A
CN116152342A CN202310229650.2A CN202310229650A CN116152342A CN 116152342 A CN116152342 A CN 116152342A CN 202310229650 A CN202310229650 A CN 202310229650A CN 116152342 A CN116152342 A CN 116152342A
Authority
CN
China
Prior art keywords
guideboard
image
gradient
positioning
area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310229650.2A
Other languages
Chinese (zh)
Inventor
陈辉
刘莹
吕传栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202310229650.2A priority Critical patent/CN116152342A/en
Publication of CN116152342A publication Critical patent/CN116152342A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • G06V20/582Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of traffic signs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

本发明涉及一种基于梯度的路牌配准定位方法,包括:A、构建数据库;B、路牌粗定位与提取;a、获取道路图像;b、将道路图像利用训练好的YOLOv8网络模型进行目标检测,实现目标路牌的粗定位,并识别当前路牌的类别,获取当前路牌的数据库信息,并求取粗定位区域图像;c、将粗定位区域图像从RGB空间转换为HSV空间;d、将处理的图像在HSV空间进行阈值处理,得到路牌区域ROI图像;C、基于梯度的配准优化;D、车辆位姿计算;本发明运用最高版本的YOLO网络模型进行路牌的粗定位,实现了在复杂场景下检测路牌目标区域的问题,提高了检测结果的速度和可靠性,保证了路牌区域的完整性,有效排除了其他相似区域的干扰。

Figure 202310229650

The invention relates to a gradient-based road sign registration and positioning method, comprising: A, building a database; B, rough positioning and extraction of street signs; a, obtaining road images; b, using the trained YOLOv8 network model to perform target detection on the road images , realize the coarse positioning of the target street sign, and identify the category of the current street sign, obtain the database information of the current street sign, and obtain the rough positioning area image; c, convert the rough positioning area image from RGB space to HSV space; d, convert the processed The image is thresholded in the HSV space to obtain the ROI image of the road sign area; C, gradient-based registration optimization; D, vehicle pose calculation; the present invention uses the highest version of the YOLO network model to perform rough positioning of road signs, and realizes in complex scenes The problem of detecting the target area of road signs improves the speed and reliability of the detection results, ensures the integrity of the road sign area, and effectively eliminates the interference of other similar areas.

Figure 202310229650

Description

一种基于梯度的路牌配准定位方法A road sign registration and positioning method based on gradient

技术领域Technical Field

本发明涉及一种基于梯度的路牌配准定位方法,属于数字图像处理和计算机视觉领域。The invention relates to a road sign registration and positioning method based on gradient, and belongs to the field of digital image processing and computer vision.

背景技术Background Art

随着国家经济水平的快速提升,汽车保有量迅速增加,交通拥堵问题日益严重,智慧交通系统以及自动驾驶技术成为国内外学者研究的热点课题。其中,车辆自定位作为基础和关键技术,极为重要。在地下停车场、隧道或建筑密集的城市中心区域,由于卫星信号被遮挡,基于卫星的车载定位系统定位精度影响很大或无法正常工作。基于此背景,深入研究基于路牌的车辆定位系统,旨在提高车载定位系统在特定区域的定位精度和可靠性。With the rapid improvement of the national economic level, the number of cars has increased rapidly, and traffic congestion has become increasingly serious. Intelligent transportation systems and autonomous driving technologies have become hot topics for scholars at home and abroad. Among them, vehicle self-positioning, as a basic and key technology, is extremely important. In underground parking lots, tunnels, or densely built urban areas, the positioning accuracy of satellite-based vehicle positioning systems is greatly affected or cannot work properly due to the obstruction of satellite signals. Based on this background, in-depth research on vehicle positioning systems based on road signs aims to improve the positioning accuracy and reliability of vehicle positioning systems in specific areas.

发展最早且应用最为广泛的定位技术是全球定位系统GPS,但其在密集城市地区的定位效果明显下降。若使用单一的传感器,则无法在车辆密集的情况下实现无干扰高精度定位;多传感器融合的定位系统,成本高且不够灵活,阻碍了该系统的大规模产品化;同时,大多数依靠传感器级联的定位系统,在复杂的城市环境和拥堵的道路状况下,定位会出现累积误差并导致较大的定位误差。The earliest developed and most widely used positioning technology is the Global Positioning System (GPS), but its positioning effect in dense urban areas has declined significantly. If a single sensor is used, it is impossible to achieve interference-free high-precision positioning in densely populated areas; the multi-sensor fusion positioning system is costly and inflexible, which hinders the large-scale productization of the system; at the same time, most positioning systems that rely on sensor cascades will have cumulative errors and lead to large positioning errors in complex urban environments and congested road conditions.

基于以上方法的局限性,随着计算机视觉的发展,视觉传感器越来越多地用于车辆定位,双目相机、单目相机与各种传感器相结合的系统层出不穷,使用单目相机完成车辆定位一般需要构建更加复杂和精准的比对数据库,且准确度较低。Based on the limitations of the above methods, with the development of computer vision, visual sensors are increasingly used for vehicle positioning, and systems combining binocular cameras, monocular cameras and various sensors are emerging in an endless stream. Using a monocular camera to complete vehicle positioning generally requires the construction of a more complex and accurate comparison database, and the accuracy is lower.

发明内容Summary of the invention

针对现有技术的不足,本发明提供了一种基于梯度的路牌配准定位方法。In view of the deficiencies of the prior art, the present invention provides a gradient-based road sign registration and positioning method.

本发明采用双目视觉车辆自定位算法,结合GPS或离线下载的路线规划、道路标志数据库实现自主导航,特别在路口车辆密集和车道线被覆盖时,利用路牌实现车辆的精准自定位。The present invention adopts binocular vision vehicle self-positioning algorithm, combined with GPS or offline downloaded route planning and road sign database to realize autonomous navigation, especially when there are dense vehicles at intersections and lane lines are covered, the road signs are used to realize accurate self-positioning of the vehicle.

术语说明:Terminology Note:

1、RGB颜色空间:由红(R)、绿(G)、蓝(B)三个基本颜色通道组成,每个通道中像素的取值范围都是[0,255],改变三个通道的取值然后进行叠加,从而得到不同的颜色。1. RGB color space: It consists of three basic color channels: red (R), green (G), and blue (B). The value range of pixels in each channel is [0,255]. By changing the values of the three channels and then superimposing them, different colors can be obtained.

2、HSV颜色空间:以色调(H)、饱和度(S)、明度(V)三种特征对颜色进行描述,更类似于人眼感知颜色的方式。2. HSV color space: describes colors using three characteristics: hue (H), saturation (S), and value (V), which is more similar to the way the human eye perceives colors.

3、getPerspectiveTransform函数:根据源图像和目标图像上的四对点坐标来计算从原图像透视变换到目标头像的透视变换矩阵。3. getPerspectiveTransform function: Calculates the perspective transformation matrix from the original image to the target avatar based on the four pairs of point coordinates on the source image and the target image.

4、相机的内参矩阵:可以将3D相机坐标变换到2D齐次图像坐标,表示为

Figure BDA0004119809910000021
其中f是焦距,
Figure BDA0004119809910000022
代表x轴方向焦距的像素长度,
Figure BDA0004119809910000023
代表y轴方向焦距的像素长度,u0,v0为图像主点的实际位置,这些参数只由相机自身属性决定,不因外界环境而改变。4. Camera intrinsic parameter matrix: The 3D camera coordinates can be transformed into 2D homogeneous image coordinates, expressed as
Figure BDA0004119809910000021
Where f is the focal length,
Figure BDA0004119809910000022
Represents the pixel length of the focal length in the x-axis direction,
Figure BDA0004119809910000023
represents the pixel length of the focal length in the y-axis direction, u 0 ,v 0 are the actual positions of the principal points of the image, and these parameters are determined only by the camera's own properties and will not change due to the external environment.

5、外参矩阵:实现从世界坐标系到相机坐标系的变换,可表示为

Figure BDA0004119809910000024
其中R是旋转矩阵,它的每一个列向量表示世界坐标系的每一个坐标轴在相机坐标系下的指向;T是平移矩阵,它是世界坐标系原点在相机坐标系下的表示。5. External parameter matrix: realizes the transformation from the world coordinate system to the camera coordinate system, which can be expressed as
Figure BDA0004119809910000024
Where R is the rotation matrix, each of its column vectors represents the direction of each coordinate axis of the world coordinate system in the camera coordinate system; T is the translation matrix, which is the representation of the origin of the world coordinate system in the camera coordinate system.

6、透视变换矩阵:两台相机对同一场景进行成像,两幅图像之间存在着唯一的几何对应关系,可以用矩阵的形式来表达。满足透视变换的一种情形是两幅图像拍摄时光心是重合的,另一种情形是拍摄的场景处于一个平面上,例如本发明所用的路牌。6. Perspective transformation matrix: Two cameras image the same scene, and there is a unique geometric correspondence between the two images, which can be expressed in the form of a matrix. One situation that satisfies the perspective transformation is that the light centers of the two images coincide when they are taken, and another situation is that the scene being photographed is on the same plane, such as the road sign used in the present invention.

7、齐次坐标:是将一个原本是n维的向量用一个n+1维向量来表示,用于投影几何里的坐标系统,二维坐标(x,y)的齐次坐标形式通常用三维坐标(hx,hy,h)表示,在齐次坐标中,h为比例因子,可任意取值,一般设为1,以保持两种坐标的一致。7. Homogeneous coordinates: A vector that is originally n-dimensional is represented by an n+1-dimensional vector. It is used in the coordinate system in projective geometry. The homogeneous coordinate form of the two-dimensional coordinate (x, y) is usually expressed by three-dimensional coordinates (hx, hy, h). In homogeneous coordinates, h is the scale factor, which can take any value and is generally set to 1 to keep the two coordinates consistent.

8、图像配准:是指将不同时间、不同传感器(成像设备)或不同条件下(天候、照度、摄像位置和角度等)获取的两幅或多幅图像进行匹配、叠加的过程。8. Image registration: It refers to the process of matching and superimposing two or more images acquired at different times, using different sensors (imaging devices) or under different conditions (weather, illumination, camera position and angle, etc.).

9、光流:是空间运动物体在观察成像平面上的像素运动的瞬时速度,在时间间隔很小(比如视频的连续前后两帧之间)时,也等同于目标点的位移。9. Optical flow: It is the instantaneous speed of the pixel movement of a moving object in space on the observation imaging plane. When the time interval is very small (such as between two consecutive frames of a video), it is also equivalent to the displacement of the target point.

本发明的技术方案为:The technical solution of the present invention is:

一种基于梯度的路牌配准定位方法,包括步骤如下:A gradient-based road sign registration and positioning method comprises the following steps:

A、构建数据库A. Building a database

所述数据库包括各个路牌的以下信息:地理坐标、路牌中心与车道的距离信息、底色,所述地理坐标是指路牌所在的经度、纬度;所述路牌中心与车道的距离信息是指路牌中心与每条车道线的横向距离;所述底色是指路牌的颜色;The database includes the following information of each road sign: geographic coordinates, distance information between the center of the road sign and the lane, and background color. The geographic coordinates refer to the longitude and latitude of the road sign; the distance information between the center of the road sign and the lane refers to the lateral distance between the center of the road sign and each lane line; the background color refers to the color of the road sign;

B、路牌粗定位与提取B. Rough positioning and extraction of road signs

a、将双目相机安装在车辆前方以实时获取道路图像;a. Install a binocular camera in front of the vehicle to obtain road images in real time;

b、将步骤a获得的道路图像利用训练好的YOLOv8网络模型进行目标检测,实现目标路牌的粗定位,并识别当前路牌的类别,获取当前路牌的数据库信息,并求取粗定位区域图像;b. Use the trained YOLOv8 network model to perform target detection on the road image obtained in step a, achieve coarse positioning of the target road sign, identify the category of the current road sign, obtain the database information of the current road sign, and obtain the coarse positioning area image;

c、将步骤b获得的粗定位区域图像从RGB空间转换为HSV空间;c. Convert the coarse positioning area image obtained in step b from RGB space to HSV space;

d、将经过步骤c处理的图像在HSV空间进行阈值处理,并通过形态学操作、连通域长宽比例约束得到掩模图像,从而得到路牌区域ROI图像;d. Perform threshold processing on the image processed in step c in HSV space, and obtain a mask image through morphological operations and connected domain length-width ratio constraints, thereby obtaining a road sign area ROI image;

C、基于梯度的配准优化C. Gradient-based registration optimization

根据得到的掩模图像,初步得到四对初始对应点,从而得到初始的透视变换矩阵,再利用基于梯度的优化算法对高斯平滑后的左右目路牌区域ROI图像进行亚像素级配准,迭代更新得到准确的透视变换矩阵;Based on the obtained mask image, four pairs of initial corresponding points are initially obtained, thereby obtaining the initial perspective transformation matrix. Then, the gradient-based optimization algorithm is used to perform sub-pixel registration on the Gaussian-smoothed left and right road sign area ROI images, and the accurate perspective transformation matrix is obtained by iterative update.

D、车辆位姿计算D. Vehicle posture calculation

求得路牌整体位移,即求得视差,进而得到车辆的位置信息,包括相机相对于路牌的横向和法线距离,即车辆相对路牌的距离和所在车道。The overall displacement of the road sign is obtained, that is, the parallax is obtained, and then the vehicle's position information is obtained, including the lateral and normal distances of the camera relative to the road sign, that is, the distance of the vehicle relative to the road sign and the lane it is in.

根据本发明优选的,步骤b的具体实现过程包括:Preferably, according to the present invention, the specific implementation process of step b includes:

(1)训练YOLOv8网络模型:(1) Training YOLOv8 network model:

获取训练集:选取含有路牌的图片,标注上标签,标签包括图片中路牌的坐标信息以及路牌的类别;Get the training set: select pictures containing road signs and label them. The labels include the coordinate information of the road signs in the pictures and the categories of the road signs.

YOLOv8网络模型包括Backbone单元、Neck单元、Head单元;The YOLOv8 network model includes Backbone unit, Neck unit, and Head unit;

Backbone单元包括卷积模块、C2f模块、SPPF模块,其中,卷积模块(Conv模块)包括卷积层、批归一化层和SiLU激活函数层;C2f模块包括卷积模块、Bottleneck模块和残差结构模块;SPPF模块中包括卷积层、池化层;The Backbone unit includes a convolution module, a C2f module, and an SPPF module. The convolution module (Conv module) includes a convolution layer, a batch normalization layer, and a SiLU activation function layer; the C2f module includes a convolution module, a Bottleneck module, and a residual structure module; the SPPF module includes a convolution layer and a pooling layer;

Neck单元包括卷积模块、C2f模块、上采样层;The Neck unit includes a convolution module, a C2f module, and an upsampling layer;

Head单元包括检测模块(Detect模块),检测模块包括卷积模块、卷积层;The Head unit includes a detection module (Detect module), which includes a convolution module and a convolution layer;

基于Backbone单元对训练集中的每张图片进行放缩以及卷积操作,从而获得初始的特征图;基于Neck单元对得到的初始特征图进行二次提取,获得不同尺度的中间特征图;将获得的不同尺度的中间特征图输入Head单元,得到YOLOv8网络模型预测的路牌坐标;Based on the Backbone unit, each image in the training set is scaled and convolved to obtain the initial feature map; based on the Neck unit, the initial feature map is extracted again to obtain intermediate feature maps of different scales; the intermediate feature maps of different scales are input into the Head unit to obtain the coordinates of the road sign predicted by the YOLOv8 network model;

通过YOLOv8网络模型预测的路牌坐标和真实的路牌坐标计算损失,通过损失获得YOLOv8网络模型优化的梯度,进行YOLOv8网络模型权重的更新,损失不断下降网络预测的准确率不断上升,从而获得一个训练好的YOLOv8网络模型;The loss is calculated using the road sign coordinates predicted by the YOLOv8 network model and the actual road sign coordinates. The gradient of the YOLOv8 network model optimization is obtained through the loss. The weights of the YOLOv8 network model are updated. The loss continues to decrease and the accuracy of network prediction continues to increase, thus obtaining a trained YOLOv8 network model.

将步骤a获得的道路图像利用训练好的YOLOv8网络模型进行目标检测,实现目标路牌的粗定位,并识别当前路牌的类别,获取数据库信息;The road image obtained in step a is used to perform target detection using the trained YOLOv8 network model to achieve rough positioning of the target road sign, identify the category of the current road sign, and obtain database information;

(2)在实际的推理测试阶段,将步骤a获得的道路图像输入训练好的YOLOv8网络模型,得到预测的路牌粗定位坐标,以及路牌的种类;(2) In the actual reasoning test phase, the road image obtained in step a is input into the trained YOLOv8 network model to obtain the predicted rough positioning coordinates of the road sign and the type of the road sign;

(3)将路牌粗定位坐标得到的粗定位区域置1,其他区域置0,得到粗定位区域图像。(3) The coarse positioning area obtained by the coarse positioning coordinates of the road sign is set to 1, and the other areas are set to 0 to obtain a coarse positioning area image.

根据本发明优选的,步骤c中,According to the preferred embodiment of the present invention, in step c,

HSV空间中,分离出符合以下阈值范围的像素:In HSV space, separate pixels that meet the following threshold ranges:

饱和度S的阈值取值范围为0.35<S<1,亮度V的阈值取值范围为0.35<V<1;色调H的阈值取值范围由路牌决定,对于矩形蓝色路牌,设置200<H<280,对于自制标准路牌提取四角红色区域,设置H>330或H<30。The threshold value range of saturation S is 0.35<S<1, and the threshold value range of brightness V is 0.35<V<1. The threshold value range of hue H is determined by the road sign. For rectangular blue road signs, set 200<H<280. For the four corner red areas of self-made standard road signs, set H>330 or H<30.

根据本发明优选的,步骤d中,According to the preferred embodiment of the present invention, in step d,

阈值处理,是指:对于符合阈值范围的像素,设置为255,其余像素设置为0,得到初步掩模图像;Threshold processing means: for pixels that meet the threshold range, set them to 255, and the remaining pixels are set to 0 to obtain a preliminary mask image;

形态学操作,是指:调用morphology的库函数,去除外部噪点和内部孔洞,并通过闭运算的方式,解决可能存在的边缘不连续情况,消除大部分干扰区域;Morphological operation means: calling the morphology library function to remove external noise and internal holes, and solving possible edge discontinuities through closing operations to eliminate most interference areas;

连通域长宽比例约束,是指:根据目标区域的长宽比例和面积大小,对所求区域进行约束,得到毫无干扰的最终目标区域;The connected domain length-width ratio constraint means: constraining the target area according to the length-width ratio and area size of the target area to obtain the final target area without interference;

求得目标区域的最小外接矩形,将其置为255,得到路牌区域的掩模图像,将掩模图像和路牌原图像进行与运算,最终得到路牌区域的ROI图像。The minimum enclosing rectangle of the target area is obtained, and is set to 255 to obtain the mask image of the road sign area. The mask image and the original road sign image are ANDed together to finally obtain the ROI image of the road sign area.

根据本发明优选的,步骤C的具体实现过程包括:Preferably, according to the present invention, the specific implementation process of step C includes:

提取左右目的掩模图像中最小外接矩形的顶点坐标,作为四对初始对应点,求得初始的透视变换矩阵M,平面透视投影变换关系如式(I)所示:The vertex coordinates of the minimum circumscribed rectangle in the left and right target mask images are extracted as four pairs of initial corresponding points to obtain the initial perspective transformation matrix M. The plane perspective projection transformation relationship is shown in formula (I):

Figure BDA0004119809910000041
Figure BDA0004119809910000041

其中x=(x,y,1)和x'=(x',y',1)是齐次坐标,~表示成比例,重写为:Where x = (x, y, 1) and x' = (x', y', 1) are homogeneous coordinates, ~ indicates proportionality, rewritten as:

Figure BDA0004119809910000042
Figure BDA0004119809910000042

Figure BDA0004119809910000043
Figure BDA0004119809910000043

将右图作为目标图像,对左图进行透视投影变换以逼近右图,使用M←(E+D)M来迭代更新变换矩阵,其中,Take the right image as the target image, perform perspective projection transformation on the left image to approximate the right image, and use M←(E+D)M to iteratively update the transformation matrix, where:

Figure BDA0004119809910000044
Figure BDA0004119809910000044

式(V)中,d0至d7对应于M矩阵中m0至m7每次迭代的更新参数;In formula (V), d 0 to d 7 correspond to the update parameters of each iteration of m 0 to m 7 in the M matrix;

此时,使用新的变换x'~(E+D)Mx对左图图像I1进行重采样,相当于使用x”~(E+D)x变换重采样后的左图图像

Figure BDA0004119809910000051
即:At this time, the left image I 1 is resampled using the new transformation x'~(E+D)Mx, which is equivalent to the left image resampled using the transformation x"~(E+D)x
Figure BDA0004119809910000051
Right now:

Figure BDA0004119809910000052
Figure BDA0004119809910000052

其中,x”=(x”,y”,1)是齐次坐标,重写为:Where x'=(x',y',1) is a homogeneous coordinate, rewritten as:

Figure BDA0004119809910000053
Figure BDA0004119809910000053

通过最小化两幅图像间的强度误差来估计像素的运动,则强度误差方程如下:The pixel motion is estimated by minimizing the intensity error between the two images. The intensity error equation is as follows:

Figure BDA0004119809910000054
Figure BDA0004119809910000054

Figure BDA0004119809910000055
Figure BDA0004119809910000055

式(X)、(XI)中,

Figure BDA0004119809910000056
是重采样后的左图图像
Figure BDA0004119809910000057
在xi的图像梯度,xi的取值范围是路牌ROI区域;In formula (X) and (XI),
Figure BDA0004119809910000056
The left image is resampled
Figure BDA0004119809910000057
In the image gradient of xi , the value range of xi is the road sign ROI area;

Figure BDA0004119809910000058
是重采样后的左图图像
Figure BDA0004119809910000059
和目标图像I0对应点的强度误差;
Figure BDA0004119809910000058
The left image is resampled
Figure BDA0004119809910000059
The intensity error of the corresponding point with the target image I 0 ;

d=(d0,d1,...,d7)是运动更新参数,Ji=Jd(xi)是重采样点坐标x'i'关于d的雅克比行列式,对应由三维平面的瞬时运动引起的光流,表示为:d = (d 0 , d 1 , ..., d 7 ) is the motion update parameter, Ji = Jd ( xi ) is the Jacobian determinant of the resampled point coordinate x'i ' with respect to d, corresponding to the optical flow caused by the instantaneous motion of the three-dimensional plane, expressed as:

Figure BDA00041198099100000510
Figure BDA00041198099100000510

此时,用最小二乘法得解析解:At this time, the least squares method is used to obtain the analytical solution:

Ad=-b (XIII)Ad=-b (XIII)

其中,海森矩阵:Among them, the Hessian matrix:

Figure BDA00041198099100000511
Figure BDA00041198099100000511

累积梯度:Accumulated gradient:

Figure BDA00041198099100000512
Figure BDA00041198099100000512

根据本发明优选的,步骤D中,According to the preferred embodiment of the present invention, in step D,

根据双目相机的成像原理可得:According to the imaging principle of the binocular camera:

Figure BDA0004119809910000061
Figure BDA0004119809910000061

Figure BDA0004119809910000062
Figure BDA0004119809910000062

其中,X是指相机到路牌中心的横向距离,根据数据库中路牌中心与车道的距离信息,推算出车辆所在车道;Z是指相机到路牌平面的法线距离,即车辆到路牌的距离;Among them, X refers to the lateral distance from the camera to the center of the road sign. The lane where the vehicle is located is calculated based on the distance information between the center of the road sign and the lane in the database; Z refers to the normal distance from the camera to the road sign plane, that is, the distance from the vehicle to the road sign;

OL、OR为双目相机的左、右光圈中心,OL、OR之间的距离是双目相机的基线b,f为焦距;P(X,Y,Z)为三维空间中的一点,P(X,Y,Z)在双目相机各成一像,记作PL和PR,校正后,PL和PR在成像平面x轴的坐标为uL和uR,求得的视差disp=uL-uR OL and OR are the left and right aperture centers of the binocular camera, the distance between OL and OR is the baseline b of the binocular camera, and f is the focal length; P(X,Y,Z) is a point in three-dimensional space, and P(X,Y,Z) forms an image in each binocular camera, recorded as PL and PR . After correction, the coordinates of PL and PR on the x-axis of the imaging plane are uL and uR , and the obtained disparity disp = uL - uR ;

至此,结合相机标定获得的内外参数,得到了相机相对于路牌的横向和法线距离,即求得了车辆相对路牌距离和所在车道,实现车辆自定位。At this point, combined with the internal and external parameters obtained by camera calibration, the lateral and normal distances of the camera relative to the road sign are obtained, that is, the distance of the vehicle relative to the road sign and the lane it is in are obtained, thus achieving vehicle self-positioning.

一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现基于梯度的路牌配准定位方法的步骤。A computer device comprises a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of a gradient-based road sign registration and positioning method when executing the computer program.

一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现基于梯度的路牌配准定位方法的步骤。A computer-readable storage medium stores a computer program, which, when executed by a processor, implements the steps of a gradient-based road sign registration and positioning method.

本发明的有益效果为:The beneficial effects of the present invention are:

1、本发明运用最高版本的YOLO(You Only Look Once version 8)网络模型进行路牌的粗定位,并与颜色提取相结合,实现了在复杂场景下检测路牌目标区域的问题,提高了检测结果的速度和可靠性,保证了路牌区域的完整性,且有效排除了其他相似区域的干扰。1. The present invention uses the highest version of the YOLO (You Only Look Once version 8) network model to perform rough positioning of road signs, and combines it with color extraction to solve the problem of detecting the target area of road signs in complex scenes, improve the speed and reliability of the detection results, ensure the integrity of the road sign area, and effectively eliminate the interference of other similar areas.

2、本发明充分利用路牌是一个平面的特性,将其作为一个特征点进行匹配,得到左右目路牌区域的透视变换矩阵,计算平面之间的整体光流进行立体匹配,避免了逐点计算的冗余,使得配准结果的鲁棒性更强,检测结果精度更高,速度更快。2. The present invention makes full use of the characteristic that the road sign is a plane, and uses it as a feature point for matching, obtains the perspective transformation matrix of the left and right road sign areas, calculates the overall optical flow between the planes for stereo matching, avoids the redundancy of point-by-point calculation, makes the registration result more robust, and the detection result more accurate and faster.

3、本发明提出使用一种简易的数据库系统,该数据库结构简单、数据量小,并且易于后期维护,数据库内容主要包括各个路牌的位置信息、底色以及路牌处道路的车道信息,就能实现车辆的车道级定位。3. The present invention proposes to use a simple database system with a simple structure, small data volume, and easy later maintenance. The database content mainly includes the location information, background color and lane information of each road sign at the road sign, which can realize the lane-level positioning of the vehicle.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明基于梯度的路牌配准定位方法的流程示意图;FIG1 is a schematic diagram of a flow chart of a gradient-based road sign registration and positioning method of the present invention;

图2为蓝色矩形的道路指路标志牌示意图;Figure 2 is a schematic diagram of a blue rectangular road sign;

图3(a)为标号为A1的自制标准路牌示意图;Figure 3(a) is a schematic diagram of a self-made standard road sign labeled A1;

图3(b)为标号为A2的自制标准路牌示意图;Figure 3(b) is a schematic diagram of a self-made standard road sign labeled A2;

图3(c)为标号为B3的自制标准路牌示意图;Figure 3(c) is a schematic diagram of a self-made standard road sign labeled B3;

图3(d)为标号为B4的自制标准路牌示意图;Figure 3(d) is a schematic diagram of a self-made standard road sign labeled B4;

图3(e)为标号为C5的自制标准路牌示意图;Figure 3(e) is a schematic diagram of a self-made standard road sign labeled C5;

图3(f)为标号为C6的自制标准路牌示意图;Figure 3(f) is a schematic diagram of a self-made standard road sign labeled C6;

图4(a)为YOLOv8网络模型的简略示意图;Figure 4(a) is a simplified schematic diagram of the YOLOv8 network model;

图4(b)为YOLOv8网络模型的结构示意图;Figure 4(b) is a schematic diagram of the structure of the YOLOv8 network model;

图4(c)为YOLOv8的Conv模块详细结构示意图;Figure 4(c) is a detailed structural diagram of the Conv module of YOLOv8;

图4(d)为YOLOv8的C2f模块详细结构示意图;Figure 4(d) is a detailed structural diagram of the C2f module of YOLOv8;

图4(e)为YOLOv8的Bottleneck模块详细结构示意图;Figure 4(e) is a detailed structural diagram of the Bottleneck module of YOLOv8;

图4(f)为YOLOv8的SPPF模块详细结构示意图;Figure 4(f) is a detailed structural diagram of the SPPF module of YOLOv8;

图4(g)为YOLOv8的Detect模块详细结构示意图;Figure 4(g) is a detailed structural diagram of the Detect module of YOLOv8;

图5为YOLOv8检测蓝色矩形路牌效果示意图;Figure 5 is a schematic diagram of the effect of YOLOv8 detecting a blue rectangular road sign;

图6为YOLOv8检测自制标准路牌效果示意图;Figure 6 is a schematic diagram of the effect of YOLOv8 detecting a self-made standard road sign;

图7为蓝色矩形路牌的ROI图像示意图;Figure 7 is a schematic diagram of the ROI image of a blue rectangular road sign;

图8为自制标准路牌的ROI图像示意图;FIG8 is a schematic diagram of a ROI image of a self-made standard road sign;

图9为迭代优化过程中误差变化示意图;FIG9 is a schematic diagram of error variation during iterative optimization;

图10为双目相机的成像模型示意图。FIG10 is a schematic diagram of an imaging model of a binocular camera.

具体实施方式DETAILED DESCRIPTION

下面结合实施例和说明书附图对本发明做进一步说明,但不限于此。The present invention will be further described below in conjunction with the embodiments and the accompanying drawings, but is not limited thereto.

实施例1Example 1

一种基于梯度的路牌配准定位方法,路牌为数据库中任意路牌,设置在道路右侧,且与路面垂直,以蓝色矩形的道路指路标志牌(如图2所示)和自制标准路牌为例,自制标准路牌为正方形平面指示牌,底色为白色,四个顶角处设有正方形红色区域,指示字符为字母加数字,用黑色标注,如图1所示,包括步骤如下:A road sign registration and positioning method based on gradient, the road sign is any road sign in the database, set on the right side of the road and perpendicular to the road surface, taking a blue rectangular road sign (as shown in FIG2 ) and a self-made standard road sign as examples, the self-made standard road sign is a square plane sign, the background color is white, and square red areas are set at the four top corners, and the indicating characters are letters and numbers, marked in black, as shown in FIG1 , including the following steps:

A、构建数据库A. Building a database

数据库包括各个路牌的以下信息:地理坐标、路牌中心与车道的距离信息、底色,地理坐标是指路牌所在的经度、纬度;路牌中心与车道的距离信息是指路牌中心与每条车道线的横向距离;底色是指路牌的颜色;本发明采用双目相机,对于不同路牌均适用,不需要提前采集路牌的尺寸大小。The database includes the following information of each road sign: geographic coordinates, distance information between the center of the road sign and the lane, and background color. The geographic coordinates refer to the longitude and latitude of the road sign; the distance information between the center of the road sign and the lane refers to the lateral distance between the center of the road sign and each lane line; the background color refers to the color of the road sign; the present invention uses a binocular camera, which is applicable to different road signs and does not require the size of the road sign to be collected in advance.

B、路牌粗定位与提取B. Rough positioning and extraction of road signs

a、将双目相机安装在车辆前方以实时获取道路图像;双目相机的光轴平行且与车辆行驶方向相同,双目相机的左、右相机焦距相等且x轴正方向重合;a. Install a binocular camera in front of the vehicle to obtain road images in real time; the optical axis of the binocular camera is parallel and in the same direction as the vehicle's travel, the focal lengths of the left and right cameras of the binocular camera are equal, and the positive directions of the x-axis coincide;

b、将步骤a获得的道路图像利用训练好的YOLOv8网络模型进行目标检测,实现目标路牌的粗定位,并识别当前路牌的类别,本实施例所用路牌有7类,其中蓝色矩形路牌一类,自制标准路牌六类,路牌上标号分别为A1、A2、B3、B4、C5、C6,分别如图3(a)、图3(b)、图3(c)、图3(d)、图3(e)、图3(f)所示;b. Use the trained YOLOv8 network model to perform target detection on the road image obtained in step a to achieve rough positioning of the target road sign and identify the category of the current road sign. There are 7 types of road signs used in this embodiment, including one type of blue rectangular road signs and six types of self-made standard road signs. The numbers on the road signs are A1, A2, B3, B4, C5, and C6, as shown in Figures 3(a), 3(b), 3(c), 3(d), 3(e), and 3(f), respectively;

c、将步骤b获得的粗定位区域图像从RGB空间转换为HSV空间;在RGB空间三个颜色分量高度相关、分析困难,且易受到光照等影响,而HSV空间可通过调节饱和度和明度消除光照的影响,能够更准确的分离出某一颜色。c. Convert the roughly positioned area image obtained in step b from the RGB space to the HSV space. In the RGB space, the three color components are highly correlated, difficult to analyze, and easily affected by lighting, while the HSV space can eliminate the influence of lighting by adjusting saturation and brightness, and can more accurately separate a certain color.

d、将经过步骤c处理的图像在HSV空间进行阈值处理,并通过形态学操作、连通域长宽比例约束得到掩模图像,从而得到路牌区域ROI图像;d. Perform threshold processing on the image processed in step c in HSV space, and obtain a mask image through morphological operations and connected domain length-width ratio constraints, thereby obtaining a road sign area ROI image;

C、基于梯度的配准优化C. Gradient-based registration optimization

根据得到的掩模图像,初步得到四对初始对应点,从而得到初始的透视变换矩阵,再利用基于梯度的优化算法对高斯平滑后的左右目路牌区域ROI图像进行亚像素级配准,迭代更新得到准确的透视变换矩阵;Based on the obtained mask image, four pairs of initial corresponding points are initially obtained, thereby obtaining the initial perspective transformation matrix. Then, the gradient-based optimization algorithm is used to perform sub-pixel registration on the Gaussian-smoothed left and right road sign area ROI images, and the accurate perspective transformation matrix is obtained by iterative update.

D、车辆位姿计算D. Vehicle posture calculation

上述步骤迭代更新得到了准确的透视变换矩阵,根据路牌中心点坐标和公式(I)求得路牌整体位移,即求得视差,进而得到车辆的位置信息,包括相机相对于路牌的横向和法线距离,即车辆相对路牌的距离和所在车道。The above steps are iteratively updated to obtain an accurate perspective transformation matrix. The overall displacement of the road sign is calculated according to the coordinates of the center point of the road sign and formula (I), that is, the parallax is calculated, and then the vehicle position information is obtained, including the lateral and normal distances of the camera relative to the road sign, that is, the distance of the vehicle relative to the road sign and the lane it is in.

实施例2Example 2

根据实施例1所述的一种基于梯度的路牌配准定位方法,其区别在于:The gradient-based road sign registration and positioning method according to Example 1 is different in that:

步骤b的具体实现过程包括:The specific implementation process of step b includes:

(1)训练YOLOv8网络模型:(1) Training YOLOv8 network model:

获取训练集:选取含有路牌的图片,标注上标签,标签包括图片中路牌的坐标信息以及路牌的类别;Get the training set: select pictures containing road signs and label them. The labels include the coordinate information of the road signs in the pictures and the categories of the road signs.

因为原始的模型不能较好的识别出路牌这样的目标信息,针对本发明所面临的实际问题,对每种路牌分别选取了100张含有路牌的图片进行训练,图片里面覆盖了较多的路牌场景,这样可以保证在新场景的路牌检测准确率,训练过程中的标签是图片中路牌的坐标信息以及路牌的类别,这样在推理阶段就能获取所需要的路牌坐标和类别信息便于下一步的操作。Because the original model cannot recognize target information such as road signs well, in order to address the practical problems faced by the present invention, 100 pictures containing road signs were selected for each type of road sign for training. The pictures cover a large number of road sign scenes, which can ensure the accuracy of road sign detection in new scenes. The labels in the training process are the coordinate information of the road signs in the pictures and the categories of the road signs. In this way, the required road sign coordinates and category information can be obtained in the inference stage to facilitate the next step of operation.

如图4(a)、图4(b)所示,YOLOv8网络模型包括Backbone单元、Neck单元、Head单元;As shown in Figure 4(a) and Figure 4(b), the YOLOv8 network model includes Backbone unit, Neck unit, and Head unit;

Backbone单元包括卷积模块、C2f模块、SPPF模块,其中,如图4(c)所示,卷积模块(Conv模块)包括卷积层、批归一化层和SiLU激活函数层;C2f模块中具体结构如图4(d)所示,C2f模块包括卷积模块、Bottleneck模块和残差结构模块,Bottleneck模块具体结构如图4(e)所示;SPPF模块中具体结构如图4(f)所示,SPPF模块中包括卷积层、池化层;The Backbone unit includes a convolution module, a C2f module, and an SPPF module. As shown in Figure 4(c), the convolution module (Conv module) includes a convolution layer, a batch normalization layer, and a SiLU activation function layer. The specific structure of the C2f module is shown in Figure 4(d). The C2f module includes a convolution module, a Bottleneck module, and a residual structure module. The specific structure of the Bottleneck module is shown in Figure 4(e). The specific structure of the SPPF module is shown in Figure 4(f). The SPPF module includes a convolution layer and a pooling layer.

输入放缩成高和宽均为640的图片后,经过Backbone单元后得到高和宽为80、高和宽为40、高和宽为20的初始特征图;After the input is scaled to a picture with a height and width of 640, the initial feature maps with height and width of 80, height and width of 40, and height and width of 20 are obtained after passing through the Backbone unit;

Neck单元包括卷积模块、C2f模块、上采样层;经过Neck单元后得到高和宽为80、高和宽为40、高和宽为20的中间特征图。The Neck unit includes a convolution module, a C2f module, and an upsampling layer; after passing through the Neck unit, intermediate feature maps with height and width of 80, height and width of 40, and height and width of 20 are obtained.

Head单元包括检测模块(Detect模块),如图4(g)所示,检测模块包括卷积模块、卷积层;经过Head单元后会得到预测的目标类别信息和目标坐标信息;The Head unit includes a detection module (Detect module), as shown in Figure 4(g), which includes a convolution module and a convolution layer. After passing through the Head unit, the predicted target category information and target coordinate information are obtained;

基于Backbone单元对训练集中的每张图片进行放缩(放缩成高和宽均为640的图片)以及卷积操作,从而获得初始的特征图;基于Neck单元对得到的初始特征图进行二次提取,获得不同尺度的中间特征图;将获得的不同尺度的中间特征图输入Head单元,得到YOLOv8网络模型预测的路牌坐标;Based on the Backbone unit, each image in the training set is scaled (scaled to an image with a height and width of 640) and convolved to obtain the initial feature map; based on the Neck unit, the initial feature map is secondary extracted to obtain intermediate feature maps of different scales; the intermediate feature maps of different scales are input into the Head unit to obtain the coordinates of the road sign predicted by the YOLOv8 network model;

通过YOLOv8网络模型预测的路牌坐标和真实的路牌坐标计算损失,通过损失获得YOLOv8网络模型优化的梯度,图9为迭代优化过程中误差变化示意图;进行YOLOv8网络模型权重的更新,损失不断下降网络预测的准确率不断上升,从而获得一个不错的训练好的YOLOv8网络模型;The loss is calculated by using the road sign coordinates predicted by the YOLOv8 network model and the actual road sign coordinates, and the gradient of the YOLOv8 network model optimization is obtained through the loss. Figure 9 is a schematic diagram of the error change during the iterative optimization process; the weights of the YOLOv8 network model are updated, the loss continues to decrease, and the accuracy of network prediction continues to increase, thereby obtaining a well-trained YOLOv8 network model;

将步骤a获得的道路图像利用训练好的YOLOv8网络模型进行目标检测,实现目标路牌的粗定位,并识别当前路牌的类别,获取数据库信息;The road image obtained in step a is used to perform target detection using the trained YOLOv8 network model to achieve rough positioning of the target road sign, identify the category of the current road sign, and obtain database information;

(2)在实际的推理测试阶段,将步骤a获得的道路图像输入训练好的YOLOv8网络模型,得到预测的路牌粗定位坐标,以及路牌的种类;图5为YOLOv8检测蓝色矩形路牌效果示意图;图6为YOLOv8检测自制标准路牌效果示意图。(2) In the actual reasoning test phase, the road image obtained in step a is input into the trained YOLOv8 network model to obtain the predicted rough positioning coordinates of the road sign and the type of the road sign; FIG5 is a schematic diagram of the effect of YOLOv8 detecting a blue rectangular road sign; FIG6 is a schematic diagram of the effect of YOLOv8 detecting a self-made standard road sign.

(3)将路牌粗定位坐标得到的粗定位区域置1,其他区域置0,得到粗定位区域图像。(3) The coarse positioning area obtained by the coarse positioning coordinates of the road sign is set to 1, and the other areas are set to 0 to obtain a coarse positioning area image.

步骤c中,在RGB空间三个颜色分量高度相关、分析困难,且易受到光照等影响,而HSV空间可通过调节饱和度和明度消除光照的影响,能够更准确的分离出某一颜色。HSV空间中,分离出符合以下阈值范围的像素:In step c, the three color components in the RGB space are highly correlated, difficult to analyze, and easily affected by lighting, while the HSV space can eliminate the influence of lighting by adjusting saturation and brightness, and can more accurately separate a certain color. In the HSV space, separate the pixels that meet the following threshold range:

根据先验知识和实验测定,饱和度S的阈值取值范围为0.35<S<1,亮度V的阈值取值范围为0.35<V<1;色调H的阈值取值范围由路牌决定,对于矩形蓝色路牌,设置200<H<280,对于自制标准路牌提取四角红色区域,设置H>330或H<30。According to prior knowledge and experimental measurements, the threshold range of saturation S is 0.35<S<1, and the threshold range of brightness V is 0.35<V<1. The threshold range of hue H is determined by the road sign. For rectangular blue road signs, 200<H<280 is set. For self-made standard road signs to extract the four red corners, H>330 or H<30 is set.

步骤d中,阈值处理,是指:对于符合阈值范围的像素,设置为255,其余像素设置为0,得到初步掩模图像;In step d, threshold processing means: for pixels that meet the threshold range, set them to 255, and set the remaining pixels to 0, to obtain a preliminary mask image;

形态学操作,是指:调用morphology的库函数,去除外部噪点和内部孔洞,并通过闭运算的方式,解决可能存在的边缘不连续情况,消除大部分干扰区域,保证路牌区域是一个完整的连通区域;Morphological operation means: calling the morphology library function to remove external noise and internal holes, and solving possible edge discontinuities through closing operations, eliminating most interference areas, and ensuring that the road sign area is a complete connected area;

连通域长宽比例约束,是指:根据目标区域的长宽比例和面积大小,对所求区域进行约束,实验表明,至此可得到毫无干扰的最终目标区域;The connected domain length-width ratio constraint means: constraining the target region according to the length-width ratio and area size of the target region. Experiments show that the final target region without interference can be obtained.

因本发明所用自制红色标准路牌的四角区域是红色正方形,即宽高比是1,所以可限制宽高比大于0.8小于1.2;同时,由于粗定位区域中路牌的红色是最大的红色区域,可以进行面积约束,保留面积最大的四个区域作为最终目标区域;Since the four corner areas of the self-made red standard road sign used in the present invention are red squares, that is, the aspect ratio is 1, the aspect ratio can be limited to be greater than 0.8 and less than 1.2; at the same time, since the red color of the road sign in the rough positioning area is the largest red area, area constraints can be performed, and the four areas with the largest areas are retained as the final target areas;

至此,已排除所有干扰,现提取四个目标区域,并求其最小外接矩形,将其置为255,得到整个路牌区域的掩模图像,将掩模图像和路牌原图像进行与运算,最终得到了路牌区域的ROI图像,图7为蓝色矩形路牌的ROI图像示意图;图8为自制标准路牌的ROI图像示意图。At this point, all interference has been eliminated. Now, four target areas are extracted, and their minimum circumscribed rectangle is calculated and set to 255 to obtain the mask image of the entire road sign area. The mask image and the original road sign image are ANDed together to finally obtain the ROI image of the road sign area. Figure 7 is a schematic diagram of the ROI image of the blue rectangular road sign; Figure 8 is a schematic diagram of the ROI image of the homemade standard road sign.

步骤C的具体实现过程包括:The specific implementation process of step C includes:

提取左右目的掩模图像中最小外接矩形的顶点坐标,作为四对初始对应点,调用OpenCV中getPerspectiveTransform函数求得初始的透视变换矩阵M,平面透视投影变换关系如式(I)所示:Extract the vertex coordinates of the minimum circumscribed rectangle in the left and right target mask images as four pairs of initial corresponding points, and call the getPerspectiveTransform function in OpenCV to obtain the initial perspective transformation matrix M. The plane perspective projection transformation relationship is shown in formula (I):

Figure BDA0004119809910000101
Figure BDA0004119809910000101

其中x=(x,y,1)和x'=(x',y',1)是齐次坐标,~表示成比例,重写为:Where x = (x, y, 1) and x' = (x', y', 1) are homogeneous coordinates, ~ indicates proportionality, rewritten as:

Figure BDA0004119809910000102
Figure BDA0004119809910000102

Figure BDA0004119809910000103
Figure BDA0004119809910000103

将右图作为目标图像,对左图进行透视投影变换以逼近右图,为了优化投影矩阵的八个参数,使用M←(E+D)M来迭代更新变换矩阵,其中,Take the right image as the target image, perform perspective projection transformation on the left image to approximate the right image. In order to optimize the eight parameters of the projection matrix, use M←(E+D)M to iteratively update the transformation matrix, where:

Figure BDA0004119809910000104
Figure BDA0004119809910000104

式(XXII)中,d0至d7对应于M矩阵中m0至m7每次迭代的更新参数;In formula (XXII), d0 to d7 correspond to the update parameters of each iteration of m0 to m7 in the M matrix;

此时,使用新的变换x'~(E+D)Mx对左图图像I1进行重采样,相当于使用x”~(E+D)x变换重采样后的左图图像

Figure BDA0004119809910000111
即:At this time, the left image I 1 is resampled using the new transformation x'~(E+D)Mx, which is equivalent to the left image resampled using the transformation x"~(E+D)x
Figure BDA0004119809910000111
Right now:

Figure BDA0004119809910000112
Figure BDA0004119809910000112

其中,x”=(x”,y”,1)是齐次坐标,重写为:Where x'=(x',y',1) is a homogeneous coordinate, rewritten as:

Figure BDA0004119809910000113
Figure BDA0004119809910000113

为了恢复准确的透视变换关系,通过最小化两幅图像间的强度误差来估计像素的运动,则强度误差方程如下:In order to restore the accurate perspective transformation relationship, the pixel motion is estimated by minimizing the intensity error between the two images. The intensity error equation is as follows:

Figure BDA0004119809910000114
Figure BDA0004119809910000114

Figure BDA0004119809910000115
Figure BDA0004119809910000115

式(XXVII)、(XXVIII)中,

Figure BDA0004119809910000116
是重采样后的左图图像
Figure BDA0004119809910000117
在xi的图像梯度,xi的取值范围是路牌ROI区域;In formula (XXVII) and (XXVIII),
Figure BDA0004119809910000116
The left image is resampled
Figure BDA0004119809910000117
In the image gradient of xi , the value range of xi is the road sign ROI area;

Figure BDA0004119809910000118
是重采样后的左图图像
Figure BDA0004119809910000119
和目标图像I0对应点的强度误差;
Figure BDA0004119809910000118
The left image is resampled
Figure BDA0004119809910000119
The intensity error of the corresponding point with the target image I 0 ;

d=(d0,d1,...,d7)是运动更新参数,Ji=Jd(xi)是重采样点坐标x″i关于d的雅克比行列式,对应由三维平面的瞬时运动引起的光流,表示为:d = (d 0 , d 1 , ..., d 7 ) is the motion update parameter, Ji = J d ( xi ) is the Jacobian determinant of the resampled point coordinate x″ i with respect to d, corresponding to the optical flow caused by the instantaneous motion of the three-dimensional plane, expressed as:

Figure BDA00041198099100001110
Figure BDA00041198099100001110

此时,用最小二乘法得解析解:At this time, the least squares method is used to obtain the analytical solution:

Ad=-b (XXX)Ad=-b (XXX)

其中,海森矩阵:Among them, the Hessian matrix:

Figure BDA00041198099100001111
Figure BDA00041198099100001111

累积梯度:Accumulated gradient:

Figure BDA00041198099100001112
Figure BDA00041198099100001112

步骤D中,根据双目相机的成像原理可得:In step D, according to the imaging principle of the binocular camera, we can get:

Figure BDA0004119809910000121
Figure BDA0004119809910000121

Figure BDA0004119809910000122
Figure BDA0004119809910000122

其中,X是指相机到路牌中心的横向距离,根据数据库中路牌中心与车道的距离信息,推算出车辆所在车道;Z是指相机到路牌平面的法线距离,即车辆到路牌的距离;Among them, X refers to the lateral distance from the camera to the center of the road sign. The lane where the vehicle is located is calculated based on the distance information between the center of the road sign and the lane in the database; Z refers to the normal distance from the camera to the road sign plane, that is, the distance from the vehicle to the road sign;

双目相机的成像模型如图10所示,OL、OR为双目相机的左、右光圈中心,OL、OR之间的距离是双目相机的基线b,方框为成像平面,f为焦距;P(X,Y,Z)为三维空间中的一点(取左相机光心为原点坐标),P(X,Y,Z)在双目相机各成一像,记作PL和PR,校正后,PL和PR在成像平面x轴的坐标为uL和uR(以图像主点为原点,所以uR为负数),则上述步骤求得的视差disp=uL-uRThe imaging model of the binocular camera is shown in Figure 10. OL and OR are the left and right aperture centers of the binocular camera, the distance between OL and OR is the baseline b of the binocular camera, the box is the imaging plane, and f is the focal length; P(X, Y, Z) is a point in three-dimensional space (the optical center of the left camera is taken as the origin coordinate), and P(X, Y, Z) forms an image in each binocular camera, recorded as PL and PR . After correction, the coordinates of PL and PR on the x-axis of the imaging plane are uL and uR (the image principal point is taken as the origin, so uR is a negative number), then the parallax disp obtained in the above steps = uL - uR ;

至此,结合相机标定获得的内外参数,得到了相机相对于路牌的横向和法线距离,即求得了车辆相对路牌距离和所在车道,实现车辆自定位。At this point, combined with the internal and external parameters obtained by camera calibration, the lateral and normal distances of the camera relative to the road sign are obtained, that is, the distance of the vehicle relative to the road sign and the lane it is in are obtained, thus achieving vehicle self-positioning.

实施例3Example 3

一种计算机设备,包括存储器和处理器,存储器存储有计算机程序,处理器执行计算机程序时实现实施例1或2基于梯度的路牌配准定位方法的步骤。A computer device includes a memory and a processor, wherein the memory stores a computer program, and when the processor executes the computer program, the steps of the gradient-based road sign registration and positioning method of embodiment 1 or 2 are implemented.

实施例4Example 4

一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现实施例1或2基于梯度的路牌配准定位方法的步骤。A computer-readable storage medium stores a computer program, which, when executed by a processor, implements the steps of the gradient-based road sign registration and positioning method of embodiment 1 or 2.

Claims (8)

1. The guideboard registration positioning method based on the gradient is characterized by comprising the following steps:
A. building a database
The database includes the following information for each guideboard: geographic coordinates, distance information between the center of the guideboard and the lane and ground color, wherein the geographic coordinates refer to longitude and latitude of the guideboard; the distance information between the center of the guideboard and the lane refers to the transverse distance between the center of the guideboard and each lane line; the ground color refers to the color of the guideboard;
B. rough positioning and extracting method for guideboard
a. Installing a binocular camera in front of a vehicle to acquire road images in real time;
b. c, carrying out target detection on the road image obtained in the step a by using a trained YOLOv8 network model, realizing rough positioning of a target guideboard, identifying the category of the current guideboard, obtaining database information of the current guideboard, and solving a rough positioning area image;
c. b, converting the rough positioning area image obtained in the step b from RGB space into HSV space;
d. c, performing threshold processing on the image processed in the step c in HSV space, and obtaining a mask image through morphological operation and constraint of aspect ratio of the connected domain, thereby obtaining a guideboard region ROI image;
C. gradient-based registration optimization
According to the obtained mask image, four pairs of initial corresponding points are preliminarily obtained, so that an initial perspective transformation matrix is obtained, sub-pixel level registration is carried out on the left and right guideboard region ROI images after Gaussian smoothing by utilizing a gradient-based optimization algorithm, and an accurate perspective transformation matrix is obtained through iterative updating;
D. vehicle pose calculation
Obtaining the overall displacement of the guideboard, namely obtaining parallax, and further obtaining the position information of the vehicle, wherein the position information comprises the transverse and normal distances of the camera relative to the guideboard, namely the distance of the vehicle relative to the guideboard and the lane where the vehicle is located.
2. The gradient-based guideboard registration positioning method of claim 1, wherein the specific implementation process of the step b comprises:
(1) Training a YOLOv8 network model:
acquiring a training set: selecting a picture containing the guideboard, labeling a label, wherein the label comprises coordinate information of the guideboard in the picture and the category of the guideboard;
the YOLOv8 network model comprises a Backbone unit, a Neck unit and a Head unit;
the backbox unit comprises a convolution module, a C2f module and an SPPF module, wherein the convolution module comprises a convolution layer, a batch normalization layer and a SiLU activation function layer; the C2f module comprises a convolution module, a Bottleneck module and a residual error structure module; the SPPF module comprises a convolution layer and a pooling layer;
the Neck unit comprises a convolution module, a C2f module and an up-sampling layer;
the Head unit comprises a detection module, wherein the detection module comprises a convolution module and a convolution layer;
scaling and convolving each picture in the training set based on a Backbone unit, so as to obtain an initial feature map; performing secondary extraction on the obtained initial feature map based on a Neck unit to obtain intermediate feature maps with different scales; inputting the obtained intermediate feature graphs with different scales into a Head unit to obtain guideboard coordinates predicted by a YOLOv8 network model;
calculating loss through the guideboard coordinates predicted by the YOLOv8 network model and the real guideboard coordinates, obtaining a gradient optimized by the YOLOv8 network model through the loss, updating the weight of the YOLOv8 network model, and continuously reducing the accuracy of network prediction and continuously increasing the loss, so as to obtain a trained YOLOv8 network model;
b, performing target detection on the road image obtained in the step a by using a trained YOLOv8 network model, realizing rough positioning of a target guideboard, identifying the category of the current guideboard, and obtaining database information;
(2) In the actual reasoning test stage, inputting the road image obtained in the step a into a trained YOLOv8 network model to obtain predicted rough positioning coordinates of the guideboard and the type of the guideboard;
(3) And setting 1 in the rough positioning area and 0 in the other areas obtained by rough positioning coordinates of the guideboard to obtain a rough positioning area image.
3. The method of gradient-based guideboard registration positioning of claim 1, wherein, in step c,
in HSV space, pixels are isolated that meet the following threshold ranges:
the threshold value range of the saturation S is 0.35< S <1, and the threshold value range of the brightness V is 0.35< V <1; the threshold value range of the hue H is determined by the guideboard, 200< H <280 is set for a rectangular blue guideboard, and H >330 or H <30 is set for a self-made standard guideboard to extract a quadrangle red area.
4. The method of gradient-based guideboard registration positioning of claim 1, wherein in step d,
thresholding refers to: setting 255 for pixels meeting a threshold range, and setting 0 for the rest pixels to obtain a preliminary mask image;
morphological operations refer to: calling a morphism library function, removing external noise points and internal holes, solving the possible edge discontinuity condition in a closed operation mode, and eliminating most of interference areas;
the aspect ratio constraint of the connected domain refers to: constraining the area according to the aspect ratio and the area size of the target area to obtain a final target area without interference;
obtaining the minimum circumscribed rectangle of the target area, setting the minimum circumscribed rectangle as 255, obtaining a mask image of the guideboard area, performing AND operation on the mask image and the original guideboard image, and finally obtaining the ROI image of the guideboard area.
5. The gradient-based guideboard registration positioning method according to claim 1, wherein the specific implementation process of the step C includes:
extracting vertex coordinates of a minimum circumscribed rectangle in the left and right target mask images as four pairs of initial corresponding points to obtain an initial perspective transformation matrix M, wherein the plane perspective projection transformation relation is shown in the formula (I):
Figure FDA0004119809900000031
where x= (x, y, 1) and x ' = (x ', y ', 1) are homogeneous coordinates, and-represent proportionality, rewritten as:
Figure FDA0004119809900000032
Figure FDA0004119809900000033
taking the right image as a target image, performing perspective projection transformation on the left image to approximate the right image, and iteratively updating a transformation matrix by using M≡ (E+D) M, wherein,
Figure FDA0004119809900000034
in the formula (V), d 0 To d 7 Corresponding to M in M matrix 0 To m 7 Updating parameters of each iteration;
at this time, the left-hand image I is transformed with new transforms x' - (E+D) Mx 1 Resampling corresponds to resampling the left image using an x' to (E+D) x transform
Figure FDA0004119809900000035
Namely: />
Figure FDA0004119809900000036
Where x "= (x", y ", 1) is a homogeneous coordinate, rewritten as:
Figure FDA0004119809900000037
the motion of the pixel is estimated by minimizing the intensity error between the two images, then the intensity error equation is as follows:
Figure FDA0004119809900000038
Figure FDA0004119809900000039
in the formulas (X) and (XI),
Figure FDA00041198099000000310
is the resampled left image +.>
Figure FDA00041198099000000311
At x i Image gradient, x i The value range of (2) is the guideboard ROI area;
Figure FDA00041198099000000312
is the resampled left image +.>
Figure FDA00041198099000000313
And target image I 0 Intensity errors of corresponding points;
d=(d 0 ,d 1 ,...,d 7 ) Is a motion update parameter, J i =J d (x i ) Is the resample point coordinates x' i ' jacobian with respect to d, corresponds to the optical flow caused by the instantaneous motion of a three-dimensional plane, expressed as:
Figure FDA0004119809900000041
at this time, an analytical solution is obtained by the least square method:
Ad=-b (XIII)
wherein, hessian matrix:
Figure FDA0004119809900000042
cumulative gradient:
Figure FDA0004119809900000043
6. the method of gradient-based guideboard registration and positioning of claim 1, wherein, in step D,
the imaging principle of the binocular camera is as follows:
Figure FDA0004119809900000044
Figure FDA0004119809900000045
wherein X is the transverse distance from the camera to the center of the guideboard, and the lane where the vehicle is located is calculated according to the distance information between the center of the guideboard and the lane in the database; z refers to the normal distance from the camera to the guideboard plane, i.e., the distance from the vehicle to the guideboard;
O L 、O R is the center of the left aperture and the right aperture of the binocular camera, O L 、O R The distance between the two is the base line b and f of the binocular camera, and the focal length is the distance; p (X, Y, Z) is a point in three-dimensional space, and P (X, Y, Z) is imaged in a binocular camera, denoted as P L And P R After correction, P L And P R The coordinate of the x-axis of the imaging plane is u L And u R The obtained parallax disp=u L -u R
The transverse and normal distances of the camera relative to the guideboard are obtained by combining the internal and external parameters obtained by the calibration of the camera, namely the distance of the vehicle relative to the guideboard and the lane where the vehicle is located are obtained, and the self-positioning of the vehicle is realized.
7. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the gradient-based guideboard registration positioning method of any of claims 1-6 when the computer program is executed.
8. A computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor implements the steps of the gradient-based guideboard registration positioning method of any of claims 1-6.
CN202310229650.2A 2023-03-10 2023-03-10 Guideboard registration positioning method based on gradient Pending CN116152342A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310229650.2A CN116152342A (en) 2023-03-10 2023-03-10 Guideboard registration positioning method based on gradient

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310229650.2A CN116152342A (en) 2023-03-10 2023-03-10 Guideboard registration positioning method based on gradient

Publications (1)

Publication Number Publication Date
CN116152342A true CN116152342A (en) 2023-05-23

Family

ID=86350635

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310229650.2A Pending CN116152342A (en) 2023-03-10 2023-03-10 Guideboard registration positioning method based on gradient

Country Status (1)

Country Link
CN (1) CN116152342A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116681885A (en) * 2023-08-03 2023-09-01 国网安徽省电力有限公司超高压分公司 Infrared image target identification method and system for power transmission and transformation equipment
CN116895030A (en) * 2023-09-11 2023-10-17 西华大学 Insulator detection method based on target detection algorithm and attention mechanism
CN117746649A (en) * 2023-11-06 2024-03-22 南京城建隧桥智慧管理有限公司 Tunnel traffic flow detection system and method based on YOLOv8 algorithm

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116681885A (en) * 2023-08-03 2023-09-01 国网安徽省电力有限公司超高压分公司 Infrared image target identification method and system for power transmission and transformation equipment
CN116681885B (en) * 2023-08-03 2024-01-02 国网安徽省电力有限公司超高压分公司 Infrared image target identification method and system for power transmission and transformation equipment
CN116895030A (en) * 2023-09-11 2023-10-17 西华大学 Insulator detection method based on target detection algorithm and attention mechanism
CN116895030B (en) * 2023-09-11 2023-11-17 西华大学 Insulator detection method based on target detection algorithm and attention mechanism
CN117746649A (en) * 2023-11-06 2024-03-22 南京城建隧桥智慧管理有限公司 Tunnel traffic flow detection system and method based on YOLOv8 algorithm

Similar Documents

Publication Publication Date Title
EP3735675B1 (en) Image annotation
CN109752701B (en) Road edge detection method based on laser point cloud
CN106651953B (en) A Vehicle Pose Estimation Method Based on Traffic Signs
CN108802785B (en) Vehicle self-positioning method based on high-precision vector map and monocular vision sensor
CN116152342A (en) Guideboard registration positioning method based on gradient
CN107045629B (en) A multi-lane line detection method
CN105930819B (en) Real-time city traffic lamp identifying system based on monocular vision and GPS integrated navigation system
CN107341453B (en) Lane line extraction method and device
Liang et al. Video stabilization for a camcorder mounted on a moving vehicle
CN106225787A (en) Unmanned aerial vehicle visual positioning method
CN112308913B (en) Vehicle positioning method and device based on vision and vehicle-mounted terminal
CN113903011A (en) Semantic map construction and positioning method suitable for indoor parking lot
CN115717894A (en) A high-precision vehicle positioning method based on GPS and common navigation maps
CN110415299B (en) Vehicle position estimation method based on set guideboard under motion constraint
CN108416798A (en) A Vehicle Distance Estimation Method Based on Optical Flow
CN113781562A (en) Lane line virtual and real registration and self-vehicle positioning method based on road model
CN112906616A (en) Lane line extraction and generation method
CN107220632B (en) A road image segmentation method based on normal feature
CN112749584A (en) Vehicle positioning method based on image detection and vehicle-mounted terminal
CN115239822A (en) Real-time visual identification and positioning method and system for multi-module space of split type flying vehicle
CN115100292A (en) An online calibration method of external parameters between lidar and camera in road environment
Bulatov et al. Context-based urban terrain reconstruction from images and videos
KR20220151572A (en) Method and System for change detection and automatic updating of road marking in HD map through IPM image and HD map fitting
CN114596369A (en) Map generation method and device, electronic equipment and computer storage medium
Li et al. Lane detection and road surface reconstruction based on multiple vanishing point & symposia

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination