CN115526987A - Label element reconstruction method, system, device and medium based on monocular camera - Google Patents

Label element reconstruction method, system, device and medium based on monocular camera Download PDF

Info

Publication number
CN115526987A
CN115526987A CN202211156746.2A CN202211156746A CN115526987A CN 115526987 A CN115526987 A CN 115526987A CN 202211156746 A CN202211156746 A CN 202211156746A CN 115526987 A CN115526987 A CN 115526987A
Authority
CN
China
Prior art keywords
image
monocular
sign
information
signage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211156746.2A
Other languages
Chinese (zh)
Inventor
杨蒙蒙
江昆
温拓朴
杨殿阁
黄晋
唐雪薇
黄健强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202211156746.2A priority Critical patent/CN115526987A/en
Publication of CN115526987A publication Critical patent/CN115526987A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a method, a system, equipment and a medium for reconstructing a sign element based on a monocular camera, wherein the method comprises the following steps: acquiring a monocular image, a GNSS signal, an IMU signal and a wheel speed signal; the acquired monocular image is subjected to perception processing to obtain a map element result of image perception; acquiring six-degree-of-freedom information of the vehicle based on the GNSS signal, the IMU signal and the wheel speed signal; and calculating road side sign elements based on the map element result of image perception and six-degree-of-freedom information of the vehicle to obtain three-dimensional information of the road side signs. The invention can realize the reconstruction of the label elements by using only low-cost monocular cameras, GNSS and IMU.

Description

Label element reconstruction method, system, device and medium based on monocular camera
Technical Field
The invention relates to a method, a system, equipment and a medium for reconstructing a label element based on a monocular camera, and relates to the technical field of intelligent networking automobile environment construction.
Background
High-precision maps are an important basis for high-level automated driving. High-precision maps are important inputs in high-level autonomous driving, which require a description of the road environment from decimetre to centimetre levels.
The traditional method mainly adopts a laser radar for collection, but is expensive and difficult to apply on a large scale. In the prior art, data of a crowdsourcing monocular camera is used for reconstruction, the cost is low, the data can be installed in a large scale, but the depth measurement is lacked, and the reconstruction of road side elements is difficult.
Disclosure of Invention
In view of the above problems, it is an object of the present invention to provide a method, system, device and medium for reconstructing signage elements based on a monocular camera, which can achieve the reconstruction of signage elements.
In order to achieve the purpose of the invention, the technical scheme adopted by the invention is as follows:
in a first aspect, the present invention provides a method for reconstructing a signage component based on a monocular camera, comprising:
acquiring a monocular image, a GNSS signal, an IMU signal and a wheel speed signal;
the obtained monocular image is subjected to perception processing to obtain a map element result of image perception;
acquiring six-degree-of-freedom information of the vehicle based on the GNSS signal, the IMU signal and the wheel speed signal;
and calculating road side sign elements based on the map element result of image perception and six-degree-of-freedom information of the vehicle to obtain three-dimensional information of the road side signs.
Further, it is necessary to include a sign on the road side in the monocular image.
Further, the obtaining of the map element result of image perception by perception processing of the acquired monocular image includes:
the label sensing is used for acquiring mask data of label pixels in the monocular image;
and edge extraction, namely performing edge extraction on the mask data of the road side sign pixels, and outputting a series of control points in the image to represent the closed contour of the road side sign element.
Further, signage perception, comprising:
zooming the monocular image;
setting an image segmentation model based on a convolutional neural network;
and inputting the monocular image into an image segmentation model, and performing forward calculation on the neural network through a GPU/FPGA/AI chip calculation unit carrying the image segmentation model to obtain a mask of the signboard pixels in the image.
Further, calculating road side sign elements based on the map element result of image perception and six-degree-of-freedom information of the vehicle to obtain three-dimensional information of the road side signs, and the method comprises the following steps:
predicting positions of the sign element examples in different frame images by using optical flows according to the closed contour of the road side sign element and six-degree-of-freedom information of the vehicle to obtain closed contours of signs at different moments;
and solving the closed outlines of the signs under the plurality of moments and the pose information of the six-degree-of-freedom vehicle through the set template information to finally obtain the three-dimensional information of the roadside signs.
Further, solving the closed outlines of the signs at multiple moments and the pose information of the six-degree-of-freedom vehicle through the set template information to finally obtain the three-dimensional information of the roadside signs, wherein the three-dimensional information comprises the following steps:
defining a label shape template;
parameterizing the signboard, wherein the parameterization comprises shape parameters and position parameters of the signboard;
defining an objective function:
defining an objective function:
Figure BDA0003859087520000021
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003859087520000022
to represent
Figure BDA0003859087520000023
Pixel position projected onto the image, R k Tk represents the six-degree-of-freedom pose of the camera at k, pi represents the projection model of the pinhole camera, p represents the closed contour, and
Figure BDA0003859087520000031
the nearest point, w represents the point in the world coordinate system;
and optimizing the objective function by using an L-M optimization algorithm, and solving the shape parameter and the position parameter of the sign.
Further, a sign shape template is defined, including rectangular templates and circular templates.
In a second aspect, the present invention further provides a system for reconstructing signage components based on a monocular camera, comprising:
a signal acquisition module configured to acquire a monocular image as well as a GNSS signal, an IMU signal, and a wheel speed signal;
the perception module is configured to perform perception processing on the acquired monocular image to obtain a map element result of image perception;
a positioning module configured to obtain six-degree-of-freedom information of the vehicle based on the GNSS signals, the IMU signals and the wheel speed signals;
and the roadside sign element calculation module is configured to calculate the roadside sign element based on the image-perceived map element result and the six-degree-of-freedom information of the vehicle to obtain the three-dimensional information of the roadside sign.
In a third aspect, the present invention further provides an electronic device comprising computer program instructions, wherein the program instructions, when executed by a processor, are adapted to implement the method for reconstructing a signage component based on a monocular camera.
In a fourth aspect, the present invention further provides a computer-readable storage medium having stored thereon computer program instructions, wherein the program instructions, when executed by a processor, are for implementing the monocular camera-based signage component reconstruction method.
Due to the adoption of the technical scheme, the invention has the following characteristics:
1. according to the method, a monocular image, a GNSS signal, an IMU signal and a wheel speed signal are obtained; the acquired monocular image is subjected to perception processing to obtain a map element result of image perception; acquiring six-degree-of-freedom information of the vehicle based on the GNSS signal, the IMU signal and the wheel speed signal; and calculating road side sign elements based on the map element result of image perception and six-degree-of-freedom information of the vehicle to obtain three-dimensional information of the road side signs, so that the sign elements can be reconstructed only by using low-cost monocular cameras, GNSS and IMU.
2. The invention can be applied to the construction of various signboard, including rectangular and round signboard.
In conclusion, the invention can be widely applied to the reconstruction of the label.
Drawings
Various additional advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Like parts are designated with like reference numerals throughout the drawings. In the drawings:
fig. 1 is a schematic overall flow chart of a monocular vision roadside sign element reconstruction method according to an embodiment of the present invention.
Fig. 2 is a flow chart of roadside sign object calculation according to an embodiment of the present invention.
FIG. 3 is a signage parameterization template of an embodiment of the invention: (a) Is a rectangular template with the length of h and the width of w, and (b) is a circular template with the radius of r.
Detailed Description
It is to be understood that the terminology used herein is for the purpose of describing particular example embodiments only, and is not intended to be limiting. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. The terms "comprises," "comprising," "including," and "having" are inclusive and therefore specify the presence of stated features, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, elements, components, and/or groups thereof. The method steps, processes, and operations described herein are not to be construed as necessarily requiring their performance in the particular order described or illustrated, unless specifically identified as an order of performance. It should also be understood that additional or alternative steps may be used.
Although the terms first, second, third, etc. may be used herein to describe various elements, components, regions, layers and/or sections, these elements, components, regions, layers and/or sections should not be limited by these terms. These terms may be only used to distinguish one element, component, region, layer or section from another region, layer or section. Terms such as "first," "second," and other numerical terms when used herein do not imply a sequence or order unless clearly indicated by the context. Thus, a first element, component, region, layer or section discussed below could be termed a second element, component, region, layer or section without departing from the teachings of the example embodiments.
For convenience of description, spatially relative terms, such as "inner", "outer", "lower", "upper", and the like, may be used herein to describe one element or feature's relationship to another element or feature as illustrated in the figures. This spatially relative term is intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures.
Because the prior art uses the data of the crowdsourcing monocular camera to reconstruct and lacks the measurement of depth, the reconstruction of the roadside element is difficult to realize. The invention provides a method, a system, equipment and a medium for reconstructing a sign element based on a monocular camera, comprising the following steps: acquiring a monocular image, a GNSS signal, an IMU signal and a wheel speed signal; the acquired monocular image is subjected to perception processing to obtain a map element result of image perception; acquiring six-degree-of-freedom information of the vehicle based on the GNSS signal, the IMU signal and the wheel speed signal; and calculating road side sign elements based on the map element result of image perception and six-degree-of-freedom information of the vehicle to obtain three-dimensional information of the road side signs. Therefore, the invention can realize the reconstruction of the signage elements by using only low-cost monocular cameras, GNSS and IMU.
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
The first embodiment is as follows: the method for reconstructing a signage component based on a monocular camera provided by the embodiment comprises the following steps:
s1, acquiring a monocular image, a GNSS (global navigation satellite system) signal, an IMU (inertial measurement unit) signal and a wheel speed signal.
Specifically, the monocular image needs to include a sign on the road side. The GNSS signal is used for obtaining absolute position information of the vehicle, and the IMU signal and the wheel speed signal are used for obtaining a pose of six degrees of freedom of the continuous vehicle.
S2, the acquired monocular image is subjected to perception processing, and a map element result of image perception is obtained.
Specifically, the sensing processing input is monocular image information, the output is a sensing vector result of a sign in an image, and the sensing processing comprises the steps of sign sensing and edge extraction, wherein:
s21, the label sensing process comprises the following steps:
scaling the input monocular image, where the image is scaled to 768 × 480 in this embodiment, for example, but not limited thereto;
an image segmentation model based on a convolutional neural network is arranged;
and inputting the input image into an image segmentation model, and performing forward calculation on the neural network through a GPU/FPGA/AI chip calculation unit carrying the image segmentation model to obtain a mask of the tag pixels in the image.
S22, edge extraction
Mask data of the road side sign pixels are input, edge extraction is carried out through a Canny operator, a series of control points in the image are output, and the closed outline of the road side sign elements is represented.
And S3, obtaining six-degree-of-freedom information of the vehicle.
GNSS signals, monocular images, IMU and wheel speed signals are processed, and six-degree-of-freedom information including three-dimensional pose information and three-degree-of-freedom rotation angle information obtained from the vehicle is obtained by utilizing the existing GNSS/IMU/wheel speed multi-sensor fusion technology. Positioning may use GNSS or RTK (carrier phase differential) signals to obtain positioning results of different positioning accuracy. In addition, other sensors, such as visual odometers, may also be incorporated to achieve more accurate relative positioning accuracy.
S4, as shown in FIG. 2, calculating the road side sign element by using the image sensing result and the six-degree-of-freedom information of the vehicle to obtain the three-dimensional information of the road side sign element, wherein the three-dimensional information comprises the following steps:
s41, predicting positions of the sign element examples in different frame images by using optical flow tracking according to the input closed contour of the road side sign element and the vehicle six-degree-of-freedom information obtained through solving in the previous step, and obtaining the closed contour of the sign at different moments.
S42, solving the closed outlines of the signs under the multiple moments and the pose information of the six-degree-of-freedom vehicle through the set template information to finally obtain the three-dimensional information of the roadside signs, wherein the solving process is as follows:
and S421, defining two typical sign shape templates.
Specifically, as shown in fig. 3 (a), the sign shape template is defined as a rectangle, and the rectangle is composed of 8 key points, which are:
(0.5w,0.25h),(0.5w,-0.25h),(0.25w,-0.5h),(-0.25w,-0.5h),
(-0.5w,-0.25h),(-0.5w,0.25h),(-0.25w,0.5h),(0.25w,0.5h)
where w represents the width and h represents the height, these two variables will be estimated in the solution module.
As shown in fig. 3 (b), the plate-shaped template is further defined as a circle, which is also indicated by 8 dots, respectively:
Figure BDA0003859087520000071
Figure BDA0003859087520000072
wherein r is the radius of the circle.
S422 parameterization of signboard
Specifically, the parameterization of the signboard comprises two parts:
one part is a shape parameter, such as a rectangle, which is w, and a circle, which is r;
another part is the position parameter p of the sign c ,R o Respectively representing the central position of the signboard and the rotation matrix of the signboard relative to a world coordinate system, and assuming that a key point corresponding to one signboard is p 0 ,p 1 ,…,p 7 Center point of which is p c The lateral edge of a given sign is the x-axis, the vertically upward edge is the y-axis, and the normal to the plane is the z-axis. The rotation matrix of the label coordinate system relative to the world coordinate system is R o Then the points in the three-dimensional space of the signboard are:
Figure BDA0003859087520000073
Figure BDA0003859087520000074
w denotes that the point is in the world coordinate system,
Figure BDA0003859087520000075
representing the point of the signboard in three-dimensional space.
S423, defining an objective function, and obtaining the optimal signage parameter by minimizing the objective function, where the objective function is defined as:
Figure BDA0003859087520000081
wherein the content of the first and second substances,
Figure BDA0003859087520000082
to represent
Figure BDA0003859087520000083
Pixel position projected onto the image, R k ,t k Representing the pose of six degrees of freedom of the camera at the k time, pi representing the projection model of the pinhole camera, and p representing the neutralization in the closed contour
Figure BDA0003859087520000084
The closest point.
S424, optimizing the objective function by using an L-M optimization algorithm, solving the shape parameter and the position parameter of the sign, and iteratively solving the increment delta t and delta theta, wherein the delta t is p c Delta theta = (delta theta) 1 ,δθ 2 ,δθ 3 ) Is R o By finding the increment, the rotation matrix of the signboard can be updated.
Since the signboard is generally a plane perpendicular to the ground, its orientation is a one-dimensional rotation around the z-axis, so that R is updated o The following formula is used for optimization:
R o ←R o ·exp(0,0,δθ 3 )
in this way, updated R can be guaranteed o It is always a one-dimensional rotation around the z-axis, i.e. a rotation around the signboard plane normal vector.
Example two: the first embodiment provides a method for reconstructing a signage component based on a monocular camera, and correspondingly, the first embodiment provides a system for reconstructing a signage component based on a monocular camera. The system provided in this embodiment may implement the method for reconstructing a signage component based on a monocular camera in the first embodiment, and the system may be implemented by software, hardware, or a combination of software and hardware. For convenience of description, the present embodiment is described with the functions divided into various units, which are described separately. Of course, the functions of the units may be implemented in the same software and/or hardware or in one or more pieces. For example, the system may comprise integrated or separate functional modules or units to perform the corresponding steps in the method of an embodiment. Since the system of the present embodiment is substantially similar to the method embodiment, the description process of the present embodiment is relatively simple, and reference may be made to part of the description of the first embodiment for relevant points.
Specifically, the system for reconstructing a signage component based on a monocular camera provided in this embodiment includes:
a signal acquisition module configured to acquire a monocular image as well as a GNSS signal, an IMU signal, and a wheel speed signal;
the perception module is configured to perform perception processing on the acquired monocular image to obtain a map element result of image perception;
a positioning module configured to obtain six-degree-of-freedom information of the vehicle based on the GNSS signals, the IMU signals and the wheel speed signals;
and the roadside sign element calculation module is configured to calculate the roadside sign elements based on the map element result sensed by the image and six-degree-of-freedom information of the vehicle to obtain three-dimensional information of the roadside sign.
In a preferred embodiment, the sensing module comprises a signage sensing module and an edge extraction module, wherein:
the label perception module comprises an image segmentation model based on a convolution neural network, wherein an input image is firstly scaled to 768 × 480, then the input image is sent into the neural network, and the neural network is subjected to forward calculation through a GPU/FPGA/AI chip calculation unit mounted on the image segmentation model to obtain a mask of label pixels in the image.
And the edge extraction module inputs mask data of the road side sign pixels, performs edge extraction through a Canny operator, and finally outputs a series of control points in the image to represent the closed contour of the road side sign element.
In a preferred embodiment, the positioning module is configured to obtain six degrees of freedom information for the vehicle.
Specifically, the positioning module: the input is GNSS, monocular image, IMU and wheel speed signal, and the output is the six-freedom degree information of the self-vehicle. The positioning module may be compatible with GNSS or RTK signals to obtain positioning results of different positioning accuracies. In addition, other sensors may also be fused to obtain more accurate relative positioning accuracy.
In a preferred embodiment, specifically, as shown in fig. 2, the roadside sign element calculation module: the module inputs the closed contour of the road side sign element and the positioning information of the vehicle output by the sensing module, and outputs the three-dimensional information of the road side sign, and the module comprises the following components:
firstly, tracking calculation is carried out according to the input closed contour and the positioning information, and the closed contour of the label at different moments is obtained.
And then, inputting the closed outlines of the signs and the vehicle pose information at a plurality of moments into a solving module based on template information for solving to finally obtain the three-dimensional information of the roadside signs.
Example three: the present embodiment provides an electronic device corresponding to the method for reconstructing a signage component based on a monocular camera according to the first embodiment, where the electronic device may be an electronic device for a client, such as a mobile phone, a notebook computer, a tablet computer, a desktop computer, and the like, to execute the method according to the first embodiment.
The electronic equipment comprises a processor, a memory, a communication interface and a bus, wherein the processor, the memory and the communication interface are connected through the bus to complete mutual communication. The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (Extended Industry Standard Component) bus, or the like. The memory stores a computer program that can be executed on the processor, and the processor executes the computer program to execute the method for reconstructing a signage element based on a monocular camera according to the first embodiment.
In some implementations, the logic instructions in the memory may be implemented in software functional units and stored in a computer readable storage medium when sold or used as a stand-alone product. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), an optical disk, and various other media capable of storing program codes.
In other implementations, the processor may be various general-purpose processors such as a Central Processing Unit (CPU), a Digital Signal Processor (DSP), and the like, and is not limited herein.
Example four: the monocular camera-based signage component reconstruction method of this embodiment may be embodied as a computer program product that may include a computer readable storage medium having computer readable program instructions embodied thereon for executing the monocular camera-based signage component reconstruction method of this embodiment.
The computer readable storage medium may be a tangible device that retains and stores instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any combination of the foregoing.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment. In the description herein, references to the description of "one embodiment," "some implementations," or the like, mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of an embodiment of the specification. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for reconstructing a signage component based on a monocular camera is characterized by comprising the following steps:
acquiring a monocular image, a GNSS signal, an IMU signal and a wheel speed signal;
the acquired monocular image is subjected to perception processing to obtain a map element result of image perception;
acquiring six-degree-of-freedom information of the vehicle based on the GNSS signal, the IMU signal and the wheel speed signal;
and calculating road side sign elements based on the map element result of image perception and six-degree-of-freedom information of the vehicle to obtain three-dimensional information of the road side signs.
2. The method of claim 1, wherein the monocular image is required to include a sign on a road side.
3. The method for reconstructing signage components based on a monocular camera of claim 1, wherein the obtaining of the monocular image is perceptually processed to obtain a map component result of image perception, comprising:
the label sensing is used for acquiring mask data of label pixels in the monocular image;
and edge extraction, namely performing edge extraction on the mask data of the road side sign pixel, and outputting a series of control points in the image to represent the closed contour of the road side sign element.
4. The method of claim 3, wherein the signage aspect reconstruction based on the monocular camera, comprises:
zooming the monocular image;
setting an image segmentation model based on a convolutional neural network;
and inputting the monocular image into an image segmentation model, and performing forward calculation on the neural network through a GPU/FPGA/AI chip calculation unit carrying the image segmentation model to obtain a mask of the signboard pixels in the image.
5. The method for reconstructing signage components based on a monocular camera of claim 3, wherein the road side signage component calculation based on the map component result of image sensing and six-degree-of-freedom information of the vehicle to obtain three-dimensional information of the road side signage comprises:
predicting the positions of the sign element examples in different frame images by using optical flows according to the closed contour of the roadside sign element and the six-degree-of-freedom information of the vehicle to obtain the closed contour of the sign at different moments;
and solving the closed outlines of the signs under the plurality of moments and the pose information of the six-degree-of-freedom vehicle through the set template information to finally obtain the three-dimensional information of the roadside signs.
6. The method for reconstructing sign elements based on a monocular camera according to claim 5, wherein solving the closed outlines of signs at a plurality of times and the pose information of the six-degree-of-freedom vehicle through the set template information to finally obtain the three-dimensional information of the roadside signs comprises:
defining a label shape template;
signboard parameterization, including shape parameters and signboard position parameters;
defining an objective function:
Figure FDA0003859087510000021
wherein the content of the first and second substances,
Figure FDA0003859087510000022
to represent
Figure FDA0003859087510000023
Pixel position projected onto the image, R k ,t k Representing the pose of six degrees of freedom of the camera at the k time, pi representing the projection model of the pinhole camera, p representing the closed contour, and
Figure FDA0003859087510000024
the nearest point, w represents the point in the world coordinate system;
and optimizing the objective function by using an L-M optimization algorithm, and solving the shape parameter and the position parameter of the sign.
7. The method of claim 6, wherein the signage component reconstruction method based on the monocular camera defines signage shape templates, including a rectangular template and a circular template.
8. A monocular camera-based signage feature reconstruction system, comprising:
a signal acquisition module configured to acquire a monocular image as well as a GNSS signal, an IMU signal, and a wheel speed signal;
the perception module is configured to perform perception processing on the acquired monocular image to obtain a map element result of image perception;
a positioning module configured to obtain six degree of freedom information of the vehicle based on the GNSS signals, the IMU signals, and the wheel speed signals;
and the roadside sign element calculation module is configured to calculate the roadside sign element based on the image-perceived map element result and the six-degree-of-freedom information of the vehicle to obtain the three-dimensional information of the roadside sign.
9. An electronic device comprising computer program instructions, wherein the program instructions, when executed by a processor, are adapted to implement the monocular camera based signage component reconstruction method of any one of claims 1-7.
10. A computer-readable storage medium, having stored thereon computer program instructions, wherein the program instructions, when executed by a processor, are adapted to implement the monocular camera based signage component reconstruction method of any one of claims 1-7.
CN202211156746.2A 2022-09-22 2022-09-22 Label element reconstruction method, system, device and medium based on monocular camera Pending CN115526987A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211156746.2A CN115526987A (en) 2022-09-22 2022-09-22 Label element reconstruction method, system, device and medium based on monocular camera

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211156746.2A CN115526987A (en) 2022-09-22 2022-09-22 Label element reconstruction method, system, device and medium based on monocular camera

Publications (1)

Publication Number Publication Date
CN115526987A true CN115526987A (en) 2022-12-27

Family

ID=84699499

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211156746.2A Pending CN115526987A (en) 2022-09-22 2022-09-22 Label element reconstruction method, system, device and medium based on monocular camera

Country Status (1)

Country Link
CN (1) CN115526987A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116415353A (en) * 2023-03-13 2023-07-11 清华大学 Modeling method for design requirements of perception system based on automatic driving function

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20220053513A (en) * 2019-10-16 2022-04-29 상하이 센스타임 린강 인텔리전트 테크놀로지 컴퍼니 리미티드 Image data automatic labeling method and device
CN114463504A (en) * 2022-01-25 2022-05-10 清华大学 Monocular camera-based roadside linear element reconstruction method, system and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20220053513A (en) * 2019-10-16 2022-04-29 상하이 센스타임 린강 인텔리전트 테크놀로지 컴퍼니 리미티드 Image data automatic labeling method and device
CN114463504A (en) * 2022-01-25 2022-05-10 清华大学 Monocular camera-based roadside linear element reconstruction method, system and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
TUOPU WEN 等: "Roadside HD Map Object Reconstruction Using Monocular Camera", 《 IEEE ROBOTICS AND AUTOMATION LETTERS》, 3 July 2022 (2022-07-03), pages 7722 - 7729 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116415353A (en) * 2023-03-13 2023-07-11 清华大学 Modeling method for design requirements of perception system based on automatic driving function

Similar Documents

Publication Publication Date Title
CN110008851B (en) Method and equipment for detecting lane line
US20210149022A1 (en) Systems and methods for 3d object detection
JP5430456B2 (en) Geometric feature extraction device, geometric feature extraction method, program, three-dimensional measurement device, object recognition device
US20180131924A1 (en) Method and apparatus for generating three-dimensional (3d) road model
Senlet et al. A framework for global vehicle localization using stereo images and satellite and road maps
Jeong et al. Hdmi-loc: Exploiting high definition map image for precise localization via bitwise particle filter
KR20210137893A (en) Method and system for determining position of vehicle
CN112327326A (en) Two-dimensional map generation method, system and terminal with three-dimensional information of obstacles
CN111174722A (en) Three-dimensional contour reconstruction method and device
CN115526987A (en) Label element reconstruction method, system, device and medium based on monocular camera
JP2019028653A (en) Object detection method and object detection device
JP7337617B2 (en) Estimation device, estimation method and program
Lu et al. Lane marking-based vehicle localization using low-cost GPS and open source map
Crombez et al. Using dense point clouds as environment model for visual localization of mobile robot
Li et al. On automatic and dynamic camera calibration based on traffic visual surveillance
CN114463504A (en) Monocular camera-based roadside linear element reconstruction method, system and storage medium
Cappelle et al. Localisation in urban environment using GPS and INS aided by monocular vision system and 3D geographical model
CN113313824B (en) Three-dimensional semantic map construction method
US11733373B2 (en) Method and device for supplying radar data
CN112528918A (en) Road element identification method, map marking method and device and vehicle
Lee et al. Semi-automatic framework for traffic landmark annotation
CN111860084B (en) Image feature matching and positioning method and device and positioning system
CN111060114A (en) Method and device for generating feature map of high-precision map
Kyutoku et al. Vehicle ego-localization with a monocular camera using epipolar geometry constraints
Winkens et al. Optical truck tracking for autonomous platooning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination