CN115620250A - Road surface element reconstruction method, device, electronic device and storage medium - Google Patents

Road surface element reconstruction method, device, electronic device and storage medium Download PDF

Info

Publication number
CN115620250A
CN115620250A CN202211090620.XA CN202211090620A CN115620250A CN 115620250 A CN115620250 A CN 115620250A CN 202211090620 A CN202211090620 A CN 202211090620A CN 115620250 A CN115620250 A CN 115620250A
Authority
CN
China
Prior art keywords
target
road surface
feature map
determining
uncertainty
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211090620.XA
Other languages
Chinese (zh)
Inventor
请求不公布姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaozhi Ruoyu Intelligent Technology Co ltd
Original Assignee
Beijing Xiaozhi Ruoyu Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaozhi Ruoyu Intelligent Technology Co ltd filed Critical Beijing Xiaozhi Ruoyu Intelligent Technology Co ltd
Publication of CN115620250A publication Critical patent/CN115620250A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/05Geographic models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/04Indexing scheme for image data processing or generation, in general involving 3D image data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/08Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Geometry (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Remote Sensing (AREA)
  • Computer Graphics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

A road surface element reconstruction method, apparatus, electronic device and storage medium are disclosed. Wherein, the method comprises the following steps: acquiring an ambient image set, the ambient image set comprising: aiming at multi-frame images with different visual angles, which are collected by a plurality of image collecting devices arranged at different directions of the movable equipment at the current moment, aiming at the environment around the movable equipment, each frame of image in an environment image set comprises at least one road surface target; generating a first feature map containing semantic segmentation information in a bird-eye view space through a neural network based on the environment image set; determining respective pixel areas of all the pavement targets in the first feature map based on the semantic segmentation information; mapping the pixel area of each road surface target in the first characteristic diagram to a preset coordinate system corresponding to the movable equipment to obtain a second characteristic diagram; performing multi-target tracking on the second characteristic diagram to obtain a tracking result; and generating a road surface element reconstruction result based on the tracking result. The method and the device can ensure the integrity of the reconstruction result of the pavement elements.

Description

Road surface element reconstruction method, device, electronic device and storage medium
Technical Field
The present disclosure relates to driving technologies, and in particular, to a road surface element reconstruction method, apparatus, electronic device, and storage medium.
Background
The environmental perception is a research focus in the fields of unmanned driving, scene understanding and the like, when the environmental perception is carried out, the requirement of road surface element reconstruction exists in some cases, the existing road surface element reconstruction mode only carries out reconstruction output on a road surface target in front of a front-view camera, and the road surface target out of the visual field of the front-view camera cannot be reconstructed.
Disclosure of Invention
The method and the device for reconstructing the road surface elements aim to solve the problem that the integrity of a road surface element reconstruction result is influenced because the road surface object out of the visual field of a forward-looking camera cannot be reconstructed by the existing road surface element reconstruction method. The embodiment of the disclosure provides a road surface element reconstruction method and device, electronic equipment and a storage medium.
According to an aspect of an embodiment of the present disclosure, there is provided a road surface element reconstruction method including:
acquiring an ambient image set, the ambient image set comprising: the method comprises the steps that a plurality of image acquisition devices arranged in different directions of the movable equipment acquire multi-frame images with different visual angles aiming at the environment around the movable equipment at the current moment, wherein each frame of image in an environment image set comprises at least one road surface target;
generating a first feature map containing semantic segmentation information in a bird's-eye view space through a neural network based on the environment image set;
determining respective pixel regions of the road surface targets in the first feature map based on the semantic segmentation information;
mapping the pixel area of each road surface target in the first characteristic diagram to a preset coordinate system corresponding to the movable equipment to obtain a second characteristic diagram;
performing multi-target tracking on the second characteristic diagram to obtain a tracking result;
and generating a road surface element reconstruction result based on the tracking result.
According to another aspect of the embodiments of the present disclosure, there is provided a road surface element reconstructing apparatus including:
a first acquisition module configured to acquire an ambient image set, the ambient image set comprising: the method comprises the steps that a plurality of image acquisition devices arranged in different directions of the movable equipment acquire multi-frame images with different visual angles aiming at the environment around the movable equipment at the current moment, wherein each frame of image in an environment image set comprises at least one road surface target;
a first generating module, configured to generate, based on the environment image set acquired by the first acquiring module, a first feature map including semantic segmentation information in a bird's-eye view space via a neural network;
a determining module, configured to determine a pixel region of each of the road surface targets in the first feature map based on the semantic segmentation information included in the first feature map generated by the first generating module;
the second obtaining module is used for mapping the pixel area of each road surface target in the first characteristic diagram determined by the determining module to a preset coordinate system corresponding to the movable equipment to obtain a second characteristic diagram;
the multi-target tracking module is used for carrying out multi-target tracking on the second characteristic diagram obtained by the second obtaining module to obtain a tracking result;
and the second generation module is used for generating a road surface element reconstruction result based on the tracking result obtained by the multi-target tracking module.
According to still another aspect of an embodiment of the present disclosure, there is provided a computer-readable storage medium storing a computer program for executing the above-described road surface element reconstructing method.
According to still another aspect of an embodiment of the present disclosure, there is provided an electronic device including:
a processor;
a memory for storing the processor-executable instructions;
and the processor is used for reading the executable instructions from the memory and executing the instructions to realize the pavement element reconstruction method.
Based on the road surface element reconstruction method, the road surface element reconstruction device, the electronic device and the storage medium provided by the embodiments of the present disclosure, a first feature map including semantic segmentation information in a bird's-eye view space may be generated via a neural network based on an environment image set including multi-frame images of different perspectives acquired by a plurality of image acquisition devices disposed in different orientations of a mobile device, a second feature map may be generated based on the first feature map, and a road surface element reconstruction result may be generated based on a tracking result obtained by performing multi-target tracking on the second feature map. In this way, the input of the road surface element reconstruction is not single forward-looking camera observation, but multi-view observation based on a plurality of image acquisition devices, and the road surface element is finally reconstructed in a bird's-eye view space.
The technical solution of the present disclosure is further described in detail by the accompanying drawings and examples.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent from the following detailed description of the embodiments of the present disclosure when taken in conjunction with the accompanying drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the principles of the disclosure and not to limit the disclosure. In the drawings, like reference numbers generally represent like parts or steps.
Fig. 1 is a schematic view of a scenario to which the present disclosure is applicable.
Fig. 2 is a schematic flow chart of a road surface element reconstruction method according to an exemplary embodiment of the present disclosure.
Fig. 3 is a schematic diagram of a road surface element reconstruction result in an exemplary embodiment of the present disclosure.
Fig. 4 is a schematic flowchart of a road surface element reconstruction method according to another exemplary embodiment of the disclosure.
Fig. 5 is a schematic flow chart of a road surface element reconstruction method according to still another exemplary embodiment of the disclosure.
Fig. 6 is a schematic flow chart of a road surface element reconstruction method according to still another exemplary embodiment of the disclosure.
Fig. 7 is a schematic flow chart of a road surface element reconstruction method according to still another exemplary embodiment of the disclosure.
Fig. 8 is a schematic structural diagram of a road surface element reconstruction device according to an exemplary embodiment of the present disclosure.
Fig. 9 is a schematic structural diagram of a road surface element reconstruction device according to another exemplary embodiment of the present disclosure.
Fig. 10 is a schematic structural diagram of a road surface element reconstruction device according to still another exemplary embodiment of the present disclosure.
Fig. 11 is a schematic structural diagram of a road surface element reconstruction device according to still another exemplary embodiment of the present disclosure.
Fig. 12 is a block diagram of an electronic device provided in an exemplary embodiment of the present disclosure.
Detailed Description
Hereinafter, example embodiments according to the present disclosure will be described in detail with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a subset of the embodiments of the present disclosure and not all embodiments of the present disclosure, with the understanding that the present disclosure is not limited to the example embodiments described herein.
It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.
It will be understood by those of skill in the art that the terms "first," "second," and the like in the embodiments of the present disclosure are used merely to distinguish one element from another, and are not intended to imply any particular technical meaning, nor is the necessary logical order between them.
It is also understood that in embodiments of the present disclosure, "a plurality" may refer to two or more and "at least one" may refer to one, two or more.
It is also to be understood that any reference to any component, data, or structure in the embodiments of the present disclosure may be generally understood as one or more, unless explicitly defined otherwise or indicated to the contrary hereinafter.
In addition, the term "and/or" in the present disclosure is only one kind of association relationship describing an associated object, and means that three kinds of relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in the present disclosure generally indicates that the former and latter associated objects are in an "or" relationship.
It should also be understood that the description of the various embodiments of the present disclosure emphasizes the differences between the various embodiments, and the same or similar parts may be referred to each other, so that the descriptions thereof are omitted for brevity.
Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.
Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
The disclosed embodiments may be applied to electronic devices such as terminal devices, computer systems, servers, etc., which are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known terminal devices, computing systems, environments, and/or configurations that may be suitable for use with electronic devices, such as terminal devices, computer systems, servers, and the like, include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, networked personal computers, minicomputer systems, mainframe computer systems, distributed cloud computing environments that include any of the above, and the like.
Electronic devices such as terminal devices, computer systems, servers, etc. may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be practiced in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.
Summary of the application
In some cases, a vehicle mounted with an Advanced Driving Assistance System (Advanced Driving Assistance System) needs to perform road surface element reconstruction.
In the process of implementing the present disclosure, the inventor finds that, an existing road surface element reconstruction method is implemented based on a conventional camera sensor, and only a road surface target in front of a front-view camera arranged on a vehicle can be reconstructed and output, and a road surface target outside the field of view of the front-view camera cannot be reconstructed, so that the field of view corresponding to a road surface element reconstruction result is only the field of view of the front-view camera, the field of view corresponding to the road surface element reconstruction result is narrow, and the integrity of the road surface element reconstruction result cannot be ensured.
Exemplary System
As shown in fig. 1, a scene diagram to which the present disclosure is applicable may include: a vehicle 12, an electronic device 14, and a plurality of cameras 16; wherein the electronic device 14 may be located on the vehicle 12, or the electronic device 14 may not be located on the vehicle 12, but may be in remote communication with the vehicle 12; multiple cameras 16 may be provided at different orientations of the vehicle 12.
In specific implementation, the plurality of cameras 16 may periodically collect the environmental images, the images collected by the plurality of cameras 16 may be provided to the electronic device 14, and the electronic device 14 may execute the road surface element reconstruction method in the embodiment of the present disclosure based on the images provided by the plurality of cameras 16 to obtain a road surface element reconstruction result with a wider corresponding view range, so as to ensure the integrity of the road surface element reconstruction result.
Exemplary method
Fig. 2 is a schematic flow chart of a road surface element reconstruction method according to an exemplary embodiment of the present disclosure. The method shown in fig. 2 includes step 210, step 220, step 230, step 240, step 250 and step 260, each of which is described below.
Step 210, obtaining an environmental image set, where the environmental image set includes: the method comprises the steps that a plurality of image acquisition devices arranged in different directions of the movable equipment acquire multi-frame images with different visual angles aiming at the environment around the movable equipment at the current moment, and each frame of image in an environment image set comprises at least one road surface target.
Alternatively, the mobile device may be a vehicle, such as vehicle 12 in FIG. 1; the image capture device may be a camera, such as camera 16 in FIG. 1; the number of the plurality of image acquisition devices arranged in different directions of the movable equipment can be four, the four image acquisition devices can be respectively arranged in the front, the front left, the front right and the rear of the movable equipment, and certainly, the number of the plurality of image acquisition devices arranged in different directions of the movable equipment can also be three, five or more than five.
It should be noted that the road surface objects involved in the embodiments of the present disclosure all belong to static road surface elements, including but not limited to zebra stripes, stop lines, road surface direction arrows, and the like.
Step 220, generating a first feature map containing semantic segmentation information in a Bird's Eye View (BEV) space through a neural network based on the environment image set.
In step 220, the set of environment images obtained in step 210 may be provided as an input to a neural network, and the neural network may perform an operation based on the input to generate a first feature map containing semantic segmentation information in the bird's-eye view space; the semantic segmentation information may include respective categories of all pixel points in the first feature map, and the categories of the pixel points include, but are not limited to, a zebra crossing category, a stop line category, a road surface direction arrow category, and the like; the first feature map may also be referred to as a BEV spatial segmentation result or BEV observation.
Step 230, determining the respective pixel regions of the road surface objects in the first feature map based on the semantic segmentation information.
Since the semantic segmentation information includes respective categories of all the pixels in the first feature map, in step 230, the respective pixel regions of each road surface target in the first feature map can be extracted efficiently and quickly by referring to the semantic segmentation information; the category of each pixel point in the pixel region corresponding to any road surface target is the category corresponding to the road surface target.
And 240, mapping the pixel area of each road surface target in the first characteristic diagram to a preset coordinate system corresponding to the movable equipment to obtain a second characteristic diagram.
For convenience of understanding, in the embodiments of the present disclosure, a case where the preset Coordinate System corresponding to the mobile device is a Vehicle body Coordinate System (Vehicle Coordinate System) is taken as an example for description.
In step 240, the conversion relationship between the bird's eye coordinate system and the vehicle body coordinate system may be established in advance, and the pixel regions of the road surface objects in the first feature map may be mapped onto a feature map of a blank space with a predetermined size in the vehicle body coordinate system, so as to form a second feature map, and the second feature map may include rectangular frames of the road surface objects.
Step 250, performing multi-Object Tracking (MOT) on the second feature map to obtain a Tracking result.
It should be noted that, in the embodiment of the present disclosure, the acquisition of the environmental image set may be performed periodically, and after each acquisition of one environmental image set, the corresponding first feature map and second feature map may be generated for the environmental image set, so that the second feature map may be obtained periodically, thereby forming a sequence including a plurality of frames of second feature maps. In step 250, a multi-target tracking algorithm may be used to perform multi-target tracking on the sequence including the multi-frame second feature map, that is, the relationship between the road surface targets in the previous and next frames is clarified, the same road surface target in the previous and next frames is associated, and a unique tracking ID is assigned, thereby obtaining a tracking result.
And step 260, generating a road surface element reconstruction result based on the tracking result.
Alternatively, the effect map of the road element reconstruction result may be as shown in fig. 3, and the road element reconstruction result may be used to present information of the respective positions, sizes, and the like of the respective road objects existing in the environment around the movable apparatus.
In the embodiment of the disclosure, a first feature map containing semantic segmentation information in a bird's-eye view space may be generated via a neural network based on an environment image set including a plurality of frame images of different perspectives acquired by a plurality of image acquisition devices disposed in different orientations of a mobile device, a second feature map may be generated based on the first feature map, and a road surface element reconstruction result may be generated based on a tracking result obtained by performing multi-target tracking on the second feature map. In this way, the input of the road surface element reconstruction is not single forward-looking camera observation, but multi-view observation based on a plurality of image acquisition devices, and the road surface element is finally reconstructed in a bird's-eye view space.
Based on the embodiment shown in fig. 2, as shown in fig. 4, the method further comprises step 252 before step 260.
Step 252, obtaining a first target feature map sequence corresponding to a plurality of historical environmental image sets, where the plurality of historical environmental image sets represent a plurality of environmental image sets acquired by a plurality of image acquisition devices disposed at different orientations of the mobile device at a plurality of historical times, and the first target feature map sequence includes: and the second characteristic map of the multi-frame history corresponds to a plurality of historical environment image sets.
As described above, in the embodiment of the present disclosure, the acquisition of the environmental image set may be performed periodically, and then, at the current time, the corresponding environmental image set may be acquired, and may be referred to as a current environmental image set, and the second feature map corresponding to the current environmental image set may be referred to as a current second feature map; at each of the plurality of historical time instants (i.e. time instants other than the current time instant), a corresponding environmental image set may also be obtained, where the environmental image set may be referred to as a historical environmental image set, and the second feature map corresponding to the historical environmental image set may be referred to as a historical second feature map. In addition, the second feature maps of the multi-frame histories, which are in one-to-one correspondence with the plurality of historical environment image sets, are sequentially arranged to form the first target feature map sequence.
Step 260 may include step 2601, step 2603, step 2605, step 2607, and step 2609.
Step 2601, determining the tracked first road surface target based on the tracking result corresponding to the first target feature map sequence.
It should be noted that, by performing multi-target tracking on the first target feature map sequence, a tracking result corresponding to the first target feature map sequence may be obtained, and based on the obtained tracking result, it may be determined which road surface targets have been successfully tracked, and each of the road surface targets that have been successfully tracked may be used as a first road surface target.
Step 2603, taking a first state variable of a second feature map of which the first road target corresponds to the last frame history in the first target feature map sequence as a reference state variable, and taking a first uncertainty corresponding to the first state variable as a reference uncertainty.
Alternatively, the state variables of any one of the road surface targets may include: three-dimensional coordinates of a center point of the rectangular frame of the road surface target, and a length, a width, and a yaw angle of the rectangular frame of the road surface target.
Alternatively, the uncertainty of any state variable can be used to characterize the reliability of the state variable, and any uncertainty can be characterized by using a covariance matrix.
Step 2605, determining a second state variable of the first road target corresponding to the current second feature map and a second uncertainty of the second state variable based on the reference state variable, the reference uncertainty, the first observation information of the first road target in the current second feature map, and the observation noise.
Optionally, the first observation information of the first road surface target in the current second feature map may include: the three-dimensional coordinates of four vertexes of the detection frame of the first road surface target in the current second characteristic diagram are respectively obtained; the observation noise may refer to noise existing when observation is performed with respect to the current second feature map.
In one embodiment, step 2605 includes:
converting the first observation information into a world coordinate system to obtain second observation information;
determining a target Jacobian matrix of the second observation information relative to the state variable;
determining a Kalman gain based on the reference uncertainty, the target Jacobian matrix and the observation noise;
determining a second state variable based on the reference state variable, the Kalman gain, the second observation information and the target Jacobian matrix;
a second uncertainty is determined based on the identity matrix, the kalman gain, the target jacobian matrix, and the reference uncertainty.
It should be noted that the conversion relationship between the vehicle body coordinate system and the world coordinate system may be predetermined, and the conversion relationship may be specifically in the form of a conversion matrix. The first observation information may be regarded as observation information in the vehicle body coordinate system, and after the first observation information is obtained, the first observation information may be converted from the vehicle body coordinate system to the world coordinate system using the conversion matrix to obtain the second observation information. In a specific example, the first observation information includes four three-dimensional coordinates, and the four three-dimensional coordinates are multiplied by the conversion matrix, respectively, to obtain another four three-dimensional coordinates corresponding to the four three-dimensional coordinates one to one, and the obtained another four three-dimensional coordinates may constitute the second observation information.
After obtaining the second observation information, a target jacobian moment of the second observation information relative to the state variable may be determined, and the target jacobian matrix may be used to characterize gradient information of the second observation information relative to the state variable.
Optionally, determining a target jacobian matrix of the second observation information with respect to the state variable includes:
determining respective observation three-dimensional coordinates of an upper left vertex, an upper right vertex, a lower left vertex and a lower right vertex of the detection frame of the first road target based on the second observation information to obtain four observation three-dimensional coordinates;
determining the length, the width and the yaw angle of a detection frame of the first pavement target based on the four observation three-dimensional coordinates;
determining a first Jacobian matrix corresponding to an upper left vertex, a second Jacobian matrix corresponding to an upper right vertex, a third Jacobian matrix corresponding to a lower left vertex and a fourth Jacobian matrix corresponding to a lower right vertex based on the length, the width and the yaw angle as well as a rotation matrix between a preset coordinate system and a world coordinate system;
a target jacobian matrix is determined based on the first, second, third, and fourth jacobian matrices.
It should be noted that the detection frames involved in the embodiments of the present disclosure may be rectangular frames.
Suppose that the three-dimensional coordinate of the top left vertex in the second observation information is represented as y 1 =(x tl ,y tl ,z tl ) T And the three-dimensional coordinate of the top right vertex is represented as y 2 =(x tr ,y tr ,z tr ) T The three-dimensional coordinate of the lower left vertex is represented as y 3 =(x br ,y br ,z br ) T The three-dimensional coordinate of the lower right vertex is represented as y 4 =(x bl ,y bl ,z bl ) T Then four observed three-dimensional coordinates can be obtained, wherein the observed three-dimensional coordinates of the upper left vertex, the upper right vertex, the lower left vertex and the lower right vertex can be y in sequence 1 =(x tl ,y tl ,z tl ) T 、y 2 =(x tr ,y tr ,z tr ) T 、y 3 =(x br ,y br ,z br ) T 、y 4 =(x bl ,y bl ,z bl ) T
Next, the length, width, and yaw angle of the detection frame of the first road surface target may be determined based on the four observed three-dimensional coordinates, and the formula used for determining the length, width, and yaw angle may be:
Figure BDA0003837114390000071
Figure BDA0003837114390000072
θ=arctan((ytl-ytr)/(xtl-xtr))
wherein l, w, and θ sequentially represent the length, width, and yaw angle of the detection frame of the first road surface target.
Optionally, based on the four observed three-dimensional coordinates, the three-dimensional coordinates of the central point of the detection frame of the first road surface target may also be determined, and the formula adopted to determine the three-dimensional coordinates of the central point of the detection frame of the first road surface target is as follows:
x=(x tl +x tr +x br +x bl )/4
y=(y tl +y tr +y br +y bl )/4
z=z tl =z tr =z br =z bl
the x, y and z sequentially represent an x coordinate, a y coordinate and a z coordinate in the three-dimensional coordinates of the central point of the detection frame of the first road surface target.
Next, a first jacobian matrix corresponding to the upper left vertex, a second jacobian matrix corresponding to the upper right vertex, a third jacobian matrix corresponding to the lower left vertex, and a fourth jacobian matrix corresponding to the lower right vertex may be determined based on the determined length, width, yaw angle, and rotation matrix between the body coordinate system and the world coordinate system.
Alternatively, the transformation matrix between the vehicle body coordinate system and the world coordinate system may be predetermined, and the transformation matrix between the vehicle body coordinate system and the world coordinate system may be expressed in the form of:
Figure BDA0003837114390000081
wherein Twv represents a transformation matrix between the vehicle body coordinate system and the world coordinate system, 9 elements at the upper left corner in Twv constitute a rotation matrix between the vehicle body coordinate system and the world coordinate system, and t 1 Representing translation vectors corresponding to the x-direction, t 2 Representing translation vectors corresponding to the y-direction, t 3 Representing the translation vector corresponding to the z direction.
Optionally, the first Jacobian matrix is represented as J tl And then:
Figure BDA0003837114390000082
J xtl_θ =-sin(θ)·(r 11 ·l/2+r 12 ·w/2)+cos(θ)·(r 12 ·l/2-r 11 ·w/2)
J xtl_l =[(r 11 ·cos(θ)+r 12 ·sin(θ)]/2
J xtl_W =[(r 12 ·cos(θ)-r 11 ·sin(θ)]/2
J ytl_θ =-sin(θ)·(r 21 ·l/2+r 22 ·w/2)+cos(θ)·(r 22 ·l/2-r 21 ·w/2)
J ytl_l =[(r 21 ·cos(θ)+r 22 ·sin(θ)]/2
J ytl_W =[(r 22 ·cos(θ)-r 21 ·sin(θ)]/2
J ztl_θ =-sin(θ)·(r 31 ·l/2+r 32 ·w/2)+cos(θ)·(r 32 ·l/2-r 31 ·w/2)
J ztl_l =[(r 31 ·cos(θ)+r 32 ·sin(θ)]/2
J ztl_W =[(r 32 ·cos(θ)-r 31 ·sin(θ)]/2
optionally, the second Jacobian matrix is represented as J tr And then:
Figure BDA0003837114390000091
J xtr_θ =-sin(θ)·(r 11 ·l/2-r 12 ·w/2)+cos(θ)·(r 12 ·l/2+r 11 ·w/2)
J xtr_l =[(r 11 ·cos(θ)+r 12 ·sin(θ)]/2
J xtr_W =[(-r 12 ·cos(θ)+r 11 ·sin(θ)]/2
J ytr_θ =-sin(θ)·(r 21 ·l/2-r 22 ·w/2)+cos(θ)·(r 22 ·l/2+r 21 ·w/2)
J ytr_l =[(r 21 ·cos(θ)+r 22 ·sin(θ)]/2
J ytr_W =[(-r 22 ·cos(θ)+r 21 ·sin(θ)]/2
J ztr_θ =-sin(θ)·(r 31 ·l/2-r 32 ·w/2)+cos(θ)·(r 32 ·l/2+r 31 ·w/2)
J ztr_l =[(r 31 ·cos(θ)+r 32 ·sin(θ)]/2
J ztr_W =[(-r 32 ·cos(θ)+r 31 ·sin(θ)]/2
optionally, the third Jacobian matrix is represented as J bl Then:
Figure BDA0003837114390000092
J xbl_θ =-sin(θ)·(-r 11 ·l/2+r 12 ·w/2)+cos(θ)·(-r 12 ·l/2-r 11 ·w/2)
J xbl_l =-(r 11 ·cos(θ)+r 12 ·sin(θ)]/2
J xbl_W =[(r 12 ·cos(θ)-r 11 ·sin(θ)]/2
J ybl_θ =-sin(θ)·(-r 21 ·l/2+r 22 ·w/2)+cos(θ)·(-r 22 ·l/2-r 21 ·w/2)J ybl_l =-(r 21 ·cos(θ)+r 22 ·sin(θ)]/2
J ybl_W =[(r 22 ·cos(θ)-r 21 ·sin(θ)]/2
J zbl_θ =-sin(θ)·(-r 31 ·l/2+r 32 ·w/2)+cos(θ)·(-r 32 ·l/2-r 31 ·w/2)
J zbl_l =-(r 31 ·cos(θ)+r 32 ·sin(θ)]/2
J zbl_W =[(r 32 ·cos(θ)-r 31 ·sin(θ)]/2
optionally, the fourth Jacobian matrix is represented as J br And then:
Figure BDA0003837114390000093
J xbr_θ =-sin(θ)·(-r 11 ·l/2-r 12 ·w/2)+cos(θ)·(-r 12 ·l/2+r 11 ·w/2)
J xbr_l =-(r 11 ·cos(θ)+r 12 ·sin(θ)]/2
J xbr_W =[(-r 12 ·cos(θ)+r 11 ·sin(θ)]/2
J ybr_θ =-sin(θ)·(-r 21 ·l/2-r 22 ·w/2)+cos(θ)·(-r 22 ·l/2+r 21 ·w/2)
J ybr_l =-(r 21 ·cos(θ)+r 22 ·sin(θ)]/2
J ybr_W =[(-r 22 ·cos(θ)+r 21 ·sin(θ)]/2
J zbr_θ =-sin(θ)·(-r 31 ·l/2-r 32 ·w/2)+cos(θ)·(-r 32 ·l/2+r 31 ·w/2)
J zbr_l =-(r 31 ·cos(θ)+r 32 ·sin(θ)]/2
J zbr_W =[(-r 32 ·cos(θ)+r 31 ·sin(θ)]/2
wherein, in the related formula, l represents the length, w represents the width, theta represents the yaw angle, J xtl_θ 、J xtl_l 、J xtl_W 、J ytl_θ 、J ytl_l 、J ytl_W 、J ztl_θ 、J ztl_l 、J ztl_W To calculate J tl Intermediate variable of time, J xtr_θ 、J xtr_l 、J xtr_W 、J ytr_θ 、J ytr_l 、J ytr_W 、J ztr_θ、 J ztr_l 、J ztr_W To calculate J tr Intermediate variable of time, J xbl_θ 、J xbl_l 、J xbl_W 、J ybl_θ 、J ybl_l 、J ybl_W 、J zbl_θ 、J zbl_l 、J zbl_W To calculate J bl Intermediate variable of time, J xbr_θ 、J xbr_l 、J xbr_W 、J ybr_θ 、J ybr_l 、J ybr_W 、J zbr_θ 、J zbr_l 、J zbr_W To calculate J br Intermediate variables of time.
In the formation of J tl 、J tr 、J bl 、J br Thereafter, a target Jacobian matrix J can be determined as follows 12×6
Figure BDA0003837114390000101
In this embodiment, based on the second observation information, the four observation three-dimensional coordinates can be efficiently and reliably obtained, based on the four observation three-dimensional coordinates, the length, the width, and the yaw angle of the first road surface target can be efficiently and reliably determined, and the determined length, the width, and the yaw angle are substituted into the corresponding formulas to perform the calculation, so that the first to fourth jacobian matrices can be efficiently and reliably determined, and thus the target jacobian matrix can be efficiently and reliably obtained, so that the target jacobian matrix can be used in the subsequent steps.
Specifically, a kalman gain may be determined based on the reference uncertainty, the target jacobian matrix, and the observation noise, a second state variable may be determined based on the reference state variable, the kalman gain, the second observation information, and the target jacobian matrix, and a second uncertainty may be determined based on the identity matrix, the kalman gain, the target jacobian matrix, and the reference uncertainty.
Optionally, any uncertainty may be characterized using a covariance matrix, and the formula for determining kalman gain utilization may be:
K=P k-1 H T (HP k-1 H T +R k ) -1
wherein K represents the Kalman gain, P k-1 To representReference uncertainty, H denotes the target Jacobian matrix, R k Representing the observed noise.
Alternatively, the formula for determining the utilization of the second state variable may be:
x k =x k-1 +K(y k -Hx k-1 )
wherein x is k Represents a second state variable, x k-1 Representing reference state variables, K representing Kalman gain, y k Representing the second observation information, and H representing the target jacobian matrix.
Optionally, any uncertainty is characterized by a covariance matrix, and the second uncertainty is determined by the following formula:
P k =(I-KH)P k-1
wherein, P k Representing a second uncertainty, I representing an identity matrix, K representing a Kalman gain, H representing a target Jacobian matrix, P k-1 Indicating a reference uncertainty.
Therefore, under the condition that the reference uncertainty, the target jacobian matrix, the observation noise, the reference state variable, the second observation information and other information are known, the Kalman gain, the second state variable and the second uncertainty can be determined efficiently and reliably only by substituting the corresponding information into the corresponding formula for operation.
Step 2607, in response to the second uncertainty satisfying a preset convergence condition, generating a road surface element reconstruction result based on the second state variable.
In response to the second uncertainty not satisfying the predetermined convergence condition, the reference state variable is updated to a second state variable, and the reference uncertainty is updated to the second uncertainty, step 2609.
After determining the second state variable and the second uncertainty, the second uncertainty may be compared to a preset uncertainty.
If the second uncertainty does not exceed the preset uncertainty, it may be determined that the second uncertainty satisfies the preset convergence condition, and a road surface element reconstruction result may be generated based on the second state variable. Alternatively, information on the position, size, and the like of the first road surface target presented by the road element reconstruction result may be determined based on the second state variable.
If the second uncertainty exceeds the preset uncertainty, it may be determined that the second uncertainty does not satisfy the preset convergence condition, the reference state variable may be updated to the second state variable, and the reference uncertainty may be updated to the second uncertainty. Then, the second feature map at the current time is changed into a second feature map of a frame history, the second feature map of the frame history is added to the first target feature map sequence, and meanwhile, a new current second feature map is obtained, so that a new second state variable and a new second uncertainty can be obtained again, and accordingly, whether the new second uncertainty meets the preset convergence condition is determined again, and the like in the subsequent processes.
In one example, the number of the plurality of history times is 5, T1, T2, T3, T4, and T5, the current time is T6, the environment image sets corresponding to T1, T2, T3, T4, T5, and T6 are Q1, Q2, Q3, Q4, Q5, and Q6, respectively, and the second feature maps corresponding to T1, T2, T3, T4, T5, and T6 are Z1, Z2, Z3, Z4, Z5, and Z6, respectively.
After obtaining Z6, it may be determined whether the state variable corresponding to Z5 of the first road target and the covariance matrix corresponding to the state variable corresponding to Z5 of the first road target are determined before.
Initialization of the state variables and covariance matrix may be performed if it was not previously determined that the first road target corresponds to the state variable of Z5 and the first road target corresponds to the covariance matrix corresponding to the state variable of Z5. Alternatively, a corresponding state variable may be determined for the first road target based on the observed information of the first road target in Z6 and referring to the above formula for calculating to obtain x, y, Z, l, w, θ, where the state variable may be used as the state variable of the first road target corresponding to Z6, and in addition, the noise level of the initial state variable may be calculated according to multiple experiments to initialize a covariance matrix, where the covariance matrix may be used as the covariance matrix of the state variable of the first road target corresponding to Z6.
If it has been previously determined that the first road target corresponds to the state variable of Z5 and the first road target corresponds to the covariance matrix corresponding to the state variable of Z5, the state variable of the first road target corresponding to Z6 and the covariance matrix corresponding to the state variable of Z6 may be determined based on the state variable of the first road target corresponding to Z5, the covariance matrix corresponding to the state variable of Z5, the observation information and the observation noise of the first road target in Z6.
If the covariance matrix corresponding to the state variable of the first road target corresponding to Z6 does not exceed the threshold (which is equivalent to the above second uncertainty satisfying the preset convergence condition), the road element reconstruction result may be generated based on the state variable of the first road target corresponding to Z6.
If the covariance matrix corresponding to the state variable of Z6 corresponding to the first road target exceeds the threshold (which is equivalent to that the second uncertainty does not satisfy the preset convergence condition), the environment image set Q7 corresponding to the next time T7 of T6 may be obtained, and the second feature map Z7 corresponding to Q7 may be obtained, and based on the state variable of Z6 corresponding to the first road target, the covariance matrix corresponding to the state variable of Z6 corresponding to the first road target, the observation information and the observation noise of the first road target in Z7, the state variable of Z7 corresponding to the first road target, and the covariance matrix corresponding to the state variable of Z7 corresponding to the first road target may be determined.
If the covariance matrix corresponding to the state variable of the first road target corresponding to Z7 does not exceed the threshold, the road element reconstruction result may be generated based on the state variable of the first road target corresponding to Z7.
If the covariance matrix corresponding to the state variable of Z7 corresponding to the first road target exceeds the threshold, the environment image set Q8 corresponding to the next time T8 of T7 may be obtained, and the subsequent process is similar to the processing process after obtaining Q7, and is not described herein again.
In the embodiment of the disclosure, by introducing an Extended Kalman Filter (EKF) algorithm, the state variables of the tracked road surface target can be updated and optimized in real time until convergence, so that a high-precision road surface element reconstruction result can be obtained. In addition, a rectangular frame model (i.e., each detection frame is a rectangular frame) can be selected during filtering, so that rapid convergence of the state variables can be realized.
Based on the embodiment shown in fig. 2, as shown in fig. 5, the method further includes step 242 before step 250.
Step 242, obtaining a second target feature map sequence corresponding to a plurality of historical environment image sets, where the plurality of historical environment image sets represent a plurality of environment image sets acquired by a plurality of image acquisition devices disposed at different orientations of the mobile device at a plurality of historical times, and the second target feature map sequence includes: and the second characteristic map of the multi-frame history corresponds to a plurality of historical environment image sets.
It should be noted that, the specific implementation process of step 242 only needs to refer to the description of step 252, and is not described herein again.
Step 250 may include step 2501, step 2503, step 2505, and step 2507.
Step 2501, determining a tracked second road surface target based on a tracking result corresponding to the second target feature map sequence.
It should be noted that, the specific implementation process of step 2501 refers to the description of step 2601, and is not described herein again.
Step 2503, performing target detection on the current second characteristic diagram, and determining a third road surface target in the current second characteristic diagram.
In step 2503, a target detection algorithm may be used to perform target detection on the current second feature map to determine each road surface target in the current second feature map, each of the road surface targets may be used as a third road surface target, and since the subsequent processing manner for each third road surface target is similar, the following description will be mainly given for the processing manner of a single third road surface target.
Step 2505, in response to that the third road surface target is the same as the second road surface target in category, calculating a cost value between the third road surface target and the second road surface target in a calculation mode corresponding to the third road surface target in category.
In a specific embodiment, calculating a cost value between the third road surface target and the second road surface target in a calculation manner corresponding to the category of the third road surface target includes:
responding to the fact that the category of the third road surface target is a road surface arrow category, and determining a first prediction detection frame of a second road surface target in a current second characteristic diagram based on a tracking result corresponding to the second target characteristic diagram sequence;
calculating a first prediction detection frame, a detection frame Intersection ratio (Intersection over Intersection) of a first actual detection frame of a third road surface target in a current second characteristic diagram and a first detection frame Euclidean distance;
calculating a first ratio of the intersection-to-parallel ratio of the detection frames to a preset intersection-to-parallel ratio;
calculating a second ratio of the Euclidean distance of the first detection frame to the preset Euclidean distance;
and determining a cost value between the third road target and the second road target based on the weighting result of the first ratio and the second ratio.
In the case that the category of the third road surface target is a road surface arrow type, according to the tracking result corresponding to the second target feature map sequence, an actual detection frame of the second road surface target in the second feature map of each frame history in the second target feature map sequence may be determined, and a current detection frame of the second road surface target in the second feature map may be predicted according to the actual detection frame, so as to determine the first predicted detection frame.
Then, the intersection ratio of the detection frames and the calculation of the Euclidean distance of the detection frames can be carried out on a first actual detection frame obtained by carrying out target detection on a third path of surface targets in the current second characteristic diagram and a first prediction detection frame; the calculated intersection and union ratio of the detection frames can be the ratio of the intersection and union of the first actual detection frame and the first prediction detection frame; the calculated euclidean distance of the first detection frame may be a euclidean distance between a center point of the first actual detection frame and a center point of the first predicted detection frame.
Then, the cross-over ratio of the detection frames and the preset cross-over ratio can be divided to obtain a first ratio, and the Euclidean distance of the first detection frame and the preset Euclidean distance are divided to obtain a second ratio.
Then, the first ratio and the second ratio may be weighted, and based on the weighting result, a cost value between the third road target and the second road target may be determined. Optionally, when the weighting processing of the first ratio and the second ratio is performed, the weights corresponding to the first ratio and the second ratio may be set according to an actual situation, and it is only necessary to ensure that the sum of the weights corresponding to the first ratio and the second ratio is 1, and after the weighting result is obtained, the weighting result may be directly used as a cost value, or the weighting result may be mapped to a specified numerical value interval (for example, 0 to 1,0 to 5, and the like), and the obtained mapping value is used as a cost value.
It should be noted that the number of the third road surface targets may be multiple, and the number of the second road surface targets may also be multiple, so that when calculating the cost value for the ith third road surface target and the jth second road surface target, the following formula may be used:
I ij =k 1 ·d ij /d max +(1-k 1 )·rect_iou ij /rect_iou min
wherein, I ij Representing a cost value, k, between the ith third road surface target and the jth second road surface target 1 Representing a predetermined weight, d ij Representing the intersection and combination ratio of the detection frames calculated by aiming at the ith third road surface target and the jth second road surface target, d max Indicating a preset cross-over ratio, rect _ iou ij Representing the Euclidean distance of a first detection frame, rect _ iou, calculated according to the ith third road surface target and the jth second road surface target min Representing a preset euclidean distance.
In this embodiment, for a road surface target of a road surface arrow type, the respective cost values can be calculated efficiently and reliably by calculating the euclidean distance between the detection frame intersection ratio and the detection frame and further combining with the weighting processing.
In another specific embodiment, calculating a cost value between the third road surface target and the second road surface target in a calculation manner corresponding to the category of the third road surface target includes:
in response to the fact that the type of the third road surface target is a zebra crossing type or a stop line type, determining a second prediction detection frame of the second road surface target in the current second characteristic diagram based on a tracking result corresponding to the second target characteristic diagram sequence;
projecting a second actual detection frame of a third road surface target in the current second characteristic diagram to a long side of a second prediction detection frame to obtain a projection line segment;
calculating the intersection ratio of the projection line segment and the line segment of the long edge;
calculating the Euclidean distance between the second prediction detection frame and the second actual detection frame;
calculating a third ratio of the intersection ratio of the line segments to the preset intersection ratio;
calculating a fourth ratio of the Euclidean distance of the second detection frame to the preset Euclidean distance;
and determining a cost value between the third road target and the second road target based on the weighting result of the third ratio and the fourth ratio.
When the type of the third road surface target is a zebra crossing type or a stop line type, an actual detection frame of the second road surface target in the second feature map of each frame history in the second target feature map sequence can be determined according to a tracking result corresponding to the second target feature map sequence, and a current detection frame of the second road surface target in the second feature map is predicted according to the actual detection frame, so as to determine a second predicted detection frame.
Then, target detection can be performed on a third path of surface targets in the current second feature map to obtain a second actual detection frame, the second actual detection frame is projected to the long side of the second prediction detection frame to determine a projection line segment, the intersection and comparison between the projection line segment and the line segment of the long side is calculated, and the Euclidean distance between the second prediction detection frame and the second detection frame is calculated; the calculated line segment intersection and union ratio is characterized by the intersection and union ratio of the projection line segment and the long edge; the calculated euclidean distance of the second detection frame may be a euclidean distance between a center point of the second actual detection frame and a center point of the second predicted detection frame.
Then, the intersection and combination ratio of the line segments and the preset intersection and combination ratio can be divided to obtain a third ratio, and the Euclidean distance of the second detection frame and the preset Euclidean distance can be divided to obtain a fourth ratio.
Then, the third ratio and the fourth ratio may be weighted, and based on the weighting result, a cost value between the third road target and the second road target may be determined. Optionally, when the weighting processing of the third ratio and the fourth ratio is performed, the weights corresponding to the third ratio and the fourth ratio may be set according to an actual situation, and it is only required to ensure that the sum of the weights corresponding to the third ratio and the fourth ratio is 1, and after the weighting result is obtained, the weighting result may be directly used as a cost value, or the weighting result may be mapped to a specified numerical value interval (e.g., 0 to 1,0 to 5, etc.), and the obtained mapping value is used as a cost value.
It should be noted that, the number of the third road surface targets may be multiple, and the number of the second road surface targets may also be multiple, so that when the cost value is calculated for the ith third road surface target and the jth second road surface target, the following formula may be used:
I ij =k 1 ·d ij /d max +(1-k 1 )·line_iou ij /line_iou min
wherein, I ij Representing a cost value, k, between the ith third road surface target and the jth second road surface target 1 Representing a predetermined weight, d ij Represents the intersection ratio of line segments calculated by aiming at the ith third road surface target and the jth second road surface target, d max Represents a preset cross-over ratio, line _ iou ij Representing the Euclidean distance of a second detection frame, rect _ iou, calculated according to the ith third road surface target and the jth second road surface target min Representing a preset euclidean distance.
In this embodiment, for road surface targets of the zebra crossing type or the stop line type, the line segment intersection ratio and the detection frame euclidean distance are calculated, and further weighting processing is performed, so that the corresponding cost value can be calculated efficiently and reliably.
Step 2507, determining a tracking result corresponding to a third target feature map sequence based on the tracking result corresponding to the second target feature map sequence and the cost value, where the third target feature map sequence includes: a second feature map of a multi-frame history in a second target feature map sequence, and a current second feature map.
In step 2507, under the condition that the tracking result corresponding to the second target feature map sequence and the cost value between the third road surface target and the second road surface target are known, a hungarian matching algorithm may be adopted to match the road surface targets many to many.
Assuming that five second road surface targets, namely, the road surface target 1, the road surface target 2, the road surface target 3, the road surface target 4 and the road surface target 5, are determined based on the tracking results corresponding to the second target feature map sequence, and the types of the road surface target 1, the road surface target 2 and the road surface target 3 are road surface arrow types, two third road surface targets, namely, the road surface target 6 and the road surface target 7 are determined by performing target detection on the current second feature map, and the type of the road surface target 6 is a road surface arrow type, the cost values between the road surface target 6 and the road surface targets 1, 2 and 3 can be respectively calculated.
Assuming that the cost value between the road surface target 6 and the road surface target 1 is a cost value 1, the cost value between the road surface target 6 and the road surface target 2 is a cost value 2, and the cost value between the road surface target 6 and the road surface target 3 is a cost value 3, a cost value exceeding a preset cost value may be filtered from the cost values 1, 2, and 3.
Assuming that the cost value 1 and the cost value 2 are still reserved after filtering, and the cost value 1 is smaller than the cost value 2, the road surface target 6 and the road surface target 1 can be considered to belong to the same road surface target; assuming that the cost values 1, 2 and 3 are all filtered, it can be considered that the road surface target 6 and the road surface target 1 do not belong to the same road surface target, and the road surface target 6 is a road surface target different from the road surface target 1, 2 and 3, and the road surface target 6 is a newly appeared road surface target.
In the above manner, the relations between the road surface target 6 and the road surface targets 1, 2, 3, 4, 5 are clarified, and in a similar manner, the relations between the road surface target 7 and the road surface targets 1, 2, 3, 4, 5 are clarified, so that the tracking result corresponding to the third target feature map sequence can be obtained.
In the embodiment of the disclosure, for different types of road surface elements, the cost value can be calculated in a matched manner, so that the accuracy and reliability of the calculation result can be better ensured, and the accuracy and reliability of the tracking result obtained by performing multi-target tracking can be ensured, thereby being beneficial to obtaining the high-precision road surface element reconstruction result.
On the basis of the embodiment shown in fig. 2, as shown in fig. 6, the method further includes step 270 and step 280.
Step 270, compressing the road element reconstruction result to obtain a compression result; wherein the road surface element reconstruction result comprises: and optimizing three-dimensional coordinates of four vertexes of the detection frame of each road surface target.
It should be noted that, by using the extended kalman filter algorithm, the state variables of each road surface target can be continuously optimized and updated, and based on the state variables when the road surface target converges, the respective three-dimensional coordinates of the four vertexes of the detection frame of the road surface target can be reversely deduced, and the three-dimensional coordinates can be used as the optimized three-dimensional target, so that the road surface element reconstruction result can be obtained.
Then, the reconstruction result of the road element can be compressed according to a Protobuf protocol to obtain a compression result; the Protobuf is a method for language-independent, platform-independent and extensible serialized structured data, and can be used for communication and data storage.
And step 280, transmitting the compression result to the upper computer, wherein the compression result is used for the upper computer to update the environment model.
Alternatively, the environmental model may be a three-dimensional model.
In step 280, the compression result may be transmitted to the upper computer according to the Protobuf protocol, and the upper computer may update the environment model of the environment where the mobile device is located according to the obtained compression result, for example, the upper computer may add each road surface target at a suitable position in the environment model.
In the embodiment of the disclosure, the road element reconstruction result can be compressed and transmitted, so that the road element reconstruction result occupies a smaller space, and the road element reconstruction result is used for updating the environment model, which is beneficial to ensuring the accuracy and reliability of the environment model.
In an alternative example, as shown in fig. 7, the road surface element reconstruction process may include the following steps: BEV neural network output, perception preprocessing, multi-target tracking, filtering optimization and vectorization output.
The BEV neural network output step may be implemented by a BEV neural network output module, and the BEV neural network output module may predict the multiple paths of images (corresponding to the above environment image set) through the neural network, and output a BEV spatial segmentation result (corresponding to the above first feature map).
The perception preprocessing step can be realized by a perception preprocessing module, the perception preprocessing module can extract pixel areas of various road targets (including zebra crossings, stop lines, road arrows and the like) in BEV observation, and areas corresponding to the road targets in a vehicle body coordinate system can be found through transformation, so that outermost rectangular frames corresponding to the road targets are obtained (equivalent to that the pixel areas of the road targets in the first characteristic diagram are mapped to a preset coordinate system corresponding to the movable equipment, and a second characteristic diagram is obtained).
The multi-target tracking step can be realized by a multi-target tracking module, the multi-target tracking module can calculate the cost value between a new observation (equivalent to the third road target in the above) and an existing road target (equivalent to the second road target in the above), and the new observation and the existing road target can be associated through a Hungarian matching algorithm to allocate a unique tracking id.
The filtering optimization step can be realized by a filtering optimization module, and the filtering optimization module is used for successfully tracking the road surfaceThe filtering optimization module can optimize the existing reconstruction result by using new observation, an extended Kalman filtering algorithm is adopted for optimizing the road targets in consideration of real-time property, and the state variable of each road target is selected to be X = (X, y, z, l, w, theta) T Wherein (x, y, z) T Is the three-dimensional coordinate of the center point of the rectangular frame, and l, w, theta are the length, width and yaw angle of the rectangular frame, respectively. Four vertex coordinates y of rectangular frame for each new observation 1 、y 2 、y 3 、y 4 And updating the state variable and the corresponding covariance matrix, and when the covariance matrix is smaller than a certain threshold value, considering that the road surface target is converged, and stopping updating.
The vectorization output step can be realized by a vectorization output module, for the road surface elements, the vectorization output module can only output four vertex coordinates of the rectangular frame, and support a Protobuf protocol to perform compression transmission (which is equivalent to compressing the road surface element reconstruction result to obtain a compression result and transmitting the compression result to an upper computer according to the Protobuf protocol in the foregoing).
In summary, the road surface element reconstruction method in the embodiments of the present disclosure is different from the conventional single forward-looking camera observation, and based on BEV observation, it is possible to realize more stable tracking reconstruction of the road surface static element, and is not limited to the forward-looking camera view range; for different road surface elements, the cost value is calculated in a matched mode during tracking association, and a stable and accurate tracking result can be obtained; the filter model selects a rectangular frame model, and the rapid convergence of the state variables can be realized.
Any of the road surface element reconstruction methods provided by the embodiments of the present disclosure may be performed by any suitable device having data processing capabilities, including but not limited to: terminal equipment, a server and the like. Alternatively, any of the road element reconstruction methods provided by the embodiments of the present disclosure may be executed by a processor, for example, the processor may execute any of the road element reconstruction methods mentioned in the embodiments of the present disclosure by calling a corresponding instruction stored in a memory. And will not be described in detail below.
Exemplary devices
Fig. 8 is a schematic structural diagram of a road surface element reconstruction device according to an exemplary embodiment of the present disclosure. The apparatus shown in FIG. 8 includes a first acquisition module 810, a first generation module 820, a determination module 830, a second acquisition module 840, a multi-target tracking module 850, and a second generation module 860.
A first obtaining module 810, configured to obtain an environmental image set, the environmental image set including: aiming at multi-frame images with different visual angles, which are collected by a plurality of image collecting devices arranged at different directions of the movable equipment at the current moment, aiming at the environment around the movable equipment, each frame of image in an environment image set comprises at least one road surface target;
a first generating module 820, configured to generate a first feature map containing semantic segmentation information in the bird's eye space via a neural network based on the environment image set acquired by the first acquiring module 810;
a determining module 830, configured to determine, based on semantic segmentation information included in the first feature map generated by the first generating module 820, respective pixel regions of the road surface objects in the first feature map;
a second obtaining module 840, configured to map the pixel area of each road surface target in the first feature map determined by the determining module 830 to a preset coordinate system corresponding to the mobile device, so as to obtain a second feature map;
the multi-target tracking module 850 is configured to perform multi-target tracking on the second feature map obtained by the second obtaining module 840 to obtain a tracking result;
and the second generating module 860 is configured to generate a road surface element reconstruction result based on the tracking result obtained by the multi-target tracking module 850.
In an alternative example, as shown in fig. 9, the apparatus further comprises:
a third obtaining module 852, configured to obtain, before generating a road surface element reconstruction result based on the tracking result obtained by the multi-target tracking module 850, a first target feature map sequence corresponding to a plurality of historical environment image sets, where the plurality of historical environment image sets represent a plurality of environment image sets acquired by a plurality of image acquisition devices disposed in different orientations of the mobile device at a plurality of historical times, and the first target feature map sequence includes: a second feature map of multi-frame history corresponding to a plurality of historical environmental image sets;
a second generation module 860 comprising:
the first determining submodule 8601 is used for determining a tracked first road surface target based on a tracking result corresponding to the first target feature map sequence obtained by the multi-target tracking module 850;
a second determining submodule 8603, configured to use a first state variable of a second feature map, which corresponds to a last frame history in the first target feature map sequence, of the first road target determined by the first determining submodule 8601 as a reference state variable, and use a first uncertainty corresponding to the first state variable as a reference uncertainty;
a third determining submodule 8605, configured to determine, based on the reference state variable determined by the second determining submodule 8603, the reference uncertainty determined by the second determining submodule 8603, the first observation information and the observation noise of the first road surface target in the current second feature map, that the first road surface target corresponds to the second state variable of the current second feature map, and a second uncertainty of the second state variable;
a generating submodule 8607, configured to generate a road element reconstruction result based on the second state variable determined by the third determining submodule 8605 in response to that the second uncertainty determined by the third determining submodule 8605 satisfies a preset convergence condition;
an updating submodule 8609, configured to update the reference state variable to the second state variable and update the reference uncertainty to the second uncertainty in response to the second uncertainty determined by the third determining submodule 8605 not meeting the preset convergence condition.
In one optional example, the third determination submodule 8605 includes:
the conversion unit is used for converting the first observation information into a world coordinate system to obtain second observation information;
the first determining unit is used for determining a target Jacobian matrix of the second observation information obtained by the converting unit relative to the state variable;
a second determining unit configured to determine a kalman gain based on the reference uncertainty determined by the second determining submodule 8603, the target jacobian matrix determined by the first determining unit, and the observation noise;
a third determining unit, configured to determine a second state variable based on the reference state variable determined by the second determining submodule 8603, the kalman gain determined by the second determining unit, the second observation information obtained by the converting unit, and the target jacobian matrix determined by the first determining unit;
a fourth determining unit configured to determine the second uncertainty based on the identity matrix, the kalman gain determined by the second determining unit, the target jacobian matrix determined by the first determining unit, and the reference uncertainty determined by the second determining submodule 8603.
In one optional example, the first determining unit includes:
the first determining subunit is configured to determine, based on the second observation information obtained by the converting unit, respective observation three-dimensional coordinates of an upper left vertex, an upper right vertex, a lower left vertex, and a lower right vertex of the detection frame of the first road target, so as to obtain four observation three-dimensional coordinates;
the second determining subunit is used for determining the length, the width and the yaw angle of the detection frame of the first road surface target based on the four observed three-dimensional coordinates obtained by the first determining subunit;
the third determining subunit is used for determining a first Jacobian matrix corresponding to the upper left vertex, a second Jacobian matrix corresponding to the upper right vertex, a third Jacobian matrix corresponding to the lower left vertex and a fourth Jacobian matrix corresponding to the lower right vertex on the basis of the length, the width and the yaw angle determined by the second determining subunit and a rotation matrix between a preset coordinate system and a world coordinate system;
and a fourth determining subunit, configured to determine the target jacobian matrix based on the first jacobian matrix, the second jacobian matrix, the third jacobian matrix, and the fourth jacobian matrix determined by the third determining subunit.
In an alternative example, any uncertainty is characterized using a covariance matrix, and the formula for determining kalman gain utilization is:
K=P k-1 H T (HP k-1 H T +R k ) -1
wherein K represents the Kalman gain, P k-1 Representing a reference uncertainty, H representing a target Jacobian matrix, R k Representing the observed noise.
In one alternative example, the formula for determining the second state variable utilization is:
x k =x k-1 +K(y k -Hx k-1 )
wherein x is k Representing a second state variable, x k-1 Denotes the reference state variable, K denotes the Kalman gain, y k Representing the second observation information, and H representing the target jacobian matrix.
In an alternative example, any uncertainty is characterized using a covariance matrix, and the second uncertainty is determined using the formula:
P k =(I-KH)P k-1
wherein, P k Expressing the second uncertainty, I expressing an identity matrix, K expressing a Kalman gain, H expressing a target Jacobian matrix, P k-1 Indicating a reference uncertainty.
In an alternative example, as shown in fig. 10, the apparatus further comprises:
a fourth obtaining module 842, configured to obtain a second target feature map sequence corresponding to a plurality of historical environmental image sets before performing multi-target tracking on the second feature map obtained by the second obtaining module 840 to obtain a tracking result, where the plurality of historical environmental image sets represent a plurality of environmental image sets acquired by a plurality of image acquisition devices disposed in different orientations of the mobile device at a plurality of historical times, and the second target feature map sequence includes: a second feature map of multi-frame history corresponding to a plurality of historical environmental image sets;
a multi-target tracking module 850, comprising:
a fourth determining submodule 8501, configured to determine a tracked second road surface target based on a tracking result corresponding to the second target feature map sequence acquired by the fourth acquiring module 842;
a fifth determining submodule 8503, configured to perform target detection on the current second feature map, and determine a third road surface target in the current second feature map;
a calculating submodule 8505, configured to, in response to that the category of the third road surface target determined by the fifth determining submodule 8503 is the same as the category of the second road surface target determined by the fourth determining submodule 8501, calculate a cost value between the third road surface target and the second road surface target in a calculation manner corresponding to the category of the third road surface target;
a sixth determining submodule 8507, configured to determine, based on the tracking result corresponding to the second target feature map sequence acquired by the fourth acquiring module 842 and the cost value calculated by the calculating submodule 8505, a tracking result corresponding to a third target feature map sequence, where the third target feature map sequence includes: a second feature map of a multi-frame history in a second target feature map sequence, and a current second feature map.
In one optional example, the computation submodule 8505 includes:
a fifth determining unit, configured to determine, in response to the category of the third road surface target determined by the fifth determining sub-module 8503 being a road surface arrow category, a first prediction detection frame of the second road surface target in the current second feature map based on the tracking result corresponding to the second target feature map sequence acquired by the fourth acquiring module 842;
a first calculating unit, configured to calculate the first predicted detection frame determined by the fifth determining unit, and a first detection frame euclidean distance and a detection frame intersection ratio of a first actual detection frame of the third road surface target in the current second feature map;
the second calculation unit is used for calculating a first ratio of the intersection ratio of the detection frame calculated by the first calculation unit to a preset intersection ratio;
the third calculation unit is used for calculating a second ratio of the Euclidean distance of the first detection frame calculated by the first calculation unit to the preset Euclidean distance;
and the sixth determining unit is used for determining a cost value between the third road target and the second road target based on a weighting result of the first ratio calculated by the second calculating unit and the second ratio calculated by the third calculating unit.
In one optional example, the computation submodule 8505 includes:
a seventh determining unit, configured to determine, in response to the class of the third road surface target determined by the fifth determining sub-module 8503 being a zebra crossing class or a stop line class, a second prediction detection frame of the second road surface target in the current second feature map based on the tracking result corresponding to the second target feature map sequence acquired by the fourth acquiring module 842;
the projection unit is used for projecting a second actual detection frame of the third road surface target in the current second characteristic diagram determined by the seventh determination unit to the long side of the second prediction detection frame to obtain a projection line segment;
the third calculating unit is used for calculating the intersection ratio of the projection line segment obtained by the projection unit and the line segment of the long edge;
a fourth calculating unit configured to calculate a euclidean distance between the second prediction detection frame determined by the seventh determining unit and the second actual detection frame;
the fifth calculating unit is used for calculating a third ratio of the intersection ratio of the line segments calculated by the third calculating unit to the preset intersection ratio;
a sixth calculating unit, configured to calculate a fourth ratio of the euclidean distance of the second detection frame calculated by the fourth calculating unit to the preset euclidean distance;
and the eighth determining unit is used for determining a cost value between the third road target and the second road target based on a weighting result of the third ratio calculated by the fifth calculating unit and the fourth ratio calculated by the sixth calculating unit.
In an alternative example, as shown in fig. 11, the apparatus further comprises:
the compression module 870 is used for compressing the road surface element reconstruction result to obtain a compression result; wherein, the road surface element reconstruction result comprises: optimizing three-dimensional coordinates of four vertexes of the detection frame of each road surface target;
and the transmission module 880 is used for transmitting the compression result obtained by the compression module 870 to the upper computer, and the compression result is used for the upper computer to update the environment model.
Exemplary electronic device
Next, an electronic apparatus according to an embodiment of the present disclosure is described with reference to fig. 12. The electronic device may be either or both of the first device and the second device, or a stand-alone device separate from them, which stand-alone device may communicate with the first device and the second device to receive the acquired input signals therefrom.
Fig. 12 illustrates a block diagram of an electronic device in accordance with an embodiment of the disclosure.
As shown in fig. 12, the electronic device 1200 includes one or more processors 1210 and memory 1220.
Processor 1210 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in electronic device 1200 to perform desired functions.
Memory 1220 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium and executed by processor 1210 to implement the pavement element reconstruction methods of the various embodiments of the present disclosure described above and/or other desired functions.
In one example, the electronic device 1200 may further include: an input device 1230 and an output device 1240, which are interconnected by a bus system and/or other form of connection mechanism (not shown).
For example, when the electronic device is a first device or a second device, the input means 1230 may be a microphone or a microphone array. When the electronic device is a stand-alone device, the input means 1230 may be a communication network connector for receiving the collected input signals from the first device and the second device.
The input device 1230 may also include, for example, a keyboard, a mouse, and the like. The output device 1240 may output various information to the outside. The output 1240 may include, for example, a display, speakers, a printer, and a communication network and its connected remote output devices, among others.
Of course, for simplicity, only some of the components of the electronic device 1200 relevant to the present disclosure are shown in fig. 12, omitting components such as buses, input/output interfaces, and the like. In addition, electronic device 1200 may include any other suitable components depending on the particular application.
Exemplary computer program product and computer-readable storage Medium
In addition to the above-described methods and apparatus, embodiments of the present disclosure may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in the road surface element reconstruction method according to various embodiments of the present disclosure described in the "exemplary methods" section of this specification above.
The computer program product may write program code for carrying out operations for embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present disclosure may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform the steps in the road surface element reconstruction method according to various embodiments of the present disclosure described in the "exemplary method" section above in this specification.
The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The basic principles of the present disclosure have been described above in connection with specific embodiments, but it should be noted that advantages, effects, and the like, mentioned in the present disclosure are only examples and not limitations, and should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.
In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts in the embodiments are referred to each other. For the system embodiment, since it basically corresponds to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The block diagrams of devices, apparatuses, systems referred to in this disclosure are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by one skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".
The methods and apparatus of the present disclosure may be implemented in a number of ways. For example, the methods and apparatus of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
It is also noted that in the devices, apparatuses, and methods of the present disclosure, each component or step can be decomposed and/or recombined. Such decomposition and/or recombination should be considered as equivalents of the present disclosure.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the disclosure to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims (14)

1. A pavement element reconstruction method comprising:
acquiring an ambient image set, the ambient image set comprising: aiming at multi-frame images with different visual angles, which are acquired by a plurality of image acquisition devices arranged at different directions of the movable equipment at the current moment, of the environment around the movable equipment, each frame of image in the environment image set comprises at least one road surface target;
generating a first feature map containing semantic segmentation information in a bird's-eye view space through a neural network based on the environment image set;
determining the pixel area of each road surface target in the first feature map based on the semantic segmentation information;
mapping the pixel area of each road surface target in the first characteristic diagram to a preset coordinate system corresponding to the movable equipment to obtain a second characteristic diagram;
performing multi-target tracking on the second characteristic diagram to obtain a tracking result;
and generating a road surface element reconstruction result based on the tracking result.
2. The method of claim 1, wherein,
before the generating road surface element reconstruction results based on the tracking results, the method further comprises:
acquiring a first target feature map sequence corresponding to a plurality of historical environment image sets, wherein the plurality of historical environment image sets represent a plurality of environment image sets acquired by a plurality of image acquisition devices arranged at different orientations of the movable equipment at a plurality of historical moments, and the first target feature map sequence comprises: the second feature map of a plurality of frames of history corresponding to the plurality of sets of historical environmental images;
generating a road surface element reconstruction result based on the tracking result, comprising:
determining a tracked first road surface target based on a tracking result corresponding to the first target characteristic diagram sequence;
taking a first state variable of the second feature map of the last frame history in the first target feature map sequence corresponding to the first road target as a reference state variable, and taking a first uncertainty corresponding to the first state variable as a reference uncertainty;
determining a second state variable of the first road target corresponding to the current second characteristic diagram and a second uncertainty of the second state variable based on the reference state variable, the reference uncertainty, and first observation information and observation noise of the first road target in the current second characteristic diagram;
generating a road surface element reconstruction result based on the second state variable in response to the second uncertainty satisfying a preset convergence condition;
in response to the second uncertainty not satisfying a preset convergence condition, updating the reference state variable to the second state variable and updating the reference uncertainty to the second uncertainty.
3. The method of claim 2, wherein said determining, based on the reference state variable, the reference uncertainty, first observation information of the first road surface target in the current second characteristic map, and observation noise, a second state variable for which the first road surface target corresponds to the current second characteristic map, and a second uncertainty of the second state variable, comprises:
converting the first observation information into a world coordinate system to obtain second observation information;
determining a target Jacobian matrix of the second observation information relative to a state variable;
determining a Kalman gain based on the reference uncertainty, the target Jacobian matrix, and the observation noise;
determining the second state variable based on the reference state variable, the Kalman gain, the second observation information and the target Jacobian matrix;
determining the second uncertainty based on an identity matrix, the Kalman gain, the target Jacobian matrix, and the reference uncertainty.
4. The method of claim 3, wherein the determining a target Jacobian matrix of the second observation information relative to state variables comprises:
determining respective observation three-dimensional coordinates of an upper left vertex, an upper right vertex, a lower left vertex and a lower right vertex of a detection frame of the first road surface target based on the second observation information to obtain four observation three-dimensional coordinates;
determining the length, the width and the yaw angle of a detection frame of the first road surface target based on the four observation three-dimensional coordinates;
determining a first Jacobian matrix corresponding to the upper left vertex, a second Jacobian matrix corresponding to the upper right vertex, a third Jacobian matrix corresponding to the lower left vertex and a fourth Jacobian matrix corresponding to the lower right vertex based on the length, the width, the yaw angle and a rotation matrix between the preset coordinate system and the world coordinate system;
determining the target Jacobian matrix based on the first Jacobian matrix, the second Jacobian matrix, the third Jacobian matrix, and the fourth Jacobian matrix.
5. The method of claim 3, wherein any uncertainty is characterized using a covariance matrix, the Kalman gain determined using the formula:
K=P k-1 H T (HP k-1 H T +R k ) -1
wherein K represents the Kalman gain, P k-1 Represents the degree of uncertainty of the reference,h represents the target Jacobian matrix, R k Representing the observed noise.
6. The method of claim 3, wherein determining the second state variable utilizes the formula:
x k =x k-1 +K(y k -Hx k-1 )
wherein x is k Represents the second state variable, x k-1 Representing the reference state variable, K representing the Kalman gain, y k And H represents the target Jacobian matrix.
7. The method of claim 3, wherein any uncertainty is characterized using a covariance matrix, the second uncertainty determined using the formula:
P k =(I-KH)P k-1
wherein, P k Representing the second uncertainty, I representing the identity matrix, K representing the Kalman gain, H representing the target Jacobian matrix, P k-1 Representing the reference uncertainty.
8. The method of claim 1, wherein,
before the performing multi-target tracking on the second feature map to obtain a tracking result, the method further includes:
acquiring a second target feature map sequence corresponding to a plurality of historical environment image sets, wherein the plurality of historical environment image sets represent a plurality of environment image sets acquired by a plurality of image acquisition devices arranged at different orientations of the movable equipment at a plurality of historical moments, and the second target feature map sequence comprises: the second feature map of a plurality of frames of history corresponding to the plurality of sets of historical environmental images;
the multi-target tracking is performed on the second characteristic diagram to obtain a tracking result, and the tracking result comprises the following steps:
determining a tracked second road surface target based on a tracking result corresponding to the second target characteristic diagram sequence;
carrying out target detection on the current second characteristic diagram, and determining a third road surface target in the current second characteristic diagram;
in response to that the third road surface target is the same as the second road surface target in category, calculating a cost value between the third road surface target and the second road surface target in a calculation mode corresponding to the third road surface target in category;
determining a tracking result corresponding to a third target feature map sequence based on the tracking result corresponding to the second target feature map sequence and the cost value, wherein the third target feature map sequence comprises: the second feature map of a multi-frame history in the second target feature map sequence, and the current second feature map.
9. The method of claim 8, wherein said calculating a cost value between the third road surface target and the second road surface target in a manner corresponding to a category of the third road surface target comprises:
in response to the fact that the type of the third road surface target is a road surface arrow type, determining a first prediction detection frame of the second road surface target in the current second characteristic diagram based on a tracking result corresponding to the second target characteristic diagram sequence;
calculating the first prediction detection frame, and the Euclidean distance between the intersection ratio of the first prediction detection frame and the detection frame of the first actual detection frame of the third road surface target in the current second characteristic diagram and the first detection frame;
calculating a first ratio of the intersection-to-parallel ratio of the detection frames to a preset intersection-to-parallel ratio;
calculating a second ratio of the Euclidean distance of the first detection frame to a preset Euclidean distance;
determining a cost value between the third road surface target and the second road surface target based on a weighted result of the first ratio and the second ratio.
10. The method of claim 8, wherein said calculating a cost value between the third road surface target and the second road surface target in a manner corresponding to a category of the third road surface target comprises:
in response to the fact that the type of the third road surface target is a zebra crossing type or a stop line type, determining a second prediction detection frame of the second road surface target in the current second characteristic diagram based on a tracking result corresponding to the second target characteristic diagram sequence;
projecting a second actual detection frame of the third road surface target in the current second characteristic diagram to the long edge of the second prediction detection frame to obtain a projection line segment;
calculating the intersection ratio of the projection line segment and the line segment of the long edge;
calculating a second detection frame Euclidean distance between the second prediction detection frame and the second actual detection frame;
calculating a third ratio of the intersection ratio of the line segments to a preset intersection ratio;
calculating a fourth ratio of the Euclidean distance of the second detection frame to a preset Euclidean distance;
determining a cost value between the third road surface target and the second road surface target based on a weighting result of the third ratio and the fourth ratio.
11. The method according to any one of claims 1-10, further comprising:
compressing the road surface element reconstruction result to obtain a compression result; wherein the road surface element reconstruction result includes: optimizing three-dimensional coordinates of four vertexes of the detection frame of each road surface target;
and transmitting the compression result to an upper computer, wherein the compression result is used for the upper computer to update the environment model.
12. A road surface element reconstructing device comprising:
a first acquisition module configured to acquire an ambient image set, the ambient image set comprising: aiming at multi-frame images with different visual angles, which are acquired by a plurality of image acquisition devices arranged at different directions of the movable equipment at the current moment, of the environment around the movable equipment, each frame of image in the environment image set comprises at least one road surface target;
a first generating module, configured to generate, based on the environment image set acquired by the first acquiring module, a first feature map including semantic segmentation information in a bird's-eye view space via a neural network;
a determining module, configured to determine, based on the semantic segmentation information included in the first feature map generated by the first generating module, a pixel region of each of the road surface targets in the first feature map;
the second obtaining module is used for mapping the pixel area of each road surface target in the first characteristic diagram determined by the determining module to a preset coordinate system corresponding to the movable equipment to obtain a second characteristic diagram;
the multi-target tracking module is used for carrying out multi-target tracking on the second characteristic diagram obtained by the second obtaining module to obtain a tracking result;
and the second generation module is used for generating a road surface element reconstruction result based on the tracking result obtained by the multi-target tracking module.
13. A computer-readable storage medium storing a computer program for executing the road surface element reconstruction method according to any one of claims 1 to 11.
14. An electronic device, the electronic device comprising:
a processor;
a memory for storing the processor-executable instructions;
the processor is configured to read the executable instructions from the memory and execute the instructions to implement the road surface element reconstruction method according to any one of claims 1 to 11.
CN202211090620.XA 2022-08-01 2022-09-07 Road surface element reconstruction method, device, electronic device and storage medium Pending CN115620250A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210918480 2022-08-01
CN2022109184804 2022-08-01

Publications (1)

Publication Number Publication Date
CN115620250A true CN115620250A (en) 2023-01-17

Family

ID=84859070

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211090620.XA Pending CN115620250A (en) 2022-08-01 2022-09-07 Road surface element reconstruction method, device, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN115620250A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115965927A (en) * 2023-03-16 2023-04-14 杭州枕石智能科技有限公司 Pavement information extraction method and device, electronic equipment and readable storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115965927A (en) * 2023-03-16 2023-04-14 杭州枕石智能科技有限公司 Pavement information extraction method and device, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
Naseer et al. Deep regression for monocular camera-based 6-dof global localization in outdoor environments
US20210110599A1 (en) Depth camera-based three-dimensional reconstruction method and apparatus, device, and storage medium
JP6798183B2 (en) Image analyzer, image analysis method and program
JP2008152530A (en) Face recognition device, face recognition method, gabor filter applied device, and computer program
CN111582054A (en) Point cloud data processing method and device and obstacle detection method and device
CN114757301A (en) Vehicle-mounted visual perception method and device, readable storage medium and electronic equipment
WO2023231435A1 (en) Visual perception method and apparatus, and storage medium and electronic device
CN114898313A (en) Bird's-eye view image generation method, device, equipment and storage medium of driving scene
US20220375134A1 (en) Method, device and system of point cloud compression for intelligent cooperative perception system
CN115620250A (en) Road surface element reconstruction method, device, electronic device and storage medium
CN113793370A (en) Three-dimensional point cloud registration method and device, electronic equipment and readable medium
CN114821506A (en) Multi-view semantic segmentation method and device, electronic equipment and storage medium
CN114926316A (en) Distance measuring method, distance measuring device, electronic device, and storage medium
CN114998610A (en) Target detection method, device, equipment and storage medium
CN113807182A (en) Method, apparatus, medium, and electronic device for processing point cloud
CN114648639B (en) Target vehicle detection method, system and device
CN108335329B (en) Position detection method and device applied to aircraft and aircraft
CN115719476A (en) Image processing method and device, electronic equipment and storage medium
CN115861417A (en) Parking space reconstruction method and device, electronic equipment and storage medium
CN114937251A (en) Training method of target detection model, and vehicle-mounted target detection method and device
CN112116804B (en) Vehicle state quantity information determination method and device
CN113763468A (en) Positioning method, device, system and storage medium
CN117315035B (en) Vehicle orientation processing method and device and processing equipment
CN114359474A (en) Three-dimensional reconstruction method and device, computer equipment and storage medium
Moreira et al. Experimental implementation of loop closure detection using data dimensionality reduction by spectral method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination