CN115439637A - Vehicle-mounted augmented reality rendering method and system, vehicle and storage medium - Google Patents

Vehicle-mounted augmented reality rendering method and system, vehicle and storage medium Download PDF

Info

Publication number
CN115439637A
CN115439637A CN202210967652.7A CN202210967652A CN115439637A CN 115439637 A CN115439637 A CN 115439637A CN 202210967652 A CN202210967652 A CN 202210967652A CN 115439637 A CN115439637 A CN 115439637A
Authority
CN
China
Prior art keywords
image
vehicle
determining
pixel
rendering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210967652.7A
Other languages
Chinese (zh)
Inventor
陈翀宇
俞波
刘少山
胡波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Binli Information Technology Co Ltd
Original Assignee
Beijing Binli Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Binli Information Technology Co Ltd filed Critical Beijing Binli Information Technology Co Ltd
Priority to CN202210967652.7A priority Critical patent/CN115439637A/en
Publication of CN115439637A publication Critical patent/CN115439637A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/20Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2219/00Indexing scheme for manipulating 3D models or images for computer graphics
    • G06T2219/20Indexing scheme for editing of 3D models
    • G06T2219/2012Colour editing, changing, or manipulating; Use of colour codes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2219/00Indexing scheme for manipulating 3D models or images for computer graphics
    • G06T2219/20Indexing scheme for editing of 3D models
    • G06T2219/2024Style variation

Abstract

The application discloses a rendering method, a system, a vehicle and a storage medium for vehicle-mounted augmented reality, wherein image rendering is carried out through intermediate data of an automatic driving algorithm, such as environment information, combined with super-pixel fusion. The method comprises the following steps: acquiring an environment image around a vehicle, wherein the environment image is acquired by at least one camera device carried on the vehicle; segmenting the environment image to obtain a segmented image, wherein the segmented image comprises a plurality of image subregions; determining a pixel index based on the image sub-region; acquiring a user style input by a user, and determining a rendering material based on the pixel index and the user style; and generating a rendering image based on the rendering material. The method and the device avoid three-dimensional reconstruction errors of the traditional method by utilizing the superpixel segmentation index, can reduce the calculated amount, and realize high-speed and high-fidelity style migration rendering.

Description

Rendering method and system of vehicle-mounted augmented reality, vehicle and storage medium
Technical Field
The application relates to the field of automatic driving, in particular to a rendering method, a rendering system, a vehicle and a storage medium for vehicle-mounted augmented reality.
Background
A vehicle operating in an autonomous mode may simplify the operation of the occupant as the autopilot system lands. When the vehicle operates in the autonomous mode, a user who originally needs to perform driving operation does not need to pay excessive attention to the road condition outside the vehicle, but pays more attention to the inside of the vehicle, and the requirement on the functions of the vehicle-mounted entertainment system arises. In-vehicle entertainment is developed towards multiple screens and large screens, such as increasing the area of a center control screen, increasing a copilot screen or adopting an integrated curved screen. However, the vehicle entertainment system still belongs to the traditional interaction with the driver and passenger through the screen, and the provided content is also the common multimedia content played through the screen.
The defect and not enough to current on-vehicle amusement form, function singleness through providing the mode of virtual reality amusement in the vehicle among the prior art, enrich user's in-vehicle amusement activity, promote user experience. Applications in which stylized immersive experiences are also a focus of attention in vehicular terminals.
In order to realize the rendering of the vehicle-mounted augmented reality, the prior art uses the existing model library to place the model at the position of an environment coordinate system in a model matching mode; or converting the surrounding environment of the vehicle into a three-dimensional geometric model format, and performing synthetic rendering by using the surface normal information of the three-dimensional model and combining the position of a rendering camera, the orientation of a light source and a material mapping. However, the above solution has the following disadvantages: firstly, rendering by using an existing model library can cause a display effect to be greatly different from an actual geometric body, unreal experience is caused, and the scene reduction degree is low; secondly, in the process of converting the vehicle surroundings into three-dimensional reconstruction, inevitable distance measurement errors exist, resulting in voids or undulations even in a flat surface. The uneven grid surface further causes the surface to have obvious saw teeth during downstream rendering, and the appearance is greatly influenced.
Disclosure of Invention
In view of the above, the present application provides a method, a system, a vehicle, and a computer-readable storage medium for vehicle-mounted augmented reality rendering, which utilize a superpixel segmentation index to avoid a three-dimensional reconstruction error of a conventional method through intermediate data of an automatic driving algorithm, and can reduce a calculation amount and implement high-speed and high-fidelity style migration rendering.
In order to solve the technical problem, a first aspect of the present application provides a rendering method for vehicle-mounted augmented reality, including: acquiring an environment image around a vehicle, wherein the environment image is acquired by at least one camera device carried on the vehicle; segmenting the environment image to obtain a segmented image, wherein the segmented image comprises a plurality of image subregions; determining a pixel index based on the image sub-region; acquiring a user style input by a user, and determining a rendering material based on the pixel index and the user style; and generating a rendering image based on the rendering material.
According to a preferred embodiment of the present application, the environment image is captured by a plurality of imaging devices mounted on the vehicle; wherein the fields of view of the plurality of imaging devices have an overlap.
According to a preferred embodiment of the present application, the segmenting the environment image to obtain a segmented image includes at least one of: semantic segmentation, superpixel segmentation, instance segmentation, panorama segmentation.
According to a preferred embodiment of the present application, segmenting the environment image to obtain a segmented image includes: performing semantic segmentation and superpixel segmentation on the environment image respectively to obtain a semantic subregion and a superpixel subregion; the image sub-region comprises the semantic sub-region and the superpixel sub-region.
According to a preferred embodiment of the present application, the method further comprises: the determining a pixel index based on the image sub-region comprises: and generating a pixel index based on the semantic label of the semantic subregion and the gray average value of the super pixel subregion.
According to a preferred embodiment of the present application, the method further comprises: the pixel index further includes an image hash of the superpixel sub-region.
According to a preferred embodiment of the present application, the method further comprises: the determining rendering material based on the pixel index and the user style comprises: and performing combined query from a style material library by using the pixel index and the user style to determine a rendering material.
According to a preferred embodiment of the present application, a three-dimensional point cloud around the vehicle is acquired, and the point cloud image is acquired by at least one laser radar mounted on the vehicle; generating a three-dimensional mesh model based on the three-dimensional point cloud; the determining a pixel index based on the image sub-region further comprises: determining a pixel index based on the image subregion and the three-dimensional point cloud; the determining a rendered image based on rendering material further comprises: determining a user field of view; determining a rendered image based on the user field of view, the three-dimensional mesh model, and the rendering material.
According to a preferred embodiment of the present application, the three-dimensional point cloud has an overlap with a field of view of the environment image, and the overlapping area corresponds to an area of the rendered image.
According to a preferred embodiment of the present application, the three-dimensional point cloud is collected by a plurality of laser radars mounted on the vehicle; wherein the fields of view of the plurality of lidar have an overlap.
According to a preferred embodiment of the present application, segmenting the environment image to obtain a segmented image includes: performing semantic segmentation and superpixel segmentation on the environment image respectively to obtain a semantic subregion and a superpixel subregion; the image sub-region comprises the semantic sub-region and the super-pixel sub-region; determining a pixel index based on the image sub-region and the three-dimensional point cloud, including: acquiring laser radar exterior meal and camera device exterior parameters, and projecting the three-dimensional point cloud to the environment image through the laser radar exterior parameters and the camera device exterior parameters; determining semantic labels of semantic subregions of the environment image corresponding to each point in the three-dimensional point cloud and gray level average values of the super-pixel subregions; generating a pixel index based on the three-dimensional point cloud, the semantic tag, and the gray average.
According to a preferred embodiment of the present application, determining a rendered image based on the user field of view, the three-dimensional mesh model and the rendering material comprises: determining the corresponding relation between the user field of view and the three-dimensional grid model; determining a pixel index corresponding to the user field of view based on the correspondence; and performing combined query from a style material library by using the pixel index and the user style to determine a rendering material.
According to a preferred embodiment of the present application, the user field of view comprises a field of view image, the method further comprising: determining a direction vector corresponding to each pixel point based on the configuration of the user field of view; determining a grid on the three-dimensional grid model corresponding to the direction vector; and acquiring a pixel index corresponding to the grid as a pixel index corresponding to the user field of view.
In order to solve the above technical problem, a second aspect of the present application provides a rendering system for vehicle augmented reality, which includes a processor and a memory, wherein the memory is used for storing a computer program, and when the computer program is executed by the processor, the processor executes the rendering method for vehicle augmented reality as provided in the first aspect of the present application.
In order to solve the above technical problem, a third aspect of the present application provides a vehicle with an in-vehicle augmented reality rendering, the vehicle comprising: the camera shooting device is used for acquiring surrounding environment images; a memory for storing a computer program; a processor for, when the computer program is executed by the processor: acquiring an environment image around a vehicle, wherein the environment image is acquired by at least one camera device carried on the vehicle; segmenting the environment image to obtain a segmented image, wherein the segmented image comprises a plurality of image sub-regions; determining a pixel index based on the image sub-region; acquiring a user style input by a user, and determining a rendering material based on the pixel index and the user style; determining a rendering image based on the rendering material.
According to a preferred embodiment of the present application, the vehicle comprises a plurality of camera devices, the field of view of which have an overlap.
According to a preferred embodiment of the present application, the segmenting the environment image to obtain a segmented image includes at least one of: semantic segmentation, superpixel segmentation, instance segmentation, panorama segmentation.
According to a preferred embodiment of the present application, segmenting the environment image to obtain a segmented image includes: performing semantic segmentation and superpixel segmentation on the environment image respectively to obtain a semantic subregion and a superpixel subregion; the image sub-region comprises the semantic sub-region and the superpixel sub-region.
According to a preferred embodiment of the present application, the determining a pixel index based on the image sub-region comprises: and generating a pixel index based on the semantic label of the semantic subregion and the gray average value of the super pixel subregion.
According to a preferred embodiment of the present application, the pixel index further comprises an image hash of the superpixel subregion.
According to a preferred embodiment of the present application, the determining rendering material based on the pixel index and the user style comprises: and performing merging query from a style material library by using the pixel index and the user style to determine a rendering material.
According to a preferred embodiment of the present application, the vehicle further comprises at least one lidar user collecting a three-dimensional point cloud around the vehicle; the processor is further configured to, when the computer program is executed by the processor: acquiring three-dimensional point cloud around the vehicle, wherein the point cloud image is acquired by at least one laser radar carried on the vehicle; generating a three-dimensional mesh model based on the three-dimensional point cloud; the determining a pixel index based on the image sub-region further comprises: determining a pixel index based on the image sub-region and the three-dimensional point cloud; the determining a rendered image based on rendering material further comprises: determining a user field of view; determining a rendered image based on the user field of view, the three-dimensional mesh model, and the rendering material.
According to a preferred embodiment of the present application, the three-dimensional point cloud has an overlap with a field of view of the environment image, and the overlapping area corresponds to an area of the rendered image.
According to a preferred embodiment of the application, the vehicle comprises a plurality of lidar whose fields of view have an overlap.
According to a preferred embodiment of the present application, segmenting the environment image to obtain a segmented image includes: performing semantic segmentation and superpixel segmentation on the environment image respectively to obtain a semantic subregion and a superpixel subregion; the image sub-region comprises the semantic sub-region and the super-pixel sub-region; determining a pixel index based on the image subregion and the three-dimensional point cloud, comprising: acquiring laser radar meal and camera device external parameters, and projecting the three-dimensional point cloud to the environment image through the laser radar external parameters and the camera device external parameters; determining semantic labels of semantic subregions of the environment image corresponding to each point in the three-dimensional point cloud and gray level average values of the super-pixel subregions; generating a pixel index based on the three-dimensional point cloud, the semantic label, and the grayscale mean.
According to a preferred embodiment of the present application, determining a rendered image based on the user field of view, the three-dimensional mesh model and the rendering material comprises: determining the corresponding relation between the user field of view and the three-dimensional grid model; determining a pixel index corresponding to the user field of view based on the correspondence; and performing combined query from a style material library by using the pixel index and the user style to determine a rendering material.
According to a preferred embodiment of the present application, the user field of view comprises a field of view image, the method further comprising: determining a direction vector corresponding to each pixel point based on the configuration of the user field of view; determining a grid on the three-dimensional grid model corresponding to the direction vector; and acquiring a pixel index corresponding to the grid as a pixel index corresponding to the user field of view.
In order to solve the above technical problem, a fourth aspect of the present application provides a computer-readable storage medium, wherein the computer-readable storage medium stores one or more programs that, when executed by a processor, implement the rendering method of the in-vehicle augmented reality as provided in the first aspect of the present application.
Compared with the prior art, the rendering method, the system, the vehicle and the computer-readable storage medium for the vehicle augmented reality, which directly render the reconstructed three-dimensional model or use the existing model, are adopted: by utilizing an image super-pixel segmentation algorithm and combining the three-dimensional reconstruction of the scene by the data of the automatic driving depth sensor, the display problem caused by the regression error of the surface normal is greatly improved, and meanwhile, the high reduction degree of the scene is ensured.
Drawings
In order to make the technical problems solved, technical means adopted and technical effects achieved by the embodiments of the present application clearer, specific embodiments of the present application will be described in detail below with reference to the accompanying drawings. It should be noted, however, that the drawings described below are only for exemplary embodiments of the present application and that other embodiments may be obtained by those skilled in the art without inventive faculty.
Fig. 1 is a flowchart of the steps of a rendering method of an in-vehicle augmented reality according to the present application;
FIG. 2 is a conceptual framework diagram of an in-vehicle augmented reality rendering method according to the application;
FIG. 3 is a schematic block diagram of a rendering method for in-vehicle augmented reality according to the present application;
fig. 4 is a schematic diagram of superpixel segmentation in a rendering method of an in-vehicle augmented reality according to the present application.
Fig. 5 is a structural framework diagram of a vehicle augmented reality rendering system according to the present application.
Detailed Description
Exemplary embodiments of the present application will now be described more fully hereinafter with reference to the accompanying drawings. The exemplary embodiments, however, may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these exemplary embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept to those skilled in the art. The same reference numerals denote the same or similar elements, components, or parts in the drawings, and thus their repetitive description will be omitted.
Features, structures, characteristics or other details described in a particular embodiment do not preclude the fact that the features, structures, characteristics or other details may be combined in any suitable manner in one or more other embodiments while remaining within the technical spirit of the embodiments of the present application.
In describing particular embodiments, the features, structures, characteristics or other details of the embodiments of the present application are described in order to provide a full understanding of the embodiments to those skilled in the art. It is not excluded that one of ordinary skill in the art may practice the solution of the embodiments of the present application without one or more of the specific features, structures, characteristics or other details.
The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.
It will be understood that, although the terms first, second, third, etc. may be used herein to describe various elements, components, assemblies or sections, these elements, assemblies or sections should not be limited by these terms. These phrases are used to distinguish one from another. For example, a first device may also be referred to as a second device without departing from the spirit of the embodiments of the present application.
The term "and/or" and/or "includes any and all combinations of one or more of the associated listed items.
In order to obtain the environmental information around the vehicle, the real-time picture of a user can be captured by wearing equipment with a camera, the three-dimensional space is subjected to mixed reality space anchor point, and then the style of the camera picture is directly subjected to style migration rendering through a stylized generation network. However, in a scene of high-speed movement, a real-time mixed reality anchor point may deviate, and it is difficult to generate a style scene with a uniform position and timing sequence, which causes a special effect jump between frames of a picture, and the experience is very unstable.
Alternatively, three-dimensional reconstruction and rendering can also be performed by means of nerve radiation field rendering. The neural radiation field is used for rendering, high fidelity can be obtained, but a large amount of data acquisition on scenes in different time periods is required, the calculation power during operation is high, the real-time performance is poor, and the method is not suitable for being used at a vehicle end.
Referring to fig. 1, a rendering method of a vehicle-mounted augmented reality provided by the present application is described below, specifically, the method includes:
s100: the method comprises the steps of obtaining an environment image around a vehicle, wherein the environment image is collected by at least one camera device mounted on the vehicle.
The rendering method of the vehicle-mounted augmented reality is mainly carried out based on images. In step S100, the vehicle may acquire an environmental image of its surroundings through the image pickup device. The number of the imaging devices mounted on the vehicle may be one or more. For example, the camera device mounted on the vehicle may be disposed at a position intermediate to the front end of the vehicle front windshield or at a position intermediate to the front of the vehicle. Thus, the camera device can provide an environment image in front of the vehicle for the vehicle, and the environment image is close to an image actually observed by human eyes of a driver and a passenger of the vehicle. Of course, the vehicle may be equipped with an imaging device at another position, for example, a vehicle roof, a vehicle left and right rear view mirror, a vehicle tail, and the like, so as to provide an environment image at another position for the vehicle.
It is understood that when the image pickup devices are mounted at different positions on the vehicle, there is overlap between the visual field ranges of the plurality of image pickup devices. For example, the field of view of the imaging device between the camera of the front face of the vehicle and the left side mirror of the vehicle and between the camera of the front face of the vehicle and the right side mirror of the vehicle overlap each other. The advantage of overlapping the view ranges of the multiple cameras is that when image processing is performed, the multiple environment images can be spliced into one environment image with a larger view range based on the overlapped part of the view ranges, so that a rendering effect with a larger view range can be obtained. Of course, the field of view of the plurality of image capturing devices may not overlap, that is, the respective environment images do not have overlapping contents; at this time, rendering of the in-vehicle augmented reality may be performed on a plurality of environment images, respectively, so as to finally obtain different rendering images, respectively.
S200: the environment image is segmented to obtain a segmented image, and the segmented image comprises a plurality of image sub-regions.
After the image of the environment surrounding the vehicle is obtained, the image may be segmented to prepare for subsequent rendering of the augmented reality. In particular, segmenting the image may include at least one of semantic segmentation, superpixel segmentation, instance segmentation, or panorama segmentation. After the image is segmented, a plurality of image sub-regions can be obtained; that is, the ambient image will comprise a plurality of segmented image sub-regions. The image sub-regions can be used for better rendering, i.e. they divide the environment image into smaller sub-units, so that different rendering effects can be given for different sub-units, resulting in rich rendering results.
In one embodiment, segmenting the image may include both semantic segmentation and superpixel segmentation. Where semantic segmentation will segment the ambient image into semantic sub-regions and superpixel segmentation will segment the ambient image into superpixel sub-regions. At this time, the image sub-region may include both the semantic sub-region and the super-pixel sub-region.
The super-pixel segmentation algorithm optionally may utilize a Density-Based Clustering method with Noise (DBSCAN), which defines a distance between pixels, clusters the pixels, and outputs a super-pixel label for each pixel of the environmental image; and converting the environment image into five-dimensional feature vectors in a CIELAB color space and XY coordinates by using SLIC (simple linear iterative clustering), constructing a distance measurement standard for the five-dimensional feature vectors, and outputting a super-pixel label for each pixel of the environment image. Of course, other algorithms can be used for the super-pixel segmentation algorithm, and are not described in detail herein.
Referring to FIG. 4, a diagram of superpixel segmentation and semantic segmentation is provided. Superpixel segmentation may segment an image into a plurality of superpixel sub-regions 1-1. The super-pixel sub-region 1-1 is composed of a plurality of pixel points which are adjacent in position and have similar characteristics such as color, brightness, texture and the like, and all the pixel points have similar visual characteristics. The super-pixel sub-regions can retain effective information for further image segmentation, local structural features are more obvious, and the boundary information of objects in the image can not be damaged by abstracting the adjacent super-pixel sub-regions in the segmentation process. Whereas semantic segmentation may segment an image into a plurality of semantic sub-regions (as shown by the different color regions in fig. 4). Semantic segmentation can associate each pixel point in the image with a category label, and the pixel points of the same category label are displayed in a set mode. The pixel point set with the same category label is a semantic subregion. For example, when an image contains cars, trees, and pedestrians, the cars, trees, and pedestrians will be identified, and the corresponding pixels will be divided into semantic sub-regions. Illustratively, the different color regions in FIG. 4 represent different semantic sub-regions. It should be noted that the semantic sub-region and the super-pixel sub-region may have the same pixel boundary or may have different pixel boundaries.
S300: a pixel index is determined based on the image sub-region.
After obtaining the image sub-region by image segmentation, a pixel index may be further determined based on the image sub-region. In some embodiments, the image sub-region may include a super-pixel sub-region and a semantic sub-region, and then a super-pixel label of a pixel under the super-pixel sub-region and a semantic label of the pixel under the semantic sub-region are fused to obtain the pixel index. Illustratively, the gray level average value of the super-pixel sub-region can be calculated based on the pixel value in the super-pixel sub-region, and the semantic label and the gray level average value are compressed by bit operation to obtain a pixel index; the gray level average value of the super-pixel sub-region can be calculated based on the pixel values in the super-pixel sub-region, the image hash value of the super-pixel sub-region is calculated, and the semantic label, the gray level average value and the image hash value are compressed by using bit operation to obtain the pixel index.
S400: acquiring a user style input by a user, and determining a rendering material based on the pixel index and the user style;
after the pixel indices are obtained, rendering textures may be determined based on the obtained user style of the user input. It is understood that the user style of the user input in this step may be obtained before or after the preceding step. For example, the user can be prompted to select a corresponding user style when the vehicle-mounted augmented reality function is set for the first time, and then the corresponding style can be directly used during each driving; the user may also be prompted to make a selection each time the in-vehicle augmented reality function is enabled, in particular, for a trip. This is not limited by the present application.
The user styles may include different styles such as cartoon styles, enhanced styles, painting styles, and the like. Different user styles correspond to different style material libraries, for example, when the style is cartoon style, different cartoon rendering materials are provided for different objects. After determining the user style, the rendering material may be determined by the pixel index obtained as described above. Illustratively, a B + tree may be utilized to implement a multi-indexed merged query. Because the pixel index includes the super-pixel sub-region information and the semantic information, a certain specific rendering material obtained by query may correspond to a specific image sub-region that needs to be rendered, and may also provide semantics corresponding to the specific image sub-region.
S500: and generating a rendering image based on the rendering material.
Once the rendering material is determined, the rendering material may be used to generate a rendered image. Specifically, the generating of the rendered image further includes determining a user field of view according to the human eye observation position of the user, the observation device and the pixel region of the observation image, and traversing each pixel point on the pixel region of the user field of view, thereby obtaining a series of direction vectors. And performing intersection operation on each direction vector and the to-be-rendered environment image to obtain a corresponding position on the environment image, and finishing material rendering by taking the super-pixels as units on the basis of the pixel index of the pixels at the corresponding position and the rendering material obtained by query.
The rendering method for vehicle-mounted enhanced display can provide vehicle-mounted real-time environment image rendering for a user, and the environment image is transferred to another rendering style from reality, so that the driving experience of the user is well enriched.
A flow of the rendering method for the vehicle-mounted augmented reality provided by the present application is further described below with reference to fig. 2.
First, the vehicle-mounted imaging device senses the environment around the vehicle and outputs an environment image.
Then, the environment image will enter the image segmentation process, specifically including image semantic segmentation and image superpixel segmentation, respectively. For image semantic segmentation, a semantic mask is output after semantic segmentation processing is performed on an acquired environment image, and exemplarily, a neural network can be used for performing semantic classification output on each pixel of an RGB image; for image super-pixel segmentation, a super-pixel mask is output after the obtained environment image is subjected to super-pixel segmentation, illustratively, a super-pixel segmentation algorithm is used for clustering pixels by defining the distance between the pixels and utilizing DBSCAN (digital broadcast control area network), and super-pixel label output is carried out on each pixel of an RGB (red, green and blue) image. The semantic mask and the superpixel mask represent the corresponding image sub-regions.
Second, the semantic mask and the superpixel mask are fused to obtain pixels with pixel indexes. Illustratively, the semantics of each pixel in the image, the current superpixel grayscale mean, and the image hash of the superpixel may be compressed into a pixel index using a bit operation. The pixel index may be a superpixel index, i.e., information representing the superpixel sub-region in which the pixel is located.
And then, based on the obtained pixel indexes and the style input by the user, carrying out combined query in a material library to obtain a queried rendering material.
And finally, according to the rendering material and the direction observed by the user, finishing material rendering by taking the super-pixels as units and outputting the image with the style transferred.
Fig. 2 schematically illustrates a flow framework of the rendering method for vehicle-mounted augmented reality provided by the present application, wherein the implementation of the specific steps may refer to the description of the foregoing steps S100 to S500, and details are not repeated here.
The following further describes a flow of another rendering method for vehicle-mounted augmented reality provided by the present application with reference to fig. 3. The differences between the present embodiment and the embodiments described above will be specifically described below, and reference may be made to the above description for the same points.
Further, the vehicle may further include at least one laser radar on the basis of the camera device. For example, a laser radar can be arranged on the vehicle at the front of the vehicle, or on the top of the vehicle and has a 360 ° FOV; a plurality of laser radars may also be provided, respectively located at the front of the vehicle, at the left side of the vehicle, and at the right side of the vehicle. When multiple lidar devices are present on a vehicle, the multiple lidar devices have overlapping fields of view similar to the multiple imaging devices. The laser radar on the vehicle and the camera device have overlapped view ranges, namely, the three-dimensional point cloud acquired by the laser radar on the vehicle and the view range of the environment image acquired by the camera device have overlap; the area where the two overlap corresponds to the area of the final rendered image. It can be understood that when a plurality of laser radars or a plurality of camera devices are provided, the overlapped three-dimensional point cloud and the environment image are provided, the three-dimensional point cloud obtained by the laser radars is aligned and combined, and the environment image obtained by the camera devices is aligned and spliced.
The superpixel segmentation and semantic segmentation can be performed on the environment image in the same way. After the semantic sub-region and the super-pixel sub-region are obtained, the three-dimensional point cloud can be further combined for fusion. Specifically, by using the external reference of the laser radar and the external reference of the camera device, the three-dimensional point cloud of the laser radar can be projected onto the environment image of the camera device. And projecting the three-dimensional point cloud to a corresponding pixel on the environment image, and further obtaining a semantic label of a semantic sub-area corresponding to the pixel and super-pixel information of a super-pixel sub-area. And (4) hashing the semantic labels, the current super-pixel gray average value and the image of the super-pixels by using bit operation, and compressing into a pixel index.
Further, three-dimensional reconstruction can be performed by using the three-dimensional point cloud. Illustratively, with the Marching Cubes algorithm, the three-dimensional point cloud can be subjected to three-dimensional reconstruction from point to surface, so as to obtain a three-dimensional mesh model.
After the user input style is obtained and the rendering material is obtained through query, the rendering image can be determined based on the user field of view, the three-dimensional grid model and the rendering material. Specifically, based on the human eye observation position of the user, and the observation device, the user field of view when the user views the rendering environment can be determined. And traversing each pixel point in the pixel region in the field of view according to the field of view of the user to obtain a series of direction vectors. And performing intersection operation on each pixel direction vector and the three-dimensional mesh model to obtain a vertex on the mesh model, and further obtaining a pixel index corresponding to the vertex. And performing merging query from a style material library by using the pixel index and the obtained user style, and determining a rendering material. Therefore, rendering can be completed based on rendering materials, and the images with the migrated styles are output.
According to the other vehicle-mounted enhanced display rendering method, three-dimensional point cloud information of the laser radar can be further introduced, three-dimensional information of a vehicle-mounted surrounding environment is provided, an environment image with three-dimensional grid model information is further constructed, and accordingly a more reduced and displayed augmented reality rendering effect is provided.
Referring to fig. 4, the present application further provides a rendering system of an in-vehicle augmented reality. The rendering system 20 comprises a memory 21 and a processor 22, the memory 21 is used for storing a computer program, when the computer program is executed by the processor 22, the processor 22 may execute the rendering method of the vehicle-mounted augmented reality provided by the present application as described above.
The present application further provides a vehicle with an in-vehicle augmented reality rendering, the vehicle comprising at least one camera device for capturing images of an environment around the vehicle, and a memory for storing a computer program, a processor for implementing the in-vehicle augmented reality rendering method as described above when the computer program is executed by the processor. In some embodiments, the vehicle provided by the present application further comprises at least one lidar, and the rendering method of the vehicle-mounted augmented reality with the lidar can be implemented.
Another embodiment of the present application further provides a computer readable storage medium storing one or more programs, where when the one or more programs are executed by a processor, the method for rendering the in-vehicle augmented reality provided by the present application is implemented as described above.
Those skilled in the art will appreciate that all or part of the steps to implement the above embodiments are implemented as programs executed by data processing apparatuses (including computers), i.e., computer programs. When the computer program is executed, the above method provided by the application can be realized. Also, the computer program may be stored in a computer-readable storage medium, i.e., a computer-readable storage medium, which may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a magnetic disk, an optical disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing such as a storage array of multiple storage media, e.g., a magnetic disk or tape storage array. The computer program, when executed by one or more data processing devices, enables the computer-readable storage medium to implement the above-described methods of the present application. Further, the storage medium is not limited to centralized storage, but may also be distributed storage, such as cloud storage based on cloud computing. It should be appreciated that in the foregoing description of exemplary embodiments of the present application, various features of the present application are sometimes described in a single embodiment or with reference to a single figure, in order to streamline the application and assist those skilled in the art in understanding various aspects of the present application. However, the present application should not be construed that the features included in the exemplary embodiments are all the essential technical features of the present patent claims.
Further, those skilled in the art will readily appreciate that the exemplary embodiments described herein may be implemented by software or by a combination of software and hardware as necessary. Therefore, the technical solution according to the embodiments of the present application may be embodied in the form of a software product, which may be stored in a computer-readable storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a data processing device (which may be a personal computer, a server, or a network device, etc.) to execute the above method according to the present application. The computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable storage medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing. Program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the C language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
Thus, the present application may be embodied as a method, system, electronic device, or computer-readable storage medium that executes a computer program. Some or all of the functions of the present application may be implemented in practice using a general-purpose data processing device such as a microprocessor or a Digital Signal Processor (DSP).
It should be understood that the modules, units, components, and the like included in the device of one embodiment of the present application may be adaptively changed to be provided in a device different from the embodiment. The different modules, units or components comprised by the apparatus of an embodiment may be combined into one module, unit or component or they may be divided into a plurality of sub-modules, sub-units or sub-components. The modules, units or components in the embodiments of the present application may be implemented in hardware, may be implemented in software running on one or more processors, or may be implemented in a combination thereof.
The above-mentioned embodiments are further described in detail for the purpose of illustrating the invention, and it should be understood that the above-mentioned embodiments are only illustrative of the present invention and are not intended to limit the present invention, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
In summary, the present application may be embodied as a method, apparatus, system, or computer-readable storage medium that executes a computer program. Some or all of the functions of the present application may be implemented in practice using a general-purpose data processing device such as a microprocessor or a Digital Signal Processor (DSP). The foregoing detailed description of the embodiments, and the objects, technical solutions and advantages of the embodiments of the present application have been described in further detail, and it should be understood that the embodiments of the present application are not inherently related to any particular computer, virtual device or electronic system, and that various general-purpose devices may implement the embodiments of the present application. The above description is only exemplary of the embodiments of the present application and should not be construed as limiting the embodiments of the present application, and any modifications, equivalents, improvements and the like made within the spirit and principle of the embodiments of the present application should be included in the protection scope of the embodiments of the present application.

Claims (28)

1. A rendering method of vehicle-mounted augmented reality is characterized by comprising the following steps:
acquiring an environment image around a vehicle, wherein the environment image is acquired by at least one camera device carried on the vehicle;
segmenting the environment image to obtain a segmented image, wherein the segmented image comprises a plurality of image subregions;
determining a pixel index based on the image sub-region;
acquiring a user style input by a user, and determining a rendering material based on the pixel index and the user style;
and generating a rendering image based on the rendering material.
2. The method of claim 1, further comprising:
the environment image is collected by a plurality of camera devices mounted on the vehicle; wherein the fields of view of the plurality of imaging devices have an overlap.
3. The method of claim 1, wherein segmenting the environmental image to obtain a segmented image comprises at least one of: semantic segmentation, superpixel segmentation, instance segmentation, panorama segmentation.
4. The method of claim 3, wherein segmenting the environmental image to obtain a segmented image comprises:
performing semantic segmentation and superpixel segmentation on the environment image respectively to obtain a semantic subregion and a superpixel subregion; the image sub-region comprises the semantic sub-region and the superpixel sub-region.
5. The method of claim 4, wherein determining a pixel index based on the image sub-region comprises:
and generating a pixel index based on the semantic label of the semantic subregion and the gray average value of the super pixel subregion.
6. The method of claim 5, wherein the pixel index further comprises an image hash of the superpixel subregion.
7. The method of claim 1, wherein the determining rendering material based on the pixel index and the user style comprises:
and performing merging query from a style material library by using the pixel index and the user style to determine a rendering material.
8. The method of claim 1, further comprising:
acquiring three-dimensional point cloud around the vehicle, wherein the point cloud image is acquired by at least one laser radar carried on the vehicle;
generating a three-dimensional mesh model based on the three-dimensional point cloud;
said determining a pixel index based on the image sub-region further comprises: determining a pixel index based on the image sub-region and the three-dimensional point cloud;
the determining a rendered image based on rendering material further comprises:
determining a user field of view;
determining a rendered image based on the user field of view, the three-dimensional mesh model, and the rendering material.
9. The method of claim 8, wherein the three-dimensional point cloud has an overlap with a field of view of the environmental image, and the overlap region corresponds to a region of the rendered image.
10. The method of claim 8, further comprising:
the three-dimensional point cloud is collected by a plurality of laser radars carried on the vehicle; wherein the fields of view of the plurality of lidar have an overlap.
11. The method of claim 8, wherein segmenting the environmental image to obtain a segmented image comprises:
performing semantic segmentation and superpixel segmentation on the environment image respectively to obtain a semantic subregion and a superpixel subregion; the image sub-region comprises the semantic sub-region and the super-pixel sub-region;
determining a pixel index based on the image subregion and the three-dimensional point cloud, comprising: acquiring laser radar exterior meal and camera device exterior parameters, and projecting the three-dimensional point cloud to the environment image through the laser radar exterior parameters and the camera device exterior parameters;
determining semantic labels of semantic subregions of the environment image corresponding to each point in the three-dimensional point cloud and gray level average values of the super-pixel subregions;
generating a pixel index based on the three-dimensional point cloud, the semantic tag, and the gray average.
12. The method of claim 11, wherein determining a rendered image based on the user field of view, the three-dimensional mesh model, and the rendered material comprises:
determining the corresponding relation between the user field of view and the three-dimensional grid model;
determining a pixel index corresponding to the user field of view based on the correspondence;
and performing merging query from a style material library by using the pixel index and the user style to determine a rendering material.
13. The method of claim 12, wherein the user field of view comprises a field of view image, the method further comprising:
determining a direction vector corresponding to each pixel point based on the configuration of the user field of view;
determining a grid on the three-dimensional grid model corresponding to the direction vector;
and acquiring a pixel index corresponding to the grid as a pixel index corresponding to the user field of view.
14. A rendering system of an augmented reality for a vehicle, comprising a processor and a memory for storing a computer program, wherein the processor performs the method of any one of claims 1-13 when the computer program is executed by the processor.
15. A vehicle with an in-vehicle augmented reality rendering, the vehicle comprising:
the camera shooting device is used for acquiring surrounding environment images;
a memory for storing a computer program;
a processor for, when the computer program is executed by the processor:
acquiring an environment image around a vehicle, wherein the environment image is acquired by at least one camera device carried on the vehicle;
segmenting the environment image to obtain a segmented image, wherein the segmented image comprises a plurality of image subregions;
determining a pixel index based on the image sub-region;
acquiring a user style input by a user, and determining a rendering material based on the pixel index and the user style;
determining a rendering image based on the rendering material.
16. The vehicle of claim 15, characterized in that the vehicle comprises a plurality of cameras having overlapping fields of view.
17. The vehicle of claim 15, wherein segmenting the environmental image yields a segmented image comprising at least one of: semantic segmentation, superpixel segmentation, instance segmentation, panorama segmentation.
18. The vehicle of claim 17, wherein segmenting the environmental image to obtain a segmented image comprises:
performing semantic segmentation and superpixel segmentation on the environment image respectively to obtain a semantic subregion and a superpixel subregion; the image sub-region comprises the semantic sub-region and the superpixel sub-region.
19. The vehicle of claim 18, wherein said determining a pixel index based on said image subregion comprises:
and generating a pixel index based on the semantic label of the semantic subregion and the gray average value of the super pixel subregion.
20. The vehicle of claim 19, wherein the pixel index further comprises an image hash of the superpixel sub-region.
21. The vehicle of claim 15, wherein the determining rendering material based on the pixel index and the user style comprises:
and performing merging query from a style material library by using the pixel index and the user style to determine a rendering material.
22. The vehicle of claim 15, further comprising at least one lidar configured to acquire a three-dimensional point cloud around the vehicle;
the processor is further configured to, when the computer program is executed by the processor:
acquiring three-dimensional point cloud around the vehicle, wherein the three-dimensional point cloud is acquired by at least one laser radar carried on the vehicle;
generating a three-dimensional mesh model based on the three-dimensional point cloud;
the determining a pixel index based on the image sub-region further comprises: determining a pixel index based on the image subregion and the three-dimensional point cloud;
the determining a rendered image based on rendering material further comprises:
determining a user field of view;
determining a rendered image based on the user field of view, the three-dimensional mesh model, and the rendering material.
23. The vehicle of claim 22, wherein the three-dimensional point cloud has an overlap with a field of view of the environmental image, and the overlap region corresponds to a region of the rendered image.
24. The vehicle of claim 22, characterized in that the vehicle comprises a plurality of lidar having overlapping fields of view.
25. The vehicle of claim 22, wherein segmenting the environmental image to obtain a segmented image comprises:
performing semantic segmentation and superpixel segmentation on the environment image respectively to obtain a semantic subregion and a superpixel subregion; the image sub-region comprises the semantic sub-region and the super-pixel sub-region;
determining a pixel index based on the image subregion and the three-dimensional point cloud, comprising: acquiring laser radar meal and camera device external parameters, and projecting the three-dimensional point cloud to the environment image through the laser radar external parameters and the camera device external parameters;
determining semantic labels of semantic sub-regions of the environment image corresponding to each point in the three-dimensional point cloud and gray level average values of the super-pixel sub-regions;
generating a pixel index based on the three-dimensional point cloud, the semantic tag, and the gray average.
26. The vehicle of claim 25, wherein determining a rendered image based on the user field of view, the three-dimensional mesh model, and the rendered material comprises:
determining the corresponding relation between the user field of view and the three-dimensional grid model;
determining a pixel index corresponding to the user field of view based on the correspondence;
and performing merging query from a style material library by using the pixel index and the user style to determine a rendering material.
27. The vehicle of claim 26, wherein the user field of view comprises a field of view image, the method further comprising:
determining a direction vector corresponding to each pixel point based on the configuration of the user field of view;
determining a grid on the three-dimensional grid model corresponding to the direction vector;
and acquiring a pixel index corresponding to the grid as a pixel index corresponding to the user field of view.
28. A computer readable storage medium, wherein the computer readable storage medium stores one or more programs which, when executed by a processor, implement the method of any of claims 1-13.
CN202210967652.7A 2022-08-12 2022-08-12 Vehicle-mounted augmented reality rendering method and system, vehicle and storage medium Pending CN115439637A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210967652.7A CN115439637A (en) 2022-08-12 2022-08-12 Vehicle-mounted augmented reality rendering method and system, vehicle and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210967652.7A CN115439637A (en) 2022-08-12 2022-08-12 Vehicle-mounted augmented reality rendering method and system, vehicle and storage medium

Publications (1)

Publication Number Publication Date
CN115439637A true CN115439637A (en) 2022-12-06

Family

ID=84242678

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210967652.7A Pending CN115439637A (en) 2022-08-12 2022-08-12 Vehicle-mounted augmented reality rendering method and system, vehicle and storage medium

Country Status (1)

Country Link
CN (1) CN115439637A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115984583A (en) * 2022-12-30 2023-04-18 广州沃芽科技有限公司 Data processing method, apparatus, computer device, storage medium and program product
CN116109753A (en) * 2023-04-12 2023-05-12 深圳原世界科技有限公司 Three-dimensional cloud rendering engine platform and data processing method

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115984583A (en) * 2022-12-30 2023-04-18 广州沃芽科技有限公司 Data processing method, apparatus, computer device, storage medium and program product
CN115984583B (en) * 2022-12-30 2024-02-02 广州沃芽科技有限公司 Data processing method, apparatus, computer device, storage medium, and program product
CN116109753A (en) * 2023-04-12 2023-05-12 深圳原世界科技有限公司 Three-dimensional cloud rendering engine platform and data processing method

Similar Documents

Publication Publication Date Title
Sakaridis et al. Semantic foggy scene understanding with synthetic data
US11379987B2 (en) Image object segmentation based on temporal information
CN115439637A (en) Vehicle-mounted augmented reality rendering method and system, vehicle and storage medium
CN114514535A (en) Instance segmentation system and method based on semantic segmentation
Xiao et al. Single image dehazing based on learning of haze layers
EP3973507B1 (en) Segmentation for holographic images
DE102022110657A1 (en) HIGH DYNAMIC RANGE IMAGE PROCESSING WITH FIXED CALIBRATION SETTINGS
US20200211200A1 (en) Method and system of annotation densification for semantic segmentation
US20230099521A1 (en) 3d map and method for generating a 3d map via temporal and unified panoptic segmentation
CN113763231B (en) Model generation method, image perspective determination method, device, equipment and medium
CN112651881A (en) Image synthesis method, apparatus, device, storage medium, and program product
DE102022117298A1 (en) COMBINATION QUALITY ASSESSMENT FOR ALL-ROUND VISION SYSTEMS
CN111382647B (en) Picture processing method, device, equipment and storage medium
DE102021125897A1 (en) HISTORY BLOCKING TO DENOISE DYNAMIC RAY TRACING SCENES USING TEMPORAL ACCUMULATION
Tabata et al. Analyzing CARLA’s performance for 2D object detection and monocular depth estimation based on deep learning approaches
CN117252947A (en) Image processing method, image processing apparatus, computer, storage medium, and program product
CN116844129A (en) Road side target detection method, system and device for multi-mode feature alignment fusion
CN113870405B (en) Texture map selection method for three-dimensional scene reconstruction and related device
DE112022001485T5 (en) METHODS AND DEVICES FOR SYNTHESIZING SIX DEGREES OF FREEDOM VIEWS FROM SPARE RGB DEPTH INPUTS
Kim et al. Real-time human segmentation from RGB-D video sequence based on adaptive geodesic distance computation
CN115249215A (en) Image processing method, image processing device, electronic equipment and readable storage medium
CN114120260A (en) Method and system for identifying travelable area, computer device, and storage medium
Garcia-Dopico et al. Locating moving objects in car-driving sequences
CN117036895B (en) Multi-task environment sensing method based on point cloud fusion of camera and laser radar
CN116991296B (en) Object editing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination