CN113302648A

CN113302648A - Panoramic image generation method, vehicle-mounted image processing device and vehicle

Info

Publication number: CN113302648A
Application number: CN202180001139.6A
Authority: CN
Inventors: 陈晓丽; 张峻豪; 黄为; 王笑悦
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2021-04-23
Filing date: 2021-04-23
Publication date: 2021-08-24
Anticipated expiration: 2041-04-23
Also published as: WO2022222121A1; CN113302648B

Abstract

A panoramic image generation method, a vehicle-mounted image processing device and a vehicle are provided, and the method in the embodiment of the application comprises the following steps: acquiring image information and depth information of the surrounding object, wherein the depth information is used for indicating coordinate point information of each point on the surrounding object; obtaining an initial model; adjusting the first coordinate point on the initial model to a second coordinate point according to the depth information to generate a first model; and generating a panoramic image from the image information and the first model. In this embodiment, the position of the first coordinate point is adjusted in real time according to the depth information of the object around the vehicle, and since the position of the first coordinate point on the initial model is adjusted in real time, that is, the first model is obtained according to the actual distance between the object around the vehicle and the vehicle, the stitching ghost and the misalignment in the panoramic image are eliminated, so that the panoramic image consistent with the actual environment around the vehicle is obtained.

Description

Panoramic image generation method, vehicle-mounted image processing device and vehicle

Technical Field

The present disclosure relates to the field of vehicle-mounted panoramic viewing technologies, and in particular, to a method for generating a panoramic image, a vehicle-mounted image processing apparatus, and a vehicle.

Background

Generally, a camera is only mounted at the tail of a vehicle in a traditional image-based reversing image system, the camera can only cover a limited area around the tail of the vehicle, and blind areas around the vehicle and the head of the vehicle increase the hidden danger of safe driving. In order to enlarge the visual field of a driver and improve the driving safety, the vehicle-mounted all-round system is developed.

The vehicle-mounted all-round looking system utilizes cameras arranged around the vehicle to reconstruct the vehicle and surrounding scenes, and carries out visual angle transformation and image splicing on shot images to generate a 3D panoramic image. In the prior art, a vehicle-mounted all-round system firstly calibrates an installed 4-way fisheye camera, and after obtaining internal and external parameters of the camera, constructs a three-dimensional bowl-shaped fixed model (as shown in fig. 1A), maps the internal and external parameters of points on a 3D fixed model to obtain pixel coordinates, and then maps pixels at corresponding positions in fisheye images on the bowl-shaped fixed model to obtain 3D panoramic images.

Current 3D panoramic images are generated based on a 3D fixed model, which is a simulated rendering of real objects around the vehicle. When the distance between the 3D fixed model and the virtual vehicle is unequal to the distance between the vehicle and the actual object, splicing double images and dislocation can occur in a splicing area of the two images, so that the detection rate of obstacles around the vehicle is reduced or a detection blind area is generated, and the driving safety is reduced.

Disclosure of Invention

The embodiment of the application provides a panoramic image generation method, a vehicle-mounted image processing device and a vehicle, which are used for eliminating splicing ghosts and malpositions in the panoramic image and obtaining the panoramic image consistent with the actual environment around the vehicle.

In a first aspect, an embodiment of the present application provides a method for generating a panoramic image, which is applied to a vehicle-mounted image processing apparatus, and includes: the vehicle-mounted image processing device acquires image information and depth information of a surrounding object, wherein the depth information is used for indicating coordinate point information of each point on the surrounding object; obtaining an initial model, for example, the initial model is a bowl-shaped 3D model, or the initial model is a cylindrical 3D model; the vehicle-mounted image processing device adjusts the first coordinate point on the initial model to the second coordinate point according to the depth information to generate a first model; the on-vehicle image processing apparatus adjusts the position of the first coordinate point in real time according to depth information of an object around the vehicle, since the position of the first coordinate point on the initial model is adjusted in real time, that is, the first model is a model obtained according to an actual distance of the object around the vehicle from the vehicle, the shape of the first model may be an irregular shape, and the shape of the first model is changed as the distance of the vehicle from the object around the vehicle changes, the on-vehicle image processing apparatus acquires a panoramic image based on the image information and the first model. In the embodiment of the application, when the vehicle-mounted image processing device creates the 3D model, the depth information is introduced, and the vehicle-mounted image processing device adjusts the coordinate point on the initial model according to the depth information to obtain the first model. The vehicle-mounted image processing device correspondingly generates a virtual first model according to the objects around the vehicle in the real world, and generates a virtual vehicle model based on the vehicle in the real world, namely when the distance between the objects around the vehicle and the vehicle changes, the distance between the vehicle model and a coordinate point corresponding to the objects on the first model also changes, splicing ghosts and dislocation in the panoramic image are eliminated, and therefore the 3D panoramic image consistent with the actual environment around the vehicle is obtained.

In an optional implementation manner, the adjusting the first coordinate point on the initial model to the second coordinate point according to the depth information, and generating the first model may include: firstly, converting pixel points in image information into first point clouds under a camera coordinate system according to the pixel points in the image information and depth information corresponding to the pixel points by the vehicle-mounted image processing device; then, the vehicle-mounted image processing device converts the first point cloud under the camera coordinate system into a second point cloud under the world coordinate; and finally, the vehicle-mounted image processing device adjusts the first coordinate point to the second coordinate point through the coordinate point in the second point cloud to generate a first model, wherein the second coordinate point is obtained according to the coordinate point in the second point cloud.

In an optional implementation, the image information includes images acquired by a plurality of image sensors, and after converting the first point cloud in the camera coordinate system into the second point cloud in the world coordinate system, the method further includes: the vehicle-mounted image processing device splices a plurality of second point clouds of a plurality of images acquired by a plurality of image sensors under a world coordinate system to obtain a target point cloud; what the coordinate point on this target point cloud corresponds is the actual distance from the object around the vehicle to the image sensor, and the vehicle-mounted image processing apparatus may include: firstly, the vehicle-mounted image processing device determines a plurality of third coordinate points in the neighborhood range of the first coordinate point, wherein the third coordinate points are coordinate points on the target point cloud, and then the vehicle-mounted image processing device determines a second coordinate point according to the plurality of third coordinate points; wherein the second coordinate point is obtained from a plurality of third coordinate points in the neighborhood of the first coordinate point. And finally, the vehicle-mounted image processing device adjusts the first coordinate point to the second coordinate point. In this embodiment, the target point cloud is a point cloud obtained based on an actual distance between an object around the vehicle and the vehicle, the vehicle-mounted image processing device determines a plurality of third coordinate points in a neighborhood range of the first coordinate point on the initial model in a large number of scattered points in the target point cloud, and then determines the second coordinate point according to the plurality of third coordinate points, so that the point on the initial model can be adjusted to the second coordinate point, and the vehicle-mounted image processing device accurately reconstructs the first model according to the actual distance between the object around the vehicle and the vehicle.

In an optional implementation manner, stitching a plurality of second point clouds of a plurality of images acquired by a plurality of image sensors under a world coordinate system to obtain a target point cloud may include: firstly, matching overlapping areas of a first image and a second image acquired by two adjacent image sensors by a vehicle-mounted image processing device to obtain a rotation matrix and a translation matrix for conversion between point clouds of the first image and point clouds of the second image; then, the vehicle-mounted image processing device transforms the point cloud of the second image by using the rotation matrix and the translation matrix, and splices the transformed point cloud of the second image and the point cloud of the first image. In this embodiment, because the poses of two adjacent cameras are different, the images acquired by the two image sensors have a difference of a small angle and a small direction, and the overlapping regions (same scenery) of the images acquired by the two image sensors are matched, so that the difference of the angle and the direction of the images acquired by the two image sensors can be found, that is, the difference can be balanced by rotation and translation, and then the point clouds acquired by the images acquired by the plurality of image sensors are spliced to obtain a whole piece of target point cloud, and the first coordinate point on the initial model is adjusted by the point on the target point cloud, so that the 3D model can be accurately reconstructed.

In an optional implementation, after generating the first model, the method further includes: and the vehicle-mounted image processing device performs interpolation smoothing processing on the first model to obtain a second model, and further performs texture mapping on the second model according to the image information to generate a panoramic image. In this embodiment, the second model after the interpolation is a smooth-surfaced 3D model, and the vehicle-mounted image processing apparatus performs texture mapping on the smooth-surfaced second model, thereby improving the rendering effect of the first model.

In an alternative implementation, in the real world, the objects around the vehicle include a first object and a second object, when the distance between the vehicle and the first object is a first distance and the distance between the vehicle and the second object is a second distance, the position of the vehicle is mapped to a first position in the panoramic image, and the position of the first object is mapped to a second position in the panoramic image, and the position of the second object is mapped to a third position in the panoramic image, when the first distance is greater than the second distance, the method further includes: the in-vehicle image processing apparatus displays a panoramic image in which a distance between the first position and the second position is larger than a distance between the first position and the third position. In this embodiment, when the distance between the object around the vehicle and the vehicle changes, the distance between the coordinate points corresponding to the object on the vehicle model and the first model also changes, so that a panoramic image consistent with the actual environment around the vehicle is obtained, splicing ghosting and dislocation are eliminated, and the detection accuracy of the splicing region and the experience effect of a driver are improved.

In a second aspect, an embodiment of the present application provides an on-vehicle looking-around device, including: the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring image information and depth information of a surrounding object, and the depth information is used for indicating coordinate point information of each point on the surrounding object; the processing module is used for acquiring an initial model; adjusting the first coordinate point on the initial model to a second coordinate point according to the depth information to generate a first model; a panoramic image is acquired based on the image information and the first model.

In an optional implementation manner, the processing module is further specifically configured to: converting the pixel points in the image information into a first point cloud under a camera coordinate system according to the pixel points in the image information and depth information corresponding to the pixel points; converting the first point cloud under the camera coordinate system into a second point cloud under the world coordinate system; and adjusting the first coordinate point to a second coordinate point through the coordinate point in the second point cloud to generate a first model, wherein the second coordinate point is obtained according to the coordinate point in the second point cloud.

In an optional implementation manner, the image information includes images acquired by a plurality of image sensors, and the processing module is further specifically configured to: splicing a plurality of second point clouds of a plurality of images acquired by a plurality of image sensors under a world coordinate system to obtain a target point cloud; determining a plurality of third coordinate points in the neighborhood range of the first coordinate point, wherein the third coordinate points are coordinate points on the target point cloud; determining a second coordinate point according to the plurality of third coordinate points; and adjusting the first coordinate point to a second coordinate point.

In an optional implementation manner, the processing module is further specifically configured to: matching overlapping areas of a first image and a second image acquired by two adjacent image sensors to obtain a rotation matrix and a translation matrix for conversion between point clouds of the first image and the second image; and transforming the point cloud of the second image by using the rotation matrix and the translation matrix, and splicing the transformed point cloud of the second image and the point cloud of the first image.

In an optional implementation manner, the processing module is further specifically configured to: carrying out interpolation smoothing treatment on the first model to obtain a second model; and performing texture mapping on the second model according to the image information to generate a panoramic image.

In an optional implementation manner, the vehicle-surrounding object comprises a first object and a second object, when the distance between the vehicle and the first object is a first distance, and the distance between the vehicle and the second object is a second distance, the position of the vehicle is mapped to a first position in the panoramic image, the position of the first object is mapped to a second position in the panoramic image, and the position of the second object is mapped to a third position in the panoramic image, and when the first distance is greater than the second distance, the device further comprises a display module; and the display module is also used for displaying the panoramic image, and in the panoramic image, the distance between the first position and the second position is greater than the distance between the first position and the third position.

In a third aspect, an embodiment of the present application provides an in-vehicle image processing apparatus, including a processor coupled with a memory, the memory being configured to store a program or instructions, which when executed by the processor, cause the in-vehicle image processing apparatus to perform the method according to the first aspect.

In a fourth aspect, an embodiment of the present application provides a vehicle-mounted panoramic system, including a sensor, a vehicle-mounted display, and the vehicle-mounted image processing apparatus according to the third aspect, where the sensor and the vehicle-mounted display are both connected to the vehicle-mounted image processor, the sensor is configured to collect image information and depth information, and the vehicle-mounted display is configured to display a panoramic image.

In a fifth aspect, embodiments of the present application provide a vehicle including an on-board surround view system as described in the fourth aspect above.

In a sixth aspect, the present application provides a computer program product, which includes computer program code, and when the computer program code is executed by a computer, the computer is enabled to implement the method according to any one of the above first aspects.

In a seventh aspect, an embodiment of the present application provides a computer-readable storage medium for storing a computer program or instructions, where the computer program or instructions, when executed, cause a computer to execute the method of any one of the above first aspects.

In an eighth aspect, an embodiment of the present application provides a chip, including a processor and a communication interface, where the processor is configured to read an instruction to perform the method of any one of the first aspect.

Drawings

FIG. 1A is a schematic perspective view of a 3D model;

FIG. 1B is a schematic side view of a 3D model;

FIG. 2A is a schematic diagram illustrating the occurrence of ghosting in a panoramic image in a conventional method;

FIG. 2B is a schematic diagram illustrating stitching dislocation in a panoramic image in a conventional method;

FIG. 3 is a schematic structural diagram of a vehicle-mounted looking-around system according to an embodiment of the present application;

FIG. 4 is a diagram of a world coordinate system, a camera coordinate system, an image coordinate system, and a pixel coordinate system according to an embodiment of the present application;

fig. 5 is a schematic flowchart illustrating steps of a method for generating a panoramic image according to an embodiment of the present application;

fig. 6 is a schematic diagram illustrating a visualization effect of converting an image with depth information into a point cloud in the embodiment of the present application;

FIG. 7 is a schematic diagram of matching an overlapping region in a first image with an overlapping region in a second image according to an embodiment of the present disclosure;

FIG. 8 is a schematic diagram of stitching of images acquired by adjacent camera sensors in an embodiment of the present application;

FIG. 9A is a schematic perspective view of a third coordinate point in a neighborhood of the first coordinate point on the initial model according to an embodiment of the present disclosure;

FIG. 9B is a schematic top view of a third coordinate point in the neighborhood of the first coordinate point on the initial model in the embodiment of the present application;

fig. 9C and 9D are schematic top views of the first model obtained by adjusting the first coordinate point on the initial model in the embodiment of the present application;

FIG. 10 is a schematic view of a real-world vehicle and surrounding objects and a scene of the vehicle and surrounding objects in a panoramic image according to an embodiment of the present application;

FIG. 11 is a diagram illustrating a first model interpolation process according to an embodiment of the present disclosure;

FIG. 12 is a schematic structural diagram of an embodiment of an in-vehicle image processing apparatus according to an embodiment of the present application;

fig. 13 is a schematic structural diagram of another embodiment of an in-vehicle image processing apparatus according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. The terms "first," "second," and the like in the description and claims of the present application and in the drawings are used for distinguishing between objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances.

The vehicle-mounted all-round system can acquire images shot by a plurality of cameras mounted on a vehicle body, and performs visual angle transformation and image splicing on the acquired images to generate a panoramic image. Illustratively, the panoramic image is a 360-degree panoramic image of the vehicle's perimeter, or the panoramic image is a 720-degree panoramic image. Generally, cameras are arranged on the periphery of a vehicle body, image information of objects around the vehicle body is collected through the cameras arranged in the front, the back, the left and the right directions of the vehicle body, images collected by every two adjacent cameras are spliced and then mapped onto a constructed 3D model with fixed size, and the 3D model is a simulation embodiment of real objects around the vehicle. Referring to fig. 1B, XOZ represents a plane coordinate system, Z ═ 0 represents a ground plane, and R represents a radius from the Z axis to the 3D model wall surface. Since the distance between the vehicle and the surrounding object is changed during the driving process of the vehicle, and the size of the 3D model is fixed, when the actual distance between the vehicle and the surrounding object is smaller than R, the images acquired by two adjacent cameras overlap, which results in a double image in the image stitching area (as shown in fig. 2A), and when the distance between the vehicle and the surrounding object is greater than R, the images acquired by two adjacent cameras generate a blind area, which results in image stitching dislocation in the stitching area (as shown in fig. 2B). It should be understood that a camera can also be mounted on the vehicle body for taking images of the bottom of the vehicle. For example, a camera mounted on the chassis or a camera mounted around the vehicle, the camera has a view angle capable of shooting the position where the vehicle bottom passes.

Based on the above problem, the embodiment of the application provides a method for generating a panoramic image, and the method is applied to a vehicle-mounted all-round viewing system. Referring to fig. 3, the vehicle-mounted surround view system includes a sensor 301, a vehicle-mounted image processing device 302, and a vehicle-mounted display 303, wherein the sensor 301 is connected to the vehicle-mounted image processing device 302, and the vehicle-mounted image processing device 302 is connected to the vehicle-mounted display 303. The sensor 301 is used to collect image information and depth information of objects around the vehicle. The vehicle-mounted image processing device 302 firstly obtains depth information of objects around the vehicle, creates an initial model, and then adjusts the position of a first coordinate point on the initial model according to the depth information to generate a first model; finally, the in-vehicle image processing apparatus 302 generates a panoramic image by texture mapping the first model based on the image information. The in-vehicle image processing apparatus 302 outputs the panoramic image to the in-vehicle display 303, and the in-vehicle display 303 displays the panoramic image. In the embodiment of the application, when the 3D model is created, depth information is introduced, the first coordinate point on the initial model is adjusted according to the depth information, the first coordinate point is adjusted to the second coordinate point, the first model is obtained, the distance between an object (corresponding to the first model) around the vehicle and the vehicle (corresponding to the vehicle model) is basically equal to the distance between the vehicle model and the first model, splicing ghosts and misplacement in the panoramic image are eliminated due to the fact that the first model is accurately reconstructed according to the depth information, the 3D panoramic image consistent with the actual environment around the vehicle is obtained, and the detection precision of a splicing region and the experience effect of a driver are improved.

To better understand the present application, the words referred to in the present application are exemplified.

Depth information, which may be used to indicate three-dimensional coordinate information of various points on the detected object. Depth information is also commonly referred to as depth, which in the field of machine vision refers to the distance of various points in space relative to a camera sensor for ease of computation. In the embodiment of the present application, the depth information of the object around the vehicle refers to a distance from a camera sensor to a three-dimensional coordinate point on the object around the vehicle, that is, a distance from the three-dimensional coordinate point on the object around the vehicle to a position on the vehicle body where the sensor is located.

Referring to FIG. 4, a world coordinate system (O) is shown in FIG. 4_w-X_wY_wZ_w) Camera coordinate system (O)_c-X_cY_cZ_c) An image coordinate system (o-xy) and a pixel coordinate system (uv). Wherein, the point P is a point in the world coordinate system, i.e. a point in the real environment. The P point is an imaging point of the point P, the coordinates of the P point in the image coordinate system are (x, y), and the coordinates in the pixel coordinate system are (u, v). The origin (O) of the image coordinate system is located on the Z-axis of the camera coordinate system, and the origin (O) of the image coordinate system and the origin (O) of the camera coordinate system_c) Is f, where f is the camera focal length. The 4 coordinate systems shown in fig. 4 are explained below.

The world coordinate system, also referred to as the measurement coordinate system, is a three-dimensional coordinate system, which may be, for example, a three-dimensional orthogonal coordinate system, a cylindrical coordinate system, a spherical coordinate system, or the like. In the embodiment of the present application, the world coordinate system may use a three-dimensional orthogonal coordinate system (X)_w，Y_w，Z_w). The spatial positions of the camera, the vehicle and objects around the vehicle can be described in a world coordinate system. The position of the world coordinate system is determined according to the actual situation. In the present application, for convenience of processing, the world coordinate system is selected to be centered on the vehicle (the vehicle position is located at the origin O of the world coordinate system)_w)，Z_wThe axis being perpendicular to the ground, X_wThe axis representative direction is a vehicle advancing direction, and the coordinate system may vary with the movement of the vehicle. The world coordinate system may have units of meters (m).

The camera coordinate system is a three-dimensional rectangular coordinate system (X)_c，Y_c，Z_c). The origin of the camera coordinate system is the optical center of the lens, X_c、Y_cThe axes being parallel to the sides of the image plane, Z_cThe axis is the optical axis of the lens,perpendicular to the image plane. The unit of the camera coordinate system may be meters (m).

In the pixel coordinate system, the coordinates of the camera coordinate system are transformed once to obtain the commonly used planar pixel coordinates (in this case, z is 1). The pixel coordinate system is in pixels, with the origin of coordinates in the upper left corner of the camera coordinate system.

The relationship of the image coordinate system and the pixel coordinate system may be: the origin of the image coordinate system is the midpoint of the pixel coordinate system. The unit of the image coordinate system may be millimeters (mm).

The transformation between the world coordinate system and the camera coordinate system is a rigid transformation, i.e. only the spatial position (translation) and orientation (rotation) of the object is changed, without changing the shape of the object. The transformation may be represented by rotation and translation. There is no rotation between the image coordinate system and the pixel coordinate system, except that the origin of coordinates of the image coordinate system and the pixel coordinate system are different.

Homogeneous coordinates, which are represented by an n-dimensional vector with an (n +1) -dimensional vector, refer to a coordinate system used in projection geometry, like cartesian coordinates used in euclidean geometry. For example, the homogeneous coordinates of the two-dimensional point (x, y) are represented as (hx, hy, h). The homogeneous representation of a vector is not unique, and the same point is represented by h in homogeneous coordinates taking different values. For example, homogeneous coordinates (8,4,2), (4,2,1) represent two-dimensional points (4, 2). The purpose of introducing homogeneous coordinates is mainly to combine multiplication and addition in matrix operations. Homogeneous coordinates provide a method for transforming a set of points in a two-dimensional, three-dimensional or even high-dimensional space from one coordinate system to another by matrix operations. The purpose of introducing homogeneous coordinates is to facilitate affine geometric transformations by computer graphics. It will be appreciated that the introduction of homogeneous coordinates may describe both rotation and translation using a matrix, with matrix multiplication being used to express rotation and translation of the object.

The intrinsic parameters of the camera may be parameters related to the characteristics of the camera itself, such as the focal length, pixel size, etc. of the camera.

The extrinsic parameters of the camera may be parameters in a world coordinate system, such as the position, rotation direction, etc. of the camera.

The point cloud may be a massive point set that expresses target spatial distribution and target surface characteristics in the same spatial reference system, and after acquiring the spatial coordinates of each sampling point on the surface of the object, a point set is obtained, which is called a "point cloud".

Texture can be pixel feature information on a two-dimensional image, also called texture map (texture) when the texture is mapped onto the surface of an object in a specific way.

Referring to fig. 5, an execution subject of the method is the vehicle-mounted surround view system in fig. 3, or the execution subject of the method is the vehicle-mounted image processing apparatus in fig. 3, or the execution subject of the method is a processor or a chip in the vehicle-mounted image processing apparatus.

Step 501, the vehicle-mounted all-round system acquires image information and depth information of objects around the vehicle.

A plurality of cameras are arranged on the periphery of the vehicle, for example, the cameras are wide-angle cameras (such as fisheye cameras), and at least one fisheye camera is arranged in each of the front, back, left and right directions of the vehicle. The fisheye camera is used for collecting image information around the vehicle in real time, and the fisheye camera is a super-wide-angle special lens, so that the fisheye camera is constructed to imitate fish eyes for imaging, large-angle shooting can be independently realized, the visual angle of the fisheye camera can even reach 180 degrees, and objects in a large-range scene can be monitored. Set up a fisheye camera in every direction all around of vehicle, just can gather vehicle panoramic picture all around through 4 fisheye cameras to can save the quantity of camera, reduce cost. Of course, in the embodiment of the present application, the camera is not limited to the wide-angle camera, and the camera may also be a common camera, and the viewing angle of the camera is increased by increasing the number of the common cameras, for example, 2 or 3 cameras are arranged in each direction around the vehicle, and the panoramic image around the vehicle is collected by using the common camera, although the number of the cameras is increased, the image information collected by the common camera has no deformation, and the collected image effect is better. In the embodiment of the present application, the number of the cameras is 4, that is, 1 fisheye camera is disposed in each direction of the front, the back, the left, and the right of the vehicle.

The vehicle-mounted all-round system for acquiring the depth information of the objects around the vehicle comprises the following two implementation modes. In a first implementation, the sensor 301 is an image sensor in a camera. The image sensor can collect image information of objects around the vehicle and transmit the image information to the vehicle-mounted image processing device, and the vehicle-mounted image processing device can obtain depth information of the objects around the vehicle according to the image information. For example, the depth information is obtained by performing depth estimation on image information acquired by a fisheye camera by the vehicle-mounted image processing device, and methods for obtaining the depth information by the vehicle-mounted image processing device include, but are not limited to, monocular depth estimation, binocular depth estimation and the like. For example, the monocular depth estimation method may be to use image data of a single view as an input of a depth model that has been trained, and output a depth corresponding to each pixel in an image using the depth model. Namely, the monocular estimation method based on the depth learning reflects the depth relation according to the pixel value relation and maps the image information into the depth map. The binocular depth estimation method is characterized in that a binocular camera is used for shooting left and right viewpoint images of the same scene, a disparity map is obtained by using a stereo matching algorithm, and then a depth map is obtained. A depth map is also called a distance map, and is an image in which distances (depths) from an image sensor (or a camera) to points in a scene are taken as pixel values, and for example, a warmer hue in a depth map indicates a smaller depth value, and a cooler hue indicates a larger depth value. In the first implementation manner, the image of the scene around the vehicle and the depth information of the objects around the vehicle can be acquired only by using the image sensor, and other related parts for acquiring the depth information are not required to be added, so that the cost is saved.

In a second implementation, the sensors 301 further include a depth sensor for collecting depth information of objects around the vehicle. Depth sensors include, but are not limited to, millimeter wave radar, laser radar, and the like. For example, a method for obtaining depth information of an object around a vehicle by using a laser radar is that the laser radar emits laser to a space at certain time intervals, records the time interval from the radar to the object in a detected scene by the signal of each scanning point, then reflects the time interval from the object back to the laser radar, and calculates the distance between the surface of the object and the radar according to the time interval, that is, obtains the depth information of the object around the vehicle. For another example, the depth information is obtained by a millimeter wave radar that emits a high-frequency continuous signal, the signal is reflected after encountering an object around the vehicle, and the receiver receives the reflected signal reflected by the object. The radar has a certain time interval t from the transmission of a signal to the reception of a reflected signal. And t is 2d/c, wherein d is the distance from the radar to the object, and c is the speed of light. In the embodiment of the present application, the above two methods for acquiring depth information are also only exemplary descriptions, and do not limit the specific method for acquiring depth information. In the second implementation manner, the depth sensor collects the depth information of the objects around the vehicle and transmits the depth information to the vehicle-mounted image processing device, so that the step of depth estimation according to the image information can be saved, and the calculation power of the vehicle-mounted image processing device is saved. In the embodiment of the present application, a method for acquiring depth information in the first implementation manner is described as an example.

Step 502, the vehicle-mounted all-round system obtains an initial model.

Illustratively, the in-vehicle image processing apparatus generates an initial model, which is a bowl-shaped 3D model, or a cylindrical 3D model, or the like. Alternatively, the initial model may be a smooth solid model, or the initial model may be a scatter model.

It should be noted that there is no time-sequential limitation between the

above steps

501 and 502, and the step 502 may be executed before the step 501, or the step 502 may be executed after the step 501, and the step 502 and the step 501 may also be executed synchronously.

Step 503, the vehicle-mounted all-around viewing system adjusts the first coordinate point on the initial model to the second coordinate point according to the depth information to obtain the first model.

First, please refer to fig. 6, in which fig. 6 is a schematic diagram of a point cloud. And the vehicle-mounted image processing device converts the pixel point coordinates in the image information into first point cloud coordinates in a camera coordinate system according to the acquired depth information and the pixel point coordinates in the image information and the depth information corresponding to the pixel point coordinates. The image information of the surrounding scene obtained in step 501 includes a plurality of pixels, for example, the plurality of pixels includes a first pixel (e.g., u)_i,v_i). The depth information of the surrounding scene obtained in step 501 above. The vehicle-mounted image processing device converts points in a pixel coordinate system into point cloud coordinates in a camera coordinate system according to the first pixel points and depth information corresponding to the first pixel points, and point cloud coordinates (x) in the camera coordinate system_i，y_i，z_i) As shown in the following formula (1).

Wherein u is₀,v₀Representing the coordinates of the optical center in the image coordinate system, f_xDenotes the focal length in the horizontal direction, f_yIndicating the focal length in the vertical direction.

Then, the in-vehicle image processing apparatus converts the first point cloud coordinates (x) in the camera coordinate system_i,y_i,z_i) Converting to second point cloud coordinate (X) under world coordinate system_i,Y_i,Z_i) In the present embodiment, in order to distinguish the point cloud coordinates in the camera coordinate system from the point cloud coordinates in the world coordinate system, the point cloud coordinates in the camera coordinate system are referred to as "first point cloud coordinates", and the point cloud coordinates in the world coordinate system are referred to as "second point cloud coordinates". Specifically, please refer to the following formulas (2) and (3) for converting the first point cloud coordinate into the second point cloud coordinate.

The formula (2) can be represented by the following formula (3) in homogeneous coordinates.

Wherein, in the above formulae (2) and (3), r_jA rotation matrix representing coordinate system transformation corresponding to the j-th camera, for example, j takes values of 1,2,3, and 4, for example, the 1 st camera is a front image sensor, the 2 nd camera is a left image sensor, the 3 rd camera is a rear image sensor, and the 4 th camera is a right image sensor; t is t_jA translation matrix representing coordinate system transformation corresponding to the j-th camera; (*)^-1Representing matrix inversion;

represents the external parameter of the jth camera; (X)_i,Y_i,Z_i) Is a pixel point (u)_i,v_i) And coordinates of the corresponding three-dimensional coordinate point in a world coordinate system.

And then, the vehicle-mounted image processing device splices (or connects) 4 pieces of point clouds of image information acquired by the 4 image sensors under a world coordinate system to obtain a complete panoramic point cloud (namely a target point cloud). For example, please refer to step (a) and step (b) below.

(a) And the vehicle-mounted image processing device matches point clouds in the overlapped area of the images (the first image and the second image) collected by two adjacent image sensors, so as to obtain a rotation matrix (indicated by 'R') and a translation matrix (indicated by 'T') for transforming the first image and the second image. It should be understood that, referring to fig. 7, the images captured by the first image sensor and the second image sensor which are adjacent to each other (e.g., the front camera and the left camera, the left camera and the rear camera, the rear camera and the right camera, and the right camera and the front camera) may have an overlapping area. The first image sensor is used for acquiring a first image, and the second image sensor is used for acquiring a second image. The point cloud data of the overlapping area in the first image is marked as "P", and the point cloud data of the overlapping area in the second image is marked as "Q". A set of rotation matrices R and translation matrices T is found using the objective function of equation (4) below. It should be understood that due to the different poses of the two cameras, the images acquired by the two image sensors have slight angle and orientation differences, and the differences can be found by utilizing the overlapping areas (same scenery) of the images acquired by the two image sensors, so that the calculation errors are eliminated.

Wherein, f (q)_i) Three-dimensional coordinates representing the ith point in the point cloud data Q, f (p)_i) Representing the three-dimensional coordinates of the ith point in the point cloud data P; r_hRotation matrix, T, representing the transformation of the point cloud data P_hA translation matrix representing the point cloud data P to be transformed; h is j-1; j is the number of image sensors. For example, the number of image sensors is 4, and h takes on values of 1,2, and 3. The four image sensors are adjacent to each other two by two, and 3 groups of R and T are required to be found. Groups 3 of R and T are each "R₁And T₁”，“R₂And T₂"and" R₃And T₃". As can be seen from the above formula (4), R.times.f (p)_i) + T is f (p)_i) For the three-dimensional coordinates of the point cloud of the image B after rotation and translation, a group of R is found by the formula (4) above_hAnd T_hAnd enabling the target function to have a minimum value, namely, the point cloud data P has a minimum value between the point cloud data Q after rotation and translation, so that the point cloud data Q and the point cloud data P in the two overlapping areas are matched.

In this embodiment, the point cloud of the original image is referred to as "point cloud a" and the point cloud after the conversion is referred to as "point cloud B" in order to distinguish the point cloud of the original image from the point cloud after the rotation and translation conversion. In order to distinguish between images captured by different image sensors, an image captured by a front image sensor is referred to as "image a", an image captured by a left image sensor is referred to as "image B", an image captured by a rear image sensor is referred to as "image C", and an image captured by a right image sensor is referred to as "image D".

For example, referring to fig. 8, the front image sensor captures an image a, the left image sensor captures an image B, and the overlapping area of the images a and B is an image of an object in front of the left of the vehicle. In the above formula (4), f (q)_i) Point cloud data representing an overlapping area in the point cloud A of the image A, f (p)_i) Point cloud data representing an overlapping area in point cloud A of image B, R is obtained according to the above formula (4)₂And T₂。

The in-vehicle image processing apparatus needs to match the overlapping area in image a and the overlapping area in image B, and when the above equation (4) has the minimum value, the overlapping area in image a and the overlapping area in image B match. The purpose of matching the overlap region in image A with the overlap region in image B is to be able to find a set of R₁And T₁Thereby passing through R₁And T₁The point cloud of the image A or the point cloud of the image B can be transformed, and then the point clouds of the images acquired by the two adjacent image sensors are spliced.

(b) And an in-vehicle image processing device for obtaining R by using the formula (4)_hAnd T_hAnd then, splicing the point clouds of the images acquired by every two adjacent image sensors to obtain a complete panoramic point cloud. For example, the in-vehicle image processing apparatus matches the overlapping area of the point cloud a of the image a and the overlapping area of the point cloud a of the image B using the above equation (4), and finds a set of the result R₁And T₁. Obtaining R by the vehicle-mounted image processing device₁And T₁Then, the vehicle-mounted image processing device may fix the point cloud a of the image a acquired by the previous image sensor. Then, using R₁And T₁The point cloud A of the image B is subjected to rotation and translation transformation, so that the point cloud B of the image B is obtained after the whole piece of point cloud data of the point cloud A of the image B, which is acquired by the left image sensor, in a world coordinate system is subjected to rotation and translation, and the point cloud B of the image B is spliced with the point cloud A of the image A.

Similarly, the rear image sensor collects an image C, and the vehicle-mounted image processing device passes through the image C according to the point cloud of the overlapping area in the point cloud B of the image B and the point cloud of the overlapping area in the point cloud A of the image CThe formula (4) finds a group of R₂And T₂. Wherein, f (q)_i) Point cloud data representing an overlapping area in the point cloud B of the image B, f (p)_i) Point cloud data representing overlapping regions in point cloud a of image C. The in-vehicle image processing apparatus matches the overlapping area of the point cloud B of the image B and the overlapping area of the point cloud a of the image C by using the above formula (4), finds a group of R₂And T₂. The point cloud B of the second image is then fixed by the onboard image processing device, using R₂And T₂And rotating and translating the point cloud A of the image C to obtain a point cloud B of the image C after rotating and translating the whole point cloud of the image C acquired by the image sensor in a world coordinate system. Further, the vehicle-mounted image processing device splices the point cloud B of the image C and the point cloud B of the image B. Similarly, the right image sensor collects an image D, and the vehicle-mounted image processing device finds a group R according to the overlapping area in the point cloud B of the image C and the overlapping area in the point cloud a of the image D by the formula (4) above₃And T₃. Wherein, f (q)_i) Point cloud data representing an overlapping area in the point cloud B of the image C, f (p)_i) Point cloud data representing overlapping regions in the point cloud a of the image D. The in-vehicle image processing apparatus obtains R by the above equation (4)₃And T₃. Then, the on-board image processing apparatus fixes the point cloud B of the image C, using R₃And T₃And rotating and translating the point cloud A of the image D to obtain the point cloud B of the image D after rotating and translating the whole piece of point cloud data of the image D acquired by the right camera under a world coordinate system. Further, the vehicle-mounted image processing device splices the point cloud B of the image D and the point cloud B of the image C. And the vehicle-mounted image processing device splices the point cloud B of the image D and the point cloud A of the image A to obtain a target point cloud. The target point cloud is the whole piece of point cloud formed by splicing the point cloud A of the image A, the point cloud B of the image B, the point cloud C of the image C and the point cloud B of the image D by the vehicle-mounted image processing device. In this embodiment, the overlapping regions (same scenery) of the images acquired by the two image sensors are matched, so that the difference between the angle and the orientation of the images acquired by the two image sensors can be found, that is, the images can be rotated and translatedThe difference is balanced, so that the point clouds of the images acquired by the image sensors can be spliced to obtain a complete image, the first coordinate point on the initial model is adjusted through the point on the target point cloud, and the 3D model can be reconstructed.

And finally, the vehicle-mounted all-around system adjusts the first coordinate point on the initial model, adjusts the first coordinate point to the second coordinate point and generates a first model. The second coordinate point is obtained from a plurality of third coordinate points in the neighborhood range of the first coordinate point, and the third coordinate points are coordinate points on the target point cloud. Referring to FIG. 9A, in the world coordinate system, the Z-axis is perpendicular to the ground, and then a first coordinate point (X) on the initial model in the world coordinate system is located_a,Y_a,Z_a)，Z_aRepresenting the height from the ground, the on-board image processing device determines a plurality of third coordinate points within a neighborhood range of the first coordinate point among a large number of scattered points in the target point cloud, for example, the third coordinate point has a height of Z_aThree-dimensional coordinate point of (1), height Z_aThe set of three-dimensional coordinate points of (2) is M. For example, a three-dimensional coordinate point (X) is included in M₁,Y₁,Z₁)，(X₂,Y₂,Z₂) And (X)₃,Y₃,Z₃) And so on, wherein Z₁，Z₂And Z₃All values of (are equal to Z_aEqually, the onboard image processing means depends on the X value (for example X) of each three-dimensional point in the set M₁，X₂And X₃) And Y value (e.g. Y)₁，Y₂And Y₃) To adjust X_aAnd Y_aThe value of (c). It should be understood that the X and Y values of each coordinate point in M can represent the actual distance of the vehicle from the scene around the vehicle. And the three-dimensional coordinate point in the M set is the first coordinate point (X)_a,Y_a,Z_a) As for coordinate points in the neighborhood range, please refer to fig. 9B for understanding, the "neighborhood range" of the first coordinate point refers to an intersection of the "first range" and the "second range". The "first range" and the "second range" are exemplified. "first range" is understood to mean the lateral range, i.e. the range between two rays passing through the center point of the initial model, centered on the center point. For example, a ray a from the center point to the first coordinate point, the ray a being rotated counterclockwise by α degrees (°) around the center point, a ray b being obtained, the ray a being rotated clockwise by α ° around the center point, a ray c being obtained, and a range between the ray b and the ray c being a first range. For example, α is an angle smaller than or equal to 10 °, and the magnitude of α can be set according to actual needs. The "second range" may be understood as a longitudinal range, and is exemplified by a distance R1 from the center point of the initial model and the first coordinate point, and then the second range is: the range covered by a circular ring consisting of two concentric circles with radius R2 and radius R3. Wherein, R2 is smaller than R1 and R3 is larger than R1, R1-R2 is g1, R3-R1 is g2, g1 and g2 may be equal or different, and g1 and g2 may be set according to empirical values or experimental values. Further, the in-vehicle image processing apparatus determines a second coordinate point from the plurality of third coordinate points. Illustratively, the on-board image processing device adjusts X according to the X value of each three-dimensional coordinate point in the set M_aAdjusting Y according to Y value of each three-dimensional coordinate point in the set M_a. The point to be adjusted is a first coordinate point (X)_a,Y_a,Z_a) The adjusted second coordinate point is (X)_a′,Y_a′,Z_a) Adjusted X_a' and Y_aThe formula (5) and the formula (6) are shown below.

Wherein, X_a' is the adjusted value of X, X_bBelong to X_aPoints within the neighborhood, n is the number of points within the neighborhood, δ (#) represents the neighborhood, δ (X)_a) Represents X_aI.e. the set of X values in the three-dimensional coordinate points in the set of M, e.g. delta (X)_a) Comprising X₁，X₂And X₃。

Wherein, Y_a' is the adjusted Y value, Y_bIs Y_aPoints within the neighborhood, n is the number of points within the neighborhood, δ (#) represents the neighborhood, δ (Y)_a) Represents Y_aI.e. the set of Y values in the three-dimensional coordinate points in the set of M, e.g. delta (Y)_a) Comprising Y₁，Y₂And Y₃。

Referring to FIGS. 9C and 9D, the three-dimensional coordinate point on the initial model is (X)_a,Y_a,Z_a) Adjusting the position of the three-dimensional coordinate point, and expressing (X) by the above equations (5) and (6)_a,Y_a,Z_a) Is adjusted to (X)_a′,Y_a′,Z_a') to obtain a first model. It should be understood that in this embodiment, only one three-dimensional point (X) on the initial model is adjusted_a,Y_a,Z_a) For example, the other first coordinate points may be adjusted by the same method as described above, which is not repeated herein, and in practical applications, the number of the first coordinate points is not limited, the positions of the first coordinate points are not limited, and the positions of the adjusted second coordinate points are not limited, for example, please refer to fig. 9C, which shows the adjusted second coordinate points (X)_a′,Y_a′,Z_a') may be located "outside" the initial model. Please refer to fig. 9D, which shows the adjusted second coordinate point (X)_a′,Y_a′,Z_a') may be located "inside" the initial model. The vehicle-mounted image processing device adjusts the position of the first coordinate point in real time according to the depth information of the object around the vehicle, and the shape of the initial model changes due to the real-time adjustment of the position of the first coordinate point on the initial model, namely the first model is obtained according to the actual distance between the object around the vehicle and the vehicle. The shape of the first model may be an irregular shape, and the shape of the first model is changed as the distance between the vehicle and the surrounding object is changed. In this embodiment, the target point cloud is a point cloud obtained based on an actual distance between an object around the vehicle and the vehicle, and the vehicle-mounted image processing apparatus determines a plurality of third coordinate points in a neighborhood range of the first coordinate point on the initial model from a large number of scattered points in the target point cloud, and then determines a plurality of third coordinate points according to the plurality of third coordinate pointsAnd determining a second coordinate point, further adjusting a point on the initial model to the second coordinate point, and accurately reconstructing the first model by the vehicle-mounted image processing device according to the actual distance between the vehicle and the object around the vehicle.

And step 504, the vehicle-mounted all-around viewing system acquires a panoramic image based on the image information and the first model.

The vehicle-mounted panoramic system performs texture mapping on the first model according to the image information to generate a 3D panoramic image (or called a panoramic image).

In step 501, 4 image sensors capture images of 4 directions around the vehicle, and the image sensors transmit the images of 4 directions to an on-vehicle image processing device, which may perform mapping in a texture mapping manner. Illustratively, the vehicle-mounted image processing device obtains internal and external parameters of a camera by calibration in advance, performs external reference and internal reference mapping on three-dimensional coordinates of the optimized first model to obtain two-dimensional pixel coordinates, acquires corresponding texture pixels from an image (or also called a "texture image") acquired by a fisheye camera, and maps the coordinate points in the image to the surface of the first model, namely, each pixel point on the texture image is mapped to which point on the first model to render and fill color, so that the whole texture image is overlaid on the first model to obtain a 3D panoramic image.

And 505, outputting the panoramic image by the vehicle-mounted all-round looking system.

The in-vehicle image processing device outputs the 3D panoramic image to an in-vehicle display, and the in-vehicle display displays the 3D panoramic image.

In the embodiment of the application, when the vehicle-mounted image processing device creates the 3D model, the depth information is introduced, and the vehicle-mounted image processing device adjusts the coordinate point on the initial model according to the depth information to obtain the first model. The vehicle-mounted image processing device generates a virtual first model according to the real-world vehicle surrounding object correspondence, and generates a virtual vehicle model based on the real-world vehicle correspondence. For example, referring to fig. 10, in the real world, the objects around the vehicle include a first object and a second object, a distance between the vehicle and the first object is a first distance, and a distance between the vehicle and the second object is a second distance. The position of the vehicle is mapped to a first position in the panoramic image, the position of the first object is mapped to a second position in the panoramic image, and the position of the second object is mapped to a third position in the panoramic image. When the first distance is greater than the second distance, a distance between a first position and a second position is greater than a distance between the first position and the third position in the panoramic image. When the distance between the object around the vehicle and the vehicle changes, the distance between the coordinate points corresponding to the object on the vehicle model and the first model also changes, so that a 3D panoramic image consistent with the actual environment around the vehicle is obtained, splicing ghosting and dislocation problems are eliminated, and the detection precision of a splicing region and the experience effect of a driver are improved.

Optionally, after step 503 and before step 504, the following steps are further included: the vehicle-mounted image processing device performs interpolation smoothing processing on scattered points on the first model to obtain a second model, wherein the second model is a smoothed model; in step 504, the in-vehicle image processing apparatus performs texture mapping on the second model based on the image information to generate a 3D panoramic image.

The first model is composed of a large number of scattered points, the scattered points are directly connected through the vehicle-mounted image processing device, the obtained model can be concave-convex, and the visual effect of the generated panoramic image is poor due to texture mapping on the concave-convex 3D model. Thus, the in-vehicle image processing apparatus performs interpolation processing on the first model, and as shown in fig. 11, fig. 11 is a schematic diagram of interpolation processing, and a continuous function is additionally inserted on the basis of the 3D scatter model by an interpolation method, so that the whole surface of the 3D model surface passes through the scatter on the 3D model. The second model after interpolation processing is a smooth-surface 3D model, and the vehicle-mounted image processing device carries out texture mapping on the smooth-surface second model, so that the 3D model rendering effect is improved. The interpolation method in the embodiment of the application can adopt spline interpolation, bicubic interpolation, discrete smooth interpolation and other interpolation methods. In the embodiment of the present application, the surface of the second model may be made smooth by an interpolation method, and a specific interpolation method is not limited.

Referring to fig. 12, an embodiment of the present application provides an in-vehicle image processing apparatus, and the in-vehicle image processing apparatus is configured to execute the method executed by the in-vehicle image processing apparatus in the above method embodiment. The vehicle-mounted image processing device 1200 comprises an acquisition module 1201 and a processing module 1202, and optionally further comprises a display module 1203.

An obtaining module 1201, configured to obtain image information and depth information of a surrounding object, where the depth information is used to indicate coordinate point information of each point on the surrounding object;

a processing module 1202 for obtaining an initial model; adjusting a first coordinate point on the initial model to a second coordinate point according to the depth information to generate a first model; acquiring a panoramic image based on the image information and the first model.

Alternatively, the processing module 1202 is a processor, which is a general-purpose processor or a special-purpose processor, etc. Optionally, the processor comprises a transceiving unit for implementing receiving and transmitting functions. For example, the transceiving unit is a transceiving circuit, or an interface circuit. The transceiver circuitry, interface or interface circuitry for implementing the receive and transmit functions is separately deployed, optionally integrated together. The transceiver circuit, the interface or the interface circuit are used for reading and writing codes or data, or the transceiver circuit, the interface or the interface circuit are used for transmitting or transmitting signals.

Optionally, the obtaining module 1201 may be replaced by a transceiver module, and optionally, the transceiver module is a communication interface. Optionally, the communication interface is an input-output interface or a transceiving circuit. The input and output interface comprises an input interface and an output interface. The transceiver circuit includes an input interface circuit and an output interface circuit.

Optionally, the transceiver module is configured to receive image information and depth information of the surrounding object from the sensor.

Alternatively, the obtaining module 1201 may be replaced by the processing module 1202.

Further, the obtaining module 1201 is configured to execute step 501 in the embodiment corresponding to fig. 5. The processing module 1202 is configured to perform step 502, step 503, step 504, and step 505 in the embodiment corresponding to fig. 5. Optionally, the display module 1203 is configured to perform step 505 in the embodiment corresponding to fig. 5, and the display module 1203 is configured to display the panoramic image.

Specifically, in an optional implementation manner, the processing module 1202 is further specifically configured to: converting the pixel points in the image information into a first point cloud under a camera coordinate system according to the pixel points in the image information and depth information corresponding to the pixel points; converting the first point cloud under the camera coordinate system into a second point cloud under a world coordinate; and adjusting the first coordinate point to a second coordinate point through the coordinate point in the second point cloud to generate the first model, wherein the second coordinate point is obtained according to the coordinate point in the second point cloud.

In an alternative implementation, the image information includes images acquired by a plurality of image sensors, and the processing module 1202 is further specifically configured to: splicing a plurality of second point clouds of a plurality of images acquired by a plurality of image sensors under a world coordinate system to obtain a target point cloud; determining a plurality of third coordinate points within the neighborhood of the first coordinate point, the third coordinate points being coordinate points on the target point cloud; determining a second coordinate point according to the plurality of third coordinate points; and adjusting the first coordinate point to the second coordinate point.

In an optional implementation manner, the processing module 1202 is further specifically configured to: matching overlapping areas of a first image and a second image acquired by two adjacent image sensors to obtain a rotation matrix and a translation matrix for conversion between point clouds of the first image and the second image; and transforming the point cloud of the second image by using the rotation matrix and the translation matrix, and splicing the transformed point cloud of the second image and the transformed point cloud of the first image.

In an optional implementation manner, the processing module 1202 is further specifically configured to: carrying out interpolation smoothing treatment on the first model to obtain a second model; and performing texture mapping on the second model according to the image information to generate a panoramic image.

In an optional implementation manner, the objects around the vehicle include a first object and a second object, when a distance between the vehicle and the first object is a first distance, and a distance between the vehicle and the second object is a second distance, the position of the vehicle is mapped to a first position in the panoramic image, the position of the first object is mapped to a second position in the panoramic image, and the position of the second object is mapped to a third position in the panoramic image, and when the first distance is greater than the second distance, the display module 1203 is further configured to display the panoramic image, in which the distance between the first position and the second position is greater than the distance between the first position and the third position.

Referring to fig. 13, in an embodiment of the present application, an in-vehicle image processing apparatus is configured to execute steps 501 to 505 in the method embodiment corresponding to fig. 5. Reference may be made in particular to the description of the above-mentioned method embodiments. The vehicle-mounted image processing device comprises one or more processors 1301, wherein the processors 1301 can also be called processing units, and certain control functions can be realized. The processor 1301 may be a general-purpose processor, a special-purpose processor, or the like, for example, the processor 1301 is a Graphics Processing Unit (GPU). The central processing unit may be configured to control the in-vehicle image processing apparatus, execute a software program, and process data of the software program.

In an alternative design, the processor 1301 may also have instructions 1303 stored therein, and the instructions 1303 may be executed by the processor, so that the vehicle-mounted image processing apparatus 1300 performs the method described in the foregoing method embodiment.

In an alternative design, processor 1301 may include a transceiver unit for performing receive and transmit functions. The transceiving unit may be, for example, a transceiving circuit, or an interface circuit. The transmit and receive circuitry, interfaces or interface circuitry used to implement the receive and transmit functions may be separate or integrated. The transceiver circuit, the interface circuit or the interface circuit may be used for reading and writing code/data, or the transceiver circuit, the interface circuit or the interface circuit may be used for transmitting or transferring signals.

One or more memories 1302 may be included in the in-vehicle image processing apparatus 1300, on which instructions 1304 may be stored, the instructions being executable on the processor to cause the in-vehicle image processing apparatus 1300 to perform the methods described in the above method embodiments. Optionally, the memory may further store data therein. Optionally, instructions and/or data may also be stored in the processor. The processor and the memory may be provided separately or may be integrated together.

Optionally, the in-vehicle image processing apparatus 1300 may further include a transceiver 1305. The processor 1301 may be referred to as a processing unit, and controls the in-vehicle image processing apparatus 1300. The transceiver 1305 may be referred to as a transceiving unit, a transceiver, a transceiving circuit, a transceiving device, a transceiving module, or the like, and is configured to implement a transceiving function.

In the embodiment of the present application, a vehicle includes the vehicle-mounted looking-around system shown in fig. 3. The in-vehicle looking around system includes the in-vehicle image processing apparatus shown in fig. 12, or the in-vehicle looking around system includes the in-vehicle image processing apparatus shown in fig. 13, which is used to execute steps 501 to 505 in the embodiment corresponding to fig. 5 described above.

The embodiment of the present application further provides a computer program product, where the computer program product includes computer program code, and when the computer program code is executed by a computer, the computer is enabled to implement the method executed by the vehicle-mounted image processing apparatus (or the vehicle-mounted surround view system) in the above method embodiment.

The embodiment of the application also provides a computer readable storage medium for storing a computer program or instructions, and the computer program or instructions can be executed to enable a computer to execute the method executed by the vehicle-mounted image processing device (or the vehicle-mounted looking-around system) in the method embodiment.

The embodiment of the application provides a chip which comprises a processor and a communication interface, wherein the processor is used for reading instructions to execute the method executed by the vehicle-mounted image processing device (or the vehicle-mounted looking-around system) in the method embodiment.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A method for generating a panoramic image, comprising:

acquiring image information and depth information of a surrounding object, wherein the depth information is used for indicating coordinate point information of each point on the surrounding object;

obtaining an initial model;

adjusting a first coordinate point on the initial model to a second coordinate point according to the depth information to generate a first model;

acquiring a panoramic image based on the image information and the first model.

2. The method of claim 1, wherein the adjusting the first coordinate point on the initial model to the second coordinate point according to the depth information generates a first model comprising:

converting the pixel points in the image information into a first point cloud under a camera coordinate system according to the pixel points in the image information and depth information corresponding to the pixel points;

converting the first point cloud under the camera coordinate system into a second point cloud under a world coordinate;

and adjusting the first coordinate point to a second coordinate point through the coordinate point in the second point cloud to generate the first model, wherein the second coordinate point is obtained according to the coordinate point in the second point cloud.

3. The method of claim 2, wherein the image information comprises images captured by a plurality of image sensors, and wherein after converting the first point cloud in the camera coordinate system to the second point cloud in world coordinates, the method further comprises:

splicing a plurality of second point clouds of a plurality of images acquired by a plurality of image sensors under a world coordinate system to obtain a target point cloud;

the adjusting the first coordinate point to a second coordinate point through the coordinate points in the second point cloud includes:

determining a plurality of third coordinate points within the neighborhood of the first coordinate point, the third coordinate points being coordinate points on the target point cloud;

determining a second coordinate point according to the plurality of third coordinate points;

and adjusting the first coordinate point to the second coordinate point.

4. The method of claim 3, wherein the stitching the plurality of second point clouds of the plurality of images acquired by the plurality of image sensors under the world coordinate system to obtain the target point cloud comprises:

matching overlapping areas of a first image and a second image acquired by two adjacent image sensors to obtain a rotation matrix and a translation matrix for conversion between point clouds of the first image and the second image;

and transforming the point cloud of the second image by using the rotation matrix and the translation matrix, and splicing the transformed point cloud of the second image and the transformed point cloud of the first image.

5. The method of any of claims 1-4, wherein after the generating the first model, the method further comprises:

carrying out interpolation smoothing treatment on the first model to obtain a second model;

the acquiring of the panoramic image based on the image information and the first model includes:

and performing texture mapping on the second model according to the image information to generate a panoramic image.

6. The method of claim 1, wherein the perimeter object comprises a first object and a second object, the position of the vehicle is mapped to a first position in the panoramic image when the distance between the vehicle and the first object is a first distance and the distance between the vehicle and the second object is a second distance, the position of the first object is mapped to a second position in the panoramic image, the position of the second object is mapped to a third position in the panoramic image, and when the first distance is greater than the second distance, the method further comprises:

displaying the panoramic image in which a distance between the first position and the second position is greater than a distance between the first position and the third position.

7. An on-vehicle looking around device, comprising:

the system comprises an acquisition module, a display module and a control module, wherein the acquisition module is used for acquiring image information and depth information of a surrounding object, and the depth information is used for indicating coordinate point information of each point on the surrounding object;

the processing module is used for acquiring an initial model; adjusting a first coordinate point on the initial model to a second coordinate point according to the depth information to generate a first model; acquiring a panoramic image based on the image information and the first model.

8. The apparatus of claim 7, wherein the processing module is further specifically configured to:

9. The apparatus of claim 8, wherein the image information comprises images captured by a plurality of image sensors, and wherein the processing module is further specifically configured to:

and adjusting the first coordinate point to the second coordinate point.

10. The apparatus of claim 9, wherein the processing module is further specifically configured to:

11. The apparatus according to any one of claims 7-10, wherein the processing module is further specifically configured to:

12. The apparatus of claim 7, wherein the perimeter object comprises a first object and a second object, the position of the vehicle is mapped to a first position in the panoramic image when the distance between the vehicle and the first object is a first distance and the distance between the vehicle and the second object is a second distance, the position of the first object is mapped to a second position in the panoramic image, the position of the second object is mapped to a third position in the panoramic image, and the apparatus further comprises a display module when the first distance is greater than the second distance; the display module is further configured to display the panoramic image, and in the panoramic image, a distance between the first position and the second position is greater than a distance between the first position and the third position.

13. An in-vehicle image processing apparatus, comprising a processor coupled with a memory for storing a program or instructions which, when executed by the processor, cause the in-vehicle image processing apparatus to perform the method of any of claims 1 to 6.

14. A vehicle mounted surround view system comprising a sensor, the vehicle mounted image processing apparatus of claim 13 and a vehicle mounted display, both connected to the vehicle mounted image processor, wherein the sensor is adapted to collect image information and depth information and the vehicle mounted display is adapted to display a panoramic image.

15. A vehicle comprising the on-board look-around system of claim 14.

16. A computer program product comprising computer program code, which when executed by a computer causes the computer to implement the method of any one of claims 1 to 6.

17. A computer-readable storage medium for storing a computer program or instructions which, when executed, cause a computer to perform the method of any one of claims 1 to 6.

18. A chip comprising a processor and a communication interface, the processor being configured to read instructions to perform the method of any of claims 1 to 6.