CN114511447A

CN114511447A - Image processing method, device, equipment and computer storage medium

Info

Publication number: CN114511447A
Application number: CN202210110821.5A
Authority: CN
Inventors: 黄寅涛; 韩殿飞; 赵汉玥; 陈东生; 杨振伟
Original assignee: Shenzhen Sensetime Technology Co Ltd
Current assignee: Shenzhen Sensetime Technology Co Ltd
Priority date: 2022-01-29
Filing date: 2022-01-29
Publication date: 2022-05-17

Abstract

The embodiment of the application discloses an image processing method, an image processing device, image processing equipment and a computer storage medium, wherein the method comprises the following steps: acquiring a first image shot by a first camera and a second image shot by a second camera; an included angle between the optical axis of the first camera and the optical axis of the second camera is a preset angle; performing conversion processing on the first image and the second image to obtain a first expanded image; determining vertex coordinates and pixel information of the vertex on a first spherical coordinate system based on the pixel information of the first unfolded image; determining a panoramic image based on the vertex coordinates and the pixel information of the vertex.

Description

Image processing method, device, equipment and computer storage medium

Technical Field

The embodiments of the present application relate to, but not limited to, the field of computer vision technologies, and in particular, to an image processing method, apparatus, device, and computer storage medium.

Background

In the related art, environmental information around a camera is generally acquired by means of shooting. However, since the imaging angle of view of the camera is limited, a plurality of cameras are required to image the surrounding environment in order to acquire environmental information of a large angle of view around the camera. In this way, the user cannot conveniently acquire the environmental information of a large viewing angle around the camera by checking a plurality of images.

Disclosure of Invention

The embodiment of the application provides an image processing method, an image processing device, image processing equipment and a computer storage medium.

In a first aspect, an embodiment of the present application provides an image processing method, including: acquiring a first image shot by a first camera and a second image shot by a second camera; an included angle between the optical axis of the first camera and the optical axis of the second camera is a preset angle; performing conversion processing on the first image and the second image to obtain a first expanded image; determining vertex coordinates and pixel information of the vertex on a first spherical coordinate system based on the pixel information of the first unfolded image; determining a panoramic image based on the vertex coordinates and the pixel information of the vertex.

In this way, the first image shot by the first camera and the second image shot by the second camera are converted to obtain the first expanded image, and the panoramic image is determined based on the pixel information of the first expanded image, so that a user can acquire the environmental information around the camera by looking up the panoramic image, and the user can conveniently acquire the environmental information with a large viewing angle around the camera; further, the vertex coordinates and the vertex pixel information in the first spherical coordinate system are determined based on the pixel information of the first developed image, and the panoramic image is determined based on the vertex coordinates and the vertex pixel information, so that the panoramic image is an image rendered by the vertex coordinates and the vertex pixel information, and the environmental information around the camera can be clearly displayed.

In some embodiments, the converting the first image and the second image to obtain a first expanded image includes: determining a first transformation relationship between the coordinates of the first image and the coordinates of the first unfolded image, and a second transformation relationship between the coordinates of the second image and the coordinates of the first unfolded image; and converting the first image and the second image based on the first conversion relation and the second conversion relation to obtain a first expanded image.

In this way, the first unfolded image is obtained based on the first conversion relationship between the coordinates of the first image and the coordinates of the first unfolded image and the second conversion relationship between the coordinates of the second image and the coordinates of the first unfolded image, so that the first unfolded image not only has the characteristics of the first image, but also has the characteristics of the second image, and the characteristics of the first image and the characteristics of the second image can be comprehensively reflected by the spliced first unfolded image.

In some embodiments, the determining a first transformation relationship between the coordinates of the first image and the coordinates of the first unfolded image and a second transformation relationship between the coordinates of the second image and the coordinates of the first unfolded image comprises: determining a third conversion relation between the coordinates of a plurality of pixel points in the first unfolded image and the coordinates of a plurality of pixel points on the mapped second spherical coordinate system; determining a fourth conversion relation between the coordinates of the plurality of pixel points on the second spherical coordinate system and the coordinates of the plurality of pixel points in the mapped first image; determining a fifth conversion relation between the coordinates of the plurality of pixel points on the second spherical coordinate system and the coordinates of the plurality of pixel points in the mapped second image; determining the first conversion relationship based on the third conversion relationship and the fourth conversion relationship, and determining the second conversion relationship based on the third conversion relationship and the fifth conversion relationship.

In this way, the first conversion relation is determined based on the third conversion relation and the fourth conversion relation, the second conversion relation is determined based on the third conversion relation and the fifth conversion relation, so that the shot first image and the shot second image can be mapped onto the second spherical coordinate system, the influence of external parameters, internal parameters and distortion parameters of the first camera and the second camera on the shooting of the first image and the second image can be eliminated, the first unfolded image can be determined based on the image mapped on the second spherical coordinate system, and the first unfolded image can accurately reflect the real scene around the camera.

In some embodiments, the converting the first image and the second image based on the first conversion relationship and the second conversion relationship to obtain a first expanded image includes: determining a first sub-expansion image corresponding to the first image based on the first image and the first conversion relation; determining a second sub-expansion image corresponding to the second image based on the second image and the second conversion relation; fusing a first pixel point of the first sub-unfolded image and a second pixel point of the second sub-unfolded image to obtain a fused area; and splicing a third pixel point except the first pixel point in the first sub-unfolded image, the fusion area and a fourth pixel point except the second pixel point in the second sub-unfolded image to obtain the first unfolded image.

Therefore, the first sub-unfolded image and the second sub-unfolded image are determined firstly, the first pixel point of the first sub-unfolded image and the second pixel point of the second sub-unfolded image are fused to obtain a fusion area, the third pixel point except the first pixel point in the first sub-unfolded image, the fourth pixel point except the second pixel point in the fusion area and the second sub-unfolded image are spliced to obtain the first unfolded image, so that the overlapped part of the first image and the second image can be fused, the first sub-unfolded image and the second sub-unfolded image can be in smooth transition, and the sense of reality of the first unfolded image is improved.

In some embodiments, the first pixel point corresponds to a first connecting line between a pixel point in the second spherical coordinate system and the center of the second spherical coordinate system, and an included angle between the first connecting line and the second connecting line is greater than or equal to a first angle and less than or equal to a second angle; the second connecting line is a connecting line between the position of the center position of the first sub-expansion image in the second spherical coordinate system and the circle center; the second pixel point corresponds to a third connecting line of the pixel point and the circle center in the second spherical coordinate system, and an included angle between the fourth connecting line and the pixel point in the second spherical coordinate system is greater than or equal to the first angle and less than or equal to the second angle; the fourth connecting line is a connecting line between the position of the center position of the second sub-expansion image in the second spherical coordinate system and the circle center; the first angle is 90 degrees minus a third angle, and the second angle is 90 degrees plus the third angle.

Therefore, the pixel points fused with the first sub expanded image and the second sub expanded image correspond to the pixel points in the second spherical coordinate system, the connecting line between the pixel points and the circle center of the second spherical coordinate system, the angle between the connecting line between the position corresponding to the center of the sub expanded image in the second spherical coordinate system and the circle center is larger than or equal to the first angle and smaller than or equal to the second angle, and therefore the fusion processing can be performed on the pixel points corresponding to the first angle or larger than or equal to the first angle and smaller than or equal to the second angle, the phenomenon that the transition between the first sub expanded image and the second sub expanded image is unnatural due to too many pixel points subjected to the fusion processing is reduced.

In some embodiments, the fusing the first pixel point of the first sub-deployment image and the second pixel point of the second sub-deployment image to obtain a fused region includes: determining a weighting coefficient based on an included angle between the first connection line and the second connection line, an included angle between the third connection line and the fourth connection line, and one of the second angle and the first angle; and fusing a first pixel point of the first sub-expansion image and a second pixel point of the second sub-expansion image based on the weighting coefficient to obtain the fusion region.

Therefore, the weighting coefficient is determined based on the included angle between the first connecting line and the second connecting line and the included angle between the third connecting line and the fourth connecting line, and the first pixel point and the second pixel point are fused based on the weighting coefficient, so that the fusion area is a result of weighting processing on the first pixel point and the second pixel point, and the fusion area can accurately reflect a real scene.

In some embodiments, the stitching a third pixel point in the first sub-unfolded image except the first pixel point, the fusion region, and a fourth pixel point in the second sub-unfolded image except the second pixel point to obtain the first unfolded image includes: splicing the third pixel point of the first sub-unfolded image, the fusion area and the fourth pixel point of the second sub-unfolded image to obtain a second unfolded image; and smoothing the second unfolded image to obtain the first unfolded image.

Therefore, the first unfolded image is obtained by performing smoothing processing on the spliced second unfolded image, so that the saw teeth in the second unfolded image can be eliminated, and the obtained first unfolded image is more real.

In some embodiments, the determining the vertex coordinates and the pixel information of the vertex on the first spherical coordinate system based on the pixel information of the first expanded image comprises: mapping a plurality of pixel points in the first unfolded image to coordinates of a plurality of pixel points on the first spherical coordinate system, and determining the coordinates as the vertex coordinates; determining a sixth conversion relation between the coordinates of a plurality of pixel points in the first unfolded image and the coordinates of a plurality of pixel points on the mapped first spherical coordinate system; and determining the pixel information of the vertex based on the pixel information of a plurality of pixel points in the first expanded image and the sixth conversion relation.

In this way, the first unfolded image is mapped onto the first spherical coordinate system, so that the vertex coordinates and the vertex pixel information on the corresponding first spherical coordinate system can be obtained, the characteristics of the first unfolded image can be comprehensively reflected by the vertex coordinates and the vertex pixel information, the obtained panoramic image is rendered by the vertex coordinates and the vertex pixel information, and the environmental information around the camera can be truly and comprehensively represented.

In some embodiments, the determining a panoramic image based on the vertex coordinates and the pixel information of the vertex comprises: in response to a sliding instruction generated for a sliding operation of a display device, determining a model observation projection MVP matrix matched with the sliding operation; and determining the panoramic image based on the vertex coordinates, the pixel information of the vertex and the MVP matrix.

Therefore, the MVP matrix can be determined based on the sliding operation of the user on the display device, and the panoramic image is determined based on the MVP matrix, so that the display view angle of the panoramic image can be switched based on the sliding operation of the user, and convenience of switching the display view angle of the panoramic image by the user is improved.

In some embodiments, after determining the panoramic image based on the vertex coordinates and the pixel information of the vertex, the method further comprises: determining a three-dimensional mesh model corresponding to a specific object in the panoramic image; rendering the three-dimensional grid model to obtain a rendered three-dimensional grid model; overlaying the rendered three-dimensional mesh model on a particular object in the panoramic image.

In this way, the three-dimensional mesh model corresponding to the specific object is rendered, and the rendered three-dimensional mesh model is covered on the specific object in the panoramic image, so that the user can conveniently know the three-dimensional characteristics of the specific object.

In a second aspect, an embodiment of the present application provides an image processing apparatus, including: the device comprises an acquisition unit, a processing unit and a display unit, wherein the acquisition unit is used for acquiring a first image shot by a first camera and a second image shot by a second camera; an included angle between the optical axis of the first camera and the optical axis of the second camera is a preset angle; the conversion unit is used for carrying out conversion processing on the first image and the second image to obtain a first expanded image; a determination unit configured to determine a vertex coordinate on a first spherical coordinate system and pixel information of the vertex based on pixel information of the first expanded image; the determining unit is further configured to determine a panoramic image based on the vertex coordinates and the pixel information of the vertex.

In a third aspect, an embodiment of the present application provides an image processing apparatus, including: a memory storing a computer program operable on the processor, and a processor implementing the image processing method described above when executing the computer program.

In a fourth aspect, embodiments of the present application provide a computer storage medium storing one or more programs, which are executable by one or more processors to implement the image processing method described above.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application.

Fig. 1 is a schematic flowchart of an image processing method according to an embodiment of the present disclosure;

fig. 2a is a schematic flowchart of another image processing method according to an embodiment of the present disclosure;

fig. 2b is a schematic diagram of a camera model provided in an embodiment of the present application;

fig. 3 is a schematic flowchart of another image processing method according to an embodiment of the present application;

fig. 4a is a schematic flowchart of another image processing method according to an embodiment of the present application;

FIG. 4b is a schematic diagram illustrating a location of a vertex according to an embodiment of the present application;

fig. 5 is a schematic flowchart of an image processing method according to another embodiment of the present application;

fig. 6 is a schematic flowchart of an image processing method according to another embodiment of the present application;

fig. 7 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present disclosure;

fig. 8 is a hardware entity diagram of an image processing apparatus according to an embodiment of the present disclosure.

Detailed Description

The technical solution of the present application will be specifically described below by way of examples with reference to the accompanying drawings. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.

It should be noted that: in the present examples, "first", "second", etc. are used for distinguishing similar objects and are not necessarily used for describing a particular order or sequence.

The technical means described in the embodiments of the present application may be arbitrarily combined without conflict. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.

The image processing method provided in the following embodiments may be applied to an image processing apparatus or a processor, and the processor may be applied to the image processing apparatus.

The image processing device may comprise one or a combination of at least two of: internet of Things (IoT) devices, satellite terminals, Wireless Local Loop (WLL) stations, Personal Digital Assistant (PDA), handheld devices with Wireless communication capabilities, computing devices or other processing devices connected to Wireless modems, servers, cell phones (mobile phones), tablet computers (Pad), computers with Wireless transceiving capabilities, palm computers, desktop computers, Personal Digital assistants, portable media players, smart speakers, navigation devices, smart watches, smart glasses, wearable devices such as smart necklaces, pedometers, Digital TVs, Virtual Reality (VR) terminal devices, Augmented Reality (AR) terminal devices, Wireless terminals in industrial control (industrial control), Wireless terminals in unmanned driving (self), Wireless terminals in remote surgical (remote medical) terminals, A wireless terminal in a smart grid (smart grid), a wireless terminal in transportation safety (transportation safety), a wireless terminal in a smart city (smart city), a wireless terminal in a smart home (smart home), and a vehicle, a vehicle-mounted device, a vehicle-mounted module, a wireless modem (modem), a handheld device (hand), a Customer Premises Equipment (CPE), a smart appliance, and the like in a vehicle networking system.

Fig. 1 is a schematic flowchart of an image processing method provided in an embodiment of the present application, and as shown in fig. 1, the method is applied to an image processing apparatus or a processor, and the method includes:

s101, acquiring a first image shot by a first camera and a second image shot by a second camera; the included angle between the optical axis of the first camera and the optical axis of the second camera is a preset angle.

In some embodiments, the first camera and/or the second camera may be included in the image processing apparatus, so that the first camera and the second camera may transmit the first image and the second image to the processor in a case where the first image and the second image are captured. In other embodiments, the first camera and/or the second camera are not included in the image processing apparatus, for example, the first camera and the second camera may be included in another apparatus, and in a case where the first camera and the second camera capture the first image and the second image, the image processing apparatus may receive the first image and the second image sent by the other apparatus, so that the image processing apparatus acquires the first image and the second image.

In some embodiments, the first camera and the second camera may both be fisheye cameras, such that the first image and the second image may both be fisheye images. In other embodiments, the first camera and the second camera may both be a normal camera or a wide-angle camera, etc.

In some embodiments, the shooting parameters of the first camera and the second camera may be the same. In other embodiments, the shooting parameters of the first camera and the second camera may be different. The photographing parameters may include at least one of: external reference, internal reference, and distortion parameters. In some embodiments, the first camera and the second camera may be the same type of camera, for example, the first camera and the second camera may both be fisheye cameras, wide-angle cameras, or normal cameras, among others. In other embodiments, the first camera and the second camera may be different types of cameras. For example, the first camera may be a fisheye camera and the second camera may be a normal camera.

In some embodiments, the angle of view of the first camera may have an overlap with the angle of view of the second camera, or the capture area of the first camera and the capture area of the second camera may have an overlap. For example, both the first camera and the second camera can capture a certain object. In other embodiments, the angle of view of the first camera may not overlap with the angle of view of the second camera, or the capture area of the first camera and the capture area of the second camera may not overlap.

The preset angle may be less than or equal to: the sum of one half of the viewing angle of the first camera and one half of the viewing angle of the second camera. For example, in the case where the angle of view of the first camera and the angle of view of the second camera are the same, the preset angle may be less than or equal to the angle of view of the first camera or the angle of view of the second camera.

In some embodiments, the viewing angle of the first fisheye camera may be greater than 180 degrees, the viewing angle of the second fisheye camera may be greater than 180 degrees, and the first fisheye camera and the second fisheye camera are disposed back-to-back. In other embodiments, the viewing angle of the first fisheye camera and/or the viewing angle of the second fisheye camera may be less than or equal to 180 degrees. In other embodiments, an angle between the optical axis of the first fisheye camera and the optical axis of the second fisheye camera may be less than or equal to 180 degrees. The optical axis of the camera may be the center line of the light beam (light pillar), or the axis of symmetry of the optical system.

In some embodiments, the first image and the second image may be taken simultaneously. In other embodiments, the first image and the second image may be taken at different times. In some embodiments, the first camera and the second camera may capture one image every set time length, and the time for capturing the image by the first camera and the time for capturing the image by the second camera are the same, and the first image and the second image may be images captured by the first camera and the second camera at a certain time. In other embodiments, the first camera and the second camera may capture videos, and the first image and the second image may be images captured at the same time from the videos captured by the first camera and the second camera, respectively.

In some embodiments, the first image and/or the second image may be an original image. For example, the original image may be an image directly obtained by image capturing. As another example, the original image may be an image frame in a video. As another example, the original image may be an image read from the local, a downloaded image, or an image read from another device (e.g., a hard disk, a usb disk, or another terminal, etc.). As another example, the raw image may be an image frame in a video read locally, a downloaded video, or a video read from another device.

In other embodiments, the first image and/or the second image may be images obtained by processing the original image by at least one of: scaling, clipping, denoising, noise adding, gray level processing, rotation processing and normalization processing. For example, the original image may be scaled and then rotated to obtain the first image or the second image.

The shape of the first image and/or the second image may be circular, elliptical or rectangular. For example, in the case where the first camera and the second camera are fisheye cameras, the shape of the first image and/or the second image is a circle.

S102, converting the first image and the second image to obtain a first expanded image.

The first expanded image may be an image in a planar coordinate system, and the first expanded image may be a two-dimensional image. The first unfolded image may be a rectangle, for example the first unfolded image may be an M × N image. M and N may be the number of pixels in the height and width directions, respectively. In some embodiments, M may be N/2, or N may be M/2. For example, mxn is 1440 × 720, 360 × 720, or 180 × 360, etc.

The first unfolded image may be an image corresponding to the first image and the second image. The first unfolded image may be a combination of the first image and the second image. For example, the pixel information in the first developed image is determined based on the pixel information of the first image and the pixel information of the second image.

In some embodiments, S102 may be implemented by: and performing coordinate conversion on the first image to obtain a first sub-expansion image, performing coordinate conversion on the second image to obtain a second sub-expansion image, and determining the first expansion image based on the first sub-expansion image and the second sub-expansion image. For example, the first expanded sub-image and the second expanded sub-image may be subjected to conversion processing to obtain a first expanded image. The first sub-expansion image and the second sub-expansion image may be both two-dimensional images, and the coordinate systems of the first sub-expansion image, the second sub-expansion image, and the first expansion image may be the same. In some embodiments, the first and second sub-unfolded images may both be rectangular. The image stitching technology is a technology for stitching at least two images with or without overlapping portions (which may be obtained at different times, different viewing angles or different sensors) into a seamless panoramic image or high-resolution image.

In some embodiments, the pixel size of the first unfolded image may be less than or equal to the pixel size of the first image and/or the second image. For example, the pixel points on the first expanded image correspond to partial pixel points of the first image and partial pixel points of the second image.

S103, determining vertex coordinates and pixel information of the vertex on a first spherical coordinate system based on the pixel information of the first expansion image.

The number of vertices on the first spherical coordinate system may be predetermined. For example, the number of vertices may be P Q, where P may have a value less than or equal to M and Q may have a value less than or equal to N. In some embodiments, the value of P may be the same as the value of Q. In other embodiments, the value of P may be different from the value of Q. For example, P may be Q/2, or Q may be P/2.

The spherical surface (with the radius of R1 or normalized spherical surface) in the first spherical coordinate system may be divided by using a "longitude and latitude division method" to divide the spherical surface into P parts along the latitude direction and Q parts along the longitude direction, all the intersections on the longitude and latitude net are the vertices of the spherical surface, and the vertex coordinates of the spherical surface are the vertex coordinates in S103.

The coordinates of some or all of the pixel points in the first expanded image may be changed to the first spherical coordinate system, so that the pixel information of the vertex in the first spherical coordinate system may be determined.

And S104, determining the panoramic image based on the vertex coordinates and the pixel information of the vertex.

S104 may include: determining a panoramic image based on the vertex coordinates, the pixel information of the vertex, and a Model View Projection (MVP) matrix.

In some embodiments, the MVP matrix may be predefined or preset. In other embodiments, the MVP matrix may be generated based on user adjustments to the view direction of the panoramic image.

The manner in which the panoramic image is determined based on the vertex coordinates, the pixel information of the vertices, and the MVP matrix is described below:

in the rendering pipeline, a vertex needs to be transformed by a plurality of coordinate spaces so as to be finally drawn on a screen. The vertices are initially defined in model space and the final vertices are transformed into screen space to obtain the true screen pixel coordinates. The panoramic image may be an image in a screen space, among others. The first step in the vertex transformation is to transform the vertices from model space into world space. This transformation is commonly called Model transformation, and the matrix used for Model space to world space transformation may be a Model matrix (Model matrix). The second step of the vertex transformation is to transform the vertex coordinates from world space into viewing space, which is commonly called the viewing transformation, and the matrix used for the world space to viewing space transformation may be a viewing matrix (View matrix). The vertices are then transformed from viewing space into clipping space (also called homogeneous clipping space), the matrix used for the viewing space to clipping space transformation being called clipping matrix or Projection matrix (Projection matrix). The screen space is a two-dimensional space, so vertices can be projected from the clipping space into the screen space to generate corresponding two-dimensional coordinates.

In this way, the panoramic image is determined by performing image rendering using the two-dimensional coordinates of the vertex coordinates mapped in the screen space and the pixel information of the vertex.

The MVP matrix includes a Model matrix, a View matrix, and a project matrix.

The model matrix may be determined based on an objective matrix, which may be obtained by transforming an initial matrix, which may be an identity matrix of fourth order. In some embodiments, the transformation of the target matrix compared to the initial matrix may correspond to a rotational transformation of the vertex coordinates. For example, in the case where the vertex coordinates in the model space are the line of sight directions directly in front of the rear lenses of the two fisheye cameras facing away from each other, the transformation of the target matrix compared to the initial matrix may correspond to a transformation of rotating the vertex coordinates by 90 degrees about the Y axis and then by-90 degrees about the X axis.

The image processing device can receive a sliding operation of a user for a screen, and a displacement matrix (Trans matrix) is obtained based on the sliding operation. The rotation amount corresponding to the sliding operation of the screen by the user may include: the first preset angle is rotated around the X axis and the second preset angle is rotated around the Y axis. The transformation of the displacement matrix compared to the obtained initial displacement matrix may correspond to a rotation of the vertex coordinate by a first predetermined angle about the X-axis and a rotation of the vertex coordinate by a second predetermined angle about the Y-axis. In this way, the displacement matrix can be continuously updated according to the sliding operation of the user for the screen.

In some embodiments, the MVP matrix may include: project matrix, Trans matrix, View matrix, and Model matrix. For example, MVP matrix is Projection matrix, Trans matrix, View matrix, Model matrix.

In this way, the MVP matrix can be continuously updated according to the sliding operation of the user on the screen, so that the image processing device can display panoramic images at different viewing angles according to the sliding operation of the user on the screen.

In the embodiment of the application, a first expanded image is obtained by converting a first image shot by a first camera and a second image shot by a second camera, and a panoramic image is determined based on pixel information of the first expanded image, so that a user can acquire environmental information around the camera by looking up the panoramic image, and further the user can conveniently acquire the environmental information with a large viewing angle around the camera; further, the vertex coordinates and the vertex pixel information in the first spherical coordinate system are determined based on the pixel information of the first developed image, and the panoramic image is determined based on the vertex coordinates and the vertex pixel information, so that the panoramic image is an image rendered by the vertex coordinates and the vertex pixel information, and the environmental information around the camera can be clearly displayed.

Fig. 2a is a schematic flowchart of another image processing method provided in an embodiment of the present application, and as shown in fig. 2a, the method is applied to an image processing apparatus or a processor, and includes:

s201, acquiring a first image shot by a first camera and a second image shot by a second camera; the included angle between the optical axis of the first camera and the optical axis of the second camera is a preset angle.

S202, determining a first conversion relation between the coordinates of the first image and the coordinates of the first unfolded image, and a second conversion relation between the coordinates of the second image and the coordinates of the first unfolded image.

In some embodiments, S202 may be implemented by the following steps a1 to a 4:

step A1, determining a third conversion relation between the coordinates of the plurality of pixel points in the first unfolded image and the coordinates of the plurality of pixel points on the mapped second spherical coordinate system.

The first developed image is a two-dimensional rectangular image. The coordinates of the pixel points in the second spherical coordinate system may be: and coordinates of a point on a spherical surface with the origin of the second spherical coordinate system as the spherical center and the radius of R2. The plurality of pixel points in the first unfolded image can correspond to the plurality of pixel points on the second spherical coordinate system one by one.

In some embodiments, the size (pixel size) of the first unfolded image may be preset, and the radius R2 may be preset. Dividing the first unfolded image into M equal parts in the height direction, dividing the first unfolded image into N equal parts in the length direction, and taking intersection points obtained by division as a plurality of pixel points in the first unfolded image; and dividing the latitude of the spherical surface with the radius of R2 into M equal parts, dividing the longitude of the spherical surface into N equal parts, and taking the intersection points obtained by division as a plurality of pixel points on a second spherical coordinate system.

The numerical values of M and N are not limited. M and N may be integers greater than or equal to 2, N may be equal to 2M, or N may be equal to M.

In some embodiments, the unfolded map is equally divided in both vertical and horizontal directions, the vertical direction (height direction) is divided into 180, the horizontal direction (length direction) is divided into 360, and the coordinates of the unfolded map are mapped to the world coordinates P of the sphere_w＝(x_w，y_w，z_w) (i.e., the coordinates of the pixel points in the second spherical coordinate system).

Wherein, the sphere world coordinate P corresponding to the pixel point (i, j) of the first expansion image_wIs calculated as shown in formulas (1) to (5), wherein i ∈ (0, width-1), j ∈ (0, height-1):

z_w＝R2*cosθ (5)；

wherein, width is the width of the first expanded image, height is the height of the first expanded image, PI is PI, and R2 is the radius of the second spherical coordinate system.

Step A2, determining a fourth conversion relationship between the coordinates of the plurality of pixel points in the second spherical coordinate system and the mapped coordinates of the plurality of pixel points in the first image.

The fourth transformation relationship may be determined based on at least one of external parameters, internal parameters, and distortion parameters of the first camera.

Step A3, determining a fifth conversion relationship between the coordinates of the plurality of pixel points in the second spherical coordinate system and the coordinates of the plurality of pixel points in the mapped second image.

The fifth transformation relationship may be determined based on at least one of external parameters, internal parameters, and distortion parameters of the second camera. The first and second images may be at different locations in the same coordinate system, or the first and second images may be at the same or different locations in different coordinate systems.

The plurality of pixel points in the first image and the plurality of pixel points in the second image may have a correspondence. For example, the positions of the plurality of pixel points in the first image are the same as the positions of the plurality of pixel points in the second image. For example, the plurality of pixel points may be evenly distributed in the first image or the second image. The number and/or position of the plurality of pixel points in the first image and the number and/or position of the plurality of pixel points in the second image may be pre-configured, or may be flexibly determined based on the computational performance of the image processing device and/or the display parameters of the panoramic image; the display parameters may include at least one of: frame rate, resolution. For example, the number of the plurality of pixel points in the first image and the number of the plurality of pixel points in the second image may be increased in a case where the calculation performance of the image processing apparatus is better and/or the display parameter of the panoramic image is higher, so that the more accurate fourth conversion relationship and the fifth conversion relationship can be determined.

The following describes a determination method of a conversion relationship between coordinates of a plurality of pixel points on the second spherical coordinate system and coordinates of a plurality of pixel points of the mapped first image and/or second image (i.e., the fourth conversion relationship and/or the fifth conversion relationship described above):

using the initial external parameters of the camera (corresponding to the external parameters of the first camera or the external parameters of the second camera) to convert the world coordinate system (i.e. the spherical world coordinate P)_wI.e. the coordinates of the pixel points in the second spherical coordinate system) to the camera coordinate system, to obtain the coordinates P_c；

Projecting the point on the camera coordinate system to the normalized sphere to obtain a point P_s(x_s，y_s，z_s) (ii) a Wherein the content of the first and second substances,

r2 is a radius corresponding to the second spherical coordinate system;

changing the coordinate system, locating the original point of the new coordinate system at xi, projecting the point on the new coordinate system to the normalization plane to obtain P_n(x_n，y_n，z_n) (ii) a Wherein the content of the first and second substances,

z_n＝1。

based on P_nCalculating the distorted coordinates P_d(x_d，y_d) Wherein k1, k2, P1 and P2 are distortion parameters and the coordinate P_d(x_d，y_d) Determined based on the following equations (6) to (9):

k_r＝1+k₁r²+k₂r⁴ (7)；

camera internal parameters (fx, fy, cx, cy) were used, based onObtaining fish eye diagram coordinate P by formula (10)_fisheye(i.e., the coordinates of the plurality of pixel points in the first image and/or the second image) the points of the image plane are converted to the pixel coordinate system using the camera intrinsic parameters to obtain the points on the final image.

Thus, a third conversion relation between the coordinates of the plurality of pixel points in the first unfolded image and the coordinates of the plurality of pixel points on the mapped second spherical coordinate system can be determined; then, a fourth conversion relationship between the coordinates of the plurality of pixel points in the second spherical coordinate system and the coordinates of the plurality of pixel points in the mapped first image is determined, a fifth conversion relationship between the coordinates of the plurality of pixel points in the second spherical coordinate system and the coordinates of the plurality of pixel points in the mapped second image is determined, so that the first conversion relationship can be determined based on the third conversion relationship and the fourth conversion relationship, the second conversion relationship can be determined based on the third conversion relationship and the fifth conversion relationship, and the first image and the second image are converted based on the first conversion relationship and the second conversion relationship to obtain the first unfolded image.

Step a4, determining the first conversion relationship based on the third conversion relationship and the fourth conversion relationship, and determining the second conversion relationship based on the third conversion relationship and the fifth conversion relationship.

In this way, the first conversion relation is determined based on the third conversion relation and the fourth conversion relation, the second conversion relation is determined based on the third conversion relation and the fifth conversion relation, so that the shot first image and the shot second image can be mapped onto the second spherical coordinate system to eliminate the influence of external parameters, internal parameters and distortion parameters of the first camera and the second camera on the shooting of the first image and the second image, and the first unfolded image can be determined based on the image mapped on the second spherical coordinate system, so that the first unfolded image can accurately reflect the real scene around the camera.

S203, based on the first conversion relation and the second conversion relation, the first image and the second image are converted to obtain a first expanded image.

In the case where the first conversion relationship and the second conversion relationship are obtained, the first image and the second image may be converted into the first expanded image. The feature of the first unfolded image is a combination of the features of the first image and the second image. Through the step of S203, the two images of the first image and the second image may be merged into one image (i.e., the first unfolded image is shown), so that the user can determine the environmental information around the camera only through the first unfolded image.

S204, determining vertex coordinates and pixel information of the vertex on a first spherical coordinate system based on the pixel information of the first expanded image.

S205, determining the panoramic image based on the vertex coordinates and the pixel information of the vertex.

In the embodiment of the application, the first unfolded image is obtained based on the first conversion relationship between the coordinates of the first image and the coordinates of the first unfolded image and the second conversion relationship between the coordinates of the second image and the coordinates of the first unfolded image, so that the first unfolded image not only has the characteristics of the first image, but also has the characteristics of the second image, and the characteristics of the first image and the characteristics of the second image can be comprehensively reflected by the spliced first unfolded image.

Fig. 2b is a schematic diagram of a camera model provided in the embodiment of the present application, and fig. 2b shows a coordinate P_cCoordinate P_sCoordinate P_nThe relationship between these three coordinates is schematically shown, and it can be seen from FIG. 2b that the coordinate P is first obtained_cConversion to coordinates P_sThen the coordinate P is calculated_sConversion to coordinates P_n。

Fig. 3 is a schematic flowchart of another image processing method provided in an embodiment of the present application, and as shown in fig. 3, the method is applied to an image processing apparatus or a processor, and includes:

s301, acquiring a first image shot by a first camera and a second image shot by a second camera; the included angle between the optical axis of the first camera and the optical axis of the second camera is a preset angle.

S302, determining a first conversion relation between the coordinates of the first image and the coordinates of the first unfolded image, and a second conversion relation between the coordinates of the second image and the coordinates of the first unfolded image.

S303, determining a first sub-expansion image corresponding to the first image based on the first image and the first conversion relation.

S304, determining a second sub-expansion image corresponding to the second image based on the second image and the second conversion relation.

The first sub-development image may be a two-dimensional rectangular image, and the second sub-development image may be a two-dimensional rectangular image.

In some embodiments, the photographing region corresponding to the first sub-spread image and the photographing region corresponding to the second sub-spread image may not have an overlapping portion, in which case the angle of view corresponding to the first camera when photographing the first image does not overlap the angle of view corresponding to the second camera when photographing the second image. For example, when the first camera and the second camera are disposed back to back, and the angles of view of the first camera and the second camera are both less than or equal to 180 degrees, or when the included angle between the optical axis of the first camera and the optical axis of the second camera is 150 degrees, and the angles of view of the first camera and the second camera are both less than or equal to 150 degrees, the shooting area corresponding to the first sub-unfolded image and the shooting area corresponding to the second sub-unfolded image do not have an overlapping portion, so that the first sub-unfolded image and the second sub-unfolded image can be spliced to obtain the first unfolded image.

In other embodiments, the capturing area corresponding to the first sub-spread image and the capturing area corresponding to the second sub-spread image may have an overlapping portion, in which case the first camera partially overlaps with the second camera at the angle of view corresponding to the capturing of the second image at the angle of view corresponding to the capturing of the first image. For example, when the first camera and the second camera are disposed back to back, and the field angles of the first camera and the second camera are both greater than 180 degrees (e.g., 190 degrees, 200 degrees, 220 degrees, etc.), the shooting area corresponding to the first sub-unfolded image and the shooting area corresponding to the second sub-unfolded image have an overlapping portion, so that the first sub-unfolded image and the second sub-unfolded image can be fused and spliced to obtain the first unfolded image. A specific method for fusing and splicing the first sub-expansion image and the second sub-expansion image to obtain the first expansion image may refer to S305 to S306 described below.

S305, fusing a first pixel point of the first sub-expansion image and a second pixel point of the second sub-expansion image to obtain a fusion area.

In some embodiments, the size of the pixels of the fused region may be the same as the size of the pixels of the first and second sub-unfolded images.

In some embodiments, the first pixel point corresponds to a first connecting line between a pixel point in the second spherical coordinate system and the center of the second spherical coordinate system, and an included angle between the first connecting line and the second connecting line is greater than or equal to a first angle and less than or equal to a second angle; the second connecting line is a connecting line between the position of the center position of the first sub-expansion image in the second spherical coordinate system and the circle center;

the second pixel point corresponds to a third connecting line of the pixel point and the circle center in the second spherical coordinate system, and an included angle between the fourth connecting line and the pixel point in the second spherical coordinate system is greater than or equal to the first angle and less than or equal to the second angle; the fourth connecting line is a connecting line between the position of the center position of the second sub-expansion image in the second spherical coordinate system and the circle center;

the first angle is 90 degrees minus a third angle, and the second angle is 90 degrees plus the third angle.

Thus, the included angle between the first connecting line and the second connecting line, or the included angle between the third connecting line and the fourth connecting line is greater than or equal to the value obtained by subtracting the third angle from 90 degrees and is less than or equal to the value obtained by adding the third angle to 90 degrees.

In some embodiments, the pixel value of the ith pixel point in the fusion region may be determined based on the pixel value of the ith pixel point in the first spread subimage and the pixel value of the ith pixel point in the second spread subimage. The position of the ith pixel point in the fusion region, the position of the ith pixel point in the first sub-expansion image and the position of the ith pixel point in the second sub-expansion image are the same or corresponding. i is an integer greater than or equal to 1.

In some embodiments, a weighting coefficient may be determined, and based on the weighting coefficient, a first pixel point of the first sub-expansion image and a second pixel point of the second sub-expansion image are fused to obtain the fusion region. Illustratively, the weighting factor may be a fixed value, or the weighting factor may be different based on the position of the pixel point.

By the method, the pixel point fused with the first sub expanded image and the second sub expanded image corresponds to the pixel point in the second spherical coordinate system, the connecting line with the center of the second spherical coordinate system and the angle between the connecting line of the center of the circle and the position corresponding to the center of the sub expanded image in the second spherical coordinate system are larger than or equal to the first angle and smaller than or equal to the second angle, so that the pixel point corresponding to the first angle or larger than or equal to the second angle can be fused, the excessive calculated amount caused by the fused pixel point is reduced, and the unnatural transition between the first sub expanded image and the second sub expanded image caused by the insufficient fused pixel point is avoided.

For example, S305 may include: determining a weighting coefficient based on an included angle between the first connection line and the second connection line, an included angle between the third connection line and the fourth connection line, and one of the second angle and the first angle; and fusing a first pixel point of the first sub-expansion image and a second pixel point of the second sub-expansion image based on the weighting coefficient to obtain the fusion region.

In some embodiments, the weighting factor k may be calculated as: k ═ g (second angle-gama)/(third angle × 2). For example, when the third angle is 4, k is (94-gama)/8. And gama is an included angle between the first connecting line and the second connecting line, or an included angle between the third connecting line and the fourth connecting line. In other embodiments, the weighting factor k may be calculated as: k is (gama-first angle)/(third angle × 2). For example, when the third angle is 4, k is (gama-86)/8.

In some embodiments, the determination method of the pixel information color of the ith pixel point in the fusion region may be: color ═ k × color 1+ (1.0-k) × color 2. Wherein, the color 1 is the pixel information of the first pixel (i-th pixel) of the first sub-development image, and the color2 is the pixel information of the first pixel (i-th pixel) of the second sub-development image.

In some embodiments, since the two fisheye images (i.e., the first image and the second image) respectively captured by the two fisheye cameras have a viewing angle greater than 180 degrees, there is a certain overlapping region. The double fisheye image is shot by two lenses, and the brightness of the obtained two fisheye images possibly has difference, so that an obvious 'seam' is possibly formed in the splicing seam, and therefore in the embodiment of the application, pixel fusion processing is performed within the range of 8 degrees (selected according to an empirical value) of the splicing position. The fusion implementation method comprises the following steps:

firstly, calculating an included angle gama between a spherical surface point OP vector and a negative X axis; and the negative X axis corresponds to the connecting line of the position of the central position of the first sub-expansion image in the second spherical coordinate system and the circle center.

And secondly, determining whether the included angle gama is in the range of (86,94), if not, needing fusion processing, otherwise, continuing the third step.

Thirdly, calculating a weighting coefficient k: k is (94-gama)/8.

Fourthly, calculating the pixel information of the spherical points on the two fisheye diagrams respectively by using the method, then performing weighting processing, and fusing the pixel information color of the ith pixel point in the region: color ═ k × color 1+ (1.0-k) × color 2.

In this way, since the weighting coefficient is determined based on the included angle between the first connection line and the second connection line and the included angle between the third connection line and the fourth connection line, and the first pixel points and the second pixel points are fused based on the weighting coefficient, the fusion area is a result of weighting the first pixel points and the second pixel points, so that the fusion area can accurately reflect a real scene.

S306, splicing a third pixel point except the first pixel point in the first sub-unfolded image, the fusion area and a fourth pixel point except the second pixel point in the second sub-unfolded image to obtain the first unfolded image.

In some embodiments, S306 may be implemented by: splicing the third pixel point of the first sub-unfolded image, the fusion area and the fourth pixel point of the second sub-unfolded image to obtain a second unfolded image; and smoothing the second unfolded image to obtain the first unfolded image.

In some embodiments, smoothing the second unfolded image to obtain the first unfolded image may include: and smoothing the non-fusion area (namely the area outside the fusion area) in the second unfolded image to obtain the first unfolded image.

In other embodiments, smoothing the second unfolded image to obtain the first unfolded image may include: and smoothing the whole second unfolded image to obtain the first unfolded image.

In some embodiments, the image may be smoothed by linear difference (including single linear difference or bilinear interpolation), neighborhood averaging, neighborhood weighted averaging, median filtering, and the like.

Illustratively, to make the image smoother, interpolation processing will be performed for the non-fusion processing region. And (3) an interpolation implementation process: after the coordinates on the fisheye pattern (namely the coordinates of the second expanded image) are calculated, 4 pixel points around the fisheye pattern coordinates (namely each pixel point in the second expanded image) are taken to carry out bilinear interpolation, and a final pixel value is obtained.

In this way, the first unfolded image is obtained by smoothing the spliced second unfolded image, so that the saw teeth in the second unfolded image can be eliminated, and the obtained first unfolded image is more real.

And S307, determining vertex coordinates and the pixel information of the vertex on a first spherical coordinate system based on the pixel information of the first expanded image.

And S308, determining the panoramic image based on the vertex coordinates and the pixel information of the vertex.

In the embodiment of the application, the first sub-expansion image and the second sub-expansion image are determined, the first pixel point of the first sub-expansion image and the second pixel point of the second sub-expansion image are fused to obtain the fusion area, the third pixel point except the first pixel point in the first sub-expansion image, the fusion area and the fourth pixel point except the second pixel point in the second sub-expansion image are spliced to obtain the first expansion image, so that the overlapped part of the first image and the second image can be fused, the first sub-expansion image and the second sub-expansion image can be in smooth transition, and the sense of reality of the first expansion image is improved.

Fig. 4a is a schematic flowchart of another image processing method provided in an embodiment of the present application, and as shown in fig. 4a, the method is applied to an image processing apparatus or a processor, and includes:

s401, acquiring a first image shot by a first camera and a second image shot by a second camera; the included angle between the optical axis of the first camera and the optical axis of the second camera is a preset angle.

S402, converting the first image and the second image to obtain a first expanded image.

And S403, mapping a plurality of pixel points in the first unfolded image to the coordinates of the plurality of pixel points on the first spherical coordinate system, and determining the coordinates as the vertex coordinates.

The plurality of pixel points in the first expanded image may be pixel points evenly distributed on the first expanded image. In some embodiments, the first unfolded image may be divided into S1 sections along the width direction and S2 sections along the height direction, and S1 and S2 may be the same or different, so that pixel information and coordinates of S1 × S2 pixels may be obtained. For example, S1 and S2 may both be 80, so that pixel information and coordinates of 80 × 80 pixels may be obtained.

In S403, under the condition that a plurality of pixel points in the first expanded image are obtained, the plurality of pixel points in the first expanded image may be mapped onto the first spherical coordinate system, so as to obtain vertex coordinates. For example, the coordinates of 80 × 80 pixels on the first expanded image are mapped on the first spherical coordinate system, so as to obtain the coordinates (i.e., vertex coordinates) of 80 × 80 pixels mapped on the first spherical coordinate system.

The following describes calculation formulas for mapping a plurality of pixel points in the first expanded image to coordinates (x, y, z) of the plurality of pixel points (i.e., vertex coordinates) on the first spherical coordinate system, as shown in formulas (11) to (15):

θ ═ pi/80 × i; i ∈ [0,80) or (0,80] (11);

phi is 2 × pi/80 × j; j ∈ [0,80) or (0,80] (12);

y＝R1×cosθ (14)；

wherein i and j are integers. R1 is the radius of the first spherical coordinate system. In some embodiments, 80 in equation (11) may be replaced with S1 and 80 in equation (12) may be replaced with S2.

S404, determining a sixth conversion relation between the coordinates of the plurality of pixel points in the first unfolded image and the mapped coordinates of the plurality of pixel points on the first spherical coordinate system.

In some embodiments, the sixth conversion relationship may be determined based on the coordinates (width × j/80, width × i/8) in the first development and equations (6) to (10). Thus, the sixth conversion relationship may be a conversion relationship between the coordinates in the first expanded view and the coordinates (x, y, z) of the plurality of pixel points on the first spherical coordinate system.

In other embodiments, the sixth conversion relationship may be determined based on the coordinates (width x (1-j/80)), width x i/8) of the rotated first development and equations (6) to (10). Thus, the sixth conversion relationship may be a conversion relationship between the coordinates of the rotated first development diagram and the coordinates (x, y, z) of the plurality of pixel points on the first spherical coordinate system.

S405, determining the pixel information of the vertex based on the pixel information of the plurality of pixel points in the first unfolded image and the sixth conversion relation.

After the sixth conversion relationship is determined, each pixel point of the first unfolded image can find the corresponding pixel point of the first spherical coordinate system, so that the vertex pixel information can be the pixel information of a plurality of pixel points on the first spherical coordinate system, and the pixel information of the plurality of pixel points on the first spherical coordinate system can be the same as the pixel information of the plurality of pixel points in the first unfolded image.

And S406, determining a panoramic image based on the vertex coordinates and the pixel information of the vertex.

In the embodiment of the application, the first unfolded image is mapped to the first spherical coordinate system, so that vertex coordinates and vertex pixel information on the corresponding first spherical coordinate system can be obtained, the vertex coordinates and the vertex pixel information can comprehensively reflect the characteristics of the first unfolded image, and further, the panoramic image obtained by rendering the vertex coordinates and the vertex pixel information can truly and comprehensively represent the environmental information around the camera.

Fig. 4b is a schematic diagram of a vertex position provided in this embodiment, as shown in fig. 4b, a sphere corresponding to the first spherical coordinate system may be divided by using a "longitude and latitude division method", for example, the sphere may be divided into 80 parts along the latitude direction, and simultaneously divided into 80 parts along the longitude direction, and all intersection points on the longitude and latitude network are vertices of the spherical surface (coordinates of the vertices on the spherical surface are vertex coordinates). Every two adjacent parts of the division in the weft direction are divided into triangles according to the mode shown in fig. 4b to obtain all triangle vertex indexes, and after the triangle vertex indexes are obtained, the coordinates of the triangle vertices and the pixel information of the triangle vertices can be determined based on the triangle vertex indexes, so that the panoramic image is obtained through rendering based on the coordinates of the triangle vertices, the pixel information of the triangle vertices and the MVP matrix.

Fig. 5 is a flowchart illustrating an image processing method according to another embodiment of the present application, as shown in fig. 5, the method is applied to an image processing apparatus or a processor, and the method includes:

s501, acquiring a first image shot by a first camera and a second image shot by a second camera; the included angle between the optical axis of the first camera and the optical axis of the second camera is a preset angle.

S502, converting the first image and the second image to obtain a first expanded image.

S503, determining vertex coordinates and pixel information of the vertex on a first spherical coordinate system based on the pixel information of the first expansion image.

S504, in response to the sliding instruction generated by the sliding operation of the display device, determining a model observation projection MVP matrix matched with the sliding operation.

In some embodiments, S504 may include: determining a displacement matrix corresponding to a slide instruction in response to the slide instruction generated for the slide operation of the display device; and determining an MVP matrix matched with the sliding operation based on the displacement matrix corresponding to the sliding instruction.

In some embodiments, determining a displacement matrix corresponding to the slide instruction comprises: and determining a first preset angle of rotation around the X axis and a second preset angle of rotation around the Y axis corresponding to the sliding instruction, and determining a displacement matrix corresponding to the sliding instruction based on the first preset angle and the second preset angle. Wherein the X-axis and the Y-axis may be on the surface of the screen or parallel to the surface of the screen.

And S505, determining the panoramic image based on the vertex coordinates, the pixel information of the vertex and the MVP matrix.

In the embodiment of the application, the MVP matrix can be determined based on the sliding operation of the user on the display device, and the panoramic image is determined based on the MVP matrix, so that the display view angle of the panoramic image can be switched based on the sliding operation of the user, and the convenience of switching the display view angle of the panoramic image by the user is improved.

Fig. 6 is a schematic flowchart of an image processing method according to another embodiment of the present application, as shown in fig. 6, the method is applied to an image processing apparatus or a processor, and the method includes:

s601, acquiring a first image shot by a first camera and a second image shot by a second camera; the included angle between the optical axis of the first camera and the optical axis of the second camera is a preset angle.

S602, converting the first image and the second image to obtain a first expanded image.

S603, determining vertex coordinates and pixel information of the vertex on a first spherical coordinate system based on the pixel information of the first expanded image.

And S604, determining the panoramic image based on the vertex coordinates and the pixel information of the vertex.

S605, determining a three-dimensional grid model corresponding to a specific object in the panoramic image.

In some embodiments, at least one setting object may be previously defined, and in the case of determining the panoramic image, an object in the panoramic image, which is the same as attribute information of the at least one setting object, may be determined as the specific object. The attribute information of the setting object may include at least one of: name, genre, model, etc. For example, the at least one setting object includes: in the case of an apple, a pear and a mobile phone, the pear in the panoramic image can be determined as a specific object when the pear appears in the panoramic image.

In some embodiments, the triggered one or more objects may be determined to be a particular object in response to a triggering instruction triggering the one or more objects in the panoramic image.

In some embodiments, the three-dimensional mesh model corresponding to a particular object may be generated in real-time by an image processing device or processor. For example, the image processing apparatus may generate a three-dimensional mesh model corresponding to pixel information of a specific object in the panoramic image based on the pixel information and/or contour information of the specific object.

In other embodiments, at least one three-dimensional mesh model library may be preset, and the three-dimensional mesh model corresponding to the specific object may be determined from the at least one three-dimensional mesh model library. For example, attribute information of a specific object is determined, a three-dimensional mesh model corresponding to the specific object is determined from at least one three-dimensional mesh model library based on the attribute information of the specific object, and in some embodiments, a relationship between the at least one three-dimensional mesh model library and the attribute information of the corresponding object may be stored in advance.

The three-dimensional mesh model corresponding to a particular object may be the same size as the particular object or may be different.

S606, rendering the three-dimensional grid model to obtain a rendered three-dimensional grid model.

In some embodiments, the three-dimensional mesh model may be rendered with pixel information for a particular object in the panoramic image.

S607, the rendered three-dimensional grid model is covered on a specific object in the panoramic image.

The rendered three-dimensional mesh model may be overlaid on a particular object in the panoramic image at a first perspective. In some embodiments, the first viewing angle may be a predetermined viewing angle (e.g., a front viewing angle, a bottom viewing angle, or a top viewing angle). In other embodiments, the first perspective may be the same perspective as the perspective of the particular object in the panoramic image.

In some embodiments, the user may perform a translation and/or rotation operation on the rendered three-dimensional mesh model, and as such, the image processing device may translate and/or rotate the three-dimensional mesh model based on the translation instruction and/or the rotation instruction generated by performing the translation operation and/or the rotation operation on the rendered three-dimensional mesh model in response to the translation instruction generated by performing the translation operation and/or the rotation instruction generated by performing the rotation operation on the rendered three-dimensional mesh model.

In the embodiment of the application, the three-dimensional grid model corresponding to the specific object is rendered, and the rendered three-dimensional grid model is covered on the specific object in the panoramic image, so that a user can conveniently know the three-dimensional characteristics of the specific object.

In the embodiment of the present application, a first image and a second image are obtained, vertex coordinates and pixel information of a vertex are determined based on the first image and the second image, and then the vertex coordinates and the pixel information of the vertex are loaded into a Graphics Processing Unit (GPU).

By capturing the amount of rotation the user slides the cell phone screen: and firstly rotating a certain angle around the X axis, updating an object coordinate system, and then rotating a certain angle around the Y axis to obtain a Trans matrix. Calculating a final MVP matrix: MVP matrix is project matrix Trans matrix View matrix Model matrix.

And finally, loading the MVP matrix into the GPU through an OpenGL interface.

The GPU performs rendering by calling shader programs. For example, the shader program includes an image stitching expansion code implementation, a vertex transformation implementation and a shading code implementation. The GPU can perform image rendering according to the vertex coordinates, the vertex pixel information and the MVP matrix, and accordingly a panoramic image is obtained.

In the embodiment of the application, a binocular fisheye camera is adopted to shoot in real time to obtain a panoramic image, and OpenGL is used for real-time splicing and rendering the image at a mobile terminal.

In the embodiment of the application, the scheme can be used for performing real-time preview and panoramic roaming on the images shot by the double-fisheye camera; the scheme can be used for assisting a three-dimensional reconstruction algorithm to verify the correctness of the MESH generation position; the MESH superposition rendering can be carried out by using the scheme, and the visual effect is improved. Wherein, the MESH may be a model map. For example, a panoramic view is a 3D model that includes a table that can be placed in position.

The embodiment of the application also provides a calibration method of the camera, which comprises the following steps: preparing two calibration plates which can be 8 multiplied by 8 apriltag, wherein the calibration plates are arranged in parallel, and a fisheye camera is arranged in the middle of the two calibration plates, so that the two calibration plates are respectively positioned in the shooting pictures of the fisheye camera cam0 and cam 1. The method comprises the steps that a handheld fisheye camera device moves 6-dof (6 degrees of freedom), preview image streams are collected, parameters of an Inertial Measurement Unit (IMU) are obtained, then camera calibration is completed by using a calibration tool, and camera internal parameters, distortion parameters and camera external parameters are obtained.

Based on the foregoing embodiments, the present application provides an image processing apparatus, which includes units included and modules included in the units, and can be implemented by a processor in an image processing device; of course, it may be implemented by a specific logic circuit.

Fig. 7 is a schematic diagram of a composition structure of an image processing apparatus according to an embodiment of the present application, and as shown in fig. 7, an image processing apparatus 700 includes:

an acquiring unit 701, configured to acquire a first image captured by a first camera and a second image captured by a second camera; an included angle between the optical axis of the first camera and the optical axis of the second camera is a preset angle;

a conversion unit 702, configured to perform conversion processing on the first image and the second image to obtain a first expanded image;

a determining unit 703 configured to determine, based on pixel information of the first expanded image, a vertex coordinate on a first spherical coordinate system and pixel information of the vertex;

the determining unit 703 is further configured to determine a panoramic image based on the vertex coordinates and the pixel information of the vertex.

In some embodiments, the conversion unit 702 is further configured to:

determining a first transformation relationship between the coordinates of the first image and the coordinates of the first unfolded image, and a second transformation relationship between the coordinates of the second image and the coordinates of the first unfolded image;

and performing conversion processing on the first image and the second image based on the first conversion relation and the second conversion relation to obtain the first expanded image.

In some embodiments, the conversion unit 702 is further configured to:

determining a third conversion relation between the coordinates of a plurality of pixel points in the first unfolded image and the coordinates of a plurality of pixel points on the mapped second spherical coordinate system;

determining a fourth conversion relation between the coordinates of the plurality of pixel points on the second spherical coordinate system and the coordinates of the plurality of pixel points in the mapped first image;

determining a fifth conversion relation between the coordinates of the plurality of pixel points on the second spherical coordinate system and the coordinates of the plurality of pixel points in the mapped second image;

determining the first conversion relationship based on the third conversion relationship and the fourth conversion relationship, and determining the second conversion relationship based on the third conversion relationship and the fifth conversion relationship.

In some embodiments, the conversion unit 702 is further configured to:

determining a first sub-expansion image corresponding to the first image based on the first image and the first conversion relation;

determining a second sub-expansion image corresponding to the second image based on the second image and the second conversion relation;

fusing a first pixel point of the first sub-unfolded image and a second pixel point of the second sub-unfolded image to obtain a fused area;

and splicing a third pixel point except the first pixel point in the first sub-unfolded image, the fusion area and a fourth pixel point except the second pixel point in the second sub-unfolded image to obtain the first unfolded image.

In some embodiments, the conversion unit 702 is further configured to:

determining a weighting coefficient based on an included angle between the first connection line and the second connection line, an included angle between the third connection line and the fourth connection line, and one of the second angle and the first angle;

and fusing a first pixel point of the first sub-expansion image and a second pixel point of the second sub-expansion image based on the weighting coefficient to obtain the fusion region.

In some embodiments, the conversion unit 702 is further configured to:

splicing the third pixel point of the first sub-unfolded image, the fusion area and the fourth pixel point of the second sub-unfolded image to obtain a second unfolded image;

and smoothing the second unfolded image to obtain the first unfolded image.

In some embodiments, the determining unit 703 is further configured to:

mapping a plurality of pixel points in the first unfolded image to coordinates of a plurality of pixel points on the first spherical coordinate system, and determining the coordinates as the vertex coordinates;

determining a sixth conversion relation between the coordinates of the plurality of pixel points in the first unfolded image and the coordinates of the plurality of pixel points on the mapped first spherical coordinate system;

and determining the pixel information of the vertex based on the pixel information of a plurality of pixel points in the first expanded image and the sixth conversion relation.

In some embodiments, the determining unit 703 is further configured to:

in response to a sliding instruction generated for a sliding operation of a display device, determining a model observation projection MVP matrix matched with the sliding operation;

and determining the panoramic image based on the vertex coordinates, the pixel information of the vertex and the MVP matrix.

In some embodiments, the image processing apparatus 700 further comprises: a covering unit 704; the covering unit is used for:

determining a three-dimensional mesh model corresponding to a specific object in the panoramic image;

rendering the three-dimensional grid model to obtain a rendered three-dimensional grid model;

overlaying the rendered three-dimensional mesh model on a particular object in the panoramic image.

The above description of the apparatus embodiments, similar to the above description of the method embodiments, has similar beneficial effects as the method embodiments. For technical details not disclosed in the embodiments of the apparatus of the present application, reference is made to the description of the embodiments of the method of the present application for understanding.

It should be noted that, in the embodiment of the present application, if the image processing method is implemented in the form of a software functional module and sold or used as a standalone product, the image processing method may also be stored in a computer storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing an image processing apparatus to execute all or part of the methods described in the embodiments of the present application.

Fig. 8 is a schematic diagram of a hardware entity of an image processing apparatus according to an embodiment of the present application, and as shown in fig. 8, the hardware entity of the image processing apparatus 800 includes: a processor 801 and a memory 802, wherein the memory 802 stores a computer program operable on the processor 801, and the processor 801 executes the computer program to implement the image processing method according to any of the above embodiments.

The Memory 802 stores a computer program operable on the processor, and the Memory 802 is configured to store instructions and applications executable by the processor 801, and may also buffer data (e.g., image data, audio data, voice communication data, and video communication data) to be processed or already processed by the processor 801 and modules in the image processing apparatus 800, and may be implemented by a FLASH Memory (FLASH) or a Random Access Memory (RAM).

The processor 801, when executing the program, implements the steps of the image processing method of any of the above. The processor 801 generally controls the overall operation of the image processing apparatus 800.

The present embodiments provide a computer storage medium storing one or more programs, which are executable by one or more processors to implement the steps of the image processing method according to any one of the above embodiments.

Here, it should be noted that: the above description of the storage medium and device embodiments is similar to the description of the method embodiments above, with similar advantageous effects as the method embodiments. For technical details not disclosed in the embodiments of the storage medium and apparatus of the present application, reference is made to the description of the embodiments of the method of the present application for understanding.

The image processing apparatus or processor may comprise an integration of any one or more of: an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a Central Processing Unit (CPU), a graphics Processor, an embedded neural Network Processor (NPU), a controller, a microcontroller, a microprocessor, a Programmable Logic Device, a discrete Gate or transistor Logic Device, and discrete hardware components. It is understood that the electronic device implementing the above-mentioned processor function may be other electronic devices, and the embodiments of the present application are not particularly limited.

The computer storage medium/Memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read Only Memory (EPROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a magnetic Random Access Memory (FRAM), a Flash Memory (Flash Memory), a magnetic surface Memory, an optical Disc, or a Compact Disc Read-Only Memory (CD-ROM), etc.; but may also be various terminals such as mobile phones, computers, tablet devices, personal digital assistants, etc., that include one or any combination of the above-mentioned memories.

It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment of the present application" or "a previous embodiment" or "some implementations" or "some embodiments" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" or "an embodiment of the present application" or "the preceding embodiments" or "some embodiments" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application. The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.

Without being specifically described, the image processing apparatus executes any step in the embodiments of the present application, and the processor of the image processing apparatus may execute the step. Unless otherwise specified, the present embodiment does not limit the order in which the image processing apparatus performs the following steps. In addition, the data may be processed in the same way or in different ways in different embodiments. It should be further noted that any step in the embodiments of the present application may be executed by the image processing apparatus independently, that is, when the image processing apparatus executes any step in the above embodiments, the image processing apparatus may not depend on the execution of other steps.

In the description of the present application, it is to be understood that the terms "center," "longitudinal," "lateral," "length," "width," "thickness," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," "clockwise," "counterclockwise," and the like are used in the orientations and positional relationships indicated in the drawings for convenience in describing the present application and for simplicity in description, and are not intended to indicate or imply that the referenced devices or elements must have a particular orientation, be constructed in a particular orientation, and be operated in a particular manner, and are not to be construed as limiting the present application. Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, features defined as "first", "second", may explicitly or implicitly include one or more of the described features. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.

In the description of the present application, it is to be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; may be mechanically connected, may be electrically connected or may be in communication with each other; either directly or indirectly through intervening media, either internally or in any other relationship. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art as appropriate.

In this application, unless expressly stated or limited otherwise, the first feature "on" or "under" the second feature may comprise direct contact of the first and second features, or may comprise contact of the first and second features not directly but through another feature in between. Also, the first feature being "on," "above" and "over" the second feature includes the first feature being directly on and obliquely above the second feature, or merely indicating that the first feature is at a higher level than the second feature. A first feature being "under," "below," and "beneath" a second feature includes the first feature being directly under and obliquely below the second feature, or simply meaning that the first feature is at a lesser elevation than the second feature.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units; can be located in one place or distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, all functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

The methods disclosed in the several method embodiments provided in the present application may be combined arbitrarily without conflict to obtain new method embodiments.

Features disclosed in several of the product embodiments provided in the present application may be combined in any combination to yield new product embodiments without conflict.

The features disclosed in the several method or apparatus embodiments provided in the present application may be combined arbitrarily, without conflict, to arrive at new method embodiments or apparatus embodiments.

Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: various media that can store program codes, such as a removable Memory device, a Read Only Memory (ROM), a magnetic disk, or an optical disk.

Alternatively, the integrated units described above in this application may be stored in a computer storage medium if they are implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or portions thereof contributing to the related art may be embodied in the form of a software product stored in a storage medium, and including several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a removable storage device, a ROM, a magnetic or optical disk, or other various media that can store program code.

In the embodiments of the present application, the descriptions of the same steps and the same contents in different embodiments may be mutually referred to. In the embodiment of the present application, the term "and" does not affect the order of the steps, for example, the image processing apparatus executes a and then executes B, or the image processing apparatus executes B and then executes a, or the image processing apparatus executes a and then executes B simultaneously.

It is to be noted that the drawings in the embodiments of the present application are only for illustrating schematic positions of the respective devices on the image processing apparatus and do not represent actual positions in the image processing apparatus, the actual positions of the respective devices or the respective areas may be changed or shifted accordingly depending on actual conditions (for example, the structure of the image processing apparatus), and the scale of different parts in the image processing apparatus in the drawings does not represent the actual scale.

As used in the examples of this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be understood that the term "and/or" as used herein is merely one type of association that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.

In the embodiments of the present application, all the steps or some of the steps may be performed as long as a complete technical solution can be formed.

The above description is only for the embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. An image processing method, comprising:

acquiring a first image shot by a first camera and a second image shot by a second camera; an included angle between the optical axis of the first camera and the optical axis of the second camera is a preset angle;

performing conversion processing on the first image and the second image to obtain a first expanded image;

determining vertex coordinates and pixel information of the vertex on a first spherical coordinate system based on the pixel information of the first unfolded image;

determining a panoramic image based on the vertex coordinates and the pixel information of the vertex.

2. The method of claim 1, wherein the transforming the first image and the second image to obtain a first unfolded image comprises:

3. The method of claim 2, wherein the determining a first transformation relationship between the coordinates of the first image and the coordinates of the first unfolded image and a second transformation relationship between the coordinates of the second image and the coordinates of the first unfolded image comprises:

4. The method according to claim 2 or 3, wherein the converting the first image and the second image based on the first conversion relation and the second conversion relation to obtain the first unfolded image comprises:

5. The method according to claim 4, wherein the first pixel point corresponds to a first connecting line between a pixel point in the second spherical coordinate system and a center of the second spherical coordinate system, and an included angle between the first connecting line and the second connecting line is greater than or equal to a first angle and less than or equal to a second angle; the second connecting line is a connecting line between the position of the center position of the first sub-expansion image in the second spherical coordinate system and the circle center;

6. The method of claim 5, wherein the fusing the first pixel point of the first sub-unfolded image and the second pixel point of the second sub-unfolded image to obtain a fused region comprises:

7. The method according to any one of claims 4 to 6, wherein the stitching a third pixel point outside the first pixel point in the first sub-unfolded image, the fusion region, and a fourth pixel point outside the second pixel point in the second sub-unfolded image to obtain the first unfolded image includes:

and smoothing the second unfolded image to obtain the first unfolded image.

8. The method of any of claims 1 to 7, wherein the determining vertex coordinates and the pixel information for the vertex in a first spherical coordinate system based on the pixel information for the first unfolded image comprises:

9. The method of any of claims 1 to 8, wherein the determining a panoramic image based on the vertex coordinates and the pixel information of the vertex comprises:

10. The method of any of claims 1 to 9, wherein after determining the panoramic image based on the vertex coordinates and the pixel information of the vertex, the method further comprises:

11. An image processing apparatus comprising:

the device comprises an acquisition unit, a processing unit and a display unit, wherein the acquisition unit is used for acquiring a first image shot by a first camera and a second image shot by a second camera; an included angle between the optical axis of the first camera and the optical axis of the second camera is a preset angle;

the conversion unit is used for carrying out conversion processing on the first image and the second image to obtain a first expanded image;

a determination unit configured to determine a vertex coordinate on a first spherical coordinate system and pixel information of the vertex based on pixel information of the first expanded image;

the determining unit is further configured to determine a panoramic image based on the vertex coordinates and the pixel information of the vertex.

12. An image processing apparatus comprising: a memory and a processor, wherein the processor is capable of,

the memory stores a computer program operable on the processor,

the processor, when executing the computer program, implements the image processing method of any of claims 1 to 10.

13. A computer storage medium storing one or more programs executable by one or more processors to implement the image processing method of any one of claims 1 to 10.