CN116012227A

CN116012227A - Image processing method, device, storage medium and processor

Info

Publication number: CN116012227A
Application number: CN202211741170.6A
Authority: CN
Inventors: 杨涛; 范卿; 付玲; 徐柏科; 许培培; 吴帅
Original assignee: Zoomlion Heavy Industry Science and Technology Co Ltd
Current assignee: Zoomlion Heavy Industry Science and Technology Co Ltd
Priority date: 2022-12-30
Filing date: 2022-12-30
Publication date: 2023-04-25

Abstract

The embodiment of the application provides an image processing method, an image processing device, a storage medium and a processor. The method comprises the following steps: and acquiring a plurality of area images and a plurality of point cloud data of the target scene, and processing the plurality of area images to obtain a plurality of splicing matrixes. And converting the point cloud data according to the equipment parameters and the motion parameters of the camera to obtain a depth-like image corresponding to each area image. And generating a spliced RGB image according to the first pixel point set determined by the plurality of splicing matrixes, and generating a panoramic depth image according to the second pixel point set. And determining the point cloud data of each pixel point included in the spliced RGB image according to the corresponding relation between the panoramic depth image and the spliced RGB image. The point cloud stitching is converted into image stitching by utilizing the corresponding relation between the RGB image and the similar depth image, so that the correspondence between the two-position image and the three-dimensional coordinate is realized, any position in the image can be accurately positioned, and the visibility of the image after scene reconstruction is improved.

Description

Image processing method, device, storage medium and processor

Technical Field

The present invention relates to the field of image processing, and in particular, to an image processing method, an image processing device, a storage medium, and a processor.

Background

In the intelligent operation process, the positioning of an operation target becomes the first problem of operation equipment, such as automatic lifting of a crane, automatic pumping of a pump truck, automatic fire extinguishing of a fire truck and the like. In the prior art, as the hardware detection distance is limited, the error of the acquired image also shows nonlinear and large increase along with the increase of the detection distance, so that the visibility of the reconstructed image is insufficient.

Disclosure of Invention

An object of an embodiment of the application is to provide an image processing method, an image processing device, a storage medium and a processor.

In order to achieve the above object, a first aspect of the present application provides an image processing method, including:

acquiring a plurality of area images of a target scene and point cloud data of each area image;

aiming at each area image, constructing a class depth image corresponding to each area image according to the point cloud data, wherein a corresponding relation exists between pixel points in the class depth image and pixel points in the area image on pixel positions;

determining a splicing matrix corresponding to each area image according to the information of the pixel points of the area image;

and splicing the plurality of class depth images according to a plurality of splicing matrixes corresponding to the plurality of region images to obtain a panoramic class depth image after splicing.

A second aspect of the present application provides a processor configured to perform the above-described image processing method.

A third aspect of the present application provides an image processing apparatus, comprising:

the camera is fixed on the holder and is used for acquiring a plurality of area images of the target scene and point cloud data of each area image;

the radar is fixed on the cloud platform, connected with the camera and used for acquiring point cloud data of a target scene;

the cradle head is used for adjusting shooting angles of the camera and the radar; and

a processor configured to perform the image processing method described above.

A fourth aspect of the present application provides a machine-readable storage medium having stored thereon instructions which, when executed by a processor, cause the processor to be configured to perform the above-described image processing method.

According to the technical scheme, a plurality of area images of the target scene and point cloud data of each area image are acquired; aiming at each area image, constructing a class depth image corresponding to each area image according to the point cloud data, wherein a corresponding relation exists between pixel points in the class depth image and pixel points in the area image on pixel positions; determining a splicing matrix corresponding to each area image according to the information of the pixel points of the area image; and splicing the plurality of class depth images according to a plurality of splicing matrixes corresponding to the plurality of region images to obtain a panoramic class depth image after splicing. And the point cloud stitching is converted into image stitching by utilizing the corresponding relation between the RGB image and the similar depth image, so that the correspondence between the two-position image and the three-dimensional coordinate is realized, the error after the image stitching is reduced, and the visibility of the image after the scene reconstruction is improved.

Additional features and advantages of embodiments of the present application will be set forth in the detailed description that follows.

Drawings

The accompanying drawings are included to provide a further understanding of embodiments of the present application and are incorporated in and constitute a part of this specification, illustrate embodiments of the present application and together with the description serve to explain, without limitation, the embodiments of the present application. In the drawings:

fig. 1 schematically shows a flow diagram of an image processing method according to an embodiment of the present application;

FIG. 2 schematically illustrates a schematic diagram of stitched RGB images according to an embodiment of the present application;

FIG. 3 schematically illustrates a schematic diagram of correspondence of stitched RGB images with panoramic-like depth images according to an embodiment of the present application;

FIG. 4 schematically illustrates a step schematic of a first joint processing operation according to an embodiment of the present application;

FIG. 5 schematically illustrates a schematic view of a stitched RGB image after stitching according to an embodiment of the present application;

FIG. 6 schematically illustrates a step schematic of a second joint processing operation according to an embodiment of the present application;

fig. 7 schematically shows a block diagram of the structure of an image processing apparatus according to an embodiment of the present application;

fig. 8 schematically shows an internal structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it should be understood that the specific implementations described herein are only for illustrating and explaining the embodiments of the present application, and are not intended to limit the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present application based on the embodiments herein.

Fig. 1 schematically shows a flow diagram of an image processing method according to an embodiment of the present application. As shown in fig. 1, in an embodiment of the present application, an image processing method is provided, including steps S102 to S108. The image processing method may be performed by a processor.

S102, acquiring a plurality of area images of the target scene and point cloud data of each area image.

The target scene refers to a space region for performing scene reconstruction or three-dimensional reconstruction as required in various fields such as engineering construction, digital city construction, three-dimensional topographic map drawing, city land planning and management, virtual tourism, street space analysis and the like. The region image refers to an RGB image of a partial spatial region of the target scene. The processor may acquire a plurality of region images of the target scene and point cloud data for each region image. The point cloud data refers to a set of point cloud points in a three-dimensional coordinate system, and each point contains three-dimensional coordinates, or may also contain color information or reflection intensity information.

S104, constructing a class depth image corresponding to each region image according to the point cloud data, wherein the pixel points in the class depth image and the pixel points in the region image have a corresponding relation in pixel positions.

And the pixel point of each area image has corresponding point cloud data. The processor may construct a class depth image corresponding to each region image from the point cloud data. Specifically, the point cloud coordinates of each pixel point in the area image may be converted into pixel coordinates and pixel values having depth information, thereby obtaining a depth-like image corresponding to each area image. A depth-like image refers to an image in which the distance (depth) of the camera to points in the scene is taken as pixel values, which directly reflects the geometry of the visible surface of the scene.

S106, determining a splicing matrix corresponding to the region image according to the information of the pixel points of the region image for each region image.

Based on the pixel coordinates and pixel values of the pixels of the area images, the processor may process each area image separately to obtain a stitching matrix for each area image. The stitching matrix refers to matrix parameters for implementing image stitching using the matrix. The matrix splicing is realized by embedding splicing software into the functions of the matrix through a software embedding technology on the basis of software splicing, and finally the aim of realizing the splicing effect by utilizing the matrix is fulfilled. Wherein each region image corresponds to a stitching matrix.

And S108, splicing the plurality of class depth images according to a plurality of splicing matrixes corresponding to the plurality of region images to obtain a panoramic class depth image after splicing.

Based on the stitching matrix corresponding to the region images, the processor may perform matrix stitching on the plurality of class depth images. The pixel points in each class depth image are converted through the splicing matrix, original pixel coordinates of each pixel point in the class depth image are converted into new pixel coordinates, and the pixel value of each pixel point of the class depth image is still reserved. Then, for the class depth image, the processor may use the pixel coordinates and the pixel values after the conversion of the plurality of pixel points as a set of pixel points corresponding to the image stitching of the plurality of class depth images. The processor may generate a panoramic class depth image from the set of pixel points. The panoramic class depth image refers to an image obtained by stitching a plurality of class depth images of a target scene. And the point cloud stitching is converted into image stitching by utilizing the corresponding relation between the RGB image and the similar depth image, so that the correspondence between the two-position image and the three-dimensional coordinate is realized, the error after the image stitching is reduced, and the visibility of the image after the scene reconstruction is improved.

In one embodiment, the method further comprises: and splicing the plurality of area images according to a plurality of splicing matrixes corresponding to the plurality of area images to obtain spliced RGB images.

Based on a plurality of stitching matrices corresponding to the plurality of region images, the processor may stitch the plurality of region images to generate a stitched RGB image. As shown in fig. 2, the stitched RGB image refers to a panoramic RGB image stitched by a plurality of area images of a target scene. For example, the processor may determine a first set of pixel points corresponding to a plurality of region images from a plurality of stitching matrices. Specifically, the processor may describe individual pixel points in each region image in terms of pixel coordinates and pixel values. The processor may process each area image to obtain a stitching matrix corresponding to each area image. Based on these stitching matrices, the processor may matrix stitch the plurality of region images. The matrix stitching process comprises the steps of converting pixel points in each area image through a stitching matrix, and converting original pixel coordinates of each pixel point in the area image into new pixel coordinates. The new pixel coordinate is the first pixel coordinate. The pixel value of each pixel point is unchanged in the matrix splicing process, and the first pixel value is still the original pixel value. The processor may convert the first pixel coordinates and the first pixel values of the plurality of pixel points into a first pixel point set corresponding to the plurality of region images after image stitching. The first pixel point set comprises a plurality of pixel points after the regional image conversion.

Because each pixel of the region image is in one-to-one correspondence with each pixel of the similar depth image, and the pixel of the spliced RGB image and the pixel of the panoramic similar depth image are determined by the same splicing matrix, each pixel of the spliced RGB image is in one-to-one correspondence with each pixel of the panoramic similar depth image. By the aid of the scheme, splicing errors are eliminated, and positioning accuracy is improved. As shown in fig. 3, fig. 3 schematically shows a schematic diagram of correspondence between a stitched RGB image and a panoramic-like depth image according to an embodiment of the application. The processor may determine point cloud data of each pixel included in the stitched RGB image according to a correspondence between the panoramic class depth image and the stitched RGB image. A technician can obtain a stitched RGB image of a target scene composed of a plurality of region images with the camera as a center point. Moreover, based on pixel coordinates (u, v) of any pixel point in the spliced RGB image, depth information (x, y, z) of the point can be simultaneously acquired, and the large-scale accurate positioning and operation of the engineering machinery can be met.

In one embodiment, the device parameters of the camera include an internal reference matrix and an external reference matrix, the point cloud data of each area image includes a plurality of point cloud coordinates, for each area image, constructing a depth-like image corresponding to each area image according to the point cloud data, and the correspondence between the pixel points in the depth-like image and the pixel points in the area image in pixel positions includes: converting each point cloud coordinate according to the internal reference matrix and the external reference matrix to obtain a second homogeneous coordinate corresponding to each point cloud coordinate after conversion, wherein the second homogeneous coordinate refers to a coordinate under a coordinate system established by taking a camera as an origin; determining a second pixel coordinate and a second pixel value of a pixel corresponding to each point cloud coordinate according to each second homogeneous coordinate and the motion parameter corresponding to each area image; and generating a depth-like image corresponding to each region image according to the second pixel coordinates and the second pixel values of each pixel.

In order to reconstruct a target scene, the camera may be mounted on a pan-tilt, which may be a multi-axis pan-tilt. The general camera and pan-tilt may be mounted to a carrier device, such as a crane boom tip, a tower crane boom, a pump truck implement joint top, etc. of the work machine. The cradle head can rotate to adjust the shooting angle of the camera, so that the camera can acquire regional images of all view angles of the target scene. The camera may be an RGB camera, which may capture color images. Radar may be used to acquire point cloud data, and to acquire depth information of an object. The device parameters of the camera refer to focal length, shutter, aperture, sensitivity, internal parameters, external parameters, distortion parameters, and the like. The motion parameters of the cradle head refer to the rotation angle, the motion matrix and the like of the cradle head. The processor may obtain device parameters of the camera and motion parameters of the pan-tilt.

The internal reference matrix is a 3×3 matrix for transforming the 3D camera coordinates to the 2D homogeneous image coordinates, and includes parameters such as focal length, principal point offset, axis tilt, etc. The external reference matrix refers to a matrix describing the position of the camera and the pointing direction of the camera in the world coordinate system, and comprises a rotation matrix R and a translation vector t. Vector t describes the position of the world coordinate system origin in the camera coordinate system, and R represents the direction of the world coordinate system axis in the camera coordinate system. The processor may convert the point cloud data of each region image according to the device parameter and the motion parameter to obtain a depth-like image corresponding to each region image. Specifically, the processor may convert each point cloud coordinate according to the internal reference matrix and the external reference matrix to obtain a second homogeneous coordinate corresponding to each point cloud coordinate after conversion. The conversion is affine transformation, and the three-dimensional point cloud coordinates are added with an additional coordinate, so that coordinate translation can be realized, and scaling and rotation operations of multiplication of the three-dimensional vector and the matrix are maintained. The second homogeneous coordinate is a coordinate in a coordinate system established by taking the camera as an origin after the pointing cloud coordinate is transformed, and the coordinate system is a camera coordinate system.

Further, the processor may determine a second pixel coordinate and a second pixel value of the pixel point corresponding to each point cloud coordinate according to each second homogeneous coordinate and the motion parameter corresponding to each area image. For each region image, each pixel point in the image corresponds to a point cloud coordinate. After converting the point cloud coordinates into second homogeneous coordinates, the processor can convert the second homogeneous coordinates into second pixel coordinates according to the motion parameters of the cradle head when the camera collects the region image, and save the scale factor representing the depth as the second pixel value of the depth image. In this way, conversion of three-dimensional point cloud data into two-dimensional image data can be achieved. The processor may generate a depth-like image corresponding to each region image from the second pixel coordinates and the second pixel values of each pixel point.

In one embodiment, the camera is fixed on the pan-tilt, and the motion parameters of the pan-tilt include a first motion parameter and a second motion parameter, wherein the first motion parameter refers to the motion parameter of the roll motion of the pan-tilt, and the second motion parameter refers to the motion parameter of the rotation motion of the pan-tilt; determining the second pixel coordinates and the second pixel values of the pixels corresponding to each point cloud coordinate according to each second homogeneous coordinate and the motion parameters corresponding to each area image comprises: determining the space pose of the cradle head when the camera shoots each area image; determining a first motion parameter and a second motion parameter corresponding to each region image according to the space pose; determining an image factor of each region image; and for each area image, determining a second pixel coordinate and a second pixel value of a pixel corresponding to each point cloud coordinate according to the image factor, the second homogeneous coordinate, the first motion parameter and the second motion parameter of the area image.

The first motion parameter refers to a motion parameter of a roll motion of the pan-tilt, and may be a roll motion matrix. The second motion parameter refers to a motion parameter of the rotation motion of the pan-tilt, and may be a rotation motion matrix. The motion parameters of the transverse rolling motion matrix and the rotary motion matrix are determined based on the gesture of the camera relative to the initial gesture of the cradle head when the camera collects images of each region. The processor can determine the space pose of the cradle head when the camera shoots each area image, and determine the first motion parameter and the second motion parameter corresponding to each area image according to the space pose. At the same time, the processor may also determine the image factor for each region image. The image factor is data that affects the imaging quality of an image. For each area image, the processor may determine a second pixel coordinate and a second pixel value of a pixel corresponding to each point cloud coordinate according to the image factor, the second homogeneous coordinate, the first motion parameter, and the second motion parameter of the area image. Wherein the second homogeneous coordinates may be calculated according to the following formula (1):

wherein [ m ] ₁ n ₁ p ₁ ]Refers to the second homogeneous coordinates of the camera coordinate system with the camera as the origin of each pixel point in the area image,

Refers to the internal reference matrix of the camera, +.>

Refers to the external matrix of the camera, [ xyz1 ]]The method is characterized in that the method refers to point cloud coordinates of each pixel point in an area image under a world coordinate system with a camera as an origin, R refers to shorthand of a rotating part in an equipment external reference matrix, and T refers to shorthand of a translation part in the equipment external reference matrix.

Wherein u is ₂ Refers to the abscissa, v, of the second pixel coordinate of each pixel point in the pixel coordinate system in the depth-like image ₂ Refers to the ordinate of the second pixel coordinate of each pixel point in the pixel coordinate system in the depth-like image, and pixels [ r, g, b ]] ^T The matrix point refers to a pixel value of each pixel point in the depth-like image, the matrix point refers to a rolling motion matrix of the cradle head in the current space position, the matrix point refers to a rotating motion matrix of the cradle head in the current space position, and the Factor refers to an image Factor of the depth-like image.

In one embodiment, stitching the plurality of class depth images according to a plurality of stitching matrices corresponding to the plurality of area images, and obtaining the stitched panoramic class depth image includes: for each area image, converting second pixel coordinates of each pixel point contained in the similar depth image according to a splicing matrix corresponding to the area image to obtain third homogeneous coordinates of each second pixel coordinate; determining third pixel coordinates corresponding to the point cloud coordinates of each pixel contained in the depth-like image according to each third homogeneous coordinate; generating a second pixel point set corresponding to the plurality of class depth images according to the third pixel coordinates and the second pixel values of all the pixels, wherein the second pixel point set comprises the third pixel coordinates of each pixel and the second pixel value of each pixel contained in the plurality of converted class depth images; and generating a panoramic depth image according to the second pixel point set.

And converting each point cloud coordinate by an internal reference matrix and an external reference matrix according to each area image to obtain a second homogeneous coordinate corresponding to each point cloud coordinate. The processor may determine a second pixel coordinate and a second pixel value of a pixel point corresponding to each point cloud coordinate according to the image factor, the second homogeneous coordinate, the first motion parameter, and the second motion parameter of the region image. The processor may determine a second set of pixel points corresponding to the plurality of class depth images according to the plurality of stitching matrices, and generate a panoramic class depth image from the second set of pixel points. Specifically, for each area image, the processor may convert the second pixel coordinates of each pixel point in the depth-like image according to the stitching matrix corresponding to the area image, so as to obtain a third homogeneous coordinate of each second pixel coordinate. The third homogeneous coordinate refers to a coordinate under a coordinate system established by taking the camera as an origin after the second pixel coordinate is subjected to matrix stitching and changing. Wherein the third homogeneous coordinates may be calculated according to the following equation (3):

wherein u is ₂ Refers to in classThe abscissa, v, of the second pixel coordinate of each pixel point in the depth image in the pixel coordinate system ₂ Refers to the ordinate, [ m ] of the second pixel coordinate in the pixel coordinate system of each pixel point in the depth-like image ₂ n ₂ p ₂ ]Refers to the third homogeneous coordinates of the camera coordinate system with the camera as the origin of each pixel point in the depth-like image,

refers to a splicing matrix corresponding to each area image, X ₁ 、X ₂ 、X ₄ And X ₅ Refers to rotation conversion parameters, X in a spliced matrix ₃ And X ₆ Refers to translation transformation parameters, X ₇ And X ₈ Refers to perspective transformation parameters.

Further, the processor may determine a third pixel coordinate corresponding to the point cloud coordinate of each pixel in the depth-like image from each third homogeneous coordinate. Wherein the third pixel coordinates may be calculated according to the following formula (4):

wherein u is ₃ Refers to the abscissa, v, of the third pixel coordinate of each pixel point in the pixel coordinate system in the depth-like image ₃ Refers to the ordinate, m, of the third pixel coordinate in the pixel coordinate system of each pixel point in the depth-like image ₂ Refers to a first coordinate value, n, of a third homogeneous coordinate of a camera coordinate system taking a camera as an origin of each pixel point in the depth-like image ₂ Refers to a second coordinate value, p, of a third homogeneous coordinate of a camera coordinate system with a camera as an origin of each pixel point in the depth-like image ₂ Refers to a third coordinate value of a third homogeneous coordinate of a camera coordinate system with the camera as an origin of each pixel point in the depth-like image.

The processor may generate a second set of pixel points corresponding to the plurality of depth-like images based on the third pixel coordinates and the second pixel values of all pixel points. That is, the above procedure converts only the second pixel coordinates of the depth-like image, the second pixel values remaining unchanged. The second set of pixel points includes a third pixel coordinate for each pixel point and a second pixel value for each pixel included in the plurality of converted depth-like images. The processor may generate a panorama class depth image from the second set of pixel points.

In one embodiment, stitching the plurality of area images according to a plurality of stitching matrices corresponding to the plurality of area images, and obtaining the stitched RGB image after stitching includes: determining initial pixel coordinates and initial pixel values of each pixel point in each area image; converting each initial pixel coordinate according to the splicing matrix corresponding to each area image to obtain a first uniform coordinate of each pixel point; determining first pixel coordinates of each pixel point according to the first uniform coordinates of each pixel point of the area image, and determining an initial pixel value of each pixel point of the area image as a first pixel value corresponding to each first pixel coordinate; determining a first pixel point set corresponding to the plurality of area images according to the first pixel coordinates and the first pixel values of all the pixel points; and generating a spliced RGB image according to the first pixel point set.

Each pixel point of the regional image acquired by the camera corresponds to an initial pixel coordinate and an initial pixel value. The processor may convert each initial pixel coordinate according to the stitching matrix corresponding to each area image, so as to obtain a first rectangular coordinate of each pixel point. The first rectangular coordinates refer to coordinates in a coordinate system established by taking a camera as an origin after the initial pixel coordinates are subjected to matrix stitching transformation. Wherein the first co-ordinate may be calculated according to the following equation (5):

wherein u is ₀ Refers to the abscissa, v, of the second pixel coordinate of each pixel point in the pixel coordinate system in the region image ₀ Refers to the ordinate, [ m ] of the second pixel coordinate of each pixel point in the pixel coordinate system in the area image ₃ n ₃ p ₃ ]Refers to the first co-ordinate of the camera coordinate system with the camera as the origin of each pixel point in the area image,

refers to a splicing matrix corresponding to each area image, X ₁ 、X ₂ 、X ₄ And X ₅ Refers to rotation conversion parameters, X in a spliced matrix ₃ And X ₆ Refers to translation transformation parameters, X ₇ And X ₈ Refers to perspective transformation parameters. The splicing matrix corresponding to the depth-like image is consistent with the splicing matrix corresponding to the spliced RGB image.

Further, the processor may determine a first pixel coordinate of each pixel point of the area image from the first co-ordinate of each pixel point. Wherein the first pixel coordinates may be calculated according to the following equation (6):

Wherein u is ₁ Refers to the abscissa, v, of the first pixel coordinate in the pixel coordinate system of each pixel point in the area image ₁ Refers to the ordinate, m, of the first pixel coordinate in the pixel coordinate system of each pixel point in the regional image ₃ Refers to a first coordinate value, n, of a first uniform coordinate in a pixel coordinate system converted from a camera coordinate system ₃ Refers to the second coordinate value, p, of the first coordinate in the pixel coordinate system converted from the camera coordinate system ₃ Refers to a third coordinate value of the first co-ordinate in the camera coordinate system converted to the pixel coordinate system.

The processor may determine an initial pixel value of each pixel point of the area image as a first pixel value corresponding to each first pixel coordinate, and determine a first pixel point set corresponding to the plurality of area images according to the first pixel coordinates and the first pixel values of all the pixel points. And translating all pixel coordinates of the first pixel point set to the first quadrant according to a certain distance. In this manner, the first set of pixels may be generated into a stitched RGB image. As shown in fig. 2, the stitched RGB image refers to a panoramic RGB image stitched by a plurality of area images of a target scene.

Fig. 4 schematically illustrates a step schematic diagram of a first joint processing operation according to an embodiment of the present application. As shown in fig. 4, in one embodiment, the steps of the first seam processing operation include:

S401, after the spliced RGB image is generated, each first spliced pixel point contained in the spliced RGB image is extracted.

S402, determining whether the pixel value of the first spliced pixel point is empty, if so, executing S403; if not, S401 is performed.

S403, determining the first spliced pixel point as a first edge empty point of the spliced RGB image.

S404, determining the number of first non-empty points in the neighborhood of the first edge empty point according to the pixel coordinates of the first edge empty point, wherein the first non-empty points are first spliced pixel points with first spliced pixel values not being empty in the spliced RGB image.

S405, determining whether the number of the first non-empty points in the neighborhood is larger than a first preset value, if so, executing S406; if not, S401 is performed.

S406, determining a first target pixel value of the first edge null point according to the first spliced pixel values of all the first non-null points in the neighborhood, so that the pixel value of the first edge null point is not null.

S407, determining whether all pixel values of the first edge null points are not null, if so, executing S408; if not, S401 is performed.

S408, determining that a first stitching operation aiming at the stitched RGB image is completed, and obtaining the stitched RGB image after stitching, so as to determine the point cloud data of each pixel point included in the stitched RGB image according to the corresponding relation between the panoramic depth image and the stitched RGB image.

As shown in fig. 2, after the spliced RGB image generated by splicing the plurality of area images, the spliced RGB image has a blank area. The processor may extract each first stitched pixel point contained in the stitched RGB image. The first spliced pixel points refer to pixel points in the spliced RGB image, and the first spliced pixel points correspond to pixel coordinates and pixel values. And because of the existence of the vacant area, the pixel value of the first spliced pixel point with a part is blank. Then, the processor may determine that the first stitched pixel point is a first edge empty point of the stitched RGB image if it is determined that the pixel value of the first stitched pixel point is empty. The first edge empty point refers to a pixel point with an empty pixel value, which is positioned at the edge of each region image in the spliced RGB image. The processor may search in the neighborhood of the first edge null point according to the pixel coordinate of the first edge null point as the center coordinate, to find the number of the first non-null points in the domain. The first non-empty point refers to a first spliced pixel point in the spliced RGB image, where the first spliced pixel value is not empty. And under the condition that the number of the first non-null points in the neighborhood is larger than or equal to a first preset value, determining a first target pixel value of the first edge null point according to the first spliced pixel values of all the first non-null points in the neighborhood so as to enable the pixel value of the first edge null point not to be null. The first preset value may be set to 1. If the number of the first non-empty points in the neighborhood is greater than or equal to 1, an average value of the first spliced pixel values of all the first non-empty points in the neighborhood can be calculated, and the average value is determined to be a first target pixel value of the first edge empty point, so that the pixel value of the first edge empty point is not empty. The first target pixel value is the pixel value of the first edge null point. And under the condition that the pixel values of all the first edge null points are not null, determining to finish the first joint processing operation aiming at the spliced RGB image, and obtaining the spliced RGB image after joint. As shown in fig. 5, fig. 5 schematically shows a schematic diagram of a spliced RGB image after splicing according to an embodiment of the present application. Further, according to the corresponding relation between the panoramic depth image and the spliced RGB image after the splicing, the point cloud data of each pixel point included in the spliced RGB image is determined. The image stitching processing technology based on the adjacent points repairs the stitching points by the adjacent point pixels, so that the image visibility can be improved.

Fig. 6 schematically illustrates a step schematic diagram of a second joint processing operation according to an embodiment of the present application. In one embodiment, the step of the second seam processing operation includes:

s601, after the panoramic depth image is generated, second spliced pixel points in the panoramic depth image are extracted.

S602, determining whether the pixel value of the second spliced pixel point is empty, if so, executing S603; if not, S601 is performed.

And S603, determining the second spliced pixel point as a second edge empty point of the panoramic depth image.

S604, determining the number of second non-empty points in the neighborhood of the second edge empty point according to the pixel coordinates of the second edge empty point, wherein the second non-empty points are second spliced pixel points with second spliced pixel values not being empty in the panoramic depth image.

S605, determining whether the number of the second non-empty points in the neighborhood is larger than or equal to a second preset value, if yes, executing S606; if not, S601 is performed.

S606, determining second target pixel values and target point cloud data of the second edge null points according to the point cloud data and the second spliced pixel values of all the second non-null points in the neighborhood, so that the pixel values and the point cloud data of the second edge null points are not null.

S607, determining whether all the pixel values of the second edge null points are not null, and executing S608; if not, S601 is performed.

S608, determining that the second stitching operation of the panoramic depth image is finished, so as to obtain a stitched panoramic depth image, and determining point cloud data of each pixel point included in the stitched RGB image according to the corresponding relation between the stitched panoramic depth image and the stitched RGB image.

After the panoramic depth image generated by splicing the plurality of area images, the images also have blank areas.

The processor may extract each second stitched pixel point contained in the panoramic class depth image. The second spliced pixel points are pixel points in the panoramic depth image, and the second spliced pixel points correspond to pixel coordinates and pixel values. Then, if the processor determines that the pixel value of the second stitched pixel point is null, the processor may determine that the second stitched pixel point is a second edge null point of the panoramic depth image. The second edge empty point refers to a pixel point with an empty pixel value, which is positioned at the edge of each class depth image in the spelling panorama class depth image. The processor may search in the neighborhood of the second edge null point according to the pixel coordinates of the second edge null point as the center coordinates, to find the number of second non-null points in the domain. The second non-empty point refers to a second spliced pixel point with a second spliced pixel value which is not empty in the panoramic depth image. And under the condition that the number of the second non-empty points in the neighborhood is larger than or equal to a second preset value, determining target pixel values and target point cloud data of the first edge empty points according to second spliced pixel values and point cloud data of all the second non-empty points in the neighborhood, so that the pixel values and the point cloud data of the first edge empty points are not empty. The second preset value may be set to 1. If the number of the second non-empty points in the adjacent area is greater than or equal to 1, an average value of second spliced pixel values of all the second non-empty points in the adjacent area can be calculated, and the average value is determined to be a second target pixel value of the second edge empty point. And calculating the average value of three coordinate values of the point cloud coordinates of all the second non-null points in the neighborhood, and determining the average value as the target point cloud data of the second edge null points. So that the pixel value and the point cloud data of the second edge null point are not null. The second target pixel value is the pixel value of the second edge null point. The target point cloud data is the point cloud data of the second edge null point. And under the condition that pixel values of all second edge null points are not null, determining to finish second stitching operation aiming at the panoramic depth image, obtaining the panoramic depth image after stitching, and determining point cloud data of each pixel point included in the stitched RGB image according to the corresponding relation between the panoramic depth image and the stitched RGB image after stitching.

In one embodiment, processing each region image separately to obtain a stitching matrix for each region image includes: extracting a feature point set of each region image through an image stitching algorithm, wherein each feature point set comprises a plurality of feature points; matching a plurality of characteristic points of each characteristic point set through an image stitching algorithm to obtain a target matching point set corresponding to each characteristic point set; and determining a splicing matrix corresponding to each area image according to the target matching point set of each area image.

The processor may process each of the region images separately to obtain a stitching matrix for each of the region images. Specifically, the processor may pre-process each region image, filter and denoise the region image.

The processor may extract a feature point set for each region image by an image stitching algorithm. The image stitching algorithm refers to an algorithm for stitching a plurality of region images into one image. Each region image corresponds to a feature point set, and each feature point set comprises a plurality of feature points of each region image. The feature points are points which are very prominent in the regional image and cannot disappear due to factors such as illumination, scale, rotation and the like. Such as corner points, edge points, bright spots of dark areas and dark spots of bright areas. The processor can match a plurality of characteristic points of each characteristic point set through an image stitching algorithm to obtain a target matching point set corresponding to each characteristic point set, reject wrong matching points, and determine a stitching matrix corresponding to each region image according to the target matching point set of each region image. For example, a sift algorithm may be used as the main feature point extraction algorithm, and a surf algorithm may be used as the supplemental algorithm to reject false matching points. And the excellent matching points of the surf algorithm can be added into the target matching point set, so that the accuracy of image matching is improved.

In one embodiment, the feature point set includes a first type feature point set and a second type feature point set, and matching the plurality of feature points of each feature point set by using an image stitching algorithm to obtain a target matching point set corresponding to each feature point set includes: matching a plurality of characteristic points of a first type of characteristic point set through a first image stitching algorithm, and determining characteristic points with the first characteristic point scale smaller than a first preset distance in the first type of characteristic point set as first target characteristic points; matching a plurality of characteristic points of a second type of characteristic point set through a second image stitching algorithm, and determining characteristic points with second characteristic point scales smaller than a second preset distance in the second type of characteristic point set as second target characteristic points; and generating a target matching point set according to the first target characteristic point and the second target characteristic point.

For each region image, the first type of feature point set contains a plurality of feature points extracted from the region image by a first image stitching algorithm, and the second type of feature point set contains a plurality of feature points extracted from the region image by a second image stitching algorithm. For example, the first image stitching algorithm may be a sift algorithm and the second image stitching algorithm may be a surf algorithm. The processor may use any one of the plurality of area images as a base image, and other area images than the base image as images to be stitched. The processor may then extract a plurality of first type feature points of the base image and a plurality of first type feature points of each image to be stitched by a first image stitching algorithm. The processor can match a plurality of characteristic points of a first type characteristic point set of the basic image and each image to be spliced through a first image splicing algorithm, and determine characteristic points with a first characteristic point scale smaller than a first preset distance in the first type characteristic point set as first target characteristic points. The first feature point scale refers to a scale space factor, which is the standard deviation of Gaussian normal distribution and reflects the degree of blurring of an image, and the larger the value, the more blurring of the image is, and the larger the corresponding scale is. The first preset distance is an optimal threshold for the scale-space factor set by the technician. Then, the processor may remove the feature points with the first feature point scale greater than or equal to the first preset distance in the first type feature point set, and determine the feature points with the feature point scale less than the preset distance in the first type feature point set as the first target feature points. The feature points that are removed are typically low contrast feature points or unstable edge response points. The processor can also match a plurality of feature points of the second type of feature point set through a second image stitching algorithm, and determine feature points, of which the second feature point scale is smaller than a second preset distance, in the second type of feature point set as second target feature points. The second feature point scale refers to the euclidean distance, and the shorter the euclidean distance is, the better the matching degree of the two feature points is represented. The second preset distance is set by the technician to an optimal threshold value of euclidean distance. The first target feature point and the second target feature point both comprise scale, gray scale and direction information. The processor may generate a set of target matching points from the first target feature point and the second target feature point. The target matching point set refers to a set of best matching points between the image to be stitched and the base image. By the method, the corresponding positions of the characteristic points in the images to be spliced in the basic image can be quickly and accurately found, and then the transformation relation between the two images is determined.

In one embodiment, determining a stitching matrix corresponding to each region image from the set of target matching points for each region image includes: determining a splicing matrix corresponding to each region image according to the target matching point set under the condition that the number of target feature points in the target matching point set of the region image is larger than a preset value for any region image; and under the condition that the number of the target feature points in the target matching point set of the area image is smaller than or equal to a preset value, matching the plurality of feature points in the first type of feature point set through a first image stitching algorithm again until the number of the target feature points in the target matching point set of the area image is larger than the preset value.

For any one area image, the area image is determined as an image to be spliced. Then, in a case where the processor determines that the number of target feature points in the target matching point set of the area image is greater than the preset value, a stitching matrix corresponding to each area image may be determined according to the target matching point set. Or after all the feature points in the feature point sets in the area images are matched, determining a splicing matrix corresponding to each area image according to the target matching point set. If the number of the target feature points in the target matching point set of the area image is smaller than or equal to the preset value, the processor can match the plurality of feature points in the first type feature point set again through the first image stitching algorithm until the number of the target feature points in the target matching point set of the area image is larger than the preset value. For example, the preset value may be 50.

In one implementation, the processor may extract a sift feature point set for each region image through a sift algorithm, and match the sift feature point set for the image a to be stitched with the sift feature point set for the base image. And in the process of matching the image A to be spliced with the basic image, determining a sift characteristic point with a first characteristic point scale smaller than a first preset distance as a first target characteristic point, and removing the sift characteristic point with the first characteristic point scale larger than or equal to the first preset distance. The processor can extract a surf feature point set of each region image through a surf algorithm, match the surf feature point set of the image to be spliced with the surf feature point set of the basic image, and select surf feature points in the surf feature point set, which are the same as the sift feature points with the first feature point scale smaller than the first preset distance. Among the selected surf feature points, the processor may determine a feature point having a second feature point size smaller than a second preset distance as a target feature point, and then remove feature points having a second feature point size greater than or equal to the second preset distance. And the sift characteristic points with the first characteristic point scale smaller than the first preset distance and different from the surf characteristic points are also determined as target characteristic points. The processor may then generate a set of target matching points from the selected target feature points. If the number of the target feature points in the target matching point set is less than or equal to 50, matching a plurality of sift feature points in the sift feature point set again through a sift algorithm until the number of the target feature points in the target matching point set of the region image is greater than 50. If the number of target feature points of the target matching point set is greater than 50, the processing may generate a stitching matrix corresponding to each region image according to the target matching point set. The surf algorithm is used for supplementing the sift algorithm, and excellent characteristic points selected by the surf algorithm can be added while error characteristic points are removed, so that the accuracy of image matching is improved.

Through the technical scheme, the shooting angles of the RGB camera and the radar are adjusted through the cradle head, so that a plurality of area images and a plurality of point cloud data of a target scene are obtained. And processing the multiple region images through a sift-surf algorithm to obtain multiple splicing matrixes, so that the accuracy of image matching is improved. And converting the pixel points of each area image according to the splicing matrix corresponding to each area image. After the pixel coordinates of the single area image are converted into homogeneous coordinates, the homogeneous coordinates are converted into pixel coordinates of the spliced RGB image to generate a first pixel point set, so that a plurality of area images are spliced into a complete spliced RGB image. And converting point cloud data based on an internal reference matrix, an external reference matrix and a motion matrix of a cradle head of the RGB camera, converting point cloud coordinates into pixel coordinates of a similar depth image, and storing scale factors as pixel values of the similar depth image, thereby obtaining the similar depth image corresponding to each region image. The problem that the pixel point information changes due to the splicing of the depth map, and the three-dimensional point cloud coordinates cannot be recovered by the pixel point of the depth image is avoided. Further, according to a splicing mode which is common to the area image, after the pixel coordinates of the similar depth image are converted into homogeneous coordinates according to a plurality of splicing matrixes, the homogeneous coordinates are converted into the pixel coordinates of the panoramic similar depth image, and a second pixel point set is generated, so that a plurality of similar depth images are spliced into a complete panoramic similar depth image. Compared with the traditional method, the method has the advantages that the point cloud splicing is directly carried out, the point cloud splicing is more reliable by utilizing the image, and each pixel point of the panoramic depth image and each pixel point of the spliced RGB image can be in one-to-one correspondence. Further, the average value of the pixel values of the non-null points and the point cloud coordinates in the edge null point field is respectively taken, and the null pixel points in the spliced RGB image and the panoramic depth image are filled, so that the visibility of the image after scene reconstruction is improved, and the problem of failure in positioning the null point pixels in the image is solved. And determining the point cloud data of each pixel point included in the spliced RGB image according to the corresponding relation between the panoramic depth image and the spliced RGB image. And the point cloud stitching is converted into image stitching by utilizing the corresponding relation between the RGB image and the similar depth image, so that the correspondence between the two-position image and the three-dimensional coordinate is realized, the error after the image stitching is reduced, and the visibility of the image after the scene reconstruction can be realized.

Fig. 1 is a flow chart of an image processing method in an embodiment. It should be understood that, although the steps in the flowchart of fig. 1 are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 1 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of other steps or sub-steps of other steps.

In one embodiment, as shown in fig. 7, there is provided an image processing apparatus including:

a camera 710 for acquiring a plurality of area images of a target scene and point cloud data of each area image.

The radar 720, fixed to the pan-tilt 730, is connected to the camera 710, and is used for acquiring point cloud data of the target scene.

Cradle head 730 for adjusting the shooting angles of the camera and the radar.

The processor 740 is configured to perform the image processing method described above.

Processor 740 may control pan-tilt 720 to have a roll angle α and control pan-tilt 730 to rotate 180 ° at a fixed roll angle α, and control pan-tilt 730 to have a roll angle β and control pan-tilt 730 to rotate 180 ° at a fixed roll angle β. Cradle head 730 may fix radar 720 and camera 710 such that the radar and camera have a relatively fixed spatial position. Meanwhile, the processor controls the camera to collect the corresponding area images under each gesture, and controls the radar to collect the point cloud data corresponding to each area image. The regional image can be acquired through an RGB camera, and the point cloud data is acquired through a laser radar. The larger the range of the target scene is, the cradle head rotates under different roll angles, so that the larger the image range in the reconstructed spliced RGB image and the panoramic depth image is, the more detailed the image data is.

The processor includes a kernel, and the kernel fetches the corresponding program unit from the memory. The kernel can be provided with one or more than one, and the image processing method is realized by adjusting kernel parameters.

The memory may include volatile memory, random Access Memory (RAM), and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM), among other forms in computer readable media, the memory including at least one memory chip.

The embodiment of the application provides a storage medium having a program stored thereon, which when executed by a processor, implements the above-described image processing method.

The embodiment of the application provides a processor for running a program, wherein the image processing method is executed when the program runs.

In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 8. The computer device includes a processor a01, a network interface a02, a memory (not shown) and a database (not shown) connected by a system bus. Wherein the processor a01 of the computer device is adapted to provide computing and control capabilities. The memory of the computer device includes internal memory a03 and nonvolatile storage medium a04. The nonvolatile storage medium a04 stores an operating system B01, a computer program B02, and a database (not shown in the figure). The internal memory a03 provides an environment for the operation of the operating system B01 and the computer program B02 in the nonvolatile storage medium a04. The database of the computer device is used to store image processing data. The network interface a02 of the computer device is used for communication with an external terminal through a network connection. The computer program B02 is executed by the processor a01 to implement an image processing method.

It will be appreciated by those skilled in the art that the structure shown in fig. 8 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

The embodiment of the application provides equipment, which comprises a processor, a memory and a program stored in the memory and capable of running on the processor, wherein the processor executes the program to realize the steps of the image processing method.

The present application also provides a computer program product adapted to perform a program initialized with the steps of the image processing method when executed on a data processing device.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, etc., such as Read Only Memory (ROM) or flash RAM. Memory is an example of a computer-readable medium.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises an element.

The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims

1. An image processing method, the method comprising:

for each area image, constructing a class depth image corresponding to each area image according to the point cloud data, wherein a corresponding relation exists between pixel points in the class depth image and pixel points in the area image on pixel positions;

2. The image processing method according to claim 1, characterized in that the method further comprises:

and splicing the plurality of area images according to a plurality of splicing matrixes corresponding to the plurality of area images to obtain spliced RGB images.

3. The image processing method according to claim 2, wherein the stitching the plurality of area images according to the plurality of stitching matrices corresponding to the plurality of area images to obtain the stitched RGB image includes:

determining initial pixel coordinates and initial pixel values of each pixel point in each area image;

converting each initial pixel coordinate according to the splicing matrix corresponding to each area image to obtain a first uniform coordinate of each pixel point;

determining a first pixel coordinate of each pixel point according to the first uniform coordinates of each pixel point of the area image,

determining an initial pixel value of each pixel point of the area image as a first pixel value corresponding to each first pixel coordinate;

Determining a first pixel point set corresponding to the plurality of area images according to the first pixel coordinates and the first pixel values of all the pixel points;

and generating the spliced RGB image according to the first pixel point set.

4. The image processing method according to claim 2, characterized in that the method further comprises:

after the spliced RGB image is generated, extracting each first spliced pixel point contained in the spliced RGB image;

under the condition that the pixel value of the first spliced pixel point is null, determining the first spliced pixel point as a first edge null point of the spliced RGB image;

determining the number of first non-empty points in the neighborhood of the first edge empty point according to the pixel coordinates of the first edge empty point, wherein the first non-empty points are first spliced pixel points with first spliced pixel values which are not empty in the spliced RGB image;

determining a first target pixel value of the first edge null point according to first spliced pixel values of all the first non-null points in the neighborhood when the number of the first non-null points in the neighborhood is larger than or equal to a first preset value, so that the pixel value of the first edge null point is not null;

and under the condition that all pixel values of the first edge null points are not null, determining to finish a first joint processing operation aiming at the joint RGB image to obtain a joint RGB image after joint, and determining the point cloud data of each pixel point included in the joint RGB image according to the corresponding relation between the panoramic depth image and the joint RGB image after joint.

5. The image processing method according to claim 1, wherein the device parameters of the camera include an internal reference matrix and an external reference matrix, the point cloud data of each area image includes a plurality of point cloud coordinates, the constructing, for each area image, a class depth image corresponding to each area image according to the point cloud data, and the pixel points in the class depth image and the pixel points in the area image have a corresponding relationship in pixel positions, including:

converting each point cloud coordinate according to the internal reference matrix and the external reference matrix to obtain a second homogeneous coordinate corresponding to each point cloud coordinate after conversion, wherein the second homogeneous coordinate refers to a coordinate under a coordinate system established by taking the camera as an origin;

determining a second pixel coordinate and a second pixel value of a pixel point corresponding to each point cloud coordinate according to each second homogeneous coordinate and the motion parameter corresponding to each area image;

and generating a similar depth image corresponding to each area image according to the second pixel coordinates and the second pixel values of each pixel point.

6. The image processing method according to claim 5, wherein the camera is fixed to a pan-tilt, and the motion parameters of the pan-tilt include a first motion parameter and a second motion parameter, the first motion parameter being a motion parameter of a roll motion of the pan-tilt, and the second motion parameter being a motion parameter of a rotation motion of the pan-tilt;

The determining the second pixel coordinates and the second pixel values of the pixel points corresponding to each point cloud coordinate according to each second homogeneous coordinate and the motion parameters corresponding to each area image comprises:

determining the space pose of the cradle head when the camera shoots each area image;

determining a first motion parameter and a second motion parameter corresponding to each region image according to the space pose;

determining an image factor of each region image;

and for each area image, determining a second pixel coordinate and a second pixel value of a pixel point corresponding to each point cloud coordinate according to the image factor, the second homogeneous coordinate, the first motion parameter and the second motion parameter of the area image.

7. The method according to claim 5, wherein the stitching the plurality of class depth images according to the plurality of stitching matrices corresponding to the plurality of area images to obtain the stitched panoramic class depth image includes:

for each region image, converting second pixel coordinates of each pixel contained in the depth-like image according to a splicing matrix corresponding to the region image so as to obtain third homogeneous coordinates of each second pixel coordinate;

Determining third pixel coordinates corresponding to the point cloud coordinates of each pixel point contained in the depth-like image according to each third homogeneous coordinate;

generating a second pixel point set corresponding to the plurality of class depth images according to the third pixel coordinates and the second pixel values of all the pixels, wherein the second pixel point set comprises the third pixel coordinates of each pixel point and the second pixel value of each pixel contained in the plurality of converted class depth images;

and generating the panoramic depth image according to the second pixel point set.

8. The image processing method according to claim 1, characterized in that the method further comprises:

after the panoramic depth image is generated, second spliced pixel points in the panoramic depth image are extracted;

under the condition that the pixel value of the second spliced pixel point is not null, determining the second spliced pixel point as a second edge null point of the panoramic depth image;

determining the number of second non-empty points in the neighborhood of the edge empty points according to the pixel coordinates of the second edge empty points, wherein the second non-empty points are second spliced pixel points with second spliced pixel values which are not empty in the panoramic depth image;

Under the condition that the number of the second non-empty points in the neighborhood is larger than or equal to a second preset value, determining second target pixel values and target point cloud data of second edge empty points according to point cloud data and second spliced pixel values of all the second non-empty points in the neighborhood so that the pixel values and the point cloud data of the second edge empty points are not empty;

and under the condition that all pixel values of the second edge null points are not null, determining that the second stitching operation of the panoramic depth image is finished to obtain a panoramic depth image after stitching, and determining the point cloud data of each pixel point included in the stitched RGB image according to the corresponding relation between the panoramic depth image after stitching and the stitched RGB image.

9. The image processing method according to claim 1, wherein the determining, for each area image, a stitching matrix corresponding to the area image based on information of pixels of the area image includes:

extracting a feature point set of each region image through an image stitching algorithm, wherein each feature point set comprises a plurality of feature points;

matching a plurality of characteristic points of each characteristic point set through the image stitching algorithm to obtain a target matching point set corresponding to each characteristic point set;

And determining a splicing matrix corresponding to each area image according to the target matching point set of each area image.

10. The image processing method according to claim 9, wherein the feature point sets include a first type feature point set and a second type feature point set, and the matching the plurality of feature points of each feature point set by the image stitching algorithm to obtain a target matching point set corresponding to each feature point set includes:

matching a plurality of characteristic points of the first type of characteristic point set through a first image stitching algorithm, and determining characteristic points with first characteristic point scales smaller than a first preset distance in the first type of characteristic point set as first target characteristic points;

matching a plurality of feature points of the second type of feature point set through a second image stitching algorithm, and determining feature points with second feature point scales smaller than a second preset distance in the second type of feature point set as second target feature points;

and generating a target matching point set according to the first target characteristic point and the second target characteristic point.

11. The image processing method according to claim 10, wherein the determining a stitching matrix corresponding to each region image from the target matching point set of each region image includes:

Determining a splicing matrix corresponding to each region image according to a target matching point set under the condition that the number of target feature points in the target matching point set of the region image is larger than a preset value for any region image;

and under the condition that the number of the target feature points in the target matching point set of the area image is smaller than or equal to the preset value, matching the plurality of feature points in the first type feature point set through the first image stitching algorithm again until the number of the target feature points in the target matching point set of the area image is larger than the preset value.

12. A processor configured to perform the image processing method according to any one of claims 1 to 11.

13. An image processing apparatus, characterized in that the apparatus comprises:

the camera is fixed on the cradle head and used for acquiring a plurality of area images of the target scene;

the radar is fixed on the holder, connected with the camera and used for acquiring point cloud data of the target scene;

the processor of claim 12.

14. A machine-readable storage medium having instructions stored thereon, which when executed by a processor cause the processor to be configured to perform the image processing method according to any of claims 1 to 11.