WO2023184740A1

WO2023184740A1 - Image processing system and method, intelligent terminal, and computer readable storage medium

Info

Publication number: WO2023184740A1
Application number: PCT/CN2022/100631
Authority: WO
Inventors: 王刚; 余洪涛; 谷涛
Original assignee: 奥比中光科技集团股份有限公司
Priority date: 2022-04-01
Filing date: 2022-06-23
Publication date: 2023-10-05
Also published as: CN114862658B; CN114862658A

Abstract

An image processing system and method, an intelligent terminal, and a computer readable storage medium. The image processing system comprises: a depth camera (10) configured to acquire a depth image of a target scene; a two-dimensional camera (20) configured to acquire a two-dimensional image of the target scene; a memory (30) configured to store the depth image, the two-dimensional image, calibration information between the depth camera (10) and the two-dimensional camera (20), and a depth measurement range of the depth camera (10); and a processor (40) configured to determine a corresponding row index range of each row of the depth image in the two-dimensional image according to the calibration information and the depth measurement range, and extract at least one row of depth data in the depth image and at least one row of corresponding two-dimensional data in the two-dimensional image according to the row index range for alignment to determine two-dimensional data corresponding to depth data. In this way, compared with the prior art, the image alignment efficiency is improved.

Description

Image processing system, method, intelligent terminal and computer-readable storage medium

This application requests the priority of the Chinese patent application submitted to the China Patent Office on April 1, 2022, with the application number 202210338866.8, and the invention name is "Image processing system, system, method, intelligent terminal and computer-readable storage medium", which The entire contents are incorporated herein by reference.

Technical field

The present invention relates to the technical field of image information computing and processing, and in particular to an image processing system, a system, a method, an intelligent terminal and a computer-readable storage medium.

Background technique

With the development of science and technology, the acquisition and application of image information are becoming more and more extensive. In some application scenarios, it is necessary to align multiple acquired images in order to obtain corresponding data information in multiple images. For example, in 3D vision application scenarios, in order to better apply the depth data in the depth image, it is necessary to compare the depth image corresponding to one scene with another two-dimensional image corresponding to the scene (such as RGB image, grayscale image, etc.) Corresponding image processing is performed to align the depth image with the two-dimensional image, that is, the depth information in the depth image is mapped to the pixels of the corresponding two-dimensional image, so that the depth information can be used to assist in operations such as 3D recognition.

When aligning two (or more) images, the existing image processing system needs to process all the data in the two images at one time, and cannot calculate and process the data in batches. The problem with the existing technology is that a large amount of data needs to be processed simultaneously when performing image alignment processing, which is not conducive to improving processing efficiency and accuracy of image alignment. At the same time, the huge amount of calculation requires high-performance hardware devices for processing, which is not conducive to image alignment processing in scenarios where the hardware device performance is not high, that is, it affects the applicability of image alignment processing.

Contents of the invention

The main purpose of the present invention is to provide an image processing system, method, intelligent terminal and computer-readable storage medium, aiming to solve the problem that in the prior art, a large amount of data needs to be processed simultaneously during image alignment processing, which is not conducive to improving processing efficiency. , which is not conducive to image alignment processing in scenarios with low hardware device performance, and affects the applicability of image alignment processing.

In order to achieve the above objects, a first aspect of the present invention provides an image processing system, including:

a depth camera configured to collect depth images of the target scene;

a two-dimensional camera configured to collect two-dimensional images of the target scene;

A memory configured to store the depth image, the two-dimensional image, the calibration information between the depth camera and the two-dimensional camera, and the depth measurement range of the depth camera;

A processor configured to determine the row index range corresponding to each row of the depth image in the two-dimensional image based on the calibration information and the depth measurement range, and to extract at least one row of depth data in the depth image based on the row index range. and at least one corresponding row of two-dimensional data in the above-mentioned two-dimensional image are aligned to determine the two-dimensional data corresponding to the above-mentioned depth data.

Optionally, the above-mentioned extraction of at least one row of depth data in the above-mentioned depth image and the corresponding at least one row of two-dimensional data in the above-mentioned two-dimensional image according to the above-mentioned row index range and performing alignment processing to determine the two-dimensional data corresponding to the above-mentioned depth data includes:

The depth image and the two-dimensional image are divided according to the memory size of the processor and the row index range and multiple groups of data to be processed are extracted, wherein each group of the data to be processed includes at least one row of depth data and corresponding multiple rows of two-dimensional data. dimensional data, and adjacent rows of depth data share part of the two-dimensional data.

Optionally, the above-mentioned alignment process on the above-mentioned data to be processed determines the two-dimensional data corresponding to the above-mentioned depth data, including:

Transfer a set of the above-mentioned data to be processed to the memory of the above-mentioned processor, align the set of data to be processed through the above-mentioned processor, determine the two-dimensional data corresponding to the above-mentioned depth data in the set of data to be processed, and then add the next The above-mentioned group of data to be processed is transferred to the above-mentioned memory for alignment processing until each group of the above-mentioned data to be processed has completed the alignment processing;

Among them, the sum of the data amount of a group of memory storage data is not greater than the memory size of the above-mentioned processor, and a group of the above-mentioned memory storage data includes a group of the above-mentioned data to be processed and the processed data obtained by aligning the group of data to be processed. .

Optionally, the above image processing system is also used for:

When transporting the i-th group of the above-mentioned data to be processed, obtain the row index overlapping range of the i-th group of the above-mentioned data to be processed and the i-1 group of the above-mentioned data to be processed, retain the overlapping data in the above-mentioned memory, and transport the data to be transported to the above-mentioned memory middle;

Wherein, the overlapping data is data corresponding to the overlapping range of the row index in the i-1th group of data to be processed, and the data to be transferred is data outside the overlapping range of the row index in the i-th group of data to be processed.

Optionally, the calibration information stored in the memory includes: the internal parameters of the depth camera, the internal parameters of the two-dimensional camera, the rotation matrix between the depth camera and the two-dimensional camera, and the rotation matrix between the depth camera and the two-dimensional camera. translation matrix between.

Optionally, the depth measurement range stored in the memory includes: the maximum detection value and the minimum detection value of the depth camera.

Optionally, the step of determining the row index range corresponding to each row of the depth image in the two-dimensional image based on the calibration information and the depth measurement range includes:

According to the internal parameters of the above-mentioned depth camera, the above-mentioned internal parameters of the two-dimensional camera, the above-mentioned rotation matrix, the above-mentioned translation matrix, the above-mentioned maximum detection value and the above-mentioned minimum detection value, respectively obtain the maximum row index and the minimum row index corresponding to each row of the above-mentioned depth image;

Obtain the above-mentioned row index range corresponding to each row in the above-mentioned depth image according to the above-mentioned maximum row index and the above-mentioned minimum row index;

Among them, the row index range corresponding to a row to be aligned in the above-mentioned depth image is used to indicate the range of the target alignment row in the above-mentioned two-dimensional image. The target alignment pixels in the above-mentioned two-dimensional image belong to the above-mentioned target alignment rows. The above-mentioned target alignment pixels Corresponds to any pixel to be aligned in the above row to be aligned.

A second aspect of the present invention provides an image processing method, wherein the above-mentioned image processing method is applied to any of the above-mentioned image processing systems, and the above-mentioned method includes:

Collect the depth image of the target scene through the depth camera and store it in the memory;

Collect two-dimensional images of the target scene through a two-dimensional camera and store them in memory;

Obtain the calibration information between the above-mentioned depth camera and the above-mentioned two-dimensional camera pre-stored in the above-mentioned memory and the depth measurement range of the above-mentioned depth camera;

The processor determines the row index range corresponding to each row of the depth image in the two-dimensional image based on the calibration information and the depth measurement range, and extracts at least one row of depth data in the depth image and the two depth data based on the row index range. At least one corresponding row of two-dimensional data in the three-dimensional image is aligned to determine the two-dimensional data corresponding to the depth data.

A third aspect of the present invention provides an intelligent terminal. The intelligent terminal includes a memory, a processor, and an image processing program stored in the memory and executable on the processor. When the image processing program is executed by the processor, the above-mentioned steps are implemented. Image processing method steps.

A fourth aspect of the present invention provides a computer-readable storage medium. An image processing program is stored on the computer-readable storage medium. When the image processing program is executed by a processor, the steps of the image processing method are implemented.

As can be seen from the above, the image processing system provided in the solution of the present invention includes: a depth camera configured to collect a depth image of the target scene; a two-dimensional camera configured to collect a two-dimensional image of the target scene; a memory configured to store the above-mentioned a depth image, a two-dimensional image, calibration information between the depth camera and the two-dimensional camera, and a depth measurement range of the depth camera; a processor configured to determine each of the depth images based on the calibration information and the depth measurement range. The row index range corresponding to one row in the above-mentioned two-dimensional image, and extracting at least one row of depth data in the above-mentioned depth image and at least one corresponding row of two-dimensional data in the above-mentioned two-dimensional image according to the above-mentioned row index range, performing alignment processing to determine the corresponding depth data two-dimensional data. Compared with the existing technology that needs to process all the data in two images at one time and cannot calculate and process the data in batches, the image processing system in the present invention can obtain the depth image based on the calibration information and the depth measurement range. Each row corresponds to the row index range in the two-dimensional image, so that the data to be processed is grouped according to the corresponding row index range, and a group of data to be processed is aligned each time. In this way, the processor does not need to process all data at once, and the amount of data calculation each time is small, which is beneficial to improving processing efficiency. At the same time, data can be grouped reasonably according to the memory size of the processor, thereby rationally utilizing the computing power and storage capacity of the processor, which is beneficial to adapting to processors with different performance and improving the applicability of the image alignment process. This facilitates the operation process after image alignment (such as face recognition) and improves user experience.

Description of drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments or prior art will be briefly introduced below. Obviously, the drawings in the following description are only illustrative of the present invention. For some embodiments, for those of ordinary skill in the art, other drawings can be obtained based on these drawings without exerting creative efforts.

Figure 1 is a schematic structural diagram of an image processing system provided by an embodiment of the present invention;

Figure 2 is a schematic diagram of the mapping relationship between a depth image and an RGB image provided by an embodiment of the present invention;

Figure 3 is a schematic diagram of the mapping relationship between a depth image and an RGB image provided by an embodiment of the present invention;

Figure 4 is a schematic flowchart of an image processing method provided by an embodiment of the present invention;

Figure 5 is a functional block diagram of the internal structure of an intelligent terminal provided by an embodiment of the present invention.

Detailed ways

In the following description, specific details such as specific system structures and technologies are provided for the purpose of illustration rather than limitation, so as to provide a thorough understanding of the embodiments of the present invention. However, it will be apparent to those skilled in the art that the present invention may be practiced in other embodiments without these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention in unnecessary detail.

It will be understood that, when used in this specification and the appended claims, the term "comprising" indicates the presence of described features, integers, steps, operations, elements and/or components but does not exclude one or more other features , the presence or addition of a whole, a step, an operation, an element, a component, and/or a collection thereof.

It should also be understood that the terminology used in the description of the present invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a", "an" and "the" are intended to include the plural forms unless the context clearly dictates otherwise.

It will be further understood that the term "and/or" as used in the specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items. .

As used in this specification and the appended claims, the term "if" may be interpreted as "when" or "once" or "in response to determining" or "in response to detecting" depending on the context. Similarly, the phrase "if determined" or "if [the described condition or event] is detected" may be interpreted, depending on the context, to mean "once determined" or "in response to a determination" or "once the [described condition or event] is detected" event]” or “in response to detection of [the described condition or event]”.

The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, rather than all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of the present invention.

Many specific details are set forth in the following description to fully understand the present invention. However, the present invention can also be implemented in other ways different from those described here. Those skilled in the art can do so without departing from the connotation of the present invention. Similar generalizations are made, and therefore the present invention is not limited to the specific embodiments disclosed below.

With the development of science and technology, the acquisition and application of image information are becoming more and more extensive. In some application scenarios, it is necessary to align multiple acquired images in order to obtain corresponding data information in multiple images. For example, in a 3D vision application scenario, after obtaining a depth image from a 3D camera module, in order to better apply the depth data in the depth image, it is necessary to compare the depth image corresponding to one scene with another depth image corresponding to the scene. Perform corresponding image processing on dimensional images (such as RGB images, grayscale images, IR images, etc.) so that the depth image is aligned with the two-dimensional image, that is, the depth information in the depth image is mapped to the pixels of the corresponding two-dimensional image, which is convenient Use depth information to assist in operations such as 3D recognition. For example, align the depth image with an RGB image obtained from a two-dimensional camera module (for example, an RGB camera module), and map the obtained depth information to the pixels of the color image (i.e., D2C, Depth to Color) to Achieve better information utilization.

In the existing technology, when aligning two (or more) images, all data in the two images need to be processed at one time, and the data cannot be calculated and processed in batches. The problem with the existing technology is that a large amount of data needs to be processed simultaneously when performing image alignment processing, which is not conducive to improving processing efficiency and accuracy of image alignment. At the same time, the huge amount of calculation requires high-performance hardware devices for processing, which is not conducive to image alignment processing in scenarios where the hardware device performance is not high, that is, it affects the applicability of image alignment processing.

Specifically, due to limitations in the amount of calculation, image alignment calculations in currently common 3D devices need to be calculated on an additional host or the CPU of a mobile terminal, and cannot be directly performed on the image processor (such as a digital signal processor) corresponding to the 3D device. processor). If the host is used, the usage scenarios of the 3D device are limited. If the CPU of the mobile terminal is used, the burden on the CPU will be increased and the performance of other tasks will be affected. At the same time, the D2C algorithm in the existing technology can usually only be calculated based on the data collected by a specific 3D camera module. Different 3D camera modules have individual differences and installation differences, making the existing D2C algorithm unable to adapt. Compute the data collected by different 3D camera modules.

In order to solve at least one problem existing in the prior art, the present invention provides an image processing system including: a depth camera configured to collect a depth image of a target scene; a two-dimensional camera configured to collect a two-dimensional image of the target scene ; The memory is configured to store the above-mentioned depth image, the two-dimensional image, the calibration information between the above-mentioned depth camera and the above-mentioned two-dimensional camera, and the depth measurement range of the above-mentioned depth camera; The processor is configured to store the above-mentioned calibration information and the above-mentioned depth The measurement range determines the row index range corresponding to each row of the depth image in the two-dimensional image, and extracts at least one row of depth data in the depth image and at least one corresponding row of two-dimensional data in the two-dimensional image based on the row index range. An alignment process is performed to determine the two-dimensional data corresponding to the above depth data. Compared with the existing technology that needs to process all the data in two images at one time and cannot calculate and process the data in batches, the image processing system in the present invention can obtain the depth image based on the calibration information and the depth measurement range. Each row corresponds to the row index range in the two-dimensional image, so that the data to be processed is grouped according to the corresponding row index range, and a group of data to be processed is aligned each time. In this way, the processor does not need to process all data at once, and the amount of data calculation each time is small, which is beneficial to improving processing efficiency. At the same time, data can be grouped reasonably according to the memory size of the processor, thereby rationally utilizing the computing power and storage capacity of the processor, which is beneficial to adapting to processors with different performance and improving the applicability of the image alignment process. This facilitates the operation process after image alignment (such as face recognition) and improves user experience.

As shown in Figure 1, an embodiment of the present invention provides an image processing system, including:

Depth camera 10 is configured to collect depth images of the target scene;

The two-dimensional camera 20 is configured to collect two-dimensional images of the target scene;

The memory 30 is configured to store the above-mentioned depth image, the two-dimensional image, the calibration information between the above-mentioned depth camera and the above-mentioned two-dimensional camera, and the depth measurement range of the above-mentioned depth camera;

The processor 40 is configured to determine the row index range corresponding to each row of the depth image in the two-dimensional image based on the calibration information and the depth measurement range, and extract at least one row of depth in the depth image based on the row index range. The data and at least one corresponding row of two-dimensional data in the above-mentioned two-dimensional image are aligned to determine the two-dimensional data corresponding to the above-mentioned depth data.

Among them, the depth image and the two-dimensional image are images that need to be aligned, and the target scene is the scene where the image is captured. In this embodiment, alignment processing of two images is taken as an example for explanation. In actual use, multiple images can be aligned based on the above image processing system, that is, there is more than one two-dimensional image. For example, a depth image can be After aligning with the first two-dimensional image, a first aligned image is obtained, and the second two-dimensional image is aligned with the first aligned image to obtain a second aligned image, thereby realizing alignment processing between the three images.

In this embodiment, by aligning the depth image with the two-dimensional image (such as an RGB image), the depth information in the depth image can be mapped to the corresponding pixels of the two-dimensional image, which is beneficial to assisting the two-dimensional image in three-dimensional processing by combining the depth information. Information recognition (such as 3D face recognition).

It should be noted that in this embodiment, the depth image and the two-dimensional image are captured by different cameras. The depth image is captured by the depth camera 10 and the two-dimensional image is captured by the two-dimensional camera 20 . Specifically, the two-dimensional image is an RGB image to be aligned, which is captured by an RGB camera. Based on the image processing system in this embodiment, image alignment processing is performed on the depth image to be aligned and the RGB image to be aligned, and the depth information is mapped to the pixels of the color image (RGB image to be aligned), thereby further performing image recognition based on the color image. Obtaining information about the corresponding environment or the object being photographed is conducive to better utilization of depth information.

Specifically, the calibration information includes the internal parameters of the depth camera 10, the internal parameters of the two-dimensional camera 20, the rotation matrix between the depth camera 10 and the two-dimensional camera 20, the translation matrix between the depth camera 10 and the two-dimensional camera 20, and the depth measurement range. Including the maximum detection value and minimum detection value of the depth camera 10 .

In this embodiment, the depth image to be aligned and the RGB image to be aligned can be aligned based on the calibration information. At the same time, based on the maximum detection value and the minimum detection value corresponding to the depth camera 10, the possible corresponding line index range of each line in the depth image in the two-dimensional image can be determined, thereby facilitating line-by-line alignment of the depth image and the two-dimensional image, that is, realizing data alignment. Process in batches.

Among them, the row index range corresponding to a row in the depth image is the index range of rows in which each pixel point corresponding to the pixel point in the row may exist in the two-dimensional image. For example, the pixels corresponding to each pixel in the first row of the depth image may be distributed between the first and third rows of the two-dimensional image. Then the row index corresponding to the first row of the depth image ranges from rows 1 to 3. .

It should be noted that the depth measurement range can be determined based on the maximum working distance and the minimum working distance of the depth camera 10 .

In this embodiment, the above-mentioned determination of the row index range corresponding to each row of the above-mentioned depth image in the above-mentioned two-dimensional image based on the above-mentioned calibration information and the above-mentioned depth measurement range includes: based on the internal parameters of the depth camera 10, the internal parameters of the two-dimensional camera 20, The rotation matrix, translation matrix, maximum detection value and minimum detection value respectively obtain the maximum row index and minimum row index corresponding to each row of the depth image; based on the maximum row index and minimum row index, the row index range corresponding to each row in the depth image is obtained.

Among them, the row index range corresponding to a row to be aligned in the depth image (i.e., any row in the depth image) is used to indicate the range of the target alignment row (i.e., the row corresponding to the row to be aligned) in the two-dimensional image. In the two-dimensional image, The target alignment pixel belongs to the target alignment row, and the target alignment pixel corresponds to any to-be-aligned pixel in the row to be aligned.

Specifically, for a row to be aligned, all pixels in the row are pixels to be aligned. Each pixel to be aligned has a target alignment pixel in the two-dimensional image, and all targets corresponding to a row to be aligned are aligned. The pixels in the two-dimensional image belong to the above row index range. For example, row 1 in the depth image has 640 pixels, and these 640 pixels have corresponding 640 target alignment pixels in the 2D image. The row index range corresponding to the 640 target alignment pixels in the two-dimensional image (for example, rows 1 to 3) is used as the row index range of the row to be aligned. Among them, the maximum row index corresponding to the row to be aligned is 3, and the minimum row index is 1.

Specifically, the optical centers of depth cameras (such as 3D camera modules) and two-dimensional (such as RGB) cameras (modules) have deviations in distance and rotation angle. Therefore, when mapping depth information to an RGB image, you need to first The depth information is converted to the world coordinate system, and then converted from the world coordinate system to the RGB image pixel coordinate system, which implements D2C.

In this embodiment, the coordinate transformation relationship between the depth image pixel coordinate system and the RGB image pixel coordinate system is constructed, as shown in the following formula (1):

Among them, u _rgb and v _rgb respectively represent the abscissa and ordinate coordinates of the image in the RGB image pixel coordinate system, u _ir and v _ir respectively represent the abscissa and ordinate coordinates of the image in the depth image pixel coordinate system, and R _ir2rgb represents from The rotation matrix from the depth (IR) camera to the RGB camera, T _ir2rgb represents the translation matrix from the depth camera to the RGB camera, K _ir and K _rgb represent the internal parameters of the depth camera and RGB camera respectively,

is the inverse matrix of K _ir , depth represents the depth value of the position with coordinates (u _ir , v _ir ) in the depth image, and Z _rgb is the corresponding (u _ir , v _ir ) coordinate point in the depth image in the RGB image ( u _rgb , v _rgb ) the depth value of the coordinate position.

It should be noted that in the above formula (1), the coordinates of a pixel to be aligned in the depth image are (u _ir , v _ir ), and the coordinates of the target alignment pixel corresponding to the pixel to be aligned in the RGB image are (u _rgb , v _rgb ). The D2C algorithm maps the depth value of the depth image to the corresponding pixels (target alignment pixels) on the RGB image, so that it can be turned into RGBD data (that is, data that combines RGB color and depth information). In the entire algorithm There is no need to consider the values of RGB pixels. In one application scenario, the RGB camera and the depth camera capture images separately. The pixel coordinates of the same object in the two images are different in the two images. That is, there will be a deviation if the two images are overlapped. D2C What needs to be done is to overlap the depth image with RGB so that the same object in the two pictures overlaps. The above Z _rgb is the depth of the (u _ir , v _ir ) coordinate point mapped to the depth of the (u _rgb , v _rgb ) point, that is is the depth value of the pixel on the aligned RGB image, that is, the processed data corresponding to the (u _ir , v _ir ) coordinate point after the alignment process (the processed data includes the RGB color of each point in the aligned image value and depth value).

In this embodiment, each pixel point in the depth image is aligned with each pixel point in the two-dimensional image, and the corresponding depth information is mapped. Therefore (u _ir , v _ir ) is determined (determined according to the coordinates of the corresponding pixel points to be aligned), and the calibration information between the depth image and the two-dimensional image and the depth measurement range corresponding to the above-mentioned depth image are obtained in advance, so In the above formula (1), only u _rgb , v _rgb and Z _rgb are unknown quantities that need to be calculated. In this embodiment, the pixels in the depth image are aligned row by row, so there is no need to consider the column index range corresponding to v _rgb . In actual use, the pixels in the depth image can also be aligned column by column, and only the column index range corresponding to v _rgb needs to be considered, without considering the row index range corresponding to u _rgb . For details, please refer to this embodiment. The plan will not be described in detail here.

In this embodiment, line-by-line processing is taken as an example. The unknown quantity Z _rgb in formula (1) can be moved to the left side of the equation, so that for a certain line to be aligned (row u _ir ) in the depth image, According to the maximum and minimum values of depth, the maximum and minimum values corresponding to u _rgb are respectively obtained as the maximum row index and minimum row index of the row to be aligned.

In an application scenario, taking a common depth image as an example, the resolution of a common depth image is 640*480, then the value range of u _ir is 0 to 639 (counting from line 0), v The value range of _ir is 0 to 479 (counting from column 0). The maximum and minimum values of the depth value (that is, the maximum detection value and the minimum detection value in the above-mentioned depth image) can be determined based on the working distance of the depth camera.

Specifically, based on the D2C principle shown in the above formula (1), it can be known that the corresponding row index u _rgb and column index v _rgb in the RGB image are related to depth. Figure 2 is a schematic diagram of the mapping relationship between a depth image and an RGB image provided by an embodiment of the present invention. As shown in Figure 2, when depth is the minimum detection value, the corresponding minimum row index can be obtained according to the above formula (1), and vice versa. When depth is the maximum detection value, the corresponding maximum row index can be obtained according to the above formula (1), so that the row index range in the two-dimensional image corresponding to the row to be aligned in the depth image can be obtained, and the data that needs to be transferred each time can be determined.

In this embodiment, for each row in the depth image, the corresponding value can be calculated in advance based on the working distance of the depth camera (that is, the detection distance range of the depth camera, determined according to the specific depth camera and application scene settings) and the camera calibration information. Row index range. And a corresponding index table can be established based on the minimum row index (color_y_min) and the maximum row index (color_y_max) corresponding to each row, so as to determine the row index range corresponding to each row in the depth image. It should be noted that Figure 2 shows the corresponding mapping relationship between the depth image and the RGB image. In actual use, the aligned image can be considered to overlap with the RGB image, that is, Figure 2 can also be used as the depth image and the aligned image. The corresponding mapping relationship between the images.

In this embodiment, an algorithm (or program, etc.) for calculating the row index range can be preset, and the inter-camera rotation matrix R _ir2rgb , translation matrix T _ir2rgb , depth camera internal parameter K _ir and RGB camera internal parameter k _rgb are input to the corresponding algorithm, at the same time, obtain the maximum detection value depth _max and the minimum detection value depth _min based on the working distance of the depth camera and input them into the above algorithm.

It should be noted that in an application scenario, the maximum detection value and the minimum detection value can also be confirmed in advance by setting the depth camera and its application scenario. For example, in the corresponding shooting area, the subject to be photographed is predetermined. (For example, users who need to perform face detection) The shortest distance and the farthest distance from the depth camera. In another application scenario, depth value analysis can also be performed on the depth image, and the specific maximum detection value and minimum detection value in the depth image that has been taken can be obtained and calculated to improve calculation accuracy and perform better data handling. Improve image alignment efficiency.

Specifically, the maximum row index and the minimum row index corresponding to each row in the depth image are calculated sequentially, thereby obtaining the row index range corresponding to each row. For a line to be aligned in the depth image, u _ir is determined. Traverse each value of v _ir (for example, 0 to 639), and substitute depth _max and depth _min into the above formula (1) to calculate the line to be aligned. 640 candidate minimum indexes and 640 candidate maximum indexes, and the minimum value among all candidate minimum indexes is used as the minimum row index corresponding to the row to be aligned, and the maximum value among all candidate maximum indexes is used as the corresponding minimum row index for the row to be aligned. Maximum row index. In this way, the maximum row index and the minimum row index corresponding to each row in the depth image can be obtained, thereby obtaining the row index range corresponding to each row. It should be noted that each row and its corresponding row index range can be further made into a lookup table to facilitate storage and search, and based on the above lookup table, you can quickly obtain which numbers a row in the depth image may correspond to in the two-dimensional image. OK, which is helpful for data grouping and handling.

It should be noted that, in order to facilitate calculation, in this embodiment, the same depth _max and depth _min are used for calculation of each row to be aligned. In one embodiment, in order to reflect the mapping relationship more accurately, the above-mentioned depth measurement range may include the maximum detection value and the minimum detection value of each row in the depth image, so that the row is calculated and obtained based on the maximum detection value and the minimum detection value of each row. The corresponding maximum row index, minimum row index and row index range.

In this embodiment, the processor 40 is an image processor, and specifically is an image processor used for image alignment processing. The image processor may be a digital signal processor. Specifically, using the system in this embodiment, all data that require image alignment processing (including all data in depth images and two-dimensional images) can be grouped (batch) processed, so corresponding image processing is not required. The hardware performance of the processor is very high, so it can be processed directly based on the digital information processor in the depth camera (or 3D camera module) without using an additional host or mobile terminal. This not only improves the computing efficiency of the D2C algorithm, but also Reduce system resource consumption, reduce hardware power consumption, and improve applicability. It can be seen that, unlike the prior art solution that uses the CPU of the host machine and the mobile terminal, the calculation process of the image alignment process is based on a digital signal processor in this embodiment. In this way, the burden on the mobile terminal CPU can be reduced and the cost of the entire system can be reduced. power consumption.

In this embodiment, extracting at least one row of depth data in the depth image and at least one corresponding row of two-dimensional data in the two-dimensional image based on the row index range and performing alignment processing to determine the two-dimensional data corresponding to the depth data includes: The memory size and row index range of the processor 40 simultaneously extract multiple rows of depth data in the above-mentioned depth image and corresponding multiple rows of two-dimensional data in the two-dimensional image and form multiple groups of data to be processed; each group of data to be processed is processed separately. Alignment processing determines the two-dimensional data corresponding to each row of depth data.

Specifically, during the image alignment process, there are a large number of memory access operations. The above D2C calculation process is calculated on the DSP. Neither the depth image nor the two-dimensional image is stored in the on-chip memory of the DSP, that is, the data that needs to be accessed. If it is not in the DSP's on-chip memory, the corresponding data needs to be moved to the DSP's on-chip memory. At the same time, the on-chip memory size of the DSP is limited and usually cannot accommodate all the data in the depth image and the two-dimensional image. Therefore, the data needs to be transported and calculated in batches. At the same time, it is necessary to ensure that in each batch of data transported, all rows to be aligned are included. The data in the corresponding target alignment row is also successfully moved. In this embodiment, data needs to be transferred from Synchronous Dynamic Random Access Memory (SDRAM) to the on-chip memory of the DSP.

In one application scenario, the user is required to determine how many rows of data are to be moved at a time, and the user is required to manually enter corresponding instructions to move data back and forth between the on-chip memory and the SDARM. To obtain data of different sizes and types, different inputs are required. instructions, so it is inconvenient to operate and is not conducive to improving image processing efficiency. At the same time, users are also required to manually calculate the amount of data moved each time, which makes it difficult to fully utilize the on-chip memory and easily leads to an increase in the number of data moves, which consumes computing time.

Therefore, in this embodiment, preferably, a row of depth data in the depth image and at least one corresponding row of two-dimensional data in the two-dimensional image are extracted according to the row index range to form a set of data to be processed, and the set of data to be processed is transported to the processing In the memory of the processor 40, the processor 40 performs alignment processing on the set of data to be processed, determines the two-dimensional data corresponding to the depth data in the set of data to be processed, and then transfers the next set of processed data to the memory for alignment processing. Until each set of data to be processed is aligned. The sum of the data amounts of a set of memory storage data is not greater than the memory size of the processor 40 , and a set of memory storage data includes a set of data to be processed and processed data obtained by aligning the set of data to be processed.

Furthermore, multiple sets of data to be processed can also be transported to the processor 40 at the same time for alignment processing. It should be noted that due to the limitation of the memory size of the processor 40, only a few lines to dozens of lines of data can be transferred to the on-chip memory for processing at a time. Therefore, the depth image and the two-dimensional image can be divided in advance according to actual needs. Determine the data to be processed that needs to be moved each time. Specifically, in this embodiment, the number of rows corresponding to one row in the depth image in the two-dimensional image can be determined according to the predetermined row index range, thereby determining the amount of data that needs to be processed corresponding to one row in the depth image, so that the processing can be performed Grouping of data (for example, determining how many rows in a depth image are grouped together). In this way, based on the predetermined row index range, it can be determined how many rows need to be calculated each time and how many rows of data can be transferred to the on-chip memory. In this way, the amount of calculation and the number of data moves can be reduced. At the same time, the above image processing system can also be applied to image data of different image data (such as different resolutions) collected by different camera modules, and performs appropriate grouping and handling of different image data, so that the D2C algorithm can adapt to the differences between different modules. Difference, you can make full use of the hardware performance of DSP without too much manual intervention.

In this embodiment, the depth image and the two-dimensional image are divided respectively according to the memory size and the above row index range to obtain multiple sets of data to be processed. It should be noted that each group of the data to be processed includes at least one row of depth data and corresponding multiple rows of two-dimensional data, and adjacent rows of data to be processed share part of the two-dimensional data. Among them, the shared part of two-dimensional data means that the two-dimensional data corresponding to the data to be processed in adjacent rows may have overlapping parts. The overlapping parts can be retained and reused during the alignment process, and there is no need to repeatedly transport them.

For example, a set of images to be processed includes rows 1 and 2 in the depth image, where the row index corresponding to row 1 in the depth image ranges from rows 1 to 3 in the two-dimensional image, and row 2 in the depth image The corresponding row index range is 2 to 5 rows of the two-dimensional image, then the above-mentioned data to be processed includes all the data of rows 1 to 5 in the two-dimensional image. When performing alignment processing, the depth data of the first row and the depth of the second row are Data can share rows 2 to 3 of 2D data. In this way, it can be ensured that the data required for each image alignment process is transferred to the on-chip memory and avoids wasting the amount of data caused by repeated transfer.

At the same time, it is also necessary to consider the storage space of the on-chip memory, that is, the sum of the data amount of a set of memory storage data is not greater than the above-mentioned on-chip memory size. A set of memory storage data includes a set of data to be processed and the data corresponding to the set of data to be processed. Data processing. It should be noted that not only the input data (a set of data to be processed) needs to be transported on the DSP, but also the output data (aligned image data, Aligned_Depth), that is, the processed data corresponding to the group of data to be processed, needs to be transported. The two pieces of data used in the calculation need to be stored in the on-chip memory at the same time, and the on-chip memory size is fixed. Therefore, it is necessary to limit the sum of the data amount of a set of memory storage data to not be greater than the above-mentioned on-chip memory size.

In one application scenario, each group of the above-mentioned data to be processed includes first data to be processed and second data to be processed. The above-mentioned first data to be processed is the data corresponding to the row to be processed in the above-mentioned depth image, and the above-mentioned second data to be processed is The processed data is all data in the row index range corresponding to the row to be processed in the two-dimensional image, and the row to be processed includes at least one row in the depth image. Correspondingly, at least one row of depth data in the depth image and at least one corresponding row of two-dimensional data in the two-dimensional image can be extracted according to the row index range to form at least one set of data to be processed, which specifically includes: according to the memory size and row index of the processor 40 The scope divides the depth image and the two-dimensional image respectively and extracts multiple groups of data to be processed, wherein each group of data to be processed includes first data to be processed and second data to be processed, and the first data to be processed is the data to be processed in the depth image. The data corresponding to the row is processed, and the second data to be processed is all data in the row index range in the two-dimensional image corresponding to the row to be processed, and the row to be processed includes at least one row in the depth image.

It should be noted that the data volume of the corresponding output data can be obtained according to the data volume of the input data, thereby limiting the data transferred each time. For example, in one application scenario, the amount of data corresponding to one line of the depth image and the amount of data corresponding to one line of the RGB image can be preset. Since the output data obtained after alignment is data containing RGB image information and depth information, that is, the data type is certain, and one pixel in the depth image can correspond to one output pixel data, and the number of rows and data volume of the output data can be determined based on the input data to be processed. In this embodiment, it is only necessary to ensure that the sum of the size of the input data and the size of the output data does not exceed the size of the on-chip memory.

For different camera modules, the internal parameters of the camera are different, so the row index range corresponding to each row in the acquired depth image is also different, so the number of rows of data to be processed that different camera modules can handle at one time is also different. For example, if the data generated by a camera module can transfer 8 lines at a time, then 480 lines of data will need to be calculated 60 times. The data generated by another camera module can be transferred 10 lines at a time, so 480 lines of data only need to be transferred 48 times. The smaller the number of transfers, the smaller the system overhead. Based on the solution in this embodiment, the performance of the DSP can be fully utilized, the data can be grouped reasonably, multiple groups of data to be processed can be obtained, and each group of data to be processed can be processed in turn, thereby rationally utilizing the memory and processing capabilities of the DSP. while improving the efficiency of image alignment processing.

It should be noted that different image resolutions will also lead to different transfer times. That is, the more pixels in a row, the fewer rows that can be transferred at one time. For example, a 1280*960 image may only be able to transfer 4 lines of data at a time. Based on the image processing system in this embodiment, it can adapt to images of different resolutions collected by different modules, thereby improving the utilization of the corresponding algorithm codes and eliminating the need for users to modify the corresponding codes when switching modules and resolutions. , which is helpful to improve the applicability of the image processing process and facilitate image alignment processing in different scenarios.

At the same time, the row index range calculated in this embodiment is a rough range calculated based on the maximum detection value and the minimum detection value, rather than an exact range calculated based on the specific depth value of each pixel in the depth image. Therefore, overlapping row index ranges may appear in several consecutive rows to be aligned (or there may not be overlapping row index ranges), that is, corresponding adjacent row depth data may share two-dimensional data within overlapping row index ranges. If the corresponding data in the overlapping row index range is moved repeatedly, time will be wasted. Therefore, in this embodiment, the above system is also used for:

When transporting the i-th group of the above-mentioned data to be processed, obtain the row index overlapping range of the i-th group of data to be processed and the i-1 group of data to be processed, retain the overlapping data in the on-chip memory, and transport the data to be transported to the on-chip memory;

Among them, the overlapping data is the data corresponding to the overlapping range of the row index in the i-1 group of data to be processed, and the data to be transferred is the data outside the overlapping range of the row index in the i-th group of data to be processed.

Figure 3 is a schematic diagram of the mapping relationship between a depth image and an RGB image provided by an embodiment of the present invention. In Figure 3, it is assumed that a set of data to be processed in the depth image includes one line of data in the depth image and multiple lines of data in the RGB image. . As shown in Figure 3, the row index ranges corresponding to the i-1th row of data and the i-th row of data in the depth image overlap, and the corresponding overlapping data is the data between the dotted lines in the RGB image in Figure 3. When processing the i-1th group of data to be processed, the overlapping data will be transferred to the on-chip memory, and when processing the i-th group of data to be processed, the overlapping data also needs to be used. Therefore, based on the solution in this embodiment, the overlapping data in the on-chip memory can be retained, or the overlapping data can be transported on the on-chip memory according to actual needs (the speed of internal transportation in the on-chip memory is much faster than that between SDRAM and on-chip memory) (transfer speed between data transfers). In this way, redundant transfer operations during the data transfer process can be reduced, which is beneficial to reducing the time of data transfer, increasing the speed of the entire system operation, thereby improving the efficiency of image alignment processing.

In one application scenario, the above image processing process can also be performed in combination with Single Instruction Multiple Data (SIMD, Single Instruction Multiple Data) technology to further improve the speed of image data alignment processing.

It should be noted that in this embodiment, the image alignment processing is performed in batches on the digital signal processor. In actual use, the image alignment processing can also be performed in batches through the CPU, or the image alignment processing can be performed in batches using FPGA, digital chips, etc. Alignment processing, reasonable and full use of the image processor.

As can be seen from the above, in this embodiment, the prior information of the 3D camera module (including the row index range corresponding to each row in the depth image) is obtained, the data is grouped according to the prior information, multiple groups of data to be processed are obtained, and the data are transported and processed in sequence. Process each set of data to be processed, make full use of the performance of the digital signal processor and reduce the number of data transfers while ensuring that the data that needs to be stored each time does not exceed the on-chip memory of the digital signal processor, thereby reducing the power consumption of the hardware system. This enables the 3D camera module to be suitable for various scenarios. At the same time, the above system can also adapt to different 3D camera modules. It can consider the differences in calibration parameters of different modules, configure according to specific calibration parameters, make full use of the hardware performance of DSP, and perform parallel processing of data based on DSP. .

At the same time, in this embodiment, overlapping data in the data that needs to be transferred is also considered to avoid repeated transfer of overlapping data, further reduce processing time, improve the efficiency of image alignment processing, and facilitate the operation process after image alignment (such as face recognition). ), which is conducive to improving the user experience.

As shown in Figure 4, corresponding to the above image processing system, an embodiment of the present invention also provides an image processing method. The above method is applied to any of the above image processing systems. The above method includes:

Step S100, collect the depth image of the target scene through the depth camera and store it in the memory;

Step S200, collect the two-dimensional image of the target scene through the two-dimensional camera and store it in the memory;

Step S300, obtain the calibration information between the depth camera and the two-dimensional camera pre-stored in the memory and the depth measurement range of the depth camera;

Step S400: The processor determines the row index range corresponding to each row of the depth image in the two-dimensional image based on the calibration information and the depth measurement range, and extracts at least one row of depth data in the depth image based on the row index range. and at least one corresponding row of two-dimensional data in the above-mentioned two-dimensional image are aligned to determine the two-dimensional data corresponding to the above-mentioned depth data.

Among them, the depth image and the two-dimensional image are images that need to be aligned, and the target scene is the scene where the image is captured. In this embodiment, the alignment process of two images is taken as an example for explanation. In actual use, multiple images can be aligned based on the image processing method, that is, there is more than one two-dimensional image. For example, the depth image and The first aligned image is obtained after the first two-dimensional image is aligned, and the second aligned image is obtained by aligning the second two-dimensional image with the first aligned image, thereby realizing alignment processing between the three images.

It should be noted that in this embodiment, the depth image and the two-dimensional image are captured by different cameras. The depth image is captured by the depth camera, and the two-dimensional image is captured by the two-dimensional camera. Specifically, the two-dimensional image is an RGB image to be aligned, which is captured by an RGB camera. Based on the image processing method in this embodiment, image alignment processing is performed on the depth image to be aligned and the RGB image to be aligned, and the depth information is mapped to the pixels of the color image (RGB image to be aligned), thereby further performing image recognition based on the color image, Obtaining information about the corresponding environment or the object being photographed is conducive to better utilization of depth information.

Specifically, the calibration information includes the internal parameters of the depth camera, the internal parameters of the two-dimensional camera, the rotation matrix between the depth camera and the two-dimensional camera, and the translation matrix between the depth camera and the two-dimensional camera. The depth measurement range includes the maximum detection of the depth camera. value and minimum detection value.

In this embodiment, the depth image to be aligned and the RGB image to be aligned can be aligned based on the calibration information. At the same time, based on the maximum detection value and the minimum detection value in the depth image, the possible corresponding row index range of each line in the depth image in the two-dimensional image can be determined, thereby facilitating line-by-line alignment of the depth image and the two-dimensional image, that is, realizing data separation. Batch processing.

Among them, the row index range corresponding to a row in the depth image is the index range of rows in which each pixel point corresponding to the pixel point in the row may exist in the two-dimensional image.

In this embodiment, the processor is an image processor, and is specifically an image processor used for image alignment processing. The image processor may be a digital signal processor. Specifically, based on the method in this embodiment, all data that require image alignment processing (including all data in depth images and two-dimensional images) can be grouped (batch) processed, so corresponding image processing is not required. The hardware performance of the processor is very high, so it can be processed directly based on the digital information processor in the depth camera (or 3D camera module) without using an additional host or mobile terminal. This not only improves the computing efficiency of the D2C algorithm, but also Reduce system resource consumption, reduce hardware power consumption, and improve applicability. It can be seen that, unlike the prior art solution that uses the CPU of the host machine and the mobile terminal, the calculation process of the image alignment process is based on a digital signal processor in this embodiment. In this way, the burden on the mobile terminal CPU can be reduced and the cost of the entire system can be reduced. power consumption. Based on the above embodiments, the present invention also provides an intelligent terminal, the functional block diagram of which can be shown in Figure 5 . Smart terminals include processors and memories. The memory of the smart terminal includes an image processing program, and the memory provides an environment for the execution of the image processing program. When the image processing program is executed by the processor, the steps of any of the above image processing methods are implemented. It should be noted that the smart terminal may also include other functional modules or units, which are not specifically limited here.

Those skilled in the art can understand that the principle block diagram shown in Figure 5 is only a block diagram of a partial structure related to the solution of the present invention, and does not constitute a limitation on the smart terminal to which the solution of the present invention is applied. Specifically, the smart terminal More or fewer components may be included than shown in the figures, or certain components may be combined, or may have a different arrangement of components.

In one embodiment, the image processing program performs the following operation instructions when executed by the processor:

Use the depth camera and the 2D camera to obtain the depth image and 2D image of the target scene respectively;

Obtain the calibration information between the depth camera and the two-dimensional camera and the depth measurement range measured by the depth camera. The calibration information is the pose calibration information between the depth camera and the two-dimensional camera;

Obtain the row index range corresponding to each row of the depth image in the two-dimensional image according to the calibration information and the depth measurement range;

Extract at least one row of depth data in the depth image and at least one corresponding row of two-dimensional data in the two-dimensional image according to the row index range to form at least one set of data to be processed, and perform alignment processing on the data to be processed to determine the two-dimensional data corresponding to the depth data.

Embodiments of the present invention also provide a computer-readable storage medium. An image processing program is stored on the computer-readable storage medium. When the image processing program is executed by a processor, the steps of any image processing method provided by the embodiments of the present invention are implemented.

It should be understood that the sequence number of each step in the above embodiment does not mean the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present invention.

Those skilled in the art can clearly understand that for the convenience and simplicity of description, only the division of the above functional units and modules is used as an example. In actual applications, the above functions can be allocated to different functional units and modules according to needs. Module completion means dividing the internal structure of the above device into different functional units or modules to complete all or part of the functions described above. Each functional unit and module in the embodiment can be integrated into one processing unit, or each unit can exist physically alone, or two or more units can be integrated into one unit. The above-mentioned integrated unit can be hardware-based. It can also be implemented in the form of software functional units. In addition, the specific names of each functional unit and module are only for the convenience of distinguishing each other and are not used to limit the scope of the present invention. For the specific working processes of the units and modules in the above system, please refer to the corresponding processes in the foregoing method embodiments, and will not be described again here.

In the above embodiments, each embodiment is described with its own emphasis. For parts that are not detailed or documented in a certain embodiment, please refer to the relevant descriptions of other embodiments.

Those of ordinary skill in the art can appreciate that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented with electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each specific application, but such implementations should not be considered to be beyond the scope of the present invention.

In the embodiments provided by the present invention, it should be understood that the disclosed apparatus/terminal equipment and methods can be implemented in other ways. For example, the apparatus/terminal equipment embodiments described above are only illustrative. For example, the division of the above modules or units is only a logical function division. In actual implementation, it can be divided in other ways, such as multiple units or units. Components may be combined or may be integrated into another system, or some features may be ignored, or not implemented.

If the above-mentioned integrated modules/units are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the present invention can implement all or part of the processes in the above-mentioned embodiment methods, and can also be completed by instructing relevant hardware through a computer program. The above-mentioned computer program can be stored in a computer-readable storage medium. The computer program can be stored in a computer-readable storage medium. When executed by the processor, the steps of each of the above method embodiments can be implemented. Wherein, the above-mentioned computer program includes computer program code, and the above-mentioned computer program code may be in the form of source code, object code, executable file or some intermediate form, etc. The above-mentioned computer-readable media may include: any entity or device capable of carrying the above-mentioned computer program code, recording media, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory), random accessory Access memory (RAM, Random Access Memory), electrical carrier signals, telecommunications signals, and software distribution media, etc. It should be noted that the content contained in the above computer-readable storage media can be appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdiction.

The above-described embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them. Although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that they can still implement the above-mentioned implementations. Modifications are made to the technical solutions described in the examples, or equivalent substitutions are made to some of the technical features; however, these modifications or substitutions do not mean that the essence of the corresponding technical solutions deviates from the spirit and scope of the technical solutions of each embodiment of the present invention, and they should all be included in this document. within the scope of protection of the invention.

Claims

An image processing system, characterized by including:

a depth camera configured to collect depth images of the target scene;

a two-dimensional camera configured to collect two-dimensional images of the target scene;

A memory configured to store the depth image, the two-dimensional image, the calibration information between the depth camera and the two-dimensional camera, and the depth measurement range of the depth camera;

A processor configured to determine the row index range corresponding to each row of the depth image in the two-dimensional image according to the calibration information and the depth measurement range, and to extract the depth image according to the row index range. At least one row of depth data in the two-dimensional image and at least one corresponding row of two-dimensional data in the two-dimensional image are aligned to determine the two-dimensional data corresponding to the depth data.
The image processing system according to claim 1, wherein at least one row of depth data in the depth image and corresponding at least one row of two-dimensional data in the two-dimensional image are extracted according to the row index range for alignment. Processing and determining the two-dimensional data corresponding to the depth data includes:

The depth image and the two-dimensional image are divided according to the memory size of the processor and the row index range and multiple groups of data to be processed are extracted, wherein each group of the data to be processed includes at least one row of depth data and Corresponding multiple rows of two-dimensional data, and adjacent rows of depth data share part of the two-dimensional data.
The image processing system according to claim 2, wherein the alignment process on the data to be processed to determine the two-dimensional data corresponding to the depth data includes:

transporting a set of the data to be processed to the memory of the processor, aligning the set of data to be processed by the processor, and determining the two-dimensional data corresponding to the depth data in the set of data to be processed, Then, the next set of data to be processed is transferred to the memory for alignment processing until each set of data to be processed is completed.

Wherein, the sum of the data amounts of a set of memory storage data is not greater than the memory size of the processor, and a set of memory storage data includes a set of the to-be-processed data and a set of data obtained by aligning the set of to-be-processed data. Data processed.
The image processing system according to claim 3, characterized in that the image processing system is also used for:

When transporting the data to be processed in the i-th group, obtain the row index overlapping range of the data to be processed in the i-th group and the data to be processed in the i-1 group, retain the overlapping data in the memory, and transfer the data to be transported transferred to the memory;

Wherein, the overlapping data is the data corresponding to the overlapping range of the row index in the i-1th group of data to be processed, and the data to be transported is the overlapping range of the row index in the i-th group of data to be processed. other data.
The image processing system according to claim 1, characterized in that the calibration information stored in the memory includes: internal parameters of the depth camera, internal parameters of the two-dimensional camera, the depth camera and the two-dimensional camera. The rotation matrix between the 2D camera and the translation matrix between the depth camera and the 2D camera.
The image processing system according to claim 5, wherein the depth measurement range stored in the memory includes: a maximum detection value and a minimum detection value of the depth camera.
The image processing system according to claim 6, wherein determining the line index range corresponding to each line of the depth image in the two-dimensional image based on the calibration information and the depth measurement range includes: :

The maximum row corresponding to each row of the depth image is obtained according to the internal parameters of the depth camera, the internal parameters of the two-dimensional camera, the rotation matrix, the translation matrix, the maximum detection value and the minimum detection value. Index and minimum row index;

Obtain the row index range corresponding to each row in the depth image according to the maximum row index and the minimum row index;

Wherein, the row index range corresponding to a row to be aligned in the depth image is used to indicate the range of the target alignment row in the two-dimensional image, and the target alignment pixels in the two-dimensional image belong to the target alignment row, so The target alignment pixel corresponds to any pixel to be aligned in the row to be aligned.
An image processing method, characterized in that the method is applied to the image processing system according to any one of claims 1 to 7, and the method includes:

Collect the depth image of the target scene through the depth camera and store it in the memory;

Collect two-dimensional images of the target scene through a two-dimensional camera and store them in memory;

Obtain the calibration information between the depth camera and the two-dimensional camera and the depth measurement range of the depth camera pre-stored in the memory;

The processor determines the row index range corresponding to each row of the depth image in the two-dimensional image according to the calibration information and the depth measurement range, and extracts at least one of the depth images according to the row index range. One row of depth data and at least one corresponding row of two-dimensional data in the two-dimensional image are aligned to determine the two-dimensional data corresponding to the depth data.
An intelligent terminal. The intelligent terminal includes a memory, a processor, and an image processing program stored on the memory and executable on the processor. When the image processing program is executed by the processor, the image processing program implements the following steps: The steps of the image processing method described in claim 8.
A computer-readable storage medium, characterized in that an image processing program is stored on the computer-readable storage medium, and when the image processing program is executed by a processor, the steps of the image processing method as claimed in claim 8 are implemented.