WO2021077427A1

WO2021077427A1 - Image processing method and device, and movable platform

Info

Publication number: WO2021077427A1
Application number: PCT/CN2019/113433
Authority: WO
Inventors: 徐功林; 仇晓颖; 韩彬
Original assignee: 深圳市大疆创新科技有限公司
Priority date: 2019-10-25
Filing date: 2019-10-25
Publication date: 2021-04-29
Also published as: CN112204606A

Abstract

An image processing method and device, and a movable platform. A filling pixel is provided on the edge of an image to be processed. The method comprises: sequentially inputting pixel values of pixels in an n-th row in said image into a first register group, n being any integer greater than or equal to 1; the pixel value comprising an image pixel value and a filling pixel value (S201); and when an adjacent filling pixel value of a pixel value of the last image in the n-th row is input into the first register group, inputting a pixel value of a first image in an (n+1)-th row in said image into a second register group (S202). Therefore, the time when the pixel values in the (n+1)-th row are input into the second register group overlaps with some of the time when the pixel values in the n-th row are input into the first register group, which improves the image processing efficiency.

Description

Image processing method, equipment and movable platform

Technical field

The embodiments of the present application relate to the field of image processing technology, and in particular, to an image processing method, device, and movable platform.

Background technique

With the rapid development of the artificial intelligence industry, processors based on convolutional neural networks have been widely used. Taking drones as an example, drones collect images during the flight, and processors based on convolutional neural networks can The image is recognized and processed to identify the target in the image to ensure the flight safety of the UAV. Among them, convolutional neural networks generally include: Convolution Layer, Activation Layer, Normalization Layer, Downsampling Layer, Fully Connected Layer, among them The pooling operation is located in the down-sampling layer of the convolutional neural network. The down-sampling layer can reduce the feature map. According to different functions, the pooling operation can be divided into maximum pooling and average pooling. Maximum pooling is to find the maximum value in the pooling window, and average pooling is to find the average value in the pooling window. In order to preserve the information of the edge of the image as much as possible, some pixels are usually filled around the image. However, when the current pooling operation is performed, the pixel value of the image and the filled pixel value need to be input into the same register in turn, and the pixel values in the current row are all input Only after this register can you start to input the pixel value of the next line, resulting in low pooling efficiency.

Summary of the invention

The embodiments of the present application provide an image processing method, equipment, and a movable platform to save processing time and improve processing efficiency.

In a first aspect, an embodiment of the present application provides an image processing method, where the edges of the image to be processed are provided with padding pixels, and the method includes:

Sequentially input the pixel values of the pixels in the nth row in the image to be processed into the first register group, where n is any integer greater than or equal to 1, and the pixel values include image pixel values and padding pixel values;

When the adjacent filling pixel value of the last image pixel value in the nth row is input into the first register group, the first image pixel value in the n+1th row in the image is input into the second register group.

In a second aspect, an embodiment of the present application provides an image processing device, where the edges of the image to be processed are provided with padding pixels, and the image processing device includes: a first register group, a second register group, and a processor;

The processor is configured to sequentially input the pixel values of the pixels in the nth row in the image to be processed into the first register group, where n is any integer greater than or equal to 1, and the pixel values include image pixels Value and padding pixel value; when the adjacent padding pixel value of the last image pixel value in the nth row is input into the first register group, the first image in the n+1th row in the image The pixel value is input to the second register group.

In a third aspect, an embodiment of the present application provides a movable platform, including: a movable platform body and the image processing device according to the embodiment of the present application in the second aspect, wherein the image processing device is installed on the movable platform. On the platform body.

In a fourth aspect, an embodiment of the present application provides a readable storage medium with a computer program stored on the readable storage medium; when the computer program is executed, it realizes the image described in the embodiment of the present application in the first aspect. Approach.

In a fifth aspect, an embodiment of the present application provides a program product, the program product includes a computer program, the computer program is stored in a readable storage medium, and at least one processor of a removable platform can download from the readable storage medium The computer program is read, and the at least one processor executes the computer program to enable the mobile platform to implement the image processing method described in the embodiment of the present application in the first aspect.

The image processing method, device, and movable platform provided by the embodiments of the present application input the pixel values of the pixels in the nth row of the image to be processed into the first register group in sequence, and when the last image in the nth row is input When the adjacent filled pixel value of the pixel value is input into the first register group, the first image pixel value of the n+1th row in the image is input into the second register group. Since the first register group and the second register group are provided in this embodiment, the pixel values of pixels in adjacent rows can be input into different register groups respectively, and the pixel values of pixels in the nth row are not completely input to the first register. When grouping, you can start to input the first image pixel value of the n+1th row into the second register group. Therefore, the time when the pixel value of the n+1th row is input into the second register group is multiplexed with the nth row Part of the time for the pixel value to be input to the first register set improves the image processing efficiency.

Description of the drawings

In order to more clearly describe the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description These are some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.

Fig. 1 is a schematic architecture diagram of an unmanned aerial system according to an embodiment of the present application;

FIG. 2 is a flowchart of an image processing method provided by an embodiment of the application;

FIG. 3 is a schematic diagram of each pixel in an image to be processed according to an embodiment of the application;

FIG. 4 is a schematic diagram of the first register set or the second register set provided by an embodiment of the application;

FIG. 5 is a schematic diagram of a cache provided by an embodiment of this application;

Fig. 6 is a schematic structural diagram of an image processing device provided by an embodiment of the application;

FIG. 7 is a schematic structural diagram of a movable platform provided by an embodiment of this application;

FIG. 8 is a schematic structural diagram of a movable platform provided by another embodiment of the application.

Detailed ways

In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments It is a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by a person of ordinary skill in the art without creative work shall fall within the protection scope of this application.

The embodiments of the present application provide an image processing method, equipment, and a movable platform, where the movable platform may be a handheld phone, a handheld PTZ, a drone, an unmanned vehicle, an unmanned boat, a robot, or an autonomous vehicle, etc. The following description of the mobile platform of this application uses drones as an example. It will be obvious to those skilled in the art that other types of drones can be used without restriction, and the embodiments of the present application can be applied to various types of drones. For example, the drone can be a small or large drone. In some embodiments, the drone may be a rotorcraft, for example, a multi-rotor drone that is propelled by multiple propulsion devices through the air. The embodiments of the present application are not limited to this, and the drone It can also be other types of drones.

Fig. 1 is a schematic architecture diagram of an unmanned aerial system according to an embodiment of the present application. In this embodiment, a rotary wing drone is taken as an example for description.

The unmanned aerial system 100 may include a drone 110, a display device 130, and a remote control device 140. Among them, the UAV 110 may include a power system 150, a flight control system 160, a frame, and a pan/tilt 120 carried on the frame. The drone 110 can wirelessly communicate with the remote control device 140 and the display device 130.

The frame may include a fuselage and a tripod (also called a landing gear). The fuselage may include a center frame and one or more arms connected to the center frame, and the one or more arms extend radially from the center frame. The tripod is connected with the fuselage, and is used for supporting the UAV 110 when it is landed.

The power system 150 may include one or more electronic governors (referred to as ESCs) 151, one or more propellers 153, and one or more motors 152 corresponding to the one or more propellers 153, wherein the motors 152 are connected to Between the electronic governor 151 and the propeller 153, the motor 152 and the propeller 153 are arranged on the arm of the UAV 110; the electronic governor 151 is used to receive the driving signal generated by the flight control system 160 and provide driving according to the driving signal Current is supplied to the motor 152 to control the speed of the motor 152. The motor 152 is used to drive the propeller to rotate, thereby providing power for the flight of the drone 110, and the power enables the drone 110 to achieve one or more degrees of freedom of movement. In some embodiments, the drone 110 may rotate about one or more rotation axes. For example, the aforementioned rotation axis may include a roll axis (Roll), a yaw axis (Yaw), and a pitch axis (pitch). It should be understood that the motor 152 may be a DC motor or an AC motor. In addition, the motor 152 may be a brushless motor or a brushed motor.

The flight control system 160 may include a flight controller 161 and a sensing system 162. The sensing system 162 is used to measure the attitude information of the drone, that is, the position information and state information of the drone 110 in space, such as three-dimensional position, three-dimensional angle, three-dimensional velocity, three-dimensional acceleration, and three-dimensional angular velocity. The sensing system 162 may include, for example, at least one of sensors such as a gyroscope, an ultrasonic sensor, an electronic compass, an inertial measurement unit (IMU), a vision sensor, a global navigation satellite system, and a barometer. For example, the global navigation satellite system may be the Global Positioning System (GPS). The flight controller 161 is used to control the flight of the drone 110, for example, it can control the flight of the drone 110 according to the attitude information measured by the sensor system 162. It should be understood that the flight controller 161 can control the drone 110 according to pre-programmed program instructions, and can also control the drone 110 by responding to one or more remote control signals from the remote control device 140.

The pan/tilt head 120 may include a motor 122. The pan/tilt is used to carry the camera 123. The flight controller 161 can control the movement of the pan-tilt 120 through the motor 122. Optionally, as another embodiment, the pan/tilt head 120 may further include a controller for controlling the movement of the pan/tilt head 120 by controlling the motor 122. It should be understood that the pan-tilt 120 may be independent of the drone 110 or a part of the drone 110. It should be understood that the motor 122 may be a DC motor or an AC motor. In addition, the motor 122 may be a brushless motor or a brushed motor. It should also be understood that the pan-tilt can be located on the top of the drone, or on the bottom of the drone.

The photographing device 123 may be, for example, a device for capturing images, such as a camera or a video camera, and the photographing device 123 may communicate with the flight controller and take pictures under the control of the flight controller. The imaging device 123 of this embodiment at least includes a photosensitive element, and the photosensitive element is, for example, a Complementary Metal Oxide Semiconductor (CMOS) sensor or a Charge-coupled Device (CCD) sensor. It can be understood that the camera 123 can also be directly fixed to the drone 110, so the pan/tilt 120 can be omitted.

The display device 130 is located on the ground end of the unmanned aerial system 100, can communicate with the drone 110 in a wireless manner, and can be used to display the attitude information of the drone 110. In addition, the image photographed by the photographing device 123 may also be displayed on the display device 130. It should be understood that the display device 130 may be an independent device or integrated in the remote control device 140.

The remote control device 140 is located on the ground end of the unmanned aerial system 100, and can communicate with the drone 110 in a wireless manner for remote control of the drone 110.

It should be understood that the aforementioned naming of the components of the unmanned aerial system is only for identification purposes, and should not be understood as a limitation to the embodiments of the present application.

Among them, the image captured by the above-mentioned photographing device 123 may be processed using the solutions of the embodiments of the present application.

FIG. 2 is a flowchart of an image processing method provided by an embodiment of this application. As shown in FIG. 2, the method of this embodiment may include:

S201. Input the pixel values of the pixels in the nth row in the image to be processed into the first register group in sequence, where n is any integer greater than or equal to 1, and the pixel values include image pixel values and padding pixel values.

S202. When inputting the adjacent filling pixel value of the last image pixel value in the nth row into the first register group, input the first image pixel value of the n+1th row in the image into the first register group. Two register bank.

In this embodiment, the edge of the image to be processed is provided with filled pixels. As shown in Figure 3, the pixels of the image to be processed include image pixels and filled pixels. The pixels represented by numbers in Figure 3 are image pixels, and the filled pixels are located in image pixels. Around the area, the pixels indicated by the blanks in Figure 3 are filled pixels. Among them, the pixel value of the image pixel is called the image pixel value, and the pixel value of the filled pixel is called the filled pixel value. For example: the filling pixels are respectively filled in the first row and the last row of the image to be processed, as well as the 1-2 columns and the last 1-2 columns of the image to be processed. Among them, the filling pixels are filled in the row direction of the image to be processed The number of rows and the number of columns of filled pixels filled in the column direction are not limited to this, and can be determined according to specific actual application scenarios. Among them, the pixels represented by numbers 1 to 8 in the first row of the image to be processed are image pixels. For example, the number 1 represents image pixel 1, the number 2 represents image pixel 2, and other rows are similar.

In this embodiment, the pixel values of the pixels in the nth row of the image to be processed are sequentially input into the first register group, where n is any integer greater than or equal to 1, when the last image pixel value in the nth row is When the adjacent padding pixel value of is input into the first register group, the pixel value of the first image in the n+1th row in the image is input into the second register group. Then the pixel values of the pixels in the n+1th row of the image to be processed are sequentially input into the second register group. When the adjacent filling pixel value of the last image pixel value in the n+1th row is input into the first When in the second register group, input the pixel value of the first image in the n+2th row in the image into the first register group. By analogy, I won't repeat it in the follow-up. In some cases, the first register group may be the second register group, and the second register group may be the first register group. For example, when the pixel value of the pixel in the n+1th row is input, the second register group serves as the first register group. In the register group, when the pixel value of the pixel in the n+2th row is input, the first register group serves as the second register group. In other implementations, when the pixel value of any filling pixel after the last image pixel in the nth row is input into the first register group, the n+1th pixel in the image The first image pixel value of the row is input into the second register group.

In the related art, all the pixel values of the pixels in the nth row are input to the register group before all the pixel values of the pixels in the n+1th row are input to the register group. In the embodiment of this application, there are two sets of register sets (the first register set and the second register set), and the pixel values of the pixels in two adjacent rows are input to different register sets. Therefore, in this embodiment, the When the pixel value of the pixel is not all input to the first register group, the pixel value of the pixel in the n+1th row is input to the second register group. Therefore, the time when the pixel value in the n+1th row is input to the second register group Part of the time when the pixel value of the nth row is input to the first register group is multiplexed, which improves the image processing efficiency.

Taking n equal to 2, that is, the second line in Figure 3, the pixel value of image pixel 1 is input to the first register group, and then the pixel value of image pixel 2 is input to the first register group,..., and then the pixel value of the image pixel is input to the first register group. The pixel value is input to the first register group, the pixel value of the penultimate filling pixel is input to the first register group, and the pixel value of the last filling pixel is input to the first register group. Among them, when the pixel value of the penultimate filling pixel is input to the first register group, the pixel value of the image pixel 9 in the third row is input into the second register group, and the pixel value of the penultimate filling pixel is input to the second register group. When the value is input to the first register group, the pixel value of the image pixel 10 in the third row is input into the second register group, which saves the time of inputting two pixel values for two adjacent rows, or when the last one is the last Input the pixel value of one filling pixel into the first register group, and input the pixel value of the image pixel 9 in the third row into the second register group. For two adjacent rows, it saves inputting 1 pixel into the first register group. Value of time.

In this embodiment, by sequentially inputting the pixel values of the pixels in the nth row of the image to be processed into the first register group, when the adjacent filling pixel value of the last image pixel value in the nth row is input into the When in the first register set, input the pixel value of the first image in the n+1th row in the image into the second register set. Since the first register group and the second register group are provided in this embodiment, the pixel values of pixels in adjacent rows can be input into different register groups respectively, and the pixel values of pixels in the nth row are not completely input to the first register. When grouping, you can start to input the first image pixel value of the n+1th row into the second register group. Therefore, the time when the pixel value of the n+1th row is input into the second register group is multiplexed with the nth row Part of the time for the pixel value to be input to the first register set improves the image processing efficiency.

In some embodiments, the width of the pooling window is w pixels, the height of the pooling window is h pixels, and the pooling window slides in the row direction according to the preset row step length and according to the preset column step length Sliding along the column direction, the pooling window may include pixels in consecutive h rows of w consecutive columns in the image to be processed, and the total pixels are w*h pixels. Among them, each time the pooling window slides, the pooling window will include the pixels at the corresponding position after the sliding. When w pixel values belonging to the same pooling window in the nth row are input into the first register group, that is, when w pixel values have been registered in the first register group, and these w pixel values belong to the to-be-processed In the same row in the image, and also belong to the same pooling window, this embodiment also obtains the calculated pixel values of the w pixel values, for example, obtains the w pixel values registered in the first register group, and performs operations on the w pixel values. Corresponding operations, the operation result is obtained, and the operation result is called the operation pixel value of the w pixel values. Therefore, the calculated pixel values of w pixel values belonging to different pooling windows in the nth row can be obtained.

When w pixel values belonging to the same pooling window in the n+1th row are input into the second register group, that is, when w pixel values have been registered in the second register group, and these w pixel values belong to the In the processed image for the same row and also belong to the same pooling window, this embodiment also obtains the calculated pixel values of the w pixel values, for example, obtains the w pixel values registered in the second register group, and compares the w pixels Perform corresponding operations on the values to obtain the result of the operation. The result of the operation is called the calculated pixel value of the w pixel values. Therefore, the calculated pixel values of w pixel values belonging to the same pooling window in the n+1th row can be obtained.

Similar processing, the calculated pixel values of the w pixel values belonging to the same pooling window in the n+h-1th row can be obtained.

Wherein, when the calculated pixel values of the w pixel values of each row in the same pooling window are obtained, the calculation of w*h pixel values in the pooling window is determined according to the calculated pixel values of the w pixel values of each row Pixel values. For example, when obtaining the calculated pixel value of the w pixel values of the nth row in the same pooling window, the calculated pixel value of the w pixel values of the n+1th row,..., the w pixel values of the n+h-1th row When calculating the pixel value of the same pooling window, the calculated pixel value of the w pixel values of the nth row, the calculated pixel value of the w pixel values of the n+1th row,..., the n+h-1th row of the same pooling window The calculated pixel values of w pixel values are subjected to corresponding calculations to obtain the calculation result, which is called the calculated pixel value of w*h pixel values in the pooling window.

Among them, each time the pooling window slides, a pooling result is output, and the pooling result is the calculated pixel value of the w*h pixel values obtained by the corresponding operation.

In some embodiments, the first register group or the second register group includes at least w registers, and each register is used to register a single pixel value.

In this embodiment, the first register group includes at least w registers, and the number of pixel values registered in the first register group is the same as the number of registers included in the first register group. Therefore, the pixels that can be registered in the first register group The value is at least w. The second register group includes at least w registers, and the number of pixel values registered in the second register group is the same as the number of registers included in the second register group. Therefore, the second register group can register at least w pixel values .

Taking the first register group or the second register group including w registers as an example, the w is 3, for example, as shown in Fig. 4, the directions from left to right are register 1, register 2, and register 3 respectively. When the pixel values of the pixels of the processed image are input into the first register group or the second register group, the pixel values of the pixels of the image are input into the register group in consecutive clock cycles.

Taking the pixel value of the pixel in the second row in Figure 3, for example, if register 1 has registered the pixel value of image pixel 3, register 2 has registered the pixel value of image pixel 2, and register 3 has registered the pixel value of image pixel 1, if When the pixel value of image pixel 4 needs to be input, in the next clock cycle, the pixel value of image pixel 4 is input to register 1, and the pixel value of image pixel 3 registered in register 1 is input to register 2, and register 2 is registered The pixel value of image pixel 2 is input to register 3, that is, the value of register 1 is updated to the pixel value of image pixel 4, the value of register 2 is updated to the pixel value of image pixel 3, and the value of register 3 is updated to image pixel 2. The pixel value. In this way, it is ensured that the pixel values of 3 pixels are registered in the first register group, and the pixel values of the most recently input 3 pixels are registered.

In some embodiments, after obtaining the calculated pixel values of w pixel values, the calculated pixel values of the w pixel values are stored in a buffer, so that the calculation of the w pixel values of each row in the same pooling window is obtained. When the pixel value is used, the calculated pixel value of the w pixel values of each row in the same pooling window obtained by the previous operation is obtained from the cache in time.

In some embodiments, the buffer includes at least h sub-buffers to ensure that the calculated pixel values of w pixel values of different rows in the same pooling window can be stored in the buffer at the same time. The calculated pixel values of w pixel values of different rows in the same pooling window are stored in different sub-buffers, so that the calculated pixel values of w pixel values of each row in the same pooling window are obtained from different sub-buffers. Obtain, in order to improve the efficiency of obtaining the calculated pixel value. For example, h is equal to 3, and the buffer includes 3 sub-buffers as an example. As shown in Figure 5, the calculated pixel values of the w pixel values of the nth row in the same pooling window are stored in sub-buffer 1, and the n+1th row The calculated pixel values of w pixel values are stored in the sub-buffer 2, and the calculated pixel values of the w pixel values in the n+2th row are stored in the sub-buffer 3.

Optionally, the calculated pixel values of w pixel values belonging to the same row and belonging to different pooling windows may be stored in the same sub-buffer. When the pooling window slides in the column direction, multiple pooling windows are formed. As shown in Figure 5, the calculated pixel values of w pixel values belonging to the first pooling window in the nth row are stored in the address 0 of the sub-buffer , The calculated pixel values of w pixel values belonging to the second pooling window are stored in sub-buffer address 2, and the calculated pixel values of w pixel values belonging to the third pooling window are stored in sub-buffer address 3, and so on .

Optionally, there are a total of width addresses in each sub-buffer, where width corresponds to the number of image pixels in the row direction of the image to be processed. Taking FIG. 3 as an example, the width is 8. In this way, each sub-buffer can support the storage operation when the width of the pooling window is 1 pixel at most.

Optionally, when the maximum pooling operation is performed, the storage space corresponding to each address can be divided into two parts, where the first part is the calculated pixel value of the above w pixel values, that is, the maximum pixel value, and the second part is the The address information corresponding to the maximum pixel value. In this way, when the maximum pixel value of each pooling window is subsequently output, the address information corresponding to the maximum pixel value can be output at the same time to facilitate subsequent processing. Taking FIG. 3 as an example, the address information corresponding to the maximum pixel value of the first pooling window is the address information of the image pixel 9. For example, the image pixel at the upper left corner of the image can be used as the origin to determine the address information of each image pixel.

In some embodiments, the calculated pixel values of w pixel values of different rows in the same pooling window are stored in the same position in the corresponding sub-buffer. For example: as shown in Figure 5, if the calculated pixel values of the w pixel values in the first row in the same pooling window are stored in the corresponding position of address 0 of the sub-buffer 1, then the w in the second row in the same pooling window The calculated pixel value of each pixel value is stored in the address 0 corresponding position of the sub-buffer 2, and the calculated pixel value of the w pixel values in the third row of the same pooling window is stored in the address 0 corresponding position of the sub-buffer 3. If the calculated pixel values of the w pixel values of the first row in the same pooling window are stored in the corresponding location of address 2 of the sub-buffer 1, then the calculated pixel values of the w pixel values of the second row in the same pooling window exist In the location corresponding to the address 2 of the sub-buffer 2, the calculated pixel values of the w pixel values in the third row in the same pooling window are stored in the location corresponding to the address 2 of the sub-buffer 3.

Therefore, when obtaining the calculated pixel value of the w pixel values of each row in the same pooling window in the buffer, according to the same address information, obtain the calculated pixel value from the same position in each sub-buffer, and you can obtain the calculated pixel value in the same pooling window. The calculated pixel value of the w pixel values of each row.

In some embodiments, a possible implementation of determining the calculated pixel values of the w*h pixel values in the pooling window according to the calculated pixel values of the w pixel values in each row is: when the When the calculated pixel values of the w pixel values of the last row in the pooling window are stored, while storing the calculated pixel values of the w pixel values of the last row in the buffer, store the w pixels of the last row The calculated pixel value of the value is input into the third register group; the calculated pixel value of the w pixel values of each of the other h-1 rows in the pooling window is read from the buffer, and is input into the third register In the group; perform operations on the h arithmetic pixel values registered in the third register group, and determine the arithmetic pixel values of w*h pixel values in the pooling window. Optionally, the calculated pixel values of w pixel values in the last row of the same pooling window and the calculated pixel values of w pixel values in each row in other h-1 rows can be simultaneously input into the third register group.

In this embodiment, the third register group includes at least h registers, and each register is used to register a single pixel value. Therefore, the third register can simultaneously register the arithmetic pixel values of w pixel values of each row in the same pooling window, that is, h The pixel value of the operation.

Since this embodiment obtains the calculated pixel values of the w pixel values of the last row in the pooling window, it does not first input the calculated pixel values of the w pixel values of the last row into the buffer and then take it out of the buffer and input the third Register group, instead of inputting the calculated pixel values of w pixel values in the last row into the buffer, it also inputs the third register group, which multiplexes the time of inputting an arithmetic pixel value into the buffer, saving processing time and improving processing effectiveness.

In some embodiments, after the calculated pixel value of the w pixel values in the n+h row is obtained, the calculated pixel value of the w pixel values in the corresponding column direction in the nth row stored in the buffer is replaced with The calculated pixel value of the w pixel values in the n+hth row. As shown in Figure 5, taking h=3 as an example, if the sub-buffer 1 has stored the calculated pixel values of w pixel values in the nth row, and the sub-buffer 2 has stored the calculated pixel values of w pixel values in the n+1th row The sub-buffer 3 has stored the calculated pixel value of the w pixel value in the n+2th row, and the calculated pixel value of the w pixel value in the n+3th row is currently obtained, then the nth row stored in the buffer Replace the calculated pixel values of w pixel values in the n+3th row with the calculated pixel values of the w pixel values in the n+3th row. For example, store the calculated pixel values of the w pixel values sequentially obtained in the n+3th row in the sub The location corresponding to each address of the buffer 1, so that the buffer including h sub-buffers can update the calculated pixel values storing the w pixel values of the h rows newly obtained, thereby saving buffer space.

In some embodiments, a possible implementation manner for obtaining the arithmetic pixel values of the w pixel values is: performing operations on the w pixel values according to the pooling mode to obtain the arithmetic pixel values of the w pixel values. In this embodiment, when w pixel values belonging to the same pooling window in the nth row are input into the first register group, the w pixel values in the first register group are operated according to the pooling mode to obtain the The calculated pixel values of w pixel values belonging to the same pooling window in the n rows. When the w pixel values belonging to the same pooling window in the n+1th row are input into the second register group, the w pixel values in the second register group are operated according to the pooling mode to obtain the n+1th The calculated pixel values of w pixel values in the same pooling window in the row.

When the arithmetic pixel values of w pixel values belonging to each row in the same pooling window are input to the third register group, that is, the third register group simultaneously stores w belonging to each row in h rows in the same pooling window The calculated pixel value of the pixel value is calculated on the h operation pixel values in the third register group according to the pooling mode to obtain the operation pixel values belonging to the w*h pixel values in the n+1th row in the same pooling window.

Optionally, if the pooling mode is the maximum pooling, the calculated pixel value is the maximum pixel value, that is, the pixel values registered in the first register group or the second register group or the third register group are compared in size, and from these pixels The maximum pixel value is determined in the value, and the maximum pixel value is called the calculated pixel value of these pixel values.

Optionally, if the pooling mode is average pooling, the calculated pixel value is an accumulated pixel value. That is, the pixel values registered in the first register group or the second register group or the third register group are accumulated to obtain the accumulated pixel value of these pixel values, and the accumulated pixel value is called the calculated pixel value of these pixel values.

Optionally, if the pooling mode is average pooling, the calculated pixel value is an average pixel value. That is, the pixel values registered in the first register group or the second register group are accumulated to obtain the accumulated pixel value of these pixel values, and the accumulated pixel value is divided by w to obtain the average pixel value. The average pixel value is called The calculated pixel value of w pixel values. Accumulate the pixel values registered in the third register group to obtain the accumulated pixel value of these pixel values, and divide the accumulated pixel value by h to obtain the average pixel value. The average pixel value is called the w*h pixel value Calculate the pixel value.

In some embodiments, a possible implementation manner of performing operations on the w pixel values according to the pooling mode to obtain the calculated pixel values is: combining the first register set or the second register set The pixel value registered in is output to the arithmetic unit, so that the arithmetic unit outputs the arithmetic pixel value; and the arithmetic pixel value output by the arithmetic unit is acquired.

When the w pixel values in the same pooling window are registered in the first register group or the second register group at the same time, the w pixel values registered in the first register group or the second register group are output to the arithmetic unit to The arithmetic unit is made to perform operations on the w pixel values according to the pooling mode to obtain the arithmetic pixel values of the w pixel values, and then obtain the arithmetic pixel values output by the arithmetic unit.

When the operation pixel values of the w pixel values of each row in h rows in the same pooling window are registered in the third register group at the same time, the h operation pixel values registered in the third register group are output to the operation unit, so that The arithmetic unit performs operations on h arithmetic pixel values according to the pooling mode to obtain the arithmetic pixel values of h arithmetic pixel values, and then obtains the arithmetic pixel values of the w*h row pixel values output by the arithmetic unit.

In some embodiments, if the pooling mode is maximum pooling, the arithmetic unit is configured as a comparator; if the pooling mode is average pooling, the arithmetic unit is configured as an adder. In some embodiments, the operation unit includes an adder, and when the operation mode is average pooling, the adder outputs the accumulated pixel value. Optionally, if the calculated pixel value is an average pixel value, the arithmetic unit may further include a multiplier. The multiplier multiplies the accumulated pixel value output by the adder by 1/w or 1/h to obtain the average pixel value.

Optionally, the arithmetic unit is configured as a comparator to multiplex the adder. When two pixel values are compared, one of the pixel values A is input to the adder, and the other pixel value B is multiplied by -1 to obtain -B Also input the adder, the adder adds A and -B to get the sum value, which is AB, if AB is greater than 0, then A is greater than B, if AB is less than 0, then A is less than B, if AB is equal to 0, then A Equal to B. In some embodiments, because the multiplier occupies more resources, when B is a signed number, B can be inverted and then added by 1 to obtain -B.

Therefore, the arithmetic unit of this embodiment can reuse the adder to realize the function of the comparator, saving hardware cost.

In some embodiments, the input of the arithmetic unit can also be configured according to the size of the pooling window. For example: if the width or height of the pooling window is 3 pixels, when the pooling mode is maximum pooling, the input of the first adder of the two adders in the arithmetic unit is configured as two of the to-be-calculated adders Pixel value, the input of the second adder of the two adders is the output of the first adder and the remaining pixel values. It should be noted that if the pooling mode is the maximum pooling, the inputs of the adders other than the above two adders in the arithmetic unit are configured to the minimum value. If the pooling mode is the average pooling, the The outputs of the adders other than the above two adders in the arithmetic unit are all configured as 0.

In some embodiments, the arithmetic units that output the arithmetic pixel values of w pixel values in adjacent rows are a first arithmetic unit and a second arithmetic unit, respectively, and output the arithmetic pixels of w*h pixel values in the pooling window. The operation unit of the value is the third operation unit.

For example: the arithmetic pixel value used to obtain the w pixel values stored in the first register group is the first arithmetic unit, and the arithmetic pixel value used to obtain the w pixel values stored in the second register group is the second arithmetic unit, That is, the arithmetic units used to respectively output the arithmetic pixel values of w pixel values in adjacent rows are not the same arithmetic unit, which can ensure that the operations of two adjacent rows will not affect each other, save pooling time and improve pooling efficiency. In addition, the arithmetic pixel value used to obtain the h arithmetic pixel values stored in the third register group is the third arithmetic unit, and the third arithmetic unit is not the same arithmetic unit as the first arithmetic unit and the second arithmetic unit, so that the calculation can be guaranteed The process is continuous and will not be interrupted, which improves the efficiency of pooling.

In some embodiments, if the pooling mode is maximum pooling, the filled pixel value is the minimum pixel value, which can ensure that the filled pixel value does not affect the actual maximum pixel value calculation result. If the pooling mode is average pooling, the filled pixel value is 0, which can ensure that the filled pixel value does not affect the actual average pixel value or the calculation result of the accumulated pixel value.

In some embodiments, it is also possible to perform layered processing on the original image to obtain a multi-layer sub-image; the pixel value of each pixel in each layer of the sub-image is the pixel value of the same bit of each pixel in the original image; The image to be processed is any sub-image in the multi-layer sub-image. Then, the solutions of the foregoing embodiments are executed for each layer of sub-images.

If the pixel value in the original image is 16 bits, the original image can be layered, for example, divided into two layers to obtain two layers of sub-images, which are the first layer of sub-images and the second layer of sub-images. Among them, the pixel value of each pixel in the first layer sub-image can be the first to eighth bits in the pixel value of the corresponding pixel in the original image, and the pixel value of each pixel in the second layer sub-image can be the original image The 9th to 16th bits in the pixel value of the corresponding pixel in. Then execute the solutions of the foregoing embodiments on the first layer of sub-images to obtain the pooling result of the first layer of sub-images, and execute the solutions of the foregoing embodiments on the second layer of sub-images to obtain the pooling of the second layer of sub-images result. In this embodiment, the pooling result of the original image can also be obtained according to the pooling result of the first layer sub-images and the pooling result of the second layer sub-images. For example, the pooling results of the pooling windows corresponding to the sub-images of the first layer and the sub-images of the second layer are added.

Among them, the hardware required for processing the first-layer sub-image and the hardware required for processing the second-layer sub-image may not be the same hardware, so that the parallel processing of the first-layer sub-image and the second-layer sub-image is realized, and the pooling efficiency is improved.

In some embodiments, the original image may also be divided into blocks, for example, the original image is divided into multiple blocks along the row direction and/or column direction, and then the solutions of the foregoing embodiments are executed for each image block. . In this way, when the data volume of the original image is relatively large, multiple image blocks can be processed separately, reducing the storage space required for image processing.

In some embodiments, the foregoing embodiments can be used to process multiple original images at the same time, so as to further increase the degree of parallelism and improve the efficiency of the pooling operation. For example, if 32 original images are processed in parallel, and each image pixel in each image is 16 bits, then 512 bits of data can be input in one clock cycle. Optionally, each image pixel can also be divided into high 8 bits and low 8 bits to be processed separately.

Hereinafter, an implementation manner of the present application will be described with reference to FIGS. 3 to 5.

Set the initial value in each register group according to the pooling mode. If the pooling mode is average pooling, the initial value is 0; if the pooling mode is maximized pooling, the initial value is the minimum pixel value, such as the pixel value 8 bits, the smallest pixel value represents -128.

Input the pixel value of image pixel 1 in row 2 in Figure 3 into register 1 of the first register group, because the pixel value of image pixel 1 at this time represents the accumulated pixel value of row 2 in the first pooling window (or Average pixel value or maximum pixel value), and input the pixel value of image pixel 1 into address 0 of sub-buffer 1 in the buffer.

The pixel value of image pixel 2 is input to register 1 of the first register group, and the pixel value of image pixel 1 registered by register 1 is input to register 2.

The pixel value of image pixel 3 is input to register 1 of the first register group, while the pixel value of image pixel 2 registered in register 1 is input to register 2 and the pixel value of image pixel 1 registered in register 2 is input to register 3. It also obtains the accumulated pixel value (or average pixel value or maximum pixel value) of the pixel values of image pixel 1, image pixel 2, image pixel 3 registered in the first register group, and inputs it to the address of sub-buffer 1 in the buffer. 1 in.

The pixel value of the image pixel 4 is input to the register 1 of the first register group, the pixel value of the image pixel 3 registered in the register 1 is input to the register 2 and the pixel value of the image pixel 2 registered in the register 2 is input to the register 3.

The pixel value of the image pixel 5 is input into the first register group, and the process of inputting the image pixel 5 will not be repeated here. It also obtains the accumulated pixel value (or average pixel value or maximum pixel value) of the pixel values of image pixel 3, image pixel 4, and image pixel 5 registered in the first register group, and inputs it to the address of sub-buffer 1 in the buffer. 2 in.

The pixel value of the image pixel 6 is input into the first register group, and the process of inputting the image pixel 6 is not repeated here.

The pixel value of the image pixel 7 is input into the first register group, and the process of inputting the image pixel 7 will not be repeated here. It also obtains the accumulated pixel value (or average pixel value or maximum pixel value) of the pixel values of image pixels 5, image pixels 6, and image pixels 7 registered in the first register group, and inputs them to the address of sub-buffer 1 in the buffer. 3 in.

The pixel value of the image pixel 8 is input into the first register group, and the process of inputting the image pixel 8 will not be repeated here.

The pixel value of the filling pixel adjacent to the image pixel 8 is input into the first register group, and the process of inputting the image pixel 8 will not be repeated here. It also obtains the accumulated pixel value (or average pixel value or maximum pixel value) of the pixel values of the image pixels 7, image pixels 8, and filling pixels registered in the first register group, and inputs them to the address 4 of the sub-buffer 1 in the buffer. in.

And while the pixel value of the filling pixel adjacent to the image pixel 8 is input into the first register group, the pixel value of the image pixel 9 in the third row is input into the register 1 of the second register group, because at this time the pixel value of the image pixel 9 The value represents the accumulated pixel value (or average pixel value or maximum pixel value) of the third row in the first pooling window, and the pixel value of the image pixel 9 is also input into the address 0 of the sub-buffer 2 in the buffer. Obtain the accumulated pixel value (or average pixel value or maximum pixel value) of the second row in the first pooling window from the location of address 0 of sub-buffer 1, that is, the pixel value of image pixel 1, and add the value of image pixel 1 The pixel value is input to the third register group. It also obtains the accumulated pixel value (or average pixel value or maximum pixel value) of the third row in the first pooling window from the location of address 0 of the sub-buffer 2 and enters it into the third register group, or, in the image pixel 9 The pixel value of is input into the position of address 0 of sub-buffer 2 in the buffer, and the pixel value of image pixel 9 is input into the third register group. Obtain the accumulated pixel value or average pixel value of image pixel 1 and image pixel 9 through an adder, or obtain the maximum pixel value of image pixel 1 and image pixel 9 through a comparator, and then add the accumulated pixel value of image pixel 1 and image pixel 9 The value or the average pixel value or the maximum pixel value is output as the pooling result of the first pooling window. At the same time, the beating of the second register set has been going on. Since different register sets and adders/comparators are used, the comparison or accumulation of column dimensions will not affect the comparison or accumulation of the third row.

The pixel value of the image pixel 10 is input into the second register group, and the process of inputting the image pixel 10 will not be repeated here.

The pixel value of the image pixel 11 is input into the second register group, and the process of inputting the image pixel 11 will not be repeated here. It also obtains the accumulated pixel value (or average pixel value or maximum pixel value) of the pixel values of image pixel 9, image pixel 10, and image pixel 11 registered in the second register group, and inputs it to the address of sub-buffer 2 in the buffer. 1 in.

At this time, get the accumulated pixel value (or average pixel value or maximum pixel value) of the second row in the second pooling window from the position of address 1 of sub-buffer 1, namely image pixel 1, image pixel 2, image pixel The accumulated pixel value of 3 (or average pixel value or maximum pixel value) is input into the third register group. It also obtains the accumulated pixel value (or average pixel value or maximum pixel value) of the third row in the second pooling window from the location of address 1 of the sub-buffer 2 and enters it into the third register group, or, in the image pixel 9 , The accumulated pixel value (or average pixel value or maximum pixel value) of the pixel values of the image pixel 10 and the image pixel 11 is input into the position of address 1 of the sub-buffer 2 in the buffer and into the third register group at the same time. Obtain the accumulated pixel value or average pixel value of all pixels in the second pooling window through the adder, or obtain the maximum pixel value among all the pixels in the second pooling window through the comparator, and then obtain the accumulated pixel value or The average pixel value or the maximum pixel value is output as the pooling result of the second pooling window. At the same time, the beating of the second register set has been going on. Since different register sets and adders/comparators are used, the comparison or accumulation of column dimensions will not affect the comparison or accumulation of the third row.

In some implementations, the pixel value of the image shown in Figure 3 is 8bit. In practical applications, the image pixel value of the original image is 16bit. Therefore, the original image is disassembled into two images to be processed with a single pixel value of 8bit. The parallelism of the char type is twice that of the short. Therefore, the 16bit type requires two similar hardware structures as described above. For example, when processing an image to be processed with a pixel value of 8bit, three register sets and three adders/comparators are required. For a 16-bit pixel value, a total of 6 register sets and 6 adders/comparators are required. Optionally, when the bit width of the register is 16 bits, 3 register groups and 6 adders/comparators can be used. It can be understood that the amount of hardware required is compatible with the bit width of the hardware, and those skilled in the art can determine the amount of hardware according to actual application scenarios.

Optionally, the data written into the sub-buffer comes from the maximum pixel value output by the comparator or the accumulated pixel value output by the adder. Since the sub-buffer is separated, each sub-buffer is independently controlled, and each sub-buffer is assigned an address here. Each comparator/adder will output the address write enable to output the calculated pixel value to the corresponding sub-buffer.

FIG. 6 is a schematic structural diagram of an image processing device provided by an embodiment of the application. As shown in FIG. 6, the image processing device 600 of this embodiment includes: a first register group 601, a second register group 602, and a processor 603 through a bus connection. Optionally, the image processing device 600 of this embodiment may further include a cache 604, and the cache 604 is connected to the foregoing components through a bus. Optionally, the image processing device 600 of this embodiment may further include a third register group 605, and the third register group 605 is connected to the foregoing components through a bus. Optionally, the image processing device 600 of this embodiment may further include an arithmetic unit 606, which is connected to the foregoing components through a bus. In this embodiment, three arithmetic units are shown, and the three arithmetic units are connected to the foregoing three registers. Group correspondence.

Among them, the edge of the image to be processed is provided with filled pixels. The processor 603 is configured to sequentially input the pixel values of the pixels in the nth row in the image to be processed into the first register group 601, where n is any integer greater than or equal to 1, and the pixel values include Image pixel value and padding pixel value; when the adjacent padding pixel value of the last image pixel value in the nth row is input into the first register group 601, the n+1th row in the image An image pixel value is input to the second register group 602.

In some embodiments, the processor 603 is further configured to:

When w pixel values belonging to the same pooling window in the nth row are input into the first register group 601, the calculated pixel values of the w pixel values are obtained, and the width of the pooling window is w Pixels, the height of the pooling window is h pixels, and the pooling window slides in a row direction according to a preset row step length and slides in a column direction according to a preset column step length;

When the calculated pixel values of w pixel values in each row in the same pooling window are obtained, the calculated pixel values of w*h pixel values in the pooling window are determined according to the calculated pixel values of w pixel values in each row .

In some embodiments, the first register group 601 or the second register group 602 includes at least w registers, and each register is used to register a single pixel value.

In some embodiments, the processor 603 is further configured to store the calculated pixel values of the w pixel values in the buffer 604 after obtaining the calculated pixel values of the w pixel values.

In some embodiments, the buffer 604 includes at least h sub-buffers, and the calculated pixel values of w pixel values of different rows in the same pooling window are stored in different sub-buffers.

In some embodiments, the calculated pixel values of w pixel values of different rows in the same pooling window are stored in the same location in the corresponding sub-buffer 604.

In some embodiments, when the processor 603 determines the calculated pixel values of w*h pixel values in the pooling window according to the calculated pixel values of w pixel values in each row, it is specifically configured to:

When the calculated pixel values of the w pixel values of the last row in the pooling window are obtained, while the calculated pixel values of the w pixel values of the last row are stored in the buffer 604, the last The calculated pixel values of w pixel values of a row are input into the third register group 605;

Read the calculated pixel values of the w pixel values of each of the other h-1 rows in the pooling window from the buffer 604, and input them into the third register group 605;

Perform calculations on the h arithmetic pixel values registered in the third register group 605, and determine the arithmetic pixel values of w*h pixel values in the pooling window.

In some embodiments, the processor 603 is further configured to:

After obtaining the calculated pixel values of the w pixel values in the n+h row, replace the calculated pixel values of the w pixel values in the corresponding column direction in the nth row stored in the buffer 604 with the n+h The calculated pixel value of w pixel values in the row.

In some embodiments, when the processor 603 obtains the calculated pixel values of the w pixel values, it is specifically configured to: perform operations on the w pixel values according to a pooling mode to obtain the calculated pixel values.

In some embodiments, if the pooling mode is maximum pooling, the calculated pixel value is the maximum pixel value; if the pooling mode is average pooling, the calculated pixel value is the average pixel value or Accumulate pixel values.

In some embodiments, the processor 603, when performing operations on the w pixel values according to the pooling mode to obtain the calculated pixel values, is specifically configured to: use the first register set 601 or the The pixel value registered in the second register group 602 is output to the arithmetic unit 606 so that the arithmetic unit 606 outputs the arithmetic pixel value; and the arithmetic pixel value output by the arithmetic unit 606 is obtained.

In some embodiments, the processor 603 is further configured to:

If the pooling mode is maximum pooling, configure the arithmetic unit 606 as a comparator;

If the pooling mode is average pooling, the computing unit 606 is configured as an adder.

In some embodiments, the processor 603 is further configured to configure the input of the computing unit 606 according to the size of the pooling window.

In some embodiments, the arithmetic unit 606 that outputs the arithmetic pixel values of w pixel values in adjacent rows is a first arithmetic unit and a second arithmetic unit, and outputs the arithmetic of w*h pixel values in the pooling window. The pixel value calculation unit 606 is the third calculation unit.

In some embodiments, if the pooling mode is maximum pooling, the filled pixel value is the smallest pixel value;

If the pooling mode is average pooling, the filled pixel value is 0.

In some embodiments, the processor 603 is further configured to perform layered processing on the original image to obtain multiple sub-images;

The pixel value of each pixel in the sub-image of each layer is the pixel value of the same bit of each pixel in the original image;

The image to be processed is any sub-image in the multi-layer sub-image.

Optionally, the image processing device 600 of this embodiment may further include: a memory (not shown in the figure) for storing program codes. The memory is used for storing program codes. When the program codes are executed, the image processing device 600 can implement the above-mentioned technical solutions.

The image processing device of this embodiment can be used to implement the technical solutions of FIG. 2 and the corresponding method embodiment, and its implementation principles and technical effects are similar, and will not be repeated here.

Another embodiment of the present application further provides an image processing device including a memory and a processor; the memory is used for storing program instructions, and the processor is used for calling the program instructions in the memory to execute the solutions of the foregoing embodiments.

FIG. 7 is a schematic structural diagram of a movable platform provided by an embodiment of the application. As shown in FIG. 7, the movable platform 700 of this embodiment may include: a first register set 701, a second register set 702, and a processor 703 Bus connection. Optionally, the movable platform 700 of this embodiment may further include a cache 704, and the cache 704 is connected to the foregoing components through a bus. Optionally, the movable platform 700 of this embodiment may further include a third register set 705, and the third register set 705 is connected to the foregoing components through a bus. Optionally, the movable platform 700 of this embodiment may further include an arithmetic unit 706, which is connected to the above-mentioned components through a bus. In this embodiment, three arithmetic units are shown, and the three arithmetic units are connected to the above-mentioned three registers. Group correspondence.

Among them, the edge of the image to be processed is provided with filled pixels. The processor 703 is configured to sequentially input the pixel values of the pixels in the nth row in the image to be processed into the first register group 701, where n is any integer greater than or equal to 1, and the pixel values include Image pixel value and padding pixel value; when the adjacent padding pixel value of the last image pixel value in the nth row is input into the first register group 701, the n+1th row in the image An image pixel value is input to the second register group 702.

In some embodiments, the processor 703 is further configured to:

When w pixel values belonging to the same pooling window in the nth row are input into the first register group 701, the calculated pixel values of the w pixel values are obtained, and the width of the pooling window is w Pixels, the height of the pooling window is h pixels, and the pooling window slides in a row direction according to a preset row step length and slides in a column direction according to a preset column step length;

In some embodiments, the first register group 701 or the second register group 702 includes at least w registers, and each register is used to register a single pixel value.

In some embodiments, the processor 703 is further configured to store the calculated pixel values of the w pixel values in the buffer 704 after obtaining the calculated pixel values of the w pixel values.

In some embodiments, the buffer 704 includes at least h sub-buffers, and the calculated pixel values of w pixel values of different rows in the same pooling window are stored in different sub-buffers.

In some embodiments, the calculated pixel values of w pixel values of different rows in the same pooling window are stored in the same position in the corresponding sub-buffer 704.

In some embodiments, when the processor 703 determines the calculated pixel values of w*h pixel values in the pooling window according to the calculated pixel values of w pixel values in each row, it is specifically configured to:

When the calculated pixel values of the w pixel values of the last row in the pooling window are obtained, while the calculated pixel values of the w pixel values of the last row are stored in the buffer 704, the last The calculated pixel values of w pixel values of a row are input into the third register group 705;

Read from the buffer 704 the calculated pixel values of the w pixel values of each of the other h-1 rows in the pooling window, and input them into the third register group 705;

Perform calculations on the h arithmetic pixel values registered in the third register group 705, and determine the arithmetic pixel values of w*h pixel values in the pooling window.

In some embodiments, the processor 703 is further configured to:

After obtaining the calculated pixel value of the w pixel values in the n+h row, replace the calculated pixel value of the w pixel values in the corresponding column direction in the nth row stored in the buffer 704 with the n+h The calculated pixel value of w pixel values in the row.

In some embodiments, when the processor 703 obtains the calculated pixel values of the w pixel values, it is specifically configured to: perform operations on the w pixel values according to a pooling mode to obtain the calculated pixel values.

In some embodiments, the processor 703, when performing operations on the w pixel values according to the pooling mode to obtain the calculated pixel values, is specifically configured to: use the first register set 701 or the The pixel value registered in the second register group 702 is output to the arithmetic unit 706 so that the arithmetic unit 706 outputs the arithmetic pixel value; and the arithmetic pixel value output by the arithmetic unit 706 is obtained.

In some embodiments, the processor 703 is further configured to:

If the pooling mode is maximum pooling, configure the arithmetic unit 706 as a comparator;

If the pooling mode is average pooling, the computing unit 706 is configured as an adder.

In some embodiments, the processor 703 is further configured to configure the input of the computing unit 706 according to the size of the pooling window.

In some embodiments, the arithmetic unit 706 that outputs the arithmetic pixel values of w pixel values in adjacent rows is a first arithmetic unit and a second arithmetic unit, and outputs the arithmetic of w*h pixel values in the pooling window. The pixel value calculation unit 706 is the third calculation unit.

If the pooling mode is average pooling, the filled pixel value is 0.

In some embodiments, the processor 703 is further configured to perform layered processing on the original image to obtain multiple layers of sub-images;

The image to be processed is any sub-image in the multi-layer sub-image.

Optionally, the movable platform 700 of this embodiment may further include: a memory (not shown in the figure) for storing program codes, the memory is used for storing program codes, and when the program codes are executed, the movable platform 700 can implement the above-mentioned technical solutions.

The movable platform of this embodiment can be used to implement the technical solutions of FIG. 2 and the corresponding method embodiments, and its implementation principles and technical effects are similar, and will not be repeated here.

FIG. 8 is a schematic structural diagram of a movable platform provided by another embodiment of this application. As shown in FIG. 8, the movable platform 800 of this embodiment may include: a movable platform body 801 and an image processing device 802.

Wherein, the image processing device 802 is installed on the movable platform body 801. The image processing device 802 may be a device independent of the movable platform body 801.

The image processing device 802 may adopt the structure of the apparatus embodiment shown in FIG. 6, and correspondingly, may execute the technical solutions of FIG. 2 and its corresponding method embodiments. The implementation principles and technical effects are similar, and will not be repeated here.

A person of ordinary skill in the art can understand that all or part of the steps in the above method embodiments can be implemented by a program instructing relevant hardware. The foregoing program can be stored in a computer readable storage medium. When the program is executed, it is executed. Including the steps of the foregoing method embodiment; and the foregoing storage medium includes: read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disks or optical disks, etc., which can store program codes Medium.

Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the application, not to limit them; although the application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions described in the foregoing embodiments can still be modified, or some or all of the technical features can be equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the embodiments of the present application. range.

Claims

An image processing method, characterized in that the edge of the image to be processed is provided with padding pixels, and the method includes:

Sequentially input the pixel values of the pixels in the nth row in the image to be processed into the first register group, where n is any integer greater than or equal to 1, and the pixel values include image pixel values and padding pixel values;

When the adjacent filling pixel value of the last image pixel value in the nth row is input into the first register group, the first image pixel value in the n+1th row in the image is input into the second register group.
The method according to claim 1, further comprising:

When the w pixel values belonging to the same pooling window in the nth row are input into the first register group, the calculated pixel values of the w pixel values are obtained, and the width of the pooling window is w pixels , The height of the pooling window is h pixels, and the pooling window slides in the row direction according to the preset row step length and slides in the column direction according to the preset column step length;

When the calculated pixel values of w pixel values in each row in the same pooling window are obtained, the calculated pixel values of w*h pixel values in the pooling window are determined according to the calculated pixel values of w pixel values in each row .
The method according to claim 2, wherein the first register group or the second register group includes at least w registers, and each register is used to register a single pixel value.
The method according to claim 2, wherein after said obtaining the calculated pixel values of the w pixel values, the method further comprises:

The calculated pixel values of the w pixel values are stored in the buffer.
The method according to claim 4, wherein the buffer includes at least h sub-buffers, and the calculated pixel values of w pixel values of different rows in the same pooling window are stored in different sub-buffers.
The method according to claim 5, wherein the calculated pixel values of the w pixel values of different rows in the same pooling window are stored in the same position in the corresponding sub-buffer.
The method according to any one of claims 4-6, wherein the calculated pixel values of w*h pixel values in the pooling window are determined according to the calculated pixel values of w pixel values in each row ,include:

When the calculated pixel values of the w pixel values of the last row in the pooling window are obtained, while the calculated pixel values of the w pixel values of the last row are stored in the buffer, the last row The calculated pixel values of the w pixel values are input into the third register group;

Read the calculated pixel values of the w pixel values of each of the other h-1 rows in the pooling window from the buffer, and input them into the third register group;

Perform an operation on the h arithmetic pixel values registered in the third register group, and determine the arithmetic pixel values of w*h pixel values in the pooling window.
The method according to any one of claims 4-7, further comprising:

After obtaining the calculated pixel value of the w pixel values in the n+h row, replace the calculated pixel value of the w pixel values in the corresponding column direction in the nth row in the buffer with the n+h row The calculated pixel value of w pixel values in.
The method according to any one of claims 2-8, wherein the obtaining the calculated pixel values of the w pixel values comprises:

Perform operations on the w pixel values according to the pooling mode to obtain the calculated pixel values.
The method according to claim 9, wherein if the pooling mode is maximum pooling, the calculated pixel value is the maximum pixel value; if the pooling mode is average pooling, then the calculation The pixel value is an average pixel value or an accumulated pixel value.
The method according to claim 9, wherein the calculating the w pixel values according to the pooling mode to obtain the calculated pixel value comprises:

Output the pixel value registered in the first register group or the second register group to an arithmetic unit, so that the arithmetic unit outputs the arithmetic pixel value;

Obtaining the calculated pixel value output by the arithmetic unit.
11. The method according to claim 11, wherein if the pooling mode is maximum pooling, the arithmetic unit is configured as a comparator;

If the pooling mode is average pooling, the arithmetic unit is configured as an adder.
The method according to claim 11 or 12, further comprising:

According to the size of the pooling window, the input of the arithmetic unit is configured.
The method according to any one of claims 11-13, wherein the arithmetic units that output the arithmetic pixel values of w pixel values in adjacent rows are respectively a first arithmetic unit and a second arithmetic unit, and output the pool The arithmetic unit for calculating the pixel values of w*h pixel values in the window is the third arithmetic unit.
The method according to any one of claims 9-14, wherein if the pooling mode is maximum pooling, the filled pixel value is the minimum pixel value;

If the pooling mode is average pooling, the filled pixel value is 0.
The method according to any one of claims 1-15, further comprising:

Perform layered processing on the original image to obtain multiple sub-images;

The pixel value of each pixel in each layer of sub-image is the same bit pixel value of each pixel in the original image;

The image to be processed is any sub-image in the multi-layer sub-image.
An image processing device, characterized in that the edge of the image to be processed is provided with filling pixels, and the image processing device includes: a first register group, a second register group, and a processor;

The processor is configured to sequentially input the pixel values of the pixels in the nth row in the image to be processed into the first register group, where n is any integer greater than or equal to 1, and the pixel values include image pixels Value and padding pixel value; when the adjacent padding pixel value of the last image pixel value in the nth row is input into the first register group, the first image in the n+1th row in the image The pixel value is input to the second register group.
The device according to claim 17, wherein the processor is further configured to:

When the w pixel values belonging to the same pooling window in the nth row are input into the first register group, the calculated pixel values of the w pixel values are obtained, and the width of the pooling window is w pixels , The height of the pooling window is h pixels, and the pooling window slides in the row direction according to the preset row step length and slides in the column direction according to the preset column step length;

When the calculated pixel values of w pixel values in each row in the same pooling window are obtained, the calculated pixel values of w*h pixel values in the pooling window are determined according to the calculated pixel values of w pixel values in each row .
The device according to claim 18, wherein the first register group or the second register group includes at least w registers, and each register is used to register a single pixel value.
The device according to claim 18, further comprising a cache;

The processor is further configured to store the calculated pixel values of the w pixel values in a buffer after obtaining the calculated pixel values of the w pixel values.
The device according to claim 20, wherein the buffer includes at least h sub-buffers, and the calculated pixel values of w pixel values of different rows in the same pooling window are stored in different sub-buffers.
The device according to claim 21, wherein the calculated pixel values of the w pixel values of different rows in the same pooling window are stored in the same position in the corresponding sub-buffer.
The device according to any one of claims 20-22, further comprising a third register set;

When the processor determines the calculated pixel values of w*h pixel values in the pooling window according to the calculated pixel values of w pixel values in each row, it is specifically used for:

When the calculated pixel values of the w pixel values of the last row in the pooling window are obtained, while the calculated pixel values of the w pixel values of the last row are stored in the buffer, the last row The calculated pixel values of the w pixel values are input into the third register group;

Read the calculated pixel values of the w pixel values of each of the other h-1 rows in the pooling window from the buffer, and input them into the third register group;

Perform an operation on the h arithmetic pixel values registered in the third register group, and determine the arithmetic pixel values of w*h pixel values in the pooling window.
The device according to any one of claims 20-23, wherein the processor is further configured to:

After obtaining the calculated pixel values of the w pixel values in the n+h row, replace the calculated pixel values of the w pixel values in the corresponding column direction in the nth row in the buffer with the n+h row The calculated pixel value of w pixel values in.
The device according to any one of claims 18-24, wherein when the processor obtains the calculated pixel values of the w pixel values, it is specifically configured to: Value is calculated to obtain the calculated pixel value.
The device according to claim 25, wherein if the pooling mode is maximum pooling, the calculated pixel value is the maximum pixel value; if the pooling mode is average pooling, the calculated pixel value The pixel value is an average pixel value or an accumulated pixel value.
The device according to claim 26, further comprising an arithmetic unit;

The processor, when performing operations on the w pixel values according to the pooling mode to obtain the calculated pixel values, is specifically configured to: transfer the pixels registered in the first register group or the second register group The value is output to the arithmetic unit, so that the arithmetic unit outputs the arithmetic pixel value; and the arithmetic pixel value output by the arithmetic unit is obtained.
The device according to claim 27, wherein the processor is further configured to:

If the pooling mode is maximum pooling, configure the arithmetic unit as a comparator;

If the pooling mode is average pooling, the computing unit is configured as an adder.
The device according to claim 27 or 28, wherein the processor is further configured to configure the input of the arithmetic unit according to the size of the pooling window.
The device according to any one of claims 27-29, wherein the arithmetic units that output the arithmetic pixel values of w pixel values in adjacent rows are respectively a first arithmetic unit and a second arithmetic unit, and output the pool The arithmetic unit for calculating the pixel values of the w*h pixel values in the window is the third arithmetic unit.
The device according to any one of claims 25-30, wherein if the pooling mode is maximum pooling, the filled pixel value is a minimum pixel value;

If the pooling mode is average pooling, the filled pixel value is 0.
The device according to any one of claims 17-31, wherein the processor is further configured to perform layered processing on the original image to obtain multiple layers of sub-images;

The pixel value of each pixel in each layer of sub-image is the same bit pixel value of each pixel in the original image;

The image to be processed is any sub-image in the multi-layer sub-image.
A movable platform, characterized by comprising: a movable platform body and the image processing device according to any one of claims 17-32, wherein the image processing device is installed on the movable platform body.
The mobile platform of claim 33, wherein the mobile platform comprises a handheld phone, a handheld pan/tilt, a drone, an unmanned vehicle, an unmanned boat, a robot, or an autonomous vehicle.
A readable storage medium, characterized in that a computer program is stored on the readable storage medium; when the computer program is executed, the image processing method according to any one of claims 1-16 is realized.