CN113160026B

CN113160026B - Image processing method, device, medium and electronic equipment

Info

Publication number: CN113160026B
Application number: CN202010014430.4A
Authority: CN
Inventors: 张祎男; 王振江; 凌坤
Original assignee: Beijing Horizon Robotics Technology Research and Development Co Ltd
Current assignee: Beijing Horizon Robotics Technology Research and Development Co Ltd
Priority date: 2020-01-07
Filing date: 2020-01-07
Publication date: 2024-03-05
Anticipated expiration: 2040-01-07
Also published as: CN113160026A

Abstract

The invention discloses an image processing method, an image processing device, a medium and electronic equipment, wherein the image processing method comprises the following steps: performing image signal processing on a sensing signal output by an image sensor to generate a pixel value in a current frame image; monitoring the number of pixels of the current generated pixel value; when the number of pixels of the current generated pixel value is monitored to reach a preset number, executing operation processing based on a neural network on an image block formed by the current generated pixel value; and obtaining the processing result of the current frame image according to the operation processing results of all the image blocks in the current frame image. The technical scheme provided by the disclosure is beneficial to reducing the time delay of processing the current frame image, thereby being beneficial to improving the instantaneity of image processing. For intelligent driving applications, it is advantageous to improve the safety of intelligent driving.

Description

Image processing method, device, medium and electronic equipment

Technical Field

The present disclosure relates to computer vision technology, and more particularly, to an image processing method, an image processing apparatus, a storage medium, and an electronic device.

Background

In applications such as intelligent driving, it is often necessary to perform processing such as target recognition and target tracking on an image captured by an imaging device. The real-time performance of the application such as intelligent driving on the processing speed of the image shot by the camera device is often required to be high, and the time delay of image processing can influence the safety of intelligent driving.

How to reduce the time delay of processing the image shot by the image pickup device is a technical problem which is worth focusing attention.

Disclosure of Invention

The present disclosure has been made in order to solve the above technical problems. The embodiment of the disclosure provides an image processing method, an image processing device, a storage medium and electronic equipment.

According to an aspect of the embodiments of the present disclosure, there is provided an image processing method including: performing image signal processing on a sensing signal output by an image sensor to generate a pixel value in a current frame image; monitoring the number of pixels of the current generated pixel value; when the number of pixels of the current generated pixel value is monitored to reach a preset number, executing operation processing based on a neural network on an image block formed by the current generated pixel value; and obtaining the processing result of the current frame image according to the operation processing results of all the image blocks in the current frame image.

According to another aspect of an embodiment of the present disclosure, there is provided an image processing apparatus including: the signal processing module is used for performing image signal processing on the sensing signal output by the image sensor so as to generate a pixel value in the current frame image; the monitoring module is used for monitoring the number of pixels of the pixel value currently generated by the signal processing module; the operation processing module is used for executing operation processing based on a neural network on an image block formed by the pixel values generated at present when the monitoring module monitors that the pixel number of the pixel values generated at present reaches a preset number; the acquisition processing result module is used for acquiring the processing result of the current frame image according to the operation processing result of the operation processing module on all the image blocks in the current frame image.

According to still another aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium storing a computer program for implementing the above method.

According to still another aspect of the embodiments of the present disclosure, there is provided an electronic device including: a processor; a memory for storing the processor-executable instructions; the processor is configured to read the executable instructions from the memory and execute the instructions to implement the method described above.

According to the image processing method and the image processing device provided by the embodiment of the disclosure, when the generation of the preset number of pixel values is monitored in the image signal processing process, the operation processing based on the neural network is performed on the image block formed by the pixel values generated at present, so that the computing resources aiming at the neural network layout can be fully utilized, and the time for waiting to perform the operation processing based on the neural network can be shortened. Therefore, the technical scheme provided by the disclosure is beneficial to reducing the time delay of processing the current frame image, thereby being beneficial to improving the instantaneity of image processing. For intelligent driving application, the image processing result of each current frame image can be timely obtained, so that the phenomenon of hysteresis of generating and issuing control instructions is avoided, and the safety of intelligent driving is improved.

The technical scheme of the present disclosure is described in further detail below through the accompanying drawings and examples.

Drawings

The above and other objects, features and advantages of the present disclosure will become more apparent by describing embodiments thereof in more detail with reference to the accompanying drawings. The accompanying drawings are included to provide a further understanding of embodiments of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure, not to limit the disclosure. In the drawings, like reference numerals generally refer to like parts or steps.

FIG. 1 is a schematic illustration of a scenario in which the present disclosure is applicable;

FIG. 2 is a flow chart of one embodiment of an image processing method of the present disclosure;

FIG. 3 is a schematic diagram of one embodiment of the present disclosure performing an arithmetic process on an image block;

FIG. 4 is a flow chart of another embodiment of an image processing method of the present disclosure;

FIG. 5 is a schematic view of an embodiment of an image processing apparatus of the present disclosure;

fig. 6 is a block diagram of an electronic device according to an exemplary embodiment of the present application.

Detailed Description

Example embodiments according to the present disclosure will be described in detail below with reference to the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present disclosure and not all of the embodiments of the present disclosure, and that the present disclosure is not limited by the example embodiments described herein.

It should be noted that: the relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless it is specifically stated otherwise.

It will be appreciated by those of skill in the art that the terms "first," "second," etc. in embodiments of the present disclosure are used merely to distinguish between different steps, devices or modules, etc., and do not represent any particular technical meaning nor necessarily logical order between them.

It should also be understood that in embodiments of the present disclosure, "plurality" may refer to two or more, and "at least one" may refer to one, two or more.

It should also be appreciated that any component, data, or structure referred to in the presently disclosed embodiments may be generally understood as one or more without explicit limitation or the contrary in the context.

In addition, the term "and/or" in this disclosure is merely an association relationship describing an association object, and indicates that three relationships may exist, such as a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" in the present disclosure generally indicates that the front and rear association objects are an or relationship.

It should also be understood that the description of the various embodiments of the present disclosure emphasizes the differences between the various embodiments, and that the same or similar features may be referred to each other, and for brevity, will not be described in detail.

Meanwhile, it should be understood that the sizes of the respective parts shown in the drawings are not drawn in actual scale for convenience of description.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.

Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but are intended to be part of the specification where appropriate.

It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.

Embodiments of the present disclosure are applicable to electronic devices such as terminal devices, computer systems, servers, etc., which are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known terminal devices, computing systems, environments, and/or configurations that may be suitable for use with the terminal device, computer system, or server, include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, network personal computers, minicomputer systems, mainframe computer systems, distributed cloud computing environments that include any of the above systems, and the like.

Electronic devices such as terminal devices, computer systems, servers, etc. may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc., that perform particular tasks or implement particular abstract data types. The computer system/server may be implemented in a distributed cloud computing environment. In a distributed cloud computing environment, tasks may be performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computing system storage media including memory storage devices.

Summary of the disclosure

In carrying out the present disclosure, the inventors have discovered that current image processing processes typically begin processing a video frame image or a photo image after the camera device has filmed and generated a complete video frame image or photo image. That is, in the process of generating a video frame image or a photo image, the image processing operation is in a waiting state, so that the delay of the image processing is large. The higher the resolution of the video frame image or the photo image, the longer the image processing operation waits, and the greater the delay of the image processing. This does not take advantage of improving the real-time nature of image processing. For intelligent driving, the delay of image processing can affect the real-time performance of vehicle control, which is unfavorable for the safe driving of the vehicle.

Exemplary overview

The vehicle 100 in fig. 1 may implement intelligent driving. For example, an ADAS (AdvancedDriving Assistant System, advanced driving assistance system) in the vehicle 100 may cause the vehicle 100 to realize assisted driving; as another example, the vehicle 100 may implement autonomous driving.

At least one image pickup device 101 (only one image pickup device 101 is shown in fig. 1) and an in-vehicle processing apparatus (not shown in fig. 1) are provided in the vehicle 100. In the case where the vehicle 100 is in the automatic driving mode or the assisted driving mode, the image capturing device 101 may obtain the video stream of the road surface on which the vehicle 100 is located in real time through the video capturing function. In the process of forming a video frame image by the image pickup device 101, the in-vehicle processing apparatus can start image processing on the video frame image currently being generated.

For example, assume that the time period required for the in-vehicle processing apparatus to perform image processing on one complete video frame image is T, and that the start time at which the image pickup device 101 forms one video frame image is T1, and that the end time at which the image pickup device 101 forms the video frame image is T2. Under the above assumption, the in-vehicle processing apparatus may start image processing of the video frame image currently being generated by the image pickup device 101 at time t3, and the in-vehicle processing apparatus completes image processing of the video frame image currently successfully generated by the image pickup device 101 at time t 4. The time T3 is later than the time T1, is earlier than the time T2, the time T4 is later than the time T2, and the difference between the time T4 and the time T3 is basically the same as the time length T. From this, the image processing delay is basically shortened from the original T to (T- (T2-T3)).

The in-vehicle processing apparatus may perform image processing in the above manner in the process of forming each video frame image by the image pickup device 101, so that the in-vehicle processing apparatus may obtain a target object (for example, a vehicle, a pedestrian, a traffic light, etc.) detected from each video frame image according to the image processing result, and then, the in-vehicle processing apparatus may generate and issue a corresponding control instruction in real time according to the target object detected from each video frame image, thereby controlling the running of the vehicle 100.

Control instructions generated and issued by the vehicle-mounted processing device include, but are not limited to: a speed maintaining control command, a speed adjusting control command, a direction maintaining control command, a direction adjusting control command, an early warning prompt control command and the like.

Exemplary method

Fig. 2 is a flowchart of one embodiment of an image processing method of the present disclosure. The method as shown in fig. 2 includes: s200, S201, S202, and S203. The steps are described separately below.

S200, performing image signal processing on the sensing signal output by the image sensor to generate a pixel value in the current frame image.

An image sensor in the present disclosure may refer to an optoelectronic device that converts an optical signal into an electrical signal. For example, an image sensor may convert an optical image on its light-receiving surface into: an electrical signal in a proportional relationship with the light image. The image sensor in the present disclosure may be referred to as a photosensitive element. The image sensor in the present disclosure may be an image sensor in an image pickup apparatus. The sensing signal in the present disclosure may refer to the converted electrical signal. The camera device in the present disclosure may refer to a camera having an artificial smart chip or a camera in a smart mobile phone or a camera in a vehicle, or the like.

The current frame image in the present disclosure may refer to a video frame or a photo or the like that is currently being formed. The spatial resolution of the current frame image (i.e., the width and height of the current frame image) may determine the number of pixels that the current frame image contains.

The pixel values in this disclosure are embodied by being related to the format of the current frame image. For example, in the case where the current frame image is a RGB (Red Green Blue) -based current frame image, the pixel values in the present disclosure may refer to the values of R, G, and B. For another example, in the case where the current frame image is a current frame image based on YUV (luminance component, hue component, and color saturation component), the pixel values in the present disclosure may refer to the values of Y, U, and V. For another example, in the case where the current frame image is a gray-based current frame image, the pixel value in the present disclosure may refer to a gray value.

S201, monitoring the number of pixels of the current generated pixel value.

In the case where the pixel values generated in the present disclosure are stored in at least one predetermined storage area, the monitoring of the number of pixels of the currently generated pixel values in the present disclosure may be specifically monitoring the storage condition of each predetermined storage area. The size of the storage space of each predetermined storage area is generally related to the number of pixels contained in one image block. In the case where all pixel values included in one image block occupy all storage spaces of a predetermined storage area, the monitoring of the number of pixels of the currently generated pixel value by the present disclosure may specifically be: monitoring whether the first predetermined storage area is full, monitoring whether the second predetermined storage area is full, … …, and monitoring whether the last predetermined storage area is full.

S202, when the number of pixels of the current generated pixel value reaches a preset number, executing operation processing based on a neural network on an image block formed by the current generated pixel value.

The predetermined number in the present disclosure may be related to the spatial resolution of the current frame image, and the number of all image blocks contained in the current frame image.

For example, assuming that the spatial resolution of the current frame image is w (wide) ×h (high), and assuming that the neural network-based operation processing is completed for n image blocks of the current frame image after the neural network-based operation processing is performed for each of the n image blocks, the predetermined number may be (w×h)/n. Wherein w, h and n are positive integers greater than zero. n may be a factor of h, and n is typically equal to or greater than 2.

In the case where the current frame image is stored in n predetermined storage areas, the present disclosure may perform a neural network-based arithmetic process on a first image block with the content stored in the first predetermined storage area as the first image block if it is monitored that the first predetermined storage area is full; when the second preset storage area is monitored to be full, taking the content stored in the second preset storage area as a second image block, and executing operation processing based on a neural network on the second image block; with such a push, in the case where it is monitored that the nth predetermined storage area is full, the contents stored in the nth predetermined storage area are taken as the nth image block, and the operation processing based on the neural network is performed on the nth image block; thus, the operation processing based on the neural network of all the image blocks contained in the current frame image is completed.

In the present disclosure, in the case where it is monitored that any one of the predetermined storage areas is full, a process of performing an arithmetic processing based on a neural network may be started by generating an image block formed for a pixel value currently generated in the form of an interrupt signal or the like.

The neural network-based operation process in the present disclosure may be determined according to actual requirements of image processing, for example, the neural network-based operation process may include, but is not limited to: an operation process based on FPN (Feature PyramidNetworks, feature pyramid neural network), an operation process based on CNN (Convolutional NeuralNetworks, convolutional neural network), or the like.

S203, obtaining the processing result of the current frame image according to the operation processing results of all the image blocks in the current frame image.

After the operation processing results of all the image blocks are obtained, the operation processing results of all the image blocks can be combined, and the combined results can be further processed correspondingly, so that the processing result of the current frame image is obtained. For example, convolution processing, classification processing, and the like are performed on the combined result.

By executing the operation processing based on the neural network for the image block formed by the pixel values generated at present when the generation of the predetermined number of pixel values is monitored in the image signal processing process, the calculation resources laid out for the neural network can be fully utilized, and the time length for waiting for executing the operation processing based on the neural network can be shortened. For example, when the start time of the image capturing device forming the current frame image is t1 and the end time of the image capturing device forming the current frame image is t2, the present disclosure may start the operation processing based on the neural network on the part of the current frame image currently generated by the image capturing device at time t3, and since time t3 is later than time t1 and earlier than time t2, the present disclosure may reduce the waiting time for starting the image processing on the current frame image from (t 2-t 1) to (t 3-t 1). Therefore, the technical scheme provided by the disclosure is beneficial to reducing the time delay of processing the current frame image, thereby being beneficial to improving the instantaneity of image processing. For intelligent driving application, since the image processing result of each current frame image can be timely obtained, hysteresis phenomenon of generating and issuing control instructions is avoided, and accordingly intelligent driving safety is improved.

In an alternative example, the image processing method of the present disclosure may further include, before S200: and determining the number of pixels included in each image block. The size (e.g., width x height) of all image blocks included in the current frame image in the present disclosure is generally the same. The present disclosure may determine the number of pixels included in each image block according to at least one of a spatial resolution of a current frame image, a calculation array of a neural network, input and output spatial resolutions of the neural network, a calculation resource of a data processor performing an operation process, and a buffer space of a first buffer in the data processor. That is, the size of the image block may be set with reference to the spatial resolution of the current frame image, the calculation array of the neural network, the input and output spatial resolutions of the neural network, the calculation resources of the data processor performing the arithmetic processing, and the buffer space of the first buffer in the data processor.

The spatial resolution of the current frame image refers to the width and height of the current frame image. The above-mentioned calculation array of the neural network may refer to a data size that each layer in the neural network needs to calculate, for example, the number of convolution kernels (such as n1 convolution kernels) included in the layer in the neural network, a size of a weight matrix of the convolution kernels (such as n2×n2×c), and so on. The input spatial resolution of the neural network may refer to the width and height of the input (e.g., input image) that is used as the neural network. The output spatial resolution of the neural network may refer to the width and height of the output result (such as the output feature map) of the neural network. The computing resources of the data processor performing the above-mentioned arithmetic processing may be the number of multiply-add devices included in the data processor, the size of a Memory space (for example, the size of an SRAM (Static Random-Access Memory)) included in the data processor, and the like. The first buffer in the data processor may include: a buffer for storing output information of hidden layers in the neural network. Implicit layer herein may refer to a layer located between an input layer and an output layer of a neural network.

Optionally, any two adjacent image blocks in all the image blocks included in the current frame image in the present disclosure generally include overlapping image areas, so as to ensure accuracy when performing operation processing based on a neural network on pixels in a lower edge line of the image blocks. For example, assume that one image block includes image areas from the i-th line to the i+j-th line in the current frame image, and that the next image block of the image block may include image areas from the i+j-x-th line to the i+2j-x-th line in the current image. The value of x can be determined according to the actual situation of the operation processing based on the neural network. For example, it may be determined based on the size of the convolution kernel of the convolution operation and the step size. For example, the size of the convolution kernel is 3×3×c, and the step size is 1, and the value of x may be 2.

The method and the device have the advantages that the number of the pixels contained in the image block is determined under the condition of referring to various factors, so that the size of the image block is reasonably set, resources used by image processing are reasonably and fully utilized, and the delay of the image processing is reduced as much as possible.

In one alternative example, the predetermined number in the present disclosure may be represented by the number of pixel rows. The present disclosure may divide an image into at least two image blocks in advance, each image block including N rows, where N is an integer greater than zero and less than or equal to one-half the total number of rows of pixels included in a current frame image. For example, in the case where the current frame image includes 800 total numbers of pixels, N may be 10 or 20 or 40, or the like. The value of N is also determined by the spatial resolution of the current frame image, the computational array of the neural network, the input and output spatial resolutions of the neural network, the computational resources of the data processor performing the arithmetic processing, and the buffer space of the first buffer in the data processor.

Alternatively, the data processor of the present disclosure that performs the neural network-based arithmetic processing on the image block may be a CPU (Central Processing Unit ) or FPGA (Field Programmable Gate Array, programmable logic array) or GPU (Graphics Processing Unit ) or ASIC (Application Specific Integrated Circuit, application specific integrated circuit) or ASIP (Application Specific Instruction Set Processor ) or DSP (digital signal processor, digital signal processor) or neural network-specific processor, or the like.

Optionally, S202 in the present disclosure may specifically be: generating a notification signal when the number of pixels of the pixel value generated and buffered in the second buffer area reaches the number of pixel lines N (for example, the 1 st line to the nth line, the N-2 nd line to the 2N-3 nd line, the 2N-5 th line to the 3N-6 th line, etc. are buffered) each time according to the number of pixel lines N included in the preset image block; the present disclosure can thus read out pixel values of all pixels of an image block from the second buffer based on the notification signal, and perform an arithmetic process based on a neural network on the read out image block.

Alternatively, the present disclosure may generate the above-described notification signal by an ISP (Image Signal Processor ) in the case where the neural network-based arithmetic processing operation is performed by the BPU for each image block, respectively, according to an instruction in its profile. For example, the ISP sends a notification signal (such as a notification signal that the data is ready) to the BPU each time the pixel values of the N rows of pixels in the current frame image are generated and buffered in the second buffer according to the number N of pixel rows contained in the image block in its configuration file, and the BPU reads the pixel values of the N rows of pixels from the second buffer after receiving the notification signal. The BPU may perform a neural network-based arithmetic processing operation on the image block formed by the N rows of pixels currently read out according to the corresponding instruction sequence in its configuration file.

Alternatively, in the case where the neural network-based arithmetic processing operations are performed individually for each image block by the BPU according to instructions in its configuration file, the present disclosure may generate the above-described notification signals by other data processors (e.g., CPU, etc.) than the BPU. For example, the CPU sends notification information (such as a data ready notification signal) to the BPU every time it detects that the ISP has cached the pixel values of the N rows of pixels in the current frame image in the second buffer, according to the number N of pixel rows contained in the image block in its configuration file, and the BPU reads the pixel values of the N rows of pixels from the second buffer after receiving the notification signal. The BPU may perform a neural network-based arithmetic processing operation on the image block formed by the N rows of pixels currently read out according to the corresponding instruction sequence in its configuration file.

Alternatively, the notification signal of data readiness in the present disclosure may be an interrupt signal of a data processor (e.g., BPU, etc.), or the like. The buffer space of the second buffer in the present disclosure is generally not smaller than the space occupied by the pixel values of one image block. In one example, the buffer space of the second buffer may be not smaller than the space occupied by the current frame image, and pixel values of all image blocks included in the current frame image may be stored in the second buffer. In another example, the buffer space of the second buffer may be smaller than the space occupied by the current frame image, but not smaller than the space occupied by one image block, in this example, the present disclosure may preset a plurality of second buffers, each image block included in the current frame image corresponds to one second buffer, that is, one second buffer is used to store the pixel value of one image block in the current frame image, and the data processor may determine the second buffer corresponding to the current interrupt signal and read the pixel value of the image block from the corresponding second buffer.

According to the method and the device, the current frame image is divided into the image blocks, each image block contains N rows, so that the sizes of the image blocks are identical, inconformity of the sizes of the image blocks is avoided, and inconvenience is brought to subsequent processing of the image blocks. The buffer area is monitored, so that the monitoring operation of the pixel number of the current generated pixel value is easy to realize.

In an optional example, performing the neural network-based arithmetic processing on the image block formed by the currently generated pixel values in S202 of the present disclosure may be specifically: determining an instruction sequence corresponding to an image block formed by the pixel values generated at present, and executing operation processing based on a neural network on the image block according to the instruction sequence corresponding to the image block.

Optionally, one instruction sequence generally includes a plurality of instructions, and the number of instructions included in the instruction sequence corresponding to each image block is generally the same, which may be the case in some cases. The instructions contained in different instruction sequences are typically not identical. The instruction non-uniformity herein may include, but is not limited to: although the names of the instructions are the same, the parameters carried by the instructions are not the same. For example, although the read instructions are all read instructions for reading data from the second buffer, the addresses of the second buffer carried by the read instructions are not the same.

Optionally, the disclosure may determine an instruction sequence corresponding to the image block according to the notification signal, for example, when the notification signal indicates that all pixel values of the first image block of the current frame image are already stored in the second buffer, the disclosure may perform, on the first image block, an operation process based on a neural network according to the instruction sequence corresponding to the first image block; when the notification signal indicates that all pixel values of a second image block of the current frame image have been stored in the second buffer, the present disclosure may perform a neural network-based operation process on the second image block according to an instruction sequence corresponding to the second image block; and so on, until the notification signal indicates that all pixel values of the last image block of the current frame image have been stored in the second buffer, the present disclosure performs a neural network-based arithmetic process on the last image block according to an instruction sequence corresponding to the last image block.

The correspondence between the notification signal and the instruction sequence in the present disclosure may be set in a configuration file of the data processor. According to the method and the device, the instruction sequences are respectively set for each image block, so that each image block can obtain corresponding operation processing based on the corresponding instruction sequences. The arithmetic processing of all image blocks in the present disclosure may be in the form of parallel interleaving processing. An example is shown in fig. 3.

In fig. 3, the abscissa is the time axis T.

Assuming that the first image block of the current frame image is generated at the time ISP of T1, the data processor in the present disclosure may perform the neural network-based arithmetic processing on the first image block using the instruction sequence (for example, instruction 0, instruction 1, … …, and instruction n) corresponding to the first image block.

It is assumed that the second image block of the current frame image is generated at the time ISP at the T2, and at this time, although the data processor has not completed the neural network-based operation processing for the first image block, the data processor may start to perform the neural network-based operation processing for the second image block using the instruction sequence (for example, instructions n+1, … … instruction 2 n) corresponding to the second image block. That is, the data processor performs the neural network-based arithmetic processing on the first image block from the time T1 to the time T2, in parallel with the processing of the ISP to generate the second image block.

It is assumed that the third image block of the current frame image is generated at the time T3, and at this time, although the data processor has not completed the neural network-based operation processing of the second image block, the data processor may start to perform the neural network-based operation processing on the third image block using the instruction sequence (for example, instructions 2n+1, … …, and instruction 3 n) corresponding to the third image block. That is, from time T2 to time T3, the data processor performs the neural network-based operation processing on the first image block, the data processor performs the neural network-based operation processing on the second image block, and the ISP generates the third image block in parallel.

And so on, until the ISP generates the last image block of the current frame image, at which time, although the data processor has not completed the neural network-based operation process of the penultimate image block of the current frame image, the data processor may start performing the neural network-based operation process on the last image block using the instruction sequence corresponding to the last image block. That is, the data processor performs the neural network-based operation process on the third-last image block, the data processor performs the neural network-based operation process on the second-last image block, and the ISP generates the last image block in parallel from the second-last time to the last time.

As is clear from the above description with respect to fig. 3: the time length from the start of generating the first image block of the current frame image by the ISP to the completion of the operation processing of all the image blocks of the current frame image by the data processor based on the neural network is generally smaller than the sum of the time length of generating the current frame image by the ISP and the time length of the operation processing of all the image blocks of the current frame image by the data processor based on the neural network.

According to the method and the device, the instruction sequences corresponding to the image blocks are utilized to respectively perform operation processing based on the neural network on the image blocks, so that the operation processing of all the image blocks can be formed into an interleaving parallel processing mode, and the delay of the image processing is reduced.

In an alternative example, the instruction sequence corresponding to the image block in the present disclosure includes: preprocessing instructions and at least one operation instruction based on a convolutional neural network. The convolutional neural network-based operation instructions may include, but are not limited to: an instruction for performing a padding operation on the cache data in the buffer, an instruction for reading the cache data from the buffer, an instruction for performing a multiply-add operation on the read cache data, and the like. That is, the execution of the neural network-based arithmetic processing on the image block according to the instruction sequence corresponding to the image block in S202 of the present disclosure may be specifically: and then, according to the result of the preprocessing operation buffered in the third buffer area based on the operation instruction based on the convolutional neural network, executing the operation processing based on the convolutional neural network. The operation instruction based on the convolutional neural network in the present disclosure may refer to an operation related to the convolutional neural network. Such as multiply-add operations, padding operations, and the like.

Alternatively, the result after the preprocessing operation may still be an image block, that is, the disclosure may perform the preprocessing operation on the image block according to the preprocessing instruction, buffer the image block after the preprocessing operation in the third buffer, and then perform the operation processing based on the convolutional neural network on the image block after the preprocessing operation buffered in the third buffer according to the operation instruction based on the convolutional neural network. Of course, the result of the preprocessing operation may be a feature map of the image block.

Optionally, the preprocessing operations in the present disclosure may include, but are not limited to: and processing operation of correspondingly adjusting the image block based on the input requirement of the convolutional neural network. For example, the preprocessing operations may include, but are not limited to: and adjusting the size of the image block, so that the image block with the adjusted size meets the size requirement of the convolutional neural network on the input image. As another example, the preprocessing operations may include, but are not limited to: and adjusting the size of the image block, and carrying out image complement processing and the like on the image block with the adjusted size, so that the image block with the image complement processing meets the size requirement of the convolutional neural network on the input image. Of course, the preprocessing operations in the present disclosure may also include other content, such as extracting feature maps of image blocks, and the like.

Optionally, the preprocessing instruction in the present disclosure may be: instructions for resizing the image block to a plurality of different sizes. That is, the present disclosure may resize the tiles according to the preprocessing instructions, thereby forming pyramid tiles. That is, in the case where the input of the convolutional neural network is a pyramid image, the present disclosure may generate pyramid image blocks according to a preset preprocessing instruction, thereby forming the input of the convolutional neural network. The preprocessing instruction in the present disclosure may be set according to the size of each layer image in the preset pyramid image, the position of the image block in the current frame image, and the height and width of the current frame image.

One example is: assuming that the input of the convolutional neural network is a pyramid image including three layers of images, the size of the lowermost layer image in the pyramid image is 1200×800, the size of the intermediate layer image in the pyramid image is 600×400, and the size of the uppermost layer image in the pyramid image is 300×200. Assuming that the width of an image block is 1200 as the width of the current frame image, the height of an image block is 100 (for convenience of description, it is not considered here that adjacent image blocks should contain overlapping areas), and the height of the current frame image is 800. Under the above assumption, the preprocessing instruction in the present disclosure may represent: the image block is scaled firstly, then, the image block after scaling is subjected to image compensation processing according to the width and the height of each layer of image in the pyramid image and the width and the height of the image block after scaling, the filled value can be a preset value (such as 0 or 255) so as to form the pyramid image block, and the pyramid image block can be used as input of a convolutional neural network. In scaling image blocks, the wide-based scaling coefficients used may be 1, one-half, and one-fourth, respectively, and the high-based scaling coefficients used may be 1, one-half, and one-fourth, respectively. At this time, the sizes of pyramid image blocks provided to the convolutional neural network in the present disclosure are respectively: 1200×800, 600×400, 300×200.

Another example is: the input to the convolutional neural network is assumed to be a pyramid image containing three layers of images. The size of the lowermost layer image in the pyramid image is 1200×800, the size of the intermediate layer image in the pyramid image is 600×400, and the size of the uppermost layer image in the pyramid image is 300×200. Assuming that the width of an image block is 1200 as the width of the current frame image, the height of an image block is 100 (for convenience of description, it is not considered here that adjacent image blocks should contain overlapping areas), and the height of the current frame image is 800. Under the above assumption, the preprocessing instruction in the present disclosure may represent: and scaling the image blocks, and then forming pyramid image blocks by directly utilizing the three image blocks after scaling. In scaling image blocks, the wide-based scaling coefficients used may be 1, one-half, and one-fourth, respectively, and the high-based scaling coefficients used may be 1, one-half, and one-fourth, respectively. That is, the present disclosure may not perform the complement processing based on a predetermined value on the scaled pyramid image block, but directly use the scaled pyramid image block as an input to the convolutional neural network. At this time, the sizes of pyramid image blocks provided to the convolutional neural network in the present disclosure are respectively: 1200×100, 600×50, 300×25.

Optionally, the third buffer in the present disclosure may be located inside the data processor performing the preprocessing operation. For example, the third buffer may be a buffer in the BPU. The third buffer may be part of the first buffer described above.

The method and the device provide a data basis for processing results based on the convolutional neural network for obtaining the current frame image by preprocessing the image block and performing operation processing based on the convolutional neural network. By adjusting the image block to be a pyramid image block, a data basis is provided for obtaining pyramid features of the current frame image.

In an optional example, the performing, according to the operation instruction based on the convolutional neural network, the operation processing based on the convolutional neural network on the image block buffered in the third buffer in the disclosure may be specifically: and executing multi-layer depth feature extraction operation on the image blocks cached in the third buffer area according to the operation instruction based on the convolutional neural network, and caching the feature blocks of each image block in the fourth buffer area corresponding to each image block. Therefore, the feature map of the current frame image can be obtained by combining the feature blocks cached in the fourth buffer areas corresponding to all the image blocks included in the current frame image, and the feature map can be called as an intermediate feature map of the current frame image. The final feature map of the current frame image can be obtained by continuing to execute the feature extraction operation and other processes on the intermediate feature map of the current frame image obtained by merging.

Optionally, in the case that the input of the convolutional neural network is a pyramid image block, the present disclosure may perform operation processing based on the convolutional neural network on each layer of image blocks in the pyramid image block cached in the third buffer area, so that content stored in a fourth buffer area corresponding to the image block in the present disclosure is a pyramid feature block of the image block. Then, when feature block merging processing is performed, feature blocks of the same layer in pyramid feature blocks of all image blocks can be respectively merged, so that an intermediate pyramid feature map of the current frame image can be obtained. According to the method and the device, the final feature map of the current frame image can be obtained by respectively carrying out feature extraction operation and other treatments on each layer of feature map in the middle pyramid feature map of the current frame image.

Optionally, the operation processing based on the convolutional neural network includes: in the case of performing the padding processing on the input features, if the disclosure directly uses the scaled image block as the pyramid image block, the instruction sequence of the disclosure should include an operation instruction for adding an invalid feature value to each layer in the pyramid image block.

Alternatively, in the case where the present disclosure employs a complement process based on a predetermined value for an image block to form a pyramid image block, when performing an operation process based on a convolutional neural network for such an image block, the present disclosure may perform the operation process based on the convolutional neural network only for the effective data in the image block, so as to avoid performing unnecessary operations on the filled predetermined value, and save calculation resources. For example, the effective data in each layer of image may be read from the pyramid image block according to the position of the image block in the current frame image, and the operation processing based on the convolutional neural network may be performed.

Optionally, the fourth buffer in the present disclosure may be located inside the data processor performing the preprocessing operation. For example, the fourth buffer may be a buffer in the BPU. The fourth buffer may be part of the first buffer described above.

The method and the device can obtain the characteristic blocks of all the image blocks contained in the current frame image by carrying out multi-layer depth characteristic extraction operation on the image blocks, and provide a data basis for obtaining the characteristic image of the current frame image. The feature extraction operation is continuously carried out on the feature images formed after the feature blocks of all the image blocks contained in the current frame image are combined, and a feasible implementation mode is provided for quickly obtaining the final feature image of the current frame image.

The image processing method of the present disclosure may be implemented by an image pickup apparatus having an artificial intelligence chip. An example is shown in fig. 4.

In fig. 4, the leftmost image is a current frame image 400 finally generated by the image pickup device. The image sensor in the image pickup device collects the optical signals and outputs corresponding electrical signals, and the electrical signals are subjected to post-processing such as linear correction, noise removal, dead pixel removal, white balance and the like on the electrical signals output by the image sensor through the ISP in the image pickup device, so that pixel values of all pixels in the current frame image 400 are sequentially generated according to a certain sequence. Each time the ISP generates pixel values for a predetermined number of lines of pixels of the current frame image 400, an image block 401 (i.e. an image bar in fig. 4) is formed. When the ISP generates one image block 401, an interrupt signal is formed, where the interrupt signal is used to trigger an artificial intelligent chip in the image capturing device to perform a corresponding operation according to an instruction sequence corresponding to the image block 401, for example, the artificial intelligent chip forms the image block 401 currently generated by the ISP into a pyramid image block, and performs an operation process based on a convolutional neural network on the pyramid image block, so that the artificial intelligent chip can generate a corresponding feature map 402 for each image block 401. Thereafter, the artificial intelligence chip may combine the feature maps 402 of the image blocks 401 to form an intermediate feature map 403 of the current frame image 400, and continue to perform processing such as feature extraction operation on the intermediate feature map 403, so that the artificial intelligence chip may output a final pyramid feature map 404 of the current frame image 400.

When the image capturing device is in a video recording state, the image capturing device can sequentially output the pyramid feature map 404 of each video frame image obtained by recording the pyramid feature map by adopting the method.

The imaging device described above can be generally connected to other devices. For example, the camera device may be disposed in a smart mobile phone and connected to a data processor in the smart mobile phone. For another example, the imaging device may be provided in a vehicle and connected to a data processor in an on-board processing device of the vehicle. The device connected with the camera device can execute operations such as target object detection and identification according to the pyramid feature map of the video frame image output by the camera device, so that the functions such as face recognition, intelligent driving or target object tracking are realized. The present disclosure is not limited to the function of the specific implementation of the information output by the image pickup device.

Exemplary apparatus

Fig. 5 is a schematic structural view of an embodiment of an image processing apparatus of the present disclosure. The apparatus of this embodiment may be used to implement the corresponding method embodiments of the present disclosure.

The image processing apparatus as shown in fig. 5 includes: the device comprises a signal processing module 500, a monitoring module 501, an operation processing module 502 and a processing result acquisition module 503. Optionally, the apparatus may further include: a module 504 is set.

The signal processing module 500 is configured to perform image signal processing on a sensing signal output by the image sensor to generate a pixel value in the current frame image.

The monitoring module 501 is configured to monitor the number of pixels that the signal processing module 500 currently generates a pixel value.

Optionally, the monitoring module 501 may generate a notification signal (such as an interrupt signal) every time it monitors that the number of pixels of the pixel value generated and buffered in the second buffer reaches the number of pixel rows N according to the number of pixel rows N included in the preset image block. Wherein N is an integer greater than zero and less than or equal to one-half the total number of rows of pixels included in the current frame image.

The operation processing module 502 is configured to perform an operation process based on a neural network on an image block formed by the pixel values currently generated by the signal processing module 500 when the monitoring module 501 monitors that the number of pixels currently generating the pixel values reaches a predetermined number.

Optionally, the operation processing module 502 reads the pixel value from the second buffer when receiving the notification signal (such as the interrupt signal) output by the monitoring module 501, and performs the operation processing of the neural network on the read pixel value. For example, the operation processing module 502 determines an instruction sequence corresponding to an image block formed by the pixel values currently generated, and then the operation processing module 502 performs an operation process based on a neural network on the image block according to the instruction sequence corresponding to the image block.

Optionally, the instruction sequence corresponding to the image block may include: preprocessing instructions and at least one operation instruction based on a convolutional neural network. At this time, the operation processing module 502 may perform a preprocessing operation on the image block according to the preprocessing instruction, and buffer the image block after the preprocessing operation in the third buffer area; then, the operation processing module 502 performs an operation process based on the convolutional neural network on the image block buffered in the third buffer according to the operation instruction based on the convolutional neural network.

Alternatively, an example of the preprocessing operation performed by the arithmetic processing module 502 may be: the operation processing module 502 adjusts the size of the image block according to the preprocessing instruction to form a pyramid image block. The preprocessing instruction is set according to the size of each layer of image in the preset pyramid image, the position of the image block in the current frame image and the height and width of the current frame image.

Alternatively, one example of the operation processing module 502 performing the convolutional neural network-based operation processing may be: the operation processing module 502 performs a multi-layer depth feature extraction operation on the image blocks cached in the third buffer area according to the operation instruction based on the convolutional neural network, and caches the feature blocks of each image block in the fourth buffer area corresponding to each image block.

The obtaining processing result module 503 is configured to obtain a processing result of the current frame image according to the operation processing result of the operation processing module 502 on all image blocks in the current frame image. For example, the obtaining processing result module 503 may perform a feature extraction operation on a feature map of the current frame image that is formed by merging feature blocks of each image block buffered in the fourth buffer, to obtain a final feature map of the current frame image.

The setting module 504 is configured to determine the number of pixels included in each image block according to at least one of a spatial resolution of the current frame image, a computing array of the neural network, input and output spatial resolutions of the neural network, a computing resource of a data processor performing an operation process, and a buffer space of a first buffer in the data processor. Wherein the first buffer in the data processor comprises: a buffer for storing hidden layer output information in the neural network.

The operations performed by the above modules may be referred to in the related description of the above method embodiments, and will not be described in detail herein.

Exemplary electronic device

An electronic device according to an embodiment of the present disclosure is described below with reference to fig. 6. Fig. 6 shows a block diagram of an electronic device according to an embodiment of the disclosure. As shown in fig. 6, the electronic device 61 includes one or more processors 611 and memory 612.

The processor 611 may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device 61 to perform the desired functions.

Memory 612 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example: random Access Memory (RAM) and/or cache, etc. The nonvolatile memory may include, for example: read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer readable storage medium that can be executed by the processor 611 to implement the image processing methods and/or other desired functions of the various embodiments of the present disclosure described above. Various contents such as an input signal, a signal component, a noise component, and the like may also be stored in the computer-readable storage medium.

In one example, the electronic device 61 may further include: input device 613, and output device 614, etc., interconnected by a bus system and/or other forms of connection mechanisms (not shown). In addition, the input device 613 may include, for example, a keyboard, a mouse, and the like. The output device 614 can output various information to the outside. The output devices 614 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, etc.

Of course, only some of the components of the electronic device 61 that are relevant to the present disclosure are shown in fig. 6 for simplicity, components such as buses, input/output interfaces, and the like are omitted. In addition, the electronic device 61 may include any other suitable components depending on the particular application.

Exemplary computer program product and computer readable storage Medium

In addition to the methods and apparatus described above, embodiments of the present disclosure may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to perform steps in an image processing method according to various embodiments of the present disclosure described in the "exemplary methods" section of the present description.

The computer program product may write program code for performing the operations of embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, embodiments of the present disclosure may also be a computer-readable storage medium, having stored thereon computer program instructions, which when executed by a processor, cause the processor to perform steps in an image processing method according to various embodiments of the present disclosure described in the above "exemplary method" section of the present disclosure.

The computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium may include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The basic principles of the present disclosure have been described above in connection with specific embodiments, however, it should be noted that the advantages, benefits, effects, etc. mentioned in the present disclosure are merely examples and not limiting, and these advantages, benefits, effects, etc. are not to be considered as necessarily possessed by the various embodiments of the present disclosure. Furthermore, the specific details disclosed herein are for purposes of illustration and understanding only, and are not intended to be limiting, since the disclosure is not necessarily limited to practice with the specific details described.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different manner from other embodiments, so that the same or similar parts between the embodiments are mutually referred to. For system embodiments, the description is relatively simple as it essentially corresponds to method embodiments, and reference should be made to the description of method embodiments for relevant points.

The block diagrams of the devices, apparatuses, devices, systems referred to in this disclosure are merely illustrative examples and are not intended to require or imply that the connections, arrangements, configurations must be made in the manner shown in the block diagrams. As will be appreciated by one of skill in the art, the devices, apparatus, devices, and systems may be connected, arranged, configured in any manner. Words such as "including," "comprising," "having," and the like are words of openness and mean "including but not limited to," and are used interchangeably therewith. The terms "or" and "as used herein refer to and are used interchangeably with the term" and/or "unless the context clearly indicates otherwise. The term "such as" as used herein refers to, and is used interchangeably with, the phrase "such as, but not limited to.

The methods and apparatus of the present disclosure may be implemented in a number of ways. For example, the methods and apparatus of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, firmware. The above-described sequence of steps for the method is for illustration only, and the steps of the method of the present disclosure are not limited to the sequence specifically described above unless specifically stated otherwise. Furthermore, in some embodiments, the present disclosure may also be implemented as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.

It is also noted that in the apparatus, devices and methods of the present disclosure, components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered equivalent to the present disclosure.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects, and the like, will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the disclosure to the form disclosed herein. Although a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, changes, additions, and sub-combinations thereof.

Claims

1. An image processing method, comprising:

performing image signal processing on a sensing signal output by an image sensor to generate a pixel value in a current frame image;

monitoring the number of pixels of the current generated pixel value;

when the number of pixels of the current generated pixel value is monitored to reach a preset number, executing operation processing based on a neural network on an image block formed by the current generated pixel value;

obtaining a processing result of the current frame image according to the operation processing results of all image blocks in the current frame image;

wherein the neural network-based operation processing performed on all image blocks in the current frame image forms an interleaved parallel processing form.

2. The method of claim 1, wherein the method further comprises:

determining the number of pixels contained in each image block according to at least one of the spatial resolution of the current frame image, the computing array of the neural network, the input and output spatial resolutions of the neural network, the computing resources of a data processor executing the operation processing, and the buffer space of a first buffer zone in the data processor;

Wherein the first buffer in the data processor comprises: a buffer for storing hidden layer output information in the neural network.

3. The method according to claim 1 or 2, wherein the performing a neural network-based arithmetic process on an image block formed by a currently generated pixel value when it is monitored that the number of pixels of the currently generated pixel value reaches a predetermined number, includes:

generating a notification signal when the number of pixels of the pixel value generated and buffered in the second buffer area reaches the number of pixel lines N each time according to the number of pixel lines N contained in the preset image block;

reading the pixel value from the second buffer based on the notification signal, and performing an arithmetic process of a neural network on the read pixel value;

wherein N is an integer greater than zero and less than or equal to one half of the total number of rows of pixels included in the current frame image.

4. The method according to any one of claims 1 to 2, wherein the performing a neural network-based arithmetic process on the image block formed of the currently generated pixel values includes:

and determining an instruction sequence corresponding to the image block formed by the pixel values generated at present, and executing operation processing based on a neural network on the image block according to the instruction sequence corresponding to the image block.

5. The method of claim 4, wherein the sequence of instructions corresponding to the image block comprises: preprocessing an instruction and at least one operation instruction based on a convolutional neural network;

the executing operation processing based on the neural network on the image block according to the instruction sequence corresponding to the image block comprises the following steps:

performing preprocessing operation on the image blocks according to the preprocessing instruction, and caching the preprocessed image blocks in a third buffer area;

and executing operation processing based on the convolutional neural network on the image blocks cached in the third buffer area according to the operation instruction based on the convolutional neural network.

6. The method of claim 5, wherein the performing a preprocessing operation on the image block according to the preprocessing instruction comprises:

adjusting the size of the image block according to the preprocessing instruction to form a pyramid image block;

the preprocessing instruction is set according to the size of each layer of image in a preset pyramid image, the position of the image block in the current frame image and the height and width of the current frame image.

7. The method of claim 5, wherein the performing convolutional neural network-based arithmetic processing on the image blocks buffered in the third buffer according to convolutional neural network-based arithmetic instructions comprises:

And executing multi-layer depth feature extraction operation on the image blocks cached in the third buffer area according to the operation instruction based on the convolutional neural network, and caching the feature blocks of each image block in a fourth buffer area corresponding to each image block.

8. The method of claim 7, wherein the convolutional neural network-based operation instruction performs convolutional neural network-based operation processing on the image block buffered in the third buffer, further comprising:

and executing feature extraction operation on the feature map of the current frame image formed by combining the feature blocks of the image blocks cached in the fourth buffer area to obtain a final feature map of the current frame image.

9. An image processing apparatus comprising:

the signal processing module is used for performing image signal processing on the sensing signal output by the image sensor so as to generate a pixel value in the current frame image;

the monitoring module is used for monitoring the number of pixels of the pixel value currently generated by the signal processing module;

the operation processing module is used for executing operation processing based on a neural network on an image block formed by the pixel values generated at present when the monitoring module monitors that the pixel number of the pixel values generated at present reaches a preset number;

The processing result obtaining module is used for obtaining the processing result of the current frame image according to the operation processing result of the operation processing module on all the image blocks in the current frame image;

10. The apparatus of claim 9, wherein the apparatus further comprises:

the setting module is used for determining the number of pixels contained in each image block according to at least one of the spatial resolution of the current frame image, the computing array of the neural network, the input and output spatial resolutions of the neural network, the computing resources of a data processor executing the operation processing and the buffer space of a first buffer zone in the data processor;

11. A computer readable storage medium storing a computer program which when executed is adapted to carry out the method of any one of the preceding claims 1-8.

12. An electronic device, the electronic device comprising:

a processor;

a memory for storing the processor-executable instructions;

the processor being configured to read the executable instructions from the memory and execute the instructions to implement the method of any of the preceding claims 1-8.