WO2022166309A1

WO2022166309A1 - Method and apparatus for processing image data of image sensor

Info

Publication number: WO2022166309A1
Application number: PCT/CN2021/131698
Authority: WO
Inventors: 李文斌; 段小祥
Original assignee: 华为技术有限公司
Priority date: 2021-02-07
Filing date: 2021-11-19
Publication date: 2022-08-11
Also published as: CN114915731A

Abstract

The present application relates to a method and apparatus for processing image data of an image sensor. The method comprises: receiving first image data from an image sensor, the first image data being one of a plurality of pieces of image data generated by the image sensor scanning a physical region corresponding to an acquisition region in one scanning period, and the acquisition region representing an acquisition range of the image sensor; performing image processing on the first image data to obtain second image data; and outputting the second image data. The method and apparatus for processing image data of an image sensor of the present application can advance the processing flow for image data generated by the image sensor, thereby reducing the end-to-end delay from the image sensor generating image data to a vehicle actuator executing an operation. The method and apparatus for processing image data of an image sensor of the present application are particularly suitable for advanced driving assistance systems (ADASs), robot systems, and the like.

Description

Method and apparatus for processing image sensor image data

technical field

The present application relates to the field of intelligent networked vehicles, and in particular, to a method and device for processing image sensor image data.

Background technique

Perceptual recognition plays an important role in the advanced driving assistance system (ADAS) and autonomous driving system (Autonomous Driving) of intelligent networked vehicles. In order to realize the perception and recognition function, the intelligent networked vehicle is equipped with a variety of image sensors. Common examples of in-vehicle image sensors for intelligent perception recognition are cameras (camera), LiDAR (Light Detection and Ranging, LiDAR), millimeter wave radar (millimeter wave), etc. The "image data" obtained by them is rich in information, so Intelligent connected vehicles can realize the recognition function through these image data.

In the ICV, the image data reaches the advanced driving assistance system or the autonomous driving system from the image sensor, and after the steps of image processing, algorithm recognition, and driving decision-making, it is finally transformed into the operation control instructions for the various actuators of the vehicle. Driving controls. In this process, from the perspective of safety, reducing the end-to-end delay from the generation of image data by image sensors to the execution of operations by vehicle actuators is the goal of advanced driver assistance systems and autonomous driving systems.

In-vehicle sensors Image sensors include many image sensors that sense "scanning", such as lidars and cameras. Such a scanning image sensor does not generate all the image data of a frame of images simultaneously in the acquisition area, but sequentially generates all the image data of a frame of images over a period of time, for example by scanning in rows or columns. At present, the vehicle-mounted Image Signal Processor (ISP) and subsequent algorithm platforms process the frame of image after waiting for all the image data of the frame to be received. This processing method has a great impact on the end-to-end delay.

SUMMARY OF THE INVENTION

In view of this, the present application provides a method and apparatus for processing image sensor image data, which can reduce the end-to-end delay from the image sensor generating the image data to the vehicle actuator performing the operation.

In a first aspect, a method for processing image data of an image sensor is provided, the method comprising the steps of: receiving first image data from an image sensor, the first image data being that the image sensor scans an acquisition area within one scan period One image data among a plurality of image data that can be generated by the corresponding physical area, the collection area represents the collection range of the image sensor; image processing is performed on the first image data to obtain second image data; and the first image data is output. 2. Image data.

By performing image processing on the received first image data while receiving one image data among a plurality of image data that can be generated by the image sensor scanning the physical area corresponding to the acquisition area in one scan period, the first image data generated by the image sensor can be processed in advance. An image processing flow of image data, thereby reducing the end-to-end delay from the generation of image data by an image sensor to the execution of operations by vehicle actuators. In other words, when a part of the first image data of a frame of images that can be generated by scanning the acquisition area is received, image processing can be performed on the part of the first image data without waiting for all the first image data of the entire frame of image to be received. The image processing flow of the image data is started only when an image data is generated.

In one possible implementation, the image sensor is a camera, and the image is a two-dimensional plane image.

The acquisition area of the camera is often referred to as the target surface. The frame rate of the camera is generally 30Hz (Hertz, Hertz). In other words, the exposure time of the camera from the first line of the target surface to the last line of the target surface to form one frame of image is 33ms (milisecond, millisecond). Generally, it is necessary to wait for the completion of the generation and transmission of all the first image data of the whole frame of image before performing processing on the received image data of the whole frame of image. That is, after the camera starts to scan the target surface, it takes at least 33ms to process the first image data thus generated. However, with the method for processing image data of the present application, for example, after receiving the first first image data of a frame of image, the processing of the first image data can be performed immediately, without waiting for all the images of the whole frame to be received. first image data. This can advance the image processing process of the first image data of the plane image, save the processing time of the first image data of each frame of plane image, thereby reducing the end-to-end time from the generation of the first image data by the camera to the operation of the vehicle actuator. extension.

In a possible implementation, the image sensor is a lidar, and the image is a three-dimensional point cloud. The acquisition area of the lidar is often referred to as the scan area.

The frame rate of lidar is generally 10Hz or 20Hz. That is to say, the lidar scans from the first column of the scanning area to the last column of the scanning area, so that the duration of forming the first image data of a frame of point cloud is usually 100ms or 50ms. Similar to processing the image data of a plane image, it is usually necessary to wait for the first image data of the whole frame of point cloud to be generated and transmitted before processing the received first image data of the whole frame of point cloud. However, with the method for processing image data of the present application, for example, after receiving the first first image data of a frame of point cloud, the first image data can be processed immediately without waiting for the whole frame of points to be received. All first image data of the cloud. This can advance the image processing flow of the first image data of the point cloud, save the processing time of the first image data of each frame of the point cloud, thereby reducing the end-to-end process from the generation of the first image data by the lidar to the operation performed by the vehicle actuator. time delay.

With reference to the first aspect, in a possible implementation manner, the collection area includes a plurality of sub-collection areas; the performing image processing on the first image data includes: when the first data set A included in the first data set A is received After one image data is obtained, image processing is performed in units of all the first image data included in the first image data group A, where the first image data group A is the physical image corresponding to the image sensor scanning one of the sub-collection areas. The set of the first image data generated by the region.

By defining each first image data group in each sub-collection area of the collection area, the plurality of first image data of one frame of image are sorted out in advance. After receiving the first image data included in the first data group A, image processing is performed using all the first image data included in the first image data group A as a unit, that is, by using the first image data group as a unit to perform image processing processing, the image processing flow of the first image data can be advanced, and at the same time, the complexity of the subsequent processing and the increase of the subsequent processing time caused thereby can be avoided. In addition, each first image data set is defined by the sub-acquisition regions, which can be applied to image sensors of different resolutions.

With reference to the first aspect, in a possible implementation manner, the size and quantity of the plurality of sub-collection regions are preset.

In a possible implementation, the number of sub-collection regions is selected from 2 to 4.

In a possible implementation manner, the sizes of each sub-collection area are equal.

With reference to the first aspect, in a possible implementation manner, the method further includes the following steps: receiving a division strategy, and presetting the size and quantity of the plurality of sub-collection areas according to the division strategy.

By receiving the division strategy to pre-set the size and number of multiple sub-collection areas, the process of image processing of multiple first image data can be flexibly adjusted according to the actual needs of the application scenario, so that the method for processing image sensor image data of the present application Adapt to various application scenarios.

With reference to the first aspect, in a possible implementation manner, the sub-collection area is a rectangle, and the size of the sub-collection area is defined by coordinates of four corners of the rectangle.

By defining the size of the sub-collection area by the coordinates of the four corners of the sub-collection area, each sub-collection area can be flexibly divided in a particularly simple and intuitive manner.

In a second aspect, there is provided a method for processing image data of an image sensor, the method comprising the steps of: sequentially receiving second image data, the second image data is obtained by performing image processing on the first image data, the first The image data is one image data among a plurality of image data that can be generated by the image sensor scanning the physical area corresponding to the acquisition area in one scan period, and the acquisition area represents the acquisition range of the image sensor; Extract feature data from image data; perform fusion recognition processing on each feature data; and output image recognition results.

By sequentially extracting feature data from the received second image data while receiving the second image data, it is possible to advance the image recognition process of the second image data, thereby reducing the time from the generation of the first image data by the image sensor to the time when the vehicle actuator performs operations. end-to-end delay. In other words, when a part of the second image data of a frame of image is received, the extraction of feature data as part of the processing process of the recognition algorithm can be performed on the part of the second image data of the frame image, without waiting for the whole frame to be received Feature extraction starts only when all of the second image data of the image is performed.

In addition, by dividing the traditional recognition algorithm processing process into a feature extraction process and a fusion recognition process, for example, in the feature extraction process, the feature data of the second image data originating from different parts of the acquisition area are respectively extracted, and the fusion recognition is performed in the fusion recognition process. Each feature data extracted from all the second image data can ensure the stability and accuracy of the image recognition result of the fusion recognition.

With reference to the second aspect, in a possible implementation manner, the image sensor is a camera, and the image is a two-dimensional plane image. With reference to the second aspect, in a possible implementation manner, the image sensor is a lidar, and the image is a three-dimensional point cloud.

With reference to the second aspect, in a possible implementation manner, the second image data is grouped; the respective feature data is extracted from each group of the second image data.

By grouping the second image data, a plurality of second image data of one frame of image are sorted out in advance. By extracting the respective feature data from each set of the second image data, that is, extracting the feature data with a set of second image data as a unit, the image recognition process of the image data can be advanced and the subsequent processing can be avoided excessively. complexity and consequent increase in subsequent processing time.

With reference to the second aspect, in a possible implementation manner, the quantity of the second image data of each group of the second image data is preset.

With reference to the second aspect, in a possible implementation manner, the collection area includes a plurality of sub-collection areas, the second image data is grouped according to the plurality of sub-collection areas, and a group of second image data is image-processed A set of first image data is a set of the first image data generated by the image sensor scanning a physical area corresponding to one of the sub-acquisition areas.

With reference to the second aspect, in a possible implementation manner, the method further includes the following steps: receiving a division strategy, and presetting the size and number of the plurality of sub-collection areas according to the received division strategy.

The size and number of multiple sub-collection areas are pre-set by receiving the division strategy, so as to adjust the grouping of image data, and the process of extracting feature data from multiple image data can be flexibly adjusted according to the actual needs of the application scenario, so that the present application The method of processing image sensor image data is suitable for various application scenarios.

With reference to the second aspect, in a possible implementation manner, the sub-collection area is a rectangle, and the size of the sub-collection area is defined by coordinates of four corners of the rectangle.

With reference to the second aspect, in a possible implementation manner, the extracting feature data includes convolution processing and pooling (pooling, also known as pooling) processing.

In a possible implementation manner, the convolution processing and the pooling processing are alternately performed more than once. In one possible implementation, the convolution process includes one or more sub-convolution processes. In a possible implementation manner, the number of sub-convolution processes in the convolution process is a natural number selected from 1 to 3.

With reference to the second aspect, in a possible implementation manner, the fusion identification includes feature fusion processing.

Through the feature fusion process, each feature data extracted from the second image data originating from each acquisition area can be effectively fused into the form of feature data of the entire frame image.

With reference to the second aspect, in a possible implementation manner, the fusion identification further includes full connection processing.

Through the full connection processing, it is possible to perform global analysis and recognition on the characteristics of the image data of the entire frame of image.

In one possible implementation, the feature fusion process includes a series of feature fusion (concat) processes. In a possible implementation manner, the full connection processing includes one or more sub-full connection processing.

In a third aspect, an image data processing device is provided, comprising: a receiving module configured to receive first image data from an image sensor, where the first image data is corresponding to a scanning and acquisition area of the image sensor in one scanning period One image data among a plurality of image data that can be generated by the physical area, the collection area represents the collection range of the image sensor; and an image processing module, configured to perform image processing on the first image data to obtain second image data, and for outputting the second image data.

With reference to the third aspect, in a possible implementation manner, the collection area includes a plurality of sub-collection areas; the performing image processing on the first image data includes: the image processing module is further configured to, when the receiving After the module receives the first image data contained in the first data set A, it performs image processing in units of all the first image data contained in the first image data set A, where the first image data set A is the image The set of the first image data generated by the sensor scanning a physical area corresponding to the sub-collection area.

With reference to the third aspect, in a possible implementation manner, the size and quantity of the plurality of sub-collection regions are preset.

With reference to the third aspect, in a possible implementation manner, the sub-collection area is a rectangle, and the size of the sub-collection area is defined by coordinates of four corners of the rectangle.

With reference to the third aspect, in a possible implementation manner, the receiving module is further configured to receive a division strategy, and the image data processing module is further configured to preset the number of sub-collection regions according to the division strategy. size and quantity.

Since the device of the third aspect of the present application can execute the method of the first aspect, the advantages and benefits of the device of the third aspect are similar to those of the first aspect, and the relevant descriptions of the first aspect are referred to and will not be repeated here.

In a fourth aspect, an image recognition device is provided, comprising: a receiving module configured to sequentially receive second image data, wherein the second image data is obtained by performing image processing on the first image data, and the first image data is The image sensor scans one image data among multiple image data that can be generated by scanning the physical area corresponding to the acquisition area in one scan period, and the acquisition area represents the acquisition range of the image sensor; the feature extraction module is used to sequentially select from all the image data. feature data is extracted from the second image data; and a fusion recognition module is used to perform fusion recognition processing on each feature data and output image recognition results.

With reference to the fourth aspect, in a possible implementation manner, the quantity of the second image data of each group of the second image data is preset.

With reference to the fourth aspect, in a possible implementation manner, the collection area includes a plurality of sub-collection areas, the second image data is grouped according to the plurality of sub-collection areas, and a group of second image data is image-processed A set of first image data is a set of the first image data generated by the image sensor scanning a physical area corresponding to one of the sub-acquisition areas.

With reference to the fourth aspect, in a possible implementation manner, the size and quantity of the plurality of sub-collection regions are preset.

With reference to the fourth aspect, in a possible implementation manner, the receiving module is further configured to receive a division strategy, and the feature extraction module is further configured to preset the size and quantity of the plurality of sub-collection regions according to the division strategy.

With reference to the fourth aspect, in a possible implementation manner, the sub-collection area is a rectangle, and the size of the second sub-collection area is defined by coordinates of four corners of the rectangle.

With reference to the fourth aspect, in a possible implementation manner, the feature extraction module includes a convolution layer and a pooling layer.

In a possible implementation manner, more than one convolution layer and pooling layer are alternately configured. In one possible implementation, the convolutional layer includes one or more sub-convolutional layers. In a possible implementation manner, the number of sub-convolutional layers in the convolutional layer is a natural number selected from 1 to 3.

With reference to the fourth aspect, in a possible implementation manner, the fusion identification module includes a feature fusion layer.

With reference to the fourth aspect, in a possible implementation manner, the fusion identification module further includes a fully connected layer.

In a possible implementation, the feature fusion processing module includes a series of feature fusion (concat) layers. In a possible implementation manner, the fully-connected layer includes one or more sub-fully-connected layers.

With reference to the fourth aspect, in a possible implementation manner, the receiving module is further configured to receive a division strategy, and the feature extraction module is further configured to preset the plurality of second sub-collections according to the division strategy The size and number of regions.

Since the apparatus of the fourth aspect of the present application can perform the method of the second aspect, the advantages and benefits of the apparatus of the fourth aspect are similar to those of the second aspect, and the relevant descriptions of the second aspect are referred to, and are not repeated here.

In a fifth aspect, an image sensor image data processing system is provided, including any image data processing apparatus as in the third aspect and any image recognition apparatus as in the fourth aspect.

With reference to the fifth aspect, in a possible implementation manner, the image sensor image data processing system further includes a division management module, configured to provide a division strategy to the image data processing apparatus and the image recognition apparatus, the division The strategy is used to preset the size and quantity of the plurality of sub-collection areas of the collection area.

Since the system of the fifth aspect includes the apparatus of the third aspect and the fourth aspect, the advantages and benefits of the system of the fifth aspect will include the advantages and benefits of the third aspect and the fourth aspect, with reference to the third aspect and the fourth aspect. The description will not be repeated here.

In a sixth aspect, a driving system is provided, which includes any image sensor image data processing system and a driving decision-making unit in the fifth aspect; wherein the driving decision-making unit is connected to the image sensor image data processing system for The image recognition result output by the image sensor image data processing system executes behavioral decision-making and motion planning and outputs operation instructions.

By adopting any of the image sensor image data processing systems of the fifth aspect, the driving system of the present application can process the process of image data in advance, save processing time, thereby reducing the end-to-end time from the image sensor generating the image data to the vehicle actuator performing the operation end delay.

In one possible implementation, the driving system is an advanced driving assistance system. In another possible implementation, the driving system is an autonomous driving system.

In a seventh aspect, a vehicle is provided, which includes an image sensor connected in sequence, any one of the driving systems of the sixth aspect, an electronic control unit, and an actuator; wherein the image sensor is used to perceive the vehicle environment in a scanning manner and outputting first image data; the electronic control unit is configured to control the actuator to perform operations according to the operating instructions of the driving system.

By adopting any of the driving systems in the sixth aspect, the vehicle of the present application can process the process in advance, save processing time, and thus reduce the end-to-end delay from the image sensor generating the image data to the vehicle actuator performing the operation.

In an eighth aspect, there is provided a computing device comprising: at least one processor; and at least one memory connected in connection with the processing and storing program instructions, the program instructions when executed by the at least one processor The at least one processor is caused to perform the method of any one of the first and third aspects.

Since the processor in the computing device of the present application can execute any one of the above-mentioned methods for processing image sensor image data in the first and second aspects, the advantages and benefits of the computing device are also similar to the first and second aspects For the advantages and benefits, refer to the relevant descriptions of the first aspect and the second aspect, which will not be repeated here.

In a ninth aspect, there is provided a computer-readable storage medium having program instructions stored thereon, the program instructions, when executed by a computer, cause the computer to perform any one of the first and second aspects above to process image sensor images method of data.

Since the computer-readable storage medium of the present application can enable a computer to perform any one of the above-mentioned methods for processing image sensor image data in the first aspect and the second aspect, the advantages and benefits of the computer-readable storage medium are also similar to those of the first aspect For the advantages and benefits of the second aspect, reference is made to the relevant descriptions of the first aspect and the second aspect, which will not be repeated here.

Description of drawings

The various features of the present application and the connections between the various features are further explained below with reference to the accompanying drawings. The drawings are exemplary, some features are not shown to scale, and some of the drawings may omit features that are customary in the field to which the application relates and not essential to the application, or additionally show The non-essential features of the present application, and the combination of individual features shown in the drawings are not intended to limit the present application. In addition, the same reference numerals refer to the same contents throughout the present specification. The specific drawings are described as follows:

Fig. 1 is the schematic diagram of the image data processing scheme in the prior art;

2 is a schematic diagram of an image data processing solution of the prior art and an embodiment of the present application, wherein the solution of the prior art is located at the upper part, and the solution of the embodiment of the present application is located at the lower part;

3 is a schematic structural diagram of an image sensor image data processing system according to an embodiment of the present application;

4 is a schematic structural diagram of an image data processing apparatus according to an embodiment of the present application;

5 is a schematic structural diagram of an image recognition apparatus according to an embodiment of the present application;

6 is a schematic flowchart of a method for processing image data according to an embodiment of the present application;

7 is a schematic diagram of dividing a collection area by a dividing strategy in an embodiment of the present application;

FIG. 8 is a schematic diagram of the division of each sub-collection area of the image sensor collection area in FIG. 7;

9 is a schematic flowchart of a feature extraction and fusion identification process in a method for processing image sensor image data according to an embodiment of the present application; and

10 is a schematic diagram of an image data processing solution of the prior art and another embodiment of the present application, wherein the solution of the prior art is located at the upper part, and the solution of the embodiment of the present application is located at the lower part;

11 is a schematic diagram of an image data processing solution of the prior art and another embodiment of the present application, wherein the solution of the prior art is located at the upper part, and the solution of the embodiment of the present application is located at the lower part;

12 is a schematic flowchart of a method for processing image sensor image data according to another embodiment of the present application;

13 is a schematic structural diagram of a driving system according to an embodiment of the present application;

FIG. 14 is a schematic structural diagram of a vehicle according to an embodiment of the present application; and

FIG. 15 is a schematic structural diagram of a computing device according to an embodiment of the present application.

Detailed ways

In the following description, the reference numerals representing steps, such as S101, S102, etc., do not necessarily mean that this step will be performed, and the order of the preceding and following steps may be interchanged or performed simultaneously if permitted.

The words "first, second, third" or "region A, region B, region C" and other similar terms in the description and claims are only used to distinguish similar objects, and do not indicate a specific ordering of objects, and may It is understood that the specific order or sequence may be interchanged, where permitted, to enable the embodiments of the application described herein to be practiced in sequences other than those illustrated or described herein. For example, the terms "first image data" and "second image data" used in this application are both image data, but the "second image data" is obtained by performing image processing on the "first image data", so the term "first image data" is used. One" and "Second" distinguish the two.

The term "comprising" used in the description and claims should not be interpreted as being limited to what is listed thereafter; it does not exclude other elements or steps. Accordingly, it should be interpreted as specifying the presence of said features, integers, steps or components mentioned, but not excluding the presence or addition of one or more other features, integers, steps or components and groups thereof. Therefore, the expression "apparatus comprising means A and B" should not be limited to apparatuses consisting of parts A and B only.

Reference in the specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, the appearances of the terms "in one embodiment" or "in an embodiment" in various places in this specification are not necessarily all referring to the same embodiment, but may refer to the same embodiment. Furthermore, the particular features, structures or characteristics can be combined in any suitable manner, as would be apparent to one of ordinary skill in the art from this disclosure, in one or more embodiments.

The words "optional", "optionally", and "optional" are used in the specification to mean that the feature they modify may be omitted in some embodiments, but in some alternative embodiments.

The term "imaging sensor" as used in the specification and claims includes cameras and lidars. The image sensor is used to scan a plurality of image data that can be generated from the physical area corresponding to the acquisition area in one scan cycle. The acquisition area represents the acquisition range of the image sensor, and the acquisition area of the camera is also called the target surface. The collection area of the lidar is also called the scanning area.

The term "image processing" as used in the description and claims refers to techniques of analyzing images with computing devices to achieve desired results. The image processing applicable to the image data of different types of images is different. For example, image processing of image data for two-dimensional planar images may include, but is not limited to, black level compensation, lens shading correction, bad pixel correction, demosaic ), Bayesian domain denoising, white balance correction (automatic white balance), color correction (color correction), Gamma correction and color space conversion (RGB to YUV). Image processing for the image data of the three-dimensional point cloud may include, but is not limited to, filtering, down sampling, and outlier removal. Therefore, the "image processing" involved in the method embodiments of the present application may include one or more sub-image processing, and the "image processing module" involved in the apparatus embodiments of the present application may include one or more sub-image processing modules.

The term "feature extraction" or "extraction of feature data" used in the description and claims refers to constructing a variety of informative but not redundant techniques for the remaining feature data. The feature extraction applicable to the image data of different types of images is different. For example, feature extraction for image data of two-dimensional planar images may include, but is not limited to, convolution and pooling. Feature extraction for image data of three-dimensional point clouds may include, but is not limited to, feature vector extraction.

The term "fusion and recognition" used in the description and claims refers to a technology that, after all feature data extracted from image data is fused into a whole, performs recognition analysis based on the whole and outputs an image recognition result. The fusion recognition applicable to the image data of different types of images is different. For example, fusion recognition of image data for two-dimensional planar images may include, but is not limited to, feature fusion, fully connected, and output. Fusion identification of image data for 3D point clouds may include, but is not limited to, feature point matching.

The term "feature fusion" used in the specification and claims is a process of fusing all feature data extracted from image data of the entire acquisition area into one. The feature fusion involved in this application is an early fusion (early fusion) performed before the image recognition result is obtained, which may include but not limited to serial feature fusion (concat) and parallel fusion (add).

At present, there are many image sensors with a "scanning" sensing method in the vehicle-mounted image sensor. Instead of simultaneously generating all the first image data of a frame of image in the acquisition area, such a scanning image image sensor generates all the first image data of a frame of image sequentially over a period of time, for example by scanning the acquisition area in rows or columns. first image data. As shown in Figure 1, in the existing method for processing image sensor image data, starting from time t0, the image sensor generates and transmits the first first image data of a frame of image, until time t1, the frame of image The image processing is performed on all the first image data of the whole frame of image only when the last first image data of the frame is transmitted. Then at time t2, when the image processing of the first image data of the whole frame of image is completed, the recognition algorithm processing will be performed on the second image data (that is, the first image data after image processing) until time t3. Complete the recognition algorithm processing of the whole frame image.

The inventors of the present application found that such a method of processing image sensor image data has the following drawbacks. The image processing process must wait until the image sensor scans the entire acquisition area to generate all the first image data and transfer them, and the image recognition process must wait until all the first image data has been image-processed and transformed into the second image data. start. Each of these waiting times delays the processing flow of the image sensor image data, resulting in a large end-to-end delay from the generation of the first image data by the image sensor to the execution of the operation by the vehicle actuator. In other words, since the image processing and the recognition algorithm processing are performed in units of the first or second image data of the whole frame of image, it is impossible to process the received first or second image data while receiving the first or second image data of the frame. The first or second image data is processed, which reduces the speed of processing the image sensor image data and increases the end-to-end latency.

In view of this, an embodiment of the present application provides a method for processing image data of an image sensor, including: receiving first image data from an image sensor, where the first image data is scanned by the image sensor within one scan period One image data among a plurality of image data that can be generated by the physical area corresponding to the acquisition area, the acquisition area represents the acquisition range of the image sensor; image processing is performed on the first image data to obtain second image data; the second image data.

The concept of the method for processing image sensor image data of the present application is further described below with reference to FIG. 2 . For the sake of intuition, FIG. 2 shows the existing solution (located in the upper part of FIG. 2 ) and the solution of the present application (located in the lower part of FIG. 2 ) in the same time axis.

In the embodiment of the method for processing image sensor image data of the present application shown in the lower part of FIG. 2 , the image sensor scans its acquisition area and sequentially generates a plurality of image data. In the case where the scanning area of the image sensor is pre-divided into three sub-acquisition areas A, B and C, when all the image data generated by the scanning sub-acquisition area A is received (ie at time t11 ), that is, when receiving The processing method of the present application starts to perform image processing before all image data generated by scanning the entire acquisition area (ie, at time t1). This advances the processing flow from time t1 in the prior art to time t11 in the present application, so that the end time of image processing in the present application is earlier than the end time of image processing in the existing solution, so the final algorithm processing The end time of the image recognition will inevitably be advanced, so that the image recognition results can be obtained faster, and the end-to-end delay can be shortened.

The method for processing image sensor image data of the present application can be applied to processing image data generated by various "scanning" image sensors, and can be applied to vehicles, robots and other devices having such image sensors.

For example, in a motion robot scene with a camera, the robot uses the camera to perceive the environment and plan and execute the corresponding movement according to the image recognition results of the camera's image data. When such a robot applies the method for processing image sensor image data of the present application, an image recognition result can be obtained more quickly, so that a corresponding movement can be made in response to the result more quickly. Therefore, using the method for processing image sensor image data of the present application can make the motion of the motion robot more agile.

For example, in an ICV scenario with lidar, the vehicle uses lidar to perceive road conditions and plan and execute corresponding autonomous driving operations based on the image recognition results of lidar's point cloud image data. When such a robot applies the method for processing image sensor image data of the present application, the image recognition result can be obtained more quickly, so that the corresponding driving operation can be performed in response to the result more quickly. Therefore, using the method for processing image sensor image data of the present application can make the automatic driving of the intelligent networked vehicle safer.

In addition, although the method for processing image sensor image data of the present application is to improve the solution for processing one frame of image, the time for processing one frame of image is shortened, thereby shortening the end-to-end delay. It should be understood that the method of processing image sensor image data of the present application is also applicable to processing video including multiple frames of images. By using the method for processing image sensor image data of the present application for each frame of images in the video, the processing time of the video can be shortened.

One embodiment of the image sensor image data processing system and one embodiment of the method of processing image data of the present application will be described in detail below with reference to FIGS. 2 to 9 . In FIG. 3 , optional modules of the device are represented by dashed boxes, that is, the optional modules may be omitted in other embodiments.

FIG. 3 exemplarily shows an image sensor image data processing system 1001 according to an embodiment of the present application, including an image data processing apparatus 1100 and an image recognition apparatus 1200 that are connected to each other.

Figure 3 shows an optional partition management module 1300 in a dashed box. The division management module 1300 is respectively connected with the image data processing apparatus 1100 and the image recognition apparatus 1200, and is used for providing division policies to both. The division strategy may be used to preset the size and quantity of the multiple sub-capturing regions of the image sensor's capturing region. In this embodiment, the division strategy provided by the division management module 1300 enables the collection area to be divided into three sub-collection areas A, B, and C equally.

FIG. 3 additionally shows image sensor 2000 . The image sensor 2000 scans the acquisition area to sequentially generate multiple first image data constituting one frame of image, and sequentially sends the multiple first image data to the image data processing device 1100 in the image sensor image data transmission processing system 1001 of the present application.

As shown in FIG. 4 , the image data processing apparatus 1100 includes a receiving module 1110 and an image processing module 1120 which are connected to each other.

In this embodiment, the receiving module 1110 of the image data processing apparatus 1100 is configured to receive a plurality of first image data sent by the image sensor 2000 , and may also be configured to receive a division strategy provided by the division management module 1300 .

In this embodiment, since the image processing module 1120 pre-divides the acquisition area into three sub-acquisition areas A, B and C of equal size according to the division strategy received by the receiving module 1110, the first image received by the receiving module 1110 The data can be divided into 3 groups of first image data, so that when the image sensor scans the 3 different sub-acquisition regions, a group of first image data is respectively generated. The image processing module 1120 can thus be further configured to perform image processing on a set of first image data when the set of first image data has been received.

Specifically, referring to FIG. 2 , the image processing module 1120 is configured to start performing image processing on the image data when the receiving module 1110 receives all the first image data generated from the sub-collection area A at time t11. Then, at time t12, when the receiving module 1110 receives all the first image data generated from the sub-collection area B, the image processing module 1120 starts to perform image processing on these image data. Finally, at time t13, when the receiving module 1110 receives all the first image data generated from the sub-collection area C, the image processing module 1120 starts to perform image processing on these image data.

In this application, the image processing module 1120 may include a plurality of image processing sub-modules (not shown). Similarly, image processing may include multiple image sub-processing. In practice, the image data processing apparatus 1100 may be, for example, an image signal processor.

As shown in FIG. 5 , the image recognition apparatus 1200 includes a receiving module 1210 , a feature extraction module 1220 and a fusion recognition module 1230 which are connected in sequence.

In this embodiment, the receiving module 1210 of the image recognition apparatus 1200 is configured to receive a plurality of second image data. The plurality of second image data are obtained by performing image processing on the plurality of first image data by the image data processing apparatus 1100, and the plurality of first image data are sequentially generated by the image sensor 3000 scanning the acquisition area. In addition, the receiving module 1210 is further configured to receive the partition policy provided by the partition management module 1300 .

In this embodiment, the feature extraction module 1220 is configured to sequentially extract feature data from the received second image data during the period when the receiving module receives the plurality of second image data. Since the feature extraction module 1220 groups the second image data according to the division strategy received by the receiving module 1210, that is, the acquisition area is pre-divided into three sub-acquisition areas A, B, and C of equal size, so the second image data received by the receiving module 1210 The image data can also be similarly divided into 3 different sets of second image data, so that each second image data originating from the sub-acquisition areas A, B, C belong to different sets of second image data, respectively. The feature extraction module 1220 is configured to extract feature data from a set of second image data when the set of second image data has been received.

Specifically, referring to FIG. 2 , when the receiving module 1210 receives a set of second image data from the sub-collection area A, the feature extraction module 1220 starts to extract feature data from the set of second image data together. In other words, when all the first image data generated from the sub-collection area A are transformed into corresponding second image data through image processing and received by the receiving module 1210, the feature extraction module 1220 starts to extract the second image data from these second image data together. Extract feature data. Subsequently, the feature extraction module 1220 performs feature extraction on the second image data originating from the sub-acquisition area B and the sub-acquisition area C, respectively, in a similar manner.

In this embodiment, the fusion identification module 1230 is configured to first fuse all the characteristic data into a whole after the characteristic extraction module 1220 extracts characteristic data from all the second image data originating from the sub-collection areas A, B, and C, respectively, Then the whole is recognized, and finally the image recognition result is output at time t4.

Since both the feature extraction module 1220 and the fusion recognition module 1230 are involved in image recognition processing, in practice, they are usually set in the same entity, for example, in the algorithm platform of the vehicle.

In this embodiment, the image processing module 1120 and the feature extraction module 1220 respectively process the first image data generated from a sub-acquisition area or the second image data from a sub-acquisition area, that is, their processing objects are from First or second image data of a sub-acquisition area. The fusion recognition module 1230 of this embodiment is configured to perform fusion recognition from all the feature data of the second image data originating from the entire collection area as a whole, that is, the processing object is all the second image data originating from the whole collection area. Feature data for image data.

Additionally, the feature extraction module 1220 may include one or more convolutional layers (not shown) and one or more pooling layers (not shown). The fusion recognition module 1230 may include one or more feature fusion layers (not shown), one or more fully connected layers (not shown), and an output layer (not shown). These layers may also each include one or more sublayers (not shown). The various steps performed by the various layers in the feature extraction module 1220 and the fusion identification module 1230 will be further described below, so the structure of the various layers within these modules will become clearer in the related description below.

An embodiment of the method for processing image sensor image data of the present application will be described in detail below with reference to FIG. 6 in conjunction with the above descriptions with reference to FIGS. 3 to 5 .

FIG. 6 exemplarily shows a schematic flowchart of a method for processing image data according to an embodiment of the present application, which includes the following steps S101 to S107:

In step S101, the division management module 1300 provides a division policy. Specifically, the division strategy is used to preset the number of sub-collection areas of the collection area to 3 and to preset the size of each sub-collection area to be equal to each other.

In step S102, the image data processing apparatus 1100 and the image recognition apparatus 1200 respectively receive the division strategy, and pre-set each sub-collection area of the collection area according to the division policy, thereby grouping the first image data and the second image data respectively. Specifically, the image data processing device 1100 and the receiving module 1110 of the image recognition device 1200 and the receiving module 1210 of the image recognition device 1200 receive the division strategy respectively, and the corresponding image processing module 1120 and the feature extraction module 1220 divide the collection area according to the division strategy. A first sub-collection area and a second sub-collection area are formed, and the first and second sub-collection areas are preset as sub-collection areas A, B, and C of equal size to each other. In other words, since the adopted division strategy is the same, the second sub-acquisition areas of the image data processing apparatus 1100 and the recognition apparatus 1200 are all sub-acquisition areas A, B and C.

In step S103, the image data processing apparatus 1100 receives a plurality of first image data sequentially generated by the image sensor scanning the acquisition area. Specifically, the receiving module 1110 of the image data processing apparatus 1100 receives a plurality of first image data sequentially generated and sent by the image sensor 3000 by scanning the acquisition area.

In step S104, when the image data processing apparatus 1100 has received a set of first image data generated from a first sub-collection area, it performs image processing on the set of first image data, and outputs corresponding second image data . Specifically, when the receiving module 1110 receives a set of first image data generated from a first sub-collection area, the image processing module 1120 performs image processing on the set of first image data, and sequentially outputs the set of first images The corresponding second image data obtained by performing the image processing.

In step S105, the image recognition apparatus 1200 receives a plurality of second image data sequentially output by the image data processing apparatus 1100. Specifically, the receiving module 1210 of the image recognition apparatus 1200 receives a plurality of second image data sequentially output by the image data processing apparatus 1100 .

In step S106, the image recognition apparatus 1200 extracts feature data from a set of second image data originating from a second sub-capture area when the image recognition apparatus 1200 has received the set of second image data. Specifically, when the receiving module 1210 receives all the second image data originating from a second sub-collection area, the feature extraction module 1220 of the image recognition apparatus 1200 extracts feature data from the image data together.

In step S107, the image recognition device 1200 performs fusion recognition processing on the feature data extracted from the plurality of second image data, and outputs an image recognition result. Specifically, after the feature extraction module 1220 of the image recognition device 1200 completes the extraction of feature data from the second image data originating from the sub-collection areas A, B, and C, respectively, the fusion recognition module 1230 of the image recognition device 1200 fuses all the feature data As a whole, then identify the whole, and finally output the image recognition result.

It should be understood that the above steps S101 to S107 are not arranged in chronological order. For example, in this embodiment, when the receiving module 1110 is receiving the first image data generated by the sub-collection area B in step S103, the image processing module 1120 in step S104 may perform image processing on the first image data generated by the sub-collection area A deal with. Therefore, step S103 and step S104 may overlap in time. Similarly, step S105 and step S105 may coincide in time. In addition, in this embodiment, since the acquisition area is divided into 3 sub-acquisition areas, as the image sensor scans the sub-acquisition areas A, B, and C in sequence, steps S104 and S106 are correspondingly repeated 3 times for processing respectively First image data from each sub-acquisition area and second image data from each sub-acquisition area are generated, as shown in FIG. 2 .

In addition, it should be understood that the execution object of the image processing step S104 and the feature extraction step S106 in this embodiment is the first or second image data generated from or originating from a sub-acquisition area of the acquisition area. In this embodiment, the execution object of the fusion identification step S107 is all feature data extracted from the second image data originating from the entire collection area.

Referring to FIG. 2 again, it can be known that the delay benefit ΔT achieved by the above system and method implementations compared to the existing solution can be calculated as follows:

ΔT=t3-t4 Equation (1);

Taking time t1 as the benchmark, equation (1) can be converted into:

ΔT=(t3-t1)-(t2-t1)/3-T0/3-T1 Formula (2);

t3=t2+T0+T1, thus converting formula (2), we can get

ΔT=2/3*(t2-t1+T0) Equation (3);

Wherein, t1 is the start time of performing image processing on the first image data generated from the entire collection area in the existing solution, and t1 is also the image processing time on the first image data generated from the sub-collection area C in this embodiment of the present application t2 is the moment when the image processing ends and the algorithm processing starts in the existing scheme; t3 is the moment when the algorithm processing ends in the existing scheme; t4 is the moment when the fusion recognition ends in the present embodiment of the present application; T0 is the current The duration of feature extraction that belongs to a part of algorithm processing in the scheme is also the duration of feature processing in this embodiment of the present application; and T1 is the duration of fusion recognition that belongs to a part of algorithm processing in the existing scheme, which is also the duration of this embodiment of the present application The duration of fusion recognition in .

It can be clearly seen from FIG. 2 and Equation (3) that, on the one hand, the delay benefit in this embodiment of the present application benefits from the advance of the image processing flow, that is, in this embodiment, the processing of one frame has already started before time t1. Image processing is performed on 2/3 of the first image data of the image, thereby reducing the end-to-end delay. On the other hand, the delay benefit in this embodiment of the present application benefits from the advance of the image recognition process, that is, in this embodiment, 2/3 of the second image data of the frame image has been processed before time t1. feature extraction, thereby further reducing the end-to-end delay.

Some modules and steps in the system and method embodiments of the present application will be described in more detail below with reference to FIGS. 7-9 .

FIG. 7 exemplarily shows a schematic diagram of dividing the acquisition area of the image sensor 3000 using the division strategy provided by the division management module 1300 . Taking a camera as an example, as shown in FIG. 7 , a frame of image is generated by the camera in its target surface (also referred to as an acquisition area in the present invention) by line exposure from top to bottom. According to the division strategy provided by the division management module 1300, the collection area is divided into three rectangular sub-collection areas A, B, and C in the direction from top to bottom, and the sizes of each sub-collection area are set to be equal.

In FIG. 7 , the image sensor scans the sub-acquisition area A first and thus generates the first image data of the sub-acquisition area A first, and then transmits the image data generated from the sub-acquisition area A to the image sensor image data processing system 1001 first. first image data. The image sensor image data processing system 1001 therefore firstly performs image processing and feature extraction on the image data generated from the sub-acquisition area A, then processes the first image data generated from the sub-acquisition area B, and finally processes the image data generated from the sub-acquisition area B. The first image data of the sub-acquisition area C is processed. After that, the acquisition area of the camera can also be divided by the same division strategy for processing the second frame of images generated by subsequent scans, up to the Z-th frame image, where Z is any integer greater than 2. That is to say, the present application can process images of multiple frames, and thus can process objects composed of images of multiple frames, such as videos.

FIG. 8 is a schematic diagram of division of each sub-collection area of the image sensor collection area in FIG. 7 . It is assumed that the resolution of the acquisition area shown by FIG. 7 is 1920*1080. The upper left corner of the collection area is set as the imaginary coordinate origin, and each sub-collection area A, B and C can be easily defined by the coordinates of the upper left corner, lower left corner, upper right corner and lower right corner of the rectangle. Please refer to Table 1 for the specific coordinates. and Figure 8.

Table 1

The coordinates of the upper left corner The coordinates of the upper right corner The coordinates of the lower left corner The coordinates of the lower right corner

Sub-collection area A(0,0) (1919,0) (0,359) (1919,359)

Sub-collection area B(0,360) (1919,360) (0,719) (1919,719)

Sub-collection area C(0,720) (1919,720) (0,1079) (1919,1079)

In this way, each sub-collection area of the collection area can be limited in a simple and flexible manner, and it is convenient to adjust the number of the sub-collection areas and the size of each sub-collection area according to the actual situation.

The structure of the feature extraction step S106 and the fusion identification step S107 and the related step executor feature extraction module 1220 and the fusion identification module 1230 are further described below with reference to FIG. 9 .

FIG. 9 exemplarily shows a schematic diagram of a flow chart of feature extraction and fusion recognition in a method for processing image data according to an embodiment of the present application. The parts involved in the feature extraction step S106 and the part involved in the fusion identification step S107 are respectively marked with different dashed boxes. Wherein, conv1 indicates the first convolution process that can be performed by the first convolution layer of the feature extraction module 1220, conv2 indicates the second convolution process that can be performed by the second convolution layer of the feature extraction module, and conv3 indicates that the feature The third convolution process performed by the third convolution layer of the extraction module; similarly, pool1 to pool3 each indicate the first to third pooling processes that can be performed by the first to third pooling layers of the feature extraction module 1220, respectively , fc1 and fc2 each indicate the first and second fully connected processes that may be performed by the first and second fully connected layers of the fusion recognition module 1230, respectively. concat indicates a series of feature fusion processes that may be performed by the series feature fusion layer of the fusion recognition module 1230. output indicates that the output of the image recognition result may be performed by the output layer of the fusion recognition module 1230 . Furthermore, in FIG. 9 , conv2_1 indicates the first sub-convolution process of the first convolution process that can be performed by the first sub-convolution layer of the first convolution layer, and conv2_2 indicates the second sub-convolution process that can be performed by the first convolution layer. The second subconvolution process of the first convolution process performed by the subconvolution layer. Similarly, conv3_1 to conv3_3 each indicate the first to third sub-convolution processes of the third convolution process that can be performed by the first to third sub-convolution layers of the third convolution layer, respectively.

For convenience of display, the processes involved in step S106 are not arranged in time sequence in FIG. 10 . Specifically, when the feature extraction module 1220 performs the feature extraction step S106, as shown in the dashed box in S106, each convolution and pooling of the image data of the sub-collection area A located in the left column is first performed from top to bottom processing, and then perform each convolution and pooling processing of the image data of the sub-collection area B in the middle column from top to bottom, and finally perform each of the image data of the sub-collection area C in the right column from top to bottom. Convolution and pooling. In other words, the convolution and pooling of the second image data of each sub-acquisition area A, B and C are not performed simultaneously, but are sequentially performed from left to right in units of the second image data of a sub-collection area .

As shown in the dotted box in S107 of FIG. 10 , when the fusion recognition module 1230 performs the fusion recognition step S107, it sequentially performs a series of feature fusion processing, first full connection processing, second full connection processing and image recognition in order from top to bottom result output.

For ease of understanding, an embodiment of an image sensor image data processing system and an embodiment of a method for image data processing of the present application are described above in detail. However, it should not be understood that the systems and methods of the present application are limited to the feature combinations of the above-described embodiments.

In some other embodiments of the present application, the image data processing apparatus 1100 takes a set of first image data generated from a sub-collection area as the processing object, and the image recognition apparatus 1200 uses all the second image data from the entire collection area as a processing object. As shown in the lower part of FIG. 10 , the method for image data processing according to this embodiment of the present application includes: during receiving a plurality of first image data generated from scanning the entire acquisition area, sequentially processing a group of images generated from a sub-acquisition area Image processing is performed on the first image data to output the corresponding second image data; when the image processing is completed on all the first image data generated from the three sub-collection areas and all the second image data are output, all the second image data are processed. Feature extraction and fusion recognition are performed as a whole. Therefore, the delay benefit of this solution is only 2/3 of the image processing time, that is, this solution only benefits from the advance of image processing of the first image data generated from the sub-collection areas A and B.

In some other embodiments, the sizes of the various sub-collection regions may be defined to be unequal to each other. For example, in another embodiment of the present application as shown in FIG. 11 , the size of the sub-collection area D is set to be 1/2 of the size of the sub-collection area E.

In some other implementations, the image data processing apparatus or the image recognition apparatus may perform image processing or feature extraction on the received first or second image data at a start time that is not the time at which the image data is received. For example, in another embodiment of the present application shown in the lower part of FIG. 11 , the image data processing device performs image processing on the first image data generated from the sub-collection area D at time t16, that is, after receiving the first image data generated from the sub-collection area D. Between time t14 when all the first image data of the sub-capture area D is acquired and time t15 when all the first image data of the sub-acquisition area E are received. In this way, the start time when the image data processing apparatus performs image processing on all the first image data generated from the sub-capturing area E is the completion time when the image processing is performed on all the first image data generated from the sub-capturing area D. Therefore, the delay benefit of this solution is only 1/3 of the image processing time, that is, this solution only benefits from the advance of image processing of all the first image data generated from the sub-collection area D.

Therefore, referring to FIG. 2 , FIG. 10 and FIG. 11 , it can be understood that the setting of the start time for image processing performed by the image data processing apparatus and feature extraction performed by the image recognition apparatus both have an impact on the delay benefit of the present application. In addition, the setting of the number and size of the sub-collection areas of the collection area will also have an impact on the delay benefit of the present application.

In some other embodiments, the size and number of sub-collection areas of the collection area may be directly pre-configured according to preset values. Therefore, the division management module 1300 may be omitted, or steps S101 and S102 may be omitted. In this case, in combination with the solution shown in the lower part of FIG. 10 , the image data processing apparatus 1100 and the image recognition apparatus 1200 execute the following steps S201 - S205 correspondingly according to preset values, as shown in FIG. 12 .

In step S201, the image data processing apparatus 1101 receives a plurality of first image data sequentially generated by the image sensor scanning the acquisition area. Specifically, the receiving module 1111 of the image data processing apparatus 1101 receives a plurality of first image data sequentially generated by the image sensor scanning the acquisition area.

In step S202, when the image data processing device 1101 has received a set of first image data, image processing is performed on the set of first image data, and a set of first image data is generated from a preset first image data in the acquisition area. Sub-collection area. Specifically, according to a predetermined value, the image processing module 1121 of the image data processing apparatus 1100 presets each of the first sub-collection areas A, B and C of the collection area. When the receiving module 1111 receives a set of first image data generated from a first sub-collection area, the image processing module 1121 of the image data processing device 1100 performs image processing on the first image data, and sequentially outputs the set of first image data. The corresponding second image data is obtained by performing the image processing on one image data.

In step S203, the image recognition device 1201 receives a plurality of second image data sequentially output by the image data processing device 1101. Specifically, these second image data are received by the receiving module 1211 of the image recognition apparatus 1201 .

In step S204, when receiving the plurality of second image data, the image recognition device 1201 extracts feature data from the plurality of second image data. Specifically, the feature extraction module 1221 of the image recognition apparatus 1201 extracts feature data from the plurality of second image data together when the receiving module 1211 receives the plurality of second image data originating from the entire acquisition area.

In step S205, the image recognition device 1201 performs fusion recognition processing on the feature data extracted from the plurality of second image data, and outputs an image recognition result. Specifically, the feature extraction module 1221 of the image recognition device 1201 extracts feature data from a plurality of second image data, and the fusion recognition module 1231 of the image recognition device 1201 fuses the plurality of feature data obtained thereby, performs recognition, and finally outputs Image recognition results.

In some other embodiments, the first image data and the second image data of different frames can be grouped in different ways, for example, different division strategies or preset values are used to pre-set the acquisition area of the camera when scanning different frames of images. The number and size of sub-collection regions. In one such embodiment, in processing the image data of the first frame of image, the division method shown in FIG. 7 can be used, so that the acquisition area is divided into three sub-acquisition areas of equal size. In processing the image data of the second frame of images, the division strategy adopted in Figure 11 can be used, that is, the acquisition area is divided into two sub-acquisition areas with unequal sizes, and the size of one sub-acquisition area is twice that of the other. .

In some other implementation manners, the first image data and the second image data of the same frame can also be grouped in different ways. In the image recognition process, each sub-collection area of the collection area is set in another way. In one such embodiment, in the image data processing flow, the acquisition area is divided into 4 equal sub-acquisition areas F, G, H, and I, while in the image recognition flow, the acquisition area is divided into 2 equal sub-acquisition areas The sub-collection areas J and K of , wherein the sub-collection area J coincides with the intersection of the sub-collection areas F and G, and the sub-collection area K coincides with the intersection of the sub-collection areas H and I. In this way, the image data processing process can be advanced, and the image recognition process can be ensured not to be overly complicated, so as to avoid increasing the time of the image recognition process.

In some other implementations, especially in implementations where the image sensor is, for example, a lidar, each sub-acquisition area of the image may not be defined by the coordinates of the four corners, but may be defined by a rotation angle, for example, a rotation of 0° to 90° may be used The angular extent defines a sub-collection area of the point cloud.

In some other embodiments, the number of layers and sub-layers of the feature extraction module and the fusion recognition module is adjustable. Correspondingly, in some other embodiments, the number of times that each process and each sub-process of feature extraction and fusion identification is performed can also be adjusted.

In some embodiments, the image data processing apparatus receives first image data from an image sensor, where the first image data is among a plurality of image data that can be generated by the image sensor scanning a physical area corresponding to the acquisition area in one scan period An image data, the acquisition area represents the acquisition range of the image sensor. The image data processing device performs image processing on the first image data to obtain second image data. The image data processing device outputs the second image data. That is, in this embodiment, in the image processing flow, the first image data is not processed in a grouped manner.

In some embodiments, the image recognition device sequentially receives second image data, where the second image data is obtained by performing image processing on the first image data, and the first image data corresponds to the area scanned by the image sensor in one scan period. One of the plurality of image data that can be generated by the physical area of the acquisition area represents the acquisition range of the image sensor. The image recognition device sequentially extracts feature data from the second image data. The image recognition device performs fusion recognition processing on each feature data. The image recognition device outputs an image recognition result. That is, in this embodiment, in the image recognition flow, the second image data is not processed in groups.

FIG. 13 exemplarily shows a schematic structural diagram of a driving system 3001 according to an embodiment of the present application. The driving system 3001 is an advanced driving assistance system (ADAS), which includes an image sensor image data processing system 1001 and a driving decision unit 3100 . The image sensor image data processing system 1001 can be connected in communication with the camera 2001 outside the driving system 3001, process and recognize a plurality of first image data sequentially generated by the camera 2001 scanning the acquisition area, and output the image recognition result. The driving decision unit 3100 is connected in communication with the image sensor image data processing system 1001, and is used for executing behavior decision and motion planning and outputting operation instructions according to the image recognition result output by the image sensor image data processing system 1001.

Fig. 14 exemplarily shows a schematic structural diagram of an intelligent networked vehicle V according to an embodiment of the present application. The intelligent networked vehicle V includes a camera 2001 usually set in the front of the car, a driving system 3001 set in the car, an electronic control unit 4001 and an actuator 5001 such as a braking mechanism. The camera 2001 perceives the vehicle environment in a manner of scanning its acquisition area in rows, and sequentially outputs a plurality of first image data. The driving system 3001 is connected in communication with the camera 2001 for outputting operation instructions according to a plurality of first image data from the camera 2001 . The Electronic Control Unit (ECU) 4001 is connected in communication with the driving system 3001, and is used to control the actuator 5001 to perform operations according to the operation commands from the driving system 3001, for example, control the braking mechanism to perform braking according to the braking command of the driving system operate.

FIG. 15 is an exemplary structural diagram of a computing device 1500 provided by an embodiment of the present application. The computing device 1500 includes: a processor 1510 , a memory 1520 , a communication interface 1530 , and a bus 1540 .

It should be understood that the communication interface 1530 in the computing device 1500 shown in FIG. 15 may be used to perform communication with other devices.

Wherein, the processor 1510 can be connected with the memory 1520 . The memory 1520 may be used to store the program codes and data. Therefore, the memory 1520 may be a storage unit inside the processor 1510 , or an external storage unit independent from the processor 1510 , or may include a storage unit inside the processor 1510 and an external storage unit independent from the processor 1510 . part.

Optionally, computing device 1500 may also include bus 1540 . The memory 1520 and the communication interface 1530 may be connected to the processor 1510 through the bus 1540 . The bus 1540 may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus or the like. The bus 1540 can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one line is shown in FIG. 15, but it does not mean that there is only one bus or one type of bus.

It should be understood that, in the embodiments of the present application, the processor 1510 may adopt a central processing unit (central processing unit, CPU). The processor may also be other general-purpose processors, digital signal processors (DSPs), application specific integrated circuits (ASICs), off-the-shelf programmable gate arrays (FPGAs) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. Alternatively, the processor 1510 uses one or more integrated circuits to execute related programs to implement the technical solutions provided by the embodiments of the present application.

The memory 1520 may include read only memory and random access memory and provides instructions and data to the processor 1510 . A portion of the processor 1510 may also include non-volatile random access memory. For example, the processor 1510 may also store device type information.

When the computing device 1500 is running, the processor 1510 executes the computer-implemented instructions in the memory 1520 to perform the operation steps of any of the above methods for processing image sensor image data.

In other embodiments, the communication interface 1530 and the bus 1540 are omitted.

It should be understood that the computing device 1500 according to the embodiments of the present application may correspond to the corresponding subjects in executing the methods according to the various embodiments of the present application, and the above-mentioned and other operations and/or functions of the various units in the computing device 1500 are respectively for realizing the present invention. For the sake of brevity, the corresponding procedures of each method in the implementation manner are not repeated here.

Those of ordinary skill in the art can realize that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.

Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working process of the above-described systems, devices and units may refer to the corresponding processes in the foregoing method implementations, which will not be repeated here.

In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods may be implemented in other manners. For example, the apparatus implementations described above are only exemplary. For example, the division of the units is only a logical function division. In actual implementation, there may be other division strategies, for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this implementation manner.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution. The computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .

Embodiments of the present application further provide a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, is used to execute a method for processing image data of an image sensor, and the method includes the methods described in the foregoing embodiments. at least one of the methods described.

The computer storage medium of the embodiments of the present application may adopt any combination of one or more computer-readable mediums. The computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or any combination of the above. More specific examples (a non-exhaustive list) of computer readable storage media include: electrical connections having one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), Erasable Programmable Read Only Memory (EPROM or Flash), fiber optics, portable compact disk read only memory (CD-ROM), optical storage, magnetic storage, cloud, or any suitable combination of the above. In this document, a computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.

A computer-readable signal medium may include a propagated data signal in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .

Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for performing the operations of the present application may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, but also conventional Procedural programming language - such as the "C" language or similar programming language. The program code may execute on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or wide area network (WAN), or may be connected to an external computer (eg, through the Internet using an Internet service provider) connect).

Note that the above are only the preferred embodiments of the present application and the applied technical principles. Those skilled in the art will understand that the present application is not limited to the specific embodiments described herein, and various obvious changes, readjustments and substitutions can be made by those skilled in the art without departing from the protection scope of the present application. Therefore, although the present application has been described in detail through the above embodiments, the present application is not limited to the above embodiments, and can also include other equivalent embodiments without departing from the concept of the present application, all of which belong to The scope of protection of this application.

Claims

A method for processing image sensor image data, comprising:

Receive first image data from an image sensor, where the first image data is one image data among multiple image data that can be generated by the image sensor scanning a physical area corresponding to the acquisition area in one scan period, and the acquisition area represents the acquisition range of the image sensor;

performing image processing on the first image data to obtain second image data; and

The second image data is output.
The method according to claim 1, wherein the collection area includes a plurality of sub-collection areas;

The performing image processing on the first image data includes:

After receiving the first image data included in the first data group A, image processing is performed in units of all the first image data included in the first image data group A, where the first image data group A is the image The set of the first image data generated by the sensor scanning a physical area corresponding to the sub-collection area.
The method according to claim 2, wherein the size and quantity of the plurality of sub-collection regions are preset.
A method for processing image sensor image data, comprising:

Receive second image data in sequence, where the second image data is obtained by performing image processing on the first image data, and the first image data can be generated by the image sensor scanning the physical area corresponding to the acquisition area within one scan cycle One image data among the plurality of image data, the collection area represents the collection range of the image sensor;

extracting feature data sequentially from the second image data;

Perform fusion identification processing on each feature data; and

Output image recognition results.
The method of claim 4, wherein the second image data is grouped;

The respective feature data are extracted from the respective sets of the second image data.
The method according to claim 5, wherein the quantity of the second image data of each group of the second image data is preset.
An image data processing device, comprising:

a receiving module, configured to receive first image data from an image sensor, where the first image data is one image data among a plurality of image data that can be generated by the image sensor scanning a physical area corresponding to the acquisition area in one scan period, the acquisition area represents the acquisition range of the image sensor; and

An image processing module, configured to perform image processing on the first image data to obtain second image data, and to output the second image data.
The device according to claim 7, wherein the collection area includes a plurality of sub-collection areas;

The performing image processing on the first image data includes:

The image processing module is further configured to perform image processing in units of all the first image data included in the first image data group A after the receiving module receives the first image data included in the first data group A, The first image data group A is a collection of the first image data generated by the image sensor scanning a physical area corresponding to one of the sub-acquisition areas.
The device according to claim 8, wherein the size and quantity of the plurality of sub-collection regions are preset.
An image recognition device, comprising:

A receiving module, configured to sequentially receive second image data, the second image data is obtained by performing image processing on the first image data, and the first image data corresponds to the scanning and collecting area of the image sensor in one scanning period One image data among a plurality of image data that can be generated by the physical area of the image sensor, and the acquisition area represents the acquisition range of the image sensor;

a feature extraction module for sequentially extracting feature data from the second image data; and

The fusion recognition module is used to perform fusion recognition processing on each feature data, and to output image recognition results.
The apparatus according to claim 10, wherein the feature extraction module is further configured to group the second image data;

The respective feature data are extracted from the respective sets of the second image data.
The apparatus according to claim 11, wherein the quantity of the second image data of each group of the second image data is preset.
An image sensor image data processing system, comprising:

The image data processing apparatus according to any one of claims 7 to 9; and

The image recognition device according to any one of claims 10 to 12.
A driving system, comprising the image sensor image data processing system and the driving decision unit as claimed in claim 13;

The driving decision-making unit is connected to the image sensor image data processing system, and is configured to execute behavior decision-making and motion planning according to the image recognition result output by the image sensor image data processing system, and output operation instructions.
A vehicle, characterized by comprising sequentially connected image sensors, a driving system as claimed in claim 14, an electronic control unit and an actuator; wherein

The image sensor is used to perceive the vehicle environment in a scanning manner and output first image data;

The electronic control unit is used for controlling the actuator to perform an operation according to an operation instruction of the driving system.
A computing device, comprising:

at least one processor; and

at least one memory connected to the processing and storing program instructions which, when executed by the at least one processor, cause the at least one processor to perform the functions of claims 1 to 3 and 4 to 6 The method of any one.
A computer-readable storage medium having program instructions stored thereon, wherein the program instructions, when executed by a computer, cause the computer to perform the execution of any one of claims 1 to 3 and claims 4 to 6 Methods.