CN109034136B

CN109034136B - Image processing method, image processing apparatus, image capturing device, and storage medium

Info

Publication number: CN109034136B
Application number: CN201811039245.XA
Authority: CN
Inventors: 杨文龙; P·尼古拉斯
Original assignee: Hubei Ecarx Technology Co Ltd
Current assignee: Ecarx Hubei Tech Co Ltd
Priority date: 2018-09-06
Filing date: 2018-09-06
Publication date: 2021-07-20
Anticipated expiration: 2038-09-06
Also published as: CN109034136A

Abstract

The embodiment of the invention provides an image processing method, an image processing device, camera equipment and a storage medium, and relates to the field of image processing, wherein the method comprises the following steps: determining a road surface area in the current frame picture; extracting an interested area in the current frame picture, wherein the interested area comprises a road surface area; judging whether the proportion of the range of the region of interest in the current frame picture is larger than a first preset value; when the proportion of the range of the region of interest in the current frame picture is larger than a first preset value, adjusting the resolution of the region of interest, wherein the adjusted resolution is smaller than the resolution before adjustment; and processing the region of interest after the resolution ratio is adjusted according to a preset deep learning model. The image processing method and device, the camera equipment and the storage medium provided by the embodiment of the invention can reduce the calculation amount when processing the high-resolution picture, improve the calculation speed and save the calculation resources.

Description

Image processing method, image processing apparatus, image capturing device, and storage medium

Technical Field

The present invention relates to the field of image processing, and in particular, to an image processing method, an image processing apparatus, an image capturing device, and a storage medium.

Background

In the fields of driving assistance, automatic driving and the like, visual detection tasks such as object detection, traffic light identification, travelable area detection and the like need to be performed, but due to the limitation of the calculation amount and power consumption of a processor, an image original image cannot be directly sent to a model for processing, and the image generally needs to be processed after the resolution of the image is reduced (for example, a 1080P image is reduced to 416x416 size, and can be processed in real time).

Moreover, as the resolution of the camera is increased, the camera with more than or equal to 800 ten thousand pixels is gradually popular at present, and the distance and the accuracy which can be detected by a visual algorithm can be greatly widened, but the processing performance of a processor is stressed.

Disclosure of Invention

The invention aims to provide an image processing method, an image processing device, an image pickup apparatus and a storage medium, which can reduce the calculation amount when processing a high-resolution picture, improve the calculation speed and save the calculation resources.

In order to achieve the above purpose, the embodiment of the present invention adopts the following technical solutions:

in a first aspect, an embodiment of the present invention provides an image processing method, where the method includes: determining a road surface area in the current frame picture; extracting an interested area in the current frame picture, wherein the interested area contains the road surface area; judging whether the proportion of the range of the region of interest in the current frame picture is larger than a first preset value; when the proportion of the range of the region of interest in the current frame picture is larger than the first preset value, adjusting the resolution of the region of interest, wherein the adjusted resolution is smaller than the resolution before adjustment; and processing the region of interest with the adjusted resolution according to a preset deep learning model.

In a second aspect, an embodiment of the present invention provides an image processing apparatus, including: the road surface area identification module is used for determining a road surface area in the current frame picture; the interested region extracting module is used for extracting an interested region in the current frame picture, wherein the interested region comprises the road surface region; the judging module is used for judging whether the proportion of the range of the region of interest in the current frame picture is larger than a first preset value; the resolution adjusting module is used for adjusting the resolution of the region of interest when the proportion of the range of the region of interest in the current frame picture is larger than the first preset value, wherein the adjusted resolution is smaller than the resolution before adjustment; and the image processing module is used for processing the region of interest after the resolution is adjusted according to a preset deep learning model.

In a third aspect, an embodiment of the present invention provides an image pickup apparatus including a memory for storing one or more programs; a processor. The one or more programs, when executed by the processor, implement the image processing method described above.

In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the image processing method described above.

Compared with the prior art, the image processing method, the image processing device, the image pickup equipment and the storage medium provided by the embodiment of the invention have the advantages that the region of interest in the current frame picture is obtained through extraction, when the ratio of the range of the region of interest to the current frame picture is judged to be larger than the first preset value, the resolution of the region of interest is adjusted and then the region of interest is sent to the preset deep learning model for processing, compared with the prior art, the calculated amount during processing of the high-resolution picture can be reduced, the calculation speed is improved, and the calculation resources are saved.

In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 is a schematic configuration diagram of an image pickup apparatus according to an embodiment of the present invention;

FIG. 2 shows a schematic flow chart of an image processing method provided by an embodiment of the invention;

FIG. 3 is a schematic diagram of a picture taken by the camera device during driving assistance;

FIG. 4 is a schematic flow chart of the substeps of step S100 in FIG. 2;

FIG. 5 is a diagram of semantic segmentation;

FIG. 6 is a schematic flow chart of the substeps of step S200 in FIG. 2;

FIG. 7 is a schematic diagram of a region of interest extraction step;

FIG. 8 is a schematic diagram of the generation of a plurality of sub-regions of interest;

fig. 9 is a schematic configuration diagram showing an image processing apparatus according to an embodiment of the present invention;

fig. 10 is a schematic configuration diagram showing a road surface area recognition module of an image processing apparatus according to an embodiment of the present invention;

fig. 11 shows a schematic structural diagram of a region-of-interest extraction module of an image processing apparatus according to an embodiment of the present invention.

In the figure: 10-an image pickup apparatus; 110-a memory; 120-a processor; 130-a memory controller; 140-peripheral interfaces; 150-a radio frequency unit; 160-communication bus/signal line; 170-a camera unit; 200-an image processing apparatus; 210-a road surface area identification module; 211-semantic segmentation processing unit; 212-road surface area determination unit; 220-region of interest extraction module; 221-region of interest selection unit; 222-region of interest update unit; 230-a judgment module; 240-resolution adjustment module; 250-an image processing module; 260-a sub-region of interest generating module; 270-a target region of interest determination module; 280-image fusion module.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present invention, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Some embodiments of the invention are described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.

Based on the above-mentioned defects in the prior art, an improvement method provided by the embodiment of the present invention is as follows: and extracting and obtaining an interested region in the current frame picture, and when the ratio of the range of the interested region in the current frame picture is judged to be larger than a first preset value, adjusting the resolution of the interested region and then sending the interested region into a preset deep learning model for processing.

Referring to fig. 1, fig. 1 shows a schematic structural diagram of an image capturing apparatus 10 according to an embodiment of the present invention, in the embodiment of the present invention, the image capturing apparatus 10 includes a memory 110, a storage controller 130, one or more processors (only one is shown in the figure) 120, a peripheral interface 140, a radio frequency unit 150, an image capturing unit 170, and the like. These components communicate with each other via one or more communication buses/signal lines 160.

The memory 110 can be used for storing software programs and modules, such as program instructions/modules corresponding to the image processing apparatus 200 provided by the embodiment of the present invention, and the processor 120 executes various functional applications and image processing, such as the image processing method provided by the embodiment of the present invention, by running the software programs and modules stored in the memory 110.

The Memory 110 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like.

The processor 120 may be an integrated circuit chip having signal processing capabilities. The Processor 120 may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), a voice Processor, a video Processor, and the like; but may also be a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor 120 may be any conventional processor or the like.

The peripheral interface 140 couples various input/output devices to the processor 120 as well as to the memory 110. In some embodiments, peripheral interface 140, processor 120, and memory controller 130 may be implemented in a single chip. In other embodiments of the present invention, they may be implemented by separate chips.

The rf unit 150 is used for receiving and transmitting electromagnetic waves, and implementing interconversion between the electromagnetic waves and electrical signals, so as to communicate with a communication network or other devices.

The camera unit 170 is used to take a picture so that the processor 120 processes the taken picture.

It is to be understood that the configuration shown in fig. 1 is merely illustrative, and that the image pickup apparatus 10 may include more or fewer components than shown in fig. 1, or have a different configuration than shown in fig. 1. The components shown in fig. 1 may be implemented in hardware, software, or a combination thereof.

Referring to fig. 2, fig. 2 is a schematic flowchart illustrating an image processing method according to an embodiment of the present invention, in which the image processing method includes the following steps:

and step S100, determining a road surface area in the current frame picture.

At present, in the field of video detection, after the resolution of a high-pixel picture is adjusted and reduced, the whole picture is sent to a preset deep learning model for image processing, but actually, in the case of driving assistance, the picture processed by the deep learning model contains a large amount of useless information.

For example, referring to fig. 3, fig. 3 is a schematic diagram of a picture taken by the imaging device 10 during driving assistance, only information contained in a road area in the lower half of fig. 3 is actually useful for driving assistance of the vehicle, and information contained in a sky area in the upper half is useless for driving assistance of the vehicle, so during driving assistance, the sky in the upper half is not actually required to be processed as in fig. 3.

Therefore, in the embodiment of the present invention, the road surface area in the current frame picture is determined first, and then the area containing the information useful for driving assistance in the current frame picture is determined.

Optionally, as an implementation manner, the manner of determining the road surface area in the current frame picture may be: distinguishing a sky area and a road surface area in the current frame picture according to respective coordinates of each pixel point in the current frame picture and respective pixel value of each pixel point; alternatively, a clustering analysis algorithm such as kmeans can be used to identify the road surface area.

Optionally, as another implementation, determining the road surface area in the current frame picture may also be implemented by using a semantic segmentation technique. Referring to fig. 4, fig. 4 is a schematic flow chart of the sub-steps of step S100 in fig. 2, in the embodiment of the present invention, step S100 includes the following sub-steps:

and a substep S110, obtaining a semantic segmentation result of the current frame picture.

Currently, a semantic segmentation (FRRN) technology is widely applied to the fields of automatic driving and assisted driving. For example, please refer to fig. 5, fig. 5 is a semantic segmentation schematic diagram, fig. 5(1) is an original image, fig. 5(2) is a semantic segmentation result schematic diagram, and sky, road, vehicle, construction, and the like in the original image 5(1) are represented by different colors, so as to obtain the semantic segmentation result schematic diagram of fig. 5 (2).

Similarly, when the road surface area of the current frame picture is determined, firstly, a semantic segmentation result of the current frame picture is obtained according to a semantic segmentation algorithm preset in the camera device 10.

And a substep S120 of determining a road surface area in the semantic segmentation result of the current frame picture according to the semantic segmentation result in the previous frame picture.

It can be understood that, in the image pickup apparatus 10, a semantic division result of the previous frame of picture is also stored, and the semantic division result of the previous frame of picture is a result obtained when the image pickup apparatus 10 processes the previous frame of picture according to a preset semantic division algorithm.

The image pickup apparatus 10 can learn colors each representing sky, road, vehicles, construction, and the like in the semantic division result from the semantic division result of the last frame picture. Correspondingly, according to the semantic segmentation result of the previous frame of picture, namely, the colors of the sky, the road surface, the vehicles, the construction and the like, the road surface area in the current frame of picture can be determined from the semantic segmentation result of the current frame of picture.

Step S200, extracting the interested area in the current frame picture.

As described above, in the current frame picture taken by the image pickup apparatus 10 at the time of driving assistance, the road surface region is a region that contains information useful for driving assistance, and therefore, the image pickup apparatus 10 extracts a region of interest (ROI) in the current frame picture according to the road surface region.

Optionally, as an implementation manner, please refer to fig. 6, fig. 6 is a schematic flowchart of the sub-steps of step S200 in fig. 2, in an embodiment of the present invention, step S200 includes the following sub-steps:

and a substep S210 of selecting, as the region of interest, all the regions of the road surface region in the current frame picture according to a preset generation mode.

When the region of interest in the current frame picture is extracted, in the current frame picture, all the regions of the road surface region in the current frame picture are selected as the region of interest according to a preset generation mode.

For example, please refer to fig. 7, fig. 7 is a schematic diagram illustrating a step of extracting an area of interest, in the schematic diagram shown in fig. 7(1), a rectangular frame is used to select all road surface areas in a current frame picture, that is, all road surface areas are required to be included in the range of the rectangular frame, and all areas included in the rectangular frame are used as the area of interest.

Optionally, as an implementation, the step S200 further includes the following sub-steps:

and a substep S220 of adjusting the range of the region of interest according to a preset scaling ratio to update the region of interest.

In order to eliminate the condition that the road surface area is reduced due to the fact that obstacles such as vehicles or buildings block the road surface area, the range of the acquired region of interest is adjusted by adopting a preset scaling so as to perform redundant processing on the region of interest, and then the region of interest is updated, the road surface area is ensured to be completely surrounded by the region of interest, and the height of the vehicles appearing on the road surface area is ensured to be covered.

For example, fig. 7(2) is a result obtained by enlarging (for example, enlarging by 10%) the rectangular frame of the selected region of interest according to a preset scaling on the basis of fig. 7(1), the region of interest shown in fig. 7(2) is a result obtained by performing redundant processing on the basis of the region of interest shown in fig. 7(1), and then the region of interest shown in fig. 7(2) is cut, so as to obtain the image data for processing by the deep learning model shown in fig. 7 (3).

Based on the above design, the image processing method provided in the embodiment of the present invention performs redundancy processing by adjusting and updating the region of interest according to the preset scaling, so as to ensure that the road surface region is completely surrounded by the region of interest.

Step S300, determine whether the ratio of the region of interest in the current frame picture is greater than a first preset value? When yes, step S413 is performed; when no, step S421 is executed.

When a large area of occlusion exists in the current frame picture, for example, when a large truck or other obstacle exists in front of the image capturing apparatus 10, the road surface area captured by the image capturing apparatus 10 may be very small, for example, the road surface area is completely occluded, and further the obtained region of interest is small, which affects the detection accuracy of the deep learning model on the region of interest.

Therefore, after the region of interest in the current frame picture is obtained, whether the ratio of the region of interest to the current frame picture is greater than a first preset value or not is judged according to the respective range sizes of the region of interest and the current frame picture. When the ratio of the range of the region of interest to the current frame picture is greater than the first preset value, the region representing the region of interest to the current frame picture is large enough to meet the requirement of the deep learning model on the detection accuracy of the region of interest, and then step S413 is executed; on the contrary, when the ratio of the range of the region of interest to the current frame picture is less than or equal to the first preset value, the region of interest on the current frame picture is relatively small, and the requirement of the deep learning model on the detection accuracy of the region of interest cannot be met, at this time, step S421 is executed.

In step S413, the resolution of the region of interest is adjusted.

When the ratio of the range of the region of interest to the current frame picture is determined to be greater than the first preset value, the image capturing apparatus 10 determines that the region of interest on the current frame picture is sufficiently large, and at this time, adjusts the resolution of the region of interest, so as to input the region of interest with the adjusted resolution into the preset deep learning model for detection processing, where the resolution of the region of interest after adjustment is smaller than the resolution before adjustment.

And step S414, processing the region of interest after the resolution is adjusted according to a preset deep learning model.

Based on the design, the image processing method provided by the embodiment of the invention obtains the region of interest in the current frame picture by extraction, adjusts the resolution of the region of interest and sends the region of interest to the preset deep learning model for processing when the ratio of the region of interest to the current frame picture is judged to be larger than the first preset value.

As an implementation manner, referring to fig. 2 again, in an embodiment of the present invention, before performing step S413, the image processing method further includes the following steps:

step S411, according to the region of interest, generating a plurality of sub regions of interest according to a preset scaling manner.

When the area of interest is selected, the area of interest is directly selected according to the range of the road surface area, but a target detection object (for example, another car or the like) existing on the road surface area is far away from the driving assistant car, which causes the area occupied by the target detection object on the area of interest to be small and to become a "small object", which causes the recognition accuracy of the target detection object to be reduced, so that the "small object" may not be recognized by the camera device 10.

Therefore, for the extracted region of interest, a plurality of sub-regions of interest are generated according to a preset scaling mode, wherein the plurality of sub-regions of interest have the same scaling point.

For example, referring to fig. 8, fig. 8 is a schematic diagram illustrating generation of a plurality of sub interest regions, where an area surrounded by an outermost rectangular frame is a selected interest region, an intersection point obtained by extending edge lines on two sides of a road surface is used as a scaling point, and then the outermost rectangular frame is continuously scaled according to a preset scaling manner, for example, according to different scaling ratios, so as to obtain a plurality of rectangular frames, where each area surrounded by each rectangular frame is a sub interest region.

Of course, it is understood that in some other embodiments of the embodiment of the present invention, the zoom point may also be selected in other manners, for example, a midpoint or an end point of an edge of the current region of interest, as long as the zoom point can be determined, for example, the zoom point may also be obtained by adjusting a certain distance upwards for a preset fixed point or a midpoint.

In step S412, a target sub-region of interest among the plurality of sub-regions of interest is determined.

After obtaining a plurality of sub-interest regions according to the interest region, determining a target sub-interest region from the plurality of sub-interest regions according to a target detection object required to be detected by the image capturing apparatus 10, and taking the target sub-interest region as the interest region for adjusting the resolution in step S413, where the target sub-interest region is a minimum region including the target detection object in the plurality of sub-interest regions.

Optionally, as an embodiment, the manner of determining the target sub-region of interest in the plurality of sub-regions of interest may be: and traversing all the sub interest areas, finding out all the sub interest areas of the identified target detection object as alternative sub interest areas, and then selecting the target sub interest areas according to the proportion of the overlapped area of the target identification frame of the identified target detection object and each sub interest area in the target identification frame. For example, in two adjacent sub-interest areas in the candidate sub-interest areas, when the target identification frame is completely surrounded by the smaller sub-interest area, the smaller sub-interest area is used as the target sub-interest area; when the target identification frame is not completely surrounded by the smaller sub interest region, an overlapping region exists between the target identification frame and the smaller sub interest region, and the proportion of the overlapping region in the target identification frame is greater than a second preset value, the smaller sub interest region in the adjacent two sub interest regions is selected as the target sub interest region; and when the target identification frame is not completely surrounded by the smaller sub interest region, an overlapping region exists between the target identification frame and the smaller sub interest region, and the proportion of the overlapping region in the target identification frame is smaller than or equal to a second preset value, selecting the larger sub interest region of the two adjacent sub interest regions as the target sub interest region.

Based on the above design, in the image processing method provided in the embodiment of the present invention, the target sub-interest region determined from the plurality of sub-interest regions obtained by scaling the interest region in the preset scaling manner is used as the interest region for image processing, so that the pixel proportion of the target detection object in the interest region is enlarged, and the detection recognition rate of the target detection object is improved.

Referring to fig. 2, in the embodiment of the present invention, when it is determined according to step S300 that the ratio of the region of interest to the current frame picture is smaller than or equal to the first preset value, the image processing method further includes the following steps:

in step S421, the resolution of the current frame picture is adjusted.

That is, when it is determined that the ratio of the region of interest to the current frame picture is smaller than or equal to the first preset value according to step S300, the image capturing apparatus 10 determines that the region of interest to the current frame picture is smaller, and at this time, the resolution of the current frame picture is adjusted by using the whole current frame picture as the region of interest, so as to input the current frame picture with the resolution adjusted to the preset deep learning model for detection processing, where the resolution after adjustment is smaller than the resolution before adjustment.

Step S422, the current frame picture with the resolution adjusted is processed according to a preset deep learning model.

Although the region of interest processed according to the preset deep learning model in the step S414 is only a part of the current frame picture, since the picture displayed to the user by the image capturing apparatus 10 is generally a complete current frame picture, and when the target object is continuously tracked, a continuous complete picture is also required for tracking, as an implementation manner, please continue to refer to fig. 2, in the embodiment of the present invention, the image processing method further includes the following steps:

and step S500, restoring the resolution of the processed region of interest to the resolution before adjustment.

And step S600, fusing the region of interest with the current frame picture after the resolution is restored.

The resolution of the region of interest after the resolution is restored is the same as that of the current frame picture, and at the moment, the region of interest after the resolution is restored and the current frame picture are fused, so that continuous identification of the target detection object in continuous multi-frame images can be realized. That is to say, in the process of processing the image, from the first frame picture to the nth frame picture, the image capturing apparatus 10 fuses the region of interest extracted from the first frame picture back into the first frame picture when processing the first frame picture; and the interested region extracted from the picture of the Nth frame is fused back to the picture of the Nth frame. Namely: and the region of interest extracted from the frame of picture is fused back into the corresponding frame of picture.

Referring to fig. 9, fig. 9 is a schematic structural diagram of an image processing apparatus 200 according to an embodiment of the present invention, in which the image processing apparatus 200 includes a road surface area identification module 210, a region of interest extraction module 220, a determination module 230, a resolution adjustment module 240, and an image processing module 250.

The road surface area identification module 210 is configured to determine a road surface area in the current frame picture.

Alternatively, referring to fig. 10 as an implementation manner, fig. 10 shows a schematic structural diagram of a road surface area identification module 210 of an image processing apparatus 200 according to an embodiment of the present invention, in which the road surface area identification module 210 includes a semantic segmentation processing unit 211 and a road surface area determination unit 212.

The semantic segmentation processing unit 211 is configured to obtain a semantic segmentation result of the current frame picture.

The road surface region determining unit 212 is configured to determine a road surface region in the semantic segmentation result of the current frame picture according to the semantic segmentation result of the previous frame picture.

Referring to fig. 9, the region-of-interest extracting module 220 is configured to extract a region of interest in the current frame picture, where the region of interest includes the road surface region.

Optionally, as an implementation manner, referring to fig. 11, fig. 11 shows a schematic structural diagram of a region of interest extracting module 220 of an image processing apparatus 200 according to an embodiment of the present invention, in which the region of interest extracting module 220 includes a region of interest selecting unit 221 and a region of interest updating unit 222.

The region-of-interest selecting unit 221 is configured to select, as a region of interest, all regions of the road surface region in the current frame picture according to a preset generation manner.

The region of interest updating unit 222 is configured to adjust the range of the region of interest according to a preset scaling ratio to update the region of interest.

Referring to fig. 9, the determining module 230 is configured to determine whether a ratio of the range of the region of interest in the current frame picture is greater than a first preset value.

The resolution adjusting module 240 is configured to adjust the resolution of the region of interest when the ratio of the range of the region of interest in the current frame picture is greater than the first preset value, where the adjusted resolution is smaller than the resolution before adjustment.

The image processing module 250 is configured to process the region of interest after the resolution is adjusted according to a preset deep learning model.

As an embodiment, please continue to refer to fig. 9, the resolution adjustment module 240 is further configured to adjust the resolution of the current frame picture when the ratio of the region of interest to the current frame picture is smaller than or equal to the first preset value.

The image processing module 250 is further configured to process the current frame picture with the adjusted resolution according to a preset deep learning model.

In one embodiment, with continued reference to fig. 9, the image processing apparatus 200 further includes a sub-region of interest generating module 260 and a target region of interest determining module 270.

The sub-region of interest generating module 260 is configured to generate a plurality of sub-regions of interest according to the region of interest in a preset scaling manner, where the plurality of sub-regions of interest have the same scaling point.

The target interest region determining module 270 is configured to determine a target sub-interest region in the multiple sub-interest regions, where the target sub-interest region is a minimum region including a target detection object in the multiple sub-interest regions, and the target sub-interest region is a sub-interest region in the step of adjusting the resolution of the interest region.

As an embodiment, with continued reference to fig. 9, in the embodiment of the present invention, the image processing apparatus 200 further includes an image fusion module 280.

The resolution adjustment module 240 is further configured to restore the resolution of the processed region of interest to the resolution before the adjustment.

The image fusion module 280 is configured to fuse the region of interest with restored resolution with the current frame picture.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative and, for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, each functional module in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiment of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In summary, according to the image processing method, the image processing apparatus, the image capturing device, and the storage medium provided by the embodiments of the present invention, the region of interest in the current frame picture is obtained by extraction, and when it is determined that the ratio of the region of interest to the current frame picture is greater than the first preset value, the resolution of the region of interest is adjusted and then sent to the preset deep learning model for processing, compared with the prior art, the amount of calculation in processing a high resolution picture can be reduced, the calculation speed is increased, and the calculation resources are saved; the target sub-interest region determined from the plurality of sub-interest regions obtained by zooming the interest region in the preset zooming mode is used as the interest region for image processing, so that the pixel proportion of the target detection object in the interest region is enlarged, and the detection recognition rate of the target detection object is improved; and the region of interest is adjusted and updated according to a preset scaling ratio to carry out redundancy processing, so that the road surface region is completely surrounded by the region of interest.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.

Claims

1. An image processing method, characterized in that the method comprises:

determining a road surface area in the current frame picture;

extracting an interested area in the current frame picture, wherein the interested area contains the road surface area;

judging whether the proportion of the range of the region of interest in the current frame picture is larger than a first preset value;

when the proportion of the range of the region of interest in the current frame picture is larger than the first preset value, adjusting the resolution of the region of interest, wherein the adjusted resolution is smaller than the resolution before adjustment;

detecting the region of interest with the adjusted resolution according to a preset deep learning model;

the step of extracting the region of interest in the current frame picture includes:

selecting all the areas of the road surface area in the current frame picture as the interested areas according to a preset generation mode;

expanding the range of the region of interest according to a preset scaling so as to update the region of interest;

prior to the step of adjusting the resolution of the region of interest, the method further comprises:

generating a plurality of sub interest areas according to the interest areas and a preset scaling mode, wherein the sub interest areas have the same scaling point;

traversing all the sub interest areas, taking all the sub interest areas of the identified target detection object as alternative sub interest areas, selecting the target sub interest areas according to the proportion of the overlapped area of the target identification frame of the identified target detection object and each sub interest area in the target identification frame, and taking the smaller sub interest area as the target sub interest area when the target identification frame is completely surrounded by the smaller sub interest area in the two adjacent sub interest areas in the alternative sub interest areas; when the target identification frame is not completely surrounded by the smaller sub interest region, an overlapping region exists between the target identification frame and the smaller sub interest region, and the proportion of the overlapping region in the target identification frame is greater than a second preset value, selecting the smaller sub interest region of the adjacent two as the target sub interest region; when the identification frame is not completely surrounded by the smaller sub interest region, an overlapping region exists between the target identification frame and the smaller sub interest region, and the proportion of the overlapping region in the target identification frame is smaller than or equal to a second preset value, selecting the larger sub interest region of the two adjacent sub interest regions as the target sub interest region;

wherein the target sub-region of interest is a minimum region containing a target detection object among the plurality of sub-regions of interest, and the target sub-region of interest is a sub-region of interest in the step of adjusting the resolution of the region of interest.

2. The method of claim 1, wherein the step of determining the road surface area in the current frame picture comprises:

obtaining a semantic segmentation result of the current frame picture;

and determining the road surface area in the semantic segmentation result of the current frame picture according to the semantic segmentation result in the previous frame picture.

3. The method of claim 1, wherein the method further comprises:

when the proportion of the range of the region of interest in the current frame picture is smaller than or equal to the first preset value, adjusting the resolution of the current frame picture;

and processing the current frame picture after the resolution ratio is adjusted according to a preset deep learning model.

4. The method of claim 1, wherein the method further comprises:

restoring the resolution of the processed region of interest to the resolution before adjustment;

and fusing the region of interest with the current frame picture after the resolution is restored.

5. An image processing apparatus, characterized in that the apparatus comprises:

the road surface area identification module is used for determining a road surface area in the current frame picture;

the interested region extracting module is used for extracting an interested region in the current frame picture, wherein the interested region comprises the road surface region;

the judging module is used for judging whether the proportion of the range of the region of interest in the current frame picture is larger than a first preset value;

the resolution adjusting module is used for adjusting the resolution of the region of interest when the proportion of the range of the region of interest in the current frame picture is larger than the first preset value, wherein the adjusted resolution is smaller than the resolution before adjustment;

the image processing module is used for detecting the region of interest after the resolution is adjusted according to a preset deep learning model;

the region of interest extraction module comprises a region of interest selection unit and a region of interest updating unit;

the interesting region selecting unit is used for selecting all the road surface regions occupying the current frame picture as interesting regions according to a preset generating mode;

the interesting region updating unit is used for expanding the range of the interesting region according to a preset scaling so as to update the interesting region;

the image processing device also comprises a sub interest region generation module and a target interest region determination module;

the sub interest region generation module is used for generating a plurality of sub interest regions according to the interest regions and a preset scaling mode, wherein the sub interest regions have the same scaling point;

the target interest region determining module is used for traversing all the sub interest regions, taking all the sub interest regions of the identified target detection object as alternative sub interest regions, selecting the target sub interest regions according to the proportion of the overlapped region of the target identification frame of the identified target detection object and each sub interest region in the target identification frame, and taking the smaller sub interest regions as the target sub interest regions when the target identification frame is completely surrounded by the smaller sub interest regions in the two adjacent sub interest regions in the alternative sub interest regions; when the target identification frame is not completely surrounded by the smaller sub interest region, an overlapping region exists between the target identification frame and the smaller sub interest region, and the proportion of the overlapping region in the target identification frame is greater than a second preset value, selecting the smaller sub interest region of the adjacent two as the target sub interest region; when the identification frame is not completely surrounded by the smaller sub interest region, an overlapping region exists between the target identification frame and the smaller sub interest region, and the proportion of the overlapping region in the target identification frame is smaller than or equal to a second preset value, selecting the larger sub interest region of the two adjacent sub interest regions as the target sub interest region;

6. An image pickup apparatus characterized by comprising:

a memory for storing one or more programs;

a processor;

the one or more programs, when executed by the processor, implement the method of any of claims 1-4.

7. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-4.