CN116403148A

CN116403148A - Image processing method, device, camera and readable medium

Info

Publication number: CN116403148A
Application number: CN202111622551.8A
Authority: CN
Inventors: 黄进新; 季军
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2021-12-28
Filing date: 2021-12-28
Publication date: 2023-07-07

Abstract

The embodiment of the application provides an image processing method, which comprises the following steps: acquiring a first image frame and a second image frame shot by a camera, wherein the brightness of the first image frame is higher than that of the second image frame, and the first image frame and the second image frame both comprise a first moving object; and when the image quality of a second region of interest (ROI) in the second image frame is better than that of a first ROI in the first image frame, fusing the second ROI into the first image frame to obtain a target image. According to the method, registration fusion is carried out on the local area based on the multi-frame images, and the interested area (such as a license plate) with good quality in one frame of image is fused into another frame of image (with better overall brightness), so that the problem of overexposure of the license plate in a traffic scene is solved, and the clarity of the face can be ensured. In addition, the method reduces the phenomena of large calculation pressure and image tailing in the ultra-wide dynamic state of the high-speed moving scene through local registration and fusion.

Description

Image processing method, device, camera and readable medium

Technical Field

The present disclosure relates to the field of machine vision, and in particular, to an image processing method, an image processing device, a camera, and a readable medium.

Background

The ITS bayonet snapshot equipment has the main functions of video monitoring and vehicle snapshot, wherein the video monitoring only needs to see the license plate clearly, and the vehicle snapshot needs to see the face in the vehicle clearly. The reflective film of the motor license plate of the current industry has high reflective coefficient, plays an obvious warning role in the daytime with bright color, and can effectively enhance the recognition capability of people, see the object clearly and cause alertness under the condition of night or insufficient light, thereby avoiding accidents, reducing casualties and reducing economic losses. The light transmittance of the car window glass is only 70% due to film coating, and the light reflection characteristic of the face is only 50% after the car window glass is transmitted twice, and is much weaker than that of a license plate and is only nearly 6%. Therefore, the light supplement for viewing the license plate is relatively weak in video monitoring, and the light supplement for viewing the face in the vehicle is relatively strong in vehicle snapshot.

In the daytime, because the ambient light is better, the video monitoring does not need light filling, but in the daytime, because the window glass reflects light strongly, in order to obtain clear faces in the automobile, the requirements of face recognition are met, stronger white light filling is needed, and the intensity of the light filling reaches 20000lux through gas explosion flash in the current industry. At night, the requirement of video monitoring light filling is generally 5-10 lux because the ambient light is very weak or even no ambient light, and the snapshot light filling intensity for viewing the face in the vehicle is generally 50-100 lux. Therefore, when the vehicle is snapped, the license plate is easy to overexposure because the light supplementing is strong.

In the prior art, single-frame image enhancement processing or multi-frame image fusion enhancement processing is adopted, but both methods have the defects: (1) Single frame image enhancement processing: the method is used for carrying out sum brightness enhancement on the license plate base color and the license plate characters so as to recover the unexposed license plate information in the original signal as much as possible and improve the contrast of the license plate, but the method cannot solve the problem of the exposed scene in the RAW domain. (2) multi-frame image fusion enhancement processing: the license plate image shot through multiple exposure ensures that license plate information is not excessively exposed in a RAW domain, and then multi-frame images are subjected to overall fusion to obtain a wide dynamic image, but the method has high expenditure of computing resources and trailing risk of fusion effect.

Disclosure of Invention

In a first aspect, an embodiment of the present application provides a method for image processing, including: acquiring a first image frame and a second image frame shot by a camera, wherein the brightness of the first image frame is higher than that of the second image frame, and the first image frame and the second image frame both comprise a first moving object; and when the image quality of a second region of interest (ROI) in the second image frame is better than that of a first ROI in the first image frame, fusing the second ROI into the first image frame to obtain a target image.

The method is based on the registration fusion of the multi-frame images to the local area, and the interested area (such as license plate) with good quality in one frame of image is fused into the other frame of image (with better overall brightness), so that the problem of overexposure of the license plate in the traffic scene is solved.

In a possible implementation manner, the first moving object is a first vehicle, and the first ROI and the second ROI are image areas where license plates of the first vehicle are located.

In a possible implementation manner, the first image frame is an image frame captured by the camera through light filling, the second image frame is an image frame captured by the camera under the condition of non-light filling or weak light filling, and the first image frame and the second image frame are adjacent image frames.

In a possible implementation, the exposure time of the first image frame is longer than the exposure time of the second image frame, and the first image frame and the second image frame are adjacent image frames.

In a possible implementation manner, license plates in the first image frame and the second image frame are detected respectively, so as to obtain the first ROI and the second ROI.

In a possible embodiment, fusing the second ROI into the first image comprises: and carrying out local image registration on the second image frame and the first image frame to obtain registration data of the ROI area, and fusing the second ROI into the first image according to the registration data.

In a possible implementation, the image quality of the second region of interest ROI in the second image frame is better than the first ROI of the first image frame, comprising: the brightness of the license plate area of the first image frame is higher than the normal range, and the brightness of the license plate area of the second image frame is in the normal range.

In a possible implementation, the first image is output as the target image when the image quality of the second region of interest ROI in the second image frame is not better than the first ROI of the first image frame.

In a second aspect, embodiments of the present application provide an apparatus for image processing, the apparatus comprising a plurality of modules for performing the functions implemented by the method in the first aspect and any one of its possible designs.

In a third aspect, an embodiment of the present application provides an apparatus for image processing, where the apparatus may include: the acquisition module is used for acquiring a first image frame and a second image frame shot by the camera, wherein the brightness of the first image frame is higher than that of the second image frame, and the first image frame and the second image frame both comprise a first moving object; a fusion module for: and when the image quality of a second region of interest (ROI) in the second image frame is better than that of a first ROI in the first image frame, fusing the second ROI into the first image frame to obtain a target image.

In a possible embodiment, the apparatus further comprises: the detection module is used for: and detecting license plates in the first image frame and the second image frame to obtain the first ROI and the second ROI.

In a possible embodiment, the fusion module is configured to: carrying out local image registration on the second image frame and the first image frame to obtain registration data of an ROI (region of interest); the second ROI is fused into the first image according to the registration data.

In a possible implementation manner, the brightness of the license plate area of the first image frame is higher than the normal range, and the brightness of the license plate area of the second image frame is in the normal range.

In a possible embodiment, the fusion module is further configured to: the first image is output as a target image when the image quality of a second region of interest, ROI, in the second image frame is not better than the first ROI of the first image frame.

In a fourth aspect, embodiments of the present application also provide a camera, the camera including a processor and a memory; the memory is used for storing computer program instructions; the processor executes instructions that invoke the computer program in the memory to perform the method of the first aspect and any of its possible designs.

In a fifth aspect, embodiments of the present application also provide a computer-readable storage medium, which when executed by a computing device, performs the method of the first aspect and any of its possible designs.

In a sixth aspect, embodiments of the present application also provide a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method provided in the first aspect of the present application and any one of its possible implementations.

Drawings

Fig. 1 is a diagram showing an example of a hardware configuration of a camera 10 according to an embodiment of the present application;

FIG. 2 is another hardware architecture provided by embodiments of the present application;

fig. 3 is a schematic view of a traffic scenario provided in an embodiment of the present application;

FIG. 4 is a schematic flow chart of a method according to an embodiment of the present application;

FIG. 5 is a schematic diagram of an image processing method provided herein;

fig. 6 is a diagram showing an example of the structure of an image processing apparatus 50 provided in the present application;

Detailed Description

In order to facilitate understanding of the embodiments of the present application, first, some terms related to the present application will be explained.

A bayonet: the road traffic safety gate monitoring system is one kind of road traffic monitoring system for shooting, recording and processing all motor vehicles in certain road site, such as toll station, traffic or safety inspection station. For example, the bayonet may take a snapshot of the vehicle that has traveled to identify whether the driver is wearing a seat belt.

ITS (Intelligent Transportation System): the system is a typical representative of intelligent traffic system and novel automobile information electronic products, and is characterized in that advanced technologies such as navigation positioning technology, information technology, data communication technology and the like are effectively and comprehensively applied to urban traffic, so that an omnibearing, real-time, accurate and efficient comprehensive transportation management system is established.

Image registration (Image registration) is a process of matching and overlapping two or more images acquired at different times, with different sensors (imaging devices) or under different conditions (weather, illuminance, imaging position and angle, etc.), and has been widely used in the fields of remote sensing data analysis, computer vision, image processing, etc.

Image fusion (Image fusion): an image processing technique combines two or more images into a new image using a specific algorithm, the combined image having excellent characteristics of the original image, such as brightness, sharpness, color, etc.

Optical flow method (Optical Flow Method): the optical flow is the instantaneous speed of the pixel motion of a space moving object on an observation imaging plane, and the optical flow method is a method for finding the corresponding relation between the previous frame and the current frame by utilizing the change of the pixels in an image sequence on a time domain and the correlation between the adjacent frames, so as to calculate the motion information of the object between the adjacent frames.

ISP (Image Signal Processing): the image signal processing technology is mainly used for performing post-processing on signals output by a front-end image sensor, and has the main functions of linear correction, noise removal, dead pixel removal, interpolation, white balance, automatic exposure control and the like, and can restore site details well under different optical conditions only by relying on an ISP, so that the imaging quality is determined to a great extent by the ISP technology.

In view of the problems in the background technology, the embodiment of the application provides an image processing method, which is used for carrying out registration fusion on local areas based on multi-frame images, and fusing a license plate area with good quality in one frame of image into another frame of image (the quality of a non-license plate area is better), so that the problem of license plate overexposure in a traffic scene is solved, and the phenomena of high calculation pressure and image tailing in ultra-wide dynamic of a high-speed motion scene are also reduced through local registration fusion.

The technical solutions provided in the present application are described below from various angles such as a hardware system (fig. 1 and 2), a service scenario (fig. 3), a method implementation (fig. 4-5), a software device (fig. 6), and the like.

It should be noted that terms in the specification and claims of the present application and the above drawings, such as "step 301", "step 302", "step 303", and "first", "second", "third", etc., include terms of numerical concepts are used to distinguish similar objects, and are not limited to describe a specific order or sequence. It is to be understood that the order of the parts may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. In the embodiments of the present application, words such as "exemplary" or "such as" are used to mean serving as examples, illustrations, or descriptions. Any embodiment or design described herein as "exemplary" or "for example" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.

First, a hardware architecture of the embodiment of the present application will be described.

Fig. 1 is a diagram showing an example of a hardware structure of a camera 10 according to an embodiment of the present application, which can be used for capturing and video monitoring of moving objects in a traffic scene. In hardware, as shown in fig. 1, the camera 10 includes a processor 110, an imaging assembly 120, a light filling lamp 130, and a communication interface 140, as described in detail below:

the processor 110 may be configured to perform ISP processing on RAW data captured by the imaging component 120, and then control light filling of a light filling lamp, and control exposure of a lens and an image sensor, so as to obtain a high-quality color image, where ISP processing includes processing such as 3A (auto exposure, auto focus and auto white balance), dead-spot correction, denoising, strong light suppression, backlight compensation, color enhancement, lens shading correction, and the like, so as to obtain a processed image, and further perform detection and analysis based on the processed image. The processor 110 may be a central processing unit (central processing unit, CPU), a graphics processor (graphics processing unit, GPU), an application specific integrated circuit (application specific integrated circuit, ASIC), a field programmable gate array (field programmable gate array, FPGA), an artificial intelligence (artificial intelligence, AI) chip, a system on chip (SoC), or a complex programmable logic device (complex programmable logic device, CPLD), or the like. The embodiment of the present application is not limited to the specific type of the processor 110, only one processor 110 is shown in fig. 1, in practical applications, the number of the processors 110 may be plural, and the multiple processors 110 may also be of different types, such as the camera 110 includes a CPU and a GPU. Wherein a CPU may in turn have one or more processor cores. The present embodiment does not limit the number of processors and the number of processor cores.

The imaging assembly 120 is for capturing images and may include a lens, an image sensor, and a light supplement 130. The lens includes a lens group composed of one or more optical glasses (lens, filter, polarizer, etc.) for imaging light reflected by the vehicle on the road on the image sensor. The image sensor is a semiconductor device for converting an optical image into a digital signal, and may be a sensor complementary metal oxide semiconductor (CMOS, complementary Metal Oxide Semiconductor) sensor. Alternatively, the image sensor may also use a Charge-coupled Device (CCD) sensor. It should be noted that fig. 1 only shows 1 imaging assembly 110, but the embodiment of the present application is not limited to 1 imaging assembly 110, and the number of imaging assemblies 110 may be one or more, which is not limited in the embodiment of the present application.

The light filling lamp 130 is used for providing light filling for the camera system, and the light filling lamp 130 is commonly used with a gas light filling lamp, an LED light filling lamp or a multifunctional light filling lamp, so that the traffic monitoring camera can provide the daytime and night monitoring snapshot light filling. The LED lamp is a light supplementing lamp of a light source provided by a light emitting diode, in the security industry, the camera is supplemented with light by the LED lamp commonly used, the LED can work in a stroboscopic mode and a normally-on mode, the power of the LED can be flexibly controlled to provide different light supplementing intensities, and the LED lamp can provide high-current and high-intensity light supplementing instantly, which is called LED frequency exposure. The hernia lamp is a novel headlight containing xenon, also called a high-intensity discharge type gas lamp, and is widely applied to automobile headlights, photography light supplement and monitoring snapshot light supplement.

The communication interface 140 may support network ports, optical ports or wireless communication, and may support communication with other devices such as cameras, edge servers, cloud servers, etc.

The memory 150 may include one or more of random access memory (random access memory, RAM), read-only memory (ROM), flash EPROM (flash EPROM), mechanical hard disk (HDD), or solid state disk (solid state drive, SSD), among any other memory, storage device, storage medium that may be used to store and hold information. Memory 150 may have stored therein executable program code that is executable by a processor to implement the steps of the methods of embodiments of the present application.

Fig. 2 is another hardware architecture provided in an embodiment of the present application, in which the camera 10 and the server 40 are included, where the server 40 includes at least a processor 110, a communication interface 140, and a memory 150, which can be described with reference to the foregoing. In the scenario of fig. 2, the camera 10 may send the acquired video stream or image to the server 40 based on a network, and the server 40 performs the image processing method provided in the embodiment of the present application. Optionally, the processor of the camera 10 performs some basic processing on the image frames, such as ISP, before sending the video stream or image frames to the server 40, and sends the ISP processed image to the server 40, and the processor of the server 40 performs further analysis and processing based on the ISP processed image to implement the method of the embodiment of the present application. Alternatively, the server 40 may be an edge computing server or a cloud server, where the camera acquires image information and sends the image information to a processor of the edge server or the cloud server for processing, or the edge server forwards the image information from the camera to the cloud server for processing, which is not limited in the embodiment of the present application.

Next, a service scenario in the embodiment of the present application will be described.

It should be noted that, the service scenario described in the embodiment of the present application is for more clearly describing the technical solution of the embodiment of the present application, and does not constitute a limitation on the technical solution provided in the embodiment of the present application, and those skilled in the art can know that, with the evolution of the system architecture and the appearance of the new service scenario, the technical solution provided in the embodiment of the present application is applicable to similar technical problems.

Fig. 3 is a schematic view of a traffic scene provided in an embodiment of the present application, in this scene, the traffic scene includes a camera 10 and a vehicle 20, the structure of the camera 10 is shown in fig. 1, the vehicle 20 further includes a driver, and a license plate 30 is installed on the vehicle 20. In the embodiment of the present application, examples of two types of scenes are given, and the specific description is as follows:

(1) In a bayonet scenario, the camera 10 is used for video monitoring and capturing of vehicles at traffic bayonets. When video monitoring is implemented, the video shot by the camera 10 only needs to see the license plate, and the reflective film of the license plate has high reflective coefficient, and the bright reflective effect can effectively enhance the recognition effect no matter in the day or at night, so that light supplementing is not needed in a conventional video monitoring scene, namely, the video frames in the video monitoring are not supplemented with light. When the snapshot is implemented, the camera 10 needs to contain a clear face in the snapshot image, and as the transmittance of the window glass is not high, and the reflection characteristic of the face is much weaker than that of the license plate, the snapshot scene usually needs to be supplemented with light to see the face in the car clearly, especially in the night or in the environment with poor light. For example, when the driver is riding in the vehicle 20 and passing within the capture area of the camera, the camera will be properly light-filled and snap shot, and the license plate position in the snap shot frame will typically be overexposed due to the stronger light filling. Optionally, a video frame of the snapshot frame and the video monitor is adjacent.

For the two adjacent frames in the bayonet scene, in the method provided by the application embodiment, the license plate areas are detected and registered, and the license plates which are not overexposed in the video frames are fused into the snapshot frames, so that images with proper brightness of the vehicle window and the license plates are obtained.

(2) In a mixed-line scene, pedestrians (not shown in fig. 3) and vehicles can be seen simultaneously in the screen of the camera 10. In one possible implementation, camera 10 may obtain snap frames of different brightness by controlling the exposure time. For example, for a long exposure frame, the human face of the pedestrian in the long exposure image has proper brightness due to longer exposure time, but the license plate position is overexposed; for the short exposure frame, the license plate position in the short exposure image is not overexposed due to short exposure time, and the brightness is proper, but the human face of the pedestrian is darker. Alternatively, the long exposure frame and the short exposure frame herein are two adjacent frames, for example, the long exposure frame is immediately followed by a frame of short exposure frame.

Aiming at two adjacent frames with different exposure time in the mixed scene, in the method provided by the embodiment of the application, the license plate region is detected and registered, and the license plate which is not subjected to overexposure in the short exposure frame is fused into the long exposure frame, so that an image with proper face and license plate brightness is obtained.

Next, a method implementation of the embodiment of the present application will be described.

In an embodiment of a method provided in the present application, fig. 4 is a schematic flow chart of a method provided in the embodiment of the present application, including steps 310 to 340. In a possible implementation, the method of fig. 4 is performed by camera 10, where camera 110 is a smart camera with advanced image processing capabilities. In another possible embodiment, the method of fig. 4 is performed by a server 40 in communication with the camera 10.

The method embodiment presented in connection with fig. 4 is presented as follows:

step 310, a first image and a second image frame captured by the camera 10 are acquired, wherein the brightness of the first image frame is higher than that of the second image frame, and both the first image frame and the second image frame comprise a first moving object.

Alternatively, the first moving object may be a first vehicle. The method of the embodiment of the application can also be applied to moving targets in other traffic scenes, and is not limited herein.

When the camera 10 is deployed at a traffic gate, which may also be referred to as a gate camera, it is used to monitor the image information of the road surface in the area in real time, and to monitor and photograph the passing car in video. The first image frame and the second image frame are two images captured by the camera 10, and the two images include the first vehicle, and due to different capturing modes, the brightness of the first image frame is higher than that of the second image frame, which is described as follows:

An example of an implementation scenario, the aforementioned bayonet scenario, is shown in fig. 5, where the first image frame is the snap frame F1 in fig. 5 and the second image frame is the video frame F2 in fig. 5. F1 is an image frame captured by the camera through light filling, and F2 is an image frame captured by the camera under the condition of non-light filling or weak light filling.

Specific: the camera 10 and its light supplement lamp are disposed at a particular angle on a pole as shown in fig. 3. Whenever a first vehicle enters a shooting area of the camera 10, the camera 10 shoots and records a monitoring video containing the first vehicle, when the vehicle reaches a shooting position preset by the camera 10, the camera 10 is triggered to perform light supplementing and shooting, a light supplementing lamp is used for supplementing light to the shot first vehicle, the camera automatically focuses and fixes an exposure state, a first image frame F1 which is shot is obtained, and the brightness of the shot image with the light supplementing is higher relative to a picture in video monitoring. The second image frame is a video frame F2 in the monitoring video recorded by the camera 10 or other photographed image without light supplement.

Optionally, the first image frame and the second image frame are adjacent image frames. For example, the frame interval is 40ms, and the aforementioned snap frame F1 and video frame F2 are temporally adjacent image frames.

An example of an implementation scenario, the aforementioned hybrid scenario, is where the first image frame is a long exposure frame F3 and the second image frame is a short exposure frame F4.

Specific: the camera 10 is disposed on a pole as shown in fig. 3, and the camera 10 can obtain snap frames of different brightness by controlling exposure time of the camera 10: a long exposure frame F3 and a short exposure frame F4.

Alternatively, the aforementioned long exposure frame F3 and short exposure frame F4 are two image frames that are adjacent in time, for example, a frame immediately following the long exposure frame.

It should be noted that, the sources of the first image frame and the second image frame are not limited to the above two scenes, and may be used as the first image frame and the second image frame in the embodiment of the present application as long as the following conditions are satisfied: the first image frame has a higher brightness than the second image frame, and both the first image frame and the second image frame include a first vehicle, which may be any vehicle that enters the frame of the camera 10.

In some possible embodiments, the brightness of the first image frame may not be limited to be higher than that of the second image frame, as long as the quality of the non-ROI area of the first image frame is higher than that of the non-ROI area of the first image frame, and the non-ROI area refers to other image areas except for the license plate.

Step 320, detecting the region of interest in the first image frame and the second image frame, respectively, to obtain a first ROI and a second ROI.

In one possible embodiment, when the first target is a vehicle, the region of interest may be a license plate region.

In one possible implementation, the camera performs a series of ISP processes (such as linear correction, noise removal, dead point removal, interpolation, white balance, automatic exposure control, etc.) on the RAW data of each frame, to obtain a processed first image, for example, the processed image may be a YUV image, and we scale the YUV image of each frame to a small scale, and then perform vehicle detection. Alternatively, the processed image may be an RGB image.

The camera 10 may perform vehicle detection and license plate detection on the acquired processed first image frame and second image frame, where the vehicle detection and license plate detection are required to be implemented by an artificial intelligence (Artificial Intelligence, AI) algorithm, and the AI algorithm may be implemented by a functional module built in the camera, or may be completed in whole or in part by the server 40. And after the license plate is detected, marking the license plate region in the first image as a first ROI, and marking the license plate region in the second image as a second ROI.

For example, taking a bayonet scene as an example, fig. 5 is a schematic diagram of an image processing method, and license plate regions 30-F1 and 30-F2 in license plate regions 30-F1 and F2 detected in fig. 5 are respectively a first ROI and a second ROI, and by way of example, diagonal stripes on license plate region 30-F1 indicate that the image quality of the region is poor, for example, overexposure occurs, and diagonal stripes on a face in an F2 car indicate that the image quality of the face region is poor, for example, the face is darker.

And step 330, fusing the second ROI into the first image to obtain a target image when the image quality of the second ROI is better than that of the first ROI.

The image quality of the second ROI is better than the first ROI, which can be understood that the license plate region in the second image frame has better quality than the license plate region in the first image frame, and the quality can be represented by: brightness, noise, sharpness, color, confidence, etc., embodiments of the present application are not limited herein. Alternatively, the license plate may be determined by a conventional method, for example, by using an image quality indicator, or by using an AI algorithm to determine the confidence level of the license plate. And carrying out local image registration on the second image frame and the first image frame to obtain registration data of the ROI region, and fusing the second ROI into the first image according to the registration data.

In one possible implementation, the overall brightness of the first image frame is higher than that of the second image frame due to light filling or long exposure, but the license plate reflection coefficient is high, so that the brightness of the license plate area of the first image frame exceeds the normal range, i.e. the "overexposure", while the brightness of the license plate area of the second image frame is in the normal range, i.e. no "overexposure". Further, "overexposure" may be understood as that information on the RAW data cannot be recovered by the image processing method.

When the image quality of the second ROI is better than that of the first ROI, the second ROI is fused into the first image to obtain a target image, wherein the fusion comprises two steps of local registration and fusion. In an actual scene, as the vehicle moves and is in front and back frames, the vehicle is displaced and deformed, and the corresponding positions in the two frames of pictures are different, the license plate of one frame of picture and the license plate of the other frame of picture need to be registered, namely the problem of aligning the deformation of the license plate of one frame with the pixel level of the license plate of the other frame of picture is solved.

The registration and fusion of two adjacent frames can adopt a traditional optical method or an AI method. The traditional optical method includes an optical flow method, which uses the change of pixels in an image sequence in a time domain and the correlation between adjacent frames to find the pixel correspondence (i.e. transformation parameters) existing between the two frames before and after, to layer the image content, to fuse each layer separately, and to combine each layer. The AI method adopts registration and fusion technology based on the neural network, can improve the accuracy of registration through the strong learning ability of the neural network, and corrects registration errors and color errors through the fusion network. Meanwhile, the three networks of noise reduction, registration and fusion can be combined together, so that the calculation force requirement is reduced, and the purpose of a floor embedded system is realized.

In a possible embodiment, the flow of local registration is as follows: firstly, extracting features of local areas (a first ROI and a second ROI) in two images to obtain feature points; finding matched characteristic point pairs by carrying out similarity measurement; then coordinate transformation parameters, which can be called registration parameters in the embodiment of the application, are obtained through the matched feature point pairs; finally, image registration is performed by the coordinate transformation parameters. Specifically, based on the obtained transformation parameters, the local image is subjected to registration processing, the registration processing method comprises affine transformation or 3D space domain transformation, namely, each pixel point of the first ROI is subjected to space coordinate vector transformation processing, a predicted license plate image after registration processing is obtained, and the predicted license plate image can be completely overlapped with a license plate region (and the second ROI) of the second image frame.

In one possible implementation, the fusion may be understood as replacing license plate regions in the second image (i.e., the second ROI) with the predicted license plate map obtained after the registration of the first ROI. Optionally, a brightness smoothing is done at the fused boundary. Taking two image frames of a bayonet scene as an example, an implementation example, the fused target image includes: a normally exposed license plate region (from the second image frame), and in-vehicle valid information (e.g., face information and window region) from the first image frame. Taking F1 and F2 as an example, a fusion frame schematic diagram (F1 and F2 fusion frames) is also shown in fig. 5, and the license plate region 30-F2 'is a registered predicted license plate diagram, in the embodiment of the present application, the license plate region 30-F2' which is not overexposed in the video frame F2 is registered and fused into the snapshot frame F1, and the overexposed license plate region in the snapshot frame F1 is replaced, so that a vehicle window and a human face with good brightness are obtained, and the problem of overexposure of the license plate is also solved.

In the prior art, based on the weighted fusion of multiple frames of full-image pixel levels, as the motion areas of the front frame and the rear frame have displacement, motion estimation and motion compensation are needed first, the system resource cost is high, and the tailing is still obvious in a high-speed motion scene. However, in the embodiment of the application, the fusion of the interested areas is adopted, that is, the detection, registration and fusion of the front frame and the rear frame are carried out based on the special local area, and the detection of the special area can achieve pixel-level precision, which is different from the precision of motion estimation and compensation in the prior art, so that a large amount of calculation force can be saved, the processing efficiency can be effectively improved, and the image tailing phenomenon under the condition of high-speed motion of a vehicle can be relieved.

Step 340, outputting the first image as a target image when the image quality of the second region of interest ROI in the second image frame is not better than the first ROI of the first image frame.

The image quality of the second ROI is not better than the first ROI, and it is understood that the quality of the license plate region in the second image frame with higher brightness is not better than the quality of the license plate region in the first image frame. In a possible implementation manner, if the license plate area in the snapshot frame F1 is not overexposed, the snapshot frame is directly output as the target image.

Next, an implementation of the apparatus according to the embodiment of the present application will be described.

Based on the same inventive concept as the method embodiments, an image processing apparatus 50 provided in the embodiments of the present application includes a plurality of functional modules, which may be used to implement one or more steps of the foregoing method embodiments (fig. 4).

Fig. 6 is a diagram showing an example of the structure of an image processing apparatus 50 provided in the present application, which may be disposed on the video camera 10 of fig. 1 or the server 40 of fig. 2. As shown in fig. 6, the image processing apparatus 50 may include:

an acquiring module 501, configured to acquire a first image frame and a second image frame captured by a camera, where a brightness of the first image frame is higher than a brightness of the second image frame, and the first image frame and the second image frame each include a first moving object.

Optionally, the first image frame is an image frame captured by the camera 10 through light filling, the second image frame is an image frame captured by the camera 10 under the condition of non-light filling or weak light filling, the first image frame and the second image frame are adjacent image frames, and reference is made to the description of the bayonet scene in the related content, which is not repeated here.

Optionally, the exposure time of the first image frame is longer than that of the second image frame, and the first image frame and the second image frame are adjacent image frames, and the related content refers to the description of the mixed-line scene and is not repeated here.

A fusion module 502, configured to: when the image quality of the second region of interest (ROI) in the second image frame is better than that of the first ROI in the first image frame, the second ROI is fused into the first image frame, and a target image is obtained.

In a possible implementation manner, the first ROI and the second ROI are image areas where license plates of the first vehicle are located.

Specific: the image quality of the second ROI is better than that of the first ROI, which can be understood that the license plate region in the second image frame is better than that in the first image frame, and the related content is referred to the description in the foregoing method embodiment and will not be repeated here. Then, when the image quality of the second ROI is better than that of the first ROI, the second ROI is fused into the first image to obtain a target image, wherein the fusion comprises two steps of local registration and fusion.

The fusion module 503 is further configured to: carrying out local image registration on the second image frame and the first image frame to obtain registration data of the ROI; based on the registration data, a second ROI is fused into the first image.

The registration and fusion of two adjacent frames can adopt a traditional optical method or an AI method, and the related content refers to the description in the embodiment of the method, and is not repeated here.

Optionally, the fusion module 503 is further configured to: when the image quality of the second region of interest ROI in the second image frame is not better than the first ROI of the first image frame, the first image is output as the target image. For relevant content, please refer to the previous description, and the description is omitted here.

A detection module 503, optionally, configured to: and detecting the region of interest in the first image frame and the second image frame to obtain the first ROI and the second ROI.

Specific examples are: the camera 10 may perform vehicle detection and license plate detection on the obtained processed first image frame and second image frame, where the vehicle detection and license plate detection need to be implemented through an algorithm a, and after the license plate is detected, the license plate region in the first image is marked as a first ROI, the license plate region in the second image is marked as a second ROI, and the related content is referred to the description in the foregoing method embodiment and will not be repeated herein.

Finally, the above-described embodiments may be implemented in whole or in part by software, hardware, firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded or executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more sets of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk (solid state drive, SSD).

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various modifications and variations can be made in the present application without departing from the scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to cover such modifications and variations.

Claims

1. A method of image processing, the method comprising:

acquiring a first image frame and a second image frame shot by a camera, wherein the brightness of the first image frame is higher than that of the second image frame, and the first image frame and the second image frame both comprise a first moving object;

and when the image quality of a second region of interest (ROI) in the second image frame is better than that of a first ROI in the first image frame, fusing the second ROI into the first image frame to obtain a target image.

2. The method of claim 1, wherein the first moving object is a first vehicle, and the first ROI and the second ROI are image areas where a license plate of the first vehicle is located.

3. The method of claim 1, wherein the first image frame is an image frame captured by the camera with light filling, the second image frame is an image frame captured by the camera with non-light filling or weak light filling, and the first image frame and the second image frame are adjacent image frames.

4. The method of claim 1, wherein the exposure time of the first image frame is longer than the exposure time of the second image frame, the first image frame and the second image frame being adjacent image frames.

5. A method according to any one of claims 1-3, wherein the method further comprises: and detecting license plates in the first image frame and the second image frame respectively to obtain the first ROI and the second ROI.

6. The method of any of claims 1-4, wherein fusing the second ROI into the first image comprises:

carrying out local image registration on the second image frame and the first image frame to obtain registration data of an ROI (region of interest); the second ROI is fused into the first image according to the registration data.

7. The method of any of claims 1-5, wherein the image quality of the second region of interest, ROI, in the second image frame is better than the first ROI of the first image frame, comprising:

the brightness of the license plate area of the first image frame is higher than the normal range, and the brightness of the license plate area of the second image frame is in the normal range.

8. The method according to any one of claims 1-5, further comprising:

the first image is output as a target image when the image quality of a second region of interest, ROI, in the second image frame is not better than the first ROI of the first image frame.

9. An apparatus for image processing, the apparatus comprising:

the acquisition module is used for acquiring a first image frame and a second image frame shot by the camera, wherein the brightness of the first image frame is higher than that of the second image frame, and the first image frame and the second image frame both comprise a first moving object;

a fusion module for: and when the image quality of a second region of interest (ROI) in the second image frame is better than that of a first ROI in the first image frame, fusing the second ROI into the first image frame to obtain a target image.

10. The apparatus of claim 9, wherein the first moving object is a first vehicle, and the first ROI and the second ROI are image regions in which license plates of the first vehicle are located.

11. The apparatus of claim 8, wherein the first image frame is an image frame captured by the camera with light filling, the second image frame is an image frame captured by the camera with non-light filling or weak light filling, and the first image frame and the second image frame are adjacent image frames.

12. The apparatus of claim 8, wherein the exposure time of the first image frame is longer than the exposure time of the second image frame, the first image frame and the second image frame being adjacent image frames.

13. The apparatus according to any one of claims 8-10, wherein the apparatus further comprises:

the detection module is used for: and detecting license plates in the first image frame and the second image frame to obtain the first ROI and the second ROI.

14. The device according to any one of claims 8 to 11, wherein,

the fusion module is used for: carrying out local image registration on the second image frame and the first image frame to obtain registration data of an ROI (region of interest); the second ROI is fused into the first image according to the registration data.

15. The apparatus of any of claims 8-12, wherein the license plate region of the first image frame has a brightness above a normal range and the license plate region of the second image frame has a brightness in the normal range.

16. The device according to any one of claims 8-13, wherein,

the fusion module is further configured to: the first image is output as a target image when the image quality of a second region of interest, ROI, in the second image frame is not better than the first ROI of the first image frame.

17. A camera, the camera comprising a processor and a memory;

The memory is used for storing computer program instructions;

execution of the processor invokes computer program instructions in the memory to perform the method of any one of claims 1 to 7.

18. A computer readable storage medium, characterized in that the computer readable storage medium, when executed by a computing device, performs the method of any of the preceding claims 1 to 7.