WO2019084712A1

WO2019084712A1 - Image processing method and apparatus, and terminal

Info

Publication number: WO2019084712A1
Application number: PCT/CN2017/108314
Authority: WO
Inventors: 何展鹏; 张立天; 吴博
Original assignee: 深圳市大疆创新科技有限公司
Priority date: 2017-10-30
Filing date: 2017-10-30
Publication date: 2019-05-09
Also published as: CN108702463B; CN112541414A; CN108702463A

Abstract

An image processing method and apparatus, and a terminal. The method comprises: according to a selection algorithm for an image corresponding to target video data, selecting at least two frames of a key image from the target video data, wherein each of the frames of the key image comprises a foreground object; acquiring, from each of the frames of the key image, a foreground sub-image containing the foreground object; and performing image fusion on each frame of the foreground sub-image and a background image of the target video data, so as to obtain an exposure image. The present present is convenient and quick to operate, and can effectively realize multi-exposure and reduce the photography difficulty.

Description

Image processing method, device and terminal

Technical field

The present application relates to the field of image processing technologies, and in particular, to an image processing method, apparatus, and terminal.

Background technique

Multiple exposure is a shooting technique. The principle of multiple exposure technology is to record the image of a moving object in a time segment on a picture by two or more exposures, which can show magical effects. . However, the above multiple exposure technique requires a multi-step fine operation, and the shooting is difficult.

Summary of the invention

The embodiment of the invention provides an image processing method, device and terminal, which are convenient to operate, can effectively realize multiple exposures, and reduce shooting difficulty.

The first aspect of the embodiment of the present invention discloses an image processing method, including:

Selecting at least two frames of key images in the target video data according to an image selection algorithm corresponding to the target video data, wherein the key images of each frame include a foreground object;

Obtaining a foreground sub-image including the foreground object in the key image of each frame;

The foreground sub-image of each frame and the background image of the target video data are image-fused to obtain an exposure image.

The second aspect of the embodiment of the present invention discloses an image processing apparatus, including:

An image obtaining module, configured to select at least two frame key images in the target video data according to an image selection algorithm corresponding to the target video data, where the key images in each frame include a foreground object;

a sub-image obtaining module, configured to acquire a foreground sub-image including the foreground object in the key image of each frame;

And an image fusion module, configured to image fuse the foreground sub-image of each frame and the background image of the target video data to obtain an exposure image.

A third aspect of the embodiments of the present invention discloses a terminal, including: a memory and a processor,

The memory is configured to store program instructions;

The processor is configured to invoke the program instruction, and when the program instruction is executed, perform the following operations:

According to the image selection algorithm corresponding to the target video data, the embodiment selects at least two frame key images in the target video data, acquires a foreground sub-image including the foreground object in each frame key image, and displays each frame foreground sub-image and target video. The background image of the data is image-fused to obtain an exposed image. Compared with the conventional multiple exposure technology, the multi-step fine operation is required, and the shooting is difficult. The embodiment of the present invention is convenient to operate, and can effectively realize multiple exposures and reduce the shooting difficulty.

DRAWINGS

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings to be used in the embodiments will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without paying for creative labor.

1 is a schematic flow chart of an image processing method according to an embodiment of the present invention;

2 is a schematic diagram of an interface image of a background image according to an embodiment of the present invention;

3 is a schematic diagram of an interface of a key image disclosed in an embodiment of the present invention;

4 is a schematic diagram of an interface of an exposure image disclosed in an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present invention; FIG.

FIG. 6 is a schematic structural diagram of a terminal according to an embodiment of the present invention.

Detailed ways

The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, and not An embodiment of the department. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.

The terminal in the embodiment of the present invention may include a personal computer, a smart phone (such as an Android mobile phone, an iOS mobile phone, etc.), a tablet computer, a palmtop computer, a mobile Internet device (MID, Mobile Internet Devices), a wearable smart device, an aircraft, or an unmanned person. Ground control station, etc.

The target video data may be acquired by the camera device, or may be obtained in the memory of the terminal or in the Internet, and is not limited by the embodiment of the present invention. The camera device can be integrated in the terminal or can be connected to the terminal. The target video data may include at least two frames of images.

The content included in the background image may be a background in at least two frames of images included in the target video data. The foreground object may be an object of a relative background in at least two frames of images included in the target video data, such as a pedestrian, an animal, or an item (eg, a skateboard, a ball, etc.). The terminal may select an image including the foreground object in the target video data, and select at least two frame designated images selected from the image including the foreground object as the key image. The foreground sub-image may be an area of the key image containing the foreground object, for example, the edge of the area coincides with the edge of the foreground object, and the distance between the edge of the area and the edge of the foreground object is less than a preset distance threshold. The exposure image may be an image obtained by processing the foreground sub-image and the background image by image fusion technology.

Please refer to FIG. 1. FIG. 1 is a schematic flowchart diagram of an image processing method according to an embodiment of the present invention. Specifically, as shown in FIG. 1, the image processing method of the embodiment of the present invention may include the following steps:

101. Select at least two frame key images in the target video data according to an image selection algorithm corresponding to the target video data.

Specifically, the terminal may pre-establish an image selection algorithm corresponding to different video data. When the target video data needs to be processed, the terminal may acquire an image selection algorithm corresponding to the target video data, and select at least two frame key images in the target video data. The image selection algorithm is used to select a key image, and the key image may include a foreground object. For example, the terminal may acquire the foreground sub-image included in each image in at least two frames of images included in the target video data, obtain the foreground sub-image located at the center point through the statistical information of the foreground sub-image, and according to the foreground located at the center point The spatial information and the time information of the sub-image determine the image to which the selected foreground sub-image belongs as a key image, and the key image may be as shown in FIG. 3 .

Optionally, the terminal may obtain an application scenario of the target video data, according to a preset application field. The corresponding relationship between the scene and the image selection algorithm is obtained, and an image selection algorithm corresponding to the application scene is obtained, and the target video data is used as an input of the image selection algorithm to obtain at least two key images.

Specifically, the terminal may pre-establish an image selection algorithm corresponding to different application scenarios. When the target video data needs to be processed, the terminal may acquire an application scenario of the target video data, and obtain an image selection algorithm corresponding to the application scenario, and the terminal may The image selection algorithm corresponding to the application scene is used as an image selection algorithm corresponding to the target video data, and the target video data is used as an input of the image selection algorithm, and the image output by the image selection algorithm is used as at least two frame key images. The application scenario may include a motion gesture of the foreground object, such as a jumping gesture, a thousand-hand Guanyin gesture, or a martial arts action gesture.

Optionally, the image selection algorithm may be: acquiring a frame image in the target video data every predetermined number of frames, and using the acquired image as a key image. The preset number of frames may be preset, for example, three frames per interval or five frames per interval.

For example, the target video data includes 10 frames of images, and the terminal may acquire one frame of image in the target video data every two frames, that is, the terminal may use the first frame image, the fourth frame image, the seventh frame image, and the tenth frame image as Key image.

For example, if the motion posture of the current scene object presents a thousand-hand Guanyin gesture or a martial arts action gesture, the terminal may determine that the application scene of the target video data is the first application scenario, and then acquire an image selection algorithm corresponding to the first application scenario, and the target is obtained. The video data is used as an input of the image selection algorithm, and the terminal may acquire one frame image in the target video data every predetermined number of frames, and use the acquired image as a key image.

Optionally, the terminal may acquire the foreground sub-image in each frame image included in the target video data according to the background image, select the target foreground sub-image according to the spatial information and time information of each foreground sub-image, and select the target foreground sub-image. The image to which it belongs is determined as a key image.

For example, if the motion image of the foreground object in the at least two frames of the target video data is a skip gesture, the terminal may obtain the foreground sub-image in each frame image included in the target video data according to the background image, and may The spatial information and time information of the image are selected as the foreground sub-image of the foreground object, the jump to the highest point and the foreground sub-image, and the selected foreground sub-image is used as the target foreground sub-image, and then the image of the foreground sub-image belongs. As a key image.

Exemplarily, if the motion posture of the current scene object is a jumping posture, the terminal can determine the target video. The application scenario of the data is a second application scenario, and then an image selection algorithm corresponding to the second application scenario is obtained, and the target video data is used as an input of the image selection algorithm, and the terminal may use each frame included in the target video data according to the background image. The foreground sub-image is acquired in the image, and the target foreground sub-image is selected according to the spatial information and the time information of each foreground sub-image, and the image to which the target foreground sub-image belongs is determined as the key image.

102. Obtain a foreground sub-image including a foreground object in each frame key image.

Specifically, the terminal selects at least two frame key images in the target video data according to the image selection algorithm corresponding to the target video data, and then processes each frame key image to obtain a foreground sub-image including the foreground object in the key image. As shown in FIG. 3, the foreground sub-image may include a pedestrian with a moving posture as a take-off.

Optionally, the terminal may acquire a foreground sub-image including the foreground object in each frame key image according to the background image.

Specifically, the terminal may compare the background image with at least two frame key images, and acquire a foreground sub-image including the foreground object in each frame key image. For example, the terminal can compare the background image to the key image based on global variation factors. Moreover, due to the variability of the scene, the strategy of switching the branch algorithm is also selected, that is, the practical sub-algorithm is selected according to the application scene of the target video data, and the background image and the key image are processed according to the sub-algorithm to obtain the foreground object. Prospect sub-image. Wherein, at least one foreground sub-image can be acquired in one key image.

Optionally, when the current scene object is a pedestrian, the terminal may perform pedestrian recognition on each frame key image according to a pedestrian recognition algorithm to obtain a foreground sub-image including a pedestrian.

Specifically, the terminal may perform face recognition on the key image. When the face is recognized in the key image, the terminal may determine that the foreground object is a pedestrian, and the terminal may perform pedestrian recognition on the key image according to the pedestrian recognition algorithm to obtain a pedestrian-containing Prospect sub-image.

For example, when the terminal determines that the foreground object is a pedestrian, the feature extraction may be performed in the key image to obtain a Histogram of Oriented Gradient (HOG) feature, and then the HOG feature is used as an input of the SVM classifier to obtain a pedestrian. The foreground sub-image.

For example, the terminal can use the pedestrian recognition network (R-CNN), the fast convolutional neural network feature Fast-RCNN or the faster convolutional neural network feature Faster-RCNN to identify the key image. Conduct pedestrian identification and get pedestrians The foreground sub-image.

The embodiment of the present invention performs pedestrian recognition on a key image according to a pedestrian recognition algorithm to obtain a foreground sub-image including a pedestrian. Even if the foreground object in the target video data is in a non-moving state (ie, a stationary state), the terminal can obtain the foreground sub-subject by the above method. The image can improve the recognition efficiency of foreground objects and effectively achieve multiple exposures.

Optionally, after acquiring the foreground sub-image including the foreground object in the key image, the terminal may acquire, in the target video data, an image in which the time information is greater than the time information of the foreground sub-image according to the time information of the foreground sub-image, and the foreground sub-image Image fusion is performed with each acquired image to update the acquired image, and the target video data is updated according to the updated image, and the updated target video data includes the updated image.

For example, the terminal selects four key images in the target video data, which are the first frame image, the fourth frame image, the seventh frame image, and the tenth frame image in the target video data, and are acquired in the first frame image. The motion image of the foreground object contained in the first foreground sub-image is the run-up, and the motion pose of the foreground object included in the second foreground sub-image acquired in the fourth frame image is the take-off, in the seventh frame image. The motion posture of the foreground object included in the acquired third foreground sub-image is jumped to the highest point, and the motion posture of the foreground object included in the fourth foreground sub-image acquired in the tenth frame image is landing. The terminal may determine that the time information of the first foreground sub-image is the first frame, and the time information of the target video data is larger than the image of the first frame is the image of the second to the tenth frame, and then the first foreground sub-image and the second - 10 frames of images are image-fused to obtain an updated image of the 2nd-10th frame. Similarly, the terminal may determine that the time information of the second foreground sub-image is the fourth frame, and the time information of the target video data is greater than the image of the fourth frame is the 5-10th image, and then the second foreground sub-image is respectively The image is blended with the updated 5-10th image to obtain the updated 5-10th image. Similarly, the terminal may determine that the time information of the third foreground sub-image is the seventh frame, and the image in which the time information in the target video data is larger than the seventh frame is the 8-10th image, and then the third foreground sub-image is respectively Image fusion is performed with the updated image of the 8th-10th frame to obtain an updated image of the 8th-10th frame. The terminal may determine that the time information of the fourth foreground sub-image is the tenth frame, and the target video data does not have the image whose time information is greater than the tenth frame, the terminal may update the target video data, and the updated target video data includes an update. The subsequent image, for example, the target video data includes the first frame image, and the updated 2-10th frame image, wherein the updated second frame image is the first foreground sub image and the second frame image After the image fusion is performed, the updated fifth frame image is obtained by image fusion of the first foreground sub-image, the second foreground sub-image, and the fifth frame image, and the updated eighth frame image is A foreground sub-image, a second foreground sub-image, a third foreground sub-image and a fifth frame image are obtained by image fusion, and the updated tenth frame image is the first foreground sub-image and the second foreground The sub-image, the third foreground sub-image, the fourth foreground sub-image, and the fifth frame image are obtained by image fusion.

103. Perform image fusion on each frame foreground sub-image and the background image of the target video data to obtain an exposure image.

Specifically, the terminal may perform image fusion on all foreground sub-images and background images to obtain an exposure image, and the exposure image may be as shown in FIG. 4 . The background image may be obtained by processing the target video data by the terminal, or may be obtained by the terminal by the camera, acquired in a local memory, or obtained through the Internet.

Optionally, the terminal performs image fusion on the foreground image of each frame and the background image of the target video data, and before the obtained image is obtained, the target video data may be processed to obtain a background image, and the background image may be as shown in FIG. 2 .

Optionally, the terminal may obtain a position of each foreground sub-image in the key image to which the foreground sub-image belongs, and combine the foreground sub-image and the background image according to the position to obtain an exposure image.

For example, if the foreground object contained in the first foreground sub-image is located on the right side of the image of the first frame, the terminal fuses the first foreground sub-image and the background image according to the position to obtain an exposure image, and the first image in the exposure image The foreground object contained in the foreground sub-image is located on the right side of the exposure image, and the distance between the foreground object and each edge of the exposure image is the same as the distance between the foreground object and the corresponding edge of the first frame image.

According to the image selection algorithm corresponding to the target video data, the embodiment selects a key image in the target video data, acquires a foreground sub-image including the foreground object in the key image, and performs image fusion on the foreground sub-image and the background image of the target video data. , the exposure image is obtained, the operation is convenient, the multiple exposure can be effectively realized, and the shooting difficulty is reduced.

The embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores program instructions, and the program may include some or all of the steps of the image processing method in the corresponding embodiment of FIG. 1 .

FIG. 5 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present invention. The image processing apparatus described in this embodiment includes:

The image obtaining module 501 is configured to select at least two frame key images in the target video data according to an image selection algorithm corresponding to the target video data, where the key images of each frame include a foreground object;

a sub-image obtaining module 502, configured to acquire, in the key image of each frame, a foreground sub-image including the foreground object;

The image fusion module 503 is configured to image fuse the foreground sub-image of each frame and the background image of the target video data to obtain an exposure image.

Optionally, the sub-image obtaining module 502 is configured to acquire a foreground sub-image including the foreground object in the key image of each frame according to the background image.

Optionally, the foreground object is a pedestrian, and the sub-image obtaining module 502 is configured to perform pedestrian recognition on the key image of each frame according to a pedestrian recognition algorithm to obtain a foreground sub-image including the pedestrian.

Optionally, the image obtaining module 501 is further configured to: the image fusion module 503 performs image fusion on the foreground sub-image of each frame and the background image of the target video data, and obtains the target image before the exposure image is obtained. The video data is processed to obtain the background image.

Optionally, the image obtaining module 501 is specifically configured to:

Obtaining an application scenario of the target video data;

Acquiring an image selection algorithm corresponding to the application scenario according to a corresponding relationship between the preset application scenario and the image selection algorithm;

Using the target video data as an input of the image selection algorithm, the at least two frames of key images are obtained.

Optionally, the image obtaining module 501 uses the target video data as an input of the image selection algorithm to obtain the key image, specifically for:

Obtaining one frame of image in the target video data every predetermined number of frames;

The acquired image is taken as the key image.

Obtaining a foreground sub-image in each frame image included in the target video data according to the background image;

Selecting a target foreground sub-image according to spatial information and time information of each of the foreground sub-images;

The image to which the target foreground sub-image belongs is determined as a key image.

Optionally, the image fusion module 503 is specifically configured to:

Obtaining, in each frame, a position of the foreground sub-image in a key image to which the foreground sub-image belongs;

According to the position, the foreground sub-image and the background image are image-fused to obtain the exposed image.

Optionally, the image obtaining module 501 is further configured to: after the sub-image obtaining module 502 acquires a foreground sub-image including the foreground object in each key image of each frame, according to time information of the foreground sub-image Obtaining, in the target video data, an image in which time information is greater than time information of the foreground sub-image;

The image fusion module 503 is further configured to image fuse the foreground sub-image and each acquired image to update the acquired image;

The image processing apparatus further includes:

The update module 504 is configured to update the target video data according to the updated image, and the updated target video data includes the updated image.

It is to be understood that the functions of the functional modules of the image processing apparatus of the embodiments of the present invention may be specifically implemented according to the method in the foregoing method embodiments, and the specific implementation process may refer to the related description of the foregoing method embodiments, and details are not described herein again. .

In the embodiment of the present invention, the image obtaining module 501 selects a key image in the target video data according to the image selection algorithm corresponding to the target video data, and the sub-image acquiring module 502 acquires a foreground sub-image including the foreground object in the key image, and the image fusion module 503 The foreground image and the background image of the target video data are image-fused to obtain an exposed image, which is convenient to operate, and can effectively achieve multiple exposures and reduce shooting difficulty.

FIG. 6 is a schematic structural diagram of a terminal according to an embodiment of the present invention. The terminal described in this embodiment includes: a memory 601 and a processor 602. The above processor 602 and memory 601 are connected by a bus.

The processor 602 may be a central processing unit (CPU), and the processor may be another general-purpose processor, a digital signal processor (DSP), or an application specific integrated circuit (ASIC). ), a Field-Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, and the like. The general purpose processor may be a microprocessor or the processor or any conventional processor or the like.

The above memory 601 can include read only memory and random access memory and provides instructions and data to the processor 602. A portion of the memory 601 may also include a non-volatile random access memory. among them:

The memory 601 is configured to store program instructions;

The processor 602 is configured to invoke the program instruction, and when the program instruction is executed, perform the following operations:

Optionally, the processor 602 obtains, in the key image of each frame, a foreground sub-image that includes the foreground object, specifically for:

According to the background image, a foreground sub-image including the foreground object is acquired in the key image of each frame.

Optionally, the foreground object is a pedestrian, and the processor 602 obtains a foreground sub-image including the foreground object in the key image of each frame, specifically:

Pedestrian recognition is performed on the key image of each frame according to a pedestrian recognition algorithm to obtain a foreground sub-image including the pedestrian.

Optionally, the processor 602 is further configured to perform image fusion on the foreground sub-image of each frame and the background image of the target video data, and process the target video data before obtaining the exposed image to obtain the The background image.

Optionally, the processor 602 is configured according to an image selection algorithm corresponding to the target video data. Select at least two frames of key images from the target video data, specifically for:

Obtaining an application scenario of the target video data;

Optionally, the processor 602 uses the target video data as an input of the image selection algorithm to obtain the at least two frames of key images, specifically for:

The acquired image is taken as the key image.

Optionally, the processor 602 performs image fusion on the foreground sub-image and the background image of each frame to obtain an exposure image, specifically for:

Optionally, the processor 602 is further configured to: after acquiring a foreground sub-image including the foreground object in each key image of each frame, according to time information of the foreground sub-image, in the target video data Obtaining an image in which time information is greater than time information of the foreground sub-image;

The processor 602 is further configured to perform image fusion on the foreground sub-image and each acquired image to update the acquired image;

The processor 602 is further configured to update the target video data according to the updated image, where the updated target video data includes the updated image.

In a specific implementation, the processor 602 described in the embodiment of the present invention may be implemented in the embodiment of the present invention. The implementation of the image processing method described in the embodiment of the present invention may also be implemented, and details are not described herein again.

It should be noted that, for the foregoing various method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the present invention is not limited by the described action sequence. Because some steps may be performed in other orders or concurrently in accordance with the present application. In the following, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present application.

A person skilled in the art may understand that all or part of the various steps of the foregoing embodiments may be performed by a program to instruct related hardware. The program may be stored in a computer readable storage medium, and the storage medium may include: Flash disk, Read-Only Memory (ROM), Random Access Memory (RAM), disk or optical disk.

The control method, device, device and aircraft of the control terminal provided by the embodiment of the present invention are described in detail. The principle and implementation manner of the present invention are described in the following. The description of the above embodiment is only The method for understanding the present invention and its core idea; at the same time, for those of ordinary skill in the art, according to the idea of the present invention, there will be changes in specific embodiments and application scopes. The description should not be construed as limiting the invention.

Claims

An image processing method, the method comprising:

Selecting at least two frames of key images in the target video data according to an image selection algorithm corresponding to the target video data, wherein the key images of each frame include a foreground object;

Obtaining a foreground sub-image including the foreground object in the key image of each frame;

The foreground sub-image of each frame and the background image of the target video data are image-fused to obtain an exposure image.
The method of claim 1, wherein the acquiring a foreground sub-image comprising the foreground object in the key image of each frame comprises:

According to the background image, a foreground sub-image including the foreground object is acquired in the key image of each frame.
The method of claim 1 wherein said foreground object is a pedestrian;

Obtaining, in the key image of each frame, a foreground sub-image including the foreground object, including:

Pedestrian recognition is performed on the key image of each frame according to a pedestrian recognition algorithm to obtain a foreground sub-image including the pedestrian.
The method according to claim 1, wherein the image fusion of the foreground sub-image of each frame and the background image of the target video data to obtain an exposure image further comprises:

The target video data is processed to obtain the background image.
The method according to claim 1, wherein the image selection algorithm corresponding to the target video data selects at least two frame key images in the target video data, including:

Obtaining an application scenario of the target video data;

Acquiring an image selection algorithm corresponding to the application scenario according to a corresponding relationship between the preset application scenario and the image selection algorithm;

Using the target video data as an input of the image selection algorithm, the at least two frames of key images are obtained.
The method of claim 5, wherein the using the target video data as an input of the image selection algorithm to obtain the at least two frames of key images comprises:

Obtaining one frame of image in the target video data every predetermined number of frames;

The acquired image is taken as the key image.
The method of claim 5, wherein the using the target video data as an input of the image selection algorithm to obtain the at least two frames of key images comprises:

Obtaining a foreground sub-image in each frame image included in the target video data according to the background image;

Selecting a target foreground sub-image according to spatial information and time information of each of the foreground sub-images;

An image to which the target foreground sub-image belongs is determined as the key image.
The method according to claim 1, wherein the image fusion of the foreground sub-image of each frame and the background image of the target video data to obtain an exposure image comprises:

Obtaining, in each frame, a position of the foreground sub-image in a key image to which the foreground sub-image belongs;

According to the position, the foreground sub-image and the background image are image-fused to obtain the exposed image.
The method according to claim 1, wherein after acquiring the foreground sub-image including the foreground object in the key image of each frame, the method further comprises:

Obtaining, in the target video data, an image in which time information is greater than time information of the foreground sub-image according to time information of the foreground sub-image;

Performing image fusion on the foreground sub-image and each acquired image to update the acquired image;

The target video data is updated according to the updated image, and the updated target video data includes the updated image.
An image processing apparatus, characterized in that the apparatus comprises:

An image obtaining module, configured to select at least two frame key images in the target video data according to an image selection algorithm corresponding to the target video data, where the key images in each frame include a foreground object;

a sub-image obtaining module, configured to acquire a foreground sub-image including the foreground object in the key image of each frame;

And an image fusion module, configured to image fuse the foreground sub-image of each frame and the background image of the target video data to obtain an exposure image.
The device of claim 10 wherein:

And the sub-image obtaining module is configured to acquire a foreground sub-image including the foreground object in the key image of each frame according to the background image.
The device of claim 10 wherein said foreground object is a pedestrian;

The sub-image obtaining module is specifically configured to perform pedestrian recognition on the key image of each frame according to a pedestrian recognition algorithm to obtain a foreground sub-image including the pedestrian.
The device of claim 10 wherein:

The image acquisition module is further configured to: the image fusion module performs image fusion on the foreground sub-image of each frame and the background image of the target video data, and processes the target video data before obtaining the exposed image to obtain The background image.
The device according to claim 10, wherein the image acquisition module is specifically configured to:

Obtaining an application scenario of the target video data;

Acquiring an image selection algorithm corresponding to the application scenario according to a corresponding relationship between the preset application scenario and the image selection algorithm;

Using the target video data as an input of the image selection algorithm, the at least two frames of key images are obtained.
The apparatus of claim 14 wherein said image acquisition module The target video data is used as an input of the image selection algorithm to obtain the key image, specifically for:

Obtaining one frame of image in the target video data every predetermined number of frames;

The acquired image is taken as the key image.
The device according to claim 14, wherein the image acquisition module uses the target video data as an input of the image selection algorithm to obtain the key image, specifically for:

Obtaining a foreground sub-image in each frame image included in the target video data according to the background image;

Selecting a target foreground sub-image according to spatial information and time information of each of the foreground sub-images;

An image to which the target foreground sub-image belongs is determined as the key image.
The device according to claim 10, wherein the image fusion module is specifically configured to:

Obtaining, in each frame, a position of the foreground sub-image in a key image to which the foreground sub-image belongs;

According to the position, the foreground sub-image and the background image are image-fused to obtain the exposed image.
The device of claim 10 wherein:

The image acquisition module is further configured to: after the sub-image acquisition module acquires a foreground sub-image including the foreground object in each key image of each frame, according to time information of the foreground sub-image, in the target video Obtaining an image in the data that has time information greater than time information of the foreground sub-image;

The image fusion module is further configured to image fuse the foreground sub-image and each acquired image to update the acquired image;

The image processing apparatus further includes:

And an update module, configured to update the target video data according to the updated image, where the updated target video data includes the updated image.
A terminal, comprising: a memory and a processor;

The memory is configured to store program instructions;

The processor is configured to invoke the program instruction, and when the program instruction is executed, perform the following operations:

Selecting at least two frames of key images in the target video data according to an image selection algorithm corresponding to the target video data, wherein the key images of each frame include a foreground object;

Obtaining a foreground sub-image including the foreground object in the key image of each frame;

The foreground sub-image of each frame and the background image of the target video data are image-fused to obtain an exposure image.
The terminal according to claim 19, wherein the processor acquires a foreground sub-image including the foreground object in the key image of each frame, specifically for:

According to the background image, a foreground sub-image including the foreground object is acquired in the key image of each frame.
The terminal according to claim 19, wherein said foreground object is a pedestrian;

The processor acquires a foreground sub-image including the foreground object in the key image of each frame, specifically for:

Pedestrian recognition is performed on the key image of each frame according to a pedestrian recognition algorithm to obtain a foreground sub-image including the pedestrian.
The terminal of claim 19, wherein:

The processor is further configured to image fuse the foreground sub-image of each frame and the background image of the target video data, and process the target video data to obtain the background image before obtaining the exposed image.
The terminal according to claim 19, wherein the processor selects at least two frame key images in the target video data according to an image selection algorithm corresponding to the target video data, specifically for:

Obtaining an application scenario of the target video data;

Obtaining the application scenario according to a corresponding relationship between a preset application scenario and an image selection algorithm Corresponding image selection algorithm;

Using the target video data as an input of the image selection algorithm, the at least two frames of key images are obtained.
The terminal according to claim 23, wherein the processor uses the target video data as an input of the image selection algorithm to obtain the at least two frame key images, specifically for:

Obtaining one frame of image in the target video data every predetermined number of frames;

The acquired image is taken as the key image.
The terminal according to claim 23, wherein the processor uses the target video data as an input of the image selection algorithm to obtain the at least two frame key images, specifically for:

Obtaining a foreground sub-image in each frame image included in the target video data according to the background image;

Selecting a target foreground sub-image according to spatial information and time information of each of the foreground sub-images;

The image to which the target foreground sub-image belongs is determined as a key image.
The terminal according to claim 19, wherein the processor combines the foreground sub-image and the background image of each frame to obtain an exposure image, specifically for:

Obtaining, in each frame, a position of the foreground sub-image in a key image to which the foreground sub-image belongs;

According to the position, the foreground sub-image and the background image are image-fused to obtain the exposed image.
The terminal of claim 19, wherein:

The processor is further configured to: after obtaining a foreground sub-image including the foreground object in each key image of each frame, obtain time information in the target video data that is greater than the time information according to time information of the foreground sub-image An image of time information of the foreground sub-image;

The processor is further configured to image fuse the foreground sub-image and each acquired image to update the acquired image;

The processor is further configured to update the target video data according to the updated image, where the updated target video data includes the updated image.