CN114615443A

CN114615443A - Image processing method and device

Info

Publication number: CN114615443A
Application number: CN202210253477.5A
Authority: CN
Inventors: 梁宇
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2022-03-15
Filing date: 2022-03-15
Publication date: 2022-06-10

Abstract

The application discloses an image processing method and an image processing device, and belongs to the technical field of electronics. Wherein the method comprises the following steps: acquiring a first image based on a target scene; acquiring a second image based on the target scene and a first object entering the target scene; determining transparency channel information of the first object and a sub-image corresponding to the first object in the second image according to difference information between the first image and the second image; and outputting a third image according to the transparency channel information, the sub-image and the target background image.

Description

Image processing method and device

Technical Field

The present application belongs to the field of electronic technology, and in particular, relates to an image processing method and apparatus.

Background

In some image processing scenarios, it is necessary to scratch out someone from one image and then combine it with another image to synthesize a new image.

In the prior art, the image synthesis method is adopted as follows: in software, manually drawing the outline of a target area in an image, setting a transparency channel corresponding to the area, then scratching out a sub-image of the area according to the set transparency channel until a complete shooting object is scratched out, and then synthesizing the scratched-out image with a new background image. Generally, different transparency channels need to be set for different areas of a shot object, so that the shot object is more natural. For example, for a shot object such as a 'person', different transparency channels need to be set from the hair heel to the hair tail of the hair to achieve the effect that the hair gradually becomes invisible from the hair heel to the hair tail. For another example, for the composition of video, processing is required for each frame of image, so that the region contour needs to be outlined and the transparency channel needs to be set multiple times.

It can be seen that the image synthesis method in the prior art causes complicated user operation.

Disclosure of Invention

An object of the embodiments of the present application is to provide an image processing method, which can solve the problem that the image synthesis method in the prior art causes complex user operation.

In a first aspect, an embodiment of the present application provides an image processing method, including: acquiring a first image based on a target scene; acquiring a second image based on the target scene and a first object entering the target scene; determining transparency channel information of the first object and a sub-image corresponding to the first object in the second image according to difference information between the first image and the second image; and outputting a third image according to the transparency channel information, the sub-image and the target background image.

In a second aspect, an embodiment of the present application provides an image processing apparatus, including: the first acquisition module is used for acquiring a first image based on a target scene; the second acquisition module is used for acquiring a second image based on the target scene and the first object entering the target scene; a determining module, configured to determine transparency channel information of the first object and a sub-image corresponding to the first object in the second image according to difference information between the first image and the second image; and the output module is used for outputting a third image according to the transparency channel information, the sub-image and the target background image.

In a third aspect, embodiments of the present application provide an electronic device, which includes a processor and a memory, where the memory stores a program or instructions executable on the processor, and the program or instructions, when executed by the processor, implement the steps of the method according to the first aspect.

In a fourth aspect, embodiments of the present application provide a readable storage medium, on which a program or instructions are stored, which when executed by a processor implement the steps of the method according to the first aspect.

In a fifth aspect, an embodiment of the present application provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a program or instructions to implement the method according to the first aspect.

In a sixth aspect, embodiments of the present application provide a computer program product, stored on a storage medium, for execution by at least one processor to implement the method according to the first aspect.

In this way, in the embodiment of the present application, in a scene of image processing, first, based on an operation of a user, a first image and a second image may be captured, and a difference between the first image and the second image is that the second image is added with a first object to be scratched than the first image. Then, the two images are input into a pre-trained model as a group of images, and after model processing, transparency channel information of the first object and a sub-image corresponding to the first object can be respectively output based on the difference between the two images. And finally, combining the target background image to be synthesized based on the transparency channel information and the subimage to synthesize a third image. Therefore, in the embodiment, on one hand, the transparency channel information does not need to be manually set by the user, and the user operation is simplified; on the other hand, the accuracy of the output transparency channel information is high, and the image synthesis effect is favorably improved.

Drawings

FIG. 1 is a flowchart of an image processing method according to an embodiment of the present application;

fig. 2 and 3 are schematic diagrams of an image processing method according to an embodiment of the present application;

FIG. 4 is a second flowchart of an image processing method according to an embodiment of the present application;

fig. 5 is a block diagram of an image processing apparatus according to an embodiment of the present application;

fig. 6 is one of the hardware configuration diagrams of the electronic device according to the embodiment of the present application;

fig. 7 is a second schematic diagram of a hardware structure of the electronic device according to the embodiment of the present application.

Detailed Description

The technical solutions of the embodiments of the present application will be described below clearly with reference to the drawings of the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, of the embodiments of the present application. All other embodiments that can be derived from the embodiments of the present application by one of ordinary skill in the art are intended to be within the scope of the present application.

The terms first, second and the like in the description and in the claims of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that embodiments of the application may be practiced in sequences other than those illustrated or described herein, and that the terms "first," "second," and the like are generally used herein in a generic sense and do not limit the number of terms, e.g., the first term can be one or more than one. In addition, "and/or" in the specification and claims means at least one of connected objects, a character "/" generally means that a preceding and succeeding related objects are in an "or" relationship.

The image processing method provided by the embodiment of the present application is described in detail below with reference to the accompanying drawings through specific embodiments and application scenarios thereof.

Fig. 1 shows a flowchart of an image processing method according to an embodiment of the present application, which is applied to an electronic device, and includes:

step 110: based on the target scene, a first image is acquired.

In the present embodiment, firstly, a matting method is provided, in which for the convenience of distinguishing, an object to be scratched is defined as a foreground, and all objects except the object to be scratched are combined together to define a background.

In this step, the first image is used to represent a background image.

Referring to fig. 2, for example, a camera function of the electronic device 1 is started, and an image (a first image) is acquired, wherein the image includes a certain spatial environment (such as a house in fig. 2), and the spatial environment corresponds to the target scene 2.

In reference, the more single the target scene is, the better the matting effect can be achieved by the present embodiment.

Step 120: based on the target scene, and the first object entering the target scene, a second image is acquired.

In this step, the second image is used to represent an image of a combination of the background and the foreground.

Referring to fig. 3, for example, in a case where the target scene 2 is kept unchanged, after someone or something appears in the target scene 2, an image (second image) including the target scene 2 and the appearing person is captured, and the appearing person or something corresponds to the first object 3.

Optionally, the order in which the first image and the second image are acquired is not limited.

In this embodiment, the number of the first objects is not limited, and may be one or more. For example, a plurality of second images may be acquired based on the target scene, each second image including a first object, and the first object in each second image being different. For another example, multiple second images may be acquired based on the target scene, each second image including a first object, the first object in each second image being the same, but the position, posture, etc. of the first object in different second images being different. For another example, a second image may be acquired based on the target scene, the second image including a plurality of first objects.

Wherein, in gathering the link, can gather many second images to be used for subsequent sectional drawing link, thereby improve collection efficiency.

Step 130: and determining transparency channel information of the first object and a sub-image corresponding to the first object in the second image according to difference information between the first image and the second image.

Referring to fig. 4, in the present embodiment, a first image 4 and a corresponding second image 5 are taken as a group of images, and input into a specific model trained in advance, the model splices the first image 4 and the second image 5 together and inputs into a network, and finally a transparency channel map 6 and a foreground map 7 after being additionally processed are output. And then the cutout is finished.

The transparency channel information in the step is a transparency channel map, and the sub-image corresponding to the first object is a foreground map. And in the transparency channel map, transparency channel sub-information at respective positions of the sub-image corresponding to the first object is included.

It should be noted that the transparency channel sub-information is a numerical value used for indicating a mixing ratio of a pixel point, a foreground F and a background B at a certain position in an image, and a value range of the transparency channel sub-information is as follows: 0 to 1. And the transparency channel information of the first object includes a plurality of transparency channel sub information.

In this embodiment, the model trained in advance calculates the difference between the two images, mainly by convolution calculation, and the specific model structure is not described. As the contrast information of the foreground and the background is fully utilized, the model can output a transparency channel map with higher fineness degree and restore the sub-image of the first object to the maximum degree.

Referring to fig. 4, on one hand, the transparency channel fig. 6 in the present embodiment is automatically output by the model, and the model is trained in advance, so that not only is the obtained transparency channel fig. 6 accurate, but also manual setting by a user is avoided; on the other hand, the sub-image corresponding to the first object can be obtained based on the difference between the two images, and the obtained sub-image is the original image of the first object, so that a relatively real and reasonable image synthesis effect can be ensured in the process of subsequent synthesis.

Step 140: and outputting a third image according to the transparency channel information, the sub-image and the target background image.

In this embodiment, the obtained transparency channel map can be continuously used in image synthesis.

Based on the transparency channel information and the sub-image obtained by the model, a new image (third image) can be synthesized in combination with the new background image (i.e. the target background image).

Thus, in the embodiment of the present application, in the scene of image processing, first, based on the operation of the user, a first image and a second image may be captured, and the difference between the first image and the second image is that the second image is added with the first object to be scratched than the first image. Then, the two images are input into a pre-trained model as a group of images, and after model processing, transparency channel information of the first object and a sub-image corresponding to the first object can be respectively output based on the difference between the two images. And finally, combining a target background image to be synthesized to synthesize a third image based on the transparency channel information and the subimage. Therefore, in the embodiment, on one hand, the transparency channel information does not need to be manually set by the user, and the user operation is simplified; on the other hand, the accuracy of the output transparency channel information is high, and the image synthesis effect is favorably improved.

In the flow of the image processing method according to another embodiment of the present application, step 140 includes:

substep A1: and determining a third pixel point on the target position in the third image according to the first pixel point on the target position in the subimage and the corresponding first transparency channel subinformation thereof, and the second pixel point on the target position in the target background image and the corresponding second transparency channel subinformation thereof.

And the first transparency channel sub-information and the second transparency channel sub-information have an association relationship.

In this embodiment, according to formula one: and I ═ alpha F + (1-alpha) B, and a third image is output.

In formula one, I is used to represent the third image, alpha is used to represent transparency channel information of the first object, F is used to represent the sub-image corresponding to the first object, and B is used to represent the target background image.

More specifically, in formula one, I is used to represent a pixel point (e.g., a third pixel point) at a certain position of the third image, alpha is used to represent transparency channel sub-information of a pixel point (e.g., a first pixel point) at a corresponding position of the first object, F is used to represent a pixel point (e.g., a first pixel point) at a corresponding position of the first object, and B is used to represent a pixel point (e.g., a second pixel point) at a corresponding position of the target background image.

For the pixel point at the same position, the sum of the first transparency channel sub-information corresponding to the pixel point in the sub-image of the first object and the second transparency channel sub-information corresponding to the pixel point in the target background image is "1".

In the third image, when alpha is 0, the pixel points at the corresponding positions completely reserve the background image, but not reserve the foreground image; when alpha is 1, the pixel points at the corresponding positions completely reserve the foreground image, but not the background image.

And finally, when the synthesis of each pixel point of the sub-image of the first object and the corresponding pixel point in the target background image is finished, the synthesis of the third image is finished.

The application scenario of the embodiment is, for example, to use the output transparency channel map and the sub-image to perform creation of the background replacement class of the video.

In this embodiment, a synthesis method is provided, and based on the transparency channel map and the corresponding sub-image obtained in the present application, the effect of the synthesized image can be more real and reasonable.

In further embodiments of the present application, application scenarios such as the transparency channel map to be output and the second image obtained in the foregoing steps are used for high-precision matting of the training samples of the network, so that consumption of manpower and time is avoided.

In the flow of the image processing method according to another embodiment of the present application, step 130 includes:

substep B1: according to the difference information between the first image and the second image, a sub-image of the first object which is not rendered by the target scene is determined in the second image.

In this embodiment, based on the difference between the two images, the model can also estimate the original foreground before output mixing.

Generally, when the first object enters the target scene, the first object is necessarily affected by the target scene in the imaging, for example, the target scene is a green curtain, and the edge of the first object appears green.

Therefore, in the second image, the first object seen by naked eyes is not the original first object but the first object rendered by the target scene, and the model in this embodiment can restore the original first object to the maximum extent, that is, the extra-processed foreground is obtained.

In the formula one, the composition of an image, especially the edge part, is determined by mixing the foreground and the background together, and the foreground obtained based on the embodiment is obtained by restoring, so that the effect of image composition can be improved.

In the prior art, the first object obtained by matting is not the original first object but the first object rendered by the background, so that the reality and the rationality of the image are poor after the image is synthesized under the condition that the obtained first object is not accurate.

In an image processing method according to another embodiment of the present application, the acquired data corresponding to the first image is the same as the acquired data corresponding to the second image.

The collected data comprises a collection angle, a collection position, a focus area and exposure parameters.

In the present application, in order to ensure accurate calculation of difference information between the first image and the second image, it is necessary to ensure that the first image and the second image are consistent in all parts except the first object.

Thus, in the present application, it is necessary to ensure that the relevant data at the time of acquiring the first image and the second image are identical.

In the embodiment, the acquisition angle, the acquisition position, the focal region and the exposure parameter are provided, so that in the process of acquiring the image, based on the locking of the parameters, the similarity between the first image and the second image is ensured to be extremely high, and the subsequent calculation is facilitated.

In the image processing method according to another embodiment of the present application, in the case of acquiring the first image and the second image, in order to lock the acquired data, an auxiliary instrument such as a tripod may be used to fix the electronic device.

In the flow of the image processing method according to another embodiment of the present application, before step 130, the method further includes:

step C1: and aligning the first image and the second image along the first direction according to the matched target characteristic points in the first image and the second image.

In this embodiment, in consideration of a hand shake or the like occurring during photographing, a deviation occurs between two images, and it is necessary to perform an alignment process on the two images between a set of images input to the model.

Illustratively, one alignment is: and calculating the characteristic points in the two images by using a special operator, matching in the horizontal direction and the vertical direction, cutting the maximum overlapped area of the characteristic points after successful matching, correcting to a certain degree, and finally obtaining two approximately aligned images.

The first direction comprises a vertical direction and a horizontal mode, and the overlapped characteristic points are matched characteristic points.

Illustratively, another alignment method is as follows: a multi-convolution accumulation model is constructed by utilizing a convolution neural network, the matching and alignment of the characteristic points of the two images are realized by utilizing the complex parameter calculation quantity of the model, and then the two approximately aligned images are obtained by certain degree of correction.

Due to the generalization capability of the model, the two images only need to be approximately aligned, and the expected result can be output.

In this embodiment, in the case that an auxiliary instrument (such as a tripod) is not used to fix the electronic device during the process of acquiring the images, the background may not be aligned on the same set of images, and therefore, an additional image alignment method is required to complete the matching alignment of the images; in addition, based on the alignment method of the embodiment, the difficulty of image acquisition can be reduced, and the user operation is further simplified. The image alignment method in this embodiment may be implemented by an alignment module, and the alignment module may be a pre-trained model.

In summary, the present application aims to: the matting and image synthesis method based on the fixed background is provided, the problems of complex and inaccurate matting operation in the prior art are solved, and meanwhile the authenticity and effectiveness of image synthesis are ensured. On the first hand, the method has no excessive requirements on image acquisition, and only needs to fixedly acquire a group of images under a specified background; in the second aspect, manual intervention is only needed when the model is pre-trained in the early stage, and the transparency channel map and the extra processed foreground can be automatically output by directly utilizing the model in the follow-up process; in the third aspect, the application utilizes the comparison information of the two images, so that the output structure precision is extremely high, and the method can be directly used for image synthesis and other applications in specific scenes.

In more scenes, the idea of the application can be applied to other types of image annotation, particularly to image annotation with background contrast, so that two technologies of image alignment and image difference can be fully utilized to improve the precision and efficiency of some image annotation.

In more scenes, the method and the device can also be applied to model training to improve the accuracy of the output result of the model.

For example, in the prior art, for training of a matting model, a data set disclosed by an existing academic world or a network is required, data is rare, and most data scenes need to be collected before a curtain (such as a green curtain) with a single color, so as to facilitate subsequent manual marking; in addition, the difference of annotating personnel also can lead to the difference of image annotation quality, and this has just indirectly decided that depth model's generalization ability is not good enough, can lead to the edge region excessive color (if the people like the portrait hair edge that the cutout was come out can see green curtain colour), mistake cutout, hourglass cutout scheduling problem to make the output precision of cutout model lower. And with the second image that this application obtained and the transparency passageway picture that corresponds, as training sample, train the cutout model, avoid artifical mark like this to effectively solve if taking time and energy, mark difference scheduling problem if bring because of artifical mark, and then avoid appearing the phenomenon such as edge region overflow look, mistake cutout, hourglass cutout, in order to realize accurate cutout.

In the image processing method provided by the embodiment of the application, the execution main body can be an image processing device. The image processing apparatus provided in the embodiment of the present application is described with an example in which an image processing apparatus executes an image processing method.

Fig. 5 shows a block diagram of an image processing apparatus of another embodiment of the present application, the apparatus including:

a first acquisition module 10, configured to acquire a first image based on a target scene;

a second acquisition module 20, configured to acquire a second image based on the target scene and the first object entering the target scene;

a determining module 30, configured to determine transparency channel information of the first object and a sub-image corresponding to the first object in the second image according to difference information between the first image and the second image;

and the output module 40 is used for outputting a third image according to the transparency channel information, the sub-image and the target background image.

In this way, in the embodiment of the present application, in a scene of image processing, first, based on an operation of a user, a first image and a second image may be captured, and a difference between the first image and the second image is that the second image is added with a first object to be scratched than the first image. Then, the two images are input into a pre-trained model as a group of images, and after model processing, transparency channel information of the first object and a sub-image corresponding to the first object can be respectively output based on the difference between the two images. And finally, combining a target background image to be synthesized to synthesize a third image based on the transparency channel information and the subimage. Therefore, in the embodiment, on one hand, the transparency channel information does not need to be manually set by the user, and the user operation is simplified; on the other hand, the accuracy of the output transparency channel information is high, and the image synthesis effect is favorably improved.

Optionally, the output module 40 includes:

the first determining unit is used for determining a third pixel point on a target position in a third image according to a first pixel point on the target position in the subimage and corresponding first transparency channel subinformation thereof, and a second pixel point on the target position in the target background image and corresponding second transparency channel subinformation thereof;

Optionally, the determining module 30 includes:

a second determining unit, configured to determine, in the second image, a sub-image of the first object that is not rendered by the target scene according to difference information between the first image and the second image.

Optionally, the collected data corresponding to the first image is the same as the collected data corresponding to the second image;

Optionally, the apparatus further comprises:

and the alignment module is used for aligning the first image and the second image along the first direction according to the matched target characteristic points in the first image and the second image.

The image processing apparatus in the embodiment of the present application may be an electronic device, or may be a component in an electronic device, such as an integrated circuit or a chip. The electronic device may be a terminal, or may be a device other than a terminal. The electronic Device may be, for example, a Mobile phone, a tablet computer, a notebook computer, a palm top computer, a vehicle-mounted electronic Device, a Mobile Internet Device (MID), an Augmented Reality (AR)/Virtual Reality (VR) Device, a robot, a wearable Device, an ultra-Mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), and the like, and may also be a server, a Network Attached Storage (NAS), a Personal Computer (PC), a Television (TV), a teller machine, a self-service machine, and the like, and the embodiments of the present application are not particularly limited.

The image processing apparatus according to the embodiment of the present application may be an apparatus having an action system. The action system may be an Android (Android) action system, an ios action system, or other possible action systems, and the embodiment of the present application is not particularly limited.

The image processing apparatus provided in the embodiment of the present application can implement each process implemented by the foregoing method embodiment, and is not described here again to avoid repetition.

Optionally, as shown in fig. 6, an electronic device 100 is further provided in this embodiment of the present application, and includes a processor 101, a memory 102, and a program or an instruction stored in the memory 102 and executable on the processor 101, where the program or the instruction is executed by the processor 101 to implement each step of any one of the above embodiments of the image processing method, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.

It should be noted that the electronic device according to the embodiment of the present application includes the mobile electronic device and the non-mobile electronic device described above.

Fig. 7 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.

The electronic device 1000 includes, but is not limited to: a radio frequency unit 1001, a network module 1002, an audio output unit 1003, an input unit 1004, a sensor 1005, a display unit 1006, a user input unit 1007, an interface unit 1008, a memory 1009, and a processor 1010.

Those skilled in the art will appreciate that the electronic device 1000 may further comprise a power source (e.g., a battery) for supplying power to various components, and the power source may be logically connected to the processor 1010 through a power management system, so as to implement functions of managing charging, discharging, and power consumption through the power management system. The electronic device structure shown in fig. 7 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than those shown, or combine some components, or arrange different components, and thus, the description is omitted here.

Wherein, the processor 1010 is configured to acquire a first image based on the target scene; acquiring a second image based on the target scene and a first object entering the target scene; determining transparency channel information of the first object and a sub-image corresponding to the first object in the second image according to difference information between the first image and the second image; and outputting a third image according to the transparency channel information, the sub-image and the target background image.

Optionally, the processor 1010 is further configured to determine a third pixel point in the target position in the third image according to a first pixel point in the target position in the sub-image and first transparency channel sub-information corresponding to the first pixel point, and a second pixel point in the target position in the target background image and second transparency channel sub-information corresponding to the second pixel point; wherein the first transparency channel sub-information and the second transparency channel sub-information have an association relationship therebetween.

Optionally, the processor 1010 is further configured to determine, according to difference information between the first image and the second image, a sub-image of the first object that is not rendered by the target scene in the second image.

Optionally, the acquired data corresponding to the first image is the same as the acquired data corresponding to the second image; wherein the collected data comprises a collection angle, a collection position, a focal region and exposure parameters.

Optionally, the processor 1010 is further configured to align the first image and the second image along a first direction according to the matched target feature point in the first image and the second image.

In summary, the present application aims to: the matting and image synthesis method based on the fixed background is provided, the problems of complicated operation and inaccuracy caused by matting in the prior art are solved, and meanwhile the authenticity and the effectiveness of image synthesis are ensured. On the first hand, the method has no excessive requirements on image acquisition, and only needs to fixedly acquire a group of images under a specified background; in the second aspect, manual intervention is only needed when the model is pre-trained in the early stage, and the transparency channel map and the extra processed foreground can be automatically output by directly utilizing the model in the follow-up process; in the third aspect, the application utilizes the comparison information of the two images, so that the output structure precision is extremely high, and the method can be directly used for image synthesis and other applications in specific scenes.

It should be understood that in the embodiment of the present application, the input Unit 1004 may include a Graphics Processing Unit (GPU) 10041 and a microphone 10042, and the Graphics Processing Unit 10041 processes image data of a still picture or a video image obtained by an image capturing device (such as a camera) in a video image capturing mode or an image capturing mode. The display unit 1006 may include a display panel 10061, and the display panel 10061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 1007 includes at least one of a touch panel 10071 and other input devices 10072. The touch panel 10071 is also referred to as a touch screen. The touch panel 10071 may include two parts, a touch detection device and a touch controller. Other input devices 10072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, and an action stick, which are not described in detail herein. The memory 1009 may be used to store software programs as well as various data, including but not limited to applications and action systems. The processor 1010 may integrate an application processor, which primarily handles motion systems, user pages, applications, etc., and a modem processor, which primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 1010.

The memory 1009 may be used to store software programs as well as various data. The memory 1009 may mainly include a first storage area storing a program or an instruction and a second storage area storing data, wherein the first storage area may store an operating system, an application program or an instruction (such as a sound playing function, an image playing function, and the like) required for at least one function, and the like. Further, the memory 1009 may include volatile memory or nonvolatile memory, or the memory x09 may include both volatile and nonvolatile memory. The non-volatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory. The volatile Memory may be a Random Access Memory (RAM), a Static Random Access Memory (Static RAM, SRAM), a Dynamic Random Access Memory (Dynamic RAM, DRAM), a Synchronous Dynamic Random Access Memory (Synchronous DRAM, SDRAM), a Double Data Rate Synchronous Dynamic Random Access Memory (Double Data Rate SDRAM, ddr SDRAM), an Enhanced Synchronous SDRAM (ESDRAM), a Synchronous Link DRAM (SLDRAM), and a Direct Memory bus RAM (DRRAM). The memory 1009 in the embodiments of the present application includes, but is not limited to, these and any other suitable types of memory.

Processor 1010 may include one or more processing units; optionally, the processor 1010 integrates an application processor, which primarily handles operations related to the operating system, user interface, and applications, and a modem processor, which primarily handles wireless communication signals, such as a baseband processor. It will be appreciated that the modem processor described above may not be integrated into processor 1010.

The embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or the instruction is executed by a processor, the program or the instruction implements each process of the embodiment of the image processing method, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.

The processor is the processor in the electronic device described in the above embodiment. The readable storage medium includes a computer readable storage medium, such as a computer read only memory ROM, a random access memory RAM, a magnetic or optical disk, and the like.

The embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to execute a program or an instruction to implement each process of the embodiment of the image processing method, and can achieve the same technical effect, and the details are not repeated here to avoid repetition.

It should be understood that the chips mentioned in the embodiments of the present application may also be referred to as system-on-chip, system-on-chip or system-on-chip, etc.

Embodiments of the present application provide a computer program product, where the program product is stored in a storage medium, and the program product is executed by at least one processor to implement the processes of the foregoing embodiments of the image processing method, and achieve the same technical effects, and in order to avoid repetition, details are not repeated here.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Further, it should be noted that the scope of the methods and apparatus of the embodiments of the present application is not limited to performing the functions in the order illustrated or discussed, but may include performing the functions in a substantially simultaneous manner or in a reverse order based on the functions involved, e.g., the methods described may be performed in an order different than that described, and various steps may be added, omitted, or combined. In addition, features described with reference to certain examples may be combined in other examples.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a computer software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present application.

While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments described above, which are meant to be illustrative and not restrictive, and that various changes may be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. An image processing method, characterized in that the method comprises:

acquiring a first image based on a target scene;

acquiring a second image based on the target scene and a first object entering the target scene;

determining transparency channel information of the first object and a sub-image corresponding to the first object in the second image according to difference information between the first image and the second image;

and outputting a third image according to the transparency channel information, the sub-image and the target background image.

2. The method of claim 1, wherein outputting a third image based on the transparency channel information and the sub-image, and a target background image comprises:

determining a third pixel point on the target position in the third image according to a first pixel point on the target position in the subimage and corresponding first transparency channel subinformation thereof, and a second pixel point on the target position in the target background image and corresponding second transparency channel subinformation thereof;

wherein the first transparency channel sub-information and the second transparency channel sub-information have an association relationship therebetween.

3. The method according to claim 1, wherein the determining transparency channel information of the first object and the sub-image corresponding to the first object in the second image according to the difference information between the first image and the second image comprises:

determining, in the second image, a sub-image of the first object that is not rendered by the target scene according to difference information between the first image and the second image.

4. The method of claim 1, wherein the first image corresponds to the same acquisition data as the second image;

wherein the collected data comprises a collection angle, a collection position, a focal region and exposure parameters.

5. The method according to claim 1, wherein before determining transparency channel information of the first object and the sub-image corresponding to the first object in the second image according to difference information between the first image and the second image, the method further comprises:

and aligning the first image and the second image along a first direction according to the matched target feature points in the first image and the second image.

6. An image processing apparatus, characterized in that the apparatus comprises:

the first acquisition module is used for acquiring a first image based on a target scene;

the second acquisition module is used for acquiring a second image based on the target scene and the first object entering the target scene;

a determining module, configured to determine transparency channel information of the first object and a sub-image corresponding to the first object in the second image according to difference information between the first image and the second image;

and the output module is used for outputting a third image according to the transparency channel information, the sub-image and the target background image.

7. The apparatus of claim 6, wherein the output module comprises:

a first determining unit, configured to determine a third pixel point in the target position in the third image according to a first pixel point in the target position in the sub-image and first transparency channel sub-information corresponding to the first pixel point, and a second pixel point in the target position in the target background image and second transparency channel sub-information corresponding to the second pixel point;

8. The apparatus of claim 6, wherein the determining module comprises:

9. The apparatus of claim 6, wherein the first image corresponds to the same acquisition data as the second image;

10. The apparatus of claim 6, further comprising:

and the alignment module is used for aligning the first image and the second image along a first direction according to the matched target feature points in the first image and the second image.