CN112396610A

CN112396610A - Image processing method, computer equipment and storage medium

Info

Publication number: CN112396610A
Application number: CN201910741275.3A
Authority: CN
Inventors: 张云柯; 龚立雪; 樊鲁斌; 任沛然
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2019-08-12
Filing date: 2019-08-12
Publication date: 2021-02-23

Abstract

The application discloses an image processing method, a computer device and a storage medium. The method comprises the following steps: acquiring an image, wherein the image comprises a foreground part and a background part; acquiring characteristic data corresponding to a foreground part and a background part of the image respectively; inputting feature data corresponding to a foreground part and a background part of the image into a mixer of a neural network model, wherein the mixer is used for predicting pixel transparency of the image by comparing the foreground part and the background part; determining, by the mixer, a pixel transparency of the image. The embodiment of the application can be applied to the field of computer matting, provides a scheme for automatically calculating the pixel transparency of an image and automatically matting according to the pixel transparency of the image, overcomes the problems of large labor overhead and low efficiency of manual matting, and can meet the requirement of batch matting.

Description

Image processing method, computer equipment and storage medium

Technical Field

The present application relates to the field of data processing technologies, and in particular, to an image processing method, a computer device, and a computer-readable storage medium.

Background

In the flat design and image editing process, there is often a need for image segmentation, i.e. matting, which means to separate a certain part of a picture or an image from an original picture or image into a separate layer, for example, matting a foreground subject such as a model, a product, etc. in the image from the image, and placing the foreground subject to another background image, thereby synthesizing a new image.

Generally, the matting accuracy is required to be very high, and the matting can be completed by high-accuracy image matting when the pixel level is reached, and matting processing is further performed according to the pixel transparency through the pixel-by-pixel estimation transparency of a professional.

Therefore, the high-precision image matting has very large manpower overhead, and generally one professional needs to matte one large image for nearly five hours, so that the time is long, the image matting efficiency is low, and the high-precision image matting method cannot be applied to scenes needing large batches of image matting.

Disclosure of Invention

In view of the above, the present application is made to provide an image processing method, a computer apparatus, and a computer-readable storage medium that overcome or at least partially solve the above problems.

According to an aspect of the present application, there is provided an image processing method including:

acquiring an image, wherein the image comprises a foreground part and a background part;

acquiring characteristic data corresponding to a foreground part and a background part of the image respectively;

inputting feature data corresponding to a foreground part and a background part of the image into a mixer of a neural network model, wherein the mixer is used for predicting pixel transparency of the image by comparing the foreground part and the background part;

determining, by the mixer, a pixel transparency of the image.

Optionally, the determining, by the mixer, the transparency of the image includes:

obtaining a pixel transparency relative value of the foreground part and the background part according to the characteristic data respectively corresponding to the foreground part and the background part;

and determining the pixel transparency of the image according to the pixel transparency relative value and the pixel transparencies respectively corresponding to the foreground part and the background part.

Optionally, the method further includes:

and performing image segmentation processing based on the pixel transparency of the image.

Optionally, the neural network model further includes at least two decoders, and the method further includes:

and respectively inputting the characteristic data of the image into the decoder to respectively obtain the characteristic data corresponding to the foreground part and the background part of the image and the pixel transparency corresponding to the foreground part and the background part.

Optionally, the neural network model further includes an encoder, and the method further includes:

and inputting the image into an encoder to obtain the characteristic data of the image.

According to another aspect of the present application, there is provided an image processing method including:

determining a first content element and a second content element of an image;

acquiring pixel attribute relative values of the first content element and the second content element;

determining the pixel attribute of the image according to the pixel attribute relative value and the pixel attributes corresponding to the first content element and the second content element respectively;

and processing the image based on the pixel attribute.

Optionally, the first content element includes an image foreground portion, the second content element includes an image background portion, and the determining the first content element and the second content element of the image includes:

identifying at least one target object in the image;

and determining the image foreground part and the image background part according to the corresponding relation between the target object and the image foreground part or the image background part.

Optionally, the determining the first content element and the second content element of the image includes:

and inputting the image into a first content decoder and a second content decoder of a neural network model to respectively obtain a first content element and a second content element of the image.

Optionally, the obtaining the relative value of the pixel attribute of the first content element and the second content element includes:

acquiring characteristic data corresponding to a first content element and a second content element of the image respectively;

determining a relative value of pixel attributes of a foreground portion and a background portion of the image from the acquired feature data.

Optionally, the determining a relative value of pixel attributes of a foreground portion and a background portion of the image according to the acquired feature data includes:

inputting the feature data respectively corresponding to the first content element and the second content element and the feature data of the image into a mixer of a neural network to obtain a pixel attribute relative value of a foreground part and a background part of the image.

Optionally, the obtaining of the feature data corresponding to the first content element and the second content element of the image respectively includes:

inputting the image into an encoder of a neural network model to obtain characteristic data of the image corresponding to multiple dimensions;

and inputting the feature data of the image into a first content decoder and a second content decoder of a neural network model to respectively obtain the feature data corresponding to the first content element and the second content element of the image.

Optionally, the pixel attribute of the image is in a value interval formed by the pixel attribute of the first content element and the pixel attribute of the second content element.

Optionally, the determining the pixel attribute of the image according to the pixel attribute relative value and the pixel attributes respectively corresponding to the first content element and the second content element includes:

determining attribute weights of pixel attributes corresponding to the first content element and the second content element respectively according to the pixel attribute relative value and a preset numerical value relationship;

and determining the pixel attribute of the image according to the pixel attribute corresponding to the first content element and the second content element respectively and the attribute weight corresponding to the pixel attribute.

Optionally, the pixel attribute includes a pixel transparency, and the image processing based on the pixel attribute includes:

and carrying out image segmentation processing according to the pixel transparency of the image.

Optionally, before the image processing based on the pixel attribute, the method further includes:

and determining a target object to be processed in the image.

Optionally, the determining a target object to be processed in the image includes:

and determining a target object to be processed in the image according to whether at least one target object in the image belongs to a preset category.

determining a plurality of content elements of an image;

acquiring pixel attribute relative values among the content elements;

determining the pixel attribute of the image according to the pixel relative parameter and the pixel attribute of the content element;

and processing the image based on the pixel attribute.

According to another aspect of the present application, there is provided an image processing method comprising:

acquiring an input first image;

determining characteristic data corresponding to a foreground part and a background part of the first image respectively by adopting a first neural network and a second neural network;

determining the pixel attribute of the first image according to the characteristic data respectively corresponding to the foreground part and the background part of the first image;

extracting the first content element based on pixel attributes of the first image;

obtaining a second image according to the first content element and the updated second content element;

providing the second image.

Optionally, the method further includes:

determining feature data of an intermediate scene portion of the first image using a third neural network;

the determining the pixel attribute of the first image according to the feature data corresponding to the foreground part and the background part of the first image respectively comprises:

and determining the pixel attribute of the first image according to the characteristic data corresponding to the foreground part, the background part and the intermediate scenery part of the first image respectively.

According to another aspect of the application, there is provided a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method according to one or more of the above when executing the computer program.

According to another aspect of the application, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, implements a method as one or more of the above.

According to the embodiment of the application, after a first content element and a second content element of an image are determined, pixel attribute relative values of the two content elements are obtained, pixel attributes of the image can be obtained according to the pixel attribute relative values and the pixel attributes corresponding to the first content element and the second content element respectively, further image processing can be performed according to the pixel attributes, the first content element and the second content element respectively correspond to a foreground and a background of the image, and the pixel attributes are pixel transparency.

In addition, the pixel attribute of the image is determined according to the pixel attribute relative value and the pixel attributes corresponding to the first content element and the second content element respectively, so that the pixel attribute of the image can be in the range of the pixel attributes corresponding to the first content element and the second content element, and the accuracy of the matting is ensured.

The foregoing description is only an overview of the technical solutions of the present application, and the present application can be implemented according to the content of the description in order to make the technical means of the present application more clearly understood, and the following detailed description of the present application is given in order to make the above and other objects, features, and advantages of the present application more clearly understandable.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the alternative embodiments. The drawings are only for purposes of illustrating alternative embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

FIG. 1 is a flow chart of an embodiment of an image processing method based on a neural network model according to a first embodiment of the present application;

FIG. 2 is a flow chart of an embodiment of an image processing method according to the second embodiment of the present application;

FIG. 3 is a flow chart of an embodiment of an image processing method according to the third embodiment of the present application;

FIG. 4 shows an architectural diagram of image processing in one example of the present application;

FIG. 5 is a block diagram illustrating an embodiment of an image processing apparatus based on a neural network model according to a fourth embodiment of the present disclosure;

FIG. 6 is a block diagram of an embodiment of an image processing apparatus according to the fifth embodiment of the present application;

FIG. 7 is a block diagram of an embodiment of an image processing apparatus according to the sixth embodiment of the present application;

FIG. 8 is a flowchart of an embodiment of an image processing method according to the seventh embodiment of the present application;

FIG. 9 shows a schematic diagram of image processing in another example according to the application;

fig. 10 illustrates an exemplary system that can be used to implement various embodiments described in this disclosure.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

To enable those skilled in the art to better understand the present application, the following description is made of the concepts related to the present application:

the image is divided into a plurality of content elements, which may include a first content element and a second content element, and the specific division manner may be determined according to actual service requirements. For example, a foreground portion of the image is taken as a first content element and a background portion is taken as a second content element. The foreground portion is the subject of the image presentation or bracketing, and is typically an object that is located close to the lens and in motion, and the background portion is typically the environment in which the foreground portion is located. The content elements can be divided according to the image areas, the image content of a certain partial area is taken as a first content element, and the image content of another partial area is taken as a second content element; the content elements may also be divided according to the classification of the image content, which includes text, graphics, video, templates, etc., e.g., a text portion in an image as a first content element and a video portion as a second content element. The specific contents of the first object and the second object may also be set according to actual business requirements, which is not limited in the present application.

The image and content elements each have corresponding pixel attributes, which may be, for example, one or more of pixel RGB values (color values of pixels in red, green, and blue channels), pixel transparency, pixel position coordinates, content classification corresponding to pixels, pixel definition, and the like. In one embodiment of the present application, the pixel attribute may include a pixel transparency.

The pixel attribute relative value represents a relative relationship between the content elements on the pixel attribute, and may be represented as a pixel attribute of one or more pixels, or may be represented as a ratio or a weighted average of the pixel attributes, or the like. Taking the example that the pixel attribute includes a pixel transparency, the pixel attribute relative value may be a pixel transparency of one or more of the content elements, or a pixel transparency relative value (e.g., a ratio) of two content elements.

The pixel attribute of the image is determined according to the pixel attributes of the content elements and the relative values of the pixel attributes, and the image attributes are further used as the basis for image processing.

The image processing related to the embodiment of the present application may include image segmentation, noise removal, enhancement or restoration, and the like. How to perform image processing according to the image attribute can be set according to actual service requirements, which is not limited in the present application.

The image processing method of the present application is described below by taking an application scenario in which image processing includes image segmentation as an example, where an image content element includes a foreground portion and a background portion, a pixel attribute corresponding to the image and the content element includes a pixel transparency, and a pixel relative parameter of the content element or the image includes a pixel transparency relative value.

Referring to fig. 1, a flowchart of an embodiment of an image processing method based on a neural network model according to an embodiment of the present application is shown, where the embodiment uses the neural network model to perform an image processing process, the neural network model includes a mixer, and the method may specifically include the following steps:

step 101, acquiring an image, wherein the image comprises a foreground part and a background part.

And 102, acquiring characteristic data corresponding to the foreground part and the background part of the image respectively.

And inputting the characteristic data corresponding to the foreground part and the background part of the image into the mixer, and further obtaining the pixel transparency relative value of the foreground part and the background part after the image is obtained.

The content elements divided into the image comprise a foreground part and a background part, and the characteristic data of the foreground part and the characteristic data of the background part are further acquired after the image is acquired.

In Computer Vision (CV), in order for a machine to recognize an image, the image needs to be abstractly represented in a form understandable by the machine, that is, the image is subjected to feature extraction, and the image is represented by feature data. The feature data may include features of the content elements in one or more dimensions, and may specifically be in the form of a vector, i.e. vectorizing the image. Such as the number of content elements corresponding to a certain data type, the number of pixels, position coordinates, etc. The feature data are used for representing the content elements, and the content elements can be more accurately represented by constructing the high-dimensional feature data.

Step 103, inputting the characteristic data corresponding to the foreground part and the background part of the image into a mixer of the neural network model.

The neural network model can adopt a supervised learning mode, a mixer is obtained through a training method of the neural network model by marking feature data of an image sample and a corresponding pixel transparency relative value, and in the embodiment of the application, the mixer is used for predicting the transparency of the image by comparing a foreground part with a background part.

Step 104, determining the pixel transparency of the image through the mixer.

In an optional embodiment, when the transparency of the image is determined by the mixer, a pixel transparency relative value of the foreground portion and the background portion may be obtained according to the feature data corresponding to the foreground portion and the background portion, and further, the pixel transparency of the image is determined according to the pixel transparency relative value and the pixel transparencies corresponding to the foreground portion and the background portion.

The pixel transparency of the image is determined according to the pixel transparency of the content elements and the relative value of the pixel transparency, so that the pixel transparency range of the image can be defined according to the pixel transparency of each content element, the pixel transparency of the image can be in the pixel transparency range corresponding to the foreground part and the background part, and the accuracy of the matting is ensured.

The specific scheme for determining the pixel transparency of the image according to the pixel transparency of the content element and the relative transparency value can be set according to actual requirements. For example, the pixel transparency of the image is a weighted value of the pixel transparency of the foreground portion and the pixel transparency of the background portion, wherein the weight of the pixel transparency of the foreground portion or the background portion may be determined according to a transparency relative value, which may be expressed as, for example, the pixel transparency of the image (the pixel transparency of the foreground portion) ((1-transparency relative value) + the pixel transparency of the background portion) ("transparency relative value).

In an alternative embodiment, the image segmentation process may also be performed based on the pixel transparency of the image.

According to the embodiment of the application, the characteristic data corresponding to the foreground part and the background part of the image are input into the mixer to obtain the corresponding pixel transparency relative values respectively, the pixel transparency of the image is determined according to the pixel transparency relative values and the pixel transparencies corresponding to the foreground part and the background part respectively, and the image segmentation processing is further carried out based on the pixel transparency of the image.

In the embodiment of the application, the neural network model further comprises at least two decoders, the decoders decode the image feature data by taking the feature data of the image as input and combining the model obtained by training, so that the feature data corresponding to the foreground part and the background part of the image respectively and the pixel transparency corresponding to the foreground part and the background part respectively can be obtained respectively.

In the embodiment of the application, the neural network model further comprises an encoder, the characteristic data of the image can be obtained according to the encoder, the image is input into the encoder, and the encoder identifies the image characteristic according to the preset dimensionality to obtain the characteristic data of the image.

Referring to fig. 2, a flowchart of an embodiment of an image processing method according to the second embodiment of the present application is shown, where the method specifically includes the following steps:

step 201, a first content element and a second content element of an image are determined.

Step 202, obtaining the pixel attribute relative value of the first content element and the second content element.

Step 203, determining the pixel attribute of the image according to the pixel attribute relative value and the pixel attributes corresponding to the first content element and the second content element respectively.

And step 204, performing image processing based on the pixel attributes.

In this embodiment of the application, optionally, the first content element includes an image foreground portion, the second content element includes an image background portion, and when the first content element and the second content element of the image are determined, at least one target object in the image may be identified; and determining the image foreground part and the image background part according to the corresponding relation between the target object and the image foreground part or the image background part. If the target object corresponds to the foreground portion, the region or layer where the target object is located may be identified as the image foreground portion, and if the target object corresponds to the background portion, the region or layer where the target object is located may be identified as the image background portion. For example, if a pedestrian is included in the recognition image, the image region where the pedestrian is located may be defined as the foreground portion of the image, and if a frame pattern is included in the recognition image, the image region where the frame pattern is located may be defined as the background portion of the image.

In this embodiment of the application, optionally, a decoder included in the neural network model may be used to obtain content elements of the image, and when determining the first content element and the second content element of the image, the image may be input into the first content decoder and the second content decoder of the neural network model to obtain the first content element and the second content element of the image, respectively.

In the embodiment of the application, optionally, when the pixel relative parameters of the first content element and the second content element are obtained, feature data corresponding to the first content element and the second content element of the image respectively may be obtained; and determining the relative value of the pixel attributes of the foreground part and the background part of the image according to the acquired feature data.

In the embodiment of the present application, optionally, a mixer included in the neural network model may be used to obtain the pixel attribute relation between the content elements of the image. When the pixel attribute relative value of the foreground part and the background part of the image is determined according to the acquired feature data, the feature data corresponding to the first content element and the second content element respectively and the feature data of the image can be input into a mixer of a neural network, so as to obtain the pixel attribute relative value of the foreground part and the background part of the image.

In the embodiment of the application, optionally, when the feature data corresponding to the first content element and the second content element of the image respectively is obtained, the image may be input to an encoder of a neural network model to obtain feature data corresponding to the image in multiple dimensions; and inputting the feature data of the image into a first content decoder and a second content decoder of the neural network model to respectively obtain the feature data corresponding to the first content element and the second content element of the image.

According to the scheme for determining the pixel attribute of the image in the embodiment of the application, the pixel attribute of the image is in a value interval formed by the pixel attribute of the first content element and the pixel attribute of the second content element.

In this embodiment of the present application, the pixel attribute may include a pixel transparency, and when performing image processing based on the pixel attribute, image segmentation processing may be performed according to the pixel transparency of an image.

In the embodiment of the application, optionally, attribute weights of pixel attributes corresponding to the first content element and the second content element are determined according to the relative values of the pixel attributes and the preset numerical value relationship, and the pixel attributes of the image are determined according to the pixel attributes corresponding to the first content element and the second content element and the attribute weights corresponding to the pixel attributes.

The preset data relation met by the pixel attribute relative value and the attribute weight can be set according to actual requirements, so that after the pixel attribute value is obtained, the attribute weights of the pixel transparencies corresponding to the first content element and the second content element are obtained, and the pixel attribute of the image is further calculated according to the attribute weights.

For example, if the attribute weight of the pixel transparency of one content element is the relative value of the pixel attribute, and the sum of the attribute weights of the pixel transparencies of two content elements is 1, the attribute weight of the pixel transparency of the other content element can be obtained. Taking the example that the content element includes a foreground part and a background part, the transparency of the pixel of the image is characterized by transparency of the foreground part, transparency of the foreground part and transparency of the pixel of the background part, transparency of the background part and transparency of the pixel of the background part, and further can be characterized by transparency relative value of the foreground part and transparency relative value of the pixel of the background part (1-transparency relative value).

In this embodiment of the application, optionally, before performing image processing based on the pixel attributes, the method further includes: and determining a target object to be processed in the image. Specifically, various image recognition techniques suitable in the art may be used to identify the target object.

In the embodiment of the present application, optionally, when the target object to be processed in the image is determined, the target object to be processed in the image may be determined according to whether at least one target object in the image belongs to a preset category. For example, it is determined whether the target object needs to be subjected to the matting processing according to whether the target object is an animal or not. The specific preset category can be set according to the actual application requirements.

Referring to fig. 3, a flowchart of an embodiment of an image processing method according to a third embodiment of the present application is shown, where the method specifically includes the following steps:

in step 301, a plurality of content elements of an image is determined.

At step 302, pixel attribute relative values between content elements are obtained.

Step 303, determining the pixel attribute of the image according to the pixel attribute relative value and the pixel attribute of the content element.

Step 304, image processing is performed based on the pixel attributes.

In this embodiment, the content elements may include a plurality of content elements, for example, two or more content elements, and specific implementation details may refer to the description of the foregoing embodiments, which are not described herein again.

According to the embodiment of the application, after a plurality of content elements of an image are determined, the pixel attribute relative values of the content elements are obtained, the pixel attributes of the image are obtained according to the pixel attribute relative values and the pixel attributes of the content elements, image processing is further performed according to the pixel attributes, the content elements comprise the foreground and the background of the image, and the image pixels are taken as the pixel transparency.

In addition, the pixel attributes of the image are determined according to the pixel relative parameters and the pixel attributes corresponding to the content elements, so that the pixel attributes of the image can be located in a numerical value interval formed by the pixel attributes of the content elements, and the accuracy of matting is guaranteed.

In order to make the present application better understood by those skilled in the art, the following description of the present solution is given by way of specific examples.

Fig. 4 shows an architectural diagram of image processing in one example of the present application. It can be seen that the neural network model comprises at least an encoder, a decoder and a mixer. The corresponding processing flow comprises the following steps:

1. image input

2. The encoder encodes the obtained characteristic data

3. The foreground decoder decodes and the background decoder obtains foreground characteristics and background characteristics, the foreground and the background respectively correspond to the pixel transparency, and the pixel transparency is stored in the video memory

4. The mixer obtains the relative value of the transparency of the pixels of the foreground and the background by mixing, and the relative value is stored in a video memory

5. Determining the pixel transparency of the image according to the relative transparency value, storing the pixel transparency in the memory, and facilitating the subsequent image segmentation processing

6. Image segmentation according to pixel transparency of image

Referring to fig. 5, a block diagram illustrating a structure of an embodiment of an image processing apparatus based on a neural network model according to a fourth embodiment of the present application is shown, which may specifically include:

a relative value obtaining module 401, configured to input feature data corresponding to a foreground portion and a background portion of the image into the mixer, so as to obtain a pixel transparency relative value of the foreground portion and the background portion;

a transparency determining module 402, configured to determine a pixel transparency of the image according to the pixel transparency relative value and pixel transparencies corresponding to the foreground portion and the background portion, respectively;

an image segmentation module 403, configured to perform image segmentation processing based on pixel transparency of the image.

In an optional embodiment of the present application, the neural network model further comprises at least two decoders, the apparatus further comprises:

and the characteristic data processing module is used for respectively inputting the characteristic data of the image into the decoder to respectively obtain the characteristic data corresponding to the foreground part and the background part of the image and the pixel transparency corresponding to the foreground part and the background part.

In an optional embodiment of the present application, the neural network model further comprises an encoder, and the apparatus further comprises:

and the characteristic data acquisition module is used for inputting the image into an encoder to obtain the characteristic data of the image.

In addition, according to the embodiment of the application, the pixel transparency corresponding to the image is determined according to the pixel transparency relative value and the pixel transparencies corresponding to the foreground part and the background part respectively, so that the pixel transparency of the image can be in the range of the pixel transparencies corresponding to the foreground part and the background part, and the accuracy of the cutout is ensured.

Referring to fig. 6, a block diagram illustrating a structure of an embodiment of an image processing apparatus according to the fifth embodiment of the present application may specifically include:

an element determination module 501 for determining a first content element and a second content element of an image;

a parameter obtaining module 502, configured to obtain a pixel attribute relative value of the first content element and the second content element;

an attribute determining module 503, configured to determine a pixel attribute of the image according to the pixel attribute relative value and pixel attributes corresponding to the first content element and the second content element, respectively;

an image processing module 504, configured to perform image processing based on the pixel attributes.

In an optional embodiment of the application, the first content element comprises an image foreground portion and the second content element comprises an image background portion, the element determination module comprising:

a target object identification submodule for identifying at least one target object in the image;

and the content determining submodule is used for determining the image foreground part and the image background part according to the corresponding relation between the target object and the image foreground part or the image background part.

In an optional embodiment of the present application, the attribute determining module is specifically configured to input the image into a first content decoder and a second content decoder of a neural network model, so as to obtain a first content element and a second content element of the image, respectively.

In an optional embodiment of the present application, the parameter obtaining module includes:

the characteristic data acquisition submodule is used for acquiring characteristic data corresponding to a first content element and a second content element of the image respectively;

and the relative value determining submodule is used for determining the pixel attribute relative value of the foreground part and the background part of the image according to the acquired feature data.

In an optional embodiment of the present application, the relative value determining sub-module is specifically configured to input feature data corresponding to the first content element and the second content element, and the feature data of the image into a mixer of a neural network, so as to obtain a pixel attribute relative value of a foreground portion and a background portion of the image.

In an optional embodiment of the present application, the feature data obtaining sub-module is specifically configured to input the image into an encoder of a neural network model, so as to obtain feature data of multiple dimensions corresponding to the image; and inputting the feature data of the image into a first content decoder and a second content decoder of a neural network model to respectively obtain the feature data corresponding to the first content element and the second content element of the image.

In an optional embodiment of the present application, the pixel attribute of the image is within a value range formed by the pixel attribute of the first content element and the pixel attribute of the second content element.

In an optional embodiment of the present application, the attribute determining module includes:

the parameter determining submodule is used for determining attribute weights of the pixel attributes corresponding to the first content element and the second content element respectively according to the pixel attribute relative value and a preset numerical value relation;

and the operation submodule is used for determining the pixel attribute of the image according to the pixel attribute corresponding to the first content element and the second content element respectively and the attribute weight corresponding to the pixel attribute.

In an optional embodiment of the present application, the image processing module is specifically configured to perform image segmentation processing according to a pixel transparency of the image.

In an optional embodiment of the present application, the apparatus further comprises:

and the object determining module is used for determining a target object to be processed in the image before the image processing based on the pixel attributes.

In an optional embodiment of the present application, the object determining module is specifically configured to determine a target object to be processed in the image according to whether at least one target object in the image belongs to a preset category.

Referring to fig. 7, a block diagram of an embodiment of an image processing apparatus according to a sixth embodiment of the present application is shown, which may specifically include:

an element determination module 601 for determining a plurality of content elements of an image;

a relative value obtaining module 602, configured to obtain pixel attribute relative values between content elements;

an attribute determining module 603, configured to determine a pixel attribute of the image according to the pixel attribute relative value and the pixel attribute of the content element;

an image processing module 604, configured to perform image processing based on the pixel attribute.

According to the embodiment of the application, after a plurality of content elements of an image are determined, the pixel attribute relative values of the content elements are obtained, the pixel attributes of the image are obtained according to the pixel attribute relative values and the pixel attributes of the content elements, image processing is further performed according to the pixel attributes, the content elements comprise the foreground and the background of the image, and the pixel attributes of the image are taken as the pixel transparency.

In addition, the pixel attributes of the image are determined according to the pixel attribute relative values and the pixel attributes corresponding to the content elements, so that the pixel attributes of the image can be located in a numerical value interval formed by the pixel attributes of the content elements, and the accuracy of matting is guaranteed.

Referring to fig. 8, a flowchart of an embodiment of an image processing method according to the seventh embodiment of the present application is shown, which may specifically include:

step 701, acquiring an input first image.

The first image is an image to be processed, and comprises a first content element, a second content element and at least one other content element. For example, the first content element is a foreground portion, the second content element is a background portion, and other content elements may also be included, such as an intermediate scene portion, specifically, content of the image excluding the background portion, which is not a content of the foreground portion, and the content and the identification mode of the content may be set according to actual needs.

Step 702, determining feature data corresponding to a foreground part and a background part of the first image respectively by using a first neural network and a second neural network.

Step 703, determining the pixel attribute of the first image according to the feature data corresponding to the foreground part and the background part of the first image, respectively.

After the first image is obtained, the pixel attribute relative values of the first content element and the second content element are further determined, and the pixel attribute of the image is determined according to the pixel attribute relative values. For specific details, reference may be made to the above embodiments 1 to 6, which are not described herein again.

Step 704, extracting the first content element based on the pixel attribute of the first image.

The first content element may be extracted from the pixel attribute of the image, and the foreground content of the image may be further extracted according to the pixel transparency of the image, taking the pixel attribute of the image as the pixel transparency of the image and the first content element as the background content as an example.

Step 705, a second image is obtained according to the first content element and the updated second content element.

The acquired first content element may be re-composited with the updated second content element into a second image, the first image differing from the second image by one of the differences in the second content element. When the scheme is applied to the cutout scene, a new image can be synthesized with new background content for image synthesis after foreground content is extracted.

Step 706, providing the second image.

Further, a second image may be provided for presentation or use.

In an alternative embodiment, the first image, excluding the foreground portion and the background portion, further comprises an intermediate scene portion, and a third neural network may be employed to determine feature data of the intermediate scene portion of the first image.

Correspondingly, when the pixel attribute of the first image is determined according to the feature data corresponding to the foreground part and the background part of the first image, the pixel attribute of the first image may be determined according to the feature data corresponding to the foreground part, the background part and the intermediate scene part of the first image.

It can be understood that, in a specific implementation, the number and the type of content elements included in the first image may be set according to actual needs, and then corresponding feature extraction and processing are performed by using a plurality of corresponding neural networks, and combinations of corresponding encoders, decoders, and mixers are also different, which is not limited in this application. To improve the efficiency of image processing, a plurality of neural network models (including an encoder, a decoder, and a mixer) may also be used for image processing.

And a plurality of neural networks are adopted for processing, so that the processing efficiency can be improved, and the image processing effect can be improved. Based on the consideration of factors such as specific requirements and cost, different levels of services can be provided for different clients, image processing with higher precision is required to be realized, higher resource cost can be borne, a processing scheme adopting a plurality of neural networks can be selected, and fewer neural networks can be selected for clients with lower requirements and cost saving, for example, one neural network is adopted for feature extraction.

It should also be noted that, the order of the above steps may be adjusted according to actual needs, for example, different neural network models may simultaneously execute the feature extraction step, or may perform the feature extraction step sequentially. The positions and the order of the various components in the corresponding neural network models are also different.

Referring to fig. 9, a schematic diagram according to another example of the present application is shown.

In this example, an image is input through the terminal interface, and after receiving the image, the terminal further performs image preprocessing and further predicts the transparency of the image according to the neural network model. The neural network model comprises an encoder, a foreground decoder, a background decoder and a mixer.

Firstly, an image encoder obtains characteristic data of an image, and the characteristic data is input into a foreground neural network and a background neural network, namely a foreground decoder and a background decoder of a neural network model obtain characteristic data corresponding to foreground content and background content respectively.

Inputting the characteristic data into a mixer of the neural network model to obtain a relative value of pixel transparency of the foreground part and the background part, and determining the pixel transparency of the image according to the relative value of the pixel transparency and the pixel transparency respectively corresponding to the foreground part and the background part.

The foreground part of the image can be deducted according to the transparency of the image, the foreground part and the new background part are further synthesized into a new image, and the new image can be input into a terminal for displaying.

When synthesizing a new image, the foreground portion may be pre-processed, for example, to add image effects (e.g., cartoon effects), to adjust steering (e.g., to adjust human orientation), and to set the position of the foreground portion on the background portion.

The above-mentioned image processing may be executed on the terminal, and also started to transmit the image to the server via the network.

For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.

Embodiments of the disclosure may be implemented as a system using any suitable hardware, firmware, software, or any combination thereof, in a desired configuration. Fig. 10 schematically illustrates an exemplary system (or apparatus) 800 that can be used to implement various embodiments described in this disclosure.

For one embodiment, fig. 10 illustrates an exemplary system 800 having one or more processors 802, a system control module (chipset) 804 coupled to at least one of the processor(s) 802, a system memory 806 coupled to the system control module 804, a non-volatile memory (NVM)/storage 808 coupled to the system control module 804, one or more input/output devices 810 coupled to the system control module 804, and a network interface 812 coupled to the system control module 806.

The processor 802 may include one or more single-core or multi-core processors, and the processor 802 may include any combination of general-purpose or special-purpose processors (e.g., graphics processors, application processors, baseband processors, etc.). In some embodiments, the system 800 can function as a browser as described in embodiments herein.

In some embodiments, system 800 may include one or more computer-readable media (e.g., system memory 806 or NVM/storage 808) having instructions and one or more processors 802 that, in conjunction with the one or more computer-readable media, are configured to execute the instructions to implement modules to perform the actions described in this disclosure.

For one embodiment, the system control module 804 may include any suitable interface controller to provide any suitable interface to at least one of the processor(s) 802 and/or any suitable device or component in communication with the system control module 804.

The system control module 804 may include a memory controller module to provide an interface to the system memory 806. The memory controller module may be a hardware module, a software module, and/or a firmware module.

System memory 806 may be used, for example, to load and store data and/or instructions for system 800. For one embodiment, system memory 806 may include any suitable volatile memory, such as suitable DRAM. In some embodiments, the system memory 806 may include a double data rate type four synchronous dynamic random access memory (DDR4 SDRAM).

For one embodiment, the system control module 804 may include one or more input/output controllers to provide an interface to the NVM/storage 808 and input/output device(s) 810.

For example, NVM/storage 808 may be used to store data and/or instructions. NVM/storage 808 may include any suitable non-volatile memory (e.g., flash memory) and/or may include any suitable non-volatile storage device(s) (e.g., one or more Hard Disk Drives (HDDs), one or more Compact Disc (CD) drives, and/or one or more Digital Versatile Disc (DVD) drives).

NVM/storage 808 may include storage resources that are physically part of the device on which system 800 is installed or may be accessed by the device and not necessarily part of the device. For example, the NVM/storage 808 may be accessible over a network via the input/output device(s) 810.

Input/output device(s) 810 may provide an interface for system 800 to communicate with any other suitable device, input/output device(s) 810 may include communication components, audio components, sensor components, and so forth. Network interface 812 may provide an interface for system 800 to communicate over one or more networks, and system 800 may communicate wirelessly with one or more components of a wireless network according to any of one or more wireless network standards and/or protocols, such as to access a wireless network based on a communication standard, such as WiFi, 2G, or 3G, or a combination thereof.

For one embodiment, at least one of the processor(s) 802 may be packaged together with logic for one or more controller(s) (e.g., memory controller module) of the system control module 804. For one embodiment, at least one of the processor(s) 802 may be packaged together with logic for one or more controller(s) of the system control module 804 to form a System In Package (SiP). For one embodiment, at least one of the processor(s) 802 may be integrated on the same die with logic for one or more controller(s) of the system control module 804. For one embodiment, at least one of the processor(s) 802 may be integrated on the same die with logic of one or more controllers of the system control module 804 to form a system on a chip (SoC).

In various embodiments, system 800 may be, but is not limited to being: a browser, a workstation, a desktop computing device, or a mobile computing device (e.g., a laptop computing device, a handheld computing device, a tablet, a netbook, etc.). In various embodiments, system 800 may have more or fewer components and/or different architectures. For example, in some embodiments, system 800 includes one or more cameras, a keyboard, a Liquid Crystal Display (LCD) screen (including a touch screen display), a non-volatile memory port, multiple antennas, a graphics chip, an Application Specific Integrated Circuit (ASIC), and speakers.

Wherein, if the display includes a touch panel, the display screen may be implemented as a touch screen display to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation.

The present application further provides a non-volatile readable storage medium, where one or more modules (programs) are stored in the storage medium, and when the one or more modules are applied to a terminal device, the one or more modules may cause the terminal device to execute instructions (instructions) of method steps in the present application.

In one example, a computer device is provided, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method according to the embodiments of the present application when executing the computer program.

There is also provided in one example a computer readable storage medium having stored thereon a computer program, characterized in that the program, when executed by a processor, implements a method as one or more of the embodiments of the application.

An embodiment of the application discloses an image processing method and an image processing device, and example 1 includes an image processing method, including:

determining, by the mixer, a pixel transparency of the image.

Example 2 may include the method of example 1, wherein the determining, by the mixer, the transparency of the image includes:

Example 3 may include the method of example 1, wherein the method further comprises:

Example 4 may include the method of example 1, wherein the neural network model further includes at least two decoders, the method further comprising:

Example 5 may include the method of example 1, wherein the neural network model further includes an encoder, the method further comprising:

Example 6 includes an image processing method, comprising:

determining a first content element and a second content element of an image;

and processing the image based on the pixel attribute.

Example 7 may include the method of example 6, wherein the first content element includes an image foreground portion and the second content element includes an image background portion, and the determining the first content element and the second content element of the image includes:

identifying at least one target object in the image;

Example 8 may include the method of example 6, wherein the determining the first and second content elements of the image comprises:

Example 9 may include the method of example 6, wherein the obtaining the pixel attribute relative values of the first content element and the second content element comprises:

Example 10 may include the method of example 9, wherein the determining a relative value of pixel attributes for foreground and background portions of the image from the acquired feature data comprises:

Example 11 may include the method of example 9, wherein the obtaining feature data corresponding to the first content element and the second content element of the image respectively comprises:

Example 12 may include the method of example 6, wherein the pixel attribute of the image is within a numerical range of the pixel attribute of the first content element and the pixel attribute of the second content element.

Example 13 may include the method of example 6, wherein the determining the pixel attribute of the image according to the pixel attribute relative value and the pixel attributes corresponding to the first content element and the second content element, respectively, comprises:

Example 14 may include the method of example 6, wherein the pixel attribute includes a pixel transparency, the image processing based on the pixel attribute including:

Example 15 may include the method of example 6, wherein prior to the image processing based on the pixel attributes, the method further comprises:

and determining a target object to be processed in the image.

Example 16 may include the method of example 6, wherein the determining a target object to be processed in the image comprises:

Example 17 includes an image processing method comprising:

determining a plurality of content elements of an image;

acquiring pixel attribute relative values among the content elements;

and processing the image based on the pixel attribute.

Example 18 includes a method of image processing, comprising:

acquiring an input first image;

providing the second image.

Example 19 may include the method of example 18, further comprising:

Example 20 includes a computer device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing a method as in one or more of examples 1-19 when executing the computer program.

Although certain examples have been illustrated and described for purposes of description, a wide variety of alternate and/or equivalent implementations, or calculations, may be made to achieve the same objectives without departing from the scope of practice of the present application. This application is intended to cover any adaptations or variations of the embodiments discussed herein. Therefore, it is manifestly intended that the embodiments described herein be limited only by the claims and the equivalents thereof.

Claims

1. An image processing method, comprising:

determining, by the mixer, a pixel transparency of the image.

2. The method of claim 1, wherein the determining, by the mixer, the transparency of the image comprises:

3. The method of claim 1, further comprising:

4. The method of claim 1, wherein the neural network model further comprises at least two decoders, the method further comprising:

5. The method of claim 1, wherein the neural network model further comprises an encoder, the method further comprising:

6. An image processing method, comprising:

determining a first content element and a second content element of an image;

and processing the image based on the pixel attribute.

7. The method of claim 6, wherein the first content element comprises an image foreground portion and the second content element comprises an image background portion, and wherein determining the first content element and the second content element of the image comprises:

identifying at least one target object in the image;

8. The method of claim 6, wherein determining the first content element and the second content element of the image comprises:

9. The method of claim 6, wherein obtaining the relative value of the pixel attributes of the first content element and the second content element comprises:

10. The method of claim 9, wherein determining relative values of pixel attributes of foreground and background portions of the image from the acquired feature data comprises:

11. The method of claim 9, wherein obtaining feature data corresponding to the first content element and the second content element of the image respectively comprises:

12. The method of claim 6, wherein the pixel attribute of the image is within a value range formed by the pixel attribute of the first content element and the pixel attribute of the second content element.

13. The method of claim 6, wherein determining the pixel attribute of the image according to the relative value of the pixel attribute and the pixel attributes corresponding to the first content element and the second content element comprises:

14. The method of claim 6, wherein the pixel attribute comprises a pixel transparency, and wherein the image processing based on the pixel attribute comprises:

15. The method of claim 6, wherein prior to said image processing based on said pixel attributes, said method further comprises:

and determining a target object to be processed in the image.

16. The method of claim 6, wherein the determining a target object to be processed in the image comprises:

17. An image processing method, comprising:

determining a plurality of content elements of an image;

acquiring pixel attribute relative values among the content elements;

and processing the image based on the pixel attribute.

18. An image processing method, comprising:

acquiring an input first image;

providing the second image.

19. The method of claim 18, further comprising:

20. A computer arrangement comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to one or more of claims 1-19 when executing the computer program.

21. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to one or more of claims 1-19.