CN117788486A - Image segmentation method, device, electronic equipment and storage medium - Google Patents

Image segmentation method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117788486A
CN117788486A CN202311816313.XA CN202311816313A CN117788486A CN 117788486 A CN117788486 A CN 117788486A CN 202311816313 A CN202311816313 A CN 202311816313A CN 117788486 A CN117788486 A CN 117788486A
Authority
CN
China
Prior art keywords
image
segmentation
segmented
sampling
position information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311816313.XA
Other languages
Chinese (zh)
Inventor
马熠东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shuhang Technology Beijing Co ltd
Original Assignee
Shuhang Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shuhang Technology Beijing Co ltd filed Critical Shuhang Technology Beijing Co ltd
Priority to CN202311816313.XA priority Critical patent/CN117788486A/en
Publication of CN117788486A publication Critical patent/CN117788486A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The embodiment of the application discloses an image segmentation method, an image segmentation device, electronic equipment and a storage medium. The method comprises the following steps: acquiring an image to be segmented and a reference image corresponding to the image to be segmented, wherein the image to be segmented is generated by a diffusion model according to the reference image; performing main object key point identification on the reference image to acquire position information of key points of a main object in the reference image; determining first segmentation prompt information for indicating the position of a main object in the image to be segmented according to the position information of the key points; inputting the image to be segmented and the first segmentation prompt information into an image segmentation model; and carrying out subject object segmentation on the image to be segmented according to the first segmentation prompt information through the image segmentation model, and obtaining a first subject mask corresponding to the subject object in the image to be segmented. When the image is segmented, the segmentation prompt information is not required to be manually input, and the efficiency and the accuracy of the image segmentation can be improved.

Description

Image segmentation method, device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to an image segmentation method, an image segmentation device, an electronic device, and a storage medium.
Background
At present, the application of image segmentation technology is more and more extensive, and the requirements on image segmentation are also higher and higher. In the related art, the image segmentation process may be performed using a model, but segmentation prompt information needs to be manually input by a user to achieve segmentation of a subject object in an image.
The related art has problems in that a user is required to manually input the segmentation hint information for each image, and the segmentation effect depends on the accuracy of the segmentation hint information input by the user, i.e., the segmentation accuracy is affected by the manual operation of the user, thus being unfavorable for improving the efficiency and accuracy of image segmentation.
Disclosure of Invention
The embodiment of the application provides an image segmentation method, an image segmentation device, electronic equipment and a storage medium, wherein when an image is segmented, main object key points are automatically identified aiming at a reference image corresponding to an image to be segmented, so that first segmentation prompt information is automatically determined according to position information of the key points to segment the image, manual operation is not needed, and the efficiency and accuracy of image segmentation are improved.
An embodiment of the present application provides an image segmentation method, where the method includes:
acquiring an image to be segmented and a reference image corresponding to the image to be segmented, wherein the image to be segmented is generated by a diffusion model according to the reference image;
Performing main object key point identification on the reference image to acquire position information of key points of a main object in the reference image;
determining first segmentation prompt information for indicating the position of a main object in the image to be segmented according to the position information of the key points;
inputting the image to be segmented and the first segmentation prompt information into an image segmentation model;
and carrying out subject object segmentation on the image to be segmented according to the first segmentation prompt information through the image segmentation model, and obtaining a first subject mask corresponding to the subject object in the image to be segmented.
A second aspect of an embodiment of the present application provides an image segmentation apparatus, including:
the image acquisition module is used for acquiring an image to be segmented and a reference image corresponding to the image to be segmented, wherein the image to be segmented is generated by a diffusion model according to the reference image;
the key point identification module is used for carrying out key point identification of the main object aiming at the reference image and acquiring the position information of the key point of the main object in the reference image;
the prompt information generation module is used for determining first segmentation prompt information for indicating the position of the main object in the image to be segmented according to the position information of the key points;
The data transmission module is used for inputting the image to be segmented and the first segmentation prompt information into an image segmentation model;
and the image segmentation module is used for carrying out main object segmentation on the image to be segmented according to the first segmentation prompt information through the image segmentation model, and obtaining a first main mask corresponding to the main object in the image to be segmented.
In some optional embodiments, the prompt information generating module includes:
the first information acquisition unit is used for taking the position information of the key points as second segmentation prompt information;
the first segmentation unit is used for carrying out main object segmentation on the image to be segmented according to the image segmentation model and the second segmentation prompt information, and obtaining a second main mask corresponding to the main object in the image to be segmented;
the sampling unit is used for sampling the pixel points in the second main body mask range to obtain the position information of the sampling points;
and the second information acquisition unit is used for determining the first segmentation prompt information according to the position information of the sampling point and the position information of the key point.
In some optional embodiments, the prompt message generating module further includes an image erosion unit, configured to:
Acquiring an image corrosion control parameter;
and performing image etching treatment on the second main mask according to the image etching control parameters so as to reduce the corresponding range of the second main mask, and obtaining the treated second main mask.
In some alternative embodiments, the sampling unit is specifically configured to:
determining a sampling step length according to the image size of the image to be segmented;
uniformly sampling the pixel points of the image to be segmented according to the sampling step length to obtain a plurality of sampling pixel points and position information of the sampling pixel points;
and taking the sampling pixel points in the second main mask range as sampling points according to the position information of the sampling point pixel points, and obtaining the position information corresponding to the sampling points.
In some alternative embodiments, the sampling unit is specifically configured to:
acquiring a minimum bounding box corresponding to the second main mask;
determining a sampling step length according to the size of the minimum bounding box;
and uniformly sampling the pixel points in the second main mask range according to the sampling step length to obtain a plurality of sampling points and position information of the sampling points.
In some optional embodiments, the second information obtaining unit is specifically configured to:
Determining a target distance threshold according to the sampling step length and the position information of the key point;
determining target key points corresponding to the key points in the image to be segmented according to the position information of the key points, and acquiring the position information of the target key points;
determining a target sampling point from the sampling points according to the target distance threshold, the position information of the sampling points in the image to be segmented and the position information of the target key points, wherein the distance between the target sampling point and each target key point is larger than the target distance threshold;
and constructing the first segmentation prompt information according to the position information of the target sampling point and the position information of the key point.
In some optional embodiments, the second information obtaining unit is further specifically configured to:
calculating and obtaining a minimum distance value between the key points according to the position information of the key points;
and taking the smaller one between the sampling step length and the minimum distance value as the target distance threshold value.
A third aspect of embodiments of the present application provides an electronic device, including a memory and a processor, where the memory stores a plurality of instructions; the processor loads instructions from the memory to execute steps in the image segmentation method provided in the first aspect of the embodiment of the present application.
A fourth aspect of the embodiments of the present application provides a computer readable storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the steps in the image segmentation method provided in the first aspect of the embodiments of the present application.
By adopting the scheme of the embodiment of the application, the image to be segmented and the reference image corresponding to the image to be segmented can be obtained, wherein the image to be segmented is generated by a diffusion model according to the reference image; performing main object key point identification on the reference image to acquire position information of key points of a main object in the reference image; determining first segmentation prompt information for indicating the position of a main object in the image to be segmented according to the position information of the key points; inputting the image to be segmented and the first segmentation prompt information into an image segmentation model; and carrying out subject object segmentation on the image to be segmented according to the first segmentation prompt information through the image segmentation model, and obtaining a first subject mask corresponding to the subject object in the image to be segmented.
In this way, when the image is segmented, the key points of the main object are automatically identified for the reference image corresponding to the image to be segmented, so that the first segmentation prompt information is automatically determined according to the position information of the key points, after the first segmentation prompt information and the image to be processed are input into the image segmentation model, the main object is automatically segmented for the image to be segmented according to the first segmentation prompt information through the image segmentation model, the manual operation is not needed, the segmentation accuracy is not influenced by the manual operation of a user, and the efficiency and the accuracy of the image segmentation are improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a data processing sequence of image segmentation performed by an image segmentation apparatus according to an embodiment of the present application;
fig. 2 is a schematic flow chart of an image segmentation method according to an embodiment of the present application;
fig. 3 is an interaction schematic diagram during image segmentation processing according to an embodiment of the present application;
fig. 4 is a block diagram of an image segmentation apparatus according to an embodiment of the present application;
fig. 5 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
The embodiment of the application provides an image segmentation method, an image segmentation device, electronic equipment and a storage medium. Specifically, the image segmentation method of the embodiment of the application may be performed by a computer device, where the computer device may be a terminal or a server. The terminal can be a terminal device such as a smart phone, a tablet computer, a notebook computer, a touch screen, a personal computer (PC, personal Computer), a personal digital assistant (PDA, personal Digital Assistant) and the like. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligent platforms.
The following will describe in detail. The following description of the embodiments is not intended to limit the preferred embodiments.
Fig. 1 is a schematic diagram of data processing timing sequence of image segmentation performed by an image segmentation apparatus according to an embodiment of the present application, as shown in fig. 1, in which the image segmentation is performed according to the procedures from step S0 to step S5. Step S0 is a process of generating an image to be segmented in advance.
In the embodiment of the present application, the image to be segmented is an image generated by a diffusion model, and in an actual use process, the image to be segmented may also be an image obtained by other modes according to a reference image, which is not limited herein specifically.
Specifically, in the field of generated artificial intelligence (AICG, artificial Intelligence Generated Content), diffusion models are commonly used for image generation. Specifically, the diffusion model used for generating the image to be segmented in the embodiment of the present application may be a stablediffration model, and other models may also be adopted in the actual use process, which is not limited herein. When the diffusion model generates an image, a reference image A and a guide language promts are input, text coding is carried out on the guide language promts, self coding and Gaussian noise adding are carried out on the reference image A, and then a style virtual character image B which accords with the text description of the guide language promts and is based on the character reference image A is obtained through a StableDiffuse denoising process. In the embodiment of the present application, the image B is used as the image to be segmented, that is, the region where the person body in the image B is located needs to be segmented.
Meanwhile, image segmentation may be performed using an image segmentation model such as a general segmentation model (SAM, segment anything model). However, when the character body is segmented with respect to the image generated by the diffusion model, the character image generated by the diffusion model may be a character of different styles such as a real character and a virtual character, or a character generated by superimposing different styles, so that the segmentation effect is poor, and it is difficult to accurately segment the region where the character body is located. In addition, for the SAM model, the user needs to manually input the segmentation prompt information prompt, so that the segmentation prompt information cannot be automatically acquired, and the image segmentation efficiency is affected.
In the image segmentation method provided by the embodiment of the application, the first segmentation prompt information can be automatically generated, so that the image segmentation efficiency is improved.
An embodiment of the present application provides an image segmentation method, please refer to fig. 2, and fig. 2 is a flow chart of the image segmentation method provided in the embodiment of the present application. The specific flow of the image segmentation method can be as follows:
201. and acquiring an image to be segmented and a reference image corresponding to the image to be segmented, wherein the image to be segmented is generated by a diffusion model according to the reference image.
In the embodiment of the application, image segmentation is performed on an image to be segmented, which is generated in advance by a diffusion model. It should be noted that, the image to be segmented generated from the reference image has the same image size as the reference image.
202. And identifying key points of the main body object aiming at the reference image, and acquiring the position information of the key points of the main body object in the reference image.
It should be noted that the image style of the image to be segmented may be complicated, and it is difficult to determine the position where the subject object (i.e., the person subject) is located. However, the image style of the reference image used by the diffusion model is generally simpler, and the image to be segmented is generated by referring to the reference image, so in the embodiment of the present application, the location information of the key points of the subject object in the reference image may be obtained by performing subject object key point identification on the reference image. The position information of the keypoints in the reference image may be used to generally indicate the position of the subject object in the image to be segmented.
In some embodiments of the present application, a reference image (i.e., image a) is input for character keypoint identification using a preset two-dimensional human keypoint identification model. It should be noted that the two-dimensional human body key point recognition model may be selected and adjusted according to actual requirements, for example, an openpost model, or RTMPose may be used, which is not limited herein.
Further, a plurality of key points can be identified by a two-dimensional human body key Point identification model, in this embodiment, some key points are selected, for example, coordinates of the key points including a nose, a left and a right eye, a left and a right shoulder, a left and a right elbow, a left and a right wrist, a left and a right hip, a left and a right knee, and a left and a right ankle can be selected, and position information Point2 d= [ [ x1, y1], [ x2, y2], …, [ x15, y15] ] of the key points of the subject in the reference image is constructed, that is, the reference image includes 15 two-dimensional coordinates in total. Other key points may be used in the actual use process, and are not particularly limited herein.
203. And determining first segmentation prompt information for indicating the position of the main object in the image to be segmented according to the position information of the key points.
Note that, in the embodiment of the present application, the partition prompt information (including the first partition prompt information and the second partition prompt information) may be a position of a point in the image, or a detection frame, a partition mask (mask), or the like for indicating a certain area range in the image, and in the present application, the position (i.e. coordinates) of the point is taken as an example for describing the partition prompt information, but not as a specific limitation.
Because the image sizes of the reference image and the image to be segmented are the same, the pixel points in the reference image and the image to be segmented are also in one-to-one correspondence. In general, the region where the subject object in the image to be segmented is located is relatively similar to the region where the subject object in the reference image is located.
In one application scenario, the location information of the key point may be directly used as the first segmentation hint information. At this time, the position information Point2d of the key Point of the subject object in the reference image is used as the first segmentation prompt information, and the first segmentation prompt information and the image to be segmented (i.e. the image B) are input into the image segmentation model to obtain the first subject mask for indicating the region where the subject object in the image to be segmented is located.
In another application scenario, the position information of the key points can be further processed to obtain more accurate first segmentation prompt information, so that the accuracy of image segmentation is improved.
Specifically, the determining, according to the position information of the key point, first segmentation hint information for indicating a position of the subject object in the image to be segmented includes:
the position information of the key points is used as second segmentation prompt information;
Performing subject object segmentation on the image to be segmented according to the image segmentation model and the second segmentation prompt information to obtain a second subject mask corresponding to the subject object in the image to be segmented;
sampling the pixel points in the second main mask range to obtain the position information of the sampling points;
and determining the first segmentation prompt information according to the position information of the sampling point and the position information of the key point.
At this time, the position information Point2d of the key Point of the subject object in the reference image is used as the second segmentation prompt information, and the second segmentation prompt information and the image to be segmented (i.e. the image B) are input into the image segmentation model to obtain a second subject mask for indicating the region where the subject object in the image to be segmented is located. And then sampling the pixel points in the second main mask to generate more accurate first segmentation prompt information.
Specifically, the image segmentation model may be selected according to actual requirements, for example, in the embodiment of the present application, a SAM vit-l model is used as the image segmentation model, and the SAM vit-l model is a general segmentation model using a Large model of a visual transducer (ViT-Large, vision Transformer Large Model) as a basic model.
In the embodiment of the application, the second segmentation prompt information and the image to be segmented (i.e. the image B) are input into the image segmentation model to obtain the second main Mask mask_b. And then sampling pixel points according to the second main mask so as to generate more accurate first segmentation prompt information.
It should be noted that, in order to avoid that the sampling point exceeds the area range of the second main mask or falls on the boundary of the second main mask to affect the image segmentation result, in this embodiment of the present application, image corrosion is performed on the second main mask first, and then pixel point sampling is performed.
Specifically, before the sampling is performed on the pixel points within the second main mask range to obtain the position information of the sampling points, the method further includes:
acquiring an image corrosion control parameter;
and performing image etching treatment on the second main mask according to the image etching control parameters so as to reduce the corresponding range of the second main mask, and obtaining the treated second main mask.
The image corrosion control parameter is a parameter for controlling the image corrosion range, and may be preset or adjusted according to actual requirements, which is not particularly limited herein.
In the embodiment of the present application, the image erosion parameter is set to kernel=5, and the image erosion parameter is set to events=6, that is, 5 pixels are reduced each time, and the processing is iterated 6 times. Specifically, according to the image corrosion control parameters of kernel=5 and interfaces=6, performing image corrosion processing on mask_b to obtain a second main Mask mask_b_mask with a reduced range, and performing pixel point sampling based on the processed second main Mask.
In an application scenario, the sampling the pixel points within the second main mask range to obtain the position information of the sampling points includes:
determining a sampling step length according to the image size of the image to be segmented;
uniformly sampling the pixel points of the image to be segmented according to the sampling step length to obtain a plurality of sampling pixel points and position information of the sampling pixel points;
and taking the sampling pixel points in the second main mask range as sampling points according to the position information of the sampling point pixel points, and obtaining the position information corresponding to the sampling points.
Specifically, the image size of the image to be segmented is width_b and height_b, the sampling step length in the width-height direction is determined according to a preset sampling step number control parameter, and the sampling step number control parameter can be set and adjusted according to actual requirements, for example, can be set to 12, and is not particularly limited herein.
Further, the corresponding sampling step is a wide-direction step int (width_b/12), a high-direction step int (height_b/12), and uniform sampling is performed along the wide-and-high direction to obtain 169 (169=13x13) sampling pixels, and position information point_sample= [ [0,0], [0,12], [12,0], …, [ int (width_b/12) ×12, int (height_b/12) ×12] ] of the sampling pixels is obtained.
At this time, the whole image to be segmented is sampled, so the sampling pixel points do not necessarily fall into the second main body mask range, and in the embodiment of the application, only the sampling pixel points in the second main body mask range are used as the sampling points.
In another application scenario, the sampling the pixel points within the second main mask range to obtain the position information of the sampling points includes:
acquiring a minimum bounding box corresponding to the second main mask;
determining a sampling step length according to the size of the minimum bounding box;
and uniformly sampling the pixel points in the second main mask range according to the sampling step length to obtain a plurality of sampling points and position information of the sampling points.
Specifically, when determining the sampling step according to the size of the minimum bounding box, the width and the height of the minimum bounding box are divided by the sampling step number control parameters to determine the corresponding sampling step in the width-height direction, and the specific sampling process is similar to the above description and will not be repeated here.
In this way, only the points in the minimum bounding box region corresponding to the second main mask are sampled, so that the data processing amount can be reduced, and the processing efficiency can be further improved.
The determining the first segmentation hint information according to the position information of the sampling point and the position information of the key point includes:
determining a target distance threshold according to the sampling step length and the position information of the key point;
determining target key points corresponding to the key points in the image to be segmented according to the position information of the key points, and acquiring the position information of the target key points;
determining a target sampling point from the sampling points according to the target distance threshold, the position information of the sampling points in the image to be segmented and the position information of the target key points, wherein the distance between the target sampling point and each target key point is larger than the target distance threshold; the distance between the target sampling point and the target key point is determined according to the position information of the target sampling point and the position information of the target key point;
and constructing the first segmentation prompt information according to the position information of the target sampling point and the position information of the key point.
It should be noted that, for each key point or sampling point for constructing the first segmentation prompt information, if the distance is too close, the accuracy of image segmentation is affected, and the image segmentation efficiency is affected by too many points, therefore, in the embodiment of the application, the sampling points are further screened to obtain target sampling points therein, and the distance between each point is ensured to be greater than the target distance threshold.
The sizes of the image to be segmented and the reference image are the same, and the pixel points are in one-to-one correspondence, so that the image to be segmented and the reference image are overlapped, or the same coordinate system is established for the image to be segmented and the reference image, and the position of the target key point in the image to be segmented can be determined.
In the embodiment of the present application, the description is given by taking the case of establishing the same coordinate system for the image to be segmented and the reference image, where the coordinates of the corresponding set of key points and the target key points are the same (i.e., the position information is the same), and the position information of the key points may be directly used as the position information of the target key points.
Specifically, the determining the target distance threshold according to the sampling step length and the position information of the key point includes:
calculating and obtaining a minimum distance value between the key points according to the position information of the key points;
and taking the smaller one between the sampling step length and the minimum distance value as the target distance threshold value.
It should be noted that, the target distance threshold may be preset and adjusted according to actual requirements, which is not limited herein.
In one application scenario, for a sampling pixel, if the sampling pixel satisfies the following two conditions, the sampling pixel is used as a target sampling point. The first condition is that the coordinates of the sampling pixel points are located in a mask_B_error dividing region of the second main body after corrosion treatment; the second condition is that the distance from any one Point in Point2D exceeds a threshold, where threshold=min (minimum value of the distance between any two points in Point2D, int (width_b/12), int (height_b/12). And constructing a Point set point_mask corresponding to the target sampling Point meeting the two conditions.
Combining the two Point sets of the Point2D and the point_mask to obtain a New Point set of the Point2d_new, and taking the position information stored in the Point2d_new as first segmentation prompt information.
204. And inputting the image to be segmented and the first segmentation prompt information into an image segmentation model.
205. And carrying out subject object segmentation on the image to be segmented according to the first segmentation prompt information through the image segmentation model, and obtaining a first subject mask corresponding to the subject object in the image to be segmented.
In this embodiment of the present application, the position information stored in the Point2d_new is used as the first segmentation hint information (i.e., the guiding information), and is input into the decoder stage (i.e., the decoding stage) of the image segmentation model SAM vit-l that performs the primary image segmentation according to the second segmentation hint information, and the final segmentation result Mask is obtained as the first main Mask. The first main mask indicates an area where a main object is located in the image to be segmented. In one application scenario, the subject object segmentation process may be performed on the image to be segmented based on the first subject mask, so as to obtain an image only including the subject object, which is not limited herein.
Fig. 3 is an interaction diagram during image segmentation processing according to an embodiment of the present application, as shown in fig. 3, a user inputs an image to be segmented and a reference image thereof, which are generated in advance by a diffusion model, and then performs subject object key point recognition on the reference image to obtain position information of a key point in a subject object, thereby automatically determining first segmentation prompt information, and performing image segmentation processing by using the image segmentation model to obtain a first subject mask. Therefore, the automatic prompt (segmentation prompt information) extraction method based on the SAM is provided, the manual setting of the prompt is not needed, a user can automatically complete image segmentation by inputting corresponding images, the processing efficiency can be improved, and the character image generated by the diffusion model can be segmented better.
By adopting the scheme of the embodiment of the application, the image to be segmented and the reference image corresponding to the image to be segmented can be obtained, wherein the image to be segmented is generated by a diffusion model according to the reference image; performing main object key point identification on the reference image to acquire position information of key points of a main object in the reference image; determining first segmentation prompt information for indicating the position of a main object in the image to be segmented according to the position information of the key points; inputting the image to be segmented and the first segmentation prompt information into an image segmentation model; and carrying out subject object segmentation on the image to be segmented according to the first segmentation prompt information through the image segmentation model, and obtaining a first subject mask corresponding to the subject object in the image to be segmented.
In this way, when the image is segmented, the key points of the main object are automatically identified for the reference image corresponding to the image to be segmented, so that the first segmentation prompt information is automatically determined according to the position information of the key points, after the first segmentation prompt information and the image to be processed are input into the image segmentation model, the main object is automatically segmented for the image to be segmented according to the first segmentation prompt information through the image segmentation model, the manual operation is not needed, the segmentation accuracy is not influenced by the manual operation of a user, and the efficiency and the accuracy of the image segmentation are improved.
With reference to fig. 4, fig. 4 is a block diagram of an image segmentation apparatus according to an embodiment of the present application, where the image segmentation apparatus includes:
an image obtaining module 401, configured to obtain an image to be segmented, and a reference image corresponding to the image to be segmented, where the image to be segmented is generated by a diffusion model according to the reference image;
a key point identification module 402, configured to identify key points of a subject with respect to the reference image, and obtain location information of the key points of the subject in the reference image;
A prompt information generating module 403, configured to determine, according to the position information of the key point, first segmentation prompt information for indicating a position of the subject object in the image to be segmented;
the data transfer module 404 is configured to input the image to be segmented and the first segmentation hint information into an image segmentation model;
the image segmentation module 405 is configured to segment the subject object of the image to be segmented according to the first segmentation hint information by using the image segmentation model, and obtain a first subject mask corresponding to the subject object in the image to be segmented.
In some alternative embodiments, the prompt information generating module 403 includes:
the first information acquisition unit is used for taking the position information of the key points as second segmentation prompt information;
the first segmentation unit is used for carrying out main object segmentation on the image to be segmented according to the image segmentation model and the second segmentation prompt information, and obtaining a second main mask corresponding to the main object in the image to be segmented;
the sampling unit is used for sampling the pixel points in the second main body mask range to obtain the position information of the sampling points;
and the second information acquisition unit is used for determining the first segmentation prompt information according to the position information of the sampling point and the position information of the key point.
In some optional embodiments, the prompt information generating module 403 further includes an image erosion unit, configured to:
acquiring an image corrosion control parameter;
and performing image etching treatment on the second main mask according to the image etching control parameters so as to reduce the corresponding range of the second main mask, and obtaining the treated second main mask.
In some alternative embodiments, the sampling unit is specifically configured to:
determining a sampling step length according to the image size of the image to be segmented;
uniformly sampling the pixel points of the image to be segmented according to the sampling step length to obtain a plurality of sampling pixel points and position information of the sampling pixel points;
and taking the sampling pixel points in the second main mask range as sampling points according to the position information of the sampling point pixel points, and obtaining the position information corresponding to the sampling points.
In some alternative embodiments, the sampling unit is specifically configured to:
acquiring a minimum bounding box corresponding to the second main mask;
determining a sampling step length according to the size of the minimum bounding box;
and uniformly sampling the pixel points in the second main mask range according to the sampling step length to obtain a plurality of sampling points and position information of the sampling points.
In some optional embodiments, the second information obtaining unit is specifically configured to:
determining a target distance threshold according to the sampling step length and the position information of the key point;
determining target key points corresponding to the key points in the image to be segmented according to the position information of the key points, and acquiring the position information of the target key points;
determining a target sampling point from the sampling points according to the target distance threshold, the position information of the sampling points in the image to be segmented and the position information of the target key points, wherein the distance between the target sampling point and each target key point is larger than the target distance threshold;
and constructing the first segmentation prompt information according to the position information of the target sampling point and the position information of the key point.
In some optional embodiments, the second information obtaining unit is further specifically configured to:
calculating and obtaining a minimum distance value between the key points according to the position information of the key points;
and taking the smaller one between the sampling step length and the minimum distance value as the target distance threshold value.
The embodiment of the application discloses an image segmentation device, which is used for acquiring an image to be segmented and a reference image corresponding to the image to be segmented through an image acquisition module 401, wherein the image to be segmented is generated by a diffusion model according to the reference image; performing main object key point identification on the reference image through a key point identification module 402, and acquiring position information of key points of a main object in the reference image; determining, by a prompt information generating module 403, first segmentation prompt information for indicating a position of a subject object in the image to be segmented according to the position information of the key point; inputting the image to be segmented and the first segmentation prompt information into an image segmentation model through a data transmission module 404; and performing subject object segmentation on the image to be segmented according to the first segmentation prompt information through the image segmentation model by an image segmentation module 405 to obtain a first subject mask corresponding to the subject object in the image to be segmented.
In this way, when the image is segmented, the key points of the main object are automatically identified for the reference image corresponding to the image to be segmented, so that the first segmentation prompt information is automatically determined according to the position information of the key points, after the first segmentation prompt information and the image to be processed are input into the image segmentation model, the main object is automatically segmented for the image to be segmented according to the first segmentation prompt information through the image segmentation model, the manual operation is not needed, the segmentation accuracy is not influenced by the manual operation of a user, and the efficiency and the accuracy of the image segmentation are improved.
It should be noted that the division of each module in the above device may be determined according to actual requirements, which is not specifically limited herein.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.
Correspondingly, the embodiment of the application also provides electronic equipment, which can be a terminal, and the terminal can be terminal equipment such as a smart phone, a tablet personal computer, a notebook computer, a touch screen, a game machine, a personal computer (PC, personal Computer), a personal digital assistant (PDA, personal Digital Assistant) and the like. As shown in fig. 5, fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device 500 includes a processor 501 having one or more processing cores, a memory 502 having one or more computer readable storage media, and a computer program stored on the memory 502 and executable on the processor. The processor 501 is electrically connected to the memory 502. Those skilled in the art will appreciate that the electronic device structure shown in fig. 5 is not limiting of the electronic device and may include more or fewer components than shown in fig. 5, or may combine certain components, or a different arrangement of components.
The processor 501 is a control center of the electronic device 500, connects various portions of the entire electronic device 500 using various interfaces and lines, and performs various functions of the electronic device 500 and processes data by running or loading software programs and/or modules stored in the memory 502, and invoking data stored in the memory 502, thereby performing overall monitoring of the electronic device 500. The processor 501 may be a central processing unit CPU, a graphics processor GPU, a network processor (NP, network Processor), etc., and may implement or execute the methods, steps and logic blocks disclosed in the embodiments of the present application.
In the embodiment of the present application, the processor 501 in the electronic device 500 loads the instructions corresponding to the processes of one or more application programs into the memory 502 according to the following steps, and the processor 501 executes the application programs stored in the memory 502, so as to implement various functions, for example:
acquiring an image to be segmented and a reference image corresponding to the image to be segmented, wherein the image to be segmented is generated by a diffusion model according to the reference image;
performing main object key point identification on the reference image to acquire position information of key points of a main object in the reference image;
Determining first segmentation prompt information for indicating the position of a main object in the image to be segmented according to the position information of the key points;
inputting the image to be segmented and the first segmentation prompt information into an image segmentation model;
and carrying out subject object segmentation on the image to be segmented according to the first segmentation prompt information through the image segmentation model, and obtaining a first subject mask corresponding to the subject object in the image to be segmented.
The specific implementation of each operation above may be referred to the previous embodiments, and will not be described herein.
Optionally, as shown in fig. 5, the electronic device 500 further includes: a touch display screen 503, a radio frequency circuit 504, an audio circuit 505, an input unit 506, and a power supply 507. The processor 501 is electrically connected to the touch display 503, the radio frequency circuit 504, the audio circuit 505, the input unit 506, and the power supply 507, respectively. Those skilled in the art will appreciate that the electronic device structure shown in fig. 5 is not limiting of the electronic device and may include more or fewer components than shown in fig. 5, or may combine certain components, or a different arrangement of components.
The touch display screen 503 may be used to display a graphical user interface and receive operation instructions generated by a user acting on the graphical user interface. The touch display screen 503 may include a display panel and a touch panel. Wherein the display panel may be used to display information entered by a user or provided to a user as well as various graphical user interfaces of the electronic device, which may be composed of graphics, text, icons, video, and any combination thereof. Alternatively, the display panel may be configured in the form of a liquid crystal display (LCD, liquid Crystal Display), an Organic Light-Emitting Diode (OLED), or the like. The touch panel may be used to collect touch operations on or near the user (such as operations on or near the touch panel by the user using any suitable object or accessory such as a finger, stylus, etc.), and generate corresponding operation instructions, and the operation instructions execute corresponding programs. Alternatively, the touch panel may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch azimuth of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device and converts it into touch point coordinates, which are then sent to the processor 501, and can receive commands from the processor 501 and execute them. The touch panel may overlay the display panel, and upon detection of a touch operation thereon or thereabout, the touch panel is passed to the processor 501 to determine the type of touch event, and the processor 501 then provides a corresponding visual output on the display panel based on the type of touch event. In the embodiment of the present application, the touch panel and the display panel may be integrated into the touch display screen 503 to implement the input and output functions. In some embodiments, however, the touch panel and the display panel may be implemented as two separate components to implement the input and output functions. I.e. the touch sensitive display 503 may also implement an input function as part of the input unit 506.
The radio frequency circuitry 504 may be used to transceive radio frequency signals to establish wireless communication with a network device or other electronic device via wireless communication.
The audio circuitry 505 may be used to provide an audio interface between a user and the electronic device through a speaker, microphone. The audio circuit 505 may transmit the received electrical signal after audio data conversion to a speaker, and convert the electrical signal into a sound signal for output by the speaker; on the other hand, the microphone converts the collected sound signals into electrical signals, which are received by the audio circuit 505 and converted into audio data, which are processed by the audio data output processor 501 for transmission to, for example, another electronic device via the radio frequency circuit 504, or which are output to the memory 502 for further processing. The audio circuit 505 may also include an ear bud jack to provide communication of the peripheral ear bud with the electronic device.
The input unit 506 may be used to receive input numbers, character information, or user characteristic information (e.g., fingerprint, iris, facial information, etc.), and to generate keyboard, mouse, joystick, optical, or trackball signal inputs related to user settings and function control.
The power supply 507 is used to power the various components of the electronic device 500. Alternatively, the power supply 507 may be logically connected to the processor 501 through a power management system, so as to implement functions of managing charging, discharging, and power consumption management through the power management system. The power supply 507 may also include one or more of any components, such as a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
Although not shown in fig. 5, the electronic device 500 may further include a camera, a sensor, a wireless fidelity module, a bluetooth module, etc., which are not described herein.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.
Those of ordinary skill in the art will appreciate that all or a portion of the steps of the various methods of the above embodiments may be performed by instructions, or by instructions controlling associated hardware, which may be stored in a computer-readable storage medium and loaded and executed by a processor.
To this end, embodiments of the present application provide a computer readable storage medium having stored therein a plurality of computer programs that can be loaded by a processor to perform steps in any of the image segmentation methods provided by embodiments of the present application. For example, the computer program may perform the steps of:
Acquiring an image to be segmented and a reference image corresponding to the image to be segmented, wherein the image to be segmented is generated by a diffusion model according to the reference image;
performing main object key point identification on the reference image to acquire position information of key points of a main object in the reference image;
determining first segmentation prompt information for indicating the position of a main object in the image to be segmented according to the position information of the key points;
inputting the image to be segmented and the first segmentation prompt information into an image segmentation model;
and carrying out subject object segmentation on the image to be segmented according to the first segmentation prompt information through the image segmentation model, and obtaining a first subject mask corresponding to the subject object in the image to be segmented.
The specific implementation of each operation above may be referred to the previous embodiments, and will not be described herein.
Wherein the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
The steps in any image segmentation method provided in the embodiments of the present application may be executed by the computer program stored in the storage medium, so that the beneficial effects that any image segmentation method provided in the embodiments of the present application may be achieved, which are detailed in the previous embodiments and are not repeated herein.
According to one aspect of the present application, there is also provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the methods provided in the various alternative implementations of the above embodiments.
The image segmentation method, the device, the electronic equipment and the storage medium provided by the embodiment of the application are described in detail, and specific examples are applied to the description of the principle and the implementation of the application, and the description of the above embodiments is only used for helping to understand the method and the core idea of the application; meanwhile, those skilled in the art will have variations in the specific embodiments and application scope in light of the ideas of the present application, and the present description should not be construed as limiting the present application in view of the above.

Claims (10)

1. An image segmentation method, comprising:
acquiring an image to be segmented and a reference image corresponding to the image to be segmented, wherein the image to be segmented is generated by a diffusion model according to the reference image;
Performing main object key point identification on the reference image to acquire position information of key points of a main object in the reference image;
determining first segmentation prompt information for indicating the position of a main object in the image to be segmented according to the position information of the key points;
inputting the image to be segmented and the first segmentation prompt information into an image segmentation model;
and carrying out subject object segmentation on the image to be segmented according to the first segmentation prompt information through the image segmentation model, and obtaining a first subject mask corresponding to the subject object in the image to be segmented.
2. The image segmentation method according to claim 1, wherein the determining the first segmentation hint information for indicating the position of the subject object in the image to be segmented according to the position information of the key point includes:
the position information of the key points is used as second segmentation prompt information;
performing subject object segmentation on the image to be segmented according to the image segmentation model and the second segmentation prompt information, and obtaining a second subject mask corresponding to the subject object in the image to be segmented;
sampling pixel points in the second main mask range to obtain position information of sampling points;
And determining the first segmentation prompt information according to the position information of the sampling point and the position information of the key point.
3. The image segmentation method according to claim 2, wherein before the sampling the pixel points within the second main mask range to obtain the position information of the sampling points, the method further includes:
acquiring an image corrosion control parameter;
and performing image corrosion treatment on the second main mask according to the image corrosion control parameters so as to reduce the corresponding range of the second main mask, and obtaining the treated second main mask.
4. The image segmentation method according to claim 2, wherein the step of sampling the pixel points within the second main mask range to obtain the position information of the sampling points includes:
determining a sampling step length according to the image size of the image to be segmented;
uniformly sampling the pixel points of the image to be segmented according to the sampling step length to obtain a plurality of sampling pixel points and position information of the sampling pixel points;
and taking the sampling pixel points in the second main mask range as sampling points according to the position information of the sampling point pixel points, and obtaining the position information corresponding to the sampling points.
5. The image segmentation method according to claim 2, wherein the step of sampling the pixel points within the second main mask range to obtain the position information of the sampling points includes:
acquiring a minimum bounding box corresponding to the second main mask;
determining a sampling step length according to the size of the minimum bounding box;
and uniformly sampling the pixel points in the second main mask range according to the sampling step length to obtain a plurality of sampling points and position information of the sampling points.
6. The image segmentation method according to claim 4 or 5, wherein the determining the first segmentation hint information according to the position information of the sampling point and the position information of the key point includes:
determining a target distance threshold according to the sampling step length and the position information of the key point;
determining target key points corresponding to the key points in the image to be segmented according to the position information of the key points, and acquiring the position information of the target key points;
determining a target sampling point from the sampling points according to the target distance threshold, the position information of the sampling points in the image to be segmented and the position information of the target key points, wherein the distance between the target sampling point and each target key point is larger than the target distance threshold;
And constructing the first segmentation prompt information according to the position information of the target sampling point and the position information of the key point.
7. The image segmentation method as set forth in claim 6, wherein the determining a target distance threshold based on the sampling step and the location information of the keypoint comprises:
calculating and obtaining a minimum distance value between the key points according to the position information of the key points;
the smaller of the sampling step size and the minimum distance value is taken as the target distance threshold.
8. An image dividing apparatus, comprising:
the image acquisition module is used for acquiring an image to be segmented and a reference image corresponding to the image to be segmented, wherein the image to be segmented is generated by a diffusion model according to the reference image;
the key point identification module is used for carrying out key point identification of the main object aiming at the reference image and acquiring the position information of the key point of the main object in the reference image;
the prompt information generation module is used for determining first segmentation prompt information for indicating the position of the main object in the image to be segmented according to the position information of the key points;
The data transmission module is used for inputting the image to be segmented and the first segmentation prompt information into an image segmentation model;
and the image segmentation module is used for carrying out main object segmentation on the image to be segmented according to the first segmentation prompt information through the image segmentation model, and obtaining a first main mask corresponding to the main object in the image to be segmented.
9. An electronic device comprising a memory and a processor; the memory stores an application program, and the processor is configured to execute the application program in the memory to perform the steps in the image segmentation method according to any one of claims 1 to 7.
10. A computer readable storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the steps in the image segmentation method according to any one of claims 1 to 7.
CN202311816313.XA 2023-12-26 2023-12-26 Image segmentation method, device, electronic equipment and storage medium Pending CN117788486A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311816313.XA CN117788486A (en) 2023-12-26 2023-12-26 Image segmentation method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311816313.XA CN117788486A (en) 2023-12-26 2023-12-26 Image segmentation method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117788486A true CN117788486A (en) 2024-03-29

Family

ID=90381153

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311816313.XA Pending CN117788486A (en) 2023-12-26 2023-12-26 Image segmentation method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117788486A (en)

Similar Documents

Publication Publication Date Title
CN112836064A (en) Knowledge graph complementing method and device, storage medium and electronic equipment
CN114186632B (en) Method, device, equipment and storage medium for training key point detection model
US20220139061A1 (en) Model training method and apparatus, keypoint positioning method and apparatus, device and medium
CN113570052B (en) Image processing method, device, electronic equipment and storage medium
CN114723888B (en) Three-dimensional hair model generation method, device, equipment, storage medium and product
CN112001331B (en) Image recognition method, device, equipment and storage medium
CN113344184A (en) User portrait prediction method, device, terminal and computer readable storage medium
CN114783070A (en) Training method and device for in-vivo detection model, electronic equipment and storage medium
CN117455753B (en) Special effect template generation method, special effect generation device and storage medium
CN117593493A (en) Three-dimensional face fitting method, three-dimensional face fitting device, electronic equipment and storage medium
CN112206541B (en) Game plug-in identification method and device, storage medium and computer equipment
CN115393251A (en) Defect detection method and device for printed circuit board, storage medium and electronic equipment
CN116385615A (en) Virtual face generation method, device, computer equipment and storage medium
CN113807430B (en) Model training method, device, computer equipment and storage medium
CN114415997B (en) Display parameter setting method and device, electronic equipment and storage medium
CN117788486A (en) Image segmentation method, device, electronic equipment and storage medium
CN113361490B (en) Image generation method, network training method, image generation device, network training device, computer equipment and storage medium
CN113903071A (en) Face recognition method and device, electronic equipment and storage medium
CN116469156A (en) Method, apparatus, computer device and computer readable storage medium for identifying body state
CN116029912A (en) Training of image processing model, image processing method, device, equipment and medium
CN117523136B (en) Face point position corresponding relation processing method, face reconstruction method, device and medium
CN113807403B (en) Model training method, device, computer equipment and storage medium
CN115578797B (en) Model training method, image recognition device and electronic equipment
CN117726808A (en) Model generation method, image processing method and related equipment
CN117726700A (en) Image generation method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination