CN115439846B

CN115439846B - Image segmentation method and device, electronic equipment and medium

Info

Publication number: CN115439846B
Application number: CN202210952173.8A
Authority: CN
Inventors: 梁孔明; 郭宸宇; 马占宇; 吴铭; 张闯; 肖波
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2022-08-09
Filing date: 2022-08-09
Publication date: 2023-04-25
Anticipated expiration: 2042-08-09
Also published as: CN115439846A

Abstract

The application discloses an image segmentation method, an image segmentation device, electronic equipment and a medium. By applying the technical scheme, the class activation image of the original image can be utilized to determine the category of each characteristic region image, and the characteristic region image of the specific category is mapped into the original image so as to obtain the segmented image with the complete boundary of the segmented object reserved. Furthermore, the method is realized by utilizing the super-pixel technology to reserve the boundary of the target area and simultaneously combining a method for calculating the prototype of the target area to obtain the pseudo-pixel level label with complete image as far as possible. Therefore, the problem that the segmented image is not accurate enough due to the fact that the pseudo pixel level label target area generated by the class activation diagram is not completed or the boundary is not obvious in the related technology is avoided.

Description

Image segmentation method and device, electronic equipment and medium

Technical Field

The present application relates to data processing technologies, and in particular, to a method and apparatus for image segmentation, an electronic device, and a medium.

Background

With the development of communication and network age, application of image segmentation technology is also receiving more and more attention in various application scenes.

In the related art, most of weak supervision semantic segmentation methods using image-level labels are based on class activation diagrams (CAM) to locate a segmented target, and the CAM can only identify part of high-response discrimination areas in an image and lacks accurate description of boundaries where the segmented target is located. Therefore, how to design a method capable of accurately generating a segmentation target image becomes a problem to be solved.

Disclosure of Invention

The embodiment of the application provides an image segmentation method, an image segmentation device, electronic equipment and a medium. The method is used for solving the problem that the segmentation target image cannot be accurately obtained by using the class activation map technology in the related technology.

According to one aspect of the embodiments of the present application, there is provided an image segmentation method, including:

acquiring at least one characteristic region image which corresponds to an original image and is obtained based on a super pixel technology; acquiring at least one class activation image corresponding to the original image, wherein each class activation image is marked with an image class to which the class activation image belongs;

determining an image category characterized by each characteristic area image based on the class activation image;

mapping a target characteristic region prototype characterized in a preset image category with the original image to obtain a segmented image corresponding to the target characteristic region image

Optionally, in another embodiment of the method according to the present application, the acquiring at least one feature area image corresponding to the original image includes:

inputting the original image into an image classification network to obtain a super-pixel image corresponding to the original image;

performing regional pooling operation on the super-pixel images, and determining a plurality of regional features corresponding to each super-pixel image;

and carrying out feature clustering on the plurality of region features to obtain the corresponding at least one feature region image.

Optionally, in another embodiment of the method according to the present application, the acquiring at least one class activation image corresponding to the original image includes:

and generating the class activation image corresponding to each object to be segmented in the original image by using a class activation algorithm.

Optionally, in another embodiment of the method according to the present application, the determining, based on the class activation image, an image class characterized by each of the feature area images includes:

respectively calculating the overlapping degree of the characteristic region image and all the class activation images, and determining a target class activation image with the largest corresponding overlapping degree of the characteristic region image;

and taking the image category of the target class activation image corresponding to the characteristic area image as the image category characterized by the characteristic area image.

Optionally, in another embodiment of the method according to the present application, the calculating the overlapping degree of the feature area image and all class activation images respectively, and determining the target class activation image with the largest overlapping degree corresponding to the feature area image includes:

calculating the intersection ratio of each characteristic area image and all types of activated images respectively;

and taking the class activation image corresponding to the maximum cross ratio of all the cross ratios corresponding to the characteristic area image as the target class activation image corresponding to the characteristic area image.

Optionally, in another embodiment of the method according to the present application, mapping the target feature region prototype, which is characterized by the preset image category, with the original image to obtain a segmented image corresponding to the target feature region image includes:

carrying out average pooling on each target characteristic region image, and calculating to obtain a prototype of each target characteristic region image;

performing matrix continuous multiplication summation operation on the original image and a prototype of the target characteristic region image to obtain a characteristic sum value;

and obtaining a segmented image corresponding to the target characteristic region image based on the characteristic sum value.

Optionally, in another embodiment of the method according to the present application, the obtaining, based on the feature and the value, a segmented image corresponding to the target feature area image includes:

negative value removal is carried out on the characteristic sum value by using a preset activation function, and a pseudo pixel level label of the original image is obtained;

and obtaining a segmented image corresponding to the target characteristic region image by using the pseudo pixel level label.

According to still another aspect of the embodiments of the present application, there is provided an image segmentation apparatus, including:

the acquisition module is configured to acquire at least one characteristic region image which corresponds to the original image and is obtained based on the super pixel technology; acquiring at least one class activation image corresponding to the original image, wherein each class activation image is marked with an image class to which the class activation image belongs;

a determining module configured to determine an image category characterized by each of the feature area images based on the class activation images;

the generation module is configured to map a target feature area prototype characterized by a preset image category with the original image to obtain a segmented image corresponding to the target feature area image.

According to still another aspect of the embodiments of the present application, there is provided an electronic device including:

a memory for storing executable instructions; and

and the display is used for executing the executable instructions with the memory so as to finish the operation of any image segmentation method.

According to still another aspect of the embodiments of the present application, there is provided a computer-readable storage medium storing computer-readable instructions that, when executed, perform the operations of any one of the above-described image segmentation methods.

In the application, at least one characteristic region image which is obtained based on the super pixel technology and corresponds to the original image can be obtained; acquiring at least one class activation image corresponding to an original image, wherein each class activation image is marked with an image class to which the class activation image belongs; determining an image category characterized by each characteristic area image based on the class activation image; and mapping the target characteristic region prototype characterized in the preset image category with the original image to obtain a segmented image corresponding to the target characteristic region image. By applying the technical scheme, the class activation image of the original image can be utilized to determine the category of each characteristic region image, and the characteristic region image of the specific category is mapped into the original image so as to obtain the segmentation image with the complete boundary of the segmentation object reserved. Furthermore, the method is realized by utilizing the super-pixel technology to reserve the boundary of the target area and simultaneously combining a method for calculating the prototype of the target area to obtain the pseudo-pixel level label with complete image as far as possible. Therefore, the problem that the segmented image is not accurate enough due to the fact that the pseudo pixel level label target area generated by the class activation diagram is not completed or the boundary is not obvious in the related technology is avoided.

The technical scheme of the present application is described in further detail below through the accompanying drawings and examples.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the application and, together with the description, serve to explain the principles of the application.

The present application will be more clearly understood from the following detailed description with reference to the accompanying drawings, in which:

FIG. 1 is a schematic diagram of an image segmentation method according to an embodiment of the present application;

fig. 2 is a schematic system architecture diagram of an image segmentation method according to an embodiment of the present application;

FIG. 3 is a flow chart illustrating a method for segmenting an image according to an embodiment of the present application;

FIG. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure;

FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure;

fig. 6 shows a schematic diagram of a storage medium according to an embodiment of the present application.

Detailed Description

Various exemplary embodiments of the present application will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present application unless it is specifically stated otherwise.

Meanwhile, it should be understood that the sizes of the respective parts shown in the drawings are not drawn in actual scale for convenience of description.

The following description of at least one exemplary embodiment is merely exemplary in nature and is in no way intended to limit the application, its application, or uses.

Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but are intended to be part of the specification where appropriate.

It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.

In addition, the technical solutions of the embodiments of the present application may be combined with each other, but it is necessary to be based on the fact that those skilled in the art can implement the technical solutions, and when the technical solutions are contradictory or cannot be implemented, the combination of the technical solutions should be considered to be absent, and is not within the scope of protection claimed in the present application.

It should be noted that all directional indicators (such as up, down, left, right, front, and rear … …) in the embodiments of the present application are merely used to explain the relative positional relationship, movement conditions, and the like between the components in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indicator is correspondingly changed accordingly.

A segmentation method for performing an image according to an exemplary embodiment of the present application is described below with reference to fig. 1 to 3. It should be noted that the following application scenario is only shown for the convenience of understanding the spirit and principles of the present application, and embodiments of the present application are not limited in any way in this respect. Rather, embodiments of the present application may be applied to any scenario where applicable.

The application also provides an image segmentation method, an image segmentation device, electronic equipment and a medium.

Fig. 1 schematically shows a flow diagram of a method of segmentation of an image according to an embodiment of the present application. As shown in fig. 1, the method includes:

s101, acquiring at least one characteristic region image which corresponds to an original image and is obtained based on a super pixel technology; and obtaining at least one class activation image corresponding to the original image, wherein each class activation image is marked with the belonging image class.

S101, determining the image category characterized by each characteristic area image based on the class activated image.

And S101, mapping the target feature region prototype characterized by the preset image category with the original image to obtain a segmented image corresponding to the target feature region image.

With the development of networks and communications, the application of image segmentation technology is also receiving more and more attention in various application scenarios.

In the related art, most of weak supervision semantic segmentation methods using image-level labels are based on class activation diagrams (CAM) to locate segmented targets, but the CAM can only identify partial high-response discrimination areas in the images and lack accurate description of boundaries of the targets, so that the prior art utilizes significance detection, relationship among pixels, attention mechanisms and the like to excavate and expand seed areas on the basis of the accurate discrimination areas, and cannot obtain more complete and accurate pseudo-pixel-level labels for next segmentation. Therefore, how to design a method capable of accurately generating a segmentation target image becomes a problem to be solved.

In order to solve the above-mentioned problems, the embodiment of the present application proposes an image segmentation method, which is based on the idea of avoiding the use of super-pixel technology to preserve the complete boundary of the segmented object, and at the same time innovatively combining the method of using the center of gravity of the feature area image to obtain the complete pseudo-pixel level label of the segmented object as much as possible. Resulting in a segmented image that retains the complete boundaries of the segmented object.

In one mode, as shown in fig. 2, a system architecture diagram of an image segmentation method according to an embodiment of the present application is shown. With reference to fig. 2, a step of the image segmentation method according to the embodiment of the present application is described:

step 1, obtaining a seed region corresponding to an object to be segmented, wherein the seed region is obtained by three substeps:

and a step a, generating a super-pixel image and carrying out feature clustering, so that the purposes of imaging a feature area and reserving the outline boundary of an object in the image are achieved.

Specifically, the original image may be input to the image classification network first, so as to obtain a superpixel image of at least one region, and the superpixel image of the at least one region is subjected to region pooling operation, so as to determine a superpixel feature corresponding to each superpixel image.

The super-pixel technology is a set of pixels capable of extracting features of specific colors, textures and the like in an image, and can keep the outline of an object.

In one manner, embodiments of the present application can employ an image classification network (e.g., a ResNet50 network) that is pre-trained by ImageNet to perform feature extraction on the original image. By way of example, the fully connected layer may be a classifier with 20 output channels (20 classes due to the use of the PASCAL VOC2021 dataset) that is super-pixel by super-pixel technology, i.e. an image that is divided into many regions.

Further, the super-pixel images of the plurality of areas are subjected to pooling operation, so that the characteristics of the areas are obtained, and the plurality of super-pixel characteristics are subjected to characteristic clustering by using a k-means method, so that at least one corresponding characteristic area image is obtained. It will be appreciated that each clustered feature region image is also more significant than before.

And b, generating at least one class activation image corresponding to the original image, which is used for positioning the segmentation object and determining the seed area.

The class activation diagram can show where and how the weight or the center of gravity of the model is transferred in the training process, and the classification model is judged according to the characteristics of which part, namely, the key part of the related task of the original image can be found.

In one mode, in the embodiment of the present application, a class activation diagram of each segmentation target in an original image needs to be obtained through a class activation method. Wherein each class activation graph has been labeled with the corresponding belonging class.

And c, calculating the IOU between the clustered regional feature map and the class activated image, so as to determine the class of each regional feature map, and obtaining the seed region of the segmentation target.

The IoU (Intersection over Union) is used for measuring the accuracy of the position information of the prediction result in the target segmentation task. As an example, in the embodiment of the present application, the degree of overlap between each region feature map and the class activation image is measured by using the IOU between each region feature map and the class activation image.

In one mode, in the embodiment of the present application, the overlapping degree of each feature area image and all class activation images may be calculated, so as to determine the target class activation image corresponding to each feature area image and having the largest overlapping degree.

That is, the maximum class activation image a (i.e., the target class activation image) corresponding to the feature area image a and the maximum class activation image B (i.e., the target class activation image) corresponding to the feature area image B can be obtained in the above manner.

As an example, the overlapping degree may be calculated by determining the category to which each feature region belongs according to the intersection ratio of the calculated feature region image and all the category activation images, where the formula is as follows:

wherein k represents a certain image category, C represents a category activation diagram, and R represents a characteristic region image. It will be appreciated that embodiments of the present application require that Ik be calculated for different image categories k. And the category corresponding to the Ik maximum value is the category of the corresponding characteristic region image.

And step 2, calculating a prototype of the target characteristic region image.

In one mode, the embodiment of the application can obtain the corresponding prototype through calculating the gravity center value of the characteristic region image.

Further, the embodiment of the present application needs to calculate the gravity center value of each feature area image Rn by using the original image F, so as to average and pool each feature area image to obtain a prototype (prototype, i.e., gravity center value) corresponding to each feature area image. Wherein, the formula is as follows:

wherein, 1 () represents that if a pixel point of the original image belongs to the nth feature area image, the value is 1. Otherwise, the value is 0.

And step 3, calculating to obtain the pseudo pixel level label of the original image.

Further, in the embodiment of the present application, all the feature area images Rn belonging to the kth class need to be respectively subjected to matrix multiplication and summation (i.e., matrix continuous multiplication and summation) with the original image F obtained initially, and negative values in the feature area images Rn are removed through a preset activation function (for example, a ReLU activation function), so as to obtain a final pseudo-pixel level label.

And 4, performing thread segmentation on the original image by using the pseudo-pixel level label to obtain a segmented image corresponding to the target characteristic region image.

In the application, at least one characteristic region image which is obtained based on the super pixel technology and corresponds to the original image can be obtained; acquiring at least one class activation image corresponding to an original image, wherein each class activation image is marked with an image class to which the class activation image belongs; determining an image category characterized by each characteristic area image based on the class activation image; and mapping the target characteristic region prototype characterized in the preset image category with the original image to obtain a segmented image corresponding to the target characteristic region image. By applying the technical scheme, the class activation image of the original image can be utilized to determine the category of each characteristic region image, and the characteristic region image of the specific category is mapped into the original image so as to obtain the segmented image with the complete boundary of the segmented object reserved. Furthermore, the problem that the segmented image is not accurate enough due to the fact that the pseudo pixel level label target area generated by the class activation diagram is not completed or the boundary is not obvious in the related technology is avoided.

Fig. 3 is a schematic flow chart of an image segmentation method according to the present application, where the method includes:

in the application, at least one characteristic region image which is obtained based on the super pixel technology and corresponds to the original image can be obtained; acquiring at least one class activation image corresponding to an original image, wherein each class activation image is marked with an image class to which the class activation image belongs; determining an image category characterized by each characteristic area image based on the class activation image; and mapping the target characteristic region prototype characterized in the preset image category with the original image to obtain a segmented image corresponding to the target characteristic region image.

By applying the technical scheme, the class activation image of the original image can be utilized to determine the category of each characteristic region image, and the characteristic region image of the specific category is mapped into the original image so as to obtain the segmented image with the complete boundary of the segmented object reserved. Furthermore, the method is realized by utilizing the super-pixel technology to reserve the boundary of the target area and simultaneously combining a method for calculating the prototype of the target area to obtain the pseudo-pixel level label with complete image as far as possible. Therefore, the problem that the segmented image is not accurate enough due to the fact that the pseudo pixel level label target area generated by the class activation diagram is not completed or the boundary is not obvious in the related technology is avoided.

Optionally, in another embodiment of the present application, as shown in fig. 4, the present application further provides an image segmentation apparatus. The method comprises the following steps:

an acquisition module 201 configured to acquire at least one feature area image corresponding to an original image; acquiring at least one class activation image corresponding to the original image, wherein each class activation image is marked with an image class to which the class activation image belongs;

a determining module 202 configured to determine an image category characterized by each of the feature area images based on the class activation images;

the generating module 203 is configured to map the target feature area prototype featuring in the preset image category with the original image, so as to obtain a segmented image corresponding to the target feature area image.

In the application, at least one characteristic region image which is obtained based on the super pixel technology and corresponds to the original image can be obtained; acquiring at least one class activation image corresponding to an original image, wherein each class activation image is marked with an image class to which the class activation image belongs; determining an image category characterized by each characteristic area image based on the class activation image; and mapping the target characteristic region prototype characterized in the preset image category with the original image to obtain a segmented image corresponding to the target characteristic region image. By applying the technical scheme, the class activation image of the original image can be utilized to determine the category of each characteristic region image, and the characteristic region image of the specific category is mapped into the original image so as to obtain the segmented image with the complete boundary of the segmented object reserved. Furthermore, the method is realized by utilizing the super-pixel technology to reserve the boundary of the target area and simultaneously combining a method for calculating the prototype of the target area to obtain the pseudo-pixel level label with complete image as far as possible. Therefore, the problem that the segmented image is not accurate enough due to the fact that the pseudo pixel level label target area generated by the class activation diagram is not completed or the boundary is not obvious in the related technology is avoided.

In another embodiment of the present application, the obtaining module 201 is configured to perform the steps comprising:

and carrying out thread segmentation on the original image by using the pseudo pixel level label to obtain a segmented image corresponding to the target characteristic region image.

The embodiment of the application also provides an electronic device for executing the image segmentation method. Referring to fig. 5, a schematic diagram of an electronic device according to some embodiments of the present application is shown. As shown in fig. 5, the electronic apparatus 3 includes: a processor 300, a memory 301, a bus 302 and a communication interface 303, the processor 300, the communication interface 303 and the memory 301 being connected by the bus 302; the memory 301 stores a computer program executable on the processor 300, and the processor 300 executes the image segmentation method provided in any of the foregoing embodiments of the present application when the computer program is executed.

The memory 301 may include a high-speed random access memory (RAM: random Access Memory), and may further include a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory. The communication connection between the device network element and at least one other network element is achieved through at least one communication interface 303 (which may be wired or wireless), the internet, a wide area network, a local network, a metropolitan area network, etc. may be used.

Bus 302 may be an ISA bus, a PCI bus, an EISA bus, or the like. The buses may be classified as address buses, data buses, control buses, etc. The memory 301 is configured to store a program, and the processor 300 executes the program after receiving an execution instruction, and the method for identifying data disclosed in any of the foregoing embodiments of the present application may be applied to the processor 300 or implemented by the processor 300.

The processor 300 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in the processor 300 or by instructions in the form of software. The processor 300 may be a general-purpose processor, including a processor (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; but may also be a Digital Signal Processor (DSP), application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in hardware, in a decoded processor, or in a combination of hardware and software modules in a decoded processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 301, and the processor 300 reads the information in the memory 301, and in combination with its hardware, performs the steps of the above method.

The electronic device provided by the embodiment of the application and the method for identifying data provided by the embodiment of the application are the same in the invention conception, and have the same beneficial effects as the method adopted, operated or realized by the electronic device.

The present embodiment also provides a computer readable storage medium corresponding to the method for identifying data provided in the foregoing embodiment, referring to fig. 6, the computer readable storage medium is shown as an optical disc 40, on which a computer program (i.e. a program product) is stored, where the computer program, when executed by a processor, performs the method for segmenting an image provided in any of the foregoing embodiments.

It should be noted that examples of the computer readable storage medium may also include, but are not limited to, a phase change memory (PRAM), a Static Random Access Memory (SRAM), a Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a flash memory, or other optical or magnetic storage medium, which will not be described in detail herein.

The computer readable storage medium provided by the above embodiments of the present application and the method of data identification provided by the embodiments of the present application have the same advantageous effects as the method adopted, operated or implemented by the application program stored therein, for the same inventive concept.

It should be noted that:

in the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the present application may be practiced without these specific details. In some instances, well-known structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the application, various features of the application are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the application and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the following schematic diagram: i.e., the claimed application requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this application.

Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the present application and form different embodiments. For example, in the following claims, any of the claimed embodiments can be used in any combination.

The foregoing is merely a preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the technical scope of the present application should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of segmenting an image, comprising:

mapping a target characteristic region prototype characterized by a preset image category with the original image to obtain a segmented image corresponding to the target characteristic region image, wherein the target characteristic region prototype is obtained by calculating the gravity center value of the characteristic region image;

the mapping the target feature region prototype characterized in the preset image category with the original image to obtain a segmented image corresponding to the target feature region image includes:

based on the feature and the value, obtaining a segmented image corresponding to the target feature area image;

the obtaining the segmented image corresponding to the target feature area image based on the features and the values includes:

2. The method according to claim 1, wherein the acquiring at least one feature area image obtained based on the super pixel technique corresponding to the original image includes:

3. The method according to claim 1 or 2, wherein said obtaining at least one class activation image corresponding to said original image comprises:

4. The method of claim 1, wherein the determining the image class characterized by each of the feature area images based on the class activation images comprises:

5. The method of claim 4, wherein the calculating the overlapping degree of the feature area image and all class activation images respectively, and determining the target class activation image with the largest overlapping degree corresponding to the feature area image, comprises:

6. An image segmentation apparatus, comprising:

the generation module is configured to map a target feature area prototype characterized by a preset image category with the original image to obtain a segmented image corresponding to the target feature area image, wherein the target feature area prototype is obtained by calculating the gravity center value of the feature area image;

7. An electronic device, comprising:

a memory for storing executable instructions; the method comprises the steps of,

a processor for executing the executable instructions with the memory to perform the operations of the method of segmenting an image as claimed in any one of claims 1 to 5.

8. A computer readable storage medium storing computer readable instructions, wherein the instructions when executed perform the operations of the method of segmenting an image according to any one of claims 1-5.