CN110889816A

CN110889816A - Image segmentation method and device

Info

Publication number: CN110889816A
Application number: CN201911083589.5A
Authority: CN
Inventors: 王立新; 余威; 张晓璐
Original assignee: Beijing Liangjian Intelligent Technology Co Ltd
Current assignee: Bayer AG
Priority date: 2019-11-07
Filing date: 2019-11-07
Publication date: 2020-03-17
Anticipated expiration: 2039-11-07
Also published as: CN110889816B

Abstract

The invention provides an image segmentation method and device. Aiming at a target object, determining a plurality of corresponding label fusion strategies according to labels in a sample image, wherein the number m of the types of the labels is more than or equal to 3, the labels do not comprise backgrounds, and the number n of the label fusion strategies is less than or equal to m; training an image segmentation model for each label fusion strategy to obtain segmentation results output by each image segmentation model; and fusing the segmentation results output by the image segmentation models to obtain the target object. The method considers the structural correlation among different labels, and meets the requirement of identifying the target object through image segmentation based on label fusion. The segmentation effect on the target object is significantly better than the case where each model separately identifies one of the tags, and also significantly better than the case where a plurality of tags are identified by a single model.

Description

Image segmentation method and device

Technical Field

The invention relates to the technical field of image processing, in particular to a multi-class image segmentation technology.

Background

When processing image segmentation containing multi-class labels, the background (label) is not considered, the class number m of the labels is more than or equal to 3, and the processing for each label is generally divided into the following two modes:

mode 1. training 1 image segmentation model, the output of which is m target objects.

And 2, training m image segmentation models, aiming at one target object, and fusing results of the m image segmentation models to obtain m target objects.

In the mode 1, all the categories are processed by 1 image segmentation model, so that the parameter adjustment is difficult, and the segmentation effect among the categories cannot be balanced; mode 2 obviously increases the workload of training the image segmentation model, is inefficient, and is difficult to apply to multiple models. Also, neither of these approaches takes into account structural dependencies that exist between the objects to be segmented.

Disclosure of Invention

The invention aims to provide a multi-class image segmentation method, a multi-class image segmentation device, a computing device, a computer readable storage medium and a computer program product.

According to an aspect of the present invention, there is provided an image segmentation method, wherein the method comprises:

aiming at a target object, determining a plurality of corresponding label fusion strategies according to labels in a sample image, wherein the number m of the label categories is more than or equal to 3, the labels do not comprise backgrounds, the number n of the label fusion strategies is less than or equal to m,

wherein the different classes of tags are fused based on at least one of the following rules:

-fusing adjacent tags;

-fusing tags whose corresponding ranges have a surrounding relationship;

-fusing labels with boundaries that cross each other;

-fusing at least two of the labels based on image feature similarity;

-fusing at least one of the tags with a background;

training an image segmentation model for each label fusion strategy to obtain segmentation results output by each image segmentation model;

and fusing the segmentation results output by the image segmentation models to obtain the target object.

According to an aspect of the present invention, there is also provided an image segmentation method, wherein the method includes:

acquiring an image to be detected;

calling each trained image segmentation model to segment the image to be detected so as to respectively obtain corresponding segmentation results;

fusing the obtained segmentation results to obtain a target object in the image to be detected;

wherein each of the image segmentation models respectively corresponds to one of a plurality of label fusion strategies to obtain training,

the label fusion strategies are determined according to labels in a sample image for a target object in a training process, wherein the number m of the types of the labels is larger than or equal to 3, the labels do not include a background, the number n of the label fusion strategies is smaller than or equal to m, and the labels of different types are fused based on at least one of the following rules:

-fusing adjacent tags;

-fusing tags whose corresponding ranges have a surrounding relationship;

-fusing labels with boundaries that cross each other;

-fusing at least two of the labels based on image feature similarity;

-fusing at least one of the tags with the background.

According to an aspect of the present invention, there is also provided an image segmentation apparatus, wherein the apparatus comprises:

the strategy determining device is used for determining a plurality of corresponding label fusion strategies according to labels in the sample image aiming at the target object, wherein the number m of the types of the labels is more than or equal to 3, the labels do not comprise backgrounds, the number n of the label fusion strategies is less than or equal to m,

-fusing adjacent tags;

-fusing tags whose corresponding ranges have a surrounding relationship;

-fusing labels with boundaries that cross each other;

-fusing at least two of the labels based on image feature similarity;

-fusing at least one of the tags with a background;

the model training device is used for training an image segmentation model for each label fusion strategy so as to obtain segmentation results output by each image segmentation model;

and the result fusion device is used for fusing the segmentation results output by the image segmentation models to obtain the target object.

the image acquisition device is used for acquiring an image to be detected;

the model calling device is used for calling each trained image segmentation model to segment the image to be detected so as to respectively obtain corresponding segmentation results;

the target recognition device is used for fusing the obtained segmentation results to obtain a target object in the image to be detected;

-fusing adjacent tags;

-fusing tags whose corresponding ranges have a surrounding relationship;

-fusing labels with boundaries that cross each other;

-fusing at least two of the labels based on image feature similarity;

-fusing at least one of the tags with the background.

According to an aspect of the present invention, there is also provided a computing device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements an image segmentation method according to an aspect of the present invention when executing the computer program.

According to an aspect of the present invention, there is also provided a computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements an image segmentation method according to an aspect of the present invention.

According to an aspect of the present invention, there is also provided a computer program product which, when executed by a computing device, implements an image segmentation method according to an aspect of the present invention.

Compared with the prior art, the method considers the structural correlation among different labels, and meets the requirement of identifying the target object through image segmentation based on label fusion. And configuring an image segmentation model for each label fusion strategy, wherein the number of the image segmentation models is less than or equal to that of the labels, and the obtained segmentation results are fused to obtain a final target object. The segmentation effect on the target object is significantly better than the case where each model separately identifies one of the tags, and also significantly better than the case where a plurality of tags are identified by a single model.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments made with reference to the following drawings:

FIG. 1 shows a flow diagram of a method of image segmentation according to an embodiment of the invention;

fig. 2 shows a schematic apparatus diagram of an image segmentation device according to another embodiment of the present invention.

The same or similar reference numbers in the drawings identify the same or similar elements.

Detailed Description

Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments of the present invention are described as an apparatus represented by a block diagram and a process or method represented by a flow diagram. Although a flowchart depicts a sequence of process steps in the present invention, many of the operations can be performed in parallel, concurrently, or simultaneously. In addition, the order of the operations may be re-arranged. The process of the present invention may be terminated when its operations are performed, but may include additional steps not shown in the flowchart. The processes of the present invention may correspond to methods, functions, procedures, subroutines, and the like.

The methods illustrated by the flow diagrams and apparatus illustrated by the block diagrams discussed below may be implemented by hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the necessary tasks may be stored in a machine or computer readable medium such as storage medium. The processor(s) may perform the necessary tasks.

Similarly, it will be further appreciated that any flow charts, flow diagrams, state transition diagrams, and the like represent various processes which may be substantially described as program code stored in computer readable media and so executed by a computing device or processor, whether or not such computing device or processor is explicitly shown.

As used herein, the term "storage medium" may refer to one or more devices for storing data, including Read Only Memory (ROM), Random Access Memory (RAM), magnetic RAM, kernel memory, magnetic disk storage media, optical storage media, flash memory devices, and/or other machine-readable media for storing information. The term "computer-readable medium" can include, but is not limited to portable or fixed storage devices, optical storage devices, and various other mediums capable of storing and/or containing instructions and/or data.

A code segment may represent a procedure, a function, a subprogram, a program, a routine, a subroutine, a module, a software package, a class, or any combination of instructions, data structures, or program descriptions. A code segment may be coupled to another code segment or a hardware circuit by passing and/or receiving information, data, arguments, parameters, or memory contents. Information, arguments, parameters, data, etc. may be passed, forwarded, or transmitted via any suitable means including memory sharing, information passing, token passing, network transmission, etc.

The term "computing device" in this context refers to an electronic device that can perform predetermined processes such as numerical calculations and/or logical calculations by executing predetermined programs or instructions, and may include at least a processor and a memory, wherein the predetermined processes are performed by the processor executing program instructions prestored in the memory, or by hardware such as ASIC, FPGA, DSP, or by a combination of the above two.

The "computing device" described above is typically embodied in the form of a general purpose computing device, whose components may include, but are not limited to: one or more processors or processing units, system memory. The system memory may include computer readable media in the form of volatile memory, such as Random Access Memory (RAM) and/or cache memory. "computing device" may further include other removable/non-removable, volatile/nonvolatile computer-readable storage media. The memory may include at least one computer program product having a set (e.g., at least one) of program modules that are configured to perform the functions and/or methods of embodiments of the present invention. The processor executes various functional applications and data processing by executing programs stored in the memory.

For example, a computer program for executing the functions and processes of the present invention is stored in the memory, and when the processor executes the corresponding computer program, a multi-class image segmentation scheme in the present invention is implemented.

Typically, the computing devices include, for example, user devices and network devices. Wherein the user equipment includes but is not limited to a Personal Computer (PC), a notebook computer, a mobile terminal, etc., and the mobile terminal includes but is not limited to a smart phone, a tablet computer, etc.; the network device includes, but is not limited to, a single network server, a server group consisting of a plurality of network servers, or a Cloud Computing (Cloud Computing) based Cloud consisting of a large number of computers or network servers, wherein Cloud Computing is one of distributed Computing, a super virtual computer consisting of a collection of loosely coupled computers. Wherein the computing device is capable of operating alone to implement the invention, or of accessing a network and performing the invention by interoperating with other computing devices in the network. The network in which the computing device is located includes, but is not limited to, the internet, a wide area network, a metropolitan area network, a local area network, a VPN network, and the like.

It should be noted that the user devices, network devices, networks, etc. are merely examples, and other existing or future computing devices or networks may be suitable for the present invention, and are included in the scope of the present invention and are incorporated by reference herein.

Specific structural and functional details disclosed herein are merely representative and are provided for purposes of describing example embodiments of the present invention. The present invention may, however, be embodied in many alternate forms and should not be construed as limited to only the embodiments set forth herein.

It will be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element may be termed a second element, and, similarly, a second element may be termed a first element, without departing from the scope of example embodiments. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be noted that, in some alternative implementations, the functions/acts noted may occur out of the order noted in the figures. For example, two figures shown in succession may, in fact, be executed substantially concurrently, or the figures may sometimes be executed in the reverse order, depending upon the functionality/acts involved.

The structural correlation existing between the segmentation targets in the image refers to the following aspects:

1. the labeling of each object is complex, the boundary of the label is difficult to judge, and the label is linked with other labels.

2. There is a wrapping relationship between the tags, i.e. the whole of one tag in conjunction with another tag can constitute a closure area.

3. Some labels have a greater similarity of image features, for example, the color histogram distribution is more uniform.

Examples of the case of structural dependencies existing between objects:

1. streets and cars are segmented. The automobile generally has a situation of shading the road on the street, and the label of the street has a characteristic of surrounding the label of the automobile, which accords with the 2 nd relevance.

2. The necrotic and enhanced regions of the brain glioma are segmented. The boundary of the enhancement region and the boundary of the necrosis region are very complicated, but the boundaries of the enhancement region and the necrosis region are connected with each other, and the correlation item 1 is satisfied.

In the case of structural dependencies existing between the above objects (number of classes m.gtoreq.3), neither of the two previously mentioned approaches is sufficient to solve this problem. Mode 1 cannot give consideration to multiple labels per se, the condition of target imbalance is difficult to process, and the precision of a segmented boundary part is low; in the mode 2, because correlation exists between the targets, prior information provided by other targets is lost, the segmentation precision of the boundary part is poor, and the targets are too small and are very easily influenced by other background features.

Accordingly, the invention provides a multi-class image segmentation scheme. Aiming at a target object to be segmented, the method determines a plurality of corresponding label fusion strategies according to the multi-class labels in the sample image, trains an image segmentation model for each label fusion strategy, thereby obtaining segmentation results output by each image segmentation model, and further fuses the segmentation results output by each image segmentation model to obtain the target object in the image.

The present invention is described in further detail below with reference to the attached drawing figures.

FIG. 1 illustrates a method flow diagram, which particularly shows a process for multi-class image segmentation, according to one embodiment of the present invention.

Typically, the invention is implemented by a computing device. When a general-purpose computing device is configured with program modules embodying the invention, it will be a specialized, multi-class image-segmenting computing device, rather than any general-purpose computer or processor. However, those skilled in the art will appreciate that the foregoing description is intended only to illustrate that the present invention may be applied to any general purpose computing device, which, when applied to a general purpose computing device, becomes a specific multi-class image segmentation device, hereinafter referred to as an "image segmentation device," for practicing the present invention.

As shown in fig. 1, in step S101, for a target object, an image segmentation apparatus determines a plurality of corresponding label fusion policies according to labels in a sample image; in step S102, the image segmentation apparatus trains an image segmentation model for each label fusion policy to obtain a segmentation result output by each image segmentation model; in step S103, the image segmentation apparatus fuses the segmentation results output by the image segmentation models to obtain the target object.

Specifically, in step S101, for the target object, the image segmentation apparatus determines a plurality of corresponding label fusion policies according to the labels in the sample image.

The number m of the types of the labels in the sample image is more than or equal to 3, the labels do not comprise backgrounds, and the number n of label fusion strategies is less than or equal to m.

According to an example of the present invention, for example, the number of classes of tags is 3 regardless of the background, and after analyzing the structural correlation of the tags of each class, an appropriate tag is selected for fusion, and thus the number of determined tag fusion strategies is 2.

The label fusion is to fuse different types of labels with structural correlation, so that the fused labels are easier to identify, and after segmentation results obtained after segmentation of the image segmentation model corresponding to each label fusion strategy are fused again, a target object in the image can be obtained.

Therefore, in order to ensure that the target object in the image can be finally identified, a plurality of label fusion strategies are designed to fuse different types of labels, so that the target object can be obtained after the segmentation results segmented according to the fusion labels are fused again.

For a multi-label sample image, the relationship of the structural relevance of each label is analyzed, and an appropriate label is selected for label fusion, for example, two or more labels are selected to be fused into one label, and one or more labels are marked as a background (i.e., fused with the background label) again. Multiple tag fusions may be included in a tag fusion strategy, but the same tag cannot be fused repeatedly. For example, if tag a and tag B are fused to tag a ', and tag B and tag C are fused to tag B', tag a 'and tag B' cannot be fused to each other.

1) fusing adjacent labels;

for example, the label a and the label B respectively mark adjacent regions in the image, and the label a and the label B are adjacent labels, and may be fused into one label.

2) Fusing labels with surrounding relations in corresponding ranges;

for example, label a is used to label a street area, label B is used to label a vehicle on the street, and the range of both label B and label a has a surrounding relationship.

3) Fusing labels with mutually crossed boundaries;

for example, if the range of the label a includes a region and a part of discrete points outside the region, and the range of the label B includes a part of the region between the region and the discrete points of the label a, the boundaries of the label a and the label B intersect with each other.

4) Fusing at least two labels based on the image feature similarity;

for example, a threshold is set, and when the similarity of the image features of the label a and the label B is large, if the similarity exceeds the threshold, the label a and the label B may be fused. Here, for comparison of two label regions, any image feature may be used for similarity comparison, and the present invention is not limited thereto. According to an example of the present invention, for example, the determination of the similarity of image features between two labels may use a color histogram, such as counting one specific pixel value, and if the statistics of the pixel values in the respective ranges of the two labels are close, such as the difference between the statistics is smaller than the corresponding threshold, the two labels may be considered to be close in the pixel value. Specifically, for example, the statistics of the pixels having the pixel value of 30 are performed, 800 pixels are provided in the range of the label a, and 810 pixels are provided in the range of the label B, and accordingly the statistics of the 30 pixel values in the labels a and B are close. When the statistics of all pixel values are close, the curves of the color histograms of the two labels are consistent. When the color histogram distributions of the respective corresponding ranges of the two labels are relatively consistent, the image features between the two labels can be considered to be similar, so that the two labels can be fused.

5) At least one of the tags is fused to the background.

For example, 3 labels are included in the sample image, and after label C is fused with the background, label a and label B can be subsequently directly segmented from the sample image by an image segmentation model.

In step S102, the image segmentation apparatus trains an image segmentation model for each label fusion policy to obtain a segmentation result output by each image segmentation model.

And according to different label fusion strategies, each label fusion strategy corresponds to one image segmentation model. According to one example of the present invention, for label A, B, C, for example, label A and B are fused in label fusion strategy 1 and label C is fused with the background in label fusion strategy 2. Thus, in the label fusion strategy 1, the label obtained by fusing the labels a and B is labeled as a ', and the corresponding image segmentation model 1 is trained to recognize the labels a' and C. The image segmentation model 2 corresponding to the label fusion strategy 2 is trained to identify labels a and B. After the image segmentation models 1 and 2 are trained, the segmentation result output by the image segmentation model 1 is labeled a' and C, and the segmentation result output by the image segmentation model 2 is labeled a and B.

Here, the image segmentation model adopts a classification Neural Network, for example, a convolutional Neural Network such as a ResNet (Residual Neural Network) related Network, a VGG (Visual Geometry Group) related Network, etc., and the training mode thereof is, for example, sample training, that is, an image labeled with a target label to be learned is input to the classification Neural Network, so that the classification Neural Network is trained as an image segmentation model for recognizing the target label by learning the classification thereof. It will be understood by those skilled in the art that the image segmentation model and the training method thereof are only examples for the purpose of illustrating the present invention, and the image segmentation model or the training method thereof, which are present or come in the future, are included in the scope of the present invention by reference if applicable.

According to an example of the invention, the image segmentation models corresponding to the label fusion strategies use the same classification neural network. Further, the number of input labels of each image segmentation model may be the same, for example, 2 labels.

In the training process of the image segmentation models, the plurality of image segmentation models are all used for identifying a common target object, and after one image segmentation model is trained, the model parameters can be used for the other image segmentation model, so that the convergence speed of the next image segmentation model can be accelerated, and the training process can be completed as soon as possible.

In step S103, the image segmentation apparatus fuses the segmentation results output by the respective image segmentation models to obtain the target object.

According to an example of the present invention, for example, in step S102, the image segmentation model 1 outputs segmentation result labels a' and C, and the image segmentation model 2 outputs segmentation result labels a and B, then in step S103, after fusing these labels, the image segmentation apparatus may obtain the label A, B, C. According to another example of the present invention, the target object includes a label a', and the image segmentation apparatus may also obtain the target object from the segmentation results of the image segmentation models 1 and 2.

According to another example of the present invention, for example, in step S102, the segmentation results output by the plurality of image segmentation models each include a label D, that is, the region corresponding to the label D is a repeatedly optimized region, and in step S103, the image segmentation apparatus may select a region of any label D as a final target object. Alternatively, the image segmentation apparatus may also determine, on a pixel-by-pixel basis, whether to label the corresponding pixel as the label D for different region portions of all the output labels D. For example, for a pixel in these different region parts, whether the pixel can be finally labeled as label D is determined by a probabilistic fusion or voting operation. Specifically, for example, if the pixel is labeled as label D in the plurality of output results (i.e., voting operation), the pixel may be finally labeled as label D. For another example, the probability that the pixel belongs to the label D in each corresponding image segmentation model is averaged, and when the average exceeds the classification judgment threshold of the label D, the pixel may be finally labeled as the label D.

The above steps S101-S103 describe the process of training the image segmentation apparatus for the target object to identify the target object from the input sample image.

The process of the trained image segmentation apparatus for identifying the target object in the input to-be-detected image will be further described below by steps S111-S113 (not shown in fig. 1).

Specifically, in step S111, the image segmentation apparatus obtains an unmarked image to be detected; in step S112, the image segmentation apparatus calls each trained image segmentation model to obtain corresponding output results respectively; in step S113, the image segmentation apparatus fuses these output results to obtain a target object in the image to be detected.

According to one example of the present invention, for example, the target object is label A, B, C, trained image segmentation model 1 outputs labels a' (fusing labels a and B) and C, and trained image segmentation model 2 outputs labels a and B. Thus, in step S111, an image to be detected for the above-mentioned target object is input to the image segmentation apparatus. In step S112, the image segmentation apparatus calls the image segmentation model 1 to identify the labels a' and C in the image to be detected, and calls the image segmentation model 2 to identify the labels a and B in the image to be detected. In step S113, the image segmentation apparatus fuses the output results of the image segmentation models 1 and 2, obtaining target objects in the image to be detected, i.e., the labels A, B and C.

Fig. 2 shows a schematic diagram of an apparatus according to an embodiment of the invention, which particularly shows a multi-label based image segmentation apparatus.

Typically, the image segmentation device of the present invention may be implemented in any one of general purpose computing devices. However, it will be appreciated by those skilled in the art that the foregoing description is only intended to illustrate that the image segmentation apparatus of the present invention can be applied to any general purpose computing apparatus, which becomes a specific image segmentation apparatus for implementing the present invention when configured as a general purpose computing apparatus, which can be implemented as a computer program, hardware or a combination thereof.

As shown in fig. 2, the image segmentation apparatus 20 comprises a strategy determination means 21, a model training means 22 and a result fusion means 23.

For the target object, the policy determining device 21 determines a plurality of corresponding tag fusion policies according to the tags in the sample image; the model training device 22 trains an image segmentation model for each label fusion strategy to obtain segmentation results output by each image segmentation model; the result fusion device 23 fuses the segmentation results output by the image segmentation models to obtain the target object.

Specifically, for the target object, the policy determination device 21 determines a plurality of corresponding tag fusion policies from the tags in the sample image.

According to an example of the present invention, for example, the number of types of tags is 3 regardless of the background, and the policy determination means 21 selects an appropriate tag to fuse after analyzing the structural correlation of the tags of each type, so that the number of determined tag fusion policies is 2.

For a multi-label sample image, the policy determining device 21 analyzes the relationship of the structural correlations of the labels, selects an appropriate label for label fusion, for example, selects two or more labels to be fused into one label, and re-marks one or more labels as the background (i.e., to be fused with the background label). Multiple tag fusions may be included in a tag fusion strategy, but the same tag cannot be fused repeatedly. For example, if tag a and tag B are fused to tag a ', and tag B and tag C are fused to tag B', tag a 'and tag B' cannot be fused to each other.

1) fusing adjacent labels;

2) Fusing labels with surrounding relations in corresponding ranges;

3) Fusing labels with mutually crossed boundaries;

4) Fusing at least two labels based on the image feature similarity;

5) At least one of the tags is fused to the background.

The model training device 22 trains an image segmentation model for each label fusion strategy to obtain the segmentation result output by each image segmentation model.

Next, the model training device 22 obtains the target object by fusing the segmentation results output from the image segmentation models.

According to an example of the present invention, for example, when the image segmentation model 1 outputs the segmentation result labels a' and C and the image segmentation model 2 outputs the segmentation result labels a and B, the label A, B, C can be obtained after the result fusion device 23 fuses these labels. According to another example of the present invention, the target object comprises a label a', and the result fusion means 23 may also obtain the target object from the segmentation results of the image segmentation models 1 and 2.

According to another example of the present invention, for example, if the segmentation results output by the plurality of image segmentation models each include a label D, that is, the region corresponding to the label D is a repeatedly optimized region, the result fusion device 23 may select a region of any label D as the final target object. Alternatively, the result fusion means 23 may also determine whether to label the corresponding pixel as the label D on a pixel-by-pixel basis for different region portions of all the output labels D. For example, for a pixel in these different region parts, whether the pixel can be finally labeled as label D is determined by a probabilistic fusion or voting operation. Specifically, for example, if the pixel is labeled as label D in the plurality of output results (i.e., voting operation), the pixel may be finally labeled as label D. For another example, the probability that the pixel belongs to the label D in each corresponding image segmentation model is averaged, and when the average exceeds the classification judgment threshold of the label D, the pixel may be finally labeled as the label D.

It has been described above that the image segmentation apparatus is trained on the target object from the input sample image to identify the target object by the operations performed by each of the policy determination means 21, the model training means 22, and the result fusion means 23.

The process of identifying the target object in the input inspection image by the trained image segmentation apparatus will be further described below. According to an example of the present invention, the image segmentation apparatus 20 further comprises (not shown in fig. 2) image acquisition means, model invocation means and object recognition means.

Specifically, the image acquisition device acquires an unmarked image to be detected; the model calling device calls each trained image segmentation model to respectively obtain corresponding output results; and the target recognition device fuses the output results to obtain the target object in the image to be detected.

According to one example of the present invention, for example, the target object is label A, B, C, trained image segmentation model 1 outputs labels a' (fusing labels a and B) and C, and trained image segmentation model 2 outputs labels a and B. Thereby, an image to be detected for the target object is input to the image acquisition device. The model calling device calls the image segmentation model 1 to identify the labels A' and C in the image to be detected, and calls the image segmentation model 2 to identify the labels A and B in the image to be detected. The target recognition apparatus fuses the output results of the image segmentation models 1 and 2 to obtain target objects in the image to be detected, i.e., the labels A, B and C.

It should be noted that the present invention may be implemented in software and/or in a combination of software and hardware, for example, as an Application Specific Integrated Circuit (ASIC), a general purpose computer or any other similar hardware device. In one embodiment, the software program of the present invention may be executed by a processor to implement the steps or functions described above. Also, the software programs (including associated data structures) of the present invention can be stored in a computer readable recording medium, such as RAM memory, magnetic or optical drive or diskette and the like. Further, some of the steps or functions of the present invention may be implemented in hardware, for example, as circuitry that cooperates with the processor to perform various steps or functions.

In addition, at least a portion of the present invention may be implemented as a computer program product, such as computer program instructions, which, when executed by a computing device, may invoke or provide methods and/or aspects in accordance with the present invention through operation of the computing device. Program instructions which invoke/provide the methods of the present invention may be stored on fixed or removable recording media and/or transmitted via a data stream over a broadcast or other signal-bearing medium, and/or stored in a working memory of a computing device operating in accordance with the program instructions.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.

Claims

1. An image segmentation method, wherein the method comprises:

-fusing adjacent tags;

-fusing tags whose corresponding ranges have a surrounding relationship;

-fusing labels with boundaries that cross each other;

-fusing at least two of the labels based on image feature similarity;

-fusing at least one of the tags with a background;

2. The method of claim 1, wherein the image segmentation models employ the same classification neural network.

3. The method of claim 1 or 2, wherein one tag fusion strategy comprises one or more tag fusions.

4. The method according to any one of claims 1 to 3, wherein the segmentation results output by the plurality of image segmentation models in each image segmentation model comprise the same label, and any label is selected as the target object.

5. The method of any one of claims 1 to 3, wherein segmentation results output by a plurality of the image segmentation models comprise the same label, and the same label is probability fused or voted to determine the target object.

6. The method of any of claims 1 to 5, wherein the method further comprises:

acquiring an image to be detected;

calling each image segmentation model to segment the image to be detected so as to respectively obtain corresponding segmentation results;

fusing the obtained segmentation results to obtain the target object in the image to be detected.

7. An image segmentation method, wherein the method comprises:

acquiring an image to be detected;

-fusing adjacent tags;

-fusing tags whose corresponding ranges have a surrounding relationship;

-fusing labels with boundaries that cross each other;

-fusing at least two of the labels based on image feature similarity;

-fusing at least one of the tags with the background.

8. An image segmentation apparatus, wherein the apparatus comprises:

-fusing adjacent tags;

-fusing tags whose corresponding ranges have a surrounding relationship;

-fusing labels with boundaries that cross each other;

-fusing at least two of the labels based on image feature similarity;

-fusing at least one of the tags with a background;

9. The apparatus of claim 8, wherein the image segmentation models employ the same classification neural network.

10. The apparatus of claim 8 or 9, wherein one tag fusion strategy comprises one or more tag fusions.

11. The apparatus according to any one of claims 8 to 10, wherein the segmentation results output by the plurality of image segmentation models in each image segmentation model include the same label, from which any label is selected as the target object.

12. The apparatus according to any one of claims 8 to 10, wherein the segmentation results output by the plurality of image segmentation models of the respective image segmentation models comprise the same label, and the same label is probability fused or voted to determine the target object.

13. The apparatus of any of claims 8 to 12, wherein the apparatus further comprises:

the image acquisition device is used for acquiring an image to be detected;

the model calling device is used for calling each image segmentation model to segment the image to be detected so as to respectively obtain corresponding segmentation results;

and the target recognition device is used for fusing the obtained segmentation results to obtain the target object in the image to be detected.

14. An image segmentation apparatus, wherein the apparatus comprises:

the image acquisition device is used for acquiring an image to be detected;

-fusing adjacent tags;

-fusing tags whose corresponding ranges have a surrounding relationship;

-fusing labels with boundaries that cross each other;

-fusing at least two of the labels based on image feature similarity;

-fusing at least one of the tags with the background.

15. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 7 when executing the computer program.

16. A computer-readable storage medium, on which a computer program is stored, wherein the computer program, when executed by a processor, implements the method of any of claims 1 to 7.

17. A computer program product implementing the method of any one of claims 1 to 7 when executed by a computer device.