CN115205301A - Image segmentation method and device based on characteristic space multi-view analysis - Google Patents

Image segmentation method and device based on characteristic space multi-view analysis Download PDF

Info

Publication number
CN115205301A
CN115205301A CN202210634465.7A CN202210634465A CN115205301A CN 115205301 A CN115205301 A CN 115205301A CN 202210634465 A CN202210634465 A CN 202210634465A CN 115205301 A CN115205301 A CN 115205301A
Authority
CN
China
Prior art keywords
target
image
segmentation
neural network
network model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210634465.7A
Other languages
Chinese (zh)
Inventor
刘江
李衡
区明阳
刘浩锋
胡衍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest University of Science and Technology
Original Assignee
Southwest University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest University of Science and Technology filed Critical Southwest University of Science and Technology
Priority to CN202210634465.7A priority Critical patent/CN115205301A/en
Publication of CN115205301A publication Critical patent/CN115205301A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the disclosure provides an image segmentation method and device based on characteristic space multi-view analysis, and relates to the technical field of artificial intelligence. The image segmentation method based on the characteristic space multi-view analysis comprises the following steps: acquiring an image target data set; acquiring an image training data set; acquiring an initial neural network model; performing model training processing on the initial neural network model according to the image training data set to obtain a target neural network model; inputting the image target data set into a trained target neural network model; the target neural network model is used for carrying out image segmentation of multi-view analysis in a characteristic space; the image target data set is subjected to image segmentation processing through the target neural network model to obtain a target segmentation result, and the accuracy of image segmentation can be improved through the technical scheme provided by the embodiment of the disclosure.

Description

Image segmentation method and device based on characteristic space multi-view analysis
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an image segmentation method and device based on characteristic space multi-view analysis.
Background
The image segmentation algorithm based on the deep neural network can be used for segmenting medical images, however, the segmentation task of the current segmentation algorithm is high in complexity and is easily affected by factors such as partial volume effect, gray level nonuniformity, artifacts and closeness of gray levels among different soft tissues, and the accuracy of image segmentation is low.
Disclosure of Invention
The present disclosure provides an image segmentation method and an image segmentation device based on feature space multi-view analysis, which can improve the accuracy of image segmentation.
In order to achieve the above object, a first aspect of the embodiments of the present disclosure provides an image segmentation method based on feature space multi-view analysis, including:
acquiring an image target data set;
acquiring an image training data set; the image training data set comprises image data and segmentation labels corresponding to the image data;
acquiring an initial neural network model; the initial neural network model at least comprises a shared encoder and a target decoder; the target decoder comprises at least a first decoder and a second decoder;
performing model training processing on the initial neural network model according to the image training data set to obtain a target neural network model;
inputting the image target data set into the trained target neural network model; the target neural network model is used for carrying out image segmentation of multi-view analysis in a characteristic space;
and carrying out image segmentation processing on the image target data set through the target neural network model to obtain a target segmentation result.
In some embodiments, the performing model training processing on the initial neural network model according to the image training data set to obtain a target neural network model includes:
performing feature extraction processing on the image data through the shared encoder to obtain initial features of the image; the image initial characteristics comprise instrument position characteristics, contour characteristics and image saturation characteristics;
performing feature space multi-view analysis processing on the initial features of the image through the target decoder to obtain a target image prediction result:
performing loss calculation according to the target decoder, the target image prediction result and the segmentation label to obtain a target loss function;
and performing the model training processing on the initial neural network model according to the target loss function to obtain the target neural network model.
In some embodiments, the performing a loss calculation according to the target decoder, the target image prediction result, and the segmentation label to obtain a target loss function includes:
obtaining a difference loss function according to the target decoder;
obtaining a segmentation loss function according to the target image prediction result and the segmentation label;
and obtaining the target loss function according to the difference loss function and the segmentation loss function.
In some embodiments, said deriving a difference loss function from said target decoder comprises:
acquiring a first parameter set of the first decoder, and obtaining a first parameter vector according to the first parameter set;
acquiring a second parameter set of the second decoder, and obtaining a second parameter vector according to the second parameter set;
and performing cosine similarity quantization processing on the first parameter vector and the second parameter vector to obtain the difference loss function.
In some embodiments, the target image prediction comprises a first image prediction output by the first decoder, a second image prediction output by the second decoder; the obtaining of the segmentation loss function according to the target image prediction result and the segmentation label includes:
comparing the first image prediction result with the segmentation label to obtain the first segmentation loss information;
and comparing the second image prediction result with the segmentation label to obtain the second segmentation loss information.
In some embodiments, said deriving said target loss function from said difference loss function and said segmentation loss function comprises:
acquiring a target weight parameter; the target weight parameters comprise difference weight parameters and segmentation weight parameters;
and weighting the difference loss function and the segmentation loss function according to the target weight parameters to obtain the target loss function.
In some embodiments, the image segmentation processing on the image target data set by the target neural network model to obtain a target segmentation result includes:
segmenting the image target data set through the first decoder to obtain a first sub-segmentation result;
segmenting the image target data set through the second decoder to obtain a second sub-segmentation result;
performing probability calculation on the first sub-segmentation result and the second sub-segmentation result to obtain a class probability map;
and performing logistic regression calculation according to the class probability graph to obtain the target segmentation result.
In order to achieve the above object, a second aspect of the present disclosure provides an image segmentation apparatus based on feature space multi-view analysis, including:
the target data set acquisition module is used for acquiring an image target data set;
the training data set acquisition module is used for acquiring an image training data set; the image training data set comprises image data and segmentation labels corresponding to the image data;
the initial neural network model acquisition module is used for acquiring an initial neural network model; the initial neural network model at least comprises a shared encoder and a target decoder; the target decoder at least comprises a first decoder and a second decoder;
the model training module is used for carrying out model training processing on the initial neural network model according to the image training data set to obtain a target neural network model;
the data input module is used for inputting the image target data set into the trained target neural network model; the target neural network model is used for carrying out image segmentation of multi-view analysis in a characteristic space;
and the image segmentation module is used for carrying out image segmentation processing on the image target data set through the target neural network model to obtain a target segmentation result.
To achieve the above object, a third aspect of the present disclosure provides an electronic device, including:
at least one memory;
at least one processor;
at least one program;
the program is stored in a memory and a processor executes the at least one program to implement the method of the present disclosure as described in the above first aspect.
To achieve the above object, a fourth aspect of the present disclosure proposes a storage medium that is a computer-readable storage medium storing computer-executable instructions for causing a computer to perform:
a method as described in the first aspect above.
The image segmentation method and device based on characteristic space multi-view analysis provided by the embodiment of the disclosure firstly acquire an image target data set, an image training data set and an initial neural network model respectively, then perform model training processing on the initial neural network model according to the image training data set to obtain a target neural network model, further input the image target data set into the trained target neural network model, and finally perform image segmentation of multi-view analysis on the image target data set in the characteristic space through the target neural network model to obtain a target segmentation result.
Drawings
Fig. 1 is a flowchart of an image segmentation method based on feature space multi-view analysis according to an embodiment of the present disclosure.
Fig. 2 is a flowchart of step S150 in fig. 1.
Fig. 3 is a flowchart of step S230 in fig. 2.
Fig. 4 is a flowchart of step S310 in fig. 3.
Fig. 5 is a flowchart of step S320 in fig. 3.
Fig. 6 is a flowchart of step S160 in fig. 1.
Fig. 7 is a schematic diagram illustrating an image segmentation method based on feature space multi-view analysis according to an embodiment of the present disclosure.
Fig. 8 is a block diagram of an image segmentation apparatus based on feature space multi-view analysis according to an embodiment of the present disclosure.
Fig. 9 is a schematic hardware structure diagram of an electronic device according to an embodiment of the present disclosure.
Reference numerals: a target data set acquisition module 810, a training data set acquisition module 820, an initial neural network model acquisition module 830, a model training module 840, a data input module 850, an image segmentation module 860, a processor 901, a memory 902, an input/output interface 903, a communication interface 904, and a bus 905.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
It should be noted that although functional blocks are partitioned in a schematic diagram of an apparatus and a logical order is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the partitioning of blocks in the apparatus or the order in the flowchart. The terms first, second and the like in the description and in the claims, and the drawings described above, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein is for the purpose of describing embodiments of the invention only and is not intended to be limiting of the invention.
At present, in the field of image segmentation, there are many algorithms capable of having good segmentation effects on natural images, city streetscapes and medical images, and the segmentation algorithm based on the deep neural network is widely used due to its excellent performance.
However, the complexity of the segmentation task causes poor effect of the current image segmentation method, and the effect of the segmentation algorithm is not only affected by factors such as partial volume effect, gray level non-uniformity, artifacts, and proximity of gray levels between different soft tissues, but also interfered by noise, thereby causing low accuracy of image segmentation and poor segmentation effect.
Based on this, the embodiments of the present disclosure provide an image segmentation method and apparatus based on feature space multi-view analysis, where an image target data set, an image training data set, and an initial neural network model are obtained first, then a model training process is performed on the initial neural network model according to the image training data set to obtain a target neural network model, and then the image target data set is input to the trained target neural network model, and finally an image segmentation of the image target data set in a feature space is performed through the target neural network model to obtain a target segmentation result.
The image segmentation method based on the characteristic space multi-view analysis is provided on the basis of the principle that objects can be observed from multiple angles to obtain more accurate conclusions, the neural network is applied to feature extraction of images, and two classifiers (including decoders) with different parameters are adopted to learn features respectively, so that a segmentation network capable of performing multi-view analysis in the characteristic space is trained, the image segmentation accuracy is improved, and the image segmentation effect is improved.
Images of the same data source can be often described through a set of attributes, and the neural network has the capability of segmenting the images through screening and learning the attributes, so that comprehensive consideration of input images is formed through selecting different attribute sets for learning, and the performance of the neural network is improved. Specifically, the features are extracted by an encoder, then different features are learned by two decoders with different parameters, so that the images are analyzed from the view points of different attribute sets, and finally the capabilities of the two are integrated to segment the images.
It should be noted that, in the image segmentation method based on feature space multi-view analysis provided in the embodiment of the present application, "feature space multi-view" is not a multi-view of a general physical space, but is a multi-view in a feature space, and therefore, the image segmentation method based on feature space multi-view analysis of the present application is not limited by the dimension of image data, and can be applied to analysis of data such as two-dimensional data, three-dimensional data, and time series data.
The embodiments of the present disclosure provide an image segmentation method and apparatus based on feature space multi-view analysis, and are specifically described with reference to the following embodiments, which first describe an image segmentation method based on feature space multi-view analysis in the embodiments of the present disclosure.
The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best results.
The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
The embodiment of the disclosure provides an image segmentation method based on characteristic space multi-view analysis, and relates to the technical field of artificial intelligence, in particular to the technical field of image segmentation. The image segmentation method based on feature space multi-view analysis provided by the embodiment of the disclosure can be applied to a terminal, a server side and software running in the terminal or the server side. In some embodiments, the terminal may be a smartphone, tablet, laptop, desktop computer, smart watch, or the like; the server can be an independent server, and can also be a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, content Delivery Network (CDN), big data and an artificial intelligence platform; the software may be an application or the like for implementing an image segmentation method based on feature space multi-view analysis, but is not limited to the above form.
The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiment of the disclosure provides an image segmentation method based on feature space multi-view analysis, which includes: acquiring an image target data set; acquiring an image training data set; the image training data set comprises image data and segmentation labels corresponding to the image data; acquiring an initial neural network model; the initial neural network model at least comprises a shared encoder and a target decoder; the target decoder at least comprises a first decoder and a second decoder; performing model training processing on the initial neural network model according to the image training data set to obtain a target neural network model; inputting the image target data set into a trained target neural network model; the target neural network model is used for carrying out image segmentation of multi-view analysis in a characteristic space; and performing image segmentation processing on the image target data set through the target neural network model to obtain a target segmentation result.
Fig. 1 is an optional flowchart of an image segmentation method based on feature space multi-view analysis according to an embodiment of the present disclosure, where the method in fig. 1 may include, but is not limited to, steps S110 to S160, and specifically includes:
s110, acquiring an image target data set;
s120, acquiring an image training data set;
s130, obtaining an initial neural network model;
s140, performing model training processing on the initial neural network model according to the image training data set to obtain a target neural network model;
s150, inputting the image target data set into the trained target neural network model;
and S160, carrying out image segmentation processing on the image target data set through the target neural network model to obtain a target segmentation result.
In step S110, the image target data set is a data set to be subjected to image segmentation.
In step S120, the video training dataset is a dataset used in training the model, and includes video data and a segmentation label corresponding to the video data, for example, a dataset (X, Y), where X is the video data and Y is the segmentation label.
In step S130, the initial neural network model at least includes a shared encoder and a target decoder; the target decoder at least comprises a first decoder and a second decoder, wherein the shared encoder is marked as E and used for extracting the features in the image data, and the target decoder is marked as D and used for carrying out multi-view learning on the features in the image data.
The shared part of the shared encoder is embodied in connection with each decoder, and the structure of the shared decoder is not limited to any full convolution network model, such as deep labv 3.
In a specific embodiment, the structure of each decoder in the target decoder is kept consistent, but the parameters of each decoder are different, so that the image characteristics learned by different decoders are different, thereby realizing multi-view analysis. For example, the parameters of the first decoder are a first parameter set, and the parameters of the second decoder are a second parameter set.
In a specific embodiment, the output of the shared encoder is connected to the input of each of the target decoders, the connection is a main path connection, and the shared encoder is further connected to each of the decoders through a branch path for improving the performance of the network model.
In a specific embodiment, each of the target decoders is provided with a fully-connected layer for outputting the final segmentation result, and other parts of each of the target decoders include, but are not limited to, an deconvolution layer.
It should be noted that at least two decoders in the target decoder are used for learning from different features, so that multi-view analysis is performed in dimensions of different feature spaces, accuracy of image segmentation is improved, the number of decoders in the target decoder increases with the increase of the number of views, and the purpose of multi-view analysis can be achieved.
In step S140, the training process of the initial neural network model specifically includes: designing a loss function, inputting the image data in the training data set into the model to obtain output, comparing the output with the segmentation labels, and adjusting the parameters of the model through the loss function to finally obtain the trained target neural network model.
In steps S150 to S160, the image target data set is input into the target neural network model, so as to perform image segmentation processing on the image target data set through the target neural network model, obtain a target segmentation result, and implement image segmentation for performing multi-view analysis in the feature space.
The embodiment of the disclosure provides an image segmentation method based on characteristic space multi-view analysis, which includes the steps of firstly, respectively obtaining an image target data set, an image training data set and an initial neural network model, then, conducting model training processing on the initial neural network model according to the image training data set to obtain a target neural network model, further inputting the image target data set to the trained target neural network model, and finally, conducting image segmentation of multi-view analysis on the image target data set in a characteristic space through the target neural network model to obtain a target segmentation result.
In some embodiments, performing model training processing on the initial neural network model according to the image training data set to obtain a target neural network model, includes: carrying out feature extraction processing on the image data through a shared encoder to obtain initial features of the image; the image initial characteristics comprise instrument position characteristics, contour characteristics and image saturation characteristics; performing characteristic space multi-view analysis processing on the initial characteristics of the image through a target decoder to obtain a target image prediction result: performing loss calculation according to the target decoder, the target image prediction result and the segmentation label to obtain a target loss function; and performing model training processing on the initial neural network model according to the target loss function to obtain a target neural network model.
Fig. 2 is a flow chart of step S150 in some embodiments, and step S150 illustrated in fig. 2 includes, but is not limited to, steps S210 to S240:
s210, performing feature extraction processing on image data through a shared encoder to obtain initial features of an image;
s220, performing characteristic space multi-view analysis processing on the initial characteristics of the image through a target decoder to obtain a target image prediction result:
s230, performing loss calculation according to the target decoder, the target image prediction result and the segmentation label to obtain a target loss function;
and S240, performing model training processing on the initial neural network model according to the target loss function to obtain a target neural network model.
In step S210, the image data is input into the shared encoder, and the attributes in the image are abstracted to obtain initial image features, where the initial image features include instrument position features, contour features, and image saturation features.
In step S220, let the first decoder be D α The second decoder is D β Each decoder is responsible for learning the obtained feature vector, and because the first decoder and the second decoder adopt the same structure, the parameters of the two decoders can form two corresponding vectors and output corresponding target image prediction results, wherein the target image prediction results are the results of semantic segmentation image prediction.
In steps S230 to S240, the loss function of the entire target neural network model is the target loss function, and the target loss function is denoted as L total The objective loss function is used to train the model.
In some embodiments, performing a loss calculation according to the target decoder, the target image prediction result, and the segmentation label to obtain a target loss function, includes: obtaining a difference loss function according to a target decoder; obtaining a segmentation loss function according to the target image prediction result and the segmentation label; and obtaining a target loss function according to the difference loss function and the segmentation loss function.
Fig. 3 is a flowchart of step S230 in some embodiments, and step S230 illustrated in fig. 3 includes, but is not limited to, step S310 to step S330:
s310, obtaining a difference loss function according to a target decoder;
s320, obtaining a segmentation loss function according to the target image prediction result and the segmentation label;
and S330, obtaining a target loss function according to the difference loss function and the segmentation loss function.
In step S310, in order to prompt different decoders to learn features of different attribute sets, multi-view analysis is performed in a feature space, and a difference loss function is introduced to perform difference constraint on the decoders, so as to ensure that different decoders have different parameters, specifically, the difference loss function is recorded as L weight
In step S320, the segmentation loss function is a loss function of the image segmentation process performed by the decoder, and the segmentation loss function includes first segmentation loss information and second segmentation loss information, specifically, the first segmentation loss information is denoted as L segα And the second division loss information is recorded as L segβ
In step S330, the target loss function is a loss function of the entire target neural network model, and the target loss function is denoted as L total
In some embodiments, deriving a difference loss function from the target decoder comprises: acquiring a first parameter set of a first decoder, and obtaining a first parameter vector according to the first parameter set; acquiring a second parameter set of a second decoder, and obtaining a second parameter vector according to the second parameter set; and performing cosine similarity quantization processing on the first parameter vector and the second parameter vector to obtain a difference loss function.
Fig. 4 is a flow chart of step S310 in some embodiments, and step S310 illustrated in fig. 4 includes, but is not limited to, step S410 to step S430:
s410, acquiring a first parameter set of a first decoder, and obtaining a first parameter vector according to the first parameter set;
s420, acquiring a second parameter set of a second decoder, and obtaining a second parameter vector according to the second parameter set;
s430, cosine similarity quantization processing is carried out on the first parameter vector and the second parameter vector to obtain a difference loss function.
In step S410, the first parameter set is a parameter of the first decoder, and the first parameter vector is a vector obtained by the first parameter set and used for calculating the disparity loss function.
In step S420, the second parameter set is a parameter of the second decoder, and the second parameter vector is a vector obtained by the second parameter set and used for calculating the disparity loss function.
In step S430, in order to enable the two decoders to learn the features of different attribute sets, multi-view analysis is performed in the feature space, and a difference loss function is introduced to perform difference constraint on the decoders, so as to ensure that the two decoders have different parameters. Because the two decoders adopt the same structure, the parameters of the two decoders can form two corresponding vectors, and the difference between the two vectors can be quantified as a loss function by calculating the cosine similarity of the two vectors, wherein the smaller the cosine similarity is, the larger the difference between the decoders is, so that the difference between the decoders can be controlled by using the parameters, the flexibility of a model is improved, and the training efficiency of the model is improved.
In some embodiments, the target image prediction comprises a first image prediction output by a first decoder, a second image prediction output by a second decoder; the segmentation loss function comprises first segmentation loss information and second segmentation loss information, and is obtained according to the target image prediction result and the segmentation label, and the segmentation loss function comprises the following steps: comparing the first image prediction result with the segmentation label to obtain first segmentation loss information; and comparing the second image prediction result with the segmentation label to obtain second segmentation loss information.
Fig. 5 is a flowchart of step S320 in some embodiments, and step S320 illustrated in fig. 5 includes, but is not limited to, step S510 to step S520:
s510, comparing the first image prediction result with the segmentation labels to obtain first segmentation loss information;
and S520, comparing the second image prediction result with the segmentation label to obtain second segmentation loss information.
In steps S510 to S520, the target image prediction results include a first image prediction result output by the first decoder and a second image prediction result output by the second decoder, where the first image prediction result is recorded as a first image prediction result
Figure BDA0003681486640000081
The second image prediction result is recorded as
Figure BDA0003681486640000082
After the target image prediction result is obtained, comparing the first image prediction result, the second image prediction result and the segmentation label respectively, and calculating to obtain a segmentation loss function, wherein the segmentation loss function comprises first segmentation loss information and second segmentation loss information, so that the neural network is guided to learn towards a correct semantic segmentation direction. Specifically, the first segmentation loss information is denoted as L segα And the second segmentation loss information is recorded as L segα
In some embodiments, obtaining the target loss function according to the difference loss function and the segmentation loss function includes: acquiring a target weight parameter; the target weight parameters comprise difference weight parameters and segmentation weight parameters; and weighting the difference loss function and the segmentation loss function according to the target weight parameters to obtain a target loss function.
Step S330 further includes: and acquiring a target weight parameter, and performing weighting processing on the difference loss function and the segmentation loss function according to the target weight parameter to obtain a target loss function.
Wherein the target weight parameters comprise difference weight parameters and segmentation weight parameters, and the difference weight parameters are marked as lambda 1 The division weight parameter is recorded as λ 2 . The weighting process is L total =λ 1 L weight2 (L segα +L segβ ) And the integral optimization of the model is realized by minimizing a loss integral function in the training process.
In some embodiments, the image segmentation processing is performed on the image target data set through the target neural network model to obtain a target segmentation result, including: segmenting the image target data set through a first decoder to obtain a first sub-segmentation result; segmenting the image target data set through a second decoder to obtain a second sub-segmentation result; performing probability calculation on the first sub-segmentation result and the second sub-segmentation result to obtain a class probability map; and performing logistic regression calculation according to the class probability graph to obtain a target segmentation result.
Fig. 6 is a flowchart of step S160 in some embodiments, and step S160 illustrated in fig. 6 includes, but is not limited to, step S610 to step S640:
s610, segmenting the image target data set through a first decoder to obtain a first sub-segmentation result;
s620, segmenting the image target data set through a second decoder to obtain a second sub-segmentation result;
s630, carrying out probability calculation on the first sub-segmentation result and the second sub-segmentation result to obtain a class probability map;
and S640, performing logistic regression calculation according to the class probability graph to obtain a target segmentation result.
In steps S610 to S620, the first sub-division result is recorded as
Figure BDA0003681486640000091
The second subdivision result is recorded as
Figure BDA0003681486640000092
It should be noted that the record symbol of the first image prediction result is the same as the record symbol of the first sub-segmentation result, and the record symbol of the second image prediction result is the same as the record symbol of the second sub-segmentation result, because the target image prediction result is the output of the decoder during the training process, and the target segmentation result is the output of the decoder during the segmentation of the video target data set, and both are the output of the decoder, and the difference is that the two are used in the training and segmentation processes of the model, respectively, so that the unification of the record symbols of the two can be understood as different states of the same data in different applicationsAnd are not to be construed as limitations on the present application or as an ambiguity.
In steps S630 to S640, the obtained first sub-segmentation result and the second sub-segmentation result need to be merged, and feature learning is integrated to obtain a target segmentation result, where the target segmentation result is a final semantic segmentation prediction result, and the target segmentation result is recorded as a final semantic segmentation prediction result
Figure BDA0003681486640000093
Specifically, in the merging process, instead of directly adding the final segmentation results, the class probability maps output by the segmentation are added, and then the logistic regression calculation is performed on the class probability maps to obtain the target segmentation result. Wherein the logistic regression calculation comprises a softmax calculation.
Fig. 7 is a schematic diagram illustrating an image segmentation method based on feature space multi-view analysis according to an embodiment of the present disclosure, and the image segmentation method based on feature space multi-view analysis is described in detail with reference to fig. 7 according to a specific embodiment. It is to be understood that the following description is illustrative only and is not intended to be in any way limiting.
First, the image data set (X, Y) required for training is obtained, where X is the image data and Y is the segmentation label. In fig. 7, "input X" represents input influence data.
Inputting the image data into a shared encoder E, and abstracting and extracting attributes in the image data, including instrument position characteristics, contour characteristics, image saturation characteristics and the like, to obtain a group of image initial characteristics. The shared encoder E is also called a feature extractor, and may use any full convolutional network. "Shared encoder E" in fig. 7 denotes a Shared encoder.
Then defining a target decoder, the target decoder including a first decoder as D α The second decoder is D β Each decoder is responsible for learning the obtained feature vectors and outputting corresponding target image prediction results
Figure BDA0003681486640000094
And
Figure BDA0003681486640000095
"Decoder D" in FIG. 7 α "denotes the first Decoder," Decoder D β "denotes the second decoder," Prediction
Figure BDA0003681486640000096
"indicates the first image Prediction result," Prediction
Figure BDA0003681486640000097
"denotes the second image prediction result. In addition, fig. 7 explicitly embodies the connection of the main and tributaries between the shared encoder and each decoder.
In order to enable two decoders to learn features of different attribute sets, multi-view analysis is performed in a feature space, introducing a difference loss function L weight The decoders are differentially constrained to ensure that the two decoders have different parameters, denoted as "disparity loss L" in FIG. 7 weight "denotes the difference loss function.
After the target image prediction result is obtained, the target image prediction result is compared with the segmentation labels respectively to calculate a segmentation loss function, so that the neural network is guided to learn in a correct semantic segmentation direction. In FIG. 7, "Segmentation loss L segα L segβ "indicates a target image prediction result.
And finally, combining the segmentation results, and synthesizing the feature learning to obtain a target segmentation result. "Final prediction" in FIG. 7
Figure BDA0003681486640000098
"indicates the target segmentation result.
The embodiment of the present disclosure provides an image segmentation apparatus based on characteristic space multi-view analysis, including: the target data set acquisition module is used for acquiring an image target data set; the training data set acquisition module is used for acquiring an image training data set; the image training data set comprises image data and segmentation labels corresponding to the image data; the initial neural network model acquisition module is used for acquiring an initial neural network model; the initial neural network model at least comprises a shared encoder and a target decoder; the target decoder at least comprises a first decoder and a second decoder; the model training module is used for carrying out model training processing on the initial neural network model according to the image training data set to obtain a target neural network model; the data input module is used for inputting the image target data set into the trained target neural network model; the target neural network model is used for carrying out image segmentation of multi-view analysis in a characteristic space; and the image segmentation module is used for carrying out image segmentation processing on the image target data set through the target neural network model to obtain a target segmentation result.
Referring to fig. 8, fig. 8 illustrates an image segmentation apparatus based on eigenspace multi-view analysis according to an embodiment, where the image segmentation apparatus based on eigenspace multi-view analysis includes: the device comprises a target data set acquisition module 810, a training data set acquisition module 820, an initial neural network model acquisition module 830, a model training module 840, a data input module 850 and an image segmentation module 860, wherein the target data set acquisition module 810 is connected with the training data set acquisition module 820, the training data set acquisition module 820 is connected with the initial neural network model acquisition module 830, the initial neural network model acquisition module 830 is connected with the model training module 840, the model training module 840 is connected with the data input module 850, and the data input module 850 is connected with the image segmentation module 860.
The specific implementation of the image segmentation apparatus based on the feature space multi-view analysis in this embodiment is substantially the same as the specific implementation of the image segmentation method based on the feature space multi-view analysis, and belongs to the same inventive concept, which is not described herein again.
An embodiment of the present disclosure further provides an electronic device, including:
at least one memory;
at least one processor;
at least one program;
the programs are stored in the memory, and the processor executes the at least one program to implement the image segmentation method based on feature space multi-view analysis. The electronic device can be any intelligent terminal including a mobile phone, a tablet computer, a Personal Digital Assistant (PDA for short), a vehicle-mounted computer and the like.
Referring to fig. 9, fig. 9 illustrates a hardware structure of an electronic device according to another embodiment, where the electronic device includes:
the processor 901 may be implemented by a general-purpose CPU (central processing unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits, and is configured to execute a relevant program to implement the technical solution provided by the embodiment of the present disclosure;
the memory 902 may be implemented in a ROM (read only memory), a static memory device, a dynamic memory device, or a RAM (random access memory). The memory 902 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present disclosure is implemented by software or firmware, the relevant program codes are stored in the memory 902 and the processor 901 calls the image segmentation method based on the feature space multi-view analysis to execute the embodiments of the present disclosure;
an input/output interface 903 for implementing information input and output;
a communication interface 904, configured to implement communication interaction between the device and another device, where communication may be implemented in a wired manner (e.g., USB, network cable, etc.), and communication may also be implemented in a wireless manner (e.g., mobile network, WIFI, bluetooth, etc.); and
a bus 905 that transfers information between various components of the device (e.g., the processor 901, the memory 902, the input/output interface 903, and the communication interface 904);
wherein the processor 901, the memory 902, the input/output interface 903 and the communication interface 904 enable a communication connection within the device with each other through a bus 905.
The embodiment of the present disclosure further provides a storage medium, which is a computer-readable storage medium, where computer-executable instructions are stored, and the computer-executable instructions are configured to enable a computer to execute the image segmentation method based on feature space multi-view analysis.
The image segmentation method and device based on characteristic space multi-view analysis provided by the embodiment of the disclosure firstly acquire an image target data set, an image training data set and an initial neural network model respectively, then perform model training processing on the initial neural network model according to the image training data set to obtain a target neural network model, further input the image target data set into the trained target neural network model, and finally perform image segmentation of multi-view analysis on the image target data set in the characteristic space through the target neural network model to obtain a target segmentation result.
The memory, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and these remote memories may be connected to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The embodiments described in the embodiments of the present disclosure are for more clearly illustrating the technical solutions of the embodiments of the present disclosure, and do not constitute a limitation to the technical solutions provided in the embodiments of the present disclosure, and it is obvious to those skilled in the art that the technical solutions provided in the embodiments of the present disclosure are also applicable to similar technical problems with the evolution of technology and the emergence of new application scenarios.
It will be appreciated by those skilled in the art that the solutions shown in fig. 1-6 are not limiting of the embodiments of the present disclosure, and may include more or fewer steps than those shown, or some of the steps may be combined, or different steps.
The above-described embodiments of the apparatus are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may also be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
One of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.
The terms "first," "second," "third," "fourth," and the like (if any) in the description of the present application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" for describing an association relationship of associated objects, indicating that there may be three relationships, e.g., "a and/or B" may indicate: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of single item(s) or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b and c may be single or plural.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the present application, in essence or part of the technical solutions contributing to the prior art, or all or part of the technical solutions, can be embodied in the form of a software product, which is stored in a storage medium and includes multiple instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing programs, such as a usb disk, a portable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The preferred embodiments of the present disclosure have been described above with reference to the accompanying drawings, and therefore do not limit the scope of the claims of the embodiments of the present disclosure. Any modifications, equivalents and improvements within the scope and spirit of the embodiments of the present disclosure should be considered within the scope of the claims of the embodiments of the present disclosure by those skilled in the art.

Claims (10)

1. An image segmentation method based on multi-view analysis of feature space is characterized by comprising the following steps:
acquiring an image target data set;
acquiring an image training data set; the image training data set comprises image data and segmentation labels corresponding to the image data;
acquiring an initial neural network model; the initial neural network model at least comprises a shared encoder and a target decoder; the target decoder at least comprises a first decoder and a second decoder;
performing model training processing on the initial neural network model according to the image training data set to obtain a target neural network model;
inputting the image target data set into the trained target neural network model; the target neural network model is used for carrying out image segmentation of multi-view analysis in a characteristic space;
and carrying out image segmentation processing on the image target data set through the target neural network model to obtain a target segmentation result.
2. The method of claim 1, wherein the performing model training processing on the initial neural network model according to the image training data set to obtain a target neural network model comprises:
performing feature extraction processing on the image data through the shared encoder to obtain initial features of the image; the image initial characteristics comprise instrument position characteristics, contour characteristics and image saturation characteristics;
performing feature space multi-view analysis processing on the initial features of the image through the target decoder to obtain a target image prediction result:
performing loss calculation according to the target decoder, the target image prediction result and the segmentation label to obtain a target loss function;
and performing the model training processing on the initial neural network model according to the target loss function to obtain the target neural network model.
3. The method of claim 2, wherein performing a loss calculation based on the target decoder, the target image prediction result, and the segmentation label to obtain a target loss function comprises:
obtaining a difference loss function according to the target decoder;
obtaining a segmentation loss function according to the target image prediction result and the segmentation label;
and obtaining the target loss function according to the difference loss function and the segmentation loss function.
4. The method of claim 3, wherein obtaining a difference loss function from the target decoder comprises:
acquiring a first parameter set of the first decoder, and obtaining a first parameter vector according to the first parameter set;
acquiring a second parameter set of the second decoder, and obtaining a second parameter vector according to the second parameter set;
and performing cosine similarity quantization processing on the first parameter vector and the second parameter vector to obtain the difference loss function.
5. The method of claim 3, wherein the target image predictor comprises a first image predictor output by the first decoder, a second image predictor output by the second decoder; the obtaining of the segmentation loss function according to the target image prediction result and the segmentation label includes:
comparing the first image prediction result with the segmentation label to obtain the first segmentation loss information;
and comparing the second image prediction result with the segmentation label to obtain the second segmentation loss information.
6. The method of claim 3, wherein obtaining the target loss function from the difference loss function and the segmentation loss function comprises:
acquiring a target weight parameter; the target weight parameters comprise difference weight parameters and segmentation weight parameters;
and weighting the difference loss function and the segmentation loss function according to the target weight parameters to obtain the target loss function.
7. The method according to any one of claims 1 to 6, wherein the performing an image segmentation process on the image target data set through the target neural network model to obtain a target segmentation result comprises:
segmenting the image target data set through the first decoder to obtain a first sub-segmentation result;
segmenting the image target data set through the second decoder to obtain a second sub-segmentation result;
performing probability calculation on the first sub-segmentation result and the second sub-segmentation result to obtain a class probability map;
and performing logistic regression calculation according to the class probability graph to obtain the target segmentation result.
8. An image segmentation device based on characteristic space multi-view analysis, comprising:
the target data set acquisition module is used for acquiring an image target data set;
the training data set acquisition module is used for acquiring an image training data set; the image training data set comprises image data and segmentation labels corresponding to the image data;
the initial neural network model acquisition module is used for acquiring an initial neural network model; the initial neural network model at least comprises a shared encoder and a target decoder; the target decoder at least comprises a first decoder and a second decoder;
the model training module is used for carrying out model training processing on the initial neural network model according to the image training data set to obtain a target neural network model;
the data input module is used for inputting the image target data set into the trained target neural network model; the target neural network model is used for carrying out image segmentation of multi-view analysis in a characteristic space;
and the image segmentation module is used for carrying out image segmentation processing on the image target data set through the target neural network model to obtain a target segmentation result.
9. An electronic device, comprising:
at least one memory;
at least one processor;
at least one program;
the programs are stored in a memory, and a processor executes the at least one program to implement:
the method of any one of claims 1 to 7.
10. A storage medium that is a computer-readable storage medium having stored thereon computer-executable instructions for causing a computer to perform:
the method of any one of claims 1 to 7.
CN202210634465.7A 2022-06-07 2022-06-07 Image segmentation method and device based on characteristic space multi-view analysis Pending CN115205301A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210634465.7A CN115205301A (en) 2022-06-07 2022-06-07 Image segmentation method and device based on characteristic space multi-view analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210634465.7A CN115205301A (en) 2022-06-07 2022-06-07 Image segmentation method and device based on characteristic space multi-view analysis

Publications (1)

Publication Number Publication Date
CN115205301A true CN115205301A (en) 2022-10-18

Family

ID=83576798

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210634465.7A Pending CN115205301A (en) 2022-06-07 2022-06-07 Image segmentation method and device based on characteristic space multi-view analysis

Country Status (1)

Country Link
CN (1) CN115205301A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117351289A (en) * 2023-11-02 2024-01-05 北京联影智能影像技术研究院 Training method of image classification model and image classification method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117351289A (en) * 2023-11-02 2024-01-05 北京联影智能影像技术研究院 Training method of image classification model and image classification method

Similar Documents

Publication Publication Date Title
CN111898696B (en) Pseudo tag and tag prediction model generation method, device, medium and equipment
CN107861938B (en) POI (Point of interest) file generation method and device and electronic equipment
CN109993102B (en) Similar face retrieval method, device and storage medium
CN112016315B (en) Model training method, text recognition method, model training device, text recognition device, electronic equipment and storage medium
CN112164391A (en) Statement processing method and device, electronic equipment and storage medium
CN113657087B (en) Information matching method and device
CN109241299B (en) Multimedia resource searching method, device, storage medium and equipment
CN115640394A (en) Text classification method, text classification device, computer equipment and storage medium
CN114298997B (en) Fake picture detection method, fake picture detection device and storage medium
CN115222061A (en) Federal learning method based on continuous learning and related equipment
CN115205301A (en) Image segmentation method and device based on characteristic space multi-view analysis
CN114972016A (en) Image processing method, image processing apparatus, computer device, storage medium, and program product
CN114004364A (en) Sampling optimization method and device, electronic equipment and storage medium
CN111914809A (en) Target object positioning method, image processing method, device and computer equipment
CN116977714A (en) Image classification method, apparatus, device, storage medium, and program product
CN115619903A (en) Training and synthesizing method, device, equipment and medium for text image synthesis model
CN115439713A (en) Model training method and device, image segmentation method, equipment and storage medium
CN114820891A (en) Lip shape generating method, device, equipment and medium
CN114595357A (en) Video searching method and device, electronic equipment and storage medium
CN113822291A (en) Image processing method, device, equipment and storage medium
CN113723515A (en) Moire pattern recognition method, device, equipment and medium based on image recognition
CN117253061B (en) Data recommendation method, device and computer readable medium
CN115937338B (en) Image processing method, device, equipment and medium
CN117540306B (en) Label classification method, device, equipment and medium for multimedia data
CN114283350B (en) Visual model training and video processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination