CN114627297A - Image semantic segmentation method for irregular material of transfer learning - Google Patents
Image semantic segmentation method for irregular material of transfer learning Download PDFInfo
- Publication number
- CN114627297A CN114627297A CN202210331353.4A CN202210331353A CN114627297A CN 114627297 A CN114627297 A CN 114627297A CN 202210331353 A CN202210331353 A CN 202210331353A CN 114627297 A CN114627297 A CN 114627297A
- Authority
- CN
- China
- Prior art keywords
- image
- irregular material
- irregular
- semantic segmentation
- pixel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an image semantic segmentation method of a transfer-learning irregular material, which judges a semantic segmentation task by using a random vector dynamic rendering model of an inferred image and a corresponding quasi-standard mark mask label. The method is based on widely available image data of the unlabeled irregular material to supplement a small-scale image semantic segmentation label data set, combines limited labeled image semantic segmentation data of the irregular material to train a model network, obtains the model generalization capability with excellent performance, can be effectively migrated to various industrial production scenes, obviously reduces the workload of manual labeling, and improves the efficiency of the method.
Description
Technical Field
The invention relates to an image semantic segmentation method for irregular materials in an intelligent factory production scene, and relates to the field of intelligent manufacturing and machine vision.
Background
There are a lot of industrial detection demands in industrial production scenes, and various industrial detection applications mainly focus on the attributes of geometric shape, size, spatial position, surface defect, edge detection and the like of a target detection object. Aiming at different application scenes in industrial production, the geometric appearance of the material needs to be detected in real time, the manual workload of industrial detection is huge, the input labor cost is huge, however, the effect of manual detection is not ideal, and the method comprises the steps of large subjective deviation, low efficiency, large detection quality fluctuation and incapability of guaranteeing the accuracy. Along with the gradual improvement of the intelligent level of industrial production, more and more industrial detection processes are realized by adopting an artificial intelligence algorithm, so that the intelligent manufacturing level can be obviously improved, and the intelligent factory construction can be powerfully promoted.
One of the main methods for detecting the attributes such as the geometric shape of the industrial material is to adopt an image semantic segmentation algorithm to realize intelligent detection. The target detection objects such as workpieces with regular geometric appearances have mature intelligent detection algorithms, but no uniform high-performance intelligent detection method exists for the target detection objects of irregular materials in production scenes, the market demand of various irregular production workpieces is huge, and a great deal of demands are made on image semantic segmentation. Irregular materials have no regular fixed geometric appearance, and bring many technical implementation challenges to detection work.
The existing semantic segmentation method comprises a U-type network algorithm (UNet, UNet + +), a V-type network algorithm (VNet) and a DeepLab algorithm (V1, V2 and V3), and has the problems that a large amount of labeling data is excessively depended on, and in the actual industrial production environment, high-quality effective training sets can be obtained only by manual labeling with high-skill experts with abundant field work experience, and the labeling data is often very expensive. The above semantic segmentation network based on the depth model needs data support, and usually needs training on a large-scale data set to achieve higher precision. Even if large data sets are available, there are many difficulties in generalizing the performance of U-type network algorithms (UNet, UNet + +), V-type network algorithms (VNet), and deep Lab algorithms (V1, V2, V3) networks to off-distribution data (such as images captured in other production scenarios).
Disclosure of Invention
The technical problem to be solved by the invention is as follows: from the aspect of image semantic segmentation, a semi-supervised method is adopted to perform transfer learning based on a large number of non-regular material sample images without labeling and a small number of labeled samples so as to solve the problem of geometric shape edge detection of small sample learning (small-scale labeled data set) non-regular materials.
The invention adopts the following technical scheme for solving the technical problems:
the invention provides an image semantic segmentation method for irregular materials, which comprises the following steps:
step S1, constructing a training data set, which comprises two subdata sets, one is an unlabeled irregular material image set, and the other is a labeled irregular material image set;
step S2, constructing a pixel level rendering module and training, inputting a vector form of a random variable during training, and outputting a semantic segmentation standard marking mask and an image; the input random variable generation process is obtained by converting random variable probability distribution of regular material image semantic space into new space distribution through a depth network, and the conversion standard is more in line with the image semantic distribution of irregular materials in a production scene;
step S3, constructing an irregular material rendering image discriminator to be applied to an irregular material real image and an irregular material rendering image in a production scene; constructing an irregular material rendering image-label pair discriminator for discriminating a rendered irregular material image-label pair and a real production scene irregular material image-label pair;
step S4, setting an image semantic manifold converter, mapping the image of the irregular material in the production scene to an embedded random variable to obtain an embedded enhanced space, and correspondingly rendering the semantic segmentation quasi-annotation mask label of the irregular material image by combining and marking the irregular material image; in the inference process, the embedding of a new image is calculated firstly, wherein the embedding refers to a vector representation;
s5, a target image of the irregular material is given, the target image is embedded into an embedded enhanced space of a pixel level rendering module, an image semantic manifold converter maps the target image to the embedded enhanced space, an inversion estimation value of an enhanced space vector is obtained through calculation, and an optimal pixel level label is obtained through reasoning;
step S6, learning and sensing image block similarity and Mahalanobis distance item per pixel of the irregular material target image are combined, and the inversion estimation value of the enhanced space vector is combined and transmitted back to the pixel level rendering module to obtain the image semantic segmentation result of the irregular material, namely: and (3) carrying out semantic segmentation on the target image and the quasi-annotation mask label.
Further, the image semantic segmentation method for irregular materials provided by the invention comprises the step of constructing a training data set in step S1, wherein the training data set comprises two sub data sets TaAnd TbWherein the data set TaIs a set of images of unmarked irregular material, Ta={u1,…,uJIn which ujRepresents each irregular material image, J is 1,2, …, J; data set TbIs an irregular material image set cleaned by data, and each image is labeled by semantic segmentation, Tb={(u1,v1),…,(uL,vL) In which ulRepresenting each irregular material image, vlThe representation corresponds to an image ulThe semantic segmentation of (1), 2, …, L, J > L.
Furthermore, the invention provides an image semantic segmentation method for irregular materials, wherein a pixel level rendering module in step S2 is provided with a residual jump connection design, 5 concatenated feature layers are sequentially arranged from an input end, and an additional branch is added to each feature layer to output a semantic segmentation quasi-label mask v and an image output u.
Furthermore, the image semantic segmentation method for the irregular material provided by the invention can be used for trainingAs an input to the process, the process may,is a random noiseSpace, the space random variable is denoted as h, i.e.Random variable h obeys a standard normal distributionFirst converting p (h) into a new spatial distribution p (r) through the fully-connected network, the fully-convolutional network and the 1 × 1 convolutional layer, the space being represented asThe conversion criterion is that the spatial distribution p (r) better conforms to or approximates the image semantic distribution of irregular materials in the production scene of the intelligent factory, wherein the random variableAfter affine transformation, these newly adjusted distributed noise variables r are fed to the feature layer of the pixel-level rendering module; outputting an imageAnd semantic segmentation quasi-annotation mask labelWhereinIs the output image distribution space and is,is the image semantic segmentation quasi-annotation mask label distribution space of irregular materials, and the form is
Furthermore, the invention provides an image semantic segmentation method of irregular materials and a pixel level rendering moduleThe objective function employed in the training isIn the form of
Wherein the sigmoid function sigmod (t) S (t) 1/(1+ e)-t) The training objective is to minimize the objective function by backpropagation gradient descent
Further, the image semantic segmentation method for the irregular material provided by the invention comprises the step of constructing an irregular material rendering image discriminator in step S3WhereinApplied to the real images of the irregular materials and the rendered images of the irregular materials in the production scene,expressing a real number domain, exciting a pixel-level rendering module to construct a vivid image, and realizing the function based on a residual error structure; distinguishing deviceThe objective function employed in the training isThe form is as follows:
the training target is TongMaximizing objective function by over-back propagation gradient descent method
Further, the image semantic segmentation method for the irregular material provided by the invention comprises the step of constructing an irregular material rendering image-label pair discriminator in step S3WhereinThe method comprises the steps of using input images and semantic segmentation quasi-labeling mask labels in a serial connection mode to judge rendered irregular material image-label pairs and real production scene irregular material image-label pairs, and adopting a multi-scale discriminator system architecture and a discriminatorThe objective function employed in the training isIn the form of
The training objective is to maximize the objective function by the backpropagation gradient descent method
Further, the invention provides an image semantic segmentation method of irregular materials, and in step S4, the image semantic manifold converterExtracting multi-level features using a feature pyramid network as a backbone, mapping these features to full convolution based networksA space; training image semantic manifold converterThe pixel-level rendering module must be fixedModel parameters of, i.e. this phaseImage semantic manifold converter without participating in trainingTraining an objective functionIs in the form of
WhereinExpressing KL divergence, summing the pixel-by-pixel KL divergence losses for all pixels of the irregular material image, the first term representing the supervised loss, where the mathematical definition of KL divergence for discrete events Wherein A and B represent two random variables respectively, and the corresponding probability distributions are PA(xi) And PB(xi) I represents a discrete random variable number; the sum of the last two terms of the objective function represents the unsupervised loss, wherein alpha is the value range [0,1 ] under the condition of irregular material image semantic distribution]Is disclosedThe parameters are set to be in a predetermined range,an irregular material image rendering backbone representing a pixel-level rendering module,marking mask label rendering branches on the irregular material image semantic segmentation standard representing the pixel-level rendering module;learning and sensing image block similarity distance for measuring feature space of image network pre-training model depth residual error networkThe distance between the first and second electrodes,representing the mahalanobis distance between the data points x, y, wherein Where Σ is the covariance matrix of the multidimensional random variable and T represents the transposition.
Furthermore, the invention provides an image semantic segmentation method for irregular materials, and the step S5 provides a target image of an irregular material in a production scene of an intelligent factoryDeriving optimal pixel-level labelsFirstly, the target image is embedded into an embedding space of a pixel-level rendering moduleIn, image semantic manifoldConverter with a voltage regulatorImage processing methodMapping toAfter inversion, the method obtainsAs follows
The first item optimizes the reconstruction quality of a given irregular material target image, the second item optimizes the irregular material target image embedding vector track to be kept in a training domain, and the image semantic manifold converter is used for approximating an inverse pixel level rendering module.
Furthermore, the invention provides an image semantic segmentation method of irregular materials, wherein the inferred values of the quasi-label mask label of the image semantic segmentation of the irregular material target image and the irregular material are based on the reconstruction itemsBy adopting the combined action of learning and sensing the similarity of image blocks and the Mahalanobis distance item of each pixel of the irregular material target image,the specific form is as follows:
wherein, delta is an independent hyper-parameter of a value range (0,1) and is obtained by inversionBack to the pixel level rendering module to obtainAndthe irregular material target image and the image semantic segmentation quasi-annotation mask label which are respectively output by the pixel level rendering module are inferred, and the inference result isAndis learning to perceive the image block similarity distance.
Compared with the prior art, the invention adopting the technical scheme has the following technical effects:
the invention constructs a semantic segmentation method for an irregular material image by semi-supervised transfer learning, and judges a semantic segmentation task by using a random vector dynamic rendering model of an inferred image and a corresponding quasi-standard mark mask label. The method has the advantages that a small-scale image semantic segmentation marking data set is supplemented based on widely available unlabeled irregular material image data, a model network is trained by combining limited labeled irregular material image semantic segmentation data, the model generalization capability with excellent performance is obtained, the method can be effectively migrated to various different industrial production scenes, the manual labeling workload is remarkably reduced, and the algorithm efficiency is improved.
The prominent transfer learning semantic segmentation method can extract the joint distribution of the irregular material image and the corresponding quasi-standard mark-mask label, and adds the non-regular material image quasi-standard mark-mask label synthesis branch on the deep full convolution network. The method has excellent performance in semantic segmentation of the image of the irregular material in the production scene, model training is carried out by assisting semi-supervised learning of a large amount of easily obtained image original data of the unmarked irregular material, the required scale of a marked data set in training is obviously reduced, the labor cost for constructing a large amount of marked data sets in production practice is effectively reduced, and the production efficiency and economic benefit are obviously improved.
Drawings
Fig. 1 is an overall flowchart of the image semantic segmentation method provided by the present invention.
FIG. 2 is a schematic diagram of an image semantic distribution space of regular materials in a smart factory production scenario.
FIG. 3 is a schematic diagram of an image semantic distribution space of irregular materials in a smart factory production scenario.
FIG. 4 is a feature layer schematic of a pixel level rendering module.
Detailed Description
The technical scheme of the invention is further explained in detail by combining the attached drawings:
it will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The invention comprises the algorithm construction of the image semantic segmentation method of irregular materials in the production scene of an intelligent factory based on transfer learning, model training and deployment and implementation reasoning in production, the overall structure is shown as figure 1, and the specific steps are as follows:
step 1: constructing a training data set comprising two sub data sets TaAnd TbWherein the data set TaIs a large scale set of unlabeled images, wherein the images are collected from irregular material in the production scene of the smart factory, denoted as Ta={u1,…,uJIn which uj(J ═ 1,2, …, J) represents each irregular material image; data set TbIs a small-scale image set, the image in the set is from irregular material in the production scene of intelligent factoryThe images are cleaned by data, and each image is marked by semantic segmentation and is represented as Tb={(u1,v1),…,(uL,vL)}. Wherein u isl(L ═ 1,2, …, L) represents each irregular material image, vl(L ═ 1,2, …, L) represents the image corresponding to ulThe semantic segmentation labeling mask label of (1); in which a sub data set TaIs much larger than the sub data set TbI.e., J > L. A large amount of unmarked data and a small amount of marked data are used for semi-supervised learning, so that the cost of manual marking is obviously reduced.
Step 2: and constructing a pixel-level rendering module and training on the basis of the data set. Pixel level rendering module, denoted asThe jump connection design with residual errors comprises 5 serial feature layers (see figure 4), wherein the serial numbers are a feature layer No. 1, a feature layer No. 2, … and a feature layer No. 5 in sequence from the input end, and an additional branch is added to each feature layer to output a semantic segmentation quasi-label mask v and an image output u. The pixel level rendering module is toAs an input to the process, the process may,is a random noise space with a random variable denoted h, i.e.Random variable h obeys a standard normal distributionPixel level rendering moduleThe objective function employed in the training isIn the form of
Wherein the sigmoid function sigmod (t) S (t) 1/(1+ e)-t) The training objective is to minimize the objective function by back-propagation gradient descent
And step 3: will be provided withAs input, p (h) is first converted into a new spatial distribution p (r) by means of the fully connected network, the fully convolutional network and the 1 x 1 convolutional layer, the space being represented asThe conversion criterion is that the spatial distribution p (r) better conforms to or approximates the image semantic distribution of irregular materials in the production scene of the intelligent factory, wherein the random variableAfter affine transformation, these newly adjusted distributed noise variables r are fed to the feature layer of the pixel-level rendering module. Outputting an imageAnd semantic segmentation quasi-annotation mask labelWhereinIs the output image distribution space and is,image semantic segmentation quasi-annotation mask label distribution space for irregular materialAnd (3) removing the solvent. The form is
And 4, step 4: construction of irregular material rendering image discriminatorWhereinTrue images of irregular material and rendered images of irregular material applied in production scene: (Representing the real number domain), the excitation pixel-level rendering module constructs a realistic image, which is implemented based on the residual structure. Distinguishing deviceThe objective function employed in the training isIn the form of
The training objective is to maximize the objective function by the backpropagation gradient descent method
And 5: construction of irregular material rendering image-label pair discriminatorWhereinDiscrimination of rendered by concatenating input image and semantically segmented quasi-annotated mask labelsIrregular material image-label pairs and real production scenario irregular material image-label pairs. The process achieves discrimination of alignment between the irregular material composite image and the label, and misaligned irregular material image-label pairs are detected as "pseudo-examples". In order to realize the strong consistency judgment between the rendered image and the label, the invention adopts a multi-scale judger system architecture to realize the function. Distinguishing deviceThe objective function employed in the training isIn the form of
The training objective is to maximize the objective function by the backpropagation gradient descent method
Step 6: set image semantic manifold converterDirect mapping of an image u of an irregular material in a production scene toSpace (therein)The space isSpatial enhancement space). Image semantic manifold converterExtracting multi-level features using a feature pyramid network as a backboneMapping these features to full convolutional network basedA space. Training image semantic manifold converterThe pixel-level rendering module must be fixedModel parameters of, i.e. this phaseAnd does not participate in training. Image semantic manifold converterTraining an objective functionIs in the form of
WhereinExpressing KL divergence, and summing the KL divergence loss of all pixels of the irregular material image pixel by pixel, wherein the first term represents the supervision loss. Wherein the mathematical definition of KL divergence for discrete events Wherein A and B represent two random variables respectively, and the corresponding probability distribution is PA(xi) And PB(xi) And i represents a discrete random variable number. The sum of the last two terms of the objective function represents the unsupervised lossWherein alpha is the value range [0,1 ] under the condition of irregular material image semantic distribution]Is determined by the parameter (c) of (c),an irregular material image rendering backbone representing a pixel-level rendering module,and representing a semantic segmentation quasi-annotation mask label rendering branch of the irregular material image of the pixel level rendering module.Is a learning perception image block similarity distance used for measuring a characteristic space of a depth residual error network (ResNet) of a pre-training model of an image network (ImageNet)The distance between the first and second electrodes,representing the Mahalanobis distance (Mahalanobis) between the data points x, y, whereWhere Σ is the covariance matrix of the multidimensional random variable, T denotes the transposition. Under the guidance of the above objective function training, the image semantic manifold converterMapping image u to embedded random variablesThe input image can be rendered again, and the irregular material image is marked, so that the semantic segmentation quasi-annotation mask label of the irregular material image is correspondingly rendered.
And 7: by image semantic manifold converterImplementation ofEnhanced space of spaceIn the process of algorithmic inference, the embedding of a new image is inferred, where embedding refers to a vector representation. And is disclosed inThe reasoning is performed differently in space, inOperating in the space, independently modeling the random noise vector r in the space, and when r is independently modeled corresponding to the number 1 to 5 characteristic layers, the obtained space is named as an enhanced spaceWhereinDuring the subsequent algorithm inference process, the pixel level rendering module isIn which embedded reasoning is performed, i.e.See the next step.
And 8: an algorithm of an image semantic segmentation method based on a rendering model is deployed and inferred in production, and a target image of an irregular material in a production scene of an intelligent factory is givenDeriving optimal pixel-level labelsFirst embedding a target image into a pixel-level rendering modelEmbedding space of blockMiddle and image semantic manifold converterImage processing methodMapping toAfter inversion, obtainingAs follows
Wherein the first term optimizes the reconstruction quality of a given irregular material target image, the second term normalizes the embedding of the irregular material target image into a vector track to be kept in a training domain, the image semantic manifold transformer is used for approximately reversing a pixel level rendering module,the specific meanings are described in the next step.
And step 9: the inferred value of the semantic segmentation quasi-annotation mask label of the target image of the irregular material and the image of the irregular material is a reconstructed item based on the formulaBy adopting the combined action of learning and sensing the similarity of image blocks and the Mahalanobis distance item of each pixel of the irregular material target image,the specific form is as follows:
where δ is an independent hyper-parameter of the span (0, 1). Obtained by inversionBack to the pixel level rendering module to obtainAndthe irregular material target image and the image semantic segmentation quasi-annotation mask label which are respectively output by the pixel level rendering module are inferred, and the inference result isAndis learning to perceive the image block similarity distance.
As shown in FIG. 2, the image semantic distribution space of regular material in the factory production sceneIn which random variables areThe target to be detected is that with regular geometric appearance (such as circle, ellipse, square and rectangle), from the perspective of semantic segmentation, these appearance features are abstracted into finite-column semantic features, so that the geometric appearance and the geometric size have definite statistical distribution, and correspond to the statistical distribution space of the finite-column semantic features, so we name the image semantic distribution space as regular materialPer-type geometry limited-range semantic features in an image are embeddedIs represented by a random variable vector h. Will be provided withAs input, p (h) is converted into a new spatial distribution p (r) through a full connection network, a full convolution network and a 1 × 1 convolution layer, wherein r represents a random variable vector embedded in each type of geometric finite-column semantic features in the irregular material image, see fig. 3, and fig. 3 is an image semantic distribution spatial distribution of the irregular material in the smart factory production sceneIn which random variables areImage semantic distribution space corresponding to irregular materials in production scene of smart factoryIn which random variables areThe criterion for transformation is that the spatial distribution p (r) better fits or approximates the statistical distribution of the image semantic limited-range semantic features of the irregular material. After the streaming transformation, these newly adjusted distributed noise variables r are fed to the feature layer of the pixel-level rendering module (see the feature layer of the pixel-level rendering module shown in fig. 4). Outputting an imageAnd semantic segmentation quasi-annotation mask labelWhereinIs the output image distribution space and is,is irregularAnd (3) performing semantic segmentation on the image of the material to obtain a quasi-annotation mask label distribution space. The form is
ThereinThe term further optimizes the quality of the image reconstruction of the input given irregular material, item pair rstrAnd constraining so as to keep the hyperspace in which the image semantic manifold transformer is positioned in the training domain space, wherein the image semantic manifold transformer is trained to approximate the inverse pixel-level rendering module. By adopting the regularization of the hyper-parameter beta constraint, the quality of the image semantic segmentation quasi-annotation mask label for rendering the irregular material is effectively ensured by deducing the image (the unlabeled data used by semi-supervised learning) of the irregular material outside the training domain space.
In the deployment and inference implementation process in production, the semantic segmentation quasi-annotation mask label of the irregular material image is optimized through an image semantic manifold converter network and test, then the target irregular material image is embedded into a joint potential semantic space, and finally the semantic segmentation quasi-annotation mask label is rendered from an inferred embedded random vector.
The inferred value of the quasi-label mask label is based on the semantic segmentation of the target image of the irregular material and the image of the irregular materialAnd reconstructing the image by adopting the joint action of learning perception image block similarity and every pixel Mahalanobis distance item of the irregular material target image. Where δ is an independent hyper-parameter of the span (0, 1). Obtained by inversionBack to the pixel level rendering module to obtainDue to the fact thatOptimized to minimizeAndreconstruction error therebetween, so thatIs approximately equal toIn addition, since the pixel-level rendering module is trained to align the synthesized segmentation labels and images, the results are obtainedIs a reconstructed imageMost preferably a label. Therefore, the image semantic segmentation quasi-annotation mask label of the irregular material finally output by the modelIs an irregular target image of a materialIterative optimal image semantic segmentationAccording to probability equal toTherefore, the invention realizes the transfer learning of the semi-supervised method by combining a large amount of non-regular material sample images without labeling with a small amount of labeled samples, and successfully solves the problem of semantic segmentation edge detection of the non-regular material (small sample attribute of labeled data).
The foregoing is only a partial embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.
Claims (10)
1. An image semantic segmentation method for irregular materials is characterized by comprising the following steps:
step S1, constructing a training data set, which comprises two subdata sets, one is an unlabeled irregular material image set, and the other is a labeled irregular material image set;
step S2, constructing a pixel level rendering module and training, inputting a vector form of a random variable during training, and outputting a semantic segmentation standard marking mask and an image; the input random variable generation process is obtained by converting random variable probability distribution of regular material image semantic space into new space distribution through a depth network, and the conversion standard is more in line with the image semantic distribution of irregular materials in a production scene;
step S3, constructing an irregular material rendering image discriminator to be applied to an irregular material real image and an irregular material rendering image in a production scene; constructing an irregular material rendering image-label pair discriminator for discriminating a rendered irregular material image-label pair and a real production scene irregular material image-label pair;
step S4, setting an image semantic manifold converter, mapping the image of the irregular material in the production scene to an embedded random variable to obtain an embedded enhanced space, and correspondingly rendering the semantic segmentation quasi-annotation mask label of the irregular material image by combining and marking the irregular material image; in the inference process, the embedding of a new image is calculated firstly, wherein the embedding refers to a vector representation;
s5, a target image of the irregular material is given, the target image is embedded into an embedded enhanced space of a pixel level rendering module, an image semantic manifold converter maps the target image to the embedded enhanced space, an inversion estimation value of an enhanced space vector is obtained through calculation, and an optimal pixel level label is obtained through reasoning;
step S6, learning and sensing image block similarity and Mahalanobis distance item per pixel of the irregular material target image are combined, and the inversion estimation value of the enhanced space vector is combined and transmitted back to the pixel level rendering module to obtain the image semantic segmentation result of the irregular material, namely: and (3) carrying out semantic segmentation on the target image and the quasi-annotation mask label.
2. The method for semantically segmenting the image of the irregular material according to claim 1, wherein in step S1, a training data set is constructed, which comprises two sub data sets TaAnd TbWherein the data set TaIs an unmarked irregular material image set, Ta={u1,...,uJH, where ujRepresents each irregular material image, J1, 2.., J; data set TbIs an irregular material image set cleaned by data, and each image is labeled by semantic segmentation, Tb={(u1,v1),...,(uL,vL) In which ulRepresenting each irregular material image, vlThe representation corresponds to an image ulThe semantic segmentation of (1) is labeled with a mask label, L1, 2.
3. The method according to claim 1, wherein the pixel-level rendering module in step S2 has a residual jump connection design, and has 5 concatenated feature layers in sequence from the input end, and an additional branch is added to each feature layer to output the semantic segmentation quasi-label mask v and the image output u.
4. The method according to claim 3, wherein training is performed by using a semantic segmentation method for the image of the irregular materialAs an input to the process, the process may,is a random noise space with a random variable denoted h, i.e.Random variable h obeys a standard normal distributionFirst converting p (h) into a new spatial distribution p (r) through the fully-connected network, the fully-convolutional network and the 1 × 1 convolutional layer, the space being represented asThe conversion criterion is that the spatial distribution p (r) better conforms to or approximates the image semantic distribution of irregular materials in the production scene of the intelligent factory, wherein the random variableAfter affine transformation, these newly adjusted distributed noise variables r are fed to the feature layer of the pixel-level rendering module; outputting an imageAnd semantic segmentation quasiLabel mask labelWhereinIs the output image distribution space and is,is the image semantic segmentation quasi-annotation mask label distribution space of irregular materials, and the form is
5. The method for semantically segmenting the image of the irregular material according to claim 1, wherein the pixel-level rendering moduleThe objective function employed in the training isIn the form of
6. The method for image semantic segmentation of irregular material according to claim 1, wherein in step S3, an irregular material rendering image discriminator is constructedWhereinApplied to the real images of the irregular materials and the rendered images of the irregular materials in the production scene,representing a real number domain, exciting a pixel level rendering module to construct a vivid image, and realizing the function based on a residual error structure; distinguishing deviceThe objective function employed in the training isThe form is as follows:
7. The method for image semantic segmentation of irregular material according to claim 1, wherein in step S3, an irregular material rendering image-label pair discriminator is constructedWhereinMethod for distinguishing rendered irregular material by using input image and semantic segmentation quasi-annotation mask label in series connectionImage-label pair and real production scene irregular material image-label pair, adopting multiscale discriminator system structure and discriminatorThe objective function employed in the training isIn the form of
8. The method for image semantic segmentation of irregular materials according to claim 1, wherein in step S4, the image semantic manifold converterExtracting multi-level features using a feature pyramid network as a backbone, mapping these features to full convolution based networksA space; training image semantic manifold converterThe pixel-level rendering module must be fixedModel parameters of, i.e. this phaseImage semantic manifold converter without participating in trainingTraining an objective functionIs in the form of
WhereinExpressing KL divergence, summing the pixel-by-pixel KL divergence losses for all pixels of the irregular material image, the first term representing the supervised loss, where the mathematical definition of KL divergence for discrete events Wherein A and B represent two random variables respectively, and the corresponding probability distributions are PA(xi) And PB(xi) I represents a discrete random variable number; the sum of the last two terms of the objective function represents the unsupervised loss, wherein alpha is the value range [0,1 ] under the condition of irregular material image semantic distribution]Is determined by the parameter (c) of (c),an irregular material image rendering backbone representing a pixel-level rendering module,marking mask label rendering branches on irregular material image semantic segmentation standards representing a pixel level rendering module;learning and sensing image block similarity distance for measuring feature space of image network pre-training model depth residual error networkThe distance between the first and second electrodes,representing the mahalanobis distance between the data points x, y, wherein Where Σ is the covariance matrix of the multidimensional random variable, T denotes the transposition.
9. The method for semantic image segmentation of irregular materials as claimed in claim 1, wherein step S5 specifies a target image of an irregular material in a production scenario of a smart factoryDeriving optimal pixel-level labelsFirstly, the target image is embedded into an embedding space of a pixel-level rendering moduleMiddle and image semantic manifold converterImage processing methodMapping toAfter inversion, the method obtainsAs follows
The first item optimizes the reconstruction quality of a given irregular material target image, the second item optimizes the irregular material target image embedding vector track to be kept in a training domain, and the image semantic manifold converter is used for approximating an inverse pixel level rendering module.
10. The method for image semantic segmentation of irregular materials according to claim 9, characterized in that the inferred value of the quasi-annotated mask label of the image semantic segmentation of the irregular material target image and the irregular material is based on the reconstruction termBy adopting the combined action of learning and sensing the similarity of image blocks and the Mahalanobis distance item of each pixel of the irregular material target image,the specific form is as follows:
wherein, delta is an independent hyper-parameter of a value range (0,1) and is obtained by inversionBack to the pixel level rendering module to obtain Andthe irregular material target image and the image semantic segmentation quasi-annotation mask label which are respectively output by the pixel level rendering module are inferred, and the inference result isAnd is learning to perceive the image block similarity distance.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210331353.4A CN114627297A (en) | 2022-03-30 | 2022-03-30 | Image semantic segmentation method for irregular material of transfer learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210331353.4A CN114627297A (en) | 2022-03-30 | 2022-03-30 | Image semantic segmentation method for irregular material of transfer learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114627297A true CN114627297A (en) | 2022-06-14 |
Family
ID=81903719
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210331353.4A Withdrawn CN114627297A (en) | 2022-03-30 | 2022-03-30 | Image semantic segmentation method for irregular material of transfer learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114627297A (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111738310A (en) * | 2020-06-04 | 2020-10-02 | 科大讯飞股份有限公司 | Material classification method and device, electronic equipment and storage medium |
CN114067168A (en) * | 2021-10-14 | 2022-02-18 | 河南大学 | Cloth defect image generation system and method based on improved variational self-encoder network |
-
2022
- 2022-03-30 CN CN202210331353.4A patent/CN114627297A/en not_active Withdrawn
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111738310A (en) * | 2020-06-04 | 2020-10-02 | 科大讯飞股份有限公司 | Material classification method and device, electronic equipment and storage medium |
CN114067168A (en) * | 2021-10-14 | 2022-02-18 | 河南大学 | Cloth defect image generation system and method based on improved variational self-encoder network |
Non-Patent Citations (1)
Title |
---|
DAIQING LI ET AL.: "Semantic Segmentation with Generative Models:Semi-Supervised Learning and Strong Out-of-Domain Generalization", 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), pages 8296 - 8307 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112465111A (en) | Three-dimensional voxel image segmentation method based on knowledge distillation and countertraining | |
CN110363068B (en) | High-resolution pedestrian image generation method based on multiscale circulation generation type countermeasure network | |
CN112967178B (en) | Image conversion method, device, equipment and storage medium | |
CN114820341A (en) | Image blind denoising method and system based on enhanced transform | |
Liu et al. | RB-Net: Training highly accurate and efficient binary neural networks with reshaped point-wise convolution and balanced activation | |
CN113793261A (en) | Spectrum reconstruction method based on 3D attention mechanism full-channel fusion network | |
CN115829876A (en) | Real degraded image blind restoration method based on cross attention mechanism | |
CN113066074A (en) | Visual saliency prediction method based on binocular parallax offset fusion | |
Yang et al. | A survey of super-resolution based on deep learning | |
CN109087247A (en) | The method that a kind of pair of stereo-picture carries out oversubscription | |
CN116823647A (en) | Image complement method based on fast Fourier transform and selective attention mechanism | |
Xu et al. | Transformer image recognition system based on deep learning | |
CN108596831B (en) | Super-resolution reconstruction method based on AdaBoost example regression | |
Wu et al. | Lightweight stepless super-resolution of remote sensing images via saliency-aware dynamic routing strategy | |
CN114627297A (en) | Image semantic segmentation method for irregular material of transfer learning | |
CN112613405B (en) | Method for recognizing actions at any visual angle | |
CN115035377A (en) | Significance detection network system based on double-stream coding and interactive decoding | |
Zhang et al. | Image super-resolution via RL-CSC: when residual learning meets convolutional sparse coding | |
CN114331894A (en) | Face image restoration method based on potential feature reconstruction and mask perception | |
Tian et al. | Depth inference with convolutional neural network | |
Nie et al. | Image restoration from patch-based compressed sensing measurement | |
Tan et al. | DBSwin: Transformer based dual branch network for single image deraining | |
Liao et al. | Cross-Attention and Cycle-Consistency-Based Haptic To Image Inpainting | |
Chen et al. | Learning Spatiotemporal Features for Video Semantic Segmentation Using 3D Convolutional Neural Networks | |
CN118470048B (en) | Real-time feedback interactive tree image matting method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20220614 |