CN114723707A - Complex texture and pattern color difference detection method based on self-supervision contrast learning - Google Patents

Complex texture and pattern color difference detection method based on self-supervision contrast learning Download PDF

Info

Publication number
CN114723707A
CN114723707A CN202210362124.9A CN202210362124A CN114723707A CN 114723707 A CN114723707 A CN 114723707A CN 202210362124 A CN202210362124 A CN 202210362124A CN 114723707 A CN114723707 A CN 114723707A
Authority
CN
China
Prior art keywords
btk
color difference
data set
image
enhanced data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210362124.9A
Other languages
Chinese (zh)
Inventor
程良伦
曾炜峰
黄国恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202210362124.9A priority Critical patent/CN114723707A/en
Publication of CN114723707A publication Critical patent/CN114723707A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/40Analysis of texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • G06T2207/30144Printing quality
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to the technical field of industrial detection, and discloses a complex texture and pattern color difference detection method based on self-supervision contrast learning, which comprises the following steps: s1, constructing an original data set with labels; s2, performing data enhancement on the original data set to obtain an enhanced data set; s3, extracting image characteristics of the enhanced data set through an encoder; s4, projecting the image characteristics of the enhanced data through a projection network to obtain an embedded vector; s5, calculating the similarity between different images in the enhanced data set and the pre-training contrast loss according to the obtained embedded vector, and improving an encoder according to the contrast loss; and S6, accessing a classification network behind the improved encoder to replace a projection network, re-acquiring the characteristics of the improved image through the improved encoder, and performing classification detection on the image in the enhanced data set. The invention solves the problems of high subjectivity and easy false detection of the detection of adjacent color difference levels in the prior art.

Description

Complex texture and pattern color difference detection method based on self-supervision contrast learning
Technical Field
The invention relates to the technical field of industrial detection, in particular to a complex texture and pattern color difference detection method based on self-supervision contrast learning.
Background
The method for evaluating the color difference quality of the color printed product mainly comprises a visual method, a densitometry method and a colorimetry method. The subjective visual inspection method is a printing product evaluation method which makes judgment based on human eye feeling. The subjective visual inspection method mainly depends on the direct comparison between the set standard printed sheets and the production sample sheets, analyzes the visual color difference between the standard printed sheets and the production sample sheets, and observes the dot shape change and dot overprinting condition of various colors by means of an amplification tool to make comprehensive evaluation. The density detection method is based on the thickness of the ink layer of the printed matter, and the density value directly reflects the reflectivity of the printed matter, so that the depth of the printing color and the thickness of the ink layer are directly judged, and the adjustment and control of printing production are influenced. For a long time, the density detection method is widely used by printing enterprises for quality detection, but the consistency of different instruments is obviously different, so that the density detection method is not suitable for wide use. The chroma detection method is a detection method for measuring the chroma information of a printed matter, and is a basic research tool for printing enterprises to utilize object colors, measure colors and describe colors. The chromaticity detection is not influenced by subjective factors, can objectively display the detection measurement, but cannot be directly associated with the ink layer thickness, the dot variation and the like, so that the detection data is directly used for guiding production. The density detection method and the chroma detection method overcome various problems of the subjective detection method and determine a similarity measurement reference. However, most industrial prints have complex textures or patterns, density and chromaticity detection are calculated by using a fixed model base, the universality and accuracy cannot be met by relevant inspection reflection, and the two detection methods can only perform reading detection on a 10-square-millimeter area of a printed sample, so that a color difference detection model of the complex texture and pattern images based on a significance algorithm needs to be constructed. At present, the corrugated paper digital printer can print images with the resolution of 1200 multiplied by 600 at most, the pattern texture is fine and complex, and if the tasks such as color difference detection, grading color separation and the like are carried out by adopting the traditional template matching mode, the processing speed is low, and the precision is low.
Aiming at the problem, the existing self-supervision image classification method based on contrast learning comprises the following steps of S1, acquiring unlabeled data, and randomly enhancing to generate different views; s2, extracting the features of the view, comparing and calculating loss without supervision to obtain an unsupervised classification model C1; step S3, manually labeling part of the unlabeled data to be used as a training verification set; step S4, taking the C1 as a pre-training model, and carrying out fine adjustment according to a training verification set; step S5: extracting the characteristics of the training verification set, and calculating loss through supervision and comparison to obtain C2; step S6: predicting labels of the label-free data according to C2, and screening data with confidence coefficient higher than a preset value to serve as training samples; step S7: based on training samples, C2 is used as a pre-training model, a small network is selected for training and fine adjustment, and the model with the highest verification output accuracy is used as an optimal classification model C3.
However, the existing chromatic aberration detection method has the problems of high subjectivity and easy false detection of the detection of adjacent chromatic aberration levels, so how to invent an objective chromatic aberration detection method with high accuracy is a problem which needs to be solved urgently in the technical field.
Disclosure of Invention
The invention provides a complex texture and pattern color difference detection method based on self-supervision contrast learning, aiming at solving the problems that the subjectivity is high and the detection of adjacent color difference levels is easy to detect by mistake in the prior art, and the method has the characteristics of objectivity and high accuracy.
In order to achieve the purpose of the invention, the technical scheme is as follows:
a complex texture and pattern color difference detection method based on self-supervision contrast learning comprises the following steps:
s1, obtaining an original image, and constructing an original data set with a label according to a color difference grading standard;
s2, performing data enhancement on the original data set to obtain an enhanced data set;
s3, pre-training is started, and image features of the enhanced data set are extracted through an encoder;
s4, projecting the image characteristics of the enhanced data through a projection network to obtain an embedded vector;
s5, calculating the similarity between different images in the enhanced data set and the contrast loss of pre-training according to the obtained embedded vector, improving an encoder according to the contrast loss, and ending the pre-training;
and S6, accessing a classification network behind the improved encoder to replace a projection network, re-acquiring the characteristics of the improved image through the improved encoder, and performing classification detection on the image in the enhanced data set.
Preferably, in step S1, the specific steps are:
s101, collecting sample image data sets with the number of N and the resolution ratio of w multiplied by h from an object to be analyzed;
s102, marking each sample image according to a color difference grade division standard, and recording an original data set as I ═ I1,I2,......,IN}。
Further, in step S2, the specific steps are:
s201, dividing an original data set into a plurality of batches according to the batch size n, and converting the batches through a transformation function RT
RT=random(flip,rotate,crop,zoom)
Performing data enhancement on original sample image data according to batches, wherein random represents a random selection function, flip represents a turnover transformation, rotate represents a rotation transformation, crop represents a clipping transformation, and zoom represents a scaling transformation;
s202, each sample image I in each batch of data setiTwo enhancement transformation types are randomly selected through a random function to be enhanced;
and S203, outputting the enhanced data sets with the number of 2N and the resolution of w multiplied by h.
Further, the encoder extracts image features of the enhanced data set, and the specific steps are as follows:
A01. the resolution is w multiplied by h and the number of channels is C through the convolution layerinputIs processed into resolution of
Figure BDA0003585687100000031
Tensor C with 64 channels1
C1=Conv7_2(Cinput)
Wherein, Conv7_2The convolution kernel size representing the convolution layer is 7, and the convolution kernel step size is 2;
A02. will tensor C1Obtaining a characteristic diagram C sequentially through a batch normalization layer and an activation function layer2
C2=ActReLU(BN(C1))
Wherein BN denotes a batch normalization layer, ActReLURepresenting an activation function;
A03. will feature map C2By maximizing the pooling layer, a shape is obtained of
Figure BDA0003585687100000032
Characteristic diagram C of3
C3=MaxP3_2(C2)
Wherein, MaxP3_2A maximum pooling layer representing a pooling kernel size of 3 and a pooling kernel step size of 2;
A04. characteristic diagram C3Sequentially passing through four residual learning stages of Stage1, Stage2, Stage3 and Stage4 to obtain an image feature vector H:
Stage1:C4=BTK2(BTK2(BTK1(C3)))
Stage2:C5=BTK2(BTK2(BTK2(BTK1(C4))))
Stage3:C6=BTK2(BTK2(BTK2(BTK2(BTK2(BTK1(C5))))))
Stage4:H=BTK2(BTK2(BTK1(C6)))
wherein, BTK1() Type I bottleneck Block, BTK, representing residual Structure2() Type II bottleneck block, C, representing residual structure4Feature maps obtained for Stage1 through residual learning phase, C5Feature maps obtained for Stage2 through residual learning phase, C6The feature map obtained in the residual learning Stage3 is denoted by H, and the m-dimensional feature vector obtained in the residual learning Stage4 is denoted by H.
Further, the image features of the enhanced data are subjected to projection operation through a projection network to obtain an embedded vector, specifically: inputting H into projection network for realizing nonlinear transformation, and the process expression is as follows
z=FC(ActReLU(Dense(H)))
Where FC is the full connection layer and z is the embedded vector resulting from the projection representation.
Furthermore, according to the obtained embedded vector, calculating the similarity between different images in the enhanced data set and the contrast loss of the pre-training, specifically
B01. Enhancing image A by enhancing any two of the data setsiAnd AjIs embedded vector ziAnd zjCalculating two enhanced images AiAnd AjCosine similarity therebetween:
Figure BDA0003585687100000041
where σ is an adjustable parameter used to scale the input and extend the range of cosine similarity [ -1,1 [ -1],||ziAnd zj| represents the modulus of the embedded vector;
B02. and (3) making the similar cosine similarity into pairs, and calculating the noise contrast estimation loss l (i, j):
Figure BDA0003585687100000042
wherein k is a function count value;
B03. in the case of image position interchange, the contrast estimation loss l (j, i) for the same pair of images is calculated again:
Figure BDA0003585687100000043
B04. calculating the loss of all pairs with the batch size of n and averaging to obtain the contrast loss LN
Figure BDA0003585687100000044
B05. According to LNAnd fixing the weight of the encoder and finishing the pre-training.
Furthermore, the classification network comprises three Dense blocks, wherein each Dense block is a module comprising a plurality of layers, and the feature maps of each layer have the same size; in the classification network, the image characteristics of the enhanced data pass through a convolution layer, sequentially pass through three Dense blocks, pass through a pooling layer and a linear function layer, and output a result, and two adjacent Dense blocks are connected through a convolution layer and a pooling layer.
Furthermore, the process of the image feature x of the enhanced data passing through the density block is as follows:
f1=Conv1_1(ActReLU(BN(x)))
f2=Conv3_1(ActReLU(BN(f1)))
wherein, Conv1_1And Conv3_1F represents a convolution layer having a convolution kernel size of 1 and a step size of 1, and a convolution layer having a convolution kernel size of 3 and a step size of 1, respectively1Is a feature map after 1 × 1 convolution processing, f2The feature map is obtained after 3 × 3 convolution processing.
Further, the images in the enhanced data set are classified and detected, andcalculating the loss of the classification network, and training the classification network according to the loss of the classification network; function L of losses of a classification networkdComprises the following steps:
Figure BDA0003585687100000051
where T is the number of color difference levels, ynAs a label, if the color difference level of the current sample is n, yn1, otherwise 0, pnThe probability that the color difference level is n for the current sample.
A complex texture and pattern color difference detection system based on self-supervision contrast learning comprises a data acquisition module, a data enhancement module, an encoder, a projection network module and a classification network module; the data acquisition module is used for acquiring an original image and constructing an original data set with a label according to a color difference grading standard, the encoder is used for acquiring image characteristics of the enhanced data, the projection network module is used for performing projection operation on the image characteristics through a projection network, and the classification network module is used for classifying color differences to obtain a color difference detection grading result.
The invention has the following beneficial effects:
according to the invention, after the original data set is collected and sorted, the original data set is enhanced, the encoder is pre-trained by combining projection operation through the enhanced data set, the characteristic extraction capability of the encoder is improved, and a classification network is accessed to replace a projection network, so that the detection of complex texture and pattern color difference is realized, the problems of high subjectivity and easy false detection of adjacent color difference levels in the prior art are solved, and the method has the characteristics of objectivity and high accuracy.
Drawings
FIG. 1 is a schematic flow chart of the complex texture and pattern color difference detection method based on the self-supervision contrast learning.
Fig. 2 is a schematic image feature flow chart of an encoder extracting an enhanced data set in the complex texture and pattern color difference detection method based on the self-supervised contrast learning in embodiment 2.
Fig. 3 is a schematic diagram of a type I bottleneck block and a type II bottleneck block in the complex texture and pattern color difference detection method based on the self-supervision contrast learning in embodiment 2.
FIG. 4 is a schematic flow chart of a classification network in the complex texture and pattern color difference detection method based on the self-supervision contrast learning.
FIG. 5 is a schematic diagram of a Dense block of a classification network in the complex texture and pattern color difference detection method based on the self-supervised contrast learning.
Detailed Description
The invention is described in detail below with reference to the drawings and the detailed description.
Example 1
As shown in fig. 1, a method for detecting color difference of complex texture and pattern based on self-supervision contrast learning includes the following steps:
s1, obtaining an original image, and constructing an original data set with a label according to a color difference grading standard;
s2, performing data enhancement on the original data set to obtain an enhanced data set;
s3, pre-training is started, and image features of the enhanced data set are extracted through an encoder;
s4, projecting the image characteristics of the enhanced data through a projection network to obtain an embedded vector;
s5, calculating the similarity between different images in the enhanced data set and the contrast loss of pre-training according to the obtained embedded vector, improving an encoder according to the contrast loss, and ending the pre-training;
and S6, accessing a classification network behind the improved encoder to replace a projection network, re-acquiring the characteristics of the improved image through the improved encoder, and performing classification detection on the image in the enhanced data set.
Example 2
As shown in fig. 1, a method for detecting color difference of complex texture and pattern based on self-supervision contrast learning includes the following steps:
s1, obtaining an original image, and constructing an original data set with a label according to a color difference grading standard;
s2, performing data enhancement on the original data set to obtain an enhanced data set;
s3, pre-training is started, and image features of the enhanced data set are extracted through an encoder;
s4, projecting the image characteristics of the enhanced data through a projection network to obtain an embedded vector;
s5, calculating the similarity between different images in the enhanced data set and the contrast loss of pre-training according to the obtained embedded vector, improving an encoder according to the contrast loss, and ending the pre-training;
and S6, accessing a classification network behind the improved encoder to replace a projection network, re-acquiring the characteristics of the improved image through the improved encoder, and performing classification detection on the image in the enhanced data set.
In one embodiment, step S1 includes the following steps:
s101, collecting sample image data sets with the number of N and the resolution ratio of w multiplied by h from an object to be analyzed; this example collected 600 sample image data sets with a resolution of 576 x 576 from a print production line.
S102, marking each sample image according to a color difference grade division standard, and recording an original data set as I ═ I1,I2,......,IN}。
Step S2, the specific steps are:
s201, dividing an original data set into a plurality of batches according to the batch size n, and converting the batches through a transformation function RT
RT=random(flip,rotate,crop,zoom)
Performing data enhancement on original sample image data according to batches, wherein random represents a random selection function, flip represents a turnover transformation, rotate represents a rotation transformation, crop represents a clipping transformation, and zoom represents a scaling transformation; in the embodiment, the original data set is divided into a plurality of batches according to the batch size of 6;
s202, each sample image I in each batch of data setiTwo are randomly selected by a random functionEnhancing by using the enhancement transformation type;
and S203, outputting the enhanced data sets with the number of 2N and the resolution of w multiplied by h.
In this embodiment, after the data set of each batch is inputted into the data enhancement module, each sample image I in the batchiTwo enhancement transformation types are selected sequentially through a random function, and the transformation function does not change the color difference characteristics of the sample, so that the enhancement result of the same sample has the same color difference characteristics, and finally, the same color difference grade is also classified by the classification network. After all batches are subjected to the first round of data enhancement operation, enhanced data sets with the number of 1200 and the resolution of 576 × 576 are output.
In this embodiment, the enhanced data set with resolution 576 × 576 and the number of channels C is input into the untrained initial encoder, and the process can be expressed as:
H=Extract(A)
wherein the Extract () has the role of obtaining image features of sample data by an encoder, and converting AijConverts m-dimensional vector HijAnd (6) outputting. The encoder used in the pre-training process comprises a deep learning network module of two parts, namely a feature extraction module and a residual error module.
As shown in fig. 2 and 3, the encoder extracts image features of the enhanced data set, and the specific steps are as follows:
A01. the resolution is w multiplied by h and the number of channels is C through the convolution layerinputIs processed into resolution of
Figure BDA0003585687100000081
Tensor C with 64 channels1
C1=Conv7_2(Cinput)
Wherein, Conv7_2The convolution kernel size representing the convolution layer is 7, and the convolution kernel step size is 2;
A02. will tensor C1Obtaining a characteristic diagram C sequentially through a batch standardization layer and an activation function layer2
C2=ActReLU(BN(C1))
Wherein BN denotes a batch normalization layer, ActReLURepresenting an activation function;
A03. will feature map C2By maximizing the pooling layer, a shape is obtained of
Figure BDA0003585687100000082
Characteristic diagram C of3
C3=MaxP3_2(C2)
Wherein, MaxP3_2A maximum pooling layer representing a pooling kernel size of 3 and a pooling kernel step size of 2;
A04. in the residual error module, there are two different residual error structure bottleneck blocks, the type I bottleneck block has two important adjustable parameters λ and μ, λ is responsible for controlling whether to execute down-sampling, μ is responsible for controlling whether to reduce the number of channels, the specific operation is to achieve the purpose of different number of input channels and output channels by the convolution layer on the residual error branch, and the residual error branch of the type II bottleneck block is not provided with convolution layer, so the number of output channels and the number of input channels are kept the same. The whole residual error module can be divided into four residual error learning stages in sequence by taking the I-type bottleneck block as a boundary, and the characteristic diagram C3Sequentially passing through four residual learning stages of Stage1, Stage2, Stage3 and Stage4 to obtain an image feature vector H:
Stage1:C4=BTK2(BTK2(BTK1(C3)))
Stage2:C5=BTK2(BTK2(BTK2(BTK1(C4))))
Stage3:C6=BTK2(BTK2(BTK2(BTK2(BTK2(BTK1(C5))))))
Stage4:H=BTK2(BTK2(BTK1(C6)))
wherein, BTK1() Type I bottleneck Block, BTK, representing residual Structure2() Type II bottleneck Block representing residual Structure, C4To go through residual error learning Stage1The resulting feature map, C5Feature maps obtained for Stage2 through residual learning phase, C6The feature map obtained in the residual learning Stage3 is denoted by H, and the m-dimensional feature vector obtained in the residual learning Stage4 is denoted by H. Stage1 residual learning in this embodiment includes looping through 2 type II bottleneck blocks and 1 type I bottleneck block, Stage2 residual learning includes looping through 1 type I bottleneck block and 3 type II bottleneck blocks, Stage3 residual learning includes looping through 1 type I bottleneck block and 5 type II bottleneck blocks, and Stage4 residual learning includes looping through 1 type I bottleneck block and 2 type II bottleneck blocks.
In this embodiment, the shape output by the feature extraction module in the first stage is the (64, 144, 144) feature map C3In input, considering that convolution and maximum pooling operations are just performed on sample image input in the feature extraction module, residual learning is not performed, a large amount of information is lost in direct downsampling at the moment, the number of input channels is 64 at the moment, and the number of channels does not need to be reduced to improve learning efficiency, so that lambda and mu are in a state of not performing operations by default. After the residual error learning in the first stage, the operations of downsampling and channel number reduction can be directly set in the following three stages, and the model learning efficiency is further improved.
Projecting the image features of the enhanced data through a projection network to obtain an embedded vector, specifically: inputting H into projection network for realizing nonlinear transformation, and the process expression is as follows
z=FC(ActReLU(Dense(H)))
And the FC is a full connection layer and is used for extracting the correlation among the features and finally mapping the correlation into an output space, and z is an embedded vector obtained by projection representation and used for loss calculation of pre-training.
In a particular embodiment, the similarity between different images in the enhanced data set and the pre-trained contrast loss are calculated based on the obtained embedded vector, in particular
B01. Enhancing image A by enhancing any two of the data setsiAnd AjIs embedded vector ziAnd zjCalculating two enhancement mapsLike AiAnd AjCosine similarity between:
Figure BDA0003585687100000091
where σ is an adjustable parameter used to scale the input and extend the range of cosine similarity [ -1,1 [ -1],||ziAnd zj| represents the modulus of the embedded vector; the above formula is used to calculate the cosine similarity between each two enhanced images in batch, and ideally, the similarity between the enhanced images of the same color difference level is very high, that is, when the subscript i of the embedded vector is equal to j, the cosine similarity is calculated to be higher.
B02. And (3) making the similar cosine similarity into pairs, and calculating the noise contrast estimation loss l (i, j):
Figure BDA0003585687100000092
wherein k is a function count value;
B03. in the case of image position interchange, the contrast estimation loss l (j, i) for the same pair of images is calculated again:
Figure BDA0003585687100000101
B04. calculating the loss of all pairs with the batch size of n and averaging to obtain the contrast loss LN
Figure BDA0003585687100000102
B05. According to LNAnd fixing the weight of the encoder and finishing the pre-training. Based on LNThe encoder and projection head representations will improve over time, and the resulting representations will place similar images in more similar locations in space.
Example 3
As shown in fig. 1, a method for detecting color difference of complex texture and pattern based on self-supervision contrast learning includes the following steps:
s1, obtaining an original image, and constructing an original data set with a label according to a color difference grading standard;
s2, performing data enhancement on the original data set to obtain an enhanced data set;
s3, pre-training is started, and image features of the enhanced data set are extracted through an encoder;
s4, projecting the image characteristics of the enhanced data through a projection network to obtain an embedded vector;
s5, calculating the similarity between different images in the enhanced data set and the contrast loss of pre-training according to the obtained embedded vector, improving an encoder according to the contrast loss, and ending the pre-training;
and S6, accessing a classification network behind the improved encoder to replace a projection network, re-acquiring the characteristics of the improved image through the improved encoder, and performing classification detection on the image in the enhanced data set.
In one embodiment, step S1 includes the following steps:
s101, collecting sample image data sets with the number of N and the resolution ratio of w multiplied by h from an object to be analyzed;
s102, marking each sample image according to a color difference grade division standard, and recording an original data set as I ═ I1,I2,......,IN}。
Step S2, the specific steps are:
s201, dividing an original data set into a plurality of batches according to the batch size n, and converting a function number RT
RT=random(flip,rotate,crop,zoom)
Performing data enhancement on original sample image data according to batches, wherein random represents a random selection function, flip represents a turnover transformation, rotate represents a rotation transformation, crop represents a clipping transformation, and zoom represents a scaling transformation;
s202. eachEach sample image I in the data set of the batchiTwo enhancement transformation types are randomly selected through a random function to be enhanced;
and S203, outputting the enhanced data sets with the number of 2N and the resolution of w multiplied by h.
In one embodiment, the encoder extracts image features of the enhanced data set by the following steps:
A01. the resolution is w multiplied by h and the number of channels is C through the convolution layerinputIs processed into resolution of
Figure BDA0003585687100000111
Tensor C with 64 channels1
C1=Conv7_2(Cinput)
Wherein, Conv7_2The convolution kernel size representing the convolution layer is 7, and the convolution kernel step size is 2;
A02. will tensor C1Obtaining a characteristic diagram C sequentially through a batch normalization layer and an activation function layer2
C2=ActReLU(BN(C1))
Wherein BN denotes a batch normalization layer, ActReLURepresenting an activation function;
A03. will feature map C2By maximizing the pooling layer, a shape is obtained of
Figure BDA0003585687100000112
Characteristic diagram C of3
C3=MaxP3_2(C2)
Wherein, MaxP3_2A maximum pooling layer representing a pooling kernel size of 3 and a pooling kernel step size of 2;
A04. characteristic diagram C3Sequentially passing through four residual learning stages of Stage1, Stage2, Stage3 and Stage4 to obtain an image feature vector H:
Stage1:C4=BTK2(BTK2(BTK1(C3)))
Stage2:C5=BTK2(BTK2(BTK2(BTK1(C4))))
Stage3:C6=BTK2(BTK2(BTK2(BTK2(BTK2(BTK1(C5))))))
Stage4:H=BTK2(BTK2(BTK1(C6)))
wherein, BTK1() Type I bottleneck Block, BTK, representing residual Structure2() Type II bottleneck block, C, representing residual structure4Feature maps obtained for Stage1 through residual learning phase, C5Feature maps obtained by Stage2 of residual learning Stage, C6The feature map obtained in the residual learning Stage3 is denoted by H, and the m-dimensional feature vector obtained in the residual learning Stage4 is denoted by H.
Projecting the image characteristics of the enhanced data through a projection network to obtain an embedded vector, specifically: inputting H into projection network for realizing nonlinear transformation, and the process expression is as follows
z=FC(ActReLU(Dense(H)))
Where FC is the full connection layer and z is the embedded vector resulting from the projection representation.
In a particular embodiment, the similarity between different images in the enhanced data set and the pre-trained contrast loss are calculated based on the obtained embedded vector, in particular
B01. Enhancing image A by enhancing any two of the data setsiAnd AjIs embedded vector ziAnd zjCalculating two enhanced images AiAnd AjCosine similarity between:
Figure BDA0003585687100000121
where σ is an adjustable parameter used to scale the input and extend the range of cosine similarity [ -1,1 [ -1],||ziAnd zj| represents the modulus of the embedded vector;
B02. and (3) making the similar cosine similarity into pairs, and calculating the noise contrast estimation loss l (i, j):
Figure BDA0003585687100000122
wherein k is a function count value;
B03. in the case of image position interchange, the contrast estimation loss l (j, i) for the same pair of images is calculated again:
Figure BDA0003585687100000123
B04. calculating the loss of all pairs with the batch size of n and averaging to obtain the contrast loss LN
Figure BDA0003585687100000124
B05. According to LNAnd fixing the weight of the encoder and finishing the pre-training.
As shown in fig. 4, in one specific implementation, the classification network includes three sense blocks, which are modules including several layers, wherein the feature maps of each layer have the same size; in the classification network, the image characteristics of the enhanced data pass through a convolution layer, sequentially pass through three Dense blocks, pass through a pooling layer and a linear function layer, and then output a result, and two adjacent Dense blocks are connected through a convolution layer and a pooling layer, so that the size of a characteristic diagram is reduced, and the effect of compressing the model is achieved.
In this embodiment, the density block is a module including a plurality of layers, the feature maps of each layer have the same size, and a Dense connection manner is adopted between the layers.
As shown in fig. 5, in one implementation, the process of passing image feature x of the enhanced data through the density block is:
f1=Conv1_1(ActReLU(BN(x)))
f2=Conv3_1(ActReLU(BN(f1)))
wherein, Conv1_1And Conv3_1F represents a convolution layer having a convolution kernel size of 1 and a step size of 1, and a convolution layer having a convolution kernel size of 3 and a step size of 1, respectively1Is a feature map after 1 × 1 convolution processing, f2The feature map is obtained after 3 × 3 convolution processing.
In one embodiment, the images in the enhanced dataset are classified and detected, the loss of the classification network is calculated, and the classification network is trained according to the loss of the classification network; function L of losses of a classification networkdComprises the following steps:
Figure BDA0003585687100000131
where T is the number of color difference levels, ynAs a label, if the color difference level of the current sample is n, yn1, otherwise 0, pnThe probability that the color difference level is n for the current sample.
According to the method, after the original data set is collected and sorted, the original data set is enhanced, the encoder is pre-trained by combining projection operation through the enhanced data set, the feature extraction capability of the encoder is improved, and a classification network is accessed to replace a projection network, so that the detection of complex textures and pattern color difference is realized; according to the self-supervised contrast learning idea, the problems of low speed and low adjacent chromatic aberration level detection precision in the traditional chromatic aberration detection method are solved, and compared with the supervised learning method, the method is more suitable for scenes with few samples, the problem that a large amount of labeled data is needed in the supervised learning process is solved by using the advantages of the contrast learning, the generalization performance of the model is greatly improved, and the method has the characteristics of high detection precision, strong detection robustness and wide applicability.
Example 4
A complex texture and pattern color difference detection system based on self-supervision contrast learning comprises a data acquisition module, a data enhancement module, an encoder, a projection network module and a classification network module; the data acquisition module is used for acquiring an original image and constructing an original data set with a label according to a color difference grading standard, the encoder is used for acquiring image characteristics of the enhanced data, the projection network module is used for performing projection operation on the image characteristics through a projection network, and the classification network module is used for classifying color differences to obtain a color difference detection grading result.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (10)

1. A complex texture and pattern color difference detection method based on self-supervision contrast learning is characterized by comprising the following steps: the method comprises the following steps:
s1, obtaining an original image, and constructing an original data set with a label according to a color difference grading standard;
s2, performing data enhancement on the original data set to obtain an enhanced data set;
s3, pre-training is started, and image features of the enhanced data set are extracted through an encoder;
s4, projecting the image characteristics of the enhanced data through a projection network to obtain an embedded vector;
s5, calculating the similarity between different images in the enhanced data set and the contrast loss of pre-training according to the obtained embedded vector, improving an encoder according to the contrast loss, and ending the pre-training;
and S6, accessing a classification network behind the improved encoder to replace a projection network, re-acquiring the characteristics of the improved image through the improved encoder, and performing classification detection on the image in the enhanced data set.
2. The method for detecting color difference of complex texture and pattern based on self-supervision contrast learning according to claim 1, characterized in that: step S1, the specific steps are:
s101, collecting sample image data sets with the number of N and the resolution ratio of w multiplied by h from an object to be analyzed;
s102, marking each sample image according to a color difference grade division standard, and recording an original data set as I ═ I1,I2,……,IN}。
3. The method for detecting color difference of complex texture and pattern based on self-supervision contrast learning according to claim 2, characterized in that: step S2, the specific steps are:
s201, dividing an original data set into a plurality of batches according to the batch size n, and converting the batches through a transformation function RT
RT=random(flip,rotate,crop,zoom)
Performing data enhancement on original sample image data according to batches, wherein random represents a random selection function, flip represents a turnover transformation, rotate represents a rotation transformation, crop represents a clipping transformation, and zoom represents a scaling transformation;
s202, each sample image I in each batch of data setsiTwo enhancement transformation types are randomly selected through a random function to be enhanced;
and S203, outputting the enhanced data sets with the number of 2N and the resolution of w multiplied by h.
4. The method for detecting color difference of complex texture and pattern based on self-supervision contrast learning according to claim 3, characterized in that: the encoder extracts the image features of the enhanced data set, and the specific steps are as follows:
A01. the resolution is w multiplied by h and the number of channels is C through the convolution layerinputIs processed into resolution of
Figure FDA0003585687090000021
Tensor C with 64 channels1
C1=Conv7_2(Cinput)
Wherein, Conv7_2The convolution kernel size representing the convolution layer is 7, and the convolution kernel step size is 2;
A02. will tensor C1Obtaining a characteristic diagram C sequentially through a batch normalization layer and an activation function layer2
C2=ActReLU(BN(C1))
Wherein BN denotes a batch normalization layer, ActReLURepresenting an activation function;
A03. will feature map C2By maximizing the pooling layer, a shape is obtained of
Figure FDA0003585687090000022
Characteristic diagram C of3
C3=MaxP3_2(C2)
Wherein, MaxP3_2A maximum pooling layer representing a pooling kernel size of 3 and a pooling kernel step size of 2;
A04. characteristic diagram C3Sequentially passing through four residual learning stages of Stage1, Stage2, Stage3 and Stage4 to obtain an image feature vector H:
Stage1:C4=BTK2(BTK2(BTK1(C3)))
Stage2:C5=BTK2(BTK2(BTK2(BTK1(C4))))
Stage3:C6=BTK2(BTK2(BTK2(BTK2(BTK2(BTK1(C5))))))
Stage4:H=BTK2(BTK2(BTK1(C6)))
wherein, BTK1() Type I bottleneck Block, BTK, representing residual Structure2() Type II bottleneck block representing residual structure, C4Feature maps obtained for Stage1 through residual learning phase, C5Feature maps obtained for Stage2 through residual learning phase, C6The feature map obtained in the residual learning Stage3 is denoted by H, and the m-dimensional feature vector obtained in the residual learning Stage4 is denoted by H.
5. The method of claim 4, wherein the method comprises: projecting the image features of the enhanced data through a projection network to obtain an embedded vector, specifically: inputting H into a projection network realizing nonlinear transformation, wherein the process expression is as follows:
z=FC(ActReLU(Dense(H)))
where FC is the full connection layer and z is the embedded vector resulting from the projection representation.
6. The method for detecting color difference of complex texture and pattern based on self-supervision contrast learning according to claim 5, characterized in that: according to the obtained embedded vector, calculating the similarity between different images in the enhanced data set and the contrast loss of pre-training, specifically
B01. Enhancing image A by enhancing any two of the data setsiAnd AjIs embedded vector ziAnd zjCalculating two enhanced images AiAnd AjCosine similarity between:
Figure FDA0003585687090000031
where σ is an adjustable parameter used to scale the input and extend the range of cosine similarity [ -1,1 [ -1],‖ziII and I zj| represents the modulus of the embedded vector;
B02. and (3) making pairs of similar cosine similarities, and calculating the noise contrast estimation loss l (i, j):
Figure FDA0003585687090000032
wherein k is a function count value;
B03. in the case of image position interchange, the contrast estimation loss l (j, i) for the same pair of images is calculated again:
Figure FDA0003585687090000033
B04. calculating the loss of all pairs with the batch size of n and averaging to obtain the contrast loss LN
Figure FDA0003585687090000034
B05. According to LNAnd fixing the weight of the encoder and finishing the pre-training.
7. The method for detecting color difference of complex texture and pattern based on self-supervision contrast learning according to claim 1, characterized in that: the classification network comprises three Dense blocks, wherein each Dense block is a module comprising a plurality of layers, and the feature maps of all the layers have the same size; in the classification network, the image characteristics of the enhanced data pass through a convolution layer, sequentially pass through three Dense blocks, pass through a pooling layer and a linear function layer, and output a result, and two adjacent Dense blocks are connected through a convolution layer and a pooling layer.
8. The method for detecting color difference of complex texture and pattern based on self-supervision contrast learning according to claim 7, characterized in that: the process of the image characteristic x of the enhanced data through the Dense block is as follows:
f1=Conv1_1(ActReLU(BN(x)))
f2=Conv3_1(ActReLU(BN(f1)))
wherein, Conv1_1And Conv3_1F represents a convolution layer having a convolution kernel size of 1 and a step size of 1, and a convolution layer having a convolution kernel size of 3 and a step size of 1, respectively1Is a feature map after 1 × 1 convolution processing, f2The feature map is obtained after 3 × 3 convolution processing.
9. The method for detecting color difference of complex texture and pattern based on self-supervision contrast learning according to claim 8, characterized in that: classifying and detecting the images in the enhanced data set, calculating the loss of a classification network, and training the classification network according to the loss of the classification network; function L of losses of a classification networkdComprises the following steps:
Figure FDA0003585687090000041
where T is the number of color difference levels, ynFor labeling, if the color difference level of the current sample is n, yn1, otherwise 0, pnThe probability that the color difference level is n for the current sample.
10. The utility model provides a complicated texture and pattern colour difference detecting system based on self-supervision contrast study which characterized in that: the system comprises a data acquisition module, a data enhancement module, an encoder, a projection network module and a classification network module; the data acquisition module is used for acquiring an original image and constructing an original data set with a label according to a color difference grading standard, the encoder is used for acquiring image characteristics of the enhanced data, the projection network module is used for performing projection operation on the image characteristics through a projection network, and the classification network module is used for classifying color differences to obtain a color difference detection grading result.
CN202210362124.9A 2022-04-07 2022-04-07 Complex texture and pattern color difference detection method based on self-supervision contrast learning Pending CN114723707A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210362124.9A CN114723707A (en) 2022-04-07 2022-04-07 Complex texture and pattern color difference detection method based on self-supervision contrast learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210362124.9A CN114723707A (en) 2022-04-07 2022-04-07 Complex texture and pattern color difference detection method based on self-supervision contrast learning

Publications (1)

Publication Number Publication Date
CN114723707A true CN114723707A (en) 2022-07-08

Family

ID=82242162

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210362124.9A Pending CN114723707A (en) 2022-04-07 2022-04-07 Complex texture and pattern color difference detection method based on self-supervision contrast learning

Country Status (1)

Country Link
CN (1) CN114723707A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115578589B (en) * 2022-10-12 2023-08-18 江苏瑞康成医疗科技有限公司 Unsupervised echocardiography section identification method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115578589B (en) * 2022-10-12 2023-08-18 江苏瑞康成医疗科技有限公司 Unsupervised echocardiography section identification method

Similar Documents

Publication Publication Date Title
CN104463195B (en) Printing digit recognizing method based on template matches
CN108052980B (en) Image-based air quality grade detection method
CN106295124B (en) The method of a variety of image detecting technique comprehensive analysis gene subgraph likelihood probability amounts
CN108564085B (en) Method for automatically reading of pointer type instrument
CN108288271A (en) Image detecting system and method based on three-dimensional residual error network
CN111402226A (en) Surface defect detection method based on cascade convolution neural network
CN103034838B (en) A kind of special vehicle instrument type identification based on characteristics of image and scaling method
CN113554629A (en) Strip steel red rust defect detection method based on artificial intelligence
CN107256547A (en) A kind of face crack recognition methods detected based on conspicuousness
JP2010134957A (en) Pattern recognition method
CN112465752A (en) Improved Faster R-CNN-based small target detection method
CN103544499A (en) Method for reducing dimensions of texture features for surface defect detection on basis of machine vision
CN113688821B (en) OCR text recognition method based on deep learning
CN114723707A (en) Complex texture and pattern color difference detection method based on self-supervision contrast learning
CN110533083B (en) Casting defect recognition method based on Adaboost model of SVM
CN112884741B (en) Printing apparent defect detection method based on image similarity comparison
CN116704526B (en) Staff scanning robot and method thereof
CN109815957A (en) A kind of character recognition method based on color image under complex background
CN113724339A (en) Color separation method for few-sample ceramic tile based on color space characteristics
CN115409841B (en) Printed matter inspection method based on vision
CN117315670A (en) Water meter reading area detection method based on computer vision
CN114841998A (en) Artificial intelligence-based packaging printing abnormity monitoring method
CN109886325A (en) A kind of stencil-chosen and acceleration matching process of non linear color space classification
CN109857896A (en) A kind of detection of document watermark and identifying system
CN112364844B (en) Data acquisition method and system based on computer vision technology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination