CN112802009A - Similarity calculation method and device for product detection data set - Google Patents

Similarity calculation method and device for product detection data set Download PDF

Info

Publication number
CN112802009A
CN112802009A CN202110210120.4A CN202110210120A CN112802009A CN 112802009 A CN112802009 A CN 112802009A CN 202110210120 A CN202110210120 A CN 202110210120A CN 112802009 A CN112802009 A CN 112802009A
Authority
CN
China
Prior art keywords
similarity
defect
data sets
calculating
defects
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110210120.4A
Other languages
Chinese (zh)
Inventor
林大
旷黎明
师文庆
韩锦
潘正颐
侯大为
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changzhou Weiyizhi Technology Co Ltd
Original Assignee
Changzhou Weiyizhi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changzhou Weiyizhi Technology Co Ltd filed Critical Changzhou Weiyizhi Technology Co Ltd
Priority to CN202110210120.4A priority Critical patent/CN112802009A/en
Publication of CN112802009A publication Critical patent/CN112802009A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Investigating Materials By The Use Of Optical Means Adapted For Particular Applications (AREA)

Abstract

The application discloses a similarity calculation method and a similarity calculation device for product detection data sets, wherein the method comprises the steps of calculating the similarity K' of each defect between two data sets according to the weight of each defect on different feature dimensions; calculating the similarity K1 between the two data sets according to the similarity K' of each defect and the number ratio of each defect; calculating cosine similarity between two vectors of each defect in the two data sets, and multiplying the cosine similarity by the similarity K1 to obtain the similarity K2 between the two data sets; and carrying out normalization processing on the similarity K2 to obtain the final similarity K between the two data sets. According to the method and the device, the similarity of the data sets is calculated in a weighting mode based on each characteristic dimension of the defect, and the similarity of the two data sets can be rapidly acquired, so that the historical parameter configuration of the data set with higher similarity is used for setting the initial parameter configuration of the data set for training, and the training efficiency of the model is improved.

Description

Similarity calculation method and device for product detection data set
Technical Field
The application belongs to the technical field of product detection, and relates to a similarity calculation method and device for a product detection data set in an industrial internet.
Background
In the artificial intelligence-based product surface defect detection solution, after a marking team marks defects of shot pictures, the pictures with correlation generally need to be classified into a group of data sets for training an intelligent detection model.
The current model training mode is usually to input a data set to a model, train the model, and obtain configuration information of relevant parameters. When the number of models to be trained is large, the training efficiency is low because the training of each model requires a long time because the training is started from zero.
Disclosure of Invention
In order to solve the problem that the training efficiency of a related technology model is low, the application provides a similarity calculation method and a similarity calculation device for a product detection data set, and the technical scheme is as follows:
in a first aspect, the present application provides a method for calculating similarity of product inspection data sets, the method comprising:
calculating the similarity K' of each defect between the two data sets according to the weight of each defect on different feature dimensions;
calculating to obtain the similarity K1 between the two data sets according to the similarity K' of each defect and the quantity ratio of each defect, wherein the similarity K1 is used for depicting the contribution of the quantity ratio of each defect to the similarity of the two data sets;
calculating cosine similarity between two vectors of each defect in the two data sets, and multiplying the cosine similarity with the similarity K1 to obtain similarity K2 between the two data sets, wherein the cosine similarity is used for describing the structural proportion of the defect number, and the similarity K2 is used for describing the contribution of the structural proportion of each defect number to the similarity of the two data sets;
and carrying out normalization processing on the similarity K2, and mapping the similarity K2 to [0,1] to obtain the final similarity K between the two data sets.
Optionally, the calculating a similarity K' between the two data sets of each defect according to the weight of each defect in different feature dimensions includes:
for each defect, the characteristic value f of said defect is extracted separately in two data setsi1,fi2,fi3,fi4,fi5,...fin]Wherein f isijRepresenting the characteristic value of the defect on the ith data set in the j dimension, wherein n is the total number of the characteristic dimensions of the defect;
calculating the similarity K' of the defect between the two data sets according to the characteristic value of the defect between the two data sets and the weight of each characteristic dimension, wherein the weight of each characteristic dimension is [ w1,w2,w3,w4,w5,...,wn],K’=v1*w1+v2*w2+v3*w3+v4*w4+v5*w5+...+vn*wn,vjRepresents the similarity between the two data sets of the characteristic value of the j dimension of the current defect, and vjThe value is [0,1]]Interval, vj=f1j/f2j
Optionally, the calculating the similarity K1 between the two data sets according to the similarity K' of each defect and the ratio of the number of each defect includes:
acquiring the number of each defect in the two data sets;
adding the number of each defect to obtain the total number of the defects;
dividing the number of each defect by the sum of the number of the defects to obtain the number ratio of each defect;
multiplying the number ratio of each defect by the similarity K' of each defect to obtain a product value of each defect;
the product values of the respective defects are added to obtain the similarity K1.
Optionally, the calculating a cosine similarity between two vectors of the number of each defect in the two data sets, and multiplying the cosine similarity by the similarity K1 to obtain a similarity K2 between the two data sets includes:
a first vector (P) is derived based on the number of defects in the two data sets11,P12,...,P1m) And a second vector (P)21,P22,...,P2m),PijThe number of defects j in the ith data set, and m is the number of defects;
calculating cosine similarity between the first vector and the second vector;
and multiplying the similarity K1 by the cosine similarity to obtain the similarity K2.
Optionally, the normalizing the similarity K2, and mapping the similarity K2 to [0,1], to obtain a final similarity K between two data sets, includes:
acquiring the weight sum W of each characteristic dimension of the defect;
dividing the similarity K2 by the weight sum W to obtain the similarity K.
In a second aspect, the present application also provides an apparatus for calculating similarity of product inspection data sets, the apparatus comprising:
the first calculation module is used for calculating the similarity K' of each defect between the two data sets according to the weight of each defect on different feature dimensions;
the second calculation module is used for calculating the similarity K1 between the two data sets according to the similarity K' of each defect calculated by the first calculation module and the quantity ratio of each defect, and the similarity K1 is used for depicting the contribution of the quantity ratio of each defect to the similarity of the two data sets;
the third calculation module is used for calculating cosine similarity between two vectors of the number of each defect in the two data sets, multiplying the cosine similarity with the similarity K1 calculated by the second calculation module to obtain the similarity K2 between the two data sets, wherein the cosine similarity is used for describing the structure proportion of the number of the defects, and the similarity K2 is used for describing the contribution of the structure proportion of the number of each defect to the similarity of the two data sets;
and the processing module is used for carrying out normalization processing on the similarity K2 calculated by the third calculation module, and mapping the similarity K2 between [0,1] to obtain the final similarity K between the two data sets.
Optionally, the first computing module includes:
an extraction unit for extracting, for each defect, a feature value [ f ] of the defect in each of the two data setsi1,fi2,fi3,fi4,fi5,...fin]Wherein f isijRepresenting the characteristic value of the defect on the ith data set in the j dimension, wherein n is the total number of the characteristic dimensions of the defect;
a first calculating unit, configured to calculate a similarity K' between the two data sets of the defect according to the feature values of the defect extracted by the extracting unit in the two data sets and the weight of each feature dimension, where the weight of each feature dimension is [ w1,w2,w3,w4,w5,...,wn],K’=v1*w1+v2*w2+v3*w3+v4*w4+v5*w5+...+vn*wn,vjRepresents the similarity between the two data sets of the characteristic value of the j dimension of the current defect, and vjThe value is [0,1]]Interval, vj=f1j/f2j
Optionally, the second computing module includes:
a first acquiring unit for acquiring the number of each defect in the two data sets;
a second calculating unit, configured to add the number of each defect acquired by the first acquiring unit to obtain a defect number sum;
a third calculating unit, configured to divide the number of each defect by the sum of the numbers of defects calculated by the second calculating unit to obtain a number ratio of each defect;
a fourth calculating unit, configured to multiply the number ratio of each defect calculated by the third calculating unit by the similarity K' of each defect to obtain a product value of each defect;
a fifth calculating unit, configured to add the product values of the defects calculated by the fourth calculating unit to obtain the similarity K1.
Optionally, the third computing module includes:
a vector acquisition module for obtaining a first vector (P) based on the number of defects in the two data sets11,P12,...,P1m) And a second vector (P)21,P22,...,P2m),PijThe number of defects j in the ith data set, and m is the number of defects;
a sixth calculating unit, configured to calculate a cosine similarity between the first vector and the second vector acquired by the vector acquisition module;
and the seventh calculating unit is used for multiplying the similarity K1 by the cosine similarity calculated by the sixth calculating unit to obtain the similarity K2.
Optionally, the processing module includes:
the second acquisition unit is used for acquiring the weight sum W of each characteristic dimension of the defect;
an eighth calculating unit, configured to divide the similarity K2 by the weight sum W obtained by the second obtaining unit to obtain the similarity K.
Based on the technical scheme, the application can at least realize the following beneficial effects:
the similarity of the two data sets is calculated by adopting a method of weighting based on each feature dimension of the defect, so that the similarity of the two data sets can be rapidly acquired, the historical parameter configuration of the data set with higher similarity is further used for setting the initial parameter configuration of the data set for training, and the training efficiency of the model can be improved to a certain extent.
In addition, according to the actual requirements of the service, the weight sizes of different dimensions can be adjusted according to the actual requirements of the service, and the similarity between two defect objects is dynamically calculated, so that the similarity between data sets is influenced; when the similarity of the data set is calculated, the contribution of each defect quantity to the overall similarity is considered locally, and the calculation of the overall proportion of each defect quantity to the similarity is also considered; the normalization process performed, maps to the [0,1] interval, so that the similarity has comparable characteristics in value, with a larger value indicating more similarity between the two data sets.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
FIG. 1 is a flow chart of a method of similarity calculation for product inspection data sets provided in one embodiment of the present application;
FIG. 2 is a schematic diagram of a similarity calculation apparatus for product inspection data sets provided in an embodiment of the present application;
fig. 3 is a schematic structural diagram of a similarity calculation apparatus for a product inspection data set provided in another embodiment of the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
Fig. 1 is a flowchart of a method for calculating similarity of product inspection data sets provided in an embodiment of the present application, and the method for calculating similarity of product inspection data sets provided in the present application can be applied to a computer, such as a computer used by a client or a server, and the computer stores an execution program for implementing the following steps. The similarity calculation method for the product detection data set provided by the application can comprise the following steps:
step 101, calculating the similarity K' of each defect between two data sets according to the weight of each defect on different feature dimensions;
for each defect, the characteristic value f of the defect is extracted separately in the two data setsi1,fi2,fi3,fi4,fi5,...fin]Wherein f isijRepresenting the characteristic value of the defect on the ith data set in the j dimension, and n is the total number of characteristic dimensions of the defect.
Calculating the similarity K' of the defect between the two data sets according to the characteristic value of the defect between the two data sets and the weight of each characteristic dimension, wherein the weight of each characteristic dimension is [ w [ ]1,w2,w3,w4,w5,...,wn],K’=v1*w1+v2*w2+v3*w3+v4*w4+v5*w5+...+vn*wn,vjRepresents the similarity between the two data sets of the characteristic value of the j dimension of the current defect, and vjThe value is [0,1]]Interval, vj=f1j/f2j
For example, the two data sets include a defect A, which extracts a defect feature value of [ f ] in the first data set11 a,f12 a,f13 a,f14 a,f15 a,...f1n a]Wherein f is1j aA characteristic value of the defect A in the j dimension of the 1 st data set is represented; defect A the characteristic value of the defect extracted in the second data set is [ f [ ]21 a,f22 a,f23 a,f24 a,f25 a,...f2n a]Wherein f is2j aRepresenting the characteristic value of defect a in the 1 st data set in the j dimension.
The similarity Ka' between the two data sets for defect a takes the following values:
Ka’= v1 a*w1+v2 a*w2+v3 a*w3+v4 a*w4+v5 a*w5+...+vn a*wnwherein v isj a=f1j a/f2j a
As another example, the two data sets include a defect B, which extracts a defect feature value of [ f ] in the first data set11 b,f12 b,f13 b,f14 b,f15 b,...f1n b]Wherein f is1j bA characteristic value of defect B in the j dimension of the 1 st data set; the characteristic value of the defect extracted by defect B in the second data set is f21 b,f22 b,f23 b,f24 b,f25 b,...f2n b]Wherein f is2j bRepresenting the characteristic value of defect B in the j dimension of the 1 st data set.
The similarity Kb' between the two data sets for defect B takes the following values:
Kb’= v1 b*w1+v2 b*w2+v3 b*w3+v4 b*w4+v5 b*w5+...+vn b*wnwherein v isj b=f1j b/f2j b
102, calculating the similarity K1 between the two data sets according to the similarity K' of each defect and the number ratio of each defect;
the similarity K1 is used herein to characterize the number of defects per defect versus the contribution to the similarity of the two data sets.
In one possible implementation, when step 102 is implemented, the following steps may be included:
s21, acquiring the number of each defect in two data sets;
such as: i is initialized to 1, the number of ith defects in the two data sets is obtained, i = i +1, and the step of obtaining the number of ith defects in the two data sets is continuously executed until the number of all defects in the two data sets is obtained.
For example, if the number of defect a in the first data set and the number of defect B in the second data set are a1 and a2, respectively, the number of defect a in the two data sets is a1+ a2, and the number of defect B in the first data set and the second data set is B1 and B2, respectively, the sum of the number of defect B in the two data sets is B1+ B2; the number of defect C in the first data set and the second data set is C1 and C2, respectively, so the total number of defect C in the two data sets is C1+ C2.
S22, adding the number of each defect to obtain the total number of the defects;
further to the above example, the sum of the number of defects in defect a, defect B and defect C in the two data sets is total = a1+ a2+ B1+ B2+ C1+ C2.
S23, dividing the number of each defect by the sum of the number of the defects to obtain the number ratio of each defect;
after the total number of defects in the two data sets of each defect is obtained according to step S22, for each defect, the number of the current defect in the two data sets may be divided by the total number of the defect to obtain the ratio of the number of the current defect.
Further to the above example, the number of defects a is (a1+ a2)/total, the number of defects B is (B1+ B2)/total, and the number of defects C is (C1+ C2)/total.
S24, multiplying the number ratio of each defect by the similarity K' of each defect to obtain a product value of each defect;
and S25, adding the product values of the defects to obtain the similarity K1.
Further to the above example, the number of individual defects contributes to the similarity of the dataset, resulting in a similarity of dataset K1= Ka (a1+ a2)/total + Kb (B1+ B2)/total + Kc (C1+ C2)/total.
103, calculating cosine similarity between two vectors of each defect in the two data sets, and multiplying the cosine similarity by the similarity K1 to obtain the similarity K2 between the two data sets;
the cosine similarity as used herein is used to characterize the structural proportion of the number of defects, and the similarity K2 is used to characterize the contribution of the structural proportion of the number of each defect to the similarity of the two data sets.
In one possible implementation manner, when step 103 is implemented, the following steps may be included:
s31, obtaining a first vector (P) according to the number of each defect in the two data sets11,P12,...,P1m) And a second vector (P)21,P22,...,P2m),PijThe number of defects j in the ith data set, and m is the number of defects;
s32, calculating cosine similarity between the first vector and the second vector;
and S33, multiplying the similarity K1 by the cosine similarity to obtain a similarity K2.
Assuming that a defect a, a defect B and a defect C are selected, the number of the three defects in the first data set is a1, B1 and C1, the number of the three defects in the second data set is a2, B2 and C2, the number of the defects is taken as a vector, two vectors (a1, B1 and C1) and (a 2, B2 and C2) can be obtained from the two data sets, the cosine similarity of the two vectors is calculated and is recorded as cos, the similarity K1 of the data sets is obtained according to the steps 101 and 102, the contribution of the structure of the number of the defects to the similarity of the data sets is calculated, and the similarity K2= K1.
And 104, carrying out normalization processing on the similarity K2, and mapping the similarity K2 to [0,1] to obtain the final similarity K between the two data sets.
Firstly, acquiring the weight sum W of each characteristic dimension of a defect; then, the similarity K2 is divided by the weight sum W to obtain the similarity K.
For example, the weight of each feature dimension of the defect is [ w ]1,w2,w3,w4,w5,...,wn]The sum of their weights W = W1+w2+...+wnFrom K2 obtained in steps 101 to 103, it is easy to know that the maximum value is W, so K2 is normalized and mapped to [0,1]]And obtaining the final similarity K = K2/W of the data set.
In summary, according to the similarity calculation method for the product detection data sets provided by the application, the similarity of the two data sets is calculated by adopting a weight mode based on each feature dimension of the defect, so that the similarity of the two data sets can be rapidly obtained, the historical parameter configuration of the data set with higher similarity is further used for setting the initial parameter configuration of the data set for training, and the training efficiency of the model can be improved to a certain extent.
In addition, according to the actual requirements of the service, the weight sizes of different dimensions can be adjusted according to the actual requirements of the service, and the similarity between two defect objects is dynamically calculated, so that the similarity between data sets is influenced; when the similarity of the data set is calculated, the contribution of each defect quantity to the overall similarity is considered locally, and the calculation of the overall proportion of each defect quantity to the similarity is also considered; the normalization process performed, maps to the [0,1] interval, so that the similarity has comparable characteristics in value, with a larger value indicating more similarity between the two data sets.
The following is an embodiment of a similarity calculation apparatus for a product detection data set, and since the apparatus embodiment corresponds to the method embodiment, for the following explanation of technical features in the similarity calculation apparatus for a product detection data set, reference may be made to the above explanation of corresponding technical features in the method embodiment, and details are not repeated here.
Fig. 2 is a schematic structural diagram of a similarity calculation apparatus for a product inspection dataset provided in an embodiment of the present application, which may be implemented by software, hardware, or a combination of software and hardware, and may include: a first calculation module 210, a second calculation module 220, a third calculation module 230, and a processing module 240.
The first calculation module 210 may be configured to calculate a similarity K' of each defect between the two data sets according to the weight of each defect in different feature dimensions;
the second calculating module 220 may be configured to calculate a similarity K1 between the two data sets according to the similarity K' of each defect calculated by the first calculating module 210 and the quantity ratio of each defect, where the similarity K1 is used to characterize the contribution of the quantity ratio of each defect to the similarity of the two data sets;
the third calculating module 230 may be configured to calculate cosine similarity between two vectors of the number of each defect in the two data sets, and multiply the cosine similarity with the similarity K1 calculated by the second calculating module 220 to obtain a similarity K2 between the two data sets, where the cosine similarity is used to characterize the structural proportion of the number of defects, and the similarity K2 is used to characterize the contribution of the structural proportion of the number of each defect to the similarity of the two data sets;
the processing module 240 may be configured to perform normalization processing on the similarity K2 calculated by the third calculating module 230, and map the similarity K2 to [0,1], so as to obtain a final similarity K between two data sets.
In a possible implementation manner, please refer to fig. 3, which is a schematic structural diagram of a similarity calculation apparatus for a product inspection data set provided in another embodiment of the present application, wherein the first calculation module 210 may include: an extraction unit 211 and a first calculation unit 212.
The extraction unit 211 may be configured to extract, for each defect, a feature value [ f ] of the defect in the two data sets, respectivelyi1,fi2,fi3,fi4,fi5,...fin]Wherein f isijRepresenting the characteristic value of the defect on the ith data set in the j dimension, wherein n is the total number of the characteristic dimensions of the defect;
the first calculating unit 212 may be configured to calculate a similarity K' between the two data sets of the defect according to the feature values of the defect extracted by the extracting unit 211 in the two data sets and a weight of each feature dimension, where the weight of each feature dimension is [ w [ ]1,w2,w3,w4,w5,...,wn],K’=v1*w1+v2*w2+v3*w3+v4*w4+v5*w5+...+vn*wn,vjRepresents the similarity between the two data sets of the characteristic value of the j dimension of the current defect, and vjThe value is [0,1]]Interval, vj=f1j/f2j
Still referring to fig. 3, the second calculation module 220 may include: a first acquisition unit 221, a second calculation unit 222, a third calculation unit 223, a fourth calculation unit 224, and a fifth calculation unit 225.
A first acquiring unit 221 for acquiring the number of each defect in the two data sets;
a second calculating unit 222, configured to add the numbers of the defects acquired by the first acquiring unit 221 to obtain a defect number sum;
a third calculating unit 223 for dividing the number of each defect by the sum of the numbers of defects calculated by the second calculating unit 222 to obtain the number ratio of each defect;
a fourth calculating unit 224, configured to multiply the ratio of the number of each defect calculated by the third calculating unit 223 by the similarity K' of each defect to obtain a product value of each defect;
a fifth calculating unit 225, configured to add the product values of the defects calculated by the fourth calculating unit 224 to obtain the similarity K1.
Still referring to fig. 3, the third calculation module 230 may include: a vector acquisition module 231, a sixth calculation unit 232 and a seventh calculation unit 233.
The vector acquisition module 231 may be configured to obtain the number of defects in the two data sets according to the number of defectsFirst vector (P)11,P12,...,P1m) And a second vector (P)21,P22,...,P2m),PijThe number of defects j in the ith data set, and m is the number of defects;
the sixth calculating unit 232 may be configured to calculate a cosine similarity between the first vector and the second vector acquired by the vector acquiring module 231;
the seventh calculating unit 233 may be configured to multiply the similarity K1 with the cosine similarity calculated by the sixth calculating unit 232 to obtain the similarity K2.
Still referring to fig. 3, the processing module 240 may include: a second acquisition unit 241 and an eighth calculation unit 242.
The second obtaining unit 241 may be configured to obtain a total weight W of each feature dimension of the defect;
the eighth calculating unit 242 may be configured to divide the similarity K2 by the weight sum W obtained by the second obtaining unit 241 to obtain the similarity K.
In summary, the similarity calculation device for the product detection data sets provided by the application calculates the similarity of the two data sets by adopting the weighting mode based on each feature dimension of the defect, can quickly acquire the similarity of the two data sets, so that the historical parameter configuration of the data set with higher similarity is further used to set the initial parameter configuration of the data set for training, and the training efficiency of the model can be improved to a certain extent.
In addition, according to the actual requirements of the service, the weight sizes of different dimensions can be adjusted according to the actual requirements of the service, and the similarity between two defect objects is dynamically calculated, so that the similarity between data sets is influenced; when the similarity of the data set is calculated, the contribution of each defect quantity to the overall similarity is considered locally, and the calculation of the overall proportion of each defect quantity to the similarity is also considered; the normalization process performed, maps to the [0,1] interval, so that the similarity has comparable characteristics in value, with a larger value indicating more similarity between the two data sets.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (10)

1. A method of similarity calculation for a product inspection data set, the method comprising:
calculating the similarity K' of each defect between the two data sets according to the weight of each defect on different feature dimensions;
calculating to obtain the similarity K1 between the two data sets according to the similarity K' of each defect and the quantity ratio of each defect, wherein the similarity K1 is used for depicting the contribution of the quantity ratio of each defect to the similarity of the two data sets;
calculating cosine similarity between two vectors of each defect in the two data sets, and multiplying the cosine similarity with the similarity K1 to obtain similarity K2 between the two data sets, wherein the cosine similarity is used for describing the structural proportion of the defect number, and the similarity K2 is used for describing the contribution of the structural proportion of each defect number to the similarity of the two data sets;
and carrying out normalization processing on the similarity K2, and mapping the similarity K2 to [0,1] to obtain the final similarity K between the two data sets.
2. The method of claim 1, wherein calculating the similarity K' of each defect between two data sets according to the weight of each defect in different feature dimensions comprises:
for each defect, the characteristic value f of said defect is extracted separately in two data setsi1,fi2,fi3,fi4,fi5,...fin]Wherein f isijRepresenting the characteristic value of the defect on the ith data set in the j dimension, wherein n is the total number of the characteristic dimensions of the defect;
calculating the similarity K' of the defect between the two data sets according to the characteristic value of the defect between the two data sets and the weight of each characteristic dimension, wherein the weight of each characteristic dimension is [ w1,w2,w3,w4,w5,...,wn],K’=v1*w1+v2*w2+v3*w3+v4*w4+v5*w5+...+vn*wn,vjRepresents the similarity between the two data sets of the characteristic value of the j dimension of the current defect, and vjThe value is [0,1]]Interval, vj=f1j/f2j
3. The method according to claim 1, wherein calculating the similarity K1 between the two data sets according to the similarity K' of each defect and the ratio of the number of each defect comprises:
acquiring the number of each defect in the two data sets;
adding the number of each defect to obtain the total number of the defects;
dividing the number of each defect by the sum of the number of the defects to obtain the number ratio of each defect;
multiplying the number ratio of each defect by the similarity K' of each defect to obtain a product value of each defect;
the product values of the respective defects are added to obtain the similarity K1.
4. The method of claim 1, wherein the calculating the cosine similarity between two vectors of the number of each defect in the two data sets, and multiplying the cosine similarity by the similarity K1 to obtain the similarity K2 between the two data sets comprises:
a first vector (P) is derived based on the number of defects in the two data sets11,P12,...,P1m) And a second vector (P)21,P22,...,P2m),PijThe number of defects j in the ith data set, and m is the number of defects;
calculating cosine similarity between the first vector and the second vector;
and multiplying the similarity K1 by the cosine similarity to obtain the similarity K2.
5. The method according to claim 1, wherein the normalizing the similarity K2 to map the similarity K2 between [0,1] to obtain a final similarity K between two data sets comprises:
acquiring the weight sum W of each characteristic dimension of the defect;
dividing the similarity K2 by the weight sum W to obtain the similarity K.
6. An apparatus for calculating similarity of product inspection data sets, the apparatus comprising:
the first calculation module is used for calculating the similarity K' of each defect between the two data sets according to the weight of each defect on different feature dimensions;
the second calculation module is used for calculating the similarity K1 between the two data sets according to the similarity K' of each defect calculated by the first calculation module and the quantity ratio of each defect, and the similarity K1 is used for depicting the contribution of the quantity ratio of each defect to the similarity of the two data sets;
the third calculation module is used for calculating cosine similarity between two vectors of the number of each defect in the two data sets, multiplying the cosine similarity with the similarity K1 calculated by the second calculation module to obtain the similarity K2 between the two data sets, wherein the cosine similarity is used for describing the structure proportion of the number of the defects, and the similarity K2 is used for describing the contribution of the structure proportion of the number of each defect to the similarity of the two data sets;
and the processing module is used for carrying out normalization processing on the similarity K2 calculated by the third calculation module, and mapping the similarity K2 between [0,1] to obtain the final similarity K between the two data sets.
7. The apparatus of claim 6, wherein the first computing module comprises:
an extraction unit for extracting, for each defect, a feature value [ f ] of the defect in each of the two data setsi1,fi2,fi3,fi4,fi5,...fin]Wherein f isijRepresenting the characteristic value of the defect on the ith data set in the j dimension, wherein n is the total number of the characteristic dimensions of the defect;
a first calculating unit, configured to calculate a similarity K' between the two data sets of the defect according to the feature values of the defect extracted by the extracting unit in the two data sets and the weight of each feature dimension, where the weight of each feature dimension is [ w1,w2,w3,w4,w5,...,wn],K’=v1*w1+v2*w2+v3*w3+v4*w4+v5*w5+...+vn*wn,vjRepresents the similarity between the two data sets of the characteristic value of the j dimension of the current defect, and vjThe value is [0,1]]Interval, vj=f1j/f2j
8. The apparatus of claim 6, wherein the second computing module comprises:
a first acquiring unit for acquiring the number of each defect in the two data sets;
a second calculating unit, configured to add the number of each defect acquired by the first acquiring unit to obtain a defect number sum;
a third calculating unit, configured to divide the number of each defect by the sum of the numbers of defects calculated by the second calculating unit to obtain a number ratio of each defect;
a fourth calculating unit, configured to multiply the number ratio of each defect calculated by the third calculating unit by the similarity K' of each defect to obtain a product value of each defect;
a fifth calculating unit, configured to add the product values of the defects calculated by the fourth calculating unit to obtain the similarity K1.
9. The apparatus of claim 6, wherein the third computing module comprises:
a vector acquisition module for obtaining a first vector (P) based on the number of defects in the two data sets11,P12,...,P1m) And a second vector (P)21,P22,...,P2m),PijThe number of defects j in the ith data set, and m is the number of defects;
a sixth calculating unit, configured to calculate a cosine similarity between the first vector and the second vector acquired by the vector acquisition module;
and the seventh calculating unit is used for multiplying the similarity K1 by the cosine similarity calculated by the sixth calculating unit to obtain the similarity K2.
10. The apparatus of claim 6, wherein the processing module comprises:
the second acquisition unit is used for acquiring the weight sum W of each characteristic dimension of the defect;
an eighth calculating unit, configured to divide the similarity K2 by the weight sum W obtained by the second obtaining unit to obtain the similarity K.
CN202110210120.4A 2021-02-25 2021-02-25 Similarity calculation method and device for product detection data set Pending CN112802009A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110210120.4A CN112802009A (en) 2021-02-25 2021-02-25 Similarity calculation method and device for product detection data set

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110210120.4A CN112802009A (en) 2021-02-25 2021-02-25 Similarity calculation method and device for product detection data set

Publications (1)

Publication Number Publication Date
CN112802009A true CN112802009A (en) 2021-05-14

Family

ID=75815827

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110210120.4A Pending CN112802009A (en) 2021-02-25 2021-02-25 Similarity calculation method and device for product detection data set

Country Status (1)

Country Link
CN (1) CN112802009A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114742791A (en) * 2022-04-02 2022-07-12 深圳市国电科技通信有限公司 Auxiliary defect detection method and device for printed circuit board assembly and computer equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110516210A (en) * 2019-08-22 2019-11-29 北京影谱科技股份有限公司 The calculation method and device of text similarity
US20200097771A1 (en) * 2018-09-25 2020-03-26 Nec Laboratories America, Inc. Deep group disentangled embedding and network weight generation for visual inspection
CN111291698A (en) * 2020-02-19 2020-06-16 深圳英飞拓科技股份有限公司 High-speed recognition method and device for face image of dense crowd scene
CN112215270A (en) * 2020-09-27 2021-01-12 苏州浪潮智能科技有限公司 Similarity comparison method, system, equipment and medium of model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200097771A1 (en) * 2018-09-25 2020-03-26 Nec Laboratories America, Inc. Deep group disentangled embedding and network weight generation for visual inspection
CN110516210A (en) * 2019-08-22 2019-11-29 北京影谱科技股份有限公司 The calculation method and device of text similarity
CN111291698A (en) * 2020-02-19 2020-06-16 深圳英飞拓科技股份有限公司 High-speed recognition method and device for face image of dense crowd scene
CN112215270A (en) * 2020-09-27 2021-01-12 苏州浪潮智能科技有限公司 Similarity comparison method, system, equipment and medium of model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LIU, NA等: "Multi-view Deep Representations with Cross-Dataset Transfer for Remote Sensing Image Retrieval and Classification", • MULTIMEDIA TOOLS AND APPLICATIONS, 3 March 2020 (2020-03-03), pages 22891, XP037497854, DOI: 10.1007/s11042-020-08712-0 *
乔非等: "基于余切相似度和BP 神经网络的相似度快速计算", 同济大学学报(自然科学版), vol. 49, no. 1, 15 January 2021 (2021-01-15), pages 153 - 162 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114742791A (en) * 2022-04-02 2022-07-12 深圳市国电科技通信有限公司 Auxiliary defect detection method and device for printed circuit board assembly and computer equipment

Similar Documents

Publication Publication Date Title
CN109872305B (en) No-reference stereo image quality evaluation method based on quality map generation network
CN110222700A (en) SAR image recognition methods and device based on Analysis On Multi-scale Features and width study
CN110243590B (en) Rotor system fault diagnosis method based on principal component analysis and width learning
CN113822982B (en) Human body three-dimensional model construction method and device, electronic equipment and storage medium
CN103440471B (en) The Human bodys' response method represented based on low-rank
CN106778714B (en) LDA face identification method based on nonlinear characteristic and model combination
CN110827312A (en) Learning method based on cooperative visual attention neural network
CN111160229A (en) Video target detection method and device based on SSD (solid State disk) network
CN105895089A (en) Speech recognition method and device
CN112802009A (en) Similarity calculation method and device for product detection data set
CN107169520A (en) A kind of big data lacks attribute complementing method
CN116776208B (en) Training method of seismic wave classification model, seismic wave selecting method, equipment and medium
CN105491371A (en) Tone mapping image quality evaluation method based on gradient magnitude similarity
CN111191027B (en) Generalized zero sample identification method based on Gaussian mixture distribution (VAE)
CN109785376B (en) Training method of depth estimation device, depth estimation device and storage medium
CN110443277A (en) A small amount of sample classification method based on attention model
CN110543845A (en) Face cascade regression model training method and reconstruction method for three-dimensional face
CN115880111A (en) Virtual simulation training classroom teaching management method and system based on images
CN111967276B (en) Translation quality evaluation method and device, electronic equipment and storage medium
CN112529772B (en) Unsupervised image conversion method under zero sample setting
CN108596068A (en) A kind of method and apparatus of action recognition
Huang et al. A harmonic means pooling strategy for structural similarity index measurement in image quality assessment
CN111126617B (en) Method, device and equipment for selecting fusion model weight parameters
CN111027589B (en) Multi-division target detection algorithm evaluation system and method
CN109685757A (en) A kind of non-reference picture quality appraisement method and system based on grey scale difference statistics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination