CN113538423B - Industrial part defect detection interval clustering method based on combined optimization algorithm - Google Patents

Industrial part defect detection interval clustering method based on combined optimization algorithm Download PDF

Info

Publication number
CN113538423B
CN113538423B CN202111078182.0A CN202111078182A CN113538423B CN 113538423 B CN113538423 B CN 113538423B CN 202111078182 A CN202111078182 A CN 202111078182A CN 113538423 B CN113538423 B CN 113538423B
Authority
CN
China
Prior art keywords
data
positive
samples
interval
rule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111078182.0A
Other languages
Chinese (zh)
Other versions
CN113538423A (en
Inventor
邱增帅
王罡
侯大为
潘正颐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changzhou Weiyizhi Technology Co Ltd
Original Assignee
Changzhou Weiyizhi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changzhou Weiyizhi Technology Co Ltd filed Critical Changzhou Weiyizhi Technology Co Ltd
Priority to CN202111078182.0A priority Critical patent/CN113538423B/en
Publication of CN113538423A publication Critical patent/CN113538423A/en
Application granted granted Critical
Publication of CN113538423B publication Critical patent/CN113538423B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a clustering method of industrial part defect detection intervals based on a combined optimization algorithm, which comprises the following specific steps: step 1, collecting data; step 2, data cleaning; step 3, balancing data distribution; step 4, feature selection; step 5, selecting positive sample data points, setting interval combinations, gradually contracting intervals for optimization, and generating rules; and 6, removing the data in the rule from the data set, and repeating the step 5 for the rest data until all positive samples are selected by the rule to obtain a series of rule descriptions, and finishing the combinatorial optimization approximation algorithm. The method performs combined optimized clustering distinction of positive and negative samples on each optical surface with different defects of the industrial parts, and has certain robustness so as to ensure accurate detection and division of the defects of multiple items.

Description

Industrial part defect detection interval clustering method based on combined optimization algorithm
Technical Field
The invention relates to the technical field of image data processing, in particular to an industrial part defect detection interval clustering method based on a combined optimization algorithm.
Background
At present, most of image data processing-based methods select physical quantity intervals for clustering according to experience, and the difference of physical quantity weight, optical surface and defect type influences the accuracy of positive and negative sample division, so that the method has many limitations. Most obviously, the length and width physical quantity of the linear defect is heavier, and the area physical quantity is not considered; the block defect is a defect with a large area physical weight, and the length and width physical quantities are not considered. This results in a partial interval combination that is not the preferred result. Meanwhile, the optical surfaces with the same defect are different, so that the combination of the setting intervals becomes complicated. However, in order to accurately perform industrial data analysis, accurate positive and negative sample divisions of the workpiece must be found.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: in order to solve the problems existing in the background technology, a clustering method for the defect detection interval of the industrial parts based on a combined optimization algorithm is provided, positive and negative sample combined optimization clustering differentiation is carried out on each optical surface of different defects of the industrial parts, and certain robustness is achieved so as to ensure that the defect accurate detection and division of multiple items can be obtained.
The technical scheme adopted by the invention for solving the technical problems is as follows: a method for clustering industrial part defect detection intervals based on a combined optimization algorithm comprises the following specific steps:
step 1, data acquisition: shooting a workpiece picture by an equipment machine, reading contour points in an original picture, and finishing data acquisition;
step 2, data cleaning: carrying out data consistency check, data missing value processing and data abnormal value processing;
step 3, balancing data distribution: because the variable data are distributed in an unbalanced manner, the number of positive samples is extremely small, the number of negative samples is extremely large, the data are balanced by an oversampling method in consideration of the particularity of the data, the positive sample data are randomly copied, and the number of the positive samples is expanded to the number of the negative samples;
step 4, feature selection: using a filtering method to perform feature selection on the expanded data, and using the variance as a feature scoring standard; before selection from data before expansion
Figure DEST_PATH_IMAGE001
The characteristics with the largest contribution degree form a new data set, the new data set is subjected to combined optimization,
Figure 492697DEST_PATH_IMAGE001
number of physical quantities after feature selection, 1 ≦
Figure 759730DEST_PATH_IMAGE001
Not more than the total physical quantity of data, an
Figure 20947DEST_PATH_IMAGE001
Is a positive integer;
and 5, selecting positive sample data points, setting interval combinations, gradually contracting the intervals for optimization, and generating rules, wherein the specific steps are as follows:
and 5.1, selecting a positive sample data point, and setting an interval combination: firstly, randomly selecting a positive sample data point from a data set after feature selection
Figure 100002_DEST_PATH_IMAGE002
Then, forming an interval combination by taking the maximum value and the minimum value of each physical quantity in the data set as interval boundaries;
and 5.2, optimizing the gradual shrinkage interval to generate a rule: then turn on the point
Figure 814460DEST_PATH_IMAGE002
Under the condition in the interval combination, the interval combination is contracted, the negative samples are filtered, and the interval combination is set as a rule until the proportion of the negative samples to the positive samples in the interval combination is less than or equal to 1:3 and the number of the positive samples is the maximum;
and 6, removing the data in the rule from the data set, repeating the step 5 on the rest data until all positive samples are selected by the rule to obtain a series of rule descriptions, finishing the combinatorial optimization approximation algorithm, namely removing the data in the rule from the data set, and repeating the step 5 on the rest data until no positive sample exists in the data to obtain a group of rule descriptions for performing optimal division on the positive and negative samples.
Further specifically, in the above technical solution, in the 5.2 step of the 5 th step, if the ratio of the negative samples to the positive samples in the interval combination is less than or equal to 1:3 and the number of the positive samples is the largest, the interval combination is a local optimization rule, and data in the selected rule is removed from the dataset; if the ratio of negative samples to positive samples in the interval combination is not less than or equal to 1:3 and the number of positive samples is not the maximum, repeat step 5.2 to point
Figure 807823DEST_PATH_IMAGE002
Under conditions within the interval combination, the interval combination is contracted and the negative sample is filtered.
Further specifically, in the above technical solution, in the 5.2 nd step of the 5 th step, if the ratio of the negative samples to the positive samples in the interval combination is not less than 1:3 and the number of the positive samples is the largest, the 5.2 th step is repeated; if the ratio of the negative samples to the positive samples in the interval combination is less than or equal to 1:3 and the number of the positive samples is not the maximum, the step 5.2 is repeated.
Further specifically, in the above technical solution, in the step 6, after the complete algorithm flow is finished, a series of rule descriptions are generated and implemented, and if there is a new data set, the new data set contains a positive sample and does not conform to the existing generated rule, the new data set is placed into the algorithm to repeat the step 5; if the new data set has no positive samples, a series of rule descriptions are obtained, and the combined optimization approximation algorithm is ended.
More specifically, in the above technical solution, in the 4 th step, the variance calculation formula of the characteristic physical quantity is as follows:
Figure DEST_PATH_IMAGE003
(1)
wherein,
Figure 100002_DEST_PATH_IMAGE004
a variance representing a characteristic physical quantity;
Figure DEST_PATH_IMAGE005
representing physical quantities of points
Figure 100002_DEST_PATH_IMAGE006
Average of (d);
Figure DEST_PATH_IMAGE007
a value representing the physical quantity on each piece of data;
Figure 100002_DEST_PATH_IMAGE008
representing the total number of samples in the data set containing positive and negative samples.
The invention has the beneficial effects that: the invention relates to a method for clustering defect detection intervals of industrial parts based on a combinatorial optimization algorithm, which reduces the number of rules by screening the characteristics of defect physical quantities, and uses combinatorial optimization approximation to carry out sample division on data, so that the number of positive samples in the rules is large, and the number of negative samples is maintained within the relative proportion, thus obtaining a series of rule descriptions of the combinatorial optimization approximation of the positive and negative samples under the optical surface of the defect; the clustering method can be used for clustering and distinguishing positive and negative samples of various optical surfaces with different defects of industrial parts, and meanwhile, interval rules have certain robustness, so that adverse factors with inconsistent defect physical quantity descriptions caused by illumination conditions, workpiece materials, workpiece shapes and the like are overcome, and the defect accurate detection and division of multiple items are completed.
Drawings
In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present invention, the drawings used in the description of the embodiments or prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is an industrial camera capturing artwork;
FIG. 2 is a defect distribution graph;
FIG. 3 is a diagram of a defect area and minimum average luminance distribution;
FIG. 4 is a flow chart of a combinatorial optimization approach algorithm;
FIG. 5 is a diagram of a defect area and minimum average luminance rule division;
FIG. 6 is a graph of defect area versus minimum average luminance rule partition approximation;
FIG. 7 is a graph of defect area versus minimum average luminance rule partitioning for local optimality;
FIG. 8 is an algorithm flow diagram of the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantageous effects solved by the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 4 and 8, the industrial part defect detection interval clustering method based on the combinatorial optimization algorithm of the invention specifically comprises the following steps:
step 1, data acquisition: the equipment machine takes a picture of the workpiece, reads the contour points (pixel coordinates) in the original picture, and finishes the data acquisition work. The equipment machine can be electronic 3C type surface defect appearance detection equipment. The workpiece is an electronic 3C workpiece, such as a mobile phone shell, a notebook shell, a mobile phone accessory and the like.
Step 2, data cleaning: carrying out data consistency check, data missing value processing and data abnormal value processing; the consistency check is to check whether or not there is data having a maximum value or a minimum value, etc., different from most of the values of the physical quantities, among the data. Data missing value processing, that is, when a missing value exists in a certain piece of data, the piece of data is deleted. And (3) data abnormal value processing, namely, deleting the data when the numerical value of the data under a certain physical quantity or a plurality of physical quantities exceeds the value range of the physical quantity.
Step 3, balancing data distribution: because the variable data are unbalanced in category distribution, the number of positive samples is extremely small, the number of negative samples is extremely large, the data are balanced by an oversampling method in consideration of the particularity of the data, and the data are constant data and are real data with practical significance. Randomly copying positive sample data, and expanding the number of the positive samples to the number of the negative samples; the invention is mainly used for defect detection, defaults that the defect data is positive sample data, and all non-defect data are negative sample data. The data is industrial real data, all positive samples are defect data, and the data distribution is balanced on the premise of not missing the positive sample data; and (4) balancing by using an oversampling method, respectively counting the number of the positive samples and the number of the negative samples, randomly extracting the positive samples and the negative samples, copying the positive samples into the positive samples, stopping until the number of the positive samples is the same as that of the negative samples, and finishing data balancing.
Step 4, feature selection: selecting features of the expanded data by using a filtering method, taking variance as a feature scoring standard (the greater the variance value difference of the features, the greater the contribution degree of the features to distinguishing samples), and selecting the pre-expanded data from the pre-expanded data
Figure 933124DEST_PATH_IMAGE001
The characteristics with the largest contribution degree are subjected to combined optimization,
Figure 670136DEST_PATH_IMAGE001
number of physical quantities after feature selection, 1 ≦
Figure 318155DEST_PATH_IMAGE001
Not more than the total physical quantity of data, an
Figure 810316DEST_PATH_IMAGE001
Is a positive integer. The filtration method comprises the following specific steps: calculating the variance of each group by using the physical quantities as the groups by using the data after data balance (if there are 50 data of 100 positive and negative samples, 12 physical quantities, 12 groups of data exist when the physical quantities are used as the groups, each group has 100 values of the physical quantities to obtain 12 variances), using the variance as the scoring standard of the characteristic weight, using the physical quantities with large variance as the characteristic weight, namely, having high characteristic weight, namely, having large contribution degree, using the physical quantities with small variance and low characteristic weight, and selecting the former physical quantities with high characteristic weight
Figure 114259DEST_PATH_IMAGE001
The physical quantities are used as physical quantities after "feature selection", and the following steps of the combinatorial optimization approximation algorithm are performed using these physical quantities.
The variance calculation formula of the characteristic physical quantity is as follows:
Figure 454629DEST_PATH_IMAGE003
(1)
wherein,
Figure 160417DEST_PATH_IMAGE004
a variance representing a characteristic physical quantity;
Figure 557900DEST_PATH_IMAGE005
representing physical quantities of points
Figure 286822DEST_PATH_IMAGE006
Average of (d);
Figure 427953DEST_PATH_IMAGE007
a value representing the physical quantity on each piece of data;
Figure 988247DEST_PATH_IMAGE008
representing the total number of samples in the data set containing positive and negative samples. Here, the physical quantity is a characteristic, and 12 physical quantities are taken as an example and are represented by letters A, B, C, etc., and 100 pieces of data of positive and negative samples (50 pieces of positive and negative samples, respectively) are represented
Figure 759894DEST_PATH_IMAGE008
100, there is data A in the physical quantity A group1To A100For the 100 data, the variance of the 100 data is calculated as the variance of the physical quantity a, and the rest of the physical quantities are also applicable.
And 5, selecting positive sample data points, setting interval combinations, gradually contracting the intervals for optimization, and generating rules, wherein the specific steps are as follows:
and 5.1, selecting a positive sample data point, and setting an interval combination: firstly, randomly selecting a positive sample data point from a data set after feature selection
Figure 772850DEST_PATH_IMAGE002
Then, a section combination is formed with the maximum value and the minimum value of each physical quantity in the data set as section boundaries (assuming that after the feature selection, A, B, C three physical quantities are retained in the data set and A, B, C maximum value and minimum value are [0,60 ] respectively],[0.5,12.2],[802,7034]Random selection ofPositive sample data point of
Figure 717672DEST_PATH_IMAGE002
(20,3.1,5000) is within the range, and the interval combination 1 is shown in table 1, and the number of positive and negative samples in the interval combination is the total number of positive and negative samples in the data set); this is a preliminarily formed combination of intervals, which is the maximum and minimum values of the physical quantity in the entire data set, for example, the minimum value of the physical quantity a in the data set is 0, the maximum value is 60, and the value of the physical quantity a is [0,60 ] at any point in the data set]The other physical quantities are equivalent to each other.
And 5.2, optimizing the gradual shrinkage interval to generate a rule: then turn on the point
Figure 70156DEST_PATH_IMAGE002
Under the condition of interval combination, gradually shrinking interval combination, filtering negative samples until the proportion of the negative samples to the positive samples in the interval combination is less than or equal to a certain proportion and the number of the positive samples is maximum, and combining the interval (shrinking interval combination, positive sample point when filtering the negative samples, negative sample point
Figure 75021DEST_PATH_IMAGE002
(20,3.1,5000) is always included in the interval combination, the interval combination 2 after contraction is shown in table 1, the proportion of positive and negative samples in the interval combination at this time meets the requirement, and the interval combination at this time is a local optimization rule) is set as a rule; the calculation formula of the contraction step for each physical quantity interval is as follows:
Figure DEST_PATH_IMAGE009
(2)
wherein,
Figure 100002_DEST_PATH_IMAGE010
representing the contraction steps of each physical quantity interval;
Figure DEST_PATH_IMAGE011
represents the maximum value of each physical quantity;
Figure 100002_DEST_PATH_IMAGE012
represents the minimum value of each physical quantity;
Figure DEST_PATH_IMAGE013
representing the total number of samples in the data set containing positive and negative samples. For example, the number of all samples in the data set is 1000, and the contraction step of the physical quantity A is
Figure 100002_DEST_PATH_IMAGE014
= 0.06. The contraction method comprises (0 +0.06 × U) and (60-0.06 × V), wherein U and V are contraction step coefficients, U is a positive integer greater than or equal to 1, V is a positive integer greater than or equal to 1, and U + V is less than or equal to 1000. (0 +0.06 × U) indicates that the lower boundary of the physical quantity a gradually shrinks inward by two units starting from the minimum value of the physical quantity a, and when U =2, the lower boundary of the physical quantity a is converted from the minimum value 0 to (0 +0.06 × 2) =0.12, and the shrinkage is 0.12. (60-0.06 × V) indicates a gradual inward contraction starting from the maximum value of the physical quantity a, and when V =1 indicates that the upper boundary of the physical quantity a has contracted inward by one unit, the upper boundary of the physical quantity a is transformed from the maximum value 60 to (60-0.06 × 1) =59.94, and the contraction is 0.06. To be provided with
Figure 946244DEST_PATH_IMAGE002
The points (20,3.1,5000) are always included in the interval combination, and the interval contraction method of other physical quantities is the same as the principle of increasing U, V gradually. Until the positive/negative sample ratio in the interval is greater than or equal to 3: and stopping at 1 time to generate a local optimal interval combination.
If the proportion of the negative samples to the positive samples in the interval combination is less than or equal to a certain proportion and the number of the positive samples is the largest, the interval combination is a local optimization rule, and data in the selected rule are removed from the data set; if the proportion of the negative samples to the positive samples in the interval combination is not less than or equal to a certain proportion and the number of the positive samples is not the maximum, repeating the step 5.2 to point
Figure 429178DEST_PATH_IMAGE002
Under conditions within the interval combination, the interval combination is contracted and the negative sample is filtered. If the proportion of the negative samples to the positive samples in the interval combination is not less than or equal to a certain proportion and the number of the positive samples is the largest, repeating the step 5.2; and if the proportion of the negative samples to the positive samples in the interval combination is less than or equal to a certain proportion and the number of the positive samples is not the maximum, repeating the step 5.2. It should be noted that, for a certain ratio, for example: continuously filtering the negative samples for d times, wherein d is a positive integer greater than or equal to 1, until the d +1 th time of filtering, the positive samples are filtered, and the proportion of the positive samples to the negative samples is less than or equal to 3: 1, stopping (normally adopting the ratio of 3: 1 depending on the setting requirement), generating a rule, and taking the interval combination value of the d-th time according to the rule.
TABLE 1
Section combination 1 0≤AMLess than or equal to 60; and 0.5 is less than or equal to BMLess than or equal to 12.2; and 802 is equal to or more than CM≤7034
Section combination 2 14.7≤AMLess than or equal to 55; and 0.5 is less than or equal to BMLess than or equal to 6.3; and 4000. ltoreq.CM≤7034
Wherein, A in Table 1MIndicating points
Figure 901748DEST_PATH_IMAGE002
A value on the physical quantity a; b isMIndicating points
Figure 80444DEST_PATH_IMAGE002
A value on the physical quantity B; cMIndicating points
Figure 67991DEST_PATH_IMAGE002
The value on the physical quantity C.
And 6, removing the data in the rule from the data set, repeating the step 5 on the rest data until all positive samples are selected by the rule to obtain a series of rule descriptions, finishing the combinatorial optimization approximation algorithm (the combinatorial optimization approximation algorithm is mainly in logic traversal, namely, judging whether the positive samples meet the requirements one by one), removing the data in the rule from the data set, repeating the step 5 on the rest data until no positive samples exist in the data, and obtaining a group of rule descriptions to perform optimal division on the positive samples and the negative samples. After the complete algorithm flow is finished, a series of rule descriptions are generated and implemented, if a new data set exists, the new data set contains positive samples and does not accord with the existing generated rules, the new data set is put into the algorithm, and the step 5 is repeated; if the new data set has no positive samples, a series of rule descriptions are obtained, and the combined optimization approximation algorithm is ended. After the interval combination 1 (shown in table 1) is obtained, the data corresponding to the interval combination 1 is removed from the data set, and the rest data generates an interval combination 2 (shown in table 1), and so on, so that the effect that the data corresponding to the previously generated interval combination influences the subsequent interval combination can be avoided. A series of composition rule descriptions may be: the combination of the sections 1U section is combined with the section 2U section is combined with the section 3U
Figure DEST_PATH_IMAGE015
Wherein
Figure 354616DEST_PATH_IMAGE015
∈[1,∞]And is
Figure 478430DEST_PATH_IMAGE015
Is a positive integer; the symbol u indicates that the relation between each interval combination is or, that is, a positive sample and a negative sample are preferably divided when the rule of any interval combination is met. A series of composition rule descriptions may also be such that: (A)1∩B1∩C1)∪(A2∩B2∩C2)∪(A3∩B3∩C3)∪……∪(Ag∩Bg∩Cg) Wherein
Figure 825098DEST_PATH_IMAGE015
∈[1,∞]And is
Figure 237624DEST_PATH_IMAGE015
Is a positive integer; the symbol @ indicates that the relationship between the respective section combinations is or, and the symbol @indicatesthat the relationship between the physical quantities within the respective section combinations is and.
If the data in the new data set does not conform to the existing series of combination rules (5 pieces), the new data set is put into the algorithm, the steps 5 and 6 are repeated until no positive sample exists in the new data set, and the newly generated 2 pieces of rules are combined with the existing series of combination rules (5 pieces) to form a new combination rule which contains 7 pieces of rules.
See fig. 1, which is an original image captured by an industrial camera, wherein black dots indicate positive samples and gray dots indicate negative samples. From this figure, it can be seen that the shape of the workpiece and the defect data need to be extracted by reading the information such as contour pixels.
FIG. 2 is a defect distribution graph; the approximate location of the defect distribution can be seen from this figure.
See fig. 3, which is a distribution of defect area and minimum average brightness, from which an approximate distribution of defect area and minimum average brightness can be seen.
See fig. 5, which is a diagram of the regular division of the defect area and the minimum average brightness, and the black line frame area is the initial interval combination range and takes points
Figure 327940DEST_PATH_IMAGE002
Under the condition in the interval combination, the number of positive samples in the interval is 430, and the number of negative samples in the interval is more than 25%, wherein 25% of the number of positive samples in the interval is 25%, and the total number of negative samples in the interval is the sum of the number of positive samples and the number of negative samples.
See FIG. 6, which is a plot of defect area versus minimum average luminance rule partition approximationThe number of positive samples is 62; the number of negative examples is 165; the black line frame area is the interval combination range in the approximate optimization, and points are used
Figure 509523DEST_PATH_IMAGE002
Under conditions within the interval combinations.
See fig. 7, which is a local optimum diagram divided by the defect area and minimum average brightness rule, and the black line frame region is an interval combination rule approaching the optimum local optimum and takes points
Figure 761513DEST_PATH_IMAGE002
Under the condition in the interval combination, the number of the positive samples in the optimized interval is 3, the number of the negative samples in the interval is 1 (25%), and 25% of the number of the negative samples accounts for 25% of the total number of the samples in the interval.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention are equivalent to or changed within the technical scope of the present invention.

Claims (5)

1. A method for clustering industrial part defect detection intervals based on a combined optimization algorithm is characterized by comprising the following specific steps:
step 1, data acquisition: shooting a workpiece picture by an equipment machine, reading contour points in an original picture, and finishing data acquisition;
step 2, data cleaning: carrying out data consistency check, data missing value processing and data abnormal value processing;
step 3, balancing data distribution: because the variable data are distributed in an unbalanced manner, the number of positive samples is extremely small, the number of negative samples is extremely large, the data are balanced by an oversampling method in consideration of the particularity of the data, the positive sample data are randomly copied, and the number of the positive samples is expanded to the number of the negative samples;
step 4, feature selection: make itCarrying out feature selection on the expanded data by using a filtering method, and taking the variance as a feature scoring standard; before selection from data before expansion
Figure DEST_PATH_IMAGE002
The characteristics with the largest contribution degree form a new data set, the new data set is subjected to combined optimization,
Figure 531292DEST_PATH_IMAGE002
number of physical quantities after feature selection, 1 ≦
Figure 510749DEST_PATH_IMAGE002
Not more than the total physical quantity of data, an
Figure 352804DEST_PATH_IMAGE002
Is a positive integer;
and 5, selecting positive sample data points, setting interval combinations, gradually contracting the intervals for optimization, and generating rules, wherein the specific steps are as follows:
and 5.1, selecting a positive sample data point, and setting an interval combination: firstly, randomly selecting a positive sample data point from a data set after feature selection
Figure DEST_PATH_IMAGE004
Then, forming an interval combination by taking the maximum value and the minimum value of each physical quantity in the data set as interval boundaries;
and 5.2, optimizing the gradual shrinkage interval to generate a rule: then turn on the point
Figure 813259DEST_PATH_IMAGE004
Under the condition in the interval combination, the interval combination is contracted, the negative samples are filtered, and the interval combination is set as a rule until the proportion of the negative samples to the positive samples in the interval combination is less than or equal to 1:3 and the number of the positive samples is the maximum;
and 6, removing the data in the rule from the data set, repeating the step 5 on the rest data until all positive samples are selected by the rule to obtain a series of rule descriptions, finishing the combinatorial optimization approximation algorithm, namely removing the data in the rule from the data set, and repeating the step 5 on the rest data until no positive sample exists in the data to obtain a group of rule descriptions for performing optimal division on the positive and negative samples.
2. The industrial part defect detection interval clustering method based on the combinatorial optimization algorithm according to claim 1, characterized in that: in the 5.2 th step of the 5 th step, if the proportion of the negative samples to the positive samples in the interval combination is less than or equal to 1:3 and the number of the positive samples is the largest, the interval combination is a local optimization rule, and data in the selected rule is removed from the data set; if the ratio of negative samples to positive samples in the interval combination is not less than or equal to 1:3 and the number of positive samples is not the maximum, repeat step 5.2 to point
Figure 689948DEST_PATH_IMAGE004
Under conditions within the interval combination, the interval combination is contracted and the negative sample is filtered.
3. The industrial part defect detection interval clustering method based on the combinatorial optimization algorithm according to claim 1, characterized in that: in the 5.2 th step of the 5 th step, if the proportion of the negative samples to the positive samples in the interval combination is not less than or equal to 1:3 and the number of the positive samples is the largest, repeating the 5.2 th step; if the ratio of the negative samples to the positive samples in the interval combination is less than or equal to 1:3 and the number of the positive samples is not the maximum, the step 5.2 is repeated.
4. The industrial part defect detection interval clustering method based on the combinatorial optimization algorithm according to claim 1, characterized in that: in the 6 th step, after the complete algorithm flow is finished, a series of rule descriptions are generated and implemented, if a new data set exists, the new data set contains positive samples and does not accord with the existing generated rules, the new data set is put into the algorithm, and the 5 th step is repeated; if the new data set has no positive samples, a series of rule descriptions are obtained, and the combined optimization approximation algorithm is ended.
5. The industrial part defect detection interval clustering method based on the combinatorial optimization algorithm according to claim 1, characterized in that: in the 4 th step, the variance calculation formula of the characteristic physical quantity is as follows:
Figure DEST_PATH_IMAGE006
(1)
wherein,
Figure DEST_PATH_IMAGE008
a variance representing a characteristic physical quantity;
Figure DEST_PATH_IMAGE010
representing physical quantities of points
Figure DEST_PATH_IMAGE012
Average of (d);
Figure DEST_PATH_IMAGE014
a value representing the physical quantity on each piece of data;
Figure DEST_PATH_IMAGE016
representing the total number of samples in the data set containing positive and negative samples.
CN202111078182.0A 2021-09-15 2021-09-15 Industrial part defect detection interval clustering method based on combined optimization algorithm Active CN113538423B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111078182.0A CN113538423B (en) 2021-09-15 2021-09-15 Industrial part defect detection interval clustering method based on combined optimization algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111078182.0A CN113538423B (en) 2021-09-15 2021-09-15 Industrial part defect detection interval clustering method based on combined optimization algorithm

Publications (2)

Publication Number Publication Date
CN113538423A CN113538423A (en) 2021-10-22
CN113538423B true CN113538423B (en) 2022-01-07

Family

ID=78092579

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111078182.0A Active CN113538423B (en) 2021-09-15 2021-09-15 Industrial part defect detection interval clustering method based on combined optimization algorithm

Country Status (1)

Country Link
CN (1) CN113538423B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101852768A (en) * 2010-05-05 2010-10-06 电子科技大学 Workpiece flaw identification method based on compound characteristics in magnaflux powder inspection environment
CN105786970A (en) * 2016-01-29 2016-07-20 深圳先进技术研究院 Processing method and device of unbalanced data
CN112905716A (en) * 2021-02-24 2021-06-04 同济大学 Semiconductor production process data preprocessing method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105335716B (en) * 2015-10-29 2019-03-26 北京工业大学 A kind of pedestrian detection method extracting union feature based on improvement UDN
CN108205766A (en) * 2016-12-19 2018-06-26 阿里巴巴集团控股有限公司 Information-pushing method, apparatus and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101852768A (en) * 2010-05-05 2010-10-06 电子科技大学 Workpiece flaw identification method based on compound characteristics in magnaflux powder inspection environment
CN105786970A (en) * 2016-01-29 2016-07-20 深圳先进技术研究院 Processing method and device of unbalanced data
CN112905716A (en) * 2021-02-24 2021-06-04 同济大学 Semiconductor production process data preprocessing method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于负样本精简概念格规则的语义概念检测;潘润华;《计算机工程》;20111230;第37卷(第23期);全文 *
用于知识表示学习的对抗式负样本生成;张钊;《计算机应用》;20190910;第39卷(第9期);全文 *

Also Published As

Publication number Publication date
CN113538423A (en) 2021-10-22

Similar Documents

Publication Publication Date Title
CN113450307B (en) Product edge defect detection method
CN109658381B (en) Method for detecting copper surface defects of flexible IC packaging substrate based on super-pixels
CN107543828A (en) A kind of Surface Flaw Detection method and system
CN112132852B (en) Automatic image matting method and device based on multi-background color statistics
CN108562589A (en) A method of magnetic circuit material surface defect is detected
CN109064418B (en) Non-local mean value-based non-uniform noise image denoising method
CN111008651B (en) Image reproduction detection method based on multi-feature fusion
CN113012059A (en) Shadow elimination method and device for character image and electronic equipment
CN115272312B (en) Plastic mobile phone shell defect detection method based on machine vision
CN109118434A (en) A kind of image pre-processing method
CN115082477B (en) Semiconductor wafer processing quality detection method based on light reflection removing effect
CN117911792B (en) Pin detecting system for voltage reference source chip production
CN116052105A (en) Pavement crack identification classification and area calculation method, system, equipment and terminal
CN113538423B (en) Industrial part defect detection interval clustering method based on combined optimization algorithm
CN105828061B (en) A kind of virtual view quality evaluating method of view-based access control model masking effect
CN113537413B (en) Clustering method for part defect detection interval of feature selection and combination optimization algorithm
CN108960285B (en) Classification model generation method, tongue image classification method and tongue image classification device
CN113393479B (en) Method for dividing test tube holes in cell plate image
Banon et al. Mathematical morphology and its applications to signal and image processing
CN114862786A (en) Retinex image enhancement and Ostu threshold segmentation based isolated zone detection method and system
CN116579968A (en) Identification method and device for food material image, steaming and baking equipment and storage medium
CN109978029B (en) Invalid image sample screening method based on convolutional neural network
CN114022434A (en) Automatic extraction method and system for upper and lower lines of guardrail
CN113554695A (en) Intelligent part hole site identification and positioning method
CN111563863A (en) Histogram-limited image enhancement method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant