CN110533071B - SMT production tracing method based on self-encoder and ensemble learning - Google Patents

SMT production tracing method based on self-encoder and ensemble learning Download PDF

Info

Publication number
CN110533071B
CN110533071B CN201910688024.3A CN201910688024A CN110533071B CN 110533071 B CN110533071 B CN 110533071B CN 201910688024 A CN201910688024 A CN 201910688024A CN 110533071 B CN110533071 B CN 110533071B
Authority
CN
China
Prior art keywords
encoder
value
self
data
tracing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910688024.3A
Other languages
Chinese (zh)
Other versions
CN110533071A (en
Inventor
常建涛
张凯磊
孔宪光
王佩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN201910688024.3A priority Critical patent/CN110533071B/en
Publication of CN110533071A publication Critical patent/CN110533071A/en
Application granted granted Critical
Publication of CN110533071B publication Critical patent/CN110533071B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/018Certifying business or products
    • G06Q30/0185Product, service or business identity fraud
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/04Manufacturing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The invention discloses an SMT production tracing method based on a self-encoder and ensemble learning, which comprises the following steps: (1) constructing a self-encoder; (2) acquiring an SPI defect tracing data set; (3) carrying out normalization processing on the SPI defect tracing data set; (4) training a self-encoder; (5) obtaining a classification tree set by using an ensemble learning method; (6) obtaining the SPI production tracing sequence. According to the invention, the normalized SPI defect tracing data set is input into the trained self-encoder to generate a classification data set, a classification tree is trained by using an ensemble learning method, the trained classification tree is traversed to obtain an SMT production tracing sequence, key factors causing product defects are positioned, and the SMT production tracing accuracy is improved.

Description

SMT production tracing method based on self-encoder and ensemble learning
Technical Field
The invention belongs to the technical field of electronics, and further relates to a Surface Mounting Technology (SMT) production tracing method based on a self-encoder and ensemble learning in the technical field of informatization of electronic manufacturing industry. The invention can be applied to tracing the Printed Circuit Board (PCB) of an electronic product in the surface mounting production process, and is used for quickly positioning key factors causing product defects.
Background
SMT is an electronic assembly technique that assembles surface mount components to a printed board. ISO 9000-: "traceability" traces back the history of the object under consideration, the application or the ability of the location where it is located. The method can control and adjust the technical instability factors, human factors or management factors causing the defect points, and continuously improve the product quality. Among various tracing methods, the method based on machine learning and deep learning effectively utilizes various data generated in the SMT production process during tracing, solves the problem of insufficient data utilization, and realizes SMT production tracing.
Shanghai ' an science and technology Limited company discloses an SMT production tracing method and technology in the patent document ' an SMT production intelligent error-proof tracing method and technology ' (patent application No. 201810719538.6, application publication No. CN 109911365A). According to the method, the intelligent warehouse is established, the operation flow of the intelligent warehouse is standardized, the intelligent warehouse and the intelligent production module are jointly used, real-time feedback and optimization of the SMT production line are achieved, and SMT production tracing is achieved. The method has the defects that only relevant data in the production process of the product can be obtained, the relation between SMT production information and production defects cannot be deeply mined, and key factors causing the product defects cannot be timely and accurately positioned.
Ban X et al, in its published paper "Quality tracking of converter steel based adaptive feature selection and multiple linear regression" (2018IEEE International Conference on Big Data and Smart Computing (BigComp). IEEE,2018:462-468), disclose a method for tracing converter steel making process abnormal production Data using an adaptive feature selection method based on correlation and deviation matching for feature selection, using a multivariate linear regression method to analyze the causal relationship between the parameters, and using the feature with the largest coefficient in the regression equation as the key factor causing the production abnormality. The method has the defects that the used self-adaptive feature selection method can only find out features linearly related to the dependent variable, so that excessive features are lost, the used multiple linear regression cannot describe the nonlinear relation between the independent variable and the dependent variable, and the interpretation capability of the relation between the independent variable and the dependent variable is weak.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides an SMT production tracing method based on an auto-encoder and ensemble learning so as to locate key factors causing product defects.
The idea for realizing the purpose of the invention is as follows: and (3) constructing 18 self-encoders with the same structure and different parameters, processing the normalized SPI defect tracing data set to obtain a classification data set, then obtaining a classification tree set by using an integrated learning method, finally traversing the classification tree set to obtain an SMT production tracing sequence, and positioning production information which has important influence on SMT production defects.
The method comprises the following specific steps:
(1) constructing an auto encoder:
(1a) build 18 autocoders with the same structure and different parameters, wherein each autocoder has three layers, and the structure of each autocoder is as follows: input layer → fully connected layer → output layer;
(1b) setting the number of nodes of the input layer and the output layer to 76;
(1c) the number of full link layer nodes for each autoencoder is set according to the following equation:
Figure BDA0002146985590000021
wherein n isiRepresents the number of fully-connected layer nodes of the ith self-encoder, i e {1,2, …,18},
Figure BDA0002146985590000022
represents a rounding down operation,% represents a remainder operation;
(1d) calculating an activation value of each node of the fully-connected layers in the 1 st to 9 th self-encoders according to the following formula:
Figure BDA0002146985590000023
wherein, TmjRepresents the activation value of the jth node in the m-th self-encoder's fully-connected layer, m ∈ {1,2, …,9}, j ∈ {1,2, …, N ∈m},NmRepresenting the total number of fully-connected layer nodes of the mth self-encoder, e(·)Expressing exponential operations based on a natural constant e,xmjRepresenting the j-th node input value, x, in the fully-connected layer of the m-th self-encodermj=Wmj TXm,WmjRepresenting a weight matrix of the network between the input layer of the mth self-encoder and the jth node in the fully-connected layer, the initialized value of each element of the matrix obeying a standard normal distribution, T representing a transposition operation, XmRepresents a vector consisting of input values of 76 nodes in the input layer of the mth self-encoder;
(1e) calculating an activation value of each node of the fully-connected layers in the 10 th to 18 th self-encoders according to the following formula:
Rln=max(0,xln)
wherein R islnRepresents the activation value of the nth node in the fully-connected layer of the ith self-encoder, i ∈ {10, …,18}, N ∈ {1,2, …, N ∈l},NlRepresents the total number of nodes of the fully-connected layer of the ith self-encoder, max (·) represents the max operation, xlnRepresenting the input value, x, of the nth node in the fully-connected layer of the ith self-encoderln=Wln TXl,WlnA weight matrix representing the network between the nth node of the input layer and the fully-connected layer of the ith self-encoder, the initialized value of each element of the matrix obeying a standard normal distribution, XlRepresenting a vector consisting of input values of 76 nodes in the ith self-encoder input layer;
(1f) calculating a loss error value between each output value from the encoder output layer and the input layer input value according to:
Figure BDA0002146985590000031
wherein L isiRepresenting the loss error value between the i-th output value from the encoder output layer and the input layer input value, i e {1,2, …,18}, NiRepresents the number of input layer nodes and the number of output layer nodes of the ith self-encoder, sigma represents the summation operation, yikRepresents the input value of the kth node of the ith self-encoder input layer,
Figure BDA0002146985590000032
represents the output value of the kth node of the ith self-encoder output layer, k is equal to {1,2, …, N };
(2) acquiring an SPI defect tracing data set:
randomly extracting at least 5320000 pieces of SPI tracing data from a database of a Manufacturing Execution System (MES) to form an MXN-dimensional SPI defect tracing data set, wherein M is at least 70000, and N is at least 76, each row of data represents SPI defect tracing data containing production information, each column of data represents a sequence formed by all values of one attribute in the SPI tracing data set, and at least 20000 rows of SPI tracing data in the SPI defect tracing data set are defective detection data;
(3) according to the following formula, normalizing the data of each attribute in the SPI defect tracing data set to obtain a normalized SPI defect tracing data set:
x'qp=(xqp-min(xq))/(max(xq)-min(xq))
wherein, x'qpNormalized value, x, of the pth data representing the qth attribute in the SPI defect trace datasetqpP data representing the q attribute of the SPI defect tracing data set, min (-) represents the minimum value operation, xqAll data representing the qth attribute of the SPI defect tracing data set, max (·) represents the maximum value operation;
(4) training the self-encoder:
respectively inputting the normalized SPI defect tracing data set into an input layer of each self-encoder of 18 self-encoders, and respectively training each self-encoder by using a random gradient descent method to obtain 18 trained self-encoders in total;
(5) obtaining a set of classification trees using ensemble learning:
(5a) inputting all data of the normalized MXN-dimensional SPI defect tracing data set to a full-connection layer of each trained self-encoder in sequence according to rows, and forming output data of all nodes of the full-connection layer into an MXN-dimensional classification data set, wherein the value of N' is equal to the number of nodes of the full-connection layer;
(5b) selecting A row of data from the classification data set to form a training set, wherein,
Figure BDA0002146985590000041
Figure BDA0002146985590000042
representing a rounding-down operation, wherein M represents the line number of the classified data set, and forming a test set by the residual data in the classified data set;
(5c) training the training set by using a classification regression tree CART training method to obtain a trained classification tree;
(5d) classifying the test set by using the trained classification tree to obtain the classification accuracy of the classification tree;
(6) obtaining an SMT production tracing sequence:
(6a) for each trained classification tree, taking a root node of the classification tree as a starting node of each traversal, sequentially taking all leaf nodes of the classification tree as target nodes of each traversal, and taking all attribute names passed by each traversal as a tracing sequence of the classification tree;
(6b) taking the classification accuracy of each classification tree as the credibility of all tracing sequences of the classification tree;
(6c) searching a self-encoder corresponding to each tracing sequence, then searching a node corresponding to each attribute name in the tracing sequence corresponding to the self-encoder in a full connection layer of the self-encoder, and forming a network weight vector of the attribute name by using network weight values from all nodes of an input layer of the self-encoder to the corresponding node in the full connection layer, wherein the total number of elements of the network weight vector corresponding to each attribute name is the same as the number of nodes of the input layer of the corresponding self-encoder;
(6d) arranging the network weight vectors of each attribute name of each tracing sequence according to rows to form a C x D-dimensional risk matrix of the tracing sequence, wherein C represents the total number of the attribute names in the tracing sequence, and D represents the total number of the attributes in the SPI defect tracing data set;
(6e) all the data obtained by summing the risk matrix of each tracing sequence according to columns form a tracing vector of the tracing sequence, wherein each data in the tracing vector represents the importance of a corresponding attribute in the SPI defect tracing data set;
(6f) and sequencing all data of the tracing vectors of each tracing sequence from large to small to form the SMT production tracing sequence, wherein the SMT production information with higher importance has higher influence on SMT production defects.
Compared with the prior art, the invention has the following advantages:
first, the invention uses the self-encoder to be constructed and trained, and simultaneously reserves the independent variables which are linearly related and nonlinearly related to the dependent variable, thereby overcoming the defect that the prior art can only select the independent variables which are linearly related to the dependent variable, and leading the SMT production information contained in the SMT production tracing sequence obtained by the invention to be more comprehensive.
Secondly, the invention obtains the classification tree set by using an ensemble learning method, obtains the SMT production tracing sequence, and simultaneously describes the linear relation and the nonlinear relation between the independent variable and the dependent variable, thereby overcoming the defect that the prior art can only describe the linear relation between the independent variable and the dependent variable, and leading the invention to more accurately obtain the key factors causing the product defects.
Thirdly, because the invention uses the ensemble learning method to obtain the classification tree set, the relationship between the SMT production information and the SMT product defects is deeply excavated, the defect that the prior art can only obtain related data in the production process of the product is overcome, and the invention can obtain hidden factors causing the product defects.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a schematic diagram of the self-encoder structure of the present invention;
FIG. 3 is a schematic diagram of production information for the SPI production traceability dataset of the present invention;
FIG. 4 is a classification tree of the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
The specific steps of the present invention will be described in further detail with reference to fig. 1.
And step 1, constructing a self-encoder.
Build 18 autocoders with the same structure and different parameters, wherein each autocoder has three layers, and the structure of each autocoder is as follows: input layer → fully connected layer → output layer.
The number of nodes of the input layer and the output layer is set to 76.
The number of full link layer nodes for each autoencoder is set according to the following equation:
Figure BDA0002146985590000061
wherein n isiRepresents the number of fully-connected layer nodes of the ith self-encoder, i e {1,2, …,18},
Figure BDA0002146985590000062
denotes a rounding down operation,% denotes a remainder operation.
The structure of the constructed self-encoder will be further described with reference to fig. 2.
The circles in fig. 2, which are marked with X in the leftmost column, represent nodes of the input layer, the circles, which are marked with h in the middle column, represent nodes of the fully connected layer, and the circles, which are marked with y in the right column, represent nodes of the output layer. Each arrow line in fig. 2 represents an input value of the node at the right end of the arrow line after multiplying the value of the node at the left end of the arrow line by the corresponding weight, and the arrow in fig. 2 represents the data flow direction in the prediction process.
Calculating an activation value of each node of the fully-connected layers in the 1 st to 9 th self-encoders according to the following formula:
Figure BDA0002146985590000063
wherein, TmjRepresents the activation value of the jth node in the fully-connected layer of the mth self-encoder, m ∈{1,2,…,9},j∈{1,2,…,Nm},NmRepresenting the total number of fully-connected layer nodes of the mth self-encoder, e(·)Denotes an exponential operation based on a natural constant e, xmjRepresenting the j-th node input value, x, in the fully-connected layer of the m-th self-encodermj=Wmj TXm,WmjA weight matrix representing the network between the input layer of the mth self-encoder and the jth node of the fully-connected layer, the initialized value of each element of the matrix obeys the standard normal distribution, T represents the transposition operation, XmRepresenting a vector of 76 node input values in the mth self-encoder input layer.
Calculating an activation value of each node of the fully-connected layers in the 10 th to 18 th self-encoders according to the following formula:
Rln=max(0,xln)
wherein R islnRepresents the activation value of the nth node in the fully-connected layer of the ith self-encoder, i ∈ {10, …,18}, N ∈ {1,2, …, N ∈l},NlRepresents the total number of nodes of the fully-connected layer of the ith self-encoder, max (·) represents the max operation, xlnRepresenting the nth node input value, x, in the fully-connected layer of the ith self-encoderln=Wln TXl,WlnA weight matrix representing a network between the nth node of the input layer and the fully-connected layer of the ith self-encoder, the initialized value of each element of the matrix obeys a standard normal distribution, T represents a transposition operation, XlRepresenting a vector of 76 node input values in the ith self-encoder input layer.
Table 118 parameter table of self-encoder
Figure BDA0002146985590000071
In an embodiment of the present invention, the parameters of the 18 autoencoders are shown in table 1.
Calculating a loss error value between each output value from the encoder output layer and the input layer input value according to:
Figure BDA0002146985590000081
wherein L isiRepresenting the loss error value between the i-th output value from the encoder output layer and the input layer input value, i e {1,2, …,18}, NiRepresents the number of input layer nodes and the number of output layer nodes of the ith self-encoder, sigma represents the summation operation, yikRepresents the input value of the kth node of the ith self-encoder input layer,
Figure BDA0002146985590000082
representing the output value of the ith node from the output layer of the encoder, k e {1,2, …, N }.
And 2, acquiring an SPI defect tracing data set.
At least 5320000 pieces of SPI tracing data are randomly extracted from a database of a Manufacturing Execution System (MES), an MXN dimensionality SPI defect tracing data set is formed, M is at least 70000, N is at least 76, each row of data represents SPI defect tracing data containing production information, each column of data represents a sequence formed by all values of one attribute in the SPI tracing data set, and at least 20000 rows of SPI tracing data are defective detection data in the SPI defect tracing data set.
Five types of production information included in the SPI defect trace back data will be further described with reference to fig. 3. The boxes labeled "process parameters" in fig. 3 represent production information in terms of process parameters, including blade classification speed, blade classification distance, platen print height compensation, platen separation speed, platen separation distance, blade pressure, cleaning speed. The box labeled "printing process status parameters" in fig. 3 represents production information in terms of printing process status parameters, including print time, work file, production count, squeegee count, MASK count, squeegee mean pressure, squeegee minimum pressure, squeegee maximum pressure, auto clean count, manual clean count, print direction, and platen separation delay. The box labeled "intermediate product inspection parameter" in fig. 3 represents production information in terms of intermediate product inspection result parameters, including pad volume, pad area, pad height, and inspection result. The box labeled "environmental parameter" in fig. 3 represents production information in terms of environmental parameters, including humidity and temperature. The box labeled "raw material property parameter" in fig. 3 represents production information in terms of raw material property parameters, including PCB bar code, PCB length, PCB width, PCB thickness, pad number, package type, doctor blade ID, steel mesh ID. The box marked with MES system in the middle represents MES system, and the line in the figure represents 5 aspects of production information from MES system.
And 3, normalizing the data of each attribute in the SPI defect tracing data set according to the following formula to obtain a normalized SPI defect tracing data set:
x'qp=(xqp-min(xq))/(max(xq)-min(xq))
wherein, x'qpNormalized value, x, of the pth data representing the qth attribute in the SPI defect trace datasetqpP data representing the q attribute of the SPI defect tracing data set, min (-) represents the minimum value operation, xqAll data representing the qth attribute of the SPI defect trace back data set, max (·) represents a max operation.
In the embodiment of the present invention, the SPI defect trace back data set before normalization is shown in table 2, and includes 7 types of production information in the obtained SPI defect trace back data set, where each row of data includes 7 types of production information including blade pressure, blade speed, separation speed, pad volume, pad area, pad height, and SPI detection result, where the serial number is the number of the row where the data is located, the unit of blade pressure is newton per square centimeter, the unit of blade speed is millimeter per second, the unit of separation speed is centimeter per second, the pad volume represents the relative value of the pad volume automatically calculated by the SPI detection device, the pad area represents the relative value of the pad area automatically calculated by the SPI detection device, the pad height represents the relative value of the pad height automatically calculated by the SPI detection device, and the SPI detection result represents the detection result of the SPI detection device, 0 represents no defect and 1 represents continuous tin defect.
TABLE 2 partial data table of SPI defect tracing data set
Figure BDA0002146985590000091
The normalization process in the embodiment of the present invention is illustrated by taking the blade pressure attribute values of the first row data table of table 2 as an example. The maximum value of all data listed in table 2 for blade pressure is 13 and the minimum value is 8. The blade attribute value for the first row of the data table in table 2 is 11, which is normalized as follows:
x'=(11-8)/(13-8)
the normalized blade attribute value for the first row of the data table was calculated to be 0.6.
The results obtained after normalizing all the data of table 2 are shown in table 3.
Table 3 partial normalized SPI defect tracing data table
Figure BDA0002146985590000101
And 4, training the self-encoder.
And respectively inputting the normalized SPI defect tracing data sets into an input layer of each of 18 self-encoders, and respectively training each self-encoder by using a random gradient descent method to obtain 18 trained self-encoders in total.
The steps of the random gradient descent method are as follows:
step 1, randomly selecting an unselected data from the normalized SPI defect tracing data set;
and 2, after the selected data is input into the input layer of the self-encoder, calculating a loss error value between the output data of the output layer of the self-encoder and the selected data according to the following formula:
Figure BDA0002146985590000102
wherein L isiRepresenting the loss error value between the i-th output value from the encoder output layer and the input layer input value, i e {1,2, …,18}, NiRepresents the number of input layer nodes and the number of output layer nodes of the ith self-encoder, sigma represents the summation operation, yikRepresents the input value of the kth node of the ith self-encoder input layer,
Figure BDA0002146985590000104
represents the output value of the kth node of the ith self-encoder output layer, k is equal to {1,2, …, N };
and 3, updating each parameter in the self-encoder network according to the following formula:
Figure BDA0002146985590000103
wherein, omega'tDenotes the t-th parameter updated from the encoder, t ∈ {1,2, …,2 × N × (num +1) }, num denotes the total number of full-link layer nodes from the encoder, ωtRepresents the t-th parameter before updating of the self-encoder, l represents the learning rate, and the value range of l is [0,1 ]],
Figure BDA0002146985590000111
Indicating a derivation operation, thetatThe t-th parameter of the self-encoder before representing the parameter updating;
step 4, inputting the data selected in the step one into an input layer of the self-encoder after the parameters are updated, and calculating a loss error value between output layer output data of each self-encoder after the parameters are updated and the selected data according to the following formula;
Figure BDA0002146985590000112
wherein L isiRepresenting the loss error value between the i-th output value from the encoder output layer and the input layer input value, i e {1,2, …,18}, NiIndicating the ith self-encodingThe number of input layer nodes and the number of output layer nodes of the device, sigma, the summation operation, yikRepresents the input value of the kth node of the ith self-encoder input layer,
Figure BDA0002146985590000113
represents the output value of the kth node of the ith self-encoder output layer, k is equal to {1,2, …, N };
in the embodiment of the present invention, the training error values of 18 autoencoders are shown in table 4:
table 418 loss error values table from encoder
Self encoder sequence number Error value of training Self encoder sequence number Error value of training
1 0.0093 10 0.0158
2 0.0159 11 0.0100
3 0.0095 12 0.0034
4 0.0061 13 0.0081
5 0.0036 14 0.0075
6 0.0067 15 0.0030
7 0.0119 16 0.0017
8 0.0195 17 0.0075
9 0.0194 18 0.0151
Step 5, judging whether the loss error value between the updated output value of the output layer of the self-encoder and the selected data is smaller than the current loss error value threshold value, if so, obtaining a trained self-encoder, otherwise, executing the first step; the threshold value is a value selected from the range of [0,300] according to different requirements on the training precision of the self-encoder network, the larger the selected value is, the lower the training precision of the network is, and the smaller the selected value is, the higher the training precision of the network is.
In the embodiment of the present invention, the threshold value is set to 0.02.
And 5, obtaining a classification tree set by using an ensemble learning method.
And sequentially inputting all data of the normalized MXN-dimensional SPI defect tracing data set to a full-connection layer of each trained self-encoder according to rows, and forming an MXN 'dimensional classification data set by output data of all nodes of the full-connection layer, wherein the value of N' is equal to the number of nodes of the full-connection layer.
Selecting A row of data from the classification data set to form a training set, wherein,
Figure BDA0002146985590000122
Figure BDA0002146985590000123
representing a rounding-down operation, M representing the number of rows of the sorted data set, and grouping the remaining data in the sorted data set into a test set.
And training the training set by using a classification regression tree CART training method to obtain a trained classification tree.
The CART training method comprises the following steps:
step 1, taking the sequence number of each column in the training set as an attribute of the training set, and forming a value sequence of the attribute corresponding to the training set by all elements of each column in the training set;
step 2, deleting repeated numerical values in the value sequence of each attribute to obtain a numerical value set of each attribute;
step 3, counting the frequency of each value in the value set of each attribute, wherein the frequency of each value appearing in the value sequence of the attribute corresponding to the training set is used as the frequency of the value;
and 4, calculating the value sequence and the value set of each attribute according to the following formula to obtain the kini index value of each attribute:
Figure BDA0002146985590000121
wherein, gbKeny index value, N, representing the b-th attributebThe total number of values in the value set representing the b-th attribute, sigma represents summation operation, s represents the sequence number of the values in the value set, and s belongs to [1, N ]b],nbsFrequency count, n, of the s-th value in the set of values representing the b-th attributebThe total number of values of the value sequence of the b-th attribute is represented;
step 5, taking the attribute with the maximum Gini index value as the optimal attribute;
step 6, adding the attribute name of the optimal attribute into a base classifier;
step 7, arranging all numerical values of the value sequence of the optimal attribute from small to large as the optimal attribute sequence;
step 8, sequentially taking the average value of each pair of adjacent numerical values in the optimal attribute sequence from left to right as a segmentation point of the optimal attribute sequence, forming all the numerical values smaller than the segmentation point in the sequence into a left sequence of the segmentation point, and forming all the numerical values larger than the segmentation point in the sequence into a right sequence of the segmentation point;
and 9, respectively calculating the Gini index value of each division point according to the following formula:
Figure BDA0002146985590000131
wherein g represents the importance score of the segmentation point, c represents the number of numerical values of the left sequence of the segmentation point, and d represents the number of numerical values of the right sequence of the segmentation point;
step 10, selecting the value of the segmentation point with the maximum Gini index value as the segmentation threshold value of the optimal attribute;
step 11, taking the row elements of each row in the training set as a piece of classification data;
step 12, forming a left sub-training set by the classification data of which the values of all the optimal attributes are less than or equal to the segmentation threshold, and forming a right sub-training set by the classification data of which the values of all the optimal attributes are greater than the segmentation threshold;
and (13) respectively training the left sub-training set and the right sub-training set by using the CART training method which is the same as that in the step (5c) until the SPI detection results of all data in the left sub-training set and the right sub-training set are the same, and forming a classification tree by all attribute names of the base classifier.
The trained classification tree is further described with reference to fig. 4, where each box in fig. 4 represents each node in a classification tree. Wherein the box labeled "X2" indicates that the value of the node is the attribute name of the output data sequence of the 2 nd node of the full link layer of the self-encoder corresponding to the classification tree. The box labeled "X [16 ]" indicates that the value of the node is the attribute name of the output data sequence of the fully-connected 16 th node of the self-encoder to which the classification tree corresponds. The box labeled "X4" indicates that the value of the node is the attribute name of the output data sequence of the 4 th node of the full link layer of the self-encoder to which the classification tree corresponds. The box labeled "X7" indicates the value of the node and the attribute name of the output data sequence of the fully-connected 7 th node of the self-encoder corresponding to the classification tree. The box labeled "X5" indicates the value of the node and the attribute name of the output data sequence of the fully-connected 5 th node of the self-encoder corresponding to the classification tree. The box labeled "tin-through" indicates that the SPI detection result for the node is tin-through. The box labeled "non-defective" indicates that the SPI detection result for that node is non-defective. In fig. 4, the starting node of the arrow line is a parent node, and the destination node of the arrow line is a child node.
And classifying the test set by using the trained classification tree to obtain the classification accuracy of the classification tree.
And 6, obtaining an SMT production tracing sequence.
And for each trained classification tree, taking a root node of the classification tree as a starting node of each traversal, sequentially taking all leaf nodes of the classification tree as destination nodes of each traversal, and taking all attribute names passed by each traversal as a tracing sequence of the classification tree.
And taking the classification accuracy of each classification tree as the credibility of all the tracing sequences of the classification tree.
Searching a self-encoder corresponding to each tracing sequence, then searching a node corresponding to each attribute name in the tracing sequence corresponding to the self-encoder in a full connection layer of the self-encoder, and forming a network weight vector of the attribute name by using network weight values from all nodes of an input layer of the self-encoder to the corresponding node in the full connection layer, wherein the total number of elements of the network weight vector corresponding to each attribute name is the same as the number of nodes of the input layer of the corresponding self-encoder.
And arranging the network weight vectors of each attribute name of each tracing sequence according to rows to form a C multiplied by D (dimension) risk matrix of the tracing sequence, wherein C represents the total number of the attribute names in the tracing sequence, and D represents the total number of the attributes in the SPI defect tracing data set.
And (3) forming a tracing vector of each tracing sequence by all the data obtained by summing the risk matrix of each tracing sequence according to columns, wherein each data in the tracing vector represents the importance of a corresponding attribute in the SPI defect tracing data set.
And sequencing all data of the tracing vectors of each tracing sequence from large to small to form an SMT production tracing sequence and an SMT production tracing sequence.
In the embodiment of the invention, the finally obtained SMT production trace sequence is shown in Table 5.
TABLE 5 SMT production traceability sequence Listing
Serial number SMT production tracing sequence
1 Distance of blade separation>Width of the board>Thickness of the board>The tin is connected with the molten tin,
2 distance of blade separation>Width of the board>Automatic cleaning and counting>The tin is connected with the molten tin,
3 speed of blade separation>Automatic cleaning and counting>Tin connection
4 Speed of blade separation>Speed of blade separation>Tin connection
5 Length of the board>Automatic cleaning>Separating speed of the working table>Tin connection
6 Speed of blade separation>Separating speed of the working table>Pressure of the scraper>Tin connection
7 Separating speed of the working table>Tin connection

Claims (4)

1. An SMT production tracing method based on a self-encoder and ensemble learning is characterized in that the self-encoder is constructed, a classification tree set is obtained by using the ensemble learning method, and an SMT production tracing sequence is obtained, wherein the method specifically comprises the following steps:
(1) constructing an auto encoder:
(1a) build 18 autocoders with the same structure and different parameters, wherein each autocoder has three layers, and the structure of each autocoder is as follows: input layer → fully connected layer → output layer;
(1b) setting the number of nodes of the input layer and the output layer to 76;
(1c) the number of full link layer nodes for each autoencoder is set according to the following equation:
Figure FDA0002146985580000011
wherein n isiRepresents the number of fully-connected layer nodes of the ith self-encoder, i e {1,2, …,18},
Figure FDA0002146985580000012
represents a rounding down operation,% represents a remainder operation;
(1d) calculating an activation value of each node of the fully-connected layers in the 1 st to 9 th self-encoders according to the following formula:
Figure FDA0002146985580000013
wherein, TmjRepresents the activation value of the jth node in the m-th self-encoder's fully-connected layer, m ∈ {1,2, …,9}, j ∈ {1,2, …, N ∈m},NmRepresenting the total number of fully-connected layer nodes of the mth self-encoder, e(·)Denotes an exponential operation based on a natural constant e, xmjRepresenting the j-th node input value, x, in the fully-connected layer of the m-th self-encodermj=Wmj TXm,WmjRepresenting a weight matrix of the network between the input layer of the mth self-encoder and the jth node in the fully-connected layer, the initialized value of each element of the matrix obeying a standard normal distribution, T representing a transposition operation, XmRepresents a vector consisting of input values of 76 nodes in the input layer of the mth self-encoder;
(1e) calculating an activation value of each node of the fully-connected layers in the 10 th to 18 th self-encoders according to the following formula:
Rln=max(0,xln)
wherein R islnRepresents the activation value of the nth node in the fully-connected layer of the ith self-encoder, i ∈ {10, …,18}, N ∈ {1,2, …, N ∈l},NlRepresenting the total number of nodes of the fully-connected layer of the ith self-encoderMax (·) denotes a max operation, xlnRepresenting the input value, x, of the nth node in the fully-connected layer of the ith self-encoderln=Wln TXl,WlnA weight matrix representing the network between the nth node of the input layer and the fully-connected layer of the ith self-encoder, the initialized value of each element of the matrix obeying a standard normal distribution, XlRepresenting a vector consisting of input values of 76 nodes in the ith self-encoder input layer;
(1f) calculating a loss error value between each output value from the encoder output layer and the input layer input value according to:
Figure FDA0002146985580000021
wherein L isiRepresenting the loss error value between the i-th output value from the encoder output layer and the input layer input value, i e {1,2, …,18}, NiRepresents the number of input layer nodes and the number of output layer nodes of the ith self-encoder, sigma represents the summation operation, yikRepresents the input value of the kth node of the ith self-encoder input layer,
Figure FDA0002146985580000022
represents the output value of the kth node of the ith self-encoder output layer, k is equal to {1,2, …, N };
(2) acquiring an SPI defect tracing data set:
randomly extracting at least 5320000 pieces of SPI tracing data from a database of a Manufacturing Execution System (MES) to form an MXN-dimensional SPI defect tracing data set, wherein M is at least 70000, and N is at least 76, each row of data represents SPI defect tracing data containing production information, each column of data represents a sequence formed by all values of one attribute in the SPI tracing data set, and at least 20000 rows of SPI tracing data in the SPI defect tracing data set are defective detection data;
(3) according to the following formula, normalizing the data of each attribute in the SPI defect tracing data set to obtain a normalized SPI defect tracing data set:
x'qp=(xqp-min(xq))/(max(xq)-min(xq))
wherein, x'qpNormalized value, x, of the pth data representing the qth attribute in the SPI defect trace datasetqpP data representing the q attribute of the SPI defect tracing data set, min (-) represents the minimum value operation, xqAll data representing the qth attribute of the SPI defect tracing data set, max (·) represents the maximum value operation;
(4) training the self-encoder:
respectively inputting the normalized SPI defect tracing data set into an input layer of each self-encoder of 18 self-encoders, and respectively training each self-encoder by using a random gradient descent method to obtain 18 trained self-encoders in total;
(5) obtaining a set of classification trees using ensemble learning:
(5a) inputting all data of the normalized MXN-dimensional SPI defect tracing data set to a full-connection layer of each trained self-encoder in sequence according to rows, and forming output data of all nodes of the full-connection layer into an MXN-dimensional classification data set, wherein the value of N' is equal to the number of nodes of the full-connection layer;
(5b) selecting A row of data from the classification data set to form a training set, wherein,
Figure FDA0002146985580000031
Figure FDA0002146985580000032
representing a rounding-down operation, wherein M represents the line number of the classified data set, and forming a test set by the residual data in the classified data set;
(5c) training the training set by using a classification regression tree CART training method to obtain a trained classification tree;
(5d) classifying the test set by using the trained classification tree to obtain the classification accuracy of the classification tree;
(6) obtaining an SMT production tracing sequence:
(6a) for each trained classification tree, taking a root node of the classification tree as a starting node of each traversal, sequentially taking all leaf nodes of the classification tree as target nodes of each traversal, and taking all attribute names passed by each traversal as a tracing sequence of the classification tree;
(6b) taking the classification accuracy of each classification tree as the credibility of all tracing sequences of the classification tree;
(6c) searching a self-encoder corresponding to each tracing sequence, then searching a node corresponding to each attribute name in the tracing sequence corresponding to the self-encoder in a full connection layer of the self-encoder, and forming a network weight vector of the attribute name by using network weight values from all nodes of an input layer of the self-encoder to the corresponding node in the full connection layer, wherein the total number of elements of the network weight vector corresponding to each attribute name is the same as the number of nodes of the input layer of the corresponding self-encoder;
(6d) arranging the network weight vectors of each attribute name of each tracing sequence according to rows to form a C x D-dimensional risk matrix of the tracing sequence, wherein C represents the total number of the attribute names in the tracing sequence, and D represents the total number of the attributes in the SPI defect tracing data set;
(6e) all the data obtained by summing the risk matrix of each tracing sequence according to columns form a tracing vector of the tracing sequence, wherein each data in the tracing vector represents the importance of a corresponding attribute in the SPI defect tracing data set;
(6f) and sequencing all data of the tracing vectors of each tracing sequence from large to small to form the SMT production tracing sequence.
2. An SMT production tracing method based on self-encoder and ensemble learning according to claim 1, wherein the step of the stochastic gradient descent method in step (4) is as follows:
step one, randomly selecting an unselected data from the normalized SPI defect tracing data set;
and secondly, after the selected data is input into the input layer of the self-encoder, calculating a loss error value between the output data of the output layer of the self-encoder and the selected data according to the following formula:
Figure FDA0002146985580000041
wherein L represents the loss error value between the output value of the output layer after the selected data is input into the self-encoder and the selected data, N represents the total number of input layer nodes of the self-encoder, the total number of output layer nodes of the self-encoder is equal to the total number of input layer nodes, the input layer nodes and the output layer nodes of the self-encoder are in one-to-one correspondence according to the node sequence, sigma represents the summation operation, y represents the loss error value between the selected data and the output layer after the selected data is input into the self-encoder, andkrepresenting the input value from the kth node in the input layer of the encoder,
Figure FDA0002146985580000042
represents the output value from the kth node in the output layer of the encoder, k ∈ {1,2, …, N }, ∈ representing belonging to a symbol;
thirdly, updating each parameter in the self-encoder network according to the following formula:
Figure FDA0002146985580000043
wherein, omega'tDenotes the t-th parameter updated from the encoder, t ∈ {1,2, …,2 × N × (num +1) }, num denotes the total number of full-link layer nodes from the encoder, ωtRepresents the t-th parameter before updating of the self-encoder, l represents the learning rate, and the value range of l is [0,1 ]],
Figure FDA0002146985580000044
Indicating a derivation operation, thetatThe t-th parameter of the self-encoder before representing the parameter updating;
and fourthly, inputting the data selected in the first step into an input layer of the self-encoder after the parameters are updated, and calculating a loss error value between output data of the self-encoder after the parameters are updated and the selected data according to the following formula:
Figure FDA0002146985580000051
wherein L' represents a loss error value between the output data from the encoder output layer after the parameter update and the selected data,
Figure FDA0002146985580000052
representing the output value of the kth node in the output layer of the encoder after the parameters are updated;
fifthly, judging whether the loss error value between the updated output value of the output layer of the self-encoder and the selected data is smaller than the current loss error value threshold value, if so, obtaining a trained self-encoder, otherwise, executing the first step; the threshold value is a value selected from the range of [0,300] according to different requirements on the training precision of the self-encoder network, the larger the selected value is, the lower the training precision of the network is, and the smaller the selected value is, the higher the training precision of the network is.
3. An SMT production tracing method based on self-encoder and ensemble learning according to claim 1, wherein the production information in step (2) includes 5 aspects of raw materials, process, printing process, environment, and test results.
4. An SMT production traceability method based on self-encoder and ensemble learning as claimed in claim 1, wherein the step of CART training in step (5c) is as follows:
step one, taking the sequence number of each column in the training set as an attribute of the training set, and forming a value sequence of the attribute corresponding to the training set by all elements of each column in the training set;
deleting repeated numerical values in the value sequence of each attribute to obtain a numerical value set of each attribute;
thirdly, counting the frequency of each value in the value set of each attribute, wherein the frequency of each value appearing in the value sequence of the attribute corresponding to the training set is used as the frequency of the value;
fourthly, calculating the value sequence and the value set of each attribute according to the following formula to obtain the kini index value of each attribute:
Figure FDA0002146985580000053
wherein, gbKeny index value, N, representing the b-th attributebThe total number of values in the value set representing the b-th attribute, sigma represents summation operation, s represents the sequence number of the values in the value set, and s belongs to [1, N ]b],nbsFrequency count, n, of the s-th value in the set of values representing the b-th attributebThe total number of values of the value sequence of the b-th attribute is represented;
the fifth step: taking the attribute with the maximum Gini index value as the optimal attribute;
and a sixth step: adding the attribute name of the optimal attribute into the base classifier;
the seventh step: arranging all values of the value sequence of the optimal attribute from small to large as an optimal attribute sequence;
eighth step: sequentially taking the average value of each pair of adjacent numerical values in the optimal attribute sequence from left to right as a segmentation point of the optimal attribute sequence, forming all the numerical values smaller than the segmentation point in the sequence into a left sequence of the segmentation point, and forming all the numerical values larger than the segmentation point in the sequence into a right sequence of the segmentation point;
the ninth step: the value of the kini index at each segmentation point is calculated separately according to the following equation:
Figure FDA0002146985580000061
wherein g represents the importance score of the segmentation point, c represents the number of numerical values of the left sequence of the segmentation point, and d represents the number of numerical values of the right sequence of the segmentation point;
the tenth step: selecting the value of the segmentation point with the maximum Gini index value as the segmentation threshold value of the optimal attribute;
the eleventh step: taking the row elements of each row in the training set as a piece of classification data;
the twelfth step: forming a left sub-training set by the classified data of which the values of all the optimal attributes are less than or equal to the segmentation threshold, and forming a right sub-training set by the classified data of which the values of all the optimal attributes are greater than the segmentation threshold;
the thirteenth step: and (5) respectively training the left sub-training set and the right sub-training set by using the CART training method which is the same as that in the step (5c) until the SPI detection results of all data in the left sub-training set and the right sub-training set are the same, and forming a classification tree by using all attribute names of the base classifier.
CN201910688024.3A 2019-07-29 2019-07-29 SMT production tracing method based on self-encoder and ensemble learning Active CN110533071B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910688024.3A CN110533071B (en) 2019-07-29 2019-07-29 SMT production tracing method based on self-encoder and ensemble learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910688024.3A CN110533071B (en) 2019-07-29 2019-07-29 SMT production tracing method based on self-encoder and ensemble learning

Publications (2)

Publication Number Publication Date
CN110533071A CN110533071A (en) 2019-12-03
CN110533071B true CN110533071B (en) 2022-03-22

Family

ID=68660567

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910688024.3A Active CN110533071B (en) 2019-07-29 2019-07-29 SMT production tracing method based on self-encoder and ensemble learning

Country Status (1)

Country Link
CN (1) CN110533071B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20140057753A (en) * 2012-11-04 2014-05-14 박호열 Production management system of surface mounted tfchnology
CN109597968A (en) * 2018-12-29 2019-04-09 西安电子科技大学 Paste solder printing Performance Influence Factor analysis method based on SMT big data
CN109657718A (en) * 2018-12-19 2019-04-19 广东省智能机器人研究院 SPI defect classification intelligent identification Method on a kind of SMT production line of data-driven
CN110021341A (en) * 2019-02-21 2019-07-16 华东师范大学 A kind of prediction technique of GPCR drug based on heterogeneous network and targeting access

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7366728B2 (en) * 2004-04-27 2008-04-29 International Business Machines Corporation System for compressing a search tree structure used in rule classification

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20140057753A (en) * 2012-11-04 2014-05-14 박호열 Production management system of surface mounted tfchnology
CN109657718A (en) * 2018-12-19 2019-04-19 广东省智能机器人研究院 SPI defect classification intelligent identification Method on a kind of SMT production line of data-driven
CN109597968A (en) * 2018-12-29 2019-04-09 西安电子科技大学 Paste solder printing Performance Influence Factor analysis method based on SMT big data
CN110021341A (en) * 2019-02-21 2019-07-16 华东师范大学 A kind of prediction technique of GPCR drug based on heterogeneous network and targeting access

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
面向电子元器件产品的质量追溯系统设计与实现;屈正龙;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20160315(第03期);I138-2892 *

Also Published As

Publication number Publication date
CN110533071A (en) 2019-12-03

Similar Documents

Publication Publication Date Title
Jain et al. Modeling and analysis of FMS performance variables by ISM, SEM and GTMA approach
CN109597968B (en) SMT big data-based solder paste printing performance influence factor analysis method
CN113469241B (en) Product quality control method based on process network model and machine learning algorithm
CN111242363A (en) PCB order splicing and typesetting prediction method and system based on machine learning
CN107392424A (en) A kind of method for establishing quality fluctuation source ISM in manufacture course of products
Tsai Development of a soldering quality classifier system using a hybrid data mining approach
CN115526093A (en) Training method, equipment and storage medium for SMT printing parameter optimization model
CN108491991B (en) Constraint condition analysis system and method based on industrial big data product construction period
CN111105082A (en) Workpiece quality prediction model construction method and prediction method based on machine learning
CN114375107A (en) Method, device and equipment for reconstructing unstructured influence factors of solder paste printing of SMT (surface mount technology) production line
US10318931B2 (en) Method and system for determining maintenance policy of complex forming device
CN110533071B (en) SMT production tracing method based on self-encoder and ensemble learning
CN108846128B (en) Cross-domain text classification method based on adaptive noise reduction encoder
CN110175631A (en) A kind of multiple view clustering method based on common Learning Subspaces structure and cluster oriental matrix
CN103714251A (en) Method, device and system for matching semiconductor product with machining device
DE102023202593A1 (en) Method and system for recommending modules for an engineering project
CN116485032A (en) Aviation product processing quality prediction method considering multidimensional influence factors
CN112454493B (en) Circuit board identification system, method and storage medium
CN115098703A (en) Knowledge graph construction method based on SMT quality big data analysis
CN113573506B (en) PCB production process and PCB circuit board
CN111243013B (en) Visual printer deviation correcting pose prediction method based on integrated multi-target regression chain
Tsai et al. Measuring machine-group flexibility: a case study for surface mount assembly line with different configurations
CN111539569A (en) Supply chain tracing system-based paddy product production optimization method and device
CN111090624A (en) MES and CR plate-based customized furniture plate classification combination algorithm
Feofanov et al. Database Organization for More Effective Information Processing in Automatic Optical Inspection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant