CN117557827A - Plate shape anomaly detection method based on self-coding cascade forests - Google Patents

Plate shape anomaly detection method based on self-coding cascade forests Download PDF

Info

Publication number
CN117557827A
CN117557827A CN202311305000.8A CN202311305000A CN117557827A CN 117557827 A CN117557827 A CN 117557827A CN 202311305000 A CN202311305000 A CN 202311305000A CN 117557827 A CN117557827 A CN 117557827A
Authority
CN
China
Prior art keywords
data
steel plate
model
training
forest
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311305000.8A
Other languages
Chinese (zh)
Inventor
刘强
赵丰年
丁进良
柴天佑
Original Assignee
东北大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 东北大学 filed Critical 东北大学
Priority to CN202311305000.8A priority Critical patent/CN117557827A/en
Publication of CN117557827A publication Critical patent/CN117557827A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • G06V10/765Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a plate shape anomaly detection method based on a self-coding cascade forest. Because the extracted features still have high-dimensional feature space and belong to the multi-class classification problem, the extracted features can be effectively processed through a multi-granularity cascading forest (gcForest) gcForest model so as to meet the requirement of multi-classification on the rolled steel plate shape, and finally, the abnormal detection on the steel plate shape is realized. Compared with a general deep network model (such as a deep neural network model and a deep random forest model), the method needs fewer super parameters and data volume, is faster in training speed and higher in accuracy, and provides objective basis for on-site workers to identify plate shape faults of the steel plate.

Description

Plate shape anomaly detection method based on self-coding cascade forests
Technical Field
The invention relates to the technical field of product quality anomaly identification, in particular to a steel plate shape anomaly identification method based on a self-coding cascade forest.
Background
The steel industry is used as a national basic industry to accelerate the digitization and intellectualization process. An advance in iron and steel enterprises is that the products have good mechanical properties and precise shape and dimensions. However, the steel plate has complicated procedures in the production process, and the quality of the steel plate shape is easily affected by various factors, so that the steel plate shape needs to be detected in time, and if abnormal rapid adjustment occurs.
At present, when the quality of the steel plate is detected and controlled, most manufacturers adopt manual inspection of the quality of the steel plate, calibration of the shape of the steel plate or judgment of the failure type of the steel plate. And then carrying out subsequent processing analysis according to the result obtained by the human. The detection method is time-consuming, labor-consuming and high in detection cost, and has the influence of taking artificial subjective factors as decisions, so that the detection result lacks objective and unified evaluation criteria. A non-contact detection method based on machine vision is selected by few manufacturers to detect the quality of the surface of the steel plate, but the method is easily influenced by environmental factors, and various factors in a scene, including illumination, object shape, surface color, a camera and spatial relation change, can influence the generated image. The information quantity of the collected gray level image and the color image is very large, the huge data quantity needs huge storage space, and meanwhile, the quick processing is not easy to realize. Therefore, the abnormal steel plate can be detected rapidly and accurately, the burden of factory technicians can be effectively reduced, the production efficiency is improved for enterprises, the cost is saved, and the competitiveness is improved.
Aiming at the problems, the invention provides a method for detecting the abnormal quality of the steel plate shape based on a self-coding cascade forest model.
Disclosure of Invention
According to the technical problems mentioned in the background art, a plate shape anomaly detection method based on a self-coding cascade forest is provided.
The invention adopts the following technical means:
a method for detecting abnormal plate shape quality of a steel plate based on a self-coding cascade forest model comprises the following steps:
step 1: collecting data; measuring the thickness of the ith steel plate in a shearing line process in the thick plate production process to obtain a thickness data set X of the ith steel plate i The dimension of the thickness data set is (m, n), and the quality label Y of the plate shape of the ith steel plate i
Wherein N represents the total number of the collected steel plates, m represents the total number of sampling points on the length of each steel plate, and N represents the total number of sampling points on the width of each steel plate; quality label Y of steel plate i ∈{0,1,2},Y i =0 indicates that the ith steel sheet failed except for the middle wave, Y i =1 indicates that the i-th steel plate is an abnormal-free steel plate, Y i =2 indicates that the i-th steel plate is a steel plate in which a medium wave fault occurs;
step 2: preprocessing the data acquired in the step 1 to obtain a data set X consisting of N steel plates and a set Y consisting of quality labels corresponding to each steel plate;
step 3: building an eF (electronic file description) network model, training and storing the eF (electronic file description) network model, and extracting features of the data processed in the step 2;
step 4: constructing a gcForest network model, training and storing the gcForest network model, and classifying the plate shape quality of the steel plate at the same time:
step 5: and (3) sampling the thickness of the steel plate to be detected after the shearing line process of the thick plate production process to obtain a thickness data set X ' of the steel plate to be detected, preprocessing the data set X ' through the step (2) to obtain a processed thickness data set X ' and inputting the preprocessed data into the stored eFcast and gcForest models to obtain the plate-shaped quality label of the steel plate to be detected.
Compared with the prior art, the invention has the following advantages:
according to the invention, through collecting thickness data and quality labels of the steel plate, forming a data set representing the relative thickness of the steel plate through a data preprocessing process, taking the data set as input and taking the plate shape quality labels of the steel plate as output, constructing and training a steel plate shape quality anomaly identification model based on a self-coding cascade forest by utilizing training samples, and rapidly and effectively detecting the plate shape quality of the steel plate, and improving objectivity, accuracy and instantaneity of the steel plate shape quality anomaly detection. The invention utilizes the self-coding cascade forest formed by a plurality of decision trees to excavate data characteristics, and compared with a common neural network model, fewer super parameters and data quantity are needed, the training speed is faster, the accuracy is higher, and a scientific objective basis is provided for on-site workers to judge whether the plate shape quality of the steel plate is abnormal or not and to quickly adjust the subsequent procedure of processing the steel plate.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to the drawings without inventive effort to a person skilled in the art.
FIG. 1 is a flow chart of a steel plate shape quality anomaly identification method based on a self-coding cascade forest model;
fig. 2 is a schematic diagram of a model structure of a method for identifying abnormal plate shape quality of a steel plate based on a self-coding cascade forest model according to an embodiment of the present invention;
fig. 3 is an anomaly identification and classification effect diagram of a steel plate shape quality anomaly identification method based on a self-coding cascade forest model according to an embodiment of the invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The method for detecting the plate shape quality abnormality of the steel plate based on the self-coding cascade forest model is characterized by comprising the following steps of:
step 1: collecting data; measuring the thickness of the ith steel plate in a shearing line process in the thick plate production process to obtain a thickness data set X of the ith steel plate i The dimension of the thickness data set is (m, n), and the quality label Y of the plate shape of the ith steel plate i
Wherein N represents the total number of the collected steel plates, m represents the total number of sampling points on the length of each steel plate, and N represents the total number of sampling points on the width of each steel plate; quality label Y of steel plate i ∈{0,1,2},Y i =0 indicates that the i-th steel sheet failed except for the middle wave,Y i =1 indicates that the i-th steel plate is an abnormal-free steel plate, Y i =2 indicates that the i-th steel plate is a steel plate in which a medium wave fault occurs;
step 2: preprocessing the data acquired in the step 1 to obtain a data set X consisting of N steel plates and a set Y consisting of quality labels corresponding to each steel plate;
step 2.1: for the acquired thickness dataset X i Carrying out normalization treatment;
step 2.2: constructing normalized data into a steel plate shape quality sample set X Data ={X 1 ,X 2 ,…X i ,…,X N The dimensions of the plate-shaped quality sample set of the steel plates are (N, 1, m, N), and a plate-shaped quality label data set Y corresponding to each steel plate is constructed Label ={Y 1 ,Y 2 ,…Y i ,…,Y N },Y i E {0,1,2}; n represents the total number of the collected steel plates, m represents the total number of sampling points on the length of each steel plate, and N represents the total number of sampling points on the width of each steel plate;
step 2.3: expanding an original four-dimensional data set into a two-dimensional data set X Data The dimension of the two-dimensional dataset is (N, m x N);
step 2.4: dividing the data set; -integrating the two-dimensional dataset X Data And the plate-shaped quality label data set Y Label Dividing the training set into training sets X according to the proportion a and b train 、Y train And test set X test 、Y test The method comprises the steps of carrying out a first treatment on the surface of the The training set is used for constructing a model, the testing set is used for checking the model construction, evaluating the accuracy of the model and testing the generalization capability of the model.
Step 3: and (3) building an eF (extensible firmware model) network model, training and storing the eF (extensible firmware model), and extracting the characteristics of the data processed in the step (2).
Step 3.1: forward coding; encoding process to obtain input data set X i And send the data set into the root nodes of K decision trees, respectively, and once the data traverses to the leaf nodes of all decision trees, the process returns a K-dimensional vector in which each element K is in tree iInteger index of leaf nodes, where i e {1,2, …, K };
step 3.2: calculating a maximum-composite Rule; firstly, leaf nodes of each tree correspond to a path from a root node, secondly, a possible range of data can be obtained according to a path result, and then, a maximum-composite Rule is calculated according to the range;
MCR is a possible range of data that cannot be expanded any more, and if expansion is continued, the problem of data incompatibility occurs; for example, the path of data Z through the decision tree is rule1 (2.7)>z 1 >0)∩(z 2 >1.5),rule2:(3.0>z 1 >0.5)∩(z 2 <2.0 Then the file can be rewritten to (2.7) according to the MCR path>z 1 >0.5)∩(2.0>z 2 >1.5);
Step 3.3: decoding the reconstructed data; recovering the original data according to the MCR obtained by each decision tree path; the original data must be valued in MCR for the classification attribute;
step 3.4: determining the value of the quantity K of eFore model parameter decision trees, wherein the quantity K of the decision trees determines the size of output data; after building the eFore model, input X Data And Y Label The data set is measured by MSE indexes and then an optimal parameter K is determined;
step 3.5: training an eFore model, and storing the model;
step 3.6: extracting data features, and collecting training set X Data Inputting into a trained eFore model, and obtaining a data set X through forward coding enc The data set X enc The dimension of (2) is (N, K).
Step 4: building a gcForest network model on the basis of the step 3, training and storing the gcForest network model, and classifying the plate shape quality of the steel plate.
Step 4.1: judging whether the data obtained after the processing in the step 3 has high dimensionality or not, and judging whether the dimensionality of the data has a constraint relation or not; if the input data has high dimensionality or constraint relation among data dimensionalities, a multi-granularity scanning structure is required to be arranged in front of the cascade forest structure, and the original data is input into the multi-granularity scanning structure and then is input into the cascade forest structure for quality detection of the steel plate shape; otherwise, directly entering the step 4.3, and carrying out quality detection on the shape of the steel plate;
step 4.2: a multi-granularity scanning structure; the original input data being multi-dimensional data, e.g. image data having dimensions (n) s ,n c ,n h ,n w ) Wherein n is s For the sample size, n c Number of channels, n h For image height, n w Is the image width; sequence data having dimensions (n) s ,n f ,s l 1), wherein n s For the sample size, n f Is the characteristic quantity of data, s l Is the sequence data length; the original data can also be one-dimensional data and two-dimensional data; however, if the data is high-dimensional or constraint relation exists between data dimensions, the data needs to be scanned in multiple granularities;
taking a steel plate shape quality data set as an example, the original X Data The dimension of the data set is (N, 1, m, N) and is similar to image data, wherein N is the number of blocks of the acquired steel plates, m is the total number of sampling points of each steel plate in length, and N is the total number of sampling points of each steel plate in width; setting the window scanning size as L, and obtaining S=N [ (m-L+1). ] after sliding window scanning (N-L+1)]A feature matrix of L x L;
select s 1 The random forests of different types are planted, each type being l, i.e. n in total 1 =s 1 * Inputting the data subjected to multi-granularity scanning into each forest for training, obtaining a probability vector with a length of c (c is the classified category number and is 3 when classifying the steel plate shape quality) in each forest, generating a probability vector of S.times.c in each forest, and finally splicing the results of all the forests to obtain S.times.c.s 1 A probability vector is outputted as a sample;
step 4.3: if the original input data does not have high dimensionality or no constraint relation exists between the data dimensionalities (the high-dimensional data can be directly changed into low-dimensional data), the high-dimensional data is directly input into the cascade forest structure;
step 4.4: design cascade forest structure by reference to the ideas of integrated learning and deep learning: step 4.4 further comprises the steps of:
step 4.4.1: designing the hierarchy of cascade forests, wherein each hierarchy consists of a plurality of different integrated learning classifiers (such as a completely random forest, a random forest, XGBoost and other methods), and the nth hierarchy cascade forests are A n ={f 1 ,f 2 ,…,f t And (f), where f t And represents the t-th classifier, each classifier can obtain the probability of c categories;
step 4.4.2: the layer-by-layer structure in deep learning is used as the partial input of the next layer;
inputting the data subjected to multi-granularity scanning into a cascade forest or directly inputting the data, and converting the data into E through a first-layer forest 1 The dimension =n×l×c+e; then as input data for the next layer until the last layer of concatenation A n The probability of c categories is output in the last layer, the results of all the classifiers are summed and averaged, and then the maximum probability and the corresponding label are selected;
wherein n represents the number of classes of the first layer classifier, l represents the number of each classifier, and c represents the number of classification classes
Since both machine learning and deep learning inevitably involve the risk of overfitting and underfilling, the training of each classifier uses a k-cross validation method, i.e. each training sample is used k-1 times in the classifier, yielding k-1 class lists, averaging as part of the input of the next cascade, evaluating the performance of all cascade structures before the previous cascade structure by the validation set after the cascade has been extended to a new level, and the training process ends if the evaluation result has not changed or lifted too much, so that the number of levels of the cascade structure is automatically determined by the training process.
Step 4.5: with training set X train Performing model training, adjusting super parameters, and adjusting parameters through training to improve model accuracy;
step 4.6: and (3) storing the training model with highest accuracy reaching the history, taking the test data set X_test as a model input, comparing the test set label output by the model with the original manual label to obtain the test accuracy of the model, testing the generalization capability of the model, and checking the classification capability of the model on new data.
Step 5: and (3) sampling the thickness of the steel plate to be detected after the shearing line process of the thick plate production process to obtain a thickness data set X ' of the steel plate to be detected, preprocessing the data set X ' through the step (2) to obtain a processed thickness data set X ' and inputting the preprocessed data into the stored eFcast and gcForest models to obtain the plate-shaped quality label of the steel plate to be detected.
Embodiment one:
a steel plate shape anomaly identification method based on a self-coding cascade forest, as shown in figure 1, comprises the following steps:
step 1: data acquisition, wherein thicknesses of different positions of each steel plate are measured in the production process, 25000 points are acquired in total, and a thickness data set X of the ith steel plate is obtained i Quality label Y having dimensions of (500, 50), and i-th steel plate shape i ∈{0,1,2};
In the embodiment, a thickness data set and a quality label of 75 steel plates are collected, wherein 25 steel plates are abnormal-free, 25 steel plates have faults except for the middle waves, and 25 steel plates have the middle wave faults; failure type Y of steel sheet i ∈{0,1,2},Y i =0 represents a steel sheet in which the ith steel sheet failed except for the middle wave, Y i =1 indicates that the i-th steel plate is an abnormal-free steel plate, Y i =2 represents that the i-th steel plate is a steel plate with a medium wave fault;
step 2: preprocessing the acquired data:
step 2.1: for each acquired steel plate data set X i Carrying out normalization treatment;
step 2.2: constructing normalized data into a steel plate shape quality sample set X Data ={X 1 ,X 2 ,…X i ,…,X 75 -constructing a plate-shaped quality label data set Y corresponding to each steel plate, having dimensions (75,1,500,50) Label ={Y 1 ,Y 2 ,…Y i ,…,Y 75 },Y i Quality label for the plate shape of the ith steel plate, Y i ∈{0,1,2};
Step 2.3: expanding the data set, reserving the number of steel plates in the first dimension of the data set, expanding and combining the other three dimensions, and changing the dimension of the data set into (75,25000);
step 2.4: dividing training set and data set, and dividing data set X Data And Y is equal to Label Dividing into training sets X according to a ratio of 6:4 train 、Y train And test set X test 、Y test ;;
Step 3: constructing an eFore network model, and extracting the characteristics of the data;
step 3.1: the method comprises the steps of forward coding, inputting data into an eFcast model, obtaining K-dimensional vectors from K decision trees through a forward coding process, representing K paths from root nodes to leaf nodes, and obtaining a path set;
step 3.2: calculating Maximal-CompatibleRule (MCR), and optimizing K paths obtained by forward coding to obtain MCR;
step 3.3: decoding the reconstructed data, and recovering the original data according to the MCR obtained by each decision tree path; for example, rule is obtained by MCR (2.7)>z 1 >0.5)∩(2.0>z 2 >1.5 And then average the intervals. Data z can be reconstructed 1 =1.6,z 2 =1.75;
Step 3.4: determining the value of the quantity K of the eFore model parameter decision trees, inputting a data set after the eFore model is built, determining an optimal parameter K after MSE index measurement, and selecting K as 2000;
step 3.5: training an eFore model;
step 3.6: extracting data features, and collecting the data set X Data Inputting into a trained eFore model, and obtaining a dimension-reduced data set X through forward coding enc Its dimensions are (75, 2000);
step 4: constructing a gcForest network model, and classifying the plate shape quality of the steel plate:
step 4.1: because the data processed by the eFore model does not have high dimensionality and has no constraint relation among the dimensionalities of the data, the input data can directly enter a cascade forest structure;
step 4.2: the architecture of the cascading forest is designed for this embodiment:
step 4.2.1: designing each stage of cascade forest, and selecting four classification learners, namely an XGBoost classification learner, a random forest classification learner, a complete random forest classification learner and a logistic regression classification learner, for better improving classification effect, wherein each type of learner comprises 10 decision trees, and each of the random forest classification learner and the complete random forest classification learner comprises 1000 decision trees;
step 4.4.2: the layer-by-layer structure in deep learning is used as reference, and the output result of the previous layer is used as the input of the next layer part;
the original data is directly input into a first layer of a cascade forest after being extracted by eForest model data characteristics, and the data is converted into E 1 The dimension =4x1x3+e (4 is four classifiers, 1 is the number of each classifier, 3 is the number of classification categories, E is the data dimension of the upper layer), and then is used as the input data of the lower layer until the last layer of cascade a n Outputting probabilities of 3 categories at the last layer, summing the results of all the classifiers, taking an average value, and then selecting the maximum probability and the corresponding label to obtain a final result;
step 4.5: with training set X train Performing model training, adjusting super parameters, and adjusting parameters through training to improve model accuracy;
in order to avoid the risk of over-fitting and under-fitting, five-time cross validation is adopted, each training sample is guaranteed to be used for 4 times in a classifier, then a list of 4 categories is generated, the average value of the training samples is taken as the input of the next cascade forest, the performance of all cascade structures before evaluation is carried out through a validation set each time after a new cascade of levels is generated, and if the evaluation result does not greatly promote the training process to automatically end;
step 4.6: storing the training model with highest accuracy and testing dataCollection X test As model input, the test set label and the original artificial label set Y output by the model test Comparing to obtain the test accuracy of the model, testing the generalization capability of the model, and checking the classification capability of the model on new data; in this embodiment, the obtained optimal structure is shown in fig. 2;
step 5: and (3) carrying out thickness sampling measurement on the steel plate to be detected after the shearing line process of the thick plate production process to obtain a thickness data set of the steel plate to be detected, preprocessing the data through the step (2), inputting the preprocessed data into the stored eFore and gcForest models, and outputting the plate shape quality label of the steel plate to be detected.
In this embodiment, taking three attribute types generated by the self-coding cascade forest as an example, performing data feature extraction and classification on a sample of a training set by adopting a self-coding cascade forest formed by a decision tree, training to obtain an optimal model with an accuracy of 93.33%, inputting a test set into the optimal model to perform data feature extraction and classification, and obtaining a classification condition as shown in fig. 3, wherein the ordinate good represents a fault-free steel plate, the ordinate is bad and represents other fault steel plates except for medium waves, and the ordinate is wave and represents a medium wave fault steel plate, so that the test accuracy is 90%. The method provided by the invention can be used for continuously updating the model at any time, and the classification result of the steel plate is obtained after the steel plate shape data to be detected are input into the model. Objective basis is provided for on-site staff to judge the quality of the steel plate, the production process of the steel plate is timely adjusted, and the product qualification rate is improved. The fault analysis is carried out by the auxiliary technicians, the quality abnormality problem of large-batch steel plate products is avoided, and the production efficiency is improved.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments. In the foregoing embodiments of the present invention, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments. In the several embodiments provided in the present application, it should be understood that the disclosed technology content may be implemented in other manners.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims (7)

1. The method for detecting the plate shape quality abnormality of the steel plate based on the self-coding cascade forest model is characterized by comprising the following steps of:
step 1: collecting data; measuring the thickness of the ith steel plate in a shearing line process in the thick plate production process to obtain a thickness data set X of the ith steel plate i The dimension of the thickness data set is (m, n), and the quality label Y of the plate shape of the ith steel plate i
Wherein N represents the total number of the collected steel plates, m represents the total number of sampling points on the length of each steel plate, and N represents the total number of sampling points on the width of each steel plate; quality label Y of steel plate i ∈{0,1,2},Y i =0 indicates that the ith steel sheet failed except for the middle wave, Y i =1 indicates that the i-th steel plate is an abnormal-free steel plate, Y i =2 indicates that the i-th steel plate is a steel plate in which a medium wave fault occurs;
step 2: preprocessing the data acquired in the step 1 to obtain a data set X consisting of N steel plates and a set Y consisting of quality labels corresponding to each steel plate;
step 3: building an eF (electronic file description) network model, training and storing the eF (electronic file description) network model, and extracting features of the data processed in the step 2;
step 4: constructing a gcForest network model, training and storing the gcForest network model, and classifying the plate shape quality of the steel plate at the same time:
step 5: and (3) sampling the thickness of the steel plate to be detected after the shearing line process of the thick plate production process to obtain a thickness data set X ' of the steel plate to be detected, preprocessing the data set X ' through the step (2) to obtain a processed thickness data set X ' and inputting the preprocessed data into the stored eFcast and gcForest models to obtain the plate-shaped quality label of the steel plate to be detected.
2. The method for detecting abnormal plate shape quality of steel plate based on self-coding cascading forest model according to claim 1, wherein the step 2 further comprises the following steps:
step 2.1: normalizing the acquired thickness data set X;
step 2.2: constructing normalized data into a steel plate shape quality sample set X Data ={X 1 ,X 2 ,…X i ,…,X N The dimensions of the plate-shaped quality sample set of the steel plates are (N, 1, m, N), and a plate-shaped quality label data set Y corresponding to each steel plate is constructed Label ={Y 1 ,Y 2 ,…Y i ,…,Y N },Y i E {0,1,2}; n represents the total number of the collected steel plates, m represents the total number of sampling points on the length of each steel plate, and N represents the total number of sampling points on the width of each steel plate;
step 2.3: expanding an original four-dimensional data set into a two-dimensional data set N Dats The dimension of the two-dimensional dataset is (N, m x N);
step 2.4: dividing the data set; -integrating the two-dimensional dataset X Data And the plate-shaped quality label data set Y Label Dividing the training set into training sets X according to the proportion a and b train 、Y train And test set X test 、Y test The method comprises the steps of carrying out a first treatment on the surface of the The training set is used for constructing a model, the testing set is used for checking the model construction, evaluating the accuracy of the model and testing the generalization capability of the model.
3. The method for detecting abnormal plate shape quality of steel plate based on self-coding cascading forest model according to claim 1, wherein the step 3 further comprises the following steps:
step 3.1: forward coding; the encoding process obtains the input numberData set X i And sending the data set to the root nodes of K decision trees, respectively, and once the data traverses to the leaf nodes of all decision trees, the process returns a K-dimensional vector, where each element K is an integer index of the leaf nodes in tree i, where i e {1,2, …, K };
step 3.2: calculating a maximum-composite Rule; firstly, leaf nodes of each tree correspond to a path from a root node, secondly, a possible range of data can be obtained according to a path result, and then, a maximum-composite Rule is calculated according to the range;
step 3.3: decoding the reconstructed data; recovering the original data according to the MCR obtained by each decision tree path; the original data must be valued in MCR for the classification attribute;
step 3.4: determining the value of the quantity K of eFore model parameter decision trees, wherein the quantity K of the decision trees determines the size of output data; after building the eFore model, input X Data And Y Label The data set is measured by MSE indexes and then an optimal parameter K is determined;
step 3.5: training an eFore model, and storing the model;
step 3.6: extracting data features, and collecting training set X Data Inputting into a trained eFore model, and obtaining a data set X through forward coding enc The data set X enc The dimension of (2) is (N, K).
4. The method for detecting abnormal plate shape quality of steel plate based on self-coding cascading forest model according to claim 1, wherein the step 4 further comprises the following steps:
step 4.1: judging whether the data obtained after the processing in the step 3 has high dimensionality or not, and judging whether the dimensionality of the data has a constraint relation or not; if the input data has high dimensionality or constraint relation among data dimensionalities, a multi-granularity scanning structure is required to be arranged in front of the cascade forest structure, and the original data is input into the multi-granularity scanning structure and then is input into the cascade forest structure for quality detection of the steel plate shape; otherwise, directly entering the step 4.3, and carrying out quality detection on the shape of the steel plate;
step 4.2: a multi-granularity scanning structure;
step 4.3: if the original input data does not have high dimensionality or no constraint relation exists between the data dimensionalities, directly inputting the original input data into a cascade forest structure;
step 4.4: design cascade forest structure by reference to the ideas of integrated learning and deep learning:
step 4.5: with training set X train Performing model training, adjusting super parameters, and adjusting parameters through training to improve model accuracy;
step 4.6: and (3) storing the training model with highest accuracy reaching the history, taking the test data set X_test as a model input, comparing the test set label output by the model with the original manual label to obtain the test accuracy of the model, testing the generalization capability of the model, and checking the classification capability of the model on new data.
5. The method for detecting abnormal plate shape quality of steel plate based on self-coding cascading forest model according to claim 1, wherein the step 4.4 further comprises the following steps:
step 4.4.1: designing the hierarchy of cascade forests, wherein each hierarchy consists of a plurality of different integrated learning classifiers, and the nth hierarchy cascade forests are A n ={f 1 ,f 2 ,…,f t And (f), where f t And represents the t-th classifier, each classifier can obtain the probability of c categories;
step 4.4.2: the layer-by-layer structure in deep learning is used as the partial input of the next layer;
inputting the data subjected to multi-granularity scanning into a cascade forest or directly inputting the data, and converting the data into E through a first-layer forest 1 The dimension =n×l×c+e; then as input data for the next layer until the last layer of concatenation A n The probability of c categories is output in the last layer, the results of all the classifiers are summed and averaged, and then the maximum probability and the corresponding label are selected;
wherein n represents the number of classes of the first layer classifier, l represents the number of each classifier, and c represents the number of classification classes;
since both machine learning and deep learning inevitably involve the risk of overfitting and underfilling, the training of each classifier uses a k-cross validation method, i.e. each training sample is used k-1 times in the classifier, yielding k-1 class lists, averaging as part of the input of the next cascade, evaluating the performance of all cascade structures before the previous cascade structure by the validation set after the cascade has been extended to a new level, and the training process ends if the evaluation result has not changed or lifted too much, so that the number of levels of the cascade structure is automatically determined by the training process.
6. The method for detecting abnormal plate shape quality of steel plate based on self-coding cascade forest model according to claim 1, wherein in the step 4.2, the original input data is multidimensional data or one-dimensional data two-dimensional data, but if the data is high-dimensional or constraint relation exists between data dimensions, the data needs to be subjected to multi-granularity scanning.
7. The method for detecting abnormal plate shape quality of steel plate based on self-coding cascading forest model according to claim 6, wherein the multi-granularity scanning selection s 1 The random forests of different types are planted, each type being l, i.e. n in total 11 * Inputting data subjected to multi-granularity scanning into each forest for training, and obtaining a probability vector with a length of c in each forest, wherein c represents the classified category number, S represents the number of feature matrixes obtained through sliding windows, each forest generates a probability vector of S.c, and finally, the results of all the forests are spliced to obtain S.c.s 1 And outputting the probability vectors as samples.
CN202311305000.8A 2023-10-10 2023-10-10 Plate shape anomaly detection method based on self-coding cascade forests Pending CN117557827A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311305000.8A CN117557827A (en) 2023-10-10 2023-10-10 Plate shape anomaly detection method based on self-coding cascade forests

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311305000.8A CN117557827A (en) 2023-10-10 2023-10-10 Plate shape anomaly detection method based on self-coding cascade forests

Publications (1)

Publication Number Publication Date
CN117557827A true CN117557827A (en) 2024-02-13

Family

ID=89822283

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311305000.8A Pending CN117557827A (en) 2023-10-10 2023-10-10 Plate shape anomaly detection method based on self-coding cascade forests

Country Status (1)

Country Link
CN (1) CN117557827A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117831659A (en) * 2024-03-04 2024-04-05 山东钢铁股份有限公司 Method and device for online detection of quality of wide and thick plates, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117831659A (en) * 2024-03-04 2024-04-05 山东钢铁股份有限公司 Method and device for online detection of quality of wide and thick plates, electronic equipment and storage medium
CN117831659B (en) * 2024-03-04 2024-05-03 山东钢铁股份有限公司 Method and device for online detection of quality of wide and thick plates, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110533631B (en) SAR image change detection method based on pyramid pooling twin network
CN110213788B (en) WSN (Wireless sensor network) anomaly detection and type identification method based on data flow space-time characteristics
CN111046961B (en) Fault classification method based on bidirectional long-time and short-time memory unit and capsule network
CN112756759B (en) Spot welding robot workstation fault judgment method
CN117557827A (en) Plate shape anomaly detection method based on self-coding cascade forests
CN109460471B (en) Method for establishing fiber category map library based on self-learning mode
KR20220059120A (en) System for modeling automatically of machine learning with hyper-parameter optimization and method thereof
CN111985825A (en) Crystal face quality evaluation method for roller mill orientation instrument
CN111695611B (en) Bee colony optimization kernel extreme learning and sparse representation mechanical fault identification method
CN112686749A (en) Credit risk assessment method and device based on logistic regression technology
CN114580934A (en) Early warning method for food detection data risk based on unsupervised anomaly detection
CN111079348B (en) Method and device for detecting slowly-varying signal
Chou et al. SHM data anomaly classification using machine learning strategies: A comparative study
CN117152119A (en) Profile flaw visual detection method based on image processing
CN110717602A (en) Machine learning model robustness assessment method based on noise data
CN111126490B (en) Steel plate shape anomaly identification method based on depth random forest
CN110675382A (en) Aluminum electrolysis superheat degree identification method based on CNN-LapseLM
CN115496291A (en) Clustering type data augmented meteorological temperature prediction method based on high-precision residual defect value
CN113591897A (en) Method, device and equipment for detecting monitoring data abnormity and readable medium
Boman A deep learning approach to defect detection with limited data availability
Binghay et al. Object Detection Approach for Batch Detection of Cacao Bean Defects
Sutacha et al. Cucumber Disease Identification Using Multiple Machine Learning Classifiers with a Pre-Trained VGG16 Model
CN116563257A (en) Hyperspectral anomaly detection method based on random histogram forest
Packianather et al. Feature selection method for neural network for the classification of wood veneer defects
CN117172125A (en) Construction engineering stress assessment method and system based on convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination