CN117150282A

CN117150282A - Secondhand equipment recycling evaluation method and system based on prediction model

Info

Publication number: CN117150282A
Application number: CN202311195128.3A
Authority: CN
Inventors: 宇文立朋; 李铁亮; 宿馨元; 刘雪枫
Original assignee: Shijiazhuang Zhenghe Network Co ltd
Current assignee: Shijiazhuang Zhenghe Network Co ltd
Priority date: 2023-09-16
Filing date: 2023-09-16
Publication date: 2023-12-01
Anticipated expiration: 2043-09-16
Also published as: CN117150282B

Abstract

The invention discloses a second-hand equipment recycling evaluation method and system based on a prediction model, wherein the method comprises the following steps: data acquisition, feature extraction, second-hand equipment type classification and equipment recycling evaluation grade classification. The invention relates to the technical field of equipment recycling evaluation, in particular to a secondhand equipment recycling evaluation method and system based on a prediction model.

Description

Secondhand equipment recycling evaluation method and system based on prediction model

Technical Field

The invention relates to the technical field of equipment recycling evaluation, in particular to a second-hand equipment recycling evaluation method and system based on a prediction model.

Background

The recycling of the second-hand equipment aims at protecting the environment and bringing economic development, and the recycling equipment can be re-sold or converted into second-hand market resources through the identification and evaluation of the recycling equipment, so that economic value is created. The existing secondhand equipment evaluation method and system based on the prediction model have the problems that the acquired image features are redundant, the complexity is high and the extraction of the features is not facilitated; when the decision tree is used for classification, the problems of unbalanced sample number ratio, high complexity of the classified feature variable and threshold value and over-fitting of classification exist; when the equipment is classified and evaluated, the problems that the classification of samples of different categories is unbalanced, the update of the number of the samples is not timely caused, the classification accuracy is low, and the hard classification is possibly caused exist.

Disclosure of Invention

Aiming at the situation, in order to overcome the defects of the prior art, the invention provides a second-hand equipment recycling evaluation method and system based on a prediction model, and aims at solving the problems that the acquired image features are redundant and the complexity is high and is not beneficial to extracting features; aiming at the problems of unbalanced sample number occupation ratio, high complexity of partitioned feature variables and thresholds and excessive classification fitting when classifying by using a decision tree, the invention adopts a weighted decision tree to classify and pre-prune in the classifying process, reduces the complexity of the decision tree, avoids excessive fitting, improves the generalization capability of a model, and can effectively reduce the complexity of the decision tree by selecting the feature variables and thresholds with the minimum radix index for partitioning; aiming at the problems that the classification of samples of different types is unbalanced, the classification accuracy is low due to the fact that the number of the samples is not updated timely, and the hard classification is possibly caused, the weight-based extensible classification algorithm is adopted, the weight calculation is carried out on a data set, the sample weight can be automatically adjusted, the accuracy of classification of a few types of samples is improved, when the samples are trained, the samples can be effectively updated by using the gradient of the loss function relative to the samples, the learning capacity of a model is improved, the contribution degree of each feature in the samples is finally reflected, and therefore the accuracy of classification evaluation is improved.

The technical method adopted by the invention is as follows: the invention provides a second-hand equipment recycling evaluation method and system based on a prediction model, wherein the method comprises the following steps:

step S1: data acquisition, namely acquiring second-hand equipment images and data of service time, wear degree and maintenance history;

step S2: feature extraction, namely performing dimension reduction on image data by adopting a principal component analysis technology, calculating feature values and feature vectors of an image covariance matrix, mapping high-dimension data to low dimension to obtain feature data capable of being classified, extracting key points and feature description in the image through scale-invariant feature transformation, and obtaining feature values and labels of the key points;

step S3: classifying the second hand equipment types, dividing a training data set and a test data set, calculating the weight w of samples, calculating the base index of the set, pre-pruning, calculating the total base index, selecting the sample with the smallest base index for dividing the characteristic variable and the threshold value, and carrying out classification test on the test data set to obtain the classified equipment types;

step S4: the equipment recovers and evaluates the grade classification, divides the training data set and the test data set, calculates sample weight and automatically adjusts, uses the loss function to carry out gradient calculation on the new training sample, automatically updates the sample, calculates the similarity between the training data set and the test data set, calculates the weight of the test set belonging to different samples, compares the classification result of the test set with the actual label to obtain the accuracy, the precision, the recall rate and the F1 value, and carries out model evaluation.

Further, in step S1, the data acquisition is to acquire an image of the second-hand equipment and a corresponding label, wherein the label is a type and a new and old degree of the second-hand equipment, and data of the service time, the wear degree and the maintenance history of the second-hand equipment are acquired.

Further, in step S2, the feature extraction specifically includes the steps of:

step S21: the obtained original image is subjected to graying treatment, and the following formula is used:

；

wherein I represents a gray value of a pixel, R1, G1 and B1 represent three color component values of red, green and blue of the pixel, respectively, and 0.299, 0.587 and 0.114 represent three coefficients for one pixel point;

step S22: and carrying out principal component analysis on the acquired image data of the second hand equipment, wherein the principal component analysis specifically comprises the following steps of:

step S221: constructing second-hand equipment recovery historical image data set X= [ X ] ₁ ，x ₂ ，…，x _n ] ^T A covariance matrix of the reclaimed historical image data is defined, and the following formula is used:

；

wherein R represents a covariance matrix of the recovery history data, n represents the number of samples in the data set,transposed sample representing the ith image sample, x _i Representing an ith image sample;

step S222: the eigenvalues and corresponding eigenvectors of the covariance matrix are calculated based on singular value decomposition, using the following formulas:

；

wherein U is a left singular vector matrix of R, S is a diagonal matrix, V ^T Is the right singular matrix of R, lambda _i Representing a eigenvalue, w, of the covariance matrix _i Representing a feature vector corresponding to the feature value;

step S223: setting a target dimension k, introducing a parameter alpha, and judging the size of the target dimension, wherein the adopted formula is as follows:

；

where k represents the target dimension, λ _i A eigenvalue representing the covariance matrix, d representing the number of eigenvalues, α representing a parameter greater than 0 and less than 1;

step S224: mapping the recovered historical data to low-dimensional data to obtain a low-dimensional data set Y= [ Y ] ₁ ，y ₂ ，…，y _n ] ^T ；

Step S23: the values of the gaussian kernel function are pre-calculated and stored, and a variable scale space is built in coordinates (x, y) through the gaussian kernel function, and the following formula is used:

；

wherein R (x, y, σ) represents a gaussian function value at a coordinate position (x, y) in the image, x and y represent coordinates of the position, and σ represents a standard deviation of a gaussian kernel;

step S24: constructing a scale space of an image, and acquiring key points, wherein the steps are as follows:

step S241: filtering the original image by using a Gaussian filter to obtain a Gaussian image of a first layer;

step S242: downsampling the Gaussian image of the first layer to obtain a Gaussian image of the second layer;

step S243: repeating the steps S241 and S242, filtering and downsampling the Gaussian image of the upper layer to obtain the Gaussian image of the lower layer;

step S244: repeating the steps of S241, S242 and S243 until the number of pyramid layers preset is reached;

step S25: the direction and gradient of the key points are calculated by the following formula:

；

wherein L (x, y, sigma) represents the scale space of the image, O (x, y) represents the direction of the key point, theta (x, y) represents the size of the key point, and x and y represent the abscissa and the ordinate of the point respectively;

step S26: dividing 4×4 subregions in the vicinity of the description point as center, performing linear interpolation, and taking statistical gradient information as feature vector;

step S27: and comparing the feature vector obtained by the feature extraction with the standard feature vector of the existing data to obtain the deviation value of the service time, the abrasion degree and the maintenance history of the second-hand equipment.

Further, in step S3, the second-hand device data classification specifically includes the following steps:

step S31: data classification, taking the low-dimensional characteristic values and the corresponding labels obtained in the step S2 as sample data, taking 70% of the sample data as a training data set at random, and the rest 30% of the sample data as a test data set, wherein the training data set D is divided into a subset D1 and a subset D2;

step S32: counting the number of each sample in the data set, calculating a sample weight value, and giving different weights to each sample, wherein the formula is as follows:

；

wherein w is _i Representing the sample weight of the ith sample, n representing the total number of samples of the ith sample;

step S33: the subset D1 weight-based base index is calculated using the following formula:

；

wherein G (D1) represents the base index of subset D1, k represents the total number of categories, w _i Representing the weight of the ith sample, R _i Represents the number of the ith category, R represents the total number, T _i Representing the sum of the numbers outside the ith category, T representing the total number minus the number of the ith category;

step S34: calculating a weight-based base index for subset D2;

step S35: pre-pruning the decision tree, setting a threshold T, and stopping dividing the node when the calculated base index is smaller than the threshold T and setting the node as a leaf node;

step S36: taking the feature C as an example for dividing the training data set D, the total radix index of the training data set is calculated by the following formula:

；

where G (D, C) represents the total radix index of training data set D when characterized by a division of C, |D1| represents the size of subset D1, |D2| represents the size of subset D2, |D| represents the size of training data set D;

step S37: selecting a characteristic variable with the minimum radix index and a threshold value to divide, and continuously splitting the subsets until the subsets belong to the same category and cannot be divided;

step S38: and presetting a test threshold, classifying the test data set through the constructed decision tree, neglecting the dimension of the label during classification, comparing the classified result with the label, judging whether the classification is correct or not, and if the classification result test set is lower than the training set, re-dividing the training data set and the test data set and turning to the step S32, otherwise, completing the classification.

Further, in step S4, the device new and old degree data classification specifically includes the following steps:

step S41: constructing a training data set and a test data set, wherein the data comprises feature vectors of second-hand equipment and corresponding labels, the corresponding labels are evaluation grades of the new and old degrees of the second-hand equipment, 70% of the data are randomly selected as the training data set, and the rest 30% of the data are selected as the test data set;

step S42: the weight-based scalable classification algorithm is specifically as follows:

step S421: dividing a given sample data set D, presetting a training set as P, testing a set as S, training the sample set as X, and corresponding a label as C;

step S422: training a new sample on the basis of the original sample dataset using the following formula:

；

wherein X is _new Representing a newly trained sample, μ represents a loss value,representing the gradient of the loss function relative to the sample;

step S423: samples in the test set that are similar to the training set samples are calculated using the following formula:

；

wherein Q represents similarity between the test set and the training set, x _i Representing samples in a training set, x _j Representing corresponding similar samples in the test set, n representing the total number of samples, t representing the feature term, ω _it Representing sample x _i Weights, ω, of the t feature term in _jt Representing sample x _j Weights of the middle t feature items;

step S424: calculating test set sample x _j Belongs to category c _t The formula used is as follows:

；

wherein W (x) _j ，c _t ) Representing sample x _j Belongs to category c _t Weights, x _j Sample representing test set, c _t Representing the category of the feature item t, v (x _i ，x _j ) Weight function representing votes, θ (x _i ，c _t ) At x _i Belonging to c _t When the value is 1, the value is 0 otherwise;

step S425: x is x _j Classifying the test samples into the category with the largest weight, repeating the step S422 and the step S423, and classifying all the test samples until all the classification is completed;

step S43: and comparing the prediction with the actual classification result and the corresponding label, obtaining the accuracy, the precision, the recall and the F1 value, presetting the thresholds of the accuracy, the precision, the recall and the F1 value, comparing and judging whether the classification model meets the standard, outputting an evaluation grade if the classification model meets the standard, and otherwise, turning to step S41.

The invention provides a secondhand equipment recycling evaluation system based on a prediction model, which comprises a data acquisition module, a feature extraction module, an equipment type classification module and an equipment recycling evaluation grade classification module, wherein the data acquisition module is used for acquiring a second hand equipment recycling evaluation grade;

the data acquisition module acquires the second-hand equipment image and the data of the service time, the abrasion degree and the maintenance history, and sends the acquired second-hand equipment image and the data of the service time, the abrasion degree and the maintenance history to the feature extraction module;

the feature extraction module receives the second-hand equipment image and the data of the service time, the abrasion degree and the maintenance history sent by the data acquisition module, performs principal component analysis and feature extraction on the image, sends the data after principal component analysis to the equipment type classification module, and sends the data after feature extraction to the equipment recovery evaluation grade classification module;

the equipment type classification module receives the data after the principal component analysis from the feature extraction module, carries out weighted decision tree classification, and sends the classified data to the equipment recycling evaluation grade classification module;

the equipment recycling evaluation grade classification module receives the data extracted by the characteristics from the characteristics extraction module, performs classification evaluation through a weight-based extensible classification algorithm, compares the predicted and actual classification results with corresponding labels, acquires accuracy, precision, recall and F1 values, and outputs an evaluation grade.

By adopting the scheme, the beneficial effects obtained by the invention are as follows:

(1) Aiming at the problems that the acquired image features are redundant and high in complexity and are unfavorable for feature extraction, the feature value and the corresponding feature vector of the covariance matrix are calculated by adopting a principal component analysis method, the most important feature in the data set is found, the high-dimensional data is mapped to a low-dimensional space, the redundancy is reduced, and the features of criticality and distinguishing degree are reserved.

(2) Aiming at the problems of unbalanced sample number ratio, high complexity of partitioned feature variables and thresholds and excessive classification fit when classifying by using a decision tree, the invention adopts a weighted decision tree to classify and pre-prune in the classifying process, reduces the complexity of the decision tree, avoids excessive fitting, improves the generalization capability of a model, and can effectively reduce the complexity of the decision tree by selecting the feature variables and thresholds with the minimum radix index for partitioning.

(3) Aiming at the problems that the classification of samples of different types is unbalanced, the classification accuracy is low due to the fact that the number of the samples is not updated timely, and the hard classification is possibly caused, the weight-based extensible classification algorithm is adopted, the weight calculation is carried out on a data set, the sample weight can be automatically adjusted, the accuracy of classification of a few types of samples is improved, when the samples are trained, the samples can be effectively updated by using the gradient of the loss function relative to the samples, the learning capacity of a model is improved, the contribution degree of each feature in the samples is finally reflected, and therefore the accuracy of classification evaluation is improved.

Drawings

FIG. 1 is a schematic flow chart of a second-hand equipment recycling evaluation method based on a prediction model;

FIG. 2 is a block diagram of a second-hand equipment recycling and evaluating system based on a prediction model;

FIG. 3 is a flow chart of step S2;

FIG. 4 is a flow chart of step S3;

fig. 5 is a flow chart of step S4.

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention; all other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In the description of the present invention, it should be understood that the terms "upper," "lower," "front," "rear," "left," "right," "top," "bottom," "inner," "outer," and the like indicate orientation or positional relationships based on those shown in the drawings, merely to facilitate description of the invention and simplify the description, and do not indicate or imply that the devices or elements referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus should not be construed as limiting the invention.

First embodiment, referring to fig. 1, the method for recycling and evaluating second-hand equipment based on a prediction model provided by the invention includes the following steps:

In the second embodiment, referring to fig. 1, the embodiment is based on the foregoing embodiment, and in step S1, the data acquisition is to acquire an image of the second-hand device and a corresponding label, where the label is a type and a new and old degree of the second-hand device, and acquire data of a service time, a wear degree and a maintenance history of the second-hand device.

Embodiment three, referring to fig. 1 and 3, based on the above embodiment, in step S2, the feature extraction specifically includes the following steps:

；

wherein O (x, y) represents the direction of the key point, θ (x, y) represents the size of the key point, and x and y represent the abscissa and ordinate of the point, respectively;

By executing the operation, aiming at the problems that the acquired image features are redundant and high in complexity and are unfavorable for feature extraction, the method adopts a principal component analysis method to calculate the feature value and the corresponding feature vector of the covariance matrix, finds the most important feature in the data set, maps the high-dimensional data to a low-dimensional space, reduces redundancy and retains the features of criticality and distinguishing degree.

Fourth embodiment, referring to fig. 1 and 4, the embodiment is based on the above embodiment, and in step S3, the second hand equipment data classification specifically includes the following steps:

；

wherein G (D1) represents the base index of subset D1, k represents the total number of categories, w _i Representing the weights of i samples, R _i Represents the number of the ith category, R represents the total number, T _i Representing the sum of the numbers outside the ith category, T representing the total number minus the number of the ith category;

step S34: calculating a weight-based base index for subset D2;

；

where G (D, C) represents the total radix index of training data set D when characterized by a division of C, |D1| represents the size of subset D1, |D2| represents the size of subset D2, |D| represents the size of training data set |D|;

By executing the operation, aiming at the problems of unbalanced sample number ratio, high complexity of the partitioned feature variable and threshold value and excessive fitting of classification when the decision tree is used for classification, the method adopts the weighted decision tree for classification and pre-pruning in the classification process, reduces the complexity of the decision tree, avoids excessive fitting, improves the generalization capability of the model, and can effectively reduce the complexity of the decision tree by selecting the feature variable and threshold value with the minimum radix index for classification.

In a fifth embodiment, referring to fig. 1 and fig. 5, the embodiment is based on the foregoing embodiment, and in step S4, the device new and old degree data classification specifically includes the following steps:

；

wherein Q represents similarity between the test set and the training set, x _i Representing samples in a training set, x _j Representing corresponding similar samples in the test set, n representing the total number of samples, and t representing the feature item，ω _it Representing sample x _i Weights, ω, of the t feature term in _jt Representing sample x _j Weights of the middle t feature items;

；

step S43: comparing the classification results of the prediction results and the actual results with the corresponding labels, obtaining accuracy, precision, recall and F1 values, presetting thresholds of the accuracy, precision, recall and F1 values, comparing and judging whether the classification model meets the standard, outputting an evaluation grade if the classification model meets the standard, otherwise, turning to step S42;

by executing the operation, aiming at the problem that the classification accuracy is possibly low due to unbalanced classification of samples of different types and untimely updating of the number of the samples, the invention adopts the weight-based extensible classification algorithm to calculate the weight of the data set and automatically adjust the weight of the samples, so as to improve the accuracy of classification of a few types of samples, and when the samples are trained, the samples can be effectively updated by using the gradient of the loss function relative to the samples, thereby improving the learning capacity of the model and finally reflecting the contribution degree of each feature in the samples, so that the accuracy of classification evaluation is improved.

In a sixth embodiment, referring to fig. 2, the present invention provides a second-hand equipment recycling evaluation system based on a prediction model, where the second-hand equipment recycling evaluation system includes a data acquisition module, a feature extraction module, an equipment type classification module, and an equipment recycling evaluation grade classification module;

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

The invention and its embodiments have been described above with no limitation, and the actual construction is not limited to the embodiments of the invention as shown in the drawings. In summary, if one of ordinary skill in the art is informed by this disclosure, a structural manner and an embodiment similar to the technical solution should not be creatively devised without departing from the gist of the present invention.

Claims

1. A second-hand equipment recycling evaluation method based on a prediction model is characterized by comprising the following steps of: the method comprises the following steps:

step S3: classifying the second hand equipment types, dividing a training data set and a test data set, calculating a weight w of a sample, calculating a base index of a sub-set, pre-pruning, calculating a total base index, selecting a sample with the minimum base index for dividing a characteristic variable and a threshold value, and performing classification test on the test data set to obtain the classified equipment types;

2. The prediction model-based second hand equipment recycling assessment method according to claim 1, wherein the method comprises the following steps: in step S4, the device new and old degree data classification specifically includes the following steps:

；

step S43: and comparing the classification results of the prediction results and the actual results with corresponding labels, obtaining accuracy, precision, recall and F1 values, presetting thresholds of the accuracy, precision, recall and F1 values, comparing and judging whether the classification model meets the standard, outputting an evaluation grade if the classification model meets the standard, and otherwise, turning to step S41.

3. The prediction model-based second hand equipment recycling assessment method according to claim 1, wherein the method comprises the following steps: in step S2, the feature extraction specifically includes the following steps:

；

4. The prediction model-based second hand equipment recycling assessment method according to claim 1, wherein the method comprises the following steps: in step S3, the second-hand device data classification specifically includes the following steps:

step S31: data classification, namely taking the low-dimensional characteristic values and the corresponding labels obtained in the step S2 as sample data, taking 70% of the sample data as a training data set randomly, taking the rest 30% of the sample data as a test data set, and dividing the training data set D into a subset D1 and a subset D2;

；

step S34: calculating a weight-based base index for subset D2;

step S36: the feature C divides the training data set D, and calculates the total base index of the training data set by the following formula:

；

5. The prediction model-based second hand equipment recycling assessment method according to claim 1, wherein the method comprises the following steps: in step S1, the data acquisition is to acquire images of the second-hand equipment and corresponding labels, wherein the labels are types and old and new degrees of the second-hand equipment, and acquire data of service time, wear degree and maintenance history of the second-hand equipment.

6. A prediction model-based second hand equipment recycling evaluation system for implementing a prediction model-based second hand equipment recycling evaluation method as set forth in any one of claims 1 to 5, characterized in that: the device comprises a data acquisition module, a feature extraction module, a device type classification module and a device recovery evaluation grade classification module;