CN117669810A - Multi-strategy fusion material sampling inspection method - Google Patents

Multi-strategy fusion material sampling inspection method Download PDF

Info

Publication number
CN117669810A
CN117669810A CN202311597412.3A CN202311597412A CN117669810A CN 117669810 A CN117669810 A CN 117669810A CN 202311597412 A CN202311597412 A CN 202311597412A CN 117669810 A CN117669810 A CN 117669810A
Authority
CN
China
Prior art keywords
model
strategy
electric power
fusion
material data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311597412.3A
Other languages
Chinese (zh)
Inventor
马锐
李潇逸
吴恒
陈威浩
董兆林
朱芷萱
廖玉琳
敖翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Chongqing Electric Power Co Ltd
Materials Branch of State Grid Chongqing Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Chongqing Electric Power Co Ltd
Materials Branch of State Grid Chongqing Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Chongqing Electric Power Co Ltd, Materials Branch of State Grid Chongqing Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202311597412.3A priority Critical patent/CN117669810A/en
Publication of CN117669810A publication Critical patent/CN117669810A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to an intelligent decision-making technology, and discloses a multi-strategy fusion material sampling inspection method, which comprises the following steps: acquiring standardized historical electric power material data, and converting the standardized historical electric power material data into historical electric power material data characteristics; training a pre-constructed multi-strategy fusion model by utilizing historical electric power material data characteristics, and obtaining a standard multi-strategy fusion model after training is completed; and acquiring the electric power material data to be spot checked, respectively carrying out spot check prediction on the electric power material data to be spot checked by the first strategy model, the second strategy model and the third strategy model of the standard multi-strategy fusion model, and carrying out weighted average processing on spot check results obtained by the first strategy model, the second strategy model and the third strategy model prediction to obtain spot check prediction results. The invention further provides a multi-strategy fusion material sampling inspection device, electronic equipment and a storage medium. The invention can improve the spot check working efficiency of the electric power materials and the accuracy of spot check results.

Description

Multi-strategy fusion material sampling inspection method
Technical Field
The invention relates to the technical field of intelligent decision making, in particular to a multi-strategy fusion material sampling inspection method.
Background
With the development of intelligent control of materials, the detection of electric power materials has become an important working step in the electric power material supply chain.
The patent document with publication number of CN 110555596A discloses a method and a system for formulating a sampling inspection strategy based on the quality evaluation of distribution materials. The above publication CN 110555596A adopts a statistical method to perform sampling inspection, and performs sampling inspection according to the same proportion for electric power material types, lacks pertinence and accuracy, does not reasonably allocate resources on the problems of different suppliers of materials and the detection qualification rate of the material types, and makes differentiated sampling inspection measures.
The patent document with publication number CN 116187836A discloses a power material quality evaluation and spot check method, which is characterized in that three power material quality evaluation indexes are established based on historical data required by making a power material spot check strategy, the spot check historical data of the power material is subjected to quality evaluation, then the spot check data of the power material is classified by using a module C mean value clustering algorithm and the quality evaluation indexes, and spot check rate indexes of the power material are established according to classification results; and finally training the neural network based on the spot check rate index and the historical data to form an intelligent spot check strategy of the electric power materials. The publication CN 116187836A uses a neural network for training, but uses a single algorithm, and is prone to overfitting.
The traditional sampling detection technology does not adopt reasonable sampling detection measures, and even if the neural network model is used for prediction, the sampling detection measures have single algorithm, the fitting condition is easy to occur, and the problems of low sampling detection working efficiency and low sampling detection result accuracy are easy to cause. How to formulate a reasonable, effective and scientific electric power material sampling inspection strategy has great significance for building intelligent electric power material industry chains.
Disclosure of Invention
The invention provides a multi-strategy fusion material sampling inspection method which can improve the safety and authentication efficiency of multi-strategy fusion material sampling inspection.
In order to achieve the above purpose, the invention provides a multi-strategy fusion material sampling inspection method, which comprises the following steps:
obtaining standardized historical power supply data, and converting the standardized historical power supply data into historical power supply data characteristics, wherein the historical power supply data characteristics comprise: material class number, material provider number, experimental class number, and material provider status;
training a pre-constructed multi-strategy fusion model by utilizing the historical electric power material data characteristics, and obtaining a standard multi-strategy fusion model after training, wherein the pre-constructed multi-strategy fusion model comprises a first strategy model, a second strategy model and a third strategy model;
And acquiring the electric power material data to be spot checked, respectively carrying out spot check prediction on the electric power material data to be spot checked by the first strategy model, the second strategy model and the third strategy model of the standard multi-strategy fusion model, and carrying out weighted average processing on spot check results obtained by the first strategy model, the second strategy model and the third strategy model prediction to obtain spot check prediction results.
Optionally, training the pre-constructed multi-strategy fusion model by using the historical electric power material data features, and obtaining a standard multi-strategy fusion model after training is completed, including:
inputting the historical electric power material data characteristics into the pre-built multi-strategy fusion model, respectively training a first strategy model, a second strategy model and a third strategy model in the pre-built multi-strategy fusion model, and obtaining the standard multi-strategy fusion model after training is completed.
Optionally, the training the first policy model in the pre-constructed multi-policy fusion model includes:
constructing a tree main model by utilizing a gradient lifting technology, dividing each node of the tree main model according to the characteristics of the historical electric power material data to obtain a tree division model, and integrating the tree division model and the tree main model to obtain an initial tree model when the division reaches a preset stop condition;
Calculating a loss value of the initial tree model by using a preset first loss function, and iteratively training the initial tree model by using the loss value to obtain a classification tree model;
adding a basic predictor on each node of the classification tree model, and optimizing branch selection of each node through a basic grading value of the basic predictor;
and summarizing branch selection of each node in the classification tree model to obtain the first strategy model.
Optionally, the calculating the loss value of the initial tree model by using a preset first loss function includes:
calculating a loss value of the initial tree model by adopting the following formula:
wherein N is the number of historical power material data characteristics, y i In order to actually classify the tag(s),tags are predicted for the model.
Optionally, the training the second policy model in the pre-constructed multi-policy fusion model includes:
extracting any one of the characteristics of the historical electric power material data as a root node of a decision tree, and taking the characteristics of the same category as any selected characteristic as parallel root nodes of the decision tree to obtain a plurality of root nodes;
dividing the characteristics except the same category of the root node in the root node to obtain a plurality of child nodes;
Summarizing all root nodes and all child nodes to obtain a decision tree cluster, and determining a classification result of the decision tree cluster in a voting mode to obtain a second strategy model.
Optionally, the training the third policy model in the pre-constructed multi-policy fusion model includes:
assigning a qualification rate to each category in the historical electric power material data characteristics according to a preset qualification rate rule;
and calculating the weight of each category according to the qualification rate, and calculating the qualification rate of the material provider according to the weight of each category to obtain a third strategy model.
Optionally, the acquiring the standardized historical power material data includes:
acquiring historical electric power material data, preprocessing the historical electric power material data, and labeling the historical electric power material data according to a preset label;
and sampling the labeled data by using a preset sample sampling technology to obtain standardized historical electric power material data.
In order to solve the above problems, the present invention further provides a multi-strategy fusion material sampling inspection device, which includes:
the data normalization module is used for acquiring normalized historical electric power material data and converting the normalized historical electric power material data into historical electric power material data characteristics, wherein the historical electric power material data characteristics comprise: material class number, material provider number, experimental class number, and material provider status;
The model training module is used for training a pre-constructed multi-strategy fusion model by utilizing the historical electric power material data characteristics, and obtaining a standard multi-strategy fusion model after training, wherein the pre-constructed multi-strategy fusion model comprises a first strategy model, a second strategy model and a third strategy model;
the sampling inspection prediction module is used for acquiring the electrical material data to be sampled inspected, the first strategy model, the second strategy model and the third strategy model of the standard multi-strategy fusion model are used for respectively carrying out sampling inspection prediction on the electrical material data to be sampled inspected, and the sampling inspection results obtained through the first strategy model, the second strategy model and the third strategy model are subjected to weighted average processing to obtain sampling inspection prediction results.
In order to solve the above-mentioned problems, the present invention also provides an electronic apparatus including:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the multi-policy fusion material spot check method described above.
In order to solve the above-mentioned problems, the present invention further provides a computer readable storage medium, where at least one computer program is stored, where the at least one computer program is executed by a processor in an electronic device to implement the multi-policy fusion material sampling inspection method described above.
According to the embodiment of the scheme, the standardized historical electric power material data are converted into the historical electric power material data characteristics, the first strategy model, the second strategy model and the third strategy model in the pre-built multi-strategy fusion model are respectively trained by utilizing the historical electric power material data characteristics, so that the generalization capability of the multi-strategy fusion model for processing the electric power material data can be ensured, moreover, the material sampling inspection prediction is performed through a plurality of strategy models, the accuracy of the sampling inspection can be realized, and in addition, the sampling inspection prediction of the electric power material data to be sampled is directly performed through the multi-strategy model, and the working efficiency of the electric power material sampling inspection can be improved.
Drawings
FIG. 1 is a schematic flow chart of a multi-strategy fusion material sampling method according to an embodiment of the present invention;
FIG. 2 is a functional block diagram of a multi-strategy fusion material sampling device according to an embodiment of the present invention;
Fig. 3 is a schematic structural diagram of an electronic device for implementing the multi-policy fusion material sampling inspection method according to an embodiment of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The embodiment of the application provides a multi-strategy fusion material sampling inspection method. The execution main body of the multi-strategy fusion material sampling inspection method comprises at least one of an electronic device, such as a server side, a terminal and the like, which can be configured to execute the method provided by the embodiment of the application. In other words, the multi-strategy fusion material sampling inspection method can be executed by software or hardware installed on a terminal device or a server device, and the software can be a blockchain platform. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like. The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.
Referring to fig. 1, a flow chart of a multi-strategy fusion material sampling method according to an embodiment of the invention is shown. In this embodiment, the multi-strategy fusion material sampling inspection method includes:
s1, acquiring standardized historical electric power material data, and converting the standardized historical electric power material data into historical electric power material data characteristics, wherein the historical electric power material data characteristics comprise: material class number, material provider number, experimental class number, and material provider status.
In the embodiment of the invention, the standardized historical electric power material data refers to electric power material data which is subjected to spot check, blacklist manufacturer information published by an electric network platform of the historical electric power material data, blacklist manufacturer punishment information and the like.
As an embodiment of the present invention, the obtaining standardized historical power material data includes:
acquiring historical electric power material data, preprocessing the historical electric power material data, and labeling the historical electric power material data according to a preset label;
and sampling the labeled data by using a preset sample sampling technology to obtain standardized data.
In the embodiment of the invention, the preset label refers to two states of sampling inspection and non-sampling inspection of each data in the sampling inspection task.
In the embodiment of the invention, the sample sampling technology is a data processing technology for adjusting the balance of a plurality of acquired data samples, so that the deviation of a model to a plurality of types of samples possibly caused by category unbalance can be prevented, few types of samples are rarely considered, the number of the few types of samples can be increased to balance the category proportion in the data set through the sample sampling technology, and the oversampling technology is usually adopted.
Further, the preprocessing the historical power material data includes:
screening abnormal data in the historical electric power material data, and processing the abnormal data to obtain standard electric power material data;
and identifying keywords in the standard electric power material data, extracting data according to preset data item keywords, and finishing the preprocessing process of the historical electric power material data.
In the embodiment of the invention, the abnormal data refers to repeated data, missing values, error data and the like in the data.
In the embodiment of the invention, the preset data item keywords refer to the material categories and the like required in the task demands. For example, the preset data keyword may be a material class number, a material provider label, an experiment class number, and a material provider status.
Illustratively, 1, data processing: 1.1 removal of duplicate data: all data is checked and if there are fully repeated rows, they are removed. 1.2 processing missing values: the missing value processing can be performed by interpolation, deletion and retention according to the content of the missing value. 1.3 correcting erroneous data: check the correctness of the data, e.g. the date should be within a reasonable range, the code of the electric material should conform to a specific code specification, etc.
2. Data cleaning: 2.1 deleting the garbage: if some columns have no effect on the final analysis and results, such as recorded IDs, individual notes, etc., deletion may be selected. 2.2 format conversion: the unified data format, for example, unifies all dates format "yyyy-mm-dd".2.3 standardization: for numerical data, such as price, normalization is performed so that it is at the same scale. 2.4 removal of illegal characters: all columns are checked and illegal characters present in all columns are deleted.
3. And (3) data extraction: 3.1 data grabbing: matching and extracting information related to blacklist vendors published by the national grid. 3.2 delete irrelevant columns: for example, if no specific address information is needed in the analysis, this column may be selected for deletion. 3.3 extracting keywords: if some columns contain keywords, such as related to winning, warranty, reputation, etc., these keywords may be extracted from them.
4. Data screening: 4.1 time screening: only data within a specified time range is extracted. 4.2 screening according to manufacturer: all material data under the manufacturer is selected.
S2, training a pre-constructed multi-strategy fusion model by utilizing the standardized historical electric power material data characteristics, and obtaining a standard multi-strategy fusion model after training, wherein the pre-constructed multi-strategy fusion model comprises a first strategy model, a second strategy model and a third strategy model.
In the embodiment of the invention, the pre-constructed multi-strategy fusion model refers to a strategy model which adopts a plurality of models for fusion.
As an embodiment of the present invention, training the pre-constructed multi-strategy fusion model by using the historical electric power material data features, and obtaining a standard multi-strategy fusion model after training is completed, including:
inputting the historical electric power material data characteristics into the pre-built multi-strategy fusion model, respectively training a first strategy model, a second strategy model and a third strategy model in the pre-built multi-strategy fusion model, and obtaining the standard multi-strategy fusion model after training is completed.
Further, the training the first strategy model in the pre-constructed multi-strategy fusion model includes:
Constructing a tree main model by utilizing a gradient lifting technology, dividing each node of the tree main model according to the characteristics of the historical electric power material data to obtain a tree division model, and integrating the tree division model and the tree main model to obtain an initial tree model when the division reaches a preset stop condition;
calculating a loss value of the initial tree model by using a preset first loss function, and iteratively training the initial tree model by using the loss value to obtain a classification tree model;
adding a basic predictor on each node of the classification tree model, and optimizing branch selection of each node through a basic grading value of the basic predictor;
and summarizing branch selection of each node in the classification tree model to obtain the first strategy model.
In the embodiment of the invention, the gradient lifting technology refers to a machine learning algorithm, a tree model can be constructed through the gradient lifting technology, and the decision tree can be continuously repeated until the decision tree reaches the designated quantity or condition.
In the embodiment of the present invention, the stop condition refers to that the stop condition is reached when all the data features are included in the tree model.
In the embodiment of the invention, the basic predictor refers to a prediction function used for predicting by utilizing data characteristics.
Further, the method calculates the loss value of the initial tree model by using a preset first loss function, and calculates the loss value of the initial tree model by adopting the following formula:
wherein N is the number of historical power material data characteristics, y i To actually sort labels,Tags are predicted for the model.
Further, optimizing branch selection of each of the nodes by the base scoring value of the base predictor, using the following formula:
wherein said y i For the actual classification labels, baseScore is the model predictive score.
Illustratively, let T iterative steps be provided, t=1, 2, …, T, each adding a new decision tree. In the first iteration step, the initial prediction result of the model is the base scoring value of the base predictor, i.e. the logarithmic probability of the logarithmic value. For each iteration step t, a predicted value of the first strategy model is calculatedRe-computing the residual r of each node i it The calculation formula is as follows:
wherein said y i In order to actually classify the tag(s),the model prediction result of the t-1 round is obtained.
Reuse of data characteristics of a current nodeWherein X is i And constructing a decision tree of the current node for data characteristics, and updating the weight of each node through the following formula to improve the generalization capability of the model.
Wherein,for the gradient of the current node +.>As the second derivative of the current node, λ is the regularization parameter.
As an embodiment of the present invention, the training the second policy model in the pre-constructed multi-policy fusion model includes:
extracting any one of the characteristics of the historical electric power material data as a root node of a decision tree, and taking the characteristics of the same category as any selected characteristic as parallel root nodes of the decision tree to obtain a plurality of root nodes;
dividing the characteristics except the same category of the root node in the root node to obtain a plurality of child nodes;
summarizing all root nodes and all child nodes to obtain a decision tree cluster, and determining a classification result of the decision tree cluster in a voting mode to obtain a second strategy model.
In the embodiment of the invention, the decision tree refers to a tree structure used for classification and prediction, for example, the decision tree can adopt a random forest integrated learning method.
In the embodiment of the present invention, the voting mode refers to taking the result with high number of votes as the final classification result.
As an embodiment of the present invention, the training the third policy model in the pre-constructed multi-policy fusion model includes:
Assigning a qualification rate to each category in the historical electric power material data characteristics according to a preset qualification rate rule;
and calculating the weight of each category according to the qualification rate, and calculating the qualification rate of the material provider according to the weight of each category to obtain a third strategy model.
In the embodiment of the invention, the preset qualification rate rule refers to a rule of qualification rate of each feature in the declared historical power material data features, and can be formulated according to the importance degree of each feature or qualification rate data according to historical statistics.
The weight of each category can be calculated according to the qualification rate by adopting the following formula:
wherein said omega i For the weight of the i-th class,is the qualification rate of the ith category.
The embodiment of the invention calculates the qualification rate of the material provider through the following formula:
FPY i =ω 1122 +…+ω tt
wherein omega 1 For the first category of weights, α 1 Actual qualification rate, ω, for the material provider corresponding to the first category 2 For the second category of weights, α 2 Actual qualification rate, ω, for the material provider corresponding to the second category t For the second category of weights, α t And the actual qualification rate of the material provider corresponding to the second category.
S3, acquiring the electric power material data to be subjected to spot inspection, wherein the first strategy model, the second strategy model and the third strategy model of the standard multi-strategy fusion model respectively carry out spot inspection prediction on the electric power material data to be subjected to spot inspection, and carrying out weighted average processing on spot inspection results obtained through the prediction of the first strategy model, the second strategy model and the third strategy model to obtain spot inspection prediction results.
In the embodiment of the invention, the electric power material data to be spot checked refers to electric power material data to be spot checked and predicted, and the electric power material data to be spot checked can be acquired through a power grid platform, manually input or crawling from a webpage by utilizing a data crawling technology.
As an embodiment of the present invention, the predicting using the standard multi-policy fusion model includes:
converting the electric power material data to be subjected to the spot check into electric power material data characteristics to be subjected to the spot check;
processing the electric power material data characteristics to be spot checked by using a first strategy model in the standard multi-strategy fusion model to obtain a first prediction result;
processing the electric power material data characteristics to be spot checked by utilizing a second strategy model in the standard multi-strategy fusion model to obtain a second prediction result;
Processing the electric power material data characteristics to be spot checked by utilizing a third strategy model in the standard multi-strategy fusion model to obtain a third prediction result;
and summarizing the first prediction result, the second prediction result and the third prediction result to obtain a spot check prediction result.
Further, the weighted average processing of the predicted sampling result includes:
and carrying out weighted average calculation on the first classification result, the second classification result and the third classification result by adopting a weighted average calculation formula to obtain a spot check prediction result.
According to the embodiment of the invention, the standardized historical electric power material data is converted into the historical electric power material data characteristics, the first strategy model, the second strategy model and the third strategy model in the pre-built multi-strategy fusion model are respectively trained by utilizing the historical electric power material data characteristics, so that the generalization capability of the multi-strategy fusion model for processing the electric power material data can be ensured, moreover, the material sampling inspection prediction can be performed through a plurality of strategy models, the accuracy of the sampling inspection can be realized, and in addition, the sampling inspection prediction of the electric power material data to be sampled can be directly performed through the multi-strategy models, and the working efficiency of the electric power material sampling inspection can be improved.
Fig. 2 is a functional block diagram of a multi-strategy fusion material sampling device according to an embodiment of the present invention.
The multi-strategy fusion material sampling inspection device 100 can be installed in electronic equipment. Depending on the functions implemented, the multi-strategy fusion material sampling device 100 may include a data normalization module 101, a model training module 102, and a sampling prediction module 103.
The module of the invention, which may also be referred to as a unit, refers to a series of computer program segments, which are stored in the memory of the electronic device, capable of being executed by the processor of the electronic device and of performing a fixed function.
In the present embodiment, the functions concerning the respective modules/units are as follows:
the data normalization module 101 is configured to obtain normalized historical power material data, and convert the normalized historical power material data into historical power material data features, where the historical power material data features include: material class number, material provider number, experimental class number, and material provider status.
In the embodiment of the invention, the standardized historical electric power material data refers to electric power material data which is subjected to spot check, blacklist manufacturer information published by an electric network platform of the historical electric power material data, blacklist manufacturer punishment information and the like.
As an embodiment of the present invention, the obtaining standardized historical power material data includes:
acquiring historical electric power material data, preprocessing the historical electric power material data, and labeling the historical electric power material data according to a preset label;
and sampling the labeled data by using a preset sample sampling technology to obtain standardized data.
In the embodiment of the invention, the preset label refers to two states of sampling inspection and non-sampling inspection of each data in the sampling inspection task.
In the embodiment of the invention, the sample sampling technology is a data processing technology for adjusting the balance of a plurality of acquired data samples, so that the deviation of a model to a plurality of types of samples possibly caused by category unbalance can be prevented, few types of samples are rarely considered, the number of the few types of samples can be increased to balance the category proportion in the data set through the sample sampling technology, and the oversampling technology is usually adopted.
Further, the preprocessing the historical power material data includes:
screening abnormal data in the historical electric power material data, and processing the abnormal data to obtain standard electric power material data;
And identifying keywords in the standard electric power material data, extracting data according to preset data item keywords, and finishing the preprocessing process of the historical electric power material data.
In the embodiment of the invention, the abnormal data refers to repeated data, missing values, error data and the like in the data.
In the embodiment of the invention, the preset data item keywords refer to the material categories and the like required in the task demands. For example, the preset data keyword may be a material class number, a material provider label, an experiment class number, and a material provider status.
Illustratively, 1, data processing: 1.1 removal of duplicate data: all data is checked and if there are fully repeated rows, they are removed. 1.2 processing missing values: the missing value processing can be performed by interpolation, deletion and retention according to the content of the missing value. 1.3 correcting erroneous data: check the correctness of the data, e.g. the date should be within a reasonable range, the code of the electric material should conform to a specific code specification, etc.
2. Data cleaning: 2.1 deleting the garbage: if some columns have no effect on the final analysis and results, such as recorded IDs, individual notes, etc., deletion may be selected. 2.2 format conversion: the unified data format, for example, unifies all dates format "yyyy-mm-dd".2.3 standardization: for numerical data, such as price, normalization is performed so that it is at the same scale. 2.4 removal of illegal characters: all columns are checked and illegal characters present in all columns are deleted.
3. And (3) data extraction: 3.1 data grabbing: matching and extracting information related to blacklist vendors published by the national grid. 3.2 delete irrelevant columns: for example, if no specific address information is needed in the analysis, this column may be selected for deletion. 3.3 extracting keywords: if some columns contain keywords, such as related to winning, warranty, reputation, etc., these keywords may be extracted from them.
4. Data screening: 4.1 time screening: only data within a specified time range is extracted. 4.2 screening according to manufacturer: all material data under the manufacturer is selected.
The model training module 102 is configured to train a pre-built multi-strategy fusion model by using the historical electric power material data features, and obtain a standard multi-strategy fusion model after training is completed, where the pre-built multi-strategy fusion model includes a first strategy model, a second strategy model and a third strategy model.
In the embodiment of the invention, the pre-constructed multi-strategy fusion model refers to a strategy model which adopts a plurality of models for fusion.
As an embodiment of the present invention, training the pre-constructed multi-strategy fusion model by using the historical electric power material data features, and obtaining a standard multi-strategy fusion model after training is completed, including:
Inputting the historical electric power material data characteristics into the pre-built multi-strategy fusion model, respectively training a first strategy model, a second strategy model and a third strategy model in the pre-built multi-strategy fusion model, and obtaining the standard multi-strategy fusion model after training is completed.
Further, the training the first strategy model in the pre-constructed multi-strategy fusion model includes:
constructing a tree main model by utilizing a gradient lifting technology, dividing each node of the tree main model according to the characteristics of the historical electric power material data to obtain a tree division model, and integrating the tree division model and the tree main model to obtain an initial tree model when the division reaches a preset stop condition;
calculating a loss value of the initial tree model by using a preset first loss function, and iteratively training the initial tree model by using the loss value to obtain a classification tree model;
adding a basic predictor on each node of the classification tree model, and optimizing branch selection of each node through a basic grading value of the basic predictor;
and summarizing branch selection of each node in the classification tree model to obtain the first strategy model.
In the embodiment of the invention, the gradient lifting technology refers to a machine learning algorithm, a tree model can be constructed through the gradient lifting technology, and the decision tree can be continuously repeated until the decision tree reaches the designated quantity or condition.
In the embodiment of the present invention, the stop condition refers to that the stop condition is reached when all the data features are included in the tree model.
In the embodiment of the invention, the basic predictor refers to a prediction function used for predicting by utilizing data characteristics.
Further, the method calculates the loss value of the initial tree model by using a preset first loss function, and calculates the loss value of the initial tree model by adopting the following formula:
wherein N is the number of historical power material data characteristics, y i In order to actually classify the tag(s),tags are predicted for the model.
Further, optimizing branch selection of each of the nodes by the base scoring value of the base predictor, using the following formula:
wherein said y i For the actual classification labels, baseScore is the model predictive score.
Illustratively, let T iterative steps be provided, t=1, 2, …, T, each adding a new decision tree. In the first iteration step, the initial prediction result of the model is the base scoring value of the base predictor, i.e. the logarithmic probability of the logarithmic value. For each iteration step t, a predicted value of the first strategy model is calculated Re-computing the residual r of each node i it The calculation formula is as follows:
wherein said y i In order to actually classify the tag(s),the model prediction result of the t-1 round is obtained.
Reuse of data characteristics of a current nodeWherein X is i And constructing a decision tree of the current node for data characteristics, and updating the weight of each node through the following formula to improve the generalization capability of the model.
Wherein,for the gradient of the current node +.>As the second derivative of the current node, λ is the regularization parameter.
As an embodiment of the present invention, the training the second policy model in the pre-constructed multi-policy fusion model includes:
extracting any one of the characteristics of the historical electric power material data as a root node of a decision tree, and taking the characteristics of the same category as any selected characteristic as parallel root nodes of the decision tree to obtain a plurality of root nodes;
dividing the characteristics except the same category of the root node in the root node to obtain a plurality of child nodes;
summarizing all root nodes and all child nodes to obtain a decision tree cluster, and determining a classification result of the decision tree cluster in a voting mode to obtain a second strategy model.
In the embodiment of the invention, the decision tree refers to a tree structure used for classification and prediction, for example, the decision tree can adopt a random forest integrated learning method.
In the embodiment of the present invention, the voting mode refers to taking the result with high number of votes as the final classification result.
As an embodiment of the present invention, the training the third policy model in the pre-constructed multi-policy fusion model includes:
assigning a qualification rate to each category in the historical electric power material data characteristics according to a preset qualification rate rule;
and calculating the weight of each category according to the qualification rate, and calculating the qualification rate of the material provider according to the weight of each category to obtain a third strategy model.
In the embodiment of the invention, the preset qualification rate rule refers to a rule of qualification rate of each feature in the declared historical power material data features, and can be formulated according to the importance degree of each feature or qualification rate data according to historical statistics.
The weight of each category can be calculated according to the qualification rate by adopting the following formula:
wherein said omega i For the weight of the i-th class,is the qualification rate of the ith category.
The embodiment of the invention calculates the qualification rate of the material provider through the following formula:
FPY i =ω 1122 +…+ω tt
wherein omega 1 For the first category of weights, α 1 Actual qualification rate, ω, for the material provider corresponding to the first category 2 For the second category of weights, α 2 Actual qualification rate, ω, for the material provider corresponding to the second category t For the second category of weights, α t And the actual qualification rate of the material provider corresponding to the second category.
The spot check prediction module 103 is configured to obtain the electrical material data to be spot checked, perform spot check prediction on the electrical material data to be spot checked by using the first policy model, the second policy model and the third policy model of the standard multi-policy fusion model, and perform weighted average processing on spot check results obtained by predicting the first policy model, the second policy model and the third policy model, so as to obtain a spot check prediction result.
In the embodiment of the invention, the electric power material data to be spot checked refers to electric power material data to be spot checked and predicted, and the electric power material data to be spot checked can be acquired through a power grid platform, manually input or crawling from a webpage by utilizing a data crawling technology.
As an embodiment of the present invention, the predicting using the standard multi-policy fusion model includes:
converting the electric power material data to be subjected to the spot check into electric power material data characteristics to be subjected to the spot check;
Processing the electric power material data characteristics to be spot checked by using a first strategy model in the standard multi-strategy fusion model to obtain a first prediction result;
processing the electric power material data characteristics to be spot checked by utilizing a second strategy model in the standard multi-strategy fusion model to obtain a second prediction result;
processing the electric power material data characteristics to be spot checked by utilizing a third strategy model in the standard multi-strategy fusion model to obtain a third prediction result;
and summarizing the first prediction result, the second prediction result and the third prediction result to obtain a spot check prediction result.
Further, the weighted average processing of the predicted sampling result includes:
and carrying out weighted average calculation on the first classification result, the second classification result and the third classification result by adopting a weighted average calculation formula to obtain a spot check prediction result.
Fig. 3 is a schematic structural diagram of an electronic device for implementing a multi-policy fusion material sampling inspection method according to an embodiment of the present invention.
The electronic device may comprise a processor 10, a memory 11, a communication bus 12 and a communication interface 13, and may further comprise a computer program stored in the memory 11 and executable on the processor 10, such as a multi-policy fusion material spot check method program.
The processor 10 may be formed by an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be formed by a plurality of integrated circuits packaged with the same function or different functions, including one or more central processing units (Central Processing Unit, CPU), a microprocessor, a digital processing chip, a graphics processor, a combination of various control chips, and so on. The processor 10 is a Control Unit (Control Unit) of the electronic device, and connects various components of the entire electronic device using various interfaces and lines, and executes various functions of the electronic device and processes data by running or executing programs or modules stored in the memory 11 (for example, executing a multi-policy fusion material sampling inspection method program, etc.), and calling data stored in the memory 11.
The memory 11 includes at least one type of readable storage medium including flash memory, a removable hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device, such as a mobile hard disk of the electronic device. The memory 11 may in other embodiments also be an external storage device of the electronic device, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device. The memory 11 may be used to store not only application software installed in an electronic device and various data, such as codes of a multi-policy fusion material sampling inspection method program, but also temporarily store data that has been output or is to be output.
The communication bus 12 may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. The bus is arranged to enable a connection communication between the memory 11 and at least one processor 10 etc.
The communication interface 13 is used for communication between the electronic device and other devices, including a network interface and a user interface. Optionally, the network interface may include a wired interface and/or a wireless interface (e.g., WI-FI interface, bluetooth interface, etc.), typically used to establish a communication connection between the electronic device and other electronic devices. The user interface may be a Display (Display), an input unit such as a Keyboard (Keyboard), or alternatively a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device and for displaying a visual user interface.
Fig. 3 shows only an electronic device with components, and it will be understood by those skilled in the art that the structure shown in fig. 3 is not limiting of the electronic device and may include fewer or more components than shown, or may combine certain components, or a different arrangement of components.
For example, although not shown, the electronic device may further include a power source (such as a battery) for supplying power to the respective components, and preferably, the power source may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management, and the like are implemented through the power management device. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The electronic device may further include various sensors, bluetooth modules, wi-Fi modules, etc., which are not described herein.
It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.
One multi-strategy fused material spot check method program stored in the memory 11 of the electronic device is a combination of a plurality of instructions, which when executed in the processor 10, can implement:
Obtaining standardized historical power supply data, and converting the standardized historical power supply data into historical power supply data characteristics, wherein the historical power supply data characteristics comprise: material class number, material provider number, experimental class number, and material provider status;
training a pre-constructed multi-strategy fusion model by utilizing the historical electric power material data characteristics, and obtaining a standard multi-strategy fusion model after training, wherein the pre-constructed multi-strategy fusion model comprises a first strategy model, a second strategy model and a third strategy model;
and acquiring the electric power material data to be spot checked, respectively carrying out spot check prediction on the electric power material data to be spot checked by the first strategy model, the second strategy model and the third strategy model of the standard multi-strategy fusion model, and carrying out weighted average processing on spot check results obtained by the first strategy model, the second strategy model and the third strategy model prediction to obtain spot check prediction results.
In particular, the specific implementation method of the above instructions by the processor 10 may refer to the description of the relevant steps in the corresponding embodiment of the drawings, which is not repeated herein.
Further, the modules/units integrated in the electronic device 1 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as separate products. The computer readable storage medium may be volatile or nonvolatile. For example, the computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM).
The present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor of an electronic device, can implement:
obtaining standardized historical power supply data, and converting the standardized historical power supply data into historical power supply data characteristics, wherein the historical power supply data characteristics comprise: material class number, material provider number, experimental class number, and material provider status;
training a pre-constructed multi-strategy fusion model by utilizing the historical electric power material data characteristics, and obtaining a standard multi-strategy fusion model after training, wherein the pre-constructed multi-strategy fusion model comprises a first strategy model, a second strategy model and a third strategy model;
and acquiring the electric power material data to be spot checked, respectively carrying out spot check prediction on the electric power material data to be spot checked by the first strategy model, the second strategy model and the third strategy model of the standard multi-strategy fusion model, and carrying out weighted average processing on spot check results obtained by the first strategy model, the second strategy model and the third strategy model prediction to obtain spot check prediction results.
In the several embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be other manners of division when actually implemented.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.
Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. A plurality of units or means recited in the system claims can also be implemented by means of software or hardware by means of one unit or means. The terms first, second, etc. are used to denote a name, but not any particular order.
Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims (10)

1. A multi-strategy fusion material spot check method, the method comprising:
obtaining standardized historical power supply data, and converting the standardized historical power supply data into historical power supply data characteristics, wherein the historical power supply data characteristics comprise: material class number, material provider number, experimental class number, and material provider status;
training a pre-constructed multi-strategy fusion model by utilizing the historical electric power material data characteristics, and obtaining a standard multi-strategy fusion model after training, wherein the pre-constructed multi-strategy fusion model comprises a first strategy model, a second strategy model and a third strategy model;
And acquiring the electric power material data to be spot checked, respectively carrying out spot check prediction on the electric power material data to be spot checked by the first strategy model, the second strategy model and the third strategy model of the standard multi-strategy fusion model, and carrying out weighted average processing on spot check results obtained by the first strategy model, the second strategy model and the third strategy model prediction to obtain spot check prediction results.
2. The multi-strategy fusion material sampling inspection method according to claim 1, wherein training the pre-constructed multi-strategy fusion model by using the historical electric power material data features, and obtaining a standard multi-strategy fusion model after training is completed, comprises:
inputting the historical electric power material data characteristics into the pre-built multi-strategy fusion model, respectively training a first strategy model, a second strategy model and a third strategy model in the pre-built multi-strategy fusion model, and obtaining the standard multi-strategy fusion model after training is completed.
3. The multi-strategy fusion material spot check method according to claim 1 or 2, wherein training a first strategy model of the pre-constructed multi-strategy fusion model comprises:
Constructing a tree main model by utilizing a gradient lifting technology, dividing each node of the tree main model according to the characteristics of the historical electric power material data to obtain a tree division model, and integrating the tree division model and the tree main model to obtain an initial tree model when the division reaches a preset stop condition;
calculating a loss value of the initial tree model by using a preset first loss function, and iteratively training the initial tree model by using the loss value to obtain a classification tree model;
adding a basic predictor on each node of the classification tree model, and optimizing branch selection of each node through a basic grading value of the basic predictor;
and summarizing branch selection of each node in the classification tree model to obtain the first strategy model.
4. The multi-strategy fused material spot inspection method of claim 3, wherein said calculating the loss value of the initial tree model using a preset first loss function comprises:
calculating a loss value of the initial tree model by adopting the following formula:
wherein N is the number of historical power material data characteristics, y i In order to actually classify the tag(s),tags are predicted for the model.
5. The multi-strategy fusion material spot check method according to claim 1 or 2, wherein training a second strategy model of the pre-constructed multi-strategy fusion model comprises:
Extracting any one of the characteristics of the historical electric power material data as a root node of a decision tree, and taking the characteristics of the same category as any selected characteristic as parallel root nodes of the decision tree to obtain a plurality of root nodes;
dividing the characteristics except the same category of the root node in the root node to obtain a plurality of child nodes;
summarizing all root nodes and all child nodes to obtain a decision tree cluster, and determining a classification result of the decision tree cluster in a voting mode to obtain a second strategy model.
6. The multi-strategy fusion material spot check method according to claim 1 or 2, wherein training a third strategy model of the pre-constructed multi-strategy fusion model comprises:
assigning a qualification rate to each category in the historical electric power material data characteristics according to a preset qualification rate rule;
and calculating the weight of each category according to the qualification rate, and calculating the qualification rate of the material provider according to the weight of each category to obtain a third strategy model.
7. The multi-strategy fusion material spot check method according to claim 1 or 2, wherein the obtaining standardized historical power material data comprises:
Acquiring historical electric power material data, preprocessing the historical electric power material data, and labeling the historical electric power material data according to a preset label;
and sampling the labeled data by using a preset sample sampling technology to obtain standardized historical electric power material data.
8. A multi-strategy fusion material sampling inspection device, characterized in that the device can implement the multi-strategy fusion material sampling inspection method as claimed in any one of claims 1 to 7, and the device comprises:
the data normalization module is used for acquiring normalized historical electric power material data and converting the normalized historical electric power material data into historical electric power material data characteristics, wherein the historical electric power material data characteristics comprise: material class number, material provider number, experimental class number, and material provider status;
the model training module is used for training a pre-constructed multi-strategy fusion model by utilizing the historical electric power material data characteristics, and obtaining a standard multi-strategy fusion model after training, wherein the pre-constructed multi-strategy fusion model comprises a first strategy model, a second strategy model and a third strategy model;
The sampling inspection prediction module is used for acquiring the electrical material data to be sampled inspected, the first strategy model, the second strategy model and the third strategy model of the standard multi-strategy fusion model are used for respectively carrying out sampling inspection prediction on the electrical material data to be sampled inspected, and the sampling inspection results obtained through the first strategy model, the second strategy model and the third strategy model are subjected to weighted average processing to obtain sampling inspection prediction results.
9. An electronic device, the electronic device comprising:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the multi-policy fusion material spot check method of any one of claims 1 to 7.
10. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the multi-policy fusion material spot check method according to any one of claims 1 to 7.
CN202311597412.3A 2023-11-27 2023-11-27 Multi-strategy fusion material sampling inspection method Pending CN117669810A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311597412.3A CN117669810A (en) 2023-11-27 2023-11-27 Multi-strategy fusion material sampling inspection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311597412.3A CN117669810A (en) 2023-11-27 2023-11-27 Multi-strategy fusion material sampling inspection method

Publications (1)

Publication Number Publication Date
CN117669810A true CN117669810A (en) 2024-03-08

Family

ID=90065391

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311597412.3A Pending CN117669810A (en) 2023-11-27 2023-11-27 Multi-strategy fusion material sampling inspection method

Country Status (1)

Country Link
CN (1) CN117669810A (en)

Similar Documents

Publication Publication Date Title
CN110995459B (en) Abnormal object identification method, device, medium and electronic equipment
CN111652278B (en) User behavior detection method, device, electronic equipment and medium
CN110688536A (en) Label prediction method, device, equipment and storage medium
CN115081025A (en) Sensitive data management method and device based on digital middlebox and electronic equipment
CN114612194A (en) Product recommendation method and device, electronic equipment and storage medium
CN113268665A (en) Information recommendation method, device and equipment based on random forest and storage medium
CN113706172B (en) Customer behavior-based complaint solving method, device, equipment and storage medium
CN113656690B (en) Product recommendation method and device, electronic equipment and readable storage medium
CN113505273B (en) Data sorting method, device, equipment and medium based on repeated data screening
CN114840684A (en) Map construction method, device and equipment based on medical entity and storage medium
CN117155771B (en) Equipment cluster fault tracing method and device based on industrial Internet of things
CN111652282B (en) Big data-based user preference analysis method and device and electronic equipment
CN116843481A (en) Knowledge graph analysis method, device, equipment and storage medium
CN115796398A (en) Intelligent demand analysis method, system, equipment and medium based on electric power materials
CN114841165B (en) User data analysis and display method and device, electronic equipment and storage medium
CN113657546B (en) Information classification method, device, electronic equipment and readable storage medium
CN113704407B (en) Complaint volume analysis method, device, equipment and storage medium based on category analysis
CN117669810A (en) Multi-strategy fusion material sampling inspection method
CN111651652B (en) Emotion tendency identification method, device, equipment and medium based on artificial intelligence
CN114780688A (en) Text quality inspection method, device and equipment based on rule matching and storage medium
CN113628043B (en) Complaint validity judging method, device, equipment and medium based on data classification
CN117235480B (en) Screening method and system based on big data under data processing
CN116991364B (en) Software development system management method based on big data
CN116468266A (en) Applicant performance risk analysis method, device and equipment based on engineering warranty
CN113362039A (en) Business approval method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination