CN115358348A - Vehicle straight-through rate influence characteristic determination method, device, equipment and medium - Google Patents
Vehicle straight-through rate influence characteristic determination method, device, equipment and medium Download PDFInfo
- Publication number
- CN115358348A CN115358348A CN202211276843.5A CN202211276843A CN115358348A CN 115358348 A CN115358348 A CN 115358348A CN 202211276843 A CN202211276843 A CN 202211276843A CN 115358348 A CN115358348 A CN 115358348A
- Authority
- CN
- China
- Prior art keywords
- vehicle
- training
- machine learning
- classification model
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 109
- 238000001514 detection method Methods 0.000 claims abstract description 134
- 238000012549 training Methods 0.000 claims abstract description 72
- 238000013145 classification model Methods 0.000 claims abstract description 67
- 230000008569 process Effects 0.000 claims abstract description 65
- 238000010801 machine learning Methods 0.000 claims abstract description 61
- 238000012216 screening Methods 0.000 claims abstract description 40
- 238000007781 pre-processing Methods 0.000 claims abstract description 25
- 238000012545 processing Methods 0.000 claims abstract description 11
- 238000012795 verification Methods 0.000 claims description 20
- 238000012360 testing method Methods 0.000 claims description 19
- 238000004590 computer program Methods 0.000 claims description 10
- 230000015654 memory Effects 0.000 claims description 5
- 238000003860 storage Methods 0.000 claims description 5
- 230000007257 malfunction Effects 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 2
- 238000012163 sequencing technique Methods 0.000 claims 4
- 238000002372 labelling Methods 0.000 claims 2
- 238000004458 analytical method Methods 0.000 abstract description 7
- 238000007405 data analysis Methods 0.000 abstract description 2
- 238000004519 manufacturing process Methods 0.000 description 6
- 230000002159 abnormal effect Effects 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000003324 Six Sigma (6σ) Methods 0.000 description 2
- 238000007635 classification algorithm Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000013515 script Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000556 factor analysis Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06395—Quality analysis or management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/04—Manufacturing
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Theoretical Computer Science (AREA)
- Strategic Management (AREA)
- General Physics & Mathematics (AREA)
- Economics (AREA)
- Physics & Mathematics (AREA)
- Marketing (AREA)
- Development Economics (AREA)
- Tourism & Hospitality (AREA)
- Software Systems (AREA)
- General Business, Economics & Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Educational Administration (AREA)
- Operations Research (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Game Theory and Decision Science (AREA)
- Primary Health Care (AREA)
- Quality & Reliability (AREA)
- Artificial Intelligence (AREA)
- Manufacturing & Machinery (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The application discloses a vehicle first-pass rate influence characteristic determination method, a vehicle first-pass rate influence characteristic training device, vehicle first-pass rate influence characteristic equipment and a vehicle first-pass rate influence characteristic medium, aims to solve the problem that reason analysis is difficult when a vehicle first-pass offline detection fails and a restart detection passes, and relates to the field of vehicle detection data analysis. The vehicle through-rate image feature determination method comprises the following steps: preprocessing vehicle detection data which do not pass offline detection; constructing and training a machine learning classification model based on the preprocessed data; calculating SHAP values for features in the preprocessed data based on the machine-learned classification model; and screening the plurality of SHAP values in a preset processing mode, and optimizing the detection process of the vehicle based on the screening result.
Description
Technical Field
The application relates to the field of vehicle detection data analysis, in particular to a method, a device, equipment and a medium for determining vehicle first pass rate influence characteristics.
Background
The one-time qualified rate (namely the first pass rate) of vehicle offline is the most important evaluation index in vehicle intelligent production, in the actual vehicle detection process, some products which are judged to be unqualified in the first detection process do not find any problem in the restart detection process, and the products can normally pass through the re-detection process, so that the first pass rate of the vehicle is seriously influenced. The root cause for this situation is difficult to find by traditional statistical methods and expert experience alone.
Disclosure of Invention
The application mainly aims to provide a method, a device, equipment and a medium for determining vehicle first-pass rate influence characteristics, and aims to solve the problem that in the analysis process of vehicle first-pass rate influence factors, the reason that vehicle first-time offline detection fails but restart detection passes is difficult to analyze.
In order to solve the above problem, an embodiment of the present application proposes: a vehicle through-rate influence characteristic determination method, comprising the steps of:
preprocessing vehicle detection data which do not pass offline detection;
constructing and training a machine learning classification model based on the preprocessed data;
calculating SHAP values for features in the preprocessed data based on the machine learning classification model;
and screening the plurality of SHAP values in a preset processing mode, and optimizing the detection process of the vehicle based on the screening result.
As an alternative embodiment of the present application, the preprocessing process includes tagging the vehicle detection data.
In a particular application, vehicle detection data is labeled for training of a machine learning classification model.
As an optional implementation manner of the present application, the machine learning classification model training process includes: and dividing the vehicle detection data set into a training data set and a verification data set, wherein the training data set is used for training the machine learning classification model, and the verification data set is used for verifying whether the machine learning classification model meets the available standard.
In specific application, the machine learning classification model is trained, the machine learning classification model needs to be trained before being put into application, and the trained model is more suitable for application scene data in the invention by training and adjusting model parameters.
As an optional implementation manner of the present application, the shield value screening process includes: and sorting the screened SHAP values, screening the SHAP values meeting the conditions according to the sorting result, wherein the corresponding characteristics are characteristics capable of being used for optimizing the vehicle detection process.
In a particular application, the influence of the detection features on the vehicle quality is ranked by screening for SHAP values.
As an optional implementation manner of the present application, the process of marking the vehicle detection data is: and defining the vehicle detection data of which the offline detection fails but the restart test passes as a positive sample, and defining the data of which the offline detection fails and the restart test still fails as a negative sample.
In a specific application, to use the vehicle detection data for training a machine learning classification model, the vehicle detection data needs to be marked as corresponding types according to needs, and marked as a positive sample and a secondary sample in the present mode.
As an optional implementation manner of the present application, the preprocessing process includes one-hot encoding the vehicle detection data.
In a specific application, since parameters of each feature in the vehicle detection data are discrete data, and the data itself has no size meaning (for example, a test area number: 1,2,3, etc.), one-hot encoding is required to be performed on the vehicle detection data.
As an optional implementation manner of the present application, the sorting manner of the screened SHAP values is descending order.
In a specific application, the detection features affecting the vehicle quality are determined according to the SHAP values, and to determine the detection features most likely to affect the vehicle quality, partial features having a large impact need to be located from the SHAP values, and after the SHAP values are sorted in a descending order, the impact magnitude can be determined according to the sorting order.
As an optional implementation manner of the present application, the process of screening qualified SHAP values is as follows: and screening the characteristics that the SHAP value is positive and the mean value is sorted Top-K and the characteristics that the SHAP value is negative and the mean value is sorted Last-K in the sorting result.
In a specific application, the SHAP value is positive and the features with the Top-k mean rank are important features that most easily cause the appearance of positive samples. And the feature that the SHAP value is negative and the mean sorting Last-k is the most important feature which easily causes the failure of the automobile.
As an optional implementation manner in the present application, the feature that the SHAP value is positive and the mean-rank Top-K is the feature that most easily generates positive sample data, and the feature that the SHAP value is negative and the mean-rank Last-K is the feature that most easily causes the vehicle to malfunction.
In specific application, the SHAP value is positive, the characteristic of the Top-k is sorted in the mean value, namely the important characteristic which is most easy to cause the occurrence of a positive sample, and the characteristic is ignored in a targeted mode, so that the first pass rate of the automobile can be directly improved. The feature that the SHAP value is negative and the average value is sorted Last-k is an important feature that most easily causes automobile failure, the detection process of the feature is optimized and improved, and the first pass rate of automobile detection can be improved.
In order to solve the above technical problem, the embodiment of the present application further provides: a vehicle through-rate influence characteristic determination model training method comprises the following steps: preprocessing vehicle detection data which do not pass offline detection; constructing a machine learning classification model; training a machine learning classification model based on the preprocessed data.
In specific application, the machine learning classification model is trained, the machine learning classification model needs to be trained before being put into application, and the trained model is more suitable for application scene data in the invention by training and adjusting model parameters.
As an optional implementation manner of the present application, the machine learning classification model training process includes: the vehicle detection data set is divided into a training data set and a verification data set, the training data set is used for training the machine learning classification model, and the verification data set is used for verifying whether the machine learning classification model meets available standards.
In specific application, the machine learning classification model is trained, the machine learning classification model needs to be trained before being put into application, and the trained model is more suitable for application scene data in the invention by training and adjusting model parameters.
In order to solve the above technical problem, the embodiment of the present application further provides: a vehicle through-rate influence characteristic determination device comprising:
the data preprocessing module is used for preprocessing vehicle detection data which do not pass offline detection;
the model training module is used for constructing and training a machine learning classification model based on the preprocessed data;
and the detection process optimization module is used for calculating SHAP values of the features in the preprocessed data based on the machine learning classification model, screening a plurality of SHAP values in a preset processing mode, and optimizing the detection process of the vehicle based on the screening result.
In a specific application, the effect is the same as that of the vehicle straight-through rate influence characteristic determination method provided by the application.
As an alternative embodiment of the present application, the preprocessing process includes tagging the vehicle detection data.
In a specific application, the vehicle detection data is marked for training of a machine learning classification model.
As an optional implementation manner of the present application, the machine learning classification model training process includes: the vehicle detection data set is divided into a training data set and a verification data set, the training data set is used for training the machine learning classification model, and the verification data set is used for verifying whether the machine learning classification model meets available standards.
In specific application, the machine learning classification model is trained, the machine learning classification model needs to be trained before being put into application, and the trained model is more suitable for application scene data in the invention by training and adjusting model parameters.
As an optional implementation manner of the present application, the shield value screening process includes: and sorting the screened SHAP values, screening the SHAP values meeting the conditions according to the sorting result, wherein the corresponding characteristics are characteristics capable of being used for optimizing the vehicle detection process.
In a particular application, the ranking of the impact of the detection features on vehicle quality is confirmed by screening for SHAP values.
As an optional implementation manner of the present application, the process of marking the vehicle detection data is: and defining the vehicle detection data of which the offline detection fails but the restart test passes as a positive sample, and defining the data of which the offline detection fails and the restart test still fails as a negative sample.
In a specific application, to use the vehicle detection data for training a machine learning classification model, the vehicle detection data needs to be marked as corresponding types according to needs, and in the present method, the vehicle detection data is marked as a positive sample and a negative sample.
As an optional implementation manner of the present application, the preprocessing process includes one-hot encoding the vehicle detection data.
In a specific application, since parameters of each feature in the vehicle detection data are discrete data, and the data itself has no size meaning (for example, a test area number: 1,2,3, etc.), one-hot encoding is required to be performed on the vehicle detection data.
As an optional implementation manner of the present application, the sorting manner of the screened SHAP values is descending order.
In a specific application, the detection features affecting the vehicle quality are determined according to the SHAP values, and to determine the detection features most likely to affect the vehicle quality, partial features having a large impact need to be located from the SHAP values, and after the SHAP values are sorted in a descending order, the impact magnitude can be determined according to the sorting order.
As an optional implementation manner of the present application, the process of screening qualified SHAP values is as follows: and screening the characteristics that the SHAP value is positive and the mean value is sorted Top-K and the characteristics that the SHAP value is negative and the mean value is sorted Last-K in the sorting result.
In a specific application, the SHAP value is positive and the features with the Top-k mean rank are important features that most easily cause the appearance of positive samples. And the feature that the SHAP value is negative and the mean value is sorted Last-k is the most important feature which is most easy to cause the automobile to have faults.
As an optional implementation manner in the present application, the feature that the SHAP value is positive and the mean-rank Top-K is the feature that most easily generates positive sample data, and the feature that the SHAP value is negative and the mean-rank Last-K is the feature that most easily causes the vehicle to malfunction.
In specific application, the SHAP value is positive, the characteristic of the Top-k is sorted in the mean value, namely the important characteristic which is most easy to cause the occurrence of a positive sample, and the characteristic is ignored in a targeted mode, so that the first pass rate of the automobile can be directly improved. And the feature that the SHAP value is negative and the Last-k is sorted in the mean value is an important feature which most easily causes the automobile to have a fault, the detection processes of the feature are optimized and improved, and the first pass rate of automobile detection can be improved.
In order to solve the above technical problem, the embodiment of the present application further provides: an electronic device comprising a memory having a computer program stored therein and a processor executing the computer program to implement the method as described above.
In a specific application, the effect is the same as that of the vehicle straight-through rate influence characteristic determination method provided by the application.
In order to solve the above technical problem, the embodiment of the present application further provides: a computer-readable storage medium having stored thereon a computer program, which computer program is executed by a processor to implement a method as described above.
In a specific application, the effect is the same as that of the vehicle straight-through rate influence characteristic determination method provided by the application.
Compared with the prior art, the vehicle straight-through rate influence characteristic determination method implemented by the application comprises the following steps: preprocessing vehicle detection data which do not pass offline detection; constructing and training a machine learning classification model based on the preprocessed data; calculating SHAP values for features in the preprocessed data based on the machine-learned classification model; and screening the plurality of SHAP values in a preset processing mode, and optimizing the detection process of the vehicle based on the screening result. The preprocessing includes tagging the vehicle detection data. The machine learning classification model training process comprises: the vehicle detection data set is divided into a training data set and a verification data set, the training data set is used for training the machine learning classification model, and the verification data set is used for verifying whether the machine learning classification model meets available standards. The screening process of the SHAP value comprises the following steps: and sorting the screened SHAP values, screening the SHAP values meeting the conditions according to the sorting result, wherein the corresponding characteristics are characteristics capable of being used for optimizing the vehicle detection process. It can be seen that the modeling is performed based on the test data before the vehicle is offline, the test data of each process is taken as the characteristic, the state parameter of whether the test passes or not is taken as the label, the classification model is established through the machine learning classification algorithm, the model is analyzed through the improved scheme based on the SHAP value, and finally the analysis result of the first-pass detection rate influence factor of the vehicle before the vehicle is offline is obtained. In the process of analyzing the vehicle detection through the algorithm model, the system firstly detects main influence factors which do not pass but pass through manual detection or re-detection, assists in optimizing the detection flow, and finally achieves the purpose of improving the first pass detection rate of vehicle detection.
Drawings
Fig. 1 is a schematic diagram of steps of a vehicle through-rate influence characteristic determination method according to an embodiment of the present application.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The application implements the vehicle through-rate influence characteristic determination method, which includes the steps of: preprocessing vehicle detection data which do not pass offline detection; constructing and training a machine learning classification model based on the preprocessed data; calculating SHAP values for features in the preprocessed data based on the machine-learned classification model; and screening the plurality of SHAP values in a preset processing mode, and optimizing the detection process of the vehicle based on the screening result. The preprocessing includes tagging the vehicle detection data. The machine learning classification model training process comprises: the vehicle detection data set is divided into a training data set and a verification data set, the training data set is used for training the machine learning classification model, and the verification data set is used for verifying whether the machine learning classification model meets available standards. The screening process of the SHAP value comprises the following steps: and sorting the screened SHAP values, screening the SHAP values meeting the conditions according to the sorting result, wherein the corresponding characteristics are characteristics capable of being used for optimizing the vehicle detection process.
In the prior art, the processing flow excessively depends on the manual experience of a service expert, and meanwhile, the parameter elimination process is greatly influenced by the threshold setting. In addition, the parameter removing process is complicated. For example, application publication numbers are: CN111159645 a. The method is used for diagnosing the fault information of the automobile production line running in real time and providing a corresponding processing scheme, but the reason for generating the fault is not analyzed, the automobile production flow cannot be directly improved, and the first pass rate cannot be directly improved. For example, application publication numbers are: CN 202210148272. The method is a management method of a mobile phone production line test process, is a mobile phone fault detection method, is used for finding out bad products produced by mobile phones, can not analyze and find out the root cause of the mobile phone fault, can not analyze and test the problems of a system through historical data, and can not improve the mobile phone first pass rate in a targeted manner. For example, patent application publication No. CN 201310268793. The method for improving the abnormal result of a certain production line only simply replaces equipment with excessive continuous abnormal times, and does not deeply analyze the reasons of the abnormal results. For example, application publication No. 202110961318.6. A DMAIC flow improvement method based on lean six-sigma is mainly used for analyzing and optimizing the problem of low through-pass rate of products A of F company in an SMT (surface mount technology) working section, the analysis method is seriously dependent on professional knowledge of an industrial expert, and the root cause of low through-pass rate in production is difficult to analyze. (product pass through improvement study of product A based on lean six sigma [ D ], xiong Chuanbin). Therefore, the method is based on the test data before vehicle offline for modeling, takes the test data of each process as the characteristics, takes the state parameters of whether the test passes or not as the labels, establishes the classification model through a machine learning classification algorithm, analyzes the model by using an improved scheme based on the SHAP value, and finally obtains the analysis result of the first pass detection rate influence factors before vehicle offline. In the process of analyzing the vehicle detection through the algorithm model, the system firstly detects main influence factors which do not pass through but pass through manual detection or re-detection, assists in optimizing the detection flow, and finally achieves the purpose of improving the first detection pass rate of the vehicle.
The application implements the vehicle through-rate influence characteristic determination method, which includes the steps of:
preprocessing vehicle detection data which do not pass offline detection;
constructing and training a machine learning classification model based on the preprocessed data;
calculating SHAP values for features in the preprocessed data based on the machine-learned classification model;
and screening the plurality of SHAP values in a preset processing mode, and optimizing the detection process of the vehicle based on the screening result.
Detailed description of the preferred embodimentreferring to fig. 1, as shown in fig. 1,
s01: preprocessing vehicle detection data which do not pass offline detection;
in the samples of the vehicle detection data, the samples that pass the restart detection are marked as 1 (i.e. the samples of the vehicle that can pass the restart without repair are positive samples and defined as resetOK), and the other samples are marked as 0 (i.e. the samples with the problem of vehicle quality are negative samples and defined as realNOK).
In the vehicle detection data, the parameters of each feature are discrete data, and the data itself has no size meaning (for example, the test area number: 1,2,3, etc.), so that one-hot encoding needs to be performed on the marked data, and the processed data is defined as a vehicle detection data set.
S02: constructing and training a machine learning classification model based on the preprocessed data;
the vehicle detection data set is as follows 8:2, dividing the training set and the verification set; using XGboost (or other machine learning classification models) technology, training the models by using a training set, and obtaining classification models; and evaluating the algorithm model by using a verification set, adjusting training parameters, calculating F1-score according to training results, namely indexes such as recall rate, accuracy and the like of the positive sample, sequentially adjusting the training parameters by using a control variable method until a highest F1-score value is obtained, ending the training and storing the model. F1-score is defined as follows:
TABLE 1 detection results confusion matrix
Negative sample | Positive sample | |
Is identified as a negative sample | True Negatives(TN) | False Negatives(FN) |
Is identified as a positive sample | False Positives(FP) | True Positives(TP) |
s03: calculating SHAP values for features in the preprocessed data based on the machine-learned classification model;
for each sample in the vehicle inspection dataset, a SHAP value for each feature is calculated.
In the vehicle detection data set, the SHAP value of the jth feature of any sample (each sample is the detection result value of all items in the detection process of one vehicle) is calculated according to the following formula:
whereinIs a feature set of all processes of vehicle detection, p is the category of all features,indicates that does not includeS is any subset of all features,for the trained XGboost model pairThe predicted value of (a) is determined,and (5) predicting the S value by the trained XGboost model.
S04: and screening the plurality of SHAP values in a preset processing mode, and optimizing the detection process of the vehicle based on the screening result.
The vehicle detection data set is data after one-hot encoding, the median value of the encoded data is 1 to indicate that the characteristic parameter appears, and the value of the encoded data is 0 to indicate that the characteristic parameter does not appear. The analysis of the invention is that which characteristics affect the straight-through rate of the vehicle, so that the analysis is more persuasive only for the appearing characteristics, namely the value with the characteristic value of 1, and the characteristic value of 0 indicates that the characteristics do not appear in the vehicle maintenance process, so that the SHAP values calculated by the characteristic values have no practical significance for the vehicle detection without factor analysis, the SHAP values are deleted, and only the SHAP values calculated by the characteristic parameters with the characteristic value of 1 are reserved.
Because the frequency of occurrence of part of characteristics in vehicle detection is too few and lacks statistical significance, the invention sets a threshold value N, the occurrence of the characteristics is less than N times, the characteristics are deleted, and N is recommended to be 100 according to actual experience.
Taking the average value of SHAP values of all the characteristics and sorting the SHAP values in a descending order, wherein the SHAP values are positive and the characteristics of the average sorting Top-k are the important characteristics which most easily cause resetOK, and the characteristics are ignored in a targeted manner, so that the first pass rate of the automobile can be directly improved. The feature that the SHAP value is negative and the average value is sorted Last-k is an important feature that most easily causes automobile failure, the detection process of the feature is optimized and improved, and the first pass rate of automobile detection can be improved.
Furthermore, in one embodiment, the present application further provides a computer program product, which when executed by a processor, implements the foregoing method.
Furthermore, in an embodiment, an embodiment of the present application further provides a computer storage medium having a computer program stored thereon, where the computer program is executed by a processor to implement the steps of the method in the foregoing embodiments.
In some embodiments, the computer-readable storage medium may be memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories. The computer may be a variety of computing devices including intelligent terminals and servers.
In some embodiments, the executable instructions may be in the form of a program, software module, script, or code written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and it may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
By way of example, executable instructions may correspond, but do not necessarily have to correspond, to files in a file system, and may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).
By way of example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present application are merely for description, and do not represent the advantages and disadvantages of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better embodiment. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (e.g., a rom/ram, a magnetic disk, an optical disk) and includes instructions for enabling a multimedia terminal (e.g., a mobile phone, a computer, a television receiver, or a network device) to perform the method according to the embodiments of the present application.
The above description is only a preferred embodiment of the present application, and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application, or which are directly or indirectly applied to other related technical fields, are included in the scope of the present application.
Claims (24)
1. A vehicle through-rate influence characteristic determination method, comprising the steps of:
preprocessing vehicle detection data which do not pass offline detection;
constructing and training a machine learning classification model based on the preprocessed data;
calculating SHAP values for features in the preprocessed data based on the machine learning classification model;
and screening the plurality of SHAP values in a preset processing mode, and optimizing the detection process of the vehicle based on the screening result.
2. The vehicle through-rate influence characteristic determination method according to claim 1, wherein the preprocessing process includes labeling the vehicle detection data.
3. The vehicle through rate impact feature determination method according to claim 1, wherein the machine learning classification model training process includes:
the vehicle detection data set is divided into a training data set and a verification data set, the training data set is used for training the machine learning classification model, and the verification data set is used for verifying whether the machine learning classification model meets available standards.
4. The vehicle straight yield influence characteristic determination method according to claim 1, characterized in that:
the screening process of the SHAP value comprises the following steps: and sequencing the screened SHAP values, and screening the SHAP values meeting the conditions according to the sequencing result, wherein the corresponding characteristics are characteristics capable of being used for optimizing the vehicle detection process.
5. The vehicle straight yield influence characteristic determination method according to claim 2, characterized in that:
the process for marking the vehicle detection data comprises the following steps: and defining the vehicle detection data of which the offline detection fails but the restart test passes as a positive sample, and defining the data of which the offline detection fails and the restart test still fails as a negative sample.
6. The vehicle straight-through rate influence characteristic determination method according to claim 2, wherein the preprocessing process includes one-hot encoding the vehicle detection data.
7. The vehicle passing rate influence characteristic determination method according to claim 4, wherein the sorted manner of the screened SHAP values is a descending order.
8. The vehicle through rate influence characteristic determination method according to claim 4, characterized in that:
the process of screening qualified SHAP values is as follows: and screening the characteristics that the SHAP value is positive and the mean sorting Top-K and the SHAP value is negative and the mean sorting Last-K in the sorting results.
9. The vehicle straight-through rate influencing feature determining method according to claim 8, wherein the SHAP value is positive and the feature of the mean-ranking Top-K is the feature that most easily generates positive sample data, and the SHAP value is negative and the feature of the mean-ranking Last-K is the feature that most easily causes automobile failure.
10. A vehicle through-rate influence characteristic determination model training method is characterized by comprising the following steps:
preprocessing vehicle detection data which do not pass offline detection;
constructing a machine learning classification model;
training a machine learning classification model based on the preprocessed data.
11. The vehicle straight through rate impact feature determination model training method of claim 10, wherein the training of the machine learning classification model comprises:
and dividing the preprocessed data into a training data set and a verification data set, wherein the training data set is used for training the machine learning classification model, and the verification data set is used for verifying whether the machine learning classification model meets available standards.
12. A vehicle through-rate influence characteristic determination device, characterized by comprising:
the data preprocessing module is used for preprocessing vehicle detection data which do not pass offline detection;
the model training module is used for constructing and training a machine learning classification model based on the preprocessed data;
and the detection process optimization module is used for calculating SHAP values of the features in the preprocessed data based on the machine learning classification model, screening a plurality of SHAP values in a preset processing mode, and optimizing the detection process of the vehicle based on the screening result.
13. The vehicle through-rate influence characteristic determination apparatus according to claim 12, wherein the preprocessing process includes labeling the vehicle detection data.
14. The vehicle through-rate impact feature determination apparatus of claim 12, wherein the machine learning classification model training process comprises:
the vehicle detection data set is divided into a training data set and a verification data set, the training data set is used for training the machine learning classification model, and the verification data set is used for verifying whether the machine learning classification model meets available standards.
15. The vehicle straight-through rate influence characteristic determination apparatus according to claim 12, wherein the SHAP value screening process includes: and sequencing the screened SHAP values, and screening the SHAP values meeting the conditions according to the sequencing result, wherein the corresponding characteristics are characteristics capable of being used for optimizing the vehicle detection process.
16. The vehicle through-rate influence characteristic determination device according to claim 12, characterized in that:
the process for marking the vehicle detection data comprises the following steps: and defining the vehicle detection data of which the offline detection fails but the restart test passes as a positive sample, and defining the data of which the offline detection fails and the restart test still fails as a negative sample.
17. The vehicle straight-through rate influence characteristic determination apparatus according to claim 13, wherein the preprocessing process includes one-hot encoding the vehicle detection data.
18. The vehicle straight through rate influencing feature determining device according to claim 15, wherein the sorted way of the screened SHAP values is a descending order.
19. The vehicle through-rate influence characteristic determination device according to claim 15, characterized in that:
the process of screening the SHAP values meeting the conditions is as follows: and screening the characteristics that the SHAP value is positive and the mean value is sorted Top-K and the characteristics that the SHAP value is negative and the mean value is sorted Last-K in the sorting result.
20. The vehicle straight-through rate influencing characteristic determining device according to claim 19, wherein the SHAP value is positive and the characteristic of the mean-ranking Top-K is the characteristic that most easily generates positive sample data, and the SHAP value is negative and the characteristic of the mean-ranking Last-K is the characteristic that most easily causes a malfunction of the automobile.
21. A vehicle through-rate influence characteristic determination model training device, characterized by comprising:
the data preprocessing module is used for preprocessing vehicle detection data which do not pass offline detection;
the model building module is used for building a machine learning classification model;
and the model training module is used for training a machine learning classification model based on the preprocessed data.
22. The vehicle straight through rate influencing feature determination model training device of claim 21, wherein the training of the machine learning classification model comprises:
and dividing the preprocessed data into a training data set and a verification data set, wherein the training data set is used for training the machine learning classification model, and the verification data set is used for verifying whether the machine learning classification model meets available standards.
23. An electronic device, characterized in that the electronic device comprises a memory in which a computer program is stored and a processor, which executes the computer program to implement the method according to any of claims 1-11.
24. A computer-readable storage medium, having stored thereon a computer program, which, when executed by a processor, performs the method of any one of claims 1-11.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211276843.5A CN115358348B (en) | 2022-10-19 | 2022-10-19 | Vehicle straight-through rate influence characteristic determination method, device, equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211276843.5A CN115358348B (en) | 2022-10-19 | 2022-10-19 | Vehicle straight-through rate influence characteristic determination method, device, equipment and medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115358348A true CN115358348A (en) | 2022-11-18 |
CN115358348B CN115358348B (en) | 2023-03-24 |
Family
ID=84008439
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211276843.5A Active CN115358348B (en) | 2022-10-19 | 2022-10-19 | Vehicle straight-through rate influence characteristic determination method, device, equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115358348B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117745720A (en) * | 2024-02-19 | 2024-03-22 | 成都数之联科技股份有限公司 | Vehicle appearance detection method, device, equipment and storage medium |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111325353A (en) * | 2020-02-28 | 2020-06-23 | 深圳前海微众银行股份有限公司 | Method, device, equipment and storage medium for calculating contribution of training data set |
CN111476296A (en) * | 2020-04-07 | 2020-07-31 | 上海优扬新媒信息技术有限公司 | Sample generation method, classification model training method, identification method and corresponding devices |
CN111523677A (en) * | 2020-04-17 | 2020-08-11 | 第四范式(北京)技术有限公司 | Method and device for explaining prediction result of machine learning model |
CN111737067A (en) * | 2020-05-29 | 2020-10-02 | 苏州浪潮智能科技有限公司 | Hard disk fault prediction model interpretation method and device |
US20210272394A1 (en) * | 2018-09-30 | 2021-09-02 | Strong Force Intellectual Capital, Llc | Intelligent transportation systems including digital twin interface for a passenger vehicle |
CN113360845A (en) * | 2021-05-25 | 2021-09-07 | 浙江大搜车软件技术有限公司 | Vehicle source transaction probability prediction method and device, electronic device and storage medium |
CN113469241A (en) * | 2021-06-29 | 2021-10-01 | 中国航空规划设计研究总院有限公司 | Product quality control method based on process network model and machine learning algorithm |
US20220044133A1 (en) * | 2020-08-07 | 2022-02-10 | Sap Se | Detection of anomalous data using machine learning |
CN114444986A (en) * | 2022-04-11 | 2022-05-06 | 成都数之联科技股份有限公司 | Product analysis method, system, device and medium |
CN114841250A (en) * | 2022-04-11 | 2022-08-02 | 浙江工业大学 | Industrial system production abnormity detection and diagnosis method based on multi-dimensional sensing data |
CN114841468A (en) * | 2022-06-06 | 2022-08-02 | 兰州理工大学 | Gasoline quality index prediction and cause analysis method |
-
2022
- 2022-10-19 CN CN202211276843.5A patent/CN115358348B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210272394A1 (en) * | 2018-09-30 | 2021-09-02 | Strong Force Intellectual Capital, Llc | Intelligent transportation systems including digital twin interface for a passenger vehicle |
CN111325353A (en) * | 2020-02-28 | 2020-06-23 | 深圳前海微众银行股份有限公司 | Method, device, equipment and storage medium for calculating contribution of training data set |
CN111476296A (en) * | 2020-04-07 | 2020-07-31 | 上海优扬新媒信息技术有限公司 | Sample generation method, classification model training method, identification method and corresponding devices |
CN111523677A (en) * | 2020-04-17 | 2020-08-11 | 第四范式(北京)技术有限公司 | Method and device for explaining prediction result of machine learning model |
CN111737067A (en) * | 2020-05-29 | 2020-10-02 | 苏州浪潮智能科技有限公司 | Hard disk fault prediction model interpretation method and device |
US20220044133A1 (en) * | 2020-08-07 | 2022-02-10 | Sap Se | Detection of anomalous data using machine learning |
CN113360845A (en) * | 2021-05-25 | 2021-09-07 | 浙江大搜车软件技术有限公司 | Vehicle source transaction probability prediction method and device, electronic device and storage medium |
CN113469241A (en) * | 2021-06-29 | 2021-10-01 | 中国航空规划设计研究总院有限公司 | Product quality control method based on process network model and machine learning algorithm |
CN114444986A (en) * | 2022-04-11 | 2022-05-06 | 成都数之联科技股份有限公司 | Product analysis method, system, device and medium |
CN114841250A (en) * | 2022-04-11 | 2022-08-02 | 浙江工业大学 | Industrial system production abnormity detection and diagnosis method based on multi-dimensional sensing data |
CN114841468A (en) * | 2022-06-06 | 2022-08-02 | 兰州理工大学 | Gasoline quality index prediction and cause analysis method |
Non-Patent Citations (1)
Title |
---|
郭嘉: "面向SMT产线的质量预测方法研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117745720A (en) * | 2024-02-19 | 2024-03-22 | 成都数之联科技股份有限公司 | Vehicle appearance detection method, device, equipment and storage medium |
CN117745720B (en) * | 2024-02-19 | 2024-05-07 | 成都数之联科技股份有限公司 | Vehicle appearance detection method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN115358348B (en) | 2023-03-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111858242B (en) | System log abnormality detection method and device, electronic equipment and storage medium | |
CN109063116B (en) | Data identification method and device, electronic equipment and computer readable storage medium | |
CN115358348B (en) | Vehicle straight-through rate influence characteristic determination method, device, equipment and medium | |
CN113221960B (en) | Construction method and collection method of high-quality vulnerability data collection model | |
CN115049019B (en) | Method and device for evaluating arsenic adsorption performance of metal organic framework and related equipment | |
CN112419268A (en) | Method, device, equipment and medium for detecting image defects of power transmission line | |
CN115907279A (en) | Quality detection system and method for industrial production products based on Internet of things | |
CN117540826A (en) | Optimization method and device of machine learning model, electronic equipment and storage medium | |
US20210397960A1 (en) | Reliability evaluation device and reliability evaluation method | |
CN110808947B (en) | Automatic vulnerability quantitative evaluation method and system | |
CN117349151A (en) | Test case priority ordering method and device based on clustering and storage medium | |
CN115222145A (en) | Driving range prediction method and system based on new energy automobile operation big data | |
CN115809622A (en) | Chip simulation acceleration system with automatic optimization configuration function | |
CN115494431A (en) | Transformer fault warning method, terminal equipment and computer readable storage medium | |
CN114860617A (en) | Intelligent pressure testing method and system | |
CN115203014A (en) | Ecological service abnormity restoration system and restoration method based on deep learning | |
CN110298690B (en) | Object class purpose period judging method, device, server and readable storage medium | |
CN114077663A (en) | Application log analysis method and device | |
CN115514621B (en) | Fault monitoring method, electronic device and storage medium | |
CN113076454B (en) | Artificial intelligence-based element number analysis method and server | |
CN114118195B (en) | Method and system for identifying vibration of breaker operating mechanism based on decision tree | |
CN114374561B (en) | Network security state evaluation method, device and storable medium | |
CN113886579B (en) | Construction method and system, identification method and system for positive and negative surface models of industry information | |
CN109214411B (en) | Verification method and system for identifying typical pictures to newly added entities based on training model | |
CN115935166A (en) | Detection model training method and device and related equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |