CN116881824A - Anomaly detection method and system for official vehicle audit - Google Patents
Anomaly detection method and system for official vehicle audit Download PDFInfo
- Publication number
- CN116881824A CN116881824A CN202310789305.4A CN202310789305A CN116881824A CN 116881824 A CN116881824 A CN 116881824A CN 202310789305 A CN202310789305 A CN 202310789305A CN 116881824 A CN116881824 A CN 116881824A
- Authority
- CN
- China
- Prior art keywords
- bus
- audit
- data
- maintenance
- oiling
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012550 audit Methods 0.000 title claims abstract description 118
- 238000001514 detection method Methods 0.000 title claims abstract description 26
- 238000012423 maintenance Methods 0.000 claims abstract description 52
- 238000000034 method Methods 0.000 claims abstract description 44
- 230000004927 fusion Effects 0.000 claims abstract description 36
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 29
- 230000010354 integration Effects 0.000 claims abstract description 15
- 238000007635 classification algorithm Methods 0.000 claims abstract description 8
- 238000004590 computer program Methods 0.000 claims description 17
- 238000011156 evaluation Methods 0.000 claims description 16
- 230000006870 function Effects 0.000 claims description 14
- 238000003860 storage Methods 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 9
- 238000010276 construction Methods 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000004140 cleaning Methods 0.000 claims description 6
- 238000011049 filling Methods 0.000 claims description 6
- 238000012360 testing method Methods 0.000 claims description 6
- 238000012549 training Methods 0.000 claims description 5
- 238000010586 diagram Methods 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 238000010801 machine learning Methods 0.000 description 6
- 238000011160 research Methods 0.000 description 5
- 230000002159 abnormal effect Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 230000005856 abnormality Effects 0.000 description 2
- 239000000872 buffer Substances 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2433—Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24147—Distances to closest patterns, e.g. nearest neighbour classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Vehicle Cleaning, Maintenance, Repair, Refitting, And Outriggers (AREA)
Abstract
The invention relates to an anomaly detection method and system for official vehicle audit, comprising the following steps: obtaining a standardized bus audit table; the method comprises the steps that an oiling data set and a maintenance data set in a standardized bus audit table are respectively subjected to oversampling treatment by an SMOTE algorithm; fusing the GBDT integration algorithm and the KNN classification algorithm by fusing weights based on a weighted average method; the fusion weight is obtained when the repeated iteration data set reaches the iteration stop condition; constructing a bus audit model based on weighted average GBDT_KNN; obtaining fusion weights of the oiling data sets, inputting the fusion weights of the oiling data sets and the oiling data sets into a bus audit model based on weighted average GBDT_KNN, and obtaining oiling suspicious point data; inputting the fusion weight of the maintenance data set and the maintenance data set into a bus audit model based on a weighted average GBDT_KNN to obtain maintenance suspicious point data; and integrating the oiling doubt point data with the maintenance doubt point data to obtain integrated doubt point data.
Description
Technical Field
The invention relates to the technical field of anomaly detection, in particular to an anomaly detection method and system for official vehicle audit.
Background
The current public service vehicle audit has the following pain and difficulty:
the machine learning methods such as anomaly detection have a few application examples in the fields such as the Internet, but the application in the auditing field is just started, most of the research of the domestic machine learning algorithm in the auditing field is still in the theoretical research stage, many of the machine learning algorithms are the construction of a research big data auditing platform and framework, and the machine learning methods are not actually applied in auditing work. And the research results of the machine learning algorithm in the bus audit are basically not available.
Because the existing multi-component heterogeneous bus audit data is rapidly increased, a certain limitation appears in a general computer audit method for screening suspicious points by using SQL sentences, and the current-stage bus audit faces a great challenge.
The machine learning can be used for precisely and deeply learning the data, and can be used for finding hidden doubts which cannot be found by a common computer auditing method.
Disclosure of Invention
Aiming at the problem that the existing auditing method is difficult to quickly and accurately identify the suspicious points of marked bus data, the invention provides an anomaly detection method and system for bus auditing, and researches a bus auditing model based on weighted average GBDT_KNN. The GBDT integrated algorithm and the KNN classification algorithm are fused, so that the high identification capability of the GBDT algorithm and the quick classification capability of the KNN algorithm can be integrated, the fault point data can be identified quickly and accurately by assisting the bus audit, and the bus audit efficiency is improved.
In order to achieve the above object, the present invention adopts the technical scheme that:
an anomaly detection method for a official vehicle audit is characterized by comprising the following steps:
s1, acquiring a standardized bus audit list; the method comprises the steps that an oiling data set and a maintenance data set in a standardized bus audit table are respectively subjected to oversampling treatment by an SMOTE algorithm;
s2, fusing the GBDT integration algorithm and the KNN classification algorithm through fusion weights based on a weighted average method; the fusion weight is obtained when the repeated iteration data set reaches the iteration stop condition; constructing a bus audit model based on weighted average GBDT_KNN;
s3, acquiring fusion weights of the oiling data sets, and inputting the fusion weights of the oiling data sets and the oiling data sets into a bus audit model based on weighted average GBDT_KNN to acquire oiling suspicious point data;
s4, acquiring fusion weights of the maintenance data sets, and inputting the fusion weights of the maintenance data sets and the maintenance data sets into a bus audit model based on weighted average GBDT_KNN to acquire maintenance doubtful point data;
and S5, integrating the oiling doubt point data and the maintenance doubt point data to obtain integrated doubt point data.
Further, before the step S1, the method further includes:
collecting original data of a public service vehicle; feature selection and data integration are carried out on the original data of the service vehicle, and preprocessed data are obtained;
constructing a bus audit feature according to the existing bus feature and audit requirements, wherein the bus audit feature comprises: a refueling feature and a maintenance feature;
establishing a bus audit table containing the bus audit characteristics based on the processed data; the bus audit list comprises: a fueling data set and a maintenance data set;
cleaning data of the bus audit table, deleting or filling the blank value, and obtaining a cleaned bus audit table;
performing independent heat coding on discrete features in the cleaned bus audit list, and converting the discrete features into numerical features; and (3) standardizing the characteristics with large difference of attribute values in the numerical characteristics to obtain a standardized bus audit list.
Further, the iteration stop condition is that the evaluation index reaches a preset threshold value; the preset threshold is 0.96.
Further, the step S2 includes sub-steps;
randomly initializing a prediction weight; calculating a predicted weight through a GBDT algorithm and a KNN algorithm to obtain a predicted weight result;
minimizing a loss function through a minimization method, and optimizing the initial weight through constraint conditions to obtain a minimized weight result;
obtaining a weighted predicted value by weighting calculation of the minimized weight result and the predicted weight result;
inputting the weighted predicted value into an index function of the evaluation weight to obtain an evaluation index of each iteration;
comparing whether the evaluation index of each iteration reaches a preset threshold value or not;
stopping iteration if the evaluation index of the iteration reaches a preset threshold value, and outputting the evaluation index of the iteration and a weighted prediction value; the weighted prediction value is a fusion weight.
Further, the minimization method is a sequential least squares method.
Further, the step S1 further includes: dividing the over-sampled fueling data set into a fueling data training set and a fueling data testing set; the over-sampled maintenance data set is divided into a maintenance data training set and a maintenance data testing set.
The invention also relates to an anomaly detection system for the official vehicle audit, which is characterized by comprising the following steps:
the standardized bus audit table module is used for acquiring a standardized bus audit table; the method comprises the steps that an oiling data set and a maintenance data set in a standardized bus audit table are respectively subjected to oversampling treatment by an SMOTE algorithm;
the model construction module is used for fusing the GBDT integration algorithm and the KNN classification algorithm through fusion weights based on a weighted average method; the fusion weight is obtained when the repeated iteration data set reaches the iteration stop condition; constructing a bus audit model based on weighted average GBDT_KNN;
the doubtful point data calculation module is used for obtaining the fusion weight of the oiling data set, inputting the fusion weight of the oiling data set and the oiling data set into the bus audit model based on the weighted average GBDT_KNN, and obtaining oiling doubtful point data; obtaining fusion weights of maintenance data sets, inputting the fusion weights of the maintenance data sets and the maintenance data sets into a bus audit model based on weighted average GBDT_KNN, and obtaining maintenance doubtful point data;
and the suspicious point data integration module is used for integrating the oiling suspicious point data and the maintenance suspicious point data to obtain integrated suspicious point data.
Further, the method further comprises the following steps:
the data acquisition module is used for acquiring the original data of the official vehicle; feature selection and data integration are carried out on the original data of the service vehicle, and preprocessed data are obtained;
the feature construction module is used for constructing a bus audit feature according to the existing bus feature combination audit requirement, and the bus audit feature comprises: a refueling feature and a maintenance feature;
the data processing module is used for establishing a bus audit table containing the bus audit characteristics based on the processed data; the bus audit list comprises: a fueling data set and a maintenance data set; cleaning data of the bus audit table, deleting or filling the blank value, and obtaining a cleaned bus audit table; performing independent heat coding on discrete features in the cleaned bus audit list, and converting the discrete features into numerical features; and (3) standardizing the characteristics with large difference of attribute values in the numerical characteristics to obtain a standardized bus audit list.
The invention also relates to a computer readable storage medium, which is characterized in that the storage medium is stored with a computer program, and the computer program realizes the anomaly detection method for the official vehicle audit when being executed by a processor.
The invention also relates to an electronic device, which is characterized by comprising a processor and a memory;
the memory is used for storing a bus audit model based on weighted average GBDT_KNN;
the processor is used for executing the anomaly detection method for the official vehicle audit by calling the official vehicle audit model based on the weighted average GBDT_KNN.
The invention also relates to a computer program product comprising a computer program and/or instructions, characterized in that the computer program and/or instructions, when executed by a processor, implement the steps of the anomaly detection method for official vehicle auditing described above.
The beneficial effects of the invention are as follows:
the invention provides an anomaly detection method and system for public service vehicle audit, and provides a public vehicle audit model based on weighted average GBDT_KNN, aiming at the problem that the existing audit method is difficult to quickly and accurately identify the suspicious points for marked public vehicle data. The GBDT integrated algorithm and the KNN classification algorithm are fused, so that the high identification capability of the GBDT algorithm and the quick classification capability of the KNN algorithm can be integrated, the fault point data can be identified quickly and accurately by assisting the bus audit, and the bus audit efficiency is improved. The anomaly detection method and the anomaly detection system for the official vehicle audit can identify the characteristic information which cannot be identified by people, reduce subjectivity of manually selected characteristics, flexibly process continuous and discrete data, obtain higher anomaly detection accuracy, quickly identify the suspicious point data, make up the defects of subjectivity and unilateralness in the traditional audit, and can more objectively and accurately reflect the rule of the audit data and timely extract valuable data for auditors.
Drawings
FIG. 1 is a schematic flow chart of an anomaly detection method for official vehicle audit.
FIG. 2 is a schematic diagram of an anomaly detection system for official vehicle audit according to the present invention.
Detailed Description
For a clearer understanding of the present invention, reference will be made to the following detailed description taken in conjunction with the accompanying drawings and examples.
The first aspect of the invention relates to a method for detecting abnormity of a public service vehicle audit, which comprises the steps as shown in a figure 1, wherein the method comprises the following steps:
firstly, business logic analysis is carried out on the original data of the public service vehicle, and feature selection and data integration are carried out according to the public service vehicle management rule.
According to the existing bus characteristics and the auditing requirements, new bus auditing characteristics are constructed, and finally a new bus auditing table is formed, and data cleaning, deletion or filling of the blank values is carried out.
Performing single-heat coding on discrete features in the cleaned data, and converting the discrete features into numerical values, so that an algorithm can calculate the distance between data points conveniently; and (3) normalizing the characteristics with large attribute value differences in the numerical type characteristics, and reducing the influence of the excessive attribute value differences on the abnormal detection result.
The abnormal data sample is marked as 1, the rest normal data samples are marked as 0, and the processed refueling data set and maintenance data set are respectively over-sampled by using an SMOTE algorithm to balance sample types.
Dividing the sample into a training set and a testing set, and integrating a KNN algorithm with good abnormality detection effect and GBDT by a weighted average method, wherein the optimal weight of the data set is obtained through multiple iterative calculations. First, a loss function is defined as log_loss_func (weights), and an index function defining an evaluation weight is defined as calculated_weighted_accuracy (prediction_weights). A minimization optimization function minimisoptize (pres, models_models, nb_classes, sample_n, testY, num_test s=20) is defined, wherein each parameter is calculated from the predicted values of GBDT and KNN. Initializing best_f1=0.0, best_weights=none, for each iteration:
(1) The weights predictionweights are randomly initialized.
(2) The loss function log_loss_func () is minimized by a minimize () method, which is SLSQP (sequential least squares programming), constraint conditions are set, and optimization objects are set as initial weights. The result of the minimize () function is used to obtain a weight value, and the weight prediction value weighted_predictors is obtained with the predictors calculated by the algorithm.
(3) And calculating an evaluation index according to the weighted prediction value and the real category, and outputting the evaluation index and the weight value of each iteration.
(4) If f1> best_f1: and updating best_f1 and best_weights.
After the iteration is finished, best_f1 is returned, best_weights. And outputs the evaluation index result at best_weights with calculated_weighted_accuracy ().
The model index result shows that the recall rate can be guaranteed to reach 1 under the better condition that the precision rate is 0.96, and the model index result shows that more data in the data predicted to be abnormal are correctly identified, and all abnormal data are identified.
The invention also relates to an abnormality detection system for the audit of the official vehicle, which has the structure shown in figure 2 and comprises the following components:
the standardized bus audit table module is used for acquiring a standardized bus audit table; the method comprises the steps that an oiling data set and a maintenance data set in a standardized bus audit table are respectively subjected to oversampling treatment by an SMOTE algorithm;
the model construction module is used for fusing the GBDT integration algorithm and the KNN classification algorithm through fusion weights based on a weighted average method; the fusion weight is obtained when the repeated iteration data set reaches the iteration stop condition; constructing a bus audit model based on weighted average GBDT_KNN;
the doubtful point data calculation module is used for obtaining the fusion weight of the oiling data set, inputting the fusion weight of the oiling data set and the oiling data set into the bus audit model based on the weighted average GBDT_KNN, and obtaining oiling doubtful point data; obtaining fusion weights of maintenance data sets, inputting the fusion weights of the maintenance data sets and the maintenance data sets into a bus audit model based on weighted average GBDT_KNN, and obtaining maintenance doubtful point data;
and the suspicious point data integration module is used for integrating the oiling suspicious point data and the maintenance suspicious point data to obtain integrated suspicious point data.
Further, the method further comprises the following steps:
the data acquisition module is used for acquiring the original data of the official vehicle; feature selection and data integration are carried out on the original data of the service vehicle, and preprocessed data are obtained;
the feature construction module is used for constructing a bus audit feature according to the existing bus feature combination audit requirement, and the bus audit feature comprises: a refueling feature and a maintenance feature;
the data processing module is used for establishing a bus audit table containing the bus audit characteristics based on the processed data; the bus audit list comprises: a fueling data set and a maintenance data set; cleaning data of the bus audit table, deleting or filling the blank value, and obtaining a cleaned bus audit table; performing independent heat coding on discrete features in the cleaned bus audit list, and converting the discrete features into numerical features; and (3) standardizing the characteristics with large difference of attribute values in the numerical characteristics to obtain a standardized bus audit list.
By using the system, the above-mentioned operation processing method can be executed and the corresponding technical effects can be achieved.
The embodiments of the present invention also provide a computer-readable storage medium capable of implementing all the steps of the method in the above embodiments, the computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements all the steps of the method in the above embodiments.
The embodiment of the invention also provides electronic equipment for executing the method, which is used as an implementation device of the method, and at least comprises a processor and a memory, wherein the memory is particularly used for storing data and related computer programs required by the execution method, such as a bus audit model based on weighted average GBDT_KNN, and the like, and all the steps of the implementation method are executed by calling the data and the programs in the memory by the processor, so that corresponding technical effects are obtained.
Preferably, the electronic device may comprise a bus architecture, and the bus may comprise any number of interconnected buses and bridges, the buses linking together various circuits, including the one or more processors and memory. The bus may also link together various other circuits such as peripheral devices, voltage regulators, power management circuits, etc., as are well known in the art and, therefore, will not be further described herein. The bus interface provides an interface between the bus and the receiver and transmitter. The receiver and the transmitter may be the same element, i.e. a transceiver, providing a unit for communicating with various other systems over a transmission medium. The processor is responsible for managing the bus and general processing, while the memory may be used to store data used by the processor in performing operations.
Additionally, the electronic device may further include a communication module, an input unit, an audio processor, a display, a power supply, and the like. The processor (or controllers, operational controls) employed may comprise a microprocessor or other processor device and/or logic devices that receives inputs and controls the operation of the various components of the electronic device; the memory may be one or more of a buffer, a flash memory, a hard drive, a removable medium, a volatile memory, a nonvolatile memory, or other suitable means, may store the above-mentioned related data information, may further store a program for executing the related information, and the processor may execute the program stored in the memory to realize information storage or processing, etc.; the input unit is used for providing input to the processor, and can be a key or a touch input device; the power supply is used for providing power for the electronic equipment; the display is used for displaying display objects such as images and characters, and may be, for example, an LCD display. The communication module is a transmitter/receiver that transmits and receives signals via an antenna. The communication module (transmitter/receiver) is coupled to the processor to provide an input signal and to receive an output signal, which may be the same as in the case of a conventional mobile communication terminal. Based on different communication technologies, a plurality of communication modules, such as a cellular network module, a bluetooth module, and/or a wireless local area network module, etc., may be provided in the same electronic device. The communication module (transmitter/receiver) is also coupled to the speaker and microphone via the audio processor to provide audio output via the speaker and to receive audio input from the microphone to implement the usual telecommunications functions. The audio processor may include any suitable buffers, decoders, amplifiers and so forth. In addition, the audio processor is also coupled to the central processor so that sound can be recorded on the host through the microphone and sound stored on the host can be played through the speaker.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create a system for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present invention should be included in the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.
Claims (11)
1. An anomaly detection method for a official vehicle audit is characterized by comprising the following steps:
s1, acquiring a standardized bus audit list; the method comprises the steps that an oiling data set and a maintenance data set in a standardized bus audit table are respectively subjected to oversampling treatment by an SMOTE algorithm;
s2, fusing the GBDT integration algorithm and the KNN classification algorithm through fusion weights based on a weighted average method; the fusion weight is obtained when the repeated iteration data set reaches the iteration stop condition; constructing a bus audit model based on weighted average GBDT_KNN;
s3, acquiring fusion weights of the oiling data sets, and inputting the fusion weights of the oiling data sets and the oiling data sets into a bus audit model based on weighted average GBDT_KNN to acquire oiling suspicious point data;
s4, acquiring fusion weights of the maintenance data sets, and inputting the fusion weights of the maintenance data sets and the maintenance data sets into a bus audit model based on weighted average GBDT_KNN to acquire maintenance doubtful point data;
and S5, integrating the oiling doubt point data and the maintenance doubt point data to obtain integrated doubt point data.
2. The method according to claim 1, wherein the step S1 is preceded by the further steps of:
collecting original data of a public service vehicle; feature selection and data integration are carried out on the original data of the service vehicle, and preprocessed data are obtained;
constructing a bus audit feature according to the existing bus feature and audit requirements, wherein the bus audit feature comprises: a refueling feature and a maintenance feature;
establishing a bus audit table containing the bus audit characteristics based on the processed data; the bus audit list comprises: a fueling data set and a maintenance data set;
cleaning data of the bus audit table, deleting or filling the blank value, and obtaining a cleaned bus audit table;
performing independent heat coding on discrete features in the cleaned bus audit list, and converting the discrete features into numerical features; and (3) standardizing the characteristics with large difference of attribute values in the numerical characteristics to obtain a standardized bus audit list.
3. The method of claim 1, wherein the iteration stop condition is that an evaluation index reaches a preset threshold; the preset threshold is 0.96.
4. A method according to claim 3, wherein said step S2 comprises the substeps;
randomly initializing a prediction weight; calculating a predicted weight through a GBDT algorithm and a KNN algorithm to obtain a predicted weight result;
minimizing a loss function through a minimization method, and optimizing the initial weight through constraint conditions to obtain a minimized weight result;
obtaining a weighted predicted value by weighting calculation of the minimized weight result and the predicted weight result;
inputting the weighted predicted value into an index function of the evaluation weight to obtain an evaluation index of each iteration;
comparing whether the evaluation index of each iteration reaches a preset threshold value or not;
stopping iteration if the evaluation index of the iteration reaches a preset threshold value, and outputting the evaluation index of the iteration and a weighted prediction value; the weighted prediction value is a fusion weight.
5. The method of claim 1, wherein the minimization method is a sequential least squares method.
6. The method of claim 1, wherein the step S1 further comprises: dividing the over-sampled fueling data set into a fueling data training set and a fueling data testing set; the over-sampled maintenance data set is divided into a maintenance data training set and a maintenance data testing set.
7. An anomaly detection system for a official vehicle audit, comprising:
the standardized bus audit table module is used for acquiring a standardized bus audit table; the method comprises the steps that an oiling data set and a maintenance data set in a standardized bus audit table are respectively subjected to oversampling treatment by an SMOTE algorithm;
the model construction module is used for fusing the GBDT integration algorithm and the KNN classification algorithm through fusion weights based on a weighted average method; the fusion weight is obtained when the repeated iteration data set reaches the iteration stop condition; constructing a bus audit model based on weighted average GBDT_KNN;
the doubtful point data calculation module is used for obtaining the fusion weight of the oiling data set, inputting the fusion weight of the oiling data set and the oiling data set into the bus audit model based on the weighted average GBDT_KNN, and obtaining oiling doubtful point data; obtaining fusion weights of maintenance data sets, inputting the fusion weights of the maintenance data sets and the maintenance data sets into a bus audit model based on weighted average GBDT_KNN, and obtaining maintenance doubtful point data;
and the suspicious point data integration module is used for integrating the oiling suspicious point data and the maintenance suspicious point data to obtain integrated suspicious point data.
8. The system as recited in claim 7, further comprising:
the data acquisition module is used for acquiring the original data of the official vehicle; feature selection and data integration are carried out on the original data of the service vehicle, and preprocessed data are obtained;
the feature construction module is used for constructing a bus audit feature according to the existing bus feature combination audit requirement, and the bus audit feature comprises: a refueling feature and a maintenance feature;
the data processing module is used for establishing a bus audit table containing the bus audit characteristics based on the processed data; the bus audit list comprises: a fueling data set and a maintenance data set; cleaning data of the bus audit table, deleting or filling the blank value, and obtaining a cleaned bus audit table; performing independent heat coding on discrete features in the cleaned bus audit list, and converting the discrete features into numerical features; and (3) standardizing the characteristics with large difference of attribute values in the numerical characteristics to obtain a standardized bus audit list.
9. A computer-readable storage medium, wherein a computer program is stored on the storage medium, which when executed by a processor, implements the anomaly detection method for official vehicle auditing of any one of claims 1 to 6.
10. An electronic device comprising a processor and a memory;
the memory is used for storing a bus audit model based on weighted average GBDT_KNN;
the processor is configured to execute the anomaly detection method for the bus audit according to any one of claims 1 to 6 by invoking a bus audit model based on weighted average gbdt_knn.
11. A computer program product comprising computer programs and/or instructions which, when executed by a processor, implement the steps of the anomaly detection method for official vehicle auditing of any of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310789305.4A CN116881824A (en) | 2023-06-30 | 2023-06-30 | Anomaly detection method and system for official vehicle audit |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310789305.4A CN116881824A (en) | 2023-06-30 | 2023-06-30 | Anomaly detection method and system for official vehicle audit |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116881824A true CN116881824A (en) | 2023-10-13 |
Family
ID=88259627
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310789305.4A Pending CN116881824A (en) | 2023-06-30 | 2023-06-30 | Anomaly detection method and system for official vehicle audit |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116881824A (en) |
-
2023
- 2023-06-30 CN CN202310789305.4A patent/CN116881824A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113434485B (en) | Data quality health degree analysis method and system based on multidimensional analysis technology | |
CN109344906B (en) | User risk classification method, device, medium and equipment based on machine learning | |
CN105487970B (en) | A kind of method for showing interface and device | |
CN111582341B (en) | User abnormal operation prediction method and device | |
CN109254912A (en) | A kind of method and device of automatic test | |
CN112037007A (en) | Credit approval method for small and micro enterprises and electronic equipment | |
CN111582488A (en) | Event deduction method and device | |
CN115576834A (en) | Software test multiplexing method, system, terminal and medium for supporting fault recovery | |
CN112882934B (en) | Test analysis method and system based on defect growth | |
CN113569988B (en) | Algorithm model evaluation method and system | |
CN116452154B (en) | Project management system suitable for communication operators | |
CN111062827B (en) | Engineering supervision method based on artificial intelligence mode | |
LU505740B1 (en) | Data monitoring method and system | |
CN115423600B (en) | Data screening method, device, medium and electronic equipment | |
CN116775741A (en) | Auditing method and related device for completion resolution of engineering | |
CN114665986B (en) | Bluetooth key testing system and method | |
CN116881824A (en) | Anomaly detection method and system for official vehicle audit | |
CN112783799B (en) | Software daemon testing method and device | |
CN115099934A (en) | High-latency customer identification method, electronic equipment and storage medium | |
CN114595216A (en) | Data verification method and device, storage medium and electronic equipment | |
CN114564405A (en) | Test case checking method and system based on log monitoring | |
CN115758135B (en) | Track traffic signal system function demand tracing method and device and electronic equipment | |
CN113536672B (en) | Target object processing method and device | |
CN116860622A (en) | Intelligent and efficient test coverage risk analysis method and system | |
CN117520132A (en) | Bottleneck analysis method and device for system performance test |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |