CN113988488A - Method for predicting ETC passing probability of vehicle by multiple factors - Google Patents

Method for predicting ETC passing probability of vehicle by multiple factors Download PDF

Info

Publication number
CN113988488A
CN113988488A CN202111610092.1A CN202111610092A CN113988488A CN 113988488 A CN113988488 A CN 113988488A CN 202111610092 A CN202111610092 A CN 202111610092A CN 113988488 A CN113988488 A CN 113988488A
Authority
CN
China
Prior art keywords
probability
result
decision tree
data
factor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111610092.1A
Other languages
Chinese (zh)
Other versions
CN113988488B (en
Inventor
朱广
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Yihi Information Technology Service Co ltd
Shanghai Yihi Chengshan Automobile Rental Co ltd
Original Assignee
Shanghai Yihi Information Technology Service Co ltd
Shanghai Yihi Chengshan Automobile Rental Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Yihi Information Technology Service Co ltd, Shanghai Yihi Chengshan Automobile Rental Co ltd filed Critical Shanghai Yihi Information Technology Service Co ltd
Priority to CN202111610092.1A priority Critical patent/CN113988488B/en
Publication of CN113988488A publication Critical patent/CN113988488A/en
Application granted granted Critical
Publication of CN113988488B publication Critical patent/CN113988488B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Mathematical Physics (AREA)
  • Game Theory and Decision Science (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Operations Research (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Educational Administration (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention provides a method for predicting the probability of a vehicle passing ETC by multiple factors, which comprises the following steps: s1, establishing a decision tree model for predicting the probability of the vehicle passing through the ETC based on a plurality of factors according to historical data; s2, inputting the real-time data into the decision tree model in the S1 for operation to obtain a prediction result of the vehicle corresponding to the real-time data passing through the ETC; and S3, comparing the actual result with the prediction result obtained in the S2, adjusting the decision tree model, and enhancing the accuracy of prediction. The method can predict the real-time ETC data according to the historical data and the model algorithm, thereby greatly reducing the cost of data matching and checking; through data comparison of the prediction result and the actual result, the model can continuously learn by self, and the calculation capability of the model is enhanced.

Description

Method for predicting ETC passing probability of vehicle by multiple factors
Technical Field
The invention relates to the field of vehicle driving, in particular to a method for predicting the probability of a vehicle passing ETC.
Background
At present, data information of vehicles passing through an ETC (electronic toll collection) system is generally acquired in a post-event and manual mode, and information including road sections, expenses and the like is acquired from a road network operation company or a bank at regular time post-event and then is matched and settled.
According to the traditional ETC data acquisition and processing mode, on one hand, data acquisition has long hysteresis, on the other hand, manual matching and accounting need to invest a large amount of manpower, and the cost is high.
The invention provides a method for predicting the probability of passing ETC of a vehicle by multiple factors, which inputs the ETC data actually occurring in history into a model for training. Obtaining the probability condition that the vehicle passes through the ETC under the condition of different kinds of factor combinations according to the total historical data; after the new order data is generated, all known factor data is input to the model. And obtaining the probability that the vehicle passes through the ETC under different factor combinations. And finally voting decision results of all groups of factors to obtain the probability of finally passing the ETC.
Because the factor data is known in real time, the prediction of the outcome is also available in real time. The problem of hysteresis in the traditional scheme is solved; after actual data are obtained, comparison between the predicted value and the actual value is carried out, the final consistency of the data is guaranteed, the data enter the model again after the actual value is obtained, the data become historical data of operation, and the prediction capability of the model is further enhanced.
Disclosure of Invention
At present, the traditional ETC data acquisition and processing mode has longer hysteresis, and a large amount of manpower is required for manual matching and accounting, so that the cost is higher. The invention provides a method for predicting the ETC passing probability of a vehicle by multiple factors to solve the problems, which can predict real-time ETC data according to historical data and a model algorithm and greatly reduce the cost of data matching and checking; according to the invention, through data comparison of the prediction result and the actual result, the self-learning can be continuously carried out, and the operational capability of the model is enhanced; the invention has real-time performance, can predict the cost of ETC probability at the first time, carries out pre-charging, improves the convenience degree of customers and saves the settlement cost.
In order to achieve the above object, the present invention provides a method for predicting a probability of a vehicle passing an ETC by multiple factors, comprising the following steps:
s1, according to historical data, establishing a decision tree model for predicting the probability of the vehicle passing through the ETC based on M factors;
s2, inputting the real-time data into the decision tree model in the S1 for operation to obtain a prediction result of the vehicle corresponding to the real-time data passing through the ETC;
and S3, comparing the actual result with the prediction result obtained in the S2, adjusting the decision tree model, and enhancing the accuracy of prediction.
Wherein, the step of S1 further comprises the following steps:
s11, calculating the probability value of each factor passing through the ETC according to historical data, and accordingly determining the splitting priority of all M factors;
s12, randomly extracting M factors from the M factors, wherein the value range of M is 2< M < M; establishing a decision tree for the m factors according to the splitting priority determined in the step S11;
s13, repeating the process of S12 until all factor combinations are traversed, and generating a large number of decision trees to form a random forest;
and S14, counting the probability that leaf nodes of all decision trees in the S13 pass the ETC in the full-scale historical data to obtain a decision tree model.
Wherein, the step of S11 further comprises the following steps:
s111, calculating a probability value of each factor passing through the ETC;
and S112, determining the splitting priority of all factors.
Wherein, the step of S12 further comprises the following steps:
s121, selecting a factor with the highest splitting priority from m factors as a root node of the decision tree to split according to the ETC passing probability value of each factor calculated in S11;
s122, establishing child nodes for different values of the factor with the highest splitting priority to generate a second-layer node;
s123, selecting a factor with a high splitting priority level for splitting the second layer of nodes according to the information gain selection;
and S124, repeating the process, and sequentially selecting the splitting factors according to the splitting priority from high to low until no factor can be selected.
Wherein, in the process of establishing the decision tree, if the branch probability exceeds the judgment probability P1The splitting is stopped or until all m factors have been split.
Wherein, the step S2 specifically includes the following steps:
s21, when a new piece of real-time data is generated, all factor data of the data are substituted into the decision tree model obtained in the S1, and the probability that each leaf node meeting the conditions passes through the ETC in each decision tree can be obtained;
s22, all leaf nodes meeting the conditions participate in voting to obtain the prediction result and adjust the reference probability P2(ii) a If one of the leaf nodes shows that the probability of passing through ETC exceeds the decision probability P1If the vehicle can pass ETC, P as the prediction result, the voting is finished2Is taken to be less than the decision probability P for all probabilities1Carrying out equal weight averaging on the leaf nodes meeting the conditions; if the probability of no leaf node exceeds the decision probability P1The predicted result is that the vehicle can not pass ETC, P2The value of (1) is equal weight average of all leaf nodes meeting the condition.
Specifically, in S3, after the actual result occurs, the actual result is returned to the decision tree model in S1, and compared with the predicted result obtained in S2:
if the comparison result is consistent, storing the data in a source database for storage, and further enhancing the prediction accuracy of the decision tree model in S1;
if the comparison result is not consistent, the actual data and the result are returned to the decision tree model in S1, the newly added data are subjected to probability operation at regular time, the probability value of the related single factor in the decision tree model is updated, the splitting sequence of the decision tree is changed, the result data of each leaf node is changed at the same time, and the decision tree model is refreshed.
If the M +1 th factor is found in the comparison result, the S1 process of the model is rerun, and the model is expanded.
Wherein, in the process of establishing the model, the probability P is judged1The value of (a) is fixed; after the actual result is obtained, the probability P can be judged according to the prediction accuracy1The value of (a) is adjusted.
Wherein, if the comparison between the actual result and the predicted result shows that the accuracy of the predicted result is higher than that required by the product, the judgment probability P can be gradually reduced1While ensuring the adjusted decision probability P1Greater than the adjusted reference probability P2(ii) a If the comparison between the actual result and the predicted result shows that the accuracy of the predicted result is lower than that required by the product, the judgment probability P is gradually increased1The value of (c).
In summary, the present invention inputs the ETC data actually occurred in history into the model for training. And obtaining the probability condition that the vehicle passes through the ETC under the condition of different kinds of factor combinations according to the total amount of historical data. And finally voting decision results of all groups of factors to obtain the probability of finally passing the ETC. Because the factor data is known in real time, the prediction of the outcome is also available in real time. The problem of hysteresis in the traditional scheme is solved. After actual data are obtained, comparison between the predicted value and the actual value is carried out, and final consistency of the data is guaranteed. And after the actual value is obtained, the model is entered again to become historical data of operation, and the prediction capability of the model is further enhanced.
The method can predict the real-time ETC data according to the historical data and the model algorithm, thereby greatly reducing the cost of data matching and checking; the invention can continuously learn by self by comparing the data of the prediction result and the actual result, thereby strengthening the operational capability of the model.
Drawings
FIG. 1 is a schematic representation of the steps of a method of multi-factor prediction of the probability of a vehicle passing an ETC in accordance with the present invention;
fig. 2 is a decision tree obtained according to 5 factors in the embodiment.
Detailed Description
Technical solutions, structural features, achieved objects and effects in the embodiments of the present invention will be described in detail below with reference to fig. 1 to 2 in the embodiments of the present invention.
It should be noted that the drawings are simplified in form and not to precise scale, and are only used for convenience and clarity to assist in describing the embodiments of the present invention, but not for limiting the conditions of the embodiments of the present invention, and therefore, the present invention is not limited by the technical spirit, and any structural modifications, changes in the proportional relationship, or adjustments in size, should fall within the scope of the technical content of the present invention without affecting the function and the achievable purpose of the present invention.
It is to be noted that, in the present invention, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
A method for predicting the probability of a vehicle passing ETC in multiple factors, as shown in FIG. 1, comprises the following steps,
s1, establishing a decision tree model for predicting the ETC passing probability of the vehicle by multiple factors according to historical data;
s11, calculating the probability value of each factor passing through the ETC, and determining the splitting priority of all the factors;
s111, calculating a probability value of each factor passing through the ETC;
assuming that there are M factors, M is 10 in the present embodiment, and the 10 factors are "ETC entrance passing speed >5km/h (hereinafter, referred to as entrance vehicle speed >5 km/h)", "ETC exit passing speed >5km/h (hereinafter, referred to as exit vehicle speed >5 km/h)", "entrance congestion level < 3", "exit congestion level < 3", "driver driving age >5 years", "driver gender = male", "rainy while passing", "time of passing daytime", "legal holiday", "road category is national road", wherein a congestion level of 1 indicates heavy congestion, a level of 2 indicates medium congestion, a level of 3 indicates light congestion, and a level of 4 indicates clear traffic.
And calculating the probability of the data containing each factor passing through the ETC one by one according to all historical data. For example, from all the historical data, all the data containing a certain factor A are extracted, and the number of the data is x; and in the x pieces of data, the number of the ETC which is actually passed finally is y pieces, and the probability that the factor A passes the ETC is y/x. Specifically, assuming that there are 500 pieces of data corresponding to "entrance vehicle speed >5 km/h" out of all the history data, and 480 pieces of the 500 pieces of history data pass the ETC in the final practical situation, the probability of passing the ETC is 480/500, which is about 96%, when the factor "entrance vehicle speed >5 km/h" occurs.
Repeating the above process, the probability value of each factor passing through the ETC can be obtained.
S112, determining the splitting priority of all factors;
and determining the splitting priority according to the probability value of each factor from large to small, wherein the higher the probability value passing through the ETC is, the higher the splitting priority is, namely, the factor with higher splitting priority is selected, and when the probability values of the two factors are equal, one factor is selected to split preferentially at random.
S12, randomly extracting M factors from the M factors, wherein the value range of M is 2< M < M; establishing a decision tree for the m factors according to the splitting priority determined in the S11;
in practice, the adjustment can be performed according to the magnitude of the value M and the prediction result after operation, in this embodiment, M is 5, and M factors are "entrance vehicle speed >5 km/h", "exit vehicle speed >5 km/h", "entrance congestion level < 3", "exit congestion level < 3", and "driver's driving age >5 years", respectively.
S121, according to the probability value of each factor passing through the ETC, which is obtained through calculation in S11, selecting the factor with the highest probability value, namely the factor with the largest information gain, from the 5 factors in S12 as a root node of the decision tree for splitting, wherein the highest splitting priority in the 5 factors is 'the entrance vehicle speed is more than 5 km/h';
s122, establishing sub-nodes for different values of the factor with the highest splitting priority to generate a second layer of nodes, wherein in the embodiment, the different values are that the speed of the vehicle is more than 5km/h when the vehicle is in line with the entrance and is not in line with the entrance;
s123, selecting a factor with a high splitting priority level for splitting the second layer of nodes according to the information gain selection; in the embodiment, the 'exit speed >5 km/h' is selected as a factor for splitting again;
s124, repeating the process, and selecting the splitting factors from high to low in sequence according to the splitting priority until no factor can be selected; in this embodiment, "entrance congestion level < 3", "exit congestion level < 3", and "driver's driving age >5 years" are sequentially selected as the division factors.
In the above S12 process, if no pruning is performed, the m factors can form a decision tree with m layers in total, and 2 occurs in totalm-1 splitting to form 2m-1And (6) obtaining the result. The method comprises the following steps of (1) not performing any pruning, namely establishing child nodes for factors at all levels in sequence, and not omitting all branches; the probability of the child node established for different values of each level of factors is the branch probability.
In the process of establishing the decision tree, if the branch probability exceeds the judgment probability P1Stopping splitting, or until all m factors have been split; the decision probability P in the present embodiment1A decision tree obtained by taking 95% of the 5 factors selected in this embodiment is shown in fig. 2.
S13, repeating the process of S12 until all factor combinations are traversed, and generating a large number of decision trees to form a random forest;
all factor combinations refer to randomly extracting M attributes from all M attributes, how many combinations are possible: according to the combination formula, the number of all possible factor combinations can be calculated, i.e. how many decision trees are generated in the subsequent steps.
For example, if there are 4 factors all, and 3 factors are randomly drawn from them to form a combination, then there are 4 combinations possible, and finally 4 decision trees are formed; in this embodiment, M is 10, and M is 5, so that 252 decision trees are finally formed.
S14, counting the probability that leaf nodes of all decision trees in S13 pass through ETC in the full-scale historical data to obtain a decision tree model;
in the decision tree shown in fig. 2, the end point of each branch is a leaf node, each leaf node represents the probability that a vehicle passes through the ETC under certain factor combinations, and the probability is obtained by statistical calculation through historical data: in the total amount of historical data, the statistics accord with the' entrance vehicle speed>The probability that the vehicle with the condition of 5km/h passes through the ETC is 96 percent and is greater than the set judgment probability P1If yes, stopping splitting to obtain a leaf node; statistics of non-compliance with' entry speed of vehicle>5km/h 'but corresponding to' exit vehicle speed>The probability that the vehicle with the condition of 5km/h passes through the ETC is 96 percent and is greater than the set judgment probability P1If yes, stopping splitting to obtain a leaf node; statistics of non-compliance with' entry speed of vehicle>5km/h 'and not corresponding to' exit vehicle speed>5km/h 'and meets the requirement of' entrance congestion<3' vehicle, the probability of passing ETC is less than the set judgment probability P1Then continue splitting; and repeating the steps to obtain each leaf node.
The leaf nodes in the decision tree of fig. 2 represent the following meanings:
the leaf node a represents "the probability of a vehicle passing through the ETC is 96% when the condition entrance vehicle speed >5km/h is satisfied";
the leaf node b indicates "when the condition entrance vehicle speed >5km/h is not satisfied and the condition exit vehicle speed >5km/h is satisfied, the probability that the vehicle passes the ETC is 96%";
the leaf node c indicates "when the conditions of entrance vehicle speed >5km/h and exit vehicle speed >5km/h are not satisfied, entrance congestion <3 satisfied, exit congestion <3 satisfied, driving age >5 years satisfied, the probability of the vehicle passing the ETC is 90%";
the leaf node d indicates "when the conditions of entrance vehicle speed >5km/h and exit vehicle speed >5km/h are not satisfied, entrance congestion <3 satisfied, exit congestion <3 satisfied, driving age >5 years are not satisfied, the probability of the vehicle passing the ETC is 85%";
a leaf node e indicates "when the conditions of entrance vehicle speed >5km/h and exit vehicle speed >5km/h are not satisfied, entrance congestion <3 satisfied, exit congestion <3 unsatisfied, driving age >5 years satisfied, the probability of the vehicle passing the ETC is 85%";
the leaf node f indicates "when the conditions of entrance vehicle speed >5km/h and exit vehicle speed >5km/h are not satisfied, entrance congestion <3 satisfied, exit congestion <3 unsatisfied, driving age >5 years are not satisfied, the probability of the vehicle passing the ETC is 80%";
the leaf node g indicates "when the conditions of entrance vehicle speed >5km/h and exit vehicle speed >5km/h are not satisfied, entrance congestion <3 is not satisfied, exit congestion <3 is satisfied, and driving age >5 years is satisfied, the probability that the vehicle passes through the ETC is 90%";
a leaf node h represents "when the conditions of entrance vehicle speed >5km/h and exit vehicle speed >5km/h are not satisfied, entrance congestion <3 is not satisfied, exit congestion <3 is satisfied, and driving age >5 years is not satisfied, the probability that a vehicle passes through ETC is 85%";
the leaf node i indicates "when the conditions of entrance vehicle speed >5km/h and exit vehicle speed >5km/h are not satisfied, entrance congestion <3 is not satisfied, exit congestion <3 is not satisfied, and driving age >5 years are satisfied, the probability that the vehicle passes through the ETC is 85%";
the leaf node j indicates "when the conditions of entrance vehicle speed >5km/h and exit vehicle speed >5km/h are not satisfied, entrance congestion <3 is not satisfied, exit congestion <3 is not satisfied, and driving age >5 years is not satisfied, the probability of the vehicle passing the ETC is 80%".
Repeating the above process can obtain all leaf nodes of all decision trees.
S2, inputting the real-time data into the decision tree model in the S1 for operation to obtain a prediction result of the vehicle corresponding to the real-time data passing through the ETC;
and S21, when a new piece of real-time data is generated, substituting all factor data of the piece of data into the decision tree model obtained in the S1 to obtain the probability that each leaf node meeting the conditions passes through the ETC in each decision tree, and thus obtaining the probability that the vehicle passes through the ETC under different factor combinations of the piece of real-time data.
Wherein, a leaf node meeting the condition means that all factors of the leaf node are in the factor data of the piece of data; for example, factors in a new piece of real-time data include: the method is characterized in that the method does not meet the conditions of entrance vehicle speed >5km/h, exit vehicle speed >5km/h, entrance congestion <3, exit congestion <3, driving age >5 years, sex = male driver, raining when passing is not satisfied, passing time in the daytime, legal holidays on the same day and national road type, and in the decision tree shown in fig. 2, the leaf node meeting the conditions is c and the probability is 90%.
S22, all leaf nodes meeting the conditions participate in voting to obtain the prediction result and adjust the reference probability P2: if one of the leaf nodes shows that the probability of passing through ETC exceeds the decision probability P1If yes, the voting is finished, and the prediction result is that the vehicle can pass ETC, P2Is taken to be less than the decision probability P for all probabilities1Carrying out equal weight averaging on the leaf nodes meeting the conditions; if the probability of no leaf node exceeds the set decision probability P1When the vehicle cannot pass ETC, P as a result of prediction2The value of (1) is equal weight average of all leaf nodes meeting the condition.
In this example P1The value is 95%, namely if the probability that one leaf node displays that the leaf node passes through the ETC exceeds 95%, the prediction result is that the vehicle can pass through the ETC; and if the probability of all the leaf nodes meeting the conditions is less than 95%, the vehicle cannot pass the ETC according to the prediction result.
S3, comparing the actual result with the prediction result obtained in the S2, adjusting the model and enhancing the accuracy of prediction;
after the actual result occurs (the actual result is a fact result whether the vehicle corresponding to the new piece of real-time data can pass the ETC), the actual result is returned to the decision tree model in S1, and the actual result is compared with the predicted result obtained in S2: if the comparison result is consistent, the data is stored in the source database, and the prediction accuracy of the decision tree model in S1 is further enhanced.
If the comparison result is not consistent, the data is pushed to a service system, and an error correction notice is sent. And after the actual result occurs, returning the actual data and the result to the decision tree model in S1, regularly participating the newly added data in probability calculation, updating the probability values of the related single factors in the decision tree model, changing the splitting sequence of the decision tree, simultaneously changing the result data of each leaf node, and refreshing the decision tree model.
In the above process, if the comparison result of the actual result and the predicted result is consistent, the probability value of the relevant factor is improved, and the prediction tendency of the model is further strengthened. On the contrary, the probability value of the related single factor is reduced, the splitting priority of the related factor in the model and the result data of the leaf node are changed simultaneously, the current prediction tendency of the model is reduced, and the model evolves towards other directions.
If the M +1 th factor is found, the modeled S1 process is re-run, expanding the model. Because the newly added factors do not influence the generated decision tree, the overall prediction result can be kept more stably.
In the present invention, the probability P is determined1Is fixed, but after the actual result occurs, the decision probability P can be determined1Adjustment is performed to enable the accuracy of the prediction result to be higher: if the accuracy of the predicted result is extremely high (higher than the accuracy required by the product) in the comparison of the actual result and the predicted result, the judgment probability P can be gradually reduced1While ensuring the adjusted decision probability P1Greater than the adjusted reference probability P2(ii) a Similarly, if the comparison between the actual result and the predicted result shows that the accuracy of the predicted result is lower than that required by the product, the judgment probability P is gradually increased1The value of (c).
While the present invention has been described in detail with reference to the preferred embodiments, it should be understood that the above description should not be taken as limiting the invention. Various modifications and alterations to this invention will become apparent to those skilled in the art upon reading the foregoing description. Accordingly, the scope of the invention should be determined from the following claims.

Claims (10)

1. A method for predicting the probability of a vehicle passing an ETC in multiple factors is characterized by comprising the following steps:
s1, establishing a decision tree model for predicting the probability of the vehicle passing through the ETC based on M factors according to historical data;
s2, inputting the real-time data into the decision tree model in the S1 for operation to obtain a prediction result of the vehicle corresponding to the real-time data passing through the ETC;
and S3, comparing the actual result with the prediction result obtained in the S2, adjusting the decision tree model, and enhancing the accuracy of prediction.
2. The method according to claim 1, wherein said S1 further comprises the steps of:
s11, calculating the probability value of each factor passing through the ETC according to historical data, and accordingly determining the splitting priority of all M factors;
s12, randomly extracting M factors from the M factors, wherein the value range of M is 2< M < M; establishing a decision tree for the m factors according to the splitting priority determined in the S11;
s13, repeating the process of S12 until all factor combinations are traversed to generate a large number of decision trees to form a random forest;
and S14, counting the probability that leaf nodes of all decision trees in the S13 pass the ETC in the full-scale historical data to obtain a decision tree model.
3. The method according to claim 2, wherein said S11 further comprises the steps of:
s111, calculating a probability value of each factor passing through the ETC;
and S112, determining the splitting priority of all factors.
4. The method according to claim 2, wherein said S12 further comprises the steps of:
s121, selecting a factor with the highest splitting priority from m factors as a root node of the decision tree to split according to the ETC passing probability value of each factor calculated in S11;
s122, establishing child nodes for different values of the factor with the highest splitting priority to generate a second-layer node;
s123, selecting a factor with a high splitting priority level for splitting the second layer of nodes according to the information gain selection;
and S124, repeating the process, and sequentially selecting the splitting factors according to the splitting priority from high to low until no factor can be selected.
5. The method according to claim 4, wherein the decision tree is established if the branch probability exceeds the decision probability P1The splitting is stopped or until all m factors have been split.
6. The method according to claim 5, wherein the step S2 specifically comprises the following steps:
s21, when a new piece of real-time data is generated, all factor data of the data are substituted into the decision tree model obtained in the S1, and the probability that each leaf node meeting the conditions passes through the ETC in each decision tree can be obtained;
s22, all leaf nodes meeting the conditions participate in voting to obtain the prediction result and adjust the reference probability P2(ii) a If one of the leaf nodes shows that the probability of passing through ETC exceeds the decision probability P1If the vehicle can pass ETC, P as the prediction result, the voting is finished2Is taken to be less than the decision probability P for all probabilities1Carrying out equal weight averaging on the leaf nodes meeting the conditions; if the probability of no leaf node exceeds the decision probability P1The predicted result is that the vehicle can not pass ETC, P2Is taken asAll eligible leaf nodes are equally weighted and averaged.
7. The method according to claim 1, wherein the step S3 is implemented by returning the actual result to the decision tree model of S1 after the actual result occurs, and comparing the actual result with the predicted result obtained in S2:
if the comparison result is consistent, storing the data in a source database for storage, and further enhancing the prediction accuracy of the decision tree model in S1;
if the comparison result is not consistent, the actual data and the result are returned to the decision tree model in S1, the newly added data are subjected to probability operation at regular time, the probability value of the related single factor in the decision tree model is updated, the splitting sequence of the decision tree is changed, the result data of each leaf node is changed at the same time, and the decision tree model is refreshed.
8. The method for predicting ETC probability through multiple factors according to claim 7, wherein if the M +1 th factor is found in the comparison result, the modeling S1 process is re-run, and the model is expanded.
9. The method according to claim 6, wherein the probability P is determined during model building1The value of (a) is fixed; after the actual result is obtained, the probability P can be judged according to the prediction accuracy1The value of (a) is adjusted.
10. The method according to claim 9, wherein the determination probability P is gradually decreased if the accuracy of the predicted result is higher than the accuracy required by the product, as found in the comparison of the actual result with the predicted result1While ensuring the adjusted decision probability P1Greater than the adjusted reference probability P2(ii) a If the actual result is compared with the predicted result, the methodIf the accuracy of the measured result is lower than the accuracy required by the product, the judgment probability P is gradually increased1The value of (c).
CN202111610092.1A 2021-12-27 2021-12-27 Method for predicting ETC passing probability of vehicle by multiple factors Active CN113988488B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111610092.1A CN113988488B (en) 2021-12-27 2021-12-27 Method for predicting ETC passing probability of vehicle by multiple factors

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111610092.1A CN113988488B (en) 2021-12-27 2021-12-27 Method for predicting ETC passing probability of vehicle by multiple factors

Publications (2)

Publication Number Publication Date
CN113988488A true CN113988488A (en) 2022-01-28
CN113988488B CN113988488B (en) 2022-06-21

Family

ID=79734533

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111610092.1A Active CN113988488B (en) 2021-12-27 2021-12-27 Method for predicting ETC passing probability of vehicle by multiple factors

Country Status (1)

Country Link
CN (1) CN113988488B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114852135A (en) * 2022-07-08 2022-08-05 八维通科技有限公司 Similar rail transit driving prediction method based on big data

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107220734A (en) * 2017-06-26 2017-09-29 江南大学 CNC Lathe Turning process Energy Consumption Prediction System based on decision tree
CN108364467A (en) * 2018-02-12 2018-08-03 北京工业大学 A kind of traffic information prediction technique based on modified decision Tree algorithms
CN109003128A (en) * 2018-07-07 2018-12-14 太原理工大学 Based on improved random forest public bicycles website Demand Forecast method
CN109017799A (en) * 2018-04-03 2018-12-18 张锐明 A kind of new-energy automobile driving behavior prediction technique
CN110837841A (en) * 2018-08-17 2020-02-25 北京亿阳信通科技有限公司 KPI (Key performance indicator) degradation root cause identification method and device based on random forest
CN111210094A (en) * 2020-03-06 2020-05-29 青岛海信网络科技股份有限公司 Airport taxi automatic scheduling method and device based on real-time passenger flow prediction
CN112287603A (en) * 2020-10-29 2021-01-29 上海淇玥信息技术有限公司 Prediction model construction method and device based on machine learning and electronic equipment
CN113096388A (en) * 2021-03-22 2021-07-09 北京工业大学 Short-term traffic flow prediction method based on gradient lifting decision tree

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107220734A (en) * 2017-06-26 2017-09-29 江南大学 CNC Lathe Turning process Energy Consumption Prediction System based on decision tree
CN108364467A (en) * 2018-02-12 2018-08-03 北京工业大学 A kind of traffic information prediction technique based on modified decision Tree algorithms
CN109017799A (en) * 2018-04-03 2018-12-18 张锐明 A kind of new-energy automobile driving behavior prediction technique
CN109003128A (en) * 2018-07-07 2018-12-14 太原理工大学 Based on improved random forest public bicycles website Demand Forecast method
CN110837841A (en) * 2018-08-17 2020-02-25 北京亿阳信通科技有限公司 KPI (Key performance indicator) degradation root cause identification method and device based on random forest
CN111210094A (en) * 2020-03-06 2020-05-29 青岛海信网络科技股份有限公司 Airport taxi automatic scheduling method and device based on real-time passenger flow prediction
CN112287603A (en) * 2020-10-29 2021-01-29 上海淇玥信息技术有限公司 Prediction model construction method and device based on machine learning and electronic equipment
CN113096388A (en) * 2021-03-22 2021-07-09 北京工业大学 Short-term traffic flow prediction method based on gradient lifting decision tree

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114852135A (en) * 2022-07-08 2022-08-05 八维通科技有限公司 Similar rail transit driving prediction method based on big data
CN114852135B (en) * 2022-07-08 2022-10-04 八维通科技有限公司 Similar rail transit driving prediction method based on big data

Also Published As

Publication number Publication date
CN113988488B (en) 2022-06-21

Similar Documents

Publication Publication Date Title
CN111563706A (en) Multivariable logistics freight volume prediction method based on LSTM network
CN113096388B (en) Short-term traffic flow prediction method based on gradient lifting decision tree
CN110782658B (en) Traffic prediction method based on LightGBM algorithm
CN111105104A (en) Short-term power load prediction method based on similar day and RBF neural network
CN110674999A (en) Cell load prediction method based on improved clustering and long-short term memory deep learning
CN111723929A (en) Numerical prediction product correction method, device and system based on neural network
CN104835103A (en) Mobile network health evaluation method based on neural network and fuzzy comprehensive evaluation
CN111582559B (en) Arrival time estimation method and device
CN111860989B (en) LSTM neural network short-time traffic flow prediction method based on ant colony optimization
CN110675029A (en) Dynamic management and control method and device for commercial tenant, server and readable storage medium
CN112529683A (en) Method and system for evaluating credit risk of customer based on CS-PNN
CN111126868B (en) Road traffic accident occurrence risk determination method and system
CN113988488B (en) Method for predicting ETC passing probability of vehicle by multiple factors
CN116596044B (en) Power generation load prediction model training method and device based on multi-source data
CN112529685A (en) Loan user credit rating method and system based on BAS-FNN
CN115410372B (en) Reliable prediction method for highway traffic flow based on Bayesian LSTM
CN116721537A (en) Urban short-time traffic flow prediction method based on GCN-IPSO-LSTM combination model
CN115600729A (en) Grid load prediction method considering multiple attributes
CN114299742B (en) Speed limit information dynamic identification and update recommendation method for expressway
CN114819178A (en) Railway construction progress index prediction and online updating method
CN112528554A (en) Data fusion method and system suitable for multi-launch multi-source rocket test data
CN115293366B (en) Model training method, information prediction method, device, equipment and medium
CN115719194A (en) Big data prediction based material purchasing method and system
CN115691140A (en) Analysis and prediction method for space-time distribution of automobile charging demand
CN113191568B (en) Meteorological-based urban operation management big data analysis and prediction method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant