CN107153906A - A kind of taxi illegal activities decision method and system - Google Patents
A kind of taxi illegal activities decision method and system Download PDFInfo
- Publication number
- CN107153906A CN107153906A CN201710169987.3A CN201710169987A CN107153906A CN 107153906 A CN107153906 A CN 107153906A CN 201710169987 A CN201710169987 A CN 201710169987A CN 107153906 A CN107153906 A CN 107153906A
- Authority
- CN
- China
- Prior art keywords
- information
- illegal
- vehicle
- determined
- decision model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 230000000694 effects Effects 0.000 title claims abstract description 35
- 238000012549 training Methods 0.000 claims description 85
- 238000002790 cross-validation Methods 0.000 claims description 12
- 238000010606 normalization Methods 0.000 claims description 10
- 230000008859 change Effects 0.000 claims description 4
- 238000012360 testing method Methods 0.000 description 16
- 238000011835 investigation Methods 0.000 description 12
- 238000003066 decision tree Methods 0.000 description 10
- 230000029305 taxis Effects 0.000 description 8
- 230000033228 biological regulation Effects 0.000 description 5
- 238000007637 random forest analysis Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 3
- 230000005856 abnormality Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012552 review Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000000446 fuel Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 210000004209 hair Anatomy 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011017 operating method Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/40—Business processes related to the transportation industry
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- General Physics & Mathematics (AREA)
- Tourism & Hospitality (AREA)
- Entrepreneurship & Innovation (AREA)
- Theoretical Computer Science (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Game Theory and Decision Science (AREA)
- Development Economics (AREA)
- Educational Administration (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Traffic Control Systems (AREA)
- Devices For Checking Fares Or Tickets At Control Points (AREA)
Abstract
The embodiment of the present invention provides a kind of taxi illegal activities decision method and system.Methods described includes:Obtain the corresponding operation information to be determined of vehicle to be determined in preset time period;According to the operation information to be determined, the illegal probable value for obtaining the vehicle to be determined is calculated using default decision model;If judgement knows that the illegal probable value is more than predetermined threshold value, judge the vehicle to be determined as illegal vehicle.The system is used to perform methods described.The embodiment of the present invention calculates the illegal probable value for obtaining vehicle to be determined by using default decision model, and if judge or illegal probable value is more than predetermined threshold value, judge vehicle to be determined as illegal vehicle, improve the efficiency investigated to taxi illegal activities.
Description
Technical field
The present embodiments relate to technical field of intelligent traffic, more particularly to a kind of taxi illegal activities decision method and
System.
Background technology
During urban development, being on the increase for urban population and vehicle fleet size exacerbates the office of traffic congestion
Face.In order to alleviate traffic pressure, trip is convenient for people to, taxi has become people's out on tours, the main trip of work
Means, therefore improve an urgent demand of operation organizational capacity and service level as Modern Urban Development for hiring out industry.
But it is due to that enterprise operation is lack of standardization, practitioner's quality is very different, the taxi-hailing software of " ticking " one class is rushed
Hit, the influence for the factor such as fuel price goes up, and driven by interests, hire out industry integrally break rules and regulations illegal activities present it is occurred frequently become
Gesture, the influence very severe caused.Therefore in face of huge taxi colony, increasing law enforcement dynamics, reinforcement supervision turns into
One vital task of trade management.And emphasis investigation illegal vehicle, carrying out punishment to illegal vehicle can be in taxi driver
Angle reduction illegal activities violating the regulations, have facilitation to the supervision of whole industry.
In the prior art, supervising or by manually being carried out one by one to passing vehicle on section to taxi
Check, or supervised according to the complaint of passenger.Although most taxi can be runed as requested, occur
Only a small number of vehicles of illegal activities violating the regulations, but investigation illegal vehicle, required people are screened in substantial amounts of taxi
The consuming of power material resources is very big, it is very difficult to investigate, and the efficiency of investigation is very low.
Therefore, how to improve to the investigation efficiency of taxi illegal activities is problem nowadays urgently to be resolved hurrily.
The content of the invention
The problem of existing for prior art, the embodiment of the present invention provides a kind of taxi illegal activities decision method and is
System.
On the one hand, the embodiment of the present invention provides a kind of taxi illegal activities decision method, including:
Obtain the corresponding operation information to be determined of vehicle to be determined in preset time period;
According to the operation information to be determined, calculated using default decision model and obtain the illegal general of the vehicle to be determined
Rate value;
If judgement knows that the illegal probable value is more than predetermined threshold value, judge the vehicle to be determined as illegal vehicle.
On the other hand, the embodiment of the present invention provides a kind of taxi illegal activities decision-making system, including:
Acquisition module, for obtaining the corresponding operation information to be determined of the vehicle to be determined in preset time period;
Computing module, for according to the operation information to be determined, being calculated using default decision model and waiting to sentence described in obtaining
Determine the illegal probable value of vehicle;
Determination module, if for judging to know that the illegal probable value is more than predetermined threshold value, judging the car to be determined
Be illegal vehicle.
A kind of taxi illegal activities decision method provided in an embodiment of the present invention and system, by using default judgement mould
Type calculates the illegal probable value for obtaining vehicle to be determined, and if judge or illegal probable value is more than predetermined threshold value, judgement is treated
Vehicle is judged as illegal vehicle, improves the efficiency investigated to taxi illegal activities.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
There is the accompanying drawing used required in technology description to be briefly described, it should be apparent that, drawings in the following description are this hairs
Some bright embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, can be with root
Other accompanying drawings are obtained according to these accompanying drawings.
Fig. 1 is a kind of taxi illegal activities decision method schematic flow sheet provided in an embodiment of the present invention;
Fig. 2 is the frequency distribution histogram counted after carrying kilometres provided in an embodiment of the present invention are normalized;
Fig. 3 is test set vehicle illegal probability distribution graph provided in an embodiment of the present invention;
Fig. 4 is a kind of taxi illegal activities decision-making system structural representation provided in an embodiment of the present invention.
Embodiment
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention
In accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is
A part of embodiment of the present invention, rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art
The every other embodiment obtained under the premise of creative work is not made, belongs to the scope of protection of the invention.
Fig. 1 is a kind of taxi illegal activities decision method schematic flow sheet provided in an embodiment of the present invention, such as Fig. 1 institutes
Show, methods described, including:
Step 101:Obtain the corresponding operation information to be determined of vehicle to be determined in preset time period;
Specifically, if to judge that the illegal probable value of illegal activities occurs for a certain taxi, this can be obtained and wait to sentence
Determine to be determined operation information of the vehicle in preset time period, it is to be understood that the vehicle fortune of nearest 15 days can be taken
Information is sought as operation information to be determined, its preset time period can be configured according to actual conditions, the embodiment of the present invention pair
This is not especially limited.
Step 102:According to the operation information to be determined, calculated using default decision model and obtain the vehicle to be determined
Illegal probable value;
Specifically, the operation information to be determined of the vehicle to be determined got is input in default decision model, can be with
Calculate and obtain the corresponding illegal probable value of the vehicle to be determined, wherein illegal probable value is higher, then illustrate that the vehicle to be determined is got over
Easily occur illegal activities.It should be noted that the default decision model can be decision tree or random forest, meanwhile,
Suitable for other models, the embodiment of the present invention is not especially limited to this.
Step 103:If judgement knows that the illegal probable value is more than predetermined threshold value, judge the vehicle to be determined to disobey
Method vehicle.
Specifically, the illegal probable value for calculating obtained vehicle to be determined is compared with predetermined threshold value, if waiting to sentence
The illegal probable value for determining vehicle is more than predetermined threshold value, then judges that the vehicle to be determined, as illegal vehicle, is arranged to taxi
When looking into, the vehicle is investigated emphatically.
The embodiment of the present invention calculates the illegal probable value for obtaining vehicle to be determined by using default decision model, and if sentencing
Disconnected or illegal probable value is more than predetermined threshold value, then judges vehicle to be determined as illegal vehicle, improve to the illegal row of taxi
For the efficiency of investigation.
On the basis of above-described embodiment, methods described also includes:
Obtain the first history operation information of all vehicles of the first preset number of days, the first history operation information bag
Include:Air line distance information, carrying kilometres information, GPS track mileage information, deadhead kilometres information and the receipts of carrying starting point to the end
Enter information;
The default decision model is trained according to the first history operation information, agreed-upon price decision model is obtained.
Specifically, in numerous illegal activities, agreed-upon price is one of typical illegal activities of taxi driver, because this
Influence of the behavior to passenger is larger, and attention rate is higher, and law enfrocement official can check at the scene in differentiated.Wherein, institute
It is exactly not show amount of money charge according to fee register that meaning, which is negotiated a price, but is directly conferred to passenger, asks for fare, and driver is logical when negotiating a price
Often without using or less use fee register.Therefore judgement of the embodiment of the present invention to agreed-upon price behavior is described in detail.
All first history operation informations of all taxis of the first preset number of days are obtained, wherein the first preset number of days can
Taxi operation history 15 days is thought, it is of course also possible to set preset number of days, the embodiment of the present invention pair according to actual conditions
This is not especially limited.The air line distance information of first history operation information including carrying starting point to the end, carrying kilometres information,
GPS track mileage information, deadhead kilometres information and income information.It should be noted that the first history operation information comes above
From in the data source on taxi got, these data sources include taxi during operation gps data (including
License plate number, GPS generation the time, longitude, latitude, passenger carrying status), fee register transaction data (including license plate number, income, merchandise when
Between, carrying kilometres, deadhead kilometres, pick-up time), approval system data (including license plate number, single Straight Run mark), car hires a car complaint
Data (including license plate number, complaining type, complain time) and taxi violation data (including it is license plate number, the review time, violating the regulations
Behavior).
Can be seen that license plate number according to above-mentioned data source is shared attribute, therefore, using license plate number as index foundation, by one
The data of different data sources are associated under traffic-operating period of car, and rejecting abnormalities data (such as longitude and latitude is 0), and will be thrown
Tell that the data for being not belonging to negotiate a price in type are weeded out, obtain some valid data of single-car single-time operation, these data are referred to as the
One history operation indicator.Wherein, the first history operation indicator includes:License plate number, GPS generation times, longitude, latitude, carrying shape
State, carrying kilometres, deadhead kilometres, income.Above same operation is performed to all traffic-operating periods of all vehicles, each car is obtained
The first each history operation indicator.The first history operation information is obtained from the first history operation indicator, and the first history is transported
Battalion's information includes air line distance information, carrying kilometres information, GPS track mileage information, the deadhead kilometres of carrying starting point to the end
Information and income information.
The corresponding first history operation information of a certain car that history is runed into 15 days constitutes a training data, owns
First history operation information of vehicle is constituted in whole training data, all vehicles, the illegal vehicle discovered and seized from law enfrocement official
First 15 days of illegal date, this 15 days can be discontinuous, but the time for having the first history operation information must expire 15
My god, and a period of time before discovering and seizing occur the possibilities of illegal activities can be bigger, therefore, what is got at law enfrocement official disobeys
The first 15 days corresponding first history operation informations on method vehicle date are set as training set positive sample, and for training set positive sample
It is 1 to put label value;From history first history of 15 days of the corresponding all vehicles of taxi company of Beijing Taxi Star
Operation information, as training set negative sample, is that training set negative sample sets label value to be 0.Transported according to the first history of all vehicles
Battalion's information is trained to default decision model.
Its default decision model can select each decision point in decision tree, decision tree to represent a kind of first history operation
Information, after all decision-making has been done to all decision points, each leaf node represents a kind of classification, and the category is illegal
Vehicle or non-illegal vehicle, it is decision tree prediction that leaf node of all segmentation conditions is met in all leaf nodes
Illegal vehicle, other leaf nodes obtain agreed-upon price decision model after the completion of representing non-illegal vehicle, training.It should be noted that
Default decision model can also select random forest, and the embodiment of the present invention is not especially limited to this.
A number of training set positive sample and training set negative sample can be randomly selected as checking and collect sample, checking collection
Sample is used for verifying agreed-upon price decision model, and the parameter in agreed-upon price decision model is adjusted with this.
There are four kinds of situations in result of determination:Illegal vehicle is determined as illegal vehicle (true positive, tp), will be non-
Illegal vehicle is determined as illegal vehicle (false positive, fp), illegal vehicle is determined as into non-illegal vehicle (false
Negative, fn), non-illegal vehicle is determined as to non-illegal vehicle (true negative, tn).Illegal vehicle judges accurate
Rate is to assess the ratio in the illegal vehicle result judged really shared by illegal vehicle, and ratio is more high more accurate, wherein illegal
Vehicle determination rate of accuracy=tp/ (tp+fp).Non- illegal vehicle determination rate of accuracy is to assess true in the non-illegal vehicle result judged
Ratio shared by positive non-illegal vehicle, ratio is more high more accurate, wherein non-illegal vehicle determination rate of accuracy=tn/ (tn+fn).
Furthermore it is possible to select all taxis in the full Beijing first history operation information of 15 days as test set sample
This, tests the agreed-upon price decision model for completing training, and test set sample is used for the performance for testing agreed-upon price decision model, will discuss
Valency decision model judges obtained illegal vehicle composition illegal vehicle storehouse, and the statistics all cars in illegal vehicle Ku Zhan Beijing are hired a car
Ratio, this ratio is more low better.
The embodiment of the present invention is used as training by obtaining the first history operation information of all vehicles in the first preset number of days
Data, are trained to default decision model, and obtain agreed-upon price decision model, can be with by the agreed-upon price decision model for completing training
The illegal probable value of vehicle to be determined is predicted, the degree of accuracy of prediction is improved so that law enfrocement official is according to illegal probable value pair
Vehicle is investigated, and the efficiency of investigation is improved while the workload of investigation is reduced.
On the basis of above-described embodiment, the straight line of the operation information to be determined including the carrying starting point to the end away from
From information, the carrying kilometres information, the GPS track mileage information, the deadhead kilometres information and the income information;
Correspondingly, the illegal probable value that the acquisition vehicle to be determined is calculated using default decision model, including:
The illegal probable value for obtaining the vehicle to be determined is calculated using the agreed-upon price decision model.
Specifically, to predict a vehicle to be determined its illegal probable value for occurring agreed-upon price illegal activities, then obtain
The operation information to be determined arrived includes the air line distance information, the carrying kilometres information, the GPS rails of carrying starting point to the end
Mark mileage information, the deadhead kilometres information and the income information, above- mentioned information are input to the agreed-upon price trained and judge mould
In type, agreed-upon price decision model can not only export the corresponding classification of sample (i.e. illegal vehicle, non-illegal vehicle), can also export such
Not corresponding probable value, can be determined that how many probability of vehicle to be determined belongs to illegal vehicle according to classification and probable value.Calculate
Obtain the illegal probable value of vehicle to be determined.
The embodiment of the present invention calculates the illegal probable value for obtaining vehicle to be determined by agreed-upon price decision model, illegal according to this
Probable value is targetedly investigated to taxi, improves the efficiency of investigation.
On the basis of above-described embodiment, methods described also includes:
Obtain the corresponding second history operation information of all vehicles of the second preset number of days, the second history operation information
Including:Single Straight Run flag information, distance travelled information, empty driving are than information, service time information, operation number information, average fortune
Away from information, income information and average income information;
Model training is carried out to the default decision model according to the second history operation information, acquisition generation drives judgement mould
Type.
Specifically, in numerous illegal activities, generation drive be also be one of typical illegal activities of taxi driver because
Influence of this behavior to passenger is larger, and attention rate is higher, and law enfrocement official can check at the scene in differentiated.Its
In, generation, which drives, gives other people on behalf of driving, and each taxi all corresponds to a driver, the feelings that driver is not consistent with vehicle
In condition referred to as generation, drives.The judgement that the embodiment of the present invention drives this illegal activities to generation is described in detail.
Obtain the corresponding second history operation information of all vehicles of the second preset number of days, it should be noted that Ke Yiqu
The corresponding second history operation information of all vehicles of 30 days in historical data, the second history operation information includes:Single Straight Run
Flag information, distance travelled information, empty driving are than information, service time information, operation number information, averge distance carried information, income
Information and average income information.
It should be noted that information above is both from the data source on taxi got, these data source bags
Include hire a car during operation gps data (including license plate number, GPS generation the time, longitude, latitude, passenger carrying status), valuation
Device transaction data (including license plate number, income, exchange hour, carrying kilometres, deadhead kilometres, pick-up time), approval system data
(including license plate number, single Straight Run mark), car, which is hired a car, complains data (including license plate number, complaining type, complaint time) and hires out
Car violation data (including license plate number, review time, act of violating regulations).
Can be seen that license plate number according to above-mentioned data source is shared attribute, therefore, using license plate number as index foundation, by one
The data of different data sources are associated under traffic-operating period of car, and rejecting abnormalities data (such as longitude and latitude is 0), and will be thrown
Tell and be not belonging to weed out for the data driven in type, obtain some valid data of single-car single-time, these data are referred to as second and gone through
History operation indicator.Wherein, the second history operation indicator includes:License plate number, single Straight Run mark, pick-up time, exchange hour, carrying
Mileage, deadhead kilometres, income.Many single carrying kilometres sums, many single incomes can be derived by above-mentioned second history operation indicator
Sum and operation number of times (the operation odd number of bicycle repeatedly in a period of time), obtain second according to the second history operation indicator and go through
History operation information, when the second history operation information includes single Straight Run flag information, distance travelled information, empty driving than information, operation
Between information, operation number information, averge distance carried information, income information and average income information.
Similarly, illegal vehicle is chosen at law enfrocement official and is occurring illegal incidents date first three ten days second history fortune
Information is sought as training set positive sample, its label value is 1;From Beijing Taxi Star second history operation information of 30 days
As training set negative sample, label value is set to 0.
Model training is carried out to default decision model according to the second history operation information, its default decision model can be selected
Each decision point represents a kind of second history operation information in decision tree, decision tree, is all determined when to all decision points
After plan, each leaf node represents a kind of classification, and the category is illegal vehicle or non-illegal vehicle, in all leaf segments
That leaf node for meeting all segmentation conditions in point is the illegal vehicle that decision tree is predicted, other leaf nodes represent non-disobey
Obtained after the completion of method vehicle, training generation drive decision model.It should be noted that default decision model can also be from random gloomy
Woods, the embodiment of the present invention is not especially limited to this.
A number of training set positive sample and training set negative sample can be randomly selected as checking and collect sample, checking collection
Sample is used for verifying that generation drives decision model, is adjusted with this for the parameter driven in decision model.
There are four kinds of situations in result of determination:Illegal vehicle is determined as illegal vehicle (true positive, tp), will be non-
Illegal vehicle is determined as illegal vehicle (false positive, fp), illegal vehicle is determined as into non-illegal vehicle (false
Negative, fn), non-illegal vehicle is predicted as to non-illegal vehicle (true negative, tn).Illegal vehicle judges accurate
Rate is to assess the ratio in the illegal vehicle result judged really shared by illegal vehicle, and ratio is more high more accurate, wherein illegal
Vehicle determination rate of accuracy=tp/ (tp+fp).Non- illegal vehicle determination rate of accuracy is to assess true in the non-illegal vehicle result judged
Ratio shared by positive non-illegal vehicle, ratio is more high more accurate, wherein non-illegal vehicle determination rate of accuracy=tn/ (tn+fn).
Furthermore it is possible to select all taxis in the full Beijing second history operation information of 30 days as test set sample
This, drives decision model to the generation for completing training and tests, and test set sample is used for the performance for driving decision model in test generation, will generation
The illegal vehicle composition illegal vehicle storehouse that decision model judges to obtain is driven, the statistics all cars in illegal vehicle Ku Zhan Beijing are hired a car
Ratio, this ratio is more low better.
The embodiment of the present invention is used as training by obtaining the second history operation information of all vehicles in the second preset number of days
Data, are trained to default decision model, and obtain generation and drive decision model, by complete the generation of training drive decision model can be with
The illegal probable value of vehicle to be determined is predicted, the degree of accuracy of prediction is improved so that law enfrocement official is according to illegal probable value pair
Vehicle is investigated, and the efficiency of investigation is improved while the workload of investigation is reduced.
On the basis of above-described embodiment, the operation information to be determined includes single Straight Run flag information, the row
Sail mileage information, the empty driving than information, the service time information, the operation number information, the averge distance carried information,
The income information and the average income information;
Correspondingly, the illegal probable value that the acquisition vehicle to be determined is calculated using default decision model, including:
The illegal probable value that decision model calculates the acquisition vehicle to be determined is driven using the generation.
Specifically, to predict that a vehicle to be determined it occurs for the illegal probable value for driving illegal activities, then to obtain
The operation information to be determined arrived includes single Straight Run flag information, distance travelled information, empty driving than information, service time information, fortune
Number information, averge distance carried information, income information and average income information are sought, above- mentioned information is input to the generation trained drives and sentence
In cover half type, generation, which drives decision model, can not only export the corresponding classification of sample (i.e. illegal vehicle, non-illegal vehicle), can also export
The corresponding probable value of the category, can be determined that how many probability of vehicle to be determined belongs to illegal vehicle according to classification and probable value.
Calculate the illegal probable value for obtaining vehicle to be determined.
The embodiment of the present invention drives the illegal probable value that decision model calculates acquisition vehicle to be determined by generation, illegal according to this
Probable value is targetedly investigated to taxi, improves the efficiency of investigation.
It is described that the default decision model is entered according to the first history operation information on the basis of above-described embodiment
Row training, obtains agreed-upon price decision model, including:
Each car, each first history operation information got is normalized, normalizing is obtained
Change the first history operation information;The first history operation information of the normalization is grouped, each group corresponding first is obtained
The frequency of history operation information;
By the corresponding frequency composing training data of the first history operation information of all vehicles;
The default decision model is trained according to the training data, agreed-upon price decision model is obtained.
Specifically, multiple first history operation informations of the first preset number of days of a car constitute a training data, tool
Body is that each first history operation information of the first preset number of days of a car is normalized, is normalized
First history operation information;The first history operation information of normalization is grouped, the group number divided can enter according to actual conditions
Row setting, but the group number of all the first history of normalization operation informations point should be identical, so as to get each group
The frequency of corresponding first history operation information.
By taking the air line distance information of carrying starting point to the end as an example, Fig. 2 is that carrying kilometres provided in an embodiment of the present invention are returned
The frequency distribution histogram counted after one change, as shown in Fig. 2 obtaining the first history of history all taxis of 15 days first
Operation information, selects wherein one taxi multiple first history operation informations of corresponding 15 days from all taxis,
Carrying kilometres information, i.e. this taxi history are selected from the corresponding multiple first history operation informations of this taxi again
15 days corresponding carrying kilometres information, the carrying kilometres information is normalized acquisition the first history operation of normalization
Information, normalizes the first history operation information=Value/Max Value, and wherein Value is current mileage information, Max
Value is the maximum in multiple carrying kilometres information.Assuming that the first history operation information of normalization is divided into 100 groups, statistics
Normalize the frequency of the first history operation information, it is possible to the frequency counted after the carrying kilometres normalization shown in drafting pattern 2
Distribution histogram.Aforesaid operations are all carried out to other the first history operation informations of the vehicle, vehicle correspondence can be got
All first history operation informations and the corresponding frequency of the first history operation information, the vehicle corresponding all first is gone through
The corresponding frequency of history operation information constitutes one group of training data.Same method carries out the processing of the above method to other vehicles,
Multigroup training data is obtained, default decision model is trained using multigroup training data, agreed-upon price decision model is obtained.
The embodiment of the present invention to the first history operation information by being normalized and counting each first history
Operation information in each group of frequency, obtain multigroup training data, default decision model carried out using multigroup training data
Training, so as to obtain agreed-upon price decision model, improves the accuracy of agreed-upon price decision model output result.
It is described that the default decision model is entered according to the second history operation information on the basis of above-described embodiment
In row model training, acquisition generation, drives decision model, including:
Packet transaction is carried out to second preset number of days;
Z-Score is carried out to all vehicles, all groups of number of days, each second history operation informations got
Standardization, obtains training data;
The default decision model is trained according to the training data, acquisition generation drives decision model.
Specifically, packet transaction is carried out to the second preset number of days, specifically, obtain all vehicles of 30 days second goes through
History operation information, selects the wherein one taxi second history operation information of corresponding 30 days from all taxis, will
The second history operation information on all Mondays is as one group in 30 days, similarly, by Tuesday, Wednesday ..., Sunday
Second history operation information is respectively as one group.Same method carries out the processing of the above method to other vehicles.By 30 days
In, the second history operation information (in addition to single Straight Run flag information) in all weeks of all cars carries out Z-Score standards
Change is handled, by taking distance travelled information as an example, and specific standardization formula is:(Value- μ)/σ, wherein Value are current driving
Mileage information, μ is the average of all distance travelled information, and σ is the standard deviation of all distance travelled information, so as to obtain training number
According to.Default decision model is trained according to the training data, acquisition generation drives decision model.It should be noted that second is default
Number of days can be configured according to actual conditions, and the embodiment of the present invention is not especially limited to this.
The embodiment of the present invention carries out Z-Score marks by the second history operation information of all groups of number of days to all vehicles
Quasi-ization processing, so as to obtain training data, carries out model training to default decision model according to the training data, improves and treat
Judge the accuracy that vehicle judges.
On the basis of the various embodiments described above, it is described according to the first history operation information to the default decision model
It is trained, including:
According to the first history operation information, the default decision model is entered using cross validation and/or boot strap
Row training.
Specifically, when being trained to default decision model, it can be tested by the first history operation information using intersection
Card and/or boot strap are trained to default decision model.
Wherein, the method for cross validation is:The data of predetermined number are chosen from training set as checking collection sample, are commonly used
Such as 10 folding cross validations, i.e. training data is divided into 10 parts, in turn will wherein 9 parts as training set sample, 1 part of conduct is tested
Card collection sample, the average of 10 results is used as final training result.Sometimes also need to the multiple 10 folding cross validation of progress and ask equal
Value, such as 10 times 10 folding cross validations, so as to more be stablized, reliably preset decision model.
The specific method of boot strap is:Training set positive sample is made training set negative sample as initial positive sample first
For initial negative sample, an initial preset decision model is trained using initial positive sample and initial negative sample, quilt is then collected
Negative sample (is categorized as positive sample, in embodiments of the present invention, as will by the negative sample of initial preset decision model mistake classification
Non- illegal vehicle is determined as illegal vehicle) form the difficult example collection of negative sample.The difficult example collection of negative sample is added untrained
Negative sample forms new negative sample collection, and positive sample collection keeps constant and trains new default decision model, and the above method can be weighed
It is multiple to carry out multiple, the final default decision model of acquisition.
Judge that the embodiment of this illegal activities of agreed-upon price is as follows:
During in January, 2016 to September, the first history operation of ten five day letter of the illegal vehicle before the date is discovered and seized
Breath is as training set positive sample, and it is 1 to set label value, from Beijing Taxi Star between August in 2016 15 days to September 15 days
The first history operation information of 15 days is as training set negative sample, and it is 0 to set label value;From August in 2016 15 days to September
All taxis in Beijing first history operation information of 15 days is used as test set sample between 15 days.Due to every in each car
One the first history operation information has been divided into 100 groups, therefore the dimension of each the first history operation information after normalization
Spend for 100, because the first history operation information includes the air line distance information, carrying kilometres information, GPS of carrying starting point to the end
Track mileage information, deadhead kilometres information and income information five, so the dimension of each car training data is 500, training set
42 cars of positive sample, 507 cars of training set negative sample, test set sample is 51874 cars.
Following table is many days corresponding first history operation indicators of some illegal vehicle bicycle, and it is as shown in the table:
The frequency distribution histogram counted after the carrying mileage information normalization in many days of some illegal vehicle is as shown in Figure 2.Will
The illegal vehicle all first history operation informations of many days constitute one group of training data, and default decision model is trained,
Random forest can be used to be trained, wherein the depth set is 2, the quantity of tree is 15, and training set positive sample keeps constant, instruction
Practice the data that collection negative sample takes 43 cars so that positive and negative sample proportion is 1:1, using the method for five folding cross validations, it will train
Data are divided into five parts, in turn will wherein four parts as training set sample, portion is as checking collection sample, and five times result is averaged
Value.It is to the training result of training set sample:Tp=32, fp=5, it is possible to draw, illegal vehicle predictablity rate is
86%, non-illegal vehicle predictablity rate is 77%.Result with test set detection model is:In order to which arresting for maximum possible is separated
Method vehicle, limits prediction vehicle probable value and is just determined as illegal vehicle more than 0.6, Fig. 3 surveys to be provided in an embodiment of the present invention
Examination collection vehicle illegal probability distribution graph, therefore, it can show that illegal vehicle proportion is 6.7%.
The embodiment for the judgement agreed-upon price illegal activities that the present invention provides for another embodiment:
In order to improve tp, fp is reduced, boot strap can be used, the positive and negative sample proportion of training set is 1:1, entered using decision tree
Row training, the depth of tree is 4, type weight parameter selection balanced.Boot strap all retention forecastings each time in training process
The fp that training set is obtained, and the fp that the negative sample of training is obtained is had neither part nor lot in the judgement of default decision model, keeping training set just
Negative sample ratio is 1:1, two obtained class fp reformulation training set negative samples are trained.It can be repeated several times above-mentioned dynamic
Make, be to the training result of training set finally:Tp=43, fp=1, illegal vehicle determination rate of accuracy are 98%, non-illegal vehicle
Determination rate of accuracy is 100%.It should be noted that the constituted mode of training set positive sample and training set negative sample and above-mentioned implementation
Example is consistent, and the embodiment of the present invention is repeated no more to this.
The embodiment of the present invention is trained by cross validation and/or boot strap to default decision model, so as to obtain more
Plus stably, reliably preset decision model, improve the accuracy of output.
On the basis of the various embodiments described above, it is described according to the second history operation information to the default decision model
Model training is carried out, including:
According to the second history operation information, affiliated default decision model is entered using cross validation and/or boot strap
Row training.
Specifically, when being trained to default decision model, it can be tested by the second history operation information using intersection
Card and/or boot strap are trained to default decision model.The wherein operating method of cross validation and boot strap and above-mentioned implementation
Example is consistent, and here is omitted.
The embodiment of the present invention is as follows to judge that generation drives the embodiment of illegal activities:
The building mode of the training data provided according to above-described embodiment builds training data, from January, 2015 extremely
During in Septembers, 2016, the second history operation information of three ten day of the illegal vehicle before the date is discovered and seized obtains all illegal cars
Training set positive sample, and set label value be 1.From in January, 2015 between in September, 2016, Beijing Taxi Star
Second history operation information of 30 days of taxi company, same mode obtains training set negative sample, and setting label value is
0.Test set sample is used as from all taxis in the 2016 Nian9Yue Beijing second history operation information of 30 days.From training
Collect and 67 positive sample data are selected in positive sample, 300 negative sample data composition checking collection samples are selected from training set negative sample
This, remaining 100 positive sample data and 400 negative sample data composition training datas, test set sample have 60186.Can
To be trained using decision tree, wherein, the depth of tree is 3, and the positive and negative sample proportion of training set is 1:4.Tested with checking collection sample
The result of the default decision model of card is:Tp=52, fp=13, illegal vehicle determination rate of accuracy are 80%, and non-illegal vehicle judges
Accuracy rate is 95%.With test set test sample preset decision model result be:Illegal vehicle proportion is 6.02%.
The embodiment of the present invention is trained by cross validation and/or boot strap to default decision model, so as to obtain more
Plus stably, reliably preset decision model, improve the accuracy of output.
Fig. 4 is a kind of taxi illegal activities decision-making system structural representation provided in an embodiment of the present invention, such as Fig. 4 institutes
Show, the system includes:Acquisition module 401, computing module 402 and determination module 403, wherein:
Acquisition module 401 is used to obtain the corresponding operation information to be determined of vehicle to be determined in preset time period;Calculate
Module 402 is used for according to the operation information to be determined, is calculated using default decision model and obtains disobeying for the vehicle to be determined
Method probable value;If determination module 403 is used to judge to know that the illegal probable value is more than predetermined threshold value, judge described to be determined
Vehicle is illegal vehicle.
Specifically, if to judge that the illegal probable value of illegal activities occurs for a certain taxi, acquisition module 401 can
Obtain to be determined operation information of the vehicle to be determined in preset time period, it is to be understood that the vehicle can be taken nearest
The operation information of one day can be configured as operation information to be determined, its preset time period according to actual conditions, the present invention
Embodiment is not especially limited to this.The operation information to be determined of the vehicle to be determined got is input to by computing module 402
In default decision model, the corresponding illegal probable value of the acquisition vehicle to be determined can be calculated, wherein illegal probable value is higher, then
Illustrate the easier generation illegal activities of vehicle to be determined.It should be noted that the default decision model can for decision tree or
Person's random forest, meanwhile, other models are also applied for, the embodiment of the present invention is not especially limited to this.Determination module 403 will be counted
The illegal probable value of obtained vehicle to be determined is compared with predetermined threshold value, if the illegal probable value of vehicle to be determined is big
In predetermined threshold value, then the vehicle to be determined is judged as illegal vehicle, when being investigated to taxi, emphatically to the vehicle
Investigated.
The embodiment for the system that the present invention is provided specifically can be used for the handling process for performing above-mentioned each method embodiment, its
Function will not be repeated here, and be referred to the detailed description of above method embodiment.
The embodiment of the present invention calculates the illegal probable value for obtaining vehicle to be determined by using default decision model, and if sentencing
Disconnected or illegal probable value is more than predetermined threshold value, then judges vehicle to be determined as illegal vehicle, improve to the illegal row of taxi
For the efficiency of investigation.
One of ordinary skill in the art will appreciate that:Realizing all or part of step of above method embodiment can pass through
Programmed instruction related hardware is completed, and foregoing program can be stored in a computer read/write memory medium, the program
Upon execution, the step of including above method embodiment is performed;And foregoing storage medium includes:ROM, RAM, magnetic disc or light
Disk etc. is various can be with the medium of store program codes.
The embodiments such as system described above are only schematical, wherein the unit illustrated as separating component
It can be or may not be physically separate, the part shown as unit can be or may not be physics list
Member, you can with positioned at a place, or can also be distributed on multiple NEs.It can be selected according to the actual needs
In some or all of module realize the purpose of this embodiment scheme.Those of ordinary skill in the art are not paying creativeness
Work in the case of, you can to understand and implement.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can
Realized by the mode of software plus required general hardware platform, naturally it is also possible to pass through hardware.Understood based on such, on
The part that technical scheme substantially in other words contributes to prior art is stated to embody in the form of software product, should
Computer software product can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disc, CD, including some fingers
Order is to cause a computer equipment (can be personal computer, server, or network equipment etc.) to perform each implementation
Method described in some parts of example or embodiment.
Finally it should be noted that:The above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although
The present invention is described in detail with reference to the foregoing embodiments, it will be understood by those within the art that:It still may be used
To be modified to the technical scheme described in foregoing embodiments, or equivalent substitution is carried out to which part technical characteristic;
And these modification or replace, do not make appropriate technical solution essence depart from various embodiments of the present invention technical scheme spirit and
Scope.
Claims (10)
1. a kind of taxi illegal activities decision method, it is characterised in that including:
Obtain the corresponding operation information to be determined of vehicle to be determined in preset time period;
According to the operation information to be determined, the illegal probability for obtaining the vehicle to be determined is calculated using default decision model
Value;
If judgement knows that the illegal probable value is more than predetermined threshold value, judge the vehicle to be determined as illegal vehicle.
2. according to the method described in claim 1, it is characterised in that methods described also includes:
The first history operation information of all vehicles of the first preset number of days is obtained, the first history operation information includes:Carry
Air line distance information, carrying kilometres information, GPS track mileage information, deadhead kilometres information and the income letter of objective starting point to the end
Breath;
The default decision model is trained according to the first history operation information, agreed-upon price decision model is obtained.
3. method according to claim 2, it is characterised in that the operation information to be determined includes the carrying starting point extremely
The air line distance information of terminal, the carrying kilometres information, the GPS track mileage information, the deadhead kilometres information and institute
State income information;
Correspondingly, the illegal probable value that the acquisition vehicle to be determined is calculated using default decision model, including:
The illegal probable value for obtaining the vehicle to be determined is calculated using the agreed-upon price decision model.
4. according to the method described in claim 1, it is characterised in that methods described also includes:
Obtain the corresponding second history operation information of all vehicles of the second preset number of days, the second history operation information bag
Include:Single Straight Run flag information, distance travelled information, empty driving are than information, service time information, operation number information, averge distance carried
Information, income information and average income information;
Model training is carried out to the default decision model according to the second history operation information, acquisition generation drives decision model.
5. method according to claim 4, it is characterised in that the operation information to be determined includes single Straight Run mark
Information, the distance travelled information, the empty driving are than information, the service time information, the operation number information, described flat
Equal haul distance information, the income information and the average income information;
Correspondingly, the illegal probable value that the acquisition vehicle to be determined is calculated using default decision model, including:
The illegal probable value that decision model calculates the acquisition vehicle to be determined is driven using the generation.
6. method according to claim 2, it is characterised in that it is described according to the first history operation information to described pre-
If decision model is trained, agreed-upon price decision model is obtained, including:
Each car, each first history operation information got is normalized, normalization the is obtained
One history operation information;The first history operation information of the normalization is grouped, each group of corresponding first history is obtained
The frequency of operation information;
By the corresponding frequency composing training data of the first history operation information of all vehicles;
The default decision model is trained according to the training data, agreed-upon price decision model is obtained.
7. method according to claim 4, it is characterised in that it is described according to the second history operation information to described pre-
If decision model carries out model training, in acquisition generation, drives decision model, including:
Packet transaction is carried out to second preset number of days;
Z-Score standards are carried out to all vehicles, all groups of number of days, each second history operation informations got
Change is handled, and obtains training data;
The default decision model is trained according to the training data, acquisition generation drives decision model.
8. the method according to claim 2,3 or 6, it is characterised in that described according to the first history operation information pair
The default decision model is trained, including:
According to the first history operation information, the default decision model is instructed using cross validation and/or boot strap
Practice.
9. the method according to claim 4,5 or 7, it is characterised in that described according to the second history operation information pair
The default decision model carries out model training, including:
According to the second history operation information, affiliated default decision model is instructed using cross validation and/or boot strap
Practice.
10. a kind of taxi illegal activities decision-making system, it is characterised in that including:
Acquisition module, for obtaining the corresponding operation information to be determined of the vehicle to be determined in preset time period;
Computing module, for according to the operation information to be determined, being calculated using default decision model and obtaining the car to be determined
Illegal probable value;
Determination module, if for judge know the illegal probable value be more than predetermined threshold value, judge the vehicle to be determined as
Illegal vehicle.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710169987.3A CN107153906A (en) | 2017-03-21 | 2017-03-21 | A kind of taxi illegal activities decision method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710169987.3A CN107153906A (en) | 2017-03-21 | 2017-03-21 | A kind of taxi illegal activities decision method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107153906A true CN107153906A (en) | 2017-09-12 |
Family
ID=59791754
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710169987.3A Pending CN107153906A (en) | 2017-03-21 | 2017-03-21 | A kind of taxi illegal activities decision method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107153906A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109887292A (en) * | 2019-04-04 | 2019-06-14 | 上海赢科信息技术有限公司 | The recognition methods and system of type of vehicle |
CN111861498A (en) * | 2019-04-24 | 2020-10-30 | 杭州海康威视系统技术有限公司 | Monitoring method, device, equipment and storage medium for taxi |
CN113792782A (en) * | 2021-09-13 | 2021-12-14 | 一汽出行科技有限公司 | Track monitoring method and device for operating vehicle, storage medium and computer equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102800136A (en) * | 2011-05-25 | 2012-11-28 | 株式会社审调社 | Drive evaluation system, drive evaluation program, and drive evaluation method |
CN105260832A (en) * | 2015-10-10 | 2016-01-20 | 东南大学 | Performance evaluation method for taxi drivers based on order data |
CN105427590A (en) * | 2015-09-10 | 2016-03-23 | 江苏智通交通科技有限公司 | Key vehicle law violation forbidding intrusion management system and method |
CN106056162A (en) * | 2016-06-07 | 2016-10-26 | 浙江大学 | A traffic safety credit scoring method based on GPS track and traffic law-violation records |
-
2017
- 2017-03-21 CN CN201710169987.3A patent/CN107153906A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102800136A (en) * | 2011-05-25 | 2012-11-28 | 株式会社审调社 | Drive evaluation system, drive evaluation program, and drive evaluation method |
CN105427590A (en) * | 2015-09-10 | 2016-03-23 | 江苏智通交通科技有限公司 | Key vehicle law violation forbidding intrusion management system and method |
CN105260832A (en) * | 2015-10-10 | 2016-01-20 | 东南大学 | Performance evaluation method for taxi drivers based on order data |
CN106056162A (en) * | 2016-06-07 | 2016-10-26 | 浙江大学 | A traffic safety credit scoring method based on GPS track and traffic law-violation records |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109887292A (en) * | 2019-04-04 | 2019-06-14 | 上海赢科信息技术有限公司 | The recognition methods and system of type of vehicle |
CN109887292B (en) * | 2019-04-04 | 2022-01-25 | 上海赢科信息技术有限公司 | Vehicle type identification method and system |
CN111861498A (en) * | 2019-04-24 | 2020-10-30 | 杭州海康威视系统技术有限公司 | Monitoring method, device, equipment and storage medium for taxi |
CN111861498B (en) * | 2019-04-24 | 2024-02-20 | 杭州海康威视系统技术有限公司 | Monitoring method, device, equipment and storage medium for taxis |
CN113792782A (en) * | 2021-09-13 | 2021-12-14 | 一汽出行科技有限公司 | Track monitoring method and device for operating vehicle, storage medium and computer equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106709513A (en) | Supervised machine learning-based security financing account identification method | |
CN105931068A (en) | Cardholder consumption figure generation method and device | |
CN108764375B (en) | Highway goods stock transprovincially matching process and device | |
CN111915155A (en) | Small and micro enterprise risk level identification method and device and computer equipment | |
CN102725772A (en) | Patent scoring and classification | |
CN106846153A (en) | A kind of vehicle insurance compensates method and system | |
CN107153906A (en) | A kind of taxi illegal activities decision method and system | |
CN115909727A (en) | Toll station efficiency monitoring method and device | |
CN115691148A (en) | Intelligent charging auxiliary method, equipment and medium based on expressway | |
CN116542631B (en) | Distributed architecture enterprise information management system | |
CN115034821A (en) | Vehicle estimation method and device, computer equipment and storage medium | |
CN116542466A (en) | Asset full period management system based on data analysis | |
CN115205026A (en) | Credit evaluation method, device, equipment and computer storage medium | |
CN110288038A (en) | A kind of classification method and device of enterprise | |
CN113888312A (en) | Method, device, electronic equipment and medium for determining business quota | |
CN113743815A (en) | Risk monitoring method and device for operating vehicle, storage medium and computer equipment | |
CN109523370A (en) | A kind of waybill loan transaction processing method, apparatus and system | |
CN116777652A (en) | Risk evaluation model-based financial analysis method | |
CN112348537B (en) | Information processing method, device, electronic equipment and storage medium | |
CN114036146A (en) | ETL processing method and device in high-speed audit service data warehouse | |
CN114742293A (en) | Method and system for evaluating driver traffic safety risk and analyzing human-vehicle association | |
CN107133747A (en) | A kind of method for evaluating quality and device | |
CN111899112A (en) | Real estate trust project management method and device | |
CN114418467B (en) | Method and device for determining operation quality of airport bus and storage medium | |
CN111489556A (en) | Method for judging attaching behavior of commercial vehicle |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170912 |
|
RJ01 | Rejection of invention patent application after publication |