CN108764375A - Highway goods stock transprovincially matching process and device - Google Patents

Highway goods stock transprovincially matching process and device Download PDF

Info

Publication number
CN108764375A
CN108764375A CN201810664921.6A CN201810664921A CN108764375A CN 108764375 A CN108764375 A CN 108764375A CN 201810664921 A CN201810664921 A CN 201810664921A CN 108764375 A CN108764375 A CN 108764375A
Authority
CN
China
Prior art keywords
data
outlet
entrance
transprovincially
matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810664921.6A
Other languages
Chinese (zh)
Other versions
CN108764375B (en
Inventor
黄海涛
叶劲松
张平
周雷
陈佳兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Academy of Transportation Sciences
Original Assignee
China Academy of Transportation Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Academy of Transportation Sciences filed Critical China Academy of Transportation Sciences
Priority to CN201810664921.6A priority Critical patent/CN108764375B/en
Publication of CN108764375A publication Critical patent/CN108764375A/en
Application granted granted Critical
Publication of CN108764375B publication Critical patent/CN108764375B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/02Payment architectures, schemes or protocols involving a neutral party, e.g. certification authority, notary or trusted third party [TTP]
    • G06Q20/023Payment architectures, schemes or protocols involving a neutral party, e.g. certification authority, notary or trusted third party [TTP] the neutral party being a clearing house

Abstract

The present invention relates to communications and transportation statistical technique fields, and in particular to transprovincially matching process and device, method include for a kind of highway goods stock:Obtain the entry data and outlet data of the corresponding goods stock of charge station transprovincially to prestore, target algorithm model is obtained based on entry data and outlet data, to in charge station transprovincially pending entry data and pending outlet data use target algorithm model to be matched to obtain matching result, and similarity calculation is carried out to obtain similarity result to the license plate number in the license plate number and pending outlet data in the corresponding pending entry data of matching result, and the matching result is optimized based on the similarity result.By the above method with effective guarantee highway goods stock transprovincially matched accuracy, and then effectively solve expressway tol lcollection data segmentation problem and since directly goods stock matching problem transprovincially can not be carried out by license plate number caused by Car license recognition is not complete or identification mistake transprovincially.

Description

Highway goods stock transprovincially matching process and device
Technical field
The present invention relates to communications and transportation to count field, is transprovincially matched in particular to a kind of highway goods stock Method and device.
Background technology
Important main line channel of the highway as highway transportation, for support national economic development, push social progress, Safeguard national security etc. plays an important role.Especially the driving path of vehicle on a highway is reflecting regional warp The important reference indicator for state of development of helping.
ExpresswayNetwork Toll Collection System is to save cascade network, provincial unified allocation settlement, expressway tol lcollection data at present Essential record vehicle receives and dispatches situation in the IC card for respectively passing in and out freeway toll station inside the province, when vehicle is by adjacent province, needs The highway passage IC card in different provinces is received and dispatched again at the provincial boundaries station of two provinces, carry out pike balance respectively, it is therefore, right In vehicle transprovincially, traveling record is divided in different provinces.
Inventor it has been investigated that, due to the precision of car license recognition equipment is not high and expressway tol lcollection management require etc. Reason causes the Car license recognition rate in expressway tol lcollection data not high, directly can not carry out vehicle string transprovincially by license plate number It connects.Therefore, transprovincially matching of the goods stock on national highway is effectively solved, to obtain goods stock on a highway Complete driving path is a technical problem to be solved urgently.
Invention content
In view of this, the purpose of the present invention is to provide a kind of highway goods stock transprovincially matching process and device, Above-mentioned technical problem is effectively relieved.
To achieve the above object, the embodiment of the present invention adopts the following technical scheme that:
A kind of highway goods stock transprovincially matching process, the method includes:
The entry data and outlet data of the corresponding goods stock of charge station transprovincially to prestore are obtained, the outlet number is based on The matching degree generation of the license plate number in license plate number and entry data in includes the sample number of matched data and non-matched data According to:
By in the sample data entry data and the outlet data perform mathematical calculations to obtain overall target, and The overall target is counted to obtain correlation metric, and the target signature collection is allocated to obtain training spy Collection and test feature collection;
The correlation metric is handled to obtain candidate characteristic set, and important journey is carried out to the candidate characteristic set Degree evaluation is to obtain target signature collection;
To the training characteristics collection use a variety of default machine learning algorithms be trained study with it is corresponding obtain it is a variety of across Matching algorithm model is saved, and chooses a kind of the first algorithm model of conduct from a variety of models of matching algorithm transprovincially;
It uses the first algorithm model described in the test feature set pair to be assessed to obtain assessment result, and is commented according to this Estimate result to be adjusted to obtain target algorithm model first algorithm model;
To in charge station transprovincially pending entry data and pending outlet data using the target algorithm model into Row matching is to obtain matching result;
The car plate in license plate number and pending outlet data in pending entry data corresponding to the matching result Number similarity calculation is carried out to obtain similarity result, and the matching result is optimized based on the similarity result.
Optionally, described to obtain the charge station transprovincially to prestore in above-mentioned highway goods stock transprovincially matching process The entry data and outlet data of corresponding goods stock, based on the vehicle in the license plate number and entry data in the outlet data The matching degree of the trade mark generates the step of matched data and non-matched data and includes:
Charge station transprovincially is obtained according to freeway net topological structure, charge station location and provincial administrative area boundary to believe Breath, and obtain the goods stock outlet data corresponding with the information of charge station transprovincially and entry data to prestore;
Have the corresponding outlet data of complete license plate number as target outlet data using in outlet data, using preset algorithm From searched in the entry data time with the complete license plate number in the target outlet data in consistent and outlet data and The difference of time in entry data is located at the entry data of a setting time range as target entries data, and by the target Entry data and with the target outlet data of the target entries Data Matching as matched data, using other data as mismatching Data.
Optionally, in above-mentioned highway goods stock transprovincially matching process, will there is complete car plate in outlet data Number corresponding outlet data searches to go out with the target from the entry data as target outlet data using preset algorithm The difference of time of the complete license plate number unanimously and in time and entry data in outlet data in mouth data is located at a setting The entry data of time range as target entries data, and by the target entries data and with the target entries Data Matching Target outlet data include as matched data, using other data as the step of non-matched data:
License plate number L in outlet data is calculated using JaroWinklerDistance algorithmsOutletWith the vehicle in entry data Trade mark LEntranceSimilarity Slicense
Wherein, m LOutletAnd LEntranceMatched number of characters, t are the number of transposition;
It must not be less than V kilometers of standard per hour according to the minimum speed of highway, according to the distance D of charge station transprovincially, Calculate the running time T from outlet charge station to entrance charge station:
T=(D/V) × 60
Entry data is screened based on the running time T, is subtracted when screening obtains the entry time in entry data Go the Outlet time of outlet data in time interval [- T, T] range and license plate number similarity SlicenseMore than a setting value When, judge that the corresponding vehicle of corresponding outlet data and the corresponding vehicle of entry data are same vehicles, and by the entry data And it is included in respectively as the target entries data and target outlet data with the matched outlet data of the entry data With data, other data are included in non-matched data.
Optionally, in above-mentioned highway goods stock transprovincially matching process, the entry data includes entrance charge It stands coding, entrance license plate number, entry time, entrance vehicle, the total number of axle of entrance vehicle, entrance vehicle goods gross weight and entrance vehicle Freight weight limit, the outlet data include outlet charge station coding, outlet license plate number, Outlet time, outlet vehicle, outlet vehicle line shaft Number, outlet vehicle goods gross weight and outlet vehicle weight limitation, by the sample data entry data and the outlet data carry out Mathematical operation is to obtain overall target, and the step of being counted the overall target to obtain correlation metric includes:
Utilize the entry time T in entry dataEntrance, entrance vehicle CEntrance, the total number of axle A of entrance vehicleEntrance, entrance vehicle goods it is total Weight WEntranceAnd entrance vehicle weight limitation LWEntranceSubtract the Outlet time T in corresponding outlet dataOutlet, outlet vehicle COutlet, outlet vehicle Total number of axle AOutlet, outlet vehicle goods gross weight WOutletAnd outlet vehicle weight limitation LWEntranceObtain overall target:
Dtime=TEntrance-TOutlet
Dcar=CEntrance-COutlet
Daxis=AEntrance-AOutlet
Dweight=WEntrance-WOutlet
Dlimitweight=LWEntrance-LWOutlet
Wherein, DtimeFor entrance time difference, DcarFor entrance vehicle is poor, DaxisFor the total number of axle of entrance vehicle it is poor, DweightFor the total method of double differences of entrance vehicle goods, DlimitweightIt is poor for entrance vehicle weight limitation;
Statistic of classification goes out the entrance time difference Dtime, entrance vehicle difference Dcar, the total number of axle difference D of entrance vehicleaxis、 The total method of double differences D of entrance vehicle goodsweight, entrance vehicle weight limitation difference DlimitweightFeature distribution, and choose judge outlet data and The whether matched correlation metric of entry data.
Optionally, in above-mentioned highway goods stock transprovincially matching process, the correlation metric is handled To obtain candidate characteristic set, and significance level evaluation is carried out to obtain target signature Ji Buzhoubao to the candidate characteristic set It includes:
With nondimensionalization, qualitative features quantification, quantitative characteristic binaryzation and discrete features coding method to coming in and going out Mouth time difference Dtime, entrance vehicle difference Dcar, the total number of axle difference D of entrance vehicleaxis, the total method of double differences D of entrance vehicle goodsweight, come in and go out Mouth vehicle weight limitation difference DlimitweightIt is handled, forms highway goods stock matching candidate feature set transprovincially;
The significance level of the candidate characteristic set is evaluated to obtain target using correlation coefficient process or variance back-and-forth method Feature set.
Optionally, quantitative with nondimensionalization, qualitative features in above-mentioned highway goods stock transprovincially matching process Change, quantitative characteristic binaryzation, One-Hot coding and/or discrete features coding method are to entrance time difference Dtime, entrance vehicle Type difference Dcar, the total number of axle difference D of entrance vehicleaxis, the total method of double differences D of entrance vehicle goodsweightIt is poor with entrance vehicle weight limitation DlimitweightHandled, formed highway goods stock transprovincially matching candidate feature set the step of include:
To the entrance time difference Dtime, entrance vehicle difference Dcar, the total number of axle difference D of entrance vehicleaxisIt is respectively adopted The One-Hot codings are handled, and to the total method of double differences D of the entrance vehicle goodsweightIt is poor with entrance vehicle weight limitation DlimitweightSection is carried out respectively to zoom in the section of [- 1,1] to obtain highway goods stock matching candidate spy transprovincially Collection.
Optionally, in above-mentioned highway goods stock transprovincially matching process, to the training characteristics collection using a variety of Default machine learning algorithm is trained study and obtains a variety of matching algorithm models transprovincially with corresponding, and from it is described it is a variety of transprovincially Include with choosing a kind of in algorithm model as the step of the first algorithm model:
Respectively value, support vector machines, naive Bayesian, decision tree, random forest and ladder are closed on using logistic regression, K Degree hoisting machine learning algorithm is trained study to the training characteristics collection and obtains corresponding model, and calculates the standard of each model True rate score;
Using a corresponding model of accuracy rate highest scoring in each model as the first algorithm model.
Optionally, in above-mentioned highway goods stock transprovincially matching process, described in the test feature set pair First algorithm model assess and handled first algorithm model to obtain target algorithm mould according to assessment result The step of type includes:
It is tested using the first algorithm model described in test feature set pair, and draws learning curve, ROC curve, calculated AUC value;
According to learning curve, ROC curve and AUC value, the fitting state of first algorithm model is judged;
It is adjusted according to the parameter of first algorithm model of fitting state pair and characteristic variable to obtain target algorithm Model.
Optionally, in above-mentioned highway goods stock transprovincially matching process, according in the pending entry data License plate number and pending outlet data in license plate number carry out similarity calculation to obtain similarity result, and be based on the phase Include like the step of result optimizes the matching result is spent:
It is the vehicle in the license plate number and pending outlet data in the corresponding pending entry data of matching to matching result The trade mark using JaroWinklerDistance algorithms carries out that similarity is calculated;
When the similarity is more than a preset value, then the matching result is maintained, otherwise, the matching result is carried out Modification.
The present invention also provides a kind of highway goods stock transprovincially coalignments, including:
Acquisition module, entry data and outlet data for obtaining the corresponding goods stock of charge station transprovincially to prestore, Matching degree generation based on the license plate number in the license plate number and entry data in the outlet data includes matched data and not Sample data with data;
Computing module, for by the sample data entry data and the outlet data perform mathematical calculations with It is counted to overall target, and to the overall target to obtain correlation metric;
Processing module, for being handled the correlation metric to obtain candidate characteristic set, and to described candidate special Collection carries out significance level evaluation to obtain target signature collection, and is allocated to obtain training characteristics to the target signature collection Collection and test feature collection;
Training module, for using a variety of default machine learning algorithms to be trained study with right the training characteristics collection A variety of matching algorithm models transprovincially should be obtained, and choose a kind of the first algorithm of conduct from a variety of models of matching algorithm transprovincially Model;
Evaluation module, for using the first algorithm model described in the test feature set pair to be assessed to obtain assessment knot Fruit, and first algorithm model is adjusted to obtain target algorithm model according to the assessment result;
Matching module, for in charge station transprovincially pending entry data and pending outlet data use the mesh Mark algorithm model is matched to obtain matching result;
Optimization module, for the license plate number in the license plate number and pending outlet data in the pending entry data Carry out similarity calculation with obtain the license plate number in similarity result pending entry data corresponding to the matching result and License plate number in pending outlet data carries out similarity calculation to obtain similarity result, and is based on the similarity result pair The matching result optimizes.
A kind of highway goods stock provided by the invention transprovincially matching process and device are prestored transprovincially by obtaining The entry data and outlet data of the corresponding goods stock of charge station, and target algorithm is obtained based on entry data and outlet data Model, in charge station transprovincially pending entry data and pending outlet data use target algorithm model to be matched with Matching result is obtained, and similar with the license plate number progress in pending outlet data according to the license plate number in pending entry data Degree is calculated to obtain similarity result, and is optimized to the matching result based on the similarity result.It is set by above-mentioned It sets with effective guarantee highway goods stock transprovincially matched accuracy, can effectively solve expressway tol lcollection data and transprovincially divide Cut problem and since directly goods stock can not be carried out transprovincially by license plate number caused by Car license recognition is not complete or identification mistake Matching problem, to obtain goods stock complete driving path on a highway.
To enable the above objects, features and advantages of the present invention to be clearer and more comprehensible, preferred embodiment cited below particularly, and coordinate Appended attached drawing, is described in detail below.
Description of the drawings
Fig. 1 is the structure diagram of terminal device provided in an embodiment of the present invention.
Fig. 2 is the flow diagram of highway goods stock provided in an embodiment of the present invention transprovincially matching process.
Fig. 3 is the flow diagram of step S110 in Fig. 2.
Fig. 4 is goods stock provided in an embodiment of the present invention matched ROC curve transprovincially.
Fig. 5 is the connection block diagram of highway goods stock provided in an embodiment of the present invention transprovincially coalignment.
Icon:10- terminal devices;12- memories;14- processors;100- highways goods stock transprovincially matches dress It sets;110- acquisition modules;120- computing modules;130- processing modules;140- training modules;150- evaluation modules;160- is matched Module:170- optimization modules.
Specific implementation mode
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment only It is a part of the embodiment of the present invention, instead of all the embodiments.The present invention being usually described and illustrated herein in the accompanying drawings The component of embodiment can be arranged and be designed with a variety of different configurations.
As shown in Figure 1, an embodiment of the present invention provides a kind of terminal device 10, including memory 12, processor 14 and height Fast highway freight vehicle transprovincially coalignment 100.Wherein, the terminal device 10 can be but not limited to server, intelligent hand Machine, PC (personal computer, PC), tablet computer etc. have the electronic equipment of data-handling capacity, herein not Make specific limit.
In this embodiment, it is directly or indirectly electrically connected between the memory 12 and processor 14, to realize The transmission or interaction of data.For example, these elements can realize electricity by one or more communication bus or signal wire between each other Property connection.The highway goods stock transprovincially coalignment 100 include it is at least one can be with software or firmware (firmware) form is stored in the software function module in the memory 12.The processor 14 is for executing described deposit The executable module stored in reservoir 12, such as the highway goods stock transprovincially software included by coalignment 100 Function module and computer program etc., to realize highway goods stock matching process transprovincially.
Wherein, the memory 12 may be, but not limited to, random access memory (Random Access Memory, RAM), read-only memory (Read Only Memory, ROM), programmable read only memory (Programmable Read-Only Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only Memory, EPROM), Electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory, EEPROM) etc.. Wherein, memory 12 is for storing program, and the processor 14 executes described program after receiving and executing instruction.
The processor 14 may be a kind of IC chip, the processing capacity with signal.Above-mentioned processor 14 Can be general processor, including central processing unit (Central Processing Unit, CPU), network processing unit (Network Processor, NP) etc.;It can also be digital signal processor (DSP), application-specific integrated circuit (ASIC), scene Programmable gate array (FPGA) either other programmable logic device, discrete gate or transistor logic, discrete hardware group Part.It may be implemented or execute disclosed each method, step and the logic diagram in the embodiment of the present invention.General processor can be with It is microprocessor or the processor can also be any conventional processor etc..
It is appreciated that structure shown in FIG. 1 is only to illustrate, the terminal device 10 may also include more than shown in Fig. 1 Either less component or with the configuration different from shown in Fig. 1.Hardware, software may be used in each component shown in Fig. 1 Or combinations thereof realize.
In conjunction with Fig. 2, the embodiment of the present invention also provides a kind of highway shipping can be applied to above-mentioned terminal device 10 Vehicle transprovincially matching process, the method includes the steps seven steps of S110- steps S170.
Step S110:The entry data and outlet data for obtaining the corresponding goods stock of charge station transprovincially to prestore, are based on The matching degree of the license plate number in license plate number and entry data in the outlet data includes matched data and non-matched data Sample data.
Wherein, the mode of the entry data and outlet data that obtain the corresponding goods stock of charge station transprovincially to prestore can be with It is according to national freeway toll station position input by user and the corresponding charge station's information in Provincial administrative division boundary As charge station's information transprovincially, and transprovincially corresponding outlet data and the entry data in charge station's information is obtained, can also be Charge station's information transprovincially is obtained according in the high speeds highway network topological structure such as Amap and Tencent's map, and obtains this and transprovincially receives Take the entry data and outlet data of the corresponding goods stock of station information.
It is optionally, in the present embodiment, described to obtain the corresponding goods stock of charge station transprovincially to prestore incorporated by reference to Fig. 2 Entry data and outlet data, based on the license plate number in the license plate number and entry data in the outlet data matching degree life At including the steps that the sample data of matched data and non-matched data includes:
Step S112:It is obtained transprovincially according to freeway net topological structure, charge station location and provincial administrative area boundary Charge station's information, and obtain the goods stock outlet data corresponding with the information of charge station transprovincially and entry data to prestore.
Step S114:Have the corresponding outlet data of complete license plate number as target outlet data using in outlet data, adopts With preset algorithm from the entry data search with the complete license plate number in the target outlet data consistent and outlet data In time and entry data in the difference of time be located at the entry data of a setting time range as target entries data, And using the target entries data and with the target outlet data of the target entries Data Matching as matched data, by other data As non-matched data.
It should be noted that in the present embodiment, the charge station transprovincially can inbound fission, can also be to come in and go out Stand one, when it is described go out inbound fission when, the distance between the outlet station of the charge station transprovincially and access station should one compared with In short distance range, such as in 20,30 or 40 kilometers.
The preset algorithm can be JaroWinkler Distance algorithms, can also be Levenshtein Distance algorithms, are chosen according to actual demand, are not specifically limited herein.
Optionally, in the present embodiment, it is searched from the entry data using preset algorithm and the target outlet number The difference of time of the complete license plate number in unanimously and in time and entry data in outlet data is located at a setting time The entry data of range is as target entries data, and by the target entries data and target with the target entries Data Matching Outlet data includes as matched data, using other data as the step of non-matched data:
License plate number L in outlet data is calculated using JaroWinklerDistance algorithmsOutletWith the vehicle in entry data Trade mark LEntranceSimilarity Slicense
Slicense=Sj+(lp(1-Sj))
Wherein, m LOutletAnd LEntranceMatched number of characters, t are the number of transposition;
It must not be less than V kilometers of standard per hour according to the minimum speed of highway, according to the distance D of charge station transprovincially, Calculate the running time T from outlet charge station to entrance charge station:
T=(D/V) × 60
Entry data is screened based on the running time T, is subtracted when screening obtains the entry time in entry data Go the Outlet time of outlet data in time interval [- T, T] range and license plate number similarity SlicenseMore than a setting value When, judge that the corresponding vehicle of corresponding outlet data and the corresponding vehicle of entry data are same vehicles, and by the entry data And it is included in respectively as the target entries data and target outlet data with the matched outlet data of the entry data With data, other data are included in non-matched data.
Wherein, the number t of transposition be unmatched number of characters, the setting value can be but not limited to 0.8,0.9 or 0.95, and work as the similarity and be more than the setting value, then judge the car plate in the license plate number and entry data in outlet data It number is consistent.
It should be noted that national freeway toll station and the spatial data on Provincial administrative division boundary will use Unified space coordinates, avoid due to space coordinates are inconsistent and caused by position offset, calculating charge station transprovincially When distance D, national freeway toll station and Provincial administrative division boundary can be superimposed upon in same map window, be adopted The mode for manually extracting charge station transprovincially calls the tool of GIS platform to calculate the distance between charge station transprovincially.
Step S120:By in the sample data entry data and the outlet data perform mathematical calculations it is comprehensive to obtain Index is closed, and the overall target is counted to obtain correlation metric.
Wherein, the outlet data can include but is not limited to:Export charge station coding, outlet license plate number, Outlet time, Export vehicle, the total number of axle of outlet vehicle, outlet vehicle goods gross weight and outlet vehicle weight limitation, the entry data may include but not Be limited to entrance charge station coding, entrance license plate number, entry time, entrance vehicle, the total number of axle of entrance vehicle, entrance vehicle goods gross weight with And entrance vehicle weight limitation.
By in the sample data entry data and the outlet data perform mathematical calculations to obtain overall target, and The step of being counted the overall target to obtain correlation metric include:
Utilize the entry time T in entry dataEntrance, entrance vehicle CEntrance, the total number of axle A of entrance vehicleEntrance, entrance vehicle goods it is total Weight WEntranceAnd entrance vehicle weight limitation LWEntranceSubtract the Outlet time T in corresponding outlet dataOutlet, outlet vehicle COutlet, outlet vehicle Total number of axle AOutlet, outlet vehicle goods gross weight WOutletAnd outlet vehicle weight limitation LWEntranceObtain overall target:
Dtime=TEntrance-TOutlet
Dcar=CEntrance-COutlet
Daxis=AEntrance-AOutlet
Dweight=WEntrance-WOutlet
Dlimitweight=LWEntrance-LWOutlet
Wherein, DtimeFor entrance time difference, DcarFor entrance vehicle is poor, DaxisFor the total number of axle of entrance vehicle it is poor, DweightFor the total method of double differences of entrance vehicle goods, DlimitweightIt is poor for entrance vehicle weight limitation;
Statistic of classification goes out the entrance time difference Dtime, entrance vehicle difference Dcar, the total number of axle difference D of entrance vehicleaxis、 The total method of double differences D of entrance vehicle goodsweight, entrance vehicle weight limitation difference DlimitweightFeature distribution, and choose judge outlet data and The whether matched correlation metric of entry data.
Step S130:The correlation metric is handled to obtain candidate characteristic set, and to the candidate characteristic set Significance level evaluation is carried out to obtain target signature collection, and to the target signature collection be allocated to obtain training characteristics collection and Test feature collection.
Wherein, the mode for training characteristics collection and test feature collection being obtained to the target signature and being allocated is can be with It is to the target signature and to be allocated to obtain training characteristics collection and test feature collection according to a setting ratio, for example, can be with It is to the target signature and to be allocated to obtain training characteristics collection and test feature collection according to 8: 2 ratio or 7: 3 ratio.
The correlation metric is handled to obtain candidate characteristic set, and important journey is carried out to the candidate characteristic set Degree evaluation to include the step of obtaining target signature collection:
With nondimensionalization, qualitative features quantification, quantitative characteristic binaryzation and discrete features coding method to coming in and going out Mouth time difference Dtime, entrance vehicle difference Dcar, the total number of axle difference D of entrance vehicleaxis, the total method of double differences D of entrance vehicle goodsweight, come in and go out Mouth vehicle weight limitation difference DlimitweightIt is handled, forms highway goods stock matching candidate feature set transprovincially;
The significance level of the candidate characteristic set is evaluated to obtain target using correlation coefficient process or variance back-and-forth method Feature set.
Specifically, in the present embodiment, to the entrance time difference Dtime, entrance vehicle difference Dcar, entrance vehicle Total number of axle difference DaxisThe One-Hot codings are respectively adopted to be handled, and to the total method of double differences D of the entrance vehicle goodsweightWith Entrance vehicle weight limitation difference DlimitweightSection is carried out respectively to zoom in the section of [- 1,1] to obtain highway freight Matching candidate feature set transprovincially.
Step S140:A variety of default machine learning algorithms are used to be trained study with to deserved the training characteristics collection To a variety of matching algorithm models transprovincially, and a kind of the first algorithm mould of conduct is chosen from a variety of models of matching algorithm transprovincially Type.
Wherein, a variety of default machine learning algorithms can include but is not limited to logistic regression, K closes on value, support vector machines, Naive Bayesian, decision tree, random forest and gradient are promoted.
To the training characteristics collection use a variety of default machine learning algorithms be trained study with it is corresponding obtain it is a variety of across Matching algorithm model is saved, and chooses from a variety of models of matching algorithm transprovincially and a kind of to be wrapped as the step of the first algorithm model It includes:
Respectively value, support vector machines, naive Bayesian, decision tree, random forest and ladder are closed on using logistic regression, K Degree hoisting machine learning algorithm is trained study to feature set, and calculates the accuracy rate score of each model.
Using a corresponding model of accuracy rate highest scoring in each model as the first algorithm model.
Step S150:The first algorithm model described in the test feature set pair is used to be assessed to obtain assessment result, And first algorithm model is adjusted to obtain target algorithm model according to the assessment result.
Specifically, the first algorithm model described in the test feature set pair is used to be assessed to obtain assessment result, and First algorithm model is calibrated to include the step of obtaining target algorithm model according to the assessment result:
It is tested using the first algorithm model described in the test feature set pair, and draws learning curve, ROC curve, Calculate AUC value.
According to learning curve, ROC curve and AUC value, the fitting state of first algorithm model is judged.
It is adjusted according to the parameter of first algorithm model of fitting state pair and characteristic variable to obtain target algorithm Model.
Step S160:To in charge station transprovincially pending entry data and pending outlet data using the target calculate Method model is matched to obtain matching result.
Specifically, by the way that the pending entry data and pending outlet data are input to the target algorithm mould Type, so that the target algorithm model matches the outlet data and entry data.
Step S170:License plate number in pending entry data corresponding to the matching result and pending outlet data In license plate number carry out similarity calculation to obtain similarity result, and based on the similarity result to the matching result into Row optimization.
Wherein, in the license plate number and pending outlet data in pending entry data corresponding to the matching result It can be to use JaroWinkler Distance algorithms in a manner of obtaining similarity result that license plate number, which carries out similarity calculation, Similarity calculation is carried out, can also be that similarity calculation is carried out using Levenshtein Distance algorithms, according to practical need It asks and is chosen, is not specifically limited herein.
Optionally, in the present embodiment, according in the pending entry data license plate number and pending outlet data In license plate number carry out similarity calculation to obtain similarity result, and based on the similarity result to the matching result into Row optimization the step of include:
It is the vehicle in the license plate number and pending outlet data in the corresponding pending entry data of matching to matching result The trade mark using JaroWinklerDistance algorithms carries out that similarity is calculated.
When the similarity is more than a preset value, then the matching result is maintained, otherwise, the matching result is carried out Modification.
Wherein, which can be 0.7,0.75,0.8,0.85 or 0.9, be not specifically limited herein, according to practical need It asks and is configured.
By above-mentioned setting highway can be effectively solved effectively to realize the matching transprovincially of highway goods stock Charge data transprovincially segmentation problem and due to Car license recognition is not complete or identification mistake caused by can not directly by license plate number into Row goods stock transprovincially matching problem, to restore complete driving path of the goods stock on national highway, for high speed The analysis decisions such as highway transportation statistics analysis, economic operation analysis provide base support.
In the present embodiment, include using Guizhou Xin Zhai charge stations as outlet station transprovincially and Guangxi with the charge station transprovincially Qian Guiliuzhai charge stations for access station transprovincially as illustrating.Goods stock in June, 2017 Guizhou to prestore is obtained respectively Xin Zhai charge stations outlet data and charge station of the Guangxi stockaded villages Qian Guiliu entry data, and reject that license plate number is empty, data target is empty Record, specific entry data include entrance charge station number, entrance charge station name, entrance license plate number, entry time, enter Mouth vehicle, the total number of axle of entrance vehicle, entrance vehicle goods gross weight and entrance vehicle weight limitation, outlet data include that outlet charge station compiles Number, outlet charge station name, outlet license plate number, Outlet time, outlet vehicle, the total number of axle of outlet vehicle, outlet vehicle goods gross weight with And outlet vehicle weight limitation.
Guizhou Xin Zhai charge stations in June, 2017 goods stock outbound data is refering to table 1:
Table 1
Charge station of the Guangxi stockaded villages Qian Guiliu in June, 2017 goods stock inbound data is refering to table 2:
Table 2
Wherein, license plate number is generally made of Chinese character+letter+number, wherein first Chinese character is the abbreviation in province, second letter General proxy city, behind be made of letter or number, license plate number length be 7.According to the coding rule of license plate number, from outlet number The complete license plate number record for meeting license plate number coding rule is filtered out in.
Calculate the transit time between Guizhou Xin Zhai charge stations and charge station of the Guangxi stockaded villages Qian Guiliu.It is charged according to Guizhou Xin Zhai The road network structure stood with the latitude and longitude coordinates of charge station of the Guangxi stockaded villages Qian Guiliu and freeway net, can calculate Guizhou Xin Zhai Charge station is 8.4 kilometers at a distance from charge station of the Guangxi stockaded villages Qian Guiliu.Do not considering freeway net topological structure abnormal conditions Under, 60 kilometers per hour of standard, Guizhou Xin Zhai charge stations and Guangxi Guizhou Province osmanthus six must not be less than according to the minimum speed of highway The transit time of charge station of stockaded village up to 8.4 minutes.
On the basis of the Outlet time in outlet data, entry time difference is filtered out from entry data in [- 9,9] minute Interior entry record collection.
The similarity of outlet license plate number and entrance license plate number is calculated using JaroWinklerDistance algorithms.Calculate knot Fruit is refering to table 3:
Table 3
If license plate number similarity is 0.9 or more, judgement outlet vehicle and entrance vehicle are same vehicles, are included in coupling number According to conversely, judging that outlet vehicle and entrance vehicle for non-same vehicle, are included in non-matched data.Finally by matched data and not The highway goods stock transprovincially matched sample data that matched data is constituted, please refers to table 4:
Table 4
By performing mathematical calculations to sample data middle outlet data and entry data, by entry and exit discretization index Be converted to can compare, analyzable overall target.By the regularity of distribution of the analysis integrated index of the statistical graphs such as histogram, grasp To judging, the whether matched correlation metric of entry data.It specifically calculates, the indicator difference of entry data, such as comes in and goes out Mouth time difference Dtime, entrance vehicle difference Dcar, the total number of axle difference D of entrance vehicleaxis, the total method of double differences D of entrance vehicle goodsweight, come in and go out Mouth vehicle weight limitation difference Dlimitweight.Result of calculation is refering to table 5:
Table 5
According to whether matching index, counts entry time difference Dtime, entrance vehicle difference Dcar, entrance vehicle line shaft Number difference Daxis, the total method of double differences D of entrance vehicle goodsweight, entrance vehicle weight limitation difference DlimitweightData distribution.
Entrance time difference DtimeData distribution refering to table 6:
Table 6
Entrance vehicle difference DcarData distribution refering to table 7:
Table 7
Entrance vehicle is poor Whether match Sample size
-4 0 84389
-3 0 193407
-2 0 101191
-1 0 129205
0 0 1121537
1 0 143759
2 0 101843
3 0 140178
4 0 145883
-4 1 6
-3 1 1
-2 1 5
-1 1 109
0 1 42986
1 1 2061
2 1 30
3 1 11
The total number of axle difference D of entrance vehicleaxisData distribution refering to table 8:
Table 8
Entrance vehicle borrows total method of double differences DweightData distribution refering to table 9:
Table 9
Entrance vehicle weight limitation difference DlimitweightData distribution refering to table 10:
Table 10
According to table 6- tables 10:Entrance time difference Dtime, entrance vehicle difference Dcar, the total number of axle of entrance vehicle it is poor Daxis, the total method of double differences D of entrance vehicle goodsweight, entrance vehicle weight limitation difference DlimitweightThere is significantly distribution to advise sample data Rule, for example, sample data entrance time difference DtimeIt concentrates within the scope of [- 5,7] minute, entrance vehicle difference DcarIt concentrates on In [- 1,1] range, the total number of axle difference D of entrance vehicleaxisIt concentrates in [- 1,1] range, the total method of double differences D of entrance vehicle goodsweight、 Entrance vehicle weight limitation difference DlimitweightAlso it all concentrates in a certain range.It is thereby possible to select entrance time difference Dtime, go out Entrance vehicle difference Dcar, the total number of axle difference D of entrance vehicleaxis, the total method of double differences D of entrance vehicle goodsweight, entrance vehicle weight limitation it is poor DlimitweightAs judging highway goods stock matched correlation metric transprovincially.
For entrance time difference Dtime, entrance vehicle difference Dcar, the total number of axle difference D of entrance vehicleaxisThree indexs are adopted It is encoded with One-Hot, converts Discrete Eigenvalue to the binary feature for including multiple mode bits;For the total method of double differences of entrance vehicle goods Dweight, entrance vehicle weight limitation difference DlimitweightTwo indices carry out section scaling using MinMaxScalar, uniformly zoom to In the section of [- 1,1], to form highway goods stock matching candidate feature set transprovincially, specifically include
WeightDiffefence, LimitWeightDifference, TD_-14, TD_-13, TD_12, TD_-11, TD_10, TD_9, TD_8, TD_7, TD_6, TD_5, TD_4, TD_3, TD_2, TD_1, TD_0, TD_1, TD_2, TD_3, TD_4, TD_5, TD_6, TD_7, TD_8, TD_9, TD_10, TD_11, TD_12, TD_13, TD_14, AD_-4, AD_-3, AD_-2, AD_- 1, AD_0, AD_1, AD_2, AD_3, AD_4, MD_4, MD_-3, MD_-2, MD_-1, MD_0, MD_1, MD_2, MD_3, MD_4 }, Wherein, TD_ beginnings are characterized as the entrance time difference using the index generated after One-Hot codings, and MD_ beginnings are characterized as out For entrance vehicle difference using the index generated after One-Hot codings, AD_ beginnings are characterized as that the total number of axle difference of entrance vehicle uses The index generated after One-Hot codings.
The significance level of candidate characteristic set is evaluated using Random Forest methods.Evaluation result please refers to table 11:
Table 11
According to the significance level ranking of feature, before ranking 10 feature is chosen as target signature collection, and according to 0.8 He 0.2 ratio is split target signature collection, and 80% data are for training so that as training characteristics collection, 20% data are used In test to be modeled as test feature collection, and using training characteristics collection.
I.e. to include TD_0, WeightDifference, LimitWeightDifference, MD_0, AD_0, MD_1, The target signature collection of AD_-2, AD-1, AD_1, MD-3 close on value using logistic regression, K, support vector machines, naive Bayesian, determine The machine learning algorithms such as plan tree, random forest, gradient promotion are trained, and calculate predictablity rate.Result of calculation is refering to table 12:
Table 12
Serial number Algorithm title Predictablity rate
1 RandomForest 0.985375
2 LogisticRegression 0.975025
3 KNN 0.075936
4 GradientBoosting 0.986650
5 AdaBoosting 0.986975
6 DecisionTree 0.985550
7 GaussianNativeByes 0.900350
8 SVC 0.974725
As can be seen from Table 12, the predictablity rate of AdaBoosting algorithms compares other algorithm highers, therefore selects First algorithm model of the AdaBoosting algorithms as highway goods stock transprovincially matching problem.
It is tested using the first algorithm model of test feature set pair, test result is refering to table 13:
Table 13
Specifically, by drawing learning curve, ROC curve, calculates the methods of AUC value and the first algorithm model is commented Estimate, master of ceremonies's algorithm model model optimize accordingly and improve to obtain target algorithm model according to assessment result.
It is 0.9798609318093167 with the AUC value that AdaBoosting algorithms are calculated, ROC curve please refers to Fig. 4 can be seen that AdaBoosting algorithms from ROC curve and AUC value and be highly suitable for solving highway goods stock Matching problem transprovincially.
The target algorithm model in charge station of trained Guizhou Xin Zhai charge stations to the Guangxi stockaded villages Qian Guiliu in June, 2017 is answered For differentiating whether the highway goods stock in July, 2017, August transprovincially matches, the generalization ability of testing model.Statistics knot Fruit is shown in Table 14:
Table 14
Can be seen that model from 14 statistical result of table integrally has good generalization ability, but to the prediction of matched data Accuracy rate is relatively low.
Therefore, solve the problems, such as that model is indifferent to exactly matching data generaliza-tion using license plate number similarity algorithm, into One step improves highway goods stock matching precision transprovincially.Result of calculation is refering to table 15:
Table 15
If entry and exit license plate number similarity is more than 0.8, model is maintained to judge as a result, otherwise, then judging to tie by model Fruit modification is modified.Therefore, indifferent to exactly matching data generaliza-tion using license plate number similarity algorithm solution model Problem further increases highway goods stock matching precision transprovincially.Specially:To model be judged as it is matched as a result, by According to license plate number similarity calculating method, the similarity of outlet license plate number and entrance car plate is calculated, if entry and exit license plate number is similar Degree is more than a setting value, then maintains model to judge as a result, otherwise, then model judgement result is modified.
Incorporated by reference to Fig. 5, on the basis of the above, the present invention also provides a kind of highway goods stock transprovincially coalignments 100, including acquisition module 110, computing module 120, processing module 130, training module 140, evaluation module 150, matching module 160 and optimization module 170.
The acquisition module 110 is used to obtain entry data and the outlet of the corresponding goods stock of charge station transprovincially to prestore Data, based on the license plate number in the license plate number and entry data in the outlet data matching degree generation include matched data and The sample data of non-matched data.In the present embodiment, the acquisition module 110 can be used for executing step S110 shown in Fig. 2, Specific descriptions about the acquisition module 110 are referred to the description to step S110 above.
The computing module 120 is used for the entry data and outlet data progress mathematics fortune in the sample data It calculates to obtain overall target, and the overall target is counted to obtain correlation metric.In the present embodiment, the meter It calculates module 120 and can be used for executing step S120 shown in Fig. 2, before the specific descriptions about the computing module 120 are referred to Description of the text to step S120.
The processing module 130 is for handling the correlation metric to obtain candidate characteristic set, and to described Candidate characteristic set carries out significance level evaluation to obtain target signature collection, and is allocated and is instructed to the target signature collection Practice feature set and test feature collection.In the present embodiment, the processing module 130 can be used for executing step S130 shown in Fig. 2 It is referred to above to step S130 in the specific descriptions of the processing module 130.
The training module 140 using a variety of default machine learning algorithms to the training characteristics collection for being trained It practises and a variety of matching algorithm models transprovincially is obtained with correspondence, and choose from a variety of models of matching algorithm transprovincially and a kind of being used as the One algorithm model.In the present embodiment, the training module 140 can be used for executing step S140 shown in Fig. 2, about the instruction The specific descriptions for practicing module 140 are referred to the description to step S140 above.
The evaluation module 150 is for using the first algorithm model described in the test feature set pair to be assessed to obtain Assessment result, and first algorithm model is adjusted to obtain target algorithm model according to the assessment result.In this reality It applies in example, the evaluation module 150 can be used for executing step S150 shown in Fig. 2, and specific about the evaluation module 150 is retouched It states and is referred to the description to step S150 above.
The matching module 160 be used for in charge station transprovincially pending entry data and pending outlet data use The target algorithm model is matched to obtain matching result.In the present embodiment, the matching module 160 can be used for executing Step S160 shown in Fig. 2, the specific descriptions about the matching module 160 are referred to the description to step S160 above.
The optimization module 170 is used for in the license plate number and pending outlet data in the pending entry data License plate number carries out similarity calculation to obtain the vehicle in similarity result pending entry data corresponding to the matching result License plate number in the trade mark and pending outlet data carries out similarity calculation to obtain similarity result, and is based on the similarity As a result the matching result is optimized.In the present embodiment, the optimization module 170 can be used for executing step shown in Fig. 2 Rapid S170, the specific descriptions about the optimization module 170 are referred to the description to step S170 above.
To sum up, a kind of highway goods stock provided by the invention transprovincially matching process and device is prestored by obtaining The corresponding goods stock of charge station transprovincially entry data and outlet data, target is obtained based on entry data and outlet data Algorithm model, in charge station transprovincially pending entry data and pending outlet data using target algorithm model carry out It is equipped with to obtain matching result, and is carried out according to the license plate number in the license plate number and pending outlet data in pending entry data Similarity calculation optimizes the matching result based on the similarity result with obtaining similarity result.By upper Setting is stated with effective guarantee highway goods stock transprovincially matched accuracy, can effectively solve expressway tol lcollection data across Save segmentation problem and since directly goods stock can not be carried out by license plate number caused by Car license recognition is not complete or identification mistake Matching problem transprovincially.
It, can be with if the function is realized and when sold or used as an independent product in the form of software function module It is stored in a computer read/write memory medium.Based on this understanding, technical scheme of the present invention is substantially in other words The part of the part that contributes to existing technology or the technical solution can be expressed in the form of software products, the meter Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be People's computer, electronic equipment or network equipment etc.) execute all or part of step of each embodiment the method for the present invention Suddenly.

Claims (10)

1. a kind of highway goods stock transprovincially matching process, which is characterized in that the method includes:
The entry data and outlet data for obtaining the corresponding goods stock of charge station transprovincially to prestore, based in the outlet data License plate number and entry data in license plate number matching degree generate include matched data and non-matched data sample data;
By in the sample data entry data and the outlet data perform mathematical calculations to obtain overall target, and to institute Overall target is stated to be counted to obtain correlation metric;
The correlation metric is handled to obtain candidate characteristic set, and significance level is carried out to the candidate characteristic set and is commented Valence is to obtain target signature collection, and is allocated to obtain training characteristics collection and test feature collection to the target signature collection;
To the training characteristics collection use a variety of default machine learning algorithms be trained study with it is corresponding obtain it is a variety of transprovincially With algorithm model, and a kind of the first algorithm model of conduct is chosen from a variety of models of matching algorithm transprovincially;
The first algorithm model described in the test feature set pair is used to be assessed to obtain assessment result, and according to the assessment knot Fruit is adjusted to obtain target algorithm model first algorithm model;
To in charge station transprovincially pending entry data and pending outlet data using the target algorithm model carry out It is equipped with to obtain matching result;
The license plate number in license plate number and pending outlet data in pending entry data corresponding to the matching result into Row similarity calculation optimizes the matching result based on the similarity result with obtaining similarity result.
2. highway goods stock according to claim 1 transprovincially matching process, which is characterized in that the acquisition prestores The corresponding goods stock of charge station transprovincially entry data and outlet data, based in the outlet data license plate number with enter The matching degree of license plate number in mouthful data generates the step of matched data and non-matched data and includes:
Charge station's information transprovincially is obtained according to freeway net topological structure, charge station location and provincial administrative area boundary, and Obtain the goods stock outlet data corresponding with the information of charge station transprovincially and entry data to prestore;
There to be the corresponding outlet data of complete license plate number as target outlet data in outlet data, using preset algorithm from institute State time and the entrance searched in entry data in and outlet data consistent with the complete license plate number in the target outlet data The difference of time in data is located at the entry data of a setting time range as target entries data, and by the target entries Data and with the target outlet data of the target entries Data Matching as matched data, using other data as mismatching number According to.
3. highway goods stock according to claim 2 transprovincially matching process, which is characterized in that will be in outlet data With the corresponding outlet data of complete license plate number as target outlet data, searched from the entry data using preset algorithm The difference of the time in time and entry data with the complete license plate number in the target outlet data in consistent and outlet data Value enters positioned at the entry data of a setting time range as target entries data, and by the target entries data and with the target The target outlet data of mouthful Data Matching include using other data as the step of non-matched data as matched data:
License plate number L in outlet data is calculated using JaroWinklerDistance algorithmsOutletWith the license plate number in entry data LEntranceSimilarity Slicense
Slicense=Sj+(lp(1-Sj))
Wherein, m LOutletAnd LEntranceMatched number of characters, t are the number of transposition;
V kilometers per hour of standard must not be less than according to the minimum speed of highway, according to the distance D of charge station transprovincially, calculated From outlet charge station to the running time T of entrance charge station:
T=(D/V) × 60
Entry data is screened based on the running time T, is subtracted out when screening obtains the entry time in entry data The Outlet time of mouthful data is in time interval [- T, T] range and license plate number similarity SlicenseWhen more than a setting value, sentence The corresponding vehicle of fixed corresponding outlet data and the corresponding vehicle of entry data are same vehicles, and by the entry data and with The matched outlet data of the entry data is included in matched data respectively as the target entries data and target outlet data, Other data are included in non-matched data.
4. highway goods stock according to claim 1 transprovincially matching process, which is characterized in that the entry data Including entrance charge station coding, entrance license plate number, entry time, entrance vehicle, the total number of axle of entrance vehicle, entrance vehicle goods gross weight with And entrance vehicle weight limitation, the outlet data include outlet charge station coding, outlet license plate number, Outlet time, outlet vehicle, go out The total number of axle of mouthful vehicle, outlet vehicle goods gross weight and outlet vehicle weight limitation, by the sample data entry data and it is described go out Mouth data perform mathematical calculations to obtain overall target, and are counted to the overall target to obtain the step of correlation metric Suddenly include:
According to the entry time T in entry data in the sample dataEntrance, entrance vehicle CEntrance, the total number of axle A of entrance vehicleEntrance、 Entrance vehicle goods gross weight WEntranceAnd entrance vehicle weight limitation LWEntranceSubtract the Outlet time T in corresponding outlet dataOutlet, outlet vehicle COutlet, outlet vehicle total number of axle AOutlet, outlet vehicle goods gross weight WOutletAnd outlet vehicle weight limitation LWEntranceObtain overall target:
Dtime=TEntrance-TOutlet
Dcar=CEntrance-COutlet
Daxis=AEntrance-AOutlet
Dweight=WEntrance-WOutlet
Dlimitweight=LWEntrance-LWOutlet
Wherein, DtimeFor entrance time difference, DcarFor entrance vehicle is poor, DaxisFor the total number of axle of entrance vehicle is poor, DweightFor The total method of double differences of entrance vehicle goods, DlimitweightIt is poor for entrance vehicle weight limitation;
Statistic of classification goes out the entrance time difference Dtime, entrance vehicle difference Dcar, the total number of axle difference D of entrance vehicleaxis, come in and go out The total method of double differences D of mouth vehicle goodsweight, entrance vehicle weight limitation difference DlimitweightFeature distribution, and choose and judge outlet data and entrance The whether matched correlation metric of data.
5. highway goods stock according to claim 4 transprovincially matching process, which is characterized in that the correlation Index is handled to obtain candidate characteristic set, and carries out significance level evaluation to the candidate characteristic set to obtain target signature The step of collection includes:
When with nondimensionalization, qualitative features quantification, quantitative characteristic binaryzation and discrete features coding method to entrance Between difference Dtime, entrance vehicle difference Dcar, the total number of axle difference D of entrance vehicleaxis, the total method of double differences D of entrance vehicle goodsweight, entrance vehicle Freight weight limit difference DlimitweightIt is handled, forms highway goods stock matching candidate feature set transprovincially;
The significance level of the candidate characteristic set is evaluated to obtain target signature using correlation coefficient process or variance back-and-forth method Collection.
6. highway goods stock according to claim 5 transprovincially matching process, which is characterized in that use dimensionless Change, qualitative features quantification, quantitative characteristic binaryzation, One-Hot coding and/or discrete features coding method are to the entrance time Poor Dtime, entrance vehicle difference Dcar, the total number of axle difference D of entrance vehicleaxis, the total method of double differences D of entrance vehicle goodsweightWith entrance vehicle Freight weight limit difference DlimitweightHandled, formed highway goods stock transprovincially matching candidate feature set the step of include:
To the entrance time difference Dtime, entrance vehicle difference Dcar, the total number of axle difference D of entrance vehicleaxisIt is respectively adopted described One-Hot codings are handled, and to the total method of double differences D of the entrance vehicle goodsweightWith entrance vehicle weight limitation difference Dlimitweight Section is carried out respectively to zoom in the section of [- 1,1] to obtain highway goods stock matching candidate feature set transprovincially.
7. highway goods stock according to claim 1 transprovincially matching process, which is characterized in that special to the training Collection use a variety of default machine learning algorithms to be trained study and obtains a variety of matching algorithm models transprovincially with correspondence, and from institute Stating the step of a kind of the first algorithm model of conduct is chosen in a variety of models of matching algorithm transprovincially includes:
Value, support vector machines, naive Bayesian, decision tree, random forest and gradient is closed on using logistic regression, K respectively to carry Liter machine learning algorithm is trained study to the training characteristics collection and obtains corresponding model, and calculates the accuracy rate of each model Score;
Using a corresponding model of accuracy rate highest scoring in each model as the first algorithm model.
8. highway goods stock according to claim 1 transprovincially matching process, which is characterized in that use the test Feature set carries out assessment to first algorithm model and is handled first algorithm model to obtain according to assessment result Include to the step of target algorithm model:
It is tested using the first algorithm model described in the test feature set pair, and draws learning curve, ROC curve, calculated AUC value;
According to learning curve, ROC curve and AUC value, the fitting state of first algorithm model is judged;
It is adjusted according to the parameter of first algorithm model of fitting state pair and characteristic variable to obtain target algorithm model.
9. highway goods stock according to claim 1 transprovincially matching process, which is characterized in that wait locating according to described Manage entry data in license plate number and pending outlet data in license plate number carry out similarity calculation to obtain similarity result, And the step of being optimized to the matching result based on the similarity result, includes:
It is the license plate number in the license plate number and pending outlet data in the corresponding pending entry data of matching to matching result It carries out that similarity is calculated using JaroWinklerDistance algorithms;
When the similarity is more than a preset value, then the matching result is maintained, otherwise, the matching result is repaiied Change.
10. a kind of highway goods stock transprovincially coalignment, which is characterized in that including:
Acquisition module, entry data and outlet data for obtaining the corresponding goods stock of charge station transprovincially to prestore, is based on The matching degree generation of the license plate number in license plate number and entry data in the outlet data includes matched data and mismatch number According to sample data;
Computing module, for by the sample data entry data and the outlet data perform mathematical calculations it is comprehensive to obtain Index is closed, and the overall target is counted to obtain correlation metric;
Processing module, for being handled the correlation metric to obtain candidate characteristic set, and to the candidate characteristic set Significance level evaluation is carried out to obtain target signature collection, and to the target signature collection be allocated to obtain training characteristics collection and Test feature collection;
Training module, for using a variety of default machine learning algorithms to be trained study with to deserved the training characteristics collection To a variety of matching algorithm models transprovincially, and a kind of the first algorithm mould of conduct is chosen from a variety of models of matching algorithm transprovincially Type;
Evaluation module, for using the first algorithm model described in the test feature set pair to be assessed to obtain assessment result, And first algorithm model is adjusted to obtain target algorithm model according to the assessment result;
Matching module, for in charge station transprovincially pending entry data and pending outlet data using the target calculation Method model is matched to obtain matching result;
Optimization module, for being carried out to the license plate number in the license plate number and pending outlet data in the pending entry data Similarity calculation is to obtain the license plate number in similarity result pending entry data corresponding to the matching result and wait locating License plate number in reason outlet data carries out similarity calculation to obtain similarity result, and based on the similarity result to described Matching result optimizes.
CN201810664921.6A 2018-06-25 2018-06-25 Highway goods stock transprovincially matching process and device Active CN108764375B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810664921.6A CN108764375B (en) 2018-06-25 2018-06-25 Highway goods stock transprovincially matching process and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810664921.6A CN108764375B (en) 2018-06-25 2018-06-25 Highway goods stock transprovincially matching process and device

Publications (2)

Publication Number Publication Date
CN108764375A true CN108764375A (en) 2018-11-06
CN108764375B CN108764375B (en) 2019-05-03

Family

ID=63977506

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810664921.6A Active CN108764375B (en) 2018-06-25 2018-06-25 Highway goods stock transprovincially matching process and device

Country Status (1)

Country Link
CN (1) CN108764375B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871513A (en) * 2019-01-28 2019-06-11 重庆首讯科技股份有限公司 A kind of current behavior similarity calculating method of vehicle on highway and system
CN110335473A (en) * 2019-08-14 2019-10-15 中国联合网络通信集团有限公司 Block the personal identification method and system of license plate vehicle
CN112419721A (en) * 2020-11-18 2021-02-26 交通运输部科学研究院 Method and device for calculating key indexes of transportation of vehicles on highway and electronic equipment
CN112560074A (en) * 2021-02-20 2021-03-26 支付宝(杭州)信息技术有限公司 Vehicle passing data processing method, device, equipment and system
CN112991134A (en) * 2021-05-11 2021-06-18 交通运输部科学研究院 Driving path reduction measuring and calculating method and device and electronic equipment
CN114446045A (en) * 2021-09-14 2022-05-06 武汉长江通信智联技术有限公司 Method for studying and judging illegal transportation behaviors of vehicles on highway in epidemic situation period
CN116821721A (en) * 2023-07-03 2023-09-29 上海金润联汇数字科技有限公司 Method, device, equipment and medium for identifying cross-city network about car

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103593879A (en) * 2013-10-08 2014-02-19 南京爱沓信息技术有限公司 Vehicle positioning integrated intelligent charge management system and method applied to same
CN105844721A (en) * 2016-01-27 2016-08-10 吴加强 Non-stop charging method for driving in and out of expressway
US20170254660A1 (en) * 2016-03-04 2017-09-07 Volvo Car Corporation Method and system for utilizing a trip history
CN107146292A (en) * 2017-04-28 2017-09-08 成都通甲优博科技有限责任公司 A kind of freeway toll station vehicle management server, system and method
CN107564294A (en) * 2017-08-25 2018-01-09 深圳前海华夏智信数据科技有限公司 Unlicensed car recognition methods and device based on virtual car plate
CN107730640A (en) * 2016-08-10 2018-02-23 西安艾润物联网技术服务有限责任公司 Bill settlement method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103593879A (en) * 2013-10-08 2014-02-19 南京爱沓信息技术有限公司 Vehicle positioning integrated intelligent charge management system and method applied to same
CN105844721A (en) * 2016-01-27 2016-08-10 吴加强 Non-stop charging method for driving in and out of expressway
US20170254660A1 (en) * 2016-03-04 2017-09-07 Volvo Car Corporation Method and system for utilizing a trip history
CN107730640A (en) * 2016-08-10 2018-02-23 西安艾润物联网技术服务有限责任公司 Bill settlement method and device
CN107146292A (en) * 2017-04-28 2017-09-08 成都通甲优博科技有限责任公司 A kind of freeway toll station vehicle management server, system and method
CN107564294A (en) * 2017-08-25 2018-01-09 深圳前海华夏智信数据科技有限公司 Unlicensed car recognition methods and device based on virtual car plate

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871513A (en) * 2019-01-28 2019-06-11 重庆首讯科技股份有限公司 A kind of current behavior similarity calculating method of vehicle on highway and system
CN109871513B (en) * 2019-01-28 2023-03-31 重庆首讯科技股份有限公司 Method and system for calculating similarity of vehicle passing behaviors on highway
CN110335473A (en) * 2019-08-14 2019-10-15 中国联合网络通信集团有限公司 Block the personal identification method and system of license plate vehicle
CN110335473B (en) * 2019-08-14 2021-05-11 中国联合网络通信集团有限公司 Identity recognition method and system for vehicle with license plate covered
CN112419721A (en) * 2020-11-18 2021-02-26 交通运输部科学研究院 Method and device for calculating key indexes of transportation of vehicles on highway and electronic equipment
CN112419721B (en) * 2020-11-18 2021-06-08 交通运输部科学研究院 Method and device for calculating key indexes of transportation of vehicles on highway and electronic equipment
CN112560074A (en) * 2021-02-20 2021-03-26 支付宝(杭州)信息技术有限公司 Vehicle passing data processing method, device, equipment and system
CN112991134A (en) * 2021-05-11 2021-06-18 交通运输部科学研究院 Driving path reduction measuring and calculating method and device and electronic equipment
CN114446045A (en) * 2021-09-14 2022-05-06 武汉长江通信智联技术有限公司 Method for studying and judging illegal transportation behaviors of vehicles on highway in epidemic situation period
CN116821721A (en) * 2023-07-03 2023-09-29 上海金润联汇数字科技有限公司 Method, device, equipment and medium for identifying cross-city network about car
CN116821721B (en) * 2023-07-03 2024-04-02 上海金润联汇数字科技有限公司 Method, device, equipment and medium for identifying cross-city network about car

Also Published As

Publication number Publication date
CN108764375B (en) 2019-05-03

Similar Documents

Publication Publication Date Title
CN108764375B (en) Highway goods stock transprovincially matching process and device
CN108777004B (en) Expressway coach transportation vehicle transprovincially matching process and device
CN108648074A (en) Loan valuation method, apparatus based on support vector machines and equipment
WO2017143919A1 (en) Method and apparatus for establishing data identification model
CN111028016A (en) Sales data prediction method and device and related equipment
CN108062674A (en) Order fraud recognition methods, system, storage medium and electronic equipment based on GPS
CN107679734A (en) It is a kind of to be used for the method and system without label data classification prediction
CN103970747A (en) Data processing method for network side computer to order search results
CN107767273A (en) Asset Allocation method, electronic installation and medium based on social data
CN114386856A (en) Method, device and equipment for identifying empty-shell enterprise and computer storage medium
CN114566052A (en) Method for judging rotation of highway traffic flow monitoring equipment based on traffic flow direction
CN113704389A (en) Data evaluation method and device, computer equipment and storage medium
CN113379334B (en) Road section bicycle riding quality identification method based on noisy track data
CN115206102B (en) Method, device, electronic equipment and medium for determining traffic path
CN107291722B (en) Descriptor classification method and device
CN113360845A (en) Vehicle source transaction probability prediction method and device, electronic device and storage medium
CN106781482A (en) Novel intelligent traffic data processing system
CN112991134B (en) Driving path reduction measuring and calculating method and device and electronic equipment
CN116823069B (en) Intelligent customer service quality inspection method based on text analysis and related equipment
CN113837764B (en) Risk early warning method, risk early warning device, electronic equipment and storage medium
CN112784789B (en) Method, device, electronic equipment and medium for identifying traffic flow of road
CN107798446A (en) The evaluation process method and device of rich ore route
CN113673595A (en) Data processing method, device and equipment
CN115936282A (en) Method and device for optimizing score model, electronic equipment and storage medium
Jin et al. Inspection Strategy Design of Toll-Free Logistics Vehicle in China Freeway: Case Study in Shaanxi Province

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant