CN107329977B - A kind of false-trademark vehicle postsearch screening method based on probability distribution - Google Patents

A kind of false-trademark vehicle postsearch screening method based on probability distribution Download PDF

Info

Publication number
CN107329977B
CN107329977B CN201710391814.6A CN201710391814A CN107329977B CN 107329977 B CN107329977 B CN 107329977B CN 201710391814 A CN201710391814 A CN 201710391814A CN 107329977 B CN107329977 B CN 107329977B
Authority
CN
China
Prior art keywords
vehicle
bayonet
license plate
probability
record
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710391814.6A
Other languages
Chinese (zh)
Other versions
CN107329977A (en
Inventor
王辉
蒋伶华
陈涛
李建元
温晓岳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yinjiang Technology Co.,Ltd.
Original Assignee
Enjoyor Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Enjoyor Co Ltd filed Critical Enjoyor Co Ltd
Priority to CN201710391814.6A priority Critical patent/CN107329977B/en
Publication of CN107329977A publication Critical patent/CN107329977A/en
Application granted granted Critical
Publication of CN107329977B publication Critical patent/CN107329977B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases

Abstract

A kind of false-trademark vehicle postsearch screening method based on probability distribution, comprising the following steps: S1. obtain bayonet cross vehicle record data, and carry out data cleansing obtain bayonet cross vehicle record data;S2. vehicle record data sorting is crossed to bayonet, extracts vehicle driving bayonet to vector;S3. the spatial probability distribution that vehicle flows between calculating bayonet;S4. bayonet record is compared with database, obtains preliminary screening vacation license plate set;S5. the spatial probability distribution flowed to based on vehicle in S3 obtains the license plate for meeting spatial probability distribution from preliminary screening vacation license plate set;S6. the spatial probability distribution flowed to according to vehicle, determines each character-recognition errors probability;S7. meet the probability and character-recognition errors probability of spatial distribution, comprehensive judgement number plate false-trademark probability according to license plate.The present invention can overcome bayonet accuracy of identification insufficient to a certain extent, effectively reduce false-trademark and check range, improve accuracy at target.

Description

A kind of false-trademark vehicle postsearch screening method based on probability distribution
Technical field
The invention belongs to intelligent transportation field more particularly to a kind of false-trademark vehicle postsearch screening methods based on probability distribution.
Background technique
In recent years, with the continuous development of Chinese national economy, vehicle guaranteeding organic quantity constantly increases, and various traffic offences are disobeyed Zhang Xianxiang is also increasing, wherein " false-trademark ", " deck " are with the illegal activities seriously endangered.Vehicle " false-trademark " phenomenon, refers to Be vehicle forge, adulterium automotive number plate, it is illegal to use the vehicle being not present in the registered vehicle information of motor vehicle management The phenomenon that trade mark.
" false-trademark " will cause serious harm.It often exceeded the speed limit wantonly using the vehicle of false license plate, press traffic lights row It sails, very disruptive traffic order.Once traffic accident occurs, these drivers are under the driving of idea of leaving things to chance, and often selection is escaped Ease makes policeman in charge of the case be difficult to determine vehicle.Meanwhile " false-trademark " vehicle also tend to be offender tool used in crime, increase broken Case difficulty.Investigate and prosecute " false-trademark " vehicle, it has also become the vital task of various regions public security department and traffic management department.
It compares currently, " false-trademark " vehicle excavates the information and date library mainly acquired by bayonet, is not present in database Definition be " false-trademark " vehicle, since bayonet number plate accuracy of identification is limited, the false-trademark vehicle of preliminary screening often up to hundreds of thousands is needed Carry out postsearch screening.From the point of view of the false-trademark vehicle screening recognition methods that existing document and disclosed patent propose, it is related to vacation at present Method for distinguishing main method is known in board screening can be divided into two classes:
(1) it is based on ancillary equipment.If number of patent application CN201210187968.0 is using the side of reserved safety monitoring password Formula.Vehicle safety detection code is reserved in traffic police's Internal Management System platform, vehicle is believed by handheld terminal in law enforcement traffic police scene Breath and safety monitoring password and reserved information compare, and judge whether it is false-trademark vehicle;Number of patent application CN201320577360.9 is adopted With a kind of false license plate recognition device based on RFID technique, pass through the electronic tag for forming radio frequency chip and microelectronic chip Be mounted on vehicle body, judged using Radio Frequency Identification Technology vehicle whether false-trademark deck.
(2) detection recognition method based on information of vehicles comparison, as number of patent application 201510744990.4 uses picture Similarity identification.The SIFT feature for extracting vehicle region in picture first is converted into neighborhood spy after clustering algorithm discretization Sign, based on vehicle Expressive Features, then using random forest method carry out similarity study, obtain similarity predict mould Type, for judging whether two vehicles belong to similar vehicle in picture.
There are some drawbacks in practical application for the above method: the first detection recognition method based on ancillary equipment, needs Extras to be installed to motor vehicle, are difficult to promote in reality;Second of method based on vehicle appearance information comparison, light It is affected according to, environment, accuracy rate is not high.The drawbacks of in order to solve the above method, realizes that fast and effeciently analysis is extensive and hands over Logical data, from doubtful " false-trademark " vehicle of a large amount of primary dcreening operations, real " false-trademark " vehicle of accurate lock needs a kind of new technical side Case meets the needs of traffic control department.
Summary of the invention
The invention proposes one kind effectively identification mistake and real " false-trademark " vehicle to be distinguished, and is substantially reduced The investigation range of " false-trademark " vehicle is not necessarily to extras, and deployment is convenient, and applicability is wide, and recognition accuracy is higher, greatlys improve The false-trademark vehicle postsearch screening method based on probability distribution of follow up check and efficiency of deploying to ensure effective monitoring and control of illegal activities.
The technical solution adopted by the present invention is that:
A kind of false-trademark vehicle postsearch screening method based on probability distribution, comprising the following steps:
S1. obtain bayonet cross vehicle record data, and carry out data cleansing obtain bayonet cross vehicle record data;
S2. to original cards make a slip of the tongue vehicle record data sorting, extract vehicle driving bayonet to vector (Ki, Kj), Ki and Kj table Show that bayonet is numbered, be put into togerther in set K with HPHM, HPHM indicates number plate of vehicle;
S3. the Spatial Probability Pij that vehicle flows between calculating bayonet, and all probability (Ki, Kj, Pij) are stored in set P In;
S4. vehicle record data acquisition license plate set H is crossed based on bayonet in S1, and drives pipe database with vehicle and compares preliminary screening False license plate obtains preliminary screening vacation license plate set F1;
S5. the normal number of hops Jnor of each vehicle in the spatial probability distribution set of computations F1 flowed to based on vehicle in S3 With abnormal number of hops Jp, and the license plate for meeting spatial probability distribution is put into set H1, does not meet spatial probability distribution License plate is put into set H2;
S6. Recognition of License Plate Characters error probability Lx is calculated based on character accounting in set H1 and set H2;
S7. known based on the normal number of hops Jnor of each vehicle in set F1 and exception number of hops Jp and characters on license plate Other error probability Lx is to license plate postsearch screening, comprehensive judgement license plate false-trademark probability.The present invention utilizes the spatial character of vehicle driving, The concept of probability distribution is proposed, the probability jumped each time by calculating vehicle judges the continuity of vehicle spatially.Such as The continuity of fruit track of vehicle spatially is higher, and illustrating that there is a possibility that larger in the track is a vehicle;If track of vehicle Continuity spatially is lower, and illustrating that there is a possibility that larger in the track is more vehicles, that is to say, that the identification of the number plate Accuracy is lower, by calculating the license plate for excluding not meet spatial probability distribution.Simultaneously as tollgate devices are to different characters Accuracy of identification is different, will spatially be more conform with the license plate of distribution probability and does not meet the license plate of probability distribution, is divided into two Set counts character accounting in two set respectively and if obvious errors occurs in character accounting illustrates the character recognition accuracy May be lower, can be by character recognition probability, exclusive segment identifies the higher license plate of error rate again.
Further, it is as follows to cross vehicle record data capture method for the bayonet of step S1: obtaining original cards in a cycle and makes a slip of the tongue Vehicle records data, and according to the data cleansing of setting rule, deletes the data not being inconsistent normally, and retain the dimension of needs, wrap It includes bayonet number, brand number, spend the vehicle time.
Further, it is as follows to obtain the step of set K by step S2:
(1) it is grouped according to brand number, according to vehicle time-sequencing is crossed in each group, is then grasped below each group of progress Make:
Step 1: taking out first record, it is denoted as record 1;
Step 2: taking out next record, it is denoted as record 2;
Step 3: calculating the time difference Δ T of record 1 and record 2;If time difference Δ T is less than threshold value T, step 4 is gone to; If time difference Δ T is greater than threshold value T, record 2 is assigned to record 1, goes to step 2;
Step 4: the bayonet number composition bayonet vector that number plate and two are recorded is put into set K to (HPHM, Ki, Kj) In;Record 2 is assigned to record 1, goes to step 2;
(2) all groups are traversed, set K is obtained.
Further, in step S3 calculate vehicle flow to Spatial Probability Pij the step of include: each in statistics set K The quantity of vector (Ki, Kj) is denoted as cout (Ki, Kj), then bayonet Ki outflow vehicle summation isVehicle Probability is flowed to from bayonet Ki to bayonet Kj
Further, the license plate set H in step S4 is that bayonet crosses unduplicated license plate in vehicle record data in S1.
Further, preliminary screening is that the license plate set driven in pipe database there will be no vehicle forms preliminary screening in step S4 False license plate set F1.
Further, the normal number of hops Jnor of each vehicle and exception number of hops Jp is walked in set of computations F1 in step S5 Suddenly include:
(i) according to the license plate in set F1, each license plate corresponding all records in set K are obtained;
(ii) if the license plate does not have corresponding record in set K, number Jnor which is normally jumped and Abnormal number of hops Jp is denoted as 0;
(iii) it if the license plate has corresponding record in set K, is obtained according to (Ki, the Kj) of each record It is corresponding in set P to flow to probability P ij, if Pij is more than or equal to threshold value Pi, it is considered that it is normal that vehicle, which this time jumps, , if Pij is less than threshold value Pi, it is considered that it is abnormal that vehicle, which this time jumps,;
(iv) the number Jnor that each license plate normally jumps, i.e. Pij >=Pi number and time jumped extremely are counted Number Jp, the i.e. number of Pij < Pi.
Further, in step S6 calculate Recognition of License Plate Characters error probability Lx the step of include: respectively statistics set H1 and The accounting of each character is denoted as Lx1 and Lx2 in set H2, and wherein x represents possible character, calculates each character in H2 set Compared to the error Lx=ABS ((Lx2-Lx1)/Lx1) of accounting in H1.
Further, license plate postsearch screening formula is as follows in step S7:
A possibility that numerical value of FB is bigger, represents false-trademark is higher, otherwise A possibility that identification mistake, is higher;ε is empirically worth, and generally takes period number of days.
The present invention be in order to overcome in a practical situation, since light, angle, number plate such as are stained at the factors, bayonet for number The discrimination of board is unable to reach the limitation of 100% (generally in 96%-98% or so), in actual conditions, the very possible handle of bayonet Some character recognition lead to primary dcreening operation normal Car license recognition at the license plate in pipe database is not driven in vehicle at other characters False-trademark vehicle list is excessive, manually verifies heavy workload.
Design of the invention are as follows: next bayonet that vehicle passes through, it should meet spatially exponential probability distribution, if certain A trade mark does not relatively meet spatial probability distribution, it is more likely that is simultaneously by two different Car license recognitions at the same vehicle Board, that is, identification mistake.Meanwhile license plate is made of different characters, each character recognition probability is different, for by knowing The license plate of the other higher character composition of probability is preferentially checked, and the influence of identification mistake can be reduced to the greatest extent, so as to great Artificial investigation range is reduced, and improves false-trademark hit rate.
Beneficial effects of the present invention are mainly manifested in: can preferably overcome the false-trademark vehicle due to caused by bayonet identification mistake Primary dcreening operation list is excessive, greatly reduces investigation range, it is good to improve false-trademark hit rate, practicability;Without relying on road network structure, fit It is stronger with property.
Detailed description of the invention
Fig. 1 is flow chart of the invention.
Fig. 2 is the spatial probability distribution figure that vehicle of the invention flows to.
Specific embodiment
Next combined with specific embodiments below invention is further explained, but does not limit the invention to these tools Body embodiment.One skilled in the art would recognize that present invention encompasses may include in Claims scope All alternatives, improvement project and equivalent scheme.
Referring to Fig.1, a kind of false-trademark vehicle postsearch screening method based on probability distribution, comprising the following steps:
S1. obtain bayonet cross vehicle record data, and carry out data cleansing obtain bayonet cross vehicle record data;
Bayonet is referred to using the skills such as advanced photoelectricity, computer, image procossing, pattern-recognition, remote data access Art carries out round-the-clock real time monitoring to car lane, the non-motorized lane in monitoring section and records dependent image data, and automatic Obtain vehicle passes through the data such as time, place, driving direction, brand number, number plate color, body color.Vehicle crosses vehicle Record can be stored in the database with format data.
It obtains a cycle inner bayonet and crosses vehicle record data.In order to reduce the too small bring contingency of sample, the period can be with It selects to grow a bit, generally 1-6 months, preferentially be selected as 3 months.
There are some dirty datas, including no license board information, license plates can not identify that partial character can not for original bayonet data Identification etc., washes these dirty datas, and retain the dimension of needs, including bayonet number, brand number, excessively vehicle time.
S2. vehicle record data sorting is crossed to bayonet, extracting vehicle driving bayonet indicates card to vector (Ki, Kj), Ki and Kj Mouth number, is put into togerther in set K with HPHM, and HPHM indicates number plate of vehicle;
Vehicle can be captured constantly by bayonet in normal driving process, and theoretically vehicle has higher probability and compared Neighbouring bayonet capture, the probability captured by remoter bayonet are lower.If a vehicle is often caught by the lower bayonet of probability It obtains, illustrates that the vehicle less meets spatial probability distribution.In view of the accuracy of identification of bayonet is unable to reach 100%, it is possible to lead The different vehicle on the way travelled is caused, the same number plate is identified as, does not meet spatial probability distribution, anti-mistake so as to cause vehicle For, meet the license plate of spatial probability distribution, identifies that a possibility that correct is higher.
In reality, due to bayonet failure, network failure, the capture rate of bayonet is unable to reach the factors such as 100%, and vehicle exists When by part bayonet, it is possible to will not be recorded.It is generally acknowledged that vehicle is from 1 bayonet, regular hour It is not captured by any bayonet inside, it may be possible to shortage of data (it could also be possible that stationary vehicle) have occurred, shortage of data has can The bayonet of next capture vehicle can be caused less to meet spatial probability distribution.This time is known as threshold value T, if two, vehicle Interval time between bayonet has been more than threshold value T, this group of bayonet is calculated to being not involved in.
It is as follows to the process of vector to extract vehicle driving bayonet:
(1) data after cleaning S1, are grouped according to brand number, according to vehicle time-sequencing is crossed in each group, so It performs the following operation for each group afterwards:
Step 1: taking out first record, it is denoted as record 1;
Step 2: taking out next record, it is denoted as record 2;
Step 3: calculating the time difference Δ T of record 1 and record 2;If time difference Δ T is less than threshold value T, step 4 is gone to; If time difference Δ T is greater than threshold value T, record 2 is assigned to record 1, goes to step 2;
Step 4: the bayonet number composition bayonet vector that number plate and two are recorded is put into set K to (HPHM, Ki, Kj) In;Record 2 is assigned to record 1, goes to step 2;
(2) all groups are traversed, set K is obtained.
S3. the Spatial Probability Pij that vehicle flows between calculating bayonet, and all probability (Ki, Kj, Pij) are stored in set P In;
Vehicle is calculated from a bayonet according to set K, reaches the probability of other each bayonets, this probability is claimed Probability is flowed between bayonet.It flows to probability and reflects the next bayonet of vehicle spatially probability distribution.Bayonet flows to probability (Ki, Kj)=(vehicle number of bayonet Kj is reached from bayonet Ki)/from the vehicle fleet of bayonet Ki.In statistics set K The quantity of each vector (Ki, Kj) is denoted as cout (Ki, Kj), then bayonet Ki flows out vehicle summation, isBayonet Ki flows to probability to bayonet Kj'sMeter It calculates and flows to probability between all bayonets pair, if the current record number between two bayonets is zero, current probability is denoted as 0%.
S4. vehicle record data acquisition license plate set H is crossed based on bayonet in S1, and drives pipe database with vehicle and compares preliminary screening False license plate obtains preliminary screening vacation license plate set F1;
Specifically, recording data according to vehicle is crossed in S1, unduplicated license plate is obtained, the collection of all license plates in the period is obtained Close H.License plate in set H is driven into being compared in pipe database with vehicle, if license plate not in the database, is put into set F1 In, F1 is the false-trademark set of preliminary screening.
S5. the normal number of hops Jnor of each vehicle in the spatial probability distribution set of computations F1 flowed to based on vehicle in S3 With abnormal number of hops Jp, and the license plate for meeting spatial probability distribution is put into set H1, does not meet spatial probability distribution License plate is put into set H2;Specific steps include:
(i) according to the license plate in set F1, each license plate corresponding all records in set K are obtained;
(ii) if the license plate does not have corresponding record in set K, number Jnor which is normally jumped and Abnormal number of hops Jp is denoted as 0;
(iii) it if the license plate has corresponding record in set K, is obtained according to (Ki, the Kj) of each record It is corresponding in set P to flow to probability P ij, if Pij is more than or equal to threshold value Pi, it is considered that it is normal that vehicle, which this time jumps, , if Pij is less than threshold value Pi, it is considered that it is abnormal that vehicle, which this time jumps,;Threshold value Pi value is 0.2%.
(iv) the number Jnor that each license plate normally jumps, i.e. Pij >=Pi number and time jumped extremely are counted Number Jp, the i.e. number of Pij < Pi.
If vehicle, which jumps, does not meet spatial probability distribution, illustrate that the license plate has larger possibility for identification mistake, anti-mistake Come, meets probability distribution, illustrate that the Car license recognition correctness is higher.
S6. Recognition of License Plate Characters error probability Lx is calculated based on character accounting in set H1 and set H2;
When sample is sufficiently large, the frequency that each character of license plate occurs should tend to a stationary value, if some character goes out Existing frequency is relatively high, illustrates that a possibility that other characters misrecognition is at the character is higher, in turn, if some character occurs Flat rate it is relatively low, illustrate that the character has a possibility that larger to be identified as other characters.
By the element in set F1, it is divided into two set H1 and H2 according to probability is jumped, wherein set H1 is to jump probability Element more than or equal to 0.2%, set H1 are to jump the generally element less than 0.2%.Due to the license plate in set H1, it is more conform with Spatial probability distribution, therefore, the character recognition accuracy in set H1 are higher, conversely, character recognition probability is lower in H2.Respectively The accounting of each character is denoted as Lx1 and Lx2 in statistics set H1 and set H2, and wherein x represents possible character, calculates H2 set In each character compared to accounting in H1 error Lx=ABS ((Lx2-Lx1)/Lx1).Lx approximately can be used to estimate every A kind of probability of character-recognition errors.
S7. known based on the normal number of hops Jnor of each vehicle in set F1 and exception number of hops Jp and characters on license plate Other error probability Lx is to license plate postsearch screening, comprehensive judgement license plate false-trademark probability.Whether meet Spatial Probability according to vehicle flow direction Distribution, can judge that two different license plates either with or without the same license plate is identified as, are not met by removing to a certain extent The license plate of spatial probability distribution can remove the license plate of this part identification mistake.In remaining license plate, different license plates is by difference Character composition, the successful probability of each character recognition is different, for the license plate being made of the higher character of identification probability, such as A possibility that fruit is not driven in pipe data in vehicle, false-trademark is very high, can preferentially be checked.
It finally can be according to formulaThe numerical value of FB is bigger, represents false-trademark Possibility is higher, and otherwise a possibility that identification mistake is higher.ε is empirically worth, and generally takes period number of days.
The present invention utilizes the spatial character of vehicle driving, proposes the concept of probability distribution, by calculating vehicle each time The probability jumped judges the continuity of vehicle spatially.If the continuity of track of vehicle spatially is higher, illustrate the rail It is a vehicle that mark, which has a possibility that larger,;If the continuity of track of vehicle spatially is lower, it is biggish to illustrate that the track has Possibility is more vehicles, that is to say, that the recognition correct rate of the number plate is lower, excludes not meeting Spatial Probability point by calculating The license plate of cloth.Simultaneously as tollgate devices are different to different character recognition precision, it will spatially be more conform with distribution probability License plate and do not meet the license plate of probability distribution, be divided into two set, count respectively two gather in character accounting, if character There are obvious errors in accounting, illustrates that character recognition accuracy may be lower, can be by character recognition probability, exclusion portion again Divide the identification higher license plate of error rate.
A kind of concrete application embodiment is as follows:
S1. bayonet crosses the extraction of car data:
It obtains a cycle inner bayonet and crosses vehicle record data, retain the dimension of needs, including bayonet number, brand number mistake The vehicle time.
The present embodiment has extracted Hangzhou 1 day-January 30 January in 2016, adds up data on the 30th, altogether comprising 489 cards Mouthful, 129534497 record in total, bayonet data format such as the following table 1:
Table 1
Field Data type Meaning
KKID VARchar(20) Bayonet ID
HPHM VARchar(10) Brand number
HPLX VARchar(2) Number plate type
JGSJ VARchar(20) Spend the vehicle time
The corresponding road section of one of KKID, HPHM+HPZL uniquely determine an automobile.JGSJ is accurate to the second, (in following steps, brand number contains number plate type, repeats no more)
The cleaning of bayonet data:
Since brand number is bayonet system according to picture recognition, number plate discrimination is unable to reach 100%, original bayonet number According to there are some dirty datas, including license plate is sky, can not be identified, partial character can not identify etc..Clean the partial data, portion Division is for example shown in the following table 2:
Table 2
Serial number Brand number Spend the vehicle time
1 2016-01-15 14:52:51
2 NULL 2016-01-20 19:32:30
3 Peaceful B? 711T 2016-01-25 11:31:34
4 Zhejiang A00? NT 2016-01-25 20:54:04
5 Zhejiang A025X? 2016-01-21 14:18:13
6 It can not identify 2016-01-10 22:49:28
S2. vehicle record ordering is crossed, and extracts bayonet vector
It crosses vehicle record ordering: according to brand number, spending the vehicle time, data are ranked up.Partial data is as shown in table 3 below (ellipsis part is non-display portion).
Table 3
Serial number Brand number Bayonet ID Spend the vehicle time
1 Zhejiang A2M1** 31000300007402 2016-01-04 07:51:09
2 Zhejiang A2M1** 31000300010702 2016-01-04 08:48:26
3 Zhejiang A2M1** 31000300010904 2016-01-04 08:50:13
4 Zhejiang A2M1** 31000300004504 2016-01-04 08:50:38
5 Zhejiang A2M1** 31000300004502 2016-01-04 08:50:58
6 Zhejiang A2M1** 31000300019902 2016-01-04 08:53:36
7 Zhejiang A2M1** 31000300005402 2016-01-04 08:59:18
··· ······ ······ ······
To sorted record, satisfactory bayonet is taken out to vector.In the present embodiment, threshold value T is set as 15 points Clock.By taking table 3 as an example, the process for taking out bayonet pair is as follows:
1, record 1, record 2 are taken out;
2, record 1 and 2 time differences of record are calculated, is 57mins17s > 15mins, gives up record 1;
3, record 3 is taken out, the time difference for calculating record 2 and record 3 is 1mins47s < 15mins, will (Zhejiang A2M1**, 31000300010702,31000300010904) it is put into set K.
4, a record is removed, is repeated above operation.
Above 7 are crossed vehicle record, can take out 5 bayonets pair.
S3. probability is flowed between calculating bayonet
All (K in statistics set Ki,Kj), it is available from bayonet KiOutflow, flows to bayonet KjVehicle number.Statistics count(Ki,Kj), it is available from KiThe vehicle fleet of outflow, obtains that as shown in table 4 below (ellipsis part is non-display unit Point).
Table 4
Bayonet Ki Bayonet Kj count(Ki,Kj) count(Ki) Probability
31000300000102 31000300001804 35433 156351 22.7%
31000300000102 31000300012619 35384 156351 22.6%
31000300000102 31000300001802 26530 156351 17.0%
31000300000102 31000300027001 18117 156351 11.6%
31000300000102 31000300009719 10139 156351 6.5%
31000300000102 31000300000502 5298 156351 3.4%
31000300000102 31000300000904 4236 156351 2.7%
31000300000102 31000300000503 3885 156351 2.5%
31000300000102 31000300002504 2150 156351 1.4%
31000300000102 31000300000504 1190 156351 0.8%
31000300000102 31000300000902 1180 156351 0.8%
31000300000102 31000300025819 962 156351 0.6%
31000300000102 31000300005002 820 156351 0.5%
31000300000102 31000300012120 810 156351 0.5%
······ ······ ······ ······ ···
······ ······ ······ ······ ···
Bayonet flows to probability and embodies bayonet distribution and road network structure in another dimension.
The probability that flows to that bayonet 31000300004304 arrives other bayonets is calculated, and probability flashback is arranged, draws and rolls over Line chart, probability are in apparent exponential distribution.Stream is equally calculated to bayonet 31000300003801 and bayonet 31000300006604 To probability and curve graph is drawn, probability is also at apparent exponential distribution.Three bayonets flow to the scatter chart of probability, such as Fig. 2 It is shown.Wherein Y-axis indicates probability, and X-axis indicates other bayonets (according to probability inverted order).
S4. bayonet record drives the comparison of pipe database with vehicle, primarily determines false-trademark vehicle range:
In the present embodiment, vehicle drive pipe data only include " Zhejiang A " beginning related data, non-Zhejiang A number plate can not judge be No is false-trademark, therefore the delineation of false-trademark range is the number plate of " Zhejiang A ".It is recorded using mistake vehicle in MapReduce acquisition S1 unduplicated License plate is only retained the number plate started with " Zhejiang A ", these number plates is driven pipe data with vehicle and are compared, are driven if being not included in vehicle It in pipe database, is put into set F1, F1 is the false-trademark vehicle list of primary dcreening operation.
In the present embodiment, sharing 235642 number plates is the doubtful false-trademark of primary dcreening operation.
S5. each normal number of hops of vehicle and abnormal number of hops in set of computations F1.
According to the license plate in set F1, each license plate corresponding all records in set K are obtained.Remembered according to each It is corresponding in (Ki, Kj) acquisition set P of record to flow to probability P ij.Partial results are as follows:
Table 5
Count the number Jnor that each number plate normally jumps, i.e. Pij >=Pi number and the number jumped extremely Jp, the i.e. number of Pij < Pi.If the number plate does not have corresponding record in set K, the number which is normally jumped Jnor and exception number of hops Jp are denoted as 0.Partial results such as the following table 6:
Table 6
Serial number Brand number Normal number of hops Abnormal number of hops
1 Zhejiang AA59** 293 7
2 Zhejiang A925** 187 0
3 Zhejiang A2EM** 371 6
4 Zhejiang A2KA** 270 2
5 Zhejiang A255** 167 4
6 Zhejiang AK5X** 66 0
7 Zhejiang AC29** 164 4
8 Zhejiang A9EN** 259 4
9 Zhejiang A295** 458 0
10 Zhejiang AH52** 258 3
······ ······ ······ ······
······ ······ ······ ······
S6. Recognition of License Plate Characters error probability is calculated.
We are by the element in set F1, according to probability is jumped, are divided into two set H1 and H2, wherein H1 includes 66460616 elements, H2 include 23970273 elements.License plate is made of 7 characters, and wherein front two indicates local, and rear five Position indicates license plate.In the present embodiment, front two based on " Zhejiang A ", therefore we mainly consider after 5 characters on license plate.
5 character accountings after license plate in H1 and H2 set are counted respectively, obtain following table:
Table 7
We have seen that in set H1 and set H2,3,5, Q, U these character accountings relatively, identify error probability compared with Small, these character accounting difference of T, X, N are larger, and identification error probability is higher.
S7. postsearch screening, the sequence of false-trademark possibility:
False-trademark possibility FB can be calculated by the following formula.
ε value is 15 in the present embodiment.
Partial results are as follows.
Table 8
In the present embodiment, from more than 20 ten thousand doubtful deck, the higher number plate (FB of 1895 deck possibilities is filtered out > 0), the range shorter of screening more than 100 times.By actual verification, if only sort according to " doubtful false-trademark " frequency of occurrence, In preceding 50 doubtful false-trademarks, only 4 are determined as false-trademark, remaining is identification mistake, sort according to this method, preceding 50 doubtful vacations In board, there are 24 to be determined as false-trademark, accuracy rate improves 6 times.

Claims (9)

1. a kind of false-trademark vehicle postsearch screening method based on probability distribution, comprising the following steps:
S1. obtain original cards make a slip of the tongue vehicle record data, and carry out data cleansing obtain bayonet cross vehicle record data;
S2. vehicle record data sorting is crossed to bayonet, extracting vehicle driving bayonet indicates that bayonet is compiled to vector (Ki, Kj), Ki and Kj Number, it is put into togerther in set K with HPHM, HPHM indicates number plate of vehicle;
S3. the Spatial Probability Pij that vehicle flows between calculating bayonet, and all probability (Ki, Kj, Pij) are stored in set P;
S4. vehicle record data acquisition license plate set H is crossed based on bayonet in S1, and drives pipe database with vehicle and compares preliminary screening vacation vehicle Board obtains preliminary screening vacation license plate set F1;
S5. the normal number of hops Jnor of each vehicle and different in the spatial probability distribution set of computations F1 flowed to based on vehicle in S3 Normal number of hops Jp, and the license plate for meeting spatial probability distribution is put into set H1, do not meet the license plate of spatial probability distribution It is put into set H2;
S6. Recognition of License Plate Characters error probability Lx is calculated based on character accounting in set H1 and set H2;
S7. wrong based on the normal number of hops Jnor of each vehicle in set F1 and exception number of hops Jp and Recognition of License Plate Characters Accidentally probability Lx is to license plate postsearch screening, comprehensive judgement license plate false-trademark probability.
2. a kind of false-trademark vehicle postsearch screening method based on probability distribution according to claim 1, it is characterised in that: step It is as follows that the bayonet of S1 crosses vehicle record data capture method: obtaining original cards in a cycle and makes a slip of the tongue vehicle record data, and according to setting Fixed data cleansing rule, deletes the data not being inconsistent normally, and retain the dimension of needs, including bayonet number, brand number, Spend the vehicle time.
3. a kind of false-trademark vehicle postsearch screening method based on probability distribution according to claim 1, it is characterised in that: step It is as follows that S2 obtains the step of set K:
(1) it is grouped according to brand number, according to vehicle time-sequencing is crossed in each group, is then performed the following operation for each group:
Step 1: taking out first record, it is denoted as record 1;
Step 2: taking out next record, it is denoted as record 2;
Step 3: calculating the time difference Δ T of record 1 and record 2;If time difference Δ T is less than threshold value T, step 4 is gone to;If Time difference Δ T is greater than threshold value T, and record 2 is assigned to record 1, goes to step 2;
Step 4: the bayonet number composition bayonet vector that number plate and two are recorded is put into set K (HPHM, Ki, Kj); Record 2 is assigned to record 1, goes to step 2;
(2) all groups are traversed, set K is obtained.
4. a kind of false-trademark vehicle postsearch screening method based on probability distribution according to claim 1, it is characterised in that: step In S3 calculate vehicle flow to Spatial Probability Pij the step of include: each vector (Ki, Kj) in statistics set K quantity, note For count (Ki, Kj), then bayonet Ki outflow vehicle summation isVehicle is from bayonet Ki to bayonet Kj Flow to probability
5. a kind of false-trademark vehicle postsearch screening method based on probability distribution according to claim 1, it is characterised in that: step License plate set H in S4 is that bayonet crosses unduplicated license plate in vehicle record data in S1.
6. a kind of false-trademark vehicle postsearch screening method based on probability distribution according to claim 5, it is characterised in that: step Preliminary screening is that the license plate set driven in pipe database there will be no vehicle forms preliminary screening vacation license plate set F1 in S4.
7. a kind of false-trademark vehicle postsearch screening method based on probability distribution, feature described according to claim 1~one of 6 exist In: the normal number of hops Jnor of each vehicle and abnormal number of hops Jp step include: in set of computations F1 in step S5
(i) according to the license plate in set F1, each license plate corresponding all records in set K are obtained;
(ii) if the license plate does not have corresponding record in set K, number Jnor and exception which is normally jumped Number of hops Jp is denoted as 0;
(iii) if the license plate has corresponding record in set K, set P is obtained according to (Ki, the Kj) of each record In it is corresponding flow to probability P ij, if Pij is more than or equal to threshold value Pi, it is considered that it is normal that vehicle, which this time jumps, if Pij is less than threshold value Pi, it is considered that it is abnormal that vehicle, which this time jumps,;
(iv) the number Jnor that each license plate normally jumps, i.e. Pij >=Pi number and the number jumped extremely are counted Jp, the i.e. number of Pij < Pi.
8. a kind of false-trademark vehicle postsearch screening method based on probability distribution according to claim 7, it is characterised in that: step The step of Recognition of License Plate Characters error probability Lx is calculated in S6 includes: each character in difference statistics set H1 and set H2 Accounting is denoted as Lx1 and Lx2, and wherein x represents possible character, error of each character compared to accounting in H1 in calculating H2 set Lx=ABS ((Lx2-Lx1)/Lx1).
9. a kind of false-trademark vehicle postsearch screening method based on probability distribution according to claim 8, it is characterised in that: step License plate postsearch screening formula is as follows in S7:
A possibility that numerical value of FB is bigger, represents false-trademark is higher, otherwise identifies A possibility that mistake, is higher;ε is empirically worth, and takes period number of days.
CN201710391814.6A 2017-05-27 2017-05-27 A kind of false-trademark vehicle postsearch screening method based on probability distribution Active CN107329977B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710391814.6A CN107329977B (en) 2017-05-27 2017-05-27 A kind of false-trademark vehicle postsearch screening method based on probability distribution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710391814.6A CN107329977B (en) 2017-05-27 2017-05-27 A kind of false-trademark vehicle postsearch screening method based on probability distribution

Publications (2)

Publication Number Publication Date
CN107329977A CN107329977A (en) 2017-11-07
CN107329977B true CN107329977B (en) 2019-08-16

Family

ID=60193242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710391814.6A Active CN107329977B (en) 2017-05-27 2017-05-27 A kind of false-trademark vehicle postsearch screening method based on probability distribution

Country Status (1)

Country Link
CN (1) CN107329977B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109087508B (en) * 2018-08-30 2021-09-21 广州市市政工程设计研究总院有限公司 High-definition bayonet data-based adjacent area traffic volume analysis method and system
CN109448363B (en) * 2018-09-30 2021-06-08 佳都科技集团股份有限公司 Intelligent suspected vehicle sealing and controlling method and system based on track prediction and processing terminal
CN110164138B (en) * 2019-05-17 2021-02-09 湖南科创信息技术股份有限公司 Identification method and system of fake-licensed vehicle based on bayonet convection direction probability and medium
KR102643324B1 (en) * 2020-10-29 2024-03-07 닛폰세이테츠 가부시키가이샤 Identification devices, identification methods and programs
CN112614347B (en) * 2020-12-22 2022-03-15 杭州海康威视系统技术有限公司 Fake plate detection method and device, computer equipment and storage medium
CN113011895B (en) * 2021-03-31 2023-07-18 腾讯科技(深圳)有限公司 Associated account sample screening method, device and equipment and computer storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013214143A (en) * 2012-03-30 2013-10-17 Fujitsu Ltd Vehicle abnormality management device, vehicle abnormality management system, vehicle abnormality management method, and program
CN105702047A (en) * 2016-03-04 2016-06-22 浙江宇视科技有限公司 License plate identification error filtering method and apparatus in fake-license plate analysis
CN105719489A (en) * 2016-03-24 2016-06-29 银江股份有限公司 Fake-licensed vehicle detection method based on bayonet vehicle flow direction probability
CN106022296A (en) * 2016-06-01 2016-10-12 银江股份有限公司 Fake plate vehicle detection method based on vehicle hot spot area probability aggregation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013214143A (en) * 2012-03-30 2013-10-17 Fujitsu Ltd Vehicle abnormality management device, vehicle abnormality management system, vehicle abnormality management method, and program
CN105702047A (en) * 2016-03-04 2016-06-22 浙江宇视科技有限公司 License plate identification error filtering method and apparatus in fake-license plate analysis
CN105719489A (en) * 2016-03-24 2016-06-29 银江股份有限公司 Fake-licensed vehicle detection method based on bayonet vehicle flow direction probability
CN106022296A (en) * 2016-06-01 2016-10-12 银江股份有限公司 Fake plate vehicle detection method based on vehicle hot spot area probability aggregation

Also Published As

Publication number Publication date
CN107329977A (en) 2017-11-07

Similar Documents

Publication Publication Date Title
CN107329977B (en) A kind of false-trademark vehicle postsearch screening method based on probability distribution
CN106022296B (en) A kind of fake-licensed car detection method of the probability polymerization based on vehicle hot spot region
CN104200669B (en) Fake-licensed car recognition method and system based on Hadoop
CN106251635A (en) The recognition methods of a kind of deck suspicion license plate number and system
CN101587643B (en) Identification method of fake-licensed cars
CN104268599B (en) Intelligent unlicensed vehicle finding method based on vehicle track temporal-spatial characteristic analysis
CN105513368B (en) A kind of false-trademark car screening technique based on uncertain information
CN105719489B (en) A kind of fake-licensed car detection method that probability is flowed to based on bayonet vehicle
CN103246876A (en) Image feature comparison based counterfeit vehicle registration plate identification method
CN107645709B (en) Method and device for determining personnel information
CN109191861B (en) System and method for detecting abnormal behavior of fee evasion vehicle on expressway based on video detector
CN110942640B (en) Method for actively discovering suspect vehicle illegally engaged in network car booking passenger transportation
CN104750800A (en) Motor vehicle clustering method based on travel time characteristic
CN106710225B (en) Vehicle number plate violation identification method and monitoring platform
CN106297304A (en) A kind of based on MapReduce towards the fake-licensed car recognition methods of extensive bayonet socket data
CN109191605A (en) A kind of highway charging rate accuracy evaluating method considering charge path
CN104794906A (en) Vehicle management platform of outdoor parking lot exit
CN101593418A (en) Method for associative search of suspected vehicles
CN112925820B (en) Method, device and system for identifying vehicle evasion toll
CN108305461A (en) A kind of determination method and apparatus for evading expense suspected vehicles
CN112380892B (en) Image recognition method, device, equipment and medium
CN112509325A (en) Video deep learning-based off-site illegal automatic discrimination method
CN106571040A (en) Suspicious vehicle confirmation method and equipment
CN113470369B (en) Method and system for judging true number plate of fake-licensed vehicle based on multi-dimensional information
CN112967410B (en) Method for identifying evasion toll vehicles based on longest public subsequence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder

Address after: 310012 1st floor, building 1, 223 Yile Road, Hangzhou City, Zhejiang Province

Patentee after: Yinjiang Technology Co.,Ltd.

Address before: 310012 1st floor, building 1, 223 Yile Road, Hangzhou City, Zhejiang Province

Patentee before: ENJOYOR Co.,Ltd.

CP01 Change in the name or title of a patent holder