Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a vehicle track prediction model establishing method, which can solve the problems of low quality of vehicle track prediction data and low accuracy of vehicle track prediction;
the second purpose of the invention is to provide a vehicle track prediction method, which can solve the problems of low quality of vehicle track prediction data and low accuracy of vehicle track prediction;
the technical scheme for realizing one purpose of the invention is as follows: a vehicle track prediction model building method comprises the following steps:
step S1: obtaining historical driving track data of a vehicle, and constructing single vehicle track data and road network vehicle track data of the vehicle;
step S2: screening the single vehicle track data and the road network vehicle track data respectively to obtain screened single vehicle track data and screened road network vehicle track data;
step S3: respectively carrying out bayonet completion on the screened single vehicle track data and the screened road network vehicle track data to obtain effective single vehicle track data and effective road network vehicle track data;
step S4: establishing a prediction model of the effective single vehicle track data according to the effective single vehicle track data, and establishing a prediction model of the effective road network vehicle track data according to the effective road network vehicle track data;
step S5: constructing a fusion training set according to the prediction model of the effective single vehicle track data and the prediction model of the effective road network vehicle track data;
step S6: and training a set prediction model according to the fusion training set to generate a vehicle track prediction model.
Further, the historical driving track data at least comprises a bayonet track sequence formed by the vehicles passing through the bayonets sequentially according to the time sequence.
Further, the single vehicle track data and the road network vehicle track data are constructed according to a bayonet track sequence, the data structure of the single vehicle track data is { current bayonet number, time period type, date period type, specific time and next bayonet number }, and the data structure of the road network vehicle track data is { license plate number, current bayonet number, vehicle type, vehicle attribution, time period type, date period type, specific time and next bayonet number }.
Further, the time period types include: morning, morning peak, morning, noon, afternoon, evening peak, and late at night; the date segment types include holiday first day, holiday middle, holiday last day, weekday first day, weekday middle, and weekday last day.
Further, before executing step S4, the method further includes deleting a fourth feature "specific time" of the valid single vehicle trajectory data to obtain new valid single vehicle trajectory data, and in step S4, a prediction model is established for the new valid single vehicle trajectory data;
and deleting the first characteristic 'license plate number' and the seventh characteristic 'specific time' of the effective road network vehicle track data to obtain new effective road network vehicle track data, and establishing a prediction model for the new effective road network vehicle track data in the step S4.
Further, the screening the single vehicle trajectory data and the road network vehicle trajectory data includes:
deleting the single vehicle track data and the road network vehicle track data which meet the first condition, reserving the residual single vehicle track data and road network vehicle track data, and respectively recording the residual single vehicle track data and the residual road network vehicle track data as screened single vehicle track data and screened road network vehicle track data;
the first condition is as follows: the time consumed for the vehicle to pass through the current bayonet number and the next bayonet number is less than the preset time, and the current bayonet number is the same as the next bayonet number.
Further, the preset time is obtained by the following steps:
the average value of the time spent by the vehicle passing through all two adjacent gates is recorded as mu, and the standard deviation is recorded as mu-2.
Further, the performing bayonet completion on the screened single vehicle trajectory data and the screened road network vehicle trajectory data includes:
and (3) carrying out bayonet completion on the screened single vehicle track data according to the step S3-1:
step S3-1: screening a bayonet track sequence corresponding to single vehicle track data, recording the bayonet track sequence as a first bayonet track sequence, starting from a first bayonet of the first bayonet track sequence, forming two bayonet units by two adjacent bayonets including a current bayonet and a next bayonet, judging bayonets omitted by the two bayonet units, and supplementing the omitted bayonets into the first bayonet track sequence, wherein the omitted bayonets are positioned behind the current bayonet and in front of the next bayonet in the two corresponding bayonet units, so that effective single vehicle track data are obtained;
completing the vehicle track data of the screened road network: and (4) processing the screened road network vehicle track data under each license plate number in turn according to the license plate number in step S3-1 to obtain effective road network vehicle track data.
Further, the specific implementation process of judging the omitted bayonets in the two bayonet units comprises the following steps:
sequentially extracting three continuous adjacent bayonets from a first bayonet of the first bayonet track sequence, wherein the three continuous adjacent bayonets are used as a three-bayonet unit to obtain a three-bayonet sequence set;
calculating the occurrence frequency of each three-card-port unit in the three-card-port sequence set, and calculating the average value of the three-card-port units in the three-card-port sequence set
If the number of times of the three-card-port unit is more than that of the three-card-port unit
Then the three-bayonet unit is taken as a completion sequence unit of a completion sequence set to obtain the completion sequence set;
sequentially extracting two continuous adjacent bayonets from a first bayonet of the first bayonet track sequence, wherein the two continuous adjacent bayonets are used as a two-bayonet unit to obtain a two-bayonet sequence set;
comparing the two bayonet units with the completion sequence unit: when two bayonets of the two bayonet units are respectively the same as the first bayonet and the third bayonet of the completion sequence unit, taking the two bayonet units as to-be-completed bayonet units, and forming a to-be-completed bayonet unit set by all to-be-completed bayonet units;
calculating the time consumption d between two bayonets in each to-be-supplemented bayonet unit in the vehicle passing through the to-be-supplemented bayonet unit set1And calculating the time d consumed by the vehicle passing between the first mount and the last mount in the corresponding completion sequence unit2;
If | d1-d2|<0.2d2And judging that the bayonet unit to be supplemented has a missed bayonet, wherein the missed bayonet is the bayonet missed by the two bayonet units, and the missed bayonet is the same as the bayonet in the middle of the supplementing sequence unit.
Further, the prediction model of the effective single vehicle trajectory data is a naive bayes model, and the prediction model of the effective single vehicle trajectory data is constructed through a formula ①:
wherein, X(i)Representing the ith characteristic of the sample, a valid single-vehicle track data is a sample, x(i)The characteristic value of the ith characteristic is shown, y represents the next bayonet, bkDenotes the next bayonet number, P (y ═ b)k) The next bayonet is shown as number bkProbability of bayonet of (A), P (X)(i)=x(i)|y=bk) Indicated at the next bayonet by the number bkUnder the condition of the bayonet of (1), the characteristic value of the ith characteristic is x(i)K represents the kth bayonet of a bayonet sequence formed by different bayonets in sequence in the bayonet track sequence corresponding to the effective single vehicle track data, f (x) returns to bkA value;
P(X(i)=x(i)|y=bk) Calculated according to formula ②, P (y ═ b)k) Calculated according to equation ③:
wherein N represents the total number of samples,
means finding the coincidence y from N samples=b
kThe number of samples of (a) to (b),
representing the ith feature in the jth sample,
the characteristic value representing the ith feature in the jth sample is a,
the characteristic value of the ith characteristic in the jth sample is a, and the number of the next bayonet is b
kThe total number of samples of (a) is,
the number of the next bayonet is b
kTotal number of samples of (a), Count (b)
k) As a function, take the values: if the bayonet is numbered b
kThe number of times of occurrence of the bayonet is greater than the average value of the number of times of occurrence of all the bayonets, then Count (b)
k) The bayonet is numbered as b
kThe number of times of bayonet of (b), otherwise Count (b)
k)=0。
Further, the prediction model of the effective road network vehicle trajectory data is a naive bayes model, and the prediction model of the effective road network vehicle trajectory data is constructed by a formula ④:
wherein, ω (X)(i)) As a function, it takes the value:
P′(X(i)=x(i)|y=bk′) Indicated at the next bayonet by the number bk′Under the condition of the bayonet of (1), the characteristic value of the ith characteristic is x(i)P' (y ═ b)k′) The next bayonet is shown as number bk′Bayonet ofK 'represents the kth bayonet of a bayonet sequence formed by different bayonets in sequence in the bayonet track sequence corresponding to the effective road network vehicle track data, and f' (x) returns to bk′A value;
P′(y=bk′) And P' (X)(i)=x(i)|y=bk′) Calculated according to equations ⑤ and ⑥, respectively:
and K represents the total number of the next bayonet sign types in all samples, the effective road network vehicle track data is regarded as one sample, lambda is greater than 0, and lambda is a constant.
Further, the fusion training set specific implementation process includes the following steps:
each effective single vehicle track data is calculated through the prediction model of the effective single vehicle track data to obtain a corresponding coefficient value, and the coefficient value of the jth effective single vehicle track data calculated through the prediction model of the effective single vehicle track data is aj(ii) a Each effective road network vehicle track data is calculated through the prediction model of the effective road network vehicle track data to obtain a corresponding coefficient value, and the coefficient value of the jth effective road network vehicle track data calculated through the prediction model of the effective road network vehicle track data is a'jWherein, in the step (A),
constructing a training subset, wherein the training subset of the jth sample is { aj,a′j,yj},yjValue takingIs 0 or 1; the current bayonet of the jth sample is denoted as kjAnd k 'represents a lower bayonet'jRecording the next gate of the prediction result of the prediction model of the effective single vehicle track data as S, and recording the next gate of the prediction result of the prediction model of the effective road network vehicle track data as S'; if S ═ k'jThen y isj0; if S '═ k'jThen y isj1 is ═ 1; if S ═ k'jAnd S '═ k'jThen y isj0; if S ≠ k'jAnd S '≠ k'jIf so, not taking the coefficient value obtained by correspondingly calculating the jth sample as a training subset, thereby obtaining the training subset of each sample;
all the training subsets form a fusion training set, and the fusion training set is obtained as { { a { (a)1,a′1,y1},{a2,a′2,y2},……,{aj,a′j,yj},……,{aN,a′N,yN}}。
Further, the prediction model is formula ⑦:
f(a,b)=sigmoid(λ1a+λ2b+λ3)------⑦
wherein λ is1、λ2And λ3Are all constant and preset with lambda1、λ2And λ3At an arbitrary initial value, a denotes ajB represents a'jI.e. a ═ a1,a2,…,aj,…aN],b=[a′1,a′2,…,a′j,…a′N];
λ1、λ2And λ3Is obtained by a loss function, which is equation ⑧:
the loss function loss is minimized through a gradient descent algorithm to obtain lambda1、λ2And λ3To end ofValue, lambda of the final value1、λ2And λ3Substituting into formula ⑦, a predictive model of the vehicle trajectory is obtained.
The second technical scheme for realizing the aim of the invention is as follows: a vehicle trajectory prediction method comprising the steps of:
step S1: obtaining historical driving track data of a vehicle, and constructing single vehicle track data and road network vehicle track data of the vehicle;
step S2: screening the single vehicle track data and the road network vehicle track data respectively to obtain screened single vehicle track data and screened road network vehicle track data;
step S3: respectively carrying out bayonet completion on the screened single vehicle track data and the screened road network vehicle track data to obtain effective single vehicle track data and effective road network vehicle track data;
step S4: establishing a prediction model of the effective single vehicle track data according to the effective single vehicle track data, and establishing a prediction model of the effective road network vehicle track data according to the effective road network vehicle track data;
step S5: constructing a fusion training set according to the prediction model of the effective single vehicle track data and the prediction model of the effective road network vehicle track data;
step S6: training a set prediction model according to the fusion training set to generate a vehicle track prediction model;
step S7: and constructing single vehicle track data and road network vehicle track data from the vehicle running track data to be predicted, respectively screening and complementing the data of the single vehicle track data and the road network vehicle track data, and inputting the data into the prediction model for prediction to obtain a prediction result.
Further, the prediction result comprises that the probability that the next gate of the vehicle is the gate S is 1-f (a, b), and the probability that the next gate of the vehicle is the gate S' is f (a, b); the next gate is a gate S which is a result obtained by predicting through the prediction model of the effective single vehicle track data, the next gate is a gate S' which is a result obtained by predicting through the prediction model of the effective road network vehicle track data, and f (a, b) are result values obtained by calculating through the vehicle track prediction model.
The invention has the beneficial effects that: according to the data cleaning process, data features are expanded, a special time and date processing method enables the prediction model to use time more effectively, meanwhile, secondary features of the vehicle type and the attribution are reserved, data restore real life as far as possible, the prediction model considers more possible incidence relations, and the prediction result is more in line with the actual situation. The single vehicle track prediction adopts inclination estimation calculation, and the frequent location of the vehicle is effectively predicted by considering the original driving habits of the vehicle users. The road network vehicle track prediction uses an improved Bayesian formula prediction result, is a supplement to single vehicle track prediction, supplements vehicle track prediction under the condition of incomplete data caused by missed shooting, and predicts gates which are not passed by the vehicles. The vehicle track prediction model fusion takes the prediction accuracy and the possibility of new bayonet into consideration, so that the prediction result is more accurate and closer to the real situation.
Example one
As shown in fig. 1, a vehicle trajectory prediction model building method includes the following steps:
step S1: obtaining historical driving track data of the vehicle, wherein the historical driving track data at least comprises a bayonet track sequence formed by the vehicle passing through the bayonet in sequence according to the time sequence, and constructing single vehicle track data and road network vehicle track data according to the bayonet track sequence.
In this embodiment, the historical driving track data of the vehicle means that, in a certain time period (for example, 12 hours), the vehicle is shot by the camera to pass through information of the vehicle in each road gate, so as to obtain information that each vehicle sequentially passes through each gate from the first gate to the last gate in time sequence, that is, a gate track sequence is obtained, where the gate track sequence at least includes a current gate through which the vehicle passes, a next gate through which the vehicle passes, a date and a specific time when the vehicle passes through the gate, a type of a time period of time when the vehicle passes through the gate, a type of a date segment of the date when the vehicle passes through the gate, a license plate number of the vehicle, a vehicle type, and a vehicle attribution place, and each gate has a unique corresponding gate number.
And constructing corresponding single vehicle track data and road network vehicle track data for each vehicle according to the gate track sequence of the vehicle. The data structure of the track data of the single vehicle is { the current bayonet number, the time period type, the date period type, the specific time and the next bayonet number }; the data structure of the road network vehicle track data is { license plate number, current bayonet number, vehicle type, vehicle attribution, time period type, date period type, specific time, next bayonet number }.
In the structure of the single vehicle trajectory data or the road network vehicle trajectory data, each of the other attributes is a feature except that the last attribute "next bayonet number" is defined as a label, for example, in the single vehicle trajectory data, the first attribute "current bayonet number" is a first feature, the value of the current bayonet number "is a feature value of the feature, the" time period type "is a second feature, the value of the" time period type "of the second feature is a feature value, and so on.
When the last gate that the vehicle passes through is taken as the current gate number, the next gate number is set to 0 or null.
Wherein, the time period types comprise early morning (0-7 o 'clock), early morning peak (7-9 o' clock), morning (9-12 o 'clock), noon (12-14 o' clock), afternoon (14-17 o 'clock), late afternoon (17-20 o' clock), late night (20-24 o 'clock), the instant time period types comprise a plurality of time periods, therefore, the characteristic values of the' time period types 'are early morning (0-7 o' clock), early morning peak (7-9 o 'clock), morning (9-12 o' clock), noon (12-14 o 'clock), afternoon (14-17 o' clock), late afternoon (17-20 o 'clock), late night (20-24 o' clock); the date section types comprise a first day of a holiday, the middle of the holiday, the last day of the holiday, the first day of a working day, the middle of the working day and the last day of the working day, the holiday comprises weekends and national statutory holidays, the working day refers to a national specified working day, and is usually Monday to Friday every week; the vehicle types include cars, passenger cars, trucks and the like, although the vehicle types may be classified into other categories, for example, the vehicle types include taxis, private cars, motorcycles and the like.
The specific time refers to the precise time when the vehicle passes through the gate, such as 2018-05-0422:31: 3.
The single vehicle track data is the running records of the same license plate number (namely the same vehicle) passing through all the gates, and the road network vehicle track data is the sum of the running records of each license plate number passing through all the gates in the historical running track data, namely the running records of all the vehicles. And classifying the road network vehicle track data according to the license plate number to obtain the single vehicle track data corresponding to the license plate number. For example, in the history of the travel locus, the vehicle M passes through 10 gates a in sequence1A2A3A4A5A6A7A8A9A10The vehicle N passes through 10 bayonets B in sequence1B2B3B4B5B6B7B8B9B10Then, thenObtaining 10 single vehicle track data of the vehicle M, wherein the track data are respectively { A }1Time period type, date period type, specific time, A2}、{A2Time period type, date period type, specific time, A3}、{A3Time period type, date period type, specific time, A4}、{A4Time period type, date period type, specific time, A5}、{A5Time period type, date period type, specific time, A6}、{A6Time period type, date period type, specific time, A7}、{A7Time period type, date period type, specific time, A8}、{A8Time period type, date period type, specific time, A9}、{A9Time period type, date period type, specific time, A10}、{A10Time period type, date period type, specific time, 0 }.
Likewise, the vehicle N has 10 individual vehicle trajectory data. The road network vehicle track data comprises 10 single vehicle track data of the vehicle M and 10 single vehicle track data of the vehicle N, and license plate number, vehicle type and vehicle attribution characteristics are added on each corresponding single vehicle track data.
Each single vehicle trajectory data or road network vehicle trajectory data is considered a sample.
Step S2: and screening the data of the single vehicle track data and the road network vehicle track data.
Because the cameras may shoot the same vehicle at almost the same time (i.e., at a short interval) at the same gate, for example, at a distance of 0.1s, in a short interval, the same gate shoots the same vehicle, that is, in the track data of the single vehicle and the track data of the road network vehicle, the current gate number is the same as the next gate number, and the corresponding track data of the single vehicle or the track data of the road network vehicle is the repeated data and needs to be deleted. This is often due to problems arising with the camera itself. For the situation that a certain vehicle appears twice or more times at the same gate in a short time, redundant single vehicle track data and road network vehicle track data need to be removed, so that the same vehicle only has one corresponding single vehicle track data and road network vehicle track data in a short time. The specific implementation process comprises the following steps:
the time consumed by any vehicle passing through any one of the bayonets A and the other adjacent bayonets B is normally distributed, the average value of the time consumed by the vehicle passing through all the two adjacent bayonets in the normal distribution is recorded as mu, and the standard deviation is recorded as.
For the track data of a single vehicle of each vehicle, reading the time T of each vehicle passing through a gate A and a gate BABIf T isABMore than or equal to mu-2, wherein mu-2 is called as preset time, the single vehicle track data or road network vehicle track data of a corresponding vehicle passing through a bayonet A and a bayonet B is regarded as correctly recorded data, namely if the current bayonet is bayonet A and the next bayonet is bayonet B in the single vehicle track data or road network vehicle track data, the single vehicle track data or road network vehicle track data is regarded as correctly recorded data, and the correctly recorded data is reserved; otherwise, deleting the single vehicle track data or the road network vehicle track data which meet the first condition:
the first condition is as follows: if TAB<Mu-2, comparing whether the bayonet numbers of the bayonet A and the bayonet B are the same, if the bayonet numbers of the bayonet A and the bayonet B are the same, namely the current bayonet number is the same as the next bayonet number, deleting the corresponding single vehicle track data or road network vehicle track data, otherwise, still keeping the corresponding single vehicle track data or road network vehicle track data.
And respectively recording the single vehicle track data and the road network vehicle track data after data screening as screened single vehicle track data and screened road network vehicle track data. After data screening, the original bayonet track sequence (for example, A) corresponding to the track data of the single vehicle1A2A3A4A5A6A7A8A9A10) Is updated. Taking the track data of a single vehicle as an example, if the vehicle passes through two adjacent gates A5A6If the time consumption meets the condition one, the next bayonet A is required to be used6And corresponding data recordsDeleting, i.e. deleting, bayonet A6And corresponding { A6Time period type, date period type, specific time, A7Recording, and clipping A5Is modified into a deleted bayonet A6Next bayonet A7. For example, delete the original bayonet track sequence A1A2A3A4A5A6A7A8A9A10Middle corresponding position bayonet A6Then, the corresponding sequence becomes a new bayonet track sequence A1A2A3A4A5A7A8A9A10And recording the new bayonet track as a first bayonet track sequence.
Step S3: and (4) respectively carrying out bayonet completion on the screened single vehicle track data and the screened road network vehicle track data.
For screening the track data of a single vehicle, the method comprises the following steps:
and obtaining a sequentially-running passing bayonet track sequence corresponding to the screened single vehicle track data subjected to the screening processing in the step S2, sequentially extracting three continuously-adjacent bayonets from the first bayonet in the bayonet track sequence, wherein the three continuously-adjacent bayonets are used as a three-bayonet unit, so that a three-bayonet sequence set is obtained. For example, assuming that the bayonet track sequence is abcdefacefabce, starting from the first bayonet a, three consecutive adjacent bayonets are sequentially extracted to form a three-bayonet unit, for example, the three-bayonet unit ABC is obtained, the three-bayonet unit forms a three-bayonet sequence set, and the obtained three-bayonet sequence set is { ABC, BCD, CDE, DEF, EFA, FAC, ACD, CDE, DEF, EFA, FAB, ABC, BCE }. Calculating the occurrence frequency of each three-card-port unit in the three-card-port sequence set, namely calculating the number of the three-card-port units in the three-card-port sequence set, for example, the three-card-port units ABC, CDE, DEF and EFA all appear for 2 times in the three-card-port sequence set, and the occurrence frequency of the rest three-card-port units all appear for one time, and calculating the average value of the three-card-port units in the three-card-port sequence set
![Figure GDA0002470713450000151](https://patentimages.storage.googleapis.com/ce/ee/4d/63fff8cb2abd32/GDA0002470713450000151.png)
![Figure GDA0002470713450000152](https://patentimages.storage.googleapis.com/7c/38/00/2450a57dbfc52c/GDA0002470713450000152.png)
For example, the total number of three-card-port units in the three-card-port sequence set { ABC, BCD, CDE, DEF, EFA, FAC, ACD, CDE, DEF, EFA, FAB, ABC, BCE } is 13, and the total number of different three-card-port units is 9, so
After the average value of the three bayonet units is obtained through calculation, comparing the occurrence frequency of each three bayonet unit with the average value of the three bayonet units, and if the occurrence frequency of the three bayonet units is greater than the average value of the three bayonet units, taking the corresponding three bayonet units as completion sequence units of a completion sequence set, wherein for example, the occurrence frequencies of the three bayonet units ABC, CDE, DEF and EFA are all greater than the average value of the three bayonet units, so that the three bayonet units ABC, CDE, DEF and EFA are all the completion sequence units, and the completion sequence set { ABC, CDE, DEF and EFA } is obtained; otherwise, the three-card-interface unit is not taken as a completion sequence unit.
And then, sequentially extracting two continuous adjacent bayonets from the first bayonet of the bayonet track sequence, wherein the two continuous adjacent bayonets are used as a two-bayonet unit, and obtaining a two-bayonet sequence set. For example, a two-card-port sequence set obtained by the card-port track sequence abcdefefabce is { AB, BC, CD, DE, EF, FA, AC, CD, DE, EF, FA, AB, BC, CE }, and the two-card-port units are compared with the completion sequence unit, when two card ports of the two-card-port units are respectively the same as the first card port and the third (i.e., the last) card port of the completion sequence unit, the corresponding two-card-port units are used as card port units that may need to be completed, that is, card port units to be completed, and all card port units to be completed constitute a card port unit set to be completed. For example, the first card mount a and the third card mount C of the two card mount units AC and the complementing sequence unit ABC are respectively the same, and therefore, the two card mount units AC are the card mount units to be complemented. Similarly, the two card port units CE are the same as the first card port C and the third card port E of the completion sequence unit CDE, and thus the two card port units CE are card port units to be completed. Thus, the set of bayonet units { AC, CE } to be complemented is obtained.
After the card port unit set to be compensated is obtained, calculating the time consumed when the vehicle passes between two card ports in each card port unit to be compensated, and recording the time consumed as d1And the time consumption between two bayonets which are the same as the two bayonets of the bayonet unit to be completed in the corresponding completion sequence unit is recorded as d2. For example, the time taken for a vehicle to pass between a gate a and a gate C in a gate unit AC to be replenished is d1The time consumed between the bayonet A and the bayonet C in the corresponding completion sequence unit ABC is d2. If | d1-d2|<0.2d2And judging that the bayonet unit to be supplemented has a missed bayonet, namely, a bayonet is missed between two bayonets of the supplementing bayonet unit, and the missed bayonet is the same as the bayonet in the middle of the supplementing sequence unit.
For example, the time d taken for the vehicle to pass between the gate A and the gate C in the gate unit AC to be replenished is calculated1And the time consumption d between the bayonet A and the bayonet C in the corresponding completion sequence unit ABC2If the requirements are met, it is judged that the bayonet B is omitted between the bayonet units AC to be supplemented, namely the bayonet B in the sequential unit ABC is omitted, and the bayonet B is supplemented into the bayonet units AC to be supplemented. And (4) obtaining a bayonet sequence ABC by the bayonet unit AC to be supplemented after the bayonet B is supplemented, and replacing the bayonet sequence ABC with the bayonet sequence AC in the original bayonet track sequence, thereby updating the bayonet track sequence. After the bayonet B is supplemented with the bayonet unit AC to be supplemented, correspondingly, the original bayonet track sequence ABCDEFADEFAABCE is changed into a new bayonet track sequence ABCDEFAABCDEFAABCE; and if the requirements of the formula are not met, not supplementing the bayonet until all bayonet units to be supplemented in the bayonet unit set to be supplemented are supplemented, and obtaining a new bayonet track sequence, thereby completing track supplementation of the track data of the single vehicle, obtaining the track data of the single vehicle after the track supplementation, and recording the track data of the single vehicle after the track supplementation as effective track data of the single vehicle.
For the road network vehicle track data, firstly, the road network vehicle track data is split according to the license plates to obtain the track data similar to single vehicles, namely the data structure of the data obtained by each license plate is { license plate number, current bayonet number, vehicle type, vehicle attribution, time period type, date section type, specific time and next bayonet number }, the subsequent processing steps are the same as the processing steps of the single vehicle track data, so that the track completion of the road network vehicle track data is completed, and the road network vehicle track data after the track completion is marked as effective road network vehicle track data.
Step S4: and establishing a prediction model of effective single vehicle track data.
Establishing a prediction model of the effective single vehicle trajectory data, wherein the prediction model is a formula ①:
wherein, X(i)Representing the ith characteristic of a sample, a valid single vehicle trajectory data is a sample, e.g. { A }1Time period type, date period type, specific time, A2Is a sample, X(1)Representing a first feature A of the sample1;x(i)Feature value representing the ith feature in the sample, y representing the next bayonet, bkDenotes the next bayonet number, P (y ═ b)k) The next bayonet is shown as number bkThe probability of the bayonet of (a); p (X)(i)=x(i)|y=bk) Indicated at the next bayonet by the number bkUnder the condition of the bayonet of (a), the characteristic value of the ith characteristic of the sample is x(i)The probability of (d); k represents the kth bayonet of a bayonet sequence formed by different bayonets in sequence in the bayonet track sequence corresponding to the effective single vehicle track data, for example, if the bayonet track sequence corresponding to the effective single vehicle track data is ABCDEABC, the bayonet sequence formed by different bayonets in sequence is ABCDE, and the corresponding k takes a value of 1-5, that is, k is 1,2,3,4, 5; f (x) returning bkValue, bkThe value representing the bayonet number corresponding to the kth bayonet, e.g. b returned from f (x)kA value of b3Correspondingly, representing that returned is the bayonet C, and returned is the bayonet BkThe corresponding bayonet of the value is marked as S.
To reduce the amount of computation, the available sheet is used before a prediction model of the available sheet trajectory data is establishedDeleting the 'specific time' characteristic of the vehicle track data, so as to obtain a sample of { current bayonet number, time period type, date period type and next bayonet number }, wherein x is(1)Indicating the current bayonet number, x(2)Indicates the type of time period, x(3)Indicating the date field type. It should be noted that the last attribute "next card slot number" is not a feature, defined as a label, and thus there is no x(4)I.e. the maximum value of i is 3.
As can be seen from the formula ①, only P (X) needs to be calculated(i)=x(i)|y=bk) And P (y ═ b)k) Here, maximum likelihood estimation is used to calculate P (X)(i)=x(i)|y=bk) Calculating P (y-b) using tilt estimationk)。P(X(i)=x(i)|y=bk) Calculated according to formula ②, P (y ═ b)k) Calculated according to equation ③:
wherein N represents the total number of samples, namely all the single vehicle track data of a certain vehicle,
means that the coincidence y is found out from N samples
kI.e. in accordance with y ═ b
kThe number of the single vehicle track data is recorded;
representing the ith feature in the jth sample, e.g.
The 1 st feature representing the 100 th sample,
to representThe characteristic value of the i-th feature in the j-th sample is a, for example
The characteristic value of the 1 st feature representing the 100 th sample was 0023,
the characteristic value of the ith characteristic in the jth sample is a, and the number of the next bayonet is b
kThe sample of (a) is selected,
the characteristic value of the ith characteristic in the jth sample is a, and the number of the next bayonet is b
kThe total number of samples of (a); similarly, I (y ═ b)
k) The number of the next bayonet is b
kThe sample of (a) is selected,
the number of the next bayonet is b
kThe total number of samples of (a); count (b)
k) Is a function whose value is: if the bayonet is numbered b
kThe number of times of occurrence of the bayonet is greater than the average value of the number of times of occurrence of all the bayonets, then Count (b)
k) The bayonet is numbered as b
kThe number of times of bayonet of (b), otherwise Count (b)
k)=0。
Through Count (b)k) The frequently-occurring bayonets can be taken into consideration, namely, the probability of the frequently-occurring bayonets is increased, and the probability of the infrequently-occurring bayonets is weakened.
Step S5: and establishing a prediction model of the effective road network vehicle track data.
Establishing a prediction model of effective road network vehicle trajectory data, wherein the prediction model is a formula ④:
wherein, ω (X)(i)) As a function, it takes the value:
ω(X(i)) In that the degree of contribution, ω (X), of features in the sample to the outcome of the prediction model is changed(i)) The larger the value, the larger the contribution, and vice versa. In the ordinary bayesian model calculation, each feature is considered equally important, i.e. the weight corresponding to each feature is the same. However, in vehicle trajectory prediction, not every feature is equally important, and according to actual conditions and research, the importance of the bayonet number is found to be greater than that of other features, so that the maximum weight value is given to the feature of the current bayonet number, so that the prediction result of the prediction model is more dependent on the bayonet number feature, and in general, the importance degree of the features is as follows:
the current bayonet number > time period type > date period type > vehicle type ═ vehicle home.
Corresponding, ω (X)(i)) The values 1,2 and 3 are only for indicating the importance of the corresponding feature and are not limiting ω (X)(i)) Can only take values of 1,2 and 3, so ω (X)(i)) The values of (a) may be:
wherein q is1>q2>q3>0。
X(i)The ith characteristic is represented by a sample, and a valid road network vehicle track data is a sample, x(i)Eigenvalues of the ith feature in the sample, y represents the next bayonet, bk′Indicates the next bayonet number, P' (y ═ b)k′) The next bayonet is shown as number bk′The probability of the bayonet of (a); p' (X)(i)=x(i)|y=bk′) Indicated at the next bayonet by the number bk′Under the condition of the bayonet of (1), the characteristic value of the ith characteristic of the sample is x(i)The probability of (d); k ' represents the kth ' of a bayonet sequence formed by different bayonets in sequence in bayonet track sequences corresponding to effective road network vehicle track data 'For example, if a bayonet track sequence corresponding to valid single vehicle track data is ABCDEABC, a bayonet sequence formed by different bayonets in sequence is ABCDE, and a corresponding k 'takes a value of 1-5, that is, k' is 1,2,3,4, 5; f (x) returning bk′Value, bk′The value representing the corresponding bayonet, e.g. f (x) returned bk′A value of b3Correspondingly, representing that returned is the bayonet C, and returned is the bayonet Bk′The corresponding bayonet of the value is denoted as S'.
In order to reduce the calculation amount, before a prediction model of the effective road network vehicle track data is established, the characteristics of ' license plate number ' and ' specific time ' of the effective road network vehicle track data are deleted, so that a sample of ' current bayonet number, vehicle type, vehicle attribution, time period type, date segment type and next bayonet number ' is obtained, and thus x is the number of the current bayonet, the vehicle type, the vehicle attribution, the time period type, the date segment type and the next bayonet number '(1)Indicating the current bayonet number, x(2)Indicates the type of vehicle, x(3)Indicating the location of the vehicle, x(4)Indicates the type of time period, x(5)Indicating the date field type. Note that the attribute "Next Port number" is defined as a label, is not a feature, and thus does not have x(6)Therefore, the maximum value of i is 5.
P′(y=bk′) And P' (X)(i)=x(i)|y=bk′) Calculated according to equations ⑤ and ⑥, respectively:
wherein, K represents the total number of the label "next bayonet number" types in all samples, for example, if the bayonet track sequence corresponding to the effective road network vehicle track data is ABCDEABC, N is 8, and K is 5; λ >0, with specific values for λ typically being 1-3.
Step S6: and establishing a fusion training set of the vehicle track prediction fusion model.
In the above equation ①, for each valid single vehicle trajectoryThe data (i.e. each sample) corresponds to a coefficient value aj,ajDenotes the coefficient value calculated for the j-th sample, ajThe calculation formula of (a) is as follows:
similarly, in formula ④, a coefficient value a 'is obtained for each valid road network vehicle track data (i.e. each sample)'j,a′jDenotes the coefficient value calculated for the j-th sample, a'jThe calculation formula of (a) is as follows:
constructing a training subset of the fusion model, wherein the training subset of the jth sample is { a }j,a′j,yj},yjA value of 0 or 1; y isjThe value of (A) is obtained by the following steps:
assume that the current bayonet is kjThe next bayonet is k'jThe data of (a) are respectively corresponding to the jth effective road network vehicle track data and the jth effective road network vehicle track data. And substituting the effective road network vehicle track data corresponding to the jth sample into the prediction model of the effective single vehicle track data, and substituting the effective road network vehicle track data corresponding to the jth sample into the prediction model of the effective road network vehicle track data for prediction. Namely, the current bayonet is kjAnd the lower bayonet is k'jThe corresponding effective single vehicle track data is predicted by the formula ①, and the current gate is kjAnd the lower bayonet is k'jThe corresponding effective road network vehicle trajectory data is predicted by equation ④, and the predicted results are denoted as S and S', respectively.
If S ═ k'jThen y isj0; if S '═ k'jThen y isj1 is ═ 1; if S ═ k'jAnd S '═ k'jThen y isj0; if S ≠ k'jAnd S '≠ k'jThen, thenThe jth sample is not taken as a training subset, i.e., the jth sample does not have a corresponding training subset.
Calculating corresponding y for each samplejAnd obtaining all training subsets to form a fusion training set of the fusion model.
That is, if yjIf the prediction result is 0, the prediction result of the prediction model with the prediction result being the valid single vehicle track data is represented, namely the next gate of the prediction result is S; if yjWhen the prediction result is 1, the next block of the prediction result is S', which indicates that the prediction result is the prediction result of the prediction model of the effective road network vehicle trajectory data.
All the training subsets form a fusion training set of the fusion model, namely the fusion training set is { { a { (a)1,a′1,y1},{a2,a′2,y2},……,{aj,a′j,yj},……,{aN,a′N,yN}},j=1,2,3,…,N。
A in each of the training subsetsjAnd a'jAll for the same vehicle, ajIs a 'calculated from the valid individual vehicle trajectory data of the same vehicle'jThe representation is calculated from valid road network vehicle trajectory data for the same vehicle.
Step S7: establishing a prediction model for vehicle trajectory prediction
The predictive model of the vehicle trajectory prediction is formula ⑦:
f(a,b)=sigmoid(λ1a+λ2b+λ3)------⑦
this is a logistic regression model, λ1、λ2And λ3Are all constant and preset with lambda1、λ2And λ3An arbitrary initial value. Wherein a represents ajB represents a'jI.e. a ═ a1,a2,…,aj,…aN],b=[a′1,a′2,…,a′j,…a′N]. For example, there is f (a)1,b1)=sigmoid(λ1a1+λ2b1+λ3)。
Establishing a loss function, which is formula ⑧:
minimizing loss function loss through gradient descent algorithm to obtain corresponding lambda1、λ2And λ3At this time λ1、λ2And λ3Is taken as an optimal parameter, and the lambda of the optimal parameter is taken as1、λ2And λ3The prediction model is obtained by substituting the formula ⑦.