CN115470872B - Driver portrait construction method based on vehicle track data - Google Patents

Driver portrait construction method based on vehicle track data Download PDF

Info

Publication number
CN115470872B
CN115470872B CN202211417112.8A CN202211417112A CN115470872B CN 115470872 B CN115470872 B CN 115470872B CN 202211417112 A CN202211417112 A CN 202211417112A CN 115470872 B CN115470872 B CN 115470872B
Authority
CN
China
Prior art keywords
track
driver
point
entropy
mean value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211417112.8A
Other languages
Chinese (zh)
Other versions
CN115470872A (en
Inventor
桂志鹏
刘宇航
吴华意
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202211417112.8A priority Critical patent/CN115470872B/en
Publication of CN115470872A publication Critical patent/CN115470872A/en
Application granted granted Critical
Publication of CN115470872B publication Critical patent/CN115470872B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses a driver portrait construction method based on vehicle track data, which defines the track characteristics describing individual movement modes; designing 32 track characteristics from time, space, geographic semantics and driving behaviors as quantity question items for measuring the track characteristics; then, appointing the corresponding relation between each track characteristic and the track characteristic and a scoring rule to construct a track characteristic quality table; and then, evaluating the effectiveness of the measuring result of the scale by adopting a statistical method to measure the reasonability of the scale design or the applicability of the scale in the current user group. The invention provides a technical framework for extracting track characteristics from multiple angles, establishes a mapping and conversion mechanism between a bottom track statistical characteristic and a high-rise track characteristic portrait from the perspective of portrait, excavates the travel tendency and the driving preference of a driver by integrating fragmented track characteristics, and provides a technical scheme for high-rise semantic modeling of travel activity characteristics.

Description

Driver portrait construction method based on vehicle track data
Technical Field
The invention relates to the field of track data mining, in particular to a driver portrait construction method based on vehicle track data.
Background
Mining individual movement patterns based on vehicle trajectory data has great significance, and simultaneously faces a plurality of challenges, particularly in the aspect of constructing trajectory images. As an important component of urban traffic, tens of thousands of vehicles spend a great deal of time on roads each day. The method has the advantages that travel tendency and driving preference of drivers are known, traffic jam and pollution of cities are relieved, city infrastructure is designed, personalized service is recommended, safe driving is promoted, and the like.
Existing techniques for mining movement patterns of individuals focus on extracting or exploring relationships between trajectory features and a single demographic attribute. The extraction of the track features is to define features reflecting individual movement modes from certain angles of time, space, semantics and driving behaviors, such as travel distance, travel entropy, sudden speed change, overspeed and the like; and the research on the relationship between the track characteristics and the single demographic attributes mainly aims at the correlation between the track characteristics and the age and emotion of the individual.
The prior art method has the problems that extracted track features are one-sided and fragmented, and macroscopic and comprehensive description cannot be provided to reveal different individual movement modes. Specifically, the method is characterized in that 1) the extraction of the track features of a single angle is incomplete, for example, time features are not extracted from a plurality of time granularities, so that individual various and multilevel travel rhythms cannot be described; 2) Track features are not extracted from a plurality of angles at the same time, so that the description of the individual moving mode is one-sided; 3) A label system capable of macroscopically describing individual trip tendency and driving preference is not established, namely, a track portrait is established, so that the track characteristics of the bottom layer are unhooked from the track portrait concerned by the upper layer application, and the application scene of track analysis is restricted.
Disclosure of Invention
In order to solve the technical problem, the invention provides a driver portrait construction method based on vehicle track data, which comprises the following steps:
s1, defining description dimensions of a track image;
s2, designing related track features from multiple angles according to the definition of the description dimensions to form a to-be-selected track feature set;
s3, constructing a scale measurement description dimension which comprises a scoring rule for defining the corresponding relation between the description dimension and the track characteristic and specifying the track characteristic; specifically, for each description dimension, selecting a track feature meeting the definition of each description dimension from a to-be-selected track feature set by means of the definition of each description dimension, and taking the track feature as a subject for measuring the description dimension; meanwhile, judging the positive and negative correlation relationship of the track characteristics and the description dimensions in the concept according to the definitions of the track characteristics and the description dimensions, and appointing a scoring rule of the track characteristics according to the positive and negative correlation relationship; if a certain description dimension cannot select a proper track feature from the track feature set to be selected, the description dimension is not included in the final track image;
and S4, inputting track data of one or more users, extracting track features based on the scale in the step S3, and fusing the track features under the description dimensions to obtain a measurement result of the description dimensions, namely a track portrait.
Further, the specific implementation manner of step S1 is as follows;
defining the description dimensions of the track image as four description dimensions A, B, C and D, namely the track characteristics; wherein the description dimension A describes the range of the driver's activity space, the frequency of traveling, and the number of times of visiting shopping or entertainment venues; describing dimension B measures the irregularity and unpredictability of travel locations; the description dimension C captures the tendency of the driver to drive impulsively; the description dimension D describes the degree to which the driver is in compliance with traffic regulations, on-time.
Further, step S2 specifically includes:
designing track characteristics from four angles of time, space, geographical semantics and driving behaviors based on vehicle track data;
step S21, designing time characteristics to reveal travel rhythms;
s22, designing a spatial characteristic to describe the spatial range and distribution characteristic of the trip location;
s23, designing geographic semantic features to depict travel activity information;
and step S24, designing driving behavior characteristics to reflect driving capacity and risk.
Further, the time characteristic designed in step S21 includes:
the designed time characteristics start from a travel time entropy, the travel time entropy comprises a daily travel entropy and a week and day combined travel entropy, and the calculation formula is as follows:
Figure 639256DEST_PATH_IMAGE001
in which
Figure 661438DEST_PATH_IMAGE002
Denotes the first
Figure 844158DEST_PATH_IMAGE003
The time slot of each trip is set as the time slot,
Figure 810846DEST_PATH_IMAGE004
indicates the driver is
Figure 802460DEST_PATH_IMAGE002
The trip frequency of (c); when a day is segmented by taking hours as a unit, each time slot is a time slot, and the daily trip entropy can be calculated according to the formula; when a week is divided into a plurality of time periods in days, andand further segmenting each time slot into a plurality of time slots again by taking the hour as a unit, and calculating the travel entropy of the combination of the week and the day according to the formula.
Further, the spatial characteristics designed in step S22 include:
the designed spatial characteristics consist of a rotation radius, a spatial entropy and other three subclasses; the radius of rotation subclass contains 4 features: the dwell time weighted radius of rotation, the dwell times weighted radius of rotation, the k-radius of rotation ratio, the minimum number of dwell points for determining the spatial range; the spatial entropy subclass contains 4 features: random entropy, place entropy, sequence entropy, departure place-destination entropy; other subclasses contain 3 features: monthly travel times, average travel distance and non-interest travel rate; the calculation formula is as follows:
dwell time weighted radius of rotation/dwell number weighted radius of rotation:
Figure 147991DEST_PATH_IMAGE005
wherein, in the step (A),La set of stop points for the driver;r i is a two-dimensional vector representation stop point
Figure 829508DEST_PATH_IMAGE003
The longitude and latitude of (c);n i is the driver at the stopping point
Figure 565383DEST_PATH_IMAGE003
The number of times of stay or the stay time;
Figure 357758DEST_PATH_IMAGE006
is the total number of dwells or time;r cm the center of all the stopping points of the driver is the mean value of the coordinates;
k-radius of rotation ratio:
Figure 229899DEST_PATH_IMAGE007
in which
Figure 82317DEST_PATH_IMAGE008
Wherein, in the process,
Figure 164543DEST_PATH_IMAGE009
most frequently visited by the driverkThe center of each dwell point is the coordinate mean value;N k is thatkThe sum of the weights of the individual sites can be the total number of stays or the time;
determining the minimum number of stopover points for the spatial range:
Figure 963872DEST_PATH_IMAGE010
random entropy:
Figure 549574DEST_PATH_IMAGE011
wherein, in the step (A),Nindicating the number of driver stops;
location entropy:
Figure 773226DEST_PATH_IMAGE012
wherein, in the step (A),
Figure 546010DEST_PATH_IMAGE013
is shown as
Figure 414609DEST_PATH_IMAGE003
The number of the stop points is equal to the number of the stop points,
Figure 120397DEST_PATH_IMAGE014
indicating driver access
Figure 173672DEST_PATH_IMAGE015
The frequency of (d);
sequence entropy:
Figure 233420DEST_PATH_IMAGE016
wherein, in the step (A),
Figure 640130DEST_PATH_IMAGE017
for a sequential sequence of time-wise visits by a driver to the stop,
Figure 466004DEST_PATH_IMAGE018
is the frequency of occurrence of the sequence;
origin-destination entropy:
Figure 440913DEST_PATH_IMAGE019
wherein, in the step (A),mrepresenting the number of unique non-repeating pairs of "origin-destination" waypoints,
Figure 985027DEST_PATH_IMAGE020
indicating departure from
Figure 336374DEST_PATH_IMAGE021
To the destination
Figure 282333DEST_PATH_IMAGE022
The frequency of occurrence;
monthly trip times:
Figure 693723DEST_PATH_IMAGE023
wherein, in the process,nrepresents the total travel times of a driver,monthis the total travel month;
average trip distance:
Figure 990712DEST_PATH_IMAGE024
wherein, in the step (A),
Figure 880171DEST_PATH_IMAGE025
is the straight-line distance between the starting point and the destination in one trip,nthe total travel times are calculated;
non-interesting trip rates:
Figure 946216DEST_PATH_IMAGE026
wherein, in the process,n uni is the number of trips when the destination is an infrequent trip to a stop.
Further, the geographic semantic features designed in step S23 include:
the designed geographic semantic features consist of two subclasses of features related to families and Point of Interest (POI) related features; the family-related feature subclass contains 2 features: stay-at-home time ratio, distance-from-home entropy; the interest point related features include 3 features: the average importance of shopping POI; average importance of entertainment POI; average importance of catering POI;
residence time ratio at home:
Figure 262927DEST_PATH_IMAGE027
wherein, in the process,
Figure 781633DEST_PATH_IMAGE028
the sum of the length of time the driver stays at home,
Figure 474783DEST_PATH_IMAGE029
the sum of the length of time that the driver stays at all the stay points; "Home" may be defined as the point at which the driver stays for the longest period of time during the night;
distance from home entropy:
Figure 536280DEST_PATH_IMAGE030
wherein
Figure 411176DEST_PATH_IMAGE031
Is the first
Figure 292545DEST_PATH_IMAGE003
The distance between the point of stay and the home,
Figure 648440DEST_PATH_IMAGE032
the frequency at which the distance occurs;
average importance of shopping/entertainment/dining POIs: the POI average importance is the average value of the POI importance of the type on all the stopping points of the driver, the POI importance of each stopping point comes from a semantic vector of an area where the stopping point is located, and the area is a grid in a semantic map; the semantic map is a spatial grid, and each grid in the grid is provided with a semantic vector which reflects the importance of various POIs and is obtained by a weighted TF-IDF algorithm; first, the
Figure 298864DEST_PATH_IMAGE003
The semantic vector of each lattice is noted as
Figure 347591DEST_PATH_IMAGE033
Wherein, in the step (A),ois the number of categories of the POI,
Figure 716256DEST_PATH_IMAGE034
wherein, in the step (A),n j is the firstjThe number of POI classes that can be included in the video,Nis the number of all POIs in the grid,Candc j respectively the total number of lattices in the semantic map and the inclusion numberjThe number of boxes of the POI-like,w j is the first in the neighborhood of cell 3 x 3jThe number of POI classes.
Further, the driving behavior characteristics designed in step S24 include:
the designed driving behavior characteristics comprise three subclasses of general behavior, abnormal behavior and residential area intersection behavior; the generic behavior subclass contains 6 features: the average value of the speed standard deviations, the maximum value of the speed standard deviations, the standard deviation of the speed average value, the average value of the maximum speed and the maximum value of the acceleration standard deviation; the abnormal behavior subclass contains 6 features: the average of the sharp shift point ratios, the average of the sharp turn point ratios, the standard deviation of the sharp shift point ratios, the standard deviation of the sharp turn point ratios, the average of the overspeed point ratios, and the average of the number of overspeed points; the residential area intersection behavior subclass contains 2 characteristics: average value of crossing overspeed point ratio and crossing speed average value;
mean of speed standard deviations/maximum of speed standard deviations/standard deviation of speed mean/mean of maximum speed: the speed of all track points in a driver's track can be expressed as a vector
Figure 875842DEST_PATH_IMAGE035
Based on this vector, the mean of the velocities of a trajectory can be calculated
Figure 380772DEST_PATH_IMAGE036
Standard deviation of velocity
Figure 865980DEST_PATH_IMAGE037
Maximum speed of the motor
Figure 721941DEST_PATH_IMAGE038
(ii) a The mean of the velocities of all trajectories can be expressed as a vector
Figure 419638DEST_PATH_IMAGE039
The mean value of the vector is the speed mean value, and the standard deviation of the vector is the standard deviation of the speed mean value; the standard deviation of the velocities of all trajectories can be expressed as a vector
Figure 44655DEST_PATH_IMAGE040
The mean value of the vector is the mean value of the speed standard deviation, and the maximum value of the vector is the maximum value of the speed standard deviation; the maximum velocity of all tracks can be expressed as a vector
Figure 435185DEST_PATH_IMAGE041
The mean value of the vector is the mean value of the maximum velocity;
maximum value of acceleration standard deviation: the calculation of the characteristic is the same as the calculation of the maximum value of the standard deviation of the speed;
mean of sharp shift point ratio/standard deviation of sharp shift point ratio: the ratio of the sharp change points of a driver's trajectory is recorded
Figure 512862DEST_PATH_IMAGE042
Wherein, in the step (A),
Figure 420775DEST_PATH_IMAGE043
is a point of track
Figure 24932DEST_PATH_IMAGE044
The acceleration of (a) is detected,
Figure 461730DEST_PATH_IMAGE045
the number of trace points representing the piece of track,
Figure 154267DEST_PATH_IMAGE046
the number of trace points for which the absolute value of the acceleration exceeds a threshold,ATis the threshold value for judging the sudden speed change; the mean value of the rates of the sharp change points of all the tracks is the mean value of the rates of the sharp change points, and the standard deviation is the standard deviation of the rates of the sharp change points;
mean of sharp turn point ratio/standard deviation of sharp turn point ratio: the sharp turn point ratio of one track of the driver is recorded as
Figure 69133DEST_PATH_IMAGE047
Wherein, in the step (A),
Figure 527796DEST_PATH_IMAGE048
is a point of track
Figure 401074DEST_PATH_IMAGE044
The angle of the turning-over corner of the frame,TTis a threshold value for judging sharp turns; the mean value of the sharp turning point ratios of all the tracks is the mean value of the sharp turning point ratios, and the standard deviation is the standard deviation of the sharp turning point ratios;
mean of overspeed point ratio/mean of number of overspeed points: the ratio of the overspeed points of one track of the driver is recorded as
Figure 577978DEST_PATH_IMAGE049
Wherein, in the process,
Figure 296535DEST_PATH_IMAGE050
is a point of track
Figure 875284DEST_PATH_IMAGE044
The speed of the motor vehicle (2) is,STis the threshold value at which the overspeed is judged,
Figure 919463DEST_PATH_IMAGE051
is the number of overspeed points; the mean value of the overspeed point ratios of all the tracks is the mean value of the overspeed point ratios, and the mean value of the overspeed point number of all the tracks is the mean value of the overspeed point number;
average crossing speed/average crossing speed ratio: and selecting track points of each track in the intersection buffer area based on the spatial position, and obtaining the two characteristics according to a calculation method of the mean value of the overspeed point ratio and the speed mean value.
Further, step S3 specifically includes:
step S31: defining the corresponding relation between the track characteristics and the track characteristics; selecting the track characteristics for measuring the individual travel range and the travel activity information as a question item for measuring the characteristics according to the definition of the description dimension A; selecting a trip space entropy as a subject item for measuring the trait according to the definition of the description dimension B; selecting a track characteristic representing the driving stability of the driver as a subject item for measuring the characteristic according to the definition of the description dimension C; selecting track characteristics related to driving violation and travel time entropy as a question item for measuring the speciality according to the definition of the description dimension D;
step S31 of defining a corresponding relationship between the description dimension and the trajectory feature includes:
selecting monthly travel times, residence time ratio at home, average travel distance, residence time weighted radius of rotation, shopping POI average importance, entertainment POI average importance and catering POI average importance as the subject of measuring and describing dimension A; selecting a non-interest travel ratio, a k-rotation radius ratio, a minimum number of stopping points for determining a space range, a random entropy, a place entropy, a sequence entropy, a departure place-destination entropy and a distance from home entropy as a subject item of a measurement description dimension B; selecting the mean value of the speed standard deviation, the standard deviation of the speed mean value, the maximum value of the speed standard deviation, the maximum value of the acceleration standard deviation, the mean value of the sharp change point ratio, the mean value of the sharp turn point ratio, the standard deviation of the sharp change point ratio and the standard deviation of the sharp turn point ratio as the problem item of the measurement description dimension C; selecting a speed mean value, a mean value of speed maximum values, a mean value of overspeed point ratios, a mean value of overspeed point numbers, a mean value of overspeed point ratios of residential area intersections, a residential area intersection speed mean value, a daily trip entropy and a week and day combined trip entropy as a subject item of a measurement description dimension D;
step S32: determining a scoring rule of each track characteristic, wherein the scoring rule is divided into positive scoring and negative scoring, and positive scoring, namely the track characteristic and the track characteristic are in direct proportion conceptually; negative scoring, namely the track characteristics and the track characteristics are in inverse proportion conceptually, and conceptually, the more the travel times, the higher the description dimension A score is, so the travel times belong to positive scoring; the higher the residence time ratio at home, the lower the description dimension A score, so the residence time ratio at home belongs to negative scoring;
the trajectory feature scoring rule determined in step S32 specifically includes:
the residence time ratio at home, the k-rotation radius ratio, the speed mean value, the mean value of the speed maximum value, the mean value of the overspeed point ratio, the mean value of the number of overspeed points, the mean value of the overspeed point ratio at the residential area intersection, the speed mean value at the residential area intersection, the daily trip entropy, the trip entropy combining the week and the day are recorded as reverse-counting sectional items, and the others are forward-counting sectional items.
Further, in step 4, vehicle trajectory data of one or more drivers is input to obtain a trajectory image of each driver, and the processing steps are as follows for the vehicle trajectory data of one driver:
step S41: inputting vehicle track data of a driver and preprocessing the vehicle track data, wherein each track point in the vehicle track data at least comprises three information of longitude, latitude and timestamp; before extracting time, space and geographical semantic features, all the stay points corresponding to all tracks of each driver are obtained by using a clustering algorithm, and the speed, the acceleration and the steering angle of each track point are calculated before extracting driving behavior features;
step S42: extracting the track characteristic of the driver according to the calculation method of the track characteristic designed in the step S2;
step S43: according to the positive and negative scoring rules of the track characteristics specified in the step S32, different formulas are adopted for normalization processing of each track characteristic;
the step S43 normalization method includes:
forward scoring:
Figure 583663DEST_PATH_IMAGE052
wherein, in the step (A),kandx 0 is a parameter for normalization, can be set
Figure 105911DEST_PATH_IMAGE053
Figure 539166DEST_PATH_IMAGE054
And IQR is the four-bit distance,
Figure 488668DEST_PATH_IMAGE055
is a median, or set
Figure 374584DEST_PATH_IMAGE056
Figure 762840DEST_PATH_IMAGE057
SThe standard deviation is used as the standard deviation,
Figure 539952DEST_PATH_IMAGE058
is the mean value;
reverse scoring:
Figure 112884DEST_PATH_IMAGE059
the symbol meaning is scored in the same forward direction;
step S44: and calculating the sum or average value of the corresponding normalized track characteristics of each track characteristic as the measurement result of the track characteristic according to the track characteristic and the corresponding relation of the track characteristics defined in the step S31.
Further, the method also comprises a step S5 of measuring the reasonability of the scale design or the applicability of the scale in the current user group by evaluating the effectiveness of the measuring result of the scale;
the method specifically comprises the following steps: evaluating the effectiveness of the measuring result of the scale through item discrimination, internal consistency reliability and data semi-reliability so as to measure the rationality of the scale design or the applicability of the scale in the current user group;
the project discrimination, internal consistency reliability and data semi-reliability assessment method in the step S5 comprises the following steps:
item discrimination: for the subject item in each trait dimension, the 27%, 73% percentiles of all driver trait measurements according to that dimension will beThe driver group is divided into three parts, which are respectively defined as a low group, a middle group and a high group, and then usedtChecking and comparing the score difference of the high grouping and the low grouping on each question item; if the difference is obvious, the method has good discrimination, otherwise, the discrimination is poor;
internal consistency confidence: for each trait dimension, kronebach is used
Figure 751676DEST_PATH_IMAGE060
Coefficient (Cronbach)
Figure 678044DEST_PATH_IMAGE060
) Estimating the internal consistency reliability of the attribute dimension by the formula
Figure 10192DEST_PATH_IMAGE061
Wherein K is the number of the questions in the table,
Figure 426130DEST_PATH_IMAGE062
the variance of all driver attribute measurements,
Figure 162005DEST_PATH_IMAGE063
for all drivers in the second
Figure 688801DEST_PATH_IMAGE003
Score variance on individual subject items;
data half-confidence: for each trait dimension, equally dividing the track of each driver into two parts by taking a bar as a unit, respectively calculating trait scores of the two parts of tracks, and estimating the data half-credibility by comparing the normalized average absolute difference between two trait score sets of all the drivers; the normalized mean absolute difference is formulated as
Figure 826522DEST_PATH_IMAGE064
Wherein, in the step (A),Nis the number of drivers to be driven,
Figure 678940DEST_PATH_IMAGE065
and
Figure 636532DEST_PATH_IMAGE066
is the driver
Figure 967019DEST_PATH_IMAGE003
The two characteristics of (a) are scored,
Figure 959246DEST_PATH_IMAGE067
and
Figure 451407DEST_PATH_IMAGE068
is the standard deviation of the two sets of trait scores for all drivers.
One or more technical solutions in the embodiments of the present application have at least one or more of the following technical effects: the invention discloses a driver portrait construction method based on vehicle track data. The method defines 4 dimensions (A, B, C and D) as description dimensions (namely the track traits) of the track portrait, and measures each description dimension by developing a track trait table. In particular, the invention defines 4 trajectory traits describing individual movement patterns; 32 track characteristics are designed from 4 angles of time, space, geographic semantics and driving behaviors to serve as quantity question items for measuring track characteristics; then, appointing the corresponding relation between each track characteristic and the scoring rule to construct a track characteristic quantity table; then, adopttTest, cronbach' s
Figure 755349DEST_PATH_IMAGE060
And evaluating the effectiveness of the scale measurement result by using the statistical method to measure the reasonability of scale design or the applicability of the scale in the current user group. The invention provides a technical framework for extracting track characteristics from multiple angles; meanwhile, from the perspective of images, a mapping and conversion mechanism between the bottom layer track statistical characteristics and the high layer track characteristic images is established, the travel tendency and the driving preference of a driver are mined by integrating fragmented track characteristics, and a technical scheme is provided for high-layer semantic modeling of travel activity characteristics.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a simplified flow diagram of the process of the present invention;
FIG. 2 is a schematic flow chart showing the details of the method of the present invention;
FIG. 3 is a schematic flow chart of an embodiment of the present invention;
FIG. 4 is an example of data and results for an embodiment of the present invention;
FIG. 5 is a trajectory quality table according to an embodiment of the present invention;
FIG. 6 is a result of a project differentiation experiment according to an embodiment of the present invention;
FIG. 7 is a result of an internal consistency reliability experiment of an embodiment of the present invention;
FIG. 8 shows the results of a data-to-half-reliability experiment according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment provides a driver portrait construction method based on vehicle track data, please refer to fig. 1 and fig. 2, the method includes:
step S1: the description dimensions of the trajectory image are defined.
In this embodiment, step S1 specifically includes:
defining the description dimensions of the track image as A, B, C and D (namely the track characteristics); wherein the description dimension A describes the range of the driver's activity space, the frequency of traveling, and the number of times of visiting shopping or entertainment venues; the description dimension B measures the irregularity and unpredictability of the travel places of the driver; the description dimension C captures the tendency of the driver to drive impulsively; the description dimension D describes the degree to which the driver is in compliance with traffic regulations, on-time.
Step S2: and extracting the track features related to the description dimensionality from a plurality of angles according to the definition of the description dimensionality to form a candidate track feature set.
In this embodiment, step S2 specifically includes:
and designing track characteristics from four angles of time, space, geographic semantics and driving behaviors based on the vehicle track data.
Step S21: designing time characteristics to reveal travel rhythms specifically comprises:
the extracted time characteristics are based on a travel time entropy, the travel time entropy comprises a daily travel entropy and a week and day combined travel entropy, and a general calculation formula is as follows:
Figure 355439DEST_PATH_IMAGE001
wherein
Figure 841653DEST_PATH_IMAGE002
Is shown as
Figure 504716DEST_PATH_IMAGE003
The time slot of each trip is set as the time slot,
Figure 564463DEST_PATH_IMAGE004
indicates the driver is
Figure 377698DEST_PATH_IMAGE002
The trip frequency of (c). When a day is segmented by taking hours as a unit, each time slot is a time slot, and the daily trip entropy can be calculated according to the formula; when the week is divided into a plurality of time slots by day, and each time slot is further divided into a plurality of time slots by hour, the combined travel of the week and the day can be calculated according to the formulaEntropy.
Specifically, the embodiment divides 24 hours a day into equal intervals, each time slot is a time slot, and according to the formula
Figure 469151DEST_PATH_IMAGE001
Calculating the daily trip entropy; dividing a week into 7 time periods from Monday to Sunday, further dividing each day into 24 time slots with equal intervals of hour units, and calculating the trip entropy of the combination of the week and the day according to the formula, wherein the total number of the time slots is 24 multiplied by 7.
Step S22: designing a spatial range and distribution characteristics of a spatial characteristic description trip location, specifically comprising:
the spatial features of the design consist of the radius of rotation, spatial entropy and three other sub-classes. The radius of rotation subclass contains 4 features: the rotation radius weighted by the dwell time, the rotation radius weighted by the dwell times, the k-rotation radius ratio and the minimum number of dwell points in a space range are determined; the spatial entropy subclass contains 4 features: random entropy, place entropy, sequence entropy, departure place-destination entropy; the other subclass contains 3 features: monthly travel times, average travel distance, non-interest travel rate. The calculation formula is as follows:
dwell time weighted radius of rotation/dwell number weighted radius of rotation:
Figure 444060DEST_PATH_IMAGE005
wherein, in the step (A),La set of stop points for the driver;r i is a two-dimensional vector representation stop point
Figure 988174DEST_PATH_IMAGE003
The longitude and latitude of (c);n i is the driver at the stopping point
Figure 464155DEST_PATH_IMAGE003
The number of dwells or dwell time;
Figure 19901DEST_PATH_IMAGE006
is the total number of dwells or time;r cm the center of all the stopping points of the driver is the mean value of the coordinates;
k-radius of rotation ratio:
Figure 24766DEST_PATH_IMAGE007
wherein
Figure 790597DEST_PATH_IMAGE069
Wherein, in the step (A),
Figure 945635DEST_PATH_IMAGE009
most frequently visited by the driverkThe center of each dwell point is the coordinate mean value;N k is thatkThe sum of the weights of the individual sites can be the total number of stays or the time;
determining the minimum number of stopover points for the spatial range:
Figure 746100DEST_PATH_IMAGE010
random entropy:
Figure 328391DEST_PATH_IMAGE011
wherein, in the step (A),Nindicating the number of driver stops;
location entropy:
Figure 847097DEST_PATH_IMAGE012
wherein, in the step (A),
Figure 540247DEST_PATH_IMAGE013
is shown as
Figure 910399DEST_PATH_IMAGE003
The number of the stop points is equal to the number of the stop points,
Figure 257066DEST_PATH_IMAGE014
indicating driver access
Figure 607276DEST_PATH_IMAGE015
The frequency of (d);
sequence entropy:
Figure 963171DEST_PATH_IMAGE016
wherein, in the step (A),
Figure 800546DEST_PATH_IMAGE017
for a sequential sequence of drivers visiting stop points chronologically,
Figure 724640DEST_PATH_IMAGE018
is the frequency of occurrence of the sequence;
origin-destination entropy:
Figure 358883DEST_PATH_IMAGE019
wherein, in the step (A),mrepresenting the number of unique non-repeating pairs of "origin-destination" waypoints,
Figure 521399DEST_PATH_IMAGE020
indicating departure from
Figure 26330DEST_PATH_IMAGE021
To the destination
Figure 245958DEST_PATH_IMAGE022
The frequency of occurrence;
monthly travel times:
Figure 836340DEST_PATH_IMAGE023
wherein, in the step (A),nrepresents the total number of trips of a driver,monthis the total travel month;
average trip distance:
Figure 799617DEST_PATH_IMAGE024
wherein, in the process,
Figure 424633DEST_PATH_IMAGE025
is the linear distance between the starting point and the destination on a trip,nthe total travel times are calculated;
non-interest travel rate:
Figure 815163DEST_PATH_IMAGE026
wherein, in the step (A),n uni is the number of trips when the destination is an infrequent trip to a stop.
Specifically, in the present embodiment, K =4 is set when calculating the K-rotation radius ratio, and the weighting method is the dwell number weighting. When the average travel distance is calculated, the distance between the travel starting point and the destination may be an euclidean distance, a geographic distance, a path distance, or the like, and the great circle distance in the geographic distance is used in this embodiment. The embodiment defines the infrequent-stay points in the non-interest trip rate as noise points in the DBSCAN clustering result.
Step S23: designing the geographical semantic features to depict the travel activity information, which specifically comprises the following steps:
the designed geographic semantic features consist of two subclasses of features related to families and Point of Interest (POI) related features. The family-related feature subclass contains 2 features: stay-at-home time ratio, distance-from-home entropy; the interest point related features include 3 features: the average importance of shopping POI; average importance of entertainment POI; restaurant POI average importance.
Residence time ratio at home:
Figure 158420DEST_PATH_IMAGE027
wherein, in the step (A),
Figure 659808DEST_PATH_IMAGE028
the sum of the length of time the driver stays at home,
Figure 139331DEST_PATH_IMAGE029
the sum of the length of time that the driver stays at all the stay points; "Home" may be defined as the point at which the driver stays for the longest period of time during the night;
distance from home entropy:
Figure 700762DEST_PATH_IMAGE030
wherein
Figure 265736DEST_PATH_IMAGE031
Is the first
Figure 570815DEST_PATH_IMAGE003
The distance between the point of stay and the home,
Figure 170424DEST_PATH_IMAGE032
the frequency of occurrence of the distance;
shopping/entertainment/catering POI average importance: the average POI importance is the average value of the POI importance of the class on all the stop points of the driver, and the POI importance of each stop point comes from a semantic vector of an area where the stop point is located, and the area is a grid in a semantic map; the semantic map is a spatial grid, and each grid in the grid is provided with a semantic vector which reflects the importance of various POIs and is obtained by a weighted TF-IDF algorithm; first, the
Figure 43702DEST_PATH_IMAGE003
The semantic vector of a lattice is noted as
Figure 955026DEST_PATH_IMAGE033
Wherein, in the step (A),ois the number of categories of the POI,
Figure 939163DEST_PATH_IMAGE034
wherein, in the step (A),n j is the firstjThe number of POI classes that can be included in the video,Nis the number of all POIs in the grid,Candc j respectively the total number of lattices in the semantic map and the inclusion numberjThe number of boxes of the POI-like,w j is the second in the neighborhood of lattice 3 x 3jThe number of POI classes.
Specifically, the present embodiment sets the nighttime for determining the location of "home" to 1 to 6 a.m. in calculating the at-home stay time ratio and the off-home distance entropy.
Step S24: designing driving behavior characteristics to reflect driving ability and risks specifically comprises:
the designed driving behavior characteristics are three subclasses of general behaviors, abnormal behaviors and residential area intersection behaviors. The generic behavior subclass contains 6 features: the average value of the speed standard deviations, the maximum value of the speed standard deviations, the standard deviation of the speed average value, the average value of the maximum speed and the maximum value of the acceleration standard deviation; the abnormal behavior subclass contains 6 features: the mean value of the sharp shift point ratio, the mean value of the sharp turn point ratio, the standard deviation of the sharp shift point ratio, the standard deviation of the sharp turn point ratio, the mean value of the overspeed point ratio, and the mean value of the number of overspeed points; the residential area intersection behavior subclass contains 2 characteristics: average value of crossing overspeed point ratio and crossing speed average value.
Mean of speed standard deviation/maximum of speed standard deviation/standard deviation of speed mean/mean of maximum speed: the speed of all track points in one track of the driver can be expressed as a vector
Figure 514982DEST_PATH_IMAGE035
Based on this vector, the mean velocity of a trajectory can be calculated
Figure 293582DEST_PATH_IMAGE036
Standard deviation of velocity
Figure 692202DEST_PATH_IMAGE037
Maximum speed, maximum speed
Figure 214451DEST_PATH_IMAGE038
(ii) a The mean of all the velocities of the trajectory can be expressed as a vector
Figure 913285DEST_PATH_IMAGE039
The mean value of the vector is the speed mean value, and the standard deviation of the vector is the standard deviation of the speed mean value; the standard deviation of the velocities of all trajectories can be expressed as a vector
Figure 862787DEST_PATH_IMAGE040
The mean value of the vector is the mean value of the speed standard deviation, and the maximum value of the vector is the maximum value of the speed standard deviation; the maximum velocity of all tracks can be expressed as a vector
Figure 14282DEST_PATH_IMAGE041
The mean value of the vector is the mean value of the maximum velocity;
maximum value of acceleration standard deviation: the calculation of the characteristic is the same as the calculation of the maximum value of the standard deviation of the speed;
mean of sharp shift point ratio/standard deviation of sharp shift point ratio: the ratio of the sharp change points of a driver's trajectory is recorded
Figure 74642DEST_PATH_IMAGE042
Wherein, in the step (A),
Figure 627983DEST_PATH_IMAGE043
is a point of track
Figure 13965DEST_PATH_IMAGE044
The acceleration of (a) is detected,
Figure 262544DEST_PATH_IMAGE045
indicating the number of trace points of the trace,
Figure 985649DEST_PATH_IMAGE046
the number of trace points for which the absolute value of the acceleration exceeds a threshold,ATjudging the threshold value of the sudden speed change; the mean value of the rates of the sharp change points of all the tracks is the mean value of the rates of the sharp change points, and the standard deviation is the standard deviation of the rates of the sharp change points;
mean of sharp turn point ratio/standard deviation of sharp turn point ratio: the sharp turn point ratio of one track of the driver is recorded as
Figure 393497DEST_PATH_IMAGE047
Wherein, in the step (A),
Figure 419222DEST_PATH_IMAGE048
is a point of track
Figure 279730DEST_PATH_IMAGE044
The angle of the turning-over corner of the frame,TTis a threshold value for judging sharp turns; the mean value of the sharp turning point ratios of all the tracks is the mean value of the sharp turning point ratios, and the standard deviation is the standard deviation of the sharp turning point ratios;
mean of overspeed point ratio/mean of number of overspeed points: the ratio of the overspeed points of one track of the driver is recorded as
Figure 947472DEST_PATH_IMAGE049
Wherein, in the process,
Figure 212756DEST_PATH_IMAGE050
is a point of track
Figure 674961DEST_PATH_IMAGE044
The speed of the motor vehicle is set to be,STis the threshold value for determining the overspeed,
Figure 22766DEST_PATH_IMAGE051
is the number of overspeed points; the mean value of the overspeed point ratios of all the tracks is the mean value of the overspeed point ratios, and the mean value of the overspeed point number of all the tracks is the mean value of the overspeed point number;
crossing speed average/crossing speed average of crossing speeding point ratio: and selecting track points of each track in the intersection buffer area based on the space position, and obtaining the two characteristics according to a calculation method of the mean value of the overspeed point ratio and the speed mean value.
Specifically, in the present embodiment, the threshold AT for determining a sudden shift is set to 1.67m/s 2 The threshold TT for judging sharp turning is 0.5rad/s, the threshold ST for judging overspeed is 80km/h, and the radius of the crossing buffer area is 10m.
And step S3: a scale measurement description dimension is constructed that contains scoring rules defining the correspondence of the description dimension and the trajectory characteristics and specifying the trajectory characteristics. Specifically, for each description dimension, selecting a track feature meeting the definition of each description dimension from a to-be-selected track feature set by means of the definition of each description dimension to serve as a subject item for measuring the description dimension; meanwhile, judging the positive and negative correlation relationship of the track characteristics and the description dimensions in the concept according to the definitions of the track characteristics and the description dimensions, and appointing the scoring rule of the track characteristics according to the positive and negative correlation relationship. And if a certain description dimension cannot select a proper track feature from the to-be-selected track feature set, the description dimension is not included in the final track image.
In this embodiment, step S3 specifically includes:
step S31: and defining the corresponding relation between the track characteristics and the track characteristics. Selecting the track characteristics for measuring the individual travel range and the travel activity information as a question item for measuring the characteristics according to the definition of the description dimension A; selecting a trip space entropy as a subject item for measuring the trait according to the definition of the description dimension B; selecting a track characteristic representing the driving stability of a driver as a subject for measuring the characteristic according to the definition of the description dimension C; and selecting the track characteristics related to the driving violation and the travel time entropy according to the definition of the description dimension D as a subject item for measuring the characteristics.
Selecting monthly travel times, residence time ratio at home, average travel distance, rotation radius weighted by residence time, rotation radius weighted by residence times, average importance of shopping POI, average importance of entertainment POI and average importance of catering POI as the subject of measuring and describing dimension A; selecting a non-interest travel ratio, a k-rotation radius ratio, a minimum number of stopping points for determining a space range, a random entropy, a place entropy, a sequence entropy, a departure place-destination entropy and a distance from home entropy as a subject item of a measurement description dimension B; selecting the mean value of the speed standard deviations, the standard deviation of the speed mean value, the maximum value of the speed standard deviations, the maximum value of the acceleration standard deviations, the mean value of the sharp turning point ratio, the standard deviation of the sharp turning point ratio and the standard deviation of the sharp turning point ratio as the problem item of the measurement description dimension C; selecting a speed mean value, a speed maximum value mean value, a overspeed point ratio mean value, an overspeed point quantity mean value, a residential crossing overspeed point ratio mean value, a residential crossing speed mean value, a daily trip entropy and a week and day combined trip entropy as a problem item of the measurement description dimension D.
Step S32: and determining scoring rules of the track characteristics, wherein the scoring rules are divided into positive scoring and negative scoring. The positive scoring, namely the track characteristics and the track characteristics are conceptually in direct proportion; negative scoring, i.e., track features and track traits, is conceptually inversely proportional. For example, conceptually, the more trips, the higher the description dimension a score is, so the trips belong to forward scores; the higher the home dwell time ratio, the lower the score describing dimension a so the home dwell time ratio is a negative score. The residence time ratio at home, the k-rotation radius ratio, the speed mean value, the mean value of the maximum speed value, the mean value of the overspeed point ratio, the mean value of the number of overspeed points, the mean value of the overspeed point ratio at the intersection of the residential area, the speed mean value at the intersection of the residential area, the trip entropy in the day, and the trip entropy formed by combining the week and the day are recorded as backward-counting thematic items, and the rest are forward-counting thematic items.
And step S4: inputting track data of one or more users, extracting track features based on the scale in step S3, and fusing the track features under the description dimensions to obtain a measurement result of the description dimensions, namely a track portrait.
Specifically, the method of fusing the trajectory features in each description dimension includes normalized summation (averaging), hierarchical summation (averaging), normalized summation (averaging), hierarchical counting, and the like. The common point of the above methods is that the track features with different scales and ranges are converted to the same scale or interval by a feature scaling method, so that the track features can be compared, and the operations of summing, averaging and the like can be performed.
In this embodiment, step S4 specifically includes:
step S41: and inputting and preprocessing vehicle track data of a driver. Each track point in the vehicle track data at least comprises three information of longitude, latitude and timestamp. Before extracting time, space and geographic semantic features, a clustering algorithm is needed to obtain all the stop points corresponding to all tracks of each driver. Before the driving behavior characteristics are extracted, the speed, the acceleration and the steering angle of each track point need to be calculated.
Specifically, clustering is performed using the DBSCAN algorithm for vehicle key-off points for which the driver dwell time is greater than 20 minutes (clustering parameters: eps = 200m, minPts = 3). In the clustering result, one cluster corresponds to one stop point, and the coordinates of the stop point are recorded as the coordinates of the central points of the flameout points of all vehicles belonging to the cluster. Tracing pointp i The speed of (1) is the previous track point of the track pointp i-1 And the next track pointp i+1 Average speed in between; acceleration is the tracing pointp i With the next track pointp i+1 The rate of change of speed therebetween; steering angle is composed ofp i-1 ,p i ,p i+1 And determining three track points.
Step S42: and (4) extracting the track characteristics of the driver according to the calculation method of the track characteristics designed in the step (S2).
Step S43: and according to the positive and negative scoring rules of the track features specified in the step S32, performing normalization processing on each track feature by adopting different formulas.
Specifically, forward scoring:
Figure 494199DEST_PATH_IMAGE052
wherein, in the step (A),kandx 0 is a parameter for normalization, can be set
Figure 876638DEST_PATH_IMAGE053
Figure 509745DEST_PATH_IMAGE054
And IQR is the four-bit distance,
Figure 344846DEST_PATH_IMAGE055
is a median, or set
Figure 88811DEST_PATH_IMAGE056
Figure 60178DEST_PATH_IMAGE057
SIs the standard deviation of the measured data to be measured,
Figure 988820DEST_PATH_IMAGE058
is the mean value;
reverse scoring:
Figure 514479DEST_PATH_IMAGE059
the symbol meaning is scored in the same forward direction;
step S44: and calculating the sum or average value of the corresponding normalized track characteristics of each track characteristic as the measurement result of the track characteristic according to the track characteristic and the corresponding relation of the track characteristics defined in the step S31.
Specifically, for each trajectory trait, the sum of its corresponding normalized trajectory features is calculated as a measure of the trajectory trait.
Optionally, the method further comprises:
step S5: by evaluating the validity of the scale measurements, the rationality of the scale design or the suitability of the scale in the current user population is measured.
In particular, newly developed gauges need to evaluate the validity of their measurements before use, in order to measure the rationality and feasibility of the gauge design. In addition, whether or not a re-assessment is required in relation to the current user population when using a scale that already accounts for its plausibility. If the driver of the current trajectory representation to be constructed comes from the population for which the specified scale measurements are valid, the current sample does not need to be evaluated again. For example, if the vehicle trajectory data of 500 drivers is extracted at random, 200 drivers are extracted to construct a trajectory image, and the validity of the measurement result of the gauge is verified based on the data, the gauge is suitable for the current user group, and when 20 drivers are extracted again to construct a trajectory image, the verification is not required again. On the contrary, if the driver of the current track portrait to be constructed comes from the population which does not indicate that the measurement result of the scale is valid, the evaluation is performed based on the data of the current user group.
In this embodiment, step S5 specifically includes:
and evaluating the effectiveness of the measuring result of the scale through item discrimination, internal consistency reliability and data semi-reliability so as to measure the rationality of the scale design or the applicability of the scale in the current user group.
Item discrimination: for the subject items in each attribute dimension, dividing the driver population into three parts according to 27 percent and 73 percent percentiles of all driver attribute measurement results in the dimension, respectively defining the driver population as a low group, a medium group and a high group, and then using the driver attribute measurement resultstCheck contrast ratioThe score difference between the grouping and the low grouping on each topic. If the difference is obvious, the discrimination is good, otherwise, the discrimination is poor.
Internal consistency belief: for each trait dimension, kronebach is used
Figure 327714DEST_PATH_IMAGE060
Coefficient (Cronbach)
Figure 888009DEST_PATH_IMAGE060
) Estimating the internal consistency reliability of the attribute dimension by the formula
Figure 128497DEST_PATH_IMAGE061
Wherein K is the number of the questions in the table,
Figure 427539DEST_PATH_IMAGE062
the variance of all driver attribute measurements,
Figure 44466DEST_PATH_IMAGE063
for all drivers
Figure 990425DEST_PATH_IMAGE003
Score variance on individual subject items;
data half-confidence: for each trait dimension, equally dividing the track of each driver into two parts by taking a bar as a unit, respectively calculating trait scores of the two parts of tracks, and estimating the data half-credibility by comparing the normalized average absolute difference between two trait score sets of all the drivers. The normalized mean absolute difference is formulated as
Figure 401815DEST_PATH_IMAGE064
Wherein, in the process,Nis the number of drivers to be driven,
Figure 433224DEST_PATH_IMAGE065
and
Figure 853842DEST_PATH_IMAGE066
is the driver
Figure 654307DEST_PATH_IMAGE003
The two characteristics of (a) are scored,
Figure 236598DEST_PATH_IMAGE067
and
Figure 489725DEST_PATH_IMAGE068
is the standard deviation of the two sets of trait scores for all drivers.
In order to more clearly illustrate the implementation and the advantages of the method provided by the present invention, the following detailed description is given by way of specific examples. The flow of this embodiment is shown in fig. 3.
The method comprises the steps that vehicle track data of 662 drivers in a certain city in 2019, 7-class POI data of the certain city, boundary data of the certain city and intersection data of residential areas of the certain city are obtained, and a track portrait of each driver needs to be constructed. The driver starts the vehicle to start recording the track until the flameout is finished, and the track of one-time starting and flameout is stored in a file. The trace contains the longitude, latitude, and timestamp of the time of recording, an example of which is shown in FIG. 4 (a). The method of the present invention will be explained in detail below with reference to the accompanying drawings, and the specific steps are as follows:
1) As described in step S1, a description dimension of the trajectory image is defined.
2) And as step S2, designing track characteristics from 4 angles of time, space, geographic semantics and driving operation.
3) As described in step S3, the corresponding relationship between the four trajectory characteristics a, B, C, and D and the 32 trajectory characteristics is specified, and the scoring rule of each trajectory characteristic is described, and the result is shown in fig. 5.
4) In step S4, trajectory data of 662 drivers are input, and trajectory images are obtained based on the trajectory trait table, that is, 4 trajectory trait scores for each driver, and the method is as follows:
(1) and constructing a semantic map and generating an intersection buffer area. Dividing a fishing net with the size of 500 x 500m by using certain city boundary data, counting the number of various POIs in each grid, and calculating a semantic vector of each grid according to a semantic vector formula (see step S24 in detail) to obtain a semantic map; a circular buffer with a radius of 10m is generated for each intersection point by using intersection data of residential areas of a certain city.
(2) For the track of a driver, the last record of each file, namely the vehicle flameout point, is taken out, then the vehicle flameout points with the stay time longer than 20 minutes are screened out, finally the vehicle flameout points meeting the conditions are clustered by using a DBSCAN algorithm, and the stay points of the driver are obtained, which is shown in fig. 4 (b).
(3) According to the formula in step S22, based on the driver stay point data, the day-in travel entropy and the week-day combined travel entropy are calculated.
(4) Based on the driver stay point data, a stay time-weighted rotation radius, a stay number-weighted rotation radius, a k-rotation radius ratio, a minimum stay point number that determines a spatial range, a random entropy, a place entropy, a sequence entropy, a departure place-destination entropy, a monthly travel number, an average travel distance, a non-interest travel ratio are calculated according to the formula in step S23.
(5) According to the formula in step S24, based on the driver stay point data, the sum of the stay time of the driver at night (1. And assigning the semantic vector of the grid in the semantic map to the stop points in the grid, and averaging the values of the corresponding shopping POIs in the semantic vectors of all the stop points of the driver to obtain the average importance of the shopping POIs, wherein the method for calculating the average importance of the entertainment POIs/catering POIs is the same as the method for calculating the average importance of the entertainment POIs/catering POIs.
(6) For each track of one driver, the speed, the acceleration and the steering angle of each track point are sequentially calculated to obtain processed track data, and an example is shown in fig. 4 (b).
(7) Calculating a mean value of speed standard deviations, a maximum value of speed standard deviations, a standard deviation of speed mean values, a speed mean value, a mean value of maximum speeds, a maximum value of acceleration standard deviations, a mean value of sharp turning point ratios, a standard deviation of sharp turning point ratios, a mean value of over-speed point ratios, and a mean value of the number of over-speed points, based on the trajectory data processed by the driver, according to the formula in step S25; and (4) screening track points of the driver in the residential area intersection buffer area, and calculating the average value of the intersection overspeed point ratio and the intersection speed average value.
(8) And (3) repeating the steps (2) to (7) to obtain the track characteristics of 622 drivers, and the result is shown in fig. 4 (c) for an example.
(9) And calculating the normalized parameters of one track feature, namely calculating the median and the quartile distance of the track feature of 622 drivers.
Judging whether the trace feature is reverse scoring, if so, using reverse scoring formula to make normalization; if not, then normalization is performed using a forward scoring formula.
\9322f, repeating (9) and normalizing all 32 features, and the result is shown in fig. 4 (d).
9323where, for a driver, the sum of normalized track features corresponding to the track features is obtained to obtain the track feature of the driver.
\9324andrepeating the step of \9323, and obtaining the track characteristics of 622 drivers, and the result is shown in fig. 4 (e).
5) Optionally, as described in step S5, the validity of the scale measurement is evaluated by:
evaluating project discrimination:
(1) for a track trait, the driver population was divided into three segments, defined as low, medium and high, based on the 27%, 73% percentiles of 622 drivers' scores for the trait.
(2) For a feature of the track under the track characteristic, usetThe difference in scores on the trajectory feature is checked against high and low packets.
(3) And (5) repeating the step (2) and evaluating the item discrimination of each track characteristic under the track characteristic.
(4) Repeating the steps (1) to (3), and evaluating the item discrimination of 32 track characteristics under the characteristics of 4 tracks, wherein the result is shown in fig. 6.
Evaluating internal consistency reliability:
(1) for a trajectory trait, krumbech is calculated from the trait scores of 622 drivers
Figure 182875DEST_PATH_IMAGE060
And (5) checking the coefficient.
(2) And (3) repeating the step (1), and evaluating the internal consistency reliability of the 4 track characteristics, wherein the result is shown in FIG. 7.
And (3) evaluating data half-credibility:
(1) for the trajectory data of 662 drivers, the trajectory of each driver was divided into two parts in units of bars at random and in equal amounts.
(2) In order to reduce sampling errors caused by randomly dividing the data, the step (1) is repeated for 100 times, namely, the track of each driver is randomly divided into equal amounts for 100 times to form two track data sets.
(3) Repeating the step 4) for each data set to obtain two feature score sets. Where a normalization parameter is used when the data is not halved.
(4) For a track trait, a normalized mean absolute difference value of the track trait in both trait score sets is calculated.
(5) And (5) repeating the step (4), and evaluating the data half-credibility of the 4 track characteristics, wherein the result is shown in FIG. 8.
The invention has the following beneficial effects: the track characteristics of the driver can be effectively abstracted into four track characteristics A, B, C and D for describing the travel mode and the driving behavior of the driver through the track characteristic quantity table, and further a basis is provided for applications such as safe driving, customized vehicle insurance, position service personalized recommendation and the like. The reasonability of the design of the track quality table is evaluated through project discrimination, internal consistency reliability and data semi-reliability. As shown in fig. 6, the topic items (i.e., trace features) in the table can both significantly distinguish high and low groupings: (p<0.001 It) shows that each question item can correctly distinguish drivers presenting different characteristics in different dimensions; as shown in FIG. 7, the internal consistency confidence of each trajectory trait is acceptable according to empirical rules: (α>0.6 This shows that the topics under each track trait are related, consistent, and the current track trait isThe corresponding relation between the quality and the track characteristics is reasonable; as shown in fig. 8, when two independent measurements are made of the trajectory characteristics, the difference between the measurement results is small (β SD <0.5 This shows that the measurement results are stable and reproducible.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all changes and modifications that fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made in the embodiments of the present invention without departing from the spirit or scope of the embodiments of the invention. Thus, if such modifications and variations of the embodiments of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to encompass such modifications and variations.

Claims (10)

1. A driver portrait construction method based on vehicle track data is characterized by comprising the following steps:
s1, defining description dimensions of a track image;
s2, designing related track features from multiple angles according to the definition of the description dimensions to form a candidate track feature set;
s3, constructing a scale measurement description dimension which comprises a scoring rule for defining the corresponding relation between the description dimension and the track characteristic and specifying the track characteristic; specifically, for each description dimension, selecting a track feature meeting the definition of each description dimension from a to-be-selected track feature set by means of the definition of each description dimension, and taking the track feature as a subject for measuring the description dimension; meanwhile, judging the positive and negative correlation relationship of the track characteristics and the description dimensions in the concept according to the definitions of the track characteristics and the description dimensions, and appointing a scoring rule of the track characteristics according to the positive and negative correlation relationship; if a certain description dimension cannot select a proper track feature from the track feature set to be selected, the description dimension is not included in the final track image;
and S4, inputting track data of one or more users, extracting track features based on the scale in the step S3, and fusing the track features under the description dimensions to obtain a measurement result of the description dimensions, namely a track portrait.
2. The driver representation construction method based on vehicle trajectory data as set forth in claim 1, wherein: the specific implementation manner of the step S1 is as follows;
defining the description dimensions of the track image as four description dimensions A, B, C and D, namely track characteristics; wherein the description dimension A describes the range of the driver's activity space, the frequency of traveling, and the number of times of visiting shopping or entertainment venues; the description dimension B measures the irregularity and unpredictability of the travel places of the driver; the description dimension C captures the tendency of the driver to drive impulsively; the description dimension D describes the degree of driver compliance with traffic rules, timekeeping.
3. The driver representation construction method based on vehicle trajectory data as claimed in claim 2, wherein: the step S2 specifically includes:
designing track characteristics from four angles of time, space, geographic semantics and driving behaviors based on vehicle track data;
step S21, designing time characteristics to reveal travel rhythms;
s22, designing a spatial characteristic to describe the spatial range and distribution characteristic of the trip location;
s23, designing geographic semantic features to depict travel activity information;
and S24, designing driving behavior characteristics to reflect driving capacity and risks.
4. A driver representation construction method based on vehicle trajectory data as set forth in claim 3, wherein: the time characteristic designed in step S21 includes:
the designed time characteristics start from a travel time entropy, the travel time entropy comprises a daily travel entropy and a week and day combined travel entropy, and the calculation formula is as follows:
Figure 839912DEST_PATH_IMAGE001
wherein
Figure 276709DEST_PATH_IMAGE002
Is shown as
Figure 966317DEST_PATH_IMAGE003
The time slot of each trip is set up,
Figure 146762DEST_PATH_IMAGE004
indicates the driver is
Figure 746371DEST_PATH_IMAGE002
The trip frequency of (c); when the day is segmented by taking hours as a unit, each time slot is a time slot, the daily trip entropy can be calculated according to the formula; when the week is divided into a plurality of time slots by day, and each time slot is further divided into a plurality of time slots by hour, the trip entropy of the combination of the week and the day can be calculated according to the formula.
5. A driver representation construction method based on vehicle trajectory data as set forth in claim 3, wherein: the spatial features designed in step S22 include:
the designed spatial characteristics consist of a rotation radius, a spatial entropy and other three subclasses; the radius of rotation subclass contains 4 features: the dwell time weighted radius of rotation, the dwell times weighted radius of rotation, the k-radius of rotation ratio, the minimum number of dwell points for determining the spatial range; the spatial entropy subclass contains 4 features: random entropy, place entropy, sequence entropy, departure place-destination entropy; the other subclass contains 3 features: monthly trip times, average trip distance and non-interest trip rate; the calculation formula is as follows:
dwell time weighted radius of rotation/dwell number weighted radius of rotation:
Figure 478703DEST_PATH_IMAGE005
wherein, in the step (A),Lfor stopping the driverA set of points;r i is a two-dimensional vector representation stop point
Figure 530973DEST_PATH_IMAGE003
The longitude and latitude of (c);n i is the driver at the stopping point
Figure 374164DEST_PATH_IMAGE003
The number of times of stay or the stay time;
Figure 690263DEST_PATH_IMAGE006
is the total number of dwells or time;r cm the center of all the stopping points of the driver is the mean value of the coordinates;
k-radius of rotation ratio:
Figure 468864DEST_PATH_IMAGE007
in which
Figure 133063DEST_PATH_IMAGE008
Wherein, in the step (A),
Figure 655311DEST_PATH_IMAGE009
most frequently visited by the driverkThe center of each dwell point is the coordinate mean value;N k is thatkThe sum of the weights of the individual sites can be the total number of stays or the time;
determining the minimum number of stopover points for the spatial range:
Figure 88567DEST_PATH_IMAGE010
random entropy:
Figure 38068DEST_PATH_IMAGE011
wherein, in the step (A),Nindicating the number of driver stops;
location entropy:
Figure 189564DEST_PATH_IMAGE012
wherein, in the step (A),
Figure 249924DEST_PATH_IMAGE013
is shown as
Figure 537686DEST_PATH_IMAGE003
The number of the stop points is equal to the number of the stop points,
Figure 923668DEST_PATH_IMAGE014
indicating driver access
Figure 437826DEST_PATH_IMAGE015
The frequency of (d);
sequence entropy:
Figure 426510DEST_PATH_IMAGE016
wherein, in the step (A),
Figure 709724DEST_PATH_IMAGE017
for a sequential sequence of drivers visiting stop points chronologically,
Figure 125662DEST_PATH_IMAGE018
is the frequency of occurrence of the sequence;
origin-destination entropy:
Figure 861537DEST_PATH_IMAGE019
wherein, in the step (A),mindicating the number of unique non-repeating "origin-destination" waypoint pairs,
Figure 919491DEST_PATH_IMAGE020
indicating departure from
Figure 57212DEST_PATH_IMAGE021
To the destination
Figure 641121DEST_PATH_IMAGE022
Appear byThe frequency of (d);
monthly trip times:
Figure 864292DEST_PATH_IMAGE023
wherein, in the step (A),nrepresents the total number of trips of a driver,monthis the total trip month;
average trip distance:
Figure 194779DEST_PATH_IMAGE024
wherein, in the process,
Figure 187006DEST_PATH_IMAGE025
is the linear distance between the starting point and the destination on a trip,nthe total travel times are calculated;
non-interesting trip rates:
Figure 210326DEST_PATH_IMAGE026
wherein, in the step (A),n uni is the number of trips when the destination is an infrequent trip to a stop.
6. A driver representation construction method based on vehicle trajectory data as set forth in claim 3, wherein: the geographic semantic features designed in step S23 include:
the designed geographic semantic features consist of two subclasses of features related to families and Point of Interest (POI) related features; the family-related feature subclass contains 2 features: stay-at-home time ratio, distance-from-home entropy; the interest point related features include 3 features: the average importance of shopping POI; average importance of entertainment POI; average importance of catering POI;
residence time ratio at home:
Figure 920793DEST_PATH_IMAGE027
wherein, in the step (A),
Figure 789392DEST_PATH_IMAGE028
cheque masterThe sum of the length of time the machine stays at "home",
Figure 901704DEST_PATH_IMAGE029
the sum of the length of time that the driver stays at all the stay points; "Home" may be defined as the point at which the driver stays for the longest period of time during the night;
distance from home entropy:
Figure 971291DEST_PATH_IMAGE030
wherein
Figure 293688DEST_PATH_IMAGE031
Is the first
Figure 841344DEST_PATH_IMAGE003
The distance between the point of stay and the home,
Figure 932797DEST_PATH_IMAGE032
the frequency at which the distance occurs;
shopping/entertainment/catering POI average importance: the average POI importance is the average value of the POI importance of the class on all the stop points of the driver, and the POI importance of each stop point comes from a semantic vector of an area where the stop point is located, and the area is a grid in a semantic map; the semantic map is a spatial grid, and each grid in the grid is provided with a semantic vector which reflects the importance of various POIs and is obtained by a weighted TF-IDF algorithm; first, the
Figure 32340DEST_PATH_IMAGE003
The semantic vector of a lattice is noted as
Figure 717399DEST_PATH_IMAGE033
Wherein, in the step (A),ois the number of categories of the POI,
Figure 68746DEST_PATH_IMAGE034
wherein, in the step (A),n j is the firstjClass PThe number of the OI is equal to the total number of the OI,Nis the number of all POIs in the grid,Candc j respectively, the total number of lattices in the semantic map and the contentjThe number of boxes of the POI-like,w j is the first in the neighborhood of cell 3 x 3jThe number of POI classes.
7. A driver representation construction method based on vehicle trajectory data as set forth in claim 3, wherein: the driving behavior characteristics designed in step S24 include:
the designed driving behavior characteristics comprise three subclasses of general behavior, abnormal behavior and residential area intersection behavior; the generic behavior subclass contains 6 features: the average value of the speed standard deviations, the maximum value of the speed standard deviations, the standard deviation of the speed average value, the average value of the maximum speed and the maximum value of the acceleration standard deviation; the abnormal behavior subclass contains 6 features: the average of the sharp shift point ratios, the average of the sharp turn point ratios, the standard deviation of the sharp shift point ratios, the standard deviation of the sharp turn point ratios, the average of the overspeed point ratios, and the average of the number of overspeed points; the residential area intersection behavior subclass contains 2 characteristics: average value of crossing overspeed point ratio and crossing speed average value;
mean of speed standard deviations/maximum of speed standard deviations/standard deviation of speed mean/mean of maximum speed: the speed of all track points in a driver's track can be expressed as a vector
Figure 14705DEST_PATH_IMAGE035
Based on this vector, the mean of the velocities of a trajectory can be calculated
Figure 426095DEST_PATH_IMAGE036
Standard deviation of velocity
Figure 460435DEST_PATH_IMAGE037
Maximum speed of the motor
Figure 349893DEST_PATH_IMAGE038
(ii) a Speed of all tracksThe degree means may be expressed as a vector
Figure 415938DEST_PATH_IMAGE039
The mean value of the vector is the speed mean value, and the standard deviation of the vector is the standard deviation of the speed mean value; the standard deviation of the velocities of all trajectories can be expressed as a vector
Figure 998229DEST_PATH_IMAGE040
The mean value of the vector is the mean value of the speed standard deviation, and the maximum value of the vector is the maximum value of the speed standard deviation; the maximum velocity of all tracks can be expressed as a vector
Figure 516935DEST_PATH_IMAGE041
The mean value of the vector is the mean value of the maximum velocity;
maximum value of acceleration standard deviation: the calculation of the characteristic is the same as the calculation of the maximum value of the standard deviation of the speed;
mean of sharp shift point ratio/standard deviation of sharp shift point ratio: the ratio of the sharp change points of one track of the driver is recorded as
Figure 210085DEST_PATH_IMAGE042
Wherein, in the step (A),
Figure 865057DEST_PATH_IMAGE043
is a point of track
Figure 618250DEST_PATH_IMAGE044
The acceleration of (2) is detected,
Figure 889831DEST_PATH_IMAGE045
indicating the number of trace points of the trace,
Figure 386671DEST_PATH_IMAGE046
the number of trace points for which the absolute value of the acceleration exceeds a threshold,ATis the threshold value for judging the sudden speed change; the mean of the sharp point ratios for all tracks is the sharpThe mean of the point ratios, the standard deviation being the standard deviation of the sharp shift point ratio;
mean of sharp turn point ratio/standard deviation of sharp turn point ratio: the sharp turn point ratio of one track of the driver is recorded as
Figure 161729DEST_PATH_IMAGE047
Wherein, in the step (A),
Figure 820244DEST_PATH_IMAGE048
is a point of track
Figure 313542DEST_PATH_IMAGE044
The angle of the turning-over corner of the frame,TTis a threshold value for judging sharp turns; the mean value of the sharp turning point ratios of all the tracks is the mean value of the sharp turning point ratios, and the standard deviation is the standard deviation of the sharp turning point ratios;
mean of overspeed point ratio/mean of number of overspeed points: the ratio of the overspeed points of one track of the driver is recorded as
Figure 614073DEST_PATH_IMAGE049
Wherein, in the step (A),
Figure 384583DEST_PATH_IMAGE050
is a point of track
Figure 338633DEST_PATH_IMAGE044
The speed of the motor vehicle is set to be,STis the threshold value for determining the overspeed,
Figure 194593DEST_PATH_IMAGE051
is the number of overspeed points; the mean value of the overspeed point ratios of all the tracks is the mean value of the overspeed point ratios, and the mean value of the overspeed point number of all the tracks is the mean value of the overspeed point number;
average crossing speed/average crossing speed ratio: and selecting track points of each track in the intersection buffer area based on the spatial position, and obtaining the two characteristics according to a calculation method of the mean value of the overspeed point ratio and the speed mean value.
8. A driver representation construction method based on vehicle trajectory data as set forth in claim 3, wherein: step S3 specifically includes:
step S31: defining the corresponding relation between the track characteristics and the track characteristics; selecting the track characteristics for measuring the individual travel range and the travel activity information as a subject item for measuring the characteristics according to the definition of the description dimension A; selecting a trip space entropy as a subject item for measuring the trait according to the definition of the description dimension B; selecting a track characteristic representing the driving stability of the driver as a subject item for measuring the characteristic according to the definition of the description dimension C; selecting track characteristics related to driving violation and travel time entropy as a subject item for measuring the characteristics according to the definition of the description dimension D;
step S31 of defining a corresponding relationship between the description dimension and the trajectory feature includes:
selecting monthly travel times, residence time ratio at home, average travel distance, residence time weighted radius of rotation, shopping POI average importance, entertainment POI average importance and catering POI average importance as the subject of measuring and describing dimension A; selecting a non-interest travel ratio, a k-rotation radius ratio, a minimum number of stopping points for determining a space range, a random entropy, a place entropy, a sequence entropy, a departure place-destination entropy and a distance from home entropy as a subject item of a measurement description dimension B; selecting the mean value of the speed standard deviations, the standard deviation of the speed mean value, the maximum value of the speed standard deviations, the maximum value of the acceleration standard deviations, the mean value of the sharp turning point ratio, the standard deviation of the sharp turning point ratio and the standard deviation of the sharp turning point ratio as the problem item of the measurement description dimension C; selecting a speed mean value, a speed maximum value mean value, a overspeed point ratio mean value, an overspeed point quantity mean value, a residential area intersection overspeed point ratio mean value, a residential area intersection speed mean value, a daily trip entropy and a week and day combined trip entropy as a subject item of a measurement description dimension D;
step S32: determining scoring rules of each track characteristic, wherein the scoring rules are divided into positive scoring and negative scoring, and the positive scoring is that the track characteristics are in direct proportion to the track characteristics in concept; negative scoring, namely the track characteristics and the track characteristics are in inverse proportion conceptually, and conceptually, the more the travel times, the higher the description dimension A score is, so the travel times belong to positive scoring; the higher the residence time ratio at home, the lower the description dimension A score, so the residence time ratio at home belongs to a negative score;
the trajectory feature scoring rule determined in step S32 specifically includes:
and recording the residence time ratio at home, the k-rotation radius ratio, the speed mean value, the mean value of the maximum speed value, the mean value of the overspeed point ratio, the mean value of the number of overspeed points, the mean value of the overspeed point ratio at the residential area intersection, the speed mean value at the residential area intersection, the daily trip entropy, the week and day combined trip entropy as reverse-counting thematic items, and the others are forward-counting thematic items.
9. The driver representation construction method based on vehicle trajectory data as set forth in claim 8, wherein: in step 4, inputting vehicle track data of one or more drivers to obtain a track picture of each driver, and aiming at the vehicle track data of one driver, the processing steps are as follows:
step S41: inputting vehicle track data of a driver and preprocessing the vehicle track data, wherein each track point in the vehicle track data at least comprises three information of longitude, latitude and timestamp; before extracting time, space and geographical semantic features, all stay points corresponding to all tracks of each driver are obtained by using a clustering algorithm, and the speed, the acceleration and the steering angle of each track point are calculated before extracting driving behavior features;
step S42: extracting the track characteristic of the driver according to the calculation method of the track characteristic designed in the step S2;
step S43: according to the positive and negative scoring rules of the track characteristics specified in the step S32, different formulas are adopted for normalization processing of the track characteristics;
the step S43 normalization method includes:
forward scoring:
Figure 154940DEST_PATH_IMAGE052
wherein, in the process,kandx 0 is a parameter for normalization, and can be set
Figure 514378DEST_PATH_IMAGE053
Figure 904908DEST_PATH_IMAGE054
And IQR is the four-bit distance,
Figure 248164DEST_PATH_IMAGE055
is a median, or set
Figure 15132DEST_PATH_IMAGE056
Figure 494655DEST_PATH_IMAGE057
SIs the standard deviation of the measured data to be measured,
Figure 56086DEST_PATH_IMAGE058
is the mean value;
and (4) reverse scoring:
Figure 621060DEST_PATH_IMAGE059
the symbol meaning is scored in the same forward direction;
step S44: and calculating the sum or average value of the corresponding normalized track characteristics of each track characteristic as the measurement result of the track characteristic according to the track characteristic and the corresponding relation of the track characteristics defined in the step S31.
10. The driver representation construction method based on vehicle trajectory data as set forth in claim 1, wherein: the method also comprises a step S5 of measuring the reasonability of the scale design or the applicability of the scale in the current user group by evaluating the effectiveness of the measuring result of the scale;
the method specifically comprises the following steps: evaluating the effectiveness of the measuring result of the scale through item discrimination, internal consistency reliability and data semi-reliability, and measuring the reasonability of the scale design or the applicability of the scale in the current user group;
the item discrimination degree, internal consistency reliability and data semi-reliability evaluation method in the step S5 comprises the following steps:
item discrimination: for the subject items in each attribute dimension, dividing the driver population into three parts according to 27 percent and 73 percent percentiles of all driver attribute measurement results in the dimension, respectively defining the driver population as a low group, a medium group and a high group, and then using the driver attribute measurement resultstChecking and comparing the score difference of the high grouping and the low grouping on each question item; if the difference is obvious, the discrimination is good, otherwise, the discrimination is poor;
internal consistency confidence: for each trait dimension, kronebach is used
Figure 67085DEST_PATH_IMAGE060
Coefficient (Cronbach)
Figure 525748DEST_PATH_IMAGE060
) Estimating the internal consistency reliability of the attribute dimension by the formula
Figure 133447DEST_PATH_IMAGE061
Wherein K is the number of the test items,
Figure 310350DEST_PATH_IMAGE062
for the variance of all driver quality measurements,
Figure 28907DEST_PATH_IMAGE063
for all drivers in the second
Figure 873235DEST_PATH_IMAGE003
Score variance on individual subject items;
data half-confidence: for each trait dimension, equally dividing the trajectory of each driver into two parts by taking a bar as a unit, respectively calculating trait scores of the two parts of the trajectory, and comparing all the trait scoresEstimating the half-credibility of the data by the normalized mean absolute difference between the two attribute score sets of the driver; the normalized mean absolute difference is formulated as
Figure 651836DEST_PATH_IMAGE064
Wherein, in the step (A),Nis the number of drivers to be driven,
Figure 316035DEST_PATH_IMAGE065
and
Figure 838283DEST_PATH_IMAGE066
is the driver
Figure 412484DEST_PATH_IMAGE003
The two characteristics of (a) are scored,
Figure 489549DEST_PATH_IMAGE067
and
Figure 250832DEST_PATH_IMAGE068
is the standard deviation of the two sets of trait scores for all drivers.
CN202211417112.8A 2022-11-14 2022-11-14 Driver portrait construction method based on vehicle track data Active CN115470872B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211417112.8A CN115470872B (en) 2022-11-14 2022-11-14 Driver portrait construction method based on vehicle track data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211417112.8A CN115470872B (en) 2022-11-14 2022-11-14 Driver portrait construction method based on vehicle track data

Publications (2)

Publication Number Publication Date
CN115470872A CN115470872A (en) 2022-12-13
CN115470872B true CN115470872B (en) 2023-04-18

Family

ID=84338060

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211417112.8A Active CN115470872B (en) 2022-11-14 2022-11-14 Driver portrait construction method based on vehicle track data

Country Status (1)

Country Link
CN (1) CN115470872B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116383270B (en) * 2023-03-30 2023-08-29 万联易达物流科技有限公司 Mining method for on-line driver transportation interests

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110991392A (en) * 2019-12-17 2020-04-10 Oppo广东移动通信有限公司 Crowd identification method, device, terminal and storage medium
US20220144256A1 (en) * 2020-11-10 2022-05-12 Nec Laboratories America, Inc. Divide-and-conquer for lane-aware diverse trajectory prediction
CN112417273B (en) * 2020-11-17 2022-04-19 平安科技(深圳)有限公司 Region portrait image generation method, region portrait image generation device, computer equipment and storage medium
US20220227391A1 (en) * 2021-01-20 2022-07-21 Argo AI, LLC Systems and methods for scenario dependent trajectory scoring
CN112819232A (en) * 2021-02-04 2021-05-18 北京建筑大学 People flow characteristic prediction method and device based on card punching data
CN113112326A (en) * 2021-04-02 2021-07-13 北京沃东天骏信息技术有限公司 User identification method, method for displaying data to user and related device
CN112800210B (en) * 2021-04-06 2021-06-18 湖南师范大学 Crowd portrayal algorithm based on mass public transport data
CN113571157A (en) * 2021-04-20 2021-10-29 杭州袋虎信息技术有限公司 Intelligent risk person psychological image recognition system based on FMT characteristics
CN113704373B (en) * 2021-08-19 2023-12-05 国家计算机网络与信息安全管理中心 User identification method, device and storage medium based on movement track data
CN114186582A (en) * 2021-11-15 2022-03-15 重庆邮电大学 Natural semantic processing-based method for discovering vehicles in same driving

Also Published As

Publication number Publication date
CN115470872A (en) 2022-12-13

Similar Documents

Publication Publication Date Title
Deng et al. Generating urban road intersection models from low-frequency GPS trajectory data
Zheng et al. Mining interesting locations and travel sequences from GPS trajectories
US8892455B2 (en) Systems, techniques, and methods for providing location assessments
Duckham et al. Including landmarks in routing instructions
US8554473B2 (en) Energy efficient routing using an impedance factor
US8983973B2 (en) Systems and methods for ranking points of interest
RU2406158C2 (en) Methods of predicting destinations from partial trajectories employing open- and closed-world modeling methods
US7730049B2 (en) Method for representing the vertical component of road geometry and computing grade or slope
CN110442662B (en) Method for determining user attribute information and information push method
CN110379161B (en) Urban road network traffic flow distribution method
Krueger et al. Semantic enrichment of movement behavior with foursquare–a visual analytics approach
CN115470872B (en) Driver portrait construction method based on vehicle track data
Stipancic et al. Measuring and visualizing space–time congestion patterns in an urban road network using large-scale smartphone-collected GPS data
Braga et al. Clustering user trajectories to find patterns for social interaction applications
CN112800210A (en) Crowd portrayal algorithm based on mass public transport data
Keler et al. Detecting traffic congestion propagation in urban environments–a case study with Floating Taxi Data (FTD) in Shanghai
Washburn et al. Rural freeway level of service based on traveler perception
JP2024038372A (en) Methods for indicating sites using similarity and journey duration
US20210270629A1 (en) Method and apparatus for selecting a path to a destination
Zhou et al. Identifying trip ends from raw GPS data with a hybrid spatio-temporal clustering algorithm and random forest model: a case study in Shanghai
CN112001384A (en) Business circle identification method and equipment
CN114398462B (en) Destination recommendation method and system based on multi-source heterogeneous information network
Gao Estimation of Tourist Travel Patterns with Recursive Logit Models based on Wi-Fi Data with Kyoto City Case Study
Levine Journey to crime Estimation
CN112639859B (en) Brand penetration determination system using image semantic content

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant