CN109948711B

CN109948711B - Stroke similarity acquisition method and equipment, and method and system for searching similar strokes

Info

Publication number: CN109948711B
Application number: CN201910217774.2A
Authority: CN
Inventors: 杨维嘉; 徐孙杰; 杨治
Original assignee: Shanghai Yingke Information Technology Co ltd
Current assignee: Shanghai Yingke Information Technology Co ltd
Priority date: 2019-03-21
Filing date: 2019-03-21
Publication date: 2023-07-18
Anticipated expiration: 2039-03-21
Also published as: CN109948711A

Abstract

The invention discloses a travel similarity acquisition method and device, and a method and system for searching similar travel. The travel similarity obtaining method comprises the following steps: extracting a plurality of characteristic points of the first stroke, and counting the number of the characteristic points of the first stroke; extracting a plurality of characteristic points of the second stroke and counting the number of the characteristic points of the second stroke; the characteristic points comprise a starting point, a finishing point and a turning point; obtaining matched characteristic point pairs, and counting the number of the matched characteristic point pairs, wherein the matched characteristic point pairs are composed of characteristic points with matched position information of a plurality of characteristic points of a first stroke and position information of a second stroke; and calculating the similarity. According to the invention, the characteristic points in the strokes are extracted, and the similarity of the two strokes is calculated according to the matched characteristic points, so that the calculated amount is greatly reduced, and the similarity of the strokes and the calculation efficiency of searching for similar strokes are improved.

Description

Stroke similarity acquisition method and equipment, and method and system for searching similar strokes

Technical Field

The invention belongs to the field of similarity calculation of vehicle strokes, and particularly relates to a stroke similarity acquisition method and equipment, and a method and system for searching similar strokes.

Background

In the course of vehicle journey analysis, it is often necessary to analyze the similarity of two journeys. The curve of the journey on the map is usually taken as an image, and the similarity of the two journeys is obtained by calculating the similarity of the two images. However, the comparison of images is computationally intensive, and the computational complexity is proportional to the image area. For searching similar strokes from massive stroke data, the calculated amount is further increased, and the calculation efficiency is low.

Disclosure of Invention

The invention aims to overcome the defect of low efficiency of calculating the similarity of vehicle strokes in the prior art, and provides a stroke similarity obtaining method and equipment, and a method and system for searching similar strokes.

The invention solves the technical problems by the following technical scheme:

the invention provides a stroke similarity acquisition method, wherein a stroke comprises a first stroke and a second stroke, and the acquisition method comprises the following steps:

s1, extracting a plurality of characteristic points of a first stroke, acquiring position information of each characteristic point of the first stroke, and counting the number n1 of the characteristic points of the first stroke; extracting a plurality of characteristic points of the second stroke, acquiring position information of each characteristic point of the second stroke, and counting the number n2 of the characteristic points of the second stroke; the characteristic points comprise a starting point, a finishing point and a turning point;

S2, acquiring matched characteristic point pairs, and counting the number n3 of the matched characteristic point pairs, wherein the matched characteristic point pairs are composed of characteristic points with matched position information of a plurality of characteristic points of a first stroke and position information of a second stroke;

s3, calculating the similarity m of the first stroke and the second stroke according to the following formula:

m=n3/max (n 1, n 2), where max (n 1, n 2) is used to characterize the larger of n1 and n 2.

Preferably, before step S1, the acquisition method further comprises the steps of:

s0, setting a plurality of sampling moments in a time interval of the vehicle passing through the journey, and acquiring position information of sampling points corresponding to each sampling moment in the journey; the sampling points comprise a starting point and an end point;

the step of extracting turning points comprises the following steps:

calculating an included angle between a first straight line connecting an ith sampling point and an (i-a) th sampling point and a second straight line connecting the ith sampling point and an (i+b) th sampling point according to the position information of the sampling points, wherein a and b are positive integers;

judging whether the included angle belongs to a preset angle interval, if so, extracting an ith sampling point as a turning point.

Preferably, the distance between the ith sampling point and the (i-a) th sampling point is not greater than a preset distance, and the distance between the ith sampling point and the (i-a-1) th sampling point is greater than the preset distance; the distance between the ith sampling point and the (i+b) th sampling point is not greater than a preset distance, and the distance between the ith sampling point and the (i+b+1) th sampling point is greater than the preset distance.

Preferably, step S1 further comprises: performing GeoHash coding on the position information of the feature points to generate a coding value;

the matching feature point pair is composed of feature points in which the encoded value of the feature points of the first run matches the encoded value of the feature points of the second run.

Preferably, step S1 further comprises:

performing de-duplication operation on the feature points according to the coding values to obtain de-duplicated feature points, and counting the number of the de-duplicated feature points;

step S2 comprises:

obtaining the matched characteristic point pairs after the duplication removal, and counting the number n6 of the matched characteristic point pairs after the duplication removal, wherein the matched characteristic point pairs after the duplication removal consist of the characteristic points after the duplication removal, of which the coded values are matched with the coded values of the characteristic points after the duplication removal in the first stroke, and the coded values of the characteristic points after the duplication removal in the second stroke;

step S3 comprises: the similarity m of the first stroke and the second stroke is calculated according to the following formula:

m=n6/max (n 4, n 5), where n4 is the number of feature points after de-duplication of the first run and n5 is the number of feature points after de-duplication of the second run.

Preferably, the step of performing a deduplication operation on the encoded value includes:

if the highest p bit of the code value of the first turning point is the same as the highest p bit of the code value of the starting point, deleting the first turning point, wherein the first turning point is the turning point with the sampling time closest to the sampling time of the starting point; if the highest p bit of the code value of the final turning point is the same as the highest p bit of the code value of the end point, deleting the final turning point, wherein the final turning point is the turning point with the sampling time closest to the sampling time of the end point; p is a positive integer;

If the coded values of the plurality of adjacent turning points are the same, only one turning point of the plurality of turning points is reserved.

Preferably, after the step of performing the de-duplication operation on the feature points according to the encoded values to obtain the de-duplicated feature points, step S1 further includes:

sorting the characteristic points after the duplication removal according to the size of the coding values of the characteristic points after the duplication removal according to a preset arrangement rule so as to obtain reordered duplication removal characteristic points;

the pairs of de-duplicated matched feature points consist of re-ordered de-duplication feature points in the plurality of re-ordered de-duplication feature points of the first run, whose encoded values match the encoded values of the re-ordered de-duplication feature points of the second run.

Preferably, the step of sorting the feature points after de-duplication according to a preset arrangement rule according to the size of the encoded values of the feature points after de-duplication includes:

s101, setting a starting point as a first reference point and setting an end point as a second reference point;

s102, judging whether the first reference point is adjacent to or coincides with the second reference point, if so, executing the step S107, and if not, executing the step S103;

s103, judging whether the code value of the first reference point is equal to the code value of the second reference point, if so, executing the step S104, and if not, executing the step S105;

S104, setting the next feature point adjacent to the first reference point as a new first reference point, setting the previous feature point adjacent to the second reference point as a new second reference point, and executing the step S102;

s105, judging whether the code value of the first reference point is larger than the code value of the second reference point, if so, executing the step S106, and if not, executing the step S107;

s106, arranging the characteristic points after the duplication removal in an inverted sequence, and executing a step S107;

s107, finishing the sorting.

The invention also provides a method for searching similar strokes, wherein the number of the strokes is k1, and k1 is an integer greater than 2, and the method comprises the following steps:

the stroke similarity obtaining method is adopted to respectively calculate the similarity between every two strokes in k1 strokes; the method further comprises the steps of:

if the similarity between the two strokes is greater than a preset similarity threshold, the two strokes are set as a pair of similar strokes.

Preferably, the method further comprises the steps of:

if the plurality of similar stroke pairs all comprise the same stroke, setting the strokes comprised by the plurality of similar stroke pairs as a similar stroke group, and counting the number of strokes comprised in the similar stroke group.

Preferably, let the number of strokes included in the similar stroke group be k2, the method further comprises the steps of:

the similar stroke ratio q is calculated according to the following formula:

q＝k2/k1。

the invention also provides a travel similarity acquisition device, wherein the travel comprises a first travel and a second travel, and the acquisition device comprises a feature point extraction unit, a matching feature point acquisition unit and a similarity calculation unit;

the characteristic point extraction unit is used for extracting a plurality of characteristic points of the first stroke, acquiring the position information of each characteristic point of the first stroke and counting the number n1 of the characteristic points of the first stroke; the feature point extraction unit is also used for extracting a plurality of feature points of the second stroke, acquiring the position information of each feature point of the second stroke and counting the number n2 of the feature points of the second stroke; the characteristic points comprise a starting point, a finishing point and a turning point;

the matching characteristic point acquisition unit is used for acquiring matching characteristic point pairs and counting the number n3 of the matching characteristic point pairs, wherein the matching characteristic point pairs are composed of characteristic points with matched position information of the characteristic points of the first stroke and the second stroke;

the similarity calculation unit is used for calculating the similarity m of the first stroke and the second stroke according to the following formula:

Preferably, the stroke similarity obtaining device further comprises a sampling unit;

the sampling unit is used for setting a plurality of sampling moments in a time interval of the vehicle passing through the journey and acquiring position information of sampling points corresponding to each sampling moment in the journey; the sampling points comprise a starting point and an end point;

the characteristic point extraction unit is used for calculating an included angle between a first straight line connecting an ith sampling point and an (i-a) th sampling point and a second straight line connecting the ith sampling point and an (i+b) th sampling point according to the position information of the sampling points, wherein a and b are positive integers;

the feature point extraction unit is further configured to determine whether the included angle belongs to a preset angle interval, and if yes, extract the ith sampling point as a turning point.

Preferably, the obtaining device further comprises a coding unit, wherein the coding unit is used for performing GeoHash coding on the position information of the feature points to generate a coding value;

Preferably, the acquisition device further comprises a deduplication unit,

the de-duplication unit is used for performing de-duplication operation on the feature points according to the coding values so as to obtain de-duplicated feature points;

the matching feature point obtaining unit is further configured to obtain a pair of matching feature points after de-duplication, and count the number n6 of the pair of matching feature points after de-duplication, where the pair of matching feature points after de-duplication is composed of de-duplication feature points, where the encoding value of the plurality of feature points after de-duplication in the first stroke matches the encoding value of the feature point after de-duplication in the second stroke;

the similarity calculation unit is configured to calculate a similarity m between the first stroke and the second stroke according to the following formula:

Preferably, the deduplication unit is further configured to perform deduplication according to the following steps:

Preferably, the acquisition device further comprises a sorting unit,

the sorting unit is used for sorting the characteristic points after the duplication removal according to the size of the coded values of the characteristic points after the duplication removal according to a preset arrangement rule so as to obtain reordered duplication removal characteristic points;

Preferably, the sorting unit is further configured to sort the feature points after the deduplication according to the following steps:

s107, finishing the sorting.

The invention also provides a system for searching similar strokes, wherein the number of the strokes is k1, k1 is an integer greater than 2, and the system comprises the stroke similarity acquisition equipment;

the acquisition equipment is used for calculating the similarity between every two strokes in k1 strokes respectively; the system further comprises a travel pair setting device:

The stroke pair setting device is used for setting the two strokes as a similar stroke pair when the similarity between the two strokes is larger than a preset similarity threshold value.

Preferably, the system further comprises a travel group setting device;

the stroke group setting device is used for setting the strokes included in the plurality of similar stroke pairs as a similar stroke group when the plurality of similar stroke pairs include the same stroke, and is used for counting the number of the strokes included in the similar stroke group.

Preferably, the system further comprises a ratio calculation device, assuming that the number of strokes included in the similar stroke group is k 2;

the ratio calculation device is configured to calculate the similar stroke ratio q according to the following formula:

q＝k2/k1。

the invention has the positive progress effects that: according to the invention, the characteristic points in the strokes are extracted, and the similarity of the two strokes is calculated according to the matched characteristic points, so that the calculated amount is greatly reduced, and the similarity of the strokes and the calculation efficiency of searching for similar strokes are improved.

Drawings

Fig. 1 is a schematic diagram of the construction of a stroke similarity obtaining apparatus of embodiment 1 of the present invention.

Fig. 2 is a partial schematic view of a vehicle course of the course similarity acquisition apparatus of embodiment 1 of the present invention.

Fig. 3 is a flowchart of a travel similarity obtaining method according to embodiment 1 of the present invention.

Fig. 4 is a schematic diagram of the structure of the stroke similarity obtaining apparatus of embodiment 2 of the present invention.

Fig. 5 is a schematic diagram of sampling points of the first stroke and the second stroke of the stroke similarity obtaining device of embodiment 2 of the present invention.

Fig. 6 is a schematic diagram of feature points of the first stroke and the second stroke of the stroke similarity obtaining device of embodiment 2 of the present invention.

Fig. 7 is a partial schematic view of the first stroke and the second stroke of the stroke similarity obtaining device of embodiment 2 of the present invention.

Fig. 8 is a flowchart of the sorting step of the travel similarity obtaining apparatus of embodiment 2 of the present invention.

Fig. 9 is a schematic diagram of the system for finding similar strokes according to embodiment 3 of the present invention.

Fig. 10 is a flowchart of a method of finding similar strokes according to embodiment 3 of the present invention.

Detailed Description

The invention is further illustrated by means of the following examples, which are not intended to limit the scope of the invention. The following detailed description presents embodiments disclosed herein in detail. It is to be understood that this description is not limited to the particular embodiments disclosed herein, but is capable of modification. Those skilled in the art will appreciate that there are numerous modifications and variations to the disclosure herein, which fall within the scope and principles of the disclosure. Each embodiment may be combined with any other embodiment, unless otherwise specified.

Example 1

The present embodiment provides a travel similarity acquisition apparatus, the travel of the vehicle including a first travel, a second travel, and referring to fig. 1, the acquisition apparatus including a feature point extraction unit 21, a matching feature point acquisition unit 22, and a similarity calculation unit 23.

The characteristic point extraction unit is used for extracting a plurality of characteristic points of the first stroke, acquiring the position information of each characteristic point of the first stroke and counting the number n1 of the characteristic points of the first stroke; the feature point extraction unit is further configured to extract a plurality of feature points of the second stroke, and to obtain position information of each feature point of the second stroke, and to count the number n2 of feature points of the second stroke.

The matching feature point obtaining unit is used for obtaining matching feature point pairs and is used for counting the number n3 of the matching feature point pairs, and the matching feature point pairs are composed of feature points, of which the position information is matched with the position information of the feature point of the second stroke, in the plurality of feature points of the first stroke.

m=n3/max (n 1, n 2), where max (n 1, n 2) is used to characterize the larger of n1 and n2.

In the present embodiment, the stroke similarity acquisition apparatus further includes a sampling unit 24. The sampling unit is used for setting a plurality of sampling moments in a time interval of the vehicle passing through the journey and acquiring position information of sampling points corresponding to each sampling moment in the journey; the sampling points include a start point and an end point. The feature point extraction unit is used for calculating an included angle between a first straight line connecting an ith sampling point and an (i-a) th sampling point and a second straight line connecting the ith sampling point and an (i+b) th sampling point according to the position information of the sampling points, wherein the (i-a) th sampling point is a sampling point with a sampling moment earlier than a sampling moment corresponding to the ith sampling point, and the (i+b) th sampling point is a sampling point with a sampling moment later than a sampling moment corresponding to the ith sampling point. The feature point extraction unit is used for judging whether the included angle belongs to a preset angle interval, and if so, the ith sampling point is extracted as a turning point.

In this embodiment, in order to improve the accuracy of the turning position recognition, the distance between the i-th sampling point and the (i-a) -th sampling point is not greater than a preset distance, the distance between the i-th sampling point and the (i-a-1) -th sampling point is greater than a preset distance, and the (i-a-1) -th sampling point is a sampling point whose sampling time is adjacent to and earlier than the sampling time corresponding to the (i-a) -th sampling point; the distance between the ith sampling point and the (i+b) th sampling point is not greater than the preset distance, the distance between the ith sampling point and the (i+b+1) th sampling point is greater than the preset distance, and the (i+b+1) th sampling point is a sampling point with a sampling moment adjacent to and later than the sampling moment corresponding to the (i+b) th sampling point. In this embodiment, the preset distance is 50 meters. In other alternative embodiments, the preferred range of the predetermined distance is 20 to 100 meters.

In order to improve the accuracy of turning position identification, the time interval between two adjacent sampling moments ranges from 0.5 to 1.5 seconds, so that the data volume can be reduced, and the too dense sampling points are avoided; meanwhile, the distance between the adjacent sampling points can be ensured to be moderate, the recognition precision is improved, and the missing of turning positions due to the large distance between the sampling points is avoided. In this embodiment, the time interval between two adjacent sampling moments is 1 second.

As a preferred embodiment, in this embodiment, an engine sensor is provided on the vehicle, when the engine sensor senses that the engine is ignited, that is, indicates that the trip starts, the sampling unit sets the time at which the trip starts as a start sampling point (that is, the first sampling point of the trip), the sampling unit takes the position of the vehicle at the start sampling point as the start point of the trip of the vehicle, and the sampling unit acquires the position information of the position of the vehicle at the start sampling point through a GPS (global positioning system) device provided on the vehicle, and adds a timestamp corresponding to the position information. From the first sampling point, the sampling unit sets a sampling time every 1 second, and the sampling unit acquires the position information of the sampling point corresponding to each sampling time of the vehicle through the GPS equipment arranged on the vehicle and adds a corresponding time stamp to the position information of each sampling point. The location information contains the geographical coordinate data of the sampling point.

In this embodiment, when the engine sensor senses that the engine of the vehicle is turned off, the sampling unit sets the time as an end sampling time, the position of the vehicle is an end sampling point (i.e., the last sampling point), and the sampling unit acquires position information of the end sampling point through a GPS device provided on the vehicle and adds a time stamp corresponding to the position information. In other alternative embodiments, the sampling unit sets the time at which the engine is initially turned off as the end sampling time when the engine sensor senses that the engine of the vehicle is in a turned off state for up to a preset time threshold (e.g., 5 minutes). Or, a speed sensor is arranged on the vehicle, the speed sensor detects that the speed of the vehicle is 0, and the duration reaches a preset time threshold, and the sampling unit sets the time when the engine is initially shut down as the end sampling time.

For each stroke, the feature point extraction unit extracts the position information of the first sampling point of the stroke acquired by the sampling unit as the position information of the start point of the stroke, and extracts the position information of the last sampling point of the stroke acquired by the sampling unit as the position information of the end point of the stroke.

Fig. 2 shows a part of the vehicle journey, in which the position of the sampling point is represented by a small circle. In the process of extracting turning points in the stroke, the feature point extracting unit sequentially calculates each sampling point from the first sampling point according to the sequence of sampling moments. The feature point extracting unit first judges the current sampling point (i-th sampling point P _i ) And a number of sampling points P thereafter _i 、P _i+1 ……P _i+b 、P _i+(b+1) And (b is a positive integer and can be set according to the requirement), if so, the method indicates that possible turning points in the travel have been traversed, the identification process is finished, and the identified turning positions are all turning positions in the travel. Otherwise, the following calculation is performed: if the current sampling point (i-th sampling point P _i ) The (i-a) th sampling point P spaced a (a is a positive integer) time interval from the previous sampling point P _i-a The distance D1 between the sampling points is less than or equal to a preset distance (50 meters), and the ith sampling point P _i The (i-a-1) th sampling point P spaced (a+1) time interval from it _i-(a+1) The distance D2 between the two is larger than the preset distance (50 meters); and, the i-th sampling point P _i The (i+b) th sampling point P separated from it by b time intervals _i+b The distance D3 between the two is smaller than or equal to a preset value Distance (50 meters), and the i-th sampling point P _i The (i+b+1) th sampling point P spaced (b+1) time interval from that _i+(b+1) The distance D4 between the two sampling points is larger than the preset distance (50 meters), and the ith sampling point P is used _i And calculating an included angle theta between a first straight line L1 connecting the ith sampling point and the (i-a) th sampling point and a second straight line L2 connecting the ith sampling point and the (i+b) th sampling point according to geographic position data of the ith sampling point, the (i-a) th sampling point and the (i+b) th sampling point as target sampling points.

Next, the feature point extraction unit determines whether the included angle θ belongs to a preset angle interval [60 degrees, 120 degrees ]]If yes, the feature point extraction unit identifies the ith sampling point as a turning point. After the feature point extraction unit performs recognition, judging whether the (i+b) th sampling point is the last sampling point of the vehicle travel, if so, indicating that the possible turning points in the travel have been traversed, and ending the recognition process, wherein the recognized turning positions are all turning positions in the travel; if not, that is, if the (i+b) th sampling point is not the last sampling point of the vehicle course, the feature point extracting means extracts the (i+b) th sampling point P after recognizing the position corresponding to the i th sampling point as the turning position in order to reduce the amount of computation and increase the recognition speed _i+b Set to the new target sampling point (i.e., assign (i+b) to i), and then trigger the feature point extraction unit to perform the feature point extraction for the new target sampling point (i+b) th sampling point P _i+b ) And (5) performing calculation. Because of the ith sampling point P _i And (i+b) th sampling point P _i+b The distance between the two sampling points accords with a preset reasonable range, so that the (i+1) th to (i+b-1) th sampling points can be reasonably skipped by assigning (i+b) to i. If the amount of calculation is required to be reduced, the recognition speed is increased, and the values of the sampling interval, the value of b and the value of the preset distance can be set to be larger values reasonably, so that the recognition of the frequent steering position of the vehicle in a smaller range can be omitted. If each turning operation of the vehicle needs to be accurately identified, a higher identification accuracy can be obtained by reasonably setting the value of the sampling interval, the value of b, and the value of the preset distance to smaller values.

Judging whether the included angle theta belongs to a preset angle interval [60 degrees, 120 degrees ] in a characteristic point extraction unit]If the angle θ between the first straight line L1 connecting the ith sample point and the (i-a) th sample point and the second straight line L2 connecting the ith sample point and the (i+b) th sample point is smaller than 60 degrees or larger than 120 degrees, the feature point extraction unit extracts the next sample point (P _i+1 ) Set to the new target sampling point (i.e., (i+1) is assigned to i), and then trigger the feature point extraction unit to calculate for the new target sampling point (i+1) th sampling point).

And (3) reciprocating in this way until each sampling point in the stroke is calculated and identified, namely, the extraction of turning points in the whole stroke is completed.

The starting point, turning point and end point of the journey constitute characteristic points of the journey. After extracting the characteristic points of the starting point, the turning point and the end point of the vehicle journey, the characteristic point extracting unit arranges the starting point, the turning point and the end point of the first journey according to the sequence of sampling time, and sets 5 characteristic points of the first journey, namely A1, A2, A3, A4 and A5, respectively, and the corresponding geographic coordinates of the characteristic points are (x 1, y 1), (x 2, y 2), (x 3, y 3), (x 4, y 4) and (x 5, y 5); the feature point extraction unit arranges the starting point, the turning point and the end point of the second stroke according to the sequence of the sampling time, and sets 6 feature points in total for the first stroke, namely B1, B2, B3, B4, B5 and B6, and the corresponding geographic coordinates of the feature points are (x 1, y 1), (x 7, y 7), (x 3, y 3), (x 4, y 4), (x 6, y 6) and (x 5, y 5) respectively. The matching feature point obtaining unit calculates: the geographic coordinates of A1 and B1 are the same, and are matched feature point pairs; the geographic coordinates of A3 and B3 are the same, and are matched feature point pairs; the geographic coordinates of A4 and B4 are the same, and are matched feature point pairs; the geographic coordinates of A5 and B6 are the same, and are matched feature point pairs. The matching characteristic point obtaining unit calculates the number of the matching characteristic point pairs of the first travel and the second travel to be 4.

Then, the similarity calculation unit calculates the number of pairs of matching feature points from the number of feature points of the first pass 5 (n 1), the number of feature points of the second pass 6 (n 2), and the number of pairs of matching feature points 4 (n 3), according to the formula:

m＝n3/max(n1，n2)，

the similarity of the first stroke to the second stroke m=2/3, or characterized by a 66.67% approximation, is calculated.

Characteristic points in the vehicle journey, in particular, a starting point, a turning point and an ending point, can accurately reflect the shape characteristics of the vehicle journey. The travel similarity obtaining device of the embodiment can effectively omit the calculation of a large number of parts aiming at the linear motion of the vehicle in the travel by extracting the characteristic points of the travel of the vehicle and calculating the similarity of the travel of the vehicle based on the characteristic points, thereby greatly reducing the calculated amount and improving the calculation efficiency of the travel similarity.

The present embodiment provides a travel similarity obtaining method, which is implemented by using the travel similarity obtaining apparatus of the present embodiment. The stroke comprises a first stroke and a second stroke. Referring to fig. 4, the acquisition method includes the steps of:

step S300, setting a plurality of sampling moments in a time interval of the vehicle passing through the journey, and acquiring position information of sampling points corresponding to each sampling moment in the journey. The sampling points include a start point and an end point of a trip.

Step S301 extracts a plurality of feature points of the first stroke and a plurality of feature points of the second stroke. Acquiring position information of each characteristic point of the first stroke, and counting the number n1 of the characteristic points of the first stroke; position information of each characteristic point of the second stroke is acquired, and the number n2 of the characteristic points of the second stroke is counted. The characteristic points comprise a starting point, an ending point and a turning point.

Step S302, counting the number n3 of the matched characteristic point pairs. The matching feature point pair is composed of feature points, of the plurality of feature points of the first journey, of which the position information is matched with the position information of the feature point of the second journey.

Step S303, calculating the similarity between the first stroke and the second stroke. Specifically, the similarity m between the first stroke and the second stroke is calculated according to the following formula:

As a preferred embodiment, an engine sensor is disposed on the vehicle, in step S300, when the engine sensor senses that the engine is ignited, that is, indicates that the trip starts, the sampling unit sets the time of the start of the trip as a start sampling point (that is, the first sampling point of the trip), the sampling unit takes the position of the vehicle at the start sampling point as the start point of the trip of the vehicle, and the sampling unit acquires the position information of the position of the vehicle at the start sampling point through a GPS device disposed on the vehicle, and adds a time stamp corresponding to the position information. From the first sampling point, the sampling unit sets a sampling time every 1 second, and the sampling unit acquires the position information of the sampling point corresponding to each sampling time of the vehicle through the GPS equipment arranged on the vehicle and adds a corresponding time stamp to the position information of each sampling point. The location information contains the geographical coordinate data of the sampling point.

In step 301, for each trip, the feature point extraction unit extracts the position information of the first sampling point of the trip, which is acquired by the sampling unit, as the position information of the start point of the trip, and extracts the position information of the last sampling point of the trip, which is acquired by the sampling unit, as the position information of the end point of the trip.

FIG. 2 illustrates a portion of a vehicle trip, whereinThe position of the sampling point is represented by a small circle. In the process of extracting turning points in the stroke, the feature point extracting unit sequentially calculates each sampling point from the first sampling point according to the sequence of sampling moments. The feature point extracting unit first judges the current sampling point (i-th sampling point P _i ) And a number of sampling points P thereafter _i 、P _i+1 ……P _i+b 、P _i+(b+1) And (b is a positive integer and can be set according to the requirement), if so, the method indicates that possible turning points in the travel have been traversed, the identification process is finished, and the identified turning positions are all turning positions in the travel. Otherwise, the following calculation is performed: if the current sampling point (i-th sampling point P _i ) The (i-a) th sampling point P spaced a (a is a positive integer) time interval from the previous sampling point P _i-a The distance D1 between the sampling points is less than or equal to a preset distance (50 meters), and the ith sampling point P _i The (i-a-1) th sampling point P spaced (a+1) time interval from it _i-(a+1) The distance D2 between the two is larger than the preset distance (50 meters); and, the i-th sampling point P _i The (i+b) th sampling point P separated from it by b time intervals _i+b The distance D3 between the two sampling points is less than or equal to a preset distance (50 meters), and the ith sampling point P _i The (i+b+1) th sampling point P spaced (b+1) time interval from that _i+(b+1) The distance D4 between the two sampling points is larger than the preset distance (50 meters), and the i-th sampling point is taken as a target sampling point, and an included angle theta between a first straight line L1 connecting the i-th sampling point and the (i-a) -th sampling point and a second straight line L2 connecting the i-th sampling point and the (i+b) -th sampling point is calculated according to geographic position data of the i-th sampling point, the (i-a) -th sampling point and the (i+b) -th sampling point.

Next, the feature point extraction unit determines whether the included angle θ belongs to a preset angle interval [60 degrees, 120 degrees ]]If yes, the feature point extraction unit identifies the ith sampling point as a turning point. After the feature point extraction unit performs recognition, it is further determined whether the (i+b) th sampling point is the last sampling point of the vehicle trip, if so, the feature point extraction unit indicates that the possible turning point in the trip has been traversed, and the recognition process is ended and has been recognizedThe obtained turning positions are all turning positions in the stroke; if not, that is, if the (i+b) th sampling point is not the last sampling point of the vehicle course, the feature point extracting means extracts the (i+b) th sampling point P after recognizing the position corresponding to the i th sampling point as the turning position in order to reduce the amount of computation and increase the recognition speed _i+b Set to the new target sampling point (i.e., assign (i+b) to i), and then trigger the feature point extraction unit to perform the feature point extraction for the new target sampling point (i+b) th sampling point P _i+b ) And (5) performing calculation. Because of the ith sampling point P _i And (i+b) th sampling point P _i+b The distance between the two sampling points accords with a preset reasonable range, so that the (i+1) th to (i+b-1) th sampling points can be reasonably skipped by assigning (i+b) to i. If the amount of calculation is required to be reduced, the recognition speed is increased, and the values of the sampling interval, the value of b and the value of the preset distance can be set to be larger values reasonably, so that the recognition of the frequent steering position of the vehicle in a smaller range can be omitted. If each turning operation of the vehicle needs to be accurately identified, a higher identification accuracy can be obtained by reasonably setting the value of the sampling interval, the value of b, and the value of the preset distance to smaller values.

And (3) reciprocating in this way until each sampling point in the stroke is calculated and identified, namely, the extraction of turning points in the whole stroke is completed. The starting point, turning point and end point of the journey constitute characteristic points of the journey.

After extracting the characteristic points of the starting point, the turning point and the end point of the vehicle journey, the characteristic point extracting unit arranges the starting point, the turning point and the end point of the first journey according to the sequence of sampling time, and sets 5 characteristic points of the first journey, namely A1, A2, A3, A4 and A5, respectively, and the corresponding geographic coordinates of the characteristic points are (x 1, y 1), (x 2, y 2), (x 3, y 3), (x 4, y 4) and (x 5, y 5); the feature point extraction unit arranges the starting point, the turning point and the end point of the second stroke according to the sequence of the sampling time, and sets 6 feature points in total for the first stroke, namely B1, B2, B3, B4, B5 and B6, and the corresponding geographic coordinates of the feature points are (x 1, y 1), (x 7, y 7), (x 3, y 3), (x 4, y 4), (x 6, y 6) and (x 5, y 5) respectively. In step S302, the matching feature point acquisition unit calculates: the geographic coordinates of A1 and B1 are the same, and are matched feature point pairs; the geographic coordinates of A3 and B3 are the same, and are matched feature point pairs; the geographic coordinates of A4 and B4 are the same, and are matched feature point pairs; the geographic coordinates of A5 and B6 are the same, and are matched feature point pairs. The matching characteristic point obtaining unit calculates the number of the matching characteristic point pairs of the first travel and the second travel to be 4.

Then, in step S304, the similarity calculation unit calculates 4 (n 3) pairs of matching feature points according to the number of feature points of the first pass 5 (n 1), the number of feature points of the second pass 6 (n 2), and the formula:

m＝n3/max(n1，n2)，

Example 2

On the basis of embodiment 1, this embodiment provides a stroke similarity obtaining apparatus, and fig. 4 shows a schematic configuration diagram of the stroke similarity obtaining apparatus.

In this embodiment, the sampling points of the first stroke and the second stroke acquired by the sampling unit are shown in fig. 5. For ease of illustration, the first stroke is represented by a solid line, the second stroke is represented by a dashed line, and the sampling points are all represented by small circles. The sampling points J1S are the starting point of the first stroke, the sampling points J1T1, the sampling points J1T2, the sampling points J1T3 and the sampling points J1T4 are 4 turning points of the first stroke, and the sampling point J1E is the end point of the first stroke; the sampling point J2S is the start point of the second stroke, the sampling point J2T1, the sampling point J2T2, the sampling point J2T3, the sampling point J2T4, the sampling point J2T5 are 5 turning points of the second stroke, and the sampling point J2E is the end point of the second stroke.

The feature points extracted by the feature point extracting unit are shown in fig. 6, in which only the extracted feature points are reserved, that is, only the starting point sampling point J1S, the 4 turning point sampling points J1T1, the sampling point J1T2, the sampling point J1T3, the sampling point J1T4, and the end point sampling point J1E of the first stroke are reserved in the first stroke; the second stroke retains only the start sampling point J2S,5 turning point sampling points J2T1, sampling point J2T2, sampling point J2T3, sampling point J2T4, sampling point J2T5, and end sampling point J2E of the second stroke.

The run-length similarity obtaining apparatus of the present embodiment further includes an encoding unit 25 configured to perform GeoHash encoding on the position information of the feature points, and generate an encoded value. Fig. 7 shows a part in the first stroke and a part in the second stroke, where J1S, J1T1 and J1T2 are 3 feature points that are sequentially adjacent in the first stroke, and J2E, J2T1 and J2T2 are 3 feature points that are sequentially adjacent in the second stroke. The precise geographic coordinates of J1S and J2E, J1T1 and J2T1, J1T2 and J2T2 are different from each other. However, considering that the road on which the vehicle travels has a certain width, the routes do not completely coincide when the vehicle travels on the same road a plurality of times, but the same journey should be evaluated. According to the GeoHash encoding method, all points in the same rectangular area on the map have the same GeoHash encoding value. Therefore, after GeoHash encoding is performed on the position information of the feature point, J1T1 and J2T1 have the same encoded value, J1T2 and J2T2 have the same encoded value, and J1S and J2E have the same encoded value. At this time, the matching feature point acquisition unit acquires a matching feature point pair based on the code values of the feature points, that is, the matching feature point pair is composed of feature points in which the code values of the feature points of the first pass are matched with the code values of the feature points of the second pass, and then J1T1 and J2T1 have the same code value, which is the matching feature point pair; J1T2 and J2T2 have the same coding value and are matched characteristic point pairs; J1S and J2E have the same code value, which is the matching feature point pair. Thus, the influence of natural deviation of the vehicle when the vehicle runs on the same road on calculation of the travel similarity of the vehicle can be avoided, and the travel similarity with actual reference value can be obtained.

In this embodiment, the start point and the end point of the trip are both represented by 6-bit GeoHash code values, and the turning point is represented by 7-bit GeoHash code values. In other alternative embodiments, the number of bits of the encoded value may be set as required, and the more the number of bits, the smaller the area of the rectangle represented by the same GeoHash encoded value. After the encoding unit performs GeoHash encoding on the position information of the feature points of the first stroke and the second stroke, the encoding values corresponding to the feature points are as follows:

in the first pass, the start sampling point J1S: wx2g0d; sampling point J1T1: wx2g0d1; sampling point J1T2: wx2ed1d; sampling point J1T3: wx2ed1q; sampling point J1T4: wx2ed1q; endpoint sampling point J1E: wx2ecf.

In the second stroke: starting point sampling point J2S: wx2ecf; sampling point J2T5: wx2ecd1; sampling point J2T4: wx2ed1q; sampling point J2T3: wx2ed1q; sampling point J2T2: wx2ed1d; sampling point J2T1: wx2g0d1; endpoint sampling point J2E: wx2g0d.

Further, in the present embodiment, the acquisition apparatus further includes a deduplication unit 26. The de-duplication unit performs de-duplication operation on the feature points according to the coding values to obtain de-duplicated feature points, and is further used for counting the number of the de-duplicated feature points.

As a preferred embodiment, the deduplication unit performs deduplication according to the following steps:

if the highest p bit of the code value of the first turning point is the same as the highest p bit of the code value of the starting point, deleting the first turning point, wherein the first turning point is the turning point with the sampling time closest to the sampling time of the starting point; if the highest p bit of the code value of the final turning point is the same as the highest p bit of the code value of the end point, deleting the final turning point, wherein the final turning point is the turning point with the sampling time closest to the sampling time of the end point; p is a positive integer. If the coded values of the plurality of adjacent turning points are the same, only one turning point of the plurality of turning points is reserved.

In this embodiment, p has a value of 6. For the first stroke, the sampling point J1T1 is a turning point adjacent to the starting point sampling point J1S, the highest 6 bits of the coded values of the two are the same, the deduplication unit deletes the sampling point J1T1, and the starting point sampling point J1S is reserved. The sampling points J1T3 and J1T4 are adjacent turning points, and the coding values of the two are the same, and the deduplication unit only reserves one turning point, for example, the sampling point J1T3. For the second stroke, the sampling point J2T1 is a turning point adjacent to the end point sampling point J2E, the highest 6 bits of the coded values of the two are the same, the deduplication unit deletes the sampling point J2T1, and the end point sampling point J2E is reserved. The sampling point J2T3 and the sampling point J2T4 are adjacent turning points, and the coding values of the sampling point J2T3 and the sampling point J2T4 are the same, and the deduplication unit only reserves one turning point, for example, the sampling point J2T3.

After the duplication elimination operation, obtaining the feature points after duplication elimination of each stroke:

the first pass includes 4 (n 4) de-duplicated feature points: starting point sampling point J1S: wx2g0d; sampling point J1T2: wx2ed1d; sampling point J1T3: wx2ed1q; endpoint sampling point J1E: wx2ecf.

The second pass includes 5 (n 5) de-duplicated feature points: starting point sampling point J2S: wx2ecf; sampling point J2T5: wx2ecd1; sampling point J2T3: wx2ed1q; sampling point J2T2: wx2ed1d; endpoint sampling point J2E: wx2g0d.

It can be seen by inspection that the first stroke is opposite in direction to the second stroke. When used to evaluate the driving habits of the driver, the strokes of the same course but in opposite directions are regarded as the same strokes. Therefore, for the sake of convenience in calculation, the stroke similarity acquisition apparatus of the present embodiment further includes a sorting unit 27. The sorting unit is used for sorting the characteristic points after the duplication removal according to the size of the coded values of the characteristic points after the duplication removal according to a preset arrangement rule so as to obtain the duplicated characteristic points after the duplication removal. In the present embodiment, referring to fig. 8, the sorting unit sorts the feature points after the deduplication according to the following steps:

step S11, setting a starting point as a first reference point and setting an end point as a second reference point;

Step S12, judging whether the first reference point is adjacent to or coincides with the second reference point, if so, executing step S17, and if not, executing step S13;

step S13, judging whether the code value of the first reference point is equal to the code value of the second reference point, if so, executing step S14, and if not, executing step S15;

step S14, setting the next characteristic point after the duplication elimination adjacent to the first reference point as a new first reference point, setting the previous characteristic point after the duplication elimination adjacent to the second reference point as a new second reference point, and executing step S12;

step S15, judging whether the code value of the first reference point is larger than the code value of the second reference point, if so, executing step S16, and if not, executing step S17;

step S16, the characteristic points after the duplication removal are arranged in an inverted sequence, and step S17 is executed;

and S17, finishing the sequencing.

For the first stroke, the sorting unit firstly sets the starting point sampling point J1S as a first reference point, sets the end point sampling point J1E as a second reference point, and compares the encoded value of the starting point sampling point J1S with the encoded value of the end point sampling point J1E, because the former is larger than the latter, the sorting is finished, namely the original sequence of the 4 characteristic points after the duplication removal of the first stroke is maintained.

For the second run, the sorting unit sets the start sampling point J2S as the first reference point, sets the end sampling point J2E as the second reference point, and compares the encoded value of the start sampling point J2S with the encoded value of the end sampling point J2E. Because the latter is smaller than the former, the sorting unit arranges the 5 de-duplicated feature points of the second run in reverse order, resulting in a re-ordered de-duplicated feature point, i.e., end point sampling point J2E: wx2g0d; sampling point J2T2: wx2ed1d; sampling point J2T3: wx2ed1q; sampling point J2T5: wx2ecd1; starting point sampling point J2S: wx2ecf.

By the heavy operation, the influence of the multiple steering operations within a certain area of the vehicle can be eliminated. For example, after a vehicle is started, a number of turns are made in a residential area, and the area of the area may be considered the starting point of the vehicle's journey, relative to the vehicle's entire journey. Therefore, through the duplicate removal operation, turning points generated when the vehicle turns for many times in the residential quarter can be eliminated, so that the subsequent operation amount is reduced, the accuracy of calculation of the travel similarity is improved, and the calculated travel similarity has more practical application value.

Next, the matching feature point acquisition unit acquires a pair of the de-duplicated matching feature points, the pair of the de-duplicated matching feature points being composed of the re-ordered de-duplication feature points in which the encoded values of the re-ordered de-duplication feature points of the first run match the encoded values of the re-ordered de-duplication feature points of the second run. In this embodiment, the start sampling point J1S and the end sampling point J2E are pairs of matching feature points after de-duplication; sampling points J1T2 and J2T2 are pairs of matching characteristic points after de-duplication; sampling points J1T3 and J2T3 are pairs of matching characteristic points after de-duplication; the end point sampling point J1E and the start point sampling point J2S are pairs of matching feature points after de-duplication. Therefore, the number of pairs of matching feature points after the de-duplication of the first and second strokes is 4 (n 3).

Then, the similarity calculation unit calculates a similarity m of the first stroke and the second stroke according to the following formula: m=n3/max (n 4, n 5) =4/5=80%.

In other optional embodiments, after the sorting unit performs the sorting operation, the encoding values corresponding to the reordered duplicate removal feature points of the first run are sequentially connected to form a first string a: { wx2g0d, wx2ed1d, wx2ed1q, wx2ecf }, and sequentially concatenating the encoded values corresponding to the reordered deduplication feature of the second run to form a second string b: { wx2g0d, wx2ed1d, wx2ed1q, wx2ecd1, wx2ecf }. The matching feature point obtaining unit obtains the number of "non-matching feature points" by calculating the edit distance (i.e., levenshtein distance) L (a, b) of the first character string a and the second character string b. In this alternative embodiment, each code value is calculated as an element when calculating the edit distance L (a, b). In this alternative embodiment, an element (i.e. a coding value) "wx2ecd1" needs to be added to the corresponding position in the first string a to obtain the second string b, so that the editing distance between the first string a and the second string b is 1. Since each code value corresponds to one feature point, the edit distance L (a, b) of the first character string a and the second character string b is actually the number of "non-matching feature points" of the first run and the second run, that is, feature points of the first run and the second run other than the feature points constituting the matching feature point pair. Let the number of matching feature point pairs be n3, the number of reordered deduplication feature points of the first run be n4, and the number of reordered deduplication feature points of the second run be n5, the matching feature point obtaining unit obtains the number of matching feature point pairs n3 according to the following formula:

n3＝max(n4，n5)-L(a，b)。

In such an alternative embodiment, the similarity calculation unit calculates the similarity m of the first stroke and the second stroke according to the following formula:

i.e.

On the basis of this alternative embodiment, the run-length similarity obtaining apparatus of the present invention further has another embodiment in which the matching feature point obtaining unit first determines whether the first code value in the first string a is identical to the first code value in the second string b, and determines whether the last code value in the first string a is identical to the last code value in the second string b, and if either one of the two determinations is not identical, it indicates that at least one of the start points or the end points of the first run and the second run is not identical. The number of pairs of the output matching feature points of the matching feature point acquisition unit is 0. Then, the similarity calculation unit calculates that the similarity of the first stroke and the second stroke is 0.

In this embodiment, if the matching feature point obtaining unit determines that the first code value in the first string a is the same as the first code value in the second string b, and the last code value in the first string a is the same as the last code value in the second string b, the matching feature point obtaining unit obtains the number of "non-matching feature points" by calculating the edit distance L (a, b) of the first string a and the second string b. When the edit distance L (a, b) is calculated, each code value is calculated as an element. Let the number of matching feature point pairs be n3, the number of reordered deduplication feature points of the first run be n4, and the number of reordered deduplication feature points of the second run be n5, the matching feature point obtaining unit obtains the number of matching feature point pairs n3 according to the following formula:

n3＝max(n4，n5)-L(a，b)。

In this embodiment, the similarity calculation unit calculates the similarity m of the first stroke and the second stroke according to the following formula:

i.e.

After the characteristic points of the travel are extracted, the similarity of the travel can be calculated by using the characteristic points of the travel based on either strict or loose criteria, and the specific criteria can be set appropriately according to actual requirements.

On the basis of the travel similarity obtaining method of embodiment 1, this embodiment also provides a travel similarity obtaining method, which is implemented by using the travel similarity obtaining apparatus of this embodiment.

In step S300, the sampling points of the first stroke and the second stroke acquired by the sampling unit are shown in fig. 5. For ease of illustration, the first stroke is represented by a solid line, the second stroke is represented by a dashed line, and the sampling points are all represented by small circles. The sampling points J1S are the starting point of the first stroke, the sampling points J1T1, the sampling points J1T2, the sampling points J1T3 and the sampling points J1T4 are 4 turning points of the first stroke, and the sampling point J1E is the end point of the first stroke; the sampling point J2S is the start point of the second stroke, the sampling point J2T1, the sampling point J2T2, the sampling point J2T3, the sampling point J2T4, the sampling point J2T5 are 5 turning points of the second stroke, and the sampling point J2E is the end point of the second stroke.

In step S301, the feature points extracted by the feature point extraction unit are as shown in fig. 6. Only the extracted characteristic points are reserved in the figure, namely, a first stroke only reserves a starting point sampling point J1S,4 turning point sampling points J1T1, sampling points J1T2, sampling points J1T3, sampling points J1T4 and an end point sampling point J1E of the first stroke; the second stroke retains only the start sampling point J2S,5 turning point sampling points J2T1, sampling point J2T2, sampling point J2T3, sampling point J2T4, sampling point J2T5, and end sampling point J2E of the second stroke.

In this embodiment, step S301 further includes: and performing GeoHash coding on the position information of the feature points to generate a coding value.

Further, in the present embodiment, step S301 further includes: and carrying out de-duplication operation on the characteristic points according to the coding values to obtain de-duplicated characteristic points, and counting the number of the de-duplicated characteristic points.

As a preferred embodiment, the deduplication operation comprises the steps of:

It can be seen by inspection that the first stroke is opposite in direction to the second stroke. When used to evaluate the driving habits of the driver, the strokes of the same course but in opposite directions are regarded as the same strokes. Therefore, for ease of calculation, after the deduplication operation, step S301 further includes: and sorting the characteristic points after the duplication removal according to the size of the coded values of the characteristic points after the duplication removal according to a preset arrangement rule so as to obtain the duplication removal characteristic points after the duplication removal. In this embodiment, referring to fig. 8, the step of sorting the feature points after the duplication removal includes:

and S17, finishing the sequencing.

Next, in step S302, the matching feature point acquisition unit acquires a pair of matching feature points after deduplication, the pair of matching feature points after deduplication being composed of reordered deduplication feature points in which the encoded value of the reordered deduplication feature points of the first run matches the encoded value of the reordered deduplication feature points of the second run. In this embodiment, the start sampling point J1S and the end sampling point J2E are pairs of matching feature points after de-duplication; sampling points J1T2 and J2T2 are pairs of matching characteristic points after de-duplication; sampling points J1T3 and J2T3 are pairs of matching characteristic points after de-duplication; the end point sampling point J1E and the start point sampling point J2S are pairs of matching feature points after de-duplication. Therefore, the number of pairs of matching feature points after the de-duplication of the first and second strokes is 4 (n 3).

Then, in step S303, the similarity calculation unit calculates a similarity m of the first stroke and the second stroke according to the following formula: m=n3/max (n 4, n 5) =4/5=80%.

In other optional embodiments, after the sorting unit performs the sorting operation, the encoding values corresponding to the reordered duplicate removal feature points of the first run are sequentially connected to form a first string a: { wx2g0d, wx2ed1d, wx2ed1q, wx2ecf }, and sequentially concatenating the encoded values corresponding to the reordered deduplication feature of the second run to form a second string b: { wx2g0d, wx2ed1d, wx2ed1q, wx2ecd1, wx2ecf }. Then, in step S302, the matching feature point acquisition unit obtains the number of "non-matching feature points" by calculating the edit distance (i.e., levenshtein distance) L (a, b) of the first character string a and the second character string b. In this alternative embodiment, each code value is calculated as an element when calculating the edit distance L (a, b). In this alternative embodiment, an element (i.e. a coding value) "wx2ecd1" needs to be added to the corresponding position in the first string a to obtain the second string b, so that the editing distance between the first string a and the second string b is 1. Since each code value corresponds to one feature point, the edit distance L (a, b) of the first character string a and the second character string b is actually the number of "non-matching feature points" of the first run and the second run, that is, feature points of the first run and the second run other than the feature points constituting the matching feature point pair. Let the number of matching feature point pairs be n3, the number of reordered deduplication feature points of the first run be n4, and the number of reordered deduplication feature points of the second run be n5, the matching feature point obtaining unit obtains the number of matching feature point pairs n3 according to the following formula:

n3＝max(n4，n5)-L(a，b)。

In this alternative embodiment, in step S303, the similarity calculation unit calculates the similarity m of the first stroke and the second stroke according to the following formula:

i.e.

Example 3

The present embodiment provides a system for finding similar strokes, which includes the stroke similarity obtaining device 501 of embodiment 1 or embodiment 2, with reference to fig. 9. Let the number of strokes be k1, k1 be an integer greater than 2. The stroke similarity obtaining device 501 is configured to calculate the similarity between every two strokes (i.e., the similarity between every two strokes) among k1 strokes, respectively.

The system for finding similar trips of this embodiment also includes a trip pair setting device 502. The stroke pair setting device is used for setting the two strokes as a similar stroke pair when the similarity between the two strokes is larger than a preset similarity threshold value.

The system for finding similar trips further comprises a trip group setting device 503. The stroke group setting device is used for setting the strokes included in the plurality of similar stroke pairs as a similar stroke group when the plurality of similar stroke pairs include the same stroke, and is used for counting the number of the strokes included in the similar stroke group.

In practice, it is assumed that k1 is 4 and that the 4 strokes are J1, J2, J3, and J4, respectively. The stroke similarity obtaining device calculates the similarity between every two strokes of the 4 strokes, respectively. The process of calculating the similarity of the two strokes is not described in detail herein.

The calculated similarity between the stroke J1 and the stroke J2 is s (J1, J2), the similarity between the stroke J1 and the stroke J3 is s (J1, J3), the similarity between the stroke J1 and the stroke J4 is s (J1, J4), the similarity between the stroke J2 and the stroke J3 is s (J2, J3), the similarity between the stroke J2 and the stroke J4 is s (J2, J4), and the similarity between the stroke J3 and the stroke J4 is s (J3, J4). S (J1, J2), s (J1, J3) and s (J2, J3) are all assumed to be larger than a preset similarity threshold (the similarity threshold can be reasonably set as required); s (J1, J4), s (J2, J4), s (J3, J4) are all less than a preset similarity threshold. The stroke pair setting device sets J1 and J2, J1 and J3, J2 and J3 as the similar stroke pairs, respectively. Since the two similar stroke pairs J1 and J2, J1 and J3 each include the stroke J1, the stroke group setting device sets the 3 strokes J1, J2, J3 as a similar stroke group, and statistically obtains that 3 strokes are included in the similar stroke group.

For convenience of explanation, k1 in the present embodiment takes a small value. In the case of analysis of a large amount of travel data, the number of travels k1 is a huge value. The system for searching similar travel according to the embodiment improves the calculation efficiency of travel similarity, so that similar paths can be quickly searched from massive travel data.

In other alternative embodiments, when k1 strokes include JA, JB, JC, JD … … JN, JO, JP, and the like, if JA and JB are similar stroke pairs, JA and JC are similar stroke pairs, JA and JN are similar stroke pairs, JA and JP are similar stroke pairs, the stroke group setting device sets JA, JB, JC, JN and JP as a similar stroke group including the number of strokes of 5.

Further, in the present embodiment, the system for finding similar runs further includes a ratio calculation device 504. Assuming that the number of strokes included in the similar stroke group is k2, the ratio calculating device is configured to calculate a similar stroke ratio q according to the following formula:

q＝k2/k1。

in the present embodiment, the similar stroke ratio q=3/4=75%.

When a large amount of trip data of a certain driver is analyzed, the calculated similar trip ratio can be used to evaluate the daily trip habits of the driver. For example, a larger value of the driver's similar trip ratio over a certain statistical period indicates that the driver has a higher probability of traveling on the route on which the driver is traveling frequently. Further, since the driver has a high probability of traveling on a route on which the driver frequently travels, which is familiar with the route of travel with a high probability, the driver is also less likely to have an accident while traveling, which has a reference meaning to the evaluation of the vehicle insurance. That is, the similar trip ratio may be used as a parameter for the assessment of vehicle insurance.

The embodiment also provides a method for searching similar travel, which is implemented by adopting the system for searching similar travel. Let the number of strokes be k1, k1 be an integer greater than 2. Referring to fig. 10, the method of finding similar strokes includes the steps of:

step S401, respectively calculating the similarity between every two strokes in the k1 strokes. The process of calculating the similarity between two strokes is implemented by using the stroke similarity obtaining method of embodiment 1 or embodiment 2, and the specific process is not described here again.

In step S402, if the similarity between the two strokes is greater than a preset similarity threshold, the two strokes are set as a pair of similar strokes.

In step S403, if the plurality of similar stroke pairs include the same stroke, the strokes included in the plurality of similar stroke pairs are set as a similar stroke group, and the number of strokes included in the similar stroke group is counted.

Step S404, calculating the similar stroke ratio. Assuming that the number of strokes included in the similar stroke group is k2, a similar stroke ratio q is calculated according to the following formula:

q＝k2/k1。

while specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that these are by way of example only, and the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the principles and spirit of the invention, but such changes and modifications fall within the scope of the invention.

Claims

1. A travel similarity obtaining method, wherein the travel includes a first travel and a second travel of a vehicle, the obtaining method comprising the steps of:

s1, extracting a plurality of characteristic points of the first stroke, acquiring position information of each characteristic point of the first stroke, and counting the number n1 of the characteristic points of the first stroke; extracting a plurality of characteristic points of the second stroke, acquiring position information of each characteristic point of the second stroke, and counting the number n2 of the characteristic points of the second stroke; the characteristic points comprise a starting point, an ending point and a turning point;

s2, acquiring matched characteristic point pairs, and counting the number n3 of the matched characteristic point pairs, wherein the matched characteristic point pairs consist of characteristic points, of which the position information is matched with the position information of the characteristic point of the second stroke, in a plurality of characteristic points of the first stroke;

s3, calculating the similarity m between the first stroke and the second stroke according to the following formula:

m=n3/max (n 1, n 2), where max (n 1, n 2) is used to characterize the larger of n1 and n2;

the step S1 further includes: performing GeoHash coding on the position information of the feature points to generate coding values; wherein all the feature points in the same rectangular region have the same coding value;

The matching feature point pair is composed of feature points, among the plurality of feature points of the first run, for which the code value matches the code value of the feature point of the second run.

2. The travel similarity obtaining method according to claim 1, characterized in that before said step S1, said obtaining method further comprises the steps of:

s0, setting a plurality of sampling moments in a time interval of the vehicle passing through the journey, and acquiring position information of sampling points corresponding to each sampling moment in the journey; the sampling point comprises the starting point and the ending point;

the step of extracting the turning point includes:

judging whether the included angle belongs to a preset angle interval, if so, extracting the ith sampling point as a turning point.

3. The travel similarity obtaining method according to claim 2, wherein a distance between the i-th sampling point and the (i-a) -th sampling point is not greater than a preset distance, and a distance between the i-th sampling point and the (i-a-1) -th sampling point is greater than a preset distance; the distance between the ith sampling point and the (i+b) th sampling point is not greater than a preset distance, and the distance between the ith sampling point and the (i+b+1) th sampling point is greater than the preset distance.

4. The travel similarity obtaining method according to claim 1, wherein the step S1 further includes:

step S2 comprises:

obtaining a pair of matched characteristic points after the duplication removal, and counting the number n6 of the pair of matched characteristic points after the duplication removal, wherein the pair of matched characteristic points after the duplication removal consists of the characteristic points after the duplication removal, of which the coding values are matched with the coding values of the characteristic points after the duplication removal in the first stroke;

step S3 comprises: calculating the similarity m of the first stroke and the second stroke according to the following formula:

5. The method for obtaining travel similarity according to claim 4, wherein said step of performing a deduplication operation on the feature points according to the encoded values comprises:

deleting a first turning point if the highest p bit of the coding value of the first turning point is the same as the highest p bit of the coding value of the starting point, wherein the first turning point is the turning point with the sampling moment closest to the sampling moment of the starting point; if the highest p bit of the code value of the final turning point is the same as the highest p bit of the code value of the end point, deleting the final turning point, wherein the final turning point is the turning point with the sampling moment closest to the sampling moment of the end point; p is a positive integer;

If the coding values of the adjacent turning points are the same, only one turning point of the turning points is reserved.

6. The method for obtaining the run-length similarity according to claim 5, wherein after the step of performing a de-duplication operation on the feature points according to the encoded values to obtain de-duplicated feature points, the step S1 further includes:

sorting the characteristic points after de-duplication according to the size of the coded values of the characteristic points after de-duplication according to a preset arrangement rule to obtain re-sorted de-duplication characteristic points;

the pairs of de-duplicated matched feature points consist of re-ordered de-duplication feature points of the plurality of re-ordered de-duplication feature points of the first run, for which the encoded values match the encoded values of the re-ordered de-duplication feature points of the second run.

7. The method for obtaining the travel similarity according to claim 6, wherein the step of sorting the feature points after de-duplication according to a preset arrangement rule according to the magnitude of the encoded values of the feature points after de-duplication comprises:

s101, setting the starting point as a first reference point and setting the end point as a second reference point;

S102, judging whether the first reference point is adjacent to or coincides with the second reference point, if so, executing a step S107, and if not, executing a step S103;

s104, setting the next feature point adjacent to the first reference point as a new first reference point, setting the previous feature point adjacent to the second reference point as a new second reference point, and executing step S102;

s106, arranging the characteristic points after the duplication elimination in an inverted sequence, and executing a step S107;

s107, finishing the sorting.

8. A method of finding similar runs, wherein the number of runs is k1, k1 being an integer greater than 2, the method comprising the steps of:

calculating the similarity between every two strokes in k1 strokes by adopting the stroke similarity obtaining method according to any one of claims 1 to 7; the method further comprises the steps of:

And if the similarity between the two strokes is larger than a preset similarity threshold value, setting the two strokes as similar stroke pairs.

9. The method of finding similar itineraries according to claim 8, wherein the method further comprises the steps of:

if a plurality of the similar stroke pairs comprise the same stroke, setting the strokes comprising the similar stroke pairs as a similar stroke group, and counting the number of the strokes comprising the similar stroke group.

10. The method of searching for similar strokes according to claim 9, wherein the number of strokes included in the set of similar strokes is set to k2, the method further comprising the steps of:

the similar stroke ratio q is calculated according to the following formula:

q＝k2/k1。

11. a travel similarity obtaining device, wherein the travel comprises a first travel and a second travel of a vehicle, and the obtaining device comprises a feature point extracting unit, a matching feature point obtaining unit and a similarity calculating unit;

the characteristic point extraction unit is used for extracting a plurality of characteristic points of the first stroke, acquiring position information of each characteristic point of the first stroke and counting the number n1 of the characteristic points of the first stroke; the feature point extraction unit is further used for extracting a plurality of feature points of the second stroke, acquiring position information of each feature point of the second stroke and counting the number n2 of the feature points of the second stroke; the characteristic points comprise a starting point, an ending point and a turning point;

The matching feature point obtaining unit is used for obtaining matching feature point pairs and counting the number n3 of the matching feature point pairs, wherein the matching feature point pairs are composed of feature points, of the plurality of feature points of the first stroke, of which the position information is matched with the position information of the feature point of the second stroke;

m=n3/max (n 1, n 2), where max (n 1, n 2) is used to characterize the larger of n1 and n 2;

the acquisition equipment further comprises a coding unit, wherein the coding unit is used for performing GeoHash coding on the position information of the feature points to generate a coding value; wherein all the feature points in the same rectangular region have the same coding value;

12. The stroke similarity obtaining device according to claim 11, wherein the stroke similarity obtaining device further comprises a sampling unit;

The sampling unit is used for setting a plurality of sampling moments in a time interval of the vehicle passing through the journey and acquiring position information of sampling points corresponding to each sampling moment in the journey; the sampling point comprises the starting point and the ending point;

13. The stroke similarity obtaining device according to claim 12, wherein a distance between the i-th sampling point and the (i-a) -th sampling point is not greater than a preset distance, and a distance between the i-th sampling point and the (i-a-1) -th sampling point is greater than a preset distance; the distance between the ith sampling point and the (i+b) th sampling point is not greater than a preset distance, and the distance between the ith sampling point and the (i+b+1) th sampling point is greater than the preset distance.

14. The travel similarity obtaining apparatus according to claim 11, further comprising a deduplication unit,

the matching feature point obtaining unit is further configured to obtain a number n6 of the de-duplicated matching feature point pairs, where the number n6 of the de-duplicated matching feature point pairs is formed by de-duplicated feature points, where the code value in the de-duplicated feature points in the first stroke matches the code value of the de-duplicated feature point in the second stroke;

the similarity calculation unit is configured to calculate a similarity m of the first stroke and the second stroke according to the following formula:

15. The travel similarity acquisition apparatus according to claim 14, wherein the deduplication unit is further configured to perform a deduplication operation according to:

16. The travel similarity obtaining apparatus according to claim 15, further comprising a sorting unit,

the sorting unit is used for sorting the characteristic points after the duplication removal according to a preset arrangement rule according to the size of the coding values of the characteristic points after the duplication removal so as to obtain reordered duplication removal characteristic points;

17. The travel similarity acquisition apparatus according to claim 16, wherein the sorting unit is further configured to sort the de-duplicated feature points according to:

s107, finishing the sorting.

18. A system for finding similar strokes, wherein the number of strokes is k1, k1 is an integer greater than 2, the system comprising the stroke similarity acquisition device according to any one of claims 11 to 17;

the acquisition equipment is used for calculating the similarity between every two strokes in k1 strokes respectively; the system further comprises a trip pair setting device:

the stroke pair setting device is used for setting two strokes as similar stroke pairs when the similarity between the two strokes is larger than a preset similarity threshold value.

19. The system for finding similar itineraries according to claim 18, wherein the system further comprises an itinerary group setting device;

the stroke group setting device is used for setting the strokes included in the similar stroke pairs into similar stroke groups when the similar stroke pairs include the same stroke, and is used for counting the number of the strokes included in the similar stroke groups.

20. The system for finding similar strokes according to claim 19, wherein a number of strokes included in the similar stroke group is set to k2, the system further comprising a ratio calculation device;

the ratio calculation device is configured to calculate a similar stroke ratio q according to the following formula:

q＝k2/k1。