CN110021166B

CN110021166B - Method and device for processing user travel data and computing equipment

Info

Publication number: CN110021166B
Application number: CN201910250564.3A
Authority: CN
Inventors: 赵星
Original assignee: Advanced New Technologies Co Ltd
Current assignee: Advanced New Technologies Co Ltd; Advantageous New Technologies Co Ltd
Priority date: 2019-03-29
Filing date: 2019-03-29
Publication date: 2021-09-07
Anticipated expiration: 2039-03-29
Also published as: CN110021166A

Abstract

Embodiments of the present specification provide a method, apparatus and computing device for processing user travel data. The method comprises the following steps: acquiring k positions representing a user travel path and stopping probabilities respectively corresponding to the k positions; determining m OD results based on the k locations, wherein each of the m OD results comprises at least two of the k locations; and determining the probability corresponding to each OD result based on the stay probabilities respectively corresponding to the at least two positions included in each OD result.

Description

Method and device for processing user travel data and computing equipment

Technical Field

Embodiments of the present specification relate to the field of data processing, and more particularly, to a method, apparatus and computing device for processing user travel data.

Background

In order to reasonably evaluate and plan urban traffic (e.g., public transportation networks), it is necessary to analyze travel data of a population. The main purpose of travel data analysis is to accurately analyze the travel demand of a population, such as origin-Destination (OD).

However, there may be many active locations on each person's daily travel path, where only some of the locations may be where they really go, while others are only intermediate locations on the way. Then, how to obtain the OD based on these travel data becomes one of the problems to be solved.

Disclosure of Invention

In view of the above-mentioned problems of the prior art, embodiments of the present specification provide a method, apparatus and computing device for processing user travel data.

In one aspect, an embodiment of the present specification provides a method for processing user travel data, including: acquiring k positions representing a user travel path and stopping probabilities respectively corresponding to the k positions, wherein k is a positive integer larger than 1; determining m origin-arrival OD outcomes based on the k locations, wherein each of the m OD outcomes comprises at least two of the k locations, m being a positive integer; and determining the probability corresponding to each OD result based on the stay probabilities respectively corresponding to at least two positions included in each OD result.

In another aspect, an embodiment of the present specification provides an apparatus for processing user travel data, including: the device comprises an acquisition unit, a processing unit and a control unit, wherein the acquisition unit is used for acquiring k positions representing a user travel path and stopping probabilities respectively corresponding to the k positions, and k is a positive integer larger than 1; a first determining unit, configured to determine m origin-arrival OD results based on the k positions, where each of the m OD results includes at least two of the k positions, and m is a positive integer; and the second determining unit is used for determining the probability corresponding to each OD result based on the stopping probability corresponding to at least two positions included in each OD result respectively.

In another aspect, embodiments of the present specification provide a computing device comprising: at least one processor; a memory in communication with the at least one processor having stored thereon executable instructions that, when executed by the at least one processor, cause the at least one processor to implement the above-described method.

Therefore, in the technical scheme, different position combinations are considered for different OD results, and the probability corresponding to each OD result is obtained based on the stay probability of each position included in each OD result, so that the probability of each position is respected to the maximum extent, and subsequent traffic evaluation and planning can be accurately and reasonably performed by using the final OD result and the corresponding probability.

Drawings

The foregoing and other objects, features and advantages of the embodiments of the present specification will become more apparent from the following more particular description of the embodiments of the present specification, as illustrated in the accompanying drawings in which like reference characters generally represent like elements throughout.

Fig. 1 is a schematic flow chart of a method for processing user travel data according to one embodiment.

FIG. 2A is a schematic diagram of an example of an application scenario, according to one embodiment.

FIG. 2B is a schematic diagram of an example of an application scenario, according to one embodiment.

Fig. 3 is a schematic block diagram of an apparatus for processing user travel data according to one embodiment.

FIG. 4 is a hardware block diagram of a computing device for processing user travel data, according to one embodiment.

Detailed Description

The subject matter described herein will now be discussed with reference to various embodiments. It should be understood that these examples are discussed only to enable those skilled in the art to better understand and implement the subject matter described herein, and are not intended to limit the scope, applicability, or examples set forth in the claims. Changes may be made in the function and arrangement of elements discussed without departing from the scope of the claims. Various embodiments may omit, replace, or add various procedures or components as desired.

In order to reasonably evaluate and plan urban traffic, it is often necessary to analyze travel data of a population. For example, in evaluating and planning public transportation networks, it is often necessary to accurately analyze the OD results of a population. However, the moving positions of each person on the daily travel path may be many, but only some of the moving positions may be the real places to go, and other moving positions may be only the middle positions on the way.

To this end, a probabilistic model may be introduced to evaluate the probability that each location is O or D. This probability may be referred to herein as a dwell probability for each location. For example, various manners such as expert rules, supervised models, or semi-supervised models may be employed to evaluate the probability of stay for each location.

In the technical solution of this document, k positions representing a user travel path and stop probabilities respectively corresponding to the k positions may be obtained first. k may be a positive integer greater than 1. Based on the k locations, m OD results may be determined, where each OD result may include at least two of the k locations, and m is a positive integer. Then, the probability of each OD outcome may be determined based on the stay probabilities respectively corresponding to the at least two locations included in each OD outcome.

Therefore, in the technical scheme, the corresponding OD result can be obtained by combining at least two positions on the user travel path, and the probability of the OD result is obtained based on the stay probability of the at least two positions included in each OD result, so that the stay probabilities of different positions can be taken into consideration, and the final OD result and the corresponding probability can be utilized to accurately and reasonably perform subsequent traffic evaluation and planning.

The above technical solutions will be described below with reference to specific embodiments.

As shown in fig. 1, in step 102, k positions representing the user travel route and corresponding stay probabilities of the k positions may be obtained, where k is a positive integer greater than 1.

For example, the user travel path may be represented by a line connecting a plurality of locations. For example, the user travel route may include at least two positions, namely, a departure position and an arrival position. Furthermore, the travel path may further comprise at least one intermediate position. The probability of stay for each location can be calculated by any suitable probabilistic model now available.

In step 104, m OD results may be determined based on the k locations. Each OD result may include at least two of the k positions, m being a positive integer.

For example, one OD result may be formed by combining at least two of the k positions. Thus, different combinations of positions may result in different OD results.

In step 106, a probability corresponding to each OD result may be determined based on the stay probabilities corresponding to the at least two positions included in each OD result.

For example, the probability corresponding to a combination of the locations (i.e., the OD results) can be determined based on the stay probabilities of the respective locations in each OD result using existing probability calculation methods.

In one approach, a fixed probability threshold may be introduced. If the stopping probability corresponding to a certain position is less than or equal to the probability threshold, the position can be considered as the middle position in the way, and the stopping probability can be removed. And locations with a probability of stay greater than the probability threshold may be concatenated to form the OD result. This is simpler to implement, but due to the "one-off" approach, the positions may not be reasonably considered.

In one embodiment, m combinations of k positions may be obtained by enumeration to form m OD results. Here, it can be understood that, in general, the departure location and the arrival location of the travel route (i.e. the locations of the two end points of the travel route) both have a relatively large stay probability, for example, the stay probability may be 1. Then, if the k positions include intermediate positions in addition to the positions of the two end points, the respective combinations of the k positions are mainly the respective combinations of the intermediate positions. In this case, m may be equal to 2^k-2. That is, there may be 2^k-2The combinations were as OD results. It follows that each of the m combinations includes at least the positions of the two end points as the travel path. That is, each OD result includes at least the positions of two end points as travel paths.

For example, FIG. 2A is a schematic diagram of an example of an application scenario, according to one embodiment. It should be understood that fig. 2A is only for helping those skilled in the art to better understand the technical solution of the present specification, and does not limit the scope thereof.

As shown in fig. 2A, assume that the user travel path includes A, B and C, where location a and location C may be two end points of the travel path, and location B may be a middle position. Then two combinations, i.e. two OD results, can be formed. The first OD result may include positions a and C, which may be denoted as a-C herein for ease of description. The second OD result may include positions A, B and C, which may be denoted as a-B-C.

Further, assume that positions A, B and C correspond to dwell probabilities of 1, 0.6, and 1, respectively. Then, the first OD outcome may correspond to a probability of 1 (1-0.6) 1-0.4, and the second OD outcome may correspond to a probability of 1 0.6-1-0.6.

It can be seen that this approach can take into account each location sufficiently to accurately process each combination.

This way of enumerating all position combinations may be applied to scenarios with a small number of positions on the travel path. For example, in the case where k is less than or equal to the first number threshold, the above-described manner is adopted. The first quantity threshold may be predetermined, for example, based on actual demand, computing resources, and the like.

In another embodiment, n positions of the k positions may be determined, where the n positions correspond to a dwell probability greater than a first probability threshold, and n is a positive integer greater than 1. Then, the user travel path can be divided into n-1 segments by taking the n positions as endpoints respectively. Then, for each of the n-1 segments, a corresponding OD result is determined, thereby obtaining the m OD results.

For example, the first probability threshold may be predetermined, for example, according to various factors such as actual demand. The first probability threshold may be a relatively large threshold, such as 0.9. Locations with a probability of stay greater than a first probability threshold may be considered approximately as the origin or the arrival. Then, using these n positions as end points, the travel path is divided into n-1 segments. That is, the long chain formed by k positions is divided into a plurality of segments. Then, for each segment, a corresponding OD result may be obtained. The processing between the segments is independent of each other. It can be seen that the two end points of each segment may be locations where the probability of stay is high, while the middle location of each segment (if there is a middle location in the segment) may be a location where the probability of stay is relatively low.

Therefore, in the scheme, the travel path is further subdivided into different segments, so that the processing complexity can be reduced, and the engineering implementation is relatively simple.

The method can be suitable for scenes with a large number of positions on the travel path. For example, for k positions, there may be 2^k-2If the combination is directly enumerated, the exponential explosion problem is caused, the processing complexity is high, and the engineering implementation is not facilitated. Thus, in the case where k is greater than the first number threshold, the OD result may be determined in a segmented manner. Since each segment can be processed independently, the number of position combinations can be effectively reduced, and enumeration can be effectively controlled and weakenedThe exponential explosion problem of time.

For example, FIG. 2B is a schematic diagram of an example of an application scenario, according to one embodiment. It should be understood that fig. 2B is only for helping those skilled in the art to better understand the technical solution of the present specification, and does not limit the scope thereof.

As shown in fig. 2B, it is assumed that the user travel path includes positions a to F, and the stay probabilities corresponding to the positions a to F may be 1, 0.6, 1, 0.4, 0.1, and 1, respectively. Assume that the first probability threshold is 0.9. Thus, the probability of stay at locations A, C and F is above the first probability threshold. Then, the user travel path is divided into two segments with the positions A, C and F as end points, respectively. The first segment may include locations A, B and C. The second segment may include locations C, D, E and F.

For the first segment, a corresponding OD result may be determined. For the second segment, a corresponding OD result may be determined. The processing of the two segments may be independent of each other. For example, in a simple implementation, two OD results, A-C and A-B-C, may be formed for the first segment. For the second segment, four OD results may be formed, namely C-F, C-D-F, C-E-F and C-D-E-F.

Therefore, the problem that the index is exploded during enumeration can be effectively controlled and weakened in a segmentation mode, and engineering implementation is facilitated.

There may be different ways of handling for each segment. For convenience of description, any one of the n-1 segments will be referred to as a first segment.

For example, in one embodiment, if the first segment includes p locations and p-2 is less than or equal to the second number threshold, the OD result corresponding to the first segment may be obtained directly by enumeration. The second number threshold may be predetermined, for example, based on various factors such as actual demand, computing resources, and the like. For example, the second number threshold may be 3. In this embodiment, p-2 may be understood as the number of intermediate positions in the first segment except for two positions as end points. When the number of intermediate positions is less than or equal to the second number threshold, the OD result can be obtained directly by enumeration, since the number of position combinations is relatively small. The manner of enumeration is similar to that described above with reference to fig. 2A.

Specifically, p positions may be combined to form q combinations as OD results corresponding to the first segment. Each of the q combinations includes at least positions of two end points of the p positions as the first segment, where p is a positive integer greater than 1 and q is a positive integer. That is, various combinations of the middle position and the two end points of the first segment may be made to form the corresponding OD results.

In another embodiment, if p-2 is greater than the second number threshold, some of the p positions may be removed and then enumerated. p-2 being greater than the second quantity threshold may indicate a relatively high number of intermediate positions in the first segment, where an exponential explosion problem may exist if the OD result is obtained directly by enumeration. Thus, x positions can be selected from the p positions. It should be understood that the x positions should include at least the positions of the two end points of the first segment. Then, the x positions are combined to form y combinations as the OD result corresponding to the first segment. x may be a positive integer greater than 1 and y is a positive integer. Similarly, each of the y combinations includes at least the positions of the two end points of the first segment.

It can be seen that by determining the OD result in different ways based on the number of positions in each segment, the implementation is simple and an accurate solution can be provided to the maximum extent.

In one embodiment, for the case where p-2 is greater than the second number threshold, the x positions may include the first set of positions in addition to the positions that are endpoints of the first segment. The first set of locations may be intermediate locations of the first segment, and the first set of locations may correspond to a probability of dwell greater than a second probability threshold. In the embodiment, the positions with the stay probability greater than the second probability threshold are screened out, and the positions with the stay probability less than or equal to the second probability threshold are removed, so that the position combination can be reduced, and the processing complexity is effectively reduced. The second probability threshold may be preset according to various factors such as actual conditions. The second probability threshold may be less than the first probability threshold.

In another embodiment, the x positions may include a second set of positions in addition to the positions that are endpoints of the first segment. The second set of positions may be intermediate positions of the first segment, and the second set of positions may be the first z positions of the intermediate positions of the first segment where the probability of dwell is greatest, and z may be a positive integer.

For example, in selecting the middle position of the first segment, the first several positions, such as the first 3 positions, where the probability of stay is the largest may be selected. For ease of understanding, a simple example is given below. Assume that the first segment includes positions A, B2, B1, B4, B3, B5, and C, and assume that the second quantity threshold is 3. At this point, the first segment has an intermediate position B1-B5, the number exceeding a second number threshold. Then, some positions may be selected from the intermediate positions B2, B1, B4, B3, B5. Assume that positions B2, B1, B4, B3, B5 can be ranked as B1, B2, B3, B4, and B5 by probability of stay. The three positions B1, B2, B3 with the highest probability of stay may be selected, while positions B4 and B5 may be eliminated. Thus, positions A, B2, B1, B3, and C may be combined to yield the OD result for the first segment. In this case, there may be 8 combinations, i.e. 8 OD results.

Further, the probability corresponding to each OD outcome may be derived based on the dwell probability corresponding to the location included in each OD outcome. For example, assuming that the stay probabilities corresponding to positions a and C are 1 and the stay probabilities corresponding to positions B2, B1, and B3 are P1, P2, and P3, respectively, the probability corresponding to the OD result may be P1 × P2 × P3 for the case where the OD result is a-B2-B1-B3-C. For the case of an OD result of a-B2-C, the probability of the OD result may be P1 (1-P2) and (1-P3).

Therefore, the index explosion problem during enumeration can be effectively solved in a segmentation mode, and engineering implementation is facilitated.

As shown in fig. 3, the apparatus 300 may include an obtaining unit 302, a first determining unit 304, and a second determining unit 306.

The obtaining unit 302 may obtain k positions representing the user travel path and stop probabilities respectively corresponding to the k positions, where k is a positive integer greater than 1. The first determining unit 304 may determine m OD results based on the k positions, wherein each of the m OD results includes at least two of the k positions, and m is a positive integer. The second determining unit 306 may determine a probability corresponding to each OD result based on the stay probabilities corresponding to the at least two positions included in each OD result.

In one embodiment, the first determination unit 304 may combine k locations to form m combinations into m OD results. Each of the m combinations may include at least positions of two end points of the k positions as a user travel path, m being equal to 2^k-2. In another embodiment, k may be less than or equal to the first number threshold.

In another embodiment, the first determining unit 304 may determine n positions of the k positions, where the n positions correspond to a dwell probability greater than a first probability threshold, and n is a positive integer greater than 1. The first determining unit 304 may divide the user travel path into n-1 segments with n positions as end points, respectively. The first determining unit 304 may determine a respective OD result for each of the n-1 segments, respectively, to obtain m OD results. In another embodiment, k may be greater than the first number threshold.

In another embodiment, for a first segment of the n-1 segments, the first segment is any one of the n-1 segments:

if the first segment includes p locations and p-2 is less than or equal to the second number threshold, the first determination unit 304 may combine the p locations to form q combinations as OD results corresponding to the first segment, wherein each of the q combinations includes at least the locations of two endpoints of the first segment as p locations, wherein p is a positive integer greater than 1 and q is a positive integer.

If p-2 is greater than the second quantity threshold, the first determination unit 304 may select x positions from the p positions, and combine the x positions to form y combinations as the OD result corresponding to the first segment, wherein the x positions include at least the positions of two endpoints as the first segment from the p positions, each of the y combinations includes at least the positions of two endpoints as the first segment from the p positions, x is a positive integer greater than 1, and y is a positive integer.

In another embodiment, the x positions may further include a first set of positions, the first set of positions being intermediate positions of the first segment, and the first set of positions corresponding to a dwell probability greater than a second probability threshold.

In another embodiment, the x positions may further include a second set of positions, the second set of positions being middle positions of the first segment, and the second set of positions being the first z positions of the middle positions of the first segment where the probability of stay is greatest, z being a positive integer.

The units of the apparatus 300 may perform corresponding steps in the method embodiments of fig. 1 to 2B, and therefore, for brevity of description, specific operations and functions of the units of the apparatus 300 are not described herein again.

The apparatus 300 may be implemented by hardware, software, or a combination of hardware and software. For example, when implemented in software, the apparatus 300 may be formed by a processor of a device that reads corresponding executable instructions from a memory (e.g., a non-volatile memory) into the memory for execution.

FIG. 4 is a hardware block diagram of a computing device for processing user travel data, according to one embodiment. As shown in fig. 4, computing device 400 may include at least one processor 402, storage 404, memory 406, and communication interface 408, and the at least one processor 402, storage 404, memory 406, and communication interface 408 are coupled together via a bus 410. The at least one processor 402 executes at least one executable instruction (i.e., the elements described above as being implemented in software) stored or encoded in the memory 404.

In one embodiment, the executable instructions stored in the memory 404, when executed by the at least one processor 402, cause the computing device to implement the various processes described above in connection with fig. 1-2B.

Computing device 400 may be implemented in any suitable form known in the art, including, for example, but not limited to, a desktop computer, a laptop computer, a smartphone, a tablet computer, a consumer electronics device, a wearable smart device, and so forth.

Embodiments of the present specification also provide a machine-readable storage medium. The machine-readable storage medium may store executable instructions that, when executed by a machine, cause the machine to perform particular processes of the method embodiments described above with reference to fig. 1-2B.

For example, a machine-readable storage medium may include, but is not limited to, Random Access Memory (RAM), Read-Only Memory (ROM), Electrically-Erasable Programmable Read-Only Memory (EEPROM), Static Random Access Memory (SRAM), a hard disk, flash Memory, and so forth.

It should be understood that the embodiments in this specification are described in a progressive manner, and that the same or similar parts in the various embodiments may be mutually referred to, and each embodiment is described with emphasis instead of others. For example, as for the embodiments of the apparatus, the computing device and the machine-readable storage medium, since they are substantially similar to the method embodiments, the description is simple, and the relevant points can be referred to the partial description of the method embodiments.

Specific embodiments of this specification have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

It will be understood that various modifications to the embodiments described herein will be readily apparent to those skilled in the art, and that the generic principles defined herein may be applied to other variations without departing from the scope of the claims.

Claims

1. A method for processing user travel data, comprising:

acquiring k positions representing a user travel path and stopping probabilities respectively corresponding to the k positions, wherein k is a positive integer larger than 1;

determining m origin-arrival (OD) results based on the k locations, wherein a plurality of the k locations are used to determine the m OD results, the plurality of locations including at least locations that are two end points of the user travel path, each of the m OD results including at least two of the plurality of locations, m being a positive integer;

determining the probability corresponding to each OD result based on the stay probability corresponding to at least two positions included in each OD result, wherein: if other positions which are not included in the OD result exist between the first position and the last position included in the OD result aiming at the plurality of positions, the probability corresponding to the OD result is the product of the stay probability respectively corresponding to each position included in the OD result and the non-stay probability corresponding to the other positions, and the non-stay probability corresponding to the other positions is calculated based on the stay probabilities of the other positions;

determining a final OD result based on the m OD results, wherein the final OD result at least comprises positions of two end points which are travel paths of the user.

2. The method of claim 1, wherein the determining m OD results based on the k locations comprises:

combining the k locations to form m combinations as the m OD results, wherein each of the m combinations includes at least the locations of the k locations that are two endpoints of the user travel path, and m is equal to 2^k-2。

3. The method of claim 2, wherein k is less than or equal to a first number threshold.

4. The method of claim 1, wherein the determining m OD results based on the k locations comprises:

determining n positions of the k positions, wherein the stay probabilities corresponding to the n positions are greater than a first probability threshold, and n is a positive integer greater than 1;

dividing the user travel path into n-1 segments by taking the n positions as end points respectively;

determining a corresponding OD result for each of the n-1 segments, respectively, to obtain the m OD results.

5. The method of claim 4, wherein k is greater than a first number threshold.

6. The method of claim 4 or 5, wherein said determining, for each of said n-1 segments, a respective OD result comprises:

for a first segment of the n-1 segments, the first segment being any one of the n-1 segments:

if the first segment includes p locations and p-2 is less than or equal to a second number threshold, combining the p locations to form q combinations as OD results corresponding to the first segment, wherein each of the q combinations includes at least one of the p locations as two endpoints of the first segment, where p is a positive integer greater than 1 and q is a positive integer;

if p-2 is greater than the second quantity threshold, selecting x positions from the p positions and combining the x positions to form y combinations as OD results corresponding to the first segment, wherein the x positions include at least the positions of the p positions that are two endpoints of the first segment, each of the y combinations includes at least the positions of the p positions that are two endpoints of the first segment, x is a positive integer greater than 1, and y is a positive integer.

7. The method of claim 6, wherein the x locations further comprise a set of locations, the set of locations being intermediate locations of the first segment, and the set of locations corresponding to a probability of dwell greater than a second probability threshold.

8. The method of claim 6, wherein the x positions further comprise a set of positions, the set of positions being intermediate positions of the first segment and the set of positions being the first z positions of the intermediate positions of the first segment where the probability of dwell is greatest, z being a positive integer.

9. An apparatus for processing user travel data, comprising:

the device comprises an acquisition unit, a processing unit and a control unit, wherein the acquisition unit is used for acquiring k positions representing a user travel path and stopping probabilities respectively corresponding to the k positions, and k is a positive integer larger than 1;

a first determining unit configured to determine m origin-arrival OD results based on the k positions, wherein a plurality of the k positions are used to determine the m OD results, the plurality of positions include at least positions that are two end points of the user travel path, each of the m OD results includes at least two of the plurality of positions, and m is a positive integer;

a second determination unit configured to:

10. The apparatus of claim 9, wherein the first determining unit, when determining m OD results based on the k positions, is specifically configured to:

11. The apparatus of claim 10, wherein k is less than or equal to a first number threshold.

12. The apparatus of claim 9, wherein the first determining unit, when determining m OD results based on the k positions, is specifically configured to:

13. The apparatus of claim 12, wherein k is greater than a first number threshold.

14. The apparatus according to claim 12 or 13, wherein the first determining unit, when determining the respective OD result for each of the n-1 segments, is specifically configured to:

15. The apparatus of claim 14, wherein the x locations further comprise a set of locations that are intermediate locations of the first segment and that correspond to a probability of dwell greater than a second probability threshold.

16. The apparatus of claim 14, wherein the x positions further comprise a set of positions, the set of positions being middle positions of the first segment and the set of positions being the first z positions of the middle positions of the first segment where the probability of dwell is greatest, z being a positive integer.

17. A computing device, comprising:

at least one processor;

a memory in communication with the at least one processor having stored thereon executable instructions that, when executed by the at least one processor, cause the at least one processor to implement the method of any one of claims 1-8.