CN106682051A - Method for finding out crowd movement behaviors - Google Patents
Method for finding out crowd movement behaviors Download PDFInfo
- Publication number
- CN106682051A CN106682051A CN201510982408.8A CN201510982408A CN106682051A CN 106682051 A CN106682051 A CN 106682051A CN 201510982408 A CN201510982408 A CN 201510982408A CN 106682051 A CN106682051 A CN 106682051A
- Authority
- CN
- China
- Prior art keywords
- line segment
- distance
- sequence
- those
- representative series
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 230000006399 behavior Effects 0.000 title abstract 3
- 238000013507 mapping Methods 0.000 claims description 18
- 230000000694 effects Effects 0.000 claims description 11
- 238000012360 testing method Methods 0.000 claims description 9
- 241001269238 Data Species 0.000 claims description 8
- 238000012163 sequencing technique Methods 0.000 claims description 8
- 230000002776 aggregation Effects 0.000 claims description 2
- 238000004220 aggregation Methods 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 12
- 239000000203 mixture Substances 0.000 description 8
- 238000010606 normalization Methods 0.000 description 7
- 238000001514 detection method Methods 0.000 description 6
- 239000013598 vector Substances 0.000 description 6
- 238000009826 distribution Methods 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 238000006116 polymerization reaction Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000005065 mining Methods 0.000 description 2
- 230000007306 turnover Effects 0.000 description 2
- 240000001439 Opuntia Species 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000009827 uniform distribution Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
Abstract
The invention discloses a method for finding out crowd movement behaviors, which comprises the following steps: collecting a plurality of location data regarding a plurality of user devices; detecting a plurality of conventional patterns in the position data to generate a plurality of representative sequences, wherein each representative sequence comprises at least one line segment between a starting position point and an ending position point; and classifying the representative sequences into a plurality of sets according to a plurality of sequence distances among the representative sequences so as to find the moving behaviors of the crowd.
Description
Technical field
The invention relates to it is a kind of by the position data collected with regard to user's set, to find out crowd's mobile behavior
Method.
Background technology
For many companies and tissue, such as chain convenience store, public transportation company, local government etc., it is known that people
How group intercity moves possibly critically important information in city or multiple.For these tissues, there is many
Where and crowd carrys out and move to the information of where from where great decision-making need to be dependent on regard to crowd, and these decision-makings are for example
It is the new bus route of setting, build new traffic transfer station, opens up new StoreFront and construction urban public utilities.Cause
How this, effectively find out the relevant information of crowd's mobile behavior, is one of problem that current industry is endeavoured.
The content of the invention
The invention relates to the non-momentary computer of a kind of method for finding out crowd's mobile behavior and execution the method can
Reading media.
According to one of present invention embodiment, a kind of method for finding out crowd's mobile behavior is proposed, method includes:Collect with regard to
Multiple position datas of multiple user's sets;Detect the multiple usual pattern in these position datas to produce multiple representative sequences
Row, wherein each representative series include an at least line segment, an at least line segment is between source location set and end position point;With
And according to the multiple sequence distances between these representative series, these representative series are categorized as into multiple set, to find out
Crowd's mobile behavior.
More preferably understand to have to the above-mentioned and other aspect of the present invention, the model implemented according to the present invention cited below particularly
Example, and coordinate institute's accompanying drawings, it is described in detail below:
Description of the drawings
Fig. 1 illustrates the example schematic for carrying out payment activity with smart card.
Fig. 2 is illustrated according to the flow chart for finding out crowd's mobile behavior method of one embodiment of the invention.
Fig. 3 illustrates the schematic diagram of an example payment record, and payment record is relevant to the position captured from multiple user's sets
Data and time.
Fig. 4 illustrates the flow chart of the collection with regard to the position data of user's set of foundation one embodiment of the invention.
Fig. 5 A and Fig. 5 B are illustrated location for payment according to one embodiment of the invention with immediate reference position point mark
The example schematic of note.
After the arrangement that Fig. 6 illustrates according to one embodiment of the invention with simplification after payment record schematic diagram.
Fig. 7 illustrates the usual pattern in the test position data according to one embodiment of the invention to produce representative series
Flow chart.
Fig. 8 illustrates the schematic diagram according to the usual pattern of polymerization of one embodiment of the invention.
Fig. 9 illustrates the flow chart of the sequence distance between the calculating representative series according to one embodiment of the invention.
Figure 10 illustrates the stream of the line segment distance between the line segment of calculating first and second line segment according to one embodiment of the invention
Cheng Tu.
The schematic diagram of line segment distance between two line segments that Figure 11 illustrates according to one embodiment of the invention.
Figure 12 illustrates the flow process of the line segment of calculating first according to one embodiment of the invention and parallel distance between second line segment
Figure.
The schematic diagram of line segment distance between two line segments that Figure 13 illustrates according to one embodiment of the invention.
Figure 14 illustrates the calculating normalization angular distance, regular vertical range and normalization of foundation one embodiment of the invention
The flow chart of parallel distance.
Figure 15 A~Figure 15 D are illustrated and shown according to various situations of the consideration vertical range domain maximum of one embodiment of the invention
It is intended to.
Figure 16 illustrates the flow process of the decision First ray according to one embodiment of the invention and sequence distance between the second sequence
Figure.
Multiple image combinations between two representative series that Figure 17 A~Figure 17 C illustrate according to one embodiment of the invention
Schematic diagram.
Figure 18 illustrates the example schematic of a kind of invalid image between two representative series.
Figure 19 illustrates the flow chart of the image distance according to the calculating image combination of one embodiment of the invention.
Figure 20 is illustrated according to the flow chart for being categorized as representative series to gather of one embodiment of the invention.
Figure 21 illustrates finding out crowd's mobile behavior and find out the flow chart of type sequence according to one embodiment of the invention.
Figure 22 illustrates the flow chart according to the type sequence for finding out target date type of one embodiment of the invention.
Specific embodiment
To make the object, technical solutions and advantages of the present invention become more apparent, below in conjunction with specific embodiment, and reference
Accompanying drawing, the present invention is described in further detail.
Many people take public transportation, such as bus, train, rapid transit etc. using smart card in the modern life, additionally,
Smart card also can as electronic money bag to buy article or payment expense, for example, smart card can stored value in the card,
Can be used for vending machine, turnover parking lot or turnover railway station.Fig. 1 illustrates the example for carrying out payment activity with smart card and illustrates
Figure, the smart card in this example is a kind of contactless smart card.Whenever using smart card, one can be produced and pay note
Record (Payment Log) or traffic records, and because vending machine or station gate can have static geography information, therefore
Can collect and position data when using intelligently is stuck in regard to multiple users.The service provider of distribution smart card can be with mat
By these payment records are collected, with the relevant information for where obtaining crowd and how crowd moves.
Fig. 2 is illustrated according to the flow chart for finding out crowd's mobile behavior method of one embodiment of the invention, finds out crowd's movement
The method of behavior comprises the following steps.Step S100, collects the position data with regard to multiple user's sets.Step S200, detection
(Mining) usual pattern (Frequent Patterns) in position data is producing multiple representative series
(Representative Sequences).Step S300, according to the sequence distance between representative series, by representative series
It is categorized as gathering (Cluster), to find out crowd's mobile behavior.Wherein each representative series include an at least line segment, and this is at least
One line segment is between source location set and end position point.The method for example be able to can be stored by software program implementation, software program
On CD, software program can include multiple instructions for being relevant to computer processor, and these instructions can be subsequently can by computer
Device loads to perform the above-mentioned method for finding out crowd's mobile behavior.Detailed description with regard to each step is as follows.
Step S100:Collect the position data with regard to multiple user's sets.User's set may include smart card, electronic payment
Card or the running gear with ability to pay, when these user's sets are used to pay activity, can collect the positional number of correlation
According to.For example, when smart card is used to carry out payment activity in a payment terminal machine, one can be uploaded and is reported to central authorities' clothes
Business device, this report may include authentication (Identification, ID), payment, date-time, the payment of smart card
The position of terminating machine.Crowd's mobile behavior method of finding out of the present invention is not limited to collect positional number when payment activity is carried out
According to can collect the opportunity of position data also may include:When user's set is into station, when money is stored into user's set or
Person is when user's set enters building through certification.In order to make it easy to understand, following explanation will be with when payment activity be carried out
Collect position data to illustrate as example, and the position data collected is represented with payment record.
Fig. 3 illustrates the schematic diagram of an example payment record, and payment record is relevant to the position captured from multiple user's sets
Data and time.Payment record can be stored in the central server of payment services provider.In this example, payment record bag
The field for including has:Uid, leave position, in-position, date-time, payment, type of transaction.Uid represents user's set
ID, that is, identical uid corresponds to identical user's set, it is possible to corresponds to identical user, therefore can obtain
One people is in the information of where, and by the position data collected with regard to multiple user's sets, service provider can learn greatly
What kind of common motion track most people have.Type of transaction can be purchase, traffic, save or other types, for traffic
Type, payment activity is probably to carry out at destination station, and leaves position and record related traffic letter respectively to in-position
Breath, e.g. leaves station with the geographical coordinate for getting to the station.For other types of transaction beyond traffic, can use and leave position
The position of field record payment activity is put, and in-position field then can be recorded using "-".
In above-mentioned example, the coordinate for being recorded can be accurate position data.And in payment record, because of
There is substantial amounts of shop to use payment services, therefore the accurate position coordinates quantity of diverse location may be a lot.For having found out
For crowd's mobile behavior of interest, it may not be necessary to accurate position, therefore neighbouring position can be considered as a semantic space
Domain (Semantic Region), it is possible to select a reference position point (Reference Location Point) to represent
One semantic region.Step S100 can include step S110 and step S120, according to one embodiment of the invention as depicted in Fig. 4
Collection with regard to the position data of user's set flow chart.
Step S110:Select multiple reference position points.The example of reference position point can include school, chain convenient business
Shop, the landscape positions determined according to resident population.Step S120:By each location point in position data, it is substituted by and its ground
The immediate reference position point in reason position.Each payment data in for payment record, the precise local fix of script can be with
It is substituted by an immediate reference position point.Fig. 5 A and Fig. 5 B illustrate according to one embodiment of the invention by location for payment with
The example schematic of immediate reference position point mark.Fig. 5 A are advance with the triangular representation three that three different shades are filled up
Reference position point Ref_a, Ref_b of selection, Ref_c, with the location for payment of hollow circular script.Then, by each payment position
Put and be substituted by the immediate reference position point in geographical position, as shown in Figure 5 B, the shade of each location for payment fills up state change
It is identical for corresponding reference position point.After using immediate reference position point mark location for payment, in script
The geographical coordinate paid in record can be substituted by these reference position points.Step S110 and step S120 mentioned herein be
Optional step, that is, even if not execution step S110 and step S120, preserves original location for payment in payment record, after
Continuous step S200 and step S300 still can be performed to original location for payment.
Step S200:Usual pattern in test position data is producing multiple representative series.In Data Collection and
After above-mentioned preceding processing stages, payment record can switch to pay sequence (Sequence), and each pays sequence includes
The project (Item) of one sequence, represents specific user in the whole payment track of special time, pays the project in sequence
Can be the reference position point as shown in Fig. 5 A and Fig. 5 B, the payment sequence of an example can be { id_677:Ref_h, Ref_
C, Ref_c }.From multiple payment sequences, it is possible to use in proper order pattern detection (Sequential Pattern Mining) is calculated
Method, e.g. PrefixSpan or Generalized Sequential Pattern (GSP) algorithm, are paid in sequence with finding out
Usual pattern.After pattern detection in proper order, can obtain for the representative series in special time and its is corresponding
Support (Support Count), support can represent the number of times of appearance, can be calculated in detection algorithm.Each representative
Property sequence include an at least line segment, an at least line segment is together between beginning location point and end position point, source location set
And end position point can be exact position or reference position point.For example, traffic pattern is corresponded in payment record
One payment data, can be considered as and leave the line segment between position and in-position.One example of representative series
For<Ref_a, Ref_d, Ref_e>, this representative series include two line segments, a line segment from Ref_a to Ref_d, another
Line segment is from Ref_d to Ref_e.
In the middle of step S200, payment record first can be ranked up according to uid and date-time, for example can will be right
The transaction clustering of same subscriber device should be arrived together, and the Transaction Information of corresponding this user's set is entered according to time sequencing
Row sequence.Additionally, each location point can be marked using immediate reference position point, as shown in Figure 5 B.Fig. 6 is illustrated according to this
Invent after the arrangement of an embodiment with simplification after payment record schematic diagram.In this instance, for uid 604 can form one
Individual transaction sequence:<Ref_a, Ref_d, Ref_e>, and for uid 677 can form another transaction sequence:<Ref_h,
Ref_c, Ref_c>, then can to multiple transaction sequence applications in proper order pattern detection algorithm finding out usual pattern.
Fig. 7 illustrates the usual pattern in the test position data according to one embodiment of the invention to produce representative series
Flow chart.Step S200 may include step S210, step S220 and step S230, and these steps are performed after pattern detection.
Step S210, if there is a usual pattern only to have single location point, then removes this usual pattern.Step S220, at each
In usual pattern, identical adjacent position point is removed.Due to being the method for finding out crowd's mobile behavior, therefore those generations can be excluded
Table stops in situ usual pattern, e.g.<Ref_a>And<Ref_a, Ref_a>.Additionally, it is identical to include at least two
Adjacent position point usual pattern, for example<Ref_h, Ref_c, Ref_c>, equally can also exclude the adjacent position point of repetition.
So obtained representative series will not include identical adjacent position point, each line segment in the middle of each representative series
All represent the direction of crowd's mobile behavior.
Step S230, by the usual Pattern Aggregation of a few days producing representative series.The payment record collected can be according to
According to different time and same date is not classified, and this is because in different time sections, crowd's mobile behavior track can
Can differ.For example, 6:30am~9:30am、10:30am~13:30am、4:30pm~7:These three periods of 30pm can
Different crowd activities can respectively be corresponded to.If needing the cumulative statistical analysis for special time section in many days, then
This time section usual pattern of corresponding many days can be polymerized.Fig. 8 illustrates poly- according to one embodiment of the invention
Close the schematic diagram of usual pattern.In this example, the period 6 of all working day in the middle of May:30am~9:Generation in the middle of 30am
Table sequence is polymerized, the support of the digitized representation representative series in form.As shown in figure 8, the sequence of each working day
Row<Ref_d, Ref_e>After accumulated, the support of produced whole this sequence of May is 5750.Numeral 23/23 is that occur
Rate (Occurrence Rate), represents this sequence<Ref_d, Ref_e>Occur 23 times in 23 working days in may.Mat
By the polymerization statistics of many days, the mobile trend of most people more can be exactly found out.
Representative series, according to the sequence distance between representative series, are categorized as set, to find out people by step S300
Group's mobile behavior.After execution step S100 and step S200, it is found with regard to the representative series of crowd's mobile behavior,
Then similar representative series can be categorized as set, to find out in crowd's mobile behavior in a big way between region.Such as
This practice is helpful, because the interesting crowd's mobile behavior of great majority is with regard to the shifting between two extensive areas
It is dynamic, rather than the movement between two specific buildings.The sequence distance between two representative series can be calculated in step S300,
To determine the similarity degree of the two representative series.For example, shorter sequence distance represents the two representative series
There is higher similarity degree (being for example closer on geographical position).Can be according to such calculated sequence distance, will
Representative series are categorized as set.
Hereinafter illustrate an example with regard to sequence of calculation distance.One representative series can be considered the line segment of a sequence,
In this illustrative example, representative series include First ray Seq_a and the second sequence Seq_b.First ray Seq_a bags
The first line segment L1 is included, the first line segment L1 is put between point L1_s and the first end position point L1_e in first start bit.Second sequence
Seq_b includes second line segment L2, and second line segment L2 is between the second source location set L2_s and the second end position point L2_e.The
Sequence distance between one sequence Seq_a and the second sequence Seq_b, is according to the line between the first line segment L1 and second line segment L2
Segment distance and determine.First line segment L1 has directionality (pointing to the first end point L1_e from the first starting point L1_s), the second line
Section L2 also equally has directionality (pointing to the second end point L1_e from the second starting point L2_s).Therefore in the following description,
Will be using vectorAndThe first line segment L1 and second line segment L2 are represented respectively.
Fig. 9 illustrates the flow chart of the sequence distance between the calculating representative series according to one embodiment of the invention.First
Sequence Seq_a and the second sequence Seq_b form a sequence pair in representative series, for each sequence pair, step
S300 (representative series are categorized as into set) more may include step S310 and step S320.Step S310, calculates First Line
Line segment distance between section L1 and second line segment L2, line segment distance represents the close degree or similarity degree of the two line segments.Step
Rapid S320, according to the line segment distance between the first line segment L1 and second line segment L2, determines First ray Seq_a and the second sequence
Sequence distance between Seq_b, in other words, the similarity degree between two representative series is according to the two representative sequences
Arrange the similarity degree between the line segment having respectively and determine.
Figure 10 illustrates the stream of the line segment distance between the line segment of calculating first and second line segment according to one embodiment of the invention
Cheng Tu.Step S310 can comprise the following steps.Step S311, calculates the angular distance between the first line segment L1 and second line segment L2
dθ(Angle Distance), vertical range d⊥(Perpendicular Distance) and parallel distance d||(Parallel
Distance).Step S312, according to angular distance dθ, vertical range d⊥And parallel distance d||, calculate normalization
(Normalized) angular distance Ndθ, regular vertical range Nd⊥And regular parallel distance Nd||, wherein regular angular distance
Ndθ, regular vertical range Nd⊥And regular parallel distance Nd||Value in identical codomain.Step S313, according to regular
Change angular distance Ndθ, regular vertical range Nd⊥And regular parallel distance Nd||Weighted sum, determine the first line segment L1 with
Line segment distance between second line segment L2.An example presented below illustrates the calculating with regard to line segment distance between two line segments.
The schematic diagram of line segment distance between two line segments that Figure 11 illustrates according to one embodiment of the invention.Line segment distance is by three
Composition is determined:Angular distance dθ, vertical range d⊥And parallel distance d||.Angular distance dθIt is relevant toWithBetween angle theta
(0≤θ≤180°).For example, angle theta can be according to formulaCalculate, whereinIt is two
The inner product (dot product) of vector,
AndRepresent the length of two vectors.Angular distance dθCan be calculated according to following formula (1):
Angular distance dθSimilarity degree of two vectors in pointing direction is represented, angle theta is less, angular distance dθIt is less.And when folder
When angle θ is more than 90 °, rightabout is substantially pointed to equivalent to two vectors, then angular distance d nowθAngular distance delocalization can be set to
(Domain) maximum value possible, it is on direction and dissimilar to represent the two vectors.
Figure 12 illustrates the flow process of the line segment of calculating first according to one embodiment of the invention and parallel distance between second line segment
Figure.Step S311 comprises the following steps.Step S331, by the second source location set L2_s the extension of the first line segment L1 is projected in
Line, to obtain the 3rd initial subpoint L3_s.Step S332, by the second end position point L2_e prolonging for the first line segment L1 is projected in
Line is stretched, to obtain the 3rd end subpoint L3_e.Step S333, the 3rd starting subpoint L3_s of connection and the 3rd terminates subpoint
L3_e, to produce the 3rd line segment L3, the 3rd initial subpoint L3_s, the 3rd end subpoint L3_e, the Yi Ji for so producing
Three line segment L3, it is seen that depicted in Figure 11.Step S334, by the union (Union) of the first line segment L1 and the 3rd line segment L3 is deducted
The common factor (Intersection) of one line segment L1 and the 3rd line segment L3, to determine parallel distance d||.Parallel distance d||Can according to
Lower formula (2) calculates:
d||=L1 ∪ L3-L1 ∩ L3 (2)
Because the 3rd line segment L3 is formed via second line segment L2 is projected to the first line segment L1, therefore the 3rd line segment L3
It is conllinear (Collinear) with the first line segment L1.In the example shown in Figure 11, the union of the first line segment L1 and the 3rd line segment L3,
It is the length that the end position point L1_e of point L1_s to first are put from first start bit, and the friendship of the first line segment L1 and the 3rd line segment L3
Collection, is the length from the 3rd source location set L3_s to the 3rd end position point L3_e.It is by the second dimension L2 in this example
Project to the first dimension L1, in other embodiments, the first line segment L1 can also be projected to (extension) second line segment L2
Obtain parallel distance d||(value calculated may be different).Parallel distance d||Represent two line segments equivalent parallel length it
Between similarity degree.
Vertical range d⊥Can be calculated according to following formula (3):
Wherein l⊥sIt is the Euclidean distance between the second source location set L2_s and the 3rd source location set L3_s
(Euclidean Distance), l⊥eEuclidean between the second end position point L2_e and the 3rd end position point L3_e away from
From.Formula (3) represents l⊥sWith l⊥eAnti- harmonic average (Contraharmonic Mean).
The schematic diagram of line segment distance between two line segments that Figure 13 illustrates according to one embodiment of the invention.Equally can respectively according to
Angular distance d is calculated according to formula (1), (2), (3)θ, parallel distance d||, vertical range d⊥.In this example, angle theta is more than 90 °,
Therefore angular distance dθIt is equal toAs for parallel distance d||, the union of the first line segment L1 and the 3rd line segment L3, be from
First start bit puts the length of the end position point L3_e of point L1_s to the 3rd, and the common factor of the first line segment L1 and the 3rd line segment L3,
It is the length from the 3rd source location set L3_s to the first end position point L1_e.
As described above, when calculate line segment apart from when consider three compositions simultaneously.However, because the codomain of these three compositions can
Can differ greatly, cause to be not easy to obtain a significant combination according to these three compositions.In the method for the present invention, in step
Rapid S312 calculates regular angular distance Ndθ, regular parallel distance Nd||And regular vertical range Nd⊥, wherein regular angle
Apart from Ndθ, regular parallel distance Nd||And regular vertical range Nd⊥Value in identical codomain, e.g. [0,1],
[0,1] represent more than or equal to the 0, interval less than or equal to 1.Because the value of these three regular compositions is in identical codomain, this
The linear combination of three regular compositions just has meaning for the line segment distance calculated between two line segments.In one embodiment,
Line segment distance is regular angular distance Ndθ, regular parallel distance Nd||And regular vertical range Nd⊥Weighted sum, line
Segment distance can be calculated according to following formula (4):
Line segment distance=w1×Ndθ+w2×ND||+w3×ND⊥Wherein
For example, w1, w2, w3 can be equal toTo obtain regular angular distance Ndθ, regular parallel distance
Nd||And regular vertical range Nd⊥Mean value.
Figure 14 illustrates the calculating normalization angular distance, regular vertical range and normalization of foundation one embodiment of the invention
The flow chart of parallel distance.Step S312 may include the following steps.Step S341, by angular distance dθDivided by the maximum of angular distance delocalization
Value, to obtain regular angular distance Ndθ.Step S342, by vertical range d⊥Divided by the maximum in vertical range domain, to obtain just
Ruleization vertical range Nd⊥.Step S343, by parallel distance d||It is parallel to obtain normalization divided by the maximum in parallel distance domain
Apart from Nd||.Because three normalization distances are all that by producing divided by the maximum in respective distance domain, therefore these three are just
Ruleization all can be in [0,1] scope apart from the value of composition.
As shown in formula (1), the maximum of angular distance delocalization is shorter one in the middle of the first line segment L1 and second line segment L2
Length.As shown in formula (2), the maximum in parallel distance domain is the union of the first line segment L1 and the 3rd line segment L3.It is vertical away from
The maximum of delocalization is difficult directly to find out that its correlation computations is described as follows from formula (3).
Figure 15 A~Figure 15 D are illustrated and shown according to various situations of the consideration vertical range domain maximum of one embodiment of the invention
It is intended to.According to formula (3) and the geometrical relationship of the first line segment L1 and second line segment L2, the maximum in vertical range domain occurs to existPerpendicular toWhen.Therefore, in one embodiment, can be by second line segment L2 around the second source location set L2s or around second
End position point L2e rotates, until perpendicular to the first line segment L1The maximum in vertical range domain is the first line segment
Vertical range between L1 and postrotational second line segment.Figure 15 A~Figure 15 D illustrate four kinds of possible rotational conditions.It is vertical away from
The maximum of delocalization is vertical range maximum in the middle of these four possible cases, and the maximum in vertical range domain can be according to following formula
Sub (5) calculate:
Wherein l2Represent the length of second line segment L2.
Line segment distance between two line segments can be obtained according to calculation procedure above, and the sequence between two representative series
Column distance, then can determine according to the distance of the line segment between the line segment that the two representative series have respectively.Figure 16 is illustrated
According to the flow chart of sequence distance between the decision First ray and the second sequence of one embodiment of the invention.Under step S320 includes
Row step.Step S321, according to an at least line segment in the middle of an at least line segment in the middle of First ray Seq_a and the second sequence Seq_b,
Produce multiple images (Mapping) between First ray Seq_a and the second sequence Seq_b to combine.Step S322, calculates each
The image distance of image combination.Step S333, with the minimum image distance in the middle of each image combination, as First ray Seq_
Sequence distance between a and the second sequence Seq_b.
First ray Seq_a can be the sequence of multiple line segments according to time sequencing arrangement forms, First ray Seq_a examples
Two line segments LineSega1 and LineSega2 are such as may include, the motion track that line segment LineSega1 is represented is earlier than line segment
The motion track that LineSega2 is represented.Figure 17 A~Figure 17 C illustrate according to one embodiment of the invention two representative series it
Between multiple image combinations schematic diagram.In this example, the second sequence Seq_b also includes two lines according to time sequencing arrangement
Section LineSegb1 and LineSegb2.
In Figure 17 A, line segment maps LineSegb1 to ceases to be busy section φ, and line segment LineSegb2 maps to line segment
LineSega1, and line segment LineSega2 maps to ceases to be busy section φ.It should be noted that in the time of each representative series
Order still maintains the time sequencing of script.17B schemes and Figure 17 C illustrate respectively different image combinations, in each representativeness
The time sequencing of sequence is equally the time sequencing for maintaining script.What Figure 18 illustrated between two representative series a kind of invalid reflects
The example schematic of picture, this image time-to-violation order, because line segment LineSega2 (mapping to line segment LineSegb1) occurs
Line segment LineSega1 (mapping to line segment LineSegb2) is later than, but line segment LineSegb1 occurs earlier than line segment
LineSegb2.For each effective image combination (as shown in Figure 17 A~Figure 17 C), a mapping distance can be calculated.
Sequence distance between First ray Seq_a and the second sequence Seq_b can be minimum image in the middle of each image combination away from
From.
Figure 19 illustrates the flow chart of the image distance according to the calculating image combination of one embodiment of the invention.Step S322 bag
Include the following steps.Step S351, according to time sequencing, an at least line segment and the second sequence Seq_b in the middle of First ray Seq_a
It is central at least between a line segment, multiple mappings are formed to (Mapping Pair).Step S352, calculate each mapping to line segment
Distance.Step S353, calculate each mapping to line segment distance mean value, to obtain mapping distance.
Figure 17 A are refer to, the image combination in this includes three images pair:{ φ, LineSegb1 }, LineSega1,
LineSegb2 }, and { LineSega2, φ }.The line segment distance of each image pair, can count according to aforesaid line segment distance
Calculation mode (including three normal distance compositions, step S311~S313 and formula (1)~(5) are calculated).And a true line
Line segment distance between Duan Yuyi ceases to be busy section φ may be defined as 1 (maximum value possible of line segment distance domain).This shown in Figure 17 A
The image distance of individual image combination, can be these three mapping to line segment distance mean value.For example, this image combination
Image distance be equal toWherein Nd represents the line segment between two line segments
Distance.Similarly, have two mappings right in 17B figures, mapping distance can be the two mapping to line segment distance it is average
Value.
Calculation as above, can calculate the sequence distance between two representative series.Figure 20 illustrate according to
According to the flow chart for being categorized as representative series to gather of one embodiment of the invention.Step S300 comprises the following steps.Step
S360, using each representative series as a set.Step S370, calculates each aggregate distance of set between, set
To being formed by two set.Step S380, finds out the first set with minimal set distance and second set.Step
S390, if minimal set distance is less than apart from threshold value, merges first set and second set.
In this embodiment can be using gathering stratum point group (Agglomerative Hierarchical
Clustering) method.In original state, each representative series is considered as into a set.Then each collection can be calculated
Close to the aggregate distance between (formed by two set, original state be two representative series), can according to step S351~
The method of step S353 calculates aggregate distance (because of in original state, i.e., equivalent to the sequence for calculating two representative series
Column distance).With minimal set distance two set are found out, if minimal set distance is less than apart from threshold value, for example
0.3, then merging the two set becomes a larger set.Then flow process can return step S370, repeatedly to enter
Row merges set.After merging, some collection credit unions have multiple representative series, and for the set with multiple representative series
The set for being formed is right, can calculate this set to central all representative series pairing (all pairing links, All-Pair
The mean value of sequence distance Linkage), using as this set to aggregate distance, wherein representative series pairing is by collecting
Close to two set in the middle of a respective representative series formed.For example, set G1 has two representative series,
Set G2 has Three Represents sequence, then the aggregate distance between set G1 and set G2, can be 2 × 3=6 representative sequence
The mean value of the sequence distance of row pairing.
In one embodiment, there is provided the method that one kind finds out type sequence (Typical Sequence).Figure 21 illustrate according to
Finding out crowd's mobile behavior and find out the flow chart of type sequence according to one embodiment of the invention.Compared with the flow chart of Fig. 2,
Figure 21 further includes step S410 and step S420.Step S410, will be divided into various date types the date.For example, the date can
To be categorized as working day and holiday, and last working day before working day can be categorized further, as holiday in odd-numbered day,
At least last working day before holiday in even-numbered days, the first job day after holiday in odd-numbered day etc..Similarly, not
Holiday can be categorized further, as holiday in odd-numbered day, at least first holiday of holiday in even-numbered days, at least holiday in even-numbered days is most
Latter holiday etc..Step S420, according to representative series in the occurrence rate of target date type, finds out target date class
The type sequence of type.As shown in figure 8, after the data of polymerization a few days, representative series can be obtained in specific date type
Occurrence rate, according to occurrence rate, can find out the type sequence of specific date type.For example, at least holiday in even-numbered days
Last holiday, the type sequence that be able to may be found is the motion track between two railway stations.
Figure 22 illustrates the flow chart according to the type sequence for finding out target date type of one embodiment of the invention.Step
S420 comprises the following steps.Step S421, calculates test representative series and goes out belonging in type date target date first
Existing rate.Step S422, calculates test representative series in non-the second occurrence rate belonged in type date target date.Step
S423, according to the first occurrence rate and the second occurrence rate, counting statistics entropy (Entropy).Step S424, if the first occurrence rate
More than probability threshold value, and statistical entropy is less than entropy threshold value, determines that test representative series are type sequence.
Occurrence rate in the middle of step S421, can obtain after execution step S230 (as shown in figure 8, the polymerization a few days
Data), illustrate the calculating performed with regard to step S421~S424 with an example below.Target date type in this is at least
First holiday of holiday in even-numbered days, represented with class H.On the other hand, represented with class (all-H) and be not belonging to
The holiday of class H.Below table one lists two representative series, and the corresponding occurrence rate of the two representative series.
Table one
Occurrence rate represents the number of times that this sequence occurs in These Days, and such as sequence R1 is belonging to the 56 of class H
41 dates are occurred in altogether in the individual date, and occur in for 2 day altogether within 128 dates for belonging to class (all-H)
Son.Probability threshold value Pth of step S424 is equal to 0.2, entropy threshold value Sth and is equal to 0.6 in this.The statistical entropy of step S423 can
Calculated according to following formula (6):
Wherein piIt is probability (6) of the sequence in class i
According to formula (6), statistical entropy S1 of sequence R1 is equal to 0.1Entropy is larger (disorderly
Degree is larger) represent Probability Distribution and be relatively close to and be uniformly distributed (Uniform Distribution), and entropy is less, represents several
Rate distribution deflection wherein one end.In above-mentioned example, if Probability Distribution deflection class H, then this sequence can be considered as
Type sequence in the class H dates.In step S424, because the first occurrence rate (41/56) is more than probability threshold value Pth, and
Statistical entropy S1=0.1 is less than entropy threshold value Sth, and sequence R1 can be decided to be the type sequence in the class H dates.Similarly,
Statistical entropy S2 of sequence R2 can also be calculated according to formula (6), and S2=0.69 is obtained, because statistical entropy S2 is more than entropy threshold value
Sth, therefore sequence R2 is not the type sequence in the class H dates.As shown in Table 1, sequence R2 is in the class H dates
The first occurrence rate (28/56) it is close with the second occurrence rate (50/128) in class (all-H) the date, imply that sequence R2
It is not especially to appear in the middle of any date type, therefore sequence R2 is not a type sequence.Finding out a day
After multiple type sequences of phase type, multiple type sequences can be categorized further, as set, the method for classification can be as
Shown in step S360, S370, S380, S390.
According to the method for finding out crowd's mobile behavior of the embodiment of the present invention, can be by the position collected from user's set acquisition
Data are put, crowd's mobile behavior track is found out, user's set is, for example, smart card.The payment services provider of distribution smart card can
With according to crowd's mobile behavior track for obtaining, with estimate specific geographical area the number that holds, accordingly design marketing with it is wide
Announcement plan, decision open up position of new StoreFront etc., the method for finding out crowd's mobile behavior according to embodiments of the present invention, can be with
There is perhaps multifaceted application, and contribute to making important decision.Furthermore, because the embodiment of the present invention can more find out spy
The type sequence of type of fixing the date, payment services provider can accordingly plan according to the type sequence of different date types
From the activity that tissue belongs to different date types.
In sum, although the present invention is disclosed as above with preferred embodiment, so it is not limited to the present invention.This
Bright those of ordinary skill in the art, without departing from the spirit and scope of the present invention, when various changes can be made
With retouching.Therefore, protection scope of the present invention is worked as and is defined by claims.
Claims (14)
1. a kind of method for finding out crowd's mobile behavior, including:
Collect the multiple position datas with regard to multiple user's sets;
The multiple usual pattern in those position datas is detected to produce multiple representative series, wherein the respectively representative series bag
An at least line segment is included, an at least line segment is between source location set and end position point;And
According to the multiple sequence distances between those representative series, those representative series are categorized as into multiple set, to look for
Go out crowd's mobile behavior.
2. the method for claim 1, wherein those representative series are further included:
First ray, including the first line segment, first line segment is put a little and the first end position point between in first start bit;And
Second sequence, including second line segment, the second line segment is between the second source location set and the second end position point;
Wherein the First ray and second sequence form the sequence pair in those representative series, for each sequence
It is right, those representative series are categorized as to further include the step of those are gathered:
Calculate the line segment distance between first line segment and the second line segment;And
According to the line segment distance between first line segment and the second line segment, determine between the First ray and second sequence
Sequence distance.
3. method as claimed in claim 2, wherein calculating the line segment distance between first line segment and the second line segment
Step is further included:
Calculate angular distance between first line segment and the second line segment, vertical range and parallel distance;
According to the angular distance, the vertical range and the parallel distance, regular angular distance, regular vertical range and just are calculated
Ruleization parallel distance, wherein the value of the regular angular distance, the regular vertical range and the regular parallel distance is identical
Codomain in;And
According to the weighted sum of the regular angular distance, the regular vertical range and the regular parallel distance, determine this
The line segment distance between one line segment and the second line segment.
4. method as claimed in claim 3, wherein calculating the parallel distance between first line segment and the second line segment
Step is further included:
Second source location set is projected in into the extension line of first line segment, to obtain the 3rd initial subpoint;
By the second end position spot projection in the extension line of first line segment, to obtain the 3rd subpoint is terminated;
Connect the 3rd initial subpoint and the 3rd end subpoint, to produce the 3rd line segment;And
The union of first line segment and the 3rd line segment is deducted into the common factor of first line segment and the 3rd line segment, to determine that this is put down
Row distance.
5. method as claimed in claim 4, wherein calculating the regular angular distance, the regular vertical range and this is regular
The step of changing parallel distance further includes:
By the angular distance divided by angular distance delocalization maximum, to obtain the regular angular distance;
By the vertical range divided by vertical range domain maximum, to obtain the regular vertical range;And
By the parallel distance divided by parallel distance domain maximum, to obtain the regular parallel distance.
6. method as claimed in claim 5, the maximum of the wherein angular distance delocalization is that first line segment is worked as with the second line segment
The length of middle shorter one, the maximum in the parallel distance domain is the union of first line segment and the 3rd line segment, this vertically away from
The maximum of delocalization is the vertical range after first line segment and rotation between line segment, and line segment is by should wherein after the rotation
Second line segment around second source location set or around the second end position point rotation, until perpendicular to first line segment with
Obtain.
7. method as claimed in claim 2, wherein determining the sequence distance between the First ray and second sequence
Step is further included:
According to an at least line segment in the middle of an at least line segment in the middle of the First ray and second sequence, the First ray with
Multiple image combinations are produced between second sequence;
Calculate the image distance of the respectively image combination;And
With the minimum mapping distance in the middle of the respectively image combination, as the sequence between the First ray and second sequence away from
From.
8. method as claimed in claim 7, wherein calculate the image of the respectively image combination apart from the step of further include:
According to time sequencing, in the middle of the First ray in the middle of an at least line segment and second sequence an at least line segment it
Between, form multiple mappings right;
Calculate respectively the mapping to line segment distance;And
Calculate respectively the mapping to the line segment distance mean value, to obtain the mapping distance.
9. the method for claim 1, wherein those position datas are received when those user's sets are used for and pay activity
Collection.
10. the method for claim 1, wherein the step of collecting those position datas with regard to those user's sets is more wrapped
Include:
Select multiple reference position points;And
By each location point in those position datas, it is substituted by and immediate those reference bits in each location point geographical position
Put a little one of them.
11. the method for claim 1, wherein detecting those the usual patterns in those position datas to produce those generations
The step of table sequence, further includes:
Remove the usual pattern only in those usual patterns with single location point;
In the respectively usual pattern, identical adjacent position point is removed;And
By those usual Pattern Aggregations of a few days producing those representative series.
12. the method for claim 1, wherein the step of those representative series are categorized as into those set further includes:
Respectively the representative series will gather as one;
Select two set to be formed in gathering from place near the steps and gather right, calculate respectively aggregate distance of the set between;
Find out the first set with minimal set distance and second set;And
If the minimal set distance is less than apart from threshold value, merge the first set and the second set.
13. the method for claim 1, further include:
Multiple date types will be divided into the date;And
According to those representative series in the occurrence rate of target date type, the type sequence of the target date type is found out.
14. methods as claimed in claim 13, wherein the step of finding out the type sequence of the target date type further includes:
Calculate test representative series and belong to the first occurrence rate in type date target date;
The test representative series are calculated in non-the second occurrence rate belonged in type date target date;
According to first occurrence rate and second occurrence rate, counting statistics entropy;And
If first occurrence rate is more than probability threshold value, and the statistical entropy is less than entropy threshold value, determines the test representativeness sequence
It is classified as the type sequence.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/936,674 US10417648B2 (en) | 2015-11-09 | 2015-11-09 | System and computer readable medium for finding crowd movements |
US14/936,674 | 2015-11-09 | ||
TW104142157A TWI622888B (en) | 2015-11-09 | 2015-12-15 | Method for finding crowd movements and non-transitory computer readable medium execute the same |
TW104142157 | 2015-12-15 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106682051A true CN106682051A (en) | 2017-05-17 |
CN106682051B CN106682051B (en) | 2020-05-29 |
Family
ID=58865128
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510982408.8A Active CN106682051B (en) | 2015-11-09 | 2015-12-24 | Method for finding out crowd movement behaviors |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106682051B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI797916B (en) * | 2021-12-27 | 2023-04-01 | 博晶醫電股份有限公司 | Human body detection method, human body detection device, and computer readable storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110071881A1 (en) * | 2009-09-18 | 2011-03-24 | Microsoft Corporation | Mining life pattern based on location history |
CN102067631A (en) * | 2008-06-27 | 2011-05-18 | 雅虎公司 | System and method for determination and display of personalized distance |
US20110208425A1 (en) * | 2010-02-23 | 2011-08-25 | Microsoft Corporation | Mining Correlation Between Locations Using Location History |
TW201336474A (en) * | 2011-12-07 | 2013-09-16 | 通路實業集團國際公司 | Behavior tracking and modification system |
US20150227934A1 (en) * | 2014-02-11 | 2015-08-13 | Mastercard International Incorporated | Method and system for determining and assessing geolocation proximity |
-
2015
- 2015-12-24 CN CN201510982408.8A patent/CN106682051B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102067631A (en) * | 2008-06-27 | 2011-05-18 | 雅虎公司 | System and method for determination and display of personalized distance |
US20110071881A1 (en) * | 2009-09-18 | 2011-03-24 | Microsoft Corporation | Mining life pattern based on location history |
US20110208425A1 (en) * | 2010-02-23 | 2011-08-25 | Microsoft Corporation | Mining Correlation Between Locations Using Location History |
TW201336474A (en) * | 2011-12-07 | 2013-09-16 | 通路實業集團國際公司 | Behavior tracking and modification system |
US20150227934A1 (en) * | 2014-02-11 | 2015-08-13 | Mastercard International Incorporated | Method and system for determining and assessing geolocation proximity |
Non-Patent Citations (1)
Title |
---|
JAE-GIL LEE等,: ""Trajectory Clustering: A Partition-and-Group Framework"", 《SIGMOD’07》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI797916B (en) * | 2021-12-27 | 2023-04-01 | 博晶醫電股份有限公司 | Human body detection method, human body detection device, and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN106682051B (en) | 2020-05-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Huang et al. | Transport mode detection based on mobile phone network data: A systematic review | |
Chen et al. | Dynamic cluster-based over-demand prediction in bike sharing systems | |
Biagioni et al. | Easytracker: automatic transit tracking, mapping, and arrival time prediction using smartphones | |
Versichele et al. | Pattern mining in tourist attraction visits through association rule learning on Bluetooth tracking data: A case study of Ghent, Belgium | |
Michau et al. | Bluetooth data in an urban context: Retrieving vehicle trajectories | |
CN108650632A (en) | It is a kind of based on duty live correspondence and when space kernel clustering stationary point judgment method | |
US10417648B2 (en) | System and computer readable medium for finding crowd movements | |
Chang et al. | Understanding user’s travel behavior and city region functions from station-free shared bike usage data | |
Yu et al. | iVizTRANS: Interactive visual learning for home and work place detection from massive public transportation data | |
Burkhard et al. | On the requirements on spatial accuracy and sampling rate for transport mode detection in view of a shift to passive signalling data | |
Yang et al. | Pedestrian network generation based on crowdsourced tracking data | |
Li et al. | A two-phase clustering approach for urban hotspot detection with spatiotemporal and network constraints | |
Fu et al. | Spatial heterogeneity and migration characteristics of traffic congestion—A quantitative identification method based on taxi trajectory data | |
Jang et al. | Pedestrian mode identification, classification and characterization by tracking mobile data | |
Gao et al. | A spatiotemporal analysis of the impact of lockdown and coronavirus on London’s bicycle hire scheme: from response to recovery to a new normal | |
Jonker et al. | Modeling trip-length distribution of shopping center trips from GPS data | |
CN106682051A (en) | Method for finding out crowd movement behaviors | |
Rieser-Schüssler | Capitalising modern data sources for observing and modelling transport behaviour | |
CN110046209B (en) | Trajectory stopping point extraction method based on Gaussian model | |
Smith et al. | From buildings to cities: techniques for the multi-scale analysis of urban form and function | |
Currans et al. | Exploring ITE’s Trip Generation Manual: Assessing age of data and land-use taxonomy in vehicle trip generation for transportation impact analyses | |
Tang et al. | Integrating GIS and spatial data mining technique for target marketing of university courses | |
Eftelioglu et al. | RING-Net: Road inference from gps trajectories using a deep segmentation network | |
Liu et al. | Effects of buffer size on associations between the built environment and metro ridership: A machine learning-based sensitive analysis | |
CN113313307A (en) | Tour route mining method based on signaling big data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |