CN106682051A - Method for finding out crowd movement behaviors - Google Patents

Method for finding out crowd movement behaviors Download PDF

Info

Publication number
CN106682051A
CN106682051A CN201510982408.8A CN201510982408A CN106682051A CN 106682051 A CN106682051 A CN 106682051A CN 201510982408 A CN201510982408 A CN 201510982408A CN 106682051 A CN106682051 A CN 106682051A
Authority
CN
China
Prior art keywords
line segment
distance
sequence
those
representative series
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510982408.8A
Other languages
Chinese (zh)
Other versions
CN106682051B (en
Inventor
王恩慈
吴泰廷
高崎钧
王昭智
郭奕宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial Technology Research Institute ITRI
Original Assignee
Industrial Technology Research Institute ITRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US14/936,674 external-priority patent/US10417648B2/en
Application filed by Industrial Technology Research Institute ITRI filed Critical Industrial Technology Research Institute ITRI
Publication of CN106682051A publication Critical patent/CN106682051A/en
Application granted granted Critical
Publication of CN106682051B publication Critical patent/CN106682051B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries

Abstract

The invention discloses a method for finding out crowd movement behaviors, which comprises the following steps: collecting a plurality of location data regarding a plurality of user devices; detecting a plurality of conventional patterns in the position data to generate a plurality of representative sequences, wherein each representative sequence comprises at least one line segment between a starting position point and an ending position point; and classifying the representative sequences into a plurality of sets according to a plurality of sequence distances among the representative sequences so as to find the moving behaviors of the crowd.

Description

The method for finding out crowd's mobile behavior
Technical field
The invention relates to it is a kind of by the position data collected with regard to user's set, to find out crowd's mobile behavior Method.
Background technology
For many companies and tissue, such as chain convenience store, public transportation company, local government etc., it is known that people How group intercity moves possibly critically important information in city or multiple.For these tissues, there is many Where and crowd carrys out and move to the information of where from where great decision-making need to be dependent on regard to crowd, and these decision-makings are for example It is the new bus route of setting, build new traffic transfer station, opens up new StoreFront and construction urban public utilities.Cause How this, effectively find out the relevant information of crowd's mobile behavior, is one of problem that current industry is endeavoured.
The content of the invention
The invention relates to the non-momentary computer of a kind of method for finding out crowd's mobile behavior and execution the method can Reading media.
According to one of present invention embodiment, a kind of method for finding out crowd's mobile behavior is proposed, method includes:Collect with regard to Multiple position datas of multiple user's sets;Detect the multiple usual pattern in these position datas to produce multiple representative sequences Row, wherein each representative series include an at least line segment, an at least line segment is between source location set and end position point;With And according to the multiple sequence distances between these representative series, these representative series are categorized as into multiple set, to find out Crowd's mobile behavior.
More preferably understand to have to the above-mentioned and other aspect of the present invention, the model implemented according to the present invention cited below particularly Example, and coordinate institute's accompanying drawings, it is described in detail below:
Description of the drawings
Fig. 1 illustrates the example schematic for carrying out payment activity with smart card.
Fig. 2 is illustrated according to the flow chart for finding out crowd's mobile behavior method of one embodiment of the invention.
Fig. 3 illustrates the schematic diagram of an example payment record, and payment record is relevant to the position captured from multiple user's sets Data and time.
Fig. 4 illustrates the flow chart of the collection with regard to the position data of user's set of foundation one embodiment of the invention.
Fig. 5 A and Fig. 5 B are illustrated location for payment according to one embodiment of the invention with immediate reference position point mark The example schematic of note.
After the arrangement that Fig. 6 illustrates according to one embodiment of the invention with simplification after payment record schematic diagram.
Fig. 7 illustrates the usual pattern in the test position data according to one embodiment of the invention to produce representative series Flow chart.
Fig. 8 illustrates the schematic diagram according to the usual pattern of polymerization of one embodiment of the invention.
Fig. 9 illustrates the flow chart of the sequence distance between the calculating representative series according to one embodiment of the invention.
Figure 10 illustrates the stream of the line segment distance between the line segment of calculating first and second line segment according to one embodiment of the invention Cheng Tu.
The schematic diagram of line segment distance between two line segments that Figure 11 illustrates according to one embodiment of the invention.
Figure 12 illustrates the flow process of the line segment of calculating first according to one embodiment of the invention and parallel distance between second line segment Figure.
The schematic diagram of line segment distance between two line segments that Figure 13 illustrates according to one embodiment of the invention.
Figure 14 illustrates the calculating normalization angular distance, regular vertical range and normalization of foundation one embodiment of the invention The flow chart of parallel distance.
Figure 15 A~Figure 15 D are illustrated and shown according to various situations of the consideration vertical range domain maximum of one embodiment of the invention It is intended to.
Figure 16 illustrates the flow process of the decision First ray according to one embodiment of the invention and sequence distance between the second sequence Figure.
Multiple image combinations between two representative series that Figure 17 A~Figure 17 C illustrate according to one embodiment of the invention Schematic diagram.
Figure 18 illustrates the example schematic of a kind of invalid image between two representative series.
Figure 19 illustrates the flow chart of the image distance according to the calculating image combination of one embodiment of the invention.
Figure 20 is illustrated according to the flow chart for being categorized as representative series to gather of one embodiment of the invention.
Figure 21 illustrates finding out crowd's mobile behavior and find out the flow chart of type sequence according to one embodiment of the invention.
Figure 22 illustrates the flow chart according to the type sequence for finding out target date type of one embodiment of the invention.
Specific embodiment
To make the object, technical solutions and advantages of the present invention become more apparent, below in conjunction with specific embodiment, and reference Accompanying drawing, the present invention is described in further detail.
Many people take public transportation, such as bus, train, rapid transit etc. using smart card in the modern life, additionally, Smart card also can as electronic money bag to buy article or payment expense, for example, smart card can stored value in the card, Can be used for vending machine, turnover parking lot or turnover railway station.Fig. 1 illustrates the example for carrying out payment activity with smart card and illustrates Figure, the smart card in this example is a kind of contactless smart card.Whenever using smart card, one can be produced and pay note Record (Payment Log) or traffic records, and because vending machine or station gate can have static geography information, therefore Can collect and position data when using intelligently is stuck in regard to multiple users.The service provider of distribution smart card can be with mat By these payment records are collected, with the relevant information for where obtaining crowd and how crowd moves.
Fig. 2 is illustrated according to the flow chart for finding out crowd's mobile behavior method of one embodiment of the invention, finds out crowd's movement The method of behavior comprises the following steps.Step S100, collects the position data with regard to multiple user's sets.Step S200, detection (Mining) usual pattern (Frequent Patterns) in position data is producing multiple representative series (Representative Sequences).Step S300, according to the sequence distance between representative series, by representative series It is categorized as gathering (Cluster), to find out crowd's mobile behavior.Wherein each representative series include an at least line segment, and this is at least One line segment is between source location set and end position point.The method for example be able to can be stored by software program implementation, software program On CD, software program can include multiple instructions for being relevant to computer processor, and these instructions can be subsequently can by computer Device loads to perform the above-mentioned method for finding out crowd's mobile behavior.Detailed description with regard to each step is as follows.
Step S100:Collect the position data with regard to multiple user's sets.User's set may include smart card, electronic payment Card or the running gear with ability to pay, when these user's sets are used to pay activity, can collect the positional number of correlation According to.For example, when smart card is used to carry out payment activity in a payment terminal machine, one can be uploaded and is reported to central authorities' clothes Business device, this report may include authentication (Identification, ID), payment, date-time, the payment of smart card The position of terminating machine.Crowd's mobile behavior method of finding out of the present invention is not limited to collect positional number when payment activity is carried out According to can collect the opportunity of position data also may include:When user's set is into station, when money is stored into user's set or Person is when user's set enters building through certification.In order to make it easy to understand, following explanation will be with when payment activity be carried out Collect position data to illustrate as example, and the position data collected is represented with payment record.
Fig. 3 illustrates the schematic diagram of an example payment record, and payment record is relevant to the position captured from multiple user's sets Data and time.Payment record can be stored in the central server of payment services provider.In this example, payment record bag The field for including has:Uid, leave position, in-position, date-time, payment, type of transaction.Uid represents user's set ID, that is, identical uid corresponds to identical user's set, it is possible to corresponds to identical user, therefore can obtain One people is in the information of where, and by the position data collected with regard to multiple user's sets, service provider can learn greatly What kind of common motion track most people have.Type of transaction can be purchase, traffic, save or other types, for traffic Type, payment activity is probably to carry out at destination station, and leaves position and record related traffic letter respectively to in-position Breath, e.g. leaves station with the geographical coordinate for getting to the station.For other types of transaction beyond traffic, can use and leave position The position of field record payment activity is put, and in-position field then can be recorded using "-".
In above-mentioned example, the coordinate for being recorded can be accurate position data.And in payment record, because of There is substantial amounts of shop to use payment services, therefore the accurate position coordinates quantity of diverse location may be a lot.For having found out For crowd's mobile behavior of interest, it may not be necessary to accurate position, therefore neighbouring position can be considered as a semantic space Domain (Semantic Region), it is possible to select a reference position point (Reference Location Point) to represent One semantic region.Step S100 can include step S110 and step S120, according to one embodiment of the invention as depicted in Fig. 4 Collection with regard to the position data of user's set flow chart.
Step S110:Select multiple reference position points.The example of reference position point can include school, chain convenient business Shop, the landscape positions determined according to resident population.Step S120:By each location point in position data, it is substituted by and its ground The immediate reference position point in reason position.Each payment data in for payment record, the precise local fix of script can be with It is substituted by an immediate reference position point.Fig. 5 A and Fig. 5 B illustrate according to one embodiment of the invention by location for payment with The example schematic of immediate reference position point mark.Fig. 5 A are advance with the triangular representation three that three different shades are filled up Reference position point Ref_a, Ref_b of selection, Ref_c, with the location for payment of hollow circular script.Then, by each payment position Put and be substituted by the immediate reference position point in geographical position, as shown in Figure 5 B, the shade of each location for payment fills up state change It is identical for corresponding reference position point.After using immediate reference position point mark location for payment, in script The geographical coordinate paid in record can be substituted by these reference position points.Step S110 and step S120 mentioned herein be Optional step, that is, even if not execution step S110 and step S120, preserves original location for payment in payment record, after Continuous step S200 and step S300 still can be performed to original location for payment.
Step S200:Usual pattern in test position data is producing multiple representative series.In Data Collection and After above-mentioned preceding processing stages, payment record can switch to pay sequence (Sequence), and each pays sequence includes The project (Item) of one sequence, represents specific user in the whole payment track of special time, pays the project in sequence Can be the reference position point as shown in Fig. 5 A and Fig. 5 B, the payment sequence of an example can be { id_677:Ref_h, Ref_ C, Ref_c }.From multiple payment sequences, it is possible to use in proper order pattern detection (Sequential Pattern Mining) is calculated Method, e.g. PrefixSpan or Generalized Sequential Pattern (GSP) algorithm, are paid in sequence with finding out Usual pattern.After pattern detection in proper order, can obtain for the representative series in special time and its is corresponding Support (Support Count), support can represent the number of times of appearance, can be calculated in detection algorithm.Each representative Property sequence include an at least line segment, an at least line segment is together between beginning location point and end position point, source location set And end position point can be exact position or reference position point.For example, traffic pattern is corresponded in payment record One payment data, can be considered as and leave the line segment between position and in-position.One example of representative series For<Ref_a, Ref_d, Ref_e>, this representative series include two line segments, a line segment from Ref_a to Ref_d, another Line segment is from Ref_d to Ref_e.
In the middle of step S200, payment record first can be ranked up according to uid and date-time, for example can will be right The transaction clustering of same subscriber device should be arrived together, and the Transaction Information of corresponding this user's set is entered according to time sequencing Row sequence.Additionally, each location point can be marked using immediate reference position point, as shown in Figure 5 B.Fig. 6 is illustrated according to this Invent after the arrangement of an embodiment with simplification after payment record schematic diagram.In this instance, for uid 604 can form one Individual transaction sequence:<Ref_a, Ref_d, Ref_e>, and for uid 677 can form another transaction sequence:<Ref_h, Ref_c, Ref_c>, then can to multiple transaction sequence applications in proper order pattern detection algorithm finding out usual pattern.
Fig. 7 illustrates the usual pattern in the test position data according to one embodiment of the invention to produce representative series Flow chart.Step S200 may include step S210, step S220 and step S230, and these steps are performed after pattern detection. Step S210, if there is a usual pattern only to have single location point, then removes this usual pattern.Step S220, at each In usual pattern, identical adjacent position point is removed.Due to being the method for finding out crowd's mobile behavior, therefore those generations can be excluded Table stops in situ usual pattern, e.g.<Ref_a>And<Ref_a, Ref_a>.Additionally, it is identical to include at least two Adjacent position point usual pattern, for example<Ref_h, Ref_c, Ref_c>, equally can also exclude the adjacent position point of repetition. So obtained representative series will not include identical adjacent position point, each line segment in the middle of each representative series All represent the direction of crowd's mobile behavior.
Step S230, by the usual Pattern Aggregation of a few days producing representative series.The payment record collected can be according to According to different time and same date is not classified, and this is because in different time sections, crowd's mobile behavior track can Can differ.For example, 6:30am~9:30am、10:30am~13:30am、4:30pm~7:These three periods of 30pm can Different crowd activities can respectively be corresponded to.If needing the cumulative statistical analysis for special time section in many days, then This time section usual pattern of corresponding many days can be polymerized.Fig. 8 illustrates poly- according to one embodiment of the invention Close the schematic diagram of usual pattern.In this example, the period 6 of all working day in the middle of May:30am~9:Generation in the middle of 30am Table sequence is polymerized, the support of the digitized representation representative series in form.As shown in figure 8, the sequence of each working day Row<Ref_d, Ref_e>After accumulated, the support of produced whole this sequence of May is 5750.Numeral 23/23 is that occur Rate (Occurrence Rate), represents this sequence<Ref_d, Ref_e>Occur 23 times in 23 working days in may.Mat By the polymerization statistics of many days, the mobile trend of most people more can be exactly found out.
Representative series, according to the sequence distance between representative series, are categorized as set, to find out people by step S300 Group's mobile behavior.After execution step S100 and step S200, it is found with regard to the representative series of crowd's mobile behavior, Then similar representative series can be categorized as set, to find out in crowd's mobile behavior in a big way between region.Such as This practice is helpful, because the interesting crowd's mobile behavior of great majority is with regard to the shifting between two extensive areas It is dynamic, rather than the movement between two specific buildings.The sequence distance between two representative series can be calculated in step S300, To determine the similarity degree of the two representative series.For example, shorter sequence distance represents the two representative series There is higher similarity degree (being for example closer on geographical position).Can be according to such calculated sequence distance, will Representative series are categorized as set.
Hereinafter illustrate an example with regard to sequence of calculation distance.One representative series can be considered the line segment of a sequence, In this illustrative example, representative series include First ray Seq_a and the second sequence Seq_b.First ray Seq_a bags The first line segment L1 is included, the first line segment L1 is put between point L1_s and the first end position point L1_e in first start bit.Second sequence Seq_b includes second line segment L2, and second line segment L2 is between the second source location set L2_s and the second end position point L2_e.The Sequence distance between one sequence Seq_a and the second sequence Seq_b, is according to the line between the first line segment L1 and second line segment L2 Segment distance and determine.First line segment L1 has directionality (pointing to the first end point L1_e from the first starting point L1_s), the second line Section L2 also equally has directionality (pointing to the second end point L1_e from the second starting point L2_s).Therefore in the following description, Will be using vectorAndThe first line segment L1 and second line segment L2 are represented respectively.
Fig. 9 illustrates the flow chart of the sequence distance between the calculating representative series according to one embodiment of the invention.First Sequence Seq_a and the second sequence Seq_b form a sequence pair in representative series, for each sequence pair, step S300 (representative series are categorized as into set) more may include step S310 and step S320.Step S310, calculates First Line Line segment distance between section L1 and second line segment L2, line segment distance represents the close degree or similarity degree of the two line segments.Step Rapid S320, according to the line segment distance between the first line segment L1 and second line segment L2, determines First ray Seq_a and the second sequence Sequence distance between Seq_b, in other words, the similarity degree between two representative series is according to the two representative sequences Arrange the similarity degree between the line segment having respectively and determine.
Figure 10 illustrates the stream of the line segment distance between the line segment of calculating first and second line segment according to one embodiment of the invention Cheng Tu.Step S310 can comprise the following steps.Step S311, calculates the angular distance between the first line segment L1 and second line segment L2 dθ(Angle Distance), vertical range d(Perpendicular Distance) and parallel distance d||(Parallel Distance).Step S312, according to angular distance dθ, vertical range dAnd parallel distance d||, calculate normalization (Normalized) angular distance Ndθ, regular vertical range NdAnd regular parallel distance Nd||, wherein regular angular distance Ndθ, regular vertical range NdAnd regular parallel distance Nd||Value in identical codomain.Step S313, according to regular Change angular distance Ndθ, regular vertical range NdAnd regular parallel distance Nd||Weighted sum, determine the first line segment L1 with Line segment distance between second line segment L2.An example presented below illustrates the calculating with regard to line segment distance between two line segments.
The schematic diagram of line segment distance between two line segments that Figure 11 illustrates according to one embodiment of the invention.Line segment distance is by three Composition is determined:Angular distance dθ, vertical range dAnd parallel distance d||.Angular distance dθIt is relevant toWithBetween angle theta (0≤θ≤180°).For example, angle theta can be according to formulaCalculate, whereinIt is two The inner product (dot product) of vector,
AndRepresent the length of two vectors.Angular distance dθCan be calculated according to following formula (1):
Angular distance dθSimilarity degree of two vectors in pointing direction is represented, angle theta is less, angular distance dθIt is less.And when folder When angle θ is more than 90 °, rightabout is substantially pointed to equivalent to two vectors, then angular distance d nowθAngular distance delocalization can be set to (Domain) maximum value possible, it is on direction and dissimilar to represent the two vectors.
Figure 12 illustrates the flow process of the line segment of calculating first according to one embodiment of the invention and parallel distance between second line segment Figure.Step S311 comprises the following steps.Step S331, by the second source location set L2_s the extension of the first line segment L1 is projected in Line, to obtain the 3rd initial subpoint L3_s.Step S332, by the second end position point L2_e prolonging for the first line segment L1 is projected in Line is stretched, to obtain the 3rd end subpoint L3_e.Step S333, the 3rd starting subpoint L3_s of connection and the 3rd terminates subpoint L3_e, to produce the 3rd line segment L3, the 3rd initial subpoint L3_s, the 3rd end subpoint L3_e, the Yi Ji for so producing Three line segment L3, it is seen that depicted in Figure 11.Step S334, by the union (Union) of the first line segment L1 and the 3rd line segment L3 is deducted The common factor (Intersection) of one line segment L1 and the 3rd line segment L3, to determine parallel distance d||.Parallel distance d||Can according to Lower formula (2) calculates:
d||=L1 ∪ L3-L1 ∩ L3 (2)
Because the 3rd line segment L3 is formed via second line segment L2 is projected to the first line segment L1, therefore the 3rd line segment L3 It is conllinear (Collinear) with the first line segment L1.In the example shown in Figure 11, the union of the first line segment L1 and the 3rd line segment L3, It is the length that the end position point L1_e of point L1_s to first are put from first start bit, and the friendship of the first line segment L1 and the 3rd line segment L3 Collection, is the length from the 3rd source location set L3_s to the 3rd end position point L3_e.It is by the second dimension L2 in this example Project to the first dimension L1, in other embodiments, the first line segment L1 can also be projected to (extension) second line segment L2 Obtain parallel distance d||(value calculated may be different).Parallel distance d||Represent two line segments equivalent parallel length it Between similarity degree.
Vertical range dCan be calculated according to following formula (3):
Wherein l⊥sIt is the Euclidean distance between the second source location set L2_s and the 3rd source location set L3_s (Euclidean Distance), l⊥eEuclidean between the second end position point L2_e and the 3rd end position point L3_e away from From.Formula (3) represents l⊥sWith l⊥eAnti- harmonic average (Contraharmonic Mean).
The schematic diagram of line segment distance between two line segments that Figure 13 illustrates according to one embodiment of the invention.Equally can respectively according to Angular distance d is calculated according to formula (1), (2), (3)θ, parallel distance d||, vertical range d.In this example, angle theta is more than 90 °, Therefore angular distance dθIt is equal toAs for parallel distance d||, the union of the first line segment L1 and the 3rd line segment L3, be from First start bit puts the length of the end position point L3_e of point L1_s to the 3rd, and the common factor of the first line segment L1 and the 3rd line segment L3, It is the length from the 3rd source location set L3_s to the first end position point L1_e.
As described above, when calculate line segment apart from when consider three compositions simultaneously.However, because the codomain of these three compositions can Can differ greatly, cause to be not easy to obtain a significant combination according to these three compositions.In the method for the present invention, in step Rapid S312 calculates regular angular distance Ndθ, regular parallel distance Nd||And regular vertical range Nd, wherein regular angle Apart from Ndθ, regular parallel distance Nd||And regular vertical range NdValue in identical codomain, e.g. [0,1], [0,1] represent more than or equal to the 0, interval less than or equal to 1.Because the value of these three regular compositions is in identical codomain, this The linear combination of three regular compositions just has meaning for the line segment distance calculated between two line segments.In one embodiment, Line segment distance is regular angular distance Ndθ, regular parallel distance Nd||And regular vertical range NdWeighted sum, line Segment distance can be calculated according to following formula (4):
Line segment distance=w1×Ndθ+w2×ND||+w3×NDWherein
For example, w1, w2, w3 can be equal toTo obtain regular angular distance Ndθ, regular parallel distance Nd||And regular vertical range NdMean value.
Figure 14 illustrates the calculating normalization angular distance, regular vertical range and normalization of foundation one embodiment of the invention The flow chart of parallel distance.Step S312 may include the following steps.Step S341, by angular distance dθDivided by the maximum of angular distance delocalization Value, to obtain regular angular distance Ndθ.Step S342, by vertical range dDivided by the maximum in vertical range domain, to obtain just Ruleization vertical range Nd.Step S343, by parallel distance d||It is parallel to obtain normalization divided by the maximum in parallel distance domain Apart from Nd||.Because three normalization distances are all that by producing divided by the maximum in respective distance domain, therefore these three are just Ruleization all can be in [0,1] scope apart from the value of composition.
As shown in formula (1), the maximum of angular distance delocalization is shorter one in the middle of the first line segment L1 and second line segment L2 Length.As shown in formula (2), the maximum in parallel distance domain is the union of the first line segment L1 and the 3rd line segment L3.It is vertical away from The maximum of delocalization is difficult directly to find out that its correlation computations is described as follows from formula (3).
Figure 15 A~Figure 15 D are illustrated and shown according to various situations of the consideration vertical range domain maximum of one embodiment of the invention It is intended to.According to formula (3) and the geometrical relationship of the first line segment L1 and second line segment L2, the maximum in vertical range domain occurs to existPerpendicular toWhen.Therefore, in one embodiment, can be by second line segment L2 around the second source location set L2s or around second End position point L2e rotates, until perpendicular to the first line segment L1The maximum in vertical range domain is the first line segment Vertical range between L1 and postrotational second line segment.Figure 15 A~Figure 15 D illustrate four kinds of possible rotational conditions.It is vertical away from The maximum of delocalization is vertical range maximum in the middle of these four possible cases, and the maximum in vertical range domain can be according to following formula Sub (5) calculate:
Wherein l2Represent the length of second line segment L2.
Line segment distance between two line segments can be obtained according to calculation procedure above, and the sequence between two representative series Column distance, then can determine according to the distance of the line segment between the line segment that the two representative series have respectively.Figure 16 is illustrated According to the flow chart of sequence distance between the decision First ray and the second sequence of one embodiment of the invention.Under step S320 includes Row step.Step S321, according to an at least line segment in the middle of an at least line segment in the middle of First ray Seq_a and the second sequence Seq_b, Produce multiple images (Mapping) between First ray Seq_a and the second sequence Seq_b to combine.Step S322, calculates each The image distance of image combination.Step S333, with the minimum image distance in the middle of each image combination, as First ray Seq_ Sequence distance between a and the second sequence Seq_b.
First ray Seq_a can be the sequence of multiple line segments according to time sequencing arrangement forms, First ray Seq_a examples Two line segments LineSega1 and LineSega2 are such as may include, the motion track that line segment LineSega1 is represented is earlier than line segment The motion track that LineSega2 is represented.Figure 17 A~Figure 17 C illustrate according to one embodiment of the invention two representative series it Between multiple image combinations schematic diagram.In this example, the second sequence Seq_b also includes two lines according to time sequencing arrangement Section LineSegb1 and LineSegb2.
In Figure 17 A, line segment maps LineSegb1 to ceases to be busy section φ, and line segment LineSegb2 maps to line segment LineSega1, and line segment LineSega2 maps to ceases to be busy section φ.It should be noted that in the time of each representative series Order still maintains the time sequencing of script.17B schemes and Figure 17 C illustrate respectively different image combinations, in each representativeness The time sequencing of sequence is equally the time sequencing for maintaining script.What Figure 18 illustrated between two representative series a kind of invalid reflects The example schematic of picture, this image time-to-violation order, because line segment LineSega2 (mapping to line segment LineSegb1) occurs Line segment LineSega1 (mapping to line segment LineSegb2) is later than, but line segment LineSegb1 occurs earlier than line segment LineSegb2.For each effective image combination (as shown in Figure 17 A~Figure 17 C), a mapping distance can be calculated. Sequence distance between First ray Seq_a and the second sequence Seq_b can be minimum image in the middle of each image combination away from From.
Figure 19 illustrates the flow chart of the image distance according to the calculating image combination of one embodiment of the invention.Step S322 bag Include the following steps.Step S351, according to time sequencing, an at least line segment and the second sequence Seq_b in the middle of First ray Seq_a It is central at least between a line segment, multiple mappings are formed to (Mapping Pair).Step S352, calculate each mapping to line segment Distance.Step S353, calculate each mapping to line segment distance mean value, to obtain mapping distance.
Figure 17 A are refer to, the image combination in this includes three images pair:{ φ, LineSegb1 }, LineSega1, LineSegb2 }, and { LineSega2, φ }.The line segment distance of each image pair, can count according to aforesaid line segment distance Calculation mode (including three normal distance compositions, step S311~S313 and formula (1)~(5) are calculated).And a true line Line segment distance between Duan Yuyi ceases to be busy section φ may be defined as 1 (maximum value possible of line segment distance domain).This shown in Figure 17 A The image distance of individual image combination, can be these three mapping to line segment distance mean value.For example, this image combination Image distance be equal toWherein Nd represents the line segment between two line segments Distance.Similarly, have two mappings right in 17B figures, mapping distance can be the two mapping to line segment distance it is average Value.
Calculation as above, can calculate the sequence distance between two representative series.Figure 20 illustrate according to According to the flow chart for being categorized as representative series to gather of one embodiment of the invention.Step S300 comprises the following steps.Step S360, using each representative series as a set.Step S370, calculates each aggregate distance of set between, set To being formed by two set.Step S380, finds out the first set with minimal set distance and second set.Step S390, if minimal set distance is less than apart from threshold value, merges first set and second set.
In this embodiment can be using gathering stratum point group (Agglomerative Hierarchical Clustering) method.In original state, each representative series is considered as into a set.Then each collection can be calculated Close to the aggregate distance between (formed by two set, original state be two representative series), can according to step S351~ The method of step S353 calculates aggregate distance (because of in original state, i.e., equivalent to the sequence for calculating two representative series Column distance).With minimal set distance two set are found out, if minimal set distance is less than apart from threshold value, for example 0.3, then merging the two set becomes a larger set.Then flow process can return step S370, repeatedly to enter Row merges set.After merging, some collection credit unions have multiple representative series, and for the set with multiple representative series The set for being formed is right, can calculate this set to central all representative series pairing (all pairing links, All-Pair The mean value of sequence distance Linkage), using as this set to aggregate distance, wherein representative series pairing is by collecting Close to two set in the middle of a respective representative series formed.For example, set G1 has two representative series, Set G2 has Three Represents sequence, then the aggregate distance between set G1 and set G2, can be 2 × 3=6 representative sequence The mean value of the sequence distance of row pairing.
In one embodiment, there is provided the method that one kind finds out type sequence (Typical Sequence).Figure 21 illustrate according to Finding out crowd's mobile behavior and find out the flow chart of type sequence according to one embodiment of the invention.Compared with the flow chart of Fig. 2, Figure 21 further includes step S410 and step S420.Step S410, will be divided into various date types the date.For example, the date can To be categorized as working day and holiday, and last working day before working day can be categorized further, as holiday in odd-numbered day, At least last working day before holiday in even-numbered days, the first job day after holiday in odd-numbered day etc..Similarly, not Holiday can be categorized further, as holiday in odd-numbered day, at least first holiday of holiday in even-numbered days, at least holiday in even-numbered days is most Latter holiday etc..Step S420, according to representative series in the occurrence rate of target date type, finds out target date class The type sequence of type.As shown in figure 8, after the data of polymerization a few days, representative series can be obtained in specific date type Occurrence rate, according to occurrence rate, can find out the type sequence of specific date type.For example, at least holiday in even-numbered days Last holiday, the type sequence that be able to may be found is the motion track between two railway stations.
Figure 22 illustrates the flow chart according to the type sequence for finding out target date type of one embodiment of the invention.Step S420 comprises the following steps.Step S421, calculates test representative series and goes out belonging in type date target date first Existing rate.Step S422, calculates test representative series in non-the second occurrence rate belonged in type date target date.Step S423, according to the first occurrence rate and the second occurrence rate, counting statistics entropy (Entropy).Step S424, if the first occurrence rate More than probability threshold value, and statistical entropy is less than entropy threshold value, determines that test representative series are type sequence.
Occurrence rate in the middle of step S421, can obtain after execution step S230 (as shown in figure 8, the polymerization a few days Data), illustrate the calculating performed with regard to step S421~S424 with an example below.Target date type in this is at least First holiday of holiday in even-numbered days, represented with class H.On the other hand, represented with class (all-H) and be not belonging to The holiday of class H.Below table one lists two representative series, and the corresponding occurrence rate of the two representative series.
Table one
Occurrence rate represents the number of times that this sequence occurs in These Days, and such as sequence R1 is belonging to the 56 of class H 41 dates are occurred in altogether in the individual date, and occur in for 2 day altogether within 128 dates for belonging to class (all-H) Son.Probability threshold value Pth of step S424 is equal to 0.2, entropy threshold value Sth and is equal to 0.6 in this.The statistical entropy of step S423 can Calculated according to following formula (6):
Wherein piIt is probability (6) of the sequence in class i
According to formula (6), statistical entropy S1 of sequence R1 is equal to 0.1Entropy is larger (disorderly Degree is larger) represent Probability Distribution and be relatively close to and be uniformly distributed (Uniform Distribution), and entropy is less, represents several Rate distribution deflection wherein one end.In above-mentioned example, if Probability Distribution deflection class H, then this sequence can be considered as Type sequence in the class H dates.In step S424, because the first occurrence rate (41/56) is more than probability threshold value Pth, and Statistical entropy S1=0.1 is less than entropy threshold value Sth, and sequence R1 can be decided to be the type sequence in the class H dates.Similarly, Statistical entropy S2 of sequence R2 can also be calculated according to formula (6), and S2=0.69 is obtained, because statistical entropy S2 is more than entropy threshold value Sth, therefore sequence R2 is not the type sequence in the class H dates.As shown in Table 1, sequence R2 is in the class H dates The first occurrence rate (28/56) it is close with the second occurrence rate (50/128) in class (all-H) the date, imply that sequence R2 It is not especially to appear in the middle of any date type, therefore sequence R2 is not a type sequence.Finding out a day After multiple type sequences of phase type, multiple type sequences can be categorized further, as set, the method for classification can be as Shown in step S360, S370, S380, S390.
According to the method for finding out crowd's mobile behavior of the embodiment of the present invention, can be by the position collected from user's set acquisition Data are put, crowd's mobile behavior track is found out, user's set is, for example, smart card.The payment services provider of distribution smart card can With according to crowd's mobile behavior track for obtaining, with estimate specific geographical area the number that holds, accordingly design marketing with it is wide Announcement plan, decision open up position of new StoreFront etc., the method for finding out crowd's mobile behavior according to embodiments of the present invention, can be with There is perhaps multifaceted application, and contribute to making important decision.Furthermore, because the embodiment of the present invention can more find out spy The type sequence of type of fixing the date, payment services provider can accordingly plan according to the type sequence of different date types From the activity that tissue belongs to different date types.
In sum, although the present invention is disclosed as above with preferred embodiment, so it is not limited to the present invention.This Bright those of ordinary skill in the art, without departing from the spirit and scope of the present invention, when various changes can be made With retouching.Therefore, protection scope of the present invention is worked as and is defined by claims.

Claims (14)

1. a kind of method for finding out crowd's mobile behavior, including:
Collect the multiple position datas with regard to multiple user's sets;
The multiple usual pattern in those position datas is detected to produce multiple representative series, wherein the respectively representative series bag An at least line segment is included, an at least line segment is between source location set and end position point;And
According to the multiple sequence distances between those representative series, those representative series are categorized as into multiple set, to look for Go out crowd's mobile behavior.
2. the method for claim 1, wherein those representative series are further included:
First ray, including the first line segment, first line segment is put a little and the first end position point between in first start bit;And
Second sequence, including second line segment, the second line segment is between the second source location set and the second end position point;
Wherein the First ray and second sequence form the sequence pair in those representative series, for each sequence It is right, those representative series are categorized as to further include the step of those are gathered:
Calculate the line segment distance between first line segment and the second line segment;And
According to the line segment distance between first line segment and the second line segment, determine between the First ray and second sequence Sequence distance.
3. method as claimed in claim 2, wherein calculating the line segment distance between first line segment and the second line segment Step is further included:
Calculate angular distance between first line segment and the second line segment, vertical range and parallel distance;
According to the angular distance, the vertical range and the parallel distance, regular angular distance, regular vertical range and just are calculated Ruleization parallel distance, wherein the value of the regular angular distance, the regular vertical range and the regular parallel distance is identical Codomain in;And
According to the weighted sum of the regular angular distance, the regular vertical range and the regular parallel distance, determine this The line segment distance between one line segment and the second line segment.
4. method as claimed in claim 3, wherein calculating the parallel distance between first line segment and the second line segment Step is further included:
Second source location set is projected in into the extension line of first line segment, to obtain the 3rd initial subpoint;
By the second end position spot projection in the extension line of first line segment, to obtain the 3rd subpoint is terminated;
Connect the 3rd initial subpoint and the 3rd end subpoint, to produce the 3rd line segment;And
The union of first line segment and the 3rd line segment is deducted into the common factor of first line segment and the 3rd line segment, to determine that this is put down Row distance.
5. method as claimed in claim 4, wherein calculating the regular angular distance, the regular vertical range and this is regular The step of changing parallel distance further includes:
By the angular distance divided by angular distance delocalization maximum, to obtain the regular angular distance;
By the vertical range divided by vertical range domain maximum, to obtain the regular vertical range;And
By the parallel distance divided by parallel distance domain maximum, to obtain the regular parallel distance.
6. method as claimed in claim 5, the maximum of the wherein angular distance delocalization is that first line segment is worked as with the second line segment The length of middle shorter one, the maximum in the parallel distance domain is the union of first line segment and the 3rd line segment, this vertically away from The maximum of delocalization is the vertical range after first line segment and rotation between line segment, and line segment is by should wherein after the rotation Second line segment around second source location set or around the second end position point rotation, until perpendicular to first line segment with Obtain.
7. method as claimed in claim 2, wherein determining the sequence distance between the First ray and second sequence Step is further included:
According to an at least line segment in the middle of an at least line segment in the middle of the First ray and second sequence, the First ray with Multiple image combinations are produced between second sequence;
Calculate the image distance of the respectively image combination;And
With the minimum mapping distance in the middle of the respectively image combination, as the sequence between the First ray and second sequence away from From.
8. method as claimed in claim 7, wherein calculate the image of the respectively image combination apart from the step of further include:
According to time sequencing, in the middle of the First ray in the middle of an at least line segment and second sequence an at least line segment it Between, form multiple mappings right;
Calculate respectively the mapping to line segment distance;And
Calculate respectively the mapping to the line segment distance mean value, to obtain the mapping distance.
9. the method for claim 1, wherein those position datas are received when those user's sets are used for and pay activity Collection.
10. the method for claim 1, wherein the step of collecting those position datas with regard to those user's sets is more wrapped Include:
Select multiple reference position points;And
By each location point in those position datas, it is substituted by and immediate those reference bits in each location point geographical position Put a little one of them.
11. the method for claim 1, wherein detecting those the usual patterns in those position datas to produce those generations The step of table sequence, further includes:
Remove the usual pattern only in those usual patterns with single location point;
In the respectively usual pattern, identical adjacent position point is removed;And
By those usual Pattern Aggregations of a few days producing those representative series.
12. the method for claim 1, wherein the step of those representative series are categorized as into those set further includes:
Respectively the representative series will gather as one;
Select two set to be formed in gathering from place near the steps and gather right, calculate respectively aggregate distance of the set between;
Find out the first set with minimal set distance and second set;And
If the minimal set distance is less than apart from threshold value, merge the first set and the second set.
13. the method for claim 1, further include:
Multiple date types will be divided into the date;And
According to those representative series in the occurrence rate of target date type, the type sequence of the target date type is found out.
14. methods as claimed in claim 13, wherein the step of finding out the type sequence of the target date type further includes:
Calculate test representative series and belong to the first occurrence rate in type date target date;
The test representative series are calculated in non-the second occurrence rate belonged in type date target date;
According to first occurrence rate and second occurrence rate, counting statistics entropy;And
If first occurrence rate is more than probability threshold value, and the statistical entropy is less than entropy threshold value, determines the test representativeness sequence It is classified as the type sequence.
CN201510982408.8A 2015-11-09 2015-12-24 Method for finding out crowd movement behaviors Active CN106682051B (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US14/936,674 US10417648B2 (en) 2015-11-09 2015-11-09 System and computer readable medium for finding crowd movements
US14/936,674 2015-11-09
TW104142157A TWI622888B (en) 2015-11-09 2015-12-15 Method for finding crowd movements and non-transitory computer readable medium execute the same
TW104142157 2015-12-15

Publications (2)

Publication Number Publication Date
CN106682051A true CN106682051A (en) 2017-05-17
CN106682051B CN106682051B (en) 2020-05-29

Family

ID=58865128

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510982408.8A Active CN106682051B (en) 2015-11-09 2015-12-24 Method for finding out crowd movement behaviors

Country Status (1)

Country Link
CN (1) CN106682051B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI797916B (en) * 2021-12-27 2023-04-01 博晶醫電股份有限公司 Human body detection method, human body detection device, and computer readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110071881A1 (en) * 2009-09-18 2011-03-24 Microsoft Corporation Mining life pattern based on location history
CN102067631A (en) * 2008-06-27 2011-05-18 雅虎公司 System and method for determination and display of personalized distance
US20110208425A1 (en) * 2010-02-23 2011-08-25 Microsoft Corporation Mining Correlation Between Locations Using Location History
TW201336474A (en) * 2011-12-07 2013-09-16 通路實業集團國際公司 Behavior tracking and modification system
US20150227934A1 (en) * 2014-02-11 2015-08-13 Mastercard International Incorporated Method and system for determining and assessing geolocation proximity

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102067631A (en) * 2008-06-27 2011-05-18 雅虎公司 System and method for determination and display of personalized distance
US20110071881A1 (en) * 2009-09-18 2011-03-24 Microsoft Corporation Mining life pattern based on location history
US20110208425A1 (en) * 2010-02-23 2011-08-25 Microsoft Corporation Mining Correlation Between Locations Using Location History
TW201336474A (en) * 2011-12-07 2013-09-16 通路實業集團國際公司 Behavior tracking and modification system
US20150227934A1 (en) * 2014-02-11 2015-08-13 Mastercard International Incorporated Method and system for determining and assessing geolocation proximity

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JAE-GIL LEE等,: ""Trajectory Clustering: A Partition-and-Group Framework"", 《SIGMOD’07》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI797916B (en) * 2021-12-27 2023-04-01 博晶醫電股份有限公司 Human body detection method, human body detection device, and computer readable storage medium

Also Published As

Publication number Publication date
CN106682051B (en) 2020-05-29

Similar Documents

Publication Publication Date Title
Huang et al. Transport mode detection based on mobile phone network data: A systematic review
Chen et al. Dynamic cluster-based over-demand prediction in bike sharing systems
Biagioni et al. Easytracker: automatic transit tracking, mapping, and arrival time prediction using smartphones
Versichele et al. Pattern mining in tourist attraction visits through association rule learning on Bluetooth tracking data: A case study of Ghent, Belgium
Michau et al. Bluetooth data in an urban context: Retrieving vehicle trajectories
CN108650632A (en) It is a kind of based on duty live correspondence and when space kernel clustering stationary point judgment method
US10417648B2 (en) System and computer readable medium for finding crowd movements
Chang et al. Understanding user’s travel behavior and city region functions from station-free shared bike usage data
Yu et al. iVizTRANS: Interactive visual learning for home and work place detection from massive public transportation data
Burkhard et al. On the requirements on spatial accuracy and sampling rate for transport mode detection in view of a shift to passive signalling data
Yang et al. Pedestrian network generation based on crowdsourced tracking data
Li et al. A two-phase clustering approach for urban hotspot detection with spatiotemporal and network constraints
Fu et al. Spatial heterogeneity and migration characteristics of traffic congestion—A quantitative identification method based on taxi trajectory data
Jang et al. Pedestrian mode identification, classification and characterization by tracking mobile data
Gao et al. A spatiotemporal analysis of the impact of lockdown and coronavirus on London’s bicycle hire scheme: from response to recovery to a new normal
Jonker et al. Modeling trip-length distribution of shopping center trips from GPS data
CN106682051A (en) Method for finding out crowd movement behaviors
Rieser-Schüssler Capitalising modern data sources for observing and modelling transport behaviour
CN110046209B (en) Trajectory stopping point extraction method based on Gaussian model
Smith et al. From buildings to cities: techniques for the multi-scale analysis of urban form and function
Currans et al. Exploring ITE’s Trip Generation Manual: Assessing age of data and land-use taxonomy in vehicle trip generation for transportation impact analyses
Tang et al. Integrating GIS and spatial data mining technique for target marketing of university courses
Eftelioglu et al. RING-Net: Road inference from gps trajectories using a deep segmentation network
Liu et al. Effects of buffer size on associations between the built environment and metro ridership: A machine learning-based sensitive analysis
CN113313307A (en) Tour route mining method based on signaling big data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant