CN107908800A - Passenger traffic channel query pattern sorting technique based on user's inquiry log - Google Patents

Passenger traffic channel query pattern sorting technique based on user's inquiry log Download PDF

Info

Publication number
CN107908800A
CN107908800A CN201711405012.2A CN201711405012A CN107908800A CN 107908800 A CN107908800 A CN 107908800A CN 201711405012 A CN201711405012 A CN 201711405012A CN 107908800 A CN107908800 A CN 107908800A
Authority
CN
China
Prior art keywords
mrow
channel
msub
user
mfrac
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711405012.2A
Other languages
Chinese (zh)
Inventor
林友芳
万怀宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jiaotong University
Original Assignee
Beijing Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jiaotong University filed Critical Beijing Jiaotong University
Priority to CN201711405012.2A priority Critical patent/CN107908800A/en
Publication of CN107908800A publication Critical patent/CN107908800A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Traffic Control Systems (AREA)

Abstract

The present invention relates to field of traffic user's inquiry log data processing and inversion technology, especially a kind of method that can be classified based on user's inquiry log to passenger traffic channel query pattern.Passenger traffic channel query pattern sorting technique proposed by the present invention based on user's inquiry log, can accurately and effectively it classify to different channel query patterns, and find the false User behavior that the auto-programming (reptile) in internet checking channel is brought, so as to filter false User behavior, data are provided and are supported for field of traffic manager and market practitioner.

Description

Passenger traffic channel query pattern sorting technique based on user's inquiry log
Technical field
It is especially a kind of to be based on user the present invention relates to field of traffic user's inquiry log data processing and inversion technology The method that inquiry log classifies passenger traffic channel query pattern.
Background technology
In recent years, with the fast development of the field of traffic such as aviation, railway, highway, the field of traffic whole city passenger number It is constantly soaring, and traffic passenger ticket queries often derives from different channels.
With the development of Internet technology, inquiry of the people for passenger ticket of going on a journey increasingly concentrates on various internet channels On.By taking air ticket is inquired about as an example, the channel of domestic air ticket ticket booking at present is broadly divided into traditional proxy ticket booking (i.e. MCSS, message Center switch system) and two kinds of internet ticket booking (i.e. IBE, Internet booking engine).With interconnection The development of net and mobile intelligent terminal technology, user carry out air ticket inquiry by IBE channels and subscribe shared ratio increasingly It is high.Although this is analyzed to us and collection user data is brought conveniently, another problem is but brought at the same time, these are mutual The false User behavior that a large amount of auto-programmings (reptile) are brought has been full of in networking inquiry channel.Therefore, day is inquired about based on user The it is proposed of the passenger traffic channel query pattern sorting technique of will is very necessary.
The content of the invention
The implementation of the present invention provides a kind of passenger traffic channel query pattern sorting technique based on user's inquiry log, To realize the purpose classified to the different channel query patterns of user's online query data.
The present invention provides following scheme, a kind of passenger traffic channel query pattern classification side based on user's inquiry log Method, this method comprise the following steps:
S1 is parsed from historical data base, is extracted user's inquiry log data:Parse the original user in historical data base Inquiry log data, extract significant field of classifying to channel query pattern from original user query daily record data, should Original user query data include user and inquire about corresponding date at moment, user's inquiry moment corresponding hour numerical value, user Inquire about moment corresponding minute numerical value, user inquires about channel, departure place city, destination city, departure date etc..
User's inquiry log data that S2 multi dimensional analysis S1 is extracted, build different channels and different travel routes are looked into Pattern feature is ask, including:
A, figureofmerit is inquired about, statistics shows, most queries channel is distributed as typical long-tail distribution, with machine Exemplified by ticket inquiry channel, there is the inquiry channel of the air ticket less than 10% to occupy the air ticket queries more than 90%.Referred to queries It is denoted as that for a query pattern feature the sluggish channel in part can be distinguished.
B, comprehensive dispersion index, normal User behavior often show as close to departure date or having social event hair Raw departure date queries is high, and circuit query amount popular or that event occurs is high, and robot is often uniform by queries Be dispersed on unrelated circuit and departure date.Comprehensive dispersion index calculation formula:
The User behavior of one channel of index expression is distributed in departure place & destinations (O&D), the space of departure date Uniformity coefficient, which represents that the distribution of channel User behavior is more uniform closer to 1, is closer to taking off several rows.
C, degree of peeling off index, the User behavior of normal person often have certain stability, so we can be from peeling off The angle of point removes the abnormal User behavior of analysis, can be respectively specifically circuit from the peeling off property of three dimensional analysis channels Dimension, history dimension, channel dimension.By taking circuit is tieed up as an example, if a channel is in one day queries to certain circuit and to All other routes Average lookup amount compared to there is more obvious exception, then the User behavior to this circuit is very suspicious.
Index object:Certain channel was in certain hour User behavior to certain O&D.
Define Ci,j,kFor i-th of channel, in jth day, the inquiry times to kth bar circuit.Circuit dimension degree of peeling off calculates public Formula:
Wherein N represents circuit sum, and the index expression channel is in certain day queries to certain circuit and overall sample The difference degree of average lookup amount.The index is more than 0 and absolute value is bigger, illustrates that sample queries amount is far above normal levels; The index is less than 0 and absolute value is bigger, illustrates that sample queries amount is far below normal levels.
D, behavior pattern index, positive frequent flight passenger inquiry waveform meet mankind's work and rest custom, and Ba Shuo robots inquiry waveform is then It is chaotic random.
Index object:Certain channel to certain O&D one day 24 it is small when in User behavior.
Define behaviorCurvec,od,bFor queries of the c channels to circuit od when b is small, standardCurvec,od,b For standard queries amount of the c channels to circuit od when b is small.
Behavior pattern desired value is defined as follows:
User behavior pattern is to standard normal person's User behavior pattern similar when the index expression sample past 24 is small Degree, value range [- 1,1], the value show User behavior closer to normal person's Behavior law closer to 1.
E, date dispersion index, robot take off several rows of queries for being and are often uniformly distributed on departure date, Normal person was only concentrated on emphasis departure date.
Index object:Certain channel was in certain hour User behavior to certain O&D.
Define the queries average value that μ is expressed as each departure date, HhIt is expressed as h-th of departure date (leaveDate) queries sum.
Date dispersion index calculation formula:
The index expression be evenly distributed degree of certain channel to certain circuit query amount on departure date.The index is got over It is small, illustrate that distribution is more uniform, illustrate that inquiry of the channel to the circuit is more similar to and take off several rows and be.
F, product dispersion index
Define certain O&D and certain leaveDate and form an inquiry product product (O&D&leaveDate), M is all Inquire about the sum of product;ν is expressed as the queries average value on each inquiry product (O&D&leaveDate);PpIt is expressed as pth The queries sum of a inquiry product product.
Order:
Channel ties up calculation formula:
Be evenly distributed degree of the queries of certain channel of the index expression on inquiry product (O&D&startDate).
User query pattern features of the S3 according to the different channels that S2 is constructed to different circuits, is clustered using k-means Method is (referring specifically to paper:Macqueen J.Some Methods for Classification and Analysis of MultiVariate Observations[C]Proc.of,Berkeley Symposium on Mathematical Statistics and Probability.1967:281-297.) on different channels to the User behaviors of different circuits into Row cluster, obtains channel query pattern classification results.
The present invention has following technique effect:Passenger traffic channel inquiry proposed by the present invention based on user's inquiry log Method for classifying modes, accurately and effectively can classify different channel query patterns, and find in internet checking channel The false User behavior that brings of auto-programming (reptile), be field of traffic manager and city so as to filter false User behavior Field practitioner provides data and supports.
Brief description of the drawings
Fig. 1 is channel queries statistical result;
Fig. 2 is the discrete distribution map of normal User behavior;
Fig. 3 is the discrete distribution map of User behavior of robot;
Fig. 4 is the queries curve for meeting normal person's work and rest, and transverse axis is query time (hour granularity), and the longitudinal axis is certain hour To the queries of the circuit;
Fig. 5 is the queries curve for not meeting normal person's work and rest, and transverse axis is query time (hour granularity), and the longitudinal axis is small for certain When to the queries of the circuit;
Fig. 6 is passenger traffic channel query pattern classification results.
Embodiment
A kind of passenger traffic channel query pattern sorting technique based on user's inquiry log that the embodiment of the present invention proposes Process flow include following steps:
Original user query daily record data in S1 parsing certain period of times in database, above-mentioned certain period of time is with ten Minute is unit, can also be selected in practical applications using hour, day etc. as time interval, by non-structured original user After inquiry log data carry out the processing such as denoising, serializing, conversion, decompression, then extracted from original user query daily record data Go out significant field of classifying to channel query pattern, the user's inquiry log data include user and inquire about corresponding day at moment Phase, user inquire about moment corresponding hour numerical value, user inquires about moment corresponding minute numerical value, user inquires about channel, departure place City, destination city, departure date etc..
The user's inquiry log data include the field shown in table 1 below;
Table 1
Numbering Title Explain
1 record_date User corresponds to the date at the inquiry moment
2 record_hour User inquires about moment corresponding hour numerical value
3 record_minute User inquires about moment corresponding minute numerical value
4 channel User inquires about channel
5 origin Departure place city
6 dest Destination city
7 departure_date Departure date
User's inquiry log data that S2 multi dimensional analysis S1 is extracted, inquiry mould of the structure channel to different travel routes Formula feature, including;
A. figureofmerit is inquired about:Statistics shows that most queries channel is distributed as typical long-tail distribution, with machine Exemplified by ticket inquiry channel, as shown in Figure 1, transverse axis is channel queries in the figure, the longitudinal axis is ratio, and two lines represent channel respectively Total queries ratio shared by quantity proportion and channel queries, it can be seen that there is the inquiry channel of the air ticket less than 10% to occupy Air ticket queries more than 90%.The sluggish canal in part can be distinguished using queries as a query pattern feature Road.
B. dispersion index is integrated:Fig. 2, the 3 different query patterns of reaction are in departure date, different departure places two dimensions Queries, Fig. 2 are normal queries behavior, and Fig. 3 is robot User behavior, and normal User behavior is often close to Query Dates Or having the departure date queries height of social event, circuit query amount that is popular or having event is high, and robot often will inquiry Amount is uniformly dispersed on unrelated circuit and departure date.
C. degree of peeling off index:The User behavior of normal person often has certain stability, so we can be from peeling off The angle of point removes the abnormal User behavior of analysis, can be respectively specifically circuit from the peeling off property of three dimensional analysis channels Dimension, history dimension, channel dimension.By taking circuit is tieed up as an example, if a channel is in one day queries to certain circuit and to All other routes Average lookup amount compared to there is more obvious exception, then the User behavior to this circuit is very suspicious.
D. behavior pattern index:Positive frequent flight passenger inquiry waveform meets the mankind and works and rests custom, as shown in figure 4, at 9 points in the morning extremely Afternoon, 5 queries were higher, and 2:00 AM is relatively low to 6 time of having a rest queries.Ba Shuo robots inquiry waveform is then that confusion does not have It is regular, as shown in Figure 5.Using the curve of the dimension as channel feature, it can effectively distinguish robot and take off the inquiry such as number Behavior.
E. date dispersion index:Robot takes off several rows of queries for being and is often uniformly distributed on departure date, Normal person's inquiry was only concentrated on emphasis departure date.Certain channel of the index expression is to certain circuit query amount in departure date On the degree that is evenly distributed.The index is smaller, illustrates that distribution is more uniform, illustrates that inquiry of the channel to the circuit is more similar to and takes off It is for several rows.
F. product dispersion index:The queries of certain channel of the index expression is in inquiry product (O&D&startDate) On the degree that is evenly distributed, distribution is more uniform, illustrates that the inquiry to the inquiry product is more similar to and takes off several rows and be.
S3 clusters the User behavior of different circuits different channels using k-means clustering methods, and method is as follows:
S3.1 is initial random to give 10 cluster centers, assigns to query pattern feature to be clustered respectively according to nearest neighbouring rule A cluster;
S3.2 is recalculated the barycenter of each cluster by the method for average, so that it is determined that the new cluster heart.Iteration always, until the cluster heart moves Dynamic distance is less than some specified value or reaches maximum iteration.Channel query characteristics after cluster are observed, are found Partial category tracing pattern is more close, and adjustment cluster Center Number is 8;
S3.3 repeats S3.2, and observation curve classification is relatively reasonable, and it is channel query pattern classification results to obtain 8 class channels (as shown in Figure 6).
In conclusion the passenger traffic channel query pattern sorting technique proposed by the present invention based on user's inquiry log, The false User behavior that the auto-programming (reptile) being effectively found that in internet checking channel is brought.Can be accurately and effectively right Falseness inquiry is filtered, and is provided data for field of traffic manager and market practitioner and is supported.
One of ordinary skill in the art will appreciate that:Attached drawing is the schematic diagram of one embodiment, module in attached drawing or Flow is not necessarily implemented necessary to the present invention.
Each embodiment in this specification is described by the way of progressive, identical similar portion between each embodiment Divide mutually referring to what each embodiment stressed is the difference with other embodiment.Can be according to reality Need to select some or all of module therein to realize the purpose of this embodiment scheme.Those of ordinary skill in the art are not In the case of making the creative labor, you can to understand and implement.
The foregoing is only a preferred embodiment of the present invention, but protection scope of the present invention be not limited thereto, Any one skilled in the art the invention discloses technical scope in, the change or replacement that can readily occur in, It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with scope of the claims Subject to.

Claims (1)

  1. A kind of 1. passenger traffic channel query pattern sorting technique based on user's inquiry log, it is characterised in that this method bag Include following steps:
    S1 is parsed from historical data base, is extracted user's inquiry log data:Parse the original user query in historical data base Daily record data, extracts significant field of classifying to channel query pattern, this is original from original user query daily record data User, which inquires about data, includes at user's inquiry moment on corresponding date, user's inquiry moment corresponding hour numerical value, user's inquiry Moment corresponding minute numerical value, user inquire about channel, departure place city, destination city, departure date;
    User's inquiry log data that S2 multi dimensional analysis S1 is extracted, build inquiry mould of the different channels to different travel routes Formula feature, including:
    A, figureofmerit is inquired about, statistics shows, most queries channel is distributed as typical long-tail distribution, is looked into air ticket Exemplified by asking channel, there is the inquiry channel of the air ticket less than 10% to occupy the air ticket queries more than 90%;To inquire about figureofmerit work The sluggish channel in part can be distinguished for a query pattern feature;
    B, comprehensive dispersion index, normal User behavior often show as close to departure date or having what social event occurred Departure date queries is high, and circuit query amount popular or that event occurs is high, and robot often uniformly divides queries It is dispersed on unrelated circuit and departure date;Comprehensive dispersion index calculation formula:
    The uniform journey that the User behavior of one channel of index expression is distributed in departure place & destinations, the space of departure date Degree, the index represent that the distribution of channel User behavior is more uniform closer to 1, are closer to taking off several rows;
    C, degree of peeling off index, the User behavior of normal person often has certain stability, so we can be from outlier Angle removes the abnormal User behavior of analysis, can specifically be respectively circuit dimension, goes through from the peeling off property of three dimensional analysis channels Shi Wei, channel dimension;By taking circuit is tieed up as an example, if a channel in one day queries to certain circuit with being averaged to All other routes Queries, which is compared, more obvious exception, then the User behavior to this circuit is very suspicious;
    Index object:Certain channel was in certain hour User behavior to certain departure place & destinations;
    Define CIj, kFor i-th of channel, in jth day, the inquiry times to kth bar circuit;Circuit ties up degree of peeling off calculation formula:
    <mrow> <mi>u</mi> <mi>t</mi> <mi>l</mi> <mi>i</mi> <mi>e</mi> <mi>r</mi> <mo>_</mo> <mi>O</mi> <mi>D</mi> <mo>=</mo> <mfrac> <mrow> <msub> <mi>C</mi> <mrow> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>,</mo> <mi>k</mi> </mrow> </msub> <mo>-</mo> <mfrac> <mn>1</mn> <mi>N</mi> </mfrac> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </msubsup> <msub> <mi>C</mi> <mrow> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>,</mo> <mi>k</mi> </mrow> </msub> </mrow> <msqrt> <mrow> <mfrac> <mn>1</mn> <mi>N</mi> </mfrac> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </msubsup> <msup> <mrow> <mo>(</mo> <msub> <mi>C</mi> <mrow> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>,</mo> <mi>k</mi> </mrow> </msub> <mo>-</mo> <mfrac> <mn>1</mn> <mi>N</mi> </mfrac> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </msubsup> <msub> <mi>C</mi> <mrow> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>,</mo> <mi>k</mi> </mrow> </msub> <mo>)</mo> </mrow> <mn>2</mn> </msup> </mrow> </msqrt> </mfrac> </mrow>
    Wherein N represents circuit sum, and the index expression channel is in certain day queries to certain circuit and overall sample mean The difference degree of queries;The index is more than 0 and absolute value is bigger, illustrates that sample queries amount is far above normal levels;This refers to Mark is less than 0 and absolute value is bigger, illustrates that sample queries amount is far below normal levels;
    D, behavior pattern index, positive frequent flight passenger inquiry waveform meet mankind's work and rest custom, and Ba Shuo robots inquiry waveform is then mixed It is disorderly random;
    Index object:Certain channel to certain O&D one day 24 it is small when in User behavior;
    Define behaviorCurveC, od, bFor queries of the c channels to circuit od when b is small, standardCurveC, od, bFor c Standard queries amount of the channel to circuit od when b is small;
    Behavior pattern desired value is defined as follows:
    <mrow> <mi>C</mi> <mi>o</mi> <mi>s</mi> <mi> </mi> <mi>i</mi> <mi>n</mi> <mi>e</mi> <mi>S</mi> <mi>i</mi> <mi>m</mi> <mi>i</mi> <mi>l</mi> <mi>a</mi> <mi>r</mi> <mi>i</mi> <mi>t</mi> <mi>y</mi> <mo>&lt;</mo> <mrow> <mo>(</mo> <mfrac> <mrow> <mi>s</mi> <mi> </mi> <mi>tan</mi> <mi> </mi> <msub> <mi>dardCurve</mi> <mi>i</mi> </msub> <mo>-</mo> <mi>s</mi> <mi> </mi> <mi>tan</mi> <mi> </mi> <msub> <mi>dardCurve</mi> <mrow> <mi>i</mi> <mo>-</mo> <mn>1</mn> </mrow> </msub> </mrow> <mfrac> <mrow> <msubsup> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mn>24</mn> </msubsup> <mi>s</mi> <mi> </mi> <mi>tan</mi> <mi> </mi> <msub> <mi>dardCurve</mi> <mi>i</mi> </msub> </mrow> <mn>24</mn> </mfrac> </mfrac> <mo>)</mo> </mrow> <mo>,</mo> <mrow> <mo>(</mo> <mfrac> <mrow> <msub> <mi>behaviorCurve</mi> <mi>i</mi> </msub> <mo>-</mo> <msub> <mi>behaviorCurve</mi> <mrow> <mi>i</mi> <mo>-</mo> <mn>1</mn> </mrow> </msub> </mrow> <mfrac> <mrow> <msubsup> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mn>24</mn> </msubsup> <msub> <mi>behaviorCurve</mi> <mi>i</mi> </msub> </mrow> <mn>24</mn> </mfrac> </mfrac> <mo>)</mo> </mrow> <mo>&gt;</mo> </mrow>
    The similarity degree of User behavior pattern and standard normal person's User behavior pattern when the index expression sample past 24 is small, Value range [- 1,1], the value show User behavior closer to normal person's Behavior law closer to 1;
    E, date dispersion index, robot take off several rows of queries for being and are often uniformly distributed on departure date, normally People was only concentrated on emphasis departure date;
    Index object:Certain channel was in certain hour User behavior to certain O&D;
    Define the queries average value that μ is expressed as each departure date, HhIt is expressed as looking into for h-th departure date (leaveDate) Inquiry amount sum;
    <mrow> <mi>&amp;mu;</mi> <mo>=</mo> <mfrac> <mrow> <msubsup> <mo>&amp;Sigma;</mo> <mrow> <mi>h</mi> <mo>=</mo> <mi>M</mi> <mi>i</mi> <mi>n</mi> <mrow> <mo>(</mo> <mi>l</mi> <mi>e</mi> <mi>a</mi> <mi>v</mi> <mi>e</mi> <mi>D</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mi>M</mi> <mi>a</mi> <mi>x</mi> <mrow> <mo>(</mo> <mi>l</mi> <mi>e</mi> <mi>a</mi> <mi>v</mi> <mi>e</mi> <mi>D</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> <mo>)</mo> </mrow> </mrow> </msubsup> <msub> <mi>H</mi> <mi>h</mi> </msub> </mrow> <mrow> <mi>M</mi> <mi>a</mi> <mi>x</mi> <mrow> <mo>(</mo> <mi>l</mi> <mi>e</mi> <mi>a</mi> <mi>v</mi> <mi>e</mi> <mi>D</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> <mo>)</mo> </mrow> <mo>-</mo> <mi>M</mi> <mi>i</mi> <mi>n</mi> <mrow> <mo>(</mo> <mi>l</mi> <mi>e</mi> <mi>a</mi> <mi>v</mi> <mi>e</mi> <mi>D</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> <mo>)</mo> </mrow> </mrow> </mfrac> </mrow>
    Date dispersion index calculation formula:
    <mrow> <mi>d</mi> <mi>i</mi> <mi>s</mi> <mi>p</mi> <mi>e</mi> <mi>r</mi> <mi>s</mi> <mi>i</mi> <mi>o</mi> <mi>n</mi> <mo>_</mo> <mi>h</mi> <mi>i</mi> <mi>s</mi> <mo>=</mo> <mfrac> <mn>1</mn> <mi>&amp;mu;</mi> </mfrac> <mo>*</mo> <msqrt> <mrow> <mfrac> <mn>1</mn> <mrow> <mi>M</mi> <mi>a</mi> <mi>x</mi> <mrow> <mo>(</mo> <mi>l</mi> <mi>e</mi> <mi>a</mi> <mi>v</mi> <mi>e</mi> <mi>D</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> <mo>)</mo> </mrow> <mo>-</mo> <mi>M</mi> <mi>i</mi> <mi>n</mi> <mrow> <mo>(</mo> <mi>l</mi> <mi>e</mi> <mi>a</mi> <mi>v</mi> <mi>e</mi> <mi>D</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>*</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>h</mi> <mo>=</mo> <mi>M</mi> <mi>i</mi> <mi>n</mi> <mrow> <mo>(</mo> <mi>l</mi> <mi>e</mi> <mi>a</mi> <mi>v</mi> <mi>e</mi> <mi>D</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mi>M</mi> <mi>a</mi> <mi>x</mi> <mrow> <mo>(</mo> <mi>l</mi> <mi>e</mi> <mi>a</mi> <mi>v</mi> <mi>e</mi> <mi>D</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> <mo>)</mo> </mrow> </mrow> </munderover> <msup> <mrow> <mo>(</mo> <msub> <mi>H</mi> <mi>h</mi> </msub> <mo>-</mo> <mi>&amp;mu;</mi> <mo>)</mo> </mrow> <mn>2</mn> </msup> </mrow> </msqrt> </mrow>
    The index expression be evenly distributed degree of certain channel to certain circuit query amount on departure date;The index is smaller, Illustrate that distribution is more uniform, illustrate that inquiry of the channel to the circuit is more similar to and take off several rows and be;
    F, product dispersion index
    Define certain O&D and certain leaveDate and form an inquiry product product (O&D&leaveDate), M is all inquiries The sum of product;V is expressed as the queries average value on each inquiry product (O&D&leaveDate);PpP-th is expressed as to look into Ask the queries sum of product product;
    Order:
    Channel ties up calculation formula:
    Be evenly distributed degree of the queries of certain channel of the index expression on inquiry product (O&D&startDate);
    User query pattern features of the S3 according to the different channels that S2 is constructed to different circuits, using k-means clustering methods To being clustered on different channels to the User behavior of different circuits, channel query pattern classification results are obtained.
CN201711405012.2A 2017-12-22 2017-12-22 Passenger traffic channel query pattern sorting technique based on user's inquiry log Pending CN107908800A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711405012.2A CN107908800A (en) 2017-12-22 2017-12-22 Passenger traffic channel query pattern sorting technique based on user's inquiry log

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711405012.2A CN107908800A (en) 2017-12-22 2017-12-22 Passenger traffic channel query pattern sorting technique based on user's inquiry log

Publications (1)

Publication Number Publication Date
CN107908800A true CN107908800A (en) 2018-04-13

Family

ID=61869641

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711405012.2A Pending CN107908800A (en) 2017-12-22 2017-12-22 Passenger traffic channel query pattern sorting technique based on user's inquiry log

Country Status (1)

Country Link
CN (1) CN107908800A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108717427A (en) * 2018-05-05 2018-10-30 北京交通大学 Passenger traffic demand index computational methods based on user's inquiry log

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7933884B2 (en) * 2008-01-30 2011-04-26 Yahoo! Inc. Apparatus and methods for tracking, querying, and visualizing behavior targeting processes
CN106777303A (en) * 2016-12-30 2017-05-31 中国民航信息网络股份有限公司 Passenger flight User behavior sorting technique and system
CN106780273A (en) * 2016-12-30 2017-05-31 中国民航信息网络股份有限公司 Passenger flight requirement analysis method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7933884B2 (en) * 2008-01-30 2011-04-26 Yahoo! Inc. Apparatus and methods for tracking, querying, and visualizing behavior targeting processes
CN106777303A (en) * 2016-12-30 2017-05-31 中国民航信息网络股份有限公司 Passenger flight User behavior sorting technique and system
CN106780273A (en) * 2016-12-30 2017-05-31 中国民航信息网络股份有限公司 Passenger flight requirement analysis method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
靳国旺 等: "《雷达摄影测量》", 30 April 2015, 测绘出版社 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108717427A (en) * 2018-05-05 2018-10-30 北京交通大学 Passenger traffic demand index computational methods based on user's inquiry log

Similar Documents

Publication Publication Date Title
CN106210044B (en) A kind of any active ues recognition methods based on access behavior
Smith Method and purpose in functional town classification
CN100595780C (en) Handwriting digital automatic identification method based on module neural network SN9701 rectangular array
EP2392120B1 (en) Method and sensor network for attribute selection for an event recognition
KR20040101477A (en) Viewing multi-dimensional data through hierarchical visualization
Pal et al. An insight of World Health Organization (WHO) accident database by cluster analysis with self-organizing map (SOM)
CN109408557B (en) Traffic accident cause analysis method based on multiple correspondences and K-means clustering
CN107180088A (en) News based on Fuzzy C-Means Cluster Algorithm recommends method
US20230215272A1 (en) Information processing method and apparatus, computer device and storage medium
CN102122353A (en) Method for segmenting images by using increment dictionary learning and sparse representation
CN107180093A (en) Information search method and device and ageing inquiry word recognition method and device
Thota et al. Cluster based zoning of crime info
CN106933906A (en) The querying method and device of data multidimensional degree
DE112021001926T5 (en) SYSTEM AND METHOD FOR FILTERLESS THrottling OF VEHICLE EVENT DATA PROCESSING TO IDENTIFY PARKING AREAS
Fusco et al. Hierarchical clustering through spatial interaction data. The case of commuting flows in South-Eastern France
Chang et al. Classification and visualization of the social science network by the minimum span clustering method
CN107908800A (en) Passenger traffic channel query pattern sorting technique based on user&#39;s inquiry log
CN101673305A (en) Industry sorting method, industry sorting device and industry sorting server
CN109657123A (en) A kind of food safety affair clustering method based on comentropy
Scholl et al. Testing for clustering of industries-evidence from micro geographic data
CN107730717B (en) A kind of suspicious card identification method of public transport based on feature extraction
CN110717089A (en) User behavior analysis system and method based on weblog
Qiong et al. Application of clustering algorithm in intelligent transportation data analysis
CN109446394A (en) For network public-opinion event based on modular public sentiment monitoring method and system
CN110097126B (en) Method for checking important personnel and house missing registration based on DBSCAN clustering algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180413