CN107908800A - Passenger traffic channel query pattern sorting technique based on user's inquiry log - Google Patents
Passenger traffic channel query pattern sorting technique based on user's inquiry log Download PDFInfo
- Publication number
- CN107908800A CN107908800A CN201711405012.2A CN201711405012A CN107908800A CN 107908800 A CN107908800 A CN 107908800A CN 201711405012 A CN201711405012 A CN 201711405012A CN 107908800 A CN107908800 A CN 107908800A
- Authority
- CN
- China
- Prior art keywords
- mrow
- channel
- msub
- user
- mfrac
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Traffic Control Systems (AREA)
Abstract
The present invention relates to field of traffic user's inquiry log data processing and inversion technology, especially a kind of method that can be classified based on user's inquiry log to passenger traffic channel query pattern.Passenger traffic channel query pattern sorting technique proposed by the present invention based on user's inquiry log, can accurately and effectively it classify to different channel query patterns, and find the false User behavior that the auto-programming (reptile) in internet checking channel is brought, so as to filter false User behavior, data are provided and are supported for field of traffic manager and market practitioner.
Description
Technical field
It is especially a kind of to be based on user the present invention relates to field of traffic user's inquiry log data processing and inversion technology
The method that inquiry log classifies passenger traffic channel query pattern.
Background technology
In recent years, with the fast development of the field of traffic such as aviation, railway, highway, the field of traffic whole city passenger number
It is constantly soaring, and traffic passenger ticket queries often derives from different channels.
With the development of Internet technology, inquiry of the people for passenger ticket of going on a journey increasingly concentrates on various internet channels
On.By taking air ticket is inquired about as an example, the channel of domestic air ticket ticket booking at present is broadly divided into traditional proxy ticket booking (i.e. MCSS, message
Center switch system) and two kinds of internet ticket booking (i.e. IBE, Internet booking engine).With interconnection
The development of net and mobile intelligent terminal technology, user carry out air ticket inquiry by IBE channels and subscribe shared ratio increasingly
It is high.Although this is analyzed to us and collection user data is brought conveniently, another problem is but brought at the same time, these are mutual
The false User behavior that a large amount of auto-programmings (reptile) are brought has been full of in networking inquiry channel.Therefore, day is inquired about based on user
The it is proposed of the passenger traffic channel query pattern sorting technique of will is very necessary.
The content of the invention
The implementation of the present invention provides a kind of passenger traffic channel query pattern sorting technique based on user's inquiry log,
To realize the purpose classified to the different channel query patterns of user's online query data.
The present invention provides following scheme, a kind of passenger traffic channel query pattern classification side based on user's inquiry log
Method, this method comprise the following steps:
S1 is parsed from historical data base, is extracted user's inquiry log data:Parse the original user in historical data base
Inquiry log data, extract significant field of classifying to channel query pattern from original user query daily record data, should
Original user query data include user and inquire about corresponding date at moment, user's inquiry moment corresponding hour numerical value, user
Inquire about moment corresponding minute numerical value, user inquires about channel, departure place city, destination city, departure date etc..
User's inquiry log data that S2 multi dimensional analysis S1 is extracted, build different channels and different travel routes are looked into
Pattern feature is ask, including:
A, figureofmerit is inquired about, statistics shows, most queries channel is distributed as typical long-tail distribution, with machine
Exemplified by ticket inquiry channel, there is the inquiry channel of the air ticket less than 10% to occupy the air ticket queries more than 90%.Referred to queries
It is denoted as that for a query pattern feature the sluggish channel in part can be distinguished.
B, comprehensive dispersion index, normal User behavior often show as close to departure date or having social event hair
Raw departure date queries is high, and circuit query amount popular or that event occurs is high, and robot is often uniform by queries
Be dispersed on unrelated circuit and departure date.Comprehensive dispersion index calculation formula:
The User behavior of one channel of index expression is distributed in departure place & destinations (O&D), the space of departure date
Uniformity coefficient, which represents that the distribution of channel User behavior is more uniform closer to 1, is closer to taking off several rows.
C, degree of peeling off index, the User behavior of normal person often have certain stability, so we can be from peeling off
The angle of point removes the abnormal User behavior of analysis, can be respectively specifically circuit from the peeling off property of three dimensional analysis channels
Dimension, history dimension, channel dimension.By taking circuit is tieed up as an example, if a channel is in one day queries to certain circuit and to All other routes
Average lookup amount compared to there is more obvious exception, then the User behavior to this circuit is very suspicious.
Index object:Certain channel was in certain hour User behavior to certain O&D.
Define Ci,j,kFor i-th of channel, in jth day, the inquiry times to kth bar circuit.Circuit dimension degree of peeling off calculates public
Formula:
Wherein N represents circuit sum, and the index expression channel is in certain day queries to certain circuit and overall sample
The difference degree of average lookup amount.The index is more than 0 and absolute value is bigger, illustrates that sample queries amount is far above normal levels;
The index is less than 0 and absolute value is bigger, illustrates that sample queries amount is far below normal levels.
D, behavior pattern index, positive frequent flight passenger inquiry waveform meet mankind's work and rest custom, and Ba Shuo robots inquiry waveform is then
It is chaotic random.
Index object:Certain channel to certain O&D one day 24 it is small when in User behavior.
Define behaviorCurvec,od,bFor queries of the c channels to circuit od when b is small, standardCurvec,od,b
For standard queries amount of the c channels to circuit od when b is small.
Behavior pattern desired value is defined as follows:
User behavior pattern is to standard normal person's User behavior pattern similar when the index expression sample past 24 is small
Degree, value range [- 1,1], the value show User behavior closer to normal person's Behavior law closer to 1.
E, date dispersion index, robot take off several rows of queries for being and are often uniformly distributed on departure date,
Normal person was only concentrated on emphasis departure date.
Index object:Certain channel was in certain hour User behavior to certain O&D.
Define the queries average value that μ is expressed as each departure date, HhIt is expressed as h-th of departure date
(leaveDate) queries sum.
Date dispersion index calculation formula:
The index expression be evenly distributed degree of certain channel to certain circuit query amount on departure date.The index is got over
It is small, illustrate that distribution is more uniform, illustrate that inquiry of the channel to the circuit is more similar to and take off several rows and be.
F, product dispersion index
Define certain O&D and certain leaveDate and form an inquiry product product (O&D&leaveDate), M is all
Inquire about the sum of product;ν is expressed as the queries average value on each inquiry product (O&D&leaveDate);PpIt is expressed as pth
The queries sum of a inquiry product product.
Order:
Channel ties up calculation formula:
Be evenly distributed degree of the queries of certain channel of the index expression on inquiry product (O&D&startDate).
User query pattern features of the S3 according to the different channels that S2 is constructed to different circuits, is clustered using k-means
Method is (referring specifically to paper:Macqueen J.Some Methods for Classification and Analysis of
MultiVariate Observations[C]Proc.of,Berkeley Symposium on Mathematical
Statistics and Probability.1967:281-297.) on different channels to the User behaviors of different circuits into
Row cluster, obtains channel query pattern classification results.
The present invention has following technique effect:Passenger traffic channel inquiry proposed by the present invention based on user's inquiry log
Method for classifying modes, accurately and effectively can classify different channel query patterns, and find in internet checking channel
The false User behavior that brings of auto-programming (reptile), be field of traffic manager and city so as to filter false User behavior
Field practitioner provides data and supports.
Brief description of the drawings
Fig. 1 is channel queries statistical result;
Fig. 2 is the discrete distribution map of normal User behavior;
Fig. 3 is the discrete distribution map of User behavior of robot;
Fig. 4 is the queries curve for meeting normal person's work and rest, and transverse axis is query time (hour granularity), and the longitudinal axis is certain hour
To the queries of the circuit;
Fig. 5 is the queries curve for not meeting normal person's work and rest, and transverse axis is query time (hour granularity), and the longitudinal axis is small for certain
When to the queries of the circuit;
Fig. 6 is passenger traffic channel query pattern classification results.
Embodiment
A kind of passenger traffic channel query pattern sorting technique based on user's inquiry log that the embodiment of the present invention proposes
Process flow include following steps:
Original user query daily record data in S1 parsing certain period of times in database, above-mentioned certain period of time is with ten
Minute is unit, can also be selected in practical applications using hour, day etc. as time interval, by non-structured original user
After inquiry log data carry out the processing such as denoising, serializing, conversion, decompression, then extracted from original user query daily record data
Go out significant field of classifying to channel query pattern, the user's inquiry log data include user and inquire about corresponding day at moment
Phase, user inquire about moment corresponding hour numerical value, user inquires about moment corresponding minute numerical value, user inquires about channel, departure place
City, destination city, departure date etc..
The user's inquiry log data include the field shown in table 1 below;
Table 1
Numbering | Title | Explain |
1 | record_date | User corresponds to the date at the inquiry moment |
2 | record_hour | User inquires about moment corresponding hour numerical value |
3 | record_minute | User inquires about moment corresponding minute numerical value |
4 | channel | User inquires about channel |
5 | origin | Departure place city |
6 | dest | Destination city |
7 | departure_date | Departure date |
User's inquiry log data that S2 multi dimensional analysis S1 is extracted, inquiry mould of the structure channel to different travel routes
Formula feature, including;
A. figureofmerit is inquired about:Statistics shows that most queries channel is distributed as typical long-tail distribution, with machine
Exemplified by ticket inquiry channel, as shown in Figure 1, transverse axis is channel queries in the figure, the longitudinal axis is ratio, and two lines represent channel respectively
Total queries ratio shared by quantity proportion and channel queries, it can be seen that there is the inquiry channel of the air ticket less than 10% to occupy
Air ticket queries more than 90%.The sluggish canal in part can be distinguished using queries as a query pattern feature
Road.
B. dispersion index is integrated:Fig. 2, the 3 different query patterns of reaction are in departure date, different departure places two dimensions
Queries, Fig. 2 are normal queries behavior, and Fig. 3 is robot User behavior, and normal User behavior is often close to Query Dates
Or having the departure date queries height of social event, circuit query amount that is popular or having event is high, and robot often will inquiry
Amount is uniformly dispersed on unrelated circuit and departure date.
C. degree of peeling off index:The User behavior of normal person often has certain stability, so we can be from peeling off
The angle of point removes the abnormal User behavior of analysis, can be respectively specifically circuit from the peeling off property of three dimensional analysis channels
Dimension, history dimension, channel dimension.By taking circuit is tieed up as an example, if a channel is in one day queries to certain circuit and to All other routes
Average lookup amount compared to there is more obvious exception, then the User behavior to this circuit is very suspicious.
D. behavior pattern index:Positive frequent flight passenger inquiry waveform meets the mankind and works and rests custom, as shown in figure 4, at 9 points in the morning extremely
Afternoon, 5 queries were higher, and 2:00 AM is relatively low to 6 time of having a rest queries.Ba Shuo robots inquiry waveform is then that confusion does not have
It is regular, as shown in Figure 5.Using the curve of the dimension as channel feature, it can effectively distinguish robot and take off the inquiry such as number
Behavior.
E. date dispersion index:Robot takes off several rows of queries for being and is often uniformly distributed on departure date,
Normal person's inquiry was only concentrated on emphasis departure date.Certain channel of the index expression is to certain circuit query amount in departure date
On the degree that is evenly distributed.The index is smaller, illustrates that distribution is more uniform, illustrates that inquiry of the channel to the circuit is more similar to and takes off
It is for several rows.
F. product dispersion index:The queries of certain channel of the index expression is in inquiry product (O&D&startDate)
On the degree that is evenly distributed, distribution is more uniform, illustrates that the inquiry to the inquiry product is more similar to and takes off several rows and be.
S3 clusters the User behavior of different circuits different channels using k-means clustering methods, and method is as follows:
S3.1 is initial random to give 10 cluster centers, assigns to query pattern feature to be clustered respectively according to nearest neighbouring rule
A cluster;
S3.2 is recalculated the barycenter of each cluster by the method for average, so that it is determined that the new cluster heart.Iteration always, until the cluster heart moves
Dynamic distance is less than some specified value or reaches maximum iteration.Channel query characteristics after cluster are observed, are found
Partial category tracing pattern is more close, and adjustment cluster Center Number is 8;
S3.3 repeats S3.2, and observation curve classification is relatively reasonable, and it is channel query pattern classification results to obtain 8 class channels
(as shown in Figure 6).
In conclusion the passenger traffic channel query pattern sorting technique proposed by the present invention based on user's inquiry log,
The false User behavior that the auto-programming (reptile) being effectively found that in internet checking channel is brought.Can be accurately and effectively right
Falseness inquiry is filtered, and is provided data for field of traffic manager and market practitioner and is supported.
One of ordinary skill in the art will appreciate that:Attached drawing is the schematic diagram of one embodiment, module in attached drawing or
Flow is not necessarily implemented necessary to the present invention.
Each embodiment in this specification is described by the way of progressive, identical similar portion between each embodiment
Divide mutually referring to what each embodiment stressed is the difference with other embodiment.Can be according to reality
Need to select some or all of module therein to realize the purpose of this embodiment scheme.Those of ordinary skill in the art are not
In the case of making the creative labor, you can to understand and implement.
The foregoing is only a preferred embodiment of the present invention, but protection scope of the present invention be not limited thereto,
Any one skilled in the art the invention discloses technical scope in, the change or replacement that can readily occur in,
It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with scope of the claims
Subject to.
Claims (1)
- A kind of 1. passenger traffic channel query pattern sorting technique based on user's inquiry log, it is characterised in that this method bag Include following steps:S1 is parsed from historical data base, is extracted user's inquiry log data:Parse the original user query in historical data base Daily record data, extracts significant field of classifying to channel query pattern, this is original from original user query daily record data User, which inquires about data, includes at user's inquiry moment on corresponding date, user's inquiry moment corresponding hour numerical value, user's inquiry Moment corresponding minute numerical value, user inquire about channel, departure place city, destination city, departure date;User's inquiry log data that S2 multi dimensional analysis S1 is extracted, build inquiry mould of the different channels to different travel routes Formula feature, including:A, figureofmerit is inquired about, statistics shows, most queries channel is distributed as typical long-tail distribution, is looked into air ticket Exemplified by asking channel, there is the inquiry channel of the air ticket less than 10% to occupy the air ticket queries more than 90%;To inquire about figureofmerit work The sluggish channel in part can be distinguished for a query pattern feature;B, comprehensive dispersion index, normal User behavior often show as close to departure date or having what social event occurred Departure date queries is high, and circuit query amount popular or that event occurs is high, and robot often uniformly divides queries It is dispersed on unrelated circuit and departure date;Comprehensive dispersion index calculation formula:The uniform journey that the User behavior of one channel of index expression is distributed in departure place & destinations, the space of departure date Degree, the index represent that the distribution of channel User behavior is more uniform closer to 1, are closer to taking off several rows;C, degree of peeling off index, the User behavior of normal person often has certain stability, so we can be from outlier Angle removes the abnormal User behavior of analysis, can specifically be respectively circuit dimension, goes through from the peeling off property of three dimensional analysis channels Shi Wei, channel dimension;By taking circuit is tieed up as an example, if a channel in one day queries to certain circuit with being averaged to All other routes Queries, which is compared, more obvious exception, then the User behavior to this circuit is very suspicious;Index object:Certain channel was in certain hour User behavior to certain departure place & destinations;Define CIj, kFor i-th of channel, in jth day, the inquiry times to kth bar circuit;Circuit ties up degree of peeling off calculation formula:<mrow> <mi>u</mi> <mi>t</mi> <mi>l</mi> <mi>i</mi> <mi>e</mi> <mi>r</mi> <mo>_</mo> <mi>O</mi> <mi>D</mi> <mo>=</mo> <mfrac> <mrow> <msub> <mi>C</mi> <mrow> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>,</mo> <mi>k</mi> </mrow> </msub> <mo>-</mo> <mfrac> <mn>1</mn> <mi>N</mi> </mfrac> <msubsup> <mi>&Sigma;</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </msubsup> <msub> <mi>C</mi> <mrow> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>,</mo> <mi>k</mi> </mrow> </msub> </mrow> <msqrt> <mrow> <mfrac> <mn>1</mn> <mi>N</mi> </mfrac> <msubsup> <mi>&Sigma;</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </msubsup> <msup> <mrow> <mo>(</mo> <msub> <mi>C</mi> <mrow> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>,</mo> <mi>k</mi> </mrow> </msub> <mo>-</mo> <mfrac> <mn>1</mn> <mi>N</mi> </mfrac> <msubsup> <mi>&Sigma;</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </msubsup> <msub> <mi>C</mi> <mrow> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>,</mo> <mi>k</mi> </mrow> </msub> <mo>)</mo> </mrow> <mn>2</mn> </msup> </mrow> </msqrt> </mfrac> </mrow>Wherein N represents circuit sum, and the index expression channel is in certain day queries to certain circuit and overall sample mean The difference degree of queries;The index is more than 0 and absolute value is bigger, illustrates that sample queries amount is far above normal levels;This refers to Mark is less than 0 and absolute value is bigger, illustrates that sample queries amount is far below normal levels;D, behavior pattern index, positive frequent flight passenger inquiry waveform meet mankind's work and rest custom, and Ba Shuo robots inquiry waveform is then mixed It is disorderly random;Index object:Certain channel to certain O&D one day 24 it is small when in User behavior;Define behaviorCurveC, od, bFor queries of the c channels to circuit od when b is small, standardCurveC, od, bFor c Standard queries amount of the channel to circuit od when b is small;Behavior pattern desired value is defined as follows:<mrow> <mi>C</mi> <mi>o</mi> <mi>s</mi> <mi> </mi> <mi>i</mi> <mi>n</mi> <mi>e</mi> <mi>S</mi> <mi>i</mi> <mi>m</mi> <mi>i</mi> <mi>l</mi> <mi>a</mi> <mi>r</mi> <mi>i</mi> <mi>t</mi> <mi>y</mi> <mo><</mo> <mrow> <mo>(</mo> <mfrac> <mrow> <mi>s</mi> <mi> </mi> <mi>tan</mi> <mi> </mi> <msub> <mi>dardCurve</mi> <mi>i</mi> </msub> <mo>-</mo> <mi>s</mi> <mi> </mi> <mi>tan</mi> <mi> </mi> <msub> <mi>dardCurve</mi> <mrow> <mi>i</mi> <mo>-</mo> <mn>1</mn> </mrow> </msub> </mrow> <mfrac> <mrow> <msubsup> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mn>24</mn> </msubsup> <mi>s</mi> <mi> </mi> <mi>tan</mi> <mi> </mi> <msub> <mi>dardCurve</mi> <mi>i</mi> </msub> </mrow> <mn>24</mn> </mfrac> </mfrac> <mo>)</mo> </mrow> <mo>,</mo> <mrow> <mo>(</mo> <mfrac> <mrow> <msub> <mi>behaviorCurve</mi> <mi>i</mi> </msub> <mo>-</mo> <msub> <mi>behaviorCurve</mi> <mrow> <mi>i</mi> <mo>-</mo> <mn>1</mn> </mrow> </msub> </mrow> <mfrac> <mrow> <msubsup> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mn>24</mn> </msubsup> <msub> <mi>behaviorCurve</mi> <mi>i</mi> </msub> </mrow> <mn>24</mn> </mfrac> </mfrac> <mo>)</mo> </mrow> <mo>></mo> </mrow>The similarity degree of User behavior pattern and standard normal person's User behavior pattern when the index expression sample past 24 is small, Value range [- 1,1], the value show User behavior closer to normal person's Behavior law closer to 1;E, date dispersion index, robot take off several rows of queries for being and are often uniformly distributed on departure date, normally People was only concentrated on emphasis departure date;Index object:Certain channel was in certain hour User behavior to certain O&D;Define the queries average value that μ is expressed as each departure date, HhIt is expressed as looking into for h-th departure date (leaveDate) Inquiry amount sum;<mrow> <mi>&mu;</mi> <mo>=</mo> <mfrac> <mrow> <msubsup> <mo>&Sigma;</mo> <mrow> <mi>h</mi> <mo>=</mo> <mi>M</mi> <mi>i</mi> <mi>n</mi> <mrow> <mo>(</mo> <mi>l</mi> <mi>e</mi> <mi>a</mi> <mi>v</mi> <mi>e</mi> <mi>D</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mi>M</mi> <mi>a</mi> <mi>x</mi> <mrow> <mo>(</mo> <mi>l</mi> <mi>e</mi> <mi>a</mi> <mi>v</mi> <mi>e</mi> <mi>D</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> <mo>)</mo> </mrow> </mrow> </msubsup> <msub> <mi>H</mi> <mi>h</mi> </msub> </mrow> <mrow> <mi>M</mi> <mi>a</mi> <mi>x</mi> <mrow> <mo>(</mo> <mi>l</mi> <mi>e</mi> <mi>a</mi> <mi>v</mi> <mi>e</mi> <mi>D</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> <mo>)</mo> </mrow> <mo>-</mo> <mi>M</mi> <mi>i</mi> <mi>n</mi> <mrow> <mo>(</mo> <mi>l</mi> <mi>e</mi> <mi>a</mi> <mi>v</mi> <mi>e</mi> <mi>D</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> <mo>)</mo> </mrow> </mrow> </mfrac> </mrow>Date dispersion index calculation formula:<mrow> <mi>d</mi> <mi>i</mi> <mi>s</mi> <mi>p</mi> <mi>e</mi> <mi>r</mi> <mi>s</mi> <mi>i</mi> <mi>o</mi> <mi>n</mi> <mo>_</mo> <mi>h</mi> <mi>i</mi> <mi>s</mi> <mo>=</mo> <mfrac> <mn>1</mn> <mi>&mu;</mi> </mfrac> <mo>*</mo> <msqrt> <mrow> <mfrac> <mn>1</mn> <mrow> <mi>M</mi> <mi>a</mi> <mi>x</mi> <mrow> <mo>(</mo> <mi>l</mi> <mi>e</mi> <mi>a</mi> <mi>v</mi> <mi>e</mi> <mi>D</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> <mo>)</mo> </mrow> <mo>-</mo> <mi>M</mi> <mi>i</mi> <mi>n</mi> <mrow> <mo>(</mo> <mi>l</mi> <mi>e</mi> <mi>a</mi> <mi>v</mi> <mi>e</mi> <mi>D</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>*</mo> <munderover> <mo>&Sigma;</mo> <mrow> <mi>h</mi> <mo>=</mo> <mi>M</mi> <mi>i</mi> <mi>n</mi> <mrow> <mo>(</mo> <mi>l</mi> <mi>e</mi> <mi>a</mi> <mi>v</mi> <mi>e</mi> <mi>D</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mi>M</mi> <mi>a</mi> <mi>x</mi> <mrow> <mo>(</mo> <mi>l</mi> <mi>e</mi> <mi>a</mi> <mi>v</mi> <mi>e</mi> <mi>D</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> <mo>)</mo> </mrow> </mrow> </munderover> <msup> <mrow> <mo>(</mo> <msub> <mi>H</mi> <mi>h</mi> </msub> <mo>-</mo> <mi>&mu;</mi> <mo>)</mo> </mrow> <mn>2</mn> </msup> </mrow> </msqrt> </mrow>The index expression be evenly distributed degree of certain channel to certain circuit query amount on departure date;The index is smaller, Illustrate that distribution is more uniform, illustrate that inquiry of the channel to the circuit is more similar to and take off several rows and be;F, product dispersion indexDefine certain O&D and certain leaveDate and form an inquiry product product (O&D&leaveDate), M is all inquiries The sum of product;V is expressed as the queries average value on each inquiry product (O&D&leaveDate);PpP-th is expressed as to look into Ask the queries sum of product product;Order:Channel ties up calculation formula:Be evenly distributed degree of the queries of certain channel of the index expression on inquiry product (O&D&startDate);User query pattern features of the S3 according to the different channels that S2 is constructed to different circuits, using k-means clustering methods To being clustered on different channels to the User behavior of different circuits, channel query pattern classification results are obtained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711405012.2A CN107908800A (en) | 2017-12-22 | 2017-12-22 | Passenger traffic channel query pattern sorting technique based on user's inquiry log |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711405012.2A CN107908800A (en) | 2017-12-22 | 2017-12-22 | Passenger traffic channel query pattern sorting technique based on user's inquiry log |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107908800A true CN107908800A (en) | 2018-04-13 |
Family
ID=61869641
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711405012.2A Pending CN107908800A (en) | 2017-12-22 | 2017-12-22 | Passenger traffic channel query pattern sorting technique based on user's inquiry log |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107908800A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108717427A (en) * | 2018-05-05 | 2018-10-30 | 北京交通大学 | Passenger traffic demand index computational methods based on user's inquiry log |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7933884B2 (en) * | 2008-01-30 | 2011-04-26 | Yahoo! Inc. | Apparatus and methods for tracking, querying, and visualizing behavior targeting processes |
CN106777303A (en) * | 2016-12-30 | 2017-05-31 | 中国民航信息网络股份有限公司 | Passenger flight User behavior sorting technique and system |
CN106780273A (en) * | 2016-12-30 | 2017-05-31 | 中国民航信息网络股份有限公司 | Passenger flight requirement analysis method and system |
-
2017
- 2017-12-22 CN CN201711405012.2A patent/CN107908800A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7933884B2 (en) * | 2008-01-30 | 2011-04-26 | Yahoo! Inc. | Apparatus and methods for tracking, querying, and visualizing behavior targeting processes |
CN106777303A (en) * | 2016-12-30 | 2017-05-31 | 中国民航信息网络股份有限公司 | Passenger flight User behavior sorting technique and system |
CN106780273A (en) * | 2016-12-30 | 2017-05-31 | 中国民航信息网络股份有限公司 | Passenger flight requirement analysis method and system |
Non-Patent Citations (1)
Title |
---|
靳国旺 等: "《雷达摄影测量》", 30 April 2015, 测绘出版社 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108717427A (en) * | 2018-05-05 | 2018-10-30 | 北京交通大学 | Passenger traffic demand index computational methods based on user's inquiry log |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106210044B (en) | A kind of any active ues recognition methods based on access behavior | |
Smith | Method and purpose in functional town classification | |
CN100595780C (en) | Handwriting digital automatic identification method based on module neural network SN9701 rectangular array | |
EP2392120B1 (en) | Method and sensor network for attribute selection for an event recognition | |
KR20040101477A (en) | Viewing multi-dimensional data through hierarchical visualization | |
Pal et al. | An insight of World Health Organization (WHO) accident database by cluster analysis with self-organizing map (SOM) | |
CN109408557B (en) | Traffic accident cause analysis method based on multiple correspondences and K-means clustering | |
CN107180088A (en) | News based on Fuzzy C-Means Cluster Algorithm recommends method | |
US20230215272A1 (en) | Information processing method and apparatus, computer device and storage medium | |
CN102122353A (en) | Method for segmenting images by using increment dictionary learning and sparse representation | |
CN107180093A (en) | Information search method and device and ageing inquiry word recognition method and device | |
Thota et al. | Cluster based zoning of crime info | |
CN106933906A (en) | The querying method and device of data multidimensional degree | |
DE112021001926T5 (en) | SYSTEM AND METHOD FOR FILTERLESS THrottling OF VEHICLE EVENT DATA PROCESSING TO IDENTIFY PARKING AREAS | |
Fusco et al. | Hierarchical clustering through spatial interaction data. The case of commuting flows in South-Eastern France | |
Chang et al. | Classification and visualization of the social science network by the minimum span clustering method | |
CN107908800A (en) | Passenger traffic channel query pattern sorting technique based on user's inquiry log | |
CN101673305A (en) | Industry sorting method, industry sorting device and industry sorting server | |
CN109657123A (en) | A kind of food safety affair clustering method based on comentropy | |
Scholl et al. | Testing for clustering of industries-evidence from micro geographic data | |
CN107730717B (en) | A kind of suspicious card identification method of public transport based on feature extraction | |
CN110717089A (en) | User behavior analysis system and method based on weblog | |
Qiong et al. | Application of clustering algorithm in intelligent transportation data analysis | |
CN109446394A (en) | For network public-opinion event based on modular public sentiment monitoring method and system | |
CN110097126B (en) | Method for checking important personnel and house missing registration based on DBSCAN clustering algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180413 |