CN110086874A - A kind of Expressway Service user classification method, system, equipment and medium - Google Patents

A kind of Expressway Service user classification method, system, equipment and medium Download PDF

Info

Publication number
CN110086874A
CN110086874A CN201910361868.7A CN201910361868A CN110086874A CN 110086874 A CN110086874 A CN 110086874A CN 201910361868 A CN201910361868 A CN 201910361868A CN 110086874 A CN110086874 A CN 110086874A
Authority
CN
China
Prior art keywords
user
mobile terminal
customer mobile
feature
access point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910361868.7A
Other languages
Chinese (zh)
Inventor
李勇
刘伟
赵凯
吴伟令
金德鹏
毕玉峰
韩国华
周鹏飞
许孝滨
冯美军
田冬军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Shandong Provincial Communications Planning and Design Institute Co Ltd
Original Assignee
Tsinghua University
Shandong Provincial Communications Planning and Design Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University, Shandong Provincial Communications Planning and Design Institute Co Ltd filed Critical Tsinghua University
Priority to CN201910361868.7A priority Critical patent/CN110086874A/en
Publication of CN110086874A publication Critical patent/CN110086874A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/51Discovery or management thereof, e.g. service location protocol [SLP] or web services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/55Push-based network services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/021Services related to particular areas, e.g. point of interest [POI] services, venue services or geofences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/50Service provisioning or reconfiguring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/80Services using short range communication, e.g. near-field communication [NFC], radio-frequency identification [RFID] or low energy communication

Abstract

The present disclosure discloses a kind of Expressway Service user classification method, system, equipment and media, acquire WiFi connection data of the mobile terminal in Expressway Service of user;User characteristics extraction is carried out to WiFi connection data;Feature Selection is carried out to user characteristics based on Pearson correlation coefficient;The feature that Feature Selection obtains is normalized;Clustering processing is carried out to the feature after normalized, obtains the corresponding class of subscriber of all customer mobile terminals.Beneficial effects of the present invention realize the precise classification for realizing the user type of Expressway Service user, and preparation has been done in the upgrading for freeway service work.

Description

A kind of Expressway Service user classification method, system, equipment and medium
Technical field
Field more particularly to a kind of Expressway Service user this disclosure relates to which WiFi data excavation and user draw a portrait Classification method, system, equipment and medium.
Background technique
The statement of this part is only to refer to background technique relevant to the disclosure, not necessarily constitutes the prior art.
Inventor exists in the prior art following technical problem and needs to solve:
Expressway Service is the place that rest is stopped exclusively for passenger and driver (referred to collectively below as user), is generally had The multiple functions region such as parking lot, public lavatory, gas station, food and drink, supermarket.It is taken simultaneously in order to provide online in user's rest Business, present service area have generally accomplished WiFi all standing.But in the prior art not to the connection data of Wi-Fi hotspot into Row acquisition is not handled the connection data of Wi-Fi hotspot, how factually existing public at a high speed using the connection number of Wi-Fi hotspot How the classification of road service area user realizes the service liter of Expressway Service using sorted highway user type The technical issues of grade is current Expressway Service urgent need to resolve.
Summary of the invention
In order to solve the deficiencies in the prior art, present disclose provides a kind of Expressway Service user classification method, it is System, equipment and medium;
In a first aspect, present disclose provides a kind of Expressway Service user classification methods;
A kind of Expressway Service user classification method, comprising:
Acquire WiFi connection data of the mobile terminal in Expressway Service of user;
User characteristics extraction is carried out to WiFi connection data;
Feature Selection is carried out to user characteristics based on Pearson correlation coefficient;
The feature that Feature Selection obtains is normalized;
Clustering processing is carried out to the feature after normalized, obtains the corresponding class of subscriber of all customer mobile terminals.
Second aspect, present disclose provides a kind of Expressway Service user categorizing systems;
A kind of Expressway Service user categorizing system, comprising:
Acquisition module is configured as WiFi connection data of the mobile terminal in Expressway Service of acquisition user;
Characteristic extracting module is configured as carrying out user characteristics extraction to WiFi connection data;
Feature Selection module is configured as carrying out Feature Selection to user characteristics based on Pearson correlation coefficient;
Normalized module is configured as that the feature that Feature Selection obtains is normalized;
Clustering processing module is configured as carrying out clustering processing to the feature after normalized, obtains all users The corresponding class of subscriber of mobile terminal.
The third aspect, the disclosure additionally provide a kind of electronic equipment, including memory and processor and are stored in storage The computer instruction run on device and on a processor when the computer instruction is run by processor, completes first aspect institute The step of stating method.
Fourth aspect, the disclosure additionally provide a kind of computer readable storage medium, described for storing computer instruction When computer instruction is executed by processor, complete first aspect the method the step of.
Compared with prior art, the beneficial effect of the disclosure is:
Because acquiring WiFi connection data of the mobile terminal in Expressway Service of user, it is possible to be based on WiFi Connection number factually shows the classification of the user type of Expressway Service user;Reorganization and expansion and service upgrade for service area are all There is important directive significance;
Because carrying out user characteristics extraction to WiFi connection data, it is possible to based on user characteristics to freeway service The user type in area carries out precise classification;
Because carrying out Feature Selection to user characteristics based on Pearson correlation coefficient, it is possible to weed out noise data; Improve the accuracy of classification;
Because carrying out clustering processing to the feature after normalized, the corresponding user class of all customer mobile terminals is obtained Not, so realizing the classification of type to highway user by way of cluster.
WiFi equipment has collected some users' networking relevant informations for not being related to privacy while servicing user and surfing the Internet, this A little data reflect the situation of service area operation to a certain extent.User is the main services object of service area, excavates user's phase The characteristics of information of pass can allow service area manager to become more apparent upon user and demand.Excavate user's phase in WiFi connection data User can be divided into different classifications, such as the staff to work in service area always by holding inside the Pass, in service area The long-distance driver to stay, only in the ordinary user etc. of service area short stay.According to ratio shared by different type crowd Size can plan the functional attributes of service area again.For example the service area of long-distance driver's large percentage can be provided more More lodging rooms and better accommodation service, service area more for the ordinary user of short stay, can provide more Shopping well, food and drink, the service to go to the toilet.Therefore, crowd's type can help the function category of clear service area in Analysis Service area Property, reorganization and expansion and service upgrade for service area have important directive significance.
Detailed description of the invention
The accompanying drawings constituting a part of this application is used to provide further understanding of the present application, and the application's shows Meaning property embodiment and its explanation are not constituted an undue limitation on the present application for explaining the application.
Fig. 1 is the method flow diagram of the embodiment of the present disclosure one;
Fig. 2 is the system module figure of the embodiment of the present disclosure two.
Specific embodiment
It is noted that described further below be all exemplary, it is intended to provide further instruction to the application.Unless another It indicates, all technical and scientific terms used herein has usual with the application person of an ordinary skill in the technical field The identical meanings of understanding.
It should be noted that term used herein above is merely to describe specific embodiment, and be not intended to restricted root According to the illustrative embodiments of the application.As used herein, unless the context clearly indicates otherwise, otherwise singular Also it is intended to include plural form, additionally, it should be understood that, when in the present specification using term "comprising" and/or " packet Include " when, indicate existing characteristics, step, operation, device, component and/or their combination.
Embodiment one: a kind of Expressway Service user classification method is present embodiments provided;
As shown in Figure 1, a kind of Expressway Service user classification method, comprising:
Acquire WiFi connection data of the mobile terminal in Expressway Service of user;
User characteristics extraction is carried out to WiFi connection data;
Feature Selection is carried out to user characteristics based on Pearson correlation coefficient;
The feature that Feature Selection obtains is normalized;
Clustering processing is carried out to the feature after normalized, obtains the corresponding class of subscriber of all customer mobile terminals.
As one or more embodiments, customer mobile terminal number ID, Expressway Service number ID, each are acquired The WiFi access point AP number that customer mobile terminal connects in Expressway Service;Each customer mobile terminal is connected to often The duration of a WiFi access point AP, each customer mobile terminal are connected to the number of each WiFi access point AP, each user moves Dynamic terminal is connected to the upload flow after WiFi access point AP and downloading flow, each customer mobile terminal are connected to WiFi access Used application APP title after point AP;
It is used it is to be understood that generally installing multiple WiFi access points (AP) in multiple positions in service area for nearby users, Such as lavatory, dining room, reception room etc..User will record following related data when connecting WiFi:
1 user of table obtains data when connecting WiFi
User's movement in service area may access different AP, therefore have a plurality of linkage record, and specific sample is such as Shown in table 1.
Crowd into Expressway Service is divided into staff, the often long-distance driver into service area rest, idol You enter the user etc. of service area.These different classes of users have apparent difference in the activity habit of service area, such as:
Service area staff is the service area residence time is long, record is more, scope of activities is big;
Motorbus and the dwell times of large-sized truck driver are more and more regular on highway, but stop Time is shorter;
Travel, go on business by user may only stop several times.
The behavior that they connect WiFi has reacted them in the activity situation of service area to a certain extent.Following characteristics are not There is apparent difference in the user group of same type, can be used for the classification of user.
As one or more embodiment, user characteristics extraction is carried out to WiFi connection data and is specifically included:
Extract the number that each customer mobile terminal once carried out the Expressway Service difference number of WiFi connection;
It is to be understood that generally at least can every 50 kilometers on same highway or the highway of same panel region There is a service area to stop for user to rest.In one section of longer time, if user may in same panel region activity It stops, and the employee of service area or general only stops in a service area by the user of sub-region once in a while in multiple service areas It stays, so, the service area quantity that user went whithin a period of time can reflect whether user is often to exist to a certain extent Movable passenger or driver in this panel region.
The service area that user during this period of time went indicates that then user went not with set S={ s1, s2, s3 ... } Number with service area is | S |, that is, the size gathered.
Extract the mobile message entropy of customer mobile terminal in some Expressway Service;
It is to be understood that user connects different WiFi access points in service area indicates that user probably connects in the WiFi Position where access point, for example being connected to the WiFi access point of lavatory indicates user probably in lavatory, it is possible to it uses The position of WiFi access point indicates user present position.
The mobile message entropy H (X) of customer mobile terminal, calculation formula:
Wherein, P (xi) indicate user in xiThe frequency that class place occurs, xiIndicate the functional areas number in service area, user In xiThe frequency that class place occurs is equal to user in functional areas xiThe number of appearance is active divided by the institute in current service area is used for The number that energy area occurs.Entropy is bigger, and expression user is abundanter in the playground of service area, and smaller expression playground is more single. The user generally stopped for a long time in service area, for example service area staff, entropy will be bigger.
It extracts in all Expressway Services, each customer mobile terminal is connected to total day of WiFi access point AP Number;
It is to be understood that connection number of days refers to user whithin a period of time, the number of days of WiFi is connected.The number of days of connection is more, Show user it is more likely that the employee of service area or the driver that the period drives a vehicle on this section of highway.It counts on each user Line time t1The date at place gathers, and the set sizes after duplicate removal indicate connection number of days.
It extracts in all Expressway Services, each customer mobile terminal is connected to total time of WiFi access point AP Number;
It is to be understood that connection number refers to that user, can be with whithin a period of time in the number of each service area connection WiFi Reflect user's residence time length to a certain extent, and the length of different classes of user's residence time has significant difference.Directly Count each user uiThe item number of linkage record within the scope of search time.
It extracts in all Expressway Services, each customer mobile terminal is connected to the flat of different WiFi access point AP Equal duration;Average duration connects equal to the total duration of all WiFi access point AP of each customer mobile terminal connection divided by user Connect the number of WiFi access point AP;
It is to be understood that averagely connection duration refers to that user connects the average length of time of WiFi every time.User is in service area It may be shorter stop, it is also possible to which the residence time is long, or even stays in service area, and reflecting user is long-distance or short distance row Vehicle, or be also likely to be the staff of service area.Calculation method is the downtime t of user every record2When subtracting online Between t1It is exactly the connection duration Δ t=t of this connection2-t1, the average connection duration of expression user after Δ t is averaging.
It extracts in some Expressway Service, each customer mobile terminal is connected to the flat of different WiFi access point AP Daily connection number;It is total that average daily connection number is equal to all WiFi access point AP connected in the set time period Number divided by set period of time number of days;
It is to be understood that the staff of service area or service area for a long time stop user because service area in activity Or the relationship of network, it might have the interruption and switching of network, and the user of short stay often only connects a region WiFi。
This day clothes are counted according to the identifier s on on-line time t1 obtained date and service area for same user The number of business area connection obtains user in the service area and averagely connects number daily then to there is all dates to be averaging.
Extract the average connection number in each service area of each customer mobile terminal;Averagely in each service area Connection number is equal to the total degree that connects in all service areas of customer mobile terminal divided by the number of the service area gone;
It is to be understood that the driver often to drive a vehicle on a highway may stop in different service areas, and common Passenger or the staff of service area generally only enter a service area.For same user, according to the identifier of service area It is average every to obtain user then to there is all service areas to be averaging in the number of each service area si connection for s, counting user A service area connects number.
Extract the mean residence time of each customer mobile terminal;Mean residence time is equal to each customer mobile terminal The summation of daily residence time, then divided by the number of days of stop;The daily residence time is equal to customer mobile terminal and is connected to some clothes The downtime at the latest in business area subtracts current earliest upper limit time;
It is to be understood that the linkage record of single may indicate the interruption in user's connection procedure, therefore generally below user In the actual time that service area stops, and time span online for the first time and that last time is offline reflects use averagely in one day Family one day is in service area residence time length.For each user, according to on-line time t1With the t of downtime2Date obtain The on-line time t earliest to middle user on the same daysDowntime t at the lateste, time span Δ t=te-ts, to time span Δ t is averaging to obtain mean residence time.
As one or more embodiments, the specific steps that user characteristics are screened based on Pearson correlation coefficient Are as follows:
Calculate the Pearson correlation coefficient in all features that feature extraction obtains between any two feature;
If Pearson correlation coefficient, which is greater than given threshold, retains another for two feature random erasure one;
If Pearson correlation coefficient is less than or equal to given threshold, two features are retained.
It is to be understood that Pearson correlation coefficient:
XiAnd YiI-th of sample value of two kinds of features is respectively indicated,WithIndicate mean value.The high feature of related coefficient is left out One of them.
As one or more embodiments, the feature after Feature Selection is normalized, the normalized Using maximin normalization processing method.
It is to be understood that the specific formula of maximin normalization processing method are as follows:
The value of all features is all between 0~1 after normalization.
As one or more embodiments, clustering processing is carried out to the feature after normalized, all users is obtained and moves The specific steps of the dynamic corresponding class of subscriber of terminal include: to set the number of clustering cluster, and all customer mobile terminals are normalized Treated, and feature is input in clustering algorithm, exports clustering cluster;With the corresponding class of subscriber of the average point of each clustering cluster The corresponding class of subscriber of all customer mobile terminals as in current cluster.
As one or more embodiments, the clustering algorithm process includes: allocation step and updates step alternately:
Allocation step: each sample point is assigned to the average point nearest from it;
Update step: each cluster obtained for allocation step, using the center of all sample points in clustering as newly Average point.
According to the user characteristics extracted, n sample point is divided into k cluster, so that the point in each cluster compares Closely, the distance between cluster and cluster are distant, so that having similar feature with the sample in cluster.
As one or more embodiments, the number of the setting clustering cluster is come true using elbow method or silhouette coefficient method It is fixed.
The elbow method: the error sum of squares (SSE) of sample point classification is calculated with following formula
Wherein, CiIt is i-th of cluster, p is the sample point for belonging to i-th of cluster, miIt is the mass center of i-th of cluster, therefore, SSE is indicated The cluster error of all sample points, represents the quality of Clustering Effect.
With the increase of clusters number k, it is more careful that sample divides, and the degree of polymerization of each cluster can be higher, so under SSE Drop.When k is less than the quantity of true cluster, the degree of polymerization of cluster can be greatly increased by increasing k, and SSE declines to a great extent,
When k is greater than the quantity of true cluster, the fall of SSE can reduce.
So may determine that the quantity of true cluster from the fall of SSE, select fall near variation ratio slowly Obvious point.
The silhouette coefficient method: following two index is calculated first:
Dissmilarity degree a in clusteri: average distance of the sample i to same other samples of cluster, aiSmaller expression i more should assign to this Cluster.
Dissmilarity degree b between clusteri: sample i to other different cluster CjIn exclusive sample average distance be bij, bi=min {bi1, bi2... }.biBigger expression i should not more assign to other clusters.
The mean value of all samples is silhouette coefficient, and the value is bigger, indicates that cluster is more reasonable.
According to the user's cluster for clustering available k quasi-representative, the WiFi connection behavioral characteristic of different cluster users is then analyzed. Average point in each cluster represents entire cluster, and the corresponding feature mentioned will have corresponding spy by artificial matched mode The cluster of sign is matched to corresponding crowd, for example will connect that number of days is more, and the Connection Time is long daytime, and basic only one service area connection It is a kind of that the user of data is classified as service area staff.
According to the classification after cluster, the difference of more different classes of user group mentioned feature in step 1.If Feature differs larger between different groups user, illustrates that this feature has obvious effect for cluster;Conversely, showing spy There is no act in this cluster for sign.To not have effective feature to leave out, cluster again.This step is repeated, until being compared Preferable cluster and corresponding feature distribution.
As one or more embodiments, the method, further includes: Behavior Pattern Analysis on user group line: different type User surf the Internet that feature is also different, and each customer mobile terminal connect according to the class of subscriber of acquisition with WiFi in data connects Number, each customer mobile terminal to each WiFi access point AP are connected to the upload flow after WiFi access point AP and downloading Flow, each customer mobile terminal are connected to used application APP title after WiFi access point AP;From liveness and more Two dimensions of sample come analyze designated user group line uplink are as follows:
Liveness refers to the frequency and uninterrupted for considering that different type user uses WiFi, if certain user group makes It is higher than given threshold with the frequency of WiFi and upload or downloading flow is greater than given threshold;Then when configuring service area network, root Corresponding network capacity is disposed according to user type.
Diversity, referring to is made after certain type of user group's mobile terminal of some service area is connected to WiFi access point AP The quantity of application APP title is more than given threshold, then it represents that the type user group preference in the service area is a certain Kind APP launches advertisement relevant to such APP in the service area that certain type of user group's ratio is greater than given threshold.
Therefore Behavior Pattern Analysis on user group line, the input of this part are all types of user's marks of difference that cluster obtains Online behavior (upload/downloading flow, record using APP) corresponding with user is signed, output is the online behavior of user of all categories Liveness and diversity whether there is significant difference.
Liveness, the main frequency for considering user's online and the uninterrupted used.
Frequency: refer to that user uses the number of APP within the unit time.Each in user's internet records is recorded and is indicated User uses the behavior of APP every time, calculates the number in each Subscriber Unit time using APP, then the user of different groups It is averaging respectively and obtains the frequency of group online.
Flow: according to obtained class of subscriber, different classes of user is analyzed using the method for hypothesis testing and is being taken Area's surfing flow usage amount of being engaged in is with the presence or absence of difference.
Assuming that:
Indicate that the i-th class crowd uploads the average value of (downloading) flow,Indicate that proprietary upload (downloading) flow is flat Mean value.Null hypothesis thinks that the i-th class crowd uploads (downloading) flow and is not different with entire population, and alternative hvpothesis is then opposite.
It is tested with the method that t is examined to hypothetical proposition, the analysis significance of difference obtains hypothesis testing result.
Diversity considers user using the abundant degree and different user group of APP classification to the preference of APP.
APP classification richness:
Indicate that user uses the abundant degree of APP classification with entropy.For any type user, calculation method is as follows:
P(xi) indicate that such all user use xiThe frequency of class APP, i.e., all used x of such useriClass APP's The quantity for the different APP that quantity is used divided by such user.Entropy is bigger, and the APP classification for indicating that such user uses is abundanter.
APP classification preference:
It is surfed the Internet the APP that uses, APP can be divided for classifications such as video, game, office, finance and economicss in service area according to user, Analyzing different classes of user may use navigation APP relatively more using on APP with the presence or absence of significant difference, such as driver, clothes Business area staff is relatively more with office APP.
Assuming that:
Indicate that the i-th class crowd uses the mean value of certain one kind APP frequency,Indicate that owner uses certain one kind APP frequency Mean value.Null hypothesis thinks that the i-th class crowd is not different using certain one kind APP frequency with entire population, and alternative hvpothesis is then opposite.
It is tested with the method that t is examined to hypothetical proposition, the analysis significance of difference obtains hypothesis testing result.
Behavior on the line of user is analyzed, data supporting can be provided for the network construction of service area, it can also will wherein one It is partially converted into demand under line, feature on the line of this step Main Analysis different type user.
Having the technical effect that for the present embodiment can be realized by the WiFi connection data of service area to freeway service The classification of area user, for example, be divided into the staff of service area, the truck man on highway, bus driver, tourism Tourist, the crowd to go on business;The identification to user type may be implemented, and then to corresponding function area (the lavatory area, commodity of service Area, dining room area, Living Area, reception area, fueling area) usable floor area planned, help service area improve efficiency of service.
Embodiment two: a kind of Expressway Service user categorizing system is present embodiments provided;
As shown in Fig. 2, a kind of Expressway Service user categorizing system, comprising:
Acquisition module is configured as WiFi connection data of the mobile terminal in Expressway Service of acquisition user;
Characteristic extracting module is configured as carrying out user characteristics extraction to WiFi connection data;
Feature Selection module is configured as carrying out Feature Selection to user characteristics based on Pearson correlation coefficient;
Normalized module is configured as that the feature that Feature Selection obtains is normalized;
Clustering processing module is configured as carrying out clustering processing to the feature after normalized, obtains all users The corresponding class of subscriber of mobile terminal.
Embodiment three:
The disclosure additionally provides a kind of electronic equipment, including memory and processor and storage on a memory and are being located The computer instruction that runs on reason device, when the computer instruction is run by processor, each operation in Method Of Accomplishment, in order to Succinctly, details are not described herein.
The electronic equipment can be mobile terminal and immobile terminal, and immobile terminal includes desktop computer, move Dynamic terminal includes smart phone (Smart Phone, such as Android phone, IOS mobile phone), smart glasses, smart watches, intelligence The mobile internet device that energy bracelet, tablet computer, laptop, personal digital assistant etc. can carry out wireless communication.
It should be understood that in the disclosure, which can be central processing unit CPU, which, which can be said to be, can be it His general processor, digital signal processor DSP, application-specific integrated circuit ASIC, ready-made programmable gate array FPGA or other Programmable logic device, discrete gate or transistor logic, discrete hardware components etc..General processor can be micro process Device or the processor are also possible to any conventional processor etc..
The memory may include read-only memory and random access memory, and to processor provide instruction and data, The a part of of memory can also include non-volatile RAM.For example, memory can be with the letter of storage device type Breath.
During realization, each step of the above method can by the integrated logic circuit of the hardware in processor or The instruction of software form is completed.The step of method in conjunction with disclosed in the disclosure, can be embodied directly in hardware processor and execute At, or in processor hardware and software module combination execute completion.Software module can be located at random access memory, dodge It deposits, this fields are mature deposits for read-only memory, programmable read only memory or electrically erasable programmable memory, register etc. In storage media.The storage medium is located at memory, and processor reads the information in memory, completes the above method in conjunction with its hardware The step of.To avoid repeating, it is not detailed herein.Those of ordinary skill in the art may be aware that in conjunction with institute herein Each exemplary unit, that is, algorithm steps of disclosed embodiment description, can be hard with electronic hardware or computer software and electronics The combination of part is realized.These functions are implemented in hardware or software actually, the specific application depending on technical solution And design constraint.Professional technician can realize described function using distinct methods to each specific application Can, but this realization is it is not considered that exceed scope of the present application.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed systems, devices and methods, it can be with It realizes in other way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit It divides, only a kind of division of logic function, there may be another division manner in actual implementation, such as multiple units or group Part can be combined or can be integrated into another system, or some features can be ignored or not executed.In addition, showing The mutual coupling or direct-coupling or communication connection shown or discussed can be through some interfaces, device or unit Indirect coupling or communication connection, can be electrically, mechanical or other forms.
It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product It is stored in a computer readable storage medium.Based on this understanding, the technical solution of the application is substantially right in other words The part of part or the technical solution that the prior art contributes can be embodied in the form of software products, the calculating Machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be individual Computer, server or network equipment etc.) execute each embodiment the method for the application all or part of the steps.And it is preceding The storage medium stated includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory The various media that can store program code such as (RAM, Random Access Memory), magnetic or disk.
The foregoing is merely preferred embodiment of the present application, are not intended to limit this application, for the skill of this field For art personnel, various changes and changes are possible in this application.Within the spirit and principles of this application, made any to repair Change, equivalent replacement, improvement etc., should be included within the scope of protection of this application.

Claims (10)

1. a kind of Expressway Service user classification method, characterized in that include:
Acquire WiFi connection data of the mobile terminal in Expressway Service of user;
User characteristics extraction is carried out to WiFi connection data;
Feature Selection is carried out to user characteristics based on Pearson correlation coefficient;
The feature that Feature Selection obtains is normalized;
Clustering processing is carried out to the feature after normalized, obtains the corresponding class of subscriber of all customer mobile terminals.
2. the method as described in claim 1, characterized in that acquisition customer mobile terminal number ID, Expressway Service are compiled The WiFi access point AP number that number ID, each customer mobile terminal connect in Expressway Service;Each user is mobile eventually End be connected to the duration of each WiFi access point AP, each customer mobile terminal be connected to each WiFi access point AP number, Each customer mobile terminal is connected to the upload flow after WiFi access point AP and connects with downloading flow, each customer mobile terminal Used application APP title after to WiFi access point AP.
3. the method as described in claim 1, characterized in that carry out user characteristics extraction to WiFi connection data and specifically include:
Extract the number that each customer mobile terminal once carried out the Expressway Service difference number of WiFi connection;
Extract the mobile message entropy of customer mobile terminal in some Expressway Service;
It extracts in all Expressway Services, each customer mobile terminal is connected to total number of days of WiFi access point AP;
It extracts in all Expressway Services, each customer mobile terminal is connected to the total degree of WiFi access point AP.
4. method as claimed in claim 3, characterized in that carry out user characteristics extraction to WiFi connection data and specifically also wrap It includes:
It extracts in all Expressway Services, each customer mobile terminal is connected to the mean time of different WiFi access point AP It is long;Average duration is connected equal to the total duration of all WiFi access point AP of each customer mobile terminal connection divided by user The number of WiFi access point AP;
It extracts in some Expressway Service, each customer mobile terminal is connected to the average every of different WiFi access point AP It connection number;Average daily connection number is equal to all WiFi access point AP total degrees connected in the set time period Divided by the number of days of set period of time;
Extract the average connection number in each service area of each customer mobile terminal;The average connection in each service area Number is equal to the total degree that connects in all service areas of customer mobile terminal divided by the number of the service area gone;
Extract the mean residence time of each customer mobile terminal;Mean residence time is equal to the every of each customer mobile terminal The summation of its residence time, then divided by the number of days of stop;The daily residence time is equal to customer mobile terminal and is connected to some service area Downtime at the latest subtract current earliest upper limit time.
5. the method as described in claim 1, characterized in that the tool screened based on Pearson correlation coefficient to user characteristics Body step are as follows:
Calculate the Pearson correlation coefficient in all features that feature extraction obtains between any two feature;
If Pearson correlation coefficient, which is greater than given threshold, retains another for two feature random erasure one;
If Pearson correlation coefficient is less than or equal to given threshold, two features are retained.
6. the method as described in claim 1, characterized in that carry out clustering processing to the feature after normalized, obtain institute The specific steps for having the corresponding class of subscriber of customer mobile terminal include: to set the number of clustering cluster, and all users are mobile eventually Feature after the normalized of end is input in clustering algorithm, exports clustering cluster;It is corresponding with the average point of each clustering cluster Class of subscriber is the corresponding class of subscriber of all customer mobile terminals in current cluster.
7. the method as described in claim 1, characterized in that the method also includes: Behavior Pattern Analysis on user group line: no The user of same type surfs the Internet, and feature is also different, and each user connected in data according to the class of subscriber of acquisition with WiFi is mobile eventually Number, each customer mobile terminal that end is connected to each WiFi access point AP are connected to the upload flow after WiFi access point AP Used application APP title after WiFi access point AP is connected to downloading flow, each customer mobile terminal;From actively It spends with two dimensions of diversity and analyzes the line uplink of designated user group are as follows:
Liveness refers to the frequency and uninterrupted for considering that different type user uses WiFi, if certain user group uses The frequency of WiFi is higher than given threshold and upload or downloading flow are greater than given threshold;Then when configuring service area network, according to User type disposes corresponding network capacity;
Diversity refers to used after certain type of user group's mobile terminal of some service area is connected to WiFi access point AP The quantity of application APP title is more than given threshold, then it represents that the type user group preference in the service area is a certain APP launches advertisement relevant to such APP in the service area that certain type of user group's ratio is greater than given threshold.
8. a kind of Expressway Service user categorizing system, characterized in that include:
Acquisition module is configured as WiFi connection data of the mobile terminal in Expressway Service of acquisition user;
Characteristic extracting module is configured as carrying out user characteristics extraction to WiFi connection data;
Feature Selection module is configured as carrying out Feature Selection to user characteristics based on Pearson correlation coefficient;
Normalized module is configured as that the feature that Feature Selection obtains is normalized;
Clustering processing module is configured as carrying out clustering processing to the feature after normalized: setting the number of clustering cluster, Feature after all customer mobile terminal normalizeds is input in clustering algorithm, clustering cluster is exported;With each cluster The corresponding class of subscriber of the average point of cluster is the corresponding class of subscriber of all customer mobile terminals in current cluster.
9. a kind of electronic equipment, characterized in that on a memory and on a processor including memory and processor and storage The computer instruction of operation when the computer instruction is run by processor, is completed described in any one of claim 1-7 method Step.
10. a kind of computer readable storage medium, characterized in that for storing computer instruction, the computer instruction is located When managing device execution, step described in any one of claim 1-7 method is completed.
CN201910361868.7A 2019-04-30 2019-04-30 A kind of Expressway Service user classification method, system, equipment and medium Pending CN110086874A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910361868.7A CN110086874A (en) 2019-04-30 2019-04-30 A kind of Expressway Service user classification method, system, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910361868.7A CN110086874A (en) 2019-04-30 2019-04-30 A kind of Expressway Service user classification method, system, equipment and medium

Publications (1)

Publication Number Publication Date
CN110086874A true CN110086874A (en) 2019-08-02

Family

ID=67418271

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910361868.7A Pending CN110086874A (en) 2019-04-30 2019-04-30 A kind of Expressway Service user classification method, system, equipment and medium

Country Status (1)

Country Link
CN (1) CN110086874A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110766208A (en) * 2019-10-09 2020-02-07 中电科新型智慧城市研究院有限公司 Government affair service demand prediction method based on social group behaviors
CN110809234A (en) * 2019-11-08 2020-02-18 浙江每日互动网络科技股份有限公司 Figure category identification method and terminal equipment
CN110807052A (en) * 2019-11-05 2020-02-18 佳都新太科技股份有限公司 User group classification method, device, equipment and storage medium
CN113255724A (en) * 2021-04-15 2021-08-13 国家计算机网络与信息安全管理中心 Method and device for identifying node type, computer storage medium and terminal

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090193017A1 (en) * 2008-01-29 2009-07-30 Sponsel Jr Kenneth Herrick Methods and Systems for Corporate Discovery, Investigation, and Implementation of Emerging Technology
CN103402177A (en) * 2013-08-02 2013-11-20 南京市海聚信息科技有限公司 Information pushing system of WiFi (wireless fidelity) terminal and implementation method thereof
CN104899303A (en) * 2015-06-10 2015-09-09 杭州祥声通讯股份有限公司 Cloud big data analysis system applied to rail transportation means
CN105868243A (en) * 2015-12-14 2016-08-17 乐视网信息技术(北京)股份有限公司 Information processing method and apparatus
CN106910092A (en) * 2017-02-28 2017-06-30 成都瑞小博科技有限公司 A kind of active marketing method and system based on business WIFI industry attributes
CN109508753A (en) * 2018-12-25 2019-03-22 中南大学 A kind of on-line prediction method of Mineral Floating Process index
CN109561391A (en) * 2019-01-23 2019-04-02 山东省交通规划设计院 Expressway Service stream of people's analysis method based on Cellular Networks and Wi-Fi data
CN109583777A (en) * 2018-12-05 2019-04-05 广东工业大学 A kind of financial product recommender system, method, equipment and medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090193017A1 (en) * 2008-01-29 2009-07-30 Sponsel Jr Kenneth Herrick Methods and Systems for Corporate Discovery, Investigation, and Implementation of Emerging Technology
CN103402177A (en) * 2013-08-02 2013-11-20 南京市海聚信息科技有限公司 Information pushing system of WiFi (wireless fidelity) terminal and implementation method thereof
CN104899303A (en) * 2015-06-10 2015-09-09 杭州祥声通讯股份有限公司 Cloud big data analysis system applied to rail transportation means
CN105868243A (en) * 2015-12-14 2016-08-17 乐视网信息技术(北京)股份有限公司 Information processing method and apparatus
CN106910092A (en) * 2017-02-28 2017-06-30 成都瑞小博科技有限公司 A kind of active marketing method and system based on business WIFI industry attributes
CN109583777A (en) * 2018-12-05 2019-04-05 广东工业大学 A kind of financial product recommender system, method, equipment and medium
CN109508753A (en) * 2018-12-25 2019-03-22 中南大学 A kind of on-line prediction method of Mineral Floating Process index
CN109561391A (en) * 2019-01-23 2019-04-02 山东省交通规划设计院 Expressway Service stream of people's analysis method based on Cellular Networks and Wi-Fi data

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110766208A (en) * 2019-10-09 2020-02-07 中电科新型智慧城市研究院有限公司 Government affair service demand prediction method based on social group behaviors
CN110807052A (en) * 2019-11-05 2020-02-18 佳都新太科技股份有限公司 User group classification method, device, equipment and storage medium
CN110809234A (en) * 2019-11-08 2020-02-18 浙江每日互动网络科技股份有限公司 Figure category identification method and terminal equipment
CN113255724A (en) * 2021-04-15 2021-08-13 国家计算机网络与信息安全管理中心 Method and device for identifying node type, computer storage medium and terminal

Similar Documents

Publication Publication Date Title
CN110086874A (en) A kind of Expressway Service user classification method, system, equipment and medium
EP3928473B1 (en) Systems and methods for communications node upgrade and selection
CN109784636A (en) Fraudulent user recognition methods, device, computer equipment and storage medium
CN105094305B (en) Identify method, user equipment and the Activity recognition server of user behavior
MXPA03006586A (en) System and method for composite customer segmentation.
US20170303079A1 (en) Information distribution apparatus and method
CN111028016A (en) Sales data prediction method and device and related equipment
CN108566618A (en) Obtain method, apparatus, equipment and storage medium that user is resident rule
CN109033408A (en) Information-pushing method and device, computer readable storage medium, electronic equipment
CN107274066B (en) LRFMD model-based shared traffic customer value analysis method
CN109993553A (en) Data analysing method, device, equipment and medium based on reverse funnel
CN109636457A (en) A kind of advertisement placement method, apparatus and system towards high net value client
CN110909222A (en) User portrait establishing method, device, medium and electronic equipment based on clustering
CN109447103B (en) Big data classification method, device and equipment based on hard clustering algorithm
Ghaemi et al. Challenges in spatial-temporal data analysis targeting public transport
CN109961199A (en) A kind of method and apparatus for analyzing data fluctuations
CN111400663B (en) Model training method, device, equipment and computer readable storage medium
CN108921214A (en) Acquisition methods, device and the computer readable storage medium of City attribution
CN111008871A (en) Real estate repurchase customer follow-up quantity calculation method, device and storage medium
CN106131238A (en) The sorting technique of IP address and device
CN110210884A (en) Determine the method, apparatus, computer equipment and storage medium of user characteristic data
KR102241221B1 (en) Apparatus and method subdividing regional spaces of interest
CN116452014B (en) Enterprise cluster determination method and device applied to city planning and electronic equipment
CN109636459A (en) A kind of advertisement placement method, apparatus and system towards high net value client
Kwon et al. A novel location prediction scheme based on trajectory data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100084 Beijing City, Haidian District Tsinghua Yuan

Applicant after: TSINGHUA University

Applicant after: Shandong transportation planning and Design Institute Co.,Ltd.

Address before: 100084 Beijing City, Haidian District Tsinghua Yuan

Applicant before: TSINGHUA University

Applicant before: SHANDONG PROVINCIAL COMMUNICATIONS PLANNING AND DESIGN INSTITUTE

CB02 Change of applicant information
RJ01 Rejection of invention patent application after publication

Application publication date: 20190802

RJ01 Rejection of invention patent application after publication