CN110442662A - A kind of method and information-pushing method of determining customer attribute information - Google Patents
A kind of method and information-pushing method of determining customer attribute information Download PDFInfo
- Publication number
- CN110442662A CN110442662A CN201910611619.9A CN201910611619A CN110442662A CN 110442662 A CN110442662 A CN 110442662A CN 201910611619 A CN201910611619 A CN 201910611619A CN 110442662 A CN110442662 A CN 110442662A
- Authority
- CN
- China
- Prior art keywords
- data
- region
- user
- attribute information
- node
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/906—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
- G06Q30/0251—Targeted advertisements
- G06Q30/0255—Targeted advertisements based on user history
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
- G06Q30/0277—Online advertisement
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/52—Network services specially adapted for the location of the user terminal
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/55—Push-based network services
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/02—Services making use of location information
- H04W4/021—Services related to particular areas, e.g. point of interest [POI] services, venue services or geofences
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/02—Services making use of location information
- H04W4/029—Location-based management or tracking services
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/12—Messaging; Mailboxes; Announcements
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Data Mining & Analysis (AREA)
- Signal Processing (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Finance (AREA)
- Strategic Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Economics (AREA)
- Game Theory and Decision Science (AREA)
- Remote Sensing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides a kind of methods for determining customer attribute information based on region POI data, personal space-time data, this method comprises: obtaining map open platform POI data by predeterminable area, and pre-process to it;Personal space-time data based on mobile terminal counts each region and clusters in the discrepancy number of different time, and to the number of statistics;Cluster result, pretreated POI data are parsed using DMR topic model algorithm, determine regional function attribute information;Personal space-time data based on mobile terminal generates the daily track data of each user, which includes the node in Trip chain and Trip chain;Customer attribute information is determined according to the Trip chain, each node and regional function attribute information corresponding with each node.By means of the invention it is possible to accurately determine user property, and targeted information push is carried out according to user property.
Description
Technical field
The present invention relates to information technology fields more particularly to the method and information of a kind of determining customer attribute information to push
Method.
Background technique
The acquisition of POI data is traditionally the warp that ground mapping personnel obtain a point of interest using accurate instrument of surveying and mapping
Then latitude marks again, in order to enrich message, the acquisition of POI data further include interest point name, classification, classification and
Recommended information etc., but these information are static.Currently, these information are also used in addition to being used for digital map navigation based on user
Keyword input carry out the push of corresponding POI information.
It is installed with GPS system mostly on smart phone at present, geographical location locating for user can be acquired in real time, there are many
Using being developed, the specific position based on user locating at that time geographical location or input carries out information recommendation.
It is clue matching that current Internet advertising both domestic and external, which is mainly concentrated through the keyword for inputting or browsing to audient,
And the sequencing of advertisement pushing carries out bid ranking decision according to advertiser's release price height.
Information above push or recommended method, which exist, is unable to judge accurately correlation and contract of the advertising information with audient
Right defect.
Summary of the invention
In view of the above problems, it proposes on the present invention overcomes the above problem or at least be partially solved in order to provide one kind
State the technical solution of problem.One aspect of the present invention provides a kind of determining based on region POI data, personal space-time data
The method of customer attribute information, this method comprises:
Map open platform POI data is obtained by predeterminable area, and it is pre-processed;
Personal space-time data based on mobile terminal counts each region in the discrepancy number of different time, and to statistics
Number is clustered;
Cluster result, pretreated POI data are parsed using DMR topic model algorithm, determine regional function
Attribute information;
Personal space-time data based on mobile terminal generates the daily track data of each user, which includes
Node on row chain and Trip chain;
Determine that user belongs to according to the Trip chain, each node and regional function attribute information corresponding with each node
Property information.
Optionally, the regional function attribute information includes that function category division in region, different classes of function are corresponding
Crucial POI type, and/or the probability distribution information on each functional category.
Optionally, map open platform POI data is obtained by region, comprising:
Reading area list;
Zone list is traversed, following process is executed:
Obtain region longitude, latitude;
Create thread pool;
Separated time journey generates the address URL of request, crawls all kinds of POI datas in each region.
Optionally, map open platform POI data is obtained by region, further includes:
Queue is created, carries out data prediction for all kinds of POI datas crawled are transferred to cloud.
Optionally, it is pre-processed using data of the cloud computing platform to acquisition, comprising: all kinds of POI datas that will be crawled
Carry out coordinate conversion.
Optionally, the number of statistics is clustered, comprising: the cloud computing platform creates ODPS class, each for handling
User's trip track data;
Under such, the index that time series is row is generated;
Access time range is greater than the data that initial time is less than the end time, creation statistics array;
Count each node ID and each group sum;
Reading area listing file, according to task sequence multithreading count each region respectively as enter region, leave
The volume of the flow of passengers in region.
Optionally, it is pre-processed using data of the cloud computing platform to acquisition, further includes:
Array is created, for saving cluster result;
Inner peripheral flow is carried out to user's inflow and outflow amount of each node respectively and enters cluster, weekend inflow cluster, data in week
Cluster outflow cluster, weekend outflow cluster.
Optionally, cluster result, pretreated POI data are parsed using DMR topic model algorithm, comprising:
The cluster(ing) file of POI data, each node is organized into predetermined format file;
Load the predetermined format file;
DMR algorithm is run, functional category division regional corresponding to each region is obtained and different classes of function is corresponding
POI distributed intelligence.
Optionally, according to the Trip chain, each node and the regional function attribute information corresponded to each node
Determine customer attribute information, comprising:
Estimated using HMM model according to the Space Time information of the daily track data of each user, regional function attribute information
Activity of the user in each node;
Customer attribute information is determined according to the activity of user.
The present invention also provides a kind of methods for carrying out information push to user using customer attribute information determined above, should
Method includes:
According to customer attribute information, a certain range of information and POI data centered on node each on track are matched,
And matching result is pushed to user;And/or
According to customer attribute information, relevant news, article or books are matched, and matching result is pushed to user.
The technical solution provided in the embodiment of the present application has at least the following technical effects or advantages: through the invention, energy
Enough in a small area, regional function attribute information is determined according to user's trip trajectory clustering result, further according to going out for user
Row track and regional function attribute information excavate recessive user behavior information, so as to automatically determine the attribute of user
Information realizes that targetedly information pushes.
The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention,
And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage can
It is clearer and more comprehensible, the followings are specific embodiments of the present invention.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this field
Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the present invention
Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 shows the one kind proposed according to the present invention and determines user property based on region POI data, personal space-time data
The flow chart of the method for information;
Fig. 2 shows the processes that region guest flow statistics is carried out to user's space-time data of acquisition;
Fig. 3 shows the master data logic relation picture of DMR (Di Li Cray polynomial regression model) algorithm;
Fig. 4, which is shown, utilizes Di Li Cray polynomial regression model to map open platform POI data and region progress
The processing result effect picture of analysis;
Fig. 5 shows the HMM model figure based on trip link.
Specific embodiment
Exemplary embodiments of the present disclosure are described in more detail below with reference to accompanying drawings.Although showing the disclosure in attached drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here
It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure
It is fully disclosed to those skilled in the art.
The development in one city, has gradually formed different functional areas, and each functional area includes multiple POI, region
Functional attributes may be one, it is also possible to and it is multiple, it is determined by the classification of POI, the functional attributes in region are again to crowd's mobility
There is critically important influence.
Since the functional attributes in a region mainly with the POI attribute and the close phase of crowd's flow pattern around region
It closes, therefore, in order to determine the functional attributes in some region, it is usually required mainly for consider that POI data and crowd around region flow feelings
Condition, in terms of reason has following two:
(1) POI data: one side POI data represents the feature around a bus station.For example, one includes most
The region in mathematics school probably belongs to education sector.On the other hand, a region generally comprises various POI, therefore has
There are many functions of multiplicity, and not only there was only a kind of function.Such as it is also likely to be amusement that some regions, which both may be shopping centre,
Area.
(2) crowd's mobility: crowd of the regional function often with access this area has close relationship.Crowd
Mobility is mainly reflected in two levels to the discovery of a regional function, one be people when reach and when
Leave, the other is people wherefrom come where.Usually on weekdays, people leave residence area in the morning, return at night
It returns.But in festivals or holidays or workaday evening, the place that people mainly go is recreational area.Furthermore different regional function
Also the reason of flowing with crowd has relationship.For example, the crowd for reaching recreational area is likely on weekdays by under working region
What class went to, it is also possible to be gone in festivals or holidays by residential quarter.Therefore, if people from functionally similar region go to other two
A region, or left from similar place and go to two places, then the two regions are more likely to identity function.
The present invention attempts on the basis of regional function attribute, push away personalizedly to user in conjunction with user trajectory data
It recommends.
Recommended using user trajectory data, time-space attribute, time series including considering user trajectory, position
The conversion of sequentiality and moving condition has mobile object institute before and after different time in place since track data itself contains
It sets, movement speed and moving direction, it is this to recommend the recommendation different from the past according to user location, because previous mode was more
It is to consider that the space attribute of position also has ignored the sequentiality between space item to the shadow of recommendation even if there is GPS data
It rings, recommendation results are more presented in a manner of discrete point or point set, such as only next with a POIs set expression user
May interested POI, the sequence of individual consumer is not met between the POIs in set.
User trajectory is defined as a Trip chain by us: Trip chain refers to that user in one day, uses different traffic works
Tool, morning from family, return again to a series of trips (trip) composition in family, at night for portraying user in city not
A series of travel behaviours carried out with the time with different behavior purposes.If carried out sequentially in time with geographical location
If connection, a complete broken line (polyline) can be showed in city map.And Trip chain can indicate are as follows: family
→ place of working → supper dining room → family, family → place of working wait different forms.
Meanwhile it is corresponding with Trip chain, user is defined as the underlying attribute of Trip chain in the activity of each node.
We pay close attention to the coordinate changed over time by user, and from equipment gps or mobile phone terminal gps positioning, correspondence is set
Standby ID correspond to user of each anonymity, excavates them and uses the Trip chain (Trip Chain) of public transport daily, and by he
Travel behaviour and area attribute PoI classification as hidden status information, reorganize turn as next stage regional dynamics
Move the model learning parameter of attribute.
As shown in Figure 1, the present invention proposes that one kind determines customer attribute information based on region POI data, personal space-time data
Method, this method comprises:
S1. map open platform POI data is obtained by predeterminable area, and it is pre-processed;
S2. the personal space-time data based on mobile terminal counts each region in the discrepancy number of different time, and to system
The number of meter is clustered;
S3. cluster result, pretreated POI data are parsed using DMR topic model algorithm, determines region function
It can attribute information;
S4. the personal space-time data based on mobile terminal generates the daily track data of each user, the track data packet
Include the node in Trip chain and Trip chain;
S5. it determines and uses according to the Trip chain, each node and regional function attribute information corresponding with each node
Family attribute information.
In step sl, map open platform is open to society opens, and the present invention is obtained by the way of being crawled by network
Take POI data.
The above method executes computer program by cloud platform (server or server cluster) and completes, which can be claimed
For super local platform.
As a preferred embodiment, being crawled with predeterminable area (such as 500m*500 meters) for a data cell
Using a certain region as 500 meters of center of circle range of all kinds of POI datas, POI data itself is classification, there is level-one class and second level class,
Each classification has the code of corresponding industry and title corresponding, and information collection is facilitated to record and distinguish.Crawling rank must be
Level-one or second level, the information crawled form the sentence creeped.
Crawling POI data, detailed process is as follows: reading area list;Zone list is traversed, following process is executed: being obtained
Region longitude, latitude;Create thread pool;Separated time journey generates the address URL of request, crawls using region as in the certain area of center
Level-one POI data is crawled using region as the second level POI data in the certain area of center.
By taking Baidu map opening platform as an example, all coordinates in Baidu map are all Baidu's coordinates, need to convert, and are needed
It calls Baidu's coordinate to convert API, then the data after coordinate conversion is parsed, are climbed one by one after parsing according to zone list
Take the level-one, second level POI data of each region.
The key code sentence of specific implementation is as follows:
During crawling, queue is created, carries out data parsing for all kinds of POI datas crawled are transferred to cloud.
In the present invention, data are pre-processed preferably by Ali's cloud MaxCompute cloud computing platform, are become
Area can be constructed later by DMR (Dirichlet-multinomial Regression) data that algorithm can be used directly
Domain-functionalities attributive analysis (Discovers Stations of different Function, DSoF) model, is finally completed pair
The functional attributes in region are analyzed, and the functional attributes information includes that function category division in region, different classes of function are corresponding
Crucial POI type, and/or the probability distribution information in each function.
In step s 2, the personal space-time data based on mobile terminal counts each region in the discrepancy people of different time
Number, and the number of statistics is clustered.Counting user trip track, determines group's trip information by region, due to each area
There may be 5,500,000 or so personal track records in domain daily, system needs to handle one month data, therefore conduct
A kind of preferred embodiment, the present invention calculate service MaxCompute, while also referred to as ODPS using the big data of Alibaba
(open data process service) is a kind of quick, the TB/PB grade data warehouse solution of complete trustship.
MaxCompute has provided a user perfect data import plan and a variety of distributed computing platforms, can be faster
User's mass data computational problem is solved, calculating cost is effectively reduced, and ensure data safety.
As shown in Fig. 2, step S2 includes following specific steps:
S21. the cloud computing platform creates ODPS class;
S22. under such, the index that time series is row is generated;
S23. access time range is greater than the data that initial time is less than the end time, creation statistics array;
S24. statistical regions serial number and each group sum;
S25. reading area listing file, according to task sequence multithreading count each region respectively as enter region,
Leave the volume of the flow of passengers in region.
Statistics for each region volume of the flow of passengers the specific implementation process is as follows:
The calculating for completing user's inflow and outflow amount in each region above, using MaxCompute platform against one month
Data calculate and need or so five day time.
The statistics to the region volume of the flow of passengers, that is, the statistics of crowd's transfer mode are completed, is needed below to this mode
It is clustered, the different classes of characteristic as in DMR model after using cluster.
Cluster can be divided into working day outflow, working day flows into, weekend outflow, weekend inflow, flow to the user of each node
Enter discharge carry out respectively inner peripheral flow enter cluster, weekend flow into cluster, week in data clusters outflow cluster, weekend outflow cluster.
In cluster process, step S2 further include:
Array is created, for saving cluster result;
Inner peripheral flow is carried out to user's inflow and outflow amount of each node respectively and enters cluster, weekend inflow cluster, data in week
Cluster outflow cluster, weekend outflow cluster.
It is the key code of the realization clustered to trip mode data below:
The data of statistics have been broadly divided into four parts, and inner peripheral flow enters some region of flow of the people, inner peripheral flow goes out a certain area
The flow of the people in domain, weekend flow out some region of flow of the people, weekend flows into some region of flow of the people.Use these four types of data pair
It is clustered in all with weekend.
The analysis of DMR model algorithm is carried out to next step by process above and has carried out data preparation.
Region is considered as document by DMR model, and POI data (such as restaurant and market) is considered as the word in article, crowd's
Flow pattern regards metadata (such as author, keyword information) as, as a result by the functional strength distribution in a region.
In step s3, cluster result, pretreated POI data are parsed using DMR topic model algorithm, really
Determine regional function attribute information, specifically include:
The cluster(ing) file of POI data, each region is organized into predetermined format file;
Load the predetermined format file;
DMR algorithm is run, functional category division regional corresponding to each region is obtained and different classes of function is corresponding
POI distributed intelligence.
Specific analytic process is as follows:
Characteristic during key procedure is realized above just refers to different classes of after clustering.
In the DMR model, regard region as an article, the functional attributes in region see the different themes made an issue of, i.e.,
Region with multiple functional attributes is the same just as the article comprising varied theme.Specifically, around a region
POI data can regard the text for forming this article as, the cluster result of passenger flow vector can analogize to a piece of article
The characteristic attributes such as keyword, author information, the specific category of entire POI can be considered as all texts in corpus.
Topic model algorithm based on DMR can add the feature that theme has an impact flexibly by addition to text
Enter and is calculated into model, compared with other combine the model of specific characteristic, such as Author-Topic model, and
Topic-Over-Time model (one of supervisied-LDA model family) is compared, and DMR has more flexible spy
Sign is chosen, and calculates more succinct efficient.
It is as shown in Figure 3 for the organizational logical structure of data in DMR model, wherein N is using σ as the Gauss of hyper parameter point
Cloth, λkIt is that there is vector identical with passenger flow data cluster result length, the classification of the n-th class POI observed in the r of region indicates
For mr,n.Other symbolic interpretations are identical as the LDA model being mentioned above, and EM algorithm and gibbs also can be used in DMR model
Sampling algorithm is calculated.Unlike LDA model, herein, for the Different Results that the volume of the flow of passengers clusters, Mei Gequ
The Di Li Cray in domain is distributed αrIt is different [33].Therefore the area topic distribution being calculated in DMR model is by POI attribute
Cause jointly with crowd's Move Mode, functional category division and difference regional corresponding to each region will eventually be obtained
The corresponding POI category distribution of classification function.
The result that obtains according to DMR model analyzing above is as shown in figure 4, wherein the first row data: representing different classes of
Functional attributes as a result, giving in such, the ratio of the most heavy functional attributes of accounting.Actually distinct functional attributes with
And its show that its corresponding region can be distinguished by different colors on map.Lower section is suitable according to what is successively successively decreased simultaneously
Sequence lists the keyword and word frequency of concrete function in every one kind.In conjunction with resident on weekdays, the average travel on day off when
Between traffic characteristic, can to the result of algorithm carry out it is explained below:
Topic1: enterprise and garden;Topic2: city neighborhood;Topic3: scientific research and education;Topic4: down town quotient
Industry area;Topic5: scenic spot;Topic6: city Office Area;Topic7: transport hub and public institution.
Through the invention, topic analysis can be carried out to city region-by-region, so that it is determined that each region different functionalities attribute, with
And main POI under each attribute, data basis is provided to the analysis of user behavior for next step.
Hidden Markov model is commonly used in the application of the research of prediction based on time series, especially speech recognition
(Hidden Markov Model-HMM) is corresponding by the phoneme signal of observation state and the corresponding vowel of hidden state, using general
The state transfer relationship of rate obtains the Statistic analysis models of training data study, to reach the signal of continuous time series
Identification.The present invention proposes that the trip mode of mobile subscriber is an approximate hidden Markov model (HMM), and most suitable
Share HMM modeling, the definition of the observation state of comparison model and hidden state, it is found that we observe is user one
The corresponding observation state sequence (location status) of the hinged node of Trip chain and Trip chain in it, still, as because of state
User behavior, which type of behavior user produce in the position of the hinged node of Trip chain on earth, is unknown, i.e. model
Hidden status switch.We have seen that HMM model is established in similar having using mobile phone signaling data and taxi car data, predict
User behavior pattern and the corresponding relationship of trip mode produce good prediction effect to inhomogeneous user grouping.We
That be more concerned about is the corresponding observation state (observable of hinged node in the time series shifted as the stochastic regime of HMM
States) shift with hidden state (hidden states) transfer correlation, i.e., the time series of Urban Residential Trip chain and
Influence of the behavior sequence to region.
Based on above-mentioned discovery, it is proposed that the HMM model based on Trip chain Trip-Chains, as shown in Figure 5:
Hidden state Hidden State:S={ r1, r2, r3, r4 }, N=4;Observation state Observation State:O
={ c1, c2, c3, c4, c5, c6 }, M=6;A is state transition probability, and B is observation probability observation
probabilities。
We are by urban grid, and with 500 meters of x500 meters of progress city zonings, modeling is simplified to a small area
Problem:
Problem models Problem Modelling
In order to describe trip track, HMM mesh modeling is carried out, as shown in figure 5, a city dweller is in interregional (grid
Side) it is mobile, one or more snippets side forms Trip-Chain.For example, as the t in the time, city dweller be located at diamond shape node (1,
2), in subsequent time t+1, can in any node of other on broken line (1,0), (1,1), (1,3) or node (2,3), (2,
4), city dweller moves between the node of different broken lines according to transition probability.If it is same to be located at a plurality of different broken lines in time t
When the node that passes through, then the node of any bar broken line can be moved in subsequent time t+1.
User is determined according to the Trip chain, each node and the regional function attribute information corresponded to each node
Attribute information, comprising:
Estimated using HMM model according to the Space Time information of the daily track data of each user, regional function attribute information
Activity of the user in each node;
Customer attribute information is determined according to the activity of user.
Shown in Fig. 5, in the model, n is definedtFor node state variable in observation state variable, that is, grid.It is fixed
Adopted atFor activity state variable in hidden state variable, that is, grid.
Transition probability defines Transition probability:a={ aij}={ P (nt+1=(1,2) | nt=(0,2)) }
Probability defines Initial probability: π={ πi}={ P (n1=(2,2)) }
Emission probability defines Emission Probabilities:
γt(i)=P (at=(tag1) | nt=(0,2))
γt(i)=P (at=(tag2) | nt=(0,2))
γt(i)=P (at=(tag3) | nt=(0,2))
It, can easily posteriorly directed force indicates forward according to the definition of forward backward algorithm are as follows:
Given observation sequence N and hidden Markov model λ defines t moment and is located at hidden state aiAnd the t+1 moment is located at hidden shape
State ajProbability variable are as follows:
ξt(i, j)=P (qt=ai,qt+1=aj|n,λ)
When position of the user between different moments, different location is shifted, future can be predicted by current location
The possible position of user and behavior, and appropriate adjustment is carried out to push content.
In step s 5, believed according to the Trip chain, each node and regional function attribute corresponding with each node
It ceases and determines customer attribute information, comprising:
A) estimated using HMM model according to the Space Time information of the daily track data of each user, regional function attribute information
User is surveyed in the activity of each node;
B) customer attribute information is determined according to the activity of user.
User property, alternatively referred to as audience attributes are determined based on the space time information of user trajectory, behavior, with specific reference to user
The living habit etc. that place, their personal consumption behavior, the having time position gone characterize carries out classification label, so that it is determined that
The audience attributes of user can be marked by the classification of multiple dimensions.Wherein specifically include: the GPS coordinate according to locating for user,
It inquires the attribute of its present position, including POI title, locating belongs to which function class which region and present position belong to
Not;User property (audience attributes) are determined according to the functional attributes of user described in a period of time present position.
As a kind of specific embodiment, the certain user of airport favourable turn terminal is frequently appeared in, will appear 1 weekly
It is secondary, it will appear the business people etc. of 2 times or flying trapeze weekly, device location when according to their used mobile phones is turning
The time and frequency that machine terminal occurs, their user property is defined as travelling merchants' white collar to super local platform or unit works
The attributes such as personnel.In this way, user leaves airport even if route terminates, super local platform can still pay close attention to user, according to their category
Property, matched businessman's market content, travel for commercial purpose market content are pushed to them.It therefore, can be according to the user of acquisition
The attribute information of position region, the User Activity frequency determine user property.
According to customer attribute information, a certain range of information and POI data centered on node each on track are matched,
And matching result is pushed to user;And/or
According to customer attribute information, relevant news, article or books are matched, and matching result is pushed to user.
Firstly, periodically request is sent to the smart phone of user or other mobile terminals, to obtain user's
GPS coordinate, correct time;It is right to obtain the institute of that region belonging to the GPS coordinate for the GPS coordinate according to locating for user
Answer the functional attributes information in region;According to local dealer, retailer that user property is adapted for its matching, by local
Dealer, retailer provide it different recommendation and favor information according to audience attributes.
Super local platform can determine whether which product or businessman in which function of periphery according to the spatio-temporal distribution of user
Sales volume in energy region is big, to analyze the influence etc. of crowd and regional function to sales volume, provides data branch for third party businessman
It holds.For super local platform user, this process be it is hiding, do not need user and carry out any operation, but user is, it can be seen that look for
The businessman liked to oneself can be more convenient, also can readily see that the product for meeting oneself taste appears in recommendation, is easy
Obtain proficiency favor information.
The technical solution provided in the embodiment of the present application has at least the following technical effects or advantages: through the invention, energy
It is enough associated according to user data of the realization in a regional areas and the place with certain function attribute, to complete essence
Standard is recommended.
In the instructions provided here, numerous specific details are set forth.It is to be appreciated, however, that implementation of the invention
Example can be practiced without these specific details.In some instances, well known method, structure is not been shown in detail
And technology, so as not to obscure the understanding of this specification.
Similarly, it should be understood that in order to simplify the disclosure and help to understand one or more of the various inventive aspects, In
Above in the description of exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimes
In example, figure or descriptions thereof.However, the disclosed method should not be interpreted as reflecting the following intention: i.e. required to protect
Shield the present invention claims features more more than feature expressly recited in each claim.More precisely, as following
Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore,
Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, wherein each claim itself
All as a separate embodiment of the present invention.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and ability
Field technique personnel can be designed alternative embodiment without departing from the scope of the appended claims.In the claims,
Any reference symbol between parentheses should not be configured to limitations on claims.
Claims (10)
1. a kind of method for determining customer attribute information based on region POI data, personal space-time data, which is characterized in that the party
Method includes:
Map open platform POI data is obtained by predeterminable area, and it is pre-processed;
Location data based on mobile terminal acquisition counts each region in the discrepancy number of different time, and to the number of statistics
It is clustered;
Cluster result, pretreated POI data are parsed using DMR topic model algorithm, determine regional function attribute
Information;
Personal space-time data based on mobile terminal generates the daily track data of each user, which includes Trip chain
And the node in Trip chain;
Determine that user property is believed according to the Trip chain, each node and regional function attribute information corresponding with each node
Breath.
2. according to the method described in claim 1, it is further characterized in that, the regional function attribute information includes function in region
The corresponding key POI type of category division, different classes of function, and/or the probability distribution information on each functional category.
3. according to the method described in claim 1, it is further characterized in that, by region obtain map open platform POI data, packet
It includes:
Reading area list;
Zone list is traversed, following process is executed:
Obtain region longitude, latitude;
Create thread pool;
Separated time journey generates the address URL of request, crawls all kinds of POI datas in each region.
4. according to the method described in claim 1, it is further characterized in that, by region obtain map open platform POI data, also wrap
It includes:
Queue is created, carries out data prediction for all kinds of POI datas crawled are transferred to cloud.
5. according to the method described in claim 1, it is further characterized in that, located in advance using data of the cloud computing platform to acquisition
Reason, comprising: all kinds of POI datas crawled are subjected to coordinate conversion.
6. according to the method described in claim 1, it is further characterized in that, the number of statistics is clustered, comprising: the cloud meter
It calculates platform and creates ODPS class, for handling each user's trip track data;
Under such, the index that time series is row is generated;
Access time range is greater than the data that initial time is less than the end time, creation statistics array;
Count each node ID and each group sum;
Reading area listing file, according to task sequence multithreading count each region respectively as enter region, leave region
The volume of the flow of passengers.
7. method according to claim 1-6, it is further characterized in that, using cloud computing platform to the data of acquisition
It is pre-processed, further includes:
Array is created, for saving cluster result;
Inner peripheral flow is carried out to user's inflow and outflow amount of each node respectively and enters cluster, weekend inflow cluster, data clusters in week
Outflow cluster, weekend outflow cluster.
8. according to the method described in claim 7, it is further characterized in that, to cluster result, pretreated POI data utilize
DMR topic model algorithm is parsed, comprising:
The cluster(ing) file of POI data, each node is organized into predetermined format file;
Load the predetermined format file;
DMR algorithm is run, functional category division and the corresponding POI of different classes of function regional corresponding to each region is obtained
Distributed intelligence.
9. one kind be based on method described in claim 1, which is characterized in that according to the Trip chain, each node and with it is described
The regional function attribute information that each node corresponds to determines customer attribute information, comprising:
Space Time information, regional function attribute information estimating subscriber's using HMM model according to the daily track data of each user
In the activity of each node;
Customer attribute information is determined according to the activity of user.
10. a kind of method that the customer attribute information determined based on claim 1-9 carries out information push to user, this method packet
It includes:
According to customer attribute information, a certain range of information and POI data centered on node each on track are matched, and will
Matching result is pushed to user;And/or
According to customer attribute information, relevant news, article or books are matched, and matching result is pushed to user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910611619.9A CN110442662B (en) | 2019-07-08 | 2019-07-08 | Method for determining user attribute information and information push method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910611619.9A CN110442662B (en) | 2019-07-08 | 2019-07-08 | Method for determining user attribute information and information push method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110442662A true CN110442662A (en) | 2019-11-12 |
CN110442662B CN110442662B (en) | 2022-05-20 |
Family
ID=68429852
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910611619.9A Active CN110442662B (en) | 2019-07-08 | 2019-07-08 | Method for determining user attribute information and information push method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110442662B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111159583A (en) * | 2019-12-31 | 2020-05-15 | 中国联合网络通信集团有限公司 | User behavior analysis method, device, equipment and storage medium |
CN111460301A (en) * | 2020-03-31 | 2020-07-28 | 拉扎斯网络科技(上海)有限公司 | Object pushing method and device, electronic equipment and storage medium |
CN112632370A (en) * | 2020-12-08 | 2021-04-09 | 青岛海尔科技有限公司 | Method, device and equipment for article pushing |
CN113127594A (en) * | 2021-06-17 | 2021-07-16 | 脉策(上海)智能科技有限公司 | Method, computing device and storage medium for determining grouping data of geographic area |
CN113177058A (en) * | 2021-05-11 | 2021-07-27 | 北京邮电大学 | Geographic position information retrieval method and system based on composite condition |
WO2021164131A1 (en) * | 2020-02-20 | 2021-08-26 | 深圳壹账通智能科技有限公司 | Map display method and system, computer device and storage medium |
CN113569978A (en) * | 2021-08-05 | 2021-10-29 | 北京红山信息科技研究院有限公司 | Travel track identification method and device, computer equipment and storage medium |
CN114626340A (en) * | 2022-03-17 | 2022-06-14 | 智慧足迹数据科技有限公司 | Behavior feature extraction method based on mobile phone signaling and related device |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105183870A (en) * | 2015-09-17 | 2015-12-23 | 武汉大学 | Urban functional domain detection method and system by means of microblog position information |
CN106951828A (en) * | 2017-02-22 | 2017-07-14 | 清华大学 | A kind of recognition methods of the urban area functional attributes based on satellite image and network |
CN108108844A (en) * | 2017-12-25 | 2018-06-01 | 儒安科技有限公司 | A kind of urban human method for predicting and system |
CN108876475A (en) * | 2018-07-12 | 2018-11-23 | 青岛理工大学 | A kind of urban function region recognition methods, server and storage medium based on point of interest acquisition |
CN108875032A (en) * | 2018-06-25 | 2018-11-23 | 讯飞智元信息科技有限公司 | Area type determines method and device |
CN109029446A (en) * | 2018-06-22 | 2018-12-18 | 北京邮电大学 | A kind of pedestrian position prediction technique, device and equipment |
-
2019
- 2019-07-08 CN CN201910611619.9A patent/CN110442662B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105183870A (en) * | 2015-09-17 | 2015-12-23 | 武汉大学 | Urban functional domain detection method and system by means of microblog position information |
CN106951828A (en) * | 2017-02-22 | 2017-07-14 | 清华大学 | A kind of recognition methods of the urban area functional attributes based on satellite image and network |
CN108108844A (en) * | 2017-12-25 | 2018-06-01 | 儒安科技有限公司 | A kind of urban human method for predicting and system |
CN109029446A (en) * | 2018-06-22 | 2018-12-18 | 北京邮电大学 | A kind of pedestrian position prediction technique, device and equipment |
CN108875032A (en) * | 2018-06-25 | 2018-11-23 | 讯飞智元信息科技有限公司 | Area type determines method and device |
CN108876475A (en) * | 2018-07-12 | 2018-11-23 | 青岛理工大学 | A kind of urban function region recognition methods, server and storage medium based on point of interest acquisition |
Non-Patent Citations (2)
Title |
---|
于璐等: "基于时空语义挖掘的城市功能区识别研究", 《四川大学学报(自然科学版)》 * |
刘丽娴等: "基于数据挖掘的移动用户出行轨迹预测", 《移动通信》 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111159583B (en) * | 2019-12-31 | 2023-08-04 | 中国联合网络通信集团有限公司 | User behavior analysis method, device, equipment and storage medium |
CN111159583A (en) * | 2019-12-31 | 2020-05-15 | 中国联合网络通信集团有限公司 | User behavior analysis method, device, equipment and storage medium |
WO2021164131A1 (en) * | 2020-02-20 | 2021-08-26 | 深圳壹账通智能科技有限公司 | Map display method and system, computer device and storage medium |
CN111460301A (en) * | 2020-03-31 | 2020-07-28 | 拉扎斯网络科技(上海)有限公司 | Object pushing method and device, electronic equipment and storage medium |
CN111460301B (en) * | 2020-03-31 | 2024-01-26 | 拉扎斯网络科技(上海)有限公司 | Object pushing method and device, electronic equipment and storage medium |
CN112632370A (en) * | 2020-12-08 | 2021-04-09 | 青岛海尔科技有限公司 | Method, device and equipment for article pushing |
CN113177058A (en) * | 2021-05-11 | 2021-07-27 | 北京邮电大学 | Geographic position information retrieval method and system based on composite condition |
CN113177058B (en) * | 2021-05-11 | 2023-10-13 | 北京邮电大学 | Geographic position information retrieval method and system based on composite condition |
CN113127594B (en) * | 2021-06-17 | 2021-09-03 | 脉策(上海)智能科技有限公司 | Method, computing device and storage medium for determining grouping data of geographic area |
CN113127594A (en) * | 2021-06-17 | 2021-07-16 | 脉策(上海)智能科技有限公司 | Method, computing device and storage medium for determining grouping data of geographic area |
CN113569978A (en) * | 2021-08-05 | 2021-10-29 | 北京红山信息科技研究院有限公司 | Travel track identification method and device, computer equipment and storage medium |
CN114626340B (en) * | 2022-03-17 | 2023-02-03 | 智慧足迹数据科技有限公司 | Behavior feature extraction method based on mobile phone signaling and related device |
CN114626340A (en) * | 2022-03-17 | 2022-06-14 | 智慧足迹数据科技有限公司 | Behavior feature extraction method based on mobile phone signaling and related device |
Also Published As
Publication number | Publication date |
---|---|
CN110442662B (en) | 2022-05-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110442662A (en) | A kind of method and information-pushing method of determining customer attribute information | |
Bao et al. | Exploring bikesharing travel patterns and trip purposes using smart card data and online point of interests | |
Ermagun et al. | Real-time trip purpose prediction using online location-based search and discovery services | |
Zhong et al. | Inferring building functions from a probabilistic model using public transportation data | |
Bhat | An endogenous segmentation mode choice model with an application to intercity travel | |
Chan et al. | A station-level ridership model for the metro network in Montreal, Quebec | |
US20160125307A1 (en) | Air quality inference using multiple data sources | |
Nguyen et al. | Reviewing trip purpose imputation in GPS-based travel surveys | |
Zheng | Urban computing | |
Chang et al. | Understanding user’s travel behavior and city region functions from station-free shared bike usage data | |
Zheng et al. | Chinese tourists in Nordic countries: An analysis of spatio-temporal behavior using geo-located travel blog data | |
Moiseeva et al. | Imputing relevant information from multi-day GPS tracers for retail planning and management using data fusion and context-sensitive learning | |
Gong et al. | Extracting activity patterns from taxi trajectory data: A two-layer framework using spatio-temporal clustering, Bayesian probability and Monte Carlo simulation | |
CN103593349A (en) | Movement position analysis method in sense network environment | |
CN114897444B (en) | Method and system for identifying service facility requirements in urban subarea | |
Cliquet et al. | Location-based marketing: geomarketing and geolocation | |
US20210011920A1 (en) | Architecture for data analysis of geographic data and associated context data | |
McKenzie et al. | Measuring urban regional similarity through mobility signatures | |
Zhang et al. | Understanding user economic behavior in the city using large-scale geotagged and crowdsourced data | |
Gong et al. | Agent-based modelling with geographically weighted calibration for intra-urban activities simulation using taxi GPS trajectories | |
Chen et al. | Harnessing social media to understand tourist travel patterns in muti-destinations | |
Gong et al. | Geographical and temporal huff model calibration using taxi trajectory data | |
Chaudhuri et al. | Application of web-based Geographical Information System (GIS) in tourism development | |
Guo et al. | Fine-grained dynamic price prediction in ride-on-demand services: Models and evaluations | |
Wang et al. | Competitive location selection of a commercial center based on the vitality of commercial districts and residential emotion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |