CN108446944B - Resident city determination method and device and electronic equipment - Google Patents

Resident city determination method and device and electronic equipment Download PDF

Info

Publication number
CN108446944B
CN108446944B CN201810112757.8A CN201810112757A CN108446944B CN 108446944 B CN108446944 B CN 108446944B CN 201810112757 A CN201810112757 A CN 201810112757A CN 108446944 B CN108446944 B CN 108446944B
Authority
CN
China
Prior art keywords
city
user
candidate
probability
resident
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810112757.8A
Other languages
Chinese (zh)
Other versions
CN108446944A (en
Inventor
吕兵
付晴川
朱日兵
左元
吴金蔚
文诗琪
霍盼
姚杏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN201810112757.8A priority Critical patent/CN108446944B/en
Publication of CN108446944A publication Critical patent/CN108446944A/en
Application granted granted Critical
Publication of CN108446944B publication Critical patent/CN108446944B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Data Mining & Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a resident city determining method, a resident city determining device and electronic equipment, wherein the resident city determining method comprises the following steps: according to the feature information of the user, the feature information of each candidate city and the behavior information of the user in each candidate city, the fitting probability of the user and each candidate city is obtained through city probability model prediction; determining a first probability threshold according to the fitting probability of the user and each candidate city; determining a city of residence of the user from the candidate cities according to the first probability threshold. The method and the device solve the problems that in the prior art, according to city classification statistics, the consumed time is long, and the accuracy of determining a resident city for a user who is active among a plurality of cities is poor, can determine the resident city of the user through the user, the feature information of the city and the behavior information of the user in the city, simplify the calculation process and improve the accuracy.

Description

Resident city determination method and device and electronic equipment
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a resident city determining method and device and electronic equipment.
Background
The resident city is a city where the user lives or works all the year round, information and products are recommended to the user according to the resident city, and the recommendation success rate can be effectively improved. For example, for a user who lives or works in city a throughout the year, news information of city a and information such as tickets are recommended to the user, but information such as specials and tourist spots of city a is not recommended to the user.
In the prior art, the step of determining the resident city algorithm of the user includes: firstly, respectively counting the stay time of a user in a specified historical period based on a city; wherein the residence time may be expressed in days; then, sequencing the residence time of a user in each city; and finally, taking the city with the most staying days as the resident city of the user.
It can be seen that the above process needs to be classified according to cities, and when the number of the cities is large, the time consumption is long; and for users who are often active between multiple cities, determining a resident city is less accurate.
Disclosure of Invention
The invention provides a resident city determining method, a resident city determining device and electronic equipment, and aims to solve the problem of determining a resident city in the prior art.
According to the feature information of the user, the feature information of each candidate city and the behavior information of the user in each candidate city, the fitting probability of the user and each candidate city is obtained through city probability model prediction;
determining a first probability threshold according to the fitting probability of the user and each candidate city;
determining a city of residence of the user from the candidate cities according to the first probability threshold.
According to a second aspect of the present invention, there is provided an apparatus for determining a resident city, the apparatus comprising:
the probability prediction module is used for predicting and obtaining the fitting probability of the user and each candidate city through a city probability model according to the characteristic information of the user, the characteristic information of each candidate city and the behavior information of the user in each candidate city;
a first probability threshold determination module, configured to determine a first probability threshold according to the fitting probability of the user and each candidate city;
a city resident determination module, configured to determine a city resident of the user from the candidate cities according to the first probability threshold.
According to a third aspect of the present invention, there is provided an electronic apparatus comprising:
a processor, a memory, and a computer program stored on the memory and executable on the processor, wherein the processor implements the method for determining a resident city described above when executing the program.
According to a fourth aspect of the present invention, there is provided a readable storage medium characterized in that instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the aforementioned resident city determination method.
The embodiment of the invention provides a resident city determining method, a resident city determining device and electronic equipment, wherein the resident city determining method comprises the following steps: according to the feature information of the user, the feature information of each candidate city and the behavior information of the user in each candidate city, the fitting probability of the user and each candidate city is obtained through city probability model prediction; determining a first probability threshold according to the fitting probability of the user and each candidate city; determining a city of residence of the user from the candidate cities according to the first probability threshold. The method and the device solve the problems that in the prior art, according to city classification statistics, the calculation process is complex, and the accuracy of determining a resident city for a user who is active among a plurality of cities is poor, can determine the resident city of the user through the user, the characteristic information of the city and the behavior information of the user in the city, simplify the calculation process and improve the accuracy.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.
Fig. 1 is a flowchart illustrating specific steps of a resident city determination method under a system architecture according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating specific steps of another resident city determination method under the system architecture according to an embodiment of the present invention;
fig. 3 is a block diagram of a determination apparatus of a resident city according to an embodiment of the present invention;
fig. 4 is a block diagram of another city-resident determination device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
Referring to fig. 1, there is shown a flow chart of steps of a resident city determination method, including:
step 101, according to the feature information of the user, the feature information of each candidate city, and the behavior information of the user in each candidate city, the fitting probability of the user and each candidate city is obtained through city probability model prediction.
The characteristic information of the user includes but is not limited to: gender, age, occupation, consumption level, income level, whether to travel to a person.
The feature information of the candidate city includes, but is not limited to: city grade, whether to travel a city, the number of local users and the number of remote users in the city, and daily average orders of services such as hotel travel traffic.
The behavior information of the user in each candidate city includes but is not limited to: the number of times and the proportion of browsing and positioning the candidate city by the user in a specified historical time period, the maximum time window of the user appearing in the candidate city, and whether the candidate city is the home city of the user. It is to be understood that the specified historical time period may be the past half year, month, week, etc., and the historical time period is not limited by embodiments of the present invention.
It can be seen that the above information includes two kinds of continuous and discrete information, where the continuous information includes: age, the number of local users and the number of users in different places in a city, daily orders of services such as hotel travel traffic and the like, the times and the proportion of browsing and positioning the candidate city by the user in a specified historical time period, and the maximum time window of the user appearing in the candidate city, wherein the discrete information comprises: gender, occupation, consumption level, income level, whether to travel to a person, city class, whether to travel to a city, whether to travel to the candidate city, or to the user's home city.
In practical applications, discrete features in the above information can be represented by discrete numerical values. For example, for the gender information of the user: male 1, female 2, income level of the user: low income is indicated by 1, medium income is indicated by 2, high income is indicated by 3, and the like.
It can be understood that, for a user and a candidate city, each feature information of the user, each feature information of the candidate city, and each behavior information of the user in the candidate city are respectively input to the city probability model corresponding to a variable, so as to obtain a fitting probability between the user and the candidate city, and thus, for a plurality of users and a plurality of candidate cities, a fitting probability between each user and each candidate city is obtained.
And 102, determining a first probability threshold according to the fitting probability of the user and each candidate city.
In the embodiment of the invention, the two adjacent probability values with the maximum probability value difference are used as reference values for determining the first probability threshold.
Specifically, an average value of two adjacent probability values may be used as the first probability threshold, and other weighted average values of two adjacent probability thresholds may also be used as the first probability threshold.
And 103, determining the resident city of the user from the candidate cities according to the first probability threshold.
Specifically, the candidate city with the probability greater than the first probability threshold is taken as the resident city, and the candidate city with the probability less than the first probability threshold is taken as the non-resident city. So that the resident city may exist in plural numbers.
The embodiment of the invention determines that a plurality of resident cities are more suitable for practical application, and in the practical application, a small part of users can be actively in a plurality of cities. For example, a user frequently makes business trips to and from a large city such as Guangzhou, Shanghai, Beijing for a long time, or a user homes in a suburb town of Hebei Gallery city, but goes to work in the Beijing city.
After the user is determined to be in the resident city, personalized recommendation can be performed so as to improve the recommendation success rate. For example, when a user visits the Mei Tuo platform in a resident city, hot-sell goods purchased by the local user can be recommended, but local hot spots, special products and the like are not recommended; when a user accesses the Mei Tuo platform in a non-resident city, hot goods purchased by the user in different places, travel products in different places, train tickets between the resident city and the different places and the like can be recommended.
In summary, an embodiment of the present invention provides a method for determining a resident city, where the method includes: according to the feature information of the user, the feature information of each candidate city and the behavior information of the user in each candidate city, the fitting probability of the user and each candidate city is obtained through city probability model prediction; determining a first probability threshold according to the fitting probability of the user and each candidate city; determining a city of residence of the user from the candidate cities according to the first probability threshold. The method and the device solve the problems that in the prior art, according to city classification statistics, the consumed time is long, and the accuracy of determining a resident city for a user who is active among a plurality of cities is poor, can determine the resident city of the user through the user, the feature information of the city and the behavior information of the user in the city, simplify the calculation process and improve the accuracy.
Example two
The embodiment of the application describes a method for determining an optionally resident city from the hierarchy of a system architecture.
Referring to fig. 2, a flowchart illustrating specific steps of another resident city determination method is shown.
Step 201, training based on an annotated data sample set to obtain a city probability model, where each annotated data sample in the annotated data sample set at least includes: the characteristic information of the user, the characteristic information of the candidate city and the behavior information of the user in the candidate city.
And marking whether the corresponding city is a resident city of the corresponding user or not by each data sample of the marked data sample set. In practical applications, a field may be added to indicate whether the data sample is a resident city sample or a non-resident city sample. For example, a sample may be labeled by a field residenticity, and when residenticity is 1, the sample is a resident city sample; when ResientCity is 0, the sample is a non-stationary city sample.
The labeled data sample set can be obtained by telephone call return and questionnaire of customer service personnel to users of similar platforms such as hotel clients and beauty groups platforms.
The characteristic information of the user can be obtained by analyzing the use log of the application by the user. For example, when the user registers an account number in the meio platform, the input personal information includes: gender, age, occupation, consumption level and income level can be used as characteristic information of the user, and whether the user is a tourist can be analyzed from a use log of the user. When the information concerned by the user is mostly tourist attractions, hotels and the like, the user can be determined to be a tourist visitor; otherwise, the user is not a tourist. It can be understood that in practical application, whether the user likes a travel option or not can be provided for the user to select during registration; thereby selecting users who like to travel as the tourist; otherwise, the user who does not like to travel is not the traveling person.
The feature information of the candidate city may be obtained from a database. For example, the beauty parlor platform can make a basic datum for the city.
The behavior information of the user in the candidate city can be obtained by analyzing the access log of the user to the application, for example, the number and the proportion of browsing and locating the candidate city by the user in a specified historical time period, the maximum time window of the user appearing in the candidate city, and whether the candidate city is the home city of the user can be determined through the log.
In the embodiment of the present invention, a logistic regression model or a decision tree model may be used for training, so that the parameter set in the model is the optimal parameter set.
Optionally, in another embodiment of the present invention, step 201 includes sub-steps 2011 to 2014:
sub-step 2011 initializes a set of parameters for the city probability model.
After the probabilistic model is selected, a set of parameters for the probabilistic model is initialized. For example, for a logistic regression model, the model formula is as follows:
Figure BDA0001569744790000061
wherein u is a user and c is a candidate city;
and N is the number of the non-constant parameters and is determined according to the number of the characteristic information of the user, the characteristic information of the candidate city and the behavior information of the user in the candidate city. For example, if the feature information of the user includes six types, such as gender, age, occupation, consumption level, income level, whether to travel to reach people, etc., the feature information of the city includes five types, such as city grade, whether to travel to a city, the number of local users and the number of remote users in the city, the daily average number of orders of hotel travel traffic, etc., and the behavior information of the user in the candidate city includes three types, such as the number of times and the ratio of browsing and locating the candidate city by the user in a specified historical time period, the maximum time window of the user appearing in the candidate city, and whether the candidate city is the home city of the user, etc., so that the size of the parameter set is 14.
xiTaking the value of ith characteristic information in the characteristic information consisting of the characteristic information of the user u, the characteristic information of the candidate city c and the behavior information of the user u in the candidate city c;
wiis xiB is a constant parameter.
It will be appreciated that training is simply determining wiAnd the value of b.
Specifically, the parameter set may be initialized based on empirical values, or may be initialized to other values through analysis. It will be appreciated that when the values of the initialization parameter set are inappropriate, training time will be increased; training time will be reduced when the value of the initialization parameter set is close to the optimal parameter set.
And a substep 2012, for each labeled data sample in the labeled data sample set, inputting the feature information of the user, the feature information of the candidate city, and the behavior information of the user in the candidate city into a preset city probability model, so as to obtain the fitting probability between the user and the corresponding candidate city.
Specifically, for the user u and the candidate city c, the feature information of the user u, the feature information of the candidate city c, and the behavior information of the user u in the candidate city c are input into the probability model according to the specified sequence, and the fitting probability of the user u and the candidate city c is obtained through calculation.
It can be understood that the order of the feature information of the user u, the feature information of the candidate city c, and the behavior information of the user u in the candidate city c may be set according to an actual application scenario, which is not limited in the embodiment of the present invention.
In practical application, the tagged data sample set includes a plurality of candidate cities corresponding to a large number of users, and for each tagged data sample, one user corresponds to one candidate city corresponding to the user. Therefore, fitting probabilities of a large number of users and a plurality of candidate cities can be obtained according to the labeled data sample set.
And a substep 2013, determining a loss value according to the fitting probability of the user and the corresponding candidate city determined by each marking data sample.
Compared with the loss function in the traditional model, such as the cross entropy loss function of the logistic regression model, the loss function of the embodiment of the invention can not only sequence the fitting probabilities of the users and the cities, but also make the fitting probability difference between the users and the resident cities more obvious than that between the users and the non-resident cities by maximizing the probability interval between the resident cities and the non-resident cities.
Optionally, in another embodiment of the present invention, step 2013 includes substeps 20131 through 20132:
and a substep 20131, for each user in the labeled data sample set, subtracting the fitting probability of the user and each resident city from the fitting probability of the user and each resident city, and adding a preset protection value to obtain each first difference value of the user.
In the embodiment of the present invention, for all users in the labeled data sample set, based on all labeled data samples of each user, each first difference value of the user is calculated according to the fitting probability of the user to each non-resident city and the fitting probability of each resident city.
Specifically, for user u, resident city c', non-resident city c, the calculation formula of the first difference value is as follows:
M1=f(φ(u,c))-f(φ(u,c'))+ε (2)
where f (φ (u, c)) is the fitting probability of user u and the non-stationary city c, and f (φ (u, c ')) is the fitting probability of user u and the stationary city c'. In practical applications, a logistic regression model formula as shown in formula (1) may be used, and other model formulas may also be used.
Epsilon is a preset protection value, which affects the training result, and can be adjusted according to the prediction result in the training process.
And a substep 20132, for each user, obtaining a second difference value by taking the maximum value of each first difference value of the user and zero, and counting the sum of each second difference value to obtain a third difference value of the user.
Specifically, for user u, resident city c', non-resident city c, the calculation formula of the second difference value is as follows:
M2=max(0,M1)=max(0,f(φ(u,c))-f(φ(u,c'))+ε) (3)
then, the calculation formula of the third difference value is as follows:
Figure BDA0001569744790000081
wherein the content of the first and second substances,Cuthe candidate city set of the user u is divided into a resident city and a non-resident city. It is understood that the resident city and the non-resident city may be a plurality of cities。
And a substep 20133, which is to count the sum of the third difference values of all users to obtain a loss value.
Specifically, the calculation formula of the loss value is as follows:
Figure BDA0001569744790000082
wherein U is the set of all users.
And a substep 2014, if the loss value does not satisfy the preset condition, adjusting the parameter set until the loss value satisfies the preset condition.
Specifically, when the loss value is less than or equal to a preset value, the loss value meets a preset condition, the training is finished, and the corresponding parameter group is a target parameter group; and when the loss value is larger than the preset value, the loss value does not meet the preset condition, and the parameter set is adjusted to continue training until the loss value meets the preset condition.
It can be understood that the preset value may be set according to an actual application scenario, and the embodiment of the present invention does not limit the preset value. The smaller the preset value is, the more accurate the training result is, and the longer the training time is; the larger the preset value is, the coarser the training result is, and the shorter the training time is.
Step 202, according to the feature information of the user, the feature information of each candidate city, and the behavior information of the user in each candidate city, the fitting probability of the user and each candidate city is obtained through city probability model prediction.
This step can refer to the detailed description of step 101, and is not described herein again.
And step 203, sequencing the fitting probability of the user and each candidate city.
Specifically, it may be arranged in descending order or in ascending order.
In step 204, the difference between two adjacent probabilities is calculated.
Specifically, for descending order, the latter probability is subtracted from the former probability; for ascending ranking, the latter probability is subtracted from the former probability.
In ascending orderThe arrangement is as an example, the difference M for the ith probability and the (i + 1) th probabilityiCan be calculated according to the following formula:
Mi=Pi-Pi+1(6)
wherein, PiIs the ith probability, Pi+1Is the (i + 1) th probability.
It will be appreciated that in practical applications, the difference may also be taken as an absolute value to ensure that the obtained difference is a positive value.
Step 205, calculating a weighted average of the two adjacent probabilities with the largest difference to obtain a first probability threshold.
Based on the formula (6), two adjacent probabilities P with the largest difference are judgedIAnd PI+1Then the first probability threshold PsCan be calculated according to the following formula:
Ps=C1·PI+C2·PI+1(7)
wherein, C1Is a probability PIWeighting parameter of C2Is a probability PI+1Weighting parameter of C1+C2=1,C1≠0,C2Not equal to 0. In particular, when C1=C2When 0.5, the first probability threshold is PIAnd PI+1Average value of (a).
Step 206, for each candidate city, if the fitting probability between the user and the candidate city is greater than or equal to the first probability threshold, the candidate city is the resident city of the user.
It will be appreciated that for the first probability threshold, probability P, calculated by equation (7)IAnd is arranged at PIThe candidate city corresponding to the probability is the resident city of the user; probability PI+1And is arranged at PI+1The candidate city corresponding to the latter probability is the user's nonresident city.
In summary, an embodiment of the present invention provides a method for determining a resident city, where the method includes: according to the feature information of the user, the feature information of each candidate city and the behavior information of the user in each candidate city, the fitting probability of the user and each candidate city is obtained through city probability model prediction; determining a first probability threshold according to the fitting probability of the user and each candidate city; determining a city of residence of the user from the candidate cities according to the first probability threshold. The method and the device solve the problems that in the prior art, according to city classification statistics, the consumed time is long, and the accuracy of determining a resident city for a user who is active among a plurality of cities is poor, can determine the resident city of the user through the user, the feature information of the city and the behavior information of the user in the city, simplify the calculation process and improve the accuracy.
EXAMPLE III
Referring to fig. 3, there is shown a block diagram of a city-resident determination device, as follows.
The probability prediction module 301 is configured to obtain a fitting probability of the user and each candidate city through city probability model prediction according to the feature information of the user, the feature information of each candidate city, and the behavior information of the user in each candidate city.
A first probability threshold determination module 302, configured to determine a first probability threshold according to the fitting probabilities of the user and the candidate cities.
A city resident determination module 303, configured to determine a city resident of the user from the candidate cities according to the first probability threshold.
In summary, an embodiment of the present invention provides a device for determining a resident city, where the device includes: the probability prediction module is used for predicting and obtaining the fitting probability of the user and each candidate city through a city probability model according to the characteristic information of the user, the characteristic information of each candidate city and the behavior information of the user in each candidate city; a first probability threshold determination module, configured to determine a first probability threshold according to the fitting probability of the user and each candidate city; a city resident determination module, configured to determine a city resident of the user from the candidate cities according to the first probability threshold. The method and the device solve the problems that in the prior art, according to city classification statistics, the consumed time is long, and the accuracy of determining a resident city for a user who is active among a plurality of cities is poor, can determine the resident city of the user through the user, the feature information of the city and the behavior information of the user in the city, simplify the calculation process and improve the accuracy.
Example four
Referring to fig. 4, there is shown a block diagram of another city-resident determination device, as follows.
A probabilistic model training module 401, configured to obtain a city probabilistic model based on labeled data sample set training, where each labeled data sample in the labeled data sample set at least includes: the characteristic information of the user, the characteristic information of the candidate city and the behavior information of the user in the candidate city.
And a probability prediction module 402, configured to obtain, according to the feature information of the user, the feature information of each candidate city, and the behavior information of the user in each candidate city, a fitting probability between the user and each candidate city through a city probability model prediction.
A first probability threshold determining module 403, configured to determine a first probability threshold according to the fitting probability of the user and each candidate city. Optionally, in another embodiment of the present invention, the first probability threshold determining module 403 includes:
a ranking submodule 4031, configured to rank the fitting probabilities of the user and the candidate cities.
A probability difference operator module 4032 for calculating the difference between two adjacent probabilities, respectively.
And the first probability threshold determining submodule 4033 is used for calculating a weighted average value of two adjacent probabilities with the largest difference to obtain a first probability threshold.
A city resident determination module 404, configured to determine a city resident of the user from the candidate cities according to the first probability threshold. Optionally, in another embodiment of the present invention, the resident city determining module includes:
a resident city determination submodule 4041, configured to, for each candidate city, determine that the candidate city is the resident city of the user if a fitting probability between the user and the candidate city is greater than or equal to the first probability threshold.
Optionally, in another embodiment of the present invention, the probabilistic model training module 401 includes:
and the parameter set initialization submodule is used for initializing the parameter set of the urban probability model.
And the probability calculation submodule is used for inputting the characteristic information of the user, the characteristic information of the candidate city and the behavior information of the user in the candidate city into a preset city probability model for each labeled data sample in the labeled data sample set so as to obtain the fitting probability of the user and the corresponding candidate city.
And the loss value determining submodule is used for determining a loss value according to the fitting probability of the user and the corresponding candidate city determined by each marking data sample.
And the continuous training submodule is used for adjusting the parameter set if the loss value does not meet the preset condition until the loss value meets the preset condition.
Optionally, in another embodiment of the present invention, the loss value determining sub-module includes:
and the first difference value calculating unit is used for subtracting the fitting probability of the user and each resident city from the fitting probability of the user and each resident city and adding a preset protection value to each user in the marked data sample set to obtain each first difference value of the user.
And the second difference calculation unit is used for obtaining a second difference by taking the maximum value of each first difference and zero of each user and counting the sum of each second difference to obtain a third difference of the user.
And the loss value determining unit is used for counting the sum of the third difference values of all the users to obtain a loss value.
In summary, an embodiment of the present invention provides a device for determining a resident city, where the device includes: the probability prediction module is used for predicting and obtaining the fitting probability of the user and each candidate city through a city probability model according to the characteristic information of the user, the characteristic information of each candidate city and the behavior information of the user in each candidate city; a first probability threshold determination module, configured to determine a first probability threshold according to the fitting probability of the user and each candidate city; a city resident determination module, configured to determine a city resident of the user from the candidate cities according to the first probability threshold. The method and the device solve the problems that in the prior art, according to city classification statistics, the consumed time is long, and the accuracy of determining a resident city for a user who is active among a plurality of cities is poor, can determine the resident city of the user through the user, the feature information of the city and the behavior information of the user in the city, simplify the calculation process and improve the accuracy.
An embodiment of the present invention further provides an electronic device, including: a processor, a memory, and a computer program stored on the memory and executable on the processor, wherein the processor implements the city resident determination method of the foregoing embodiments when executing the program.
Embodiments of the present invention also provide a readable storage medium, and when instructions in the storage medium are executed by a processor of an electronic device, the electronic device is enabled to execute the resident city determination method of the foregoing embodiments.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functionality of some or all of the components of a city-resident decision device in accordance with embodiments of the present invention. The present invention may also be embodied as an apparatus or device program for carrying out a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A resident city determination method applied to an electronic device, the method comprising:
according to the feature information of a target user, the feature information of each candidate city and the behavior information of the target user in each candidate city, the fitting probability of the target user and each candidate city is obtained through city probability model prediction;
determining a first probability threshold according to the fitting probability of the target user and each candidate city;
determining a city resident by the target user from the candidate cities according to the first probability threshold.
2. The method of claim 1, wherein said step of determining a city resident by said target user from said candidate cities based on said first probability threshold comprises:
for each candidate city, if the fitting probability between the target user and the candidate city is greater than or equal to the first probability threshold, the candidate city is a resident city of the target user.
3. The method of claim 1, wherein the step of determining a first probability threshold based on the fitted probabilities of the target user and the candidate cities comprises:
sorting the fitting probabilities of the target user and each candidate city;
respectively calculating the difference between two adjacent probabilities;
and calculating the weighted average value of the two adjacent probabilities with the maximum difference value to obtain a first probability threshold value.
4. The method of claim 1, further comprising:
training based on an annotated data sample set to obtain a city probability model, wherein each annotated data sample in the annotated data sample set at least comprises: the characteristic information of the user, the characteristic information of the candidate city and the behavior information of the user in the candidate city.
5. The method of claim 4, wherein the step of training a city probability model based on the labeled data sample set comprises:
initializing a parameter set of a city probability model;
for each labeled data sample in the labeled data sample set, inputting the characteristic information of the user, the characteristic information of the candidate city and the behavior information of the user in the candidate city into a preset city probability model to obtain the fitting probability of the user and the corresponding candidate city;
determining a loss value according to the fitting probability of the user and the corresponding candidate city determined by each labeled data sample;
and if the loss value does not meet the preset condition, adjusting the parameter set until the loss value meets the preset condition.
6. The method of claim 5, wherein the step of determining the loss value according to the fitting probability of the user and the corresponding candidate city determined by each labeled data sample comprises:
for each user in the marked data sample set, subtracting the fitting probability of the user and each resident city from the fitting probability of the user and each resident city, and adding a preset protection value to obtain each first difference value of the user;
for each user, taking the maximum value of each first difference value of the user and zero to obtain a second difference value, and counting the sum of each second difference value to obtain a third difference value of the user;
and counting the sum of the third difference values of all the users to obtain a loss value.
7. An apparatus for determining a resident city, applied to an electronic device, the apparatus comprising:
the probability prediction module is used for predicting and obtaining the fitting probability of the user and each candidate city through a city probability model according to the characteristic information of the user, the characteristic information of each candidate city and the behavior information of the user in each candidate city;
a first probability threshold determination module, configured to determine a first probability threshold according to the fitting probability of the user and each candidate city;
a city resident determination module, configured to determine a city resident of the user from the candidate cities according to the first probability threshold.
8. The apparatus of claim 7, wherein the resident city determination module comprises:
and the resident city determining submodule is used for determining, for each candidate city, if the fitting probability of the user and the candidate city is greater than or equal to the first probability threshold, the candidate city is the resident city of the user.
9. An electronic device, comprising:
processor, memory and computer program stored on the memory and executable on the processor, characterized in that the processor, when executing the program, implements the city-resident determination method according to any one of claims 1 to 6.
10. A readable storage medium, wherein instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the city resident determination method as recited in any one of method claims 1-6.
CN201810112757.8A 2018-02-05 2018-02-05 Resident city determination method and device and electronic equipment Active CN108446944B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810112757.8A CN108446944B (en) 2018-02-05 2018-02-05 Resident city determination method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810112757.8A CN108446944B (en) 2018-02-05 2018-02-05 Resident city determination method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN108446944A CN108446944A (en) 2018-08-24
CN108446944B true CN108446944B (en) 2020-03-17

Family

ID=63191751

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810112757.8A Active CN108446944B (en) 2018-02-05 2018-02-05 Resident city determination method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN108446944B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110956351B (en) * 2018-09-27 2022-05-17 北京嘀嘀无限科技发展有限公司 Order allocation method and device, server and computer readable storage medium
CN109195096B (en) * 2018-10-15 2020-04-10 北京创鑫旅程网络技术有限公司 Method and device for determining travel state
CN109242580B (en) * 2018-11-28 2020-12-29 北京腾云天下科技有限公司 Method for determining recommended emporium of target brand in specified city
CN110177152B (en) * 2019-06-11 2022-03-29 秒针信息技术有限公司 Information pushing method and device
CN111241225B (en) * 2020-01-10 2023-08-08 北京百度网讯科技有限公司 Method, device, equipment and storage medium for judging change of resident area
CN111311193B (en) * 2020-02-26 2023-09-22 百度在线网络技术(北京)有限公司 Method and device for configuring public service resources
CN111935646B (en) * 2020-07-22 2022-09-20 北京明略昭辉科技有限公司 Method and system for estimating common address of mobile equipment user

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105142104A (en) * 2015-06-19 2015-12-09 北京奇虎科技有限公司 Method, device and system for providing recommendation information
CN105740401A (en) * 2016-01-28 2016-07-06 北京理工大学 Individual behavior and group interest-based interest place recommendation method and device
CN106383882A (en) * 2016-09-13 2017-02-08 北京三快在线科技有限公司 Information recommendation method and device and server

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9659258B2 (en) * 2013-09-12 2017-05-23 International Business Machines Corporation Generating a training model based on feedback
GB2547395A (en) * 2014-12-09 2017-08-16 Beijing Didi Infinity Tech And Dev Co Ltd User maintenance system and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105142104A (en) * 2015-06-19 2015-12-09 北京奇虎科技有限公司 Method, device and system for providing recommendation information
CN105740401A (en) * 2016-01-28 2016-07-06 北京理工大学 Individual behavior and group interest-based interest place recommendation method and device
CN106383882A (en) * 2016-09-13 2017-02-08 北京三快在线科技有限公司 Information recommendation method and device and server

Also Published As

Publication number Publication date
CN108446944A (en) 2018-08-24

Similar Documents

Publication Publication Date Title
CN108446944B (en) Resident city determination method and device and electronic equipment
Wang Why public health needs GIS: a methodological overview
Önder et al. Tracing tourists by their digital footprints: The case of Austria
LeSage Bayesian estimation of spatial autoregressive models
CN110647696B (en) Business object sorting method and device
CN110334289B (en) Travel destination determining method and target user determining method
Mimis et al. Property valuation with artificial neural network: the case of Athens
KR101810169B1 (en) Method and device for predicting future number of customers coming into the store based on pattern information of floating population
Huang et al. Predicting human mobility with activity changes
CN109614556B (en) Access path prediction and information push method and device
D’Silva et al. Predicting the temporal activity patterns of new venues
CN111639988B (en) Broker recommendation method, device, electronic equipment and storage medium
Cui et al. Travel behavior classification: an approach with social network and deep learning
Khatibi et al. Fine-grained tourism prediction: Impact of social and environmental features
Kim et al. The accuracy of tourism forecasting and data characteristics: a meta-analytical approach
Mameli et al. Employment growth in Italian local labour systems: Issues of model specification and sectoral aggregation
Abildtrup et al. Combining RP and SP data while accounting for large choice sets and travel mode–an application to forest recreation
Gröbel et al. Hedonic pricing and the spatial structure of housing data–an application to Berlin
Dajcman Time-varying long-range dependence in stock market returns and financial market disruptions–a case of eight European countries
CN112966189A (en) Fund product recommendation system
Zhang et al. Analysis of street crime predictors in web open data
CN108171530B (en) Method and device for improving unit price and repurchase rate of customers
Shen et al. Delineating the perceived functional regions of London from commuting flows
CN112765475A (en) Smart travel target matching method
Heikinheimo et al. Detecting country of residence from social media data: a comparison of methods

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant