CN108446944B

CN108446944B - Resident city determination method and device and electronic equipment

Info

Publication number: CN108446944B
Application number: CN201810112757.8A
Authority: CN
Inventors: 吕兵; 付晴川; 朱日兵; 左元; 吴金蔚; 文诗琪; 霍盼; 姚杏
Original assignee: Beijing Sankuai Online Technology Co Ltd
Current assignee: Beijing Sankuai Online Technology Co Ltd
Priority date: 2018-02-05
Filing date: 2018-02-05
Publication date: 2020-03-17
Anticipated expiration: 2038-02-05
Also published as: CN108446944A

Abstract

The invention provides a resident city determining method, a resident city determining device and electronic equipment, wherein the resident city determining method comprises the following steps: according to the feature information of the user, the feature information of each candidate city and the behavior information of the user in each candidate city, the fitting probability of the user and each candidate city is obtained through city probability model prediction; determining a first probability threshold according to the fitting probability of the user and each candidate city; determining a city of residence of the user from the candidate cities according to the first probability threshold. The method and the device solve the problems that in the prior art, according to city classification statistics, the consumed time is long, and the accuracy of determining a resident city for a user who is active among a plurality of cities is poor, can determine the resident city of the user through the user, the feature information of the city and the behavior information of the user in the city, simplify the calculation process and improve the accuracy.

Description

Resident city determination method and device and electronic equipment

Technical Field

The embodiment of the invention relates to the technical field of computers, in particular to a resident city determining method and device and electronic equipment.

Background

The resident city is a city where the user lives or works all the year round, information and products are recommended to the user according to the resident city, and the recommendation success rate can be effectively improved. For example, for a user who lives or works in city a throughout the year, news information of city a and information such as tickets are recommended to the user, but information such as specials and tourist spots of city a is not recommended to the user.

In the prior art, the step of determining the resident city algorithm of the user includes: firstly, respectively counting the stay time of a user in a specified historical period based on a city; wherein the residence time may be expressed in days; then, sequencing the residence time of a user in each city; and finally, taking the city with the most staying days as the resident city of the user.

It can be seen that the above process needs to be classified according to cities, and when the number of the cities is large, the time consumption is long; and for users who are often active between multiple cities, determining a resident city is less accurate.

Disclosure of Invention

The invention provides a resident city determining method, a resident city determining device and electronic equipment, and aims to solve the problem of determining a resident city in the prior art.

According to the feature information of the user, the feature information of each candidate city and the behavior information of the user in each candidate city, the fitting probability of the user and each candidate city is obtained through city probability model prediction;

determining a first probability threshold according to the fitting probability of the user and each candidate city;

determining a city of residence of the user from the candidate cities according to the first probability threshold.

According to a second aspect of the present invention, there is provided an apparatus for determining a resident city, the apparatus comprising:

the probability prediction module is used for predicting and obtaining the fitting probability of the user and each candidate city through a city probability model according to the characteristic information of the user, the characteristic information of each candidate city and the behavior information of the user in each candidate city;

a first probability threshold determination module, configured to determine a first probability threshold according to the fitting probability of the user and each candidate city;

a city resident determination module, configured to determine a city resident of the user from the candidate cities according to the first probability threshold.

According to a third aspect of the present invention, there is provided an electronic apparatus comprising:

a processor, a memory, and a computer program stored on the memory and executable on the processor, wherein the processor implements the method for determining a resident city described above when executing the program.

According to a fourth aspect of the present invention, there is provided a readable storage medium characterized in that instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the aforementioned resident city determination method.

The embodiment of the invention provides a resident city determining method, a resident city determining device and electronic equipment, wherein the resident city determining method comprises the following steps: according to the feature information of the user, the feature information of each candidate city and the behavior information of the user in each candidate city, the fitting probability of the user and each candidate city is obtained through city probability model prediction; determining a first probability threshold according to the fitting probability of the user and each candidate city; determining a city of residence of the user from the candidate cities according to the first probability threshold. The method and the device solve the problems that in the prior art, according to city classification statistics, the calculation process is complex, and the accuracy of determining a resident city for a user who is active among a plurality of cities is poor, can determine the resident city of the user through the user, the characteristic information of the city and the behavior information of the user in the city, simplify the calculation process and improve the accuracy.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.

Fig. 1 is a flowchart illustrating specific steps of a resident city determination method under a system architecture according to an embodiment of the present invention;

fig. 2 is a flowchart illustrating specific steps of another resident city determination method under the system architecture according to an embodiment of the present invention;

fig. 3 is a block diagram of a determination apparatus of a resident city according to an embodiment of the present invention;

fig. 4 is a block diagram of another city-resident determination device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example one

Referring to fig. 1, there is shown a flow chart of steps of a resident city determination method, including:

step 101, according to the feature information of the user, the feature information of each candidate city, and the behavior information of the user in each candidate city, the fitting probability of the user and each candidate city is obtained through city probability model prediction.

The characteristic information of the user includes but is not limited to: gender, age, occupation, consumption level, income level, whether to travel to a person.

The feature information of the candidate city includes, but is not limited to: city grade, whether to travel a city, the number of local users and the number of remote users in the city, and daily average orders of services such as hotel travel traffic.

The behavior information of the user in each candidate city includes but is not limited to: the number of times and the proportion of browsing and positioning the candidate city by the user in a specified historical time period, the maximum time window of the user appearing in the candidate city, and whether the candidate city is the home city of the user. It is to be understood that the specified historical time period may be the past half year, month, week, etc., and the historical time period is not limited by embodiments of the present invention.

It can be seen that the above information includes two kinds of continuous and discrete information, where the continuous information includes: age, the number of local users and the number of users in different places in a city, daily orders of services such as hotel travel traffic and the like, the times and the proportion of browsing and positioning the candidate city by the user in a specified historical time period, and the maximum time window of the user appearing in the candidate city, wherein the discrete information comprises: gender, occupation, consumption level, income level, whether to travel to a person, city class, whether to travel to a city, whether to travel to the candidate city, or to the user's home city.

In practical applications, discrete features in the above information can be represented by discrete numerical values. For example, for the gender information of the user: male 1, female 2, income level of the user: low income is indicated by 1, medium income is indicated by 2, high income is indicated by 3, and the like.

It can be understood that, for a user and a candidate city, each feature information of the user, each feature information of the candidate city, and each behavior information of the user in the candidate city are respectively input to the city probability model corresponding to a variable, so as to obtain a fitting probability between the user and the candidate city, and thus, for a plurality of users and a plurality of candidate cities, a fitting probability between each user and each candidate city is obtained.

And 102, determining a first probability threshold according to the fitting probability of the user and each candidate city.

In the embodiment of the invention, the two adjacent probability values with the maximum probability value difference are used as reference values for determining the first probability threshold.

Specifically, an average value of two adjacent probability values may be used as the first probability threshold, and other weighted average values of two adjacent probability thresholds may also be used as the first probability threshold.

And 103, determining the resident city of the user from the candidate cities according to the first probability threshold.

Specifically, the candidate city with the probability greater than the first probability threshold is taken as the resident city, and the candidate city with the probability less than the first probability threshold is taken as the non-resident city. So that the resident city may exist in plural numbers.

The embodiment of the invention determines that a plurality of resident cities are more suitable for practical application, and in the practical application, a small part of users can be actively in a plurality of cities. For example, a user frequently makes business trips to and from a large city such as Guangzhou, Shanghai, Beijing for a long time, or a user homes in a suburb town of Hebei Gallery city, but goes to work in the Beijing city.

After the user is determined to be in the resident city, personalized recommendation can be performed so as to improve the recommendation success rate. For example, when a user visits the Mei Tuo platform in a resident city, hot-sell goods purchased by the local user can be recommended, but local hot spots, special products and the like are not recommended; when a user accesses the Mei Tuo platform in a non-resident city, hot goods purchased by the user in different places, travel products in different places, train tickets between the resident city and the different places and the like can be recommended.

In summary, an embodiment of the present invention provides a method for determining a resident city, where the method includes: according to the feature information of the user, the feature information of each candidate city and the behavior information of the user in each candidate city, the fitting probability of the user and each candidate city is obtained through city probability model prediction; determining a first probability threshold according to the fitting probability of the user and each candidate city; determining a city of residence of the user from the candidate cities according to the first probability threshold. The method and the device solve the problems that in the prior art, according to city classification statistics, the consumed time is long, and the accuracy of determining a resident city for a user who is active among a plurality of cities is poor, can determine the resident city of the user through the user, the feature information of the city and the behavior information of the user in the city, simplify the calculation process and improve the accuracy.

Example two

The embodiment of the application describes a method for determining an optionally resident city from the hierarchy of a system architecture.

Referring to fig. 2, a flowchart illustrating specific steps of another resident city determination method is shown.

Step 201, training based on an annotated data sample set to obtain a city probability model, where each annotated data sample in the annotated data sample set at least includes: the characteristic information of the user, the characteristic information of the candidate city and the behavior information of the user in the candidate city.

And marking whether the corresponding city is a resident city of the corresponding user or not by each data sample of the marked data sample set. In practical applications, a field may be added to indicate whether the data sample is a resident city sample or a non-resident city sample. For example, a sample may be labeled by a field residenticity, and when residenticity is 1, the sample is a resident city sample; when ResientCity is 0, the sample is a non-stationary city sample.

The labeled data sample set can be obtained by telephone call return and questionnaire of customer service personnel to users of similar platforms such as hotel clients and beauty groups platforms.

The characteristic information of the user can be obtained by analyzing the use log of the application by the user. For example, when the user registers an account number in the meio platform, the input personal information includes: gender, age, occupation, consumption level and income level can be used as characteristic information of the user, and whether the user is a tourist can be analyzed from a use log of the user. When the information concerned by the user is mostly tourist attractions, hotels and the like, the user can be determined to be a tourist visitor; otherwise, the user is not a tourist. It can be understood that in practical application, whether the user likes a travel option or not can be provided for the user to select during registration; thereby selecting users who like to travel as the tourist; otherwise, the user who does not like to travel is not the traveling person.

The feature information of the candidate city may be obtained from a database. For example, the beauty parlor platform can make a basic datum for the city.

The behavior information of the user in the candidate city can be obtained by analyzing the access log of the user to the application, for example, the number and the proportion of browsing and locating the candidate city by the user in a specified historical time period, the maximum time window of the user appearing in the candidate city, and whether the candidate city is the home city of the user can be determined through the log.

In the embodiment of the present invention, a logistic regression model or a decision tree model may be used for training, so that the parameter set in the model is the optimal parameter set.

Optionally, in another embodiment of the present invention, step 201 includes sub-steps 2011 to 2014:

sub-step 2011 initializes a set of parameters for the city probability model.

After the probabilistic model is selected, a set of parameters for the probabilistic model is initialized. For example, for a logistic regression model, the model formula is as follows:

wherein u is a user and c is a candidate city;

and N is the number of the non-constant parameters and is determined according to the number of the characteristic information of the user, the characteristic information of the candidate city and the behavior information of the user in the candidate city. For example, if the feature information of the user includes six types, such as gender, age, occupation, consumption level, income level, whether to travel to reach people, etc., the feature information of the city includes five types, such as city grade, whether to travel to a city, the number of local users and the number of remote users in the city, the daily average number of orders of hotel travel traffic, etc., and the behavior information of the user in the candidate city includes three types, such as the number of times and the ratio of browsing and locating the candidate city by the user in a specified historical time period, the maximum time window of the user appearing in the candidate city, and whether the candidate city is the home city of the user, etc., so that the size of the parameter set is 14.

x_iTaking the value of ith characteristic information in the characteristic information consisting of the characteristic information of the user u, the characteristic information of the candidate city c and the behavior information of the user u in the candidate city c;

w_iis x_iB is a constant parameter.

It will be appreciated that training is simply determining w_iAnd the value of b.

Specifically, the parameter set may be initialized based on empirical values, or may be initialized to other values through analysis. It will be appreciated that when the values of the initialization parameter set are inappropriate, training time will be increased; training time will be reduced when the value of the initialization parameter set is close to the optimal parameter set.

And a substep 2012, for each labeled data sample in the labeled data sample set, inputting the feature information of the user, the feature information of the candidate city, and the behavior information of the user in the candidate city into a preset city probability model, so as to obtain the fitting probability between the user and the corresponding candidate city.

Specifically, for the user u and the candidate city c, the feature information of the user u, the feature information of the candidate city c, and the behavior information of the user u in the candidate city c are input into the probability model according to the specified sequence, and the fitting probability of the user u and the candidate city c is obtained through calculation.

It can be understood that the order of the feature information of the user u, the feature information of the candidate city c, and the behavior information of the user u in the candidate city c may be set according to an actual application scenario, which is not limited in the embodiment of the present invention.

In practical application, the tagged data sample set includes a plurality of candidate cities corresponding to a large number of users, and for each tagged data sample, one user corresponds to one candidate city corresponding to the user. Therefore, fitting probabilities of a large number of users and a plurality of candidate cities can be obtained according to the labeled data sample set.

And a substep 2013, determining a loss value according to the fitting probability of the user and the corresponding candidate city determined by each marking data sample.

Compared with the loss function in the traditional model, such as the cross entropy loss function of the logistic regression model, the loss function of the embodiment of the invention can not only sequence the fitting probabilities of the users and the cities, but also make the fitting probability difference between the users and the resident cities more obvious than that between the users and the non-resident cities by maximizing the probability interval between the resident cities and the non-resident cities.

Optionally, in another embodiment of the present invention, step 2013 includes substeps 20131 through 20132:

and a substep 20131, for each user in the labeled data sample set, subtracting the fitting probability of the user and each resident city from the fitting probability of the user and each resident city, and adding a preset protection value to obtain each first difference value of the user.

In the embodiment of the present invention, for all users in the labeled data sample set, based on all labeled data samples of each user, each first difference value of the user is calculated according to the fitting probability of the user to each non-resident city and the fitting probability of each resident city.

Specifically, for user u, resident city c', non-resident city c, the calculation formula of the first difference value is as follows:

M₁＝f(φ(u,c))-f(φ(u,c'))+ε (2)

where f (φ (u, c)) is the fitting probability of user u and the non-stationary city c, and f (φ (u, c ')) is the fitting probability of user u and the stationary city c'. In practical applications, a logistic regression model formula as shown in formula (1) may be used, and other model formulas may also be used.

Epsilon is a preset protection value, which affects the training result, and can be adjusted according to the prediction result in the training process.

And a substep 20132, for each user, obtaining a second difference value by taking the maximum value of each first difference value of the user and zero, and counting the sum of each second difference value to obtain a third difference value of the user.

Specifically, for user u, resident city c', non-resident city c, the calculation formula of the second difference value is as follows:

M₂＝max(0,M₁)＝max(0,f(φ(u,c))-f(φ(u,c'))+ε) (3)

then, the calculation formula of the third difference value is as follows:

wherein the content of the first and second substances,_Cuthe candidate city set of the user u is divided into a resident city and a non-resident city. It is understood that the resident city and the non-resident city may be a plurality of cities。

And a substep 20133, which is to count the sum of the third difference values of all users to obtain a loss value.

Specifically, the calculation formula of the loss value is as follows:

wherein U is the set of all users.

And a substep 2014, if the loss value does not satisfy the preset condition, adjusting the parameter set until the loss value satisfies the preset condition.

Specifically, when the loss value is less than or equal to a preset value, the loss value meets a preset condition, the training is finished, and the corresponding parameter group is a target parameter group; and when the loss value is larger than the preset value, the loss value does not meet the preset condition, and the parameter set is adjusted to continue training until the loss value meets the preset condition.

It can be understood that the preset value may be set according to an actual application scenario, and the embodiment of the present invention does not limit the preset value. The smaller the preset value is, the more accurate the training result is, and the longer the training time is; the larger the preset value is, the coarser the training result is, and the shorter the training time is.

Step 202, according to the feature information of the user, the feature information of each candidate city, and the behavior information of the user in each candidate city, the fitting probability of the user and each candidate city is obtained through city probability model prediction.

This step can refer to the detailed description of step 101, and is not described herein again.

And step 203, sequencing the fitting probability of the user and each candidate city.

Specifically, it may be arranged in descending order or in ascending order.

In step 204, the difference between two adjacent probabilities is calculated.

Specifically, for descending order, the latter probability is subtracted from the former probability; for ascending ranking, the latter probability is subtracted from the former probability.

In ascending orderThe arrangement is as an example, the difference M for the ith probability and the (i + 1) th probability_iCan be calculated according to the following formula:

M_i＝P_i-P_i+1(6)

wherein, P_iIs the ith probability, P_i+1Is the (i + 1) th probability.

It will be appreciated that in practical applications, the difference may also be taken as an absolute value to ensure that the obtained difference is a positive value.

Step 205, calculating a weighted average of the two adjacent probabilities with the largest difference to obtain a first probability threshold.

Based on the formula (6), two adjacent probabilities P with the largest difference are judged_IAnd P_I+1Then the first probability threshold P_sCan be calculated according to the following formula:

P_s＝C₁·P_I+C₂·P_I+1(7)

wherein, C₁Is a probability P_IWeighting parameter of C₂Is a probability P_I+1Weighting parameter of C₁+C₂＝1,C₁≠0,C₂Not equal to 0. In particular, when C₁＝C₂When 0.5, the first probability threshold is P_IAnd P_I+1Average value of (a).

Step 206, for each candidate city, if the fitting probability between the user and the candidate city is greater than or equal to the first probability threshold, the candidate city is the resident city of the user.

It will be appreciated that for the first probability threshold, probability P, calculated by equation (7)_IAnd is arranged at P_IThe candidate city corresponding to the probability is the resident city of the user; probability P_I+1And is arranged at P_I+1The candidate city corresponding to the latter probability is the user's nonresident city.

EXAMPLE III

Referring to fig. 3, there is shown a block diagram of a city-resident determination device, as follows.

The probability prediction module 301 is configured to obtain a fitting probability of the user and each candidate city through city probability model prediction according to the feature information of the user, the feature information of each candidate city, and the behavior information of the user in each candidate city.

A first probability threshold determination module 302, configured to determine a first probability threshold according to the fitting probabilities of the user and the candidate cities.

A city resident determination module 303, configured to determine a city resident of the user from the candidate cities according to the first probability threshold.

In summary, an embodiment of the present invention provides a device for determining a resident city, where the device includes: the probability prediction module is used for predicting and obtaining the fitting probability of the user and each candidate city through a city probability model according to the characteristic information of the user, the characteristic information of each candidate city and the behavior information of the user in each candidate city; a first probability threshold determination module, configured to determine a first probability threshold according to the fitting probability of the user and each candidate city; a city resident determination module, configured to determine a city resident of the user from the candidate cities according to the first probability threshold. The method and the device solve the problems that in the prior art, according to city classification statistics, the consumed time is long, and the accuracy of determining a resident city for a user who is active among a plurality of cities is poor, can determine the resident city of the user through the user, the feature information of the city and the behavior information of the user in the city, simplify the calculation process and improve the accuracy.

Example four

Referring to fig. 4, there is shown a block diagram of another city-resident determination device, as follows.

A probabilistic model training module 401, configured to obtain a city probabilistic model based on labeled data sample set training, where each labeled data sample in the labeled data sample set at least includes: the characteristic information of the user, the characteristic information of the candidate city and the behavior information of the user in the candidate city.

And a probability prediction module 402, configured to obtain, according to the feature information of the user, the feature information of each candidate city, and the behavior information of the user in each candidate city, a fitting probability between the user and each candidate city through a city probability model prediction.

A first probability threshold determining module 403, configured to determine a first probability threshold according to the fitting probability of the user and each candidate city. Optionally, in another embodiment of the present invention, the first probability threshold determining module 403 includes:

a ranking submodule 4031, configured to rank the fitting probabilities of the user and the candidate cities.

A probability difference operator module 4032 for calculating the difference between two adjacent probabilities, respectively.

And the first probability threshold determining submodule 4033 is used for calculating a weighted average value of two adjacent probabilities with the largest difference to obtain a first probability threshold.

A city resident determination module 404, configured to determine a city resident of the user from the candidate cities according to the first probability threshold. Optionally, in another embodiment of the present invention, the resident city determining module includes:

a resident city determination submodule 4041, configured to, for each candidate city, determine that the candidate city is the resident city of the user if a fitting probability between the user and the candidate city is greater than or equal to the first probability threshold.

Optionally, in another embodiment of the present invention, the probabilistic model training module 401 includes:

and the parameter set initialization submodule is used for initializing the parameter set of the urban probability model.

And the probability calculation submodule is used for inputting the characteristic information of the user, the characteristic information of the candidate city and the behavior information of the user in the candidate city into a preset city probability model for each labeled data sample in the labeled data sample set so as to obtain the fitting probability of the user and the corresponding candidate city.

And the loss value determining submodule is used for determining a loss value according to the fitting probability of the user and the corresponding candidate city determined by each marking data sample.

And the continuous training submodule is used for adjusting the parameter set if the loss value does not meet the preset condition until the loss value meets the preset condition.

Optionally, in another embodiment of the present invention, the loss value determining sub-module includes:

and the first difference value calculating unit is used for subtracting the fitting probability of the user and each resident city from the fitting probability of the user and each resident city and adding a preset protection value to each user in the marked data sample set to obtain each first difference value of the user.

And the second difference calculation unit is used for obtaining a second difference by taking the maximum value of each first difference and zero of each user and counting the sum of each second difference to obtain a third difference of the user.

And the loss value determining unit is used for counting the sum of the third difference values of all the users to obtain a loss value.

An embodiment of the present invention further provides an electronic device, including: a processor, a memory, and a computer program stored on the memory and executable on the processor, wherein the processor implements the city resident determination method of the foregoing embodiments when executing the program.

Embodiments of the present invention also provide a readable storage medium, and when instructions in the storage medium are executed by a processor of an electronic device, the electronic device is enabled to execute the resident city determination method of the foregoing embodiments.

For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.

The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.

In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functionality of some or all of the components of a city-resident decision device in accordance with embodiments of the present invention. The present invention may also be embodied as an apparatus or device program for carrying out a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A resident city determination method applied to an electronic device, the method comprising:

according to the feature information of a target user, the feature information of each candidate city and the behavior information of the target user in each candidate city, the fitting probability of the target user and each candidate city is obtained through city probability model prediction;

determining a first probability threshold according to the fitting probability of the target user and each candidate city;

determining a city resident by the target user from the candidate cities according to the first probability threshold.

2. The method of claim 1, wherein said step of determining a city resident by said target user from said candidate cities based on said first probability threshold comprises:

for each candidate city, if the fitting probability between the target user and the candidate city is greater than or equal to the first probability threshold, the candidate city is a resident city of the target user.

3. The method of claim 1, wherein the step of determining a first probability threshold based on the fitted probabilities of the target user and the candidate cities comprises:

sorting the fitting probabilities of the target user and each candidate city;

respectively calculating the difference between two adjacent probabilities;

and calculating the weighted average value of the two adjacent probabilities with the maximum difference value to obtain a first probability threshold value.

4. The method of claim 1, further comprising:

training based on an annotated data sample set to obtain a city probability model, wherein each annotated data sample in the annotated data sample set at least comprises: the characteristic information of the user, the characteristic information of the candidate city and the behavior information of the user in the candidate city.

5. The method of claim 4, wherein the step of training a city probability model based on the labeled data sample set comprises:

initializing a parameter set of a city probability model;

for each labeled data sample in the labeled data sample set, inputting the characteristic information of the user, the characteristic information of the candidate city and the behavior information of the user in the candidate city into a preset city probability model to obtain the fitting probability of the user and the corresponding candidate city;

determining a loss value according to the fitting probability of the user and the corresponding candidate city determined by each labeled data sample;

and if the loss value does not meet the preset condition, adjusting the parameter set until the loss value meets the preset condition.

6. The method of claim 5, wherein the step of determining the loss value according to the fitting probability of the user and the corresponding candidate city determined by each labeled data sample comprises:

for each user in the marked data sample set, subtracting the fitting probability of the user and each resident city from the fitting probability of the user and each resident city, and adding a preset protection value to obtain each first difference value of the user;

for each user, taking the maximum value of each first difference value of the user and zero to obtain a second difference value, and counting the sum of each second difference value to obtain a third difference value of the user;

and counting the sum of the third difference values of all the users to obtain a loss value.

7. An apparatus for determining a resident city, applied to an electronic device, the apparatus comprising:

8. The apparatus of claim 7, wherein the resident city determination module comprises:

and the resident city determining submodule is used for determining, for each candidate city, if the fitting probability of the user and the candidate city is greater than or equal to the first probability threshold, the candidate city is the resident city of the user.

9. An electronic device, comprising:

processor, memory and computer program stored on the memory and executable on the processor, characterized in that the processor, when executing the program, implements the city-resident determination method according to any one of claims 1 to 6.

10. A readable storage medium, wherein instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the city resident determination method as recited in any one of method claims 1-6.