CN108305099B - Method and device for determining purchasing user - Google Patents

Method and device for determining purchasing user Download PDF

Info

Publication number
CN108305099B
CN108305099B CN201810050530.5A CN201810050530A CN108305099B CN 108305099 B CN108305099 B CN 108305099B CN 201810050530 A CN201810050530 A CN 201810050530A CN 108305099 B CN108305099 B CN 108305099B
Authority
CN
China
Prior art keywords
users
user
determining
purchasing
sample set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810050530.5A
Other languages
Chinese (zh)
Other versions
CN108305099A (en
Inventor
董泽伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced Nova Technology Singapore Holdings Ltd
Original Assignee
Advanced New Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced New Technologies Co Ltd filed Critical Advanced New Technologies Co Ltd
Priority to CN201810050530.5A priority Critical patent/CN108305099B/en
Publication of CN108305099A publication Critical patent/CN108305099A/en
Application granted granted Critical
Publication of CN108305099B publication Critical patent/CN108305099B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0203Market surveys; Market polls

Landscapes

  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Embodiments disclosed herein provide a method of determining a purchasing user. The method comprises the following steps: and determining a plurality of users located out of the country at a preset moment, and determining marked users and unmarked users according to the plurality of users and a pre-stored sample set of the purchasing users. And then, determining feature information corresponding to each user of the marked users and the unmarked users and relevant to the outbound, and determining purchasing users included in the plurality of users through an algorithm based on semi-supervised learning according to the feature information.

Description

Method and device for determining purchasing user
Technical Field
Embodiments disclosed in the present specification relate to the field of internet technologies, and in particular, to a method and an apparatus for determining a purchasing user.
Background
The purchasing service means that a purchasing merchant or a person who often goes out and enters the country purchases a required commodity. Reasons for the purchasing industry include: the consumer cannot purchase the desired product at the location, the price of the desired product at the location is higher than the price at other locations, or the consumer wants to purchase the product from the original place of the product in order to purchase the good product. The purchasing staff can go abroad (such as Australia) to purchase commodities periodically according to the needs of the consumers (such as purchasing Australia milk powder), and then the commodities are sent to the interior or directly carried to the interior through express delivery, so that the difference is earned from the commodities while the needs of the consumers are met.
The purchasing staff usually consumes a larger amount of money when purchasing the goods, but a method for identifying the purchasing staff does not exist at present, so that a service platform (such as a payment platform) cannot provide personalized services (such as pushing foreign merchant coupons and the like) for the purchasing staff in users. Therefore, there is a need to provide a reliable method to identify the purchasing user to provide personalized services for the user.
Disclosure of Invention
The specification describes a method and a device for determining purchasing users, which are provided with personalized services by determining purchasing users located out of the country at a predetermined time.
In a first aspect, a method of determining a purchasing user is provided. The method comprises the following steps:
determining a plurality of users who are out of the country at predetermined times;
determining marked users and unmarked users according to the plurality of users and a pre-stored sample set of the purchasing users;
determining feature information related to the outbound corresponding to each user of the marked user and the unmarked user;
according to the characteristic information, determining a purchasing user included in the plurality of users through an algorithm based on semi-supervised learning.
In one possible embodiment, the determining a plurality of users who are located out of the country at predetermined times includes:
acquiring the position information of a user at a preset time, and determining whether the user is located outdoors according to the position information.
In one possible embodiment, the sample set of purchasing users is determined based on at least the location information and transaction records of the sample users within a preset time period.
In one possible implementation, the sample set of purchasing users is also determined based on the class of service provided by the sample user.
In one possible embodiment, the determining the marked users and the unmarked users includes:
taking the sample users in the purchased user sample set as marked users and taking the plurality of users as unmarked users; alternatively, the first and second electrodes may be,
taking the sample users in the purchased user sample set as labeled users, and taking users which do not exist in the purchased user sample set in the plurality of users as unlabeled users; alternatively, the first and second electrodes may be,
and taking users existing in the purchasing user sample set in the plurality of users as marked users, and taking users except the marked users in the plurality of users as unmarked users.
In one possible embodiment, the characteristic information includes at least one of an average length of a single outbound, an average interval length of an outbound, a number of outbound payments, an average payment amount per transaction, and a frequency of transactions with a merchant.
In one possible embodiment, the determining, by a semi-supervised learning based algorithm, a purchasing user included in the plurality of users includes:
determining the similarity between any two users of the marked user and the unmarked user;
determining the propagation probability between two corresponding users according to the similarity;
and determining the probability that each user in the unmarked users belongs to the purchasing user according to the propagation probability.
In a possible implementation manner, the feature information is a plurality of feature information, and the determining a similarity between any two users of the labeled user and the unlabeled user includes:
determining the characteristic value of the user by adopting a weighted summation mode according to the characteristic score and the weight corresponding to each characteristic information in the plurality of characteristic information;
and determining the similarity between the corresponding two users according to the characteristic values of any two users.
In one possible embodiment, the determining, by the algorithm of semi-supervised learning, a purchasing user included in the plurality of users further includes:
and according to the probability, taking the unmarked user corresponding to the probability not less than the preset threshold value as the purchasing user.
In a second aspect, an apparatus for determining a purchasing user is provided. The device includes:
a first determination unit configured to determine a plurality of users who are out of the country at predetermined times;
the second determining unit is used for determining marked users and unmarked users according to the plurality of users and a pre-stored sample set of the purchasing users;
a third determining unit, configured to determine feature information related to the departure corresponding to each of the tagged user and the untagged user;
a fourth determination unit configured to determine a purchasing user included in the plurality of users through a semi-supervised learning based algorithm according to the feature information.
In a possible implementation, the first determining unit is specifically configured to:
acquiring the position information of a user at a preset time, and determining whether the user is located outdoors according to the position information.
In one possible embodiment, the sample set of purchasing users in the second determination unit is determined based on at least the position information and the transaction record of the sample user within a preset time period.
In a possible embodiment, the sample set of purchasing users in the second determination unit is further determined based on the service class provided by the sample user.
In a possible implementation manner, the second determining unit is specifically configured to:
taking the sample users in the purchased user sample set as marked users and taking the plurality of users as unmarked users; alternatively, the first and second electrodes may be,
taking the sample users in the purchased user sample set as labeled users, and taking users which do not exist in the purchased user sample set in the plurality of users as unlabeled users; alternatively, the first and second electrodes may be,
and taking users existing in the purchasing user sample set in the plurality of users as marked users, and taking users except the marked users in the plurality of users as unmarked users.
In a possible embodiment, the characteristic information determined by the third determination unit includes at least one of an average length of time of a single outbound, an average interval length of outbound, the number of outbound paid strokes, an average amount paid per stroke, and a frequency of transactions with the merchant.
In a possible implementation manner, the fourth determining unit specifically includes:
the first determining subunit is configured to determine a similarity between any two users of the labeled user and the unlabeled user;
a second determining subunit, configured to determine, according to the similarity, a propagation probability between two corresponding users;
and the third determining subunit is used for determining the probability that each user in the unlabeled users belongs to the purchasing user according to the propagation probability.
In a possible implementation manner, the feature information determined by the third determining unit is a plurality of feature information, and the first determining subunit is specifically configured to:
determining the characteristic value of the user by adopting a weighted summation mode according to the characteristic score and the weight corresponding to each characteristic information in the plurality of characteristic information;
and determining the similarity between the corresponding two users according to the characteristic values of any two users.
In a possible implementation manner, the fourth determining unit specifically further includes:
and the processing subunit is used for taking the unmarked user corresponding to the probability not less than the preset threshold value as the purchasing user according to the probability.
In a third aspect, a computer-readable storage medium having a computer program stored thereon is provided. When executed in a computer, the computer program causes the computer to perform the method provided in any of the embodiments of the first aspect.
In a fourth aspect, a computing device is provided that includes a memory and a processor. The memory stores executable code, and the processor, when executing the executable code, implements the method provided by any of the embodiments of the first aspect.
In a method and apparatus for determining purchasing users provided in this specification, first, a plurality of users located outside at predetermined times are determined, and a marked user and an unmarked user are determined from the plurality of users and a pre-stored sample set of purchasing users. Then, feature information corresponding to each user of the marked user and the unmarked user and relevant to the outbound is determined, and the purchasing users of the multiple users located out of the country are determined through an algorithm based on semi-supervised learning according to the feature information, so that personalized service can be provided for the purchasing users.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments disclosed in the present specification, the drawings needed to be used in the description of the embodiments will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments disclosed in the present specification, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a schematic application scenario diagram of a method for determining a purchasing user according to an embodiment disclosed in the present specification;
FIG. 2 is a flow chart of a method for determining purchasing users according to one embodiment disclosed herein;
FIG. 3 is a flowchart of a method for determining purchasing users among unlabeled users according to one embodiment disclosed herein;
FIG. 4 is a flow diagram of another method for determining purchasing users according to one embodiment disclosed herein;
fig. 5 is a block diagram of an apparatus for identifying purchasing users according to an embodiment disclosed in the present specification.
Detailed Description
Embodiments disclosed in the present specification are described below with reference to the accompanying drawings.
Fig. 1 is a schematic application scenario diagram of a method for determining a purchasing user according to an embodiment disclosed in the present specification. The method for determining purchasing users disclosed in the embodiments of the present specification may be used when a service platform (e.g., a paypal application platform) and a merchant (e.g., a foreign merchant) need to perform a marketing activity (e.g., push a coupon of a specific merchant) for the purchasing users located at the merchant location for a predetermined period of time (e.g., during the christmas period).
First, a plurality of location information (for example, the location information may be latitude and longitude information) may be acquired through a terminal of a user at a predetermined time (for example, a time determined according to christmas: 00: 12/25/2017), and a plurality of users located outside at that time may be determined according to the plurality of location information. Then, the characteristic information (for example, the characteristic information may include the time interval of the outbound, the average time of each outbound, the number of outbound payment strokes, and the like) related to the outbound corresponding to each user in the user and the pre-stored sample set of the purchasing users (for example, the sample set of the purchasing users may include the users who have purchased the purchasing shop on the purchasing platform and are actively transacted and who regularly go out of the country) is determined. Then, based on the feature information corresponding to each user, an algorithm (e.g., a label propagation algorithm) based on semi-supervised learning is adopted, and a purchasing user included in the plurality of users located out of the country is determined according to the sample users in the purchasing user sample set.
In the method for determining the shopping users provided by the embodiments disclosed in the present specification, by determining a plurality of users located overseas at predetermined times and determining the feature information related to the departure corresponding to each user in the plurality of users and a pre-stored sample set of the shopping users, according to the feature information, the shopping users in the plurality of users located overseas are determined by an algorithm based on semi-supervised learning, and further, the personalized service can be provided for the shopping users.
Fig. 2 is a flowchart of a method for determining purchasing users according to an embodiment disclosed in the present specification. The execution subject of the method may be a device with processing capabilities: a server or a system or apparatus, such as the server shown in fig. 1. As shown in fig. 2, the method specifically includes:
in step S210, a plurality of users who are out of the country at predetermined times are determined.
Specifically, location information of a user at a predetermined time is acquired, and it is determined whether the user is located abroad based on the location information.
In one embodiment, the predetermined time may be set manually according to actual needs. In one example, when it is desired to determine a purchasing user who is out of range during a holiday (e.g., christmas, day of the year), the predetermined time may be set accordingly. For example, the scheduled time is set to 17 hours and 00 minutes in 12 months and 24 days in 2017, or the scheduled time is set to 00 hours and 00 minutes in 1 month and 1 day in 2018.
In one embodiment, the obtained location information may include: information collected from a user's terminal through a Location Based Service (LBS). The LBS includes various Positioning modes, such as Global Positioning System (GPS) Positioning, base station Positioning, Wireless Fidelity (WiFi) Positioning, and the like. Accordingly, the location information may include latitude and longitude data obtained through GPS positioning or base station positioning, or may include WiFi fingerprint data obtained through WiFi positioning.
In one embodiment, determining whether the user is located abroad based on the location information may include: acquiring the registration information (such as the name of the country or the location information of the country) of the user, and determining whether the user is located abroad according to the registration information and the location information (such as longitude and latitude data and WiFi fingerprint data) of the user. In one embodiment, the user's registry information and location information may be input into a pre-established model, and it is determined whether the user is out of the office based on the output. Furthermore, for a user located outside, the output result may also include the country or city in which the user is located.
As such, a plurality of users who are out of the country at predetermined times can be determined in step S210. Further, the country or city of each of the plurality of users may also be determined.
And step S220, determining marked users and unmarked users according to the determined multiple users and a pre-stored sample set of the purchasing users.
Specifically, a sample user in the purchased user sample set may be regarded as a labeled user, and a plurality of users may be regarded as unlabeled users. Alternatively, the sample user of the sample set of the purchasing users may be regarded as the labeled user, and the user who is not present in the sample set of the purchasing users may be regarded as the unlabeled user. Still alternatively, a user existing in the purchased user sample set among the plurality of users may be regarded as a labeled user, and a user other than the labeled user (that is, a user not existing in the purchased user sample set among the plurality of users) may be regarded as an unlabeled user.
It should be noted that the purchase user sample set is predetermined and stored by the system. The manner of determining the sample set of purchasing users includes determining based on the location information and transaction records of the sample user within a preset time period (e.g., the last year); alternatively, the determination may be based on the location information of the sample user within a preset time period, the transaction record, and the service category provided by the user (e.g., shopping service, and may be provided by opening a shopping mall).
In one embodiment, the sample set of purchasing users may include individual purchasing users, and the individual purchasing users may be determined based on the user's location information and transaction records (e.g., payment records) over a preset time period (e.g., 2 years). In one example, first, a first user explicitly marked with the word "buy-for-purchase" in the remarks of the collection record may be determined according to the acquired transaction record within the preset time period. Then, in combination with the acquired location information within the preset time period, a user who has a frequency (e.g., 45/month) of such collection records (collection records in the remarks indicating the word "buy instead") higher than a first threshold (30/month) and a frequency of departure (e.g., 2 korea times per month) higher than a second threshold (e.g., once departure per month) among the first users is taken as an individual purchasing user.
That is, a user who is a transfer receiver who receives a transfer money, who is periodically out of a certain area or country and who is explicitly marked with a "purchasing" character (e.g., average monthly number of strokes) and who has received a large number of strokes of the transfer money, may be defined as a personal purchasing user.
In another embodiment, the shopping-user sample set may include online-store shopping users, and the merchant shopping users may be determined based on the user's location information, transaction record, and the service category provided by the user (e.g., information about the online-store purchased by the user) within a preset time period (e.g., 1 year).
In one example, the online store shopping user may be determined by: first, a list of users having an online shopping mall (i.e., a list of store sellers) and information about the online shopping mall (e.g., a volume of a transaction and a place of origin of a sold product) are acquired. Wherein the user name ticket may include a user opening a store on a shopping platform (e.g., a global shopping) and/or a user opening an online shopping outlet (e.g., a store including "shopping" in the online store name) on an integrated merchant platform (e.g., a Taobao). Then, locations where the user periodically (e.g., once every two weeks or once every month) departs from the house are determined based on the location information of the respective users included in the user list, and the number of times (e.g., 40 times) that the overseas payment amount exceeds a first threshold (e.g., 5000 dollars) is determined based on the transaction record of the user. Then, the users meeting the following conditions in the user list are taken as the users of the purchasing platform: the condition is that a place (e.g., australia, korea) where the user periodically departs includes a place (e.g., australia) where the user sells goods (e.g., powdered milk) in the online shop, and the condition is that the number of times (e.g., 40 times) that a large transaction (e.g., a transaction in which the payment amount exceeds the first threshold) is performed abroad exceeds the second threshold (e.g., 30 times).
That is, the online shop shopping user needs to satisfy the following conditions: the commercial shop is opened and trades actively, the overseas regions or countries where the commercial shop goes periodically include the brand origin of the commercial products sold by the commercial shop, and the large-amount trade is conducted many times overseas.
In this way, a sample set of purchasing users may be determined in advance, and in step S220, a marked user and an unmarked user may be determined according to the sample set of purchasing users and the plurality of users determined in step S210.
Next, in step S230, the feature information related to the exit corresponding to each of the marked user and the unmarked user is determined.
Specifically, the feature information may include periodic features that the user is out of the home and consumption features that are consumed abroad. The periodic characteristics may include an average length of a single outbound (e.g., 2 days), an average interval length of outbound (e.g., 1 month), and the like, and the consumption characteristics may include at least one of an overseas number of paid strokes (e.g., 48 strokes), an average amount paid per stroke (e.g., 2 ten thousand dollars), and a frequency of transactions with the merchant (e.g., one transaction with a particular milk powder store per outbound).
In one embodiment, the characteristic information may be determined based on location information (e.g., latitude and longitude information) and payment information (e.g., payment amount, merchant name, product name) obtained within a predetermined period (e.g., last year) in connection with the user's outbound. In one example, the periodic characteristics included in the characteristic information (e.g., the number of exits, the average duration of single exits, the average interval duration of exits, etc.) may be determined based on location information over a predetermined period. And determining consumption characteristics (such as the number of overseas payment strokes, the average payment amount per stroke, the frequency of transactions with merchants and the like) included in the characteristic information according to the position information and the payment information.
After a plurality of users located outside at predetermined times are determined in step S210 and feature information is determined in step S230, then, in step S240, a purchasing user included in the plurality of users is determined by an algorithm based on semi-supervised learning from the feature information.
Specifically, according to the characteristic information, the purchasing users included in the unmarked users are determined through an algorithm based on semi-supervised learning. And then, determining the purchasing users included in the plurality of users according to the purchasing users included in the unmarked users.
In one embodiment, determining the purchasing users included in the plurality of users according to the purchasing users included in the unlabeled users may include: if the plurality of users are regarded as the unmarked users in step S220, the purchasing users included in the plurality of users are the purchasing users included in the unmarked users. If the user who is not present in the sample set of the purchasing users in the plurality of users is regarded as the unmarked user in step S220, the purchasing users include the user who is present in the sample set of the purchasing users in the plurality of users and the purchasing user included in the unmarked user.
It should be noted that the basic idea of semi-supervised learning is to use the label information of labeled samples to predict the label information of unlabeled samples. Accordingly, in various embodiments provided in this specification, the characteristic information of the labeled user is used to predict whether the unlabeled user belongs to the purchasing user through an algorithm based on semi-supervised learning. And algorithms based on semi-supervised Learning may include Label Propagation (LPA), and Learning from Positive and unmarked samples (PU).
In the LPA algorithm, a relationship complete graph model can be established by using the relationship between samples, in the complete graph, nodes include labeled and unlabeled data, edges of the labeled and unlabeled data represent the similarity of two nodes, labels of the nodes are transmitted to other nodes according to the similarity, the greater the similarity of the nodes is, the easier the labels are to be transmitted, and finally label labeling of unlabeled samples is completed.
In one embodiment, determining the purchasing users included in the unlabeled users through a semi-supervised learning based algorithm may include: the method for determining the purchasing users included in the un-labeled users through the LPA algorithm specifically comprises the following steps:
step S310, determining the similarity between any two users of the marked user and the unmarked user.
As described above, the exit-related characteristic information corresponding to each user has been determined. In one embodiment, the feature information comprises a plurality of feature elements, and the plurality of feature elements form a feature vector. Determining the similarity between the two users may include determining the similarity between the feature vectors corresponding to the two users. The similarity between feature vectors can be calculated in various ways, for example, in one example, the distance between two feature vectors is calculated as the similarity between the two vectors, and further as the similarity between the corresponding two users. In another example, a cosine similarity between two feature vectors is calculated as a similarity between the corresponding two users.
In another embodiment, the feature value of each user is determined according to the feature information of each user, and then the similarity between any two users, namely the marked user and the unmarked user, is determined according to the feature values of the two users.
In one embodiment, the feature information of each user determined in step S230 is a plurality of feature information, so that a feature score and a weight corresponding to each feature information can be determined, and a feature value is determined by means of weighted summation. The feature score and the weight corresponding to each feature information may be determined according to a preset rule (e.g., a rule set after feature analysis is performed on the purchasing user).
In one example, the characteristic information of a user includes an average length of time of a single outbound (2 days), an average interval length of outbound (1 month), the number of outbound paid strokes (48 strokes), an average paid amount per stroke (2 ten thousand yuan), and a frequency of transactions with a merchant (e.g., transactions with a milk powder store for each outbound). Accordingly, it can be assumed that the feature scores corresponding to the respective feature information determined according to the preset rule are 5, 2, 4, 1, and the corresponding weights are 0.3, 0.2, 0.1, respectively. Therefore, the characteristic value of the user can be obtained to be 3.4 by adopting a weighted summation mode.
In one embodiment, after determining the eigenvalues, the similarity between any two users may be determined by constructing a similarity matrix. The similarity matrix is a graph constructed based on the characteristic values of the labeled users and the unlabeled users, each node in the graph is a data point (node containing the characteristic value), and the edge between any two nodes represents the similarity of the two nodes. In one example, the similarity between any two nodes can be calculated by the following formula:
Figure DEST_PATH_IMAGE001
(1)
wherein the content of the first and second substances,
Figure 683592DEST_PATH_IMAGE002
representing nodes
Figure DEST_PATH_IMAGE003
And node
Figure 384873DEST_PATH_IMAGE004
The similarity of (2);
Figure DEST_PATH_IMAGE005
and
Figure 535363DEST_PATH_IMAGE006
respectively representing nodes
Figure 178834DEST_PATH_IMAGE003
And node
Figure 451684DEST_PATH_IMAGE004
A characteristic value of (d);
Figure DEST_PATH_IMAGE007
indicating a hyper-parameter (a hyper-parameter is a predefined parameter, which may be set to 0.3, for example).
And step S320, determining the propagation probability between the two corresponding users according to the similarity.
Specifically, propagation of labels between nodes is performed through edges between the two. The greater the weight of an edge (i.e., the similarity between the two), the more similar the two nodes are represented, and the easier the label will propagate through. In one embodiment, the propagation probability may be calculated by the following formula:
Figure 616823DEST_PATH_IMAGE008
(2)
in the formula (2), the reaction mixture is,
Figure DEST_PATH_IMAGE009
representing slave nodes
Figure DEST_PATH_IMAGE011
Transfer to node
Figure DEST_PATH_IMAGE013
The probability of (d);
Figure 508687DEST_PATH_IMAGE014
representing nodes
Figure 6665DEST_PATH_IMAGE011
And node
Figure 214530DEST_PATH_IMAGE013
The similarity of (2);
Figure DEST_PATH_IMAGE015
representing nodes
Figure 40535DEST_PATH_IMAGE011
Similarity with node k, and
Figure 860723DEST_PATH_IMAGE016
Figure DEST_PATH_IMAGE017
representing the total number of nodes.
And step S330, determining the probability that each user in the unmarked users belongs to the purchasing user according to the propagation probability.
In particular, a probability transition matrix is constructed from propagation probabilities
Figure 649425DEST_PATH_IMAGE018
And constructing a matrix according to the marked users and the unmarked users
Figure 529657DEST_PATH_IMAGE019
. Wherein, the matrix
Figure 905274DEST_PATH_IMAGE019
The construction process of (A) is as follows: suppose there is
Figure 325891DEST_PATH_IMAGE020
Class I,
Figure DEST_PATH_IMAGE021
An annotated sample and
Figure 110045DEST_PATH_IMAGE022
individual unlabeled samples. Accordingly, define one
Figure DEST_PATH_IMAGE023
Is marked with a matrix
Figure 784347DEST_PATH_IMAGE024
And a
Figure DEST_PATH_IMAGE025
Of (2) unlabeled matrix
Figure 647261DEST_PATH_IMAGE026
In the above matrix, the first
Figure 809252DEST_PATH_IMAGE028
Line indicates the first
Figure 808432DEST_PATH_IMAGE028
Labels of individual samples indicate vectors, i.e. if
Figure 529001DEST_PATH_IMAGE028
The category of the sample is
Figure 941528DEST_PATH_IMAGE013
Then the first of the row
Figure 907210DEST_PATH_IMAGE013
One element is 1 and the others are 0. Will matrix
Figure DEST_PATH_IMAGE029
Sum matrix
Figure 26475DEST_PATH_IMAGE030
The combination can obtain one
Figure DEST_PATH_IMAGE031
Of (2) matrix
Figure 153831DEST_PATH_IMAGE032
And is and
Figure DEST_PATH_IMAGE033
in one embodiment, assume that
Figure 489873DEST_PATH_IMAGE034
That is, one of the classes is a purchasing user, the other class is a non-purchasing user, and the labeled user is a purchasing user
Figure DEST_PATH_IMAGE035
Users not labeled as
Figure 993666DEST_PATH_IMAGE022
And (4) respectively. Accordingly, define one
Figure 233018DEST_PATH_IMAGE036
Is marked with a matrix
Figure DEST_PATH_IMAGE037
And a
Figure 531275DEST_PATH_IMAGE038
Of (2) unlabeled matrix
Figure DEST_PATH_IMAGE039
. Wherein the content of the first and second substances,
Figure 354612DEST_PATH_IMAGE040
then, the matrix is divided into
Figure DEST_PATH_IMAGE041
Sum matrix
Figure 662097DEST_PATH_IMAGE042
Are combined to obtain one
Figure DEST_PATH_IMAGE043
Of (2) matrix
Figure 490375DEST_PATH_IMAGE032
And according to the probability transition matrix
Figure 287430DEST_PATH_IMAGE044
Sum matrix
Figure DEST_PATH_IMAGE045
The following procedure is performed: 1) and (3) performing propagation:
Figure 833949DEST_PATH_IMAGE046
(ii) a 2) Reset
Figure 7442DEST_PATH_IMAGE047
Label of the winning user:
Figure 454341DEST_PATH_IMAGE048
(ii) a 3) Repeating steps 1) and 2) until
Figure 359980DEST_PATH_IMAGE049
And (6) converging.
According to the result after convergence
Figure 456112DEST_PATH_IMAGE049
And the probability that each user in the unlabeled users belongs to the purchasing user can be obtained.
And step S340, taking the unmarked users corresponding to the probability not less than the preset threshold value as purchasing users.
Specifically, when the probability corresponding to each user in the unmarked users is not less than a preset threshold, the unmarked users are taken as the purchasing users. In one example, assuming that the preset threshold is 0.7 and the probability value of a certain unlabeled user belonging to the purchasing user is 0.8, the unlabeled user can be regarded as the purchasing user. In another example, assuming that the preset threshold is 0.7 and the probability value of a certain unlabeled user belonging to the purchasing user is 0.6, the unlabeled user is not considered as the purchasing user.
It should be noted that the preset threshold may be adjusted according to the service content. In one embodiment, where the business content includes a limited number of coupons, a corresponding number of purchasing users may need to be determined. Accordingly, the preset threshold value can be adjusted. For example, 2000 purchasing users need to be determined, and when the initial value of the preset threshold is set to 0.7, the number of the determined purchasing users is 3000. Accordingly, the preset threshold may be increased, for example, when the initial value of the preset threshold is set to 0.78, it may be determined that the number of the purchasing users is 2000 and is consistent with the service content.
In one embodiment, the purchasing user may also be determined from the plurality of users in a formal and unmarked sample learning (PU) manner. Specifically, according to the feature information of the labeled users, a reliable negative sample set RN is found out from an unlabeled sample set U formed by the unlabeled users. This can be achieved by means of bayesian algorithms, spy algorithms, etc. And then, obtaining a binary classifier by using the labeled user set and the negative sample set RN through iterative training. This binary classifier is utilized to classify unlabeled users into shopping users and non-shopping users.
As can be seen from the above, in the methods of determining purchasing users provided in the embodiments disclosed in the present specification, first, a plurality of users whose predetermined times are located outside are determined, and a marked user and an unmarked user are determined from the plurality of users and a pre-stored sample set of purchasing users. Then, feature information corresponding to each user of the marked user and the unmarked user and relevant to the outbound is determined, and the purchasing users of the multiple users located out of the country are determined through an algorithm based on semi-supervised learning according to the feature information, so that personalized service can be provided for the purchasing users.
The method for determining the purchasing user provided by the embodiment of the invention is described below with reference to a specific application scenario. As shown in fig. 4, the method comprises the steps of:
step S411, acquiring global seller from Tianmao international.
Step S412, acquiring the Taobao purchasing seller from the Taobao network.
And step S413, determining the purchasing transfer user according to the record of the payer transfer.
And step S420, determining a purchase-agent user sample set according to the global purchase seller, the Taobao purchase-agent seller and the purchase-agent transfer user.
Step S430, acquires the user LBS data located abroad.
Step S440, a feature library is constructed, wherein the feature library comprises a purchase-substituting user sample set and feature information of each user in the overseas users, and the feature information comprises LBS features (such as outbound times, country number, outbound average time, outbound interval time and the like) and overseas on-the-spot payment features (such as payment pen number, payment amount and the like).
And step S450, determining the probability value of the overseas user belonging to the purchasing user by adopting a semi-supervised learning algorithm according to the feature information in the feature library.
In step S460, according to the probability threshold, the users located outside the country may be purchased users.
In correspondence with the method for determining a purchasing user, embodiments disclosed in this specification further provide an apparatus for determining a purchasing user, as shown in fig. 5, the apparatus 500 includes:
a first determination unit 510 for determining a plurality of users who are out of the country at predetermined times;
a second determining unit 520, configured to determine tagged users and untagged users according to the multiple users and a pre-stored sample set of purchasing users;
a third determining unit 530, configured to determine feature information related to the outbound corresponding to each of the tagged user and the untagged user;
a fourth determining unit 540, configured to determine a purchasing user included in the plurality of users through a semi-supervised learning based algorithm according to the feature information.
In a possible implementation, the first determining unit is specifically configured to:
and acquiring the position information of the user at a preset moment, and determining whether the user is located outdoors according to the position information.
In one possible implementation, the sample set of purchasing users in the second determining unit 520 is determined based on at least the position information and the transaction record of the sample user within a preset time period.
In one possible implementation, the sample set of purchasing users in the second determining unit 520 is further determined based on the service categories provided by the sample users.
In a possible implementation, the second determining unit 520 is specifically configured to:
taking sample users in the purchased user sample set as marked users, and taking a plurality of users as unmarked users; alternatively, the first and second electrodes may be,
taking the sample users in the purchased user sample set as labeled users, and taking the users which do not exist in the purchased user sample set in the plurality of users as unlabeled users; alternatively, the first and second electrodes may be,
and taking users existing in the purchased user sample set from the plurality of users as marked users, and taking users except the marked users from the plurality of users as unmarked users.
In one possible embodiment, the characteristic information determined by the third determining unit 530 includes at least one of an average length of time of a single outbound, an average interval length of outbound, the number of outbound paid strokes, an average amount paid per stroke, and a frequency of transactions with the merchant.
In a possible implementation manner, the fourth determining unit 540 specifically includes:
a first determining subunit 541, configured to determine a similarity between any two users of the labeled user and the unlabeled user;
a second determining subunit 542, configured to determine, according to the similarity, a propagation probability between the two corresponding users;
the third determining subunit 543 is configured to determine, according to the propagation probability, the probability that each user in the unlabeled users belongs to the purchasing user.
In a possible implementation manner, the feature information determined by the third determining unit 530 is a plurality of feature information, and the first determining subunit 541 is specifically configured to:
determining the characteristic value of the user by adopting a weighted summation mode according to the characteristic score and the weight corresponding to each piece of characteristic information in the plurality of pieces of characteristic information;
and determining the similarity between the corresponding two users according to the characteristic values of any two users.
In a possible implementation manner, the fourth determining unit 540 specifically further includes:
the processing subunit 544 is configured to, according to the probability, use an unlabeled user corresponding to the probability that is not smaller than the preset threshold as a purchasing user.
As can be seen from the above, in the apparatus for determining purchasing users provided in the embodiments disclosed in the present specification, the first determining unit 510 determines a plurality of users whose predetermined times are located abroad, the second determining unit 520 determines tagged users and untagged users from the plurality of users and a pre-stored sample set of purchasing users, the second determining unit 530 determines feature information related to the departure corresponding to each of the tagged users and untagged users, and the fourth determining unit 530 determines purchasing users from the plurality of users located abroad by a semi-supervised algorithm based on the feature information, so that it is possible to provide personalized services to these purchasing users.
Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in the embodiments disclosed herein may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The above-mentioned embodiments, objects, technical solutions and advantages of the embodiments disclosed in the present specification are further described in detail, it should be understood that the above-mentioned embodiments are only specific embodiments of the embodiments disclosed in the present specification, and are not intended to limit the scope of the embodiments disclosed in the present specification, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the embodiments disclosed in the present specification should be included in the scope of the embodiments disclosed in the present specification.

Claims (18)

1. A method of determining a purchasing user, comprising:
determining a plurality of users who are out of the country at predetermined times;
determining marked users and unmarked users according to the plurality of users and a pre-stored sample set of the purchasing users;
determining feature information related to the outbound corresponding to each user of the marked user and the unmarked user;
according to the characteristic information, determining a purchasing user included in the plurality of users through an algorithm based on semi-supervised learning.
2. The method of claim 1, wherein determining a plurality of users who are out of range at predetermined times comprises:
acquiring the position information of a user at a preset time, and determining whether the user is located outdoors according to the position information.
3. The method of claim 1, wherein the sample set of purchasing users is determined based on at least location information and transaction records of sample users over a preset time period.
4. The method of claim 3, wherein the sample set of purchasing users is further determined based on a class of service provided by the sample user.
5. The method of claim 1, wherein the determining labeled users and unlabeled users comprises:
taking the sample users in the purchased user sample set as marked users and taking the plurality of users as unmarked users; alternatively, the first and second electrodes may be,
taking the sample users in the purchased user sample set as labeled users, and taking users which do not exist in the purchased user sample set in the plurality of users as unlabeled users; alternatively, the first and second electrodes may be,
and taking users existing in the purchasing user sample set in the plurality of users as marked users, and taking users except the marked users in the plurality of users as unmarked users.
6. The method of claim 1, wherein the characteristic information includes at least one of an average length of single-time departures, an average interval length of departures, a number of outbound payment strokes, an average payment amount per stroke, and a frequency of transactions with a merchant.
7. The method of claim 1, wherein the determining, by a semi-supervised learning based algorithm, a purchasing user included in the plurality of users comprises:
determining the similarity between any two users of the marked user and the unmarked user;
determining the propagation probability between two corresponding users according to the similarity;
and determining the probability that each user in the unmarked users belongs to the purchasing user according to the propagation probability.
8. The method of claim 7, wherein the feature information is a plurality of feature information, and the determining the similarity between any two users of the labeled user and the unlabeled user comprises:
determining the characteristic value of the user by adopting a weighted summation mode according to the characteristic score and the weight corresponding to each characteristic information in the plurality of characteristic information;
and determining the similarity between the corresponding two users according to the characteristic values of any two users.
9. The method of claim 7, wherein determining a purchasing user included in the plurality of users through an algorithm of semi-supervised learning further comprises:
and according to the probability, taking the unmarked user corresponding to the probability not less than the preset threshold value as the purchasing user.
10. An apparatus for determining purchasing users, comprising:
a first determination unit configured to determine a plurality of users who are out of the country at predetermined times;
the second determining unit is used for determining marked users and unmarked users according to the plurality of users and a pre-stored sample set of the purchasing users;
a third determining unit, configured to determine feature information related to the departure corresponding to each of the tagged user and the untagged user;
a fourth determination unit configured to determine a purchasing user included in the plurality of users through a semi-supervised learning based algorithm according to the feature information.
11. The apparatus according to claim 10, wherein the first determining unit is specifically configured to:
acquiring the position information of a user at a preset time, and determining whether the user is located outdoors according to the position information.
12. The apparatus according to claim 10, wherein the sample set of purchasing users in the second determination unit is determined based on at least the position information and transaction record of the sample user within a preset time period.
13. The apparatus of claim 12, wherein the sample set of purchasing users in the second determining unit is further determined based on a class of service provided by the sample user.
14. The apparatus according to claim 10, wherein the second determining unit is specifically configured to:
taking the sample users in the purchased user sample set as marked users and taking the plurality of users as unmarked users; alternatively, the first and second electrodes may be,
taking the sample users in the purchased user sample set as labeled users, and taking users which do not exist in the purchased user sample set in the plurality of users as unlabeled users; alternatively, the first and second electrodes may be,
and taking users existing in the purchasing user sample set in the plurality of users as marked users, and taking users except the marked users in the plurality of users as unmarked users.
15. The apparatus according to claim 10, wherein the characteristic information determined by the third determination unit includes at least one of an average length of time of a single outbound, an average interval length of outbound, the number of outbound payments, an average amount of payments per pen, and a frequency of transactions with a merchant.
16. The apparatus according to claim 10, wherein the fourth determining unit specifically includes:
the first determining subunit is configured to determine a similarity between any two users of the labeled user and the unlabeled user;
a second determining subunit, configured to determine, according to the similarity, a propagation probability between two corresponding users;
and the third determining subunit is used for determining the probability that each user in the unlabeled users belongs to the purchasing user according to the propagation probability.
17. The apparatus according to claim 16, wherein the feature information determined by the third determining unit is a plurality of feature information, and the first determining subunit is specifically configured to:
determining the characteristic value of the user by adopting a weighted summation mode according to the characteristic score and the weight corresponding to each characteristic information in the plurality of characteristic information;
and determining the similarity between the corresponding two users according to the characteristic values of any two users.
18. The apparatus according to claim 16, wherein the fourth determining unit further includes:
and the processing subunit is used for taking the unmarked user corresponding to the probability not less than the preset threshold value as the purchasing user according to the probability.
CN201810050530.5A 2018-01-18 2018-01-18 Method and device for determining purchasing user Active CN108305099B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810050530.5A CN108305099B (en) 2018-01-18 2018-01-18 Method and device for determining purchasing user

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810050530.5A CN108305099B (en) 2018-01-18 2018-01-18 Method and device for determining purchasing user

Publications (2)

Publication Number Publication Date
CN108305099A CN108305099A (en) 2018-07-20
CN108305099B true CN108305099B (en) 2021-11-19

Family

ID=62865594

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810050530.5A Active CN108305099B (en) 2018-01-18 2018-01-18 Method and device for determining purchasing user

Country Status (1)

Country Link
CN (1) CN108305099B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110009364B (en) * 2019-01-08 2021-08-24 创新先进技术有限公司 Industry identification model determining method and device
CN113554438B (en) * 2020-04-23 2023-12-05 北京京东振世信息技术有限公司 Account identification method and device, electronic equipment and computer readable medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103164428A (en) * 2011-12-13 2013-06-19 富士通株式会社 Method and device for determining correlation between microblog and given entity
CN104239335A (en) * 2013-06-19 2014-12-24 阿里巴巴集团控股有限公司 Method and device for acquiring information of specific users
CN106327227A (en) * 2015-06-19 2017-01-11 北京航天在线网络科技有限公司 Information recommendation system and information recommendation method
CN107273454A (en) * 2017-05-31 2017-10-20 北京京东尚科信息技术有限公司 User data sorting technique, device, server and computer-readable recording medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103164428A (en) * 2011-12-13 2013-06-19 富士通株式会社 Method and device for determining correlation between microblog and given entity
CN104239335A (en) * 2013-06-19 2014-12-24 阿里巴巴集团控股有限公司 Method and device for acquiring information of specific users
CN106327227A (en) * 2015-06-19 2017-01-11 北京航天在线网络科技有限公司 Information recommendation system and information recommendation method
CN107273454A (en) * 2017-05-31 2017-10-20 北京京东尚科信息技术有限公司 User data sorting technique, device, server and computer-readable recording medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
标签传播算法理论及其应用研究综述;张俊丽,常艳丽,师文;《计算机应用研究》;20130131;第21-25页 *

Also Published As

Publication number Publication date
CN108305099A (en) 2018-07-20

Similar Documents

Publication Publication Date Title
WO2019196579A1 (en) Method and apparatus for issuing smart voucher, and method and apparatus for verification and cancellation by using smart voucher
Hsu et al. What drives purchase intention for paid mobile apps?–An expectation confirmation model with perceived value
US7970669B1 (en) Method and system for store-to-consumer transaction management
RU2507581C2 (en) Processing receipt received in set of communications
CN108805615B (en) Preferential activity pushing method and system based on user consumption behaviors
US10776816B2 (en) System and method for building a targeted audience for an online advertising campaign
WO2020121862A1 (en) Information processing method, information processing device, and program
CN108305099B (en) Method and device for determining purchasing user
WO2018092333A1 (en) Purchase information utilization system, purchase information utilization method, and program
CN107833076A (en) A kind of marketing message method for pushing and device
Azis et al. The Effect of Trust and Price on Purchase Decisions Through Brand Image As Intervening Variables (Case Study of Shopee Users in Makassar City)
JP7078784B1 (en) Providing equipment, providing method and providing program
CN111563798A (en) Consumption object recommendation method and device and electronic equipment
US20130110605A1 (en) Product recognition promotional offer matching
KR20160143186A (en) Apparatus and method for managing mobile receipt
US20160314466A1 (en) Systems and methods for roll-up payments augmented by price matching refunds
US9972027B1 (en) System and method of tracking the effectiveness of viewing resources on electronic devices in causing transaction activity to subsequently occur at a physical location associated with the resources
JP6910515B1 (en) Analytical instruments, analytical methods and analytical programs
US20210201186A1 (en) Utilizing Machine Learning to Predict Information Corresponding to Merchant Offline Presence
CN111242633A (en) Information prompting method, device, equipment and medium
JP7427043B2 (en) Information processing device, information processing method, and information processing program
JP7258200B1 (en) Information processing device, information processing method and information processing program
JP7477679B2 (en) Providing device, providing method, and providing program
JP6910516B1 (en) Analytical instruments, analytical methods and analytical programs
US20160148240A1 (en) System and Method that Rewards Vendors for Offering Nonpublished Coupons

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20201020

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20201020

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240206

Address after: Guohao Times City # 20-01, 128 Meizhi Road, Singapore

Patentee after: Advanced Nova Technology (Singapore) Holdings Ltd.

Country or region after: Singapore

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee before: Innovative advanced technology Co.,Ltd.

Country or region before: United Kingdom

TR01 Transfer of patent right