WO2016161976A1 - 选择数据内容向终端推送的方法和装置 - Google Patents

选择数据内容向终端推送的方法和装置 Download PDF

Info

Publication number
WO2016161976A1
WO2016161976A1 PCT/CN2016/078867 CN2016078867W WO2016161976A1 WO 2016161976 A1 WO2016161976 A1 WO 2016161976A1 CN 2016078867 W CN2016078867 W CN 2016078867W WO 2016161976 A1 WO2016161976 A1 WO 2016161976A1
Authority
WO
WIPO (PCT)
Prior art keywords
data content
feature
attribute type
decision tree
user attribute
Prior art date
Application number
PCT/CN2016/078867
Other languages
English (en)
French (fr)
Inventor
姜磊
李勇
肖磊
刘大鹏
张书彬
罗川江
宋亚娟
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to JP2017543954A priority Critical patent/JP6494777B2/ja
Publication of WO2016161976A1 publication Critical patent/WO2016161976A1/zh
Priority to US15/664,233 priority patent/US10789311B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0283Price estimation or determination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/2866Architectures; Arrangements
    • H04L67/30Profiles
    • H04L67/306User profiles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/535Tracking the activity of the user
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/55Push-based network services

Definitions

  • the present invention relates to the field of computer technologies, and in particular, to a method and apparatus for selecting data content to be pushed to a terminal.
  • the server In applications such as Internet advertising, news consulting, and recruitment information publishing websites in the conventional technology, the server usually needs to push data content to the terminal.
  • a traditional online advertising service when a user opens a web page browsing, the server pushes (delivers) an online advertisement corresponding to the user to the user's terminal, and counts the click rate of the user clicking the online advertisement (ie, after the advertisement is pushed)
  • the ratio of the number of clicks to the number of pushes also known as Click-Through-Rate (CTR) or the probability of purchasing the product or service corresponding to the online advertisement.
  • CTR Click-Through-Rate
  • These parameters can reflect whether the advertisement content selected by the server has attracted the interest of the end user and meets the needs of the user.
  • the server selects advertising content for a particular user, it also tries to select an advertisement that enables the user to click on the advertisement or purchase through the link of the advertisement.
  • the recommendation is usually made according to the attribute of the user in combination with the corresponding matching model.
  • commonly used matching models include: group heat model (ie, user population based on user basic attributes, such as age, gender, statistics of each group's Top click rate), logistic regression model (ie, based on user attributes, advertising basics, advertising space attributes) , as well as user, ad slot, ad cross attribute to establish a logistic regression model).
  • the above matching model usually adopts a machine learning method, and needs to input the historical data of the foregoing statistics as sample data into the corresponding model at intervals, and then adjust the size of each parameter in the model through machine learning, so that the model can adapt to the comparison. New user habits.
  • the server selects the data content to push to the user's terminal, the server can select the data content that best matches the user according to the updated matching model.
  • the update of the matching model is to update the matching model by machine learning according to the sample data at intervals, so When the server pushes the data content according to the matching model, the matching model is not the model parameter obtained according to the latest statistical data, so that the relevance or degree of matching between the data content selected by the server and the user is low, causing the data content to be pushed. Less accurate.
  • the first aspect of the embodiment of the present invention provides a method for selecting the data content to be pushed to the terminal.
  • a method for selecting data content to be pushed to a terminal comprising:
  • the tree node of the decision tree object includes a branch node and a leaf node
  • the branch node and the user attribute type are in one-to-one correspondence
  • the branch node stores the corresponding user a feature threshold of each feature interval of the attribute type
  • the child nodes of the branch node are in one-to-one correspondence with the feature threshold
  • the number of clicks and the number of pushes corresponding to the feature threshold corresponding to the leaf node are stored in the leaf node ;
  • the second aspect of the embodiment of the present invention further provides an apparatus for selecting data content to be pushed to the terminal.
  • a device for selecting data content to be pushed to a terminal comprising:
  • a user identifier obtaining module configured to acquire a user identifier, and obtain a preset corresponding to the user identifier The feature value under the user attribute type;
  • a decision tree obtaining module configured to acquire data content, and search a decision tree object corresponding to the data content, where the tree node of the decision tree object includes a branch node and a leaf node, and the branch node has a one-to-one correspondence with the user attribute type, and The branch node stores a feature threshold of each feature interval of the corresponding user attribute type, the child nodes of the branch node are in one-to-one correspondence with the feature threshold; and the feature threshold corresponding to the leaf node is stored in the leaf node Corresponding clicks and pushes;
  • a leaf node locating module configured to locate a leaf node corresponding to the user identifier in the decision tree object according to a feature value corresponding to the user identifier and a preset user attribute type, the feature value Feature threshold matching corresponding to each tree node on the path from the root node of the decision tree object to the located leaf node;
  • a data content selection module configured to acquire a click number and a push number stored in the located leaf node, generate a selection reference value according to the click number and the push number, and select the data content to be pushed and described according to the selection reference value
  • the terminal corresponding to the user ID configured to acquire a click number and a push number stored in the located leaf node, generate a selection reference value according to the click number and the push number, and select the data content to be pushed and described according to the selection reference value The terminal corresponding to the user ID.
  • the data content corresponding to the selected reference value is searched by matching the feature value corresponding to the user identifier with the branch node in the decision tree object corresponding to the data content, and
  • the logical structure of the above decision tree object enables the decision tree object to be updated in real time by using the user's browsing record, without periodically sampling, and then the decision tree object is updated offline by machine learning according to the sampled sample, that is, It is said that when the feature values corresponding to the user identifier are matched with the branch nodes in the decision tree object corresponding to the data content, the statistical data in the decision tree object refers to the newer user browsing record, so that the matching result can be It is more in line with the operating habits or browsing habits of the running users, thus improving the accuracy of selecting data content for pushing.
  • FIG. 1 is a flow chart of a method for selecting data content to be pushed to a terminal in an embodiment
  • FIG. 2 is a logical relationship diagram between tree nodes in a decision tree object in an embodiment
  • FIG. 3 is a logical relationship diagram between tree nodes in a decision tree object in an embodiment
  • FIG. 4 is a flow chart showing a process of performing a user attribute type extension on a leaf node in a decision tree object in an embodiment
  • FIG. 5 is a schematic diagram of performing user attribute type expansion on leaf nodes in a decision tree object in an embodiment
  • FIG. 6 is a schematic diagram of an apparatus for selecting data content to be pushed to a terminal in an embodiment
  • FIG. 7 is a schematic structural diagram of another apparatus for selecting data content to be pushed to a terminal in an embodiment.
  • the method for pushing data content to a terminal may depend on a computer program, which may be an online advertisement delivery program, a news information application, a mail advertisement promotion program, a resume push program, etc., by filtering and pushing the data content.
  • a computer program which may be an online advertisement delivery program, a news information application, a mail advertisement promotion program, a resume push program, etc.
  • the computer program can run on a computer system of the von Neumann system.
  • the computer system may be a server device running the above-mentioned online advertisement delivery program, news information application, mail advertisement promotion program, resume push program, etc. by filtering the data content and pushing it to the server program of the corresponding client program.
  • a plurality of data contents are pre-stored in the server device.
  • an advertisement database storing online advertisements is set, and each online advertisement is a data content, and the online advertisement service is provided.
  • Merchants can increase ad data by adding records to the ad database.
  • the process of selecting data content is a process of searching for a data content in a server device that best matches a certain user, or a data content that is most likely to be browsed by a user after being pushed.
  • a plurality of user attribute types are pre-configured, and each attribute type is provided with a corresponding feature interval.
  • the preset user attribute types may include: “gender”, “age group”, “brand”, etc., and the user attribute type “gender” may include “male” and “female”.
  • the feature interval, the user attribute type “age segment” may include "post-70", “post-80”, “post-90”, "00” and other feature intervals, and the feature interval may be defined by a feature threshold, for example, "male”
  • the feature interval of "female” can be defined using a Boolean variable
  • the feature interval of "post 70" can be defined using the feature threshold of [70,79].
  • the user attribute of the user account on the pushed terminal also has multiple feature values under the above user attribute type.
  • the process of selecting the data content is to traverse the data content in the database and find the classified statistical data corresponding to each data content.
  • the statistical data corresponding to the plurality of feature values of the user attribute is filtered, and the probability that the traversed data content is pushed after being predicted is estimated according to the filtered statistical data, and then the data content with a high browsing probability is selected for pushing.
  • the method for selecting data content to be pushed to the terminal includes:
  • Step S102 Acquire a user identifier, and obtain a feature value corresponding to the preset user attribute type corresponding to the user identifier.
  • the user identifier is the identification information used to distinguish the user, and may be a user account registered by the user on the server program, or may be an email address, an IP address, a mobile phone number, and the like of the user for promotion without registration.
  • the feature value corresponding to the user identifier corresponding to the user identifier in the preset user attribute type may be obtained by extracting the user data of the logged-in user account or the attribute value in the user operation record.
  • the application includes two types of user accounts: a candidate user and a recruiter user.
  • the candidate user can create a resume
  • the created resume is a database of online resume delivery applications.
  • the data content stored in the applicant is usually an individual.
  • the recruiter user is the push target of the online resume, usually a business or institution.
  • Online resume delivery application service The program can find the resume that best matches an enterprise in the massive resume created by the candidate user, and then push the resume to the corresponding terminal of the recruiter user (can be pushed to the online resume delivery application on the terminal)
  • the client program can also be emailed to the applicant's user's mailbox).
  • the staff of the enterprise needs to fill in the information of the enterprise according to the preset user attribute type.
  • the preset user attribute type may include the company name, industry type, region, and company nature. If the "company name” item is filled in, the "A” and “industry type” items are filled in the “Internet”. In the “Annual Region” item, “Shenzhen” and “Enterprise Nature” are filled in, and “A”, “Internet”, “Shenzhen” and “State-owned Enterprise” are filled in the user attribute type. The characteristic value of the company name, industry type, region, and enterprise nature.
  • a large amount of advertisement data (which may be a video advertisement, a picture advertisement, etc.) is stored in a database on a server.
  • the online advertisement promotion program is based on a webpage search engine, and the user identifier may be an IP of the terminal.
  • the address, the feature value corresponding to the user identifier under the preset user attribute type, may be extracted by searching for a search record corresponding to the IP address.
  • the search record corresponding to the IP address may be searched, if the keywords in the search record include: “milk powder”, “baby car” Keywords such as “Urine is not wet”, and the feature interval under the user attribute type "Interest Product Type” includes “Infant and Child Products”, and the feature value corresponding to the user IP type under the user attribute type "Interest Product Type” is It is “infant and child product”; if the geographic location corresponding to the terminal IP is “Dongguan” and the feature interval under the user attribute type “terminal location” includes “Guangdong province”, the user attribute type corresponding to the terminal IP The characteristic value under "Terminal Location” is "Guangdongzhou”.
  • Step S104 Acquire data content, and search for a decision tree object corresponding to the data content, where the tree node of the decision tree object includes a branch node and a leaf node, and the branch node has a one-to-one correspondence with the user attribute type, and the branch node stores a feature threshold of each feature interval of the corresponding user attribute type, the child nodes of the branch node are in one-to-one correspondence with the feature threshold; and the number of clicks corresponding to the feature threshold corresponding to the leaf node is stored in the leaf node And the number of pushes.
  • Decision tree objects can be stored using data structures that are logically consistent with the tree structure (that is, the Tree type defined in common programming languages). Each data content corresponds to a decision tree object. For example, in an online ad delivery program, each time an online ad is created, the online ad is assigned an online The advertisement identifier Aid may store the online advertisement identifier Aid and the decision tree object corresponding to the Aid in the mapping table, where Aid is the key of the mapping table, and the decision tree object is the value of the mapping table.
  • the decision tree object is logically a tree structure.
  • the decision tree object includes three levels, wherein the first level tree node is a branch node and is a decision tree.
  • the root node corresponds to the user attribute type “gender”, and stores a feature threshold of the feature interval “male” and the feature interval “female” under the user attribute type “gender”, and the threshold may use a boolean variable, a number or String definition.
  • the tree nodes of the second level are all child nodes of the root node, and the tree node "male” which is the child node of the root node corresponds to the characteristic threshold of the feature interval "male” under the user attribute type "gender” corresponding to the root node, The tree node “female” which is a child node of the root node corresponds to the feature threshold of the feature section “female” under the user attribute type "gender” corresponding to the root node.
  • the tree nodes of the third level are all child nodes of the branch node "male", and the branch node “male” corresponds to the user attribute type "education", and the feature interval "high school and below” under the user attribute type "education” is stored, and the feature A characteristic threshold for the interval “College” and the feature interval "Master and above”, which can be defined using numbers or strings.
  • the leaf node “high school and below” corresponds to the feature threshold of the feature interval "high school and below” under the user attribute type "education”; the leaf node “college” is the feature interval under the user attribute type "education”
  • the feature threshold corresponds to the leaf node “Master and above", which corresponds to the feature threshold of the feature interval "Master and above” under the user attribute type "Education".
  • the leaf node stores the number of clicks and the number of pushes corresponding to the feature threshold corresponding to the leaf node. For example, as shown in FIG. 2, for a leaf node "college", in which a click number (click) 200 and an recommendation number (impression) 1000 are stored, that is, it is logically represented in the decision tree object with the leaf node "college”. The corresponding number of hits is 200, and the recommended number is 1000.
  • Step S106 locating a leaf node corresponding to the user identifier in the decision tree object according to a feature value under a preset user attribute type corresponding to the user identifier, the feature value and the decision from the decision
  • the feature threshold of the feature interval corresponding to each branch node on the path of the tree node to the located leaf node matches.
  • the process of locating in the decision tree object according to the feature value corresponding to the user identifier is determined
  • the branch node of the policy tree compares whether the feature threshold of the feature interval matches the feature value, and then moves to the child node of the branch node to recursively perform the above operation.
  • Step S108 Acquire a click number and a push number stored in the located leaf node, generate a selection reference value according to the click number and the push number, and select the data content according to the selection reference value to be pushed to the user identifier. terminal.
  • the content in the “gender” column is “male”
  • the content in the “education” column is “college”, in the “marital status” column.
  • the content filled in is “divorced”
  • the content filled in the “age” column is “32”
  • the user identifier of the user is “male” under the preset user attribute type “gender” (in other embodiments)
  • the feature under the user attribute type "education” The value is “College”
  • the feature value under the user attribute type "marital status” is "divorce”
  • the feature value under the user attribute type "age segment” is "32".
  • the feature threshold of the stored feature interval is the user attribute type.
  • the child node of the node, that is, the branch node "male” makes further judgment.
  • the user attribute type corresponding to the branch node "male” is "education", and the characteristic threshold of the stored feature interval is the feature threshold "high school and below” under the user attribute type "education", the feature threshold "college” and the feature threshold” Master degree and above”. Therefore, among the feature values corresponding to the user identifier, the feature value "college” can be matched with the feature threshold "college” stored in the branch node "male", and the child node of the branch node "male” can be obtained, that is, the leaf node "college” Make further judgments.
  • the leaf node "college” is a leaf node
  • the number of clicks 200 and the number of pushes stored in the leaf node can be obtained, that is, in the historical statistics, the data content corresponding to the decision tree object is simultaneously
  • the historical click rate statistics of the user group can be used as the selection reference value of the data content relative to the user identifier.
  • the data content in the database may be traversed, a selection reference value of each data content relative to the user identifier is generated, and then the data content with the largest selection reference value or the data content greater than the preset threshold is searched and pushed. Identify the corresponding terminal for the user. In other embodiments, it may also be pushed to the terminal corresponding to the user identifier by using an email or a social network platform.
  • the search method is the feature value and data content corresponding to the user identifier.
  • the feature thresholds corresponding to the respective branch nodes of the decision tree object are matched, and the number of clicks and the number of pushes stored in the matched leaf nodes are found, thereby finding the selection reference value corresponding to the user identifier.
  • the decision tree object corresponding to the data content constructed in this way can also be updated in real time according to the browsing record returned by the user operation, and the number of clicks and the recommended number corresponding to the browsing record returned by the user are added to the corresponding leaf node of the decision tree object. In real time, the real-time update of the decision tree object is completed.
  • the process of updating the decision tree object may be specifically as follows:
  • Receiving a browsing record uploaded by the terminal acquiring a user identifier corresponding to the terminal, and data content corresponding to the browsing record;
  • the leaf node increases the number of clicks and the number of pushes stored in the located leaf node according to the browsing record.
  • the above-mentioned dating website will send the data content (for example, the data of the more suitable user) with the largest reference value to the user whose registration information is "male”, “college”, “divorced”, or "32". If the user clicks on the data content to browse, the returned browsing record is the number of clicks 1 and the number of pushes is 1.
  • the server After receiving the browsing record, the server finds that the feature value of the user corresponding to the browsing record is “male”, “junior”, “divorced”, and “32 years old”, and the same positioning method can be used to locate the browsing. Record the leaf node "College” in the decision tree object of the corresponding data content, and then increase the number of clicks stored in the leaf node "College” to 201, and the number of pushes to 1001. Similarly, if the user does not click on the data content, the number of pushes stored in the leaf node "College” is increased to 1001, and the number of clicks does not change.
  • the decision tree object may be extended according to historical statistical data in real time, and the tree node of the decision tree object is added, that is, the user attribute type corresponding to the branch node in the decision tree object is added, and then the data content is pushed when the data content is selected. It can be selected according to the updated decision tree object, thereby further improving the accuracy of the pushed data content, making it more closely match the user's operating system or user attributes, and more likely to cause user interest.
  • the step of increasing the number of clicks and the number of pushes stored in the located leaf node according to the browsing record further includes:
  • Obtaining a branch node on the path of the root node to the located leaf node in the decision tree object, and acquiring a candidate user attribute other than a user attribute type corresponding to the branch node on the path The type, the number of clicks and the number of pushes corresponding to the data content acquired by the browsing record are added according to each feature interval under each candidate user attribute type.
  • the leaf node "college” not only stores the total number of hits that meet the gender "male”, academic “college", 200, the total number of recommendations is 1000, but also includes the classification of the user attribute type "marital status”.
  • the number of clicks in the three preset feature intervals wherein the number of hits corresponding to the feature interval "unmarried” is 120, the number of pushes is 400; the number of hits corresponding to the feature interval "divorced” is 20, and the number of pushes is 400; the number of hits corresponding to the feature interval "widowed” is 60 (the sum of the three may not be equal to the total number of hits 200, and the number of pushes is 200.
  • the user ID does not correspond to any feature under a certain user attribute type.
  • the interval includes; the number of clicks in the three preset feature intervals under the user attribute type "age segment" stored in the category, wherein the number of hits corresponding to the feature interval "below 30" is 130, and the number of pushes is 500.
  • the number of clicks corresponding to the feature section "30-40" is 30, the number of pushes is 400, the number of hits corresponding to the feature section "40 or more" is 40, and the number of pushes is 100.
  • the candidate user attribute type is the user attribute type that the branch node of the decision tree object does not correspond to.
  • the branch node on the path from the root node to the leaf node “college” in the decision tree object is only related to “gender”.
  • the "educational” has a corresponding relationship, but the remaining "marital status” and "age”
  • "marital status” and "age segment” are the corresponding candidate user attribute types.
  • the candidate user can be selected according to the correlation between the number of clicks corresponding to each feature value stored in the leaf node.
  • the attribute type extends the decision tree object.
  • the method further includes:
  • Step S202 Generate an information gain corresponding to the candidate user attribute type according to the number of clicks and the number of pushes corresponding to each feature interval in the candidate user attribute type stored in the located leaf nodes.
  • the value of p 1 is the ratio of the total number of clicks 200 stored in the leaf node "college" to the total number of pushes 1000, thus:
  • the feature threshold v of each feature interval under the user attribute type "age segment" is traversed to: “30 or less”, “30-40”, and "40 or more", wherein:
  • Entropy(S A ) can be calculated to obtain the information gain of the user attribute type "age segment".
  • Step S204 Find the candidate user attribute type whose difference between the information gain and the information gain of the other found user attribute types is greater than or equal to the information gain threshold.
  • Step S206 When the search is found, the located leaf node is set as a branch node, and the leaf node of the branch node is generated according to the feature threshold of the feature interval under the searched candidate user attribute type.
  • candidate user attribute types there are many candidate user attribute types. For example, if L has A, B, C, and D candidate user attributes under a leaf node, G(A), G(B), and G are calculated first. C), G(D), and then find two candidate user attributes with a larger G. For example, if G(A)>G(B)>G(C)>G(D), G(A)-G(B) is calculated, and if G(A)-G(B) is greater than the information gain threshold, Then, the candidate user attribute type A is selected to correspond to the tree node.
  • G(A)-G(B) is smaller than the information gain threshold, the decision tree object can be kept unchanged, and the leaf nodes of the decision tree object are not split.
  • the leaf node generated after the split as shown in FIG. 5, the total number of clicks and the total recommended number corresponding to the leaf node re-stated according to the browsing record, and the candidate user attribute type are stored therein (as shown in FIG. 5).
  • the number of clicks and the number of pushes corresponding to each feature interval of the user attribute type "age group".
  • Extending the decision tree can further improve the accuracy of the push. It can be seen from the above formula that if the number of clicks and the number of pushes corresponding to the feature interval in a candidate user attribute type are relatively uniform, the information gain is large, that is, when the decision tree object is expanded, The number of clicks corresponding to the feature interval and the candidate user attribute type with a uniform distribution of the push number are selected, so that when the positioning is performed according to the feature value corresponding to the user identifier, the probability of entering each leaf node under the branch node is similar.
  • the probability of reaching each leaf node in the decision tree object can be balanced, thereby avoiding that a certain leaf node is only too harsh due to too strict matching conditions.
  • the probability is used to match the feature values of the user identification, thereby increasing the space utilization of the storage decision tree object.
  • the decision tree object may be created for the data content in the real-time running process.
  • the step of searching for the decision tree object corresponding to the data content further includes: if the data content corresponding to the data content is not found, The decision tree object creates a decision tree object corresponding to the data content, and the root node of the created decision tree object is a leaf node; a default selection reference value is assigned to the data content.
  • the decision tree object can be extended in real time according to the browsing records returned by the subsequent terminal.
  • the decision tree object may initially have only a single node of the root node (because it has no children, so it must also be a leaf node), and with the received
  • the candidate user attribute types can be selected step by step to create branch nodes, thereby making the decision tree object perfect.
  • the branch node corresponding to the user attribute type may be added to the decision tree object according to the statistics of the browsing record of the added user attribute type, thereby
  • the decision tree object can be used to increase the reference to the user attribute type in real time as the user attribute type is expanded, thereby improving the scalability of the system that can be used for data content push.
  • the step of generating a selection reference value according to the number of clicks and the number of pushes further includes:
  • the billing value after each type of advertisement is not the same, and when the selection reference value is generated, the pricing weight coefficient is introduced, so that the selection reference value can refer not only to the historically clicked rate, but also Refer to the click revenue of an ad to maximize the benefits of online advertising.
  • the step of acquiring the data content further includes: pre-screening the data content by keyword matching according to the feature value corresponding to the preset user attribute type corresponding to the user identifier.
  • the data content stored in the database is usually huge. Therefore, the data content in the database may be performed in advance according to the feature value corresponding to the user identifier corresponding to the preset user attribute type. Pre-screening, if the data content does not contain keywords corresponding to the feature values, it is filtered out.
  • the female user data may be pre-screened, and then the selection reference value is found in the female user data according to the process of step S104 to step S108.
  • the female user profile is pushed to the male user.
  • Pre-screening the data content can greatly reduce the number of matching of decision tree objects, thereby reducing the amount of calculation and improving the execution efficiency of the computer.
  • the device for pushing the data content to the terminal includes: a user identifier obtaining module 102, a decision tree obtaining module 104, a leaf node positioning module 106, and a data content selecting module 108, wherein:
  • the user identifier obtaining module 102 is configured to obtain a user identifier, and obtain a feature value corresponding to the preset user attribute type corresponding to the user identifier;
  • the decision tree obtaining module 104 is configured to obtain data content, and search a decision tree object corresponding to the data content, where the tree node of the decision tree object includes a branch node and a leaf node, and the branch node has a one-to-one correspondence with the user attribute type. And the branch node stores a feature threshold of each feature interval of the corresponding user attribute type, the child nodes of the branch node are in one-to-one correspondence with the feature threshold; and the leaf node stores a feature corresponding to the leaf node The number of clicks and pushes corresponding to the threshold;
  • the leaf node locating module 106 is configured to locate a leaf node corresponding to the user identifier in the decision tree object according to a feature value under a preset user attribute type corresponding to the user identifier, the feature a value matching a feature threshold corresponding to each tree node on a path from the root node of the decision tree object to the located leaf node;
  • the data content selection module 108 is configured to obtain the number of clicks and the number of pushes stored in the located leaf node, generate a selection reference value according to the number of clicks and the number of pushes, and select the data content to be pushed to and according to the selected reference value.
  • the apparatus for selecting the data content to be pushed to the terminal further includes a decision tree update module 110, configured to receive the uploaded browsing record, obtain the user identifier corresponding to the browsing record, and the browsing record. Corresponding data content; acquiring a decision tree object corresponding to the data content, acquiring a feature value corresponding to the preset user attribute type corresponding to the user identifier, and positioning the location in the decision tree object according to the acquired feature value
  • the leaf node corresponding to the user identifier is configured to increase the number of clicks and the number of pushes stored in the located leaf node according to the browsing record.
  • the decision tree update module 110 is further configured to acquire the number of clicks and the number of pushes corresponding to the data content in the browsing record, and obtain the root node in the decision tree object to the located A branch node on the path of the leaf node obtains a candidate user attribute type other than the user attribute type corresponding to the branch node on the path, and is added according to each feature interval under each candidate user attribute type.
  • the browsing record acquires the number of clicks and the number of pushes corresponding to the data content.
  • the decision tree update module 110 is further configured to generate the candidate user according to the number of clicks and the number of pushes corresponding to each feature interval under the candidate user attribute type stored in the located leaf nodes.
  • Information gain corresponding to the attribute type; find information gain and other found user genus The difference of the information type of the sexual type is greater than or equal to the candidate user attribute type of the information gain threshold; when found, the positioned leaf node is set as a branch node, according to the searched candidate user attribute type
  • the feature threshold of the lower feature interval generates a leaf node of the branch node.
  • the decision tree update module 110 is further configured to use a formula:
  • the apparatus for selecting the data content to be pushed to the terminal further includes a decision tree creation module 112, configured to create and the data when the decision tree object corresponding to the data content is not found.
  • a decision tree object corresponding to the content, and the root node of the created decision tree object is a leaf node;
  • the decision tree obtaining module is further configured to allocate a default selection reference value to the data content when the decision tree object corresponding to the data content is not found.
  • the data content selection module 108 is further configured to obtain a valuation weight coefficient corresponding to the data content, and multiply the ratio of the click number and the push number by the pricing weight coefficient to obtain the data content. Select a reference value.
  • the apparatus for selecting the data content to be pushed to the terminal further includes a data content screening module 114, and is further configured to pass the feature value corresponding to the preset user attribute type corresponding to the user identifier. Keyword matching pre-screens data content.
  • the method of pushing the terminal according to the selection data contents shown in FIGS. 1 to 5 may be performed by each unit in the apparatus that pushes the selected data content to the terminal shown in FIG. 6.
  • steps S102, S104, S106, and S108 shown in FIG. 1 may be performed by the user identifier acquisition module 102, the decision tree acquisition module 104, the leaf node location module 106, and the data content selection module 108 shown in FIG. 6, respectively;
  • Steps S202, S204, and S106 shown in FIG. 4 can be as shown in FIG. 6.
  • the decision tree update module 110 is shown executing.
  • each unit in the apparatus for selecting the data content to be pushed to the terminal shown in FIG. 6 may be separately or entirely combined into one or several other units, or some of the units(s)
  • the unit can also be further divided into a plurality of functionally smaller units, which can achieve the same operation without affecting the implementation of the technical effects of the embodiments of the present invention.
  • the above units are divided based on logical functions. In practical applications, the functions of one unit may also be implemented by multiple units, or the functions of multiple units may be implemented by one unit. In other embodiments of the invention, the terminal device may also include other modules. However, in practical applications, these functions can also be implemented by other units, and can be implemented by multiple units.
  • a general-purpose computing device such as a computer including a processing unit and a storage element including a central processing unit (CPU), a random access memory (RAM), a read only memory (ROM), and the like.
  • a computer program (including program code) for performing a method of selecting data content to be pushed to a terminal as shown in FIGS. 1 to 5 to construct a device for selecting a data content to be pushed to a terminal as shown in FIG. 6, and implementing the present invention The method of selecting the data content to push to the terminal in the embodiment.
  • the computer program can be recorded, for example, on a computer readable recording medium, and loaded in and run in the above-described computing device by a computer readable recording medium.
  • the data content corresponding to the selected reference value is searched by matching the feature value corresponding to the user identifier with the branch node in the decision tree object corresponding to the data content, and
  • the logical structure of the above decision tree object enables the decision tree object to be updated in real time by using the user's browsing record, without periodically sampling, and then the decision tree object is updated offline by machine learning according to the sampled sample, that is, It is said that when the feature values corresponding to the user identifier are matched with the branch nodes in the decision tree object corresponding to the data content, the statistical data in the decision tree object refers to the newer user browsing record, so that the matching result can be It is more in line with the operating habits or browsing habits of the running users, thus improving the accuracy of selecting data content for pushing.
  • FIG. 7 is a schematic structural diagram of another apparatus for selecting data content to be pushed to a terminal according to an embodiment of the present invention.
  • the apparatus for selecting data content to be pushed to the terminal may include at least one processor 701, such as a CPU, at least one communication bus 802, and a user interface 703. And a memory 704.
  • the communication bus 702 is used to implement connection communication between these components.
  • the user interface 703 can include a display, and the optional user interface 703 can also include a standard wired interface and a wireless interface.
  • the memory 704 may be a high speed RAM memory or a non-volatile memory such as at least one disk memory.
  • the memory 704 can also optionally be at least one storage device located away from the processor 701.
  • the memory 704 stores a set of program codes, and the processor 701 calls the program code stored in the memory 704 to perform the following operations:
  • the tree node of the decision tree object includes a branch node and a leaf node
  • the branch node and the user attribute type are in one-to-one correspondence
  • the branch node stores the corresponding user a feature threshold of each feature interval of the attribute type
  • the child nodes of the branch node are in one-to-one correspondence with the feature threshold
  • the leaf node stores a click number corresponding to a feature threshold corresponding to the leaf node Push number
  • processor 701 invokes program code stored in memory 704 for performing the following operations:
  • the leaf node increases the number of clicks and the number of pushes stored in the located leaf node according to the browsing record.
  • the processor 701 calls the program code stored in the memory 704 to increase the number of clicks and the number of pushes stored in the located leaf node according to the browsing record, and may further include:
  • Obtaining a branch node on the path of the root node to the located leaf node in the decision tree object, and acquiring a candidate user attribute other than a user attribute type corresponding to the branch node on the path The type, the number of clicks and the number of pushes corresponding to the data content acquired by the browsing record are added according to each feature interval under each candidate user attribute type.
  • the processor 701 calls the program code stored in the memory 704 to add the number of clicks corresponding to the data content acquired by the browsing record according to each feature interval under each candidate user attribute type. After pushing the number, the processor 701 calls the program code stored in the memory 704, and is also used to perform the following operations:
  • the located leaf node is set as a branch node, and the leaf node of the branch node is generated according to the feature threshold of the feature interval under the found candidate user attribute type.
  • the processor 701 calls the program code stored in the memory 704 to search for a decision tree object corresponding to the data content, and may further include:
  • the processor 701 by using the program code stored in the memory 704, to generate a selection reference value according to the number of clicks and the number of pushes, may further include:
  • the processor 701 calls the program code stored in the memory 704 to obtain the data content, and may further include:
  • the data content is pre-screened by keyword matching according to the feature value corresponding to the preset user attribute type corresponding to the user identifier.
  • a "computer readable medium” can be any apparatus that can contain, store, communicate, propagate, or transport a program for use in an instruction execution system, apparatus, or device, or in conjunction with such an instruction execution system, apparatus, or device.
  • computer readable media include the following: electrical connections (electronic devices) having one or more wires, portable computer disk cartridges (magnetic devices), random access memory (RAM), Read only memory (ROM), erasable editable read only memory (EPROM or flash memory), fiber optic devices, and portable compact disk read only memory (CDROM).
  • the computer readable medium may even be a paper or other suitable medium on which the program can be printed, as it may be optically scanned, for example by paper or other medium, followed by editing, interpretation or, if appropriate, other suitable The method is processed to obtain the program electronically and then stored in computer memory.
  • portions of the invention may be implemented in hardware, software, firmware or a combination thereof.
  • multiple steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system.
  • a suitable instruction execution system For example, if implemented in hardware, as in another embodiment, it can be implemented by any one or combination of the following techniques well known in the art: having logic gates for implementing logic functions on data signals. Discrete logic circuits, application specific integrated circuits with suitable combinational logic gates, programmable gate arrays (PGAs), field programmable gate arrays (FPGAs), etc.
  • the above mentioned storage medium may be a read only memory, a magnetic disk or an optical disk or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Business, Economics & Management (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Development Economics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Accounting & Taxation (AREA)
  • Computational Linguistics (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Algebra (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

一种选择数据内容向终端推送的方法及装置,所述方法包括:获取用户标识,获取所述用户标识对应的在预设的用户属性类型下的特征值;获取数据内容,查找与所述数据内容对应的决策树对象;根据与所述用户标识对应的在预设的用户属性类型下的特征值在所述决策树对象中定位与所述用户标识对应的叶结点;获取定位到的叶结点中存储的点击数和推送数,根据所述点击数和推送数生成选择参考值,根据所述选择参考值选择数据内容推送到与所述用户标识对应的终端。所述决策树对象可在运行过程中实时得到更新,从而使得选择数据内容可参考较新的统计数据,从而提高了推送的准确度。

Description

选择数据内容向终端推送的方法和装置
本专利申请要求2015年04月08日提交的中国专利申请号为201510164053.1,发明名称为“选择数据内容向终端推动的方法及装置”的优先权,该申请的全文以引用的方式并入本申请中。
技术领域
本发明涉及计算机技术领域,尤其涉及一种选择数据内容向终端推送的方法及装置。
背景技术
在传统技术中的互联网广告、新闻咨询、招聘信息发布网站等应用中,服务器通常需要向终端推送数据内容。在传统的在线广告业务中,用户打开网页浏览的时候,服务器会向该用户的终端推送(投放)与该用户对应的在线广告,并统计用户点击该在线广告的点击率(即该广告推送后被点击的次数与推送的次数的比值,又叫Click-Through-Rate,简称CTR)或者购买该在线广告对应的产品或服务的概率等参数。这些参数可以体现服务器选择的广告内容是否引起了终端用户的兴趣,符合用户的需求。服务器在为某个特定用户选择广告内容时,也尽量选择能够使该用户点击该广告或通过该广告的链接进行购买的广告。
为了能够选择更加符合用户的需求的广告推送给该用户,传统技术中,通常根据用户的属性结合相应的匹配模型进行推荐。例如,常用的匹配模型包括:分群热度模型(即根据用户基础属性,例如年龄、性别划分用户人群,统计各个人群Top点击率)、逻辑回归模型(即根据用户属性,广告基性,广告位属性,以及用户、广告位、广告交叉属性建立逻辑回归模型)等。上述匹配模型通常采用机器学习的方法,需要每隔一段时间将前述统计的历史数据作为样本数据输入到相应的模型中,然后通过机器学习调整模型中的各个参数的大小,从而使得模型能够适应较新的用户习惯。模型更新完毕后,服务器在选择数据内容向用户的终端推送时,则可根据已更新的匹配模型选择与用户最匹配的数据内容进行推送。
然而,发明人经研究发现,上述根据匹配模型选择与用户属性匹配的数据内容的方式至少存在以下问题:匹配模型的更新为每隔一段时间根据样本数据离线对匹配模型进行机器学习来更新,因此,服务器在根据匹配模型选择数据内容进行推送时,匹配模型并不是根据最新的统计数据得到的模型参数,使得服务器选择的数据内容与用户的相关度或匹配程度较低,造成了数据内容推送的准确度较低。
发明内容
基于此,为了解决传统技术中选择数据内容进行推送的准确度较低的技术问题,本发明实施例第一方面提供了一种选择数据内容向终端推送的方法。
一种选择数据内容向终端推送的方法,包括:
获取用户标识,获取所述用户标识对应的在预设的用户属性类型下的特征值;
获取数据内容,查找与所述数据内容对应的决策树对象,所述决策树对象的树节点包括分支节点和叶结点,分支节点与用户属性类型一一对应,且分支节点存储有相应的用户属性类型的各个特征区间的特征阈值,所述分支节点的子节点与所述特征阈值一一对应;所述叶结点中存储与所述叶结点对应的特征阈值对应的点击数和推送数;
根据与所述用户标识对应的在预设的用户属性类型下的特征值在所述决策树对象中定位与所述用户标识对应的叶结点,所述特征值与从所述决策树对象的根节点到所述定位到的叶结点的路径上的各个树节点对应的特征阈值匹配;
获取定位到的叶结点中存储的点击数和推送数,根据所述点击数和推送数生成选择参考值,根据所述选择参考值选择数据内容推送到与所述用户标识对应的终端。
此外,为了解决传统技术中选择数据内容进行推送的准确度较低的技术问题,本发明实施例第二方面还提供了一种选择数据内容向终端推送的装置。
一种选择数据内容向终端推送的装置,包括:
用户标识获取模块,用于获取用户标识,获取所述用户标识对应的在预设 的用户属性类型下的特征值;
决策树获取模块,用于获取数据内容,查找与所述数据内容对应的决策树对象,所述决策树对象的树节点包括分支节点和叶结点,分支节点与用户属性类型一一对应,且分支节点存储有相应的用户属性类型的各个特征区间的特征阈值,所述分支节点的子节点与所述特征阈值一一对应;所述叶结点中存储与所述叶结点对应的特征阈值对应的点击数和推送数;
叶结点定位模块,用于根据与所述用户标识对应的在预设的用户属性类型下的特征值在所述决策树对象中定位与所述用户标识对应的叶结点,所述特征值与从所述决策树对象的根节点到所述定位到的叶结点的路径上的各个树节点对应的特征阈值匹配;
数据内容选择模块,用于获取定位到的叶结点中存储的点击数和推送数,根据所述点击数和推送数生成选择参考值,根据所述选择参考值选择数据内容推送到与所述用户标识对应的终端。
实施本发明实施例,将具有如下有益效果:
采用了上述决策树对象作为匹配模型之后,可通过将用户标识对应的特征值与数据内容各自对应的决策树对象中的分支节点进行匹配来查找到选择参考值较大的数据内容进行推送,且上述决策树对象的逻辑结构使得对决策树对象可利用用户的浏览记录实时进行更新,而不需要定期采样后,再根据采样得到的样本通过机器学习的方式离线对决策树对象进行更新,也就是说,在将用户标识对应的特征值与数据内容各自对应的决策树对象中的分支节点进行匹配时,决策树对象中的统计数据均参考了较新的用户浏览记录,从而使得匹配的结果能够更加符合运行时用户的操作习惯或浏览习惯,从而提高了选择数据内容进行推送的准确度。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
其中:
图1为一个实施例中一种选择数据内容向终端推送的方法的流程图;
图2为一个实施例中决策树对象中各个树节点之间的逻辑关系图;
图3为一个实施例中决策树对象中各个树节点之间的逻辑关系图;
图4为一个实施例中一种决策树对象中的叶结点进行用户属性类型扩展的过程流程图;
图5为一个实施例中对决策树对象中的叶结点进行用户属性类型扩展的示意图;
图6为一个实施例中一种选择数据内容向终端推送的装置的示意图;
图7为一个实施例中另一种选择数据内容向终端推送的装置的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
为解决传统技术中,由于匹配模型设计的不合理,无法实时得到更新,从而导致依赖匹配模型选择数据内容进行推送的准确度较低的技术问题,在一个实施例中,特提出了一种选择数据内容向终端推送的方法,该方法的执行可依赖于计算机程序,该计算机程序可以是在线广告投放程序、新闻资讯类应用、邮件广告推广程序、简历推送程序等通过筛选数据内容并将其推送给相应的客户端程序的服务器程序。该计算机程序可运行于冯诺依曼体系的计算机系统之上。该计算机系统可以是运行上述在线广告投放程序、新闻资讯类应用、邮件广告推广程序、简历推送程序等通过筛选数据内容并将其推送给相应的客户端程序的服务器程序的服务器设备。
在本实施例中,上述服务器设备中预先存储有多项数据内容,例如,在在线广告投放程序中,设置有存储在线广告的广告数据库,每一条在线广告即为一项数据内容,在线广告服务商可通过向广告数据库添加记录来增加广告数据 库中存储的在线广告;而在简历投放程序中,设置有存储简历的简历数据库,用户可通过招聘网站创建简历,然后上传到简历数据库中。
而选择数据内容的过程即为在服务器设备中存储数据内容的数据库中查找与某个用户最为匹配,或者称为推送后被某个用户浏览的几率最大的数据内容的过程。在本实施例中,预设有多个用户属性类型,每个属性类型下均设置有相应的特征区间。
例如,在一个服装类广告推送系统中,预设的用户属性类型可包括:“性别”、“年龄段”、“品牌”等,而用户属性类型“性别”可包括“男”和“女”的特征区间,用户属性类型“年龄段”可包括“70后”、“80后”、“90后”、“00”后等特征区间,特征区间可通过特征阈值来定义,例如,“男”和“女”的特征区间可使用布尔变量定义,“70后”的特征区间可使用[70,79]的特征阈值来定义。
被推送的终端上的用户帐户的用户属性也具有在上述用户属性类型下的多个特征值,选择数据内容的过程即为遍历数据库中的数据内容,找到每项数据内容对应的分类统计数据,筛选出用户属性的多个特征值对应的统计数据,根据该筛选出的统计数据预估遍历到的数据内容被推送后被浏览的概率,然后选择被浏览概率较大的数据内容进行推送。
具体的,如图1所示,选择数据内容向终端推送的方法包括:
步骤S102:获取用户标识,获取所述用户标识对应的在预设的用户属性类型下的特征值。
用户标识即为用于区分用户的标识信息,可以是用户在服务器程序上注册的用户帐户,也可以是无需注册的用于推广的用户的电子邮件地址、IP地址、手机号等。用户标识对应的用户标识对应的在预设的用户属性类型下的特征值可通过对登录的用户账号的用户资料或用户操作记录中的属性值进行提取得到。
例如,在一个在线简历投送应用的应用场景中,该应用包括应聘者用户和招聘者用户两种类型的用户帐户,应聘者用户可创建简历,创建的简历即为在线简历投送应用的数据库中存储的数据内容,应聘者用户通常为个人。招聘者用户即为在线简历的推送目标,通常为企业或机构。在线简历投送应用的服务 器程序可在应聘者用户创建的海量简历中查找与某个企业最为匹配的简历,然后将该简历推送至该招聘者用户对应的终端上(可推送给该终端上的在线简历投送应用的客户端程序,也可以通过电子邮件发送给应聘者用户的邮箱)。该企业的工作人员在注册招聘者用户时,需要根据预设的用户属性类型填写该企业的资料。
例如,预设的用户属性类型可包括公司名称、行业类型、所属地区、企业性质等,若注册时在“公司名称”项中填写了“A”、“行业类型”项中填写了“互联网”、“所属地区”项中填写了“深圳”、“企业性质”项中填写了“国企”,则填写的“A”、“互联网”、“深圳”和“国企”即分别为在用户属性类型公司名称、行业类型、所属地区、企业性质下的特征值。
在一个在线广告推广程序的应用场景中,服务器上的数据库中存储有海量的广告数据(可以是视频广告、图片广告等),该在线广告推广程序基于网页搜索引擎,用户标识可以是终端的IP地址,与用户标识对应的在预设的用户属性类型下的特征值即可以通过查找与该IP地址对应的搜索记录进行提取。
例如,若预设的用户属性类型包括“兴趣产品类型”、“终端位置”等,则可查找该IP地址对应的搜索记录,若搜索记录中的关键字包括:“奶粉”、“婴儿车”、“尿不湿”等关键字,而用户属性类型“兴趣产品类型”下的特征区间包括“婴幼儿产品”,与该终端IP对应的在用户属性类型“兴趣产品类型”下的特征值即为“婴幼儿产品”;若通过查询终端IP对应的地理位置为“东莞”,而用户属性类型“终端位置”下的特征区间包括“广东省”,则与该终端IP对应的在用户属性类型“终端位置”下的特征值即为“广东省”。
步骤S104:获取数据内容,查找与所述数据内容对应的决策树对象,所述决策树对象的树节点包括分支节点和叶结点,分支节点与用户属性类型一一对应,且分支节点存储有相应的用户属性类型的各个特征区间的特征阈值,所述分支节点的子节点与所述特征阈值一一对应;所述叶结点中存储与所述叶结点对应的特征阈值对应的点击数和推送数。
决策树对象可使用逻辑上符合树结构的数据结构(即常见程序设计语言中定义的Tree类型)进行存储。每项数据内容对应一个决策树对象。例如,在在线广告投送程序中,每创建一条在线广告,则会为该在线广告分配一个在线 广告标识Aid,可以在映射表中存储该在线广告标识Aid和该Aid对应的决策树对象,Aid即为映射表的键(key),决策树对象即为映射表的值(value)。
决策树对象在逻辑上为树形结构,在一个应用场景中,如图2所示,该决策树对象包括三个层级,其中,第一层级的树节点(Node)为分支节点且为决策树对象的根节点,第二层级的树节点中,树节点“男”为分支节点,树节点“女”为叶结点,第三层级的树节点均为叶结点。
在图2中,根节点与用户属性类型“性别”对应,存储有用户属性类型“性别”下的特征区间“男”和特征区间“女”的特征阈值,该阈值可使用布尔变量、数字或字符串定义。
第二层级的树节点均为根节点的子节点,作为根节点的子节点的树节点“男”则与根节点对应的用户属性类型“性别”下的特征区间“男”的特征阈值对应,作为根节点的子节点的树节点“女”则与根节点对应的用户属性类型“性别”下的特征区间“女”的特征阈值对应。
第三层级的树节点均为分支节点“男”的子节点,分支节点“男”与用户属性类型“学历”对应,存储有用户属性类型“学历”下的特征区间“高中及以下”、特征区间“大专”和特征区间“硕士及以上”的特征阈值,该阈值可使用数字或字符串定义。叶结点“高中及以下”即与用户属性类型“学历”下的特征区间“高中及以下”的特征阈值对应;叶结点“大专”即与用户属性类型“学历”下的特征区间“大专”的特征阈值对应,叶结点“硕士及以上”即与用户属性类型“学历”下的特征区间“硕士及以上”的特征阈值对应。
叶结点中存储与该叶结点对应的特征阈值对应的点击数和推送数。例如,如图2所示,对于叶结点“大专”,其中存储有点击数(click)200,推荐数(impression)1000,即表示在决策树对象中与叶结点“大专”在逻辑上对应的点击数为200,推荐数为1000。
步骤S106:根据与所述用户标识对应的在预设的用户属性类型下的特征值在所述决策树对象中定位与所述用户标识对应的叶结点,所述特征值与从所述决策树对象的根节点到所述定位到的叶结点的路径上的各个分支节点对应的特征区间的特征阈值匹配。
根据与用户标识对应的特征值在所述决策树对象中定位的过程即为在决 策树的分支节点通过比较特征区间的特征阈值是否与特征值匹配,并进而移动到该分支节点的子节点递归执行上述操作的过程。
步骤S108:获取定位到的叶结点中存储的点击数和推送数,根据所述点击数和推送数生成选择参考值,根据所述选择参考值选择数据内容推送到与所述用户标识对应的终端。
如图2所示,若用户在一个相亲网站上填写注册资料时,在“性别”栏填写的内容为“男”,在“学历”栏填写的内容为“大专”,在“婚姻状况”栏填写的内容为“离异”,在“年龄”栏填写的内容为“32”,则该用户的用户标识在预设的用户属性类型“性别”下的特征值为“男”(在其他实施例中,可不使用字符串“男”表示该特征值,而可使用布尔变量、数字或英文字符等数据类型的特征值指代“男”,以下同),在用户属性类型“学历”下的特征值为“大专”,在用户属性类型“婚姻状况”下的特征值为“离异”,在用户属性类型“年龄段”下的特征值为“32”。
参考图2所示,在根据与用户标识对应的特征值在所述决策树对象中定位时,由于根节点对应的用户属性类型为“性别”,其中存储的特征区间的特征阈值为用户属性类型“性别”下的特征阈值“男”和特征阈值“女”,因此,用户标识对应的特征值中,特征值“男”可与根节点中存储的特征阈值“男”匹配,从而可获取根节点的子节点,即分支节点“男”进行进一步判断。
而分支节点“男”对应的用户属性类型为“学历”,其中存储的特征区间的特征阈值为用户属性类型“学历”下的特征阈值“高中及以下”、特征阈值“大专”和特征阈值“硕士及以上”。因此,用户标识对应的特征值中,特征值“大专”可与分支节点“男”中存储的特征阈值“大专”匹配,可获取分支节点“男的”子节点,即叶结点“大专”进行进一步判断。
而由于叶结点“大专”为叶结点,因此可获取到叶结点中存储的点击数200和推送数1000,也就是说,在历史统计中,该决策树对象对应的数据内容对于同时满足了性别为“男”且学历为“大专”的用户,一共推送了1000次,但只有200次被点击,从而可得到该数据内容对于同时满足了性别为“男”且学历为“大专”的用户群体的历史点击率统计数据,即可将该历史点击率统计数据作为该数据内容相对于该用户标识的选择参考值。
在本实施例中,可遍历数据库中的数据内容,生成每个数据内容相对于该用户标识的选择参考值,然后查找选择参考值最大的数据内容或大于预设的阈值的数据内容将其推送给该用户标识对应的终端。在其他实施例中,也可将其通过电子邮件、社交网络平台推送给所述用户标识对应的终端。
综上所述,在确定需要被推送的用户标识之后,即可查找与该用户标识对应的选择参考值较大的数据内容进行推送,查找的方式即为将用户标识对应的特征值与数据内容的决策树对象的各个分支节点对应的特征阈值进行匹配,找到匹配到的叶结点中存储的点击数和推送数,从而查找到与该用户标识对应的选择参考值。
而以此方式构建的与数据内容对应的决策树对象也可根据用户操作返回的浏览记录得到实时更新,将用户返回的浏览记录对应的点击数和推荐数添加到决策树对象相应的叶结点中,即完成了对决策树对象的实时更新。
具体的,对决策树对象进行更新的过程可具体为:
接收终端上传的浏览记录,获取所述终端对应的用户标识以及所述浏览记录对应的数据内容;
获取所述数据内容对应的决策树对象,获取所述用户标识对应的在预设的用户属性类型下的特征值,根据获取到的特征值在所述决策树对象中定位所述与用户标识对应的叶结点,根据所述浏览记录增加所述定位到的叶结点中存储的点击数和推送数。
如上例中,上述相亲网站将与该注册信息为“男”、“大专”、“离异”、“32岁”的用户发送了选择参考值最大的数据内容(例如较般配的用户的资料)之后,若该用户点击了该数据内容进行浏览,则返回的浏览记录即为点击数1,推送数为1。
服务器在接收到该浏览记录之后,查找到该浏览记录对应的用户的特征值为“男”、“大专”、“离异”、“32岁”,则按照上述相同的定位方式可定位到该浏览记录对应的数据内容的决策树对象中的叶结点“大专”,然后将叶结点“大专”中存储的点击数增加为201,推送数增加为1001。同样,若该用户未点击该数据内容,则将叶结点“大专”中存储的推送数增加为1001,而点击数不变。
进一步的,还可实时地根据历史统计数据对决策树对象进行扩展,增加决策树对象的树节点,也就是增加决策树对象中的分支节点对应的用户属性类型,后续在选择数据内容进行推送时,可根据更新后的决策树对象进行选择,从而进一步提高推送的数据内容的准确度,使其与用户的操作系统或用户属性更加匹配,更容易引起用户的兴趣。
具体的,根据所述浏览记录增加所述定位到的叶结点中存储的点击数和推送数的步骤还包括:
获取所述浏览记录中与所述数据内容对应的点击数和推送数;
获取所述决策树对象中所述根节点到所述定位到的叶结点的路径上的分支节点,获取预设的除所述路径上的分支节点对应的用户属性类型之外的候选用户属性类型,按照各个候选用户属性类型下的各个特征区间归类添加由所述浏览记录获取到的与所述数据内容对应的点击数和推送数。
如图3所示,叶结点“大专”中不仅存储有同时符合性别“男”、学历“大专”的总点击数200,总推荐数1000,还包括分类存储的在用户属性类型“婚姻状况”下的三个预设的特征区间内的点击数,其中,与特征区间“未婚”对应的点击数为120,推送数为400;与特征区间“离异”对应的点击数为20,推送数为400;与特征区间“丧偶”对应的点击数为60(三者之和也可以不等于总点击数200,推送数为200。例如,用户标识在某个用户属性类型下不对应任何一个特征区间);还包括分类存储的在用户属性类型“年龄段”下的三个预设的特征区间内的点击数,其中,与特征区间“30以下”对应的点击数为130,推送数为500;与特征区间“30-40”对应的点击数为30,推送数为400;与特征区间“40以上”对应的点击数为40,推送数为100。
如上例中,接收到注册信息为“男”、“大专”、“离异”、“32岁”的用户返回的浏览记录之后,则先定位到叶结点“大专”,然后将与“离异”对应的点击数增加成21,“30-40”对应的点击数增加为31,总点击数增加为201,总推送数增加为1001,而其他特征值对应的点击数保持不变。
候选用户属性类型即为决策树对象的分支节点未对应的用户属性类型,如图4所示,决策树对象中由根节点至叶结点“大专”的路径上的分支节点仅与“性别”和“学历”产生了对应关系,但剩余的“婚姻状况”和“年龄段”并 没有分支节点与其对应,因此,对于由根节点至叶结点“大专”的路径,“婚姻状况”和“年龄段”即为相应的候选用户属性类型。而对于由根节点至叶结点“女”的路径上的分支节点,仅与“性别”产生了对应关系,因此,对于由根节点至叶结点“女”的路径,“学历”、“婚姻状况”和“年龄段”即为相应的候选用户属性类型。按照上述方式实时地对决策树对象中的叶结点存储的点击数和推送数进行更新后,即可在根据叶结点中存储的各个特征值对应的点击数之间的相关性选择候选用户属性类型扩展决策树对象。
具体的,如图4所示,按照各个候选用户属性类型下的各个特征区间归类添加由所述浏览记录获取到的与所述数据内容对应的点击数和推送数的步骤之后还包括:
步骤S202:根据所述定位到的叶结点中归类存储的与候选用户属性类型下的各个特征区间对应的点击数和推送数生成所述候选用户属性类型对应的信息增益。
在本实施例中,可根据公式:
Figure PCTCN2016078867-appb-000001
计算叶结点S下的用户属性类型A的信息增益;其中,FA为用户属性类型A的特征区间的集合,v为用户属性类型A下各个特征区间的特征阈值,p(v)为用户属性类型A下各个特征区间中的推送数的分布概率;Sv为各个特征区间的特征阈值v各自对应的点击数与推荐数的集合,p1为与叶结点S对应的点击数与推荐数的比值,p2为与Sv对应的点击数与推荐数的比值。
例如,参考图4所示,p1的值即为叶结点“大专”中存储的总点击数200与总推送数1000的比值,因此:
Entropy(S)=-0.2×log20.2-0.8×log20.8
而用户属性类型“婚姻状况”下各个特征区间的特征阈值v即遍历到:“未婚”、“离异”和“丧偶”,其中:
v为“未婚”时:
p(v)即为400/1000=0.4,p2即为120/400=0.3;
Entropy(Sv)=-0.3×log20.3-0.7×log20.7;
v为“离异”时:
p(v)即为400/1000=0.4,p2即为20/400=0.05;
Entropy(Sv)=-0.05×log20.05-0.95×log20.95;
v为“丧偶”时:
p(v)即为200/1000=0.2,p2即为60/200=0.3;
Entropy(Sv)=-0.3×log20.3-0.7×log20.7;
从而得到用户属性类型“婚姻状况”的信息增益。
同理,参考图3所示,用户属性类型“年龄段”下各个特征区间的特征阈值v即遍历到:“30以下”、“30-40”和“40以上”,其中:
v为“30以下”时:
p(v)即为500/1000=0.5,p2即为130/500=0.26
Entropy(Sv)=-0.26×log20.26-0.74×log20.74;
v为“30-40”时:
p(v)即为400/1000=0.4,p2即为30/400=0.075;
Entropy(Sv)=-0.075×log20.075-0.925×log20.925;
v为“40以上”时:
p(v)即为100/1000=0.1,p2即为40/100=0.4;
Entropy(Sv)=-0.4×log20.4-0.6×log26;
因此,即可计算出Entropy(SA),从而得到用户属性类型“年龄段”的信息增益。
步骤S204:查找信息增益与其他查找到的用户属性类型的信息增益的差值大于或等于信息增益阈值的候选用户属性类型。
步骤S206:在查找到时,则将所述定位到的叶结点设置为分支节点,根据所述查找到的候选用户属性类型下的特征区间的特征阈值生成该分支节点的叶结点。
例如,若G(婚姻状况)-G(年龄段)=a,若a大于或等于信息增益阈值,则如图5所示,将叶结点“大专”更新为分支节点“大专”,并为分支节点“大 专”添加叶结点“未婚”、叶结点“离异”和叶结点“丧偶”,即在分支节点“大专”中存储特征区间“未婚”、“离异”和“丧偶”的特征阈值,而每个叶结点中仍然存储着按照“年龄段”的特征区间存储的点击数。
需要说明的是,候选用户属性类型较多,例如,若某个叶结点下L有A、B、C、D4个候选用户属性,则先计算G(A)、G(B)、G(C)、G(D),然后找出G较大的两个候选用户属性。例如,若G(A)>G(B)>G(C)>G(D),则计算G(A)-G(B),若G(A)-G(B)大于信息增益阈值,则选择候选用户属性类型A与树节点对应。
而若G(A)-G(B)小于信息增益阈值,则可维持决策树对象保持不变,不对决策树对象的叶结点进行分裂。而对于分裂后生成的叶结点,如图5所示,其中存储有根据浏览记录重新统计的该叶结点对应的总点击数和总推荐数,以及候选用户属性类型(如图5中的用户属性类型“年龄段”)的各个特征区间对应的点击数和推送数。
对决策树扩展,可进一步提高推送的准确度。而通过上述公式可看出,若某个候选用户属性类型中的特征区间对应的点击数以及推送数分布较均匀,则其信息增益较大,也就是说,对于决策树对象进行扩展时,总是选择特征区间对应的点击数以及推送数分布较均匀的候选用户属性类型,从而使得在根据用户标识对应的特征值进行定位时,进入分支节点下每个叶结点的概率相近。
因此,通过计算候选用户属性类型的信息增益来选择候选用户属性类型,可均衡决策树对象中抵达各个叶结点的概率,从而避免某个叶结点由于匹配条件过于苛刻而仅有极少的概率被用于匹配用户标识的特征值,从而提高存储决策树对象的空间利用率。
而对于新加入的数据内容,可在实时运行过程中为其创建决策树对象,具体的,查找与所述数据内容对应的决策树对象的步骤还包括:若未查找到与所述数据内容对应的决策树对象,则创建与所述数据内容对应的决策树对象,该创建的决策树对象的根节点为叶节点;为数据内容分配默认的选择参考值。
也就是说,新加入的数据内容创建决策树对象之后,可根据后续的终端返回的浏览记录实时地对决策树对象进行扩展。决策树对象初始可仅具有根节点一个单一节点(由于其没有子节点,因此也必然为叶结点),而随着接收到的 浏览记录的增加,可逐步选择候选用户属性类型创建分支节点,从而使得决策树对象得到完善。
而且,采用扩展决策树对象的方案之后,若后续添加了用户属性类型,则可根据对增加的用户属性类型的浏览记录的统计在决策树对象中添加与该用户属性类型对应的分支节点,从而使得决策树对象可以随着用户属性类型的扩展而实时地增加对用户属性类型的参考,从而提高了的可用于进行数据内容推送的系统的扩展性。
可选的,根据所述点击数和推送数生成选择参考值的步骤还包括:
获取所述数据内容对应的计价权重系数,将所述点击数和推送数的比值与所述计价权重系数相乘后得到所述数据内容的选择参考值。
例如,在在线广告投送应用中,每类广告点击后的计费数值并不相同,在生成选择参考值时,引入计价权重系数,可使选择参考值不仅参考历史统计的点击率,还可参考广告的点击收益,使得在线广告的收益最大化。
可选的,获取数据内容的步骤还包括:根据所述用户标识对应的在预设的用户属性类型下的特征值通过关键字匹配对数据内容进行预筛选。
在现有的数据内容推送的系统中,数据库中存储的数据内容通常数量巨大,因此,可预先根据所述用户标识对应的在预设的用户属性类型下的特征值对数据库中的数据内容进行预筛选,若数据内容中不包含与特征值对应的关键字,则过滤掉。
例如,在一个相亲网站的应用场景中,如果目标推送的用户性别为男,则可预先筛选出女性用户资料,再在女性用户资料中按照上述步骤S104至步骤S108的过程找到选择参考值较大的女性用户资料推送给该男性用户。
对数据内容进行预筛选,可大大减少决策树对象匹配的次数,从而减少了计算量,提高了计算机的执行效率。
在一个实施例中,为解决传统技术中,由于匹配模型设计的不合理,无法实时得到更新,从而导致依赖匹配模型选择数据内容进行推送的准确度较低的技术问题,还提出了一种选择数据内容向终端推送的装置,如图6所示,包括:用户标识获取模块102、决策树获取模块104、叶结点定位模块106以及数据内容选择模块108,其中:
用户标识获取模块102,用于获取用户标识,获取所述用户标识对应的在预设的用户属性类型下的特征值;
决策树获取模块104,用于获取数据内容,查找与所述数据内容对应的决策树对象,所述决策树对象的树节点包括分支节点和叶结点,分支节点与用户属性类型一一对应,且分支节点存储有相应的用户属性类型的各个特征区间的特征阈值,所述分支节点的子节点与所述特征阈值一一对应;所述叶结点中存储与所述叶结点对应的特征阈值对应的点击数和推送数;
叶结点定位模块106,用于根据与所述用户标识对应的在预设的用户属性类型下的特征值在所述决策树对象中定位与所述用户标识对应的叶结点,所述特征值与从所述决策树对象的根节点到所述定位到的叶结点的路径上的各个树节点对应的特征阈值匹配;
数据内容选择模块108,用于获取定位到的叶结点中存储的点击数和推送数,根据所述点击数和推送数生成选择参考值,根据所述选择参考值选择数据内容推送到与所述用户标识对应的终端。
在本实施例中,如图6所示,选择数据内容向终端推送的装置还包括决策树更新模块110,用于接收上传的浏览记录,获取所述浏览记录对应的用户标识以及所述浏览记录对应的数据内容;获取所述数据内容对应的决策树对象,获取所述用户标识对应的在预设的用户属性类型下的特征值,根据获取到的特征值在所述决策树对象中定位所述与用户标识对应的叶结点,根据所述浏览记录增加所述定位到的叶结点中存储的点击数和推送数。
在本实施例中,决策树更新模块110还用于获取所述浏览记录中与所述数据内容对应的点击数和推送数;获取所述决策树对象中所述根节点到所述定位到的叶结点的路径上的分支节点,获取预设的除所述路径上的分支节点对应的用户属性类型之外的候选用户属性类型,按照各个候选用户属性类型下的各个特征区间归类添加由所述浏览记录获取到的与所述数据内容对应的点击数和推送数。
在本实施例中,决策树更新模块110还用于根据所述定位到的叶结点中归类存储的与候选用户属性类型下的各个特征区间对应的点击数和推送数生成所述候选用户属性类型对应的信息增益;查找信息增益与其他查找到的用户属 性类型的信息增益的差值大于或等于信息增益阈值的候选用户属性类型;在查找到时,则将所述定位到的叶结点设置为分支节点,根据所述查找到的候选用户属性类型下的特征区间的特征阈值生成该分支节点的叶结点。
在本实施例中,决策树更新模块110还用于根据公式:
Figure PCTCN2016078867-appb-000002
计算叶结点S下的用户属性类型A的信息增益;其中,FA为用户属性类型A的特征区间的集合,v为用户属性类型A下各个特征区间的特征阈值,p(v)为用户属性类型A下各个特征区间中的推送数的分布概率;Sv为各个特征区间的特征阈值v各自对应的点击数与推荐数的集合,p1为与叶结点S对应的点击数与推荐数的比值,p2为与Sv对应的点击数与推荐数的比值。
在本实施例中,如图6所示,选择数据内容向终端推送的装置还包括决策树创建模块112,用于在未查找与所述数据内容对应的决策树对象时,创建与所述数据内容对应的决策树对象,该创建的决策树对象的根节点为叶节点;
所述决策树获取模块还用于在在未查找与所述数据内容对应的决策树对象时,为所述数据内容分配默认的选择参考值。
在本实施例中,数据内容选择模块108还用于获取所述数据内容对应的计价权重系数,将所述点击数和推送数的比值与所述计价权重系数相乘后得到所述数据内容的选择参考值。
在本实施例中,如图6所示,选择数据内容向终端推送的装置还包括数据内容筛选模块114,还用于根据所述用户标识对应的在预设的用户属性类型下的特征值通过关键字匹配对数据内容进行预筛选。
根据本发明的一个实施例,根据图1至图5所示的选择数据内容向终端推送的方法可以是由图6所示的选择数据内容向终端推送的装置中的各个单元来执行。例如,图1所示的步骤S102、S104、S106和S108可分别由图6所示的用户标识获取模块102、决策树获取模块104、叶结点定位模块106和数据内容选择模块108来执行;图4所示的步骤S202、S204和S106可由图6 所示的决策树更新模块110来执行。
根据本发明的另一个实施例,图6所示的选择数据内容向终端推送的装置中的各个单元可以分别或全部合并为一个或若干个另外的单元来构成,或者其中的某个(些)单元还可以再拆分为功能上更小的多个单元来构成,这可以实现同样的操作,而不影响本发明的实施例的技术效果的实现。上述单元是基于逻辑功能划分的,在实际应用中,一个单元的功能也可以由多个单元来实现,或者多个单元的功能由一个单元实现。在本发明的其它实施例中,终端设备也可以包括其它模块。但在实际应用中,这些功能也可以由其它单元协助实现,并且可以由多个单元协作实现。
根据本发明的另一个实施例,可以通过在包括中央处理单元(CPU)、随机存取存储器(RAM)、只读存储器(ROM)等处理元件和存储元件的例如计算机的通用计算设备上运行能够执行如图1至图5所示的选择数据内容向终端推送的方法的计算机程序(包括程序代码),来构造如图6所示的选择数据内容向终端推送的装置,以及来实现根据本发明的实施例的选择数据内容向终端推送的方法。所述计算机程序可以记载于例如计算机可读记录介质上,并通过计算机可读记录介质装载于上述计算设备中,并在其中运行。
综上所述,实施本发明实施例,将具有如下有益效果:
采用了上述决策树对象作为匹配模型之后,可通过将用户标识对应的特征值与数据内容各自对应的决策树对象中的分支节点进行匹配来查找到选择参考值较大的数据内容进行推送,且上述决策树对象的逻辑结构使得对决策树对象可利用用户的浏览记录实时进行更新,而不需要定期采样后,再根据采样得到的样本通过机器学习的方式离线对决策树对象进行更新,也就是说,在将用户标识对应的特征值与数据内容各自对应的决策树对象中的分支节点进行匹配时,决策树对象中的统计数据均参考了较新的用户浏览记录,从而使得匹配的结果能够更加符合运行时用户的操作习惯或浏览习惯,从而提高了选择数据内容进行推送的准确度。
请参阅图7,图7为本发明实施例提供的另一种选择数据内容向终端推送的装置的结构示意图。如图7所示,所述选择数据内容向终端推送的装置可以包括至少一个处理器701,例如,CPU,至少一个通信总线802,用户接口703, 以及存储器704。其中,所述通信总线702用于实现这些组件之间的连接通信。其中,用户接口703可以包括显示屏(Display),可选用户接口703还可以包括标准的有线接口、无线接口。所述存储器704可以是高速RAM存储器,也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器。所述存储器704可选的还可以是至少一个位于远离前述处理器701的存储装置。其中所述存储器704中存储一组程序代码,且所述处理器701调用所述存储器704中存储的程序代码,用于执行以下操作:
获取用户标识,获取所述用户标识对应的在预设的用户属性类型下的特征值;
获取数据内容,查找与所述数据内容对应的决策树对象,所述决策树对象的树节点包括分支节点和叶结点,分支节点与用户属性类型一一对应,且分支节点存储有相应的用户属性类型的各个特征区间的特征阈值,所述分支节点的子节点与所述特征阈值一一对应;其中,所述叶结点中存储与所述叶结点对应的特征阈值对应的点击数和推送数;
根据与所述用户标识对应的在预设的用户属性类型下的特征值在所述决策树对象中定位与所述用户标识对应的叶结点,所述特征值与从所述决策树对象的根节点到所述定位到的叶结点的路径上的各个分支节点对应的特征区间的特征阈值匹配;及
获取定位到的叶结点中存储的点击数和推送数,根据所述点击数和推送数生成选择参考值,根据所述选择参考值选择数据内容推送到与所述用户标识对应的终端。
在一可选实施例中,处理器701调用存储器704中存储的程序代码,用于执行以下操作:
接收终端上传的浏览记录,获取所述终端对应的用户标识以及所述浏览记录对应的数据内容;及
获取所述数据内容对应的决策树对象,获取所述用户标识对应的在预设的用户属性类型下的特征值,根据获取到的特征值在所述决策树对象中定位所述与用户标识对应的叶结点,根据所述浏览记录增加所述定位到的叶结点中存储的点击数和推送数。
在一可选实施例中,所述处理器701调用所述存储器704中存储的程序代码根据所述浏览记录增加所述定位到的叶结点中存储的点击数和推送数,还可以包括:
获取所述浏览记录中与所述数据内容对应的点击数和推送数;及
获取所述决策树对象中所述根节点到所述定位到的叶结点的路径上的分支节点,获取预设的除所述路径上的分支节点对应的用户属性类型之外的候选用户属性类型,按照各个候选用户属性类型下的各个特征区间归类添加由所述浏览记录获取到的与所述数据内容对应的点击数和推送数。
在一可选实施例中,处理器701调用存储器704中存储的程序代码按照各个候选用户属性类型下的各个特征区间归类添加由所述浏览记录获取到的与所述数据内容对应的点击数和推送数之后,处理器701调用存储器704中存储的程序代码,还用于执行以下操作:
根据所述定位到的叶结点中归类存储的与候选用户属性类型下的各个特征区间对应的点击数和推送数生成所述候选用户属性类型对应的信息增益;
查找信息增益与其他查找到的用户属性类型的信息增益的差值大于或等于信息增益阈值的候选用户属性类型;及
在查找到时,则将所述定位到的叶结点设置为分支节点,根据所述查找到的候选用户属性类型下的特征区间的特征阈值生成该分支节点的叶结点。
在一可选实施例中,所述处理器701调用所述存储器704中存储的程序代码查找与所述数据内容对应的决策树对象,还可以包括:
若未查找到与所述数据内容对应的决策树对象,则创建与所述数据内容对应的决策树对象,该创建的决策树对象的根节点为叶节点;及
为数据内容分配默认的选择参考值。
在一可选实施例中,所述处理器701调用所述存储器704中存储的程序代码根据所述点击数和推送数生成选择参考值,还可以包括:
获取所述数据内容对应的计价权重系数,将所述点击数和推送数的比值与所述计价权重系数相乘后得到所述数据内容的选择参考值。
在一可选实施例中,所述处理器701调用所述存储器704中存储的程序代码获取数据内容,还可以包括:
根据所述用户标识对应的在预设的用户属性类型下的特征值通过关键字匹配对数据内容进行预筛选。
在流程图中表示或在此以其他方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的定序列表,可以具体实现在任何计算机可读介质中,以供指令执行系统、装置或设备(如基于计算机的系统、包括处理器的系统或其他可以从指令执行系统、装置或设备取指令并执行指令的系统)使用,或结合这些指令执行系统、装置或设备而使用。就本说明书而言,″计算机可读介质″可以是任何可以包含、存储、通信、传播或传输程序以供指令执行系统、装置或设备或结合这些指令执行系统、装置或设备而使用的装置。计算机可读介质的更具体的示例(非穷尽性列表)包括以下:具有一个或多个布线的电连接部(电子装置),便携式计算机盘盒(磁装置),随机存取存储器(RAM),只读存储器(ROM),可擦除可编辑只读存储器(EPROM或闪速存储器),光纤装置,以及便携式光盘只读存储器(CDROM)。另外,计算机可读介质甚至可以是可在其上打印所述程序的纸或其他合适的介质,因为可以例如通过对纸或其他介质进行光学扫描,接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得所述程序,然后将其存储在计算机存储器中。
应当理解,本发明的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。例如,如果用硬件来实现,和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。
上述提到的存储介质可以是只读存储器,磁盘或光盘等。尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。

Claims (20)

  1. 一种选择数据内容向终端推送的方法,其特征在于,包括:
    获取用户标识,获取所述用户标识对应的在预设的用户属性类型下的特征值;
    获取数据内容,查找与所述数据内容对应的决策树对象,所述决策树对象的树节点包括分支节点和叶结点,分支节点与用户属性类型一一对应,且分支节点存储有相应的用户属性类型的各个特征区间的特征阈值,所述分支节点的子节点与所述特征阈值一一对应;所述叶结点中存储与所述叶结点对应的特征阈值对应的点击数和推送数;
    根据与所述用户标识对应的在预设的用户属性类型下的特征值在所述决策树对象中定位与所述用户标识对应的叶结点,所述特征值与从所述决策树对象的根节点到所述定位到的叶结点的路径上的各个树节点对应的特征阈值匹配;
    获取定位到的叶结点中存储的点击数和推送数,根据所述点击数和推送数生成选择参考值,根据所述选择参考值选择数据内容推送到与所述用户标识对应的终端。
  2. 根据权利要求1所述的选择数据内容向终端推送的方法,其特征在于,所述方法还包括:
    接收上传的浏览记录,获取所述浏览记录对应的用户标识以及所述浏览记录对应的数据内容;
    获取所述数据内容对应的决策树对象,获取所述用户标识对应的在预设的用户属性类型下的特征值,根据获取到的特征值在所述决策树对象中定位所述与用户标识对应的叶结点,根据所述浏览记录增加所述定位到的叶结点中存储的点击数和推送数。
  3. 根据权利要求2所述的选择数据内容向终端推送的方法,其特征在于,所述根据所述浏览记录增加所述定位到的叶结点中存储的点击数和推送数的 步骤还包括:
    获取所述浏览记录中与所述数据内容对应的点击数和推送数;
    获取所述决策树对象中所述根节点到所述定位到的叶结点的路径上的分支节点,获取预设的除所述路径上的分支节点对应的用户属性类型之外的候选用户属性类型,按照各个候选用户属性类型下的各个特征区间归类添加由所述浏览记录获取到的与所述数据内容对应的点击数和推送数。
  4. 根据权利要求3所述的选择数据内容向终端推送的方法,其特征在于,所述按照各个候选用户属性类型下的各个特征区间归类添加由所述浏览记录获取到的与所述数据内容对应的点击数和推送数的步骤之后还包括:
    根据所述定位到的叶结点中归类存储的与候选用户属性类型下的各个特征区间对应的点击数和推送数生成所述候选用户属性类型对应的信息增益;
    查找信息增益与其他查找到的用户属性类型的信息增益的差值大于或等于信息增益阈值的候选用户属性类型;
    在查找到时,则将所述定位到的叶结点设置为分支节点,根据所述查找到的候选用户属性类型下的特征区间的特征阈值生成该分支节点的叶结点。
  5. 根据权利要求4所述的选择数据内容向终端推送的方法,其特征在于,所述根据所述定位到的叶结点中归类存储的与候选用户属性类型下的各个特征区间对应的点击数和推送数生成所述候选用户属性类型对应的信息增益的步骤为:
    根据公式:
    Figure PCTCN2016078867-appb-100001
    计算叶结点S下的用户属性类型A的信息增益;其中,FA为用户属性类型A的特征区间的集合,v为用户属性类型A下各个特征区间的特征阈值,p(v) 为用户属性类型A下各个特征区间中的推送数的分布概率;Sv为各个特征区间的特征阈值v各自对应的点击数与推荐数的集合,p1为与叶结点S对应的点击数与推荐数的比值,p2为与Sv对应的点击数与推荐数的比值。
  6. 根据权利要求1至5任一项所述的选择数据内容向终端推送的方法,其特征在于,所述查找与所述数据内容对应的决策树对象的步骤还包括:
    若未查找到与所述数据内容对应的决策树对象,则创建与所述数据内容对应的决策树对象,该创建的决策树对象的根节点为叶节点;
    为所述数据内容分配默认的选择参考值。
  7. 根据权利要求1至5任一项所述的选择数据内容向终端推送的方法,其特征在于,所述根据所述点击数和推送数生成选择参考值的步骤还包括:
    获取所述数据内容对应的计价权重系数,将所述点击数和推送数的比值与所述计价权重系数相乘后得到所述数据内容的选择参考值。
  8. 根据权利要求1至5任一项所述的选择数据内容向终端推送的方法,其特征在于,所述获取数据内容的步骤还包括:
    根据所述用户标识对应的在预设的用户属性类型下的特征值通过关键字匹配对数据内容进行预筛选。
  9. 一种选择数据内容向终端推送的装置,其特征在于,包括:
    用户标识获取模块,用于获取用户标识,获取所述用户标识对应的在预设的用户属性类型下的特征值;
    决策树获取模块,用于获取数据内容,查找与所述数据内容对应的决策树对象,所述决策树对象的树节点包括分支节点和叶结点,分支节点与用户属性类型一一对应,且分支节点存储有相应的用户属性类型的各个特征区间的特征阈值,所述分支节点的子节点与所述特征阈值一一对应;所述叶结点中存储与所述叶结点对应的特征阈值对应的点击数和推送数;
    叶结点定位模块,用于根据与所述用户标识对应的在预设的用户属性类型下的特征值在所述决策树对象中定位与所述用户标识对应的叶结点,所述特征值与从所述决策树对象的根节点到所述定位到的叶结点的路径上的各个树节点对应的特征阈值匹配;
    数据内容选择模块,用于获取定位到的叶结点中存储的点击数和推送数,根据所述点击数和推送数生成选择参考值,根据所述选择参考值选择数据内容推送到与所述用户标识对应的终端。
  10. 根据权利要求9所述的选择数据内容向终端推送的装置,其特征在于,所述装置还包括决策树更新模块,用于接收上传的浏览记录,获取所述浏览记录对应的用户标识以及所述浏览记录对应的数据内容;获取所述数据内容对应的决策树对象,获取所述用户标识对应的在预设的用户属性类型下的特征值,根据获取到的特征值在所述决策树对象中定位所述与用户标识对应的叶结点,根据所述浏览记录增加所述定位到的叶结点中存储的点击数和推送数。
  11. 根据权利要求10所述的选择数据内容向终端推送的装置,其特征在于,所述决策树更新模块还用于获取所述浏览记录中与所述数据内容对应的点击数和推送数;获取所述决策树对象中所述根节点到所述定位到的叶结点的路径上的分支节点,获取预设的除所述路径上的分支节点对应的用户属性类型之外的候选用户属性类型,按照各个候选用户属性类型下的各个特征区间归类添加由所述浏览记录获取到的与所述数据内容对应的点击数和推送数。
  12. 根据权利要求11所述的选择数据内容向终端推送的装置,其特征在于,所述决策树更新模块还用于根据所述定位到的叶结点中归类存储的与候选用户属性类型下的各个特征区间对应的点击数和推送数生成所述候选用户属性类型对应的信息增益;查找信息增益与其他查找到的用户属性类型的信息增益的差值大于或等于信息增益阈值的候选用户属性类型;在查找到时,则将所述定位到的叶结点设置为分支节点,根据所述查找到的候选用户属性类型下的特征区间的特征阈值生成该分支节点的叶结点。
  13. 根据权利要求12所述的选择数据内容向终端推送的装置,其特征在于,所述决策树更新模块还用于根据公式:
    Figure PCTCN2016078867-appb-100002
    计算叶结点S下的用户属性类型A的信息增益;其中,FA为用户属性类型A的特征区间的集合,v为用户属性类型A下各个特征区间的特征阈值,p(v)为用户属性类型A下各个特征区间中的推送数的分布概率;Sv为各个特征区间的特征阈值v各自对应的点击数与推荐数的集合,p1为与叶结点S对应的点击数与推荐数的比值,p2为与Sv对应的点击数与推荐数的比值。
  14. 根据权利要求9至13任一项所述的选择数据内容向终端推送的装置,其特征在于,所述装置还包括决策树创建模块,用于在未查找与所述数据内容对应的决策树对象时,创建与所述数据内容对应的决策树对象,该创建的决策树对象的根节点为叶节点;
    所述决策树获取模块还用于在在未查找与所述数据内容对应的决策树对象时,为所述数据内容分配默认的选择参考值。
  15. 根据权利要求9至13任一项所述的选择数据内容向终端推送的方法,其特征在于,所述数据内容选择模块还用于获取所述数据内容对应的计价权重系数,将所述点击数和推送数的比值与所述计价权重系数相乘后得到所述数据内容的选择参考值。
  16. 根据权利要求9至13任一项所述的选择数据内容向终端推送的装置,其特征在于,所述装置还包括数据内容筛选模块,还用于根据所述用户标识对应的在预设的用户属性类型下的特征值通过关键字匹配对数据内容进行预筛 选。
  17. 一种选择数据内容向终端推送的装置,其特征在于,包括:至少一个处理器及连接于所述至少一个处理器的存储器,所述处理器调用所述存储器中存储的程序代码用于执行以下操作的指令:
    获取用户标识,获取所述用户标识对应的在预设的用户属性类型下的特征值;
    获取数据内容,查找与所述数据内容对应的决策树对象,所述决策树对象的树节点包括分支节点和叶结点,分支节点与用户属性类型一一对应,且分支节点存储有相应的用户属性类型的各个特征区间的特征阈值,所述分支节点的子节点与所述特征阈值一一对应;所述叶结点中存储与所述叶结点对应的特征阈值对应的点击数和推送数;
    根据与所述用户标识对应的在预设的用户属性类型下的特征值在所述决策树对象中定位与所述用户标识对应的叶结点,所述特征值与从所述决策树对象的根节点到所述定位到的叶结点的路径上的各个树节点对应的特征阈值匹配;
    获取定位到的叶结点中存储的点击数和推送数,根据所述点击数和推送数生成选择参考值,根据所述选择参考值选择数据内容推送到与所述用户标识对应的终端。
  18. 根据权利要求17所述的选择数据内容向终端推送的装置,其特征在于,所述处理器调用所述存储器中存储的程序代码用于执行以下操作的指令:
    接收上传的浏览记录,获取所述浏览记录对应的用户标识以及所述浏览记录对应的数据内容;
    获取所述数据内容对应的决策树对象,获取所述用户标识对应的在预设的用户属性类型下的特征值,根据获取到的特征值在所述决策树对象中定位所述与用户标识对应的叶结点,根据所述浏览记录增加所述定位到的叶结点中存储的点击数和推送数。
  19. 根据权利要求18所述的选择数据内容向终端推送的装置,其特征在 于,执行所述根据所述浏览记录增加所述定位到的叶结点中存储的点击数和推送数的指令,包括:
    获取所述浏览记录中与所述数据内容对应的点击数和推送数;
    获取所述决策树对象中所述根节点到所述定位到的叶结点的路径上的分支节点,获取预设的除所述路径上的分支节点对应的用户属性类型之外的候选用户属性类型,按照各个候选用户属性类型下的各个特征区间归类添加由所述浏览记录获取到的与所述数据内容对应的点击数和推送数。
  20. 根据权利要求19所述的选择数据内容向终端推送的装置,其特征在于,执行所述按照各个候选用户属性类型下的各个特征区间归类添加由所述浏览记录获取到的与所述数据内容对应的点击数和推送数的指令之后,所述处理器调用所述存储器中存储的程序代码还用于执行以下操作的指令:
    根据所述定位到的叶结点中归类存储的与候选用户属性类型下的各个特征区间对应的点击数和推送数生成所述候选用户属性类型对应的信息增益;
    查找信息增益与其他查找到的用户属性类型的信息增益的差值大于或等于信息增益阈值的候选用户属性类型;
    在查找到时,则将所述定位到的叶结点设置为分支节点,根据所述查找到的候选用户属性类型下的特征区间的特征阈值生成该分支节点的叶结点。
PCT/CN2016/078867 2015-04-08 2016-04-08 选择数据内容向终端推送的方法和装置 WO2016161976A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2017543954A JP6494777B2 (ja) 2015-04-08 2016-04-08 端末にプッシュされるデータコンテンツを選択するための方法およびデバイス
US15/664,233 US10789311B2 (en) 2015-04-08 2017-07-31 Method and device for selecting data content to be pushed to terminal, and non-transitory computer storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510164053.1 2015-04-08
CN201510164053.1A CN106156127B (zh) 2015-04-08 2015-04-08 选择数据内容向终端推送的方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/664,233 Continuation US10789311B2 (en) 2015-04-08 2017-07-31 Method and device for selecting data content to be pushed to terminal, and non-transitory computer storage medium

Publications (1)

Publication Number Publication Date
WO2016161976A1 true WO2016161976A1 (zh) 2016-10-13

Family

ID=57071725

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/078867 WO2016161976A1 (zh) 2015-04-08 2016-04-08 选择数据内容向终端推送的方法和装置

Country Status (4)

Country Link
US (1) US10789311B2 (zh)
JP (1) JP6494777B2 (zh)
CN (1) CN106156127B (zh)
WO (1) WO2016161976A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108763017A (zh) * 2018-05-18 2018-11-06 平安普惠企业管理有限公司 金融业务的应用软件数据处理方法、服务端及存储介质
CN111814030A (zh) * 2019-04-10 2020-10-23 百度在线网络技术(北京)有限公司 推送方法、装置、设备和介质
CN111966904A (zh) * 2020-08-18 2020-11-20 平安国际智慧城市科技股份有限公司 基于多用户画像模型的信息推荐方法和相关装置
CN113742571A (zh) * 2021-08-03 2021-12-03 大箴(杭州)科技有限公司 一种基于大数据的消息推送方法及装置、存储介质
CN113761886A (zh) * 2020-10-16 2021-12-07 北京沃东天骏信息技术有限公司 确定目标任务的方法、装置、电子设备及存储介质
CN113923674A (zh) * 2020-12-06 2022-01-11 技象科技(浙江)有限公司 根据发送数据进行组网方法、装置、设备和存储介质

Families Citing this family (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740453B (zh) * 2016-02-03 2018-06-19 百度在线网络技术(北京)有限公司 信息推送方法和装置
CN106682102B (zh) * 2016-12-02 2019-07-19 中国通信建设集团设计院有限公司 一种基于关键字集合的信息匹配方法
CN108334522B (zh) * 2017-01-20 2021-12-14 阿里巴巴集团控股有限公司 确定海关编码的方法,以及确定类型信息的方法和系统
CN107038256B (zh) 2017-05-05 2018-06-29 平安科技(深圳)有限公司 基于数据源的业务定制装置、方法及计算机可读存储介质
CN108011936B (zh) * 2017-11-28 2021-06-04 百度在线网络技术(北京)有限公司 用于推送信息的方法和装置
US11556836B1 (en) * 2018-02-12 2023-01-17 Intuit Inc. System and method for matching specialists and potential clients
CN108540831B (zh) * 2018-04-19 2019-10-22 百度在线网络技术(北京)有限公司 用于推送信息的方法和装置
US11790381B1 (en) 2018-08-09 2023-10-17 Intuit Inc. System and method for attribute selection for expert matchmaking
CN109255013A (zh) * 2018-08-14 2019-01-22 平安医疗健康管理股份有限公司 理赔决策方法、装置、计算机设备和存储介质
US11245777B1 (en) * 2018-09-11 2022-02-08 Groupon, Inc. Multi-application interactive support and communication interface
CN109635185A (zh) * 2018-11-12 2019-04-16 平安科技(深圳)有限公司 一种舆情数据推送方法、装置、存储介质和终端设备
CN109559173B (zh) * 2018-11-30 2020-11-13 杭州可靠护理用品股份有限公司 一种基于用途的纸尿裤功能层自适应配置方法与系统
CN109684549A (zh) * 2018-12-24 2019-04-26 拉扎斯网络科技(上海)有限公司 目标数据预测方法、装置、电子设备及计算机存储介质
CN109885597B (zh) * 2019-01-07 2023-05-30 平安科技(深圳)有限公司 基于机器学习的用户分群处理方法、装置及电子终端
EP3709229A1 (en) * 2019-03-13 2020-09-16 Ricoh Company, Ltd. Learning device and learning method
CN110135590B (zh) * 2019-04-15 2022-05-17 平安科技(深圳)有限公司 信息处理方法、装置、介质及电子设备
US20220197953A1 (en) * 2019-04-15 2022-06-23 Zte Corporation Model pushing method and device, model requesting method and device, storage medium and electronic device
CN110222960B (zh) * 2019-05-23 2022-11-25 深圳供电局有限公司 一种自动匹配任务生成的方法及系统
CN111770125A (zh) * 2019-05-23 2020-10-13 北京沃东天骏信息技术有限公司 用于推送信息的方法和装置
CN110351371A (zh) * 2019-07-15 2019-10-18 星联云服科技有限公司 一种在云存储系统中进行数据推送的方法及系统
US11397591B2 (en) * 2019-11-07 2022-07-26 Kyndryl, Inc. Determining disorder in technological system architectures for computer systems
CN110990699B (zh) * 2019-11-29 2021-12-07 广州市百果园信息技术有限公司 一种信息推送系统、方法、装置、设备和存储介质
CN111339418B (zh) * 2020-02-26 2023-07-18 抖音视界有限公司 页面展示方法、装置、电子设备和计算机可读介质
CN111460285B (zh) * 2020-03-17 2023-11-03 阿波罗智联(北京)科技有限公司 信息处理方法、装置、电子设备和存储介质
CN112243021A (zh) * 2020-05-25 2021-01-19 北京沃东天骏信息技术有限公司 信息推送方法、装置、设备及计算机可读存储介质
CN111859156B (zh) * 2020-08-04 2024-02-02 上海秒针网络科技有限公司 发布人群的确定方法、装置、可读存储介质及电子设备
CN112015986B (zh) * 2020-08-26 2024-01-26 北京奇艺世纪科技有限公司 数据推送方法、装置、电子设备及计算机可读存储介质
CN112948608B (zh) * 2021-02-01 2023-08-22 北京百度网讯科技有限公司 图片查找方法、装置、电子设备及计算机可读存储介质
KR102394160B1 (ko) * 2021-05-25 2022-05-06 김민혁 웹 서비스를 이용하는 고객의 행동 조건에 따라 푸시 안내 메시지를 전송할 수 있는 웹 서비스 운영 서버 및 그 동작 방법
CN113343147B (zh) * 2021-06-18 2024-01-19 北京百度网讯科技有限公司 信息处理方法、装置、设备、介质及程序产品
CN114240527A (zh) * 2021-10-12 2022-03-25 北京淘友天下科技发展有限公司 资源推送方法、装置、电子设备、可读介质及计算机程序
CN114301975B (zh) * 2021-11-30 2023-07-28 乐美科技股份私人有限公司 应用内推送信息的处理方法、装置、设备及存储介质
CN115187345A (zh) * 2022-09-13 2022-10-14 深圳装速配科技有限公司 智能家居建材推荐方法、装置、设备及存储介质
CN117938950A (zh) * 2024-03-22 2024-04-26 深圳有为通讯科技有限公司 一种基于手机启屏的学习内容推送系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004046488A (ja) * 2002-07-11 2004-02-12 Fujitsu Ltd 情報プッシュ機能を備えた情報整理表示システム
CN101075253A (zh) * 2007-02-15 2007-11-21 腾讯科技(深圳)有限公司 一种广告信息推送系统和方法
CN101505461A (zh) * 2008-12-29 2009-08-12 北京握奇数据系统有限公司 一种信息发布处理的方法和装置
US20120072411A1 (en) * 2010-09-16 2012-03-22 Microsoft Corporation Data representation for push-based queries
CN103618668A (zh) * 2013-12-18 2014-03-05 清华大学 微博推送、接收方法及装置
CN103744968A (zh) * 2014-01-09 2014-04-23 小米科技有限责任公司 一种终端应用中的信息推送方法及装置

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6727914B1 (en) * 1999-12-17 2004-04-27 Koninklijke Philips Electronics N.V. Method and apparatus for recommending television programming using decision trees
WO2005031589A1 (en) * 2003-09-23 2005-04-07 Marchex, Inc. Performance-based online advertising system and method
US7908238B1 (en) * 2007-08-31 2011-03-15 Yahoo! Inc. Prediction engines using probability tree and computing node probabilities for the probability tree
JP5290041B2 (ja) * 2008-05-16 2013-09-18 株式会社エヌ・ティ・ティ・ドコモ 情報検索装置及び情報検索方法
US8738436B2 (en) * 2008-09-30 2014-05-27 Yahoo! Inc. Click through rate prediction system and method
US20130151332A1 (en) * 2011-12-10 2013-06-13 Rong Yan Assisted adjustment of an advertising campaign
CN102436506A (zh) * 2011-12-27 2012-05-02 Tcl集团股份有限公司 一种网络服务器端海量数据的处理方法及装置
CN103902538B (zh) * 2012-12-25 2017-03-15 中国银联股份有限公司 基于决策树的信息推荐装置及方法
JP5693630B2 (ja) * 2013-03-18 2015-04-01 ヤフー株式会社 広告抽出装置、広告抽出方法及び広告抽出プログラム
CN103685502B (zh) 2013-12-09 2017-07-25 腾讯科技(深圳)有限公司 一种消息推送方法、装置及系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004046488A (ja) * 2002-07-11 2004-02-12 Fujitsu Ltd 情報プッシュ機能を備えた情報整理表示システム
CN101075253A (zh) * 2007-02-15 2007-11-21 腾讯科技(深圳)有限公司 一种广告信息推送系统和方法
CN101505461A (zh) * 2008-12-29 2009-08-12 北京握奇数据系统有限公司 一种信息发布处理的方法和装置
US20120072411A1 (en) * 2010-09-16 2012-03-22 Microsoft Corporation Data representation for push-based queries
CN103618668A (zh) * 2013-12-18 2014-03-05 清华大学 微博推送、接收方法及装置
CN103744968A (zh) * 2014-01-09 2014-04-23 小米科技有限责任公司 一种终端应用中的信息推送方法及装置

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108763017A (zh) * 2018-05-18 2018-11-06 平安普惠企业管理有限公司 金融业务的应用软件数据处理方法、服务端及存储介质
CN108763017B (zh) * 2018-05-18 2023-04-25 平安普惠企业管理有限公司 金融业务的应用软件数据处理方法、服务端及存储介质
CN111814030A (zh) * 2019-04-10 2020-10-23 百度在线网络技术(北京)有限公司 推送方法、装置、设备和介质
CN111814030B (zh) * 2019-04-10 2023-10-27 百度在线网络技术(北京)有限公司 推送方法、装置、设备和介质
CN111966904A (zh) * 2020-08-18 2020-11-20 平安国际智慧城市科技股份有限公司 基于多用户画像模型的信息推荐方法和相关装置
CN111966904B (zh) * 2020-08-18 2023-09-05 深圳平安智慧医健科技有限公司 基于多用户画像模型的信息推荐方法和相关装置
CN113761886A (zh) * 2020-10-16 2021-12-07 北京沃东天骏信息技术有限公司 确定目标任务的方法、装置、电子设备及存储介质
CN113923674A (zh) * 2020-12-06 2022-01-11 技象科技(浙江)有限公司 根据发送数据进行组网方法、装置、设备和存储介质
CN113923674B (zh) * 2020-12-06 2023-08-29 技象科技(南京)有限公司 根据发送数据进行组网方法、装置、设备和存储介质
CN113742571A (zh) * 2021-08-03 2021-12-03 大箴(杭州)科技有限公司 一种基于大数据的消息推送方法及装置、存储介质
CN113742571B (zh) * 2021-08-03 2024-04-26 大箴(杭州)科技有限公司 一种基于大数据的消息推送方法及装置、存储介质

Also Published As

Publication number Publication date
CN106156127A (zh) 2016-11-23
JP6494777B2 (ja) 2019-04-03
CN106156127B (zh) 2020-06-16
JP2018511116A (ja) 2018-04-19
US20170329856A1 (en) 2017-11-16
US10789311B2 (en) 2020-09-29

Similar Documents

Publication Publication Date Title
WO2016161976A1 (zh) 选择数据内容向终端推送的方法和装置
US11716401B2 (en) Systems and methods for content audience analysis via encoded links
US11223694B2 (en) Systems and methods for analyzing traffic across multiple media channels via encoded links
US11947619B2 (en) Systems and methods for benchmarking online activity via encoded links
CN102053983B (zh) 一种垂直搜索的查询方法、系统和装置
US20160132904A1 (en) Influence score of a brand
US11936751B2 (en) Systems and methods for online activity monitoring via cookies
CN103310003A (zh) 一种基于点击日志的新广告点击率预测方法及系统
CN101685521A (zh) 在网页中展现广告的方法及系统
CN112632405B (zh) 一种推荐方法、装置、设备及存储介质
CN111177559B (zh) 文旅服务推荐方法、装置、电子设备及存储介质
CN111429161B (zh) 特征提取方法、特征提取装置、存储介质及电子设备
CN112989169B (zh) 目标对象识别方法、信息推荐方法、装置、设备及介质
US10691664B1 (en) User interface structural clustering and analysis
US10817845B2 (en) Updating messaging data structures to include predicted attribute values associated with recipient entities
CN109716377A (zh) 登录页面生成的改进
CN116823410B (zh) 数据处理方法、对象处理方法、推荐方法及计算设备
CN111753151A (zh) 一种基于互联网用户行为的服务推荐方法
CN115858815A (zh) 确定映射信息的方法、广告推荐方法、装置、设备及介质
KR101985603B1 (ko) 삼분 그래프에 기반한 추천 방법
WO2024114034A1 (zh) 内容推荐方法、装置、设备、介质和程序产品
US20230214679A1 (en) Extracting and classifying entities from digital content items
CN115660750A (zh) 生成引导信息的方法、装置、电子设备及存储介质
CN117807304A (zh) 一种基于大数据的智能推荐方法及系统、设备、介质
CN117725149A (zh) 内容反馈的查询方法、装置及设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16776151

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2017543954

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC , EPO FORM 1205A DATED 14.03.18.

122 Ep: pct application non-entry in european phase

Ref document number: 16776151

Country of ref document: EP

Kind code of ref document: A1