WO2019095417A1 - Real-time advertisement recommendation method and apparatus, and terminal device and storage medium - Google Patents

Real-time advertisement recommendation method and apparatus, and terminal device and storage medium Download PDF

Info

Publication number
WO2019095417A1
WO2019095417A1 PCT/CN2017/112569 CN2017112569W WO2019095417A1 WO 2019095417 A1 WO2019095417 A1 WO 2019095417A1 CN 2017112569 W CN2017112569 W CN 2017112569W WO 2019095417 A1 WO2019095417 A1 WO 2019095417A1
Authority
WO
WIPO (PCT)
Prior art keywords
preference information
user
advertisement
user preference
access request
Prior art date
Application number
PCT/CN2017/112569
Other languages
French (fr)
Chinese (zh)
Inventor
黄度新
张川
李双灵
王翼
金鑫
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2019095417A1 publication Critical patent/WO2019095417A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0257User requested
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0277Online advertisement

Definitions

  • the present application relates to the field of big data, and in particular, to a method, device, terminal device and storage medium for real-time advertisement recommendation.
  • the website When the current user visits the website, the website will randomly push the advertisement to the user. Because it is randomly pushed, it is impossible to recommend the advertisement of the user's interest to the user in real time.
  • the user visits the website he usually only browses the interest or himself and himself.
  • the webpage or advertisement related to the demand if the advertisement pushed by the website is an advertisement of interest to the user, the click rate of the user clicking the advertisement is high; on the contrary, if the advertisement pushed by the website is not the advertisement of the user, the user may not Clicking on the ad resulted in a lower click-through rate for push ads, and it did not achieve good ad push performance, and did not achieve the purpose of ad push.
  • the embodiment of the present application provides a method, a device, a terminal device and a storage medium for real-time advertisement recommendation, so as to solve the problem that the click rate of the current website random push advertisement is low.
  • an embodiment of the present application provides a real-time advertisement recommendation method, including:
  • an advertisement real-time recommendation device including:
  • An access request obtaining module configured to acquire an access request sent by the client in real time, where the access request includes a request source identifier
  • a user preference information obtaining module configured to acquire user preference information corresponding to the request source identifier based on the access request
  • An associated advertisement obtaining module configured to acquire an associated advertisement corresponding to the user preference information based on the user preference information
  • the associated advertisement recommendation module is configured to push the associated advertisement to the client in real time, so that the client displays the associated advertisement in real time.
  • an embodiment of the present application provides a terminal device, including a memory, a processor, and computer readable instructions stored in the memory and executable on the processor, where the processor executes the computer
  • the embodiment of the present application provides a computer readable storage medium, where the computer readable storage medium stores computer readable instructions, and when the computer readable instructions are executed by a processor, the real-time recommendation method of the advertisement is implemented. step:
  • the corresponding user preference information is obtained by acquiring the request source identifier in the access request sent by the client in real time, and then the corresponding association is obtained based on the user preference information.
  • the advertisement, the associated advertisement is recommended to the corresponding client in real time, so that the client displays the associated advertisement that the user is interested in.
  • the advertisement real-time recommendation method, the device, the terminal device and the storage medium can realize the related advertisements that are interested in the user in real time according to the user preference information, and improve the click rate of the push advertisement.
  • Embodiment 1 is a flow chart of a method for real-time recommendation of advertisements in Embodiment 1 of the present application.
  • FIG. 2 is a specific schematic diagram of step S20 of FIG. 1.
  • FIG. 3 is another specific schematic diagram of step S20 of FIG. 1.
  • FIG. 4 is a specific schematic diagram of step S30 of FIG. 1.
  • FIG. 5 is a schematic block diagram of an advertisement real-time recommendation device in Embodiment 2 of the present application.
  • FIG. 6 is a schematic diagram of a terminal device in Embodiment 4 of the present application.
  • FIG. 1 is a flow chart showing a method for recommending an advertisement in real time in this embodiment.
  • the real-time recommendation method of the advertisement is applied when the user visits the website, and can recommend the advertisement of interest to the user in real time, thereby improving the click rate of the advertisement.
  • the real-time recommendation method of the advertisement includes the following steps:
  • S10 Obtain an access request sent by the client in real time, and the access request includes a request source identifier.
  • the server connected to the client receives the access request sent by the client in real time, and the access request generally carries a URL address, and the server parses the URL address, and feeds back the content corresponding to the URL address to the client, so that the client Show the page.
  • the access request received by the server further includes a request source identifier, where the request source identifier is an identifier for uniquely identifying the source of the request.
  • the request source identifier includes a user identifier and/or a terminal identifier; wherein the user identifier is an identifier for uniquely identifying a user that triggers the access request; the terminal identifier is an identifier for uniquely identifying a terminal that sends the access request.
  • the user identifier may be account information input when the user logs in to a specific webpage, and the account information includes, but is not limited to, a login name, a mobile phone number, a micro signal, and an email address used when the user logs in.
  • the terminal identifier includes, but is not limited to, an identifier such as a MAC address or an IP address of a terminal for uniquely identifying a source of an access request, such as a computer, a mobile phone, and an Ipad.
  • S20 Acquire, according to the access request, user preference information corresponding to the request source identifier.
  • the server acquires the corresponding user preference information according to the request source identifier in the access request, and obtains the user image data corresponding to the request source identifier.
  • User image data refers to the number of locations to a person According to data, identity data, consumption data, behavior data and lifestyle data, a tagged user model data abstracted by data analysis.
  • the request source identifier of the access request is the user identifier, that is, when the user uses the account information to log in to the specific webpage
  • the access request sent by the user carries the user identifier (the terminal identifier can also be carried at the same time), and the server can be based on the The user identification looks up the user's portrait data of the user to obtain corresponding user preference information based on the found user image data.
  • the access request received by the server does not carry the user identifier but only carries the terminal identifier, and the server can query the terminal identifier according to the terminal identifier.
  • the terminal frequently accesses historical webpage data formed by the webpage to obtain corresponding user portrait data based on the historical webpage data, thereby determining corresponding user preference information. For example, according to the user browsing the red wine related webpage data through the terminal corresponding to the same terminal identifier for a period of time, by analyzing the behavior data of the user browsing the webpage, a label of the user may be abstracted, and the user likes red wine or recently needs to purchase. Red wine, and the favorite red wine is part of the user portrait data of the terminal corresponding to the terminal identification.
  • the user preference information may be information such as the personal preference of the user corresponding to the account information when the user logs in to the webpage by using the account information; or the user does not use the account information but visits the webpage as a visitor, according to the user Information such as personal preferences determined by historical web page data frequently accessed by the terminal.
  • a webpage frequently browsed by a user using the same terminal is a webpage related to red wine, and thus the user preference information may be inferred to be red wine; and, for example, a webpage frequently browsed by the same terminal is a webpage related to outdoor activity tourism, thereby It can be inferred that the user preference information is for outdoor travel.
  • the user portrait data is acquired by the user portrait data system.
  • the user portrait data system is divided into four subsystems: a data source subsystem, a data rotor system, a big data platform subsystem, and a data application subsystem.
  • the data source subsystem is mainly an application layer module, which is associated with the user and is used for data collection.
  • the data source subsystem can be divided into a data class module, an internet channel class module, and a third party data module.
  • the data class module includes, but is not limited to, the core transaction module, the risk association module and the data warehouse module in the embodiment;
  • the internet channel class module includes but is not limited to the portal website, the mobile banking and the WeChat bank in the embodiment;
  • the third party data module This includes, but is not limited to, data from the extranet in this embodiment. Data is generated when a user logs in to a particular web page or browses a web page, and the data source subsystem is used to collect the data.
  • the data source subsystem mainly uses the distributed log real-time collection platform Flume for data collection, and sends the collected data to the distributed message middleware Kafka for aggregation.
  • the distributed computing engine Spark is used for distributed messages.
  • Middleware Kafka gets the data and processes the data.
  • the data in the rotor system is used to connect the data source subsystem and the big data platform subsystem, that is, to send the data acquired by the data source subsystem to the big data platform subsystem.
  • the data in the rotor system is used to collect data such as database files, transaction messages, system logs, and database logs collected by the data source subsystem, and send the data to the big data platform subsystem.
  • the core transaction module converts the transaction record completed by the acquired user into a specific webpage, and triggers the relevant risk situation to generate a database file, a transaction message, a system log, and Data such as database logs, and the above data is sent to the big data platform subsystem.
  • the rotor system in the data also functions as a data storage function for storing data uploaded by the data source subsystem.
  • the distributed storage platform HBase is used for data storage.
  • the storage platform HBase can realize the processes of network communication, message authentication, transaction data format conversion, personal password PIN conversion, transaction flow record, transaction preprocessing, transaction monitoring and transaction data statistics.
  • the formed data is stored.
  • the Big Data Platform subsystem is used to process and calculate massive amounts of data.
  • the most important function of the big data platform subsystem is data calculation, and the data is analyzed by using Spark/Hive.
  • Spark/Hive is used for data analysis of all online browsing information of the user, thereby obtaining user portrait data corresponding to the user identification or the terminal identification.
  • the data application subsystem is used to provide an interface for analysis results obtained by data analysis of the big data platform subsystem for other system calls, such as inputting the analysis result into an application system for data mining, deep learning, and data market.
  • the data application subsystem may perform data mining on information of users who like wine or purchase red wine to obtain data information such as age, gender, and geographic location of the user, and perform deep learning through the data information obtained by mining. To obtain user preference information similar to the user's age, gender, and geographic location.
  • the data application subsystem can also obtain the age distribution of users who like red wine through data mining, where the geographical location is mainly concentrated, and according to the obtained data information combined with the data market application system, the key sales of places with high demand for red wine are Ad recommendation.
  • step S20 the user preference information corresponding to the request source identifier is obtained, which specifically includes the following steps:
  • S211 Determine, according to the access request, whether the request source identifier is a user identifier.
  • the request source identifier carried in the access request received by the server may be a user identifier for uniquely identifying the user, or may be a terminal identifier for uniquely identifying the terminal, or carry the user identifier and the terminal identifier at the same time.
  • the user identifier is an identifier carried by the access request formed by the user after logging in to the specific webpage by using the account information, and corresponds to the close-up user; and the terminal identifier is the identifier of the terminal that sends the access request, and is not limited to a specific user; therefore, the user identifier and the identifier are sent.
  • the user of the access request is more closely contacted, so that when the server receives the access request in step S211, It is necessary to first determine whether the request source identifier in the access request is a user identifier for the user to access a specific webpage login.
  • the existing user portrait data is user portrait data previously collected and stored in the database connected to the server and associated with the user identification.
  • the existing user portrait data includes, but is not limited to, basic information such as gender, age, region, address, occupation, marital status, consumption habits, and education level, and may also include preference information for embodying user preferences.
  • the server corresponding to the specific webpage forms a user access log
  • the user access log may include gender, age, region, address, occupation, marital status, consumption habits, and personal preferences.
  • Basic information such as the educational level, and may also include access information such as the user's access page, date of access, specific access time, and length of visit.
  • the distributed log real-time collection platform Flume will collect user access logs from different servers in real time, and send the collected user access logs to the distributed message middleware Kafka for aggregation, so that each user access log is associated with the user ID.
  • the distributed computing engine Spark obtains the user access log carrying the same user identifier from the distributed message middleware Kafka, and performs data processing on all the obtained user access logs, and labels the user to form user portrait data. Finally, the tagged user portrait data is stored in the distributed storage platform Hbase, and the user portrait data is stored in association with the user identifier, so that the corresponding existing user portrait data can be queried based on the user identifier.
  • the above steps all adopt a distributed framework, which is beneficial for processing massive data and improving data processing efficiency.
  • the age obtained from the user access log may be used as a label of the user, and the acquired occupation is taken as another label of the user, and the obtained personal preference is obtained.
  • the acquired occupation is taken as another label of the user
  • the obtained personal preference is obtained.
  • the tagged user portrait data is stored in distributed storage.
  • Platform Hbase Any one of the user access logs may carry one or more basic information and/or access information corresponding to the user identifier, so that the acquired user portrait data carries at least one label, and the obtained user label is relatively wide, and the obtained label is used for the user. In fact, it is more accurate and targeted.
  • S213 Determine whether the existing user image data contains existing preference information.
  • the user profile data includes all user tags associated with the user identification.
  • the user may fill in one of the basic information including but not limited to gender, age, region, occupation, marital status, personal preference, and education level in the corresponding account information. Or multiple, in step S213, it is determined whether the existing user image data contains existing preference information.
  • the existing preference information in the existing user image data is directly used as the user preference information, so that the recommendation is subsequently performed based on the user preference information. Since the existing favorite information is mostly uploaded by the user, it is more suitable for the user's actual preference, and the existing favorite information is used as the basis for the advertisement recommendation, so that the pushed advertisement is more in line with the user's preference, so as to improve the advertisement to a certain extent. Clickthrough rate.
  • the similar crowd is the most similar group of user portrait data and existing user portrait data. It can be understood that since the existing user image data does not include the existing preference information, it is necessary to find a similar crowd from the user portrait data system based on the existing user portrait data, so as to determine the corresponding user identifier based on the common preference information of the similar crowd. User preferences information for users.
  • the big data platform subsystem uses Spark/Hive for data analysis, and clusters user image data of all users stored in the distributed storage platform Hbase to cluster all users according to their common preference information.
  • user image data of all users may include, but is not limited to, gender, age, region, address, occupation, marital status, consumption habits, preference information, and education level.
  • the K-means clustering algorithm is used to cluster the user portrait data of all users, so that all users are divided into several clustering groups based on the common preference information, and each clustering group corresponds to the clustering user portrait data.
  • K-means clustering algorithm is a clustering algorithm based on distance evaluation similarity, that is, the closer the distance between two objects, the larger the similarity is.
  • the Euclidean distance of the existing user portrait data and the cluster user image data corresponding to each cluster population is calculated, and the cluster population with the smallest Euclidean distance is selected as the similar crowd.
  • the preference is also most likely to be the same, so the common preference information of the similar group can be used as the user of the user corresponding to the user identifier.
  • the pushed advertisement is more in line with the user's preference, and the click rate of the advertisement is improved to some extent.
  • the server may search for at least one historical webpage data that the corresponding terminal has accessed according to the user identifier.
  • the historical webpage data may be historical webpage data uploaded to the distributed storage platform Hbase. Since the historical webpage data is associated with the user identifier, it can be understood as a trace left by the user corresponding to the user identifier when accessing the webpage, so that the favorite tab corresponding to each historical webpage data can reflect the user's preference to a certain extent.
  • each historical webpage data corresponds to a favorite tag
  • the favorite tag can be obtained by using a Jieba word segmentation tool and a TF-IDF algorithm.
  • the Jieba word segmentation tool ie, the word segmentation tool
  • scans the text information in the historical webpage data and then divides the long words in the text information, and then performs part-of-speech tagging on the segmented text information to obtain Word segmentation results.
  • the TF-IDF algorithm is used to extract the keyword result of the segmentation result processed by the Jieba word segmentation tool, so that the extracted keyword is used as the favorite tag corresponding to the historical webpage data.
  • using the TF-IDF algorithm to extract the keyword results of the word segmentation processed by the Jieba word segmentation tool includes the following steps:
  • word frequency (TF) refers to the frequency at which a given word appears in the file, and its formula is
  • numerator indicates the number of occurrences of the word in the file
  • denominator indicates the sum of the occurrences of all words in the file.
  • the inverse document frequency (IDF) of each word in the word segmentation result of any historical web page data is calculated.
  • the inverse document frequency (IDF) means that each word is assigned an "importance" weight, which means that the most common words (",""yes”,”at") give the smallest Weights, the more common words give less weight, the less common words give greater weight, this weight is called “inverse document frequency", and its size is inversely proportional to the common degree of a word.
  • the inverse document frequency (IDF) formula can be expressed as: Where
  • TF-IDF i,j TF i,j ⁇ IDF i,j is used to obtain the weight of each word in the historical webpage data, and the words with the highest weight or relatively high (ie, the first N digits) are selected as keywords. That is, the favorite tag corresponding to the history webpage data.
  • TF-IDF tends to filter out common words, retain important words, and use the important words as keywords of the historical web page data. Select one keyword with the highest weight or several keywords with higher weight to determine the history page. The corresponding favorite tag in the data.
  • S218 Perform statistical analysis on the favorite tags corresponding to the historical webpage data, and obtain key preference tags to determine user preference information.
  • step S217 the TF-IDF algorithm is used for each historical webpage data to perform keyword extraction to determine that each historical webpage data has a corresponding favorite label; in step S218, corresponding to all historical webpage data corresponding to the user identifier is required.
  • the favorite tag is counted to determine the highest or higher favorite tag as the key preference tag, and the key preference tag is used as the finalized user preference information, so that the advertisement recommendation is based on the user preference information, so that the recommended advertisement is more Meet the interests of users and increase the click-through rate of your ads.
  • step S20 the user preference information corresponding to the request source identifier is obtained, which specifically includes the following steps:
  • S221 Determine, according to the access request, whether the request source identifier is a terminal identifier.
  • the request source identifier carried in the access request received by the server may be a user identifier for uniquely identifying the user, or may be a terminal identifier for uniquely identifying the terminal, or carrying both the user identifier and the terminal identifier.
  • the request source identifier in the access request received by the server is the terminal identifier, and the terminal identifier can uniquely determine the terminal that sends the access request.
  • the server searches for at least one historical webpage data that the corresponding terminal has accessed according to the terminal identifier.
  • the historical webpage data may be historical webpage data uploaded to the distributed storage platform Hbase, or may be history webpage data stored in a cookie (or a cookie) on the terminal.
  • cookies or cookies refer to the data that some websites store on the user's local terminal in order to identify the user's identity and perform session tracking.
  • the process of obtaining the user label in step S222 is similar to the process in step S217. To avoid repetition, details are not described herein.
  • S223 Perform statistical analysis on the favorite tags corresponding to the historical webpage data, and obtain key preference tags to determine user preference information.
  • step S222 the TF-IDF algorithm is used for each historical webpage data to perform keyword extraction to determine that each historical webpage data has a corresponding favorite label; in step S223, all historical webpage data corresponding to the terminal identifier is required.
  • the favorite tag is counted to determine the favorite tag with the highest frequency or high frequency as the key preference tag, and the key preference tag is used as the finalized user preference information, so that the recommendation is based on the user preference information, so that the recommended
  • the advertisement is more in line with the user's interest and improves the click rate of the advertisement.
  • the method before performing the advertisement real-time recommendation method, in particular before step S20, the method further includes: labeling all the webpages on the website, so that each webpage carries a favorite label.
  • the favorite tag of the webpage can be manually set by the webpage developer, or can be pre-adopted by the Jieba word segmentation tool and TF-IDF.
  • the method processes the content of the webpage, and obtains keywords of the webpage content to determine corresponding favorite tags.
  • step S20 specifically searches for a corresponding target webpage based on the URL address of the access request, and uses the favorite tab corresponding to the target webpage as the user preference information.
  • the target webpage is a webpage corresponding to the URL address of the access request. Since all the webpages carry the favorite tags, the target webpage should also carry a corresponding favorite tag, and the favorite tag is used as the user preference information of the user who triggered the access request, so as to perform advertisement recommendation based on the user preference information, so that the recommendation is recommended.
  • the ads are more in line with the user’s interest and increase the click-through rate of the ad.
  • the determination of the user preference information based on the URL address is associated with the access request triggered by the user each time, and has great contingency.
  • the determined User preference information can largely reflect the user's true preferences. Recommending advertisements based on the user's favorite information can also effectively increase the click rate of the advertisement to a certain extent.
  • S30 Acquire an associated advertisement corresponding to the user preference information based on the user preference information.
  • the related advertisement refers to an advertisement whose content corresponds to the user preference information. After determining the user preference information corresponding to the request source identifier in the access request by using the step S20, the corresponding related advertisement may be searched based on the user preference information, so that the associated advertisement is more in line with the interest of the user who triggered the access request, so as to improve The user’s clickthrough rate for the associated ad.
  • step S30 includes the following steps:
  • S31 Perform keyword extraction on the advertisement to determine an advertisement category of the advertisement.
  • the advertisement category refers to determining the category to which the advertisement belongs according to the advertisement content.
  • the advertising category includes, but is not limited to, travel advertisements, shopping advertisements, etc., and the travel advertisements may be subdivided into travel agency advertisements, hotel advertisements, tourist city/scenic area advertisements, travel festival celebration advertisements, and exhibition advertisements.
  • the advertisement category may be determined based on the positioning of the advertisement by the advertiser, that is, the advertiser clearly determines the advertisement category; and the advertisement category may also be determined based on the advertisement content.
  • the Jieba word segmentation tool and the TF-IDF algorithm may be used to process the advertisement content, and the keywords corresponding to the advertisement content are obtained to determine the advertisement category.
  • S32 Calculate the similarity between the advertisement category and the user preference information.
  • the similarity between the advertisement category and the user preference information may be expressed by cosine similarity.
  • the cosine similarity algorithm is used to calculate the advertisement category and user preference information.
  • the calculation formula of the cosine similarity algorithm is Where x is the weight corresponding to the keyword in the advertisement category, and y is the weight corresponding to each preference information in the user preference information.
  • advertisements can be classified according to industry categories, such as advertising advertisements, shopping advertisements, and electronic home appliance advertisements, and each category of advertisements can also be subdivided for each major.
  • the category advertisement defines the advertisement category
  • the segmented advertisement defines the advertisement category based on the corresponding large category advertisement.
  • each of the sub-segments have a corresponding weight, and the ad category and its corresponding weight can be described as T(T 1 , x 1 ), (T 2 , x 2 ), (T 3 , x 3 S(S 1 , x 4 ), (S 2 , x 5 ), (S 3 , x 6 ).
  • the user preference information is defined as P 1 , P 2 , P 3 ... P n , where n is determined according to the number of user preference information, and similarly, the user preference information and its corresponding weight can be described as P(P 1 , y 1 ), (P 2 , y 2 ), (P 3 , y 3 ) (P n , y n ).
  • the cosine value calculation is performed on the advertisement category and the user preference information by using the calculation formula of the cosine similarity algorithm. When the calculated cosine value is closer to 1, the advertisement category and the user preference information are more Close, the higher the similarity.
  • the preset value is data preset by the system, and the preset value is a standard value for evaluating whether the similarity between the advertisement category of any advertisement and the user preference information reaches the associated advertisement.
  • the advertisement is determined to be closer to the user's preference, and the user is more likely to click the advertisement; when the advertisement category and the user preference information are not greater than the preset value, It is determined that the advertisement is not close to the user's preference, and may cause the user to decrease the click rate of the advertisement.
  • the advertisement corresponding to the advertisement category whose similarity of the user preference information is greater than the preset value is used as the associated advertisement, so that the associated advertisement is closer to the user's preference, so as to facilitate the subsequent push of the associated advertisement to the user, It is easy to attract users' interest to increase the click rate of related ads.
  • S40 Push the associated advertisement to the client in real time, so that the client displays the associated advertisement in real time.
  • the client sends an access request to the server, and when the control client displays the target webpage corresponding to the URL address in the access request, the server may implement the associated advertisement that is displayed by the display server, because the associated advertisement and the user The user preference information is associated, so that the associated advertisement is more likely to cause interest of the user, so that the user clicks to view the associated advertisement, thereby increasing the click rate of the advertisement.
  • the client displaying the associated advertisement mode may be displayed in the APP in which the user account information is logged in or in the webpage when the user visits the webpage, or may be displayed on the terminal device corresponding to the terminal identifier carried in the access request, and the associated advertisement is displayed.
  • the pop-up window is displayed so that the advertisement recommendation message does not affect the user's normal browsing of the webpage information.
  • the associated advertisement is pushed to the client in real time, so that the client displays the associated advertisement in real time, and the associated advertisement can be viewed by the user who triggered the access request to improve the click rate of the advertisement; thereby avoiding triggering the access request.
  • the associated advertisement is pushed to other users, and the other users are not likely to click on the interest of the associated advertisement.
  • the advertisement real-time recommendation method can obtain the access request sent by the client in real time, identify the corresponding user preference information based on the request source of the access request, and then obtain the corresponding associated advertisement based on the user preference information, and push the associated advertisement in real time.
  • the client that triggers the access request, so that the user can view the related advertisement in real time through the client. Since the associated advertisement is associated with the user preference information, and more in line with the user's interest, the click rate of the user clicking the associated advertisement can be improved. The purpose of pushing ads.
  • FIG. 5 is a schematic block diagram showing an advertisement real-time recommendation device corresponding to the advertisement real-time recommendation method in the first embodiment.
  • the advertisement real-time recommendation device includes an access request acquisition module 10, a user preference information acquisition module 20, an associated advertisement acquisition module 30, and an associated advertisement recommendation module 40.
  • the implementation functions of the access request acquisition module 10, the user preference information acquisition module 20, the associated advertisement acquisition module 30, and the associated advertisement recommendation module 40 correspond one-to-one with the steps corresponding to the advertisement real-time recommendation method in the embodiment. To avoid redundancy, the implementation The examples are not detailed one by one.
  • the access request obtaining module 10 is configured to obtain an access request sent by the client in real time, and the access request includes a request source identifier.
  • the user preference information obtaining module 20 is configured to acquire user preference information corresponding to the request source identifier based on the access request.
  • the associated advertisement obtaining module 30 is configured to acquire an associated advertisement corresponding to the user preference information based on the user preference information.
  • the associated advertisement recommendation module 40 is configured to push the associated advertisement to the client in real time, so that the client displays the associated advertisement in real time.
  • the user preference information obtaining module 20 includes a user source identification determining unit 211, an existing user portrait data obtaining unit 212, an existing favorite information determining unit 213, a first user preference information acquiring unit 214, and a similar person.
  • the user preference information acquisition module 20 further includes a terminal source identification determination unit 221 and a second The web page preference tag acquisition unit 222 and the fourth user preference information acquisition unit 223.
  • the user source identifier determining unit 211 is configured to determine, according to the access request, whether the request source identifier is a user identifier.
  • the existing user portrait data obtaining unit 212 is configured to query the existing user portrait data based on the user identifier if the request source identifier is the user identifier.
  • the favorite information judging unit 213 is configured to judge whether the existing user image data contains the existing preference information.
  • the first user preference information acquiring unit 214 is configured to use the existing favorite information as the user preference information if the existing user image data includes the existing favorite information.
  • the similarity group searching unit 215 is configured to search for similar people based on the existing user portrait data if the existing user image data does not contain the existing preference information.
  • the second user preference information obtaining unit 216 is configured to use the common preference information corresponding to the similar crowd as the user preference information.
  • the webpage preference tag first obtaining unit 217 is configured to search for the corresponding at least one historical webpage data based on the user identifier if the existing user portrait data does not include the existing favorite information, and each historical webpage data corresponds to a favorite tag.
  • the third user preference information obtaining unit 218 is configured to perform statistical analysis on the favorite tags corresponding to the historical webpage data, and obtain key preference tags to determine user preference information.
  • the terminal source identifier determining unit 221 is configured to determine, according to the access request, whether the request source identifier is a terminal identifier.
  • the second webpage preference tag obtaining unit 222 is configured to search for the corresponding at least one historical webpage data based on the terminal identifier if the source identifier is the terminal identifier, and each historical webpage data has a corresponding favorite label.
  • the fourth user preference information obtaining unit 223 is configured to perform statistical analysis on the favorite tags corresponding to the historical webpage data, and obtain key preference tags to determine user preference information.
  • the related advertisement acquisition module 30 includes an advertisement category determination unit 31, an advertisement category similarity determination unit 32, a similarity determination unit 33, and an associated advertisement determination unit 34.
  • the advertisement category determining unit 31 is configured to perform keyword extraction on the advertisement to determine an advertisement category of the advertisement.
  • the advertisement category similarity determining unit 32 is configured to calculate the similarity between the advertisement category and the user preference information.
  • the similarity determining unit 33 is configured to determine whether the similarity is greater than a preset value.
  • the associated advertisement determining unit 34 is configured to determine that the advertisement is an associated advertisement if the similarity is greater than a preset value.
  • the user preference information acquisition module 20 is configured to acquire user preference information corresponding to the request source identifier based on the access request.
  • the existing user portrait data is queried based on the user identifier, and it is determined whether the existing user portrait data contains the existing preference information. If the existing preference information is included, the existing preference information is used as the user preference information. If the existing user image data does not contain the existing preference information, the similar user is searched based on the existing user image data, and the common preference information corresponding to the similar group is used as the The user preference information; the corresponding at least one historical webpage data may be searched based on the user identifier, the favorite tag corresponding to each historical webpage data is determined, and all the favorite bookmarks are statistically analyzed, and the key preference tag is obtained to determine the user preference information.
  • the at least one historical webpage data is searched according to the terminal identifier, and each historical webpage data has a corresponding favorite label, and all the favorite bookmarks are statistically analyzed to obtain a key favorite label. To determine user preferences.
  • the associated advertisement obtaining module 30 acquires the associated advertisement corresponding to the user preference information based on the obtained preference information, and determines that the advertisement is associated with the user when the determined advertisement category and the user's preference information similarity are greater than a preset value. Advertise and recommend to users. The related advertisements determined according to the user preference information are closer to the customer's needs, and when the advertisements are recommended to the corresponding users, the click rate of the recommended advertisements is increased.
  • the embodiment provides a computer readable storage medium on which computer readable instructions are stored, and when the computer readable instructions are executed by the processor, the real-time recommendation method of the advertisement in Embodiment 1 is implemented. No longer.
  • the computer readable instructions are executed by the processor, the functions of the modules/units in the real-time recommendation device of the embodiment 2 are implemented. To avoid repetition, details are not described herein again.
  • FIG. 6 is a schematic diagram of a terminal device according to an embodiment of the present application.
  • the terminal device 60 of this embodiment includes a processor 61, a memory 62, and computer readable instructions 63 stored in the memory 62 and operable on the processor 61, such as an advertisement real-time recommendation program.
  • the processor 61 implements various steps of the advertisement real-time recommendation method in Embodiment 1 when the computer readable instructions 63 are executed, such as steps S10 to S40 shown in FIG.
  • the processor 61 implements the functions of the modules/units in the advertisement real-time recommendation device in Embodiment 2 when the computer readable instructions 63 are executed.
  • computer readable instructions 63 may be partitioned into one or more modules/units, one or more modules/units being stored in memory 62 and executed by processor 61 to complete the application.
  • the one or more modules/units may be a series of computer readable instruction segments capable of performing a particular function for describing the execution of computer readable instructions 63 in the terminal device 60.
  • the computer readable instructions 63 may be divided into an access request acquisition module 10, a user preference information acquisition module 20, an associated advertisement acquisition module 30, and an associated advertisement recommendation module 40.
  • the specific functions of each module are as follows:
  • the access request obtaining module 10 is configured to obtain an access request sent by the client in real time, and the access request includes a request source identifier.
  • the user preference information obtaining module 20 is configured to acquire user preference information corresponding to the request source identifier based on the access request.
  • the associated advertisement obtaining module 30 is configured to acquire an associated advertisement corresponding to the user preference information based on the user preference information.
  • the associated advertisement recommendation module 40 is configured to push the associated advertisement to the client in real time, so that the client displays the associated advertisement in real time.
  • the terminal device 60 can be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the terminal device may include, but is not limited to, a processor 61, a memory 62. It will be understood by those skilled in the art that FIG. 6 is only an example of the terminal device 60, and does not constitute a limitation on the terminal device 60, and may include more or less components than those illustrated, or combine some components, or different components.
  • the terminal device may further include an input/output device, a network access device, a bus, and the like.
  • the processor 61 may be a central processing unit (CPU), or may be other general-purpose processors, a digital signal processor (DSP), an application specific integrated circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the memory 62 may be an internal storage unit of the terminal device 60, such as a hard disk or memory of the terminal device 60.
  • the memory 62 may also be an external storage device of the terminal device 60, such as a plug-in hard disk provided on the terminal device 60, a smart memory card (SMC), a Secure Digital (SD) card, and a flash memory card (Flash). Card) and so on.
  • the memory 62 may also include both an internal storage unit of the terminal device 60 and an external storage device.
  • the memory 62 is used to store computer readable instructions as well as other programs and data required by the terminal device.
  • the memory 62 can also be used to temporarily store data that has been or will be output.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or may be each Units exist physically alone, or two or more units can be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated modules/units if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium.
  • the present application implements all or part of the processes in the foregoing embodiments, and may also be implemented by computer readable instructions, which may be stored in a computer readable storage medium.
  • the computer readable instructions when executed by a processor, may implement the steps of the various method embodiments described above.
  • the computer readable instructions comprise computer readable instruction code, which may be in the form of source code, an object code form, an executable file or some intermediate form or the like.
  • the computer readable medium can include any entity or device capable of carrying the computer readable instruction code, a recording medium, a USB flash drive, a removable hard drive, a magnetic disk, an optical disk, a computer memory, a read only memory (ROM, Read-Only) Memory), random access memory (RAM), electrical carrier signals, telecommunications signals, and software distribution media.
  • a recording medium a USB flash drive
  • a removable hard drive a magnetic disk, an optical disk
  • a computer memory a read only memory (ROM, Read-Only) Memory
  • RAM random access memory

Abstract

Disclosed are a real-time advertisement recommendation method and apparatus, and a terminal device and a storage medium. The real-time advertisement recommendation method comprises: acquiring an access request sent by a client in real time, wherein the access request comprises a request source identifier; based on the access request, acquiring user preference information corresponding to the request source identifier; based on the user preference information, acquiring an associated advertisement corresponding to the user preference information; and pushing the associated advertisement to the client in real time, so that the client displays the associated advertisement in real time. By means of the real-time advertisement recommendation method, an advertisement of interest to a user is recommended in real time according to user preference information, thereby improving the click-through rate of a pushed advertisement and achieving the purpose of advertisement pushing.

Description

广告实时推荐方法、装置、终端设备及存储介质Advertising real-time recommendation method, device, terminal device and storage medium
本专利申请以2017年11月15日提交的申请号为201711126787.6,名称为“广告实时推荐方法、装置、终端设备及存储介质”的中国发明专利申请为基础,并要求其优先权。This patent application is based on the Chinese invention patent application filed on November 15, 2017, with the application number of 201711126787.6, entitled "Advertising real-time recommendation method, device, terminal device and storage medium", and requires its priority.
技术领域Technical field
本申请涉及大数据领域,尤其涉及一种广告实时推荐方法、装置、终端设备及存储介质。The present application relates to the field of big data, and in particular, to a method, device, terminal device and storage medium for real-time advertisement recommendation.
背景技术Background technique
当前用户访问网站时,网站会随机推送广告给用户,由于是随机推送,无法做到将用户感兴趣的广告实时推荐给用户,用户在访问网站时,通常只会浏览自己感兴趣的或者和自己的需求相关的网页或广告,若网站推送的广告是用户感兴趣的广告,则用户点击该广告的点击率较高;反之,若网站推送的广告不是用户感兴趣的广告,则用户可能不会点击该广告,造成推送广告点击率较低,也不能达到很好的广告推送的效果,没有实现广告推送的目的。When the current user visits the website, the website will randomly push the advertisement to the user. Because it is randomly pushed, it is impossible to recommend the advertisement of the user's interest to the user in real time. When the user visits the website, he usually only browses the interest or himself and himself. The webpage or advertisement related to the demand, if the advertisement pushed by the website is an advertisement of interest to the user, the click rate of the user clicking the advertisement is high; on the contrary, if the advertisement pushed by the website is not the advertisement of the user, the user may not Clicking on the ad resulted in a lower click-through rate for push ads, and it did not achieve good ad push performance, and did not achieve the purpose of ad push.
发明内容Summary of the invention
本申请实施例提供一种广告实时推荐方法、装置、终端设备及存储介质,以解决当前网站随机推送广告所存在的点击率较低的问题。The embodiment of the present application provides a method, a device, a terminal device and a storage medium for real-time advertisement recommendation, so as to solve the problem that the click rate of the current website random push advertisement is low.
第一方面,本申请实施例提供一种广告实时推荐方法,包括:In a first aspect, an embodiment of the present application provides a real-time advertisement recommendation method, including:
实时获取客户端发送的访问请求,所述访问请求包括请求来源标识;Acquiring an access request sent by the client in real time, where the access request includes a request source identifier;
基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息;Acquiring user preference information corresponding to the request source identifier based on the access request;
基于所述用户喜好信息,获取与所述用户喜好信息相对应的关联广告;Acquiring an associated advertisement corresponding to the user preference information based on the user preference information;
将所述关联广告实时推送给所述客户端,以使所述客户端实时显示所述关联广告。Pushing the associated advertisement to the client in real time, so that the client displays the associated advertisement in real time.
第二方面,本申请实施例提供一种广告实时推荐装置,包括:In a second aspect, the embodiment of the present application provides an advertisement real-time recommendation device, including:
访问请求获取模块,用于实时获取客户端发送的访问请求,所述访问请求包括请求来源标识; An access request obtaining module, configured to acquire an access request sent by the client in real time, where the access request includes a request source identifier;
用户喜好信息获取模块,用于基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息;a user preference information obtaining module, configured to acquire user preference information corresponding to the request source identifier based on the access request;
关联广告获取模块,用于基于所述用户喜好信息,获取与所述用户喜好信息相对应的关联广告;An associated advertisement obtaining module, configured to acquire an associated advertisement corresponding to the user preference information based on the user preference information;
关联广告推荐模块,用于将所述关联广告实时推送给所述客户端,以使所述客户端实时显示所述关联广告。The associated advertisement recommendation module is configured to push the associated advertisement to the client in real time, so that the client displays the associated advertisement in real time.
第三方面,本申请实施例提供一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现所述广告实时推荐方法的步骤:In a third aspect, an embodiment of the present application provides a terminal device, including a memory, a processor, and computer readable instructions stored in the memory and executable on the processor, where the processor executes the computer The steps of implementing the real-time recommendation method of the advertisement when reading the instruction:
实时获取客户端发送的访问请求,所述访问请求包括请求来源标识;Acquiring an access request sent by the client in real time, where the access request includes a request source identifier;
基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息;Acquiring user preference information corresponding to the request source identifier based on the access request;
基于所述用户喜好信息,获取与所述用户喜好信息相对应的关联广告;Acquiring an associated advertisement corresponding to the user preference information based on the user preference information;
将所述关联广告实时推送给所述客户端,以使所述客户端实时显示所述关联广告。Pushing the associated advertisement to the client in real time, so that the client displays the associated advertisement in real time.
第四方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,所述计算机可读指令被处理器执行时实现所述广告实时推荐方法的步骤:In a fourth aspect, the embodiment of the present application provides a computer readable storage medium, where the computer readable storage medium stores computer readable instructions, and when the computer readable instructions are executed by a processor, the real-time recommendation method of the advertisement is implemented. step:
实时获取客户端发送的访问请求,所述访问请求包括请求来源标识;Acquiring an access request sent by the client in real time, where the access request includes a request source identifier;
基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息;Acquiring user preference information corresponding to the request source identifier based on the access request;
基于所述用户喜好信息,获取与所述用户喜好信息相对应的关联广告;Acquiring an associated advertisement corresponding to the user preference information based on the user preference information;
将所述关联广告实时推送给所述客户端,以使所述客户端实时显示所述关联广告。Pushing the associated advertisement to the client in real time, so that the client displays the associated advertisement in real time.
本申请实施例提供的广告实时推荐方法、装置、终端设备及存储介质中,通过实时获取客户端发送的访问请求中的请求来源标识获取对应的用户喜好信息,再基于用户喜好信息获取对应的关联广告,将关联广告实时推荐给相应的客户端,以使客户端显示用户感兴趣的关联广告。该广告实时推荐方法、装置、终端设备及存储介质,可实现根据用户喜好信息实时推荐用户感兴趣的关联广告,提高推送广告的点击率。In the advertisement real-time recommendation method, device, terminal device and storage medium provided by the embodiment of the present application, the corresponding user preference information is obtained by acquiring the request source identifier in the access request sent by the client in real time, and then the corresponding association is obtained based on the user preference information. The advertisement, the associated advertisement is recommended to the corresponding client in real time, so that the client displays the associated advertisement that the user is interested in. The advertisement real-time recommendation method, the device, the terminal device and the storage medium can realize the related advertisements that are interested in the user in real time according to the user preference information, and improve the click rate of the push advertisement.
附图说明DRAWINGS
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获 得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, without the creative labor, it can also be obtained according to these drawings. Have other drawings.
图1是本申请实施例1中广告实时推荐方法的一流程图。1 is a flow chart of a method for real-time recommendation of advertisements in Embodiment 1 of the present application.
图2是图1中步骤S20的一具体示意图。FIG. 2 is a specific schematic diagram of step S20 of FIG. 1.
图3是图1中步骤S20的另一具体示意图。FIG. 3 is another specific schematic diagram of step S20 of FIG. 1.
图4是图1中步骤S30的一具体示意图。FIG. 4 is a specific schematic diagram of step S30 of FIG. 1.
图5是本申请实施例2中广告实时推荐装置的一原理框图。FIG. 5 is a schematic block diagram of an advertisement real-time recommendation device in Embodiment 2 of the present application.
图6是本申请实施例4中终端设备的一示意图。FIG. 6 is a schematic diagram of a terminal device in Embodiment 4 of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application are clearly and completely described in the following with reference to the drawings in the embodiments of the present application. It is obvious that the described embodiments are a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present application without departing from the inventive scope are the scope of the present application.
实施例1Example 1
图1示出本实施例中广告实时推荐方法的流程图。该广告实时推荐方法应用在当用户访问网站时,可以实时给用户推荐感兴趣的广告,提高广告的点击率。如图1所示,该广告实时推荐方法包括如下步骤:FIG. 1 is a flow chart showing a method for recommending an advertisement in real time in this embodiment. The real-time recommendation method of the advertisement is applied when the user visits the website, and can recommend the advertisement of interest to the user in real time, thereby improving the click rate of the advertisement. As shown in FIG. 1, the real-time recommendation method of the advertisement includes the following steps:
S10:实时获取客户端发送的访问请求,访问请求包括请求来源标识。S10: Obtain an access request sent by the client in real time, and the access request includes a request source identifier.
具体地,与客户端通信相连的服务器实时接收客户端发送的访问请求,该访问请求一般携带有URL地址,服务器解析该URL地址,并将URL地址对应的内容反馈给客户端,以使客户端显示相应网页。本实施例中,服务器接收到的访问请求还包括请求来源标识,该请求来源标识是用于唯一识别请求来源的标识。该请求来源标识包括用户标识和/或终端标识;其中,该用户标识是用于唯一识别触发该访问请求的用户的标识;该终端标识是用于唯一识别发送该访问请求的终端的标识。该用户标识可以为用户登录特定网页时输入的帐号信息,该帐号信息包括但不限于用户登录时使用的登录名、手机号、微信号和邮箱地址。该终端标识包括但不限于用于唯一识别访问请求来源的电脑、手机和Ipad等终端的MAC地址或IP地址等标识。Specifically, the server connected to the client receives the access request sent by the client in real time, and the access request generally carries a URL address, and the server parses the URL address, and feeds back the content corresponding to the URL address to the client, so that the client Show the page. In this embodiment, the access request received by the server further includes a request source identifier, where the request source identifier is an identifier for uniquely identifying the source of the request. The request source identifier includes a user identifier and/or a terminal identifier; wherein the user identifier is an identifier for uniquely identifying a user that triggers the access request; the terminal identifier is an identifier for uniquely identifying a terminal that sends the access request. The user identifier may be account information input when the user logs in to a specific webpage, and the account information includes, but is not limited to, a login name, a mobile phone number, a micro signal, and an email address used when the user logs in. The terminal identifier includes, but is not limited to, an identifier such as a MAC address or an IP address of a terminal for uniquely identifying a source of an access request, such as a computer, a mobile phone, and an Ipad.
S20:基于访问请求,获取与请求来源标识相对应的用户喜好信息。S20: Acquire, according to the access request, user preference information corresponding to the request source identifier.
具体地,服务器根据访问请求中的请求来源标识获取对应的用户喜好信息,具体通过获取与请求来源标识对应的用户画像数据获取。用户画像数据是指对一个人的地理位置数 据、身份数据、消费数据、行为数据和生活方式数据等数据经过数据分析抽象出的一个标签化的用户模型数据。Specifically, the server acquires the corresponding user preference information according to the request source identifier in the access request, and obtains the user image data corresponding to the request source identifier. User image data refers to the number of locations to a person According to data, identity data, consumption data, behavior data and lifestyle data, a tagged user model data abstracted by data analysis.
当访问请求的请求来源标识是用户标识时,即用户采用的是帐号信息登录特定网页时,其所发送的访问请求均携带有用户标识(此时也可以同时携带终端标识),服务器可基于该用户标识查找该用户的用户画像数据,以便基于查找到的用户画像数据获取相应的用户喜好信息。When the request source identifier of the access request is the user identifier, that is, when the user uses the account information to log in to the specific webpage, the access request sent by the user carries the user identifier (the terminal identifier can also be carried at the same time), and the server can be based on the The user identification looks up the user's portrait data of the user to obtain corresponding user preference information based on the found user image data.
在用户没有采用帐号信息登录特定网页时,即用户以游客身份浏览特定网页时,服务器接收到的访问请求没有携带用户标识而仅携带有终端标识,服务器可基于该终端标识查询该终端标识对应的终端经常访问网页所形成的历史网页数据,以便基于该历史网页数据获取对应的用户画像数据,从而确定对应的用户喜好信息。比如根据用户在一段时间内经常通过同一终端标识对应的终端浏览红酒相关网页数据,通过对用户浏览该网页的行为数据进行分析,可以抽象出该用户的一个标签,该用户喜欢红酒或者最近需要购买红酒,并将喜好红酒作为与终端标识对应的终端的用户画像数据中的一部分。When the user does not use the account information to log in to the specific webpage, that is, the user browses the specific webpage as the visitor, the access request received by the server does not carry the user identifier but only carries the terminal identifier, and the server can query the terminal identifier according to the terminal identifier. The terminal frequently accesses historical webpage data formed by the webpage to obtain corresponding user portrait data based on the historical webpage data, thereby determining corresponding user preference information. For example, according to the user browsing the red wine related webpage data through the terminal corresponding to the same terminal identifier for a period of time, by analyzing the behavior data of the user browsing the webpage, a label of the user may be abstracted, and the user likes red wine or recently needs to purchase. Red wine, and the favorite red wine is part of the user portrait data of the terminal corresponding to the terminal identification.
综上,用户喜好信息可以是用户在采用帐号信息登录网页时,与账号信息相对应的用户填写的个人喜好等信息;也可以是用户没有采用帐号信息而是以游客身份访问网页时,根据其所采用的终端经常访问的历史网页数据所确定的个人喜好等信息。比如用户经常采用同一终端浏览的网页多为与红酒相关的网页,由此可推断该用户喜好信息为红酒;又如用户经常采用同一终端浏览的网页多为与户外活动旅游相关的网页,由此可推断该用户喜好信息为户外旅游。In summary, the user preference information may be information such as the personal preference of the user corresponding to the account information when the user logs in to the webpage by using the account information; or the user does not use the account information but visits the webpage as a visitor, according to the user Information such as personal preferences determined by historical web page data frequently accessed by the terminal. For example, a webpage frequently browsed by a user using the same terminal is a webpage related to red wine, and thus the user preference information may be inferred to be red wine; and, for example, a webpage frequently browsed by the same terminal is a webpage related to outdoor activity tourism, thereby It can be inferred that the user preference information is for outdoor travel.
本实施例中,用户画像数据是由用户画像数据系统采集获取到的。该用户画像数据系统分为四个子系统:数据源子系统、数据中转子系统、大数据平台子系统和数据应用子系统。In this embodiment, the user portrait data is acquired by the user portrait data system. The user portrait data system is divided into four subsystems: a data source subsystem, a data rotor system, a big data platform subsystem, and a data application subsystem.
数据源子系统主要为应用层模块,与用户相关联,用于进行数据采集。具体地,数据源子系统可以分为数据类模块、互联网渠道类模块和第三方数据模块。数据类模块包括但不限于本实施例中的核心交易模块、风险关联模块和数据仓库模块;互联网渠道类模块包括但不限于本实施例中的门户网站、手机银行和微信银行;第三方数据模块包括但不限于本实施例中的外联网的数据。当用户登录特定网页或浏览网页时均会产生数据,数据源子系统用于采集这些数据。本实施例中,数据源子系统主要使用分布式日志实时采集平台Flume做数据采集,并将采集获取的数据发送到分布式消息中间件Kafka汇总,最后,采用分布式计算引擎Spark从分布式消息中间件Kafka得到数据并处理数据。 The data source subsystem is mainly an application layer module, which is associated with the user and is used for data collection. Specifically, the data source subsystem can be divided into a data class module, an internet channel class module, and a third party data module. The data class module includes, but is not limited to, the core transaction module, the risk association module and the data warehouse module in the embodiment; the internet channel class module includes but is not limited to the portal website, the mobile banking and the WeChat bank in the embodiment; the third party data module This includes, but is not limited to, data from the extranet in this embodiment. Data is generated when a user logs in to a particular web page or browses a web page, and the data source subsystem is used to collect the data. In this embodiment, the data source subsystem mainly uses the distributed log real-time collection platform Flume for data collection, and sends the collected data to the distributed message middleware Kafka for aggregation. Finally, the distributed computing engine Spark is used for distributed messages. Middleware Kafka gets the data and processes the data.
数据中转子系统用于连接数据源子系统和大数据平台子系统,即用于将数据源子系统采集获取的数据发送给大数据平台子系统。本实施例中,数据中转子系统用于将数据源子系统采集的数据库文件、交易报文、系统日志和数据库日志等数据,并将上述数据发送给大数据平台子系统。如在用户登陆特定网页或者浏览网页时,核心交易模块会将采集获取的用户登录特定网页时完成的交易记录、触发的相关风险情况等进行数据格式变换生成数据库文件、交易报文、系统日志和数据库日志等数据,并将上述数据发送给大数据平台子系统。可以理解地,数据中转子系统还起到数据存储功能,用于存储数据源子系统上传的数据。具体采用分布式存储平台HBase进行数据存储,存储平台HBase可实现对网络通信、报文认证、交易数据格式转换、个人密码PIN变换、交易流水记录、交易预处理、交易监控和交易数据统计等过程形成的数据进行存储。The data in the rotor system is used to connect the data source subsystem and the big data platform subsystem, that is, to send the data acquired by the data source subsystem to the big data platform subsystem. In this embodiment, the data in the rotor system is used to collect data such as database files, transaction messages, system logs, and database logs collected by the data source subsystem, and send the data to the big data platform subsystem. For example, when a user logs in to a specific webpage or browses a webpage, the core transaction module converts the transaction record completed by the acquired user into a specific webpage, and triggers the relevant risk situation to generate a database file, a transaction message, a system log, and Data such as database logs, and the above data is sent to the big data platform subsystem. It can be understood that the rotor system in the data also functions as a data storage function for storing data uploaded by the data source subsystem. The distributed storage platform HBase is used for data storage. The storage platform HBase can realize the processes of network communication, message authentication, transaction data format conversion, personal password PIN conversion, transaction flow record, transaction preprocessing, transaction monitoring and transaction data statistics. The formed data is stored.
大数据平台子系统用于处理和计算海量数据。本实施例中,大数据平台子系统最重要的功能是数据计算,具体使用Spark/Hive做数据分析,比如用户在一段时间的上网浏览信息中经常浏览红酒相关网页信息,大数据平台子系统会对用户的所有上网浏览信息使用Spark/Hive做数据分析,从而获取与用户标识或终端标识相对应的用户画像数据。The Big Data Platform subsystem is used to process and calculate massive amounts of data. In this embodiment, the most important function of the big data platform subsystem is data calculation, and the data is analyzed by using Spark/Hive. For example, the user often browses the red wine related webpage information in a period of time browsing information, and the big data platform subsystem will Spark/Hive is used for data analysis of all online browsing information of the user, thereby obtaining user portrait data corresponding to the user identification or the terminal identification.
数据应用子系统用于将大数据平台子系统进行数据分析后获取的分析结果提供接口供其它系统调用,如将分析结果输入到用于进行数据挖掘、深度学习和数据市场等应用系统。具体地,数据应用子系统可对明确喜欢红酒或者购买红酒的用户的信息进行数据挖掘,以获取得到该类用户的年龄、性别和地理位置等数据信息,通过对挖掘得到的数据信息进行深度学习,以获取与用户年龄相仿、性别相同和地理位置等数据信息相近的用户喜好信息。数据应用子系统也可以通过数据挖掘得到喜欢红酒的用户年龄段分布,所处的地理位置主要集中在什么地方,根据得到的数据信息结合数据市场应用系统对红酒需求量高的地方进行重点销售和广告推荐。The data application subsystem is used to provide an interface for analysis results obtained by data analysis of the big data platform subsystem for other system calls, such as inputting the analysis result into an application system for data mining, deep learning, and data market. Specifically, the data application subsystem may perform data mining on information of users who like wine or purchase red wine to obtain data information such as age, gender, and geographic location of the user, and perform deep learning through the data information obtained by mining. To obtain user preference information similar to the user's age, gender, and geographic location. The data application subsystem can also obtain the age distribution of users who like red wine through data mining, where the geographical location is mainly concentrated, and according to the obtained data information combined with the data market application system, the key sales of places with high demand for red wine are Ad recommendation.
在一具体实施方式中,如图2所示,步骤S20中,获取与请求来源标识相对应的用户喜好信息,具体包括如下步骤:In a specific implementation, as shown in FIG. 2, in step S20, the user preference information corresponding to the request source identifier is obtained, which specifically includes the following steps:
S211:基于访问请求,判断请求来源标识是否为用户标识。S211: Determine, according to the access request, whether the request source identifier is a user identifier.
具体地,服务器接收到的访问请求中携带的请求来源标识可能为用于唯一识别用户的用户标识,也可能为用于唯一识别终端的终端标识,或者同时携带用户标识和终端标识。由于用户标识是用户采用帐号信息登录特定网页后形成的访问请求所携带的标识,对应于特写用户;而终端标识是发送该访问请求的终端的标识,不限于特定用户;因此,用户标识与发送该访问请求的用户的联系更密切,使得步骤S211中服务器接收到访问请求时, 需先行判断该访问请求中的请求来源标识是否为用户访问特定网页登录的用户标识。Specifically, the request source identifier carried in the access request received by the server may be a user identifier for uniquely identifying the user, or may be a terminal identifier for uniquely identifying the terminal, or carry the user identifier and the terminal identifier at the same time. The user identifier is an identifier carried by the access request formed by the user after logging in to the specific webpage by using the account information, and corresponds to the close-up user; and the terminal identifier is the identifier of the terminal that sends the access request, and is not limited to a specific user; therefore, the user identifier and the identifier are sent. The user of the access request is more closely contacted, so that when the server receives the access request in step S211, It is necessary to first determine whether the request source identifier in the access request is a user identifier for the user to access a specific webpage login.
S212:若请求来源标识为用户标识,则基于用户标识查询已有用户画像数据。S212: If the request source identifier is a user identifier, query the existing user portrait data based on the user identifier.
其中,已有用户画像数据是预先采集并存储在与服务器相连的数据库中与用户标识相关联的用户画像数据。本实施例中,已有用户画像数据包括但不限于性别、年龄、地域、住址、职业、婚姻状况、消费习惯和教育程度等基本信息,也可以包括用于体现用户喜好的喜好信息。Among them, the existing user portrait data is user portrait data previously collected and stored in the database connected to the server and associated with the user identification. In this embodiment, the existing user portrait data includes, but is not limited to, basic information such as gender, age, region, address, occupation, marital status, consumption habits, and education level, and may also include preference information for embodying user preferences.
具体地,用户在采用帐号信息登录特定网页时,该特定网页对应的服务器会形成用户访问日志,该用户访问日志中可能包括性别、年龄、地域、住址、职业、婚姻状况、消费习惯、个人喜好和教育程度等基本信息,也可能包括用户的访问网页、访问日期、具体的访问时间和访问时长等访问信息。分布式日志实时采集平台Flume会从不同服务器中实时采集用户访问日志,并将采集到的用户访问日志发送给分布式消息中间件Kafka进行汇总,使每一用户访问日志均与用户标识相关联。分布式计算引擎Spark从分布式消息中间件Kafka获取携带同一用户标识的用户访问日志,并对获取到的所有用户访问日志进行数据处理,给用户贴标签,以形成用户画像数据。最后,将贴好标签的用户画像数据存储在分布式存储平台Hbase中,使该用户画像数据与用户标识关联存储,以便基于该用户标识可查询获取对应的已有用户画像数据。上述步骤均采用分布式框架,有利于处理海量数据,提高数据的处理效率。Specifically, when the user logs in to a specific webpage by using the account information, the server corresponding to the specific webpage forms a user access log, and the user access log may include gender, age, region, address, occupation, marital status, consumption habits, and personal preferences. Basic information such as the educational level, and may also include access information such as the user's access page, date of access, specific access time, and length of visit. The distributed log real-time collection platform Flume will collect user access logs from different servers in real time, and send the collected user access logs to the distributed message middleware Kafka for aggregation, so that each user access log is associated with the user ID. The distributed computing engine Spark obtains the user access log carrying the same user identifier from the distributed message middleware Kafka, and performs data processing on all the obtained user access logs, and labels the user to form user portrait data. Finally, the tagged user portrait data is stored in the distributed storage platform Hbase, and the user portrait data is stored in association with the user identifier, so that the corresponding existing user portrait data can be queried based on the user identifier. The above steps all adopt a distributed framework, which is beneficial for processing massive data and improving data processing efficiency.
比如从用户访问日志中携带有年龄、职业和个人喜好信息,则可将从该用户访问日志中获取的年龄作为用户的一个标签,将获取的职业作为用户的另一个标签,将获取的个人喜好作为用户的又一个标签……直至将采集到的用户访问日志中所有基本信息和/或访问信息均贴上标签,以获取用户画像数据,并将贴好标签的用户画像数据存储到分布式存储平台Hbase中。任一条用户访问日志均可能携带与用户标识相对应的一条或者多条基本信息和/或访问信息,使得获取的用户画像数据中携带至少一个标签,获取的用户标签比较广泛,获得的标签对用户来说,比较准确且针对性强。For example, if the age, occupation, and personal preference information is carried in the user access log, the age obtained from the user access log may be used as a label of the user, and the acquired occupation is taken as another label of the user, and the obtained personal preference is obtained. As a user's further label... until all the basic information and/or access information in the collected user access log is tagged to obtain user portrait data, and the tagged user portrait data is stored in distributed storage. Platform Hbase. Any one of the user access logs may carry one or more basic information and/or access information corresponding to the user identifier, so that the acquired user portrait data carries at least one label, and the obtained user label is relatively wide, and the obtained label is used for the user. In fact, it is more accurate and targeted.
S213:判断已有用户画像数据是否包含已有喜好信息。S213: Determine whether the existing user image data contains existing preference information.
具体地,用户画像数据包括与用户标识关联的所有用户标签。本实施例中,用户在采用帐号信息登录特定网页时,可能在其对应的帐号信息中填写包括但不限于性别、年龄、地域、职业、婚姻状况、个人喜好和教育程度等基本信息中的一个或多个,步骤S213中需判断已有用户画像数据是否包含已有喜好信息。Specifically, the user profile data includes all user tags associated with the user identification. In this embodiment, when the user logs in to a specific webpage by using the account information, the user may fill in one of the basic information including but not limited to gender, age, region, occupation, marital status, personal preference, and education level in the corresponding account information. Or multiple, in step S213, it is determined whether the existing user image data contains existing preference information.
S214:若已有用户画像数据包含已有喜好信息,则将已有喜好信息作为用户喜好信息。 S214: If the existing user image data includes the existing preference information, the existing preference information is used as the user preference information.
本实施例中,若已有用户画像数据明确确定其包含已有喜好信息,则直接将该已有用户画像数据中的已有喜好信息作为用户喜好信息,以便后续基于该用户喜好信息进行推荐。由于该已有喜好信息大多是由用户主动上传的,更贴合用户实际喜好,将该已有喜好信息作为广告推荐的依据,使得推送的广告更符合用户的喜好,以在一定程度上提高广告的点击率。In this embodiment, if the existing user image data clearly determines that the existing preference information is included, the existing preference information in the existing user image data is directly used as the user preference information, so that the recommendation is subsequently performed based on the user preference information. Since the existing favorite information is mostly uploaded by the user, it is more suitable for the user's actual preference, and the existing favorite information is used as the basis for the advertisement recommendation, so that the pushed advertisement is more in line with the user's preference, so as to improve the advertisement to a certain extent. Clickthrough rate.
S215:若已有用户画像数据没有包含已有喜好信息,则基于已有用户画像数据查找相似人群。S215: If the existing user image data does not include the existing preference information, the similar person is searched based on the existing user portrait data.
其中,相似人群是用户画像数据与已有用户画像数据最相似的人群。可以理解地,由于已有用户画像数据中没有包含已有喜好信息,则需基于已有用户画像数据从用户画像数据系统查找到相似人群,以便基于相似人群的共同喜好信息确定与用户标识相对应的用户的用户喜好信息。Among them, the similar crowd is the most similar group of user portrait data and existing user portrait data. It can be understood that since the existing user image data does not include the existing preference information, it is necessary to find a similar crowd from the user portrait data system based on the existing user portrait data, so as to determine the corresponding user identifier based on the common preference information of the similar crowd. User preferences information for users.
本实施例中,大数据平台子系统采用Spark/Hive做数据分析,对分布式存储平台Hbase中存储的所有用户的用户画像数据进行聚类分析,以将所有用户依据其共同喜好信息进行聚类。具体地,所有用户的用户画像数据均可能包括但不限于性别、年龄、地域、住址、职业、婚姻状况、消费习惯、喜好信息和教育程度等。采用K-means聚类算法对所有用户的用户画像数据进行聚类,以基于共同喜好信息将所有用户划分成若干聚类人群,每一聚类人群对应聚类用户画像数据。其中,K-means聚类算法是一种基于距离评估相似度的聚类算法,即两个对象的距离越近,其相似度越大的聚类算法。在基于已有用户画像数据确定相似人群时,需计算已有用户画像数据与每一聚类人群对应的聚类用户画像数据的欧式距离,并选取欧式距离最小的聚类人群作为相似人群。其中,任意两个n维向量a(xi1,xi2,...,xin)与b(xj1,xj2,...,xjn)的欧氏距离
Figure PCTCN2017112569-appb-000001
In this embodiment, the big data platform subsystem uses Spark/Hive for data analysis, and clusters user image data of all users stored in the distributed storage platform Hbase to cluster all users according to their common preference information. . Specifically, user image data of all users may include, but is not limited to, gender, age, region, address, occupation, marital status, consumption habits, preference information, and education level. The K-means clustering algorithm is used to cluster the user portrait data of all users, so that all users are divided into several clustering groups based on the common preference information, and each clustering group corresponds to the clustering user portrait data. Among them, K-means clustering algorithm is a clustering algorithm based on distance evaluation similarity, that is, the closer the distance between two objects, the larger the similarity is. When determining the similar crowd based on the existing user portrait data, the Euclidean distance of the existing user portrait data and the cluster user image data corresponding to each cluster population is calculated, and the cluster population with the smallest Euclidean distance is selected as the similar crowd. Where the Euclidean distance of any two n-dimensional vectors a (xi1, xi2, ..., xin) and b (xj1, xj2, ..., xjn)
Figure PCTCN2017112569-appb-000001
S216:将相似人群对应的共同喜好信息作为用户喜好信息。S216: The common preference information corresponding to the similar group is used as the user preference information.
由于相似人群的用户画像数据与该用户标识对应用户的已有用户画像数据最相似,则其喜好也最有可能相同,因此可将该相似人群的共同喜好信息作为该用户标识对应的用户的用户喜好信息,并作为广告推荐的依据,使得推送的广告更符合用户的喜好,在一定程度上提高广告的点击率。Since the user portrait data of the similar group is the most similar to the existing user portrait data of the user corresponding to the user identifier, the preference is also most likely to be the same, so the common preference information of the similar group can be used as the user of the user corresponding to the user identifier. Like the information, and as the basis for the recommendation of the advertisement, the pushed advertisement is more in line with the user's preference, and the click rate of the advertisement is improved to some extent.
S217:若已有用户画像数据没有包含已有喜好信息,则基于用户标识查找对应的至少一个历史网页数据,每一历史网页数据对应一喜好标签。S217: If the existing user image data does not include the existing favorite information, the corresponding at least one historical webpage data is searched based on the user identifier, and each historical webpage data corresponds to a favorite label.
若访问请求中的请求来源标识是用户标识但用户标识对应的已有用户画像数据中没 有包含已有喜好信息时,则服务器可根据该用户标识查找对应的终端访问过的至少一个历史网页数据。该历史网页数据可以是上传到分布式存储平台Hbase中的历史网页数据。由于该历史网页数据与用户标识相关联,可以理解为用户标识对应的用户访问网页时留下的痕迹,使得每一历史网页数据对应的喜好标签可在一定程度上体现用户的喜好。If the request source identifier in the access request is the user identifier but the existing user image data corresponding to the user identifier is not When the existing preference information is included, the server may search for at least one historical webpage data that the corresponding terminal has accessed according to the user identifier. The historical webpage data may be historical webpage data uploaded to the distributed storage platform Hbase. Since the historical webpage data is associated with the user identifier, it can be understood as a trace left by the user corresponding to the user identifier when accessing the webpage, so that the favorite tab corresponding to each historical webpage data can reflect the user's preference to a certain extent.
本实施例中,每一历史网页数据对应有喜好标签,该喜好标签可以采用Jieba分词工具和TF-IDF算法进行获取。具体地,Jieba分词工具(即结巴分词工具)将历史网页数据中的文字信息都扫描出来,然后对文字信息中的长词进行切分,再对切分后的文字信息进行词性标注,以获取分词结果。然后,采用TF-IDF算法对Jieba分词工具处理后的分词结果进行关键词提取,以将提取出的关键词作为该历史网页数据对应的喜好标签。其中,采用TF-IDF算法对Jieba分词工具处理后的分词结果进行关键词提取具体包括如下步骤:In this embodiment, each historical webpage data corresponds to a favorite tag, and the favorite tag can be obtained by using a Jieba word segmentation tool and a TF-IDF algorithm. Specifically, the Jieba word segmentation tool (ie, the word segmentation tool) scans the text information in the historical webpage data, and then divides the long words in the text information, and then performs part-of-speech tagging on the segmented text information to obtain Word segmentation results. Then, the TF-IDF algorithm is used to extract the keyword result of the segmentation result processed by the Jieba word segmentation tool, so that the extracted keyword is used as the favorite tag corresponding to the historical webpage data. Among them, using the TF-IDF algorithm to extract the keyword results of the word segmentation processed by the Jieba word segmentation tool includes the following steps:
首先,计算任一历史网页数据的分词结果中每个词语的词频(term frequency,以下简称TF)。其中,词频(TF)是指某一个给定的词语在该文件中出现的频率,其公式为
Figure PCTCN2017112569-appb-000002
分子表示该词语在文件中的出现次数,而分母表示在文件中所有词语的出现次数之和。
First, calculate the word frequency (TF) of each word in the word segmentation result of any historical web page data. Where word frequency (TF) refers to the frequency at which a given word appears in the file, and its formula is
Figure PCTCN2017112569-appb-000002
The numerator indicates the number of occurrences of the word in the file, and the denominator indicates the sum of the occurrences of all words in the file.
然后,计算任一历史网页数据的分词结果中每个词语的逆文档频率(inverse document frequency,IDF)。其中,逆文档频率(IDF)是指对每个词分配一个“重要性”权重,该“重要性”权重是指最常见的词(“的”、“是”、“在”)给予最小的权重,较常见的词给予较小的权重,较少见的词给予较大的权重,这个权重叫做“逆文档频率”,它的大小与一个词的常见程度成反比。逆文档频率(IDF)公式可表示为:
Figure PCTCN2017112569-appb-000003
其中,|D|是指语料库中的文件总数,|{j:tu∈dj}|是指包含词语的文件总数。
Then, the inverse document frequency (IDF) of each word in the word segmentation result of any historical web page data is calculated. Among them, the inverse document frequency (IDF) means that each word is assigned an "importance" weight, which means that the most common words (",""yes","at") give the smallest Weights, the more common words give less weight, the less common words give greater weight, this weight is called "inverse document frequency", and its size is inversely proportional to the common degree of a word. The inverse document frequency (IDF) formula can be expressed as:
Figure PCTCN2017112569-appb-000003
Where |D| refers to the total number of files in the corpus, and |{j:t u ∈d j }| refers to the total number of files containing words.
最后,采用TF-IDFi,j=TFi,j×IDFi,j获取历史网页数据中每一词语的权重,并从中选取权重最高或者比较高(即前N位)的词语作为关键词,即为该历史网页数据对应的喜好标签。TF-IDF倾向于过滤掉常见的词语,保留重要词语,并将该重要词语作为该历史网页数据的关键词,选取权重最高的一个关键词或者权重较高的几个关键词确定为该历史网页数据中对应的喜好标签。Finally, TF-IDF i,j =TF i,j ×IDF i,j is used to obtain the weight of each word in the historical webpage data, and the words with the highest weight or relatively high (ie, the first N digits) are selected as keywords. That is, the favorite tag corresponding to the history webpage data. TF-IDF tends to filter out common words, retain important words, and use the important words as keywords of the historical web page data. Select one keyword with the highest weight or several keywords with higher weight to determine the history page. The corresponding favorite tag in the data.
S218:对历史网页数据对应的喜好标签进行统计分析,获取关键喜好标签,以确定用户喜好信息。 S218: Perform statistical analysis on the favorite tags corresponding to the historical webpage data, and obtain key preference tags to determine user preference information.
步骤S217中对每一历史网页数据采用TF-IDF算法进行关键词提取,以确定每一历史网页数据都有对应的喜好标签;步骤S218中需对与用户标识相对应的所有历史网页数据对应的喜好标签进行统计,以确定出现频率最高或者较高的喜好标签作为关键喜好标签,并将该关键喜好标签作为最终确定的用户喜好信息,以便基于该用户喜好信息进行广告推荐,使得推荐的广告更符合用户的兴趣,提高广告的点击率。In step S217, the TF-IDF algorithm is used for each historical webpage data to perform keyword extraction to determine that each historical webpage data has a corresponding favorite label; in step S218, corresponding to all historical webpage data corresponding to the user identifier is required. The favorite tag is counted to determine the highest or higher favorite tag as the key preference tag, and the key preference tag is used as the finalized user preference information, so that the advertisement recommendation is based on the user preference information, so that the recommended advertisement is more Meet the interests of users and increase the click-through rate of your ads.
在一具体实施方式中,如图3所示,步骤S20中,获取与请求来源标识相对应的用户喜好信息,具体包括如下步骤:In a specific implementation, as shown in FIG. 3, in step S20, the user preference information corresponding to the request source identifier is obtained, which specifically includes the following steps:
S221:基于访问请求,判断请求来源标识是否为终端标识。S221: Determine, according to the access request, whether the request source identifier is a terminal identifier.
具体地,服务器接收到的访问请求中携带的请求来源标识可能为用于唯一识别用户的用户标识,也可能是用于唯一识别终端的终端标识,或者同时携带用户标识和终端标识。在用户没有采用帐号信息登录特定网页时,服务器接收到的访问请求中的请求来源标识为终端标识,该终端标识可以唯一确定发送访问请求的终端。Specifically, the request source identifier carried in the access request received by the server may be a user identifier for uniquely identifying the user, or may be a terminal identifier for uniquely identifying the terminal, or carrying both the user identifier and the terminal identifier. When the user does not use the account information to log in to the specific webpage, the request source identifier in the access request received by the server is the terminal identifier, and the terminal identifier can uniquely determine the terminal that sends the access request.
S222:若请求来源标识为终端标识,则基于终端标识查找对应的至少一个历史网页数据,每一历史网页数据有对应的喜好标签。S222: If the source identifier is the terminal identifier, the at least one historical webpage data is searched according to the terminal identifier, and each historical webpage data has a corresponding favorite label.
若访问请求中的请求来源标识是终端标识,则服务器根据该终端标识查找对应的终端访问过的至少一个历史网页数据。该历史网页数据可以是上传到分布式存储平台Hbase中的历史网页数据,也可以是存储在终端上的Cookie(或Cookies)中历史网页数据。其中,Cookie(或Cookies)指某些网站为了辨别用户身份、进行session跟踪而储存在用户本地终端上的数据。步骤S222中获取用户标签的过程与步骤S217的过程相似,为避免重复,在此不一一赘述。If the source identifier of the request in the access request is the terminal identifier, the server searches for at least one historical webpage data that the corresponding terminal has accessed according to the terminal identifier. The historical webpage data may be historical webpage data uploaded to the distributed storage platform Hbase, or may be history webpage data stored in a cookie (or a cookie) on the terminal. Among them, cookies (or cookies) refer to the data that some websites store on the user's local terminal in order to identify the user's identity and perform session tracking. The process of obtaining the user label in step S222 is similar to the process in step S217. To avoid repetition, details are not described herein.
S223:对历史网页数据对应的喜好标签进行统计分析,获取关键喜好标签,以确定用户喜好信息。S223: Perform statistical analysis on the favorite tags corresponding to the historical webpage data, and obtain key preference tags to determine user preference information.
步骤S222中对每一历史网页数据采用TF-IDF算法进行关键词提取,以确定每一历史网页数据都有对应的喜好标签;步骤S223中需对与终端标识相对应的所有历史网页数据对应的喜好标签进行统计,以确定出现频率最高或者频度较高的喜好标签作为关键喜好标签,并将该关键喜好标签作为最终确定的用户喜好信息,以便基于该用户喜好信息进行广告推荐,使得推荐的广告更符合用户的兴趣,提高广告的点击率。In step S222, the TF-IDF algorithm is used for each historical webpage data to perform keyword extraction to determine that each historical webpage data has a corresponding favorite label; in step S223, all historical webpage data corresponding to the terminal identifier is required. The favorite tag is counted to determine the favorite tag with the highest frequency or high frequency as the key preference tag, and the key preference tag is used as the finalized user preference information, so that the recommendation is based on the user preference information, so that the recommended The advertisement is more in line with the user's interest and improves the click rate of the advertisement.
在一具体实施方式中,在执行该广告实时推荐方法之前,尤其是步骤S20之前还包括:对网站上所有网页进行标签化处理,以使每一网页携带一喜好标签。可以理解地,该网页的喜好标签可以由网页开发人员人工设置,也可以预先采用Jieba分词工具和TF-IDF算 法对网页内容进行处理,获取网页内容的关键词以确定对应的喜好标签。In a specific implementation, before performing the advertisement real-time recommendation method, in particular before step S20, the method further includes: labeling all the webpages on the website, so that each webpage carries a favorite label. It can be understood that the favorite tag of the webpage can be manually set by the webpage developer, or can be pre-adopted by the Jieba word segmentation tool and TF-IDF. The method processes the content of the webpage, and obtains keywords of the webpage content to determine corresponding favorite tags.
在该具体实施方式中,步骤S20具体为基于访问请求的URL地址查找对应的目标网页,并将目标网页对应的喜好标签作为用户喜好信息。其中,目标网页为与访问请求的URL地址相对应的网页。由于所有网页均携带有喜好标签,所以目标网页也应该携带有相对应的喜好标签,将该喜好标签作为触发该访问请求的用户的用户喜好信息,以基于该用户喜好信息进行广告推荐,使得推荐的广告更符合用户的兴趣,提高广告的点击率。这种基于URL地址确定用户喜好信息与用户每次触发的访问请求相关联,具有极大的偶然性,在不确定用户标识而且基于终端标识获取的历史网页数据较少的情况下,其所确定的用户喜好信息在很大程度上可体现用户真正的喜好,基于该用户喜好信息推荐广告,在一定程度上也可有效提高广告的点击率。In this embodiment, step S20 specifically searches for a corresponding target webpage based on the URL address of the access request, and uses the favorite tab corresponding to the target webpage as the user preference information. The target webpage is a webpage corresponding to the URL address of the access request. Since all the webpages carry the favorite tags, the target webpage should also carry a corresponding favorite tag, and the favorite tag is used as the user preference information of the user who triggered the access request, so as to perform advertisement recommendation based on the user preference information, so that the recommendation is recommended. The ads are more in line with the user’s interest and increase the click-through rate of the ad. The determination of the user preference information based on the URL address is associated with the access request triggered by the user each time, and has great contingency. In the case that the user identifier is uncertain and the historical webpage data acquired based on the terminal identifier is small, the determined User preference information can largely reflect the user's true preferences. Recommending advertisements based on the user's favorite information can also effectively increase the click rate of the advertisement to a certain extent.
S30:基于用户喜好信息,获取与用户喜好信息相对应的关联广告。S30: Acquire an associated advertisement corresponding to the user preference information based on the user preference information.
其中,关联广告是指内容与用户喜好信息相对应的广告。在通过步骤S20确定与访问请求中的请求来源标识相对应的用户喜好信息后,可基于该用户喜好信息查找相对应的关联广告,使得该关联广告更符合触发访问请求的用户的兴趣,以提高用户对关联广告的点击率。The related advertisement refers to an advertisement whose content corresponds to the user preference information. After determining the user preference information corresponding to the request source identifier in the access request by using the step S20, the corresponding related advertisement may be searched based on the user preference information, so that the associated advertisement is more in line with the interest of the user who triggered the access request, so as to improve The user’s clickthrough rate for the associated ad.
在一具体实施方式中,如图4所示,步骤S30,具体包括如下步骤:In a specific implementation, as shown in FIG. 4, step S30 includes the following steps:
S31:对广告进行关键词提取,确定广告的广告类别。S31: Perform keyword extraction on the advertisement to determine an advertisement category of the advertisement.
其中,广告类别是指根据广告内容确定广告所属的类别。具体地,广告类别包含但不限于旅游广告、购物广告等,旅游广告还可细分为旅行社广告、酒店广告、旅游城市/景区广告,旅游节日庆典广告和会展广告等。本实施例中,服务器在获取要推送的广告时,可基于广告主对广告的定位确定其广告类别,即广告主明确确定其广告类别;也可基于广告内容确定其广告类别。在基于广告内容确定其广告类别时,可采用Jieba分词工具和TF-IDF算法对广告内容进行处理,获取广告内容对应的关键词,以确定广告类别。The advertisement category refers to determining the category to which the advertisement belongs according to the advertisement content. Specifically, the advertising category includes, but is not limited to, travel advertisements, shopping advertisements, etc., and the travel advertisements may be subdivided into travel agency advertisements, hotel advertisements, tourist city/scenic area advertisements, travel festival celebration advertisements, and exhibition advertisements. In this embodiment, when the server obtains the advertisement to be pushed, the advertisement category may be determined based on the positioning of the advertisement by the advertiser, that is, the advertiser clearly determines the advertisement category; and the advertisement category may also be determined based on the advertisement content. When determining the advertisement category based on the advertisement content, the Jieba word segmentation tool and the TF-IDF algorithm may be used to process the advertisement content, and the keywords corresponding to the advertisement content are obtained to determine the advertisement category.
S32:计算广告类别与用户喜好信息的相似度。S32: Calculate the similarity between the advertisement category and the user preference information.
本实施例中,广告类别与用户喜好信息的相似度可采用余弦相似度表示。具体采用余弦相似度算法对广告类别和用户喜好信息进行计算,其中,余弦相似度算法的计算公式为
Figure PCTCN2017112569-appb-000004
其中,x是指广告类别中的关键词对应的权重,y是指用户喜好信息中的各喜好信息对应的权重,当计算得出余弦值越接近1时,则证明该广告类别与该用户喜好信息相似度越高,则认定该广告类别与该用户喜好信息更接近。
In this embodiment, the similarity between the advertisement category and the user preference information may be expressed by cosine similarity. The cosine similarity algorithm is used to calculate the advertisement category and user preference information. The calculation formula of the cosine similarity algorithm is
Figure PCTCN2017112569-appb-000004
Where x is the weight corresponding to the keyword in the advertisement category, and y is the weight corresponding to each preference information in the user preference information. When the calculated cosine value is closer to 1, the advertisement category is proved to be similar to the user. The higher the information similarity, the more the advertising category is determined to be closer to the user preference information.
在一具体实施例中,可按照行业类别对广告进行大类区分,如将广告类别分为旅游广告、购物广告和电子家电广告等,每一大类广告还可进行细分,对每一大类广告进行广告类别定义,其细分的广告在对应的大类广告的基础上进行广告类别定义。如将旅游广告的广告类别定义为T,旅游广告下细分的旅行社广告的广告类别则定义为T1,酒店广告的广告类别定义为T2,旅游城市/景区广告的广告类别定义为T3等;购物广告的广告类别定义为S,购物广告下细分的烟酒广告的广告类别则定义为S1、食品广告的广告类别定义为S2、电器/电子产品的广告类别定义为S3等,其中每一细分下的广告都有一对应的权重,可将广告类别和其对应的权重描述为T(T1,x1),(T2,x2),(T3,x3);S(S1,x4),(S2,x5),(S3,x6)。用户喜好信息定义为P1、P2、P3...Pn,其中n依据用户喜好信息的个数确定,同样,可将用户喜好信息和其对应的权重描述为P(P1,y1),(P2,y2),(P3,y3)...(Pn,yn)。基于获得的x、y值,对广告类别和用户喜好信息采用余弦相似度算法的计算公式进行余弦值计算,当计算得出的余弦值越接近1时,则该广告类别与该用户喜好信息越接近,相似度越高。In a specific embodiment, advertisements can be classified according to industry categories, such as advertising advertisements, shopping advertisements, and electronic home appliance advertisements, and each category of advertisements can also be subdivided for each major. The category advertisement defines the advertisement category, and the segmented advertisement defines the advertisement category based on the corresponding large category advertisement. If the advertisement category of the travel advertisement is defined as T, the advertisement category of the travel agency advertisement subdivided under the travel advertisement is defined as T 1 , the advertisement category of the hotel advertisement is defined as T 2 , and the advertisement category of the travel city/scenic advertisement is defined as T 3 The advertisement category of the shopping advertisement is defined as S, the advertisement category of the tobacco and alcohol advertisement segmented under the shopping advertisement is defined as S 1 , the advertisement category of the food advertisement is defined as S 2 , and the advertisement category of the appliance/electronic product is defined as S 3 Etc., each of the sub-segments have a corresponding weight, and the ad category and its corresponding weight can be described as T(T 1 , x 1 ), (T 2 , x 2 ), (T 3 , x 3 S(S 1 , x 4 ), (S 2 , x 5 ), (S 3 , x 6 ). The user preference information is defined as P 1 , P 2 , P 3 ... P n , where n is determined according to the number of user preference information, and similarly, the user preference information and its corresponding weight can be described as P(P 1 , y 1 ), (P 2 , y 2 ), (P 3 , y 3 ) (P n , y n ). Based on the obtained x and y values, the cosine value calculation is performed on the advertisement category and the user preference information by using the calculation formula of the cosine similarity algorithm. When the calculated cosine value is closer to 1, the advertisement category and the user preference information are more Close, the higher the similarity.
S33:判断相似度是否大于预设值。S33: Determine whether the similarity is greater than a preset value.
其中,预设值是系统预先设置的数据,该预设值是用于评估任一广告的广告类别与用户喜好信息的相似度是否达到关联广告的标准值。当广告类别与用户喜好信息的相似度大于预设值时,则认定该广告更接近用户的喜好,更容易吸引用户点击该广告;当广告类别与用户喜好信息的相似度不大于预设值,则认定该广告不接近用户的喜好,可能会使用户对该广告的点击率降低。The preset value is data preset by the system, and the preset value is a standard value for evaluating whether the similarity between the advertisement category of any advertisement and the user preference information reaches the associated advertisement. When the similarity between the advertisement category and the user preference information is greater than the preset value, the advertisement is determined to be closer to the user's preference, and the user is more likely to click the advertisement; when the advertisement category and the user preference information are not greater than the preset value, It is determined that the advertisement is not close to the user's preference, and may cause the user to decrease the click rate of the advertisement.
S34:若相似度大于预设值时,则确定广告为关联广告。S34: If the similarity is greater than the preset value, determine that the advertisement is an associated advertisement.
本实施例中,只将与用户喜好信息的相似度大于预设值的广告类别对应的广告作为关联广告,以使关联广告更贴近用户的喜好,以便于后续将关联广告推送给用户时,更容易引起用户的兴趣,以提高关联广告的点击率。In this embodiment, only the advertisement corresponding to the advertisement category whose similarity of the user preference information is greater than the preset value is used as the associated advertisement, so that the associated advertisement is closer to the user's preference, so as to facilitate the subsequent push of the associated advertisement to the user, It is easy to attract users' interest to increase the click rate of related ads.
S40:将关联广告实时推送给客户端,以使客户端实时显示关联广告。S40: Push the associated advertisement to the client in real time, so that the client displays the associated advertisement in real time.
本实施例中,客户端将访问请求发送给服务器,服务器在控制客户端显示与访问请求中的URL地址相对应的目标网页时,可实现显示服务器推送的关联广告,由于该关联广告与用户的用户喜好信息相关联,使得该关联广告更容易引起用户的兴趣,以使用户点击查看该关联广告,从而提高广告的点击率。 In this embodiment, the client sends an access request to the server, and when the control client displays the target webpage corresponding to the URL address in the access request, the server may implement the associated advertisement that is displayed by the display server, because the associated advertisement and the user The user preference information is associated, so that the associated advertisement is more likely to cause interest of the user, so that the user clicks to view the associated advertisement, thereby increasing the click rate of the advertisement.
优选地,客户端显示关联广告方式可以是显示在用户帐号信息登录的APP中或者用户访问网页时的网页中,也可以显示在访问请求携带的终端标识对应的终端设备上,关联广告显示时采用弹窗形式显示,使得该广告推荐消息不会影响用户正常浏览网页信息。可以理解地,将关联广告实时推送给客户端,以使客户端实时显示关联广告,可保证该关联广告可被触发访问请求的用户查看,以提高广告的点击率;从而避免出现触发该访问请求的用户离线(即离开客户端时),使得该关联广告被推送给其他用户,无法引起其他用户点击关联广告的兴趣的情况。Preferably, the client displaying the associated advertisement mode may be displayed in the APP in which the user account information is logged in or in the webpage when the user visits the webpage, or may be displayed on the terminal device corresponding to the terminal identifier carried in the access request, and the associated advertisement is displayed. The pop-up window is displayed so that the advertisement recommendation message does not affect the user's normal browsing of the webpage information. It can be understood that the associated advertisement is pushed to the client in real time, so that the client displays the associated advertisement in real time, and the associated advertisement can be viewed by the user who triggered the access request to improve the click rate of the advertisement; thereby avoiding triggering the access request. When the user is offline (ie, when leaving the client), the associated advertisement is pushed to other users, and the other users are not likely to click on the interest of the associated advertisement.
该广告实时推荐方法可以实时获取客户端发送的访问请求,基于该访问请求的该请求来源标识相对应的用户喜好信息,再基于用户喜好信息获取对应的关联广告,并将该关联广告实时推送给触发该访问请求的客户端,以使用户可通过该客户端实时查看该关联广告,由于该关联广告与用户喜好信息相关联,更符合用户的兴趣,可以提高用户点击关联广告的点击率,达到推送广告的目的。The advertisement real-time recommendation method can obtain the access request sent by the client in real time, identify the corresponding user preference information based on the request source of the access request, and then obtain the corresponding associated advertisement based on the user preference information, and push the associated advertisement in real time. The client that triggers the access request, so that the user can view the related advertisement in real time through the client. Since the associated advertisement is associated with the user preference information, and more in line with the user's interest, the click rate of the user clicking the associated advertisement can be improved. The purpose of pushing ads.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence of the steps in the above embodiments does not mean that the order of execution is performed. The order of execution of each process should be determined by its function and internal logic, and should not be construed as limiting the implementation process of the embodiments of the present application.
实施例2Example 2
图5示出与实施例1中广告实时推荐方法一一对应的广告实时推荐装置的原理框图。如图5所示,该广告实时推荐装置包括访问请求获取模块10、用户喜好信息获取模块20、关联广告获取模块30和关联广告推荐模块40。其中,访问请求获取模块10、用户喜好信息获取模块20、关联广告获取模块30和关联广告推荐模块40的实现功能与实施例中广告实时推荐方法对应的步骤一一对应,为避免赘述,本实施例不一一详述。FIG. 5 is a schematic block diagram showing an advertisement real-time recommendation device corresponding to the advertisement real-time recommendation method in the first embodiment. As shown in FIG. 5, the advertisement real-time recommendation device includes an access request acquisition module 10, a user preference information acquisition module 20, an associated advertisement acquisition module 30, and an associated advertisement recommendation module 40. The implementation functions of the access request acquisition module 10, the user preference information acquisition module 20, the associated advertisement acquisition module 30, and the associated advertisement recommendation module 40 correspond one-to-one with the steps corresponding to the advertisement real-time recommendation method in the embodiment. To avoid redundancy, the implementation The examples are not detailed one by one.
访问请求获取模块10,用于实时获取客户端发送的访问请求,访问请求包括请求来源标识。The access request obtaining module 10 is configured to obtain an access request sent by the client in real time, and the access request includes a request source identifier.
用户喜好信息获取模块20,用于基于访问请求,获取与请求来源标识相对应的用户喜好信息。The user preference information obtaining module 20 is configured to acquire user preference information corresponding to the request source identifier based on the access request.
关联广告获取模块30,用于基于用户喜好信息,获取与用户喜好信息相对应的关联广告。The associated advertisement obtaining module 30 is configured to acquire an associated advertisement corresponding to the user preference information based on the user preference information.
关联广告推荐模块40,用于将关联广告实时推送给客户端,以使客户端实时显示关联广告。The associated advertisement recommendation module 40 is configured to push the associated advertisement to the client in real time, so that the client displays the associated advertisement in real time.
优选地,用户喜好信息获取模块20包括用户来源标识判断单元211、已有用户画像数据获取单元212、已有喜好信息判断单元213、第一用户喜好信息获取单元214、相似人 群查找单元215、第二用户喜好信息获取单元216、第一网页喜好标签获取单元217、第三用户喜好信息获取单元218;该用户喜好信息获取模块20还包括终端来源标识判断单元221、第二网页喜好标签获取单元222和第四用户喜好信息获取单元223。Preferably, the user preference information obtaining module 20 includes a user source identification determining unit 211, an existing user portrait data obtaining unit 212, an existing favorite information determining unit 213, a first user preference information acquiring unit 214, and a similar person. The group search unit 215, the second user preference information acquisition unit 216, the first webpage preference label acquisition unit 217, and the third user preference information acquisition unit 218; the user preference information acquisition module 20 further includes a terminal source identification determination unit 221 and a second The web page preference tag acquisition unit 222 and the fourth user preference information acquisition unit 223.
用户来源标识判断单元211,用于基于访问请求,判断请求来源标识是否为用户标识。The user source identifier determining unit 211 is configured to determine, according to the access request, whether the request source identifier is a user identifier.
已有用户画像数据获取单元212,用于若请求来源标识为用户标识,则基于用户标识查询已有用户画像数据。The existing user portrait data obtaining unit 212 is configured to query the existing user portrait data based on the user identifier if the request source identifier is the user identifier.
已有喜好信息判断单元213,用于判断已有用户画像数据是否包含已有喜好信息。The favorite information judging unit 213 is configured to judge whether the existing user image data contains the existing preference information.
第一用户喜好信息获取单元214,用于若已有用户画像数据包含已有喜好信息,则将已有喜好信息作为用户喜好信息。The first user preference information acquiring unit 214 is configured to use the existing favorite information as the user preference information if the existing user image data includes the existing favorite information.
相似人群查找单元215,用于若已有用户画像数据没有包含已有喜好信息,则基于已有用户画像数据查找相似人群。The similarity group searching unit 215 is configured to search for similar people based on the existing user portrait data if the existing user image data does not contain the existing preference information.
第二用户喜好信息获取单元216,用于将相似人群对应的共同喜好信息作为用户喜好信息。The second user preference information obtaining unit 216 is configured to use the common preference information corresponding to the similar crowd as the user preference information.
网页喜好标签第一获取单元217,用于若已有用户画像数据没有包含已有喜好信息,则基于用户标识查找对应的至少一个历史网页数据,每一历史网页数据对应一喜好标签。The webpage preference tag first obtaining unit 217 is configured to search for the corresponding at least one historical webpage data based on the user identifier if the existing user portrait data does not include the existing favorite information, and each historical webpage data corresponds to a favorite tag.
第三用户喜好信息获取单元218,用于对历史网页数据对应的喜好标签进行统计分析,获取关键喜好标签,以确定用户喜好信息。The third user preference information obtaining unit 218 is configured to perform statistical analysis on the favorite tags corresponding to the historical webpage data, and obtain key preference tags to determine user preference information.
终端来源标识判断单元221,用于基于访问请求,判断请求来源标识是否为终端标识。The terminal source identifier determining unit 221 is configured to determine, according to the access request, whether the request source identifier is a terminal identifier.
第二网页喜好标签获取单元222,用于若请求来源标识为终端标识,则基于终端标识查找对应的至少一个历史网页数据,每一历史网页数据有对应的喜好标签。The second webpage preference tag obtaining unit 222 is configured to search for the corresponding at least one historical webpage data based on the terminal identifier if the source identifier is the terminal identifier, and each historical webpage data has a corresponding favorite label.
第四用户喜好信息获取单元223,用于对历史网页数据对应的喜好标签进行统计分析,获取关键喜好标签,以确定用户喜好信息。The fourth user preference information obtaining unit 223 is configured to perform statistical analysis on the favorite tags corresponding to the historical webpage data, and obtain key preference tags to determine user preference information.
优选地,关联广告获取模块30包括广告类别确定单元31、广告类别相似度确定单元32、相似度判断单元33和关联广告确定单元34。Preferably, the related advertisement acquisition module 30 includes an advertisement category determination unit 31, an advertisement category similarity determination unit 32, a similarity determination unit 33, and an associated advertisement determination unit 34.
广告类别确定单元31,用于对广告进行关键词提取,确定广告的广告类别。The advertisement category determining unit 31 is configured to perform keyword extraction on the advertisement to determine an advertisement category of the advertisement.
广告类别相似度确定单元32,用于计算广告类别与用户喜好信息的相似度。The advertisement category similarity determining unit 32 is configured to calculate the similarity between the advertisement category and the user preference information.
相似度判断单元33,用于判断相似度是否大于预设值。The similarity determining unit 33 is configured to determine whether the similarity is greater than a preset value.
关联广告确定单元34,用于若相似度大于预设值时,则确定广告为关联广告。The associated advertisement determining unit 34 is configured to determine that the advertisement is an associated advertisement if the similarity is greater than a preset value.
本实施例所提供的广告实时推荐装置中,用户喜好信息获取模块20,用于基于访问请求,获取与请求来源标识相对应的用户喜好信息。 In the advertisement real-time recommendation device provided by the embodiment, the user preference information acquisition module 20 is configured to acquire user preference information corresponding to the request source identifier based on the access request.
若访问请求携带的来源标识是用户来源标识,则基于用户标识查询已有用户画像数据,确定已有用户画像数据是否包含已有喜好信息。若包含已有喜好信息,则将已有喜好信息作为用户喜好信息,若已有用户画像数据没有包含已有喜好信息,则基于已有用户画像数据查找相似人群,相似人群对应的共同喜好信息作为用户喜好信息;也可以基于用户标识查找对应的至少一个历史网页数据,确定每一历史网页数据对应的喜好标签,对获取的所有喜好标签进行统计分析,获取关键喜好标签,以确定用户喜好信息。If the source identifier carried by the access request is the user source identifier, the existing user portrait data is queried based on the user identifier, and it is determined whether the existing user portrait data contains the existing preference information. If the existing preference information is included, the existing preference information is used as the user preference information. If the existing user image data does not contain the existing preference information, the similar user is searched based on the existing user image data, and the common preference information corresponding to the similar group is used as the The user preference information; the corresponding at least one historical webpage data may be searched based on the user identifier, the favorite tag corresponding to each historical webpage data is determined, and all the favorite bookmarks are statistically analyzed, and the key preference tag is obtained to determine the user preference information.
若访问请求携带的来源标识是终端标识,则基于终端标识查找对应的至少一个历史网页数据,每一历史网页数据有对应的喜好标签,对获取的所有喜好标签进行统计分析,获取关键喜好标签,以确定用户喜好信息。If the source identifier carried by the access request is the terminal identifier, the at least one historical webpage data is searched according to the terminal identifier, and each historical webpage data has a corresponding favorite label, and all the favorite bookmarks are statistically analyzed to obtain a key favorite label. To determine user preferences.
关联广告获取模块30基于获得的喜好信息,获取与用户喜好信息相对应的关联广告,当确定的广告类别和用户的喜好信息相似度大于预设值时,则确定该广告为与该用户关联的广告并推荐给用户。根据用户喜好信息确定的关联广告更加贴近客户的需求,将该类广告推荐给对应的用户时,会提高用户对推荐广告的点击率。The associated advertisement obtaining module 30 acquires the associated advertisement corresponding to the user preference information based on the obtained preference information, and determines that the advertisement is associated with the user when the determined advertisement category and the user's preference information similarity are greater than a preset value. Advertise and recommend to users. The related advertisements determined according to the user preference information are closer to the customer's needs, and when the advertisements are recommended to the corresponding users, the click rate of the recommended advertisements is increased.
实施例3Example 3
本实施例提供一计算机可读存储介质,该计算机可读存储介质上存储有计算机可读指令,该计算机可读指令被处理器执行时实现实施例1中广告实时推荐方法,为避免重复,这里不再赘述。或者,该计算机可读指令被处理器执行时实现实施例2中广告实时推荐装置中各模块/单元的功能,为避免重复,这里不再赘述。The embodiment provides a computer readable storage medium on which computer readable instructions are stored, and when the computer readable instructions are executed by the processor, the real-time recommendation method of the advertisement in Embodiment 1 is implemented. No longer. Alternatively, when the computer readable instructions are executed by the processor, the functions of the modules/units in the real-time recommendation device of the embodiment 2 are implemented. To avoid repetition, details are not described herein again.
实施例4Example 4
图6是本申请一实施例提供的终端设备的示意图。如图6所示,该实施例的终端设备60包括:处理器61、存储器62以及存储在存储器62中并可在处理器61上运行的计算机可读指令63,例如广告实时推荐程序。处理器61执行计算机可读指令63时实现实施例1中广告实时推荐方法的各个步骤,例如图1所示的步骤S10至S40。或者,处理器61执行计算机可读指令63时实现实施例2中广告实时推荐装置中各模块/单元的功能。FIG. 6 is a schematic diagram of a terminal device according to an embodiment of the present application. As shown in FIG. 6, the terminal device 60 of this embodiment includes a processor 61, a memory 62, and computer readable instructions 63 stored in the memory 62 and operable on the processor 61, such as an advertisement real-time recommendation program. The processor 61 implements various steps of the advertisement real-time recommendation method in Embodiment 1 when the computer readable instructions 63 are executed, such as steps S10 to S40 shown in FIG. Alternatively, the processor 61 implements the functions of the modules/units in the advertisement real-time recommendation device in Embodiment 2 when the computer readable instructions 63 are executed.
示例性的,计算机可读指令63可以被分割成一个或多个模块/单元,一个或者多个模块/单元被存储在存储器62中,并由处理器61执行,以完成本申请。一个或多个模块/单元可以是能够完成特定功能的一系列计算机可读指令段,该指令段用于描述计算机可读指令63在终端设备60中的执行过程。例如,计算机可读指令63可以被分割成访问请求获取模块10、用户喜好信息获取模块20、关联广告获取模块30和关联广告推荐模块40,各模块具体功能如下: Illustratively, computer readable instructions 63 may be partitioned into one or more modules/units, one or more modules/units being stored in memory 62 and executed by processor 61 to complete the application. The one or more modules/units may be a series of computer readable instruction segments capable of performing a particular function for describing the execution of computer readable instructions 63 in the terminal device 60. For example, the computer readable instructions 63 may be divided into an access request acquisition module 10, a user preference information acquisition module 20, an associated advertisement acquisition module 30, and an associated advertisement recommendation module 40. The specific functions of each module are as follows:
访问请求获取模块10,用于实时获取客户端发送的访问请求,访问请求包括请求来源标识。The access request obtaining module 10 is configured to obtain an access request sent by the client in real time, and the access request includes a request source identifier.
用户喜好信息获取模块20,用于基于访问请求,获取与请求来源标识相对应的用户喜好信息。The user preference information obtaining module 20 is configured to acquire user preference information corresponding to the request source identifier based on the access request.
关联广告获取模块30,用于基于用户喜好信息,获取与用户喜好信息相对应的关联广告。The associated advertisement obtaining module 30 is configured to acquire an associated advertisement corresponding to the user preference information based on the user preference information.
关联广告推荐模块40,用于将关联广告实时推送给客户端,以使客户端实时显示关联广告。The associated advertisement recommendation module 40 is configured to push the associated advertisement to the client in real time, so that the client displays the associated advertisement in real time.
终端设备60可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。终端设备可包括,但不仅限于,处理器61、存储器62。本领域技术人员可以理解,图6仅仅是终端设备60的示例,并不构成对终端设备60的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如终端设备还可以包括输入输出设备、网络接入设备、总线等。The terminal device 60 can be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The terminal device may include, but is not limited to, a processor 61, a memory 62. It will be understood by those skilled in the art that FIG. 6 is only an example of the terminal device 60, and does not constitute a limitation on the terminal device 60, and may include more or less components than those illustrated, or combine some components, or different components. For example, the terminal device may further include an input/output device, a network access device, a bus, and the like.
所称处理器61可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The processor 61 may be a central processing unit (CPU), or may be other general-purpose processors, a digital signal processor (DSP), an application specific integrated circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
存储器62可以是终端设备60的内部存储单元,例如终端设备60的硬盘或内存。存储器62也可以是终端设备60的外部存储设备,例如终端设备60上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器62还可以既包括终端设备60的内部存储单元也包括外部存储设备。存储器62用于存储计算机可读指令以及终端设备所需的其他程序和数据。存储器62还可以用于暂时地存储已经输出或者将要输出的数据。The memory 62 may be an internal storage unit of the terminal device 60, such as a hard disk or memory of the terminal device 60. The memory 62 may also be an external storage device of the terminal device 60, such as a plug-in hard disk provided on the terminal device 60, a smart memory card (SMC), a Secure Digital (SD) card, and a flash memory card (Flash). Card) and so on. Further, the memory 62 may also include both an internal storage unit of the terminal device 60 and an external storage device. The memory 62 is used to store computer readable instructions as well as other programs and data required by the terminal device. The memory 62 can also be used to temporarily store data that has been or will be output.
所述领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。It will be apparent to those skilled in the art that, for convenience and brevity of description, only the division of each functional unit and module described above is exemplified. In practical applications, the above functions may be assigned to different functional units as needed. The module is completed, that is, the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各 个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or may be each Units exist physically alone, or two or more units can be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
所述集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,也可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一计算机可读存储介质中,该计算机可读指令在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机可读指令包括计算机可读指令代码,所述计算机可读指令代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机可读指令代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,所述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括是电载波信号和电信信号。The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, the present application implements all or part of the processes in the foregoing embodiments, and may also be implemented by computer readable instructions, which may be stored in a computer readable storage medium. The computer readable instructions, when executed by a processor, may implement the steps of the various method embodiments described above. Wherein, the computer readable instructions comprise computer readable instruction code, which may be in the form of source code, an object code form, an executable file or some intermediate form or the like. The computer readable medium can include any entity or device capable of carrying the computer readable instruction code, a recording medium, a USB flash drive, a removable hard drive, a magnetic disk, an optical disk, a computer memory, a read only memory (ROM, Read-Only) Memory), random access memory (RAM), electrical carrier signals, telecommunications signals, and software distribution media. It should be noted that the content contained in the computer readable medium may be appropriately increased or decreased according to the requirements of legislation and patent practice in a jurisdiction, for example, in some jurisdictions, according to legislation and patent practice, computer readable media It does not include electrical carrier signals and telecommunication signals.
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。 The above-mentioned embodiments are only used to explain the technical solutions of the present application, and are not limited thereto; although the present application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that they can still implement the foregoing embodiments. The technical solutions described in the examples are modified or equivalently replaced with some of the technical features; and the modifications or substitutions do not deviate from the spirit and scope of the technical solutions of the embodiments of the present application, and should be included in Within the scope of protection of this application.

Claims (20)

  1. 一种广告实时推荐方法,其特征在于,包括:An advertisement real-time recommendation method, which is characterized by comprising:
    实时获取客户端发送的访问请求,所述访问请求包括请求来源标识;Acquiring an access request sent by the client in real time, where the access request includes a request source identifier;
    基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息;Acquiring user preference information corresponding to the request source identifier based on the access request;
    基于所述用户喜好信息,获取与所述用户喜好信息相对应的关联广告;Acquiring an associated advertisement corresponding to the user preference information based on the user preference information;
    将所述关联广告实时推送给所述客户端,以使所述客户端实时显示所述关联广告。Pushing the associated advertisement to the client in real time, so that the client displays the associated advertisement in real time.
  2. 如权利要求1所述的广告实时推荐方法,其特征在于,所述基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息,包括:The advertisement real-time recommendation method according to claim 1, wherein the obtaining the user preference information corresponding to the request source identifier based on the access request comprises:
    基于所述访问请求,判断所述请求来源标识是否为用户标识;Determining, according to the access request, whether the request source identifier is a user identifier;
    若所述请求来源标识为所述用户标识,则基于所述用户标识查询已有用户画像数据;If the request source identifier is the user identifier, querying existing user portrait data based on the user identifier;
    判断所述已有用户画像数据是否包含已有喜好信息;Determining whether the existing user portrait data contains existing preference information;
    若所述已有用户画像数据包含所述已有喜好信息,则将所述已有喜好信息作为所述用户喜好信息。If the existing user portrait data includes the existing preference information, the existing preference information is used as the user preference information.
  3. 如权利要求2所述的广告实时推荐方法,其特征在于,所述基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息,还包括:The advertisement real-time recommendation method according to claim 2, wherein the obtaining the user preference information corresponding to the request source identifier based on the access request further includes:
    若所述已有用户画像数据没有包含所述已有喜好信息,则基于所述已有用户画像数据查找相似人群;If the existing user portrait data does not include the existing preference information, searching for a similar crowd based on the existing user portrait data;
    将所述相似人群对应的共同喜好信息作为所述用户喜好信息。The common preference information corresponding to the similar crowd is used as the user preference information.
  4. 如权利要求2所述的广告实时推荐方法,其特征在于,所述基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息,还包括:The advertisement real-time recommendation method according to claim 2, wherein the obtaining the user preference information corresponding to the request source identifier based on the access request further includes:
    若所述已有用户画像数据没有包含所述已有喜好信息,则基于所述用户标识查找对应的至少一个历史网页数据,每一所述历史网页数据对应一喜好标签;If the existing user profile data does not include the existing favorite information, searching for the corresponding at least one historical webpage data based on the user identifier, each of the historical webpage data corresponding to a favorite tag;
    对所述历史网页数据对应的所述喜好标签进行统计分析,获取关键喜好标签,以确定所述用户喜好信息。Perform statistical analysis on the favorite tag corresponding to the historical webpage data, and obtain a key preference tag to determine the user preference information.
  5. 如权利要求1所述的广告实时推荐方法,其特征在于,所述基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息,包括:The advertisement real-time recommendation method according to claim 1, wherein the obtaining the user preference information corresponding to the request source identifier based on the access request comprises:
    基于所述访问请求,判断所述请求来源标识是否为终端标识;Determining, according to the access request, whether the request source identifier is a terminal identifier;
    若所述请求来源标识为所述终端标识,则基于所述终端标识查找对应的至少一个历史网页数据,每一所述历史网页数据对应一喜好标签; And if the request source identifier is the terminal identifier, searching, according to the terminal identifier, the corresponding at least one historical webpage data, where each of the historical webpage data corresponds to a favorite label;
    对所述历史网页数据对应的所述喜好标签进行统计分析,获取关键喜好标签,以确定所述用户喜好信息。Perform statistical analysis on the favorite tag corresponding to the historical webpage data, and obtain a key preference tag to determine the user preference information.
  6. 如权利要求1所述的广告实时推荐方法,其特征在于,所述访问请求还包括URL地址;The advertisement real-time recommendation method according to claim 1, wherein the access request further comprises a URL address;
    所述基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息,之前还包括:对网站上所有网页进行标签化处理,以使每一所述网页携带一喜好标签;The obtaining, according to the access request, the user preference information corresponding to the request source identifier, the method further includes: performing labeling processing on all webpages on the website, so that each of the webpages carries a favorite label;
    所述基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息,包括:基于所述访问请求的所述URL地址查找对应的目标网页,并将所述目标网页对应的喜好标签作为所述用户喜好信息。The obtaining the user preference information corresponding to the request source identifier based on the access request, including: searching for a corresponding target webpage based on the URL address of the access request, and matching the favorite webpage of the target webpage As the user preference information.
  7. 如权利要求1所述的广告实时推荐方法,其特征在于,所述基于所述用户喜好信息,获取与所述用户喜好信息相对应的关联广告,包括:The advertisement real-time recommendation method according to claim 1, wherein the acquiring the related advertisement corresponding to the user preference information based on the user preference information comprises:
    对所述广告进行关键词提取,确定所述广告的广告类别;Performing keyword extraction on the advertisement to determine an advertisement category of the advertisement;
    计算所述广告类别与所述用户喜好信息的相似度;Calculating a similarity between the advertisement category and the user preference information;
    判断所述相似度是否大于所述预设值;Determining whether the similarity is greater than the preset value;
    若所述相似度大于所述预设值时,则确定所述广告为所述关联广告。If the similarity is greater than the preset value, determining that the advertisement is the associated advertisement.
  8. 一种广告实时推荐装置,其特征在于,包括:An advertisement real-time recommendation device, comprising:
    访问请求获取模块,用于实时获取客户端发送的访问请求,所述访问请求包括请求来源标识;An access request obtaining module, configured to acquire an access request sent by the client in real time, where the access request includes a request source identifier;
    用户喜好信息获取模块,用于基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息;a user preference information obtaining module, configured to acquire user preference information corresponding to the request source identifier based on the access request;
    关联广告获取模块,用于基于所述用户喜好信息,获取与所述用户喜好信息相对应的关联广告;An associated advertisement obtaining module, configured to acquire an associated advertisement corresponding to the user preference information based on the user preference information;
    关联广告推荐模块,用于将所述关联广告实时推送给所述客户端,以使所述客户端实时显示所述关联广告。The associated advertisement recommendation module is configured to push the associated advertisement to the client in real time, so that the client displays the associated advertisement in real time.
  9. 一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现如下步骤:A terminal device comprising a memory, a processor, and computer readable instructions stored in the memory and operable on the processor, wherein the processor executes the computer readable instructions as follows step:
    实时获取客户端发送的访问请求,所述访问请求包括请求来源标识;Acquiring an access request sent by the client in real time, where the access request includes a request source identifier;
    基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息;Acquiring user preference information corresponding to the request source identifier based on the access request;
    基于所述用户喜好信息,获取与所述用户喜好信息相对应的关联广告; Acquiring an associated advertisement corresponding to the user preference information based on the user preference information;
    将所述关联广告实时推送给所述客户端,以使所述客户端实时显示所述关联广告。Pushing the associated advertisement to the client in real time, so that the client displays the associated advertisement in real time.
  10. 如权利要求9所述的终端设备,其特征在于,所述基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息,包括:The terminal device according to claim 9, wherein the obtaining the user preference information corresponding to the request source identifier based on the access request comprises:
    基于所述访问请求,判断所述请求来源标识是否为用户标识;Determining, according to the access request, whether the request source identifier is a user identifier;
    若所述请求来源标识为所述用户标识,则基于所述用户标识查询已有用户画像数据;If the request source identifier is the user identifier, querying existing user portrait data based on the user identifier;
    判断所述已有用户画像数据是否包含已有喜好信息;Determining whether the existing user portrait data contains existing preference information;
    若所述已有用户画像数据包含所述已有喜好信息,则将所述已有喜好信息作为所述用户喜好信息;If the existing user portrait data includes the existing preference information, the existing preference information is used as the user preference information;
    若所述已有用户画像数据没有包含所述已有喜好信息,则基于所述已有用户画像数据查找相似人群;If the existing user portrait data does not include the existing preference information, searching for a similar crowd based on the existing user portrait data;
    将所述相似人群对应的共同喜好信息作为所述用户喜好信息。The common preference information corresponding to the similar crowd is used as the user preference information.
  11. 如权利要求10所述的终端设备,其特征在于,所述基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息,还包括:The terminal device according to claim 10, wherein the obtaining the user preference information corresponding to the request source identifier based on the access request further includes:
    若所述已有用户画像数据没有包含所述已有喜好信息,则基于所述用户标识查找对应的至少一个历史网页数据,每一所述历史网页数据对应一喜好标签;If the existing user profile data does not include the existing favorite information, searching for the corresponding at least one historical webpage data based on the user identifier, each of the historical webpage data corresponding to a favorite tag;
    对所述历史网页数据对应的所述喜好标签进行统计分析,获取关键喜好标签,以确定所述用户喜好信息。Perform statistical analysis on the favorite tag corresponding to the historical webpage data, and obtain a key preference tag to determine the user preference information.
  12. 如权利要求9所述的终端设备,其特征在于,所述基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息,包括:The terminal device according to claim 9, wherein the obtaining the user preference information corresponding to the request source identifier based on the access request comprises:
    基于所述访问请求,判断所述请求来源标识是否为终端标识;Determining, according to the access request, whether the request source identifier is a terminal identifier;
    若所述请求来源标识为所述终端标识,则基于所述终端标识查找对应的至少一个历史网页数据,每一所述历史网页数据对应一喜好标签;And if the request source identifier is the terminal identifier, searching, according to the terminal identifier, the corresponding at least one historical webpage data, where each of the historical webpage data corresponds to a favorite label;
    对所述历史网页数据对应的所述喜好标签进行统计分析,获取关键喜好标签,以确定所述用户喜好信息。Perform statistical analysis on the favorite tag corresponding to the historical webpage data, and obtain a key preference tag to determine the user preference information.
  13. 如权利要求9所述的终端设备,其特征在于,所述访问请求还包括URL地址;The terminal device according to claim 9, wherein the access request further comprises a URL address;
    所述基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息,之前还包括:对网站上所有网页进行标签化处理,以使每一所述网页携带一喜好标签;The obtaining, according to the access request, the user preference information corresponding to the request source identifier, the method further includes: performing labeling processing on all webpages on the website, so that each of the webpages carries a favorite label;
    所述基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息,包括:基于所述访问请求的所述URL地址查找对应的目标网页,并将所述目标网页对应的喜好标签作为所述用户喜好信息。 The obtaining the user preference information corresponding to the request source identifier based on the access request, including: searching for a corresponding target webpage based on the URL address of the access request, and matching the favorite webpage of the target webpage As the user preference information.
  14. 如权利要求9所述的终端设备,其特征在于,所述基于所述用户喜好信息,获取与所述用户喜好信息相对应的关联广告,包括:The terminal device according to claim 9, wherein the acquiring the associated advertisement corresponding to the user preference information based on the user preference information comprises:
    对所述广告进行关键词提取,确定所述广告的广告类别;Performing keyword extraction on the advertisement to determine an advertisement category of the advertisement;
    计算所述广告类别与所述用户喜好信息的相似度;Calculating a similarity between the advertisement category and the user preference information;
    判断所述相似度是否大于所述预设值;Determining whether the similarity is greater than the preset value;
    若所述相似度大于所述预设值时,则确定所述广告为所述关联广告。If the similarity is greater than the preset value, determining that the advertisement is the associated advertisement.
  15. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现如下步骤:A computer readable storage medium storing computer readable instructions, wherein the computer readable instructions, when executed by a processor, implement the following steps:
    实时获取客户端发送的访问请求,所述访问请求包括请求来源标识;Acquiring an access request sent by the client in real time, where the access request includes a request source identifier;
    基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息;Acquiring user preference information corresponding to the request source identifier based on the access request;
    基于所述用户喜好信息,获取与所述用户喜好信息相对应的关联广告;Acquiring an associated advertisement corresponding to the user preference information based on the user preference information;
    将所述关联广告实时推送给所述客户端,以使所述客户端实时显示所述关联广告。Pushing the associated advertisement to the client in real time, so that the client displays the associated advertisement in real time.
  16. 如权利要求15所述的计算机可读存储介质,其特征在于,所述基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息,包括:The computer readable storage medium according to claim 15, wherein the obtaining the user preference information corresponding to the request source identifier based on the access request comprises:
    基于所述访问请求,判断所述请求来源标识是否为用户标识;Determining, according to the access request, whether the request source identifier is a user identifier;
    若所述请求来源标识为所述用户标识,则基于所述用户标识查询已有用户画像数据;If the request source identifier is the user identifier, querying existing user portrait data based on the user identifier;
    判断所述已有用户画像数据是否包含已有喜好信息;Determining whether the existing user portrait data contains existing preference information;
    若所述已有用户画像数据包含所述已有喜好信息,则将所述已有喜好信息作为所述用户喜好信息;If the existing user portrait data includes the existing preference information, the existing preference information is used as the user preference information;
    若所述已有用户画像数据没有包含所述已有喜好信息,则基于所述已有用户画像数据查找相似人群;If the existing user portrait data does not include the existing preference information, searching for a similar crowd based on the existing user portrait data;
    将所述相似人群对应的共同喜好信息作为所述用户喜好信息。The common preference information corresponding to the similar crowd is used as the user preference information.
  17. 如权利要求16所述的计算机可读存储介质,其特征在于,所述基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息,还包括:The computer readable storage medium according to claim 16, wherein the obtaining the user preference information corresponding to the request source identifier based on the access request further comprises:
    若所述已有用户画像数据没有包含所述已有喜好信息,则基于所述用户标识查找对应的至少一个历史网页数据,每一所述历史网页数据对应一喜好标签;If the existing user profile data does not include the existing favorite information, searching for the corresponding at least one historical webpage data based on the user identifier, each of the historical webpage data corresponding to a favorite tag;
    对所述历史网页数据对应的所述喜好标签进行统计分析,获取关键喜好标签,以确定所述用户喜好信息。Perform statistical analysis on the favorite tag corresponding to the historical webpage data, and obtain a key preference tag to determine the user preference information.
  18. 如权利要求15所述的计算机可读存储介质,其特征在于,所述基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息,包括: The computer readable storage medium according to claim 15, wherein the obtaining the user preference information corresponding to the request source identifier based on the access request comprises:
    基于所述访问请求,判断所述请求来源标识是否为终端标识;Determining, according to the access request, whether the request source identifier is a terminal identifier;
    若所述请求来源标识为所述终端标识,则基于所述终端标识查找对应的至少一个历史网页数据,每一所述历史网页数据对应一喜好标签;And if the request source identifier is the terminal identifier, searching, according to the terminal identifier, the corresponding at least one historical webpage data, where each of the historical webpage data corresponds to a favorite label;
    对所述历史网页数据对应的所述喜好标签进行统计分析,获取关键喜好标签,以确定所述用户喜好信息。Perform statistical analysis on the favorite tag corresponding to the historical webpage data, and obtain a key preference tag to determine the user preference information.
  19. 如权利要求15所述的计算机可读存储介质,其特征在于,所述访问请求还包括URL地址;The computer readable storage medium of claim 15 wherein the access request further comprises a URL address;
    所述基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息,之前还包括:对网站上所有网页进行标签化处理,以使每一所述网页携带一喜好标签;The obtaining, according to the access request, the user preference information corresponding to the request source identifier, the method further includes: performing labeling processing on all webpages on the website, so that each of the webpages carries a favorite label;
    所述基于所述访问请求,获取与所述请求来源标识相对应的用户喜好信息,包括:基于所述访问请求的所述URL地址查找对应的目标网页,并将所述目标网页对应的喜好标签作为所述用户喜好信息。The obtaining the user preference information corresponding to the request source identifier based on the access request, including: searching for a corresponding target webpage based on the URL address of the access request, and matching the favorite webpage of the target webpage As the user preference information.
  20. 如权利要求15所述的计算机可读存储介质,其特征在于,所述基于所述用户喜好信息,获取与所述用户喜好信息相对应的关联广告,包括:The computer readable storage medium according to claim 15, wherein the obtaining an associated advertisement corresponding to the user preference information based on the user preference information comprises:
    对所述广告进行关键词提取,确定所述广告的广告类别;Performing keyword extraction on the advertisement to determine an advertisement category of the advertisement;
    计算所述广告类别与所述用户喜好信息的相似度;Calculating a similarity between the advertisement category and the user preference information;
    判断所述相似度是否大于所述预设值;Determining whether the similarity is greater than the preset value;
    若所述相似度大于所述预设值时,则确定所述广告为所述关联广告。 If the similarity is greater than the preset value, determining that the advertisement is the associated advertisement.
PCT/CN2017/112569 2017-11-15 2017-11-23 Real-time advertisement recommendation method and apparatus, and terminal device and storage medium WO2019095417A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711126787.6A CN107862553B (en) 2017-11-15 2017-11-15 Advertisement real-time recommendation method and device, terminal equipment and storage medium
CN201711126787.6 2017-11-15

Publications (1)

Publication Number Publication Date
WO2019095417A1 true WO2019095417A1 (en) 2019-05-23

Family

ID=61701778

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/112569 WO2019095417A1 (en) 2017-11-15 2017-11-23 Real-time advertisement recommendation method and apparatus, and terminal device and storage medium

Country Status (2)

Country Link
CN (1) CN107862553B (en)
WO (1) WO2019095417A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110503486A (en) * 2019-08-28 2019-11-26 北京深演智能科技股份有限公司 A kind of screening technique and device of advertising strategy

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108540831B (en) * 2018-04-19 2019-10-22 百度在线网络技术(北京)有限公司 Method and apparatus for pushed information
CN108876434B (en) * 2018-05-24 2022-08-16 北京五八信息技术有限公司 User portrait construction method and device, computing device and readable storage medium
JP7155698B2 (en) * 2018-07-18 2022-10-19 オムロンヘルスケア株式会社 Information processing device, information processing method and program for information processing
CN109165282A (en) * 2018-08-01 2019-01-08 王冠 A kind of network data grasping means and system
CN109375913B (en) * 2018-09-11 2022-04-08 中铁程科技有限责任公司 Data processing method and device
CN109299981A (en) * 2018-09-17 2019-02-01 北京点网聚科技有限公司 A kind of advertisement recommended method and device
CN109345303B (en) * 2018-09-27 2023-11-03 北京奇虎科技有限公司 Rich media advertisement putting method and device
CN110969469B (en) * 2018-09-30 2024-02-20 北京国双科技有限公司 Data acquisition method and device
CN109523302A (en) * 2018-10-19 2019-03-26 中链科技有限公司 Advertisement sending method, device and calculating equipment based on block chain
CN109598540B (en) * 2018-11-09 2024-03-22 湖南工业大学 Advertisement accurate pushing method and advertisement accurate pushing system
CN112075084B (en) * 2018-12-20 2022-06-14 海信视像科技股份有限公司 Receiving apparatus, receiving method, transmitting apparatus, transmitting method, transmitting/receiving system, and transmitting/receiving method
CN109815381A (en) * 2018-12-21 2019-05-28 平安科技(深圳)有限公司 User's portrait construction method, system, computer equipment and storage medium
CN109714277A (en) * 2018-12-28 2019-05-03 上海掌门科技有限公司 Information flow calling, distribution method, electronic equipment and medium
CN111415183A (en) * 2019-01-08 2020-07-14 北京京东尚科信息技术有限公司 Method and apparatus for processing access requests
CN109934721A (en) * 2019-01-18 2019-06-25 深圳壹账通智能科技有限公司 Finance product recommended method, device, equipment and storage medium
CN110457610B (en) * 2019-06-27 2022-04-19 五八有限公司 Information recommendation method, device, terminal, server and storage medium
CN110288443A (en) * 2019-06-27 2019-09-27 北京金山安全软件有限公司 Information pushing method and device, electronic equipment and computer readable storage medium
CN110400180B (en) * 2019-07-29 2023-11-07 腾讯科技(深圳)有限公司 Recommendation information-based display method and device and storage medium
CN112464076A (en) * 2019-09-06 2021-03-09 百度在线网络技术(北京)有限公司 Service function recommendation method and device
CN110782288A (en) * 2019-10-25 2020-02-11 广州凌鑫达实业有限公司 Cloud computing aggregate advertisement data processing method, device, equipment and medium
CN111241409A (en) * 2020-01-21 2020-06-05 北京三快在线科技有限公司 Information pushing method and device, electronic equipment and readable storage medium
CN111327930A (en) * 2020-02-28 2020-06-23 北京达佳互联信息技术有限公司 Method and device for acquiring target object, electronic equipment and storage medium
CN111581492B (en) * 2020-04-01 2024-02-23 车智互联(北京)科技有限公司 Content recommendation method, computing device and readable storage medium
CN111523948A (en) * 2020-06-16 2020-08-11 网易(杭州)网络有限公司 Advertisement display method and device, computer readable storage medium and electronic equipment
CN111913996B (en) * 2020-07-14 2023-07-18 中国联合网络通信集团有限公司 Data processing method, device, equipment and storage medium
CN112187407A (en) * 2020-09-25 2021-01-05 中国移动通信集团黑龙江有限公司 Real-time signaling message processing method, device, equipment and computer storage medium
CN112163909B (en) * 2020-10-29 2021-05-18 杭州次元岛科技有限公司 Advertisement delivery system based on big data
CN112732892B (en) * 2020-12-30 2022-09-20 平安科技(深圳)有限公司 Course recommendation method, device, equipment and storage medium
CN112884507A (en) * 2021-02-05 2021-06-01 世纪蜗牛通信科技有限公司 Advertisement marketing recommendation system based on user preference
CN113411627B (en) * 2021-06-17 2023-04-18 广州博冠信息科技有限公司 Data pushing method and device, readable storage medium and electronic equipment
CN113626575A (en) * 2021-09-01 2021-11-09 浙江力石科技股份有限公司 Intelligent recommendation method based on user question answering
CN116485474B (en) * 2023-04-29 2024-03-19 广州市安洛网络有限责任公司 Accurate crowd of recreation advertisement is directional puts in system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105512242A (en) * 2015-11-30 2016-04-20 浙江工业大学 Parallel recommend method based on social network structure
CN105827676A (en) * 2015-01-04 2016-08-03 中国移动通信集团上海有限公司 System, method and device for acquiring user portrait information
CN106485553A (en) * 2016-10-18 2017-03-08 安徽天达网络科技有限公司 A kind of advertisement intelligent put-on method for target audience

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104182516B (en) * 2014-08-21 2018-03-06 北京金山安全软件有限公司 Information recommendation method and device and mobile terminal

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105827676A (en) * 2015-01-04 2016-08-03 中国移动通信集团上海有限公司 System, method and device for acquiring user portrait information
CN105512242A (en) * 2015-11-30 2016-04-20 浙江工业大学 Parallel recommend method based on social network structure
CN106485553A (en) * 2016-10-18 2017-03-08 安徽天达网络科技有限公司 A kind of advertisement intelligent put-on method for target audience

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110503486A (en) * 2019-08-28 2019-11-26 北京深演智能科技股份有限公司 A kind of screening technique and device of advertising strategy
CN110503486B (en) * 2019-08-28 2023-01-20 北京深演智能科技股份有限公司 Method and device for screening advertisement strategies

Also Published As

Publication number Publication date
CN107862553B (en) 2020-03-17
CN107862553A (en) 2018-03-30

Similar Documents

Publication Publication Date Title
WO2019095417A1 (en) Real-time advertisement recommendation method and apparatus, and terminal device and storage medium
US11716401B2 (en) Systems and methods for content audience analysis via encoded links
US20200153917A1 (en) Systems and methods for analyzing traffic across multiple media channels via encoded links
US9710555B2 (en) User profile stitching
US9922333B2 (en) Automated multivariate behavioral prediction
US10339198B2 (en) Systems and methods for benchmarking online activity via encoded links
US9916589B2 (en) Advertisement selection using multivariate behavioral model
US10134053B2 (en) User engagement-based contextually-dependent automated pricing for non-guaranteed delivery
US9135344B2 (en) System and method providing search results based on user interaction with content
US20170154356A1 (en) Generating actionable suggestions for improving user engagement with online advertisements
KR101419504B1 (en) System and method providing a suited shopping information by analyzing the propensity of an user
US20170206416A1 (en) Systems and Methods for Associating an Image with a Business Venue by using Visually-Relevant and Business-Aware Semantics
US20120059713A1 (en) Matching Advertisers and Users Based on Their Respective Intents
US11936751B2 (en) Systems and methods for online activity monitoring via cookies
US20090271228A1 (en) Construction of predictive user profiles for advertising
CN104254851A (en) Method and system for recommending content to a user
US20120173338A1 (en) Method and apparatus for data traffic analysis and clustering
US20140074851A1 (en) Dynamic data acquisition method and system
US20160042403A1 (en) Extraction device, extraction method, and non-transitory computer readable storage medium
US10331713B1 (en) User activity analysis using word clouds
US20150339712A1 (en) Inferring Facts from Online User Activity
CN106383857A (en) Information processing method and electronic equipment
US20150142782A1 (en) Method for associating metadata with images
WO2016106571A1 (en) Systems and methods for building keyword searchable audience based on performance ranking
US9092463B2 (en) Keyword generation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17932333

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 14/08/2020)

122 Ep: pct application non-entry in european phase

Ref document number: 17932333

Country of ref document: EP

Kind code of ref document: A1