WO2014023121A1 - Procédé et dispositif permettant de lancer un contenu individuel - Google Patents

Procédé et dispositif permettant de lancer un contenu individuel Download PDF

Info

Publication number
WO2014023121A1
WO2014023121A1 PCT/CN2013/076308 CN2013076308W WO2014023121A1 WO 2014023121 A1 WO2014023121 A1 WO 2014023121A1 CN 2013076308 W CN2013076308 W CN 2013076308W WO 2014023121 A1 WO2014023121 A1 WO 2014023121A1
Authority
WO
WIPO (PCT)
Prior art keywords
web page
user
server
data
terminal
Prior art date
Application number
PCT/CN2013/076308
Other languages
English (en)
Chinese (zh)
Inventor
游源
钟杰萍
尹攀
杜家春
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2014023121A1 publication Critical patent/WO2014023121A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements

Definitions

  • Embodiments of the present invention relate to the field of communications, and in particular, to a method and apparatus for delivering personalized content. Background technique
  • the provider when a user accesses a content provider of a specific content, the provider sends the current user identification information to the analysis platform in real time, and the analysis platform acquires the current user feature from the database to which it belongs, and returns to the specific content.
  • Business or third-party content providers to provide personalized content.
  • the website needs to communicate with the user behavior analysis system in real time, and then the system retrieves the relevant user data information in real time, and returns the current user type result to the current website (or third party advertisement provided in real time).
  • the organization organizes the personalized content matching with the user and finally returns it to the current user terminal.
  • the above steps include multiple real-time communication between servers. There are many intermediate links. The response speed and delivery reliability depend heavily on the user network environment. The bandwidth consumption is large and the server interaction efficiency is not high.
  • the embodiments of the present invention provide a method and an apparatus for delivering personalized content, which solves the problem of large bandwidth consumption and low server interaction efficiency when customizing personalized content in the prior art.
  • the embodiment of the invention provides a method for delivering personalized content, including:
  • the user behavior analysis BT server obtains the user information sent by the terminal during the process of accessing the webpage, analyzes the webpage accessed by the terminal, and obtains the webpage type of the webpage; and obtains according to the webpage type of the webpage and the user information.
  • User characteristic data Querying a correspondence table between the user feature data pre-configured by the BT server and the delivery policy data, and determining the delivery policy data corresponding to the user feature data;
  • a BT server that delivers personalized content including:
  • An obtaining unit configured to acquire user information sent by the terminal during the process of accessing the webpage, and analyze the webpage accessed by the terminal to obtain a webpage type of the webpage;
  • the obtaining unit is further configured to acquire user feature data according to the webpage type of the webpage and the user information;
  • a determining unit configured to query a correspondence table between the user feature data pre-configured by the BT server and the delivery policy data, and determine the delivery policy data corresponding to the user feature data;
  • a writing unit configured to write the user feature data and the corresponding delivery policy data into the cookie of the terminal, so that the content providing server receives the access request sent by the terminal when the next time the webpage is accessed, according to the terminal
  • the delivery policy data in the cookie is personalized to the terminal.
  • the BT server after receiving the user information sent by the terminal, the BT server performs webpage type analysis on the webpage accessed by the terminal, and obtains the user characteristic data and the corresponding delivery policy according to the user information and the webpage type, and writes the terminal to the terminal.
  • the cookie so that the next time the content providing server receives the terminal access request, the terminal can directly deliver the personalized content to the terminal by reading the cookie of the terminal. Reduces bandwidth consumption and improves server interaction efficiency.
  • FIG. 1 is a system architecture diagram of an embodiment of the present invention
  • FIG. 3 is a flowchart of Embodiment 2 of the present invention
  • FIG. 4 is a structural diagram of Embodiment 3 of the present invention.
  • Figure 1 shows the architecture of the system.
  • the system includes a terminal, a content providing server, and a Behavioral Targeting (BT) server.
  • the system communicates with each other via a wired or wireless communication network.
  • These communication networks include, but are not limited to, a Mobile Telephone Network, a Wireless Local Area Network (LAN), a Bluetooth Personal Area Network, an Ethernet LAN, and a Token Ring.
  • the terminal may include, but is not limited to, a mobile device, a combination PDA and mobile telephone, a PDA, an integrated information device (IMD), a personal computer (PC). And a notebook computer (Notebook Computer).
  • These terminals can be moved or located on a mobile device such as, but not limited to, a car, a truck, a taxi, a bus, a ship, an airplane, a bicycle, a motorcycle, and the like.
  • the above communication device can implement communication process based on various different transmission technologies, including but not limited to Code Division Multiple Access (CDMA), Global System for Mobile Communications (GSM), Universal Mobile Communication. System (Universal Mobile Telecommunications System, UMTS), Time Division Multiple Access (TDMA), Frequency Division Multiple Access (FDMA), Transmission Control Protocol/Internet Protocol (Transmission Control Protocol/Internet Protocol, TCP/IP), Short Messaging Service (SMS), Multimedia Messaging Service (MMS), e-mail, Instant Messaging Service (IMS), Bluetooth, IEEE 802.11, etc.
  • Different media resources may be used between the above communication devices, including but not limited to, radio, infrared, laser, cable connection, and the like.
  • the content providing server registers its service with the BT server in advance, and embeds a specific script pointing to the BT server into the web page it operates, thereby forming a monitoring domain of the BT server.
  • the page will automatically load and execute the above script file.
  • the script file can automatically collect and store the information that the user is browsing on the page and send it to the BT server.
  • the BT server analyzes and processes the specific information of the user in the entire monitoring domain according to the relevant model algorithm, and forms user characteristic data corresponding to the current user, and according to the content providing server, the server is configured in advance for the user characteristic data in the BT server.
  • the personalized personalized content delivery rules form the corresponding strategy.
  • the above information is written to the user's local cookie in the agreed format through the above specific script. Therefore, when the user visits the website again, the content providing server can directly acquire and execute the BT analysis result and the personalized delivery policy data stored in the user locality, and perform accurate delivery. That is, the BT server only needs to analyze the user once. When the user visits the website again, the content providing server directly feeds the personalized content to the end user by reading the local cookie of the terminal, and does not need to re-click the stream data through the BT server. Analysis is performed to reduce bandwidth consumption and improve server interaction efficiency.
  • Figure 2 is a flow chart for delivering personalized content. As shown in Figure 2, the figure includes:
  • the BT server acquires the user information sent by the terminal during the process of accessing the webpage, and analyzes the webpage accessed by the terminal, and obtains the webpage type of the webpage;
  • User information includes HTTP request information, such as requested URL, jump source URL, etc., information of the requested page, such as page title, keywords, abstracts, etc., and user behavior information, such as click, submit, input, jump, Refresh and so on.
  • the BT server can perform checksum reconstruction on the received user information.
  • the user information for successful verification needs to be reconstructed into click stream data, and the user information for verification failure is deleted, and new user information is re-received and verified.
  • the BT server obtains the user information, and also needs to analyze the webpage accessed by the terminal to obtain the webpage type of the webpage. It should be noted that the steps for the BT server to obtain user information and analyze the web pages accessed by the terminal are not strictly time-limited, and may be performed simultaneously, or the user information may be acquired first, and then the accessed web pages may be analyzed.
  • the BT server analyzes the webpage accessed by the terminal, and the specific analysis steps are as follows: Set the type set of the webpage and the set of frequencies corresponding to the type of the webpage.
  • the BT server divides the content of the webpage into N categories in advance, that is, sports, finance, science and technology, etc., and is represented by ⁇ 2 ,..., ⁇ . And set the collection ⁇ , where ⁇ ,. is the frequency of the type. The frequency is the number of web pages of the type in the total number of web pages accessed by the terminal. It should be noted that ⁇ 2 ,..., ⁇ And ⁇ MM ⁇ . ⁇ M are both set in advance by the BT server, which is different from the webpage type of the webpage actually accessed by the terminal. The latter requires the BT server to pass ⁇ , ⁇ ..., ⁇ and ⁇ MA ⁇ . ⁇ Mw ⁇ and the feature data of the web page accessed by the terminal are calculated.
  • Obtaining feature data of the webpage where the feature data includes key terms, character spacings, and text lengths corresponding to webpage types in the type set of the webpage.
  • the BT server can sort P by its element value. , as in descending order, get an ordered collection
  • the BT server may determine that the text belongs to the category corresponding to the element ranked first in the set Q, and may also select the elements of the top f as the probability distribution of the corresponding category, and the selection rule is pre-determined by the BT server.
  • Configuration For example, CI and C2 respectively represent sports and tourism.
  • the text d of the web page to be analyzed contains feature data such as "basketball”, “soccer”, “travel”, etc., and the ordered set Q obtained after the above operation is obtained. In the first place
  • the element corresponds to CI
  • the second element corresponds to C2.
  • the BT server determines that the type of the text d of the webpage is of the type C1, that is, the sports class, if the first and the first are selected.
  • the two-digit element determines that the type of the text d of the webpage is both a sports class and a ⁇ game class.
  • the BT server can calculate the word frequency by the Markov model formula.
  • the formula is as follows:
  • E a M ⁇ is a metadata in the metadata set
  • the metadata is data representing key terms of the web page type. If the webpage type is sports, the metadata may be " Sports ", or “soccer”, or “basketball”, etc., whose metadata is preset by the BT server.
  • the webpage type P accessed by the terminal is the target webpage in the Markov process
  • the target webpage is the webpage accessed by the user in a time period
  • f(k) is a decaying attenuation factor for expressing the previous step, for example, this embodiment
  • P ( xk ai
  • the word frequency of the word "basketball” can be calculated by the above formula.
  • TF-IDF term frequency-inverse document Frequency
  • Term Frequency refers to the frequency at which a given word appears in the file or landing page.
  • the target webpage represents a webpage accessed by the terminal.
  • IDF Inverse Document Frequency
  • the principle is: If the document or landing page containing a certain entry has fewer IDF, the entry has a good class distinguishing ability. The IDF of a particular word can be obtained by dividing the total number of files by the logarithm of the resulting quotient:
  • TF-IDF means filtering out common words and retaining important words. The higher the value, the more important the word is.
  • the BT server calculates the values of tfai,p and idf by the above formula, and multiplies the two to find the value of tfidf.
  • the BT server first obtains all types of web pages accessed by the terminal, including four specific types: “sports class”, “music class”, “financial class”, and “skull class”, and then select “sports", “music”, “Finance”, “IT” four specific terms, respectively, find the value of their tfidf, and then sorted from large to small, the BT server selects the largest one or the first few, assuming the order is " Sports, "music", “Financial”, “IT”, then the BT server determines that the user's user characteristics are “sports fans", or the user characteristics are “sports fans” and “music talents".
  • the specific determination process can be configured by querying metadata and A correspondence table of user characteristics is obtained.
  • the user feature may be represented by a string or a data format commonly used by other computers, and the embodiment is not limited.
  • S103 Query a correspondence table between user feature data and delivery policy data pre-configured by the BT server, and determine delivery policy data corresponding to the user feature data.
  • the correspondence table is pre-configured by the BT server.
  • a user's user characteristics are "ordinary white-collar/influx/gourmet”
  • the corresponding delivery strategy is "consumer electronic/meal-escaped characters, integer numbers, and other computer-used data format representations, which are represented by strings. They are “white_ collar /fashion- follower/ food-lover” and “electronic- consumption/ group_purchasing_of_food”.
  • variable 1 and variable 2 represent user characteristic data and corresponding policies for the feature, respectively, and the remaining parameters, such as the domain, optional flag, expiration time and creation time, are standard parameters of the file format of the cookie.
  • the website can be personalized and pushed according to the relevant information in the cookie.
  • the content providing server After the cookie is written, when the terminal requests to access the webpage provided by the content providing server again, the content providing server directly reads the cookie of the terminal, obtains the current user characteristic information, and directly specifies the delivery policy data content providing server to execute the personalized personalized policy data.
  • a personalized content jump link is generated directly for the current user, or a personalized content jump link is requested from the third-party content server.
  • the content providing server delivers the personalized content to the user, and completes the accurate delivery.
  • the BT server after receiving the user information sent by the terminal, the BT server performs webpage type analysis on the webpage accessed by the terminal, and obtains the user characteristic data and the corresponding delivery policy according to the user information and the webpage type, and writes the terminal to the terminal.
  • the cookie so that the next time the content providing server receives the terminal access request, the terminal can directly deliver the personalized content to the terminal by reading the cookie of the terminal. Reduces bandwidth consumption and improves server interaction efficiency.
  • Figure 3 is a method flow for delivering personalized content. As shown in Figure 3, the method flow includes:
  • the content providing server requests the BT service, and carries the basic information of the website, such as the domain name, the basic service type, etc., in the request message, for example, the domain name is www.example.com, and the basic service type is the search type.
  • the BT server After receiving the request from the content providing server, the BT server completes the registration process with the content providing server. After the registration, the BT server specifies a plurality of personalized delivery policies for one or more user characteristics according to the pre-configured user characteristics and personalized delivery policy correspondence table of the content providing server within the specified user feature set, and periodically The statistics of the number of users in the database, the corresponding user characteristics, and the delivery policy are presented to the content providing server in the form of a report. E.g,
  • the BT server collects user information.
  • the BT server starts to automatically collect the current user information, including but not limited to: HTTP request information, such as the requested URL, the jump source URL, etc., is requested.
  • Information about the page such as page titles, keywords, abstracts, etc., as well as user behavior information such as clicks, submissions, inputs, jumps, refreshes, and more.
  • the collection process may be:
  • the terminal executes the command statement in the specific script and sends the user information to the remote BT server.
  • Common errors or exception records include garbled data, duplicates, abends, and so on.
  • the BT server will discard the above errors and abnormal data.
  • the BT server receives the above-mentioned user information and verifies it.
  • the format in which user information is saved is separated by some feature separators, such as "xxxxx ⁇ y yyyyy", where " ⁇ ⁇ " is the feature separator, "x", "y” is the original data, and its form can be For characters or other forms of data commonly used in the computer field.
  • the BT server determines that the data is abnormal, and performs abnormal and error culling, that is, discards the received abnormal user information. The data, waiting again and receiving data for the next set of user information.
  • the BT server reconstructs the user information into click stream data.
  • Refactoring means reprocessing and integrating the data of user information.
  • the BT server converts the verified data into click stream data, and the type of the click stream data includes but is not limited to: a user browsing time period, a browsing time, a frequency of browsing the website, and a type of browsing the website.
  • the browsing time period can be divided into morning (5:01 - 12:00), afternoon (12:01 - 6:00), evening (6:01 - 22:00) and late night (22: 01 - 5:00).
  • the browsing duration indicates the length of time from the time the page request is initiated to the page closing time in the original data, including: short (less than 10 seconds;), normal (10-30 seconds;), longer (30 seconds to 100 seconds;), long ( More than 100 seconds;). It should be noted that the time period of the division is not strictly limited, and the remaining divisions also belong to the present. The scope of the invention is protected.
  • the frequency of browsing the website indicates the number of visits to the website in a unit of time.
  • the types of websites that can be browsed can be classified into sports, finance, and search.
  • the time to initiate a page request in the original data is 23:09:23, and the page close time is 23:11:01, then the above data can be reconstructed as: Access time period: Late night (22:01-05:00) Visit duration: Longer (30 seconds - 100 seconds); Visits: 1 time / day; Visit website type: Sports.
  • the BT server analyzes the content of the webpage in real time or periodically, and stores the webpage content analysis result in the database.
  • the result is the type of the webpage, such as sports, finance, science, and the like.
  • ⁇ ⁇ 1 ⁇ indicates the number of input data that does not belong to the classification G.
  • the BT server can determine that the text belongs to the set Q
  • the classification corresponding to the element ranked first may also select the elements ranked in the first f as the probability distribution of the corresponding classification, and the selection rules are pre-configured by the BT server.
  • CI and C2 respectively represent sports and tourism.
  • the text d of the web page to be analyzed contains feature data such as "basketball”, “soccer”, “travel”, etc., and the ordered set Q obtained after the above operation is obtained. In the first place, the element corresponds to C1, and the second element corresponds to C2.
  • the BT server determines that the type of the text d of the webpage is of the type C1, that is, the sports class. If the first and second elements are selected, it is determined that the type of the text d of the webpage is both a sports class and a ⁇ game class. S205. Acquire feature data of the user.
  • the BT server extracts the click stream data of the user in the database and the type of the visited webpage, and performs modeling analysis. Obtaining the characteristic data of the user, for example, the user 1 is both "ordinary white-collar" and "friend".
  • the Markov model or the probability model can be used to fit and analyze the relationship between the sequential behavior sequences, and an attenuation factor is added to express the decay of the previous step, that is, the longer the weight of the record is, the longer the weight is. small. Or use Bayesian estimation to model the user's click record type.
  • the main idea is to estimate the user's current behavior of the web page from the user's past behaviors on the web page, that is, the relevance of the past behavior to the current behavior. Its formula is as follows:
  • s is the set history effective number of steps
  • ai e ⁇ a ⁇ . ⁇ aM ⁇ is a metadata in the metadata set
  • the metadata is data representing key terms of the web page type.
  • the webpage type P accessed by the terminal is the target webpage in the Markov process
  • the target webpage is the webpage accessed by the user in a time period
  • P ( xk ai
  • TF-IDF term frequency-inverse document frequency
  • TF frequency
  • IDF Inverse Document Frequency
  • TF-IDF means filtering out common words, retaining important words, the higher the value, the The more important the words are.
  • the BT server calculates the values of tfai,p and idf by the above formula, and multiplies the two to find the value of tfidf.
  • the BT server performs modeling and analysis on a certain user. The steps are as follows: Firstly, all types of web pages browsed by the user are obtained, and there are four specific categories of "sports class", “music class”, “financial class”, and "IT class".
  • BT server selects the largest one or In the first few, assuming that the order is “sports", “music”, “finance”, “IT”, then the BT server determines that the user's user characteristics are “sports fans", or the user characteristics are “sports fans”. “and “music talent”, the specific determination process can be obtained by querying the configuration table of the terms and user characteristics.
  • the user feature may be represented by a string or a data format commonly used by other computers, and the embodiment is not limited.
  • the BT server matches the obtained user feature with the personalized delivery policy set by the content providing server that provides the webpage, obtains personalized delivery policy data for the user feature, and converts the format into a specified cookie format, and The included user characteristic data and the personalized delivery policy data are written into the cookie of the terminal, and the life cycle of the cookie, that is, the user characteristic data and the validity period of the corresponding delivery policy data, is set.
  • the embodiment of the present invention has no limitation on the manner of writing a cookie, and the writing method of forming a single cookie record under the current website domain name is taken as an example, and the user characteristic of the user is "ordinary white-collar/influx/gourmet".
  • the corresponding delivery strategy is "consumer electronics/catering group purchase", and the written user characteristic data and corresponding delivery strategy data can be represented by a commonly used data format of a string, an escape character, an integer number, etc., where a string is used.
  • the terminal initiates an access request to the content providing server.
  • the BT server writes the user feature data and the corresponding delivery policy data into the cookie, the user accesses the webpage again through the terminal.
  • the content providing server acquires current user feature data and corresponding delivery policy data.
  • the content providing server obtains current user feature information by reading the cookie information of the terminal. Its specified delivery strategy data.
  • the content providing server delivers the personalized content to the user.
  • the content providing server directly executes the personalized delivery policy data, generates a personalized content jump link for the current user, or requests a personalized content jump link to the third-party content server.
  • the content providing server delivers the content to the user. Personalize the content and complete the precise delivery.
  • the BT server after receiving the user information sent by the terminal, the BT server performs webpage type analysis on the webpage accessed by the terminal, and obtains the user characteristic data and the corresponding delivery policy according to the user information and the webpage type, and writes the terminal to the terminal.
  • the cookie so that the next time the content providing server receives the terminal access request, the terminal can directly deliver the personalized content to the terminal by reading the cookie of the terminal. Reduces bandwidth consumption and improves server interaction efficiency.
  • FIG. 4 is a structural diagram of a device of the BT server, as shown in FIG. 4, including:
  • the obtaining unit 301 is configured to acquire user information sent by the terminal during the process of accessing the webpage, and analyze the webpage accessed by the terminal, and obtain the webpage type of the webpage; after the terminal accesses the webpage provided by the content providing server, the terminal passes The browser executes the script program on the web page and transmits the user information to the obtaining unit 301.
  • User information includes HTTP request information, such as requested URL, jump source URL, etc., information of the requested page, such as page title, keyword, abstract, etc., and user behavior information, such as click, submit, input, jump, Refresh and so on.
  • the obtaining unit 301 can perform checksum reconstruction on the received user information.
  • User information for successful verification needs to be reconstructed into click stream data, and the user information for verification failure is deleted, and new user information is re-received and verified.
  • the obtaining unit 301 obtains the user information, and also needs to analyze the webpage accessed by the terminal to obtain the webpage type of the webpage. It should be noted that the step of obtaining the user information and analyzing the webpage accessed by the terminal by the obtaining unit 301 is not strictly limited, and may be performed simultaneously, or the user information may be acquired first, and then the accessed webpage may be analyzed.
  • the BT server divides the content of the webpage into N categories in advance through the setting unit, that is, sports, finance, science and technology, etc., and is represented by ⁇ G'C ⁇ 'Cw ⁇ . And set the set ⁇ ' ⁇ 2' ⁇ ' ⁇ ⁇ , where G is a type of frequency.
  • the frequency is the total number of web pages accessed by the terminal, C , the number of web pages of the type.
  • ⁇ ' ⁇ and ⁇ ' ⁇ ' ⁇ ⁇ server are set in advance, and the type of web page different from actual access terminal, which is needed by server ⁇ ⁇ CI ' C2 -' Cw ⁇ Ml ' M2 -' M w ⁇ and the feature data calculation of the web page accessed by the terminal And got it.
  • the obtaining unit 301 acquires feature data of the webpage, where the feature data includes a key term, a character pitch, and a text length corresponding to the webpage type in the type set of the webpage.
  • the calculating unit calculates a probability of the feature data according to a set of frequencies corresponding to the type of the webpage, and selects one or more probability values of the calculated probability values to obtain a webpage corresponding to the selected probability value. Types of.
  • MM 1 indicates the number of input data that does not belong to the classification G.
  • the ordered set Q is obtained, that is, the larger the value of the element is, the higher the position is.
  • the determining unit can determine that the text belongs to the category corresponding to the element ranked first in the set Q, and can also select the top f
  • the element is used as the probability distribution of its corresponding classification, and its selection rules are pre-configured by the determining unit. For example, CI and C2 respectively represent sports and tourism, and the text d of the web page to be analyzed contains "basketball" and "soccer".
  • the determining unit of the BT server determines that the type of the text d of the webpage is C1, that is, the sports class. If the first and second elements are selected, it is determined that the type of the text d of the webpage is both a sports class and a tourism class. .
  • the obtaining unit is further configured to acquire user feature data according to the webpage type of the webpage and the user information;
  • the obtaining step of the obtaining unit 301 is as follows:
  • the calculation unit of the BT server can calculate the word frequency by the Markov model formula, and the formula is as follows: s
  • ai e ⁇ a ⁇ . ⁇ aM ⁇ is a metadata in the metadata set
  • the metadata is data representing key terms of the webpage type, such as the webpage type is sports
  • the metadata can be "sports", or “soccer”, or “basketball”, etc., and its metadata is preset by the computing unit of the BT server.
  • the webpage type P accessed by the terminal is the target webpage in the Markov process
  • the target webpage is the webpage accessed by the user in a time period
  • f(k) is a decaying attenuation factor for expressing the previous step, for example, this embodiment
  • P ( xk ai
  • the formula indicates the frequency at which a particular entry appears on a web page that the user has visited from the past to the present.
  • x0 p). Finally, the term "basketball" can be calculated by the above formula. Word frequency.
  • TF-IDF term frequency-inverse document frequency
  • Term Frequency refers to the frequency at which a given word appears in the file or landing page.
  • the target webpage represents a webpage accessed by the terminal.
  • IDF Inverse Document Frequency
  • the principle is: If the document or landing page containing a certain entry has fewer IDF, the entry has a good class distinguishing ability.
  • the IDF of a particular word can be obtained by dividing the total number of documents by the logarithm of the quotient obtained from the article containing the word:
  • the total number of files in the database or the total number of landing pages
  • ⁇ 3 ⁇ 4
  • the calculation unit of the BT server calculates the values of tfai, p and idf by the above formula, and multiplies the two to find the value of tfidf.
  • the obtaining unit 302 selects one or more TF-IDFs with the largest value calculated by the calculating unit, determines metadata corresponding to the one or more TF-IDF values, and queries the correspondence table between the metadata and the user feature data to obtain User characteristic data corresponding to the metadata.
  • the acquisition unit of the BT server first acquires all types of web pages accessed by the terminal, and has four specific types of "sports class”, “music class”, “financial class”, and “IT class”, and then selects "sports”, “ Tone Four specific terms, “Le”, “Finance”, and “IT”, respectively determine the value of tfidf, and then from the big to the small ⁇ 'J, the acquisition unit 302 selects the one with the largest value or the first one.
  • the user feature of the user acquired by the acquisition unit 302 is “sports fan”, or the user feature is “sports fan” and “user”
  • the specific acquisition process can be obtained by querying the correspondence table between the configuration metadata and the user feature.
  • the user feature can be represented by a character string or a data format commonly used by other computers, and the embodiment is not limited.
  • the determining unit 302 is configured to query a correspondence table between the user feature data pre-configured by the BT server and the delivery policy data, and determine the delivery policy data corresponding to the user feature data;
  • the correspondence table is pre-configured by the BT server.
  • a user's user characteristics are "ordinary white-collar/influx/gourmet”
  • the corresponding delivery strategy is "consumer electronic/meal-escaped characters, integer numbers, and other computer-used data format representations, which are represented by strings.
  • the cookie of the terminal so that the content providing server receives the access request sent by the terminal when the next time the webpage is accessed, and then delivers the personalized content to the terminal according to the delivery policy data in the cookie of the terminal.
  • the webpage accessed by the terminal next time may be the webpage accessed in the previous step, or may be another webpage.
  • the cookie written by the writing unit 303 is as shown in Table 1.
  • the content providing server After the cookie is written, when the terminal requests to access the webpage provided by the content providing server again, the content providing server directly reads the cookie of the terminal, obtains the current user characteristic information, and directly specifies the delivery policy data content providing server to execute the personalized personalized policy data.
  • a personalized content jump link is generated directly for the current user, or a personalized content jump link is requested from the third-party content server.
  • the content providing server delivers the personalized content to the user, and completes the accurate delivery.
  • the acquiring unit performs webpage type analysis on the webpage accessed by the terminal, and obtains the user characteristic data and the corresponding delivery policy according to the user information and the webpage type, by writing Enter the unit's cookie into the terminal for the next content
  • the terminal directly performs personalized service delivery on the terminal by reading the cookie of the terminal. Reduce bandwidth consumption and improve server interaction efficiency.
  • Computer readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one location to another.
  • a storage medium may be any available media that can be accessed by a computer.
  • computer readable media may comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, disk storage media or other magnetic storage device, or can be used for carrying or storing in the form of an instruction or data structure.
  • connection may suitably be a computer readable medium.
  • the software is transmitted from a website, server, or other remote source using coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave
  • coaxial cable , fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, wireless, and microwaves are included in the fixing of the associated media.
  • a disk and a disc include a compact disc (CD), a laser disc, a disc, a digital versatile disc (DVD), a floppy disc, and a Blu-ray disc, wherein the disc is usually magnetically copied, and the disc is The laser is used to optically replicate the data. Combinations of the above should also be included within the scope of the computer readable media.

Abstract

L'invention a trait à un procédé qui permet de lancer un contenu individuel, au cours duquel un serveur de ciblage comportemental (BT) : reçoit des informations utilisateur envoyées par des terminaux, analyse les pages Web auxquelles les terminaux ont accédé, et obtient les types de ces pages Web (S101) ; obtient, en fonction des types de pages Web et des informations utilisateur, les données caractéristiques des utilisateurs (S102) ; consulte la table de mappage des données caractéristiques des utilisateurs et des données de stratégie de lancement qu'il a préconfigurée, puis détermine les données de stratégie de lancement correspondantes pour les données caractéristiques des utilisateurs (S103) ; écrit les données caractéristiques des utilisateurs et les données de stratégie de lancement correspondantes dans les cookies des terminaux de manière à ce que, suite à la réception d'une demande de connexion en provenance des terminaux, le serveur fournisseur de contenu puisse lancer le contenu individuellement vers les terminaux, conformément aux données de stratégie de lancement se trouvant dans les cookies desdits terminaux (S104). En conséquence, la présente invention concerne également un dispositif de lancement individuel qui réduit l'utilisation de la bande passante et rend l'interaction du serveur plus efficace.
PCT/CN2013/076308 2012-08-10 2013-05-28 Procédé et dispositif permettant de lancer un contenu individuel WO2014023121A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201210284928.8A CN103577504A (zh) 2012-08-10 2012-08-10 一种投放个性化内容的方法和装置
CN201210284928.8 2012-08-10

Publications (1)

Publication Number Publication Date
WO2014023121A1 true WO2014023121A1 (fr) 2014-02-13

Family

ID=50049300

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/076308 WO2014023121A1 (fr) 2012-08-10 2013-05-28 Procédé et dispositif permettant de lancer un contenu individuel

Country Status (2)

Country Link
CN (1) CN103577504A (fr)
WO (1) WO2014023121A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104836781A (zh) * 2014-02-20 2015-08-12 腾讯科技(北京)有限公司 区分访问用户身份的方法及装置

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105488038B (zh) * 2014-09-15 2021-03-05 创新先进技术有限公司 通信应用的个性化信息匹配方法及装置
CN108322355A (zh) * 2017-01-18 2018-07-24 北京京东尚科信息技术有限公司 用户流量数据处理方法、处理装置、电子设备和存储介质
CN108459890B (zh) * 2017-02-20 2021-10-26 百度在线网络技术(北京)有限公司 用于应用的界面显示方法和装置
CN108573750B (zh) * 2017-03-07 2021-01-15 京东方科技集团股份有限公司 用于自动发现医学知识的方法和系统
CN109933389B (zh) * 2017-12-19 2022-08-23 阿里巴巴集团控股有限公司 数据对象信息处理、页面展示方法及装置
CN111274516B (zh) * 2018-12-04 2024-04-05 阿里巴巴新加坡控股有限公司 页面展示方法、页面配置方法和装置
CN111861564B (zh) * 2020-07-20 2021-07-13 深圳我买家网络科技有限公司 数字广告交易系统
CN113726900A (zh) * 2021-09-02 2021-11-30 四川启睿克科技有限公司 一种判断用户儿童年龄段的系统
CN117473200B (zh) * 2023-12-26 2024-03-08 天津戎行集团有限公司 一种用于网站信息数据的综合采集分析方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030014304A1 (en) * 2001-07-10 2003-01-16 Avenue A, Inc. Method of analyzing internet advertising effects
CN101034997A (zh) * 2006-03-09 2007-09-12 新数通兴业科技(北京)有限公司 一种数据信息精确发布的方法和系统
CN101079063A (zh) * 2007-06-25 2007-11-28 腾讯科技(深圳)有限公司 一种基于场景信息推送广告的方法、系统及设备
CN101431524A (zh) * 2007-11-07 2009-05-13 阿里巴巴集团控股有限公司 一种定向网络广告投放的实现方法及装置
CN102301658A (zh) * 2009-09-11 2011-12-28 华为技术有限公司 广告投放方法、广告服务器和广告系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030014304A1 (en) * 2001-07-10 2003-01-16 Avenue A, Inc. Method of analyzing internet advertising effects
CN101034997A (zh) * 2006-03-09 2007-09-12 新数通兴业科技(北京)有限公司 一种数据信息精确发布的方法和系统
CN101079063A (zh) * 2007-06-25 2007-11-28 腾讯科技(深圳)有限公司 一种基于场景信息推送广告的方法、系统及设备
CN101431524A (zh) * 2007-11-07 2009-05-13 阿里巴巴集团控股有限公司 一种定向网络广告投放的实现方法及装置
CN102301658A (zh) * 2009-09-11 2011-12-28 华为技术有限公司 广告投放方法、广告服务器和广告系统

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104836781A (zh) * 2014-02-20 2015-08-12 腾讯科技(北京)有限公司 区分访问用户身份的方法及装置

Also Published As

Publication number Publication date
CN103577504A (zh) 2014-02-12

Similar Documents

Publication Publication Date Title
WO2014023121A1 (fr) Procédé et dispositif permettant de lancer un contenu individuel
US9876751B2 (en) System and method for analyzing messages in a network or across networks
JP6506401B2 (ja) オンライン・ソーシャル・ネットワーク上でニュース関連のコンテンツを検索するための提案キーワード
US11681750B2 (en) System and method for providing content to users based on interactions by similar other users
US9251500B2 (en) Searching topics by highest ranked page in a social networking system
US8909569B2 (en) System and method for revealing correlations between data streams
US20150256499A1 (en) Ranking, collection, organization, and management of non-subscription electronic messages
US20140129331A1 (en) System and method for predicting momentum of activities of a targeted audience for automatically optimizing placement of promotional items or content in a network environment
US9722958B2 (en) Recommendation of a location resource based on recipient access
US10445753B1 (en) Determining popular and trending content characteristics
KR20160141811A (ko) 온라인 소셜 네트워크에서 검색 결과의 블렌딩
US20150100591A1 (en) Determining a Community Page for a Concept in a Social Networking System
US9946794B2 (en) Accessing special purpose search systems
US10078656B1 (en) Unmodifiable data in a storage service
US20160196267A1 (en) Configuring a web feed
JP6200894B2 (ja) ソーシャル・ネットワーキング・システムにおける概念へのユニバーサル・ソーシャル・コンテキストの付与
US10210465B2 (en) Enabling preference portability for users of a social networking system
US11580476B1 (en) Detecting a landing page that violates an online system policy based on a structural similarity between the landing page and a web page violating the policy
US20160285885A1 (en) Contextual contacts for html5
US20230409743A1 (en) Methods And Systems For Obtaining, Controlling And Viewing User Data
US11086948B2 (en) Method and system for determining abnormal crowd-sourced label
WO2016055832A1 (fr) Système informatique, procédés informatisés et produit programme informatique pour fournir des données de recommandations classées

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13828658

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13828658

Country of ref document: EP

Kind code of ref document: A1