CN107783987A - A kind of data processing method and device - Google Patents

A kind of data processing method and device Download PDF

Info

Publication number
CN107783987A
CN107783987A CN201610725502.XA CN201610725502A CN107783987A CN 107783987 A CN107783987 A CN 107783987A CN 201610725502 A CN201610725502 A CN 201610725502A CN 107783987 A CN107783987 A CN 107783987A
Authority
CN
China
Prior art keywords
client
target signature
access
label
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610725502.XA
Other languages
Chinese (zh)
Other versions
CN107783987B (en
Inventor
张洋平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201610725502.XA priority Critical patent/CN107783987B/en
Publication of CN107783987A publication Critical patent/CN107783987A/en
Application granted granted Critical
Publication of CN107783987B publication Critical patent/CN107783987B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The embodiment of the present invention discloses a kind of data processing method and device, and wherein method includes:The current accessed behavioral data bag that client is sent is received, the current accessed behavioral data bag includes target signature label corresponding to the signature identification information of access time information, the web page characteristics identification information of target web and the target web;The current accessed behavioral data bag is stored in the target signature label and the common corresponding access behavior database of the client;The current accessed behavioral data bag and history in the access behavior database access behavioral data bag, calculate in preset time period the client to the interest-degree of the target signature label.The embodiment of the present invention can improve the accuracy of user's portrait, be easy to recommend proper information, improve resource utilization.

Description

A kind of data processing method and device
Technical field
The present invention relates to Internet technical field, and in particular to a kind of data processing method and device.
Background technology
With the fast development of Internet technology and the gradual popularization of user terminal (such as computer, mobile phone etc.), user Information needed can be obtained whenever and wherever possible by internet.Generally, user obtains information needed by browser.Pass through in user While browser obtains information needed, browser also can be to user's recommendation information.
User's portrait is also known as user role (Persona), is that one kind delineates targeted customer, contact user's demand and design The effective tool in direction.User's portrait takes out according to information such as user's social property, habits and customs and consumer behaviors The user model of one labeling.The core work for establishing user's portrait is to be labeled " " to user.At present, mainly according to Hobby, Web vector graphic custom, the purpose of use information, the message area of concern and website, the expressing information that family is submitted need The personal information such as the keyword asked establish user's portrait, and are drawn a portrait based on user and carry out information recommendation, and this mode, which depends on, to be used The personal information that family is submitted, if the personal information submitted is not detailed enough, the user that establishes portrait is not accurate enough, and then to user The information of recommendation is not user's information interested.Therefore, there is user's portrait inaccuracy, information recommendation in existing information recommendation Inappositely shortcoming.
The content of the invention
The embodiment of the present invention provides a kind of data processing method and device, can improve the accuracy of user's portrait, be easy to Recommend proper information, while improve resource utilization.
First aspect of the embodiment of the present invention provides a kind of data processing method, including:
The current accessed behavioral data bag that client is sent is received, the current accessed behavioral data bag includes access time The target that the web page characteristics identification information of information, the web page characteristics identification information of target web and the target web is corresponding is special Levy label;
The current accessed behavioral data bag is stored in the target signature label and the client is jointly corresponding Access in behavior database;
The current accessed behavioral data bag and history in the access behavior database access behavioral data bag, Interest-degree of the client to the target signature label in calculating preset time period.
Second aspect of the embodiment of the present invention provides a kind of data processing equipment, including:
Receiving unit, for receiving the current accessed behavioral data bag of client transmission, the current accessed behavioral data Include the web page characteristics mark letter of access time information, the web page characteristics identification information of target web and the target web Target signature label corresponding to breath;
Memory cell, for the current accessed behavioral data bag to be stored in into the target signature label and the client Accessed corresponding to end is common in behavior database;
Computing unit, visited for the current accessed behavioral data bag in the access behavior database and history Ask behavioral data bag, calculate in preset time period the client to the interest-degree of the target signature label.
In embodiments of the present invention, the current accessed behavioral data bag sent by receiving client, and by current accessed Behavioral data bag is stored in target signature label and the common corresponding access behavior database of client, and according to the behavior of access Client is to target in current accessed behavioral data bag and history access behavioral data bag calculating preset time period in database The interest-degree of feature tag, labelled so as to realize according to user access activity to user, establish user's portrait so that Yong Huhua As fitting reality, the accuracy of user's portrait is improved, is easy to recommend proper information, while improve resource utilization.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only this Some embodiments of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can be with Other accompanying drawings are obtained according to these accompanying drawings.
Fig. 1 is the network architecture schematic diagram using the embodiment of the present invention;
Fig. 2 is a kind of schematic flow sheet of data processing method provided in an embodiment of the present invention;
Fig. 3 is a kind of organization chart in feature tag storehouse provided in an embodiment of the present invention;
Fig. 4 is a kind of interface schematic diagram of client provided in an embodiment of the present invention;
Fig. 5 is a kind of schematic diagram of interest-degree provided in an embodiment of the present invention;
Fig. 6 is a kind of time diagram of data processing method provided in an embodiment of the present invention;
Fig. 7 is the schematic flow sheet of another data processing method provided in an embodiment of the present invention;
Fig. 8 is the schematic flow sheet of another data processing method provided in an embodiment of the present invention;
Fig. 9 is a kind of structural representation of client provided in an embodiment of the present invention;
Figure 10 is the structural representation of another client provided in an embodiment of the present invention;
Figure 11 is a kind of structural representation of data processing equipment provided in an embodiment of the present invention;
Figure 12 is the structural representation of another data processing equipment provided in an embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art are obtained every other under the premise of creative work is not made Embodiment, belong to the scope of protection of the invention.
Fig. 1 is referred to, for the network architecture schematic diagram of the application embodiment of the present invention, the network architecture schematic diagram includes portable Formula terminal 11, fixed terminal 12, server 13 and user.It should be noted that Fig. 1 is only used for for example, practical application In, the network architecture schematic diagram not only includes the element shown in Fig. 1.Portable terminal 11 and fixed terminal 12 pass through wired network Network or wireless network are established with server 13 to be communicated to connect, the operable portable terminal 11 of user and fixed terminal 12.It is portable Browser can be run in terminal 11 and fixed terminal 12, user connects internet by browser and obtains information needed.Server 13 are applied in the embodiment of the present invention, for for the browser service on portable terminal 11 and fixed terminal 12, storage, point Data or information caused by browser on analysis, processing portable terminal 11 and fixed terminal 12.
Data processing method provided in an embodiment of the present invention can apply the scene for establishing user's portrait, be received for example, working as During to the access instruction of target web, client is searched and the web page characteristics of the target web in default feature tag storehouse Target signature label corresponding to identification information, and record the access time information to the target web, the feature tag storehouse Corresponding relation including web page characteristics identification information and feature tag;When receiving the out code to the target web, institute State client and update the access time information, and by access time information, the web page characteristics of the target web after renewal Identification information and the target signature Tag Packaging are into current accessed behavioral data bag, and by presently described access behavioral data Bag is sent to server;The current accessed behavioral data bag is stored in the target signature label and described by the server Accessed corresponding to client is common in behavior database, and the current accessed behavior in the access behavior database Packet and history access behavioral data bag, calculate in preset time period the client to the interest of the target signature label The scene of degree, i.e., the scene of user's portrait is established according to user access activity, improve the accuracy of user's portrait, be easy to server Proper information is pointedly recommended to client according to user's portrait, while makes full use of user access activity, improves resource Utilization rate.
Client in the embodiment of the present invention is the browser on portable terminal 11 and fixed terminal 12, can also be Browser on mobile terminal (such as mobile phone, tablet personal computer, wearable device etc.).The operating system of terminal can where client With include but is not limited to Unix system, linux system, the Mac OS systems computer operating system of exploitation (Apple Inc.), Windows systems, android system, Saipan system, IOS (Mobile operating system of Apple Inc.'s exploitation) system etc..This hair Data processing equipment in bright embodiment can be server, for that for client service, can store, analyze, handle client Caused data or information, or client recommendation information or product.
It should be noted that the precondition for implementing the embodiment of the present invention is client there is experience to identify.The experience Mark to be chosen " participating in Consumer's Experience improves plan " when installing client according to user, automatically generates mark.If with " participating in Consumer's Experience improves plan " is not chosen at family, then the client identifies without experience, and is not suitable for of the invention real Apply example.
Below in conjunction with Fig. 2-Fig. 8, data processing method provided in an embodiment of the present invention is described in detail.
Fig. 2 is referred to, is a kind of schematic flow sheet of data processing method provided in an embodiment of the present invention, the present invention is implemented Example illustrates the idiographic flow of data processing method from client-side and server side jointly, and this method can include:
101, when receiving the access instruction to target web, client is searched in default feature tag storehouse and institute Target signature label corresponding to the web page characteristics identification information of target web is stated, and records the access time to the target web Information, the feature tag storehouse include the corresponding relation of web page characteristics identification information and feature tag;
Specifically, when receiving the enabled instruction to client, that is, receive the startup that user is directed to client input During operation, the client terminal start-up, and load default feature tag storehouse.Wherein, the feature tag library storage is in the visitor In the affiliated terminal in family end, when the client terminal start-up, the client obtains the spy out of the client affiliated terminal Levy tag library.The feature tag storehouse includes the corresponding relation of web page characteristics identification information and feature tag, i.e. a webpage spy Sign identification information corresponds to one or more feature tags.It is understood that set in advance for each web page characteristics identification information One or more feature tags, feature tag are used to identify different web page characteristics identification information, and a feature tag is corresponding Multiple web page characteristics identification informations.The feature tag can be understood as the interest tags of user.Wherein, the web page characteristics mark Know information to be used to webpage be identified, including web page title title and URL (Uniform Resource Locator, URL), URL is the position of resource and a kind of succinct expression of access method to that can be obtained from internet, It is the address of standard resource on internet.Each file on internet has a unique URL, and the information that it is included is pointed out How the position of file and browser should handle it.Basic URL includes pattern (or agreement), server name (or IP Address), path and filename, such as " agreement:// mandate/path ", agreement may include HTTP resource (Hypertext Transfer Protocol, HTTP), FTP (File Transfer Protocol, FTP) Deng, in embodiments of the present invention URL can be some websites network address.Fig. 3 is referred to, for a kind of spy provided in an embodiment of the present invention The organization chart of tag library is levied, the figure is by taking sport category feature tag as an example, web page title name corresponding to sport category feature tag The keyword of title may include the Olympic Games, experience body-building etc., and URL corresponding to sport category feature tag may include The network address such as sport.sina.com, tv.sp.com.Further, sport category feature tag can be finely divided, such as is divided into basket The feature tags such as ball, football class.In a kind of mode in the cards, the feature tag storehouse can be an associated container Map, the Map provide it is one-to-one (wherein first is properly termed as keyword, and each keyword can only occur once in Map, the Two values that may be referred to as the keyword) data-handling capacity, for example, web page title title corresponding to sport category feature tag Keyword may include the Olympic Games, experience body-building etc., a numerical value, such as 3001, the number can be corresponded to corresponding to these keywords Value can represent sport category feature tag.It is understood that the corresponding pass between Map storage keywords and corresponding numerical value System, each numerical value correspond to a kind of feature tag.
After the client terminal start-up, the client can receive access instruction of the user to some webpage, the access Instruction can be to the clicking operations of some websites in InterURL or the network address inputted to network address input dialogue frame Redirect clicking operation.Fig. 4 is referred to, is a kind of interface schematic diagram of client, the interface schematic diagram is the interface of browser Schematic diagram is, it is necessary to which explanation, the interface schematic diagram are only used for for example, not forming the restriction to the embodiment of the present invention. Interface schematic diagram shown in Fig. 4 is targeted website http:The interface of target web corresponding to //www.xxx.com/xxx/ Schematic diagram, if the interface schematic diagram shown in Fig. 4 includes multiple webpages, this multiple webpage can be considered target web, in order to just In understanding, the embodiment of the present invention is introduced by taking a target web as an example.
When receiving the access instruction to target web, the client is searched in the default feature tag storehouse The target signature label corresponding with the web page characteristics identification information of the target web.Wherein, the webpage of the target web is special Levying identification information includes the web page title title of the target web and the URL of the target web.A kind of in the cards In mode, the feature tag storehouse includes fisrt feature tag library and second feature tag library, the fisrt feature tag library bag The corresponding relation of web page title title and feature tag is included, it is corresponding with feature tag that the second feature tag library includes URL Relation.In the mode of alternatively possible realization, above-mentioned associated container Map includes title Map and URL Map, and it is got the bid Autograph claims Map to include the corresponding relation between key word in title and corresponding numerical value, and each numerical value corresponds to a kind of feature tag; URL Map include the corresponding relation between URL and corresponding numerical value, and each numerical value corresponds to a kind of feature tag.
The client records the access time letter to the target web while target signature label is searched Breath.The access time information includes accessing initial time, access the end time and accessing duration.The client is obtained and connect The very first time of the access instruction is received, and access time is recorded in using the very first time as the access initial time In information.Now, the access initial time is only have recorded, a length of blank when giving tacit consent to the access end time and the access Or it is zero.In a kind of mode in the cards, the client is when receiving the access instruction, by the target web Web page characteristics identification information add in chained list, and the target web that the access initial time is write in the chained list Corresponding bar is now.It is understood that the chained list is a kind of instrument of the client records time.
102, when receiving the out code to the target web, the client updates the access time information, And the access time information after renewal, the web page characteristics identification information of the target web and the target signature label are sealed Current accessed behavioral data bag is dressed up, and presently described access behavioral data bag is sent to server;
Specifically, the client can receive the out code that user is directed to target web input, the closing refers to Order can be the out code to window where the target web, that is, be used to close where the target web in multiple webpages Window.The out code is alternatively the out code to the client, that is, is used for all webpages that closing has been opened Window, that is, click on the "×" mark in the interface schematic diagram shown in Fig. 4.
When receiving the out code to the target web, the client updates the access time information.Institute State client and obtain the second time for receiving the out code, and the access in the access time information is terminated Time is updated to second time, and the access end time that will be given tacit consent to is updated to second time.The client End calculates the time difference between second time and the very first time, and by the visit in the access time information Ask that duration is updated to the time difference, the access duration that will be given tacit consent to is updated to the time difference.A kind of possible In the mode of realization, the client searches the target web pair when receiving the out code in the chained list The entry answered, and by corresponding to the access end time and the target web for accessing duration and writing in the chained list Bar is now.
The client is by access time information, the web page characteristics identification information of the target web and the institute after renewal Target signature Tag Packaging is stated into current accessed behavioral data bag, and the current accessed behavioral data bag is sent to service Device, the current accessed behavioral data bag is carried out concluding storage by the server and calculates the client in preset time period Hold the interest-degree to the target signature label.Access behavior can be understood as navigation patterns, and user is carried out clear using browser Look at webpage, different user browses different classes of website URL, clicks on different web page titles, in different web pages stay time Difference, above-mentioned differentiation behavior form the navigation patterns of a user.Access behavioral data bag and browse row for recording user For packet, the current accessed behavioral data bag be record the current navigation patterns of user packet.
103, the current accessed behavioral data bag is stored in the target signature label and the visitor by the server Accessed corresponding to family end is common in behavior database;
Specifically, the server receives the current accessed behavioral data bag that the client is sent, and receiving During to the current accessed behavioral data bag, the identification information of the client is obtained, and to the current accessed behavioral data Binding storage is carried out with the identification information of the client.Wherein, the identification information of the client can be globally unique identifier Accord with (Globally Unique Identifier, GUID), GUID is a kind of unique mark generated by algorithm, is often expressed as The character string of 32 16 binary digits (0-9, A-F) compositions, such as:{21EC2020-3AEA-1069-A2DD- 08002B30309D }, it is substantially the bigit of 128 bit lengths.In the ideal case, any computer and calculating Machine cluster is all without two identical GUID of generation, therefore each computer has unique GUID, and client is identified for GUID End has uniqueness, facilitates server to be handled for different clients.The server is by the current accessed behavior number It is stored according to bag in the target signature label and the common corresponding access behavior database of the client.It is appreciated that It is the access behavioral data bag for accessing behavior database and including multiple clients to the target signature label.
104, the current accessed behavioral data bag and history of the server in the access behavior database Behavioral data bag is accessed, calculates in preset time period the client to the interest-degree of the target signature label;
Specifically, the server is according to the current accessed behavioral data bag accessed in behavior database and goes through History accesses behavioral data bag, counts in preset time period the client to the access times of the target signature label, and root According to the client in the preset time period to the access times of the target signature label and with the target signature mark Interest parameter is preset corresponding to label, calculates in the preset time period client to the interest of the target signature label Degree.Wherein, it is that the client is real according to the present invention before the embodiment of the present invention is performed that the history, which accesses behavioral data bag, The access behavioral data bag that the method described in example is sent to the server is applied, the history is accessed included by behavioral data bag Content is consistent with the content included by the current accessed behavioral data bag.The default interest parameter can be true by the server It is fixed.The preset time period represents the interval between present system time, is determined by the server, for example, apart from it is current when Between 10 days, concrete numerical value does not limit herein.
The client is calculated in the preset time period to the interest of the target signature label in the server After degree, the server can be associated with the target signature label to client transmission according to the numerical value of the interest-degree Recommendation information, so as to realize the server pointedly to the client send recommendation information.
Fig. 5 is referred to, is a kind of schematic diagram of interest-degree provided in an embodiment of the present invention, the schematic diagram shows two users To the interest-degree of feature tag, user A is the user of customer end A, and user B is the user of customer end B.The server leads to Cross and calculate interest-degree of the customer end A to different characteristic label, and customer end B obtains Fig. 5 to the interest-degree of different characteristic label Shown interest-degree schematic diagram.It should be noted that Fig. 5 is only used for for example, not forming the limit to the embodiment of the present invention It is fixed.If the server calculates interest-degree of the client to each feature tag in multiple feature tags, the clothes Being engaged in device can be according to the sequence of the interest-degree of each feature tag to client transmission recommendation information, with Fig. 5 User A exemplified by, the server according to basketball, electronics fan, military fan order to customer end A send recommendation Breath, optionally, the recommendation information associated with basketball is more than the recommendation information associated with electronics fan, with electronics fan Associated recommendation information is more than the recommendation information associated with military fan.
In embodiments of the present invention, client is according to the webpage with target web searched in default feature tag storehouse Target signature label corresponding to signature identification information, the banner information of target web and the access time to target web Information encapsulation is sent to server into current accessed behavioral data bag, and by current accessed behavioral data bag, ought by server Preceding access behavioral data bag is stored in target signature label and the common corresponding access behavior database of client, and is calculated pre- If client is to the interest-degree of target signature label in the period, user's labelling is given according to user access activity so as to realize, Establish user's portrait so that user, which draws a portrait, is bonded actual, the accuracy of raising user's portrait, is easy to recommend the information of appropriateness, together Shi Tigao resource utilizations.
Fig. 6 is referred to, is a kind of time diagram of data processing method provided in an embodiment of the present invention, the present invention is implemented Example illustrates the idiographic flow of data processing method from client-side and server side jointly, and this method can include:
301, when receiving the access instruction to target web, client searches the web page characteristics with the target web Target signature label corresponding to identification information;
Specifically, when receiving the access instruction to target web, client is searched in default feature tag storehouse The target signature label corresponding with the web page characteristics identification information of the target web.
In a kind of mode in the cards, the feature tag storehouse includes fisrt feature tag library and second feature label Storehouse, the fisrt feature tag library include the corresponding relation of web page title title and feature tag, the second feature tag library Corresponding relation including URL and feature tag.The client is searched and the target network in the fisrt feature tag library Fisrt feature label corresponding to the web page title title of page, if the quantity of the fisrt feature label is equal to predetermined threshold value, institute State client and the fisrt feature label is defined as to the target spy corresponding with the web page characteristics identification information of the target web Levy label;If the quantity of the fisrt feature label is more than the predetermined threshold value, the client is in the second feature mark Sign and corresponding with the URL of target web second feature label searched in storehouse, the quantity of the second feature label with it is described Predetermined threshold value is identical;If there is identical feature tag, the visitor in the second feature label with the fisrt feature label The identical feature tag is defined as the target signature corresponding with the web page characteristics identification information of the target web by family end Label;If identical feature tag is not present with the fisrt feature label in the second feature label, the client will The fisrt feature label is defined as the target signature label corresponding with the web page characteristics identification information of the target web.Its In, the predetermined threshold value is one.For example, " new person's water one is sent out for some websites entitled!Personal basketball buys discussion about experience ", root Two fisrt feature labels of basket ball fan and shopoholic are found according to the title, more than one, then the client root Second feature label is searched according to the URL of the website, if second feature label is basket ball fan, the client likes basketball Good person is defined as target signature label;If second feature label is shopoholic, shopoholic is defined as target by the client Feature tag;If second feature label is non-basket ball fan and non-shopoholic, the client is by basket ball fan and purchase The two mad fisrt feature labels of thing are defined as target signature label.
In the mode of alternatively possible realization, the feature tag storehouse can be an associated container Map, the associated container Map includes title Map and URL Map, and wherein title Map includes pair between key word in title and corresponding numerical value It should be related to, each numerical value corresponds to a kind of feature tag;URL Map include the corresponding relation between URL and corresponding numerical value, each Numerical value corresponds to a kind of feature tag.The client searches fisrt feature label first in title Map, if fisrt feature The quantity of label is more than one, then second feature label is searched in URL Map, if second feature label and fisrt feature label Identical feature tag be present, then the identical feature tag is defined as target signature label by the client;If second is special Identical feature tag is not present with fisrt feature label in sign label, then fisrt feature label is defined as target by the client Feature tag.It should be noted that the URL and feature tag in URL Map are one-to-one relations, i.e., a URL is unique A corresponding feature tag.
It is exemplary, the client when finding the target signature label, according to the target signature label with And the web page characteristics identification information generation matching form of the target web, the matching form are<[url|title]: protocol:key>, wherein, url | title is web page characteristics identification information, that is, needs the keyword matched;Protocol is Feature tag major class belonging to the keyword;Key is some group in this feature label major class.Protocol and key joints Feature tag is determined, if subdivision classification be present, key be present;If in the absence of subdivision classification, in the absence of key.For example, matching Form<soprt.sina.com:3001:100>Expression is searched according to URL,<Bryant is retired:3001:110>Represent according to title Title is searched, wherein 3001 can represent sport category feature tag, 100 and 110 be the subdivision of sport category feature tag, and 110 can be Basketball category feature label.
302, access time information of the client records to the target web;
Specifically, the client records the visit to the target web while target signature label is searched Ask temporal information.The access time information includes accessing initial time, access the end time and accessing duration.The client End obtains the very first time for receiving the access instruction, and is recorded in the very first time as the access initial time In access time information.Now, the access initial time is only have recorded, when giving tacit consent to the access end time and the access A length of blank is zero.In a kind of mode in the cards, the client is when receiving the access instruction, by described in The web page characteristics identification information of target web is added in chained list, and by described in described access in the initial time write-in chained list Bar corresponding to target web is now.It is understood that the chained list is a kind of instrument of the client records time.
303, when receiving the out code to the target web, the client updates the access time information;
Specifically, when receiving the out code to the target web, the client updates the access time Information.The client obtains the second time for receiving the out code, and by described in the access time information The access end time is updated to second time, and the access end time that will be given tacit consent to is updated to second time. The client calculates the time difference between second time and the very first time, and by the access time information The access duration be updated to the time difference, the access duration that will be given tacit consent to is updated to the time difference. In a kind of mode in the cards, the client searches the mesh when receiving the out code in the chained list Mark entry corresponding to webpage, and the target network that the access end time and the access duration are write in the chained list Bar corresponding to page is now.
304, the client by the access time information after renewal, the web page characteristics identification information of the target web with And the target signature Tag Packaging is into current accessed behavioral data bag;
Specifically, the client believes the access time information after renewal, the web page characteristics of target web mark Breath and the target signature Tag Packaging are into current accessed behavioral data bag.Access behavior can be understood as navigation patterns, use Family carries out browsing webpage using browser, and different user browses different classes of website URL, clicks on different web page titles, Different web pages stay time is also different, and above-mentioned differentiation behavior forms the navigation patterns of a user.Accessing behavioral data bag is For recording the packet of user browsing behavior, the current accessed behavioral data bag is to record the current navigation patterns of user Packet.
305, the client sends presently described access behavioral data bag to server;
306, the current accessed behavioral data bag is stored in the target signature label and the visitor by the server Accessed corresponding to family end is common in behavior database;
Specifically, the server receives the current accessed behavioral data bag that the client is sent, and receiving During to the current accessed behavioral data bag, the identification information of the client is obtained, and to the current accessed behavioral data Binding storage is carried out with the identification information of the client, the current accessed behavioral data bag is stored in the target signature In label and the common corresponding access behavior database of the client.It is understood that the access behavior database bag Include access behavioral data bag of multiple clients to the target signature label.
307, access times of the client to the target signature label in the server statistics preset time period;
Specifically, the server is according to the current accessed behavioral data bag accessed in behavior database and goes through History accesses behavioral data bag, counts in preset time period the client to the access times of the target signature label.Wherein, The history access behavioral data bag be the client before the embodiment of the present invention is performed according to described in the embodiment of the present invention The access behavioral data bag that is sent to the server of method, the history accesses the content and institute included by behavioral data bag The content stated included by current accessed behavioral data bag is consistent.Between the preset time period is represented between present system time Every being determined by the server, such as apart from current time 10 days, concrete numerical value did not limited herein.The preset time period Exemplified by apart from present system time 10 days, the current accessed row of the server in the access behavior database The access time information in running data bag is accessed for packet and the history, it is every in statistical distance present system time 10 days Its described client is to the access times of the target signature label, or every two days institutes in statistical distance present system time 10 days State access times of the client to the target signature label.
308, the client is calculated in the preset time period to the interest-degree of the target signature label;
Specifically, the server is according to visit of the client in the preset time period to the target signature label Number and default interest parameter corresponding with the target signature label are asked, calculates the client in the preset time period To the interest-degree of the target signature label.Wherein, the default interest parameter can be determined by the server.
Exemplary, the server is according to the client in the default calculation formula calculating preset time period to institute The interest-degree of target signature label is stated, the default calculation formula can be:
W (n)=∑ik*ln(pvi+1)*exp(-λi)
Wherein, n represents the preset time period, can represent n days or n hour, the embodiment of the present invention were entered exemplified by n days Row is introduced;W (n) represents in n days the client to the interest-degree of the target signature label, i.e., institute in described preset time period State interest-degree of the client to the target signature label;pviRepresent daily in the preset time period or every two days or per several Access times of its described client to the target signature label;I span is 0~n;K is default variation coefficient, can It is sized according to different feature tags;λ is preset time attenuation coefficient, can be calculated according to half-life period formula t=ln2/ λ Obtain, different for different feature tag half-life period t, for example, for the feature tag of some long-term properties, it partly declines Time phase is longer, is traditionally arranged to be 60 days, now λ=0.012;For the feature tag of some short-term properties, during its half-life period Between it is shorter, be traditionally arranged to be 10 days.Wherein, the default variation coefficient and the preset time attenuation coefficient can be regarded as described Interest parameter is preset corresponding to target signature label.Calculated by above-mentioned formula, the feature tag of long-term property can be than fulminant The value of short-term property interest-degree is much bigger.λ=0.012 is taken, such as:In 10 days, user A is daily to feature tag a click volume For 10 times, i.e. access times of the customer end A daily to feature tag a are 10 times, then its interest-degree is calculated as:21.82;User B Interest a was not clicked in first 9 days, the tenth day click volume is 100 times, i.e. the tenth day access times to feature tag a of customer end B For 100 times, then its interest value is calculated as:4.08.It can be seen that customer end A is bigger than customer end B to feature tag a interest level Much, this is also true to life.The a certain interest click volume increase of burst can not actual response user to the situation of the interest, It is genuine interested that the click volume of time span length, which can just reflect user,.Using the formula, just eliminate well this prominent The influence of noise that the interest click volume of hair property is brought to data.
In a kind of mode in the cards, if each client is to the target signature label at least two clients Interest-degree it is identical, then the server calculates average access duration of each client to the target signature label, And according to each client to the average access duration and predetermined level table of the target signature label to each visitor Family end divides levels of interest, and the predetermined level table is used to determine levels of interest corresponding to average access duration, including at least one Each average access duration scope corresponding to levels of interest in individual levels of interest, i.e., including each grade with it is corresponding average Access the corresponding relation between duration scope.At least two client division is not prospered together according to the average access duration Interesting grade, facilitate the server to distinguish the different clients of same interest-degree, further discriminate between the difference between user.
Exemplary, the server calculates each client to the target signature label according to equation below Average access duration:
Wt=∑s ti/∑pv
Wherein, wt represents average access duration of some client to the target signature label;tiRepresent the client Ith accesses the access duration of the target web, and ∑ pv represents access times of the client to the target signature label Summation.
309, the recommendation information associated with the target signature label is sent to the client;
Specifically, in a kind of mode in the cards, the server is according to the client in the preset time period Hold and the recommendation associated with the target signature label is sent to the client to the interest-degree of the target signature label Breath.
In the mode of alternatively possible realization, levels of interest and institute of the server according to each client State each client in preset time period to the interest-degree of the target signature label to each client send with The associated recommendation information of the target signature label.For example, customer end A and customer end B are to the emerging of the target signature label Interesting angle value is identical, and the levels of interest of customer end A is higher than customer end B, then the recommendation information that the server is sent to customer end A is more In the recommendation information sent to customer end B;Or the correlation of recommendation information that the server is sent to customer end A is higher than client B is held, so just causes the needs that user is more bonded to the recommendation information that customer end A is sent.
In embodiments of the present invention, client is according to the mesh corresponding with the web page characteristics identification information of target web of lookup Mark feature tag, the banner information of target web and to the access time Information encapsulation of target web into current accessed row For packet, and current accessed behavioral data bag is sent to server, stored current accessed behavioral data bag by server In target signature label and client it is common corresponding to access behavior database, and calculate in preset time period client to mesh The interest-degree of feature tag is marked, is labelled so as to realize according to user access activity to user, establishes user's portrait so that user Portrait fitting is actual, improves the accuracy of user's portrait, is easy to recommend proper information, while improve resource utilization.
Fig. 7 is referred to, is the schematic flow sheet of another data processing method provided in an embodiment of the present invention, methods described It can include:
401, when receiving the access instruction to target web, client is searched in default feature tag storehouse and institute State target signature label corresponding to the web page characteristics identification information of target web;And record the access time to the target web Information;
Specifically, when receiving the access instruction to target web, client is searched in default feature tag storehouse The target signature label corresponding with the web page characteristics identification information of the target web.
In a kind of mode in the cards, the feature tag storehouse includes fisrt feature tag library and second feature label Storehouse, the fisrt feature tag library include the corresponding relation of web page title title and feature tag, the second feature tag library Corresponding relation including URL and feature tag.The client is searched and the target network in the fisrt feature tag library Fisrt feature label corresponding to the web page title title of page, if the quantity of the fisrt feature label is equal to predetermined threshold value, institute State client and the fisrt feature label is defined as to the target spy corresponding with the web page characteristics identification information of the target web Levy label;If the quantity of the fisrt feature label is more than the predetermined threshold value, the client is in the second feature mark Sign and corresponding with the URL of target web second feature label searched in storehouse, the quantity of the second feature label with it is described Predetermined threshold value is identical;If there is identical feature tag, the visitor in the second feature label with the fisrt feature label The identical feature tag is defined as the target signature corresponding with the web page characteristics identification information of the target web by family end Label;If identical feature tag is not present with the fisrt feature label in the second feature label, the client will The fisrt feature label is defined as the target signature label corresponding with the web page characteristics identification information of the target web.Its In, the predetermined threshold value is one.For example, " new person's water one is sent out for some websites entitled!Personal basketball buys discussion about experience ", root Two fisrt feature labels of basket ball fan and shopoholic are found according to the title, more than one, then the client root Second feature label is searched according to the URL of the website, if second feature label is basket ball fan, the client likes basketball Good person is defined as target signature label;If second feature label is shopoholic, shopoholic is defined as target by the client Feature tag;If second feature label is non-basket ball fan and non-shopoholic, the client is by basket ball fan and purchase The two mad fisrt feature labels of thing are defined as target signature label.
In the mode of alternatively possible realization, the feature tag storehouse can be an associated container Map, the associated container Map includes title Map and URL Map, and wherein title Map includes pair between key word in title and corresponding numerical value It should be related to, each numerical value corresponds to a kind of feature tag;URL Map include the corresponding relation between URL and corresponding numerical value, each Numerical value corresponds to a kind of feature tag.The client searches fisrt feature label first in title Map, if fisrt feature The quantity of label is more than one, then second feature label is searched in URL Map, if second feature label and fisrt feature label Identical feature tag be present, then the identical feature tag is defined as target signature label by the client;If second is special Identical feature tag is not present with fisrt feature label in sign label, then fisrt feature label is defined as target by the client Feature tag.It should be noted that the URL and feature tag in URL Map are one-to-one relations, i.e., a URL is unique A corresponding feature tag.
It is exemplary, the client when finding the target signature label, according to the target signature label with And the web page characteristics identification information generation matching form of the target web, the matching form are<[url|title]: protocol:key>, wherein, url | title is web page characteristics identification information, that is, needs the keyword matched;Protocol is Feature tag major class belonging to the keyword;Key is some group in this feature label major class.Protocol and key joints Feature tag is determined, if subdivision classification be present, key be present;If in the absence of subdivision classification, in the absence of key.For example, matching Form<soprt.sina.com:3001:100>Expression is searched according to URL,<Bryant is retired:3001:110>Represent according to title Title is searched, wherein 3001 can represent sport category feature tag, 100 and 110 be the subdivision of sport category feature tag, and 110 can be Basketball category feature label.
The client records the access time letter to the target web while target signature label is searched Breath.The access time information includes accessing initial time, access the end time and accessing duration.The client is obtained and connect The very first time of the access instruction is received, and access time is recorded in using the very first time as the access initial time In information.Now, the access initial time is only have recorded, a length of blank when giving tacit consent to the access end time and the access Or it is zero.In a kind of mode in the cards, the client is when receiving the access instruction, by the target web Web page characteristics identification information add in chained list, and the target web that the access initial time is write in the chained list Corresponding bar is now.It is understood that the chained list is a kind of instrument of the client records time.
402, when receiving the out code to the target web, the client updates the access time information, And the access time information after renewal, the web page characteristics identification information of the target web and the target signature label are sealed Dress up current accessed behavioral data bag;
Specifically, when receiving the out code to the target web, the client updates the access time Information.The client obtains the second time for receiving the out code, and by described in the access time information The access end time is updated to second time, and the access end time that will be given tacit consent to is updated to second time. The client calculates the time difference between second time and the very first time, and by the access time information The access duration be updated to the time difference, the access duration that will be given tacit consent to is updated to the time difference. In a kind of mode in the cards, the client searches the mesh when receiving the out code in the chained list Mark entry corresponding to webpage, and the target network that the access end time and the access duration are write in the chained list Bar corresponding to page is now.
The client is by access time information, the web page characteristics identification information of the target web and the institute after renewal Target signature Tag Packaging is stated into current accessed behavioral data bag.Access behavior can be understood as navigation patterns, and user uses clear Device of looking at carries out browsing webpage, and different user browses different classes of website URL, different web page titles clicked on, in different web pages Stay time is also different, and above-mentioned differentiation behavior forms the navigation patterns of a user.Behavioral data bag is accessed to be used to record The packet of user browsing behavior, the current accessed behavioral data bag are the packet for recording the current navigation patterns of user.
Exemplary, the current accessed behavioral data is stored in visit_info structures by the client, institute State current accessed behavioral data be represented by title, url, beginTime, endTime, protocol, key, TimeSpan }, wherein, title and url represent the web page characteristics identification information of the target web, described in beginTime is represented Initial time is accessed, endTime represents the access end time, and protocol and key represent the target signature label, TimeSpan represents the access duration.
403, the client sends the current accessed behavioral data bag to server;
Specifically, the client sends the current accessed behavioral data bag to server, so that the server The current accessed behavioral data bag is stored in and accessed in behavior database, and calculates the client pair in preset time period The interest-degree of the target signature label.
In embodiments of the present invention, client is according to the webpage with target web searched in default feature tag storehouse Target signature label corresponding to signature identification information, the banner information of target web and the access time to target web Information encapsulation is sent to server into current accessed behavioral data bag, and by current accessed behavioral data bag, so as to server root Interest-degree according to client in access behavioral data bag calculating preset time period to target signature label, so as to realize according to user Access behavior is labelled to user, establishes user's portrait so that and user, which draws a portrait, is bonded reality, improves the accuracy of user's portrait, It is easy to recommend proper information, while improves resource utilization.
Fig. 8 is referred to, is the schematic flow sheet of another data processing method provided in an embodiment of the present invention, methods described It can include:
501, the current accessed behavioral data bag that client is sent is received, the current accessed behavioral data bag includes accessing Mesh corresponding to the web page characteristics identification information of temporal information, the web page characteristics identification information of target web and the target web Mark feature tag;
Specifically, data processing equipment receives the current accessed behavioral data bag that the client is sent, wherein, institute Stating current accessed behavioral data bag includes access time information, the web page characteristics identification information of target web and target spy Levy label, the target signature label be the client searched in default feature tag storehouse with the target web Feature tag corresponding to web page characteristics identification information, the feature tag storehouse include web page characteristics identification information and feature tag Corresponding relation.The access time information is the access time information after client renewal, including accesses initial time, visits Ask the end time and access duration.
502, the current accessed behavioral data bag is stored in the target signature label and the client is jointly right In the access behavior database answered;
Specifically, the data processing equipment obtains the client when receiving the current accessed behavioral data bag The identification information at end, and binding storage is carried out to the identification information of the current accessed behavioral data and the client.It is described The current accessed behavioral data bag is stored in data processing equipment into the target signature label and the client is jointly right In the access behavior database answered.It is understood that the access behavior database includes multiple clients to described The access behavioral data bag of target signature label.
503, the current accessed behavioral data bag and history in the access behavior database access behavior number According to bag, the client is calculated in preset time period to the interest-degree of the target signature label;
Specifically, the current accessed behavioral data of the data processing equipment in the access behavior database Bag and history access behavioral data bag, count access of the client to the target signature label time in preset time period Number.Wherein, it is that the client is real according to the present invention before the embodiment of the present invention is performed that the history, which accesses behavioral data bag, The access behavioral data bag that the method described in example is sent to the data processing equipment is applied, the history accesses behavioral data bag institute Including content it is consistent with the content included by the current accessed behavioral data bag, simply the history accesses behavioral data bag Had differences with the concrete numerical value of the access time information included by the current accessed behavioral data bag.The preset time period The interval between present system time is represented, is determined by the data processing equipment, such as apart from current time 10 days, specifically Numerical value does not limit herein.The preset time period is exemplified by apart from present system time 10 days, the data processing equipment root The visit in running data bag is accessed according to the current accessed behavioral data bag in the access behavior database and the history Ask temporal information, access of the client to the target signature label time daily in statistical distance present system time 10 days Access times of every two days clients to the target signature label in number, or statistical distance present system time 10 days.
The data processing equipment is according to visit of the client in the preset time period to the target signature label Number and default interest parameter corresponding with the target signature label are asked, calculates the client in the preset time period To the interest-degree of the target signature label.Wherein, the default interest parameter can be determined by the data processing equipment.
Exemplary, the data processing equipment calculates the client in the preset time period according to default calculation formula The interest-degree to the target signature label is held, the default calculation formula can be:
W (n)=∑ik*ln(pvi+1)*exp(-λi)
Wherein, n represents the preset time period, can represent n days or n hour, the embodiment of the present invention were entered exemplified by n days Row is introduced;W (n) represents in n days the client to the interest-degree of the target signature label, i.e., institute in described preset time period State interest-degree of the client to the target signature label;pviRepresent daily in the preset time period or every two days or per several Access times of its described client to the target signature label;I span is 0~n;K is default variation coefficient, can It is sized according to different feature tags;λ is preset time attenuation coefficient, can be calculated according to half-life period formula t=ln2/ λ Obtain, different for different feature tag half-life period t, for example, for the feature tag of some long-term properties, it partly declines Time phase is longer, is traditionally arranged to be 60 days, now λ=0.012;For the feature tag of some short-term properties, during its half-life period Between it is shorter, be traditionally arranged to be 10 days.Wherein, the default variation coefficient and the preset time attenuation coefficient can be regarded as described Interest parameter is preset corresponding to target signature label.Calculated by above-mentioned formula, the feature tag of long-term property can be than fulminant The value of short-term property interest-degree is much bigger.λ=0.012 is taken, such as:In 10 days, user A is daily to feature tag a click volume For 10 times, i.e. access times of the customer end A daily to feature tag a are 10 times, then its interest-degree is calculated as:21.82;User B Interest a was not clicked in first 9 days, the tenth day click volume is 100 times, i.e. the tenth day access times to feature tag a of customer end B For 100 times, then its interest value is calculated as:4.08.It can be seen that customer end A is bigger than customer end B to feature tag a interest level Much, this is also true to life.The a certain interest click volume increase of burst can not actual response user to the situation of the interest, It is genuine interested that the click volume of time span length, which can just reflect user,.Using the formula, just eliminate well this prominent The influence of noise that the interest click volume of hair property is brought to data.
In a kind of mode in the cards, if each client is to the target signature label at least two clients Interest-degree it is identical, then the data processing equipment calculates average access of each client to the target signature label Duration, and according to each client to the average access duration and predetermined level table of the target signature label to described every Individual client divides levels of interest, and the predetermined level table is used to determine levels of interest corresponding to average access duration, including extremely Average access duration scope corresponding to each levels of interest in a few levels of interest, i.e., including each grade with it is corresponding Corresponding relation between average access duration scope.At least two client is divided not according to the average access duration Same levels of interest, facilitate the data processing equipment to distinguish the different clients of same interest-degree, further discriminate between user Difference.
Exemplary, the data processing equipment calculates each client to the target signature according to equation below The average access duration of label:
Wt=∑s ti/∑pv
Wherein, wt represents average access duration of some client to the target signature label;tiRepresent the client Ith accesses the access duration of the target web, and ∑ pv represents access times of the client to the target signature label Summation.
In embodiments of the present invention, the current accessed behavioral data bag sent by receiving client, and by current accessed Behavioral data bag is stored in target signature label and the common corresponding access behavior database of the client, and is calculated default Client labels so as to realize to the interest-degree of the target signature label according to user access activity to user in period Label, establishing user's portrait so that user, which draws a portrait, is bonded reality, improves the accuracy of user's portrait, is easy to recommend the information of appropriateness, Improve resource utilization simultaneously.
Fig. 9 is referred to, is a kind of structural representation of client provided in an embodiment of the present invention, the client 60 can be with Including:Searching unit 601, recording unit 602, updating block 603, encapsulation unit 604 and transmitting element 605.
Searching unit 601, for when receiving the access instruction to target web, being looked into default feature tag storehouse The target signature label corresponding with the web page characteristics identification information of the target web is looked for, it is special that the feature tag storehouse includes webpage Levy the corresponding relation of identification information and feature tag;
In the specific implementation, the web page characteristics identification information of the target web includes the web page title name of the target web Claim the uniform resource position mark URL with the target web;The feature tag storehouse includes fisrt feature tag library and the second spy Tag library is levied, the fisrt feature tag library includes the corresponding relation of web page title title and feature tag, the second feature Tag library includes the corresponding relation of URL and feature tag.
The searching unit 601 includes label lookup unit 6011 and target determination unit 6012.
Label lookup unit 6011, for searching the webpage mark with the target web in the fisrt feature tag library Fisrt feature label corresponding to autograph title;
Target determination unit 6012, if the quantity for the fisrt feature label is equal to predetermined threshold value, by described One feature tag is defined as the target signature label corresponding with the web page characteristics identification information of the target web;
The label lookup unit 6011, it is additionally operable to search in the second feature tag library and the target web Second feature label corresponding to URL, the quantity of the second feature label are identical with the predetermined threshold value;
The target determination unit 6012, if be additionally operable to the second feature label has phase with the fisrt feature label The identical feature tag, then be defined as corresponding with the web page characteristics identification information of the target web by same feature tag Target signature label;
The target determination unit 6012, it is not present if being additionally operable to the second feature label with the fisrt feature label Identical feature tag, then the fisrt feature label is defined as corresponding with the web page characteristics identification information of the target web Target signature label.
Recording unit 602, for recording the access time information to the target web;
In the specific implementation, the access time information includes accessing initial time, access the end time and accessing duration; The recording unit 602, which is specifically used for obtaining, receives very first time of the access instruction, and using the very first time as The access initial time is recorded in access time information.
Updating block 603, for when receiving the out code to the target web, updating the access time letter Breath;
In the specific implementation, the updating block 603 is specifically used for obtaining the second time for receiving the out code, and The access end time in the access time information is updated to second time;Calculate second time and institute The time difference between the very first time is stated, and the access duration in the access time information is updated to the time difference Value.
Encapsulation unit 604, for by the access time information after renewal, the web page characteristics identification information of the target web And the target signature Tag Packaging is into current accessed behavioral data bag;
Transmitting element 605, for the current accessed behavioral data bag to be sent to server, so that the server will The current accessed behavioral data bag, which is stored in, to be accessed in behavior database, and calculates in preset time period the client to institute State the interest-degree of target signature label.
Please continue to refer to Figure 10, Figure 10 is the structural representation for another client that the embodiment of the present invention proposes.Such as figure Shown in 10, the client includes processor 701 and interface circuit 702, and memory 703 and bus 704 are given in figure, at this Reason device 701, interface circuit 702 and memory 703 are connected by bus 704 and complete mutual communication.
Wherein, processor 701 is used to perform following operating procedure:
When receiving the access instruction to target web, searched and the target web in default feature tag storehouse Target signature label corresponding to web page characteristics identification information, and record the access time information to the target web, it is described Feature tag storehouse includes the corresponding relation of web page characteristics identification information and feature tag;
When receiving out code to the target web, the access time information is updated, and by the visit after renewal Ask temporal information, the web page characteristics identification information of the target web and the target signature Tag Packaging into current accessed row For packet;
The current accessed behavioral data bag is sent to server, so that the server is by the current accessed behavior Packet, which is stored in, to be accessed in behavior database, and calculates in preset time period the client to the target signature label Interest-degree.
Wherein, the web page characteristics identification information of the target web includes web page title title and the institute of the target web State the uniform resource position mark URL of target web;The feature tag storehouse includes fisrt feature tag library and second feature label Storehouse, the fisrt feature tag library include the corresponding relation of web page title title and feature tag, the second feature tag library Corresponding relation including URL and feature tag.
Wherein, the processor 701 is specifically used for performing following operating procedure:
Fisrt feature corresponding with the web page title title of the target web is searched in the fisrt feature tag library Label;
If the quantity of the fisrt feature label is equal to predetermined threshold value, by the fisrt feature label be defined as with it is described Target signature label corresponding to the web page characteristics identification information of target web;
If the quantity of the fisrt feature label is more than the predetermined threshold value, searched in the second feature tag library The target signature label corresponding with the web page characteristics identification information of the target web.
Wherein, the processor 701 is specifically used for performing following operating procedure:
Second feature label corresponding with the URL of the target web is searched in the second feature tag library, it is described The quantity of second feature label is identical with the predetermined threshold value;
If there is identical feature tag in the second feature label, with the fisrt feature label by the identical Feature tag is defined as the target signature label corresponding with the web page characteristics identification information of the target web;
If identical feature tag is not present with the fisrt feature label in the second feature label, by described first Feature tag is defined as the target signature label corresponding with the web page characteristics identification information of the target web.
Wherein, the access time information includes accessing initial time, access the end time and accessing duration;
The processor 701 is specifically used for performing following operating procedure:
The very first time for receiving the access instruction is obtained, and using the very first time as the access initial time It is recorded in access time information.
Wherein, the processor 701 is specifically used for performing following operating procedure:
The second time for receiving the out code is obtained, and the access in the access time information is terminated Time is updated to second time;
The time difference between second time and the very first time is calculated, and by the access time information The access duration is updated to the time difference.
In embodiments of the present invention, client is according to the webpage with target web searched in default feature tag storehouse Target signature label corresponding to signature identification information, the banner information of target web and the access time to target web Information encapsulation is sent to server into current accessed behavioral data bag, and by current accessed behavioral data bag, so as to server root Interest-degree according to client in access behavioral data bag calculating preset time period to target signature label, so as to realize according to user Access behavior is labelled to user, establishes user's portrait so that and user, which draws a portrait, is bonded reality, improves the accuracy of user's portrait, It is easy to recommend proper information, while improves resource utilization.
Figure 11 is referred to, is a kind of structural representation of data processing equipment provided in an embodiment of the present invention, the data Processing unit 80 can include:Receiving unit 801, memory cell 802 and computing unit 803.
Receiving unit 801, for receiving the current accessed behavioral data bag of client transmission, the current accessed behavior number Identified according to the web page characteristics for including access time information, the web page characteristics identification information of target web and the target web Target signature label corresponding to information;
Memory cell 802, for the current accessed behavioral data bag to be stored in into target signature label and the client Accessed corresponding to end is common in behavior database;
Computing unit 803, for the current accessed behavioral data bag in the access behavior database and go through History accesses behavioral data bag, calculates in preset time period the client to the interest-degree of the target signature label;
Wherein, the target signature label is that the client searching with the target in default feature tag storehouse Feature tag corresponding to the web page characteristics identification information of webpage, the feature tag storehouse includes web page characteristics identification information and feature The corresponding relation of label.The access time information is the access time information after client renewal, including accesses starting Time, access the end time and access duration.
In the specific implementation, the computing unit 803 is specifically for described current in the access behavior database Access behavioral data bag and history accesses behavioral data bag, the client is to the target signature mark in statistics preset time period The access times of label;According to the client in the preset time period to the access times of the target signature label and with Interest parameter is preset corresponding to the target signature label, it is special to the target to calculate the client in the preset time period Levy the interest-degree of label.
The computing unit 803 is according to access of the client in the preset time period to the target signature label Number and default interest parameter corresponding with the target signature label, and when calculating described default according to default calculation formula Between interest-degree of the client to the target signature label in section;
Wherein, the default calculation formula is:
W (n)=∑ik*ln(pvi+1)*exp(-λi)
N is the preset time period, and w (n) is the client in the preset time period to the target signature label Interest-degree;K is default variation coefficient, and λ is preset time attenuation coefficient, and the default interest parameter includes the default change Coefficient and the preset time attenuation coefficient;I ∈ 0~n, pviSummation for client described in the preset time period to described The access times of target signature label.
The data processing equipment 80 also includes transmitting element and division unit, does not indicate in fig. 11.
Transmitting element, for the interest-degree according to the client in the preset time period to the target signature label The recommendation information associated with the target signature label is sent to the client.
The computing unit 803, if being additionally operable in the preset time period each client pair at least two clients The interest-degree all same of the target signature label, then calculate each average visit of the client to the target signature label Ask duration;
Division unit, for the average access duration of the target signature label and being preset according to each client Table of grading includes each grade pair at least one grade to each client division levels of interest, the predetermined level table The average access duration scope answered;
The transmitting element, it is additionally operable to according to institute in the levels of interest of each client and the preset time period Each client is stated to send and the target signature label interest-degree of the target signature label to each client Associated recommendation information.
Please continue to refer to Figure 12, Figure 12 is the structural representation for another data processing equipment that the embodiment of the present invention proposes Figure.As shown in figure 12, the data processing equipment includes processor 901 and interface circuit 902, and memory 903 is given in figure With bus 904, the processor 901, interface circuit 902 and memory 903 are connected by bus 904 and completed mutual lead to Letter.
Wherein, processor 901 is used to perform following operating procedure:
The current accessed behavioral data bag that client is sent is received, the current accessed behavioral data bag includes access time The target that the web page characteristics identification information of information, the web page characteristics identification information of target web and the target web is corresponding is special Levy label
The current accessed behavioral data bag is stored in the target signature label and the client is jointly corresponding Access in behavior database;
The current accessed behavioral data bag and history in the access behavior database access behavioral data bag, Interest-degree of the client to the target signature label in calculating preset time period.
Wherein, the processor 901 is specifically used for performing following operating procedure:
The current accessed behavioral data bag and history in the access behavior database access behavioral data bag, Access times of the client to the target signature label in statistics preset time period;
According to the client in the preset time period to the access times of the target signature label and with it is described Interest parameter is preset corresponding to target signature label, calculates in the preset time period client to the target signature mark The interest-degree of label.
Wherein, the processor 901 is specifically used for performing following operating procedure:
According to the client in the preset time period to the access times of the target signature label and with it is described Interest parameter is preset corresponding to target signature label, and the client in the preset time period is calculated according to default calculation formula Hold the interest-degree to the target signature label;
Wherein, the default calculation formula is:
W (n)=∑ik*ln(pvi+1)*exp(-λi)
N is the preset time period, and w (n) is the client in the preset time period to the target signature label Interest-degree;K is default variation coefficient, and λ is preset time attenuation coefficient, and the default interest parameter includes the default change Coefficient and the preset time attenuation coefficient;I ∈ 0~n, pviSummation for client described in the preset time period to described The access times of target signature label.
Wherein, the processor 901 is additionally operable to perform following operating procedure:
According to the client in the preset time period to the interest-degree of the target signature label to the client Send the recommendation information associated with the target signature label.
Wherein, the processor 901 is additionally operable to perform following operating procedure:
If interest of each client to the target signature label at least two clients in the preset time period All same is spent, then calculates average access duration of each client to the target signature label;
According to each client to the average access duration and predetermined level table of the target signature label to described Each client division levels of interest, the predetermined level table include average access corresponding to each grade at least one grade Duration scope;
According to each client in the levels of interest of each client and the preset time period to described The interest-degree of target signature label sends the recommendation information associated with the target signature label to each client.
In embodiments of the present invention, the current accessed behavioral data bag sent by receiving client, and by current accessed Behavioral data bag is stored in target signature label and the common corresponding access behavior database of the client, and is calculated default Client labels so as to realize to the interest-degree of the target signature label according to user access activity to user in period Label, establishing user's portrait so that user, which draws a portrait, is bonded reality, improves the accuracy of user's portrait, is easy to recommend the information of appropriateness, Improve resource utilization simultaneously.
The embodiment of the present invention also provides a kind of data handling system, including at least one client and data processing equipment.
One of ordinary skill in the art will appreciate that realize all or part of flow in above-described embodiment method, being can be with The hardware of correlation is instructed to complete by computer program, described program can be stored in a computer read/write memory medium In, the program is upon execution, it may include such as the flow of the embodiment of above-mentioned each method.Wherein, described storage medium can be magnetic Dish, CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random Access Memory, RAM) etc..
Above disclosure is only preferred embodiment of present invention, can not limit the right model of the present invention with this certainly Enclose, therefore the equivalent variations made according to the claims in the present invention, still belong to the scope that the present invention is covered.

Claims (10)

  1. A kind of 1. data processing method, it is characterised in that including:
    The current accessed behavioral data bag that client is sent is received, the current accessed behavioral data bag is believed including access time Target signature corresponding to breath, the web page characteristics identification information of the web page characteristics identification information of target web and the target web Label;
    The current accessed behavioral data bag is stored in the target signature label and the common corresponding access of the client In behavior database;
    The current accessed behavioral data bag and history in the access behavior database access behavioral data bag, calculate Interest-degree of the client to the target signature label in preset time period.
  2. 2. according to the method for claim 1, it is characterised in that described to work as according in the access behavior database Preceding access behavioral data bag and history access behavioral data bag, and the client is to the target signature in calculating preset time period The interest-degree of label, including:
    The current accessed behavioral data bag and history in the access behavior database access behavioral data bag, statistics Access times of the client to the target signature label in preset time period;
    According to the client in the preset time period to the access times of the target signature label and with the target Interest parameter is preset corresponding to feature tag, calculates in the preset time period client to the target signature label Interest-degree.
  3. 3. according to the method for claim 2, it is characterised in that described according to the client pair in the preset time period The access times of the target signature label and default interest parameter corresponding with the target signature label, are calculated described pre- If the client is to the interest-degree of the target signature label in the period, including:
    According to the client in the preset time period to the access times of the target signature label and with the target Interest parameter is preset corresponding to feature tag, and the client pair in the preset time period is calculated according to default calculation formula The interest-degree of the target signature label;
    Wherein, the default calculation formula is:
    W (n)=∑ik*ln(pvi+1)*exp(-λi)
    N is the preset time period, and w (n) is the client in the preset time period to the emerging of the target signature label Interesting degree;K is default variation coefficient, and λ is preset time attenuation coefficient, and the default interest parameter includes the default variation coefficient With the preset time attenuation coefficient;I ∈ 0~n, pviSummation for client described in the preset time period to the target The access times of feature tag.
  4. 4. according to the method for claim 1, it is characterised in that also include:
    The interest-degree of the target signature label is sent to the client according to the client in the preset time period The recommendation information associated with the target signature label.
  5. 5. according to the method for claim 1, it is characterised in that also include:
    If each client is equal to the interest-degree of the target signature label at least two clients in the preset time period It is identical, then calculate average access duration of each client to the target signature label;
    According to each client to the average access duration and predetermined level table of the target signature label to described each Client divides levels of interest, and the predetermined level table includes average access duration corresponding to each grade at least one grade Scope;
    According to each client in the levels of interest of each client and the preset time period to the target The interest-degree of feature tag, the recommendation information associated with the target signature label is sent to each client.
  6. A kind of 6. data processing equipment, it is characterised in that including:
    Receiving unit, for receiving the current accessed behavioral data bag of client transmission, the current accessed behavioral data bag Include the web page characteristics identification information pair of access time information, the web page characteristics identification information of target web and the target web The target signature label answered;
    Memory cell, for the current accessed behavioral data bag being stored in into the target signature label and the client is total to Accessed with corresponding in behavior database;
    Computing unit, row is accessed for the current accessed behavioral data bag in the access behavior database and history For packet, the client is calculated in preset time period to the interest-degree of the target signature label.
  7. 7. device according to claim 6, it is characterised in that the computing unit is specifically used for according to the access behavior The current accessed behavioral data bag and history in database access behavioral data bag, count the client in preset time period Hold the access times to the target signature label;According to the client in the preset time period to the target signature mark The access times of label and default interest parameter corresponding with the target signature label, are calculated described in the preset time period Interest-degree of the client to the target signature label.
  8. 8. device according to claim 7, it is characterised in that the computing unit is specifically used for according to the preset time The client is to the access times of the target signature label and corresponding with the target signature label default emerging in section Interesting parameter, and according to the client in the default calculation formula calculating preset time period to the emerging of the target signature label Interesting degree;
    Wherein, the default calculation formula is:
    W (n)=∑ik*ln(pvi+1)*exp(-λi)
    N is the preset time period, and w (n) is the client in the preset time period to the emerging of the target signature label Interesting degree;K is default variation coefficient, and λ is preset time attenuation coefficient, and the default interest parameter includes the default variation coefficient With the preset time attenuation coefficient;I ∈ 0~n, pviSummation for client described in the preset time period to the target The access times of feature tag.
  9. 9. device according to claim 6, it is characterised in that also include:
    Transmitting element, for according to the client in the preset time period to the interest-degree of the target signature label to institute State client and send the recommendation information associated with the target signature label.
  10. 10. device according to claim 6, it is characterised in that
    The computing unit, if being additionally operable in the preset time period at least two clients each client to the target The interest-degree all same of feature tag, then calculate average access duration of each client to the target signature label;
    Described device also includes:
    Division unit, for the average access duration and predetermined level according to each client to the target signature label Table includes at least one grade corresponding to each grade to each client division levels of interest, the predetermined level table Average access duration scope;
    The transmitting element, it is additionally operable to according to described every in the levels of interest of each client and the preset time period Individual client is sent related to the target signature label to the interest-degree of the target signature label to each client The recommendation information of connection.
CN201610725502.XA 2016-08-25 2016-08-25 Data processing method and device Active CN107783987B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610725502.XA CN107783987B (en) 2016-08-25 2016-08-25 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610725502.XA CN107783987B (en) 2016-08-25 2016-08-25 Data processing method and device

Publications (2)

Publication Number Publication Date
CN107783987A true CN107783987A (en) 2018-03-09
CN107783987B CN107783987B (en) 2022-03-04

Family

ID=61438680

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610725502.XA Active CN107783987B (en) 2016-08-25 2016-08-25 Data processing method and device

Country Status (1)

Country Link
CN (1) CN107783987B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110223113A (en) * 2019-05-28 2019-09-10 上海易点时空网络有限公司 The data processing method and device of advertisement pushing
CN110489654A (en) * 2019-09-12 2019-11-22 五八有限公司 Obtain method, apparatus, electronic equipment and the storage medium of user interest degree
CN113973087A (en) * 2021-11-24 2022-01-25 中国银联股份有限公司 Webpage access current limiting method and device and computer readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609471A (en) * 2012-01-18 2012-07-25 康睿 Method and device for precisely analyzing network behaviors of Internet users
CN103118111A (en) * 2013-01-31 2013-05-22 北京百分点信息科技有限公司 Information push method based on data from a plurality of data interaction centers
CN103530314A (en) * 2013-07-26 2014-01-22 苏州亿倍信息技术有限公司 Data processing method and system
CN104090886A (en) * 2013-12-09 2014-10-08 深圳市腾讯计算机系统有限公司 Method and device for constructing real-time portrayal of user
CN105740366A (en) * 2016-01-26 2016-07-06 哈尔滨工业大学深圳研究生院 Inference method and device of MicroBlog user interests

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609471A (en) * 2012-01-18 2012-07-25 康睿 Method and device for precisely analyzing network behaviors of Internet users
CN103118111A (en) * 2013-01-31 2013-05-22 北京百分点信息科技有限公司 Information push method based on data from a plurality of data interaction centers
CN103530314A (en) * 2013-07-26 2014-01-22 苏州亿倍信息技术有限公司 Data processing method and system
CN104090886A (en) * 2013-12-09 2014-10-08 深圳市腾讯计算机系统有限公司 Method and device for constructing real-time portrayal of user
CN105740366A (en) * 2016-01-26 2016-07-06 哈尔滨工业大学深圳研究生院 Inference method and device of MicroBlog user interests

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YUAN CHENG等: "Model Bloggers’Interests Based on Forgetting Mechanism", 《ACM》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110223113A (en) * 2019-05-28 2019-09-10 上海易点时空网络有限公司 The data processing method and device of advertisement pushing
CN110489654A (en) * 2019-09-12 2019-11-22 五八有限公司 Obtain method, apparatus, electronic equipment and the storage medium of user interest degree
CN113973087A (en) * 2021-11-24 2022-01-25 中国银联股份有限公司 Webpage access current limiting method and device and computer readable storage medium
CN113973087B (en) * 2021-11-24 2024-01-05 中国银联股份有限公司 Webpage access current limiting method and device and computer readable storage medium

Also Published As

Publication number Publication date
CN107783987B (en) 2022-03-04

Similar Documents

Publication Publication Date Title
CN110941740B (en) Video recommendation method and computer-readable storage medium
CN109819284B (en) Short video recommendation method and device, computer equipment and storage medium
CN106940705B (en) Method and equipment for constructing user portrait
CN110380954B (en) Data sharing method and device, storage medium and electronic device
US8738433B2 (en) Method and system for targeted advertising
US8224823B1 (en) Browsing history restoration
CN107222526B (en) Method, device and equipment for pushing promotion information and computer storage medium
CN107426328B (en) Information pushing method and device
WO2012122384A1 (en) Determining preferred categories based on user access attribute values
CN107341245A (en) Data processing method, device and server
CN109120719B (en) Information pushing method, information display method, computer equipment and storage medium
US10019419B2 (en) Method, server, browser, and system for recommending text information
CN110097395A (en) Directional advertisement release method, device and computer readable storage medium
CN104850546A (en) Mobile media information display method and system
WO2011147800A1 (en) Method of identifying remote users of websites
CN110209921B (en) Method and device for pushing media resource, storage medium and electronic device
CN107783987A (en) A kind of data processing method and device
CN115659008B (en) Information pushing system, method, electronic equipment and medium for big data information feedback
CN112685648A (en) Resource recommendation method, electronic device and computer-readable storage medium
CN112015986B (en) Data pushing method, device, electronic equipment and computer readable storage medium
US10909100B2 (en) Object identifier index
CN112804541A (en) User behavior tracking method and device, electronic equipment and storage medium
CN111429214B (en) Transaction data-based buyer and seller matching method and device
CN106796695A (en) Using the conversion and identification installed
WO2017111287A1 (en) Method, apparatus and computer program for providing commercial contents

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant