CN109561162A - Excavate the method and device that user accesses hobby - Google Patents

Excavate the method and device that user accesses hobby Download PDF

Info

Publication number
CN109561162A
CN109561162A CN201710884958.5A CN201710884958A CN109561162A CN 109561162 A CN109561162 A CN 109561162A CN 201710884958 A CN201710884958 A CN 201710884958A CN 109561162 A CN109561162 A CN 109561162A
Authority
CN
China
Prior art keywords
user
domain name
target
along sort
access
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710884958.5A
Other languages
Chinese (zh)
Inventor
严波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201710884958.5A priority Critical patent/CN109561162A/en
Publication of CN109561162A publication Critical patent/CN109561162A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • H04L61/4505Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
    • H04L61/4511Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses the method and devices that a kind of excavation user accesses hobby, are related to network technique field, and main purpose improves digging efficiency and invent to reduce the time loss for excavating user and accessing hobby process.The method comprise the steps that obtaining target domain name from the access data of target user;According to the corresponding relationship of preset domain name and tag along sort, the tag along sort of the corresponding target domain name is added for the target user, the tag along sort is for classifying to the corresponding website of different domain names;According to the quantity of the different classifications label of the target user, the access hobby of the target user is determined.The present invention is used to access user the excavation of hobby.

Description

Excavate the method and device that user accesses hobby
Technical field
The present invention relates to network technique field more particularly to a kind of method and devices excavated user and access hobby.
Background technique
With the continuous development of Internet technology, network has become inalienable part in people's life.Working as Under the background of modern big data era, the behavior and hobby of user how are preferably determined in a network, are had become each in network Big media and businessman occupy the best means of advantage in competition.For this purpose, accessing the determination and access happiness of situation for user Good excavation, the more attention by personnel in domain.
Currently, being carried out when being excavated to user's access hobby frequently with based on the mode of the behavioral data of user. In general, the existing excavation for accessing hobby to user is all that the contents such as article, video, the picture browsed by user are divided Analysis, and then determine that user accesses hobby based on the analysis results.However, in practical applications, being dug by the browsing content of user The mode for digging user's access hobby is operationally excessively complicated, especially when the content of user's browsing is more or is to need to divide When the number of users of analysis is larger, the data volume for the browsing content for needing to analyze is also excessively huge, and then causes to excavate user's access The process of hobby is time-consuming excessive, influences the digging efficiency that user accesses hobby.
Summary of the invention
In view of the above problems, the present invention provide it is a kind of excavation user access hobby method and device, main purpose into When row user accesses the excavation of hobby, the consumption of time is reduced, improves the digging efficiency that user accesses hobby.
In order to solve the above technical problems, in a first aspect, being somebody's turn to do the present invention provides a kind of method that excavation user accesses hobby Method includes:
Target domain name is obtained from the access data of target user;
According to the corresponding relationship of preset domain name and tag along sort, the corresponding target domain name is added for the target user Tag along sort, the tag along sort is for classifying to the corresponding website of different domain names;
According to the quantity of the different classifications label of the target user, the access hobby of the target user is determined.
Optionally, before obtaining target domain name in the access data from target user, the method also includes:
The user behavior data of different user is obtained, and parses the access number of user from the user behavior data According to the website comprising user's access and corresponding domain name in the access data;
According to the attribute information for the website that user in the access data of the user accesses, for the website of user access The tag along sort of the corresponding corresponding attribute information of domain name matching, the attribute information of the website includes the field of website, function One of energy and type are a variety of;
By domain name and the storage of corresponding tag along sort into database, domain name tag library is obtained;
The corresponding relationship according to preset domain name and tag along sort adds the corresponding target for the target user The tag along sort of domain name, specifically:
According to the corresponding relationship of the domain name and tag along sort that are stored in domain name tag library, added for the target user The tag along sort of the corresponding target domain name, domain name label stock contain domain name and corresponding tag along sort.
Optionally, before obtaining target domain name in the access data from target user, the method also includes:
The user behavior data of different user is obtained, and according to the screening rule in command information to the user behavior number It include the screening rule screened to the user behavior data in described instruction information according to being screened;
User behavior data after screening is stored into database, user behavior data library is obtained, after the screening User accesses data is included at least in user behavior data;
Target domain name is obtained in the access data from target user includes:
According to the user identifier of target user, the use of the corresponding user identifier is obtained from the user behavior data library Family behavioral data obtains the user behavior data of target user;
The access data of the target user are obtained from the user behavior data of the target user.
Optionally, in the corresponding relationship according to preset domain name and tag along sort, for target user addition pair Before the tag along sort for answering the target domain name, the method also includes;
Judge in domain name tag library with the presence or absence of target domain name;
If it does not exist, then output label addition request, the label addition request is for target domain name addition point The solicited message of class label;
The label information of request feedback is added according to the label, is added in the label information for the target domain name Tag along sort;
Corresponding relationship is established for the tag along sort in the target domain name and the label information and is stored in domain name In tag library.
Optionally, which is characterized in that the quantity of the different classifications label according to the target user determines the mesh The access of mark user is liked
Count the number of packet of the different classifications label of the target user and the total number of tag along sort;
The specific gravity that different classifications label accounts for total number is calculated according to the number of packet and the total number, is liked Good weighted value;
According to the hobby weighted value, the access hobby of the target user is determined.
Second aspect, the present invention also provides the device that a kind of excavation user accesses hobby, which includes:
Acquiring unit, for obtaining target domain name from the access data of target user;
Adding unit, for the corresponding relationship according to preset domain name and tag along sort, for target user addition pair The tag along sort for the target domain name for answering the acquiring unit to obtain, the tag along sort be used for the corresponding website of different domain names into Row classification;
Determination unit, the quantity of the different classifications label of the target user for being added according to the adding unit determine The access of the target user is liked.
Optionally, described device further include:
Resolution unit is parsed for obtaining the user behavior data of different user, and from the user behavior data The access data of user, the website accessed comprising user's access in data and corresponding domain name;
Matching unit, the category of the website of user's access in the access data of the user for being parsed according to the resolution unit Property information, for the tag along sort of the corresponding attribute information of website corresponding domain name matching of user access, the website Attribute information include one of field, function and type of website or a variety of;
First storage unit, for by after matching unit matching domain name and the storage of corresponding tag along sort to data In library, domain name tag library is obtained;
The adding unit, specifically for the domain name stored in the domain name tag library that is obtained according to first storage unit And the corresponding relationship of tag along sort, the tag along sort of the corresponding target domain name, domain name mark are added for the target user Label inventory contains domain name and corresponding tag along sort.
Optionally, described device further include:
Screening unit, for obtaining the user behavior data of different user, and according to the screening rule pair in command information The user behavior data is screened, and the screening in described instruction information comprising being screened to the user behavior data is advised Then;
Second storage unit is obtained for storing the user behavior data after screening unit screening into database To user behavior data library, user accesses data is included at least in the user behavior data after the screening;
The acquiring unit includes:
First obtains module, for the user identifier according to target user, the acquisition pair from the user behavior data library The user behavior data for answering the user identifier obtains the user behavior data of target user;
Second obtains module, for obtaining from the user behavior data for the target user that the first acquisition module obtains The access data of the target user.
Optionally, described device further includes;
Judging unit, for judging in domain name tag library with the presence or absence of target domain name;
Output unit, if judging that the target domain name is not present in domain name tag library for the judging unit, Output label addition request, the label addition request is for the solicited message to target domain name addition tag along sort;
The adding unit is also used to add the label information of request feedback according to the label, is the target domain name Add the tag along sort in the label information;
Unit is established, for for the tag along sort in label information added by the target domain name and the adding unit It establishes corresponding relationship and is stored in domain name tag library.
Optionally, the determination unit includes:
Statistical module, for counting the number of packet of the different classifications label of the target user and the totality of tag along sort Quantity;
Computing module, for calculating different classifications according to the number of packet of the statistical module counts and the total number Label accounts for the specific gravity of total number, obtains hobby weighted value;
Determining module, the hobby weighted value for being obtained according to the computing module, determines the access of the target user Hobby.
To achieve the goals above, according to the third aspect of the invention we, a kind of storage medium, the storage medium are provided Program including storage, wherein equipment where controlling the storage medium in described program operation executes digging described above Dig the method that user accesses hobby.
To achieve the goals above, according to the fourth aspect of the invention, a kind of processor is provided, the processor is used for Run program, wherein described program executes the method excavated user and access hobby described above when running.
By above-mentioned technical proposal, the method and device provided by the invention excavated user and access hobby, for existing skill For art when excavating the mode of user preferences according to browsing content, the data volume of required analysis is larger, when user's browsing content compared with When number of users more or need to be excavated is more, the data that need to be analyzed are more, cause to excavate the process consumption that user accesses hobby When excessive, the problem of influencing digging efficiency, the present invention leads to after the target domain name in the access data for getting target user The corresponding relationship of preset domain name and tag along sort is crossed, the tag along sort of corresponding target domain name, Jin Ergen are added for target user According to the quantity of tag along sort different in target user, determines the access hobby of the target user, realize and user is accessed The excavation of hobby, compared with the prior art, the present invention determine that user is liked by the quantity of tag along sort different in user The website accessed well and its classification reduce the prior art and pass through browsing content to realize the excavation for accessing user hobby Come the data volume of analysis required when determining user preferences, and then reduce and excavate the time, improves and the total of hobby is accessed to user The digging efficiency of body.It, can be real faster meanwhile by the corresponding relationship of domain name and tag along sort in preset domain name tag library Now to the function of the addition tag along sort of target user, so as to further reduce the time excavated user and access hobby, Improve digging efficiency.
The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention, And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage can It is clearer and more comprehensible, the followings are specific embodiments of the present invention.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this field Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the present invention Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 shows a kind of method flow diagram for excavating user and accessing hobby provided in an embodiment of the present invention;
Fig. 2 shows another method flow diagrams for excavating user and accessing hobby provided in an embodiment of the present invention;
Fig. 3 shows a kind of composition block diagram excavated user and access the device of hobby provided in an embodiment of the present invention;
Fig. 4 shows another composition block diagram excavated user and access the device of hobby provided in an embodiment of the present invention;
Specific embodiment
The exemplary embodiment that the present invention will be described in more detail below with reference to accompanying drawings.Although showing the present invention in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the present invention without should be by embodiments set forth here It is limited.It is to be able to thoroughly understand the present invention on the contrary, providing these embodiments, and can be by the scope of the present invention It is fully disclosed to those skilled in the art.
In order to improve the accuracy excavated user and access hobby, accesses and like the embodiment of the invention provides a kind of excavation user Good method is applied to client, as shown in Figure 1, this method comprises:
101, target domain name is obtained from the access data of target user.
Under normal conditions, before excavating the access hobby of user, it is necessary first to determine and access happiness to which user Good excavation.Therefore, in embodiments of the present invention, it is first determined the user for needing to excavate, i.e. target described in this step use Family.Further, since being the behavioural habits determination based on user to the excavation of the access hobby of user, in this regard, being excavated Before, it is necessary first to the behavioral data of user is got, specifically, in embodiments of the present invention, the user behavior data It is primarily referred to as the access data of user.The site information accessed in the user accesses data comprising user, and website is believed It include the corresponding domain-name information of different web sites in breath.
Wherein, domain name can be understood as a string with " point " separate character composition Internet on a certain computer or The title for calculating unit is usually used in identifying the electronic bearing of computer, sometimes referred to as geographical location, geographically when data are transmitted Domain name or refer to have administrative autonomy weigh a local area.A part due to domain name as same Resource Locator, Itself there is certain uniqueness, and the purpose of domain name is easy for the address of memory and the one group of server linked up, so as to So that visitor, which more easily realizes, carries out data interaction with the server in network.Thus, it is possible in the access for carrying out user When hobby is excavated, the target domain name that can be once accessed by obtaining target user, and then realize the net accessed user The determination stood.
It should be noted that obtain target user access data and access data in target domain name when, Ke Yitong The client deployment crossed in target user obtains code, perhaps crawls or adopts by behavioral data of the crawler to user Taking other, any number of mode is realized in the prior art, is not done specific restriction herein, be can according to need and chosen.
102, according to the corresponding relationship of preset domain name and tag along sort, the corresponding target is added for the target user The tag along sort of domain name.
Since domain name has uniqueness, it, can be according to the domain name after the domain name in user accesses data has been determined Determine the website that user is accessed.Belong to different classification since different websites has, has the function of different, such as shopping network It stands, military website, entertainment sites etc..It therefore, can be in embodiments of the present invention it according to the type for the website that user accesses Corresponding label, i.e. tag along sort described in this step are added, is then realized according to the corresponding relationship of domain name and website to domain Corresponding relationship is established between name and tag along sort, wherein the tag along sort is a kind of for the corresponding website of different domain names The identification information classified.Specifically, these can also be established corresponding relationship after establishing domain name and tag along sort Domain name and tag along sort be uniformly stored in a database, obtain storage domain name and corresponding tag along sort database, so as to Subsequent these corresponding relationships according in the database realize the addition that tag along sort is carried out to domain name.
It, can be according to the domain name in domain name tag library after the target domain name for getting target user in a step 101 as a result, And the corresponding relationship of tag along sort, tag along sort corresponding to the target domain name of the user is searched, then adds the tag along sort It is added to the user, to realize the determination to user's access level.Further, the corresponding tag along sort of target domain name is being added Afterwards, it can also continue to traverse the domain name of the user accessed, then the method according to this step is target user's continuation The successively tag along sort of the corresponding different domain names of addition, therefore, in the method described in the embodiment of the present invention, target user is corresponding The domain name quantity that the quantity of tag along sort can be accessed with it is corresponding.
Specifically, the mode classification of tag along sort can be according to different classifying rules such as function, the classifications of website come really The type and quantity for determining tag along sort, do not do specific restriction herein, and the specific classification need of hobby can be accessed according to user It asks and is determined.
103, according to the quantity of the different classifications label of the target user, the access hobby of the target user is determined.
After method through abovementioned steps 102 is added to tag along sort to target user, due under normal conditions, Yong Hufang The quantity for the website asked is more than one, meanwhile, therefore the domain name quantity of user's access is also multiple, therefore in abovementioned steps 102 for user add tag along sort when, quantity is also likely to be multiple.It needs in this step as a result, user is corresponding Multiple tag along sorts are grouped, meanwhile, in this step, there may be the feelings of identical tag along sort for the tag along sort of user Thus identical tag along sort can also be carried out duplicate removal and calculate the quantity of the tag along sort, then counted again next by condition The quantity of tag along sort, and so on, it, can be according to target user not after having counted all tag along sorts of the user The same quantity of tag along sort and the total number of tag along sort determines the access hobby of user.
For example, when occurring " military affairs ", " shopping ", " mail ", " amusement " four classification in the tag along sort of user A altogether When label, according to statistics, quantity 13 times, the quantity of " shopping " 120 times, the quantity of " amusement " 3200 times, " postal of " military affairs " are obtained At the quantity of part " 22 times, can determine that the user accesses at most is the website for entertaining class, thereby determines that the access happiness of the user The good website to tend to amusement aspect.
It should be noted that in this step the user access hobby method of determination can according to actual needs into Row is chosen.It specifically can choose other different methods of determination, for example, can be determined and be arranged according to the quantity of different classifications label Sequence sequence, and then determine that user most likes the website type of access according to collating sequence.Alternatively, according to different points in tag along sort The ratio for accounting for general classification total number of labels of class label, to determine the access hobby of user.It is, of course, also possible to select other modes To determine that user accesses hobby, and then excavation of the realization to user's access hobby.Here, for determining that user accesses the tool of hobby Body mode does not do specific restriction, can be on the basis of the sum of the number of packet of different classifications label and tag along sort, root The method of determination for accessing hobby according to needing voluntarily to carry out to choose user.
It is provided in an embodiment of the present invention excavate user access hobby method, for the prior art according to browsing content come When excavating the mode of user preferences, the data volume of required analysis is larger, when the use that user's browsing content is more or need to be excavated When amount amount is more, the data that need to be analyzed are more, cause the process for excavating user's access hobby time-consuming excessive, influence digging efficiency The problem of.Compared with the prior art, the present invention determines that user likes visit by the quantity of tag along sort different in user The website asked and its classification reduce the prior art by browsing content come really to realize the excavation for accessing user hobby Determine the data volume of required analysis when user preferences, and then reduce and excavate the time, improves and access user the totality liked Digging efficiency.Meanwhile by the corresponding relationship of domain name and tag along sort in preset domain name tag library, can realize faster pair The function of the addition tag along sort of target user improves so as to further reduce the time excavated user and access hobby Digging efficiency.
Further, as the refinement and extension to embodiment illustrated in fig. 1, the embodiment of the invention also provides another kinds to dig The method that user accesses hobby is dug, as shown in Fig. 2, its specific steps includes:
201, the user behavior data of different user is obtained, and parses the access of user from the user behavior data Data.
The method for digging of the user preferences as described in the embodiment of the present invention is the domain name in the access data based on user Come what is realized.Therefore, before carrying out user and accessing the excavation liked, it is necessary first to get the behavioral data of user.Wherein, User behavior data described in the embodiment of the present invention can be understood as including the whole interbehaviors of user in a network, wherein can To include the click of webpage, the different operation behaviors such as web page browsing, log in, publish, based on above-mentioned user behavior, Ke Yicong In parse the access data of user.The access data can be understood as the access information that user accesses different web sites, packet Include the address information of different web sites.Since uniform resource locator is mostly important website location information, in this step Described in access data may be uniform resource locator that user accessed, then by the uniform resource locator In get user access domain name.It is of course also possible to be other kinds of access data, herein for the type of access data And form does not do specific restriction, but is ensured that the website that user's access can be got in the access data is corresponding Domain name.In addition, the description complete one in the description for the domain name being related in embodiments of the present invention and the step 101 of previous embodiment It causes, details are not described herein.
As a result, by parsing access data in the behavioral data of user, and then obtain domain name, it can be ensured that get Domain name accuracy, to lay the foundation for the accuracy of Result that user accesses hobby.
202, the attribute information of the website accessed according to user in the access data of the user, for user access The tag along sort of the corresponding corresponding attribute information of domain name matching in website.
It, can be with for different websites since different websites all corresponds to different classification or different fields, therefore According to the attribute information of its website, classify according to classifying rules to these websites, for example, can be by " iqiyi.com ", " excellent Extremely " etc. video websites are classified as " amusement " class.Simultaneously as the uniqueness of domain name, it can be according to the corresponding website of different domain names Attribute information be that domain name matches corresponding tag along sort.
Wherein, the attribute information of website described in this step may include one in field, function and the type of website Kind is a variety of.Specifically, corresponding attribute information can be selected according to actual classification demand for the type of attribute information, Specific restriction is not done herein.In addition, when matching corresponding tag along sort to different domain names, it can root for classifying rules When being chosen according to actual needs, and carrying out tag along sort matching matching way can be voluntarily chosen according to actual needs.Example Such as, can be matched by artificial mode, by relevant technical staff according to the attribute of the corresponding website of different domain names come Corresponding tag along sort is matched for the domain name of the website.It is of course also possible to by preset function or program, by presetting one For determining the attribute programs or function of different web sites, then realized according to the corresponding tag along sort of different attribute to website The function of domain name Auto-matching tag along sort.In this regard, not doing specific restriction herein for matching way, can carry out as required It chooses.
Corresponding tag along sort is matched according to the attribute information of the corresponding website of domain name as a result, improves tag along sort The accuracy of matching result, and then guarantee the standard of the domain name stored in successive field name tag library and tag along sort corresponding relationship True property, to ensure that user accesses the accuracy of hobby on the whole.
203, domain name and the storage of corresponding tag along sort are obtained into domain name tag library into database.
After the matching of step 202, the corresponding relationship of domain name and tag along sort is obtained, for convenient for subsequent right using this It should be related to and guarantee the accuracy of the corresponding relationship, it can be according to the method described in this step, by above-mentioned domain name and tag along sort Corresponding relationship be stored in specified data library, obtain domain name tag database.In this way, in subsequent operation, Ke Yigen It is inquired and is judged in the tag database according to needs.
204, the user behavior data of different user is obtained, and according to the screening rule in command information to user's row It is screened for data.
It in general, is not one or several users of analysis only when the user behavior to user is analyzed, But need to analyze a large number of users in user group, in this regard, firstly the need of the total data for obtaining user group, then again by every The behavioral data of a user carries out analysis one by one, and then obtains the analysis result of entire user group.Similarly, implement in the present invention User described in example accesses in the mining process of hobby, it is also desirable to excavation one by one is carried out to the access hobby of each user, into And obtain the access hobby of different user.
As a result, in embodiments of the present invention, the user behavior data that can be got at present can be subjected to full dose first Acquisition, herein for obtain user behavior data mode can be according to different modes such as crawler in the prior art, scripts In any number of mode obtained.Since there may be the data such as mistake, redundancy in the data that get, while in user's row Much it is not data needed for method described in the embodiment of the present invention to exist in data, therefore, is getting user behavior After data, the request of filtering, the instruction then assigned according to picker can be sent to the picker of data in a manner of request Filtering requirement in information wherein mistake, redundancy, repetition and unwanted data will be screened and be cleaned, and then be obtained The user behavior data of " clean ", wherein the clean user behavior data herein referred to is actually to meet institute of the embodiment of the present invention User behavior data needed for the method stated, for unwanted data, then without retaining, to avoid excessive system money is occupied Source.
It should be noted that command information described in this step can be assigned before being screened every time, It is preset to can be picker, does not do specific restriction herein, can be chosen according to actual needs, but to ensure according to the instruction Information screened after user behavior data in include user access data.
By screening to the user behavior data got, the data of follow-up storage in the database can be reduced Amount, at the same can also make it is subsequent carry out target user's behavioral data acquisition process in, reduce query time, improve the effect of inquiry Rate, and then the time loss that user accesses the mining process of hobby is reduced on the whole, and reduces the occupancy to system resource.
205, the user behavior data after screening is stored into database, obtains user behavior data library.
After step 204, the user behavior data after screening can be stored in in preset database, and then To user behavior data library.It may be implemented to carry out the user behavior data of different user full dose collection effect by the database Fruit can be directly from the customer data base in needing to excavate the user behavior data library when access hobby of any user Access data are got, acquisition speed is improved.
It should be noted that the process in abovementioned steps 201-203 is actually to construct the process of domain name tag library, and walk Rapid 204-205 is then the process for constructing user behavior data library.Wherein, the process and building user behavior of domain name tag library are constructed The process of database can according to need selection sequencing, not do restriction sequentially herein, described in the embodiment of the present invention Executive mode be only specific implementation one kind.
206, target domain name is obtained from the access data of target user.
For the embodiment of the present invention, before excavating user and accessing hobby, it is necessary first to determine hobby to be excavated Which user is, specifically, may exist two kinds of situations.
On the one hand, in the user behavior data library obtained after the target user is constructed in previous embodiment 205 When user, then this step can be with specifically: firstly, being obtained from user behavior data library according to the user identifier of target user The user behavior data of the corresponding user identifier.Then, the access data of target user are obtained in the user behavior data, And then the domain name that target user accessed is parsed from access data.Wherein, the type and form for accessing data are herein not It limits, it is consistent with the description of previous embodiment.
On the other hand, it when the target user is not the user in user behavior data library, then needs to utilize existing skill Any number of data acquiring mode in art obtains the user behavior data of the target user.Specifically, can using crawler or The modes such as deployment script carry out, it is not limited here.
In this step, by inquiring user identifier in user behavior data library, and in the use for inquiring target user Family mark after, obtain corresponding user behavior data, it can be ensured that the accuracy of the user behavior information got, avoid because The user behavior data for getting mistake leads to the influence of the finally Result to the access hobby of target user.
207, judge in domain name tag library with the presence or absence of target domain name.
Although domain name tag database obtained in previous embodiment 203 is stored with a large amount of domain name and corresponding contingency table Label, but with there are compared with Websites quantity in network, it would still be possible to there are unsaved domain names.Therefore, in embodiments of the present invention, when After getting the target domain name of target user, need whether to have the target domain name in domain name label library inquiry, and according to judgement As a result subsequent step is carried out.
Wherein, if judging to then follow the steps 208 there are the target domain name in domain name tag library;If judging The target domain name is not present in domain name tag library, thens follow the steps 210.
208, according to the corresponding relationship of preset domain name and tag along sort, the corresponding target is added for the target user The tag along sort of domain name.
After step 207 judgement, when inquiring the target domain name in domain name tag library, then this specific step can To add corresponding institute for the target user according to the corresponding relationship of the domain name and tag along sort that store in domain name tag library The tag along sort of target domain name is stated, i.e., obtains the corresponding tag along sort of the domain name from domain name tag library, then by the contingency table Label are added in target user.
It should be noted that under normal conditions, the domain name that user accessed may be multiple and different domain name, therefore, The quantity for the domain name that target user accessed may be multiple, in this regard, tag set can be established for the target user, then root It is the tag set that the different corresponding different tag along sorts of domain name is added to the target user according to the method described in this step In.
In this step, by the corresponding relationship of preset domain name and tag along sort, corresponding target is added for target user The tag along sort of domain name, can be realized the accuracy of the determination to the access level of user, and then improves and excavate user's access The accuracy of hobby.
209, according to the quantity of the different classifications label of the target user, the access hobby of the target user is determined.
Specifically, after the tag along sort for being added to corresponding different target domain name for target user, it can be according to different Tag along sort determines the access hobby of target user.Specifically, this step may include: the difference firstly, statistics target user The number of packet of tag along sort and the total number of tag along sort, wherein, can will be identical when there are identical tag along sort Tag along sort merge duplicate removal, and record the number of the tag along sort.Then, according to the number of packet of tag along sort and always Body quantity calculates the specific gravity that different classifications label accounts for total number, obtains the hobby weighted value to different classifications.Finally, according to The hobby weighted value of family different classifications determines the access hobby of target user.
For example, work as user A tag along sort are as follows: " amusement ", " news ", " amusement ", " game ", " amusement ", " amusement ", " amusement ", " online shopping ", " online shopping ", " amusement " totally 10 tag along sorts when, then above-mentioned tag along sort can be merged into duplicate removal, And statistics number obtains " entertaining " 6, " online shopping " 2, " news " 1, " game " 1.Then this four tag along sorts are calculated Whole specific gravity is accounted for, the hobby weighted value for obtaining four tag along sorts respectively " entertains " 0.6, " online shopping " 0.2, " news " 0.1 And " game " 0.1.Finally due to the hobby weighted value highest of tag along sort " amusement ", and it is several much higher than other, it is possible thereby to The access hobby for excavating the user A is the website of amusement class.
In addition, it should be noted that, method described in this step, only user described in the embodiment of the present invention access hobby More excellent embodiment in method for digging chooses other specific embodiments according to actual needs unfortunately, for example, The most categories of websites as user preferences access of quantity in tag along sort can be chosen.Or preset threshold value, when accounting for entirety Specific gravity is more than the categories of websites that the tag along sort of threshold value is accessed as user preferences.
The method according to this step by calculating the quantity of every kind of tag along sort and accounting for whole specific gravity, and obtains Like weighted value, the mode that can quantify realizes the access hobby Result of user, and family access hobby can be used Result is more intuitive.
If the target domain name 210, is not present in domain name tag library, output label addition request.
After step 207 judgement, when determining in domain name tag library there is no when target domain name, then need by manual type To add corresponding tag along sort to the target domain name.Specifically, can not be deposited in determining domain name tag library in this step In the target domain name, a label addition request can be exported, to related personnel so that related personnel is receiving the label According to corresponding feedback information of the website feedback comprising corresponding tag along sort of target domain name after addition request.Wherein the label adds Add request for the solicited message to target domain name addition tag along sort.
It should be noted that the specific embodiment of the label addition request can be to preset in this step Mailbox sends request mail or other embodiments, does not do specific restriction herein, can need to select according to implementation.
211, the label information of request feedback is added according to the label, adds the label information for the target domain name In tag along sort.
After step 210 outputs addition tag request, the feedback information of the request can be received, wherein the feedback letter It may include the required tag along sort of request in breath.As a result, upon reception of the feedback information, classification is obtained from the label information Then label adds tag along sort for the target domain name.
212, corresponding relationship is established for the tag along sort in the target domain name and the label information and be stored in described In domain name tag library.
After the tag along sort of target domain name has been determined, corresponding close can be established for the target domain name and tag along sort The target domain name and tag along sort, are then stored in domain name tag library by system, to extend the content in the domain name tag library, with Ensure subsequent when another user is there is also when the target domain name, the aiming field can be directly obtained from the domain name tag library The corresponding tag along sort of name, and then user can be reduced to and add the time required when the tag along sort of corresponding target domain name, it mentions Height addition efficiency, and then improve the digging efficiency that whole user accesses hobby.
Further, as the realization to method shown in above-mentioned Fig. 1, the embodiment of the invention also provides a kind of excavation users The device for accessing hobby, for being realized to above-mentioned method shown in FIG. 1.The Installation practice and preceding method embodiment pair It answers, to be easy to read, present apparatus embodiment no longer repeats the detail content in preceding method embodiment one by one, but it should Clear, the device in the present embodiment can correspond to the full content realized in preceding method embodiment.As shown in figure 3, the device It include: acquiring unit 31, adding unit 32 and determination unit 33, wherein
Acquiring unit 31 can be used for obtaining target domain name from the access data of target user.
Adding unit 32 can be used for the corresponding relationship according to preset domain name and tag along sort, be the target user The tag along sort for the target domain name that the corresponding acquiring unit 31 of addition obtains, the tag along sort is used for corresponding to different domain names Website classify.
Determination unit 33, the number of the different classifications label for the target user that can be used for being added according to the adding unit 32 Amount determines the access hobby of the target user.
Further, as the realization to method shown in above-mentioned Fig. 2, the embodiment of the invention also provides a kind of excavation users The device for accessing hobby, for being realized to above-mentioned method shown in Fig. 2.The Installation practice and preceding method embodiment pair It answers, to be easy to read, present apparatus embodiment no longer repeats the detail content in preceding method embodiment one by one, but it should Clear, the device in the present embodiment can correspond to the full content realized in preceding method embodiment.As shown in figure 4, the device It include: acquiring unit 401, adding unit 402 and determination unit 403, wherein
Acquiring unit 401 can be used for obtaining target domain name from the access data of target user.
Adding unit 402 can be used for the corresponding relationship according to preset domain name and tag along sort, be the target user The tag along sort for the target domain name that the corresponding acquiring unit 401 of addition obtains, the tag along sort are used for different domain names pair Classify the website answered.
Determination unit 403, the different classifications label for the target user that can be used for being added according to the adding unit 402 Quantity determines the access hobby of the target user.
Further, described device further include:
Resolution unit 404 can be used for obtaining the user behavior data of different user, and from the user behavior data Parse the access data of user, the website comprising user's access and corresponding domain name in the access data.
Matching unit 405 can be used for user's access in the access data according to the user of the resolution unit 404 parsing Website attribute information, for the contingency table of the corresponding attribute information of website corresponding domain name matching of user access Label, the attribute information of the website includes one of the field of website, function and type or a variety of.
First storage unit 406, domain name and corresponding tag along sort after can be used for matching the matching unit 405 It stores in database, obtains domain name tag library;
The adding unit 402, specifically for being stored in the domain name tag library that is obtained according to first storage unit 406 Domain name and tag along sort corresponding relationship, the tag along sort of the corresponding target domain name is added for the target user, it is described Domain name label stock contains domain name and corresponding tag along sort.
Further, described device further include:
Screening unit 407 can be used for obtaining the user behavior data of different user, and according to the screening in command information Rule screens the user behavior data, includes to be screened in described instruction information to the user behavior data Screening rule.
Second storage unit 408, the user behavior data after can be used for screening the screening unit 407 are stored to number According to user behavior data library in library, is obtained, user accesses data is included at least in the user behavior data after the screening.
The acquiring unit 401 includes:
First obtains module 4011, can be used for the user identifier according to target user, from the user behavior data library The middle user behavior data for obtaining the corresponding user identifier, obtains the user behavior data of target user.
Second obtains module 4012, the user's row that can be used for obtaining the target user that module 4011 obtains from described first For the access data for obtaining the target user in data.
Further, described device further include:
Described device further includes;
Judging unit 409 can be used for judging in domain name tag library with the presence or absence of target domain name.
Output unit 410, if can be used for the judging unit 409 judges that there is no the mesh in domain name tag library Domain name is marked, then output label addition request, the label addition request is for asking target domain name addition tag along sort Seek information.
The adding unit 402 can be also used for the label information that request feedback is added according to the label, be the mesh Mark domain name adds the tag along sort in the label information.
Unit 411 is established, can be used for in label information added by the target domain name and the adding unit 402 Tag along sort establish corresponding relationship and be stored in domain name tag library.
Further, the determination unit 403 includes:
Statistical module 4031 can be used for counting the number of packet and contingency table of the different classifications label of the target user The total number of label.
Computing module 4032 can be used for being calculated according to the number of packet of the statistical module counts and the total number Different classifications label accounts for the specific gravity of total number, obtains hobby weighted value.
Determining module 4033 can be used for the hobby weighted value obtained according to the computing module, determine that the target is used The access at family is liked.
By above-mentioned technical proposal, the embodiment of the present invention provides a kind of method and device of excavation user access hobby.It is right In the prior art when excavating the mode of user preferences according to browsing content, the data volume of required analysis is larger, when user is clear When content of looking at is more or the number of users that need to be excavated is more, the data that need to be analyzed are more, lead to excavate user accessing hobby Process it is time-consuming excessive, the problem of influencing digging efficiency.Compared with the prior art, the present invention passes through contingency table different in user The quantity of label determines website and its classification of the liked access of user, to realize the excavation for accessing user hobby, reduces The prior art come the data volume of analysis required when determining user preferences, and then is reduced and excavates the time by browsing content, is mentioned The high overall digging efficiency for accessing user hobby.Meanwhile passing through domain name and tag along sort in preset domain name tag library Corresponding relationship, can realize faster to target user addition tag along sort function, so as to further reduce The time that user accesses hobby is excavated, digging efficiency is improved.
Meanwhile during constructing domain name tag database, matched by the attribute information of the corresponding website of domain name Corresponding tag along sort, improves the accuracy of tag along sort matching result, and then guarantees to deposit in successive field name tag library The domain name of storage and the accuracy of tag along sort corresponding relationship, to ensure that user accesses the accuracy of hobby on the whole.
In addition, during constructing user behavior data library, by will be sieved to the user behavior data got Choosing, can reduce the data volume of follow-up storage in the database, while can also make subsequent in progress target user's behavioral data In acquisition process, query time is reduced, the efficiency of inquiry is improved, and then reduced user on the whole and access the mining process liked Time loss, and reduce the occupancy to system resource, and by building user behavior data library, may be implemented to different use The user behavior data at family carries out full dose collecting effect, and works as and need to excavate any one user in the user behavior data library Access hobby when, access data can be directly got from the customer data base, improve acquisition speed, and then ensure The bulk velocity of the excavation of hobby is accessed user.
Further, by inquiring user identifier in user behavior data library, and in the user for inquiring target user After mark, corresponding user behavior data is obtained, it can be ensured that the accuracy of the user behavior information got is avoided because obtaining The user behavior data for getting mistake leads to the influence of the finally Result to the access hobby of target user.
In addition, by the quantity for calculating every kind of tag along sort and accounting for whole specific gravity, and obtain hobby weighted value, Neng Gouliang The mode of change realizes the access hobby Result of user, and the Result that family access hobby can be used is more intuitive. Meanwhile when judging that target domain name is not present in domain name tag library, by sending addition tag request and receiving feedback information, Tag along sort is therefrom obtained, is avoided when target domain name is not present in domain name tag library, can not be added for user described in corresponding to The case where tag along sort of target domain name, improves the accuracy for accessing user the Result of hobby, also, by the mesh Mark domain name and tag along sort are established corresponding relationship and are existed in domain name tag library, can be realized to interior in domain name tag library The supplement of appearance, it is ensured that domain name tag library content it is rich, and then ensure that subsequent is being user using in domain name tag library Add accuracy when tag along sort
The device that the excavation user accesses hobby includes processor and memory, above-mentioned acquiring unit, adding unit It is stored in memory with conduct program units such as determination units, above procedure list stored in memory is executed by processor Member realizes corresponding function.
Include kernel in processor, is gone in memory to transfer corresponding program unit by kernel.Kernel can be set one Or more, the consumption of time in the mining process of hobby is accessed to reduce user by adjusting kernel parameter, improves digging efficiency.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/ Or the forms such as Nonvolatile memory, if read-only memory (ROM) or flash memory (flash RAM), memory include that at least one is deposited Store up chip.
The embodiment of the invention provides a kind of storage mediums, are stored thereon with program, real when which is executed by processor The existing method excavated user and access hobby.
The embodiment of the invention provides a kind of processor, the processor is for running program, wherein described program operation The method that excavation user accesses hobby described in Shi Zhihang.
The embodiment of the invention provides a kind of equipment, equipment include processor, memory and storage on a memory and can The program run on a processor, processor are performed the steps of when executing program and are obtained from the access data of target user Target domain name;According to the corresponding relationship of preset domain name and tag along sort, the corresponding aiming field is added for the target user The tag along sort of name, the tag along sort is for classifying to the corresponding website of different domain names;According to the target user's The quantity of different classifications label determines the access hobby of the target user.
Further, before obtaining target domain name in the access data from target user, the method also includes:
The user behavior data of different user is obtained, and parses the access number of user from the user behavior data According to the website comprising user's access and corresponding domain name in the access data;
According to the attribute information for the website that user in the access data of the user accesses, for the website of user access The tag along sort of the corresponding corresponding attribute information of domain name matching, the attribute information of the website includes the field of website, function One of energy and type are a variety of;
By domain name and the storage of corresponding tag along sort into database, domain name tag library is obtained;
The corresponding relationship according to preset domain name and tag along sort adds the corresponding target for the target user The tag along sort of domain name, specifically:
According to the corresponding relationship of the domain name and tag along sort that are stored in domain name tag library, added for the target user The tag along sort of the corresponding target domain name, domain name label stock contain domain name and corresponding tag along sort.
Further, before obtaining target domain name in the access data from target user, the method also includes:
The user behavior data of different user is obtained, and according to the screening rule in command information to the user behavior number It include the screening rule screened to the user behavior data in described instruction information according to being screened;
User behavior data after screening is stored into database, user behavior data library is obtained, after the screening User accesses data is included at least in user behavior data;
Target domain name is obtained in the access data from target user includes:
According to the user identifier of target user, the use of the corresponding user identifier is obtained from the user behavior data library Family behavioral data obtains the user behavior data of target user;
The access data of the target user are obtained from the user behavior data of the target user.
Further, it in the corresponding relationship according to preset domain name and tag along sort, is added for the target user Before the tag along sort of the corresponding target domain name, the method also includes;
Judge in domain name tag library with the presence or absence of target domain name;
If it does not exist, then output label addition request, the label addition request is for target domain name addition point The solicited message of class label;
The label information of request feedback is added according to the label, is added in the label information for the target domain name Tag along sort;
Corresponding relationship is established for the tag along sort in the target domain name and the label information and is stored in domain name In tag library.
Further, which is characterized in that the quantity of the different classifications label according to the target user, determine described in The access of target user is liked
Count the number of packet of the different classifications label of the target user and the total number of tag along sort;
The specific gravity that different classifications label accounts for total number is calculated according to the number of packet and the total number, is liked Good weighted value;
According to the hobby weighted value, the access hobby of the target user is determined.
Equipment in the embodiment of the present invention can be server, PC, PAD, mobile phone etc..
The embodiment of the invention also provides a kind of computer program products, when executing on data processing equipment, are suitable for It executes the program of initialization there are as below methods step: obtaining target domain name from the access data of target user;According to preset The corresponding relationship of domain name and tag along sort, adds the tag along sort of the corresponding target domain name for the target user, and described point Class label is for classifying to the corresponding website of different domain names;According to the quantity of the different classifications label of the target user, Determine the access hobby of the target user.
Further, before obtaining target domain name in the access data from target user, the method also includes:
The user behavior data of different user is obtained, and parses the access number of user from the user behavior data According to the website comprising user's access and corresponding domain name in the access data;
According to the attribute information for the website that user in the access data of the user accesses, for the website of user access The tag along sort of the corresponding corresponding attribute information of domain name matching, the attribute information of the website includes the field of website, function One of energy and type are a variety of;
By domain name and the storage of corresponding tag along sort into database, domain name tag library is obtained;
The corresponding relationship according to preset domain name and tag along sort adds the corresponding target for the target user The tag along sort of domain name, specifically:
According to the corresponding relationship of the domain name and tag along sort that are stored in domain name tag library, added for the target user The tag along sort of the corresponding target domain name, domain name label stock contain domain name and corresponding tag along sort.
Further, before obtaining target domain name in the access data from target user, the method also includes:
The user behavior data of different user is obtained, and according to the screening rule in command information to the user behavior number It include the screening rule screened to the user behavior data in described instruction information according to being screened;
User behavior data after screening is stored into database, user behavior data library is obtained, after the screening User accesses data is included at least in user behavior data;
Target domain name is obtained in the access data from target user includes:
According to the user identifier of target user, the use of the corresponding user identifier is obtained from the user behavior data library Family behavioral data obtains the user behavior data of target user;
The access data of the target user are obtained from the user behavior data of the target user.
Further, it in the corresponding relationship according to preset domain name and tag along sort, is added for the target user Before the tag along sort of the corresponding target domain name, the method also includes;
Judge in domain name tag library with the presence or absence of target domain name;
If it does not exist, then output label addition request, the label addition request is for target domain name addition point The solicited message of class label;
The label information of request feedback is added according to the label, is added in the label information for the target domain name Tag along sort;
Corresponding relationship is established for the tag along sort in the target domain name and the label information and is stored in domain name In tag library.
Further, which is characterized in that the quantity of the different classifications label according to the target user, determine described in The access of target user is liked
Count the number of packet of the different classifications label of the target user and the total number of tag along sort;
The specific gravity that different classifications label accounts for total number is calculated according to the number of packet and the total number, is liked Good weighted value;
According to the hobby weighted value, the access hobby of the target user is determined.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.
The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present application Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
In a typical configuration, calculating equipment includes one or more processors (CPU), input/output interface, net Network interface and memory.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/ Or the forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable Jie The example of matter.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method Or technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data. The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory (SRAM), moves State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable Programmable read only memory (EEPROM), flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM), Digital versatile disc (DVD) or other optical storage, magnetic cassettes, tape magnetic disk storage or other magnetic storage devices Or any other non-transmission medium, can be used for storage can be accessed by a computing device information.As defined in this article, it calculates Machine readable medium does not include temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability It include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrap Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including element There is also other identical elements in process, method, commodity or equipment.
It will be understood by those skilled in the art that embodiments herein can provide as method, system or computer program product. Therefore, complete hardware embodiment, complete software embodiment or embodiment combining software and hardware aspects can be used in the application Form.It is deposited moreover, the application can be used to can be used in the computer that one or more wherein includes computer usable program code The shape for the computer program product implemented on storage media (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) Formula.
The above is only embodiments herein, are not intended to limit this application.To those skilled in the art, Various changes and changes are possible in this application.It is all within the spirit and principles of the present application made by any modification, equivalent replacement, Improve etc., it should be included within the scope of the claims of this application.

Claims (10)

1. a kind of method excavated user and access hobby characterized by comprising
Target domain name is obtained from the access data of target user;
According to the corresponding relationship of preset domain name and tag along sort, point of the corresponding target domain name is added for the target user Class label, the tag along sort is for classifying to the corresponding website of different domain names;
According to the quantity of the different classifications label of the target user, the access hobby of the target user is determined.
2. the method according to claim 1, wherein obtaining target in the access data from target user Before domain name, the method also includes:
The user behavior data of different user is obtained, and parses the access data of user, institute from the user behavior data State the website comprising user's access in access data and corresponding domain name;
According to the attribute information for the website that user in the access data of the user accesses, the website for user access is corresponding The corresponding attribute information of domain name matching tag along sort, the attribute information of the website include the field of website, function and One of type is a variety of;
By domain name and the storage of corresponding tag along sort into database, domain name tag library is obtained;
The corresponding relationship according to preset domain name and tag along sort adds the corresponding target domain name for the target user Tag along sort, specifically:
According to the corresponding relationship of the domain name and tag along sort that are stored in domain name tag library, adds and correspond to for the target user The tag along sort of the target domain name, domain name label stock contain domain name and corresponding tag along sort.
3. according to the method described in claim 2, it is characterized in that, obtaining target in the access data from target user Before domain name, the method also includes:
Obtain different user user behavior data, and according to the screening rule in command information to the user behavior data into Row screens, the screening rule in described instruction information comprising being screened to the user behavior data;
User behavior data after screening is stored into database, user behavior data library is obtained;
Target domain name is obtained in the access data from target user includes:
According to the user identifier of target user, user's row of the corresponding user identifier is obtained from the user behavior data library For data, the user behavior data of target user is obtained;
The access data of the target user are obtained from the user behavior data of the target user.
4. according to the method described in claim 3, it is characterized in that, in the correspondence according to preset domain name and tag along sort Relationship, before the tag along sort for corresponding to the target domain name is added for the target user, the method also includes;
Judge in domain name tag library with the presence or absence of target domain name;
If it does not exist, then output label addition request, the label addition request is for adding contingency table to the target domain name The solicited message of label;
The label information of request feedback is added according to the label, adds the classification in the label information for the target domain name Label;
Corresponding relationship is established for the tag along sort in the target domain name and the label information and is stored in domain name label In library.
5. method according to claim 1-4, which is characterized in that the difference according to the target user point The quantity of class label determines that the access hobby of the target user includes:
Count the number of packet of the different classifications label of the target user and the total number of tag along sort;
The specific gravity that different classifications label accounts for total number is calculated according to the number of packet and the total number, obtains hobby power Weight values;
According to the hobby weighted value, the access hobby of the target user is determined.
6. a kind of device for excavating user and accessing hobby characterized by comprising
Acquiring unit, for obtaining target domain name from the access data of target user;
Adding unit adds corresponding institute for the corresponding relationship according to preset domain name and tag along sort for the target user The tag along sort of the target domain name of acquiring unit acquisition is stated, the tag along sort is for dividing the corresponding website of different domain names Class;
Determination unit, the quantity of the different classifications label of the target user for being added according to the adding unit, determine described in The access of target user is liked.
7. device according to claim 6, which is characterized in that described device further include:
Resolution unit parses user for obtaining the user behavior data of different user, and from the user behavior data Access data, it is described access data in comprising user access website and corresponding domain name;
Matching unit, the attribute letter of the website of user's access in the access data of the user for being parsed according to the resolution unit Breath, for the tag along sort of the corresponding corresponding attribute information of domain name matching in website of user access, the category of the website Property information includes one of field, function and type of website or a variety of;
First storage unit, for by after matching unit matching domain name and the storage of corresponding tag along sort to database In, obtain domain name tag library;
The adding unit, specifically for the domain name that is stored in the domain name tag library that is obtained according to first storage unit and point The corresponding relationship of class label adds the tag along sort of the corresponding target domain name, domain name tag library for the target user It is stored with domain name and corresponding tag along sort.
8. device according to claim 7, which is characterized in that described device further include:
Screening unit, for obtaining the user behavior data of different user, and according to the screening rule in command information to described User behavior data is screened, the screening rule in described instruction information comprising being screened to the user behavior data;
Second storage unit is used for storing the user behavior data after screening unit screening into database Family behavior database includes at least user accesses data in the user behavior data after the screening;
The acquiring unit includes:
First obtains module, and for the user identifier according to target user, corresponding institute is obtained from the user behavior data library The user behavior data for stating user identifier obtains the user behavior data of target user;
Second obtains module, in the user behavior data for obtaining the target user that module obtains from described first described in acquisition The access data of target user.
9. a kind of storage medium, which is characterized in that the storage medium includes the program of storage, wherein run in described program When control the storage medium where equipment perform claim require 1 to visit to the excavation user described in any one of claim 5 The method for asking hobby.
10. a kind of processor, which is characterized in that the processor is for running program, wherein right is wanted when described program is run It asks 1 to the method excavated user and access hobby described in any one of claim 5.
CN201710884958.5A 2017-09-26 2017-09-26 Excavate the method and device that user accesses hobby Pending CN109561162A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710884958.5A CN109561162A (en) 2017-09-26 2017-09-26 Excavate the method and device that user accesses hobby

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710884958.5A CN109561162A (en) 2017-09-26 2017-09-26 Excavate the method and device that user accesses hobby

Publications (1)

Publication Number Publication Date
CN109561162A true CN109561162A (en) 2019-04-02

Family

ID=65863219

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710884958.5A Pending CN109561162A (en) 2017-09-26 2017-09-26 Excavate the method and device that user accesses hobby

Country Status (1)

Country Link
CN (1) CN109561162A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110750724A (en) * 2019-10-24 2020-02-04 北京思维造物信息科技股份有限公司 Data processing method, device, equipment and storage medium
CN110795616A (en) * 2019-10-10 2020-02-14 连连银通电子支付有限公司 Data collection method and device
CN110995824A (en) * 2019-11-29 2020-04-10 北京工业大学 DNS analysis load balancing method
CN112396536A (en) * 2019-08-12 2021-02-23 北京国双科技有限公司 Method and device for realizing intelligent service

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102799662A (en) * 2012-07-10 2012-11-28 北京奇虎科技有限公司 Method, device and system for recommending website
EP2575061A1 (en) * 2011-09-29 2013-04-03 Verisign, Inc. Tracing domain name history within a registration via a whowas service
CN103870512A (en) * 2012-12-18 2014-06-18 腾讯科技(深圳)有限公司 Method and device for generating user interest label
CN104462336A (en) * 2014-12-03 2015-03-25 北京国双科技有限公司 Information pushing method and device
CN105243144A (en) * 2015-10-15 2016-01-13 桂林电子科技大学 Method and device for recommending interesting labels
CN106446115A (en) * 2016-09-18 2017-02-22 成都九鼎瑞信科技股份有限公司 Mobile Internet user classification method and device
CN106649347A (en) * 2015-10-30 2017-05-10 北京国双科技有限公司 Interest information identification method and apparatus

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2575061A1 (en) * 2011-09-29 2013-04-03 Verisign, Inc. Tracing domain name history within a registration via a whowas service
CN102799662A (en) * 2012-07-10 2012-11-28 北京奇虎科技有限公司 Method, device and system for recommending website
CN105868291A (en) * 2012-07-10 2016-08-17 北京奇虎科技有限公司 Website address recommendation method, apparatus and system
CN103870512A (en) * 2012-12-18 2014-06-18 腾讯科技(深圳)有限公司 Method and device for generating user interest label
CN104462336A (en) * 2014-12-03 2015-03-25 北京国双科技有限公司 Information pushing method and device
CN105243144A (en) * 2015-10-15 2016-01-13 桂林电子科技大学 Method and device for recommending interesting labels
CN106649347A (en) * 2015-10-30 2017-05-10 北京国双科技有限公司 Interest information identification method and apparatus
CN106446115A (en) * 2016-09-18 2017-02-22 成都九鼎瑞信科技股份有限公司 Mobile Internet user classification method and device

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112396536A (en) * 2019-08-12 2021-02-23 北京国双科技有限公司 Method and device for realizing intelligent service
CN110795616A (en) * 2019-10-10 2020-02-14 连连银通电子支付有限公司 Data collection method and device
CN110795616B (en) * 2019-10-10 2020-10-23 连连银通电子支付有限公司 Data collection method and device
CN110750724A (en) * 2019-10-24 2020-02-04 北京思维造物信息科技股份有限公司 Data processing method, device, equipment and storage medium
CN110750724B (en) * 2019-10-24 2022-08-19 北京思维造物信息科技股份有限公司 Data processing method, device, equipment and storage medium
CN110995824A (en) * 2019-11-29 2020-04-10 北京工业大学 DNS analysis load balancing method

Similar Documents

Publication Publication Date Title
CN104090919B (en) Advertisement recommending method and advertisement recommending server
CN104680395B (en) For method and system audience segment estimation
US9569785B2 (en) Method for adjusting content of a webpage in real time based on users online behavior and profile
US8756187B2 (en) Systems and methods for providing recommendations based on collaborative and/or content-based nodal interrelationships
Chen et al. Facilitating effective user navigation through website structure improvement
CN106803190A (en) A kind of ad personalization supplying system and method
CN109561162A (en) Excavate the method and device that user accesses hobby
US20120054143A1 (en) Systems and methods for rule based inclusion of pixel retargeting in campaign management
CN107451199A (en) Method for recommending problem and device, equipment
CN108416616A (en) The sort method and device of complaints and denunciation classification
CN106339393A (en) Information push method and device
CN104217030A (en) Method and device for classifying users according to search log data of server
Yu et al. Identifying interesting visitors through Web log classification
WO2013033559A1 (en) Data fusion using behavioral factors
CN101124575A (en) Method and system for generating recommendations
CN107578263A (en) A kind of detection method, device and the electronic equipment of advertisement abnormal access
CN106503025A (en) Method and system is recommended in a kind of application
CN103729362A (en) Method and device for determining navigation content
CN103440199B (en) Test bootstrap technique and device
US20110184815A1 (en) System and method for sharing profits with one or more content providers
CN107592296A (en) The recognition methods of rubbish account and device
CN103605745A (en) Method, device and system for processing conversion paths
CN110134845A (en) Project public sentiment monitoring method, device, computer equipment and storage medium
CN111738785A (en) Product selection method, system and storage medium
CN106557556A (en) A kind of methods of exhibiting of Webpage, device, server and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing

Applicant after: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd.

Address before: 100086 Beijing city Haidian District Shuangyushu Area No. 76 Zhichun Road cuigongfandian 8 layer A

Applicant before: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
RJ01 Rejection of invention patent application after publication

Application publication date: 20190402

RJ01 Rejection of invention patent application after publication