CN100477593C - Method and device for selecting correlative discussion zone in network community - Google Patents

Method and device for selecting correlative discussion zone in network community Download PDF

Info

Publication number
CN100477593C
CN100477593C CNB2006101411656A CN200610141165A CN100477593C CN 100477593 C CN100477593 C CN 100477593C CN B2006101411656 A CNB2006101411656 A CN B2006101411656A CN 200610141165 A CN200610141165 A CN 200610141165A CN 100477593 C CN100477593 C CN 100477593C
Authority
CN
China
Prior art keywords
discussion
zone
district
relevant
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CNB2006101411656A
Other languages
Chinese (zh)
Other versions
CN1937518A (en
Inventor
郭眈
李明远
张猛
俞军
边江
李瀚�
杨用
李幸
俞建林
齐玉杰
刘建国
李彦宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CNB2006101411656A priority Critical patent/CN100477593C/en
Publication of CN1937518A publication Critical patent/CN1937518A/en
Application granted granted Critical
Publication of CN100477593C publication Critical patent/CN100477593C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

This method includes that: collects all computer logs logging into the discuss section (DS), combines them into a chief log, draws info from the chief log, generates the business set, via frequent item set (FIS) mining algorithm, selects FIS of DS, draws the related DS of each DS from FIS of DS, generates related DS list, then finishes. This invention utilizes the data mining method to find out the related DS of this DS. The related DS presents the behavior of the DS user. This will be convenient for the user's next accessing and will reduce lookup cost. A selection device of related DS in the network community is disclosed as well.

Description

The choosing method in relevant discussion district and selecting device in the Web Community
Technical field
The present invention relates to the choosing method and the selecting device of zone of discussion in a kind of Web Community, the choosing method and the selecting device in especially a kind of relevant discussion district belong to computer realm.
Background technology
Along with the fast development of network, increasing people select by the Internet come search information or seek suitable, have the people of close interest or close demand to link up.
Traditional community or BBS system are based on and manually set up content zone of discussion system and directory index formula, the manager who promptly needs system is at first in the catalogue of good some of system's inediting and content, need then manually to offer the space of a whole page or zone of discussion, and set up related between the space of a whole page or zone of discussion and the catalogue.The user needs at first to judge this topic may appear at what space of a whole page or the zone of discussion under what catalogue when browsing interchange, then by clicking super chain of catalogue and space of a whole page name, the space of a whole page or zone of discussion that the super chain of realm name arrives hope being discussed.But the user of some zone of discussion also can often go to some other identical zone of discussion, for example go the user of " AC Milan " zone of discussion also often to go to zones of discussion such as " Inter Milan ", " American-European sports star ", user in traditional community or BBS system be if just need search these zones of discussion again will enter these zones of discussion the time, thereby increased user's the cost of searching.
Some other the zone of discussion that the relevant discussion district is meant that the people that logins certain zone of discussion also often goes is the relevant discussion district of this zone of discussion.
The friendship zone of discussion be by each zone of discussion user management group through after gathering user applies and screening, setting tabulate with super chains these other zones of discussion that zone of discussion theme is relevant, content is similar.The user can switch to these zones of discussion by clicking super chain easily.
In some zone of discussion systems, on the Web of a certain zone of discussion webpage, also can show the link of other zones of discussion of also logining of people of this zone of discussion of login, for example: the people that also can show login A zone of discussion on the Web of A zone of discussion webpage has also logined B zone of discussion, C zone of discussion etc., next step the behavior that this has indicated the people of login A zone of discussion has to a certain extent made things convenient for user's the whole zone of discussion to browse action.
The defective of technique scheme is that this zone of discussion system just simply enumerated out the discussion realm name of other zones of discussion that the people that logins certain zone of discussion also logins, this does not embody the strength of association of behavior with the behavior of other zones of discussion of login of this zone of discussion of login, for example: the philtrum of 100 login A zones of discussion has 99 people also to login the B zone of discussion, but the philtrum of these 100 login A zones of discussion has only 1 people to login the C zone of discussion, the relevance of the behavior of this explanation login A zone of discussion and the behavior of login B zone of discussion is very strong, the relevance of the behavior of the behavior of login A zone of discussion and login C zone of discussion is very weak, but this zone of discussion system all enumerates B zone of discussion and C zone of discussion out, the existence of C zone of discussion not only can not disclose the people's of login A zone of discussion next step behavior, may mislead to user's the behavior of browsing on the contrary.
Summary of the invention
The objective of the invention is at the existing in prior technology defective, the choosing method and the device in relevant discussion district in a kind of Web Community is provided, make things convenient for the user to carry out next step visit, reduced and searched cost.
To achieve these goals, the invention provides the choosing method in relevant discussion district in a kind of Web Community, this method may further comprise the steps:
Step 1, the daily record of collecting the computer of all login zones of discussion are merged into total daily record with all daily records;
Step 2, from described total daily record according to predefined form information extraction, generate the affairs set;
Step 3, from the set of described affairs, choose the zone of discussion frequent item set by the frequent item set mining algorithm;
Step 4, from the frequent item set of described zone of discussion, extract the relevant discussion district of each zone of discussion, generate the tabulation of relevant discussion district.
In the technique scheme, the zone of discussion frequent item set is the set greater than the zone of discussion collection of minimum threshold of all support frequency.Wherein, minimum threshold is the parameter that pre-sets; Support frequency to be the number of times of zone of discussion collection appearance and the merchant of zone of discussion lump number; The zone of discussion collection is to be the zone of discussion sequence of one day visit of user then.
The present invention also provides the selecting device in a kind of relevant discussion district, comprising:
The log collection module is used to collect the daily record that all login the computer of zones of discussion, and a total daily record is merged in all daily records;
The daily record extraction module is connected with described log collection module, is used for from described total daily record according to predefined form information extraction, generates the affairs set;
The frequent item set generation module is connected with described daily record extraction module, is used for choosing the zone of discussion frequent item set by the frequent item set mining algorithm from described affairs set;
Tabulation generation module in zone of discussion is connected with described frequent item set generation module, is used for selecting from described zone of discussion frequent item set the relevant discussion district collection of each zone of discussion.
At arbitrary specific zone of discussion, the technical program utilizes data mining method to find out the relevant discussion district of this zone of discussion, the relevant discussion district has indicated next step behavior that may carry out of this zone of discussion user, thereby makes things convenient for the user to carry out next step visit, has reduced and has searched cost.
Below by the drawings and specific embodiments, technical scheme of the present invention is described in further detail.
Description of drawings
Fig. 1 is the flow chart of the specific embodiment 1 of the choosing method in relevant discussion district in the Web Community of the present invention;
Fig. 2 is the flow chart of the specific embodiment 2 of the choosing method in relevant discussion district in the Web Community of the present invention;
Fig. 3 is the flow chart of the specific embodiment 3 of the choosing method in relevant discussion district in the Web Community of the present invention;
Fig. 4 is the flow chart of the specific embodiment 4 of the choosing method in relevant discussion district in the Web Community of the present invention;
Fig. 5 is the flow chart of the specific embodiment 5 of the choosing method in relevant discussion district in the Web Community of the present invention;
Fig. 6 is the flow chart of the specific embodiment 6 of the choosing method in relevant discussion district in the Web Community of the present invention;
Fig. 7 is the flow chart of choosing the zone of discussion frequent item set of the choosing method in relevant discussion district in the Web Community of the present invention;
Fig. 8 is the structure chart of the specific embodiment 1 of the selecting device in relevant discussion district in the Web Community of the present invention;
Fig. 9 is the structure chart of the specific embodiment 2 of the selecting device in relevant discussion district in the Web Community of the present invention;
Figure 10 is the structure chart of the specific embodiment 3 of the selecting device in relevant discussion district in the Web Community of the present invention;
Figure 11 is the structure chart of the specific embodiment 4 of the selecting device in relevant discussion district in the Web Community of the present invention;
Figure 12 is the structure chart of the specific embodiment 5 of the selecting device in relevant discussion district in the Web Community of the present invention;
Figure 13 is the structure chart of the specific embodiment 6 of the selecting device in relevant discussion district in the Web Community of the present invention.
Embodiment
Choosing of relevant discussion district can realize by following steps:
Step 1, the daily record of collecting the computer of all login zones of discussion are merged into total daily record with all daily records;
Step 2, from described total daily record according to predefined form information extraction, generate the affairs set;
Step 3, from the set of described affairs, choose the zone of discussion frequent item set by the frequent item set mining algorithm;
Step 4, from the frequent item set of described zone of discussion, extract the relevant discussion district of each zone of discussion, generate the tabulation of relevant discussion district.
In the above-mentioned steps 2 owing to comprised the information of many types in total daily record, as access time of visit date of user, user, user name or the like, so should come from total daily record, to extract the information that comprises some particular types according to the form of user journal and the data of choosing zone of discussion frequent item set needs.
Frequent item set mining algorithm in the above-mentioned steps 3 is in order to choose the zone of discussion frequent item set, make the relevance of any two zones of discussion that concentrate each zone of discussion in the frequent item set of zone of discussion all more intense, frequent item set mining algorithm commonly used has Apriori algorithm, FP-Growth algorithm etc.
As shown in Figure 1, the flow chart for the choosing method embodiment 1 in relevant discussion of the present invention district may further comprise the steps:
Step 101: collect the daily record that all land the computer of zone of discussion, total daily record is merged in all daily records;
Step 102: the form information extraction from described total daily record according to the discussion realm name] t un=Yong Huming n of user capture Ri Qi t Fang Wenshijian t[user input generates the affairs set; Step 103: utilize the Apriori algorithm in the frequent item set mining algorithm from described affairs set, to choose the zone of discussion frequent item set, concrete steps following (as shown in Figure 7):
Step 31:k=1, the support frequency of calculating C (k) discipline collection is put into L (k) with the item collection of supporting frequency to be higher than minimum threshold, and wherein C (k) is the primitive term collection, and L (k) is frequent k item collection;
Step 32: judge that whether L (k) is empty, then finishes in this way; As otherwise execution in step 33;
Step 33: the item collection among the L (k) connected with montage obtain C (k+1);
Step 34: the support frequency that calculates C (k+1) discipline collection;
Step 35: the item of supporting frequency to be higher than minimum threshold among the output C (k+1) collects among the file ref.K;
Step 36: the item collection of supporting frequency to be higher than minimum threshold among the C (k+1) is put into L (k+1),, finish if L (k+1) is empty; Otherwise execution in step 33;
The Xiang Jiwei frequent item set that comprises among the file ref.K behind final the end, the form of zone of discussion collection is in the frequent item set: A zone of discussion, zone of discussion B zone of discussion C ... support frequency;
Step 104: from the frequent item set of described zone of discussion, extract the relevant discussion district of each zone of discussion, generate the tabulation of relevant discussion district.
Fig. 2 is the flow chart of the specific embodiment 2 of the choosing method in relevant discussion of the present invention district, and this embodiment specifically was divided into for two steps with described step 104 on the basis of embodiment 1, as shown in Figure 2:
Step 1041: each zone of discussion order is concentrated the zone of discussion of extracting except that this zone of discussion from the zone of discussion that comprises this zone of discussion, generate the relevant discussion district collection of this zone of discussion;
Step 1042: the relevant discussion district collection according to each zone of discussion generates the tabulation of relevant discussion district.
Sometimes the relevant discussion district that obtains according to such scheme has a lot of, and the quantity in the relevant discussion district that webpage can be showed is limited, therefore need be to relevant discussion district frequent item set according to supporting that frequency sorts, forward relevant discussion district represents to the user with ordering.
Fig. 3 is the flow chart of the specific embodiment 3 of the choosing method in relevant discussion of the present invention district, this embodiment also comprised step 1040 before described step 1041 on the basis of embodiment 2: according to supporting frequency order from high to low to the zone of discussion collection ordering in the frequent item set of described zone of discussion; Also comprise step 105 after described step 1042: according to the relevant discussion district number that sets in advance, the zone of discussion of sequence number greater than this relevant discussion district number arranged in deletion, and remaining zone of discussion is as effective relevant discussion block reservation of this zone of discussion.
Fig. 4 is the flow chart of the specific embodiment 4 of the choosing method in relevant discussion of the present invention district, this embodiment on the basis of embodiment 2, after described step 1042, also comprise step 105 ': the zone of discussion that the relevant discussion district of each zone of discussion is concentrated is according to supporting frequency rank order from high to low; According to the relevant discussion district number that sets in advance, the zone of discussion of sequence number greater than this relevant discussion district number arranged in deletion, and remaining zone of discussion is as effective relevant discussion block reservation of this zone of discussion.
By embodiment 3 and embodiment 4 described technical schemes, can obtain that a part of zone of discussion the highest to each zone of discussion with this zone of discussion degree of association, thereby indicate some other zone of discussion of most possibly going of user of this zone of discussion of login, made things convenient for next step browse of this zone of discussion user.
Sometimes (what is the friendship zone of discussion some relevant discussion district and friendship zone of discussion? can in background technology, illustrate) identical situation, therefore need go heavily, Fig. 5 is the flow chart of the embodiment 5 of the choosing method in relevant discussion of the present invention district, this embodiment is on the basis of embodiment 4, in step 105 ' also comprise step 106 afterwards: the relevant discussion district of removing each zone of discussion concentrates the zone of discussion identical with the friendship zone of discussion of this zone of discussion.Need to prove: described step 106 also can be placed on after the step 104 or step 1042 or the step 105 among the embodiment 3 among the embodiment 2 among the embodiment 1.
Fig. 6 is the flow chart of the choosing method specific embodiment 6 in relevant discussion of the present invention district, this embodiment is on the basis of embodiment 5, after step 106 further comprising the steps of 107: according to the zone of discussion sign of user's input, the district's tabulation of inquiry relevant discussion, extract the relevant discussion district, the link in relevant discussion district is presented on the Web webpage of this zone of discussion.Need to prove: described step 107 also can be placed on step 104 among the embodiment 1 or the step 1042 among the embodiment 2 or the step 105 among the embodiment 3 or the step 105 among the embodiment 4 ' afterwards.
As shown in Figure 8, be the structural representation of the selecting device in relevant discussion of the present invention district, comprise log collection module 11, daily record extraction module 12, frequent item set generation module 13 and zone of discussion tabulation generation module 14.Daily record extraction module 12 is connected with frequent item set generation module 13 with log collection module 11, and zone of discussion tabulation generation module 14 is connected with frequent item set generation module 13.Log collection module 11 is used to collect the daily record that all login the computer of zones of discussion, and a total daily record is merged in all daily records; Daily record extraction module 12 is used for from described total daily record according to predefined form information extraction, generates the affairs set; Frequent item set generation module 13 is used for choosing the zone of discussion frequent item set by the frequent item set mining algorithm from described affairs set; Tabulation generation module 14 in zone of discussion is used for generating the tabulation of relevant discussion district from the relevant discussion district of each zone of discussion of described zone of discussion frequent item set extraction.
Tabulation generation module 14 in zone of discussion among the last embodiment can be chosen module 14B and relevant discussion district tabulation generation module 14C forms by two submodules-relevant discussion district collection, as shown in Figure 9.Relevant discussion district collection is chosen module 14B and is connected with relevant discussion district tabulation generation module 14C with frequent item set generation module 13 respectively.Relevant discussion district collection is chosen module 14B and is used for each zone of discussion order is concentrated the zone of discussion of extracting except that this zone of discussion from the zone of discussion that comprises this zone of discussion, generates the relevant discussion district collection of this zone of discussion; Relevant discussion district tabulation generation module 14C is used for generating the tabulation of relevant discussion district according to the relevant discussion district collection of each zone of discussion.
Figure 10 is the structural representation of the specific embodiment 3 of the selecting device in relevant discussion of the present invention district, and this embodiment has added zone of discussion collection order module 14A on the basis of a last embodiment and effective relevant discussion district chooses module 15.Zone of discussion collection order module 14A chooses module 14B with frequent item set generation module 13 with relevant discussion district collection respectively and is connected, and effective relevant discussion district chooses module 15 and is connected with relevant discussion district tabulation generation module 14C.Zone of discussion collection order module 14A is used for according to the zone of discussion collection ordering of supporting frequency order from high to low to described zone of discussion frequent item set; Effective relevant discussion district chooses module 15 and is used for according to the relevant discussion district number that sets in advance, and the zone of discussion of sequence number greater than this relevant discussion district number arranged in deletion.
Log collection module 11 is collected the daily record of the computer of all login zones of discussion among this embodiment, total daily record is merged in all daily records, and should total daily record be sent to daily record extraction module 12; Daily record extraction module 12 according to predefined form information extraction, generates the affairs set from described total daily record, and the affairs set that generates is sent to frequent item set generation module 13; Frequent item set generation module 13 is chosen the zone of discussion frequent item set by the frequent item set mining algorithm from described affairs set, and the zone of discussion frequent item set is sent to zone of discussion collection order module 14A; Zone of discussion collection order module 14A sends to relevant discussion district collection with the zone of discussion frequent item set after the ordering and chooses module 14B according to supporting frequency order from high to low to the zone of discussion collection ordering in the frequent item set of described zone of discussion; Relevant discussion district collection is chosen module 14B each zone of discussion order is concentrated the zone of discussion of extracting except that this zone of discussion from the zone of discussion that comprises this zone of discussion, generate the relevant discussion district collection of this zone of discussion, and the relevant discussion district collection of all zones of discussion is sent to relevant discussion district tabulation generation module 14C; Relevant discussion district tabulation generation module 14C is placed on the relevant discussion district collection of each zone of discussion in the set, thereby generates the tabulation of relevant discussion district, and the tabulation of relevant discussion district is sent to effective relevant discussion district chooses module 15; Effectively the relevant discussion district chooses module 15 according to the relevant discussion district number that sets in advance, and the relevant discussion district of deleting each zone of discussion concentrates arranges the zone of discussion of sequence number greater than this relevant discussion district number.
Figure 11 is the structure chart of the specific embodiment 4 of the selecting device in relevant discussion of the present invention district, and this embodiment has added relevant discussion district order module 14D on the basis of the specific embodiment 2 of the selecting device in relevant discussion of the present invention district and effective relevant discussion district chooses module 15.Relevant discussion district order module 14D chooses module 15 with relevant discussion district tabulation generation module 14C with effective relevant discussion district respectively and is connected.The zone of discussion that relevant discussion district order module 14D is used for the relevant discussion district of each zone of discussion is concentrated is according to supporting frequency rank order from high to low.
Log collection module 11 is collected the daily record of the computer of all login zones of discussion among this embodiment, total daily record is merged in all daily records, and should total daily record be sent to daily record extraction module 12; Daily record extraction module 12 according to predefined form information extraction, generates the affairs set from described total daily record, and the affairs set that generates is sent to frequent item set generation module 13; Frequent item set generation module 13 is chosen the zone of discussion frequent item set by the frequent item set mining algorithm from the set of described affairs, and the zone of discussion frequent item set is sent to relevant discussion district collection chooses module 14B; Relevant discussion district collection is chosen module 14B each zone of discussion order is concentrated the zone of discussion of extracting except that this zone of discussion from the zone of discussion that comprises this zone of discussion, generate the relevant discussion district collection of this zone of discussion, and the relevant discussion district collection of all zones of discussion is sent to relevant discussion district tabulation generation module 14C; Relevant discussion district tabulation generation module 14C is placed on the relevant discussion district collection of each zone of discussion in the set, thereby generates the tabulation of relevant discussion district, and the tabulation of relevant discussion district is sent to relevant discussion district order module 14D.The zone of discussion that relevant discussion district order module 14D concentrates the relevant discussion district of each zone of discussion is according to supporting frequency rank order from high to low, and the relevant discussion district after will sort tabulates and is sent to effective relevant discussion district and chooses module 15; Effectively the relevant discussion district chooses module 15 according to the relevant discussion district number that sets in advance, and the relevant discussion district of deleting each zone of discussion concentrates arranges the zone of discussion of sequence number greater than this relevant discussion district number.
Under some relevant discussion district situation identical, need the deletion zone of discussion identical with the friendship zone of discussion with the friendship zone of discussion.As shown in figure 12, be structure chart for the specific embodiment 5 of the selecting device in relevant discussion of the present invention district, on the basis of the specific embodiment 4 of the selecting device in relevant discussion of the present invention district, add the zone of discussion and removed molality piece 16, the zone of discussion is gone to molality piece 16 and effective relevant discussion district to choose module 15 to be connected, to be used to remove the concentrated zone of discussion identical with the friendship zone of discussion of this zone of discussion, relevant discussion district of each zone of discussion.Described zone of discussion is gone molality piece 16 also can be placed in the specific embodiment 1 of selecting device in relevant discussion of the present invention district to be connected with described zone of discussion tabulation generation module 14; Perhaps be placed in the specific embodiment 2 of selecting device in relevant discussion of the present invention district and be connected with described relevant discussion district tabulation generation module 14C; Perhaps be placed on and choose module 15 with described effective relevant discussion district in the specific embodiment 3 of selecting device in relevant discussion of the present invention district and be connected.
As shown in figure 13, be structure chart for the specific embodiment 6 of the selecting device in relevant discussion of the present invention district, on the basis of the specific embodiment 5 of the selecting device in relevant discussion of the present invention district, added relevant discussion district display module 17, relevant discussion district display module 17 goes molality piece 16 to be connected with the zone of discussion, be used for zone of discussion sign according to user's input, the relevant discussion district is extracted in the tabulation of inquiry relevant discussion district, and the link in relevant discussion district is presented on the Web webpage of this zone of discussion.
Log collection module 11 is collected the daily record of the computer of all login zones of discussion among this embodiment, total daily record is merged in all daily records, and should total daily record be sent to daily record extraction module 12; Daily record extraction module 12 according to predefined form information extraction, generates the affairs set from described total daily record, and the affairs set that generates is sent to frequent item set generation module 13; Frequent item set generation module 13 is chosen the zone of discussion frequent item set by the frequent item set mining algorithm from the set of described affairs, and the zone of discussion frequent item set is sent to relevant discussion district collection chooses module 14B; Relevant discussion district collection is chosen module 14B each zone of discussion order is concentrated the zone of discussion of extracting except that this zone of discussion from the zone of discussion that comprises this zone of discussion, generate the relevant discussion district collection of this zone of discussion, and the relevant discussion district collection of all zones of discussion is sent to relevant discussion district tabulation generation module 14C; Relevant discussion district tabulation generation module 14C is placed on the relevant discussion district collection of each zone of discussion in the set, thereby generates the tabulation of relevant discussion district, and the tabulation of relevant discussion district is sent to relevant discussion district order module 14D.The zone of discussion that relevant discussion district order module 14D concentrates the relevant discussion district of each zone of discussion in the relevant discussion district tabulation is according to supporting frequency rank order from high to low, and the relevant discussion district after will sort tabulates and is sent to effective relevant discussion district and chooses module 15; Effectively the relevant discussion district chooses module 15 according to the relevant discussion district number that sets in advance, the relevant discussion district of each zone of discussion concentrates and arranges the zone of discussion of sequence number greater than this relevant discussion district number in the tabulation of deletion relevant discussion district, and the relevant discussion district after will handling tabulates and sends to the zone of discussion and remove molality piece 16; The relevant discussion district that the zone of discussion goes molality piece 16 to remove each zone of discussion in the tabulation of relevant discussion district concentrates the zone of discussion identical with the friendship zone of discussion of this zone of discussion, and the tabulation of the relevant discussion district after will handling sends to relevant discussion district display module 17; Relevant discussion district display module 17 is according to the zone of discussion sign of user input, and the relevant discussion district is extracted in the district's tabulation of inquiry relevant discussion, and the link in relevant discussion district is presented on the Web webpage of this zone of discussion.
In addition, described relevant discussion district display module 17 also can be placed in the specific embodiment 1 of selecting device in relevant discussion of the present invention district and be connected with described zone of discussion tabulation generation module 14; Perhaps be placed in the specific embodiment 2 of selecting device in relevant discussion of the present invention district and be connected with described relevant discussion district tabulation generation module 14C; Perhaps be placed on and choose module 15 with described effective relevant discussion district in the specific embodiment 3 of selecting device in relevant discussion of the present invention district and be connected; Perhaps be placed on and choose module 15 with described effective relevant discussion district in the specific embodiment 4 of selecting device in relevant discussion of the present invention district and be connected.
Should be noted that at last: above embodiment is only in order to illustrate that technical scheme of the present invention is not intended to limit; Although with reference to preferred embodiment the present invention is had been described in detail, those of ordinary skill in the field should be appreciated that still and can make amendment or the part technical characterictic is equal to replacement the specific embodiment of the present invention; And not breaking away from the spirit of technical solution of the present invention, it all should be encompassed in the middle of the technical scheme scope that the present invention asks for protection.

Claims (28)

1, the choosing method in relevant discussion district in a kind of Web Community is characterized in that this method may further comprise the steps:
Step 1, the daily record of collecting the computer of all login zones of discussion are merged into total daily record with all daily records;
Step 2, from described total daily record according to predefined form information extraction, generate the affairs set;
Step 3, from the set of described affairs, choose the zone of discussion frequent item set by the frequent item set mining algorithm;
Step 4, from the frequent item set of described zone of discussion, extract the relevant discussion district of each zone of discussion, generate the tabulation of relevant discussion district.
2, the choosing method in relevant discussion according to claim 1 district is characterized in that described zone of discussion frequent item set comprises a plurality of zones of discussion collection, and each zone of discussion collection comprises a plurality of zones of discussion; Described step 4 is specially:
Step 41: the zone of discussion of order extraction except that this zone of discussion concentrated in each zone of discussion from the zone of discussion that comprises this zone of discussion, generate the relevant discussion district collection of this zone of discussion;
Step 42: the relevant discussion district collection according to each zone of discussion generates the tabulation of relevant discussion district.
3, the choosing method in relevant discussion according to claim 2 district is characterized in that, before the described step 41 further comprising the steps of 40: according to supporting frequency order from high to low to the zone of discussion collection ordering in the frequent item set of described zone of discussion.
4, the choosing method in relevant discussion according to claim 3 district, it is characterized in that after the described step 42 further comprising the steps of 5: according to the relevant discussion district number that sets in advance, the zone of discussion of sequence number greater than this relevant discussion district number arranged in deletion.
5, the choosing method in relevant discussion according to claim 2 district is characterized in that after the described step 42 further comprising the steps of 43: the zone of discussion that the relevant discussion district of each zone of discussion is concentrated is according to supporting frequency rank order from high to low.
6, the choosing method in relevant discussion according to claim 5 district, it is characterized in that further comprising the steps of 5 after the described step 43 ': according to the relevant discussion district number that sets in advance, the zone of discussion of sequence number greater than this relevant discussion district number arranged in deletion.
7, according to the choosing method in the arbitrary described relevant discussion of claim 1-6 district, it is characterized in that further comprising the steps of 7: the relevant discussion district of removing each zone of discussion concentrates the zone of discussion identical with the friendship zone of discussion of this zone of discussion.
8, according to the choosing method in the arbitrary described relevant discussion of claim 1-6 district, it is characterized in that, further comprising the steps of 8: according to the zone of discussion sign of user's input, the district's tabulation of inquiry relevant discussion, extract the relevant discussion district, the link in relevant discussion district is presented on the Web webpage of this zone of discussion.
9, the choosing method in relevant discussion according to claim 7 district, it is characterized in that further comprising the steps of 8: according to the zone of discussion sign of user's input, the district's tabulation of inquiry relevant discussion, extract the relevant discussion district, the link in relevant discussion district is presented on the Web webpage of this zone of discussion.
10, the selecting device in relevant discussion district in a kind of Web Community is characterized in that comprising:
The log collection module is used to collect the daily record that all login the computer of zones of discussion, and a total daily record is merged in all daily records;
The daily record extraction module is connected with described log collection module, is used for from described total daily record according to predefined form information extraction, generates the affairs set;
The frequent item set generation module is connected with described daily record extraction module, is used for choosing the zone of discussion frequent item set by the frequent item set mining algorithm from described affairs set;
Tabulation generation module in zone of discussion is connected with described frequent item set generation module, is used for extracting from described zone of discussion frequent item set the relevant discussion district of each zone of discussion, generates the tabulation of relevant discussion district.
11, device according to claim 10 is characterized in that, described zone of discussion frequent item set comprises a plurality of zones of discussion collection, and each zone of discussion collection comprises a plurality of zones of discussion; Described zone of discussion tabulation generation module comprises:
Relevant discussion district collection is chosen module, is connected with described frequent item set generation module, is used for each zone of discussion order is concentrated the zone of discussion of extracting except that this zone of discussion from the zone of discussion that comprises this zone of discussion, generates the relevant discussion district collection of this zone of discussion;
Tabulation generation module in relevant discussion district is chosen module with described relevant discussion district collection and is connected, and is used for generating the tabulation of relevant discussion district according to the relevant discussion district collection of each zone of discussion.
12, device according to claim 11 is characterized in that described zone of discussion tabulation generation module also comprises:
Zone of discussion collection order module, described relevant discussion district collection are chosen module and are connected with the frequent item set generation module by this order module, and this order module is used for according to the zone of discussion collection ordering of supporting frequency order from high to low to described zone of discussion frequent item set.
13, device according to claim 12 is characterized in that also comprising:
Effectively the relevant discussion district chooses module, is connected with described zone of discussion collection order module, is used for according to the relevant discussion district number that sets in advance, and the zone of discussion of sequence number greater than this relevant discussion district number arranged in deletion.
14, device according to claim 11 is characterized in that described zone of discussion tabulation generation module also comprises:
Relevant discussion district order module is connected with described relevant discussion district tabulation generation module, and the zone of discussion that is used for concentrating in the relevant discussion district to each zone of discussion is according to support frequency rank order from high to low.
15, device according to claim 14 is characterized in that also comprising:
Effectively the relevant discussion district chooses module, is connected with described relevant discussion district order module, is used for according to the relevant discussion district number that sets in advance, and the zone of discussion of sequence number greater than this relevant discussion district number arranged in deletion.
16, device according to claim 10 is characterized in that also comprising:
The molality piece is removed in the zone of discussion, is connected with described zone of discussion tabulation generation module, is used to remove the concentrated zone of discussion identical with the friendship zone of discussion of this zone of discussion, relevant discussion district of each zone of discussion.
17, device according to claim 11 is characterized in that also comprising:
The molality piece is removed in the zone of discussion, is connected with described relevant discussion district tabulation generation module, is used to remove the concentrated zone of discussion identical with the friendship zone of discussion of this zone of discussion, relevant discussion district of each zone of discussion.
18, device according to claim 12 is characterized in that also comprising:
The molality piece is removed in the zone of discussion, is connected with described relevant discussion district tabulation generation module, is used to remove the concentrated zone of discussion identical with the friendship zone of discussion of this zone of discussion, relevant discussion district of each zone of discussion.
19, device according to claim 13 is characterized in that also comprising:
The molality piece is removed in the zone of discussion, chooses module with described effective relevant discussion district and is connected, and is used to remove the concentrated zone of discussion identical with the friendship zone of discussion of this zone of discussion, relevant discussion district of each zone of discussion.
20, device according to claim 14 is characterized in that also comprising:
The molality piece is removed in the zone of discussion, is connected with described relevant discussion district order module, is used to remove the concentrated zone of discussion identical with the friendship zone of discussion of this zone of discussion, relevant discussion district of each zone of discussion.
21, device according to claim 15 is characterized in that also comprising:
The molality piece is removed in the zone of discussion, chooses module with described effective relevant discussion district and is connected, and is used to remove the concentrated zone of discussion identical with the friendship zone of discussion of this zone of discussion, relevant discussion district of each zone of discussion.
22, device according to claim 10 is characterized in that also comprising:
Relevant discussion district display module is connected with described zone of discussion tabulation generation module, is used for the zone of discussion sign according to user's input, the district's tabulation of inquiry relevant discussion, and extraction relevant discussion district is presented at the link in relevant discussion district on the Web webpage of this zone of discussion.
23, device according to claim 11 is characterized in that also comprising:
Relevant discussion district display module, be connected with described relevant discussion district tabulation generation module, be used for zone of discussion sign, the district's tabulation of inquiry relevant discussion according to user's input, extract the relevant discussion district, the link in relevant discussion district is presented on the Web webpage of this zone of discussion.
24, device according to claim 12 is characterized in that also comprising:
Relevant discussion district display module, be connected with described relevant discussion district tabulation generation module, be used for zone of discussion sign, the district's tabulation of inquiry relevant discussion according to user's input, extract the relevant discussion district, the link in relevant discussion district is presented on the Web webpage of this zone of discussion.
25, device according to claim 13 is characterized in that also comprising:
Relevant discussion district display module, choose module with described effective relevant discussion district and be connected, be used for zone of discussion sign, the district's tabulation of inquiry relevant discussion according to user's input, extract the relevant discussion district, the link in relevant discussion district is presented on the Web webpage of this zone of discussion.
26, device according to claim 14 is characterized in that also comprising:
Relevant discussion district display module is connected with described relevant discussion district order module, is used for the zone of discussion sign according to user input, and the relevant discussion district is extracted in the district's tabulation of inquiry relevant discussion, and the link in relevant discussion district is presented on the Web webpage of this zone of discussion.
27, device according to claim 15 is characterized in that also comprising:
Relevant discussion district display module, choose module with described effective relevant discussion district and be connected, be used for zone of discussion sign, the district's tabulation of inquiry relevant discussion according to user's input, extract the relevant discussion district, the link in relevant discussion district is presented on the Web webpage of this zone of discussion.
28, according to the arbitrary described device of claim 16-21, it is characterized in that also comprising:
Relevant discussion district display module goes the molality piece to be connected with described zone of discussion, is used for the zone of discussion sign according to user input, and the relevant discussion district is extracted in the district's tabulation of inquiry relevant discussion, and the link in relevant discussion district is presented on the Web webpage of this zone of discussion.
CNB2006101411656A 2006-10-13 2006-10-13 Method and device for selecting correlative discussion zone in network community Active CN100477593C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2006101411656A CN100477593C (en) 2006-10-13 2006-10-13 Method and device for selecting correlative discussion zone in network community

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2006101411656A CN100477593C (en) 2006-10-13 2006-10-13 Method and device for selecting correlative discussion zone in network community

Publications (2)

Publication Number Publication Date
CN1937518A CN1937518A (en) 2007-03-28
CN100477593C true CN100477593C (en) 2009-04-08

Family

ID=37954805

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006101411656A Active CN100477593C (en) 2006-10-13 2006-10-13 Method and device for selecting correlative discussion zone in network community

Country Status (1)

Country Link
CN (1) CN100477593C (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8572094B2 (en) 2007-08-17 2013-10-29 Google Inc. Ranking social network objects
JP5243783B2 (en) * 2007-12-27 2013-07-24 インターナショナル・ビジネス・マシーンズ・コーポレーション Community system, community system activity recording method, and community system activity recording program
CN101383748B (en) * 2008-10-24 2011-04-13 北京航空航天大学 Community division method in complex network
US8909637B2 (en) * 2011-06-03 2014-12-09 Facebook, Inc. Context-based ranking of search results
US9268857B2 (en) 2011-06-03 2016-02-23 Facebook, Inc. Suggesting search results to users before receiving any search query from the users
US9110992B2 (en) 2011-06-03 2015-08-18 Facebook, Inc. Context-based selection of calls-to-action associated with search results
CN103514267A (en) * 2013-09-04 2014-01-15 快传(上海)广告有限公司 Gateway correlation information obtaining method and system
CN106776792B (en) * 2016-11-23 2020-07-17 北京锐安科技有限公司 Network community mining method and device

Also Published As

Publication number Publication date
CN1937518A (en) 2007-03-28

Similar Documents

Publication Publication Date Title
CN100477593C (en) Method and device for selecting correlative discussion zone in network community
CN102054004B (en) Webpage recommendation method and device adopting same
CN101246499B (en) Network information search method and system
CN101593200B (en) Method for classifying Chinese webpages based on keyword frequency analysis
US20110078140A1 (en) Method and system for user guided search navigation
CN104077415B (en) Searching method and device
CN104615627B (en) A kind of event public feelings information extracting method and system based on microblog
KR101252670B1 (en) Apparatus, method and computer readable recording medium for providing related contents
CN101847161A (en) Method for searching web pages and establishing database
CN103873601A (en) Addressing class query word mining method and system
CN101359332A (en) Design method for visual search interface with semantic categorization function
CN102117321A (en) Automated discovery aggregation and organization of subject area discussions
CN102682082B (en) Network Flash searching system and network Flash searching method based on content structure characteristics
JP4896268B2 (en) Information retrieval method and apparatus reflecting information value
CN102270331A (en) Network shopping navigating method based on visual search
CN104503988B (en) searching method and device
CN102314443A (en) Method for correcting search engine and system
CN103294692A (en) Information recommendation method and system
CN103365904A (en) Advertising information searching method and system
Gupta et al. A review on search engine optimization: Basics
CN103324631A (en) Method and device for providing data search
KR100671077B1 (en) Server, Method and System for Providing Information Search Service by Using Sheaf of Pages
CN103605742A (en) Method and device for recognizing network resource entity content page
Amitay et al. Serial Sharers: Detecting Split Identities of Web Authors.
CN103294715A (en) Hidden web data search method and search engine

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant