CN101470754A - Community server system and activity recording method therefor - Google Patents

Community server system and activity recording method therefor Download PDF

Info

Publication number
CN101470754A
CN101470754A CN 200810178618 CN200810178618A CN101470754A CN 101470754 A CN101470754 A CN 101470754A CN 200810178618 CN200810178618 CN 200810178618 CN 200810178618 A CN200810178618 A CN 200810178618A CN 101470754 A CN101470754 A CN 101470754A
Authority
CN
China
Prior art keywords
activity
user
active matrix
key word
matrix data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 200810178618
Other languages
Chinese (zh)
Other versions
CN101470754B (en
Inventor
西山莉纱
村上明子
安藤史郎
R·H·鲁迪
水田秀行
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Publication of CN101470754A publication Critical patent/CN101470754A/en
Application granted granted Critical
Publication of CN101470754B publication Critical patent/CN101470754B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The present invention provides a commodity server system and an activity recording method used for the commodity server system. The activity of single user or a plurality of on-line users is stored along with the text information in the movement for establishing the data used for preference extraction and matching. The activity/preference analysis which is executed formerly through projecting user information to one-dimensional information can be executed through two viewpoints of activity and preference or through the integrated viewpoint of activity and preference, thereby executing activity/preference analysis through more information quantity. The stored data format is called to active matrix for storing the user activity together with the text information. The data of active matrix are stored in a hard-disc driver of commodity server. The active matrix is composed of the user activity and key words extracted from the text used along with the activity. Each element is extracted from the text used together with the activity and the value of each element is in the occurrence frequency of keyword stored in a certain time period.

Description

Commodity server system and the activity recording method that is used for commodity server system
Technical field
The present invention relates to allow a plurality of users by network, utilize themselves client computers log-on server system, so that the online community system that in system, intercoms mutually such as social networking service (SNS).
Background technology
Along with the Web2.0 technology comes into the market, many Internet users have participated in such as delivering blog (online diary) in CGM (client generates medium) or social networking service (SNS), read news article and searching for the comings and goings of new product information.
Therefore, each user's point of interest and focus have been carried out from such activity log extracting so that they are used to market and several trials of other business activity.
For example, openly Japanese patent application discloses a kind of system 2007-241753 number, server is preserved the pattern of ontology schema as a human body, wherein, layering personal interest information conceptually, individual human body's extraction equipment extracts the lineal class or an example conduct human body of the root class of ontology schema, unexpected information extracting device extracts and is included among the human body but is not included in class among other human body that similarity is equal to or greater than predetermined value or example as unexpected information, and unexpected information presentation device provides the unexpected information of being extracted by unexpected information extracting device to the user of the unexpected information that does not have extraction.
The instructions that transfers the applicant's Japanese patent application 2007-110559 number discloses a kind of technology, be used for systematically any one other entity of giving corresponding company, individual, product and mentioning, so that suppress the advertisement that occurs making the reader of article to feel unpleasant along with article at article with interest mark " criminal ", " victim " and " beneficiary ".Then, control advertisement selection mechanism is to avoid and to be endowed the entity associated advertisement of interest mark.Cohesion between article and the advertisement or the degree of correlation can be calculated as numerical value by the advertisement selection mechanism that is used for this purpose.Therefore, can suppress advertisement by the numerical value that reduces cohesion.Preferably reduce degree of inhibition in time gradually.Can think that this public technology is one of the example of system model of psychologic status of attempting to set up the user of accessed web page.
Especially, openly the technology of Japanese patent application 2007-241753 number is extracted user's interest from the text that the user writes, and the technology that is described in the instructions of Japanese patent application 2007-110559 number is extracted user's interest from the text that the user reads.Because the user in the online community participates in various types of action and activity, so need to handle the unified technology of polytype online activity.
Openly Japanese patent application relates to a kind of online community analysis device that is used to find out suitable community for 2005-327105 number.This online community analysis device have be used for from predetermined mail server obtain the sender's identifier that comprises the sender who is used for identifying at least out mail and be used to identify out mail the addressee addressee's identifier information obtain part, be used for calculating the common degree analysis part of the interest-degree common point of representing the common interest-degree of sender and addressee according to the information that the acquisition unit branch obtains, and spend the analysis part that the common degree of the determined interest of analysis part comes the formed community of analytical electron mail user jointly according to interest.Analysis part utilizes the interest-degree matrix to calculate the intercommunal degree of correlation, so that the formed community of analytical electron mail user.
Openly Japanese patent application has been pointed out a kind of useful technology of utilizing matrix to calculate the intercommunal degree of correlation for 2005-327105 number.But this technology only is used to analyze out mail, rather than handles the technology of multiple activity with comprehensive method.
[patent document 1] discloses Japanese patent application 2007-241753 number
The instructions that [patent document 2] Japanese patent application is 2007-110559 number
[patent document 3] discloses Japanese patent application 2005-327105 number
Summary of the invention
An object of the present invention is, provide a kind of can with comprehensive method handle in the online community as SNS such as the technology of index of delivering the blog short essay, reading news article and read the multiple activity of message.
Another object of the present invention is, provides a kind of and is used for calculating between individual, colony or the community, or between individual and colony or the community, and the index of the multiple activity by the individuality in the online community, colony or community's calculating is to provide the technology of useful consequence.
The present invention stores unique user or a plurality of users' online activity with the text message that aprowl uses, extract and coupling to use hobby.Therefore, only the invention enables over each movable or User Activity/hobby analysis that each text carries out can analytic activity and the combination of hobby both or they, thereby can carry out activity/hobby analysis by the more information amount.The brief introduction of the activity here including, but not limited to delivering/pursue the Ph.D visitor, Writing/Reading message, reading news article, on bulletin board system, make comments, buy product, read the description of product and read another user.
In order to store User Activity with text message in the present invention, for convenience's sake, the data layout of preserving is called active matrix.The active matrix data preferably are kept in the hard disk drive of server of online community etc.Active matrix is made of User Activity and the key word that extracts from the text that aprowl generates.Each element basis of matrix is come value in the frequency of occurrences of the key word of certain period stored.Suppose with the keyword extraction be the activity of target more than once and be represented as movable A={a1, a2, a3 ... }.Active matrix is stored by this way, i.e. the key word W={w1 that from the text d that is associated with User Activity, extracts, and w2, w3 ... corresponding to movable a.If key word wi with movable a jOccur once among the text d that is associated, then the element ij of active matrix gets 1 (or weighted value).
And this active matrix preferably reflects the data of storage in the past.According to one aspect of the present invention, by at next time point T I+1Weighted sum combination Δ T=T I+1-T iBetween active matrix (being called as interim active matrix), come T update time iActive matrix.
It is active matrix of a user definition.And,, then define the active matrix of community by the activity of storing those users if a plurality of user forms explicit community.
The active matrix of creating makes it possible to analyze the text that is associated with user's action message.For example, by comparing two any users' active matrix, can find out both overlapping overlapping activity, text and incident.
The present invention can realize by two types system.First system be from text that the user is associated generate interim active matrix so that it is integrated with the active matrix of storage heretofore.This comprises Language Processing part and the active matrix generating portion that is used for extracting from text key word/preference information, described active matrix generating portion is used for generating interim active matrix from these texts and activity, so that it is integrated with the active matrix in the active matrix storage area that is stored in as described later.Second system is the active matrix of each user of storage or each static community, so that it is used for analyzing or retrieval.This system provides the interface (API) of the various analyses that are used for active matrix to external application.
As the required arithmetical operation of analytic activity matrix, defined matrix addition, subtract each other, multiply each other and go/column degradation.The addition of two active matrixs means that creating integrated has two users' of active matrix (or two communities) pseudo-user's (or community) active matrix separately.And multiplying each other of active matrix can determine that activity, key word and both combinations of two active matrixs is to what extent relevant.And the row degradation of active matrix and column degradation have provided weighting keyword vector and weighted activity vector.
According to the present invention, in such as the online community of SNS, cross over repeatedly movable and a plurality of key word computational activity matrixes, the result is stored in the memory device of computing machine, and by computing machine to the active matrix of user or community carry out such as addition, subtract each other, multiply each other, the various analyses of degeneration etc.This makes it possible to find out user model and the community mode that is hidden in the repeatedly activity.The result can be used for effectively marketing, safety or any other purpose, to improve the value of online community itself.
Description of drawings
Fig. 1 illustrates the synoptic diagram that client computer is connected with online community server by the Internet;
Fig. 2 is the calcspar that the hardware configuration of client computer is shown;
Fig. 3 is the calcspar that the hardware configuration of online community server is shown;
Fig. 4 illustrates the schematic concept map that active matrix generates;
Fig. 5 is the calcspar that active matrix DB is shown and uses the API function of these data;
Fig. 6 is that active matrix generates the functional block diagram of handling;
Fig. 7 specifically illustrates active matrix to generate the synoptic diagram of handling;
Fig. 8 illustrates active matrix to generate the process flow diagram of handling;
Fig. 9 illustrates the synoptic diagram that utilizes access log to visit document;
Figure 10 is the figure that the processing example that active matrix develops in time is shown;
Figure 11 is the figure that the concrete instance of active matrix addition is shown;
Figure 12 is the figure that the concrete instance that active matrix subtracts each other is shown;
Figure 13 is the figure that the concrete instance of computational activity/key word correlation matrix is shown;
Figure 14 is the figure that the concrete instance of calculating the key word correlation matrix is shown;
Figure 15 is the figure that the concrete instance of computational activity correlation matrix is shown;
Figure 16 is the figure that the concrete instance of calculating Keyword List interested is shown; With
Figure 17 is the figure that the concrete instance of computational activity mode list is shown.
Embodiment
Referring now to accompanying drawing embodiments of the invention are described.Except as otherwise noted, identical in the accompanying drawings label is represented identical target from start to finish.Should be noted that following description only provides at a preferred embodiment of the present invention, but the present invention is not limited to description content in this embodiment.
In Fig. 1, online community server 102 passes through the Internet 104 and a plurality of client computer 106a ..., 106z connects.In the system of Fig. 1, the user of client computer utilizes web browser to pass through the network entry of the Internet 104 to online community server 102.Specifically, on web browser, import predetermined URL (URL(uniform resource locator)) to show the predetermined page.Note, in case login can be used the intended client application program but not web browser.
In case login, the password that the user of client computer uses given user ID (identifier) and is associated with user ID.In case the user of client computer login, the user participates in such as the diary of keeping a diary, browse other people who allows visit so that the personage's that commented on, sees news, creates and have similar tastes and interests community, chat, search and people have a liking for the activity in the online community of relevant community etc.
Then, be described in and use label 106a among Fig. 1 with reference to Fig. 2,106b ..., the block hardware diagram of the client computer that 106z represents.In Fig. 2, client computer has primary memory 206, CPU (CPU (central processing unit)) 204 and IDE controller 208, and these all are connected with bus 202.And display controller 214, communication interface 218, USB interface 220, audio interface 222 and keyboard/mouse controller 228 also are connected with bus 202.Hard disk drive (HDD) 210 is connected with IDE controller 208 with DVD driver 212.If desired, utilize the program of DVD driver introducing from CD-ROM or DVD.The display device 216 that preferably has LCD (liquid crystal display) screen is connected with display controller 214.On display device 216, show the screen of online community by web browser.
If desired, USB interface 220 can be connected such as the equipment of nonshared control unit with acceleration transducer equipment.These equipment can be used to improve the operability in the online community.
Loudspeaker 224 is connected with audio interface 222 with microphone 226.If this client computer is furnished with speech-sound synthesizing function, then can be with the partner chatting contents of audible vocabulary in loudspeaker 224 output and the online community.And,, then can use speech identifying function to convert the content that faces toward the user of microphone 226 speeches in the online community to text, so that text can be sent to the partner as chat content if client computer is furnished with speech identifying function.
Keyboard 230 is connected with keyboard/mouse controller 228 with mouse 232.The content that keyboard 230 is often used in writing chat messages in the online community or describes the community that will search for.Mouse 232 is used to click the link in the online community, so as to read news, selection and executable operations and select the user to want the diary of reading from menu.
CPU 204 can be for example based on any of 32 bit architectures or 64 bit architectures.Specifically, can be the Pentium 4 (TM of Intel Company (trade mark)) of Intel or the Athlon (TM) of AMD.
Hard disk drive 210 is storage operating system and the web browser (not shown) that moves on operating system at least.In case system starting is packed operating system in the primary memory 206 into.As operating system, can use Windows XP (TM of Microsoft), WindowsVista (TM of Microsoft) or Linux (TM of Linus Torvalds).
TCP/IP (transmission control protocol/Internet Protocol) communication function that communication interface 218 uses operating system to provide is communicated by letter with online community server 102 by Ethernet (TM) agreement.
Fig. 3 is the schematic block diagrams of the hardware configuration of online this side of the supplier of community.As shown in Figure 3, client computer 106a, 106b ..., 106z is connected with the communication interface 302 of online community server 102 by the Internet 104.Communication interface 302 also is connected with bus 304.CPU 306, primary memory (RAM) 308 are connected with bus 304 with hard disk drive (HDD) 310.
Although not shown, keyboard, mouse and display also are connected with online community server 102, and whole online community server 102 is administered and maintained.
Hard disk drive 310 storage operating systems of online community server 102 and be used for client computer 106a, 106b ..., the user ID of the login management of 106z/password corresponding tables.And hard disk drive 310 is also stored such as making online community server 102 play the software of the Apache of network server effect, and in case start online community server 102, and software is packed in the primary memory 308.Therefore, client computer 106a, 106b ..., 106z can visit online community server 102 via ICP/IP protocol.
The hard disk drive 310 of online community server 102 is preferably with the multimedia form such as HTML (HTML (Hypertext Markup Language)) file, graphic image file, motion pictures files and music file, and further storage is such as from the information of this online community service of message, diary or the blog of each user or bulletin board system and the information of relevant online community service.
The user can journalise or blog and bulletin board system in, other user can pursue the Ph.D visitor and bulletin board system, and make comment according to themselves access right.
As described later, hard disk drive 310 storages are used for calculating according to the module of active matrix of the present invention and the module that is used to carry out extracting from the active matrix that calculates the operation of various information.
About the configuration of blog and bulletin board system and relevant user capture control can realize by the instrument such as the known programming language of Perl, Ruby, PHP, Servlet and JSP.Alternately, can use C, C++, C# and Java (TM of Sun Microsystems company).
And,, system configuration one-tenth can be cooperated with Perl, Ruby or PHP by suitably JavaScript (TM) being embedded in the html file.
With the content stores of blog, bulletin board system, news etc. in content management database (CMDB), so that concentrate them with integration mode.
As online community server 102, can use server model IBM (TM of International Business Machines Corporation) System X, the Systemi and the System p that can buy from International Business Machines Corporation.In this case, applicable operating system comprises AIX (TM of International Business Machines Corporation), UNIX (TM of open group), Linux (TM) and Windows (TM) 2003 Server etc.
Then, the key concept according to active matrix of the present invention is described with reference to Fig. 4.
In Fig. 4, user 402 can be in the action of taking in this online community service shown in movable 404 (participation activity), that is, " write message " 404a, " in bulletin board system, making the comments " 404b, " delivering blog " 404c ..., " reading message " 404i, " reading the comment in the bulletin board system " 404j, " visitor pursues the Ph.D " 404k and " reading news " 404l etc.According to the function that the supplier provided of online community server 102, the type of the activity that the user can participate in online community service is scheduled to.
Then, for above-mentioned every kind of activity, the text that user 402 is write in scheduled time slot, that is, as the text 406a of message read/write, in bulletin board system read/write text 406b, be stored in the hard disk drive 310 (Fig. 3) as text message 406 as the text 406c of blog read/write and as the text 406d that news is read temporarily.Should be noted that although the text of reading as message and be collectively referred to as text 406a as the text that message is write, text of reading and the text of writing are stored as and can be identified as dissimilar texts.This also sets up for the text in bulletin board system and the blog.Therefore, known analytic technique in being described in openly Japanese patent application 2001-84250,2002-251402 and 2004-246440 number is used to calculate the frequency that key word and particular expression occur by being stored in analysis/computing module in the hard disk drive 310, so that obtain user 402 active matrix 408.
More particularly, according to key word in the active matrix 408 selection row of the present invention and the activity in the row.In general, because row is that several thousand row and row are tens of row, so matrix becomes serious vertical orientated matrix.All key words that extract by resolving contained in key word in occurring being expert at from all texts that are stored as text message 406.
Therefore, the element value representative in the capable and movable row of key word is included in the frequency of occurrences of the key word in the text that is associated with activity.
The value of the active matrix so created is stored in the hard disk drive 310 file separately as each user.Formats stored can be any form, and for example, as long as CSV, HTML or XML are can be by the programming tool identification such as C, C++, C#, Java (TM), Perl, Ruby or PHP.
Fig. 5 shows the active matrix DB 502 of the data that are used to store the active matrix of creating with form as described in Figure 4.In fact, active matrix DB 502 preferably is stored in the hard disk drive 310 with the CMDB form.As mentioned above, in active matrix DB 502 each user storage active matrix data 504a, 504b, 504c ....
Active matrix DB 502 is the active matrix data 506a of storage communities of users further, 506b, and 506c ....The active matrix data of communities of users are to calculate by the computing that the user's that belongs to communities of users active matrix data are carried out defining in this embodiment of the present invention.The back will be described this algorithm in detail.
Therefore, if there is the active matrix data 504a of unique user, 504b, 504c ... and the active matrix data 506a of each communities of users, 506b, 506c, ..., then can utilize the computing of the API 508 of definition in embodiments of the present invention to carry out following processing: shown in label 510, to extract user's interest; Shown in label 512, extract the user who has with this user's similar interests; Shown in label 514, extract the community that has with this user's similar interests; With shown in label 516, extract the community that has with another community's similar interests.Should be understood that available may handle and be not limited to as shown in Figure 5 those in the present embodiment, they only are exemplary examples.Above-mentioned processing or other processing except above-mentioned are carried out with detailed description in the back.
Fig. 6 is the figure of handling from the generation of another kind of viewpoint explanation active matrix.In Fig. 6, text data 602 is texts of user's content of writing or visiting, and it is identical with text message 406 among Fig. 4.The processing module of Language Processing part 604 is stored in the hard disk drive 310, and if desired, is called execution in the primary memory 308 by CPU 306.
Basically, the language relevant treatment is all carried out in Language Processing part 604.Specifically, carry out morphological analysis, parsing, keyword extraction, proper noun are extracted, prestige is expressed extraction etc.And if necessary, Language Processing part 604 is quoted dictionary or external knowledge database.Therefore, will be stored in the hard disk drive 310 such as the dictionary (not shown) of synonymicon.The external knowledge database can be stored in the hard disk drive 310 with dictionary, or can be used as the network service and quote by network.
Because such processing is known disclosing among Japanese patent application 2001-84250,2002-251402 and the 2004-246440, omit the description that repeats here.
Active matrix generating portion 606 is such as writing message, make the comments, deliver blog in bulletin board system, read message, reading comment, the visitor that pursues the Ph.D in the bulletin board system, each action (activity) of reading news, the key word that tissue and weighting are extracted by Language Processing part 604 is so that generate active matrix.
The active matrix data accumulation that active matrix generating portion 606 will so generate is in active matrix DB 502.Be stored among the active matrix DB 502 data as shown in Figure 5.
API 508 utilizes the active matrix data that are stored among the active matrix DB 502 to calculate various types of application 612a, 612b, and 612c ....
Fig. 7 is the synoptic diagram that describes the processing of active matrix generating portion 606 in detail.In Fig. 7, text message piece 406 is identical with among Fig. 4 that.The blog articles 704a that text 702a and 702b, specific user in the text message piece 406 storage blog articles that the specific user kept read, 706b ..., message 706a that the specific user write, 706b ... and the message 708a that the specific user read.
Active matrix generating portion 606 is extracted the row 702y of the number of times that key column 702x and indication key word be used from the text 702a of the blog articles write and 702b.
And active matrix generating portion 606 is from the text 704a of the blog articles read, 704b ... the row 704y of the number of times that middle extraction key column 704x and indication key word are used.
And active matrix generating portion 606 is from the text 706a of the message write, 706b ... the row 706y of the number of times that middle extraction key column 706x and indication key word are used.
And active matrix generating portion 606 is extracted the row 708y of the number of times that key column 708x and indication key word be used from the text 708a of the message read.
Generate active matrix 710 row 702y, 704y, 706y and the 708y of the number of times that active matrix generating portion 606 is used from key column 702x, 704x, 706x and the 708x of each activity and indication key word.
In other words, the row of the number of times that is used of indication key word are aligned to the column vector of respective activity in principle.Note, in this embodiment, the number of times that key word is used preferably by divided by the quantity of the document that reads or writes that is associated with key word by normalization.Consequently, in the row " visitor pursues the Ph.D " of active matrix 710 and " writing message ", comprise decimal.
Should be noted that active matrix as shown in Figure 7 is a simplified schematic, in fact, can be several thousand row according to the line number that is used in the key word in the text.And, be defined movable quantity because columns depends on, thus number of times and quantity all be increased to than shown in big.
Then, the processing of activity of constructing matrix is described with reference to the process flow diagram of Fig. 8.Before this is described, suppose that activity vector is defined as Activity Vector a ≡ { a 1, a 2..., a m.Here, a 1, a 2..., a mBe the activity of expecting in this community system, it for example comprises delivers/pursues the Ph.D visitor, Writing/Reading message, reads news, Writing/Reading comment in bulletin board system, buys product, reads the description of product, reads other user's brief introduction etc.
In the process flow diagram of Fig. 8, between step 802 and step 816 to each movable a i(i=1 ..., m) carry out processing as described below.In other words, in step 804, being extracted in the given period is movable a iThe document sets D that generates i" document sets " used herein refers to the blog articles 702a that writes in the text message 406 among Fig. 7 for example, 702b.This is the document sets of movable " delivering blog ".In document sets, can comprise one or more documents.In the text message 406 of Fig. 7, be included in " delivering blog " corresponding document sets in document be document 702a and document 702b.To document sets D iIn each document d Ij(j=1 ..., n) processing of carrying out is as follows.
In other words, in step 808, morphological analysis and parsing are applied to document d Ij, to extract document d IjIn the key word and its frequency of occurrences.
In step 810, the key word of so extraction and its frequency of occurrences are stored in W respectively iAnd X iIn.In fact, W iAnd X iIt is vector.Note, in step 810, can pass through them divided by document sets D iIn number of files come normalization X iIn numerical value.
In step 812, finish document d Ij(j=1 ..., repetition n), and in step 814 is finished and movable a once iCorresponding W iAnd X i
Therefore, in step 816, be all movable a iProvided W iAnd X iIn step 818, vertically arrange and be included in all W iIn the processing of key word mark, then, in step 820, to each movable a iCarry out with W iIn input of the corresponding line position of key word and X iThe processing of corresponding frequency.Therefore, generate interim active matrix.Here, the reason that adds " temporarily " is because use it to calculate the differentiation of active matrix as described later.
Fig. 9 is the illustration of explanation method of actual access document in the step 804 and 806 of Fig. 8.In commodity server system as shown in Figure 3, access log 902 is recorded in the hard disk drive 310.Access log 902 comprises date 902a, time 902b, handles 902c, user ID 902d and document id 902e.
For example, when the user with user ID 014623 writes message as blog and click store button 906 on screen 904, will write down 902f and be recorded in the access log 902, indication is called as the processing of PostBlog.Then, from the document id of record 902f, find the content of the blog of writing, and text is stored in the hard disk drive 310 as file 908.Similarly, the text relevant with message 910 also is stored in the hard disk drive 310.
So the text of creating 908 and 910 is used to create interim active matrix as the text message 406 shown in Fig. 4 and 7.
Describe below in the present embodiment and upgrade active matrix owing to element develops in time.
In this embodiment, develop in time by following calculating:
A t+Δ t=W kA tW a+T Δt
Here, A tBe active matrix at time t, W kBe the weighting coefficient of each key word, W aBe each movable forget coefficient and T Δ tIt is interim active matrix at time t+ Δ t.The method that generates interim active matrix is as described in reference Fig. 7-9.
Figure 10 shows W kAnd W aSpecial case.As shown in the figure, be under the condition of N in the key word reference numerals, W kIt is N * N square formation.Under situation about simplifying most, have only diagonal components get except zero on the occasion of, and other component all is zero.For example, be configured to 1.0 or with the corresponding component of proper noun key word, and be configured to than 1.0 much smaller values with common noun or the corresponding component of common keyword near 1.0 value.Therefore, keep not being attenuated too much with the corresponding numerical value of proper noun key word, and decayed rapidly with the corresponding numerical value of common noun key word.
At the Activity Type number is under the condition of M, W aIt is M * M square formation.Under situation about simplifying most, have only diagonal components get except zero on the occasion of, and other component all is zero.For example, and as writing blog, the movable corresponding branch of dark impression is measured 1.0 or as far as possible near 1.0 value to physiognomy, and with as reading news, measure than 1.0 much smaller values to the movable corresponding branch of the more shallow impression of people.
W kA tW aArithmetical operation can be that normal matrix multiplies each other, but for W kA tW a+ T Δ tAddition, may not use the normal matrix addition.This be because, if by generating T T+ Δ tFind up to now the new key that occurs, can not think that the row labels of key word is at A tAnd T Δ tBetween may not mate.
Therefore, A t+ T=A T+1Be typically provided to by as the augmented matrix addition of giving a definition:
Matrix A t≡ { a N, m, wherein, a N, mBe (n, m) component of matrix.
And the key word mark is w={w 1, w 2..., w n.
Similarly, T ≡ { t N ', m.
The key word mark is w={w ' 1, w ' 2..., w ' n.
Then, A T+1≡ { a N ", m.
The key word mark is w={w " 1, w " 2..., w " n.
According to such definition, be divided into following situation:
At key word w iUnder the situation of (using in the past but present no word), determine present A tIn but do not appear at w among the T " iAnd a ' Ij:
W” i=w i(1<=i<=n),
a’ ij=a ij(1<=j<=m)。
At key word w iUnder the situation of (using in the past but present no word), determine present A tWith the w among the T " iAnd a ' Ij:
W” i=w i(1<=i<=n),
A’ ij=a ij+t kj(w i=w’ k,1<=j<=m,1<=k<=n’)。
At key word w ' kUnder the situation of (not using in the past but the present word that uses), decision does not appear at A tIn but appear at w among the T " iAnd a ' Ij:
W” i=w’ k(n+1<=i<=n”),
A’ ij=t kj(1<=j<=m)。
By using the addition of expansion as described above, can be according to following equation: A T+ Δ t=W kA tW a+ T Δ t, upgrade active matrix A in time tSuch calculating is calculated automatically according to the computer program that prelists.Those of ordinary skill in the art should be understood that such computer program can utilize any programming language such as C++, C# or Java to write.And, as all arithmetical operations of giving a definition also can utilize programming language to write in a similar manner, but place of execution is preserved and is called in good time.
Note, although the frequency of upgrading active matrix is normally once a day, also can be according to the frequency of community activity, select any other renewal frequency as twice of every day or jede Woche once.
Use the active matrix of so creating and upgrading in time that various important application are come true.These application are described below.
[absolute value of active matrix]
The absolute value of active matrix A defines by following equation, wherein, and a IjBe the ij component of active matrix A:
[equation 1]
| A | = &Sigma; i = 1 m &Sigma; j = 1 n | a ij |
Absolute value uses less separately, and its important application is the value in other calculating of normalization.
[additions of two active matrixs]
Two active matrix A kAnd A lSum can be by equation definition as follows.Here, the absolute value of matrix defines as described above.Before being calculated as follows equation, aim at the processing of key word mark.In other words, carry out set A kAnd A lSummation get the arithmetical operation of making the key word mark.Then, expanding A according to key word mark result kAnd A lOK, that is, all import after zero in the row of the former key word that does not occur, be calculated as follows equation:
[equation 2]
A k &CirclePlus; A 1 = | A k | | A k | + | A 1 | A k + | A 1 | | A k | + | A 1 | A 1
Then, these active matrix sums are applied to belong to fully all active matrixs of the user of community, to obtain the active matrix of community.
Figure 11 shows the example of two active matrix A and B sum.Can find out obviously that from this example active matrix A, B sum obtain as suitable normalized value.
[subtracting each other of two active matrixs]
Two active matrix A kAnd A lSubtract each other can be by equation as follows definition.Here, the absolute value of matrix defines as described above.Before being calculated as follows equation, aim at the processing of key word mark.In other words, carry out set A kAnd A lSummation get the arithmetical operation of making the key word mark.Then, expanding capable A according to key word mark result kAnd A l, that is, all import after zero in the row of the former key word that does not occur, be calculated as follows equation:
[equation 3]
The subtracting each other of these active matrixs can be used to determine between the user, between user and the community or intercommunal subtracting each other.Should be noted that, the outermost absolute value on right side mean the entry of a matrix element be transformed into simply on the occasion of, it is different from as described above the absolute value of the matrix of definition.
Figure 12 shows the example that two active matrix A and B subtract each other.Can find out obviously that from this example the subtracting each other of active matrix A, B is to obtain as suitable normalized value.
[multiplying each other of two active matrixs]
Here,, consider three types as multiplying each other of active matrix, that is, and movable key word correlativity, activity correlativity and key word correlativity.In all these calculate, all processing of before calculating, arranging the key word mark.In other words, carry out set A kAnd A lSummation get the arithmetical operation of making the key word mark.Then, expanding capable A according to key word mark result kAnd A l, that is, all import after zero in the row of the former key word that does not occur, be calculated as follows equation.
The key word correlativity defines by following equation:
[equation 4]
C = A k &CircleTimes; wordact A 1
Its specific calculation is as follows:
[equation 5]
This calculating output is presented at the overlapping of two interest in the active matrix and activity.This makes and to be extracted in interest and the movable two aspects item that all is mutually related.
Here, a k IjAnd a l IjIt is respectively matrix A k, A lThe ij component.
Figure 13 shows the example that calculates the movable key word correlativity between two active matrix A, the B.
The activity correlativity is defined in the equation as follows.Here, A TIt is transposed matrix.Here, arithmetical operation is multiplying each other of normal matrix.If active matrix is m * n matrix, then the result is n * n matrix.
[equation 6]
A &CircleTimes; act B = A T B
This calculating output is presented at the overlapping of two activities in the active matrix.This can allow the people know that the activity of what type used public keyword.
Figure 14 shows the example of the key word correlativity of calculating two active matrix A, B.
The key word correlativity is by equation definition as follows.B TIt is transposed matrix.Here, arithmetical operation is multiplying each other of normal matrix.If active matrix is m * n matrix, then the result is m * m matrix.
[equation 7]
A &CircleTimes; word B = AB T
This calculating output is presented at the overlapping of two key words in the active matrix.This can allow the people know that the key word of what type has participated in public activity.
Figure 15 shows the example that calculates the activity correlativity between two active matrix A, the B.
[degeneration of active matrix]
The degeneration of active matrix comprises degeneration on the line direction and the degeneration on the column direction.Degeneration on the line direction can be called as Keyword List interested and provide by following equation:
V word=AW act T
Wherein, A is an active matrix, W ActBe movable weighing vector, and W Act TIt is its transposed vector.W ActDimension equal movable kind number, and its component is got the value between 0 to 1 in principle.These values are decisions like this, and value that will be bigger (weight) gives important activity.For example, will deliver blog than the big weight of visitor of pursuing the Ph.D.Therefore, can obtain the weighting key column relevant with individual or community.
Figure 16 shows the example that active matrix A degenerates on line direction,, determine the example of Keyword List interested that is.
Degeneration on the column direction can be called as the activity pattern tabulation and provide by following equation:
V act=W word TA。
Wherein, A is an active matrix, W WordBe the weighted keyword vector, and W Word TIt is its transposed vector.W WordThe quantity of the dimension key word that equals to extract, and its component is got the value between 0 to 1 in principle.These values are decisions like this, and value that will be bigger (weight) gives noticeable especially key word.
Figure 17 shows the example that active matrix A degenerates on column direction,, determine the example of activity pattern tabulation that is.
Some typical cases are described below to use to help people to understand the present invention.
[using 1]
Suppose that the user wants to search the community approaching with his or her interest in community system.In order to reach such purpose, community system carries out arithmetical operation, with the key word correlativity between the active matrix of determining the user and the active matrix that is stored in the existing community system in the hard disk drive 310.In the middle of the key word correlation matrix that obtains, community system only presents the key word of the component with the predetermined value of being equal to or greater than by suitable GUI (graphic user interface) to the associated user.Consequently, the user can find the community of the key word that is characterised in that the user loses interest in the past but want in the future to participate in.
[using 2]
Suppose that community system is impaired because of suffering the spam attack.But conspicuous mode makes that whom is difficult to discern is spammer in the total system.Therefore, system operator utilizes function of the present invention to obtain the degeneration of active matrix on column direction for each user, with the activity of constructing mode list.Thereby, can suppose that the user with the activity pattern tabulation that only occurs non-required quantity comment on blog is potential spammer.
[using 3]
One of major source of revenues in the community system is an advertising income.But traditional approach makes and is difficult to determine which kind of advertisement is effective in community.But,, generate the individual activity matrix, and can further generate the community activity matrix in view of the above according to present embodiment.Then, can generate the degeneration of community on line direction, that is, Keyword List interested is so that obtain the weighting Keyword List.Therefore, can use existing key word relevant advertisements scheme that effective advertisement is presented on the screen of community.
Should be noted that, as shown in this embodiment matrix addition, subtract each other and multiply each other arithmetical operation only be to realize il-lustrative example of the present invention, the present invention is not limited to these particular equation.For example, if addition and additive operation comprise the addition of actual matrix and subtract each other and the computing of the value that normalization generates as operation result in proper range, then can select any algorithm.

Claims (14)

1. commodity server system, a plurality of users are connected with described commodity server system to read or to write document etc. so that can communicate by letter by themselves client computer, and this system comprises:
Can be from system reading of data or with the memory device in the data writing system;
With each user ID and activity description each individual consumer's activity and activity-related document thereof are stored in the memory device device as daily record; And
Analyze the document that in each activity of individual consumer, reads or write according to daily record, determine from the key word of document and their frequency of occurrences, so that with the device in their write storage devices so that be that each individual consumer is movable.
2. according to the described system of claim 1, wherein, in mode in mark key word on the first direction and mark activity on the second direction vertical with first direction, the analysis result of the document that will read or write in each User Activity is stored in the memory device as matrix data, and the frequency of occurrences of each key word is stored on the indicated intersection point of these marks.
3. according to the described system of claim 1, wherein, described activity is included in and delivers blog in the community server, the visitor that pursues the Ph.D, write message and read message.
4. according to the described system of claim 1, further comprise and be used for the frequency of occurrences of key word device divided by the quantity of the document that is associated with activity.
5. according to the described system of claim 2, further comprise the user's who is used for calculating community matrix data sum, so that obtain the device of the matrix data that is associated with user in the community.
6. according to the described system of claim 5, further comprise the device of the difference between the matrix data that is used to calculate community and specific user's the matrix data.
7. according to the described system of claim 2, further comprise the matrix data that is used for that matrix data be multiply by the device of degradation parameter and is used for newly to calculate and multiply by the device of the matrix data addition of degradation parameter.
8. activity recording method that is used for commodity server system, a plurality of users are connected with commodity server system by themselves client computer, and so that can communicate by letter, this method comprises to read or write document:
Each individual consumer is movable with user ID and activity description and the document storage that is associated with activity in the memory device of commodity server system as the step of daily record; With
The document of each movable institute read/write of analyzing the individual consumer is to determine that the key word in the document and its frequency of occurrences are so that with the step in their write storage devices.
9. in accordance with the method for claim 8, wherein, in mode in mark key word on the first direction and mark activity on the second direction vertical with first direction, will be in each User Activity the analysis result of the document of read/write be stored in the memory device as matrix data, and the frequency of occurrences of each key word is stored on the intersection point of described mark indication.
10. in accordance with the method for claim 8, wherein, described activity is included in and delivers blog in the community server, the visitor that pursues the Ph.D, write message and read message.
11. further comprise in accordance with the method for claim 8, by the frequency of occurrences of key word is carried out normalized step divided by the quantity of activity-related document.
12. in accordance with the method for claim 9, further comprise the matrix data sum of calculating the user in the community so that obtain the step of the matrix data that is associated with user in the community.
13. further comprise in accordance with the method for claim 12, the step of the difference between the matrix data that calculates community and specific user's the matrix data.
14. in accordance with the method for claim 9, further comprise the matrix data that matrix data be multiply by the step of degradation parameter and will newly calculate and multiply by the step of the matrix data addition of degradation parameter.
CN 200810178618 2007-12-27 2008-11-21 Community server system and activity recording method therefor Expired - Fee Related CN101470754B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
JP2007336919A JP5243783B2 (en) 2007-12-27 2007-12-27 Community system, community system activity recording method, and community system activity recording program
JP2007-336919 2007-12-27
JP2007336919 2007-12-27

Publications (2)

Publication Number Publication Date
CN101470754A true CN101470754A (en) 2009-07-01
CN101470754B CN101470754B (en) 2012-04-11

Family

ID=40828230

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200810178618 Expired - Fee Related CN101470754B (en) 2007-12-27 2008-11-21 Community server system and activity recording method therefor

Country Status (2)

Country Link
JP (1) JP5243783B2 (en)
CN (1) CN101470754B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103635898A (en) * 2011-04-08 2014-03-12 环球娱乐株式会社 Preference visualization system and censorship system
CN104246757A (en) * 2012-02-22 2014-12-24 诺基亚公司 Predictive service access
CN104246758A (en) * 2012-02-22 2014-12-24 诺基亚公司 Adaptive system
CN107943978A (en) * 2017-11-29 2018-04-20 北京金堤科技有限公司 User accesses the storage method and device of record
CN111427967A (en) * 2018-12-24 2020-07-17 顺丰科技有限公司 Entity relationship query method and device

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5392679B2 (en) * 2009-09-07 2014-01-22 学校法人 中央大学 Decision analysis server, decision analysis method, program, and decision analysis system
JP5525268B2 (en) * 2010-01-19 2014-06-18 Kddi株式会社 Personality estimation device and program
US20110250575A1 (en) * 2010-04-13 2011-10-13 enVie Interactive LLC System And Method For Providing A Visual Representation Of A User Personality Within A Virtual Environment
KR101248193B1 (en) * 2011-05-27 2013-03-27 주식회사 솔트룩스 System for providing expoerts list
JP5821460B2 (en) * 2011-09-20 2015-11-24 大日本印刷株式会社 AC support server apparatus, AC support system, and AC support server program
JP2014130445A (en) 2012-12-28 2014-07-10 Toshiba Corp Information extraction server, information extraction client, information extraction method, and information extraction program
JP6191892B1 (en) * 2016-03-30 2017-09-06 株式会社Personal AI Artificial intelligence device that supports the accumulation and estimation of the values and values of the individual and the organization / group to which the individual belongs and supports the value-based support and analysis

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4547940B2 (en) * 2004-03-02 2010-09-22 富士ゼロックス株式会社 Information visualization system, method and program
JP2007206876A (en) * 2006-01-31 2007-08-16 Nifty Corp Advertisement distribution system in network service
CN100477593C (en) * 2006-10-13 2009-04-08 百度在线网络技术(北京)有限公司 Method and device for selecting correlative discussion zone in network community

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103635898B (en) * 2011-04-08 2016-08-10 环球娱乐株式会社 Hobby visualization system and auditing system
US10055487B2 (en) 2011-04-08 2018-08-21 Universal Entertainment Corporation Preference visualization system and censorship system
CN103635898A (en) * 2011-04-08 2014-03-12 环球娱乐株式会社 Preference visualization system and censorship system
CN104246758B (en) * 2012-02-22 2018-05-18 诺基亚技术有限公司 Adaptable System
US9811585B2 (en) 2012-02-22 2017-11-07 Nokia Technologies Oy Adaptive system
CN104246758A (en) * 2012-02-22 2014-12-24 诺基亚公司 Adaptive system
CN104246757A (en) * 2012-02-22 2014-12-24 诺基亚公司 Predictive service access
CN104246757B (en) * 2012-02-22 2018-11-06 诺基亚技术有限公司 Predictive service access
US10324916B2 (en) 2012-02-22 2019-06-18 Nokia Technologies Oy Predictive service access
CN107943978A (en) * 2017-11-29 2018-04-20 北京金堤科技有限公司 User accesses the storage method and device of record
CN107943978B (en) * 2017-11-29 2020-11-24 北京金堤科技有限公司 Storage method and device for user access records
CN111427967A (en) * 2018-12-24 2020-07-17 顺丰科技有限公司 Entity relationship query method and device
CN111427967B (en) * 2018-12-24 2023-06-09 顺丰科技有限公司 Entity relationship query method and device

Also Published As

Publication number Publication date
JP2009157764A (en) 2009-07-16
CN101470754B (en) 2012-04-11
JP5243783B2 (en) 2013-07-24

Similar Documents

Publication Publication Date Title
CN101470754B (en) Community server system and activity recording method therefor
Hildebrand et al. Conversational robo advisors as surrogates of trust: onboarding experience, firm perception, and consumer financial decision making
Brennan et al. Do firms effectively communicate with financial stakeholders? A conceptual model of corporate communication in a capital market context
Lee et al. Uses and gratifications of smart speakers: Modelling the effectiveness of smart speaker advertising
Bucy et al. The mediated moderation model of interactivity
Sadik-Zada et al. E-government and petty corruption in public sector service delivery
Wright Politics as usual? Revolution, normalization and a new agenda for online deliberation
TWI601088B (en) Topic management network public opinion evaluation management system and method
US20120259891A1 (en) Method, system and program for analytics data delivering
Luther et al. Pathfinder: an online collaboration environment for citizen scientists
Sillence et al. Please advise: using the Internet for health and financial advice
Du et al. Achieving mobile social media popularity to enhance customer acquisition: Cases from P2P lending firms
Young et al. The gender bias tug-of-war in a co-creation community: Core-periphery tension on Wikipedia
Demarest et al. Argue, observe, assess: Measuring disciplinary identities and differences through socio‐epistemic discourse
Weinberg et al. Perspectives on big data
Carr et al. ‘I don’t think there is any moral basis for taking money away from people’: using discursive psychology to explore the complexity of talk about tax
Stoddart et al. Environmentalists' mediawork for jumbo pass and the Tobeatic wilderness, Canada: Combining text-centred and activist-centred approaches to news media and social movements
Peiris et al. Implications of Trust and Usability On E-Commerce Adoption.
Pitogo et al. Social media enabled e-participation: a lexicon-based sentiment analysis using unsupervised machine learning
Murtagh et al. Semantic mapping of discourse and activity, using Habermas’s theory of communicative action to analyze process
Mutum et al. Avoidance of sponsored posts on consumer-generated content: a study of personal blogs
Plewes Reducing response burden in the American community survey: Proceedings of a workshop
Šćepanović et al. Quantifying the impact of positive stress on companies from online employee reviews
Boggio The Right to Participate in and Enjoy the Benefits of Scientific Progress and Its Application: A Conceptual Map
Chyrun et al. Content analysis peculiarities of user internet activities for personality psychological state slice formation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20120411

Termination date: 20181121