CN101771957A - User interest point determining method and device - Google Patents

User interest point determining method and device Download PDF

Info

Publication number
CN101771957A
CN101771957A CN200810241181A CN200810241181A CN101771957A CN 101771957 A CN101771957 A CN 101771957A CN 200810241181 A CN200810241181 A CN 200810241181A CN 200810241181 A CN200810241181 A CN 200810241181A CN 101771957 A CN101771957 A CN 101771957A
Authority
CN
China
Prior art keywords
interest
point
content
multimedia
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN200810241181A
Other languages
Chinese (zh)
Other versions
CN101771957B (en
Inventor
郑于锷
孙杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN200810241181A priority Critical patent/CN101771957B/en
Publication of CN101771957A publication Critical patent/CN101771957A/en
Application granted granted Critical
Publication of CN101771957B publication Critical patent/CN101771957B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a user interest point determining method and a device. The method includes the following steps: media interest assemblage spaces are respectively generated aiming at each multimedia type, a plurality of preset interest points are contained in the media interest assemblage spaces, and each interest point determines an interest characteristic value according to the characteristics of a selected training sample corresponding to the interest point. When a user operates multimedia contents, a corresponding media characteristic value is determined according to the multimedia contents operated by the user; a difference value between the interest characteristic value corresponding to each interest point in the media interest assemblage spaces and the media characteristic value is calculated, and one or more corresponding interest points with lower difference value are determine as the interest points of the user. The user interest point determining method and the device provided in the invention can accurately determine the interest points of the user according to the multimedia contents operated by the user.

Description

A kind of user interest point is determined method and apparatus
Technical field
The present invention relates to the communications field, relate in particular to content of multimedia, determine the method and apparatus of user interest point according to user's operation.
Background technology
In the existing communication network, use for the convenience of the user, the Internet service of varied function is provided.As mSpaces is the Internet service that combines with the mobile communication function towards cellphone subscriber's personalization, is intended to increase user's viscosity and loyalty.MSpaces user utilizes the personal page of the various tool building component oneself that system provides, and enjoy the personal space storage of oneself and issue the original content of oneself, and be pushed to mobile phone by the issue of existing various types of communication service such as realizations information such as Fetion, multimedia message, note and mailbox, the own interested content of interactive communication, information sharing and customization with the good friend, enjoy and a bit submit multiple spot issue and the once customization mobile phone wireless service and the Internet community service of propelling movement automatically to.
The user can visit the mSpaces website by mobile phone terminal, uploads/download pictures, uploads/the down-load music video, writes blog, sends short coloured silk and customer location feature etc.Therefore, a lot of incidents relevant with user behavior can take place on the mSpaces platform, and the relevant content of operation of these incidents, as the content of the content of user's uploading pictures, user's download music, the media information that the user subscribes to etc. all with mobile phone terminal user's behavioural habits, interest characteristicses etc. have very important relationship.
In the prior art, when determining user interest point, a kind of implementation is to determine according to user's static attribute information, as sex, age and the user's affiliated area etc. according to the user.The implementation that also has is the information of uploading or downloading according to the user, and the keyword of match settings carries out class of subscriber and distinguishes, and according to class of subscriber under the user, determines user's point of interest.
The above-mentioned point of interest of prior art is determined method, all is the point of interest of simple and rough consumer positioning, and the user interest point of determining is not accurate enough.
Summary of the invention
The invention provides a kind of user interest point and determine method and apparatus,, determine user's point of interest more exactly according to the content of multimedia of user's operation.
User interest point provided by the invention is determined method, comprising:
According to the content of multimedia of user's operation, determine to characterize the media characteristic value of described content of multimedia feature;
According to multimedia type under the described content of multimedia, determine the corresponding medium interest space of birdsing of the same feather flock together;
Calculate described medium interest birds of the same feather flock together the interest characteristics value of each the point of interest correspondence in the space and the difference between the described media characteristic value;
According to described difference order from small to large, choose the point of interest that one or more corresponding point of interest is defined as described user;
Wherein, described medium interest is birdsed of the same feather flock together the space at the generation in advance respectively of each multimedia type; Described medium interest is birdsed of the same feather flock together and is comprised the point of interest that sets in advance in the space, and the interest characteristics value of each described point of interest correspondence is determined by the feature of the training sample of choosing corresponding with described point of interest.
The present invention also provides a kind of user interest point to determine device, comprising:
The media characteristic determination module is used for the content of multimedia according to user's operation, determines to characterize the media characteristic value of described content of multimedia feature;
The space determination module of birdsing of the same feather flock together is used for according to multimedia type under the described content of multimedia, determines the corresponding medium interest space of birdsing of the same feather flock together;
The point of interest determination module is used for calculating described medium interest birds of the same feather flock together the interest characteristics value of each point of interest correspondence in space and the difference between the described media characteristic value; According to described difference order from small to large, choose the point of interest that or above corresponding point of interest are defined as described user;
The space of birdsing of the same feather flock together generates memory module, is used for generating corresponding medium interest birds of the same feather flock together space and storage respectively at each multimedia type; Described medium interest is birdsed of the same feather flock together and is comprised the point of interest that sets in advance in the space, and the interest characteristics value of each described point of interest correspondence is determined by the feature of the training sample of choosing corresponding with described point of interest.
The present invention is directed to each multimedia type and generate the medium interest space of birdsing of the same feather flock together respectively, medium interest is birdsed of the same feather flock together and is included several points of interest of setting in the space, each point of interest has corresponding interest characteristics value,, and pre-determine as training sample by the content of multimedia of choosing relevant according to the training sample feature with this point of interest.When the user operates content of multimedia (as picture, text and sound etc.),, determine corresponding media characteristic value according to the content of multimedia of user's operation; By computing medium interest birds of the same feather flock together the interest characteristics value of each the point of interest correspondence in the space and the difference between the media characteristic value; Choose the point of interest that the less one or more corresponding point of interest of difference is defined as the user.Because the media characteristic value of content of multimedia has characterized the feature of the content of multimedia of user's operation, and the interest characteristics value of point of interest correspondence is to be determined by the feature of the training sample of correspondence, therefore, when the difference of media characteristic value and interest characteristics value hour, show that content of multimedia that the user the operates training sample corresponding with point of interest is more approaching, thereby realize determining more exactly user's point of interest according to the content of multimedia of user's operation.
Description of drawings
Fig. 1 determines method flow diagram for the user interest point that the embodiment of the invention provides;
Fig. 2 determines method flow diagram for the user interest point when the user operates picture that the embodiment of the invention provides;
The user interest point that Fig. 3 provides for the embodiment of the invention is determined the system configuration schematic diagram;
The user interest point that Fig. 4 provides for the embodiment of the invention space of determining to birds of the same feather flock together in the system generates the memory module configuration schematic diagram;
The user interest point that Fig. 5 provides for the embodiment of the invention is determined point of interest determination module structural representation in the system;
The subscription client structural representation that Fig. 6 provides for the embodiment of the invention.
Embodiment
The invention provides a kind of user interest point and determine method and apparatus, realize content of multimedia, determine user's point of interest more exactly according to user's operation.
Below in conjunction with accompanying drawing, user interest point provided by the invention is determined that method and apparatus is described in detail with specific embodiment.
Referring to Fig. 1, determine method flow diagram for the user interest point that the embodiment of the invention provides, specifically comprise:
Step S101, according to the content of multimedia of user operation, determine to characterize the media characteristic value of this content of multimedia feature;
Step S102, according to multimedia type under the content of multimedia of user operation, determine the corresponding medium interest space of birdsing of the same feather flock together;
Step S103, the computing medium interest difference between the media characteristic value of the operated content of multimedia of the interest characteristics value of each the point of interest correspondence in the space and user of birdsing of the same feather flock together;
Step S104, according to the corresponding point of interest of the difference that calculate to obtain rank order from small to large;
Step S105, by difference order from small to large, choose the point of interest that one or more corresponding point of interest is defined as the user.
Wherein, medium interest is birdsed of the same feather flock together the space at the generation in advance respectively of each multimedia type.For example: generate the corresponding medium interest space of birdsing of the same feather flock together at picture (or image); Generate the corresponding medium interest space etc. of birdsing of the same feather flock together at sound.Medium interest is birdsed of the same feather flock together and is comprised several points of interest that set in advance in the space, each point of interest has the interest characteristics value, the interest characteristics value of each point of interest correspondence, by with the content of multimedia of choosing in advance relevant with this point of interest as training sample, and determine according to the feature of training sample.
Among one embodiment, the medium interest space of birdsing of the same feather flock together is the vector space of a multidimensional, and the media characteristic value is with the media characteristic vector representation, and the interest characteristics value is with the interest characteristics vector representation.
With multimedia one type, promptly picture is an example below, specifically describes how to generate the corresponding medium interest space of birdsing of the same feather flock together.
1) user sets in advance birds of the same feather flock together several points of interest in space of medium interest, as tennis, table tennis, tourism etc. all corresponding birds of the same feather flock together a point of interest in the space of medium interest.
2) be each m training sample picture of point of interest input and k keyword (option), this group training sample picture is represented a series of pictures relevant with this point of interest, as " tennis " point of interest, can import pictures such as tennis, racket, tennis tournament, keyword can be: tennis, warm Bolton, Roger Federer, Sa Labowa etc.
3) in order enough discrete values to represent these pictures, need from the training sample picture, extract the feature (as color range, brightness, profile etc.) of the picture that can quantize, form a n-dimensional vector.Concrete processing procedure is:
At first carry out the picture cutting, be about to each training sample picture and all be cut into picture of the same size (as 300*300), the Aspect Ratio of picture original size no matter all compresses it or is extended to same resolution, so that the picture feature value is extracted.
The picture feature value is extracted: define an image content by parameters such as color range, piecemeal color, piecemeal profiles.For example mainly extract the data of three aspects of image: color range, image block average color and image block outline data.For a point of interest (being made as α), for point of interest α chooses m training sample picture.For certain training sample picture i wherein, at first obtain the color range data of picture; Then, picture is divided into the individual fritter of N*N (when picture size was 300*300, the value of N can be 8-10), earlier the color of every fritter picture is got average (bluring), and be mapped to a kind of color in the color space of 8 looks, thereby obtain the numerical value of each fritter color of expression.Again every little block feature is carried out profile and extract, obtains the vector arc description of N*N each piece contour feature of description, pass through the characteristic vector that above-mentioned processing has just obtained describing the picture material of this training sample picture i, that is:
imgf i(cs,o 1,o 2,o 3,....o N×N,c 1,c 2,c 3,...,c N×N)
Wherein, cs is a set, represent all just to scheme based on the color in the HSV that quantizes (tone hue, saturation saturation, value value) space, and with its distribution in 162 divide equally interval of dispersing.o 1, o 2..., o N * NRepresent the vector description set of the contour feature of each small images, c 1, c 2...., c N*NThe color value of each fritter in the presentation video.
For m the training sample picture of point of interest α, just can obtain m image content features vector like this, be expressed as:
ivect(imgf 1,imgf 2,imgf 3,....,imgf m)
This m image content features vector can roughly be described the image content features of this point of interest.
Except the image content features vector, each image can also have corresponding keyword that image is described.For example for point of interest α, its each training sample picture can correspondingly provide some keywords, thereby the keyword that gathers all training sample pictures obtains the feature vocabulary vector of point of interest α, that is:
despt(keyw 1,keyw 2,....,keyw n)
Wherein, keyw represents keyword, the feature set of words of despt set expression point of interest.
Respectively medium interest each point of interest that is provided with in the space of birdsing of the same feather flock together is imported one group of training sample image and keyword, set up ivect and despt characteristic vector, thereby made up based on the medium interest of the picture space of birdsing of the same feather flock together.
The characteristic vector of setting up can by expansion identifiable language (eXtensible Markup Language, XML) or other modes be described and be stored in the computer.
Made up based on the medium interest of picture and birdsed of the same feather flock together behind the space, picture for user's operation, can adopt identical method to extract picture feature, it is (clear convenient for describing to generate the characteristic of correspondence vector, hereinafter, point of interest characteristic of correspondence vector is called the interest characteristics vector, the character pair vector that will generate according to the content of multimedia of user's operation is called the media characteristic vector), and the difference between computing medium characteristic vector and the interest characteristics vector, by difference order from small to large, choose the point of interest that one or more corresponding point of interest is defined as the user.
According to said method provided by the invention, the content of multimedia for the each operation of user all needs the difference between computing medium characteristic vector and the interest characteristics vector, determines the point of interest of the less corresponding point of interest of difference as the user according to the difference size.In order to reduce amount of calculation as far as possible, in the preferred embodiment, also content of multimedia is generated corresponding content identification according to the sign generation strategy of setting, and the corresponding relation of memory contents sign and point of interest.The original records of corresponding relation comprises: the corresponding relation of the content identification that generates according to the content-data of the content of multimedia of training sample and corresponding point of interest; And after the user operates content of multimedia, content of multimedia to this operation of user generates corresponding content identification according to same sign generation strategy, and mate with the content identification in the stored relation, when the corresponding content of the content of multimedia that has comprised this operation of user in the stored relation identified, can directly determine corresponding point of interest according to stored relation was user interest point; And when the corresponding content of the content of multimedia that does not comprise this operation of user in the stored relation identifies, determine corresponding point of interest as stated above, with the corresponding content sign of the content of multimedia of this operation of user and the corresponding relation record of corresponding point of interest, be increased in the stored relation, realize constantly increasing the record in the corresponding relation, to improve the follow-up power that is matched to.
Among one embodiment, corresponding relation can adopt the form storage of form.In this mapping table, comprise two fields at least: content identification and corresponding point of interest.
Among one embodiment, the difference that mapping table can also storage computation goes out, that is: the difference between the interest characteristics vector of the media characteristic vector of the corresponding content of multimedia of memory contents sign and corresponding point of interest.
The content identification that generates is used for identifying uniquely corresponding content of multimedia, content identification in the embodiment of the invention is not artificial identifier that does not have concrete implication or the numerical value that is provided with, but need calculate according to the multimedia content data of correspondence, characterize the feature of corresponding content of multimedia.After adopting same sign generation strategy to determine corresponding content identification to same type content of multimedia, just can be by the comparison of content identification, whether the content of multimedia of determining correspondence is identical.
Content identification with picture is generated as example below, is specifically described with the content identification that how to generate content of multimedia.
For each onesize picture file, can use a kind of algorithm to generate the unique identification (ID) of this picture, make same ID represent that the probability of different pictures levels off to 0.At first can use someway,, represent with a string short numeral with a large amount of quantity of picture file.For example represent with the CRC cyclic check code of image file.In the short numeral of this string, randomly draw the numerical value of some again, be side-play amount with these numerical value, extract the data content of this side-play amount in the picture file, these contents are merged to this string than generating corresponding ID in the short number word.
A kind of ID extracting method of picture is as follows:
ID=ijk
In the following formula, i is 32 CRC cyclic check codes after picture file is deformed into 300*300, and the CRC production is:
G(x)=x 32+x 26+x 23+x 22+x 16+x 12+x 11+x 10+x 8+x 7+x 5+x 4+x 2+x+1
The decimal system front two number scale of getting i is i1, and j is the value of i1 byte of picture file; The value of getting the metric highest order of i is designated as i2, and k is the value of i2 byte of picture file.To be the value of ID after three's (being i, j and k) merging.Like this, for each picture file, after distortion, ID substantially can picture of unique identification.When the content of two picture files was identical, its corresponding ID was identical, and whether identical so whether comparison ID is identical if just can determine picture.
For audio data file, also can generate corresponding ID.For example, the preceding 10 seconds data (place that waveform is arranged from audio frequency) of first cutting audio file according to the preceding 10 seconds data of audio frequency, are extracted audio frequency characteristics and are generated corresponding ID.Audio frequency characteristics for example comprises: short-time average energy, zero-crossing rate, frequency center and bandwidth etc.
Owing to have subvector in the XML imgf vector that had tree-shaped characteristic correspondence, therefore can come the identification medium feature with XML.Be defined as follows shown in the table one:
Table one:
Bookmark name Father node Attribute Content description
??<media> Root node The unique identification Type of an image of ID represents the type of medium, is image (image) here The root node of a media characteristic of expression
??<imgf> ??<media> Id represents the sequence number of imgf The image content features vector
??<colorStage> ??<imgf> The beginning of the color range among the expression imgf, a plurality of numeric representations of color range
??<value> ??<colorStage> Stage represents the sequence number that color range is represented Each value in the expression color range vector
??<outline> ??<imgf> Id represents piece number (value is 1 to N*N) The contour vector of a piece in the expression N*N piece
Bookmark name Father node Attribute Content description
??<line> ??<outline> StartX, startY, endX, en dY represent the unique identification of the origin coordinates ID line label of a line The initial sum termination coordinate of certain the bar line among the expression outline
??<colorSet> ??<imgf> The expression of N*N piece color
??<color> ??<colorSet> Id represents piece number (value is 1 to N*N) A certain homochromatic among the expression imgf
??<keywordSet> ??<media> Feature vocabulary vector
??<keyword> ??<keywordSet> Id represents the ID of keyword A keyword in the representation feature vocabulary vector
According to foregoing description as can be known, corresponding one or more interest characteristics vectors of point of interest; The number of vector equals to be the selected training sample quantity of this point of interest.As mentioned above, point of interest α has m training sample picture, its corresponding image content features vector ivect (imgf 1, imgf 2, imgf 3...., imgf m) have a m vectorial imgf 1~imgf m
The medium interest based on picture that generates with the foregoing description space of birdsing of the same feather flock together is an example, and when the content of multimedia of user's operation was picture, the idiographic flow of determining user interest point specifically comprised as shown in Figure 2:
Step S201, according to the picture of user operation, adopt said method to determine media characteristic vector and the content identification corresponding with the picture of this operation of user.
The content identification of the picture of step S202, this operation of usefulness user is mated in content identification of storing and point of interest corresponding relation.
Step S203, judge whether to match identical content identification, if, execution in step S204; Otherwise, execution in step S205.
Step S204, according to stored relation, obtain the point of interest corresponding with content identification, go to step S213.
Step S205, obtain the interest characteristics vector ivect and the despt of next point of interest.
Next imgf vector among step S206, the taking-up ivect, the difference between the media characteristic vector of the picture that calculating imgf vector and user are operated, and write down this difference.
Suppose imgf 1The media characteristic vector of the picture correspondence that the expression user is operated; Wherein:
imgf 1=(cs 1,o 1,1,o 1,2,o 1,3,....o 1,N×N,c 1,1,c 1,2,c 1,3,...,c 1,N×N)
Suppose imgf 2Be an imgf vector among the ivect, wherein:
imgf 2=(cs 2,o 2,1,o 2,2,o 2,3,....o 2,N×N,c 2,1,c 2,2,c 2,3,...,c 2,N×N)
Imgf 1And imgf 2Between difference be:
DBI(imgf 1,imgf 2)=γ 1*cdist(imgf 1,imgf 2)+γ 2*odist(imgf 1,imgf 2)+γ 3*cdist(imgf 1,imgf 2)
Wherein, cdist (imgf 1, imgf 2)=(cs 1-cs 2) TA (cs 2-cs 1), A[a I, j] relation of expression color i and color j.Odist (imgf 1, imgf 2) expression is through after the image cutting, the diversity of profile between two images.That is:
cdist ( imgf 1 , imgf 2 ) = ( o 1,1 - o 2,1 ) 2 + ( o 1,2 - o 2,2 ) 2 + . . . + ( o 1 , N * N - o 2 , N * N ) 2
Wherein, γ 1, γ 2, γ 3Weights for the various piece proportion.
Step S207, judge whether to finish the difference calculating between the media characteristic vector of whole imgf vectors and the picture of user's operation, if not, go to step S206; If continue step S208.
Step S208, judge that picture is whether subsidiary keyword arranged, if, execution in step S209; If not, execution in step S210.
Step S209, determine picture whether appear in the despt set of point of interest with keyword, and calculate and number occurs.
Suppose that picture has attached m keyword, the keyword set of having attached the picture of keyword so can be expressed as:
Ikw = &cup; i = 1 m Ikeyw i
If total n the keyword of point of interest j represents that its keyword set is combined into:
kw j = &cup; j = 1 n keyw j
Picture Ikw and point of interest kw iKeyword degree of correlation KWR be expressed as:
KWR ( Ikw , kw i ) = &Sigma; l = 1 m isAppeared ( Ikeyw l , kw i )
Wherein (keyword, keywordset), whether the keyword keyword of expression picture occurs in the keyword set keywordset of point of interest isAppeared, if then (keyword, keywordset) value is 1 to isAppeared, otherwise is 0.
That is to say that KWR calculates is the summation of the number of times that occurs of the subsidiary keyword of picture in the keyword set of point of interest.
Step S210, calculate the difference between the interest characteristics vector of media characteristic vector and current point of interest of picture of user's operation, and choose minimal difference as the difference between the vectorial interest characteristics vector corresponding of the media characteristic of the picture of user's operation with current point of interest.The concrete calculating formula of difference DII is:
DII = &beta; 1 min i m ( DBI ( imgf 1 , imgf i ) ) - &beta; 2 KWR ( Ikw , kw )
In the following formula, β 1And β 2Be weight coefficient.
Step S211, judge that whether birds of the same feather flock together whole points of interest in the space of medium interest all calculate and finish, and if not, go to step S205; If, execution in step S212.
The difference of step S212, each point of interest correspondence of comparison by from small to large rank order, is determined the point of interest or K the less point of interest of difference of difference minimum with the difference of each point of interest correspondence.And the content identification of the picture of this operation of user and the corresponding point of interest of choosing and the corresponding difference that calculates be increased in the corresponding relation of preservation.
Step S213, with the point of interest determined among step S204 or the step S212 point of interest as the user, be saved in the user interest historical record; In addition, can also preserve the corresponding difference that calculates in the user interest historical record, this difference size can show the otherness size between user interest and the corresponding point of interest.
Adopt the described flow process of Fig. 2, when the media characteristic vector of the content of multimedia of determining user's operation, also generate corresponding content identification, the content identification and the point of interest mapping table that mate storage earlier according to content identification, if can match identical content identification, then directly the corresponding point of interest of output has been avoided and medium interest is birdsed of the same feather flock together in the space between each point of interest the calculating one by one of otherness relatively as user's point of interest.In addition, if this does not match identical content identification, after the method that then adopts the foregoing description to provide is determined corresponding point of interest, also the content identification of the content of multimedia that this user is operated is increased in the corresponding relation with the corresponding point of interest of determining, make the record in the corresponding relation constantly increase, follow-up when mating according to content identification, the possibility that the match is successful also constantly increases.
Online during as the user by Internet logging in network side respective server, can catch the content of multimedia of user's operation by network side server, the user interest point that carrying out the above embodiment of the present invention provides is determined method, determines user's point of interest.
When user's off-line operation content of multimedia, can also determine corresponding media characteristic vector and content identification and be kept at this locality earlier by the content of multimedia of subscription client according to user's operation.When subscription client logging in network side server, the media characteristic vector and the content identification of this locality storage are sent to network side server, according to the medium interest that generates the in advance space of birdsing of the same feather flock together, adopt said method to determine user's point of interest by network side server.
One of ordinary skill in the art will appreciate that all or part of step that realizes in the foregoing description method is to instruct relevant hardware to finish by program, this program can be stored in the computer read/write memory medium, as: ROM/RAM, magnetic disc, CD etc.
Based on same inventive concept, the embodiment of the invention also provides a kind of user interest point to determine device, and its structural representation comprises as shown in Figure 3:
Media characteristic determination module 31 is used for the content of multimedia according to user's operation, determines to characterize the media characteristic value of content of multimedia feature;
The space determination module 32 of birdsing of the same feather flock together is used for according to multimedia type under the content of multimedia of user's operation, determines the corresponding medium interest space of birdsing of the same feather flock together;
Point of interest determination module 33 is used for computing medium interest birds of the same feather flock together the interest characteristics value of each point of interest correspondence in space and the difference between the media characteristic value; According to difference order from small to large, choose the point of interest that or above corresponding point of interest are defined as the user;
The space of birdsing of the same feather flock together generates memory module 34, is used for generating corresponding medium interest birds of the same feather flock together space and storage respectively at each multimedia type; Medium interest is birdsed of the same feather flock together and is comprised the point of interest that sets in advance in the space, and the interest characteristics value of each point of interest correspondence is determined by the feature of the training sample of choosing corresponding with this point of interest.
Among one embodiment, user interest point determines that device also comprises:
Corresponding relation storage update module 35 is used to store the content identification of content of multimedia and the corresponding relation of point of interest; Described content identification generates according to the sign generation strategy of setting according to multimedia content data, and the original records of corresponding relation comprises: the corresponding relation of the content identification that generates according to the content-data of the content of multimedia of training sample and corresponding point of interest; And storage is according to generate, the current local content identification and the corresponding relation record of the corresponding point of interest of choosing of not being kept at of content of multimedia of user's operation.
Among one embodiment, the concrete structure of the space generation memory module 34 of birdsing of the same feather flock together comprises as shown in Figure 4:
Submodule 341 is set, is used to be provided with an above point of interest; And be that each point of interest is selected one or more training sample;
Feature extraction submodule 342 is used to extract the feature of the training sample of each point of interest correspondence, generates the interest characteristics vector corresponding with each point of interest;
Generate submodule 343, be used to preserve the interest characteristics vector of an above point of interest correspondence of setting, generate the medium interest space of birdsing of the same feather flock together.
Among one embodiment, the concrete structure of point of interest determination module 33 comprises as shown in Figure 5:
Content identification generates submodule 331, is used for the content of multimedia according to user's operation, according to the sign generation strategy of setting, generates the corresponding content sign of the content of multimedia of user's operation;
Matched sub-block 332, the content identification of the content of multimedia that the user who is used for generating operates is with the content identification coupling of storage in the corresponding relation storage update module 35; And output matching result;
Determine submodule 333, be used for when matching result be when matching identical content identification, according to stored relation in the corresponding relation storage update module 35, will the point of interest corresponding be defined as user's point of interest with this content identification; And
When matching result when not matching identical content identification, calculated difference according to the method described above, and according to difference order is from small to large chosen the point of interest that or above corresponding point of interest are defined as the user.
In the practical application, user interest point provided by the invention determines that each module of device can be arranged in the network side server; Perhaps media characteristic determination module wherein is arranged in the subscription client, and all the other each modules are arranged in the network side server, sends the media characteristic value by subscription client, and perhaps network side server is given in media characteristic value and content identification.
When the media characteristic determination module was arranged on subscription client, the concrete structure of this subscription client comprised as shown in Figure 6:
User operation case generator 61 is used to produce the content of multimedia operation, and stores this operation;
Media characteristic determination module 62 is used for the content of multimedia according to user's operation, determines to characterize the media characteristic value of content of multimedia feature; Perhaps, determine to characterize outside the media characteristic value of content of multimedia feature, also generate the content identification of the content of multimedia of user's operation according to the sign generation strategy of setting according to the content of multimedia of user's operation;
Media characteristic memory module 63 is used for the media characteristic value that medium feature determination module 62 is determined; When described media characteristic determination module 62 also generates described content identification, also be used for the content identification that medium feature determination module 62 generates;
Media characteristic sending module 64 is used to send the media characteristic value of storage to network side server; Network side server is arrived in the media characteristic value and the content identification that perhaps send storage.
The media characteristic determination module is arranged in the subscription client, when user's off-line operation content of multimedia, can determines corresponding media characteristic vector and content identification and be kept at this locality by the content of multimedia of subscription client according to user's operation.When subscription client logging in network side server, the media characteristic vector and the content identification of this locality storage are sent to network side server, according to the medium interest that generates the in advance space of birdsing of the same feather flock together, adopt the disclosed user interest point of the above embodiment of the present invention to determine that method determines user's point of interest by network side server.
In sum, the present invention is by generating the medium interest space of birdsing of the same feather flock together respectively at each multimedia type, determines the birds of the same feather flock together interest characteristics value of each point of interest correspondence in the space of medium interest according to training sample.When the user operates content of multimedia (as picture, text and sound etc.),, determine corresponding media characteristic value according to the content of multimedia of user's operation; And choose the medium interest corresponding space of birdsing of the same feather flock together, computing medium interest birds of the same feather flock together the interest characteristics value of each the point of interest correspondence in the space and the difference between the media characteristic value with the content of multimedia of this operation of user; Select the less corresponding point of interest of difference to be defined as user's point of interest.Because the media characteristic value of content of multimedia has characterized the feature of the content of multimedia of user's operation, and the interest characteristics value of point of interest correspondence is to be determined by the feature of the training sample of correspondence, therefore, when the difference of media characteristic value and interest characteristics value hour, show that content of multimedia that the user the operates training sample corresponding with point of interest is more approaching, thereby realize determining more exactly user's point of interest according to the content of multimedia of user's operation.By according to the long-term follow analysis of said method provided by the invention to the operated content of multimedia of user, write down and also bring in constant renewal in the user interest point historical record, can determine user's hobby more exactly.
Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, if of the present invention these are revised and modification belongs within the scope of claim of the present invention and equivalent technologies thereof, then the present invention also is intended to comprise these changes and modification interior.

Claims (12)

1. a user interest point is determined method, it is characterized in that, comprising:
According to the content of multimedia of user's operation, determine to characterize the media characteristic value of described content of multimedia feature;
According to multimedia type under the described content of multimedia, determine the corresponding medium interest space of birdsing of the same feather flock together;
Calculate described medium interest birds of the same feather flock together the interest characteristics value of each the point of interest correspondence in the space and the difference between the described media characteristic value;
According to described difference order from small to large, choose the point of interest that one or more corresponding point of interest is defined as described user;
Wherein, described medium interest is birdsed of the same feather flock together the space at the generation in advance respectively of each multimedia type; Described medium interest is birdsed of the same feather flock together and is comprised the point of interest that sets in advance in the space, and the interest characteristics value of each described point of interest correspondence is determined by the feature of the training sample of choosing corresponding with described point of interest.
2. the method for claim 1 is characterized in that, the described medium interest space of birdsing of the same feather flock together is the vector space of a multidimensional;
Described media characteristic value is with the media characteristic vector representation;
Described interest characteristics value is with the interest characteristics vector representation;
The described medium interest of described calculating birds of the same feather flock together the interest characteristics value of each the point of interest correspondence in the space and the difference between the described media characteristic value comprise:
Calculate described medium interest birds of the same feather flock together the interest characteristics vector of each the point of interest correspondence in the space and the vectorial difference between the described media characteristic vector.
3. method as claimed in claim 2 is characterized in that, described media characteristic vector sum interest characteristics vector adopts the expansion XML of identifiable language sign.
4. method as claimed in claim 2 is characterized in that, generates the birds of the same feather flock together concrete grammar in space of described medium interest to be:
An above point of interest is set;
For each point of interest is chosen one or more corresponding training samples;
Extract the training sample feature of each point of interest correspondence, generate the interest characteristics vector corresponding with each point of interest;
By the interest characteristics vector of a described above point of interest correspondence, generate the described medium interest space of birdsing of the same feather flock together.
5. method as claimed in claim 4 is characterized in that, corresponding one or more interest characteristics vectors of point of interest.
6. as the arbitrary described method of claim 1-5, it is characterized in that, also comprise: the content identification of storage content of multimedia and the corresponding relation of point of interest; Described content identification is to generate according to the sign generation strategy of setting according to the content-data of content of multimedia, and the original records of described corresponding relation comprises: the content identification that generates according to the content-data of the content of multimedia of described training sample and the corresponding relation of corresponding point of interest;
Further comprise before calculating described difference:,, generate the content identification of the content of multimedia of user's operation according to the sign generation strategy of described setting according to the content of multimedia of user's operation; And with the described corresponding relation of storage in content identification coupling; When matching identical content identification,, will the point of interest corresponding be defined as described user's point of interest with this content identification according to the described corresponding relation of storage;
When not matching identical content identification, calculate described difference, and, choose the point of interest that one or more corresponding point of interest is defined as described user according to described difference order from small to large; And the corresponding relation record of content identification and this corresponding point of interest of choosing of content of multimedia that in described corresponding relation, increases described user's operation of this generation.
7. a user interest point is determined device, it is characterized in that, comprising:
The media characteristic determination module is used for the content of multimedia according to user's operation, determines to characterize the media characteristic value of described content of multimedia feature;
The space determination module of birdsing of the same feather flock together is used for according to multimedia type under the described content of multimedia, determines the corresponding medium interest space of birdsing of the same feather flock together;
The point of interest determination module is used for calculating described medium interest birds of the same feather flock together the interest characteristics value of each point of interest correspondence in space and the difference between the described media characteristic value; According to described difference order from small to large, choose the point of interest that or above corresponding point of interest are defined as described user;
The space of birdsing of the same feather flock together generates memory module, is used for generating corresponding medium interest birds of the same feather flock together space and storage respectively at each multimedia type; Described medium interest is birdsed of the same feather flock together and is comprised the point of interest that sets in advance in the space, and the interest characteristics value of each described point of interest correspondence is determined by the feature of the training sample of choosing corresponding with described point of interest.
8. device as claimed in claim 7 is characterized in that, the described space of birdsing of the same feather flock together generates memory module, comprising:
Submodule is set, is used to be provided with an above point of interest; And choose one or more corresponding training samples for each point of interest;
The feature extraction submodule is used to extract the feature of the training sample of each point of interest correspondence, generates the interest characteristics vector corresponding with each point of interest;
Generate submodule, be used to preserve the interest characteristics vector of a described above point of interest correspondence, generate the described medium interest space of birdsing of the same feather flock together.
9. device as claimed in claim 8 is characterized in that, also comprises:
Corresponding relation storage update module is used to store the content identification of content of multimedia and the corresponding relation of point of interest; Described content identification is to generate according to the sign generation strategy of setting according to the content-data of content of multimedia, and the original records of described corresponding relation comprises: the content identification that generates according to the content-data of the content of multimedia of described training sample and the corresponding relation of corresponding point of interest; And
Storage is according to the corresponding relation record of the content of multimedia content identification that generates and the corresponding point of interest of choosing of user's operation.
10. device as claimed in claim 9 is characterized in that, described point of interest determination module specifically comprises:
Content identification generates submodule, according to the content of multimedia of user's operation, according to the sign generation strategy of described setting, generates the content identification of the content of multimedia of user's operation;
Matched sub-block, the content identification of the content of multimedia that the described user who is used for generating operates is with the content identification coupling of storing in the described corresponding relation storage update module; And output matching result;
Determine submodule, be used for when matching result be when matching identical content identification, according to stored relation in the described corresponding relation storage update module, will the point of interest corresponding be defined as described user's point of interest with this content identification; And
When matching result when not matching identical content identification, calculate described difference, and according to described difference order from small to large, choose the point of interest that or above corresponding point of interest are defined as described user.
11., it is characterized in that each module of described device is arranged in the network side server as each described device of claim 7-9; Perhaps
The media characteristic determination module of described device is arranged in the subscription client, and all the other each modules are arranged in the network side server; Described subscription client also sends described media characteristic value, and described network side server is given in perhaps described media characteristic value and content identification.
12. device as claimed in claim 11 is characterized in that, described subscription client comprises:
The user operation case generator is used to produce the content of multimedia operation, and stores this operation;
The media characteristic determination module is used for the content of multimedia according to user's operation, determines to characterize the media characteristic value of described content of multimedia feature; Perhaps be used for content of multimedia, determine to characterize outside the media characteristic value of described content of multimedia feature,, generate the content identification of the content of multimedia of user's operation according to multimedia content data also according to the sign generation strategy of setting according to user's operation;
The media characteristic memory module is used to store the media characteristic value that described media characteristic determination module is determined; When described media characteristic determination module also generates described content identification, also be used to store described content identification;
The media characteristic sending module is used to send the media characteristic value of storage to network side server; Perhaps be used to send the media characteristic value of storage and content identification to network side server.
CN200810241181A 2008-12-26 2008-12-26 User interest point determining method and device Expired - Fee Related CN101771957B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200810241181A CN101771957B (en) 2008-12-26 2008-12-26 User interest point determining method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200810241181A CN101771957B (en) 2008-12-26 2008-12-26 User interest point determining method and device

Publications (2)

Publication Number Publication Date
CN101771957A true CN101771957A (en) 2010-07-07
CN101771957B CN101771957B (en) 2012-10-03

Family

ID=42504485

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200810241181A Expired - Fee Related CN101771957B (en) 2008-12-26 2008-12-26 User interest point determining method and device

Country Status (1)

Country Link
CN (1) CN101771957B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103262125A (en) * 2010-11-04 2013-08-21 诺基亚公司 Method and apparatus for annotating point of interest information
CN104715007A (en) * 2014-12-26 2015-06-17 小米科技有限责任公司 User identification method and device
CN105635210A (en) * 2014-10-30 2016-06-01 腾讯科技(武汉)有限公司 Network information recommending method and device, and reading system
CN109284449A (en) * 2018-10-23 2019-01-29 厦门大学 The recommended method and device of point of interest
CN110781413A (en) * 2019-08-28 2020-02-11 腾讯大地通途(北京)科技有限公司 Interest point determining method and device, storage medium and electronic equipment
CN111222047A (en) * 2020-01-03 2020-06-02 深圳市华宇讯科技有限公司 Picture downloading method, device, server and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1052578A3 (en) * 1999-05-10 2002-04-17 Matsushita Electric Industrial Co., Ltd. Contents extraction system and method
US20050076055A1 (en) * 2001-08-28 2005-04-07 Benoit Mory Automatic question formulation from a user selection in multimedia content
US8115869B2 (en) * 2007-02-28 2012-02-14 Samsung Electronics Co., Ltd. Method and system for extracting relevant information from content metadata

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103262125A (en) * 2010-11-04 2013-08-21 诺基亚公司 Method and apparatus for annotating point of interest information
US9472159B2 (en) 2010-11-04 2016-10-18 Nokia Technologies Oy Method and apparatus for annotating point of interest information
CN103262125B (en) * 2010-11-04 2017-06-20 诺基亚技术有限公司 Method and apparatus for explaining interest point information
CN105635210A (en) * 2014-10-30 2016-06-01 腾讯科技(武汉)有限公司 Network information recommending method and device, and reading system
CN104715007A (en) * 2014-12-26 2015-06-17 小米科技有限责任公司 User identification method and device
CN109284449A (en) * 2018-10-23 2019-01-29 厦门大学 The recommended method and device of point of interest
CN110781413A (en) * 2019-08-28 2020-02-11 腾讯大地通途(北京)科技有限公司 Interest point determining method and device, storage medium and electronic equipment
CN110781413B (en) * 2019-08-28 2024-01-30 腾讯大地通途(北京)科技有限公司 Method and device for determining interest points, storage medium and electronic equipment
CN111222047A (en) * 2020-01-03 2020-06-02 深圳市华宇讯科技有限公司 Picture downloading method, device, server and storage medium

Also Published As

Publication number Publication date
CN101771957B (en) 2012-10-03

Similar Documents

Publication Publication Date Title
RU2745632C1 (en) Automated response server device, terminal device, response system, response method and program
US8983971B2 (en) Method, apparatus, and system for mobile search
CN101771957B (en) User interest point determining method and device
CN108595461B (en) Interest exploration method, storage medium, electronic device and system
CN105573995A (en) Interest identification method, interest identification equipment and data analysis method
CN104809632A (en) Template-based dynamic advertisement generation method and template-based dynamic advertisement generation device
TW201342088A (en) Digital content reordering method and digital content aggregator
US9330135B2 (en) Method, apparatus and computer readable recording medium for a search using extension keywords
CN102073704B (en) Text classification processing method, system and equipment
US20100318427A1 (en) Enhancing database management by search, personal search, advertising, and databases analysis efficiently using core-set implementations
CN110855487B (en) Network user similarity management method, device and storage medium
CN106844792B (en) Method and system for realizing advertisement of primary information designated audience of social relationship
CN111753126A (en) Method and device for video dubbing
CN109377284B (en) Method and electronic equipment for pushing information
CN108536680B (en) Method and device for acquiring house property information
CN111767953B (en) Method and apparatus for training an article coding model
CN107368504A (en) A kind of information processing method, system and relevant device
CN103164522A (en) Method for obtaining linkman by end-user in social software
CN108932262B (en) Song recommendation method and device
US11966440B2 (en) Metadata tag identification
CN106844504B (en) A kind of method and apparatus for sending song and singly identifying
CN107203892B (en) Method and device for pushing value added service information and electronic equipment
CN114282119A (en) Scientific and technological information resource retrieval method and system based on heterogeneous information network
CN110659382B (en) Mixed music recommendation method based on heterogeneous information network representation learning technology
CN110176227B (en) Voice recognition method and related device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20121003

Termination date: 20211226

CF01 Termination of patent right due to non-payment of annual fee