CN104408115B - The heterogeneous resource based on semantic interlink recommends method and apparatus on a kind of TV platform - Google Patents
The heterogeneous resource based on semantic interlink recommends method and apparatus on a kind of TV platform Download PDFInfo
- Publication number
- CN104408115B CN104408115B CN201410687895.0A CN201410687895A CN104408115B CN 104408115 B CN104408115 B CN 104408115B CN 201410687895 A CN201410687895 A CN 201410687895A CN 104408115 B CN104408115 B CN 104408115B
- Authority
- CN
- China
- Prior art keywords
- media resource
- word
- media
- weight
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 239000011159 matrix material Substances 0.000 claims abstract description 79
- 230000011218 segmentation Effects 0.000 claims description 17
- 238000000354 decomposition reaction Methods 0.000 claims description 12
- 238000000605 extraction Methods 0.000 claims description 12
- 238000001914 filtration Methods 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 8
- 230000009849 deactivation Effects 0.000 claims description 8
- 230000009467 reduction Effects 0.000 claims description 4
- 230000004931 aggregating effect Effects 0.000 claims description 3
- 230000008569 process Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/40—Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
- G06F16/43—Querying
- G06F16/435—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/40—Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
- G06F16/43—Querying
- G06F16/438—Presentation of query results
- G06F16/4387—Presentation of query results by the use of playlists
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The invention discloses resource recommendation method and device based on semantic interlink on a kind of TV platform, this method includes:Extract the text message of all media resources of backstage media source library;The candidate feature word of the media resource is extracted according to the text message of each media resource, calculate the weights of the candidate feature word, the candidate feature word is carried out according to the weights to be filtrated to get Feature Words, the Feature Words weight matrix T of backstage media source library is generated;If the current media asset of user's viewing is the media resource in the backstage media source library, the cluster similarity of each media resource and current media asset in the backstage media source library is then calculated using the Feature Words weight matrix T using the method for cluster, L media resource generation media resource recommendation list of cluster similarity highest is chosen.
Description
Technical Field
The invention relates to the technical field of multimedia, in particular to a semantic link-based heterogeneous resource recommendation method and device on a television platform.
Background
When a user watches a television program on a television platform, the user usually has an interest in some information of the current program and wants to watch other media resources related to the current program. For the psychology of a user, some recommendation methods among media resources are currently available, and generally, a keyword of a current resource is obtained according to the current resource watched by the user to represent user characteristics, and the obtained keyword is used as a vector for representing the user characteristics, so that the resource with high similarity to the current resource is recommended to the user.
However, there are many disadvantages to the existing recommendation methods among these media resources, such as: the recommendation among the similar resources is mostly carried out, and the recommendation among the heterogeneous resources is less in application; few heterogeneous resource recommendations are one-way recommendations, that is, from one resource to another resource, such as a video source recommendation method associated with a television program, a product recommendation method associated with the television program, and the like, and the methods for recommending various resources are few; the words playing an important role in the resource recommendation method are partially recognizable, partially unrecognizable and manually constructed, and are complicated to operate; limited to morphological information, lacking semantic information; depending on manual labeling, the recommendation results are not ideal for the user due to the lack of use of feedback from the user.
Disclosure of Invention
In view of the above, the invention provides a method and a device for recommending heterogeneous resources on a television platform based on semantic links, which can automatically and intelligently recommend heterogeneous resources according to the resources currently watched by a user without additional operation of the user.
The technical scheme provided by the invention is as follows:
a semantic link-based heterogeneous resource recommendation method on a television platform comprises the following steps:
extracting text information of all media resources in a background media resource library;
extracting candidate characteristic words of each media resource according to the text information of the media resource, calculating the weight of the candidate characteristic words, filtering the candidate characteristic words according to the weight to obtain characteristic words, and generating a characteristic word weight matrix T of a background media resource library;
if the current media resources watched by the user are the media resources in the background media resource library, calculating the clustering similarity of each media resource in the background media resource library and the current media resources by using the characteristic word weight matrix T by adopting a clustering method, and selecting L media resources with the highest clustering similarity to generate a media resource recommendation list, wherein L is an integer greater than 0.
A semantic link-based heterogeneous resource recommendation device on a television platform comprises:
the text information extraction module is used for extracting the text information of all the media resources in the background media resource library;
the characteristic word extraction module is used for extracting candidate characteristic words of each media resource according to the text information of the media resource, calculating the weight of the candidate characteristic words, filtering the candidate characteristic words according to the weight to obtain characteristic words, and generating a characteristic word weight matrix T of a background media resource library;
and if the current media resource watched by the user is the media resource in the background media resource library, calculating the clustering similarity between each media resource in the background media resource library and the current media resource by using the feature word weight matrix T by adopting a clustering method, and selecting L media resources with the highest clustering similarity to generate a media resource recommendation list, wherein L is an integer greater than 0.
In summary, the semantic link-based heterogeneous resource recommendation method and device on the television platform provided by the invention map various heterogeneous resources into the same semantic space by relying on mass data resources, automatically construct semantic relationships between heterogeneous resources, and generate semantic link relationships between text to video, video to text and other heterogeneous resources, thereby generating a heterogeneous resource recommendation list.
Drawings
FIG. 1 is a flow chart of a first embodiment of the method of the present invention;
FIG. 2 is a flow chart of a second embodiment of the method of the present invention;
fig. 3 is a diagram showing a structure of an apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and specific embodiments.
When a user watches the current media resource on the television platform, the heterogeneous resource recommendation method based on semantic link on the television platform can provide L background media resources with higher relevance with the current media resource for the user according to the clustering similarity between various heterogeneous resources in the background media resource library and the current media resource watched by the user, and is convenient for the user to watch the background media resource relevant with the current media resource.
Method embodiment one
Fig. 1 is a flowchart of an embodiment of the present invention, as shown in fig. 1, including the following steps:
step 101: and extracting text information of all media resources in the background media resource library.
In this step, firstly, text information is extracted from all media resources in the background media resource library. Using each media resource in background media resource library by DiAnd expressing that i is a positive integer, i is more than or equal to 1 and less than or equal to N, and N is the number of the media resources contained in the background media resource library.
All media resources of the background media asset library can be divided into two broad categories: news text and video assets. For news texts, directly extracting text information; for video resources, text information is located in a video title and subtitle content, the video title is relatively easy to obtain, and the subtitle content is identified by two methods: one is self-contained subtitles in the play stream, and the subtitles can be extracted from the play stream; and the other method is to process the image, complete the subtitle extraction by positioning the position of the subtitle in the image and integrate the subtitle into a corresponding video description text.
And extracting text information of all media resources in the background media resource library, and representing each media resource in a text form.
Step 102: and extracting candidate characteristic words of each media resource in the background media resource library.
In step 101, the text information of each media resource in the background media resource library is obtained, and in this step, the text information obtained in step 101 is further processed to obtain candidate feature words of each media resource, and the candidate feature words of the media resources can represent the content of the media resource to a certain extent.
Firstly, segmenting the text information of each media resource into a plurality of participles by utilizing a lexical analysis tool according to different parts of speech to obtain a participle sequence of each media resource. Because the lexical analysis tool only segments the text information according to the judgment of the part of speech, and does not consider the importance degree of the segmented participle pair for representing the media resource and the semantic relationship of each participle between the context in the text information of the media resource, the segmentation process may obtain some participles without practical significance, such as "in", "will", and the like, and may also segment an originally integral word string into two or more participles, such as "search fox video" into three participles of "search fox", "video", and the original "search fox video" should be taken as an integral word string to represent the media resource.
Aiming at the defect of a lexical analysis tool, the obtained participles cannot be directly used as candidate characteristic words of each media resource, the obtained participles need to be matched with a hot word dictionary, the obtained participles are corrected by the hot word dictionary, a plurality of participles containing relations in the hot word dictionary are combined according to the longest word string, and the combined participles are used as the candidate characteristic words of the media resources. For example, the word segmentation sequence of a certain media resource includes three word segmentations of "search", "fox" and "video", and the hot word dictionary includes four hot words of "search", "fox", "video" and "fox search video", then the three word segmentations of "search", "fox" and "video" in the media resource are merged according to the longest word string "fox search video" in the hot word dictionary, so as to obtain a candidate feature word "fox search video" of the media resource. In specific implementation, the word segmentation sequence of each media resource can be matched with the hot word dictionary by adopting a dictionary tree method. After the obtained segmented words are corrected by the hot word dictionary, the corrected segmented words can better accord with the reading habit of people.
The hot word dictionary is a hot word set, the hot words in the hot word dictionary can represent semantic information of a background media resource library, and the construction method comprises the following steps:
(1) and according to the language types of the text information of all the media resources in the background media resource library, selecting separators in specific language types to split the text information of all the media resources in the background media resource library into clauses, such as Chinese. ","! ","? "Chinese punctuation marks, or", "? The English punctuation marks such as "" and "".
(2) Calculating the word frequency of each repeated word string in the background media resource library, defining the word frequency of the repeated word string as how many clauses of the repeated word string appear in the background media resource library, and taking each repeated word string with the word frequency larger than a word frequency threshold value as a candidate word string to construct a candidate word string set.
(3) And filtering the candidate word strings, and taking the candidate word strings reserved after filtering as hot words to construct a hot word dictionary.
The specific filtering method can be realized by the following three steps:
a. and collecting the deactivation word list, and filtering the candidate word strings by using the deactivation word list, namely deleting the candidate word strings in the deactivation word list from the candidate word string set.
b. Calculating a weight value of each candidate word string, wherein the weight value is represented by a word Frequency (TF, Term Frequency) -Inverse Document Frequency (IDF), and the candidate word strings with the weight values lower than a weight value threshold value are deleted from the candidate word string set.
c. And (3) establishing prior knowledge according to the type of noise data in the candidate word string, for example, noise strings consisting of time information, numbers, quantifiers and the like frequently appear in text information, and deleting the noise strings from the candidate word string set.
Step 103: and further extracting the characteristic words of each media resource in the background media resource library.
The method comprises the following steps of extracting the characteristic words of each media resource in the background media resource library, and representing each media resource by at least one characteristic word. The method for extracting the media resource feature words comprises the following steps:
calculating the weight of the candidate feature words of each media resource in the background media resource library obtained in step 102, still expressing the weight by the TF-IDF value of the candidate feature words, deleting the candidate feature words with the weight less than the weight threshold, further filtering the candidate feature words with the weight not less than the weight threshold through a deactivation table, and finally reserving the candidate feature words of the media resources as the feature words of the media resources.
Defining the characteristic words of all media resources in the background media resource library as the characteristic words of the background media resource library, and expressing the characteristic word vector of the background media resource library as C ═ C1,…,cj,,…,cM]Wherein c isjThe characteristic words are the jth characteristic word of the background media resource library, M is the quantity of the characteristic words of the background media resource library, the characteristic words of the background media resource library comprise the characteristic words of each media resource, and any two characteristic words of the background media resource library are different.
Setting a weight matrix T of the characteristic words of M × N, wherein the row number M of the matrix represents the characteristic word c of the background media resource libraryjThe number of columns N represents the media resources D of the background media resource libraryiQuantity of, element T of the feature word weight matrix TjiRepresentation feature word cjOn media asset DiThe weight value of (1) is taken as the feature word cjIs a media asset DiWhen the feature word of (1), tjiIs a characteristic word cjOn media asset DiTF-IDF value of (1); when the feature word cjIs not a media asset DiWhen the feature word of (1), tji=0。
Step 104: and carrying out singular value decomposition on the feature word weight matrix T.
In order to mine the semantic relation among the characteristic words of the background media resource library, singular value decomposition is carried out on the characteristic word weight matrix T, and three matrixes S, V, U containing the semantic relation are obtained after the singular value decompositionTAnd T ═ SVUT. Wherein, UTThe method is characterized in that the method is a characteristic word weight matrix after dimension reduction of the characteristic word weight matrix T through singular value decomposition, the singular value decomposition can realize theme extraction, the weights of words with the same theme are consistent in a certain range, and the singular value decomposition can find the implicit semantic relation between characteristic words and characteristic words in the characteristic word weight matrix T.
Step 105: and judging whether the current media resources watched by the user are the media resources of the background media resource library, if not, executing step 106, and if so, executing step 107.
Step 106: and calculating the weight vector of the current media resource.
In this step, the text information of the current media resource watched by the user is first obtained, and the obtaining method is the same as the method for obtaining the text information of each media resource in the background media resource library in step 101, and is not described herein again. After obtaining text information of the current media resource, extracting candidate feature words of the current media resource (the extraction method is the same as the method for obtaining the candidate feature words of the background media resource library in step 102), then matching the candidate feature words of the current media resource with the feature word vector C, if a certain candidate feature word of the current media resource is not an element of the feature word vector C, deleting the candidate feature words of the current media resource, further performing weight calculation on the retained candidate feature words, still expressing the weight by using a TF-IDF value, deleting the candidate feature words with the weight smaller than the weight threshold, further filtering the candidate feature words with the weight not smaller than the weight threshold by using a deactivation table, and finally taking the retained candidate feature words as the feature words of the current media resource.
Constructing a weight vector Y of the current media resource, wherein Y is an M × 1 matrix and an element Y of the matrixj(j is more than or equal to 1 and less than or equal to M) as a characteristic word cjWeight in the current media resource, when the feature word cjIs a feature word of the current media asset, yjIs a characteristic word cjA TF-IDF value in the current media resource; when the feature word cjWhen not the feature word of the current media resource, yj=0。
The matrix Y is transformed as follows: y1 ═ YTSV-1Wherein Y isTTransposed matrix of Y, V-1Is the inverse matrix of V.
Step 107: and generating a media resource recommendation list by adopting a clustering method.
In order to enable the media recommendation list to more accurately capture the interest of the user, the media resource recommendation list is generated by adopting a clustering method, so that the requirements of the user on diversity and relevance are met.
In this step, the feature words of the current media resources are defined as specific feature words, and the media with weights not equal to 0 on all the specific feature words in the background media resource libraryResource composition background media resource collections
Aggregating background media resources using a K-means algorithmClustering is carried out, wherein K in the K-means algorithm takes the number of specific characteristic words, and background media resources are gatheredDivision into K classes
Go throughThe cluster similarity of each background media asset to the current media asset,middle background media resource DjThe clustering similarity with the current media resource D' is calculated by the following formula:
wherein, the background media resource DjSimilarity Sim (D) with current media asset DjD') is calculated using cosine similarity:
wherein if the current media resource D' is not a resource in the background media resource library, ujkIs DjAt UTOf the corresponding jth row and kth column element, ykThe corresponding kth column element in Y1 for D'; if the current media asset D' is a resource in the background media asset library, i.e. D ═ DdWhere d is not equal to j and d is not less than 1 and not more than N, then ujkIs DjAt UTOf the corresponding jth row and kth column element, ykIs D' in UTCorresponding to the kth column element of the d-th row.
According to clustering similarity pairsAnd sequencing all background media resources, selecting the first L background media resources to form a recommendation list and returning the recommendation list to the user as the L background media resources recommended to the user and having the maximum correlation with the current media resources, wherein L is an integer larger than 0.
Step 108: and updating the background media resource library.
In this step, if the current media resource watched by the user is a media resource in the background media resource library, the background media resource library does not need to be updated, and the feature word weight matrix T of the background media resource library is not changed; if the current media resource watched by the user is not the media resource in the background media resource library, the current media resource D' is taken as DN+1Adding the updated background media resource library into a background media resource library, wherein the updated background media resource library comprises N +1 media resources, correspondingly updating the feature word weight matrix T of the background media resource library, updating the T into an M × (N +1) -dimensional matrix, namely, adding a row of the original feature word weight matrix T, wherein the added row of elements is Y in step 105, and when a media resource recommendation list of other current media resources is generated again for the user in the following step, the background media resource library comprises the N +1 media resources, without executing step 101 to step 103 again, and executing step 104 directly.
The method completes heterogeneous resource recommendation of the current media resources watched by the user on the television platform, and the recommendation list obtained by the scheme meets the requirement of the user on information diversification.
Method embodiment two
Furthermore, in order to make the semantic relevance between the heterogeneous resources recommended to the user and the current media resources higher, the method further adjusts the weight of the clicked media resource feature words in the media resource recommendation list by combining implicit user feedback information such as the click rate and the click sequence of different users on the media resources in the media resource recommendation list, so that the interest of the user can be more approached when the media resource recommendation list is calculated for the user again in the following process. FIG. 2 is a flowchart illustrating the present embodiment of adjusting media resources R in a media resource recommendation listlFor example, where L is a positive integer, and L is greater than or equal to 1 and less than or equal to L, as shown in fig. 2, the user performs the following steps each time the user clicks one media resource in the media resource recommendation list:
step 201: the scores of the individual users for the media assets are calculated.
The user can select one or more of the media resource recommendation lists according to the interest of the user to click and watch, and when the user clicks the media resources in a certain recommendation list, a click sequence can be generated for the clicked media resources. User to media resource RlThe click sequence of (C) is denoted as rank (R)l) Due to RlIs a media resource in the recommendation list containing L media resources, therefore, the click sequence of the media resource is necessary to satisfy 1 ≦ rank (R)l) Less than or equal to L. Applying a formula according to the click sequenceComputing a single user pair RlAnd scoring, wherein Score _ max is a constant used for limiting the maximum value of the scoring of the media resources by the single user.
Step 202: a current total score of the media assets is calculated.
Media resource RlThe current total score is defined as the current total score of all users for the media resource RlThe sum of the scores of (a). Suppose there are currently a total of P users clicking the mediaBody resource RlEach user will be assigned to a media resource RlGenerating a score, thenIs the media resource RlThe current total score.
Step 203: and judging whether the current total score of the media resources is larger than a score threshold value, if not, executing the step 204, and if so, executing the step 205.
In this step, P is R of the current click media resourcelNumber of users, if media resource RlThe current total score is not greater than the score thresholdIndicate a click on media asset RlIs less, and/or the user clicks on the media resource RlThe order of the media resources is later, the reflected information is the media resource RlNot very attractive to a wide range of users, then only for that RlFine tuning the feature word weight; if media resource RlThe current total score is greater than the score thresholdIndicate a click on media asset RlAnd/or the user clicks on the media resource RlThe order of the media resources is earlier, and the reflected information is the media resource RlIs more attractive to a wide range of users, then R is the most attractive tolThe weight of the feature word is adjusted to a greater extent.
Step 204: and finely adjusting the weight value of each feature word of the media resource.
In this step, tjIs a media resource RlThe weight of the jth feature word in (1), namely the media resource R in the feature word weight matrix TlCorresponding elements, wherein α is a weight adjustment parameter, is an empirical constant, andcalculating the media resource R according to the formulalAfter each feature word weight value, updating a feature word weight value matrix T of the background media resource database.
Step 205: and adding all the characteristic words of the media resources into the high-frequency characteristic word set, and adjusting the weight value of each characteristic word of the media resources.
In this step, since the media resource RlThe current total score is greater than the score thresholdTo illustrate a media resource RlThe attraction to the user is generally high, then the media resource R is usedlAll feature words of (2) are added to the high-frequency feature word setIn, andthe characteristic words in (1) are mutually different, i.e.No repeated feature words are included. Then according to the formula f (t)j)=tj×(1+Score(Rl) /(β +1)) to media resource RlIs adjusted, wherein t isjIs a media resource RlThe weight of the jth feature word in (1), namely the media resource R in the feature word weight matrix TlCorresponding element, f (t)j) Is a media resource Rlβ is a weight adjustment parameter, is an empirical constant, andx isThe number of feature words contained therein. Calculating the media resource R according to the formulalEach of (1)And after the feature word weight is updated, updating a feature word weight matrix T of the background media resource database.
The above-mentioned process of adjusting the feature word weight matrix T for different users according to the click amount and the click sequence of the user can adjust the feature word weight of the background media resource according to the click feedback information of the user, and can provide more reasonable hot media resource sequencing for the user, so that the recommendation performance is better.
The invention also discloses a device of the resource recommendation method based on semantic link on the television platform, fig. 3 is a structural diagram of the device, and as shown in fig. 3, the device comprises:
the text information extraction module 310 is configured to extract text information of all media resources in the background media resource library;
the feature word extraction module 320 is configured to extract candidate feature words of each media resource according to text information of the media resource, calculate a weight of the candidate feature words, filter the candidate feature words according to the weight to obtain feature words, and generate a feature word weight matrix T of the background media resource library;
if the current media resource watched by the user is the media resource in the background media resource library, the media resource recommendation list generation module 330 calculates the clustering similarity between each media resource in the background media resource library and the current media resource by using the feature word weight matrix T by using a clustering method, and selects L media resources with the highest clustering similarity to generate a media resource recommendation list.
The feature word extraction module 320 further includes:
the word segmentation sequence sub-module 321 is configured to, for each media resource in the background media resource library, segment text information of each media resource into word segmentation sequences according to different parts of speech by using a lexical analysis tool;
the candidate characteristic word extraction sub-module 322 is configured to match the word segmentation sequence of each media resource with the hot word dictionary, merge multiple word segments including relationships in the hot word dictionary according to the longest word string, and use the merged word segment as a candidate characteristic word of the media resource;
the feature word weight matrix generation submodule 323 is used for calculating the weight of the candidate feature word, wherein the weight is the word frequency-inverse document frequency value of the candidate feature word, the candidate feature word with the weight not less than the weight threshold value is filtered through a deactivation table, and the filtered candidate feature word is the feature word of the media resource;
constructing feature words of the background media resource library by using feature words of all media resources of the background media resource library, and using a vector C ═ C1,…,cj,,…,cM]Representing, wherein M is the number of the feature words of the background media resource library, the feature words of the background media resource library comprise the feature words of each media resource in the background media resource library, and the feature words of any two background media resource libraries are different;
setting a weight matrix T of the characteristic words of M × N, wherein the row number M of the matrix represents the characteristic word c of the background media resource libraryjColumn number N represents media asset D of background media asset libraryiElement T of the weight matrix T of the feature wordjiRepresentation feature word cjOn media asset DiThe weight value of (1) is taken as the feature word cjIs a media asset DiWhen the feature word of (1), tjiIs a characteristic word cjOn media asset DiTF-IDF value of (1); when the feature word cjIs not a media asset DiWhen the feature word of (1), tji=0。
The feature word weight matrix generation submodule 323 is further configured to:
performing singular value decomposition on the feature word weight matrix T to obtain three matrixes S, V, U containing semantic relationsTAnd T ═ SVUTWherein, UTThe weight matrix of the feature words is obtained after the weight matrix T of the feature words is subjected to singular value decomposition and dimension reduction.
If the current media resource watched by the user is not the media resource in the background media resource library, the device further comprises a current media resource feature word weight calculation module 340, configured to obtain text information of the current media resource watched by the user, extract feature words of the current media resource according to the text information of the current media, calculate a weight of each feature word, construct a weight vector Y of the current media resource, where Y is an M × 1 matrix and a matrix element Y isj(j is more than or equal to 1 and less than or equal to M) as a characteristic word cjWeight in the current media resource, when the feature word cjIs a feature word of the current media asset, yjIs a characteristic word cjA TF-IDF value in the current media resource; when the feature word cjWhen not the feature word of the current media resource, yj=0。
The current media resource feature word weight calculation module 340 is further configured to:
the matrix Y is transformed as follows: y1 ═ YTSV-1Wherein Y isTTransposed matrix of Y, V-1Is the inverse matrix of V.
The media resource recommendation list generation module 330 further comprises:
a background media resource set generating sub-module 331, configured to define the feature words of the current media resources as specific feature words, and configure the background media resource set with media resources whose weights on all the specific feature words are not 0 in the background media resource library
A similarity operator module 332 for aggregating background media resources by using K-means algorithmClustering is carried out, wherein K in the K-means algorithm takes the number of specific characteristic words, and background media resources are gatheredDivision into K classes
Go throughThe cluster similarity of each background media asset to the current media asset,middle background media resource DjThe clustering similarity with the current media resource D' is calculated by the following formula:
wherein, the background media resource DjSimilarity Sim (D) with current media asset DjD') is calculated using cosine similarity:
wherein u isjkIs DjAt UTOf the corresponding jth row and kth column element, ykThe corresponding k column element in Y1 for D'.
The device further includes a weight learning module 340, configured to adjust a weight of a feature word weight matrix T of a background media resource library according to a click sequence and a click amount of a user clicking a media resource in a media resource recommendation list, where the weight learning module 340 further includes:
a media asset score calculation module 341 for calculating a media asset score based onComputing single user to mediaResource RlWherein R islFor the media resource currently clicked and watched by the user in the media resource recommendation list, rank (R)l) For user to media resource RlAnd 1 is not more than rank (R)l) L ≦ Score _ max being the maximum worth constant that defines the individual user's scoring of the media asset;
a media resource total score calculation module 342 for calculating a total score based onComputing a media resource RlCurrent total score, where P is the current click media asset RlThe number of users of (c);
a weight value adjusting module 343, for adjusting the media resource R if it is determined that the media resource R is a media resource RlThe current total score is not greater than the score thresholdAccording to the formula f (t)j)=tj×(1+Score(Rl) /(α +1)) to media resource RlAdjusting the weight of each feature word;
if media resource RlThe current total score is greater than the score thresholdMedia resource RlAll the feature words are added into a high-frequency feature word setIn accordance with the formula f (t)j)=tj×(1+Score(Rl) /(β +1)) to media resource RlAdjusting the weight of each feature word;
wherein, tjIs a media resource RlThe weight of the jth feature word in (1), namely the media resource R in the feature word weight matrix TlCorresponding element, f (t)j) Is a media resource Rlα is a weight adjustment parameter, and the characteristic words in (1) are mutually different, i.e.β is a weight adjustment parameter, andx isThe number of feature words contained therein.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that are within the spirit and principle of the present invention are intended to be included within the scope of the present invention.
Claims (12)
1. A resource recommendation method based on semantic links on a television platform is characterized by comprising the following steps:
extracting text information of all media resources in a background media resource library;
extracting candidate characteristic words of each media resource according to the text information of the media resource, calculating the weight of the candidate characteristic words, filtering the candidate characteristic words according to the weight to obtain characteristic words, and generating a characteristic word weight matrix T of a background media resource library;
if the current media resource watched by the user is the media resource in the background media resource library, calculating the clustering similarity of each media resource in the background media resource library and the current media resource by using the characteristic word weight matrix T by adopting a clustering method, and selecting L media resources with the highest clustering similarity to generate a media resource recommendation list, wherein L is an integer greater than 0;
the method includes the steps of extracting candidate feature words of each media resource according to text information of the media resource, calculating weights of the candidate feature words, filtering the candidate feature words to obtain feature words, and generating a feature word weight matrix T of a background media resource library, and further includes the steps of:
aiming at each media resource in the background media resource library, segmenting text information of each media resource into word segmentation sequences by utilizing a lexical analysis tool according to different parts of speech;
matching the word segmentation sequence of each media resource with a hot word dictionary, merging a plurality of word segmentations containing relations in the hot word dictionary according to the longest word string, and taking the merged word segmentations as candidate characteristic words of the media resource;
calculating a weight of the candidate characteristic word, wherein the weight is a word frequency-inverse document frequency value of the candidate characteristic word, filtering the candidate characteristic word with the weight not less than a weight threshold value through a deactivation table, and taking the filtered candidate characteristic word as the characteristic word of the media resource;
constructing feature words of the background media resource library by using feature words of all media resources of the background media resource library, and using a vector C ═ C1,…,cj,…,cM]Representing, wherein M is the number of the feature words of the background media resource library, the feature words of the background media resource library comprise the feature words of each media resource in the background media resource library, and the feature words of any two background media resource libraries are different;
setting a weight matrix T of the characteristic words of M × N, wherein the row number M of the matrix represents the characteristic word c of the background media resource libraryjThe number of columns N represents the media resources D of the background media resource libraryiQuantity of, element T of the feature word weight matrix TjiRepresentation feature word cjOn media asset DiThe weight value of (1) is taken as the feature word cjIs a media asset DiWhen the feature word of (1), tjiIs a characteristic word cjOn media asset DiTF-IDF value of (1); when the feature word cjIs not a media asset DiWhen the feature word of (1), tji=0。
2. The method of claim 1, further comprising:
performing singular value decomposition on the feature word weight matrix T to obtain three matrixes S, V, U containing semantic relationsTAnd T ═ SVUTWherein, UTThe weight matrix of the feature words is obtained after the weight matrix T of the feature words is subjected to singular value decomposition and dimension reduction.
3. The method of claim 1, wherein if the current media resource viewed by the user is not a media resource in the background media resource library, before the clustering similarity between each media resource in the background media resource library and the current media resource is calculated by the clustering method, the method further comprises:
acquiring text information of a current media resource watched by a user, extracting feature words of the current media resource according to the text information of the current media resource, calculating a weight of each feature word, and constructing a weight vector Y of the current media resource, wherein Y is an M × 1 matrix and a matrix element Yj(j is more than or equal to 1 and less than or equal to M) as a characteristic word cjWeight in the current media resource, when the feature word cjIs a feature word of the current media asset, yjIs a characteristic word cjA TF-IDF value in the current media resource; when the feature word cjWhen not the feature word of the current media resource, yj=0。
4. The method of claim 2, further comprising:
the matrix Y is transformed as follows: y1 ═ YTSV-1Wherein Y isTTransposed matrix of Y, V-1Is the inverse matrix of V.
5. The method according to claim 4, wherein the clustering method uses the feature word weight matrix T to calculate the clustering similarity between each media resource in the background media resource library and the current media resource, and further comprises:
defining the characteristic words of the current media resources as specific characteristic words, and forming a background media resource set by the media resources with weights not being 0 on all the specific characteristic words in a background media resource library
Aggregating background media resources using a K-means algorithmClustering is carried out, wherein K in the K-means algorithm takes the number of specific characteristic words, and background media resources are gatheredDivision into K classes
Go throughThe cluster similarity of each background media asset to the current media asset,middle background media resource DjThe clustering similarity with the current media resource D' is calculated by the following formula:
wherein, the background media resource DjSimilarity Sim (D) with current media asset DjD') is calculated using cosine similarity:
<mrow> <mi>S</mi> <mi>i</mi> <mi>m</mi> <mrow> <mo>(</mo> <msub> <mi>D</mi> <mi>j</mi> </msub> <mo>,</mo> <msup> <mi>D</mi> <mo>&prime;</mo> </msup> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <munder> <mo>&Sigma;</mo> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> </munder> <mrow> <mo>(</mo> <msub> <mi>u</mi> <mrow> <mi>j</mi> <mi>k</mi> </mrow> </msub> <mo>&times;</mo> <msub> <mi>y</mi> <mi>k</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <msqrt> <mrow> <munder> <mo>&Sigma;</mo> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> </munder> <msubsup> <mi>u</mi> <mrow> <mi>j</mi> <mi>k</mi> </mrow> <mn>2</mn> </msubsup> </mrow> </msqrt> <msqrt> <mrow> <munder> <mo>&Sigma;</mo> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> </munder> <msubsup> <mi>y</mi> <mi>k</mi> <mn>2</mn> </msubsup> </mrow> </msqrt> </mrow> </mfrac> <mo>;</mo> </mrow>
wherein u isjkIs DjAt UTOf the corresponding jth row and kth column element, ykIs D' corresponding kth column element in Y1.
6. The method of claim 1, further comprising:
aiming at the click sequence and click quantity of the media resources in the media resource recommendation list clicked by the user, carrying out weight adjustment on a feature word weight matrix T of a background media resource library, and specifically comprising the following steps:
according toComputing a single user-to-media resource RlWherein R islFor the media resource currently clicked and watched by the user in the media resource recommendation list, rank (R)l) For user to media resource RlAnd 1 is not more than rank (R)l) L ≦ Score _ max being a constant defining the maximum value of the individual user's scoring of the media asset;
according toComputing a media resource RlCurrent total score, where P is the current click media asset RlThe number of users of (c);
if media resource RlThe current total score is not greater than the score thresholdAccording to the formula f (t)j)=tj×(1+Score(Rl) /(α +1)) to media resource RlAdjusting the weight of each feature word;
if media resource RlThe current total score is greater than the score thresholdMedia resource RlAll the feature words are added into a high-frequency feature word setIn accordance with the formula f (t)j)=tj×(1+Score(Rl) /(β +1)) to media resource RlAdjusting the weight of each feature word;
wherein, tjIs a media resource RlThe weight of the jth feature word in (1), namely the media resource R in the feature word weight matrix TlCorresponding element, f (t)j) Is a media resource Rlα is a weight adjustment parameter, and the characteristic words in (1) are mutually different, i.e.β is a weight adjustment parameter, andx isThe number of feature words contained therein.
7. A resource recommendation device based on semantic links on a television platform, the device comprising:
the text information extraction module is used for extracting the text information of all the media resources in the background media resource library;
the characteristic word extraction module is used for extracting candidate characteristic words of each media resource according to the text information of the media resource, calculating the weight of the candidate characteristic words, filtering the candidate characteristic words according to the weight to obtain characteristic words, and generating a characteristic word weight matrix T of a background media resource library;
a media resource recommendation list generation module, if the current media resource watched by the user is the media resource in the background media resource library, calculating the clustering similarity between each media resource in the background media resource library and the current media resource by using the feature word weight matrix T by adopting a clustering method, and selecting L media resources with the highest clustering similarity to generate a media resource recommendation list, wherein L is an integer greater than 0;
wherein,
the word segmentation sequence submodule is used for segmenting the text information of each media resource into word segmentation sequences by utilizing a lexical analysis tool according to different parts of speech aiming at each media resource in the background media resource library;
the candidate characteristic word extraction sub-module is used for matching the word segmentation sequence of each media resource with the hot word dictionary, merging a plurality of word segmentations containing relations in the hot word dictionary according to the longest word string, and taking the merged word segmentations as candidate characteristic words of the media resource;
the feature word weight matrix generation submodule is used for calculating the weight of the candidate feature words, the weight is the word frequency-inverse document frequency value of the candidate feature words, the candidate feature words with the weight not less than the weight threshold value are filtered through a deactivation table, and the filtered candidate feature words are the feature words of the media resources;
constructing feature words of the background media resource library by using feature words of all media resources of the background media resource library, and using a vector C ═ C1,…,cj,…,cM]Representing, wherein M is the number of the feature words of the background media resource library, the feature words of the background media resource library comprise the feature words of each media resource in the background media resource library, and the feature words of any two background media resource libraries are different;
setting a weight matrix T of the characteristic words of M × N, wherein the row number M of the matrix represents the characteristic word c of the background media resource libraryjColumn number N represents media asset D of background media asset libraryiElement T of the weight matrix T of the feature wordjiRepresentation feature word cjOn media asset DiThe weight value of (1) is taken as the feature word cjIs a media asset DiWhen the feature word of (1), tjiIs characterized in thatWord cjOn media asset DiTF-IDF value of (1); when the feature word cjIs not a media asset DiWhen the feature word of (1), tji=0。
8. The apparatus of claim 7, wherein the feature word weight matrix generation sub-module is further configured to:
performing singular value decomposition on the feature word weight matrix T to obtain three matrixes S, V, U containing semantic relationsTAnd T ═ SVUTWherein, UTThe weight matrix of the feature words is obtained after the weight matrix T of the feature words is subjected to singular value decomposition and dimension reduction.
9. The apparatus as claimed in claim 7, wherein if the current media asset viewed by the user is not a media asset in the background media asset library, the apparatus further comprises:
a current media resource feature word weight calculation module, configured to obtain text information of a current media resource watched by a user, extract feature words of the current media resource according to the text information of the current media resource, calculate a weight of each feature word, and construct a weight vector Y of the current media resource, where Y is an M × 1 matrix and a matrix element Y isj(j is more than or equal to 1 and less than or equal to M) as a characteristic word cjWeight in the current media resource, when the feature word cjIs a feature word of the current media asset, yjIs a characteristic word cjA TF-IDF value in the current media resource; when the feature word cjWhen not the feature word of the current media resource, yj=0。
10. The apparatus according to claim 8, wherein the current media resource feature word weight calculation module is further configured to:
the matrix Y is transformed as follows: y1 ═ YTSV-1Wherein Y isTTransposed matrix of Y, V-1Is the inverse matrix of V.
11. The apparatus of claim 10, wherein the media resource recommendation list generation module further comprises:
a background media resource set generation submodule for defining the characteristic words of the current media resources as specific characteristic words and forming a background media resource set by the media resources with weights not being 0 on all the specific characteristic words in a background media resource library
A similarity operator module for gathering background media resources by adopting K-means algorithmClustering is carried out, wherein K in the K-means algorithm takes the number of specific characteristic words, and background media resources are gatheredDivision into K classes
Go throughThe cluster similarity of each background media asset to the current media asset,middle background media resource DjThe clustering similarity with the current media resource D' is calculated by the following formula:
wherein, the background media resource DjSimilarity Sim (D) with current media asset DjD') by cosine phaseAnd (4) calculating the similarity:
<mrow> <mi>S</mi> <mi>i</mi> <mi>m</mi> <mrow> <mo>(</mo> <msub> <mi>D</mi> <mi>j</mi> </msub> <mo>,</mo> <msup> <mi>D</mi> <mo>&prime;</mo> </msup> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <munder> <mo>&Sigma;</mo> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> </munder> <mrow> <mo>(</mo> <msub> <mi>u</mi> <mrow> <mi>j</mi> <mi>k</mi> </mrow> </msub> <mo>&times;</mo> <msub> <mi>y</mi> <mi>k</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <msqrt> <mrow> <munder> <mo>&Sigma;</mo> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> </munder> <msubsup> <mi>u</mi> <mrow> <mi>j</mi> <mi>k</mi> </mrow> <mn>2</mn> </msubsup> </mrow> </msqrt> <msqrt> <mrow> <munder> <mo>&Sigma;</mo> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> </munder> <msubsup> <mi>y</mi> <mi>k</mi> <mn>2</mn> </msubsup> </mrow> </msqrt> </mrow> </mfrac> <mo>;</mo> </mrow>
wherein u isjkIs DjAt UTOf the corresponding jth row and kth column element, ykThe corresponding k column element in Y1 for D'.
12. The apparatus according to claim 7, further comprising a weight learning module, configured to perform weight adjustment on a feature word weight matrix T of a background media asset library according to a click sequence and a click amount of a user clicking on a media asset in the media asset recommendation list, where the weight learning module further includes:
a media resource score calculation module for calculating a score based onComputing a single user-to-media resource RlWherein R islFor the media resource currently clicked and watched by the user in the media resource recommendation list, rank (R)l) For user to media resource RlAnd 1 is not more than rank (R)l) L ≦ Score _ max being the maximum worth constant that defines the individual user's scoring of the media asset;
a media resource total score calculating module for calculating total score according toComputing a media resource RlCurrent total score, where P is the current click media asset RlThe number of users of (c);
a weight value adjusting module for adjusting the weight value if the media resource R islThe current total score is not greater than the score thresholdAccording to the formula f (t)j)=tj×(1+Score(Rl) /(α +1)) to media resource RlAdjusting the weight of each feature word;
if media resource RlThe current total score is greater than the score thresholdMedia resource RlAll the feature words are added into a high-frequency feature word setIn accordance with the formula f (t)j)=tj×(1+Score(Rl) /(β +1)) to media resource RlAdjusting the weight of each feature word;
wherein, tjIs a media resource RlThe weight of the jth feature word in (1), namely the media resource R in the feature word weight matrix TlCorresponding element, f (t)j) Is a media resource Rlα is a weight adjustment parameter, and the characteristic words in (1) are mutually different, i.e.β is a weight adjustment parameter, andx isThe number of feature words contained therein.
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410687895.0A CN104408115B (en) | 2014-11-25 | 2014-11-25 | The heterogeneous resource based on semantic interlink recommends method and apparatus on a kind of TV platform |
KR1020150099839A KR102314645B1 (en) | 2014-11-25 | 2015-07-14 | A method and device of various-type media resource recommendation |
EP15195889.9A EP3026584A1 (en) | 2014-11-25 | 2015-11-23 | Device and method for providing media resource |
US14/952,402 US10339146B2 (en) | 2014-11-25 | 2015-11-25 | Device and method for providing media resource |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410687895.0A CN104408115B (en) | 2014-11-25 | 2014-11-25 | The heterogeneous resource based on semantic interlink recommends method and apparatus on a kind of TV platform |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104408115A CN104408115A (en) | 2015-03-11 |
CN104408115B true CN104408115B (en) | 2017-09-22 |
Family
ID=52645746
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410687895.0A Active CN104408115B (en) | 2014-11-25 | 2014-11-25 | The heterogeneous resource based on semantic interlink recommends method and apparatus on a kind of TV platform |
Country Status (2)
Country | Link |
---|---|
KR (1) | KR102314645B1 (en) |
CN (1) | CN104408115B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105279288B (en) * | 2015-12-04 | 2018-08-24 | 深圳大学 | A kind of online content recommendation method based on deep neural network |
CN105868237A (en) * | 2015-12-09 | 2016-08-17 | 乐视网信息技术(北京)股份有限公司 | Multimedia data recommendation method and server |
CN105677830B (en) * | 2016-01-04 | 2019-01-18 | 北京大学 | A kind of dissimilar medium similarity calculation method and search method based on entity mapping |
CN105808648A (en) * | 2016-02-25 | 2016-07-27 | 焦点科技股份有限公司 | R language program based personalized recommendation method |
CN109582953B (en) * | 2018-11-02 | 2023-04-07 | 中国科学院自动化研究所 | Data support scoring method and equipment for information and storage medium |
CN109726391B (en) * | 2018-12-11 | 2024-01-09 | 中科恒运股份有限公司 | Method, device and terminal for emotion classification of text |
CN109657129B (en) * | 2018-12-26 | 2023-04-18 | 北京百度网讯科技有限公司 | Method and device for acquiring information |
CN114756676A (en) * | 2022-03-16 | 2022-07-15 | 中国农业银行股份有限公司 | Text feature extraction method and device and electronic equipment |
CN115269989B (en) * | 2022-08-03 | 2023-05-05 | 百度在线网络技术(北京)有限公司 | Object recommendation method, device, electronic equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101923545A (en) * | 2009-06-15 | 2010-12-22 | 北京百分通联传媒技术有限公司 | Method for recommending personalized information |
CN103678431A (en) * | 2013-03-26 | 2014-03-26 | 南京邮电大学 | Recommendation method based on standard labels and item grades |
CN103678618A (en) * | 2013-12-17 | 2014-03-26 | 南京大学 | Web service recommendation method based on socializing network platform |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7283992B2 (en) * | 2001-11-30 | 2007-10-16 | Microsoft Corporation | Media agent to suggest contextually related media content |
US20090006368A1 (en) * | 2007-06-29 | 2009-01-01 | Microsoft Corporation | Automatic Video Recommendation |
US20100076979A1 (en) * | 2008-09-05 | 2010-03-25 | Xuejun Wang | Performing search query dimensional analysis on heterogeneous structured data based on relative density |
US9292545B2 (en) * | 2011-02-22 | 2016-03-22 | Thomson Reuters Global Resources | Entity fingerprints |
US20140280241A1 (en) * | 2013-03-15 | 2014-09-18 | MediaGraph, LLC | Methods and Systems to Organize Media Items According to Similarity |
-
2014
- 2014-11-25 CN CN201410687895.0A patent/CN104408115B/en active Active
-
2015
- 2015-07-14 KR KR1020150099839A patent/KR102314645B1/en active IP Right Grant
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101923545A (en) * | 2009-06-15 | 2010-12-22 | 北京百分通联传媒技术有限公司 | Method for recommending personalized information |
CN103678431A (en) * | 2013-03-26 | 2014-03-26 | 南京邮电大学 | Recommendation method based on standard labels and item grades |
CN103678618A (en) * | 2013-12-17 | 2014-03-26 | 南京大学 | Web service recommendation method based on socializing network platform |
Non-Patent Citations (1)
Title |
---|
基于用户聚类和语义词典的微博推荐系统;蒋超;《中国优秀硕士学位论文全文数据库 信息科技辑》;20140215(第2期);I138-923 * |
Also Published As
Publication number | Publication date |
---|---|
CN104408115A (en) | 2015-03-11 |
KR102314645B1 (en) | 2021-10-19 |
KR20160062667A (en) | 2016-06-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104408115B (en) | The heterogeneous resource based on semantic interlink recommends method and apparatus on a kind of TV platform | |
Hidasi et al. | Parallel recurrent neural network architectures for feature-rich session-based recommendations | |
CN109508414B (en) | Synonym mining method and device | |
CN106354861B (en) | Film label automatic indexing method and automatic indexing system | |
CN106921891B (en) | Method and device for displaying video characteristic information | |
US20210056571A1 (en) | Determining of summary of user-generated content and recommendation of user-generated content | |
CN108280114B (en) | Deep learning-based user literature reading interest analysis method | |
CN105095508B (en) | A kind of multimedia content recommended method and multimedia content recommendation apparatus | |
CN105975558B (en) | Establish method, the automatic edit methods of sentence and the corresponding intrument of statement editing model | |
US20140201180A1 (en) | Intelligent Supplemental Search Engine Optimization | |
CN105243143A (en) | Recommendation method and system based on instant voice content detection | |
CN111046225B (en) | Audio resource processing method, device, equipment and storage medium | |
US20080168056A1 (en) | On-line iterative multistage search engine with text categorization and supervised learning | |
CN103052953A (en) | Information processing device, method of processing information, and program | |
CN103069414A (en) | Information processing device, information processing method, and program | |
CN102737029A (en) | Searching method and system | |
Chiny et al. | Netflix recommendation system based on TF-IDF and cosine similarity algorithms | |
CN109063147A (en) | Online course forum content recommendation method and system based on text similarity | |
US20190082236A1 (en) | Determining Representative Content to be Used in Representing a Video | |
CN110717038A (en) | Object classification method and device | |
WO2022183923A1 (en) | Phrase generation method and apparatus, and computer readable storage medium | |
CN105574030A (en) | Information search method and device | |
CN106294358A (en) | The search method of a kind of information and system | |
JP7395377B2 (en) | Content search methods, devices, equipment, and storage media | |
CN104657376A (en) | Searching method and searching device for video programs based on program relationship |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |