CN108446731A - A kind of method and device of content duplicate removal - Google Patents

A kind of method and device of content duplicate removal Download PDF

Info

Publication number
CN108446731A
CN108446731A CN201810220157.3A CN201810220157A CN108446731A CN 108446731 A CN108446731 A CN 108446731A CN 201810220157 A CN201810220157 A CN 201810220157A CN 108446731 A CN108446731 A CN 108446731A
Authority
CN
China
Prior art keywords
micro
type
content
cluster
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810220157.3A
Other languages
Chinese (zh)
Other versions
CN108446731B (en
Inventor
王洁
徐钊
史小龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Poly Polytron Technologies Inc
Juhaokan Technology Co Ltd
Original Assignee
Poly Polytron Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Poly Polytron Technologies Inc filed Critical Poly Polytron Technologies Inc
Priority to CN201810220157.3A priority Critical patent/CN108446731B/en
Publication of CN108446731A publication Critical patent/CN108446731A/en
Application granted granted Critical
Publication of CN108446731B publication Critical patent/CN108446731B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application is to belong to the communications field about a kind of method and device of content duplicate removal.The method includes:According to the content description information for each content that the micro- type of target includes, the first probability that each content belongs to each content topic in M content topic is calculated, M is the integer more than 1;The first probability for belonging to each content topic according to each content obtains the theme vector of the micro- type of the target, and the micro- type of target is one in N number of micro- type, and N is the integer more than 1;According to the theme vector of the micro- type of each of N number of micro- type, N number of micro- type is clustered to obtain K cluster, K is the integer more than 1;Selected respectively from each cluster of described K cluster it is described each cluster corresponding micro- type, and by the micro- type set of selected micro- type composition.The application can avoid a large amount of duplicate contents occur between micro- type.

Description

A kind of method and device of content duplicate removal
Technical field
This application involves the communications field, more particularly to a kind of method and device of content duplicate removal.
Background technology
Current major content sites are in order to promote user experience, by more more preferably content is presented to the user, all in Lu Lu Continuous to release waterfall stream presentation mode, which may be implemented infinitely to load in webpage.Disclosure can be regarding The contents such as frequency or music, such as by taking video as an example, current video website shows video with waterfall-type presentation mode.
In order to adapt to waterfall stream presentation mode, content often defines thousands of micro- types, and each micro- type includes more Then a content shows each micro- type and its corresponding content with waterfall stream.Due to definition micro- type with the time accumulation Can be more and more, it is poor between the micro- type in the part when defining micro- type there may be the micro- type in the part of definition is more similar Alienation is smaller, includes a large amount of repetition between two micro- types for the micro- type of any two in the micro- type in the part Content.
In this way when doing personalized recommendation, it is understood that there may be or may between certain several micro- type between certain two micro- type There are a large amount of duplicate contents.
Invention content
In order to avoid occurring a large amount of duplicate contents between micro- type, the embodiment of the present application provides a kind of side of content duplicate removal Method and device.The technical solution is as follows:
In a first aspect, this application provides a kind of method of content duplicate removal, the method includes:
According to the content description information for each content that the micro- type of target includes, calculates each content and belong to M content master First probability of each content topic in topic, M are the integer more than 1;
The first probability for belonging to each content topic according to each content, obtains the theme of the micro- type of the target Vector, the micro- type of target are one in N number of micro- type, and N is the integer more than 1;
According to the theme vector of the micro- type of each of N number of micro- type, N number of micro- type is clustered to obtain K cluster, K are the integer more than 1;
Selected respectively from each cluster of described K cluster it is described each cluster corresponding micro- type, and will described in Micro- type of selection forms micro- type set to be recommended.
Optionally, the content description information of each content for including according to the micro- type of target calculates each content category First probability of each content topic in the M content topic, including:
The content description information of object content is segmented to obtain multiple words, and will be described in the multiple word composition The corpus of object content, the object content are any one content in the micro- type of the target;
The corpus is input to preset themes model and carries out theme operation, the object content is obtained and belongs to the M First probability of each content topic in a content topic.
Optionally, first probability that each content topic is belonged to according to each content, obtains the target The theme vector of micro- type, including:
Each first probability for obtaining each content for belonging to same content topic obtains the mesh according to each first probability Mark the second probability that micro- type belongs to the content topic;
The micro- type of the target is belonged to described in the second probability composition of each content topic in the M content topic The theme vector of the micro- type of target.
Optionally, it is described each clustered from described K cluster in select described each to cluster corresponding micro- class respectively Type, including:
Clustered according to target each of include micro- type theme vector, determine the barycenter of the barycenter of the target cluster Vector, the target cluster is any of described K cluster;
According to the centroid vector of the theme vector and the barycenter of each micro- type, each micro- class is calculated separately The distance between type and the barycenter;
According to each the distance between micro- type and the barycenter, select the target poly- from target cluster The corresponding micro- type of class.
Optionally, it is described each clustered from described K cluster in select described each to cluster corresponding micro- class respectively Type, including:
The content number that the micro- type of each of statistics target cluster includes, the target cluster is in described K cluster Either one or two of;
According to the content number of each micro- type, select the target cluster corresponding micro- from target cluster Type.
Optionally, it is described each clustered from described K cluster in select described each to cluster corresponding micro- class respectively After type, further include:
The viewing time for each content for including according to the micro- type of each of micro- type set calculates separately described every The recommendation index of a micro- type selects and recommends micro- from micro- type set according to the recommendation index of each micro- type Type.
Second aspect, this application provides a kind of device of content duplicate removal, described device includes:
Computing module, the content description information of each content for including according to the micro- type of target calculate each content Belong to the first probability of each content topic in M content topic, M is the integer more than 1;
Acquisition module, the first probability for belonging to each content topic according to each content, obtains the mesh The theme vector of micro- type is marked, the micro- type of target is one in N number of micro- type, and N is the integer more than 1;
Cluster module, for the theme vector according to the micro- type of each of N number of micro- type, to N number of micro- class Type is clustered to obtain K cluster, and K is the integer more than 1;
Selecting module, for from described K cluster each cluster in select respectively it is described each cluster it is corresponding micro- Type, and selected micro- type is formed to micro- type set to be recommended.
Optionally, the computing module includes:
Component units are segmented to obtain multiple words for the content description information to object content, and will be described more A word constitutes the corpus of the object content, and the object content is any one content in the micro- type of the target;
Input unit carries out theme operation for the corpus to be input to preset themes model, obtains the target Content belongs to the first probability of each content topic in the M content topic.
Optionally, the acquisition module includes:
Acquiring unit, each first probability for obtaining each content for belonging to same content topic, according to described each first Probability obtains the second probability that the micro- type of the target belongs to the content topic;
Component units, for the micro- type of the target to be belonged to each content topic in the M content topic Two probability form the theme vector of the micro- type of the target.
Optionally, the selecting module includes:
Determination unit, for clustered according to target each of include micro- type theme vector, determine that the target is poly- The centroid vector of the barycenter of class, the target cluster is any of described K cluster;
Second computing unit divides for the centroid vector according to each theme vector and the barycenter of micro- type Each the distance between micro- type and the barycenter are not calculated;
First selecting unit, for according to each the distance between micro- type and the barycenter, gathering from the target The target is selected to cluster corresponding micro- type in class.
Optionally, the selecting module includes:
Statistic unit, the content number for including for counting the micro- type of each of target cluster, the target cluster are Any of described K cluster;
Second selecting unit selects institute for the content number according to each micro- type from target cluster It states target and clusters corresponding micro- type.
Optionally, described device further includes:
Recommending module, the viewing time point of each content for including according to the micro- type of each of micro- type set The recommendation index for not calculating each micro- type, according to the recommendation index of each micro- type from micro- type set It selects and recommends micro- type.
The third aspect, the embodiment of the present application provide a kind of computer readable storage medium, the computer-readable storage Dielectric memory contains computer program, realizes that first aspect or first aspect are any when the computer program is executed by processor Method and step described in the possible realization method of kind.
Technical solution provided by the embodiments of the present application can include the following benefits:
It is right according to the theme vector of each micro- type by obtaining the theme vector of the micro- type of each of N number of micro- type N number of micro- type is clustered to obtain K cluster, and each cluster includes the similar micro- type of content, then from this K cluster Each of cluster in select each to cluster corresponding micro- type respectively, the similar micro- type of a large amount of contents can be excluded in this way, made It will not a large amount of duplicate contents between the micro- type that must be selected.
It should be understood that above general description and following detailed description is only exemplary and explanatory, not The application can be limited.
Description of the drawings
The drawings herein are incorporated into the specification and forms part of this specification, and shows the implementation for meeting the application Example, and the principle together with specification for explaining the application.
Fig. 1 is system architecture schematic diagram provided by the embodiments of the present application;
Fig. 2 is a kind of method flow diagram of content duplicate removal provided by the embodiments of the present application;
Fig. 3-1 is the method flow diagram of another content duplicate removal provided by the embodiments of the present application;
Fig. 3-2 is a kind of schematic diagram of cluster provided by the embodiments of the present application;
Fig. 4 is a kind of apparatus structure schematic diagram of content duplicate removal provided by the embodiments of the present application;
Fig. 5 is a kind of terminal structure schematic diagram provided by the embodiments of the present application.
Through the above attached drawings, it has been shown that the specific embodiment of the application will be hereinafter described in more detail.These attached drawings It is not intended to limit the range of the application design in any manner with verbal description, but is by referring to specific embodiments Those skilled in the art illustrate the concept of the application.
Specific implementation mode
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment Described in embodiment do not represent all embodiments consistent with the application.On the contrary, they be only with it is such as appended The example of consistent device and method of some aspects be described in detail in claims, the application.
Referring to Fig. 1, the embodiment of the present application provides a kind of system architecture of content duplicate removal, including:
Content library, the content library include multiple contents, and each content has content description information.The content of content describes Information may include brief introduction, content title and/or content type of the content etc..For example, content can be video, content Description information can be video profile, video title and video type etc., and video type can be video tab and video two level Type etc..
Micro- typelib, micro- typelib include M micro- types, and each micro- type corresponds to a micro- typing rule.It can root According to micro- typing rule of each micro- type, the content for belonging to each micro- type is determined from content library, and will each micro- type It corresponds to and is stored in the correspondence of micro- type and content with the content for belonging to each micro- type.For example, being by video of content Example, it is assumed that some micro- type is " review miss old times or old friends the black element of abdomen in film ", micro- typing rule of micro- type be " category":" film ", " child_category_name ":" missing old times or old friends ", " tag ":" abdomen is black " }, pass through micro- typing rule It is matched with each video in content library, to which relevant film is divided under micro- type.
Wherein, for any micro- type, it is the micro- type of target to be known as micro- type for convenience of description, micro- to belonging to target The content description information of each content in type is segmented, and the corpus of each content is obtained, and the corpus of content includes The word that the content description information of the content is segmented.It optionally, can be only to brief introduction and interior in participle Hold title to be segmented, and content type can not carry out word segmentation processing directly as a word to content type.
The corpus for each content that each micro- type includes can be input to topic model, pass through the master by topic model The corpus for each content that topic model includes to each micro- type carries out theme operation, respectively obtain the theme of each micro- type to Amount.Optionally, topic model can be that document subject matter generates model (Latent Dirichlet Allocation, LDA).
The theme vector of each micro- type can be input in Clustering Model, pass through the Clustering Model pair by Clustering Model N number of micro- type is clustered, and exports K cluster.Then it is the corresponding micro- type of each Clustering and selection from this K cluster And form micro- type set.
Optionally, above-mentioned M, N, K are the integer more than 1.
Referring to Fig. 2, the embodiment of the present application provides a kind of method of content duplicate removal, and the method can be applied to Fig. 1 institutes In the system architecture shown, including:
Step 201:According to the content description information for each content that the micro- type of target includes, calculates each content and belong in M Hold the first probability of each content topic in theme, M is the integer more than 1.
Step 202:Belong to the first probability of each content topic according to each content, obtain the theme of the micro- type of target to Amount.
Wherein, which includes that the micro- type of target belongs to the second general of each content topic in M content topic Rate, the micro- type of target are one in N number of micro- type, and M and N are the integer more than 1.
Step 203:According to the theme vector of the micro- type of each of N number of micro- type, which is clustered K cluster is obtained, K is the integer more than 1.
When realizing, the theme vector of the micro- type of each of N number of micro- type can be input to preset cluster mould Type, which carries out clustering processing according to the theme vector of each micro- type to N number of micro- type, and exports K A cluster.
Since the data complexity of Clustering Model is low, generated calculation amount is small, can be gathered faster using Clustering Model Class obtains K cluster, so as to improve the efficiency of content duplicate removal, reduces the calculation amount of content duplicate removal.
Step 204:It selects each to cluster corresponding micro- type respectively from each of this K cluster cluster, and will choosing The micro- type selected forms micro- type set.
In the embodiment of the present application, by obtaining the theme vector of the micro- type of each of N number of micro- type, according to each micro- The theme vector of type is clustered to obtain K cluster to N number of micro- type, and each cluster includes the similar micro- type of content, Then it selects each to cluster corresponding micro- type respectively from each of this K cluster cluster, can exclude in this way in a large amount of Hold similar micro- type so that will not a large amount of duplicate contents between the micro- type selected.
Referring to Fig. 3-1, the embodiment of the present application provides a kind of method of content duplicate removal, and this method can be to as shown in Figure 2 The method of embodiment be described in detail, including:
Step 301:According to the content description information for each content that the micro- type of target includes, calculates each content and belong in M Hold the first probability of each content topic in theme, the micro- type of target is one in N number of micro- type.
The micro- type of each of N number of micro- type corresponds to a micro- typing rule.It, can basis before executing this step Each corresponding micro- typing rule of micro- type, is matched with content in content library, matches and belong to each of each micro- type Content.
This step can be by realizing, respectively the step of following 3011 and 3022:
3011:The content description information of object content is segmented to obtain multiple words, and multiple words are constituted into mesh The corpus of content is marked, object content is any one content in the micro- type of target.
The content description information of object content may include brief introduction, content title and/or the content class of object content The information such as type.For example, when object content is video, the content description information of object content may include video profile, video mark Information, the video type such as topic and/or video type can be video title and/or video two-level type.
Brief introduction that can be to object content when being segmented to the content description information of object content and content title It is segmented to obtain multiple words, can be without word segmentation processing to content type, it can will obtained multiple words of participle and interior Hold the corpus of type composition object content.
Optionally, the corpus of object content can also include the content identification of object content.
For example, there are a film " Hollywood * * * ", corresponding content identification is 1526302, the video letter to the film It is situated between after carrying out word segmentation processing with video title, obtained word is " northeast ", and " Chinese's image ", " killer ", " film factory " " emits Dangerous piece ", " cinemas ", " reactionary gang ", " optimism ", " emotion ", " travelling ", " making laughs ", " Hollywood ", " friendship ", " performer " " escapes Die ", " sheet ", " modern times ", " bravery ", " strange land love ";The type of the film be " romance movie ", " comedy ", " action movie " and " Feature film ".
The type of obtained word He the film is formed to the corpus of the film, Ke Yiwei:{1526302:[" love Piece ", " northeast ", " Chinese's image ", " killer ", " film factory ", " adventure movie ", " cinemas ", " comedy ", " reactionary gang ", " action Piece ", " optimism ", " emotion ", " travelling ", " making laughs ", " Hollywood ", " feature film ", " friendship ", " performer ", " escape " is " big Piece ", " modern times ", " bravery ", " strange land love "].
3012:The corpus is input to preset themes model and carries out theme operation, the object content is obtained and belongs to the M First probability of each content topic in content topic.
It is M that the corresponding content topic number of preset themes model can be arranged before executing this step, in this way in this step In rapid, preset themes model carries out theme operation to input corpus, obtains M the first probability, and the M the first probability are respectively The object content belongs to the first probability of each content topic in the M content topic.
Optionally, which can be LDA etc..
Optionally, topic model is in export that the object content belongs to each content topic in the M content topic the When one probability, the word for belonging to each content topic in object content can also be exported.
Wherein, the first probability P i=ni/n, ni that object content belongs to i-th of content topic is the mesh of topic model output Belong to the word number of i-th of content topic in mark content, n is the word number that object content includes.Alternatively,
Topic model can export the first set of words for belonging to i-th of theme in object content, and the first set of words includes Word1, word2 ... wordN, and each word belong to the probability of i-th of theme, respectively p (word1), p (word2)……p(wordN);Topic model also exports all words in object content, that is, exports the second set of words, the Two set of words include that word1, word2 ... wordM, M and N are integer more than or equal to 1, and M is greater than or equal to N.
Primary vector is built, primary vector is by the Probability p (word1) of each word in the first set of words, p (word2) ... p (wordN) compositions or primary vector are by the corresponding default value group of each word in the first set of words At.Optionally, the corresponding default value of each word can be the numerical value such as 1,2 or 3, i.e. primary vector includes N number of present count Value.For example, default value is 1, then primary vector xti=[1 ..., 1] or xti=[p (word1) ..., p (wordN)]。
Secondary vector is built, secondary vector includes the corresponding numerical value of each word in the second set of words, for the Each word in two set of words, if the word is the word in the first set of words, the corresponding numerical value of the word is Default value, if word is not the word in the first set of words, the corresponding numerical value of the word is 0.For example, it is assumed that default Numerical value is 1, then [1,0,1,1,0 ... ..., 1] secondary vector y=.
Then, it calculates object content and belongs to the first probability of i-th of theme and be
Wherein, xiFor primary vector xtiIn element, yiFor the element in secondary vector y.
It is assumed that the micro- type of target includes X content, for being left X-1 content, above-mentioned 3011 He is respectively executed Each content that 3012 two steps obtain being left in X-1 content belongs to each content topic in the M content topic First probability.Assuming that obtained result is as shown in table 1 below.
Table 1
Step 302:Belong to the first probability of each content topic according to each content in the micro- type of target, it is micro- to obtain target The theme vector of type.
This step can be realized by following 3021 and 3,022 two steps, respectively:
3021:The first probability for obtaining each content for belonging to same content topic, according to each first determine the probability of acquisition The micro- type of target belongs to the second probability of the content topic.
Optionally, each first probability of acquisition can be ranked up, if the content number that the micro- type of target includes is Odd number then obtains centrally located first probability from the first probability after sequence, by centrally located One probability is determined as the second probability that the micro- type of target belongs to the content topic.If the content number that the micro- type of target includes is Even number then obtains two centrally located the first probability, from centrally located two from the first probability after sequence First probability is randomly choosed in a first probability, and the first determine the probability selected is belonged into content master for the micro- type of target Second probability of topic.
Assuming that the micro- type of target includes X content, it is assumed that X is odd number, for content topic 1, is obtained in the X content Each content belongs to the first probability of content topic 1, obtains X the first probability, and respectively P11, P21 ... PX1 is general to first Rate P11, P21 ... PX1 is ranked up, and the first centrally located probability is obtained from the first probability after sequence, it is assumed that The first probability obtained is P21, and the first probability P 21 is determined as the second probability that the micro- type of target belongs to content topic 1;For Content topic 2 obtains the first probability that each content in the X content belongs to content topic 2, obtains X the first probability, point Not Wei P12, P22 ... PX2, the first probability P 12, P22 ... PX2 are ranked up, obtained from the first probability after sequence The first centrally located probability, it is assumed that the first probability of acquisition is P22, and the first probability P 22 is determined as the micro- type of target The second probability ... ... for belonging to content topic 2, for content topic M, each content obtained in the X content belongs to content The first probability of theme M obtains X the first probability, respectively P1M, P2M ... PXM, to the first probability P 1M, P2M ... PXM It is ranked up, the first centrally located probability is obtained from the first probability after sequence, it is assumed that the first probability of acquisition is First probability P 2M is determined as the second probability that the micro- type of target belongs to content topic M by P2M.It is obtaining as a result, referring to following table 2。
Table 2
3022:The second probability composition target that the micro- type of target is belonged to each content topic in M content topic is micro- The theme vector of type.
For example, with reference to table 2, the second probability of each content topic that the micro- type of target belongs in M content topic is divided into Q1, Q2 ... QM, so the theme vector of the micro- type of target of composition is [Q1, Q2 ... ..., QM].
Wherein, there are N number of micro- types, for remaining N-1 micro- types, respectively to every in the N-1 micro- types A micro- type executes the operation of above-mentioned steps 301 and 302, obtains the theme vector of the micro- type of each of the N-1 micro- types.
Step 303:According to the theme vector of the micro- type of each of N number of micro- type, which is clustered K cluster is obtained, K is the integer more than 1.
When realizing, the theme vector of the micro- type of each of N number of micro- type can be input to preset cluster mould Type, which carries out clustering processing according to the theme vector of each micro- type to N number of micro- type, and exports K A cluster.
Before executing this step, it is K that the corresponding clusters number of preset Clustering Model, which can be arranged, is executing sheet in this way When step, which carries out at cluster N number of micro- type according to the theme vector of the micro- type of each of input Reason forms K cluster, and exports K cluster.
Wherein, each of this K cluster cluster includes at least one micro- type, for each cluster, is wrapped in the cluster At least one micro- type included is the similar micro- type of content.
Optionally, which can be Kmeans clustering algorithms etc..
Step 304:It selects each to cluster corresponding micro- type respectively from each of this K cluster cluster, and will choosing The micro- type selected forms micro- type set to be recommended.
Wherein, there are many modes for selecting each to cluster corresponding micro- type from each cluster.For example, in the application reality It applies in example, lists a kind of a kind of selection mode for realizing this step.The selection mode can pass through following 3041 to 3043 It operates to realize, respectively:
3041:Clustered according to target each of include micro- type theme vector, determine the matter of the barycenter of target cluster Heart vector, target cluster are any of K cluster.
When realizing, can be clustered according to target each of include micro- type theme vector, calculate average vector, will The average vector is determined as the centroid vector of the barycenter of target cluster.
For example, with reference to the target cluster shown in Fig. 3-2, target cluster includes five micro- types of A, B, C, D and E, should Position of five micro- types in target cluster is as shown in figure 3-2.According to the theme vector of five micro- types calculate it is average to Amount, and using the hair average vector as centroid vector, obtain the position of barycenter O.
3042:According to the centroid vector of the theme vector and the barycenter of each micro- type, calculate separately each micro- type with The distance between the barycenter.
For example, the distance between each micro- type and the barycenter can be calculated as follows.
In above-mentioned formula, d (xi,xcenter) it is the distance between a micro- type and the barycenter, xi1、xi2……xiMFor this Element in the theme vector of micro- type, xcenter1、xcenter2……xcenterMFor the element in the centroid vector of the barycenter.
For example, with reference to Fig. 3-2, according to the centroid vector of the theme vector and barycenter O of micro- type A, calculate micro- type A with The distance between barycenter O is L1;According to the centroid vector of the theme vector of micro- type B and barycenter O, micro- type B and matter are calculated The distance between heart O is L2;According to the centroid vector of the theme vector of micro- Type C and barycenter O, micro- Type C and barycenter O are calculated The distance between be L3;According to the centroid vector of the theme vector and barycenter O of micro- type D, calculate micro- type D and barycenter O it Between distance be L4;According to the centroid vector of the theme vector and barycenter O of micro- type E, calculate between micro- type E and barycenter O Distance be L5.Wherein, referring to Fig. 3-2, distance L3 is minimum in distance L1, L2, L3, L4 and L5.
3043:According to the distance between each micro- type and the barycenter, selection target cluster is corresponding from target cluster Micro- type.
Optionally, micro- type of the distance between barycenter minimum can be selected to be clustered as target from target cluster Corresponding micro- type.
For example, with reference to Fig. 3-2, the corresponding micro- Type Cs of minimum range L3 can be selected to make from micro- type A, B, C, D and E Corresponding micro- type is clustered for target.
The micro- type for selecting the distance between the barycenter minimum is more excellent, because taking the micro- type nearest from barycenter, this Distribution of the content below each content topic in micro- type can be relatively uniform, in this case, the content below micro- type The type of covering will be relatively abundanter.
In the embodiment of the present application, in addition to it is above-mentioned enumerate it is a kind of realizing a kind of selection mode of this step other than, can be with There are other selection modes.For another example also listing a kind of selection mode, the selection mode can pass through following 3044 to 3045 It operates to realize, respectively:
3044:The content number that the micro- type of each of statistics target cluster includes, target cluster are appointing in K cluster One.
3045:According to the content number of each micro- type, selection target clusters corresponding micro- type from target cluster.
Optionally, maximum one micro- type of content number can be selected corresponding micro- as target cluster from target cluster Type.Maximum one micro- type of content number is selected, it can be by more commending contents to user.
Step 305:The viewing time for each content for including according to the micro- type of each of micro- type set calculates separately often The recommendation index of a micro- type.
When to user's recommendation, the viewing historical record of user can be obtained, which includes using The information such as the content identification of each content of family viewing and viewing time.
In this step, the viewing time for each content that can be watched according to user, to each content of user's viewing It is ranked up;The interest coefficient of each content is set according to the clooating sequence of each content.
Optionally, it is assumed that in sequence, by viewing time from each content progress closely watched to remote sequence to user Sequence, then can be by the more forward content of clooating sequence, and the interest coefficient of the content of setting is bigger, more rearward by clooating sequence The interest coefficient of content, the content of setting is smaller.
For example, 1 can be set as the interest coefficient of the content to make number one, the interest of deputy content will be come Coefficient is set as 0.98, the interest coefficient for the content for coming third position is set as 0.96 ....
Be arranged after the interest coefficient of each content, for each micro- type, each content for including to micro- type it is emerging Interesting coefficient adds up, and obtains the recommendation index of micro- type.
Step 306:It is selected from micro- type set according to the recommendation index of each micro- type and recommends micro- type.
Optionally, can by recommend index from size to sequence, the micro- type of each of micro- type set is arranged Sequence can select to come most preceding y micro- types, and y is the integer more than 1, and the y micro- types are recommended user.
In the embodiment of the present application, according to the content description information of each content in micro- type, and pass through preset master Topic model obtains the first probability for belonging to each content topic of each content;According to first of each content in micro- type Probability obtains micro- type and belongs to the second probability of each content topic, and then obtains the theme vector of micro- type.So just N number of micro- type can be gathered by preset Clustering Model according to the theme vector of the micro- type of each of N number of micro- type Class obtains K cluster, and each cluster includes the similar micro- type of content, arbitrarily belong to two different two micro- types clustered it Between content similarity it is relatively low, then from this K cluster each cluster in select each to cluster corresponding micro- type respectively, The similar micro- type of a large amount of contents can be excluded in this way so that will not a large amount of duplicate contents between the micro- type selected.In addition, It is clustered for one since the theme vector by micro- type can gather micro- type of Similar content, one is selected from the cluster Micro- type removes other micro- types of the cluster, so that it may with the similar micro- type of a large amount of contents of duplicate removal so that the calculating of duplicate removal Amount is very low.
Following is the application device embodiment, can be used for executing the application embodiment of the method.It is real for the application device Undisclosed details in example is applied, the application embodiment of the method is please referred to.
Referring to Fig. 4, this application provides a kind of device 400 of content duplicate removal, described device 400 includes:
Computing module 401, the content description information of each content for including according to the micro- type of target calculate described each interior Hold the first probability for belonging to each content topic in M content topic, M is the integer more than 1;
Acquisition module 402, the first probability for belonging to each content topic according to each content, described in acquisition The theme vector of the micro- type of target, the micro- type of target are one in N number of micro- type, and N is the integer more than 1;
Cluster module 403, for the theme vector according to the micro- type of each of N number of micro- type, to described N number of micro- Type is clustered to obtain K cluster, and K is the integer more than 1;
Selecting module 404, for from described K cluster each cluster in select respectively it is described each cluster it is corresponding Micro- type, and selected micro- type is formed into micro- type set.
Optionally, the computing module 401 includes:
Component units are segmented to obtain multiple words for the content description information to object content, and will be described more A word constitutes the corpus of the object content, and the object content is any one content in the micro- type of the target;
Input unit carries out theme operation for the corpus to be input to preset themes model, obtains the target Content belongs to the first probability of each content topic in the M content topic.
Optionally, the acquisition module 402 includes:
Acquiring unit, each first probability for obtaining each content for belonging to same content topic, according to described each first Probability obtains the second probability that the micro- type of the target belongs to the content topic;
Component units, for the micro- type of the target to be belonged to each content topic in the M content topic Two probability form the theme vector of the micro- type of the target.
Optionally, the selecting module 404 includes:
Determination unit, for clustered according to target each of include micro- type theme vector, determine that the target is poly- The centroid vector of the barycenter of class, the target cluster is any of described K cluster;
Computing unit is counted respectively for the centroid vector according to each theme vector and the barycenter of micro- type Calculate each the distance between micro- type and the barycenter;
First selecting unit, for according to each the distance between micro- type and the barycenter, gathering from the target The target is selected to cluster corresponding micro- type in class.
Optionally, the selecting module 404 includes:
Statistic unit, the content number for including for counting the micro- type of each of target cluster, the target cluster are Any of described K cluster;
Second selecting unit selects institute for the content number according to each micro- type from target cluster It states target and clusters corresponding micro- type.
Optionally, described device 400 further includes:
Recommending module, the viewing time point of each content for including according to the micro- type of each of micro- type set The recommendation index for not calculating each micro- type, according to the recommendation index of each micro- type from micro- type set It selects and recommends micro- type.
In the embodiment of the present application, by obtaining the theme vector of the micro- type of each of N number of micro- type, according to each micro- The theme vector of type is clustered to obtain K cluster to N number of micro- type, and each cluster includes the similar micro- type of content, Then it selects each to cluster corresponding micro- type respectively from each of this K cluster cluster, can exclude in this way in a large amount of Hold similar micro- type so that will not a large amount of duplicate contents between the micro- type selected.
About the device in above-described embodiment, wherein modules execute the concrete mode of operation in related this method Embodiment in be described in detail, explanation will be not set forth in detail herein.
Fig. 5 shows the structure diagram for the terminal 500 that an illustrative embodiment of the invention provides.The terminal 500 can be with It is portable mobile termianl, such as:Smart mobile phone, tablet computer, MP3 player (Moving Picture Experts Group Audio Layer III, dynamic image expert's compression standard audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image expert's compression standard audio level 4) player, laptop Or desktop computer.Terminal 500 is also possible to be referred to as other names such as user equipment, portable terminal, laptop terminal, terminal console Claim.
In general, terminal 500 includes:Processor 501 and memory 502.
Processor 501 may include one or more processing cores, such as 4 core processors, 8 core processors etc..Place DSP (Digital Signal Processing, Digital Signal Processing), FPGA (Field- may be used in reason device 501 Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array, may be programmed Logic array) at least one of example, in hardware realize.Processor 501 can also include primary processor and coprocessor, master Processor is the processor for being handled data in the awake state, also referred to as CPU (Central Processing Unit, central processing unit);Coprocessor is the low power processor for being handled data in the standby state. In some embodiments, processor 501 can be integrated with GPU (Graphics Processing Unit, image processor), GPU is used to be responsible for the rendering and drafting of content to be shown needed for display screen.In some embodiments, processor 501 can also wrap AI (Artificial Intelligence, artificial intelligence) processor is included, the AI processors are for handling related machine learning Calculating operation.
Memory 502 may include one or more computer readable storage mediums, which can To be non-transient.Memory 502 may also include high-speed random access memory and nonvolatile memory, such as one Or multiple disk storage equipments, flash memory device.In some embodiments, the non-transient computer in memory 502 can Storage medium is read for storing at least one instruction, at least one instruction is for performed to realize this Shen by processor 501 Please in embodiment of the method provide a kind of content duplicate removal method.
In some embodiments, terminal 500 is also optional includes:Peripheral device interface 503 and at least one peripheral equipment. It can be connected by bus or signal wire between processor 501, memory 502 and peripheral device interface 503.Each peripheral equipment It can be connected with peripheral device interface 503 by bus, signal wire or circuit board.Specifically, peripheral equipment includes:Radio circuit 504, at least one of touch display screen 505, camera 506, voicefrequency circuit 507, positioning component 508 and power supply 509.
Peripheral device interface 503 can be used for I/O (Input/Output, input/output) is relevant at least one outer Peripheral equipment is connected to processor 501 and memory 502.In some embodiments, processor 501, memory 502 and peripheral equipment Interface 503 is integrated on same chip or circuit board;In some other embodiments, processor 501, memory 502 and outer Any one or two in peripheral equipment interface 503 can realize on individual chip or circuit board, the present embodiment to this not It is limited.
Radio circuit 504 is for receiving and emitting RF (Radio Frequency, radio frequency) signal, also referred to as electromagnetic signal.It penetrates Frequency circuit 504 is communicated by electromagnetic signal with communication network and other communication equipments.Radio circuit 504 turns electric signal It is changed to electromagnetic signal to be sent, alternatively, the electromagnetic signal received is converted to electric signal.Optionally, radio circuit 504 wraps It includes:Antenna system, RF transceivers, one or more amplifiers, tuner, oscillator, digital signal processor, codec chip Group, user identity module card etc..Radio circuit 504 can be carried out by least one wireless communication protocol with other terminals Communication.The wireless communication protocol includes but not limited to:WWW, Metropolitan Area Network (MAN), Intranet, each third generation mobile communication network (2G, 3G, 4G and 5G), WLAN and/or WiFi (Wireless Fidelity, Wireless Fidelity) network.In some embodiments, it penetrates Frequency circuit 504 can also include the related circuits of NFC (Near Field Communication, wireless near field communication), this Application is not limited this.
Display screen 505 is for showing UI (User Interface, user interface).The UI may include figure, text, figure Mark, video and its their arbitrary combination.When display screen 505 is touch display screen, display screen 505 also there is acquisition to show The ability of the surface of screen 505 or the touch signal of surface.The touch signal can be used as control signal to be input to processor 501 are handled.At this point, display screen 505 can be also used for providing virtual push button and/or dummy keyboard, also referred to as soft button and/or Soft keyboard.In some embodiments, display screen 505 can be one, and the front panel of terminal 500 is arranged;In other embodiments In, display screen 505 can be at least two, be separately positioned on the different surfaces of terminal 500 or in foldover design;In still other reality Apply in example, display screen 505 can be flexible display screen, be arranged on the curved surface of terminal 500 or fold plane on.Even, it shows Display screen 505 can also be arranged to non-rectangle irregular figure, namely abnormity screen.LCD (Liquid may be used in display screen 505 Crystal Display, liquid crystal display), OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode) Etc. materials prepare.
CCD camera assembly 506 is for acquiring image or video.Optionally, CCD camera assembly 506 include front camera and Rear camera.In general, the front panel in terminal is arranged in front camera, rear camera is arranged at the back side of terminal.One In a little embodiments, rear camera at least two is main camera, depth of field camera, wide-angle camera, focal length camera shooting respectively Any one in head, to realize that main camera and the fusion of depth of field camera realize background blurring function, main camera and wide-angle Camera fusion realizes that pan-shot and VR (Virtual Reality, virtual reality) shooting functions or other fusions are clapped Camera shooting function.In some embodiments, CCD camera assembly 506 can also include flash lamp.Flash lamp can be monochromatic warm flash lamp, It can also be double-colored temperature flash lamp.Double-colored temperature flash lamp refers to the combination of warm light flash lamp and cold light flash lamp, be can be used for not With the light compensation under colour temperature.
Voicefrequency circuit 507 may include microphone and loud speaker.Microphone is used to acquire the sound wave of user and environment, and will Sound wave, which is converted to electric signal and is input to processor 501, to be handled, or is input to radio circuit 504 to realize voice communication. For stereo acquisition or the purpose of noise reduction, microphone can be multiple, be separately positioned on the different parts of terminal 500.Mike Wind can also be array microphone or omnidirectional's acquisition type microphone.Loud speaker is then used to that processor 501 or radio circuit will to be come from 504 electric signal is converted to sound wave.Loud speaker can be traditional wafer speaker, can also be piezoelectric ceramic loudspeaker.When When loud speaker is piezoelectric ceramic loudspeaker, the audible sound wave of the mankind can be not only converted electrical signals to, it can also be by telecommunications Number the sound wave that the mankind do not hear is converted to carry out the purposes such as ranging.In some embodiments, voicefrequency circuit 507 can also include Earphone jack.
Positioning component 508 is used for the current geographic position of positioning terminal 500, to realize navigation or LBS (Location Based Service, location based service).Positioning component 508 can be the GPS (Global based on the U.S. Positioning System, global positioning system), China dipper system or Russia Galileo system positioning group Part.
Power supply 509 is used to be powered for the various components in terminal 500.Power supply 509 can be alternating current, direct current, Disposable battery or rechargeable battery.When power supply 509 includes rechargeable battery, which can be wired charging electricity Pond or wireless charging battery.Wired charging battery is the battery to be charged by Wireline, and wireless charging battery is by wireless The battery of coil charges.The rechargeable battery can be also used for supporting fast charge technology.
In some embodiments, terminal 500 further include there are one or multiple sensors 510.The one or more sensors 510 include but not limited to:Acceleration transducer 511, gyro sensor 512, pressure sensor 513, fingerprint sensor 514, Optical sensor 515 and proximity sensor 516.
The acceleration that acceleration transducer 511 can detect in three reference axis of the coordinate system established with terminal 500 is big It is small.For example, acceleration transducer 511 can be used for detecting component of the acceleration of gravity in three reference axis.Processor 501 can With the acceleration of gravity signal acquired according to acceleration transducer 511, control touch display screen 505 is regarded with transverse views or longitudinal direction Figure carries out the display of user interface.Acceleration transducer 511 can be also used for game or the acquisition of the exercise data of user.
Gyro sensor 512 can be with the body direction of detection terminal 500 and rotational angle, and gyro sensor 512 can To cooperate with acquisition user to act the 3D of terminal 500 with acceleration transducer 511.Processor 501 is according to gyro sensor 512 Following function may be implemented in the data of acquisition:When action induction (for example changing UI according to the tilt operation of user), shooting Image stabilization, game control and inertial navigation.
The lower layer of side frame and/or touch display screen 505 in terminal 500 can be arranged in pressure sensor 513.Work as pressure The gripping signal that user can be detected in the side frame of terminal 500 to terminal 500 is arranged in sensor 513, by processor 501 Right-hand man's identification or prompt operation are carried out according to the gripping signal that pressure sensor 513 acquires.When the setting of pressure sensor 513 exists When the lower layer of touch display screen 505, the pressure operation of touch display screen 505 is realized to UI circle according to user by processor 501 Operability control on face is controlled.Operability control includes button control, scroll bar control, icon control, menu At least one of control.
Fingerprint sensor 514 is used to acquire the fingerprint of user, collected according to fingerprint sensor 514 by processor 501 The identity of fingerprint recognition user, alternatively, by fingerprint sensor 514 according to the identity of collected fingerprint recognition user.It is identifying When the identity for going out user is trusted identity, the user is authorized to execute relevant sensitive operation, the sensitive operation packet by processor 501 Include solution lock screen, check encryption information, download software, payment and change setting etc..Terminal can be set in fingerprint sensor 514 500 front, the back side or side.When being provided with physical button or manufacturer Logo in terminal 500, fingerprint sensor 514 can be with It is integrated with physical button or manufacturer Logo.
Optical sensor 515 is for acquiring ambient light intensity.In one embodiment, processor 501 can be according to optics The ambient light intensity that sensor 515 acquires controls the display brightness of touch display screen 505.Specifically, when ambient light intensity is higher When, the display brightness of touch display screen 505 is turned up;When ambient light intensity is relatively low, the display for turning down touch display screen 505 is bright Degree.In another embodiment, the ambient light intensity that processor 501 can also be acquired according to optical sensor 515, dynamic adjust The acquisition parameters of CCD camera assembly 506.
Proximity sensor 516, also referred to as range sensor are generally arranged at the front panel of terminal 500.Proximity sensor 516 The distance between front for acquiring user and terminal 500.In one embodiment, when proximity sensor 516 detects use When family and the distance between the front of terminal 500 taper into, touch display screen 505 is controlled from bright screen state by processor 501 It is switched to breath screen state;When proximity sensor 516 detects user and the distance between the front of terminal 500 becomes larger, Touch display screen 505 is controlled by processor 501 and is switched to bright screen state from breath screen state.
It will be understood by those skilled in the art that the restriction of the not structure paired terminal 500 of structure shown in Fig. 5, can wrap It includes than illustrating more or fewer components, either combine certain components or is arranged using different components.
Those skilled in the art will readily occur to its of the application after considering specification and putting into practice application disclosed herein Its embodiment.This application is intended to cover any variations, uses, or adaptations of the application, these modifications, purposes or Person's adaptive change follows the general principle of the application and includes the undocumented common knowledge in the art of the application Or conventional techniques.The description and examples are only to be considered as illustrative, and the true scope and spirit of the application are by following Claim is pointed out.
It should be understood that the application is not limited to the precision architecture for being described above and being shown in the accompanying drawings, and And various modifications and changes may be made without departing from the scope thereof.Scope of the present application is only limited by the accompanying claims.

Claims (13)

1. a kind of content De-weight method, which is characterized in that the method includes:
According to the content description information for each content that the micro- type of target includes, calculates each content and belong in M content topic Each content topic the first probability, M is integer more than 1;
The first probability for belonging to each content topic according to each content, obtain the theme of the micro- type of the target to Amount, the micro- type of target are one in N number of micro- type, and N is the integer more than 1;
According to the theme vector of the micro- type of each of N number of micro- type, N number of micro- type is clustered to obtain K Cluster, K are the integer more than 1;
Selected respectively in each being clustered from described K cluster it is described each cluster corresponding micro- type, and by the selection Micro- type form micro- type set to be recommended.
2. the method as described in claim 1, which is characterized in that the content of each content for including according to the micro- type of target is retouched Information is stated, the first probability that each content belongs to each content topic in the M content topic is calculated, including:
The content description information of object content is segmented to obtain multiple words, and the multiple word is constituted into the target The corpus of content, the object content are any one content in the micro- type of the target;
The corpus is input to preset themes model and carries out theme operation, the object content is obtained and belongs in the M Hold the first probability of each content topic in theme.
3. the method as described in claim 1, which is characterized in that described to belong to each content topic according to each content The first probability, obtain the theme vector of the micro- type of the target, including:
It is micro- to obtain the target according to each first probability for each first probability for obtaining each content for belonging to same content topic Type belongs to the second probability of the content topic;
The second probability that the micro- type of the target is belonged to each content topic in the M content topic forms the target The theme vector of micro- type.
4. the method as described in claim 1, which is characterized in that selected respectively in the cluster from each of described K cluster Select it is described each cluster corresponding micro- type, including:
Clustered according to target each of include micro- type theme vector, determine the barycenter of the barycenter of the target cluster to Amount, the target cluster is any of described K cluster;
According to the centroid vector of the theme vector and the barycenter of each micro- type, calculate separately each micro- type with The distance between described barycenter;
According to each the distance between micro- type and the barycenter, the target cluster pair is selected from target cluster The micro- type answered.
5. the method as described in claim 1, which is characterized in that selected respectively in the cluster from each of described K cluster Select it is described each cluster corresponding micro- type, including:
The content number that the micro- type of each of statistics target cluster includes, the target cluster are appointing in described K cluster One;
According to the content number of each micro- type, the target is selected to cluster corresponding micro- class from target cluster Type.
6. the method as described in claim 1, which is characterized in that selected respectively in the cluster from each of described K cluster Select it is described each cluster corresponding micro- type after, further include:
The viewing time for each content for including according to the micro- type of each of micro- type set calculates separately described each micro- The recommendation index of type selects from micro- type set according to the recommendation index of each micro- type and recommends micro- class Type.
7. a kind of device of content duplicate removal, which is characterized in that described device includes:
Computing module, the content description information of each content for including according to the micro- type of target calculate each content and belong to M First probability of each content topic in a content topic, M are the integer more than 1;
Acquisition module, the first probability for belonging to each content topic according to each content, it is micro- to obtain the target The theme vector of type, the micro- type of target are one in N number of micro- type, and N is the integer more than 1;
Cluster module, for according to the theme vector of the micro- type of each of N number of micro- type, to N number of micro- type into Row cluster obtains K cluster, and K is the integer more than 1;
Selecting module, for from described K cluster each cluster in select respectively it is described each cluster corresponding micro- type, And selected micro- type is formed to micro- type set to be recommended.
8. device as claimed in claim 7, which is characterized in that the computing module includes:
Component units are segmented to obtain multiple words for the content description information to object content, and by the multiple word Language constitutes the corpus of the object content, and the object content is any one content in the micro- type of the target;
Input unit carries out theme operation for the corpus to be input to preset themes model, obtains the object content Belong to the first probability of each content topic in the M content topic.
9. device as claimed in claim 7, which is characterized in that the acquisition module includes:
Acquiring unit, each first probability for obtaining each content for belonging to same content topic, according to each first probability Obtain the second probability that the micro- type of the target belongs to the content topic;
Component units, second for the micro- type of the target to be belonged to each content topic in the M content topic are general Rate forms the theme vector of the micro- type of the target.
10. device as claimed in claim 7, which is characterized in that the selecting module includes:
Determination unit, for clustered according to target each of include micro- type theme vector, determine the target cluster The centroid vector of barycenter, the target cluster is any of described K cluster;
Computing unit calculates separately institute for the centroid vector according to each theme vector and the barycenter of micro- type State the distance between each micro- type and the barycenter;
First selecting unit is used for according to each the distance between micro- type and the barycenter, from target cluster The target is selected to cluster corresponding micro- type.
11. device as claimed in claim 7, which is characterized in that the selecting module includes:
Statistic unit, the content number for including for counting the micro- type of each of target cluster, the target cluster is the K Any of a cluster;
Second selecting unit selects the mesh for the content number according to each micro- type from target cluster Mark clusters corresponding micro- type.
12. device as claimed in claim 7, which is characterized in that described device further includes:
Recommending module, the viewing time of each content by including according to the micro- type of each of micro- type set respectively based on The recommendation index for calculating each micro- type is selected according to the recommendation index of each micro- type from micro- type set And recommend micro- type.
13. a kind of computer readable storage medium, which is characterized in that be stored with computer in the computer readable storage medium Program realizes the method and step as described in claim 1-6 is any when the computer program is executed by processor.
CN201810220157.3A 2018-03-16 2018-03-16 Content duplication removing method and device Active CN108446731B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810220157.3A CN108446731B (en) 2018-03-16 2018-03-16 Content duplication removing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810220157.3A CN108446731B (en) 2018-03-16 2018-03-16 Content duplication removing method and device

Publications (2)

Publication Number Publication Date
CN108446731A true CN108446731A (en) 2018-08-24
CN108446731B CN108446731B (en) 2021-01-08

Family

ID=63195719

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810220157.3A Active CN108446731B (en) 2018-03-16 2018-03-16 Content duplication removing method and device

Country Status (1)

Country Link
CN (1) CN108446731B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1231790A2 (en) * 2001-02-13 2002-08-14 Hoshen-Eliav Systems Engineering Ltd. System for distributing video and content on demand
US20050276499A1 (en) * 2004-06-15 2005-12-15 Fang Wu Adaptive breakpoint for hybrid variable length coding
US20070106753A1 (en) * 2005-02-01 2007-05-10 Moore James F Dashboard for viewing health care data pools
CN101419614A (en) * 2008-12-03 2009-04-29 深圳市迅雷网络技术有限公司 Video resource clustering method and device
CN102542024A (en) * 2011-12-21 2012-07-04 电子科技大学 Calibrating method of semantic tags of video resource
CN105631033A (en) * 2015-12-31 2016-06-01 北京奇艺世纪科技有限公司 Video data mining method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1231790A2 (en) * 2001-02-13 2002-08-14 Hoshen-Eliav Systems Engineering Ltd. System for distributing video and content on demand
US20050276499A1 (en) * 2004-06-15 2005-12-15 Fang Wu Adaptive breakpoint for hybrid variable length coding
US20070106753A1 (en) * 2005-02-01 2007-05-10 Moore James F Dashboard for viewing health care data pools
CN101419614A (en) * 2008-12-03 2009-04-29 深圳市迅雷网络技术有限公司 Video resource clustering method and device
CN102542024A (en) * 2011-12-21 2012-07-04 电子科技大学 Calibrating method of semantic tags of video resource
CN105631033A (en) * 2015-12-31 2016-06-01 北京奇艺世纪科技有限公司 Video data mining method and device

Also Published As

Publication number Publication date
CN108446731B (en) 2021-01-08

Similar Documents

Publication Publication Date Title
US11244170B2 (en) Scene segmentation method and device, and storage medium
CN110059661A (en) Action identification method, man-machine interaction method, device and storage medium
CN110141857A (en) Facial display methods, device, equipment and the storage medium of virtual role
CN109284445B (en) Network resource recommendation method and device, server and storage medium
CN107885889A (en) Feedback method, methods of exhibiting and the device of search result
CN110147805A (en) Image processing method, device, terminal and storage medium
CN110163066B (en) Multimedia data recommendation method, device and storage medium
CN110061900B (en) Message display method, device, terminal and computer readable storage medium
CN109977775B (en) Key point detection method, device, equipment and readable storage medium
CN108270794B (en) Content distribution method, device and readable medium
CN111506758B (en) Method, device, computer equipment and storage medium for determining article name
CN108304265A (en) EMS memory management process, device and storage medium
CN110222789A (en) Image-recognizing method and storage medium
CN110246110A (en) Image evaluation method, device and storage medium
CN109285178A (en) Image partition method, device and storage medium
CN109688461A (en) Video broadcasting method and device
CN109922356A (en) Video recommendation method, device and computer readable storage medium
CN110139143A (en) Virtual objects display methods, device, computer equipment and storage medium
CN110290426A (en) Method, apparatus, equipment and the storage medium of showing resource
CN110020880A (en) Advertisement placement method, device and equipment
CN109992685A (en) A kind of method and device of retrieving image
CN111031391A (en) Video dubbing method, device, server, terminal and storage medium
CN113269612A (en) Article recommendation method and device, electronic equipment and storage medium
CN109218751A (en) The method, apparatus and system of recommendation of audio
CN113987326B (en) Resource recommendation method and device, computer equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant