CN106326388A - Method and device for processing information - Google Patents

Method and device for processing information Download PDF

Info

Publication number
CN106326388A
CN106326388A CN201610681330.0A CN201610681330A CN106326388A CN 106326388 A CN106326388 A CN 106326388A CN 201610681330 A CN201610681330 A CN 201610681330A CN 106326388 A CN106326388 A CN 106326388A
Authority
CN
China
Prior art keywords
information
participle
analog
described information
cryptographic hash
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610681330.0A
Other languages
Chinese (zh)
Inventor
王高波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
LeTV Holding Beijing Co Ltd
LeTV Information Technology Beijing Co Ltd
Original Assignee
LeTV Holding Beijing Co Ltd
LeTV Information Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by LeTV Holding Beijing Co Ltd, LeTV Information Technology Beijing Co Ltd filed Critical LeTV Holding Beijing Co Ltd
Priority to CN201610681330.0A priority Critical patent/CN106326388A/en
Publication of CN106326388A publication Critical patent/CN106326388A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/735Filtering based on additional data, e.g. user or group profiles

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a method and a device for processing information. The method and the device for processing the information are applied to a video receiving terminal of a network video website. The method specifically comprises the following steps of obtaining the description content of information; according to the description content of the information, establishing a word segmenting component of the information; according to the word segmenting component of the information and a preset weight coefficient, determining whether the information is similar information or not; when the information is similar information, determining any of similar information as to-be-recommended information. The method has the advantages that by filtering the similar information, the different information is recommended for the user; the condition of a user watching more similar recommending information is avoided, and the troublesome of the user is avoided.

Description

A kind of information processing method and device
Technical field
The present invention relates to technical field of network video, particularly relate to a kind of information processing method and device.
Background technology
When user utilizes Internet video playback terminal to obtain the relevant informations such as audio frequency, video, text news, Internet video Website can recommend, to user, multiple information that user may be interested according to various conditions, abundant in order to allow users to obtain Audio frequency and video, the information such as news, the general relatively horn of plenty of the information recommended, thus the most similar or height phase Like waiting information repeated, user can be made to feel tired superfluous and loaded down with trivial details, not utilize the selection of user, cause Consumer's Experience poor.
Summary of the invention
In view of this, embodiments provide a kind of information processing method and device, to avoid recommending weight to user Multiple information.
In order to solve the problems referred to above, the embodiment of the invention discloses a kind of information processing method, it is characterised in that including:
The description content of acquisition information;
Description content according to described information, sets up the participle vector of described information;
The weight coefficient that participle vector sum according to described information is preset, determines whether described information is analog information;
When described information is analog information, any one in described analog information is defined as information to be recommended.
Optionally, according to the weight coefficient that the participle vector sum of described information is preset, determine whether described information is similar Information, including:
The weight coefficient that participle vector sum according to described information is preset, calculates the cryptographic Hash of described information;
Cryptographic Hash according to described information judges whether described information is analog information.
Optionally, according to the weight coefficient that the participle vector sum of described information is preset, calculate the cryptographic Hash of described information, bag Include:
For the preset weight coefficient of participle element in the participle vector of described information;
The weighted value of described participle element is calculated according to described weight coefficient;
The described weighted value of described participle element is added, obtains the cryptographic Hash of described information.
Optionally, judge whether described information is analog information according to the cryptographic Hash of described information, including:
Cryptographic Hash according to described information calculates the Hamming distances between described information;
Described Hamming distances is converted to similarity, described similarity is compared with the similarity threshold preset;
If described similarity is less than described similarity threshold, then judge that described information is as analog information.
Optionally, described participle vector include information type, write a play, direct, performer, winning information, box office and evaluation letter Part or all of participle element in breath.
Accordingly, in order to ensure the enforcement of said method, present invention also offers a kind of information processor, including:
Content obtaining module is described, for obtaining the description content of information;
Participle vector sets up module, for the description content according to described information, sets up the participle vector of described information;
Analog information determines module, for the weight coefficient preset according to the participle vector sum of described information, determines described Whether information is analog information;
Recommendation information determines module, for when described information is analog information, and any one by described analog information Individual it is defined as information to be recommended.
Optionally, described analog information determines that module includes:
Cryptographic Hash computing unit, for the weight coefficient preset according to the participle vector sum of described information, calculates described letter The cryptographic Hash of breath;
According to the cryptographic Hash of described information, analog information identifying unit, for judging whether described information is analog information.
Optionally, cryptographic Hash computing unit includes:
The preset subelement of weight coefficient, is used for as the preset weight coefficient of participle element in the participle vector of described information;
Weighted value computation subunit, for calculating the weighted value of described participle element according to described weight coefficient;
Additional calculation subelement, for being added by the described weighted value of described participle element, obtains the Hash of described information Value.
Optionally, analog information identifying unit includes:
Hamming distances computation subunit, for calculate according to the cryptographic Hash of described information hamming between described information away from From;
Similarity judgment sub-unit, for described Hamming distances is converted to similarity, by described similarity with preset Similarity threshold compares;
Similar judgement subelement, for when described similarity is less than described similarity threshold, it is determined that described information is phase Like information.
Optionally, described participle vector include information type, write a play, direct, performer, winning information, box office and evaluation letter Part or all of participle element in breath.
From technique scheme it can be seen that embodiments provide a kind of information reason method and apparatus, it is specially The description content of acquisition information;Description content according to information, sets up the participle vector of information;Participle vector sum according to information Preset weight coefficient, determines whether information is analog information;When above-mentioned information is analog information, by appointing in analog information Anticipate one and be defined as information to be recommended.Pass through said method, it is possible to filtered by analog information, thus realize recommending user Information be all different, it is to avoid user sees too much similar recommendation information, user will not be made to feel tired superfluous and loaded down with trivial details.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing In having technology to describe, the required accompanying drawing used is briefly described, it should be apparent that, the accompanying drawing in describing below is only this Some embodiments of invention, for those of ordinary skill in the art, on the premise of not paying creative work, it is also possible to Other accompanying drawing is obtained according to these accompanying drawings.
Fig. 1 is the flow chart of steps of a kind of information processing method embodiment of the present invention;
Fig. 2 is the flow chart of steps of the another kind of information processing method embodiment of the present invention;
Fig. 3 is the structured flowchart of a kind of information processor embodiment of the present invention;
Fig. 4 is the structured flowchart of the another kind of information processor embodiment of the present invention.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Describe, it is clear that described embodiment is only a part of embodiment of the present invention rather than whole embodiments wholely.Based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under not making creative work premise Embodiment, broadly falls into the scope of protection of the invention.
Fig. 1 is the flow chart of steps of a kind of information processing method embodiment of the present invention.
With reference to shown in Fig. 1, the information processing method that the present embodiment provides is applied to server end, receives information user Time, in order to make user obtain good experience, typically can be according to factors such as the feature of user, behavior, history viewing records Recommend, to user, the various information that user may like, but the content of these information is likely to be and similar even repeats , this rating terminal can use following method step to realize duplicate removal in this case, and concrete steps include:
S101: obtain the description content of information.
Here information includes a plurality of information preparing to push to user, and information then refers to audio frequency, video or news The information such as text, obtained the description content of every information before pushing above-mentioned information to user, and this describes content for embodying The kind of this information, other relevant informations etc. of its content of originating, embody.
Such as, for video information, its describe content can include video type, write a play, direct, performer, prize-winning The contents such as information, box office, evaluation information, because these contents can describe the basic feature of a video content such that it is able to Carry out information for duplicate removal to prepare.
S102: set up the participle vector of information according to description content.
Participle vector can wherein comprise multiple variable as a function.Multiple participle element is included in participle vector It is considered as above-mentioned variable.So-called participle element refers to the participle of the basic feature for describing video presentation content, such as For video content, these participle elements include video type, write a play, direct, performer, winning information, box office, evaluation letter Breath etc..Wherein, the participle vector of foundation can include above-mentioned whole participle element, it is also possible to includes part participle element.
S103: determine whether information is analog information according to participle vector sum weight coefficient.
After obtaining above-mentioned participle vector, determine it according to the degree of correlation of participle element in each participle vector Yu information Weight coefficient, such as, for video information, for video type, write a play, direct and the relevant journey of performer and video information Spend higher, in general substantially can determine a unique video content by above description, therefore to above-mentioned participle Element gives higher weight coefficient;For winning information, box office, evaluation information with the degree of correlation of video information relatively Difference, thus give relatively low weight coefficient.
Judge that whether two information are on the basis of corresponding weight coefficient participle vector is given by participle vector sum Analog information.
S104: therefrom choose any one after judging analog information and be defined as information to be recommended.
After determining that two information are analog information by the corresponding weight coefficient of participle vector sum of information, the most optionally One, as information to be recommended, and utilizes this rating terminal to make user obtain this recommendation information, so that user obtains useful Recommendation information.
From technique scheme it can be seen that present embodiments provide a kind of information reason method, this information processing method should For the rating terminal of Internet video website, the description content of specially acquisition information;Description content according to information, sets up letter The participle vector of breath;The weight coefficient that participle vector sum according to information is preset, determines whether information is analog information;When above-mentioned When information is analog information, any one in analog information is defined as information to be recommended.Pass through said method, it is possible to by phase Filter like information, thus the information realizing recommending user is all different, it is to avoid user sees too much similar pushing away Recommend information, user will not be made to feel tired superfluous and loaded down with trivial details.
Fig. 2 is the flow chart of steps of the another kind of information processing method embodiment of the present invention.
With reference to shown in Fig. 2, the information processing method that the present embodiment provides is applied to server end, receives information user Time, in order to make user obtain good experience, typically can be according to factors such as the feature of user, behavior, history viewing records Recommend, to user, the various information that user may like, but the content of these information is likely to be and similar even repeats , this rating terminal can use following method step to realize duplicate removal in this case, and concrete steps include:
S201: obtain the description content of information.
Here information includes a plurality of information preparing to push to user, and information then refers to audio frequency, video or news The information such as text, obtained the description content of every information before pushing above-mentioned information to user, and this describes content for embodying The kind of this information, other relevant informations etc. of its content of originating, embody.
Such as, for video information, its describe content can include video type, write a play, direct, performer, prize-winning The contents such as information, box office, evaluation information, because these contents can describe the basic feature of a video content such that it is able to Carry out information for duplicate removal to prepare.
S202: set up the participle vector of information according to description content.
Participle vector can wherein comprise multiple variable as a function.Multiple participle element is included in participle vector It is considered as above-mentioned variable.So-called participle element refers to the participle of the basic feature for describing video presentation content, such as For video content, these participle elements include video type, write a play, direct, performer, winning information, box office, evaluation letter Breath etc..Wherein, the participle vector of foundation can include above-mentioned whole participle element, it is also possible to includes part participle element.
S203: calculate the cryptographic Hash of information according to participle vector sum weight coefficient.
Cryptographic Hash is the numeric representation form that one piece of data is unique and compact, and it can check the integrity of data, typically For quickly searching and AES.
Alternatively, based on discussed above, realized the calculating of the cryptographic Hash to information by following method.
First a different weight coefficient is given according to the feature of participle element respectively to each participle element.The most right For video information, for video type, write a play, direct and performer is higher with the degree of correlation of video information, in general Substantially can determine a unique video content by above description, therefore give higher power to above-mentioned participle element Weight coefficient;For winning information, box office, evaluation information, the degree of correlation with video information is poor, thus gives relatively low Weight coefficient.
Then calculate the weighted value of each participle element, a weight vectors will be converted to by participle vector, including A series of weighted value, each weighted value corresponds to corresponding participle element.
Finally the weighted value of each participle element is added, thus obtains the cryptographic Hash of video presentation content, i.e. with Hash Total weighted value of the video presentation content of value statement.
S204: judge whether information is analog information according to the cryptographic Hash of information.
I.e. after the cryptographic Hash of each description content of the multiple information obtained, any two is described the cryptographic Hash of content Compare, judge whether the two information is analog information according to comparative result.
Alternatively, concrete judge process includes:
First the cryptographic Hash that two describe contents is compared, thus obtain two hammings described between content away from From.In information encodes, two legitimate code correspondence positions encode different figure places and is referred to as code distance, also known as Hamming distances, this reality Execute the similarity degree between two cryptographic Hash of statement in example.
Then the Hamming distances obtained is normalized, using the numerical value that obtains after normalized as similar Degree, this similarity is a normalizing value between 0~1.0, then is entered by similarity threshold default with for this similarity Row compares, and this similarity threshold typically chooses an any number between 0.7~1.0, and it is the least that this any number is chosen, The probability that the similar video judged describes content is the biggest, otherwise probability can be less.
Finally, when the similarity between any two information is less than this similarity threshold, it is determined that above-mentioned two information is Analog information, otherwise the analog information that the most really admits a fault.This predetermined threshold value preferably 0.7 in the present embodiment.
S205: therefrom choose any one after judging analog information and be defined as information to be recommended.
After determining that two information are analog information by the corresponding weight coefficient of participle vector sum of information, the most optionally One, as information to be recommended, and utilizes this rating terminal to make user obtain this recommendation information, so that user obtains useful Recommendation information.
From technique scheme it can be seen that present embodiments provide a kind of information reason method, this information processing method should For rating terminal, the description content of specially acquisition information;Description content according to described information, sets up dividing of described information Term vector;The weight coefficient that participle vector sum according to described information is preset, determines whether described information is analog information;Work as institute When the information of stating is analog information, any one in described analog information is defined as information to be recommended.By said method, energy Enough analog information is filtered, thus the information realizing recommending user is all different, it is to avoid user sees too much phase As recommendation information, user will not be made to feel tired superfluous and loaded down with trivial details.
It should be noted that for embodiment of the method, in order to be briefly described, therefore it is all expressed as a series of action group Closing, but those skilled in the art should know, the embodiment of the present invention is not limited by described sequence of movement, because depending on According to the embodiment of the present invention, some step can use other orders or carry out simultaneously.Secondly, those skilled in the art also should Knowing, embodiment described in this description belongs to preferred embodiment, and the involved action not necessarily present invention implements Necessary to example.
Fig. 3 is the structured flowchart of a kind of information processor embodiment of the present invention.
With reference to shown in Fig. 3, the information processing method that the present embodiment provides is applied to server end, receives information user Time, in order to make user obtain good experience, typically can be according to factors such as the feature of user, behavior, history viewing records Recommend, to user, the various information that user may like, but the content of these information is likely to be and similar even repeats , this rating terminal can use following method step to realize duplicate removal in this case, specifically includes description content obtaining mould Block 10, participle vector set up module 20, analog information determines that module 30 and recommendation information determine module 40.
Content obtaining module 10 is described for obtaining the description content of information.
Here information includes a plurality of information preparing to push to user, and information then refers to audio frequency, video or news The information such as text, obtained the description content of every information before pushing above-mentioned information to user, and this describes content for embodying The kind of this information, other relevant informations etc. of its content of originating, embody.
Such as, for video information, its describe content can include video type, write a play, direct, performer, prize-winning The contents such as information, box office, evaluation information, because these contents can describe the basic feature of a video content such that it is able to Carry out information for duplicate removal to prepare.
Participle vector sets up module 20 for setting up the participle vector of information according to description content.
Participle vector can wherein comprise multiple variable as a function.Multiple participle element is included in participle vector Above-mentioned variable can be regarded as.So-called participle element refers to the participle of the basic feature for describing video presentation content, such as For video content, these participle elements include video type, write a play, direct, performer, winning information, box office, evaluation letter Breath etc..Wherein, the participle vector of foundation can include above-mentioned whole participle element, it is also possible to includes part participle element.
Analog information determines according to participle vector sum weight coefficient, module 30 is for determining whether information is analog information.
After obtaining above-mentioned participle vector, determine it according to the degree of correlation of participle element in each participle vector Yu information Weight coefficient, such as, for video information, for video type, write a play, direct and the relevant journey of performer and video information Spend higher, in general substantially can determine a unique video content by above description, therefore to above-mentioned participle Element gives higher weight coefficient;For winning information, box office, evaluation information with the degree of correlation of video information relatively Difference, thus give relatively low weight coefficient.
Judge that whether two information are on the basis of corresponding weight coefficient participle vector is given by participle vector sum Analog information.
Recommendation information determine module 40 after determining that module 30 judges analog information when analog information, from analog information Choose any one and be defined as information to be recommended.
After determining that two information are analog information by the corresponding weight coefficient of participle vector sum of information, the most optionally One, as information to be recommended, and utilizes this rating terminal to make user obtain this recommendation information, so that user obtains useful Recommendation information.
From technique scheme it can be seen that present embodiments provide a kind of information reason device, this information processor should For the rating terminal of Internet video website, the description content of specially acquisition information;Description content according to information, sets up letter The participle vector of breath;The weight coefficient that participle vector sum according to information is preset, determines whether information is analog information;When above-mentioned When information is analog information, any one in analog information is defined as information to be recommended.Pass through said apparatus, it is possible to by phase Filter like information, thus the information realizing recommending user is all different, it is to avoid user sees too much similar pushing away Recommend information, user will not be made to feel tired superfluous and loaded down with trivial details.
Fig. 4 is the structured flowchart of this embodiment of another kind of information processing of the present invention.
With reference to shown in Fig. 1, the information processing method that the present embodiment provides is applied to provide the user the Internet video of information The rating terminal of website, specifically include description content obtaining module 10, participle vector sets up module 20, analog information determines mould Block 30 and recommendation information determine module 40.
Content obtaining module 10 is described for obtaining the description content of information.
Here information includes a plurality of information preparing to push to user, and information then refers to audio frequency, video or news The information such as text, obtained the description content of every information before pushing above-mentioned information to user, and this describes content for embodying The kind of this information, other relevant informations etc. of its content of originating, embody.
Such as, for video information, its describe content can include video type, write a play, direct, performer, prize-winning The contents such as information, box office, evaluation information, because these contents can describe the basic feature of a video content such that it is able to Carry out information for duplicate removal to prepare.
Participle vector sets up module 20 for setting up the participle vector of information according to description content.
Participle vector can wherein comprise multiple variable as a function.Multiple participle element is included in participle vector Above-mentioned variable can be regarded as.So-called participle element refers to the participle of the basic feature for describing video presentation content, such as For video content, these participle elements include video type, write a play, direct, performer, winning information, box office, evaluation letter Breath etc..Wherein, the participle vector of foundation can include above-mentioned whole participle element, it is also possible to includes part participle element.
Analog information determines that module 30, can for determining whether information is analog information according to participle vector sum weight coefficient Selection of land, analog information determines that module 30 specifically includes cryptographic Hash computing unit 31 and analog information identifying unit 32.
Cryptographic Hash computing unit 31 for calculating the cryptographic Hash of information according to participle vector sum weight coefficient.Cryptographic Hash is one The numeric representation form that segment data is unique and compact, it can be checked the integrity of data, be generally used for quickly searching and encrypting Algorithm.Alternatively, based on discussed above, cryptographic Hash computing unit 31 includes the preset subelement of Hash weight coefficient 311, weighted value Computation subunit 312 and additional calculation subelement 313.
The preset subelement of weight coefficient 311 is for giving one according to the feature of participle element respectively to each participle element Different weight coefficients.Such as video information, for video type, write a play, direct and performer and video information Degree of correlation is higher, in general substantially can determine a unique video content by above description, therefore to above-mentioned Participle element give higher weight coefficient;To the relevant journey of video information for winning information, box office, evaluation information Spend poor, thus give relatively low weight coefficient.
Weighted value computation subunit 312, for calculating the weighted value of each participle element, will be converted to one by participle vector Individual weight vectors, including a series of weighted value, each weighted value corresponds to corresponding participle element.
Additional calculation subelement 313 is for being added the weighted value of each participle element, thus obtains video presentation content Cryptographic Hash, i.e. with total weighted value of video presentation content of cryptographic Hash statement.
According to the cryptographic Hash of information, analog information identifying unit 32 is for judging whether information is analog information.
After the cryptographic Hash of the description content of the multiple information i.e. obtained at cryptographic Hash computing unit, in any two is described The cryptographic Hash held compares, and judges whether the two information is analog information according to comparative result.Alternatively, analog information is sentenced Cell 32 includes sea name distance computation subunit 321, similarity judgment sub-unit 322 and similar judgement subelement 323.
Sea name distance computation subunit 321 compares for describing the cryptographic Hash of content by two, thus obtains two Hamming distances between content is described.In information encodes, two legitimate code correspondence positions encode different figure places and is referred to as code Away from, also known as Hamming distances, for stating the similarity degree between two cryptographic Hash in the present embodiment.
Similarity judgment sub-unit 322 is for being normalized place by the Hamming distances that sea name distance computation subunit obtains Reason, using the numerical value that obtains after normalized as similarity, this similarity is a normalizing value between 0~1.0, then Being compared by similarity threshold default with for this similarity, this similarity threshold typically chooses between 0.7~1.0 Individual any number, it is the least that this any number is chosen, it is determined that similar video to describe the probability of content the biggest, otherwise can Energy property can be less.
Similar judgement subelement 323 is when the similarity between any two information is less than this similarity threshold, it is determined that Above-mentioned two information is analog information, on the contrary the analog information that the most really admits a fault.This predetermined threshold value preferably 0.7 in the present embodiment.
Recommendation information determines that module 40 is therefrom chosen arbitrarily after judging analog information when analog information identifying unit 30 One determines information to be recommended.
After determining that two information are analog information by the corresponding weight coefficient of participle vector sum of information, the most optionally One, as information to be recommended, and utilizes this rating terminal to make user obtain this recommendation information, so that user obtains useful Recommendation information.
From technique scheme it can be seen that present embodiments provide a kind of information reason device, this information processor should For rating terminal, the description content of specially acquisition information;Description content according to described information, sets up dividing of described information Term vector;The weight coefficient that participle vector sum according to described information is preset, determines whether described information is analog information;Work as institute When the information of stating is analog information, any one in described analog information is defined as information to be recommended.By said apparatus, energy Enough analog information is filtered, thus the information realizing recommending user is all different, it is to avoid user sees too much phase As recommendation information, user will not be made to feel tired superfluous and loaded down with trivial details.
For device embodiment, due to itself and embodiment of the method basic simlarity, so describe is fairly simple, relevant Part sees the part of embodiment of the method and illustrates.
Each embodiment in this specification all uses the mode gone forward one by one to describe, what each embodiment stressed is with The difference of other embodiments, part similar between each embodiment sees mutually.
Those skilled in the art are it should be appreciated that the embodiment of the embodiment of the present invention can be provided as method, device or calculate Machine program product.Therefore, the embodiment of the present invention can use complete hardware embodiment, complete software implementation or combine software and The form of the embodiment of hardware aspect.And, the embodiment of the present invention can use one or more wherein include computer can With in the computer-usable storage medium (including but not limited to disk memory, CD-ROM, optical memory etc.) of program code The form of the computer program implemented.
The embodiment of the present invention is with reference to method, terminal unit (system) and computer program according to embodiments of the present invention The flow chart of product and/or block diagram describe.It should be understood that can be by computer program instructions flowchart and/or block diagram In each flow process and/or the flow process in square frame and flow chart and/or block diagram and/or the combination of square frame.These can be provided Computer program instructions sets to general purpose computer, special-purpose computer, Embedded Processor or other programmable data processing terminals Standby processor is to produce a machine so that held by the processor of computer or other programmable data processing terminal equipment The instruction of row produces for realizing in one flow process of flow chart or multiple flow process and/or one square frame of block diagram or multiple square frame The device of the function specified.
These computer program instructions may be alternatively stored in and can guide computer or other programmable data processing terminal equipment In the computer-readable memory worked in a specific way so that the instruction being stored in this computer-readable memory produces bag Including the manufacture of command device, this command device realizes in one flow process of flow chart or multiple flow process and/or one side of block diagram The function specified in frame or multiple square frame.
These computer program instructions also can be loaded on computer or other programmable data processing terminal equipment so that On computer or other programmable terminal equipment, execution sequence of operations step is to produce computer implemented process, thus The instruction performed on computer or other programmable terminal equipment provides for realizing in one flow process of flow chart or multiple flow process And/or the step of the function specified in one square frame of block diagram or multiple square frame.
Although having been described for the preferred embodiment of the embodiment of the present invention, but those skilled in the art once knowing base This creativeness concept, then can make other change and amendment to these embodiments.So, claims are intended to be construed to The all changes including preferred embodiment and falling into range of embodiment of the invention and amendment.
Finally, in addition it is also necessary to explanation, in this article, the relational terms of such as first and second or the like be used merely to by One entity or operation separate with another entity or operating space, and not necessarily require or imply these entities or operation Between exist any this reality relation or order.And, term " includes ", " comprising " or its any other variant meaning Containing comprising of nonexcludability, so that include that the process of a series of key element, method, article or terminal unit not only wrap Include those key elements, but also include other key elements being not expressly set out, or also include for this process, method, article Or the key element that terminal unit is intrinsic.In the case of there is no more restriction, by wanting that statement " including ... " limits Element, it is not excluded that there is also other similar elements in including the process of described key element, method, article or terminal unit.
Being described in detail technical scheme provided by the present invention above, specific case used herein is to this Bright principle and embodiment are set forth, the explanation of above example be only intended to help to understand the method for the present invention and Core concept;Simultaneously for one of ordinary skill in the art, according to the thought of the present invention, in detailed description of the invention and application All will change in scope, in sum, this specification content should not be construed as limitation of the present invention.

Claims (10)

1. an information processing method, it is characterised in that including:
The description content of acquisition information;
Description content according to described information, sets up the participle vector of described information;
The weight coefficient that participle vector sum according to described information is preset, determines whether described information is analog information;
When described information is analog information, any one in described analog information is defined as information to be recommended.
2. the method for claim 1, it is characterised in that according to the weight system that the participle vector sum of described information is preset Number, determines whether described information is analog information, including:
The weight coefficient that participle vector sum according to described information is preset, calculates the cryptographic Hash of described information;
Cryptographic Hash according to described information judges whether described information is analog information.
3. method as claimed in claim 2, it is characterised in that according to the weight system that the participle vector sum of described information is preset Number, calculates the cryptographic Hash Hash of described information, including:
For the preset weight coefficient of participle element in the participle vector of described information;
The weighted value of described participle element is calculated according to described weight coefficient;
The described weighted value of described participle element is added, obtains the cryptographic Hash of described information.
4. method as claimed in claim 2, it is characterised in that judge that whether described information be according to the cryptographic Hash of described information Analog information Hash, including:
Cryptographic Hash according to described information calculates the Hamming distances between described information;
Described Hamming distances is converted to similarity, described similarity is compared with the similarity threshold preset;
If described similarity is less than described similarity threshold, then judge that described information is as analog information.
5. the method as described in any one of claim 1-4, it is characterised in that described participle vector include information type, playwright, screenwriter, Part or all of participle element in director, performer, winning information, box office and evaluation information.
6. an information processor, it is characterised in that including:
Content obtaining module is described, for obtaining the description content of information;
Participle vector sets up module, for the description content according to described information, sets up the participle vector of described information;
Analog information determines module, for the weight coefficient preset according to the participle vector sum of described information, determines described information Whether it is analog information;
Recommendation information determines module, for when described information is analog information, by true for any one in described analog information It is set to information to be recommended.
7. device as claimed in claim 1, it is characterised in that described analog information determines that module includes:
Cryptographic Hash computing unit, for the weight coefficient preset according to the participle vector sum of described information, calculates described information Cryptographic Hash;
According to the cryptographic Hash of described information, analog information identifying unit, for judging whether described information is analog information.
8. device as claimed in claim 7, it is characterised in that cryptographic Hash computing unit includes:
The preset subelement of weight coefficient, is used for as the preset weight coefficient of participle element in the participle vector of described information;
Weighted value computation subunit, for calculating the weighted value of described participle element according to described weight coefficient;
Additional calculation subelement, for being added by the described weighted value of described participle element, obtains the cryptographic Hash of described information.
9. device as claimed in claim 7, it is characterised in that analog information identifying unit includes:
Hamming distances computation subunit, calculates the Hamming distances between described information for the cryptographic Hash according to described information;
Similarity judgment sub-unit, for described Hamming distances is converted to similarity, described similarity is similar to preset Degree threshold value compares;
Similar judgement subelement, for when described similarity is less than described similarity threshold, it is determined that described information is similar letter Breath.
10. the device as described in any one of claim 6~9, it is characterised in that described participle vector includes information type, volume Part or all of participle element in play, director, performer, winning information, box office and evaluation information.
CN201610681330.0A 2016-08-17 2016-08-17 Method and device for processing information Pending CN106326388A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610681330.0A CN106326388A (en) 2016-08-17 2016-08-17 Method and device for processing information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610681330.0A CN106326388A (en) 2016-08-17 2016-08-17 Method and device for processing information

Publications (1)

Publication Number Publication Date
CN106326388A true CN106326388A (en) 2017-01-11

Family

ID=57743991

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610681330.0A Pending CN106326388A (en) 2016-08-17 2016-08-17 Method and device for processing information

Country Status (1)

Country Link
CN (1) CN106326388A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066621A (en) * 2017-05-11 2017-08-18 腾讯科技(深圳)有限公司 A kind of search method of similar video, device and storage medium
CN107193893A (en) * 2017-05-03 2017-09-22 聚好看科技股份有限公司 Handle the method and device of video resource
CN107977355A (en) * 2017-11-17 2018-05-01 四川长虹电器股份有限公司 TV programme suggesting method based on term vector training
CN110929002A (en) * 2018-09-03 2020-03-27 广州神马移动信息科技有限公司 Similar article duplicate removal method, device, terminal and computer readable storage medium
CN111128243A (en) * 2019-12-25 2020-05-08 苏州科达科技股份有限公司 Noise data acquisition method, device and storage medium
CN113672913A (en) * 2021-08-20 2021-11-19 绿盟科技集团股份有限公司 Security event processing method and device and electronic equipment
CN113672913B (en) * 2021-08-20 2024-06-28 绿盟科技集团股份有限公司 Security event processing method and device and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831198A (en) * 2012-08-07 2012-12-19 人民搜索网络股份公司 Similar document identifying device and similar document identifying method based on document signature technology
CN103714118A (en) * 2013-11-22 2014-04-09 浙江大学 Book cross-reading method
CN104679835A (en) * 2015-02-09 2015-06-03 浙江大学 Book recommending method based on multi-view hash
CN104951448A (en) * 2014-03-26 2015-09-30 北京雪球信息科技有限公司 Method and server for pushing messages of subscribed categories for users
CN105138647A (en) * 2015-08-26 2015-12-09 陕西师范大学 Travel network cell division method based on Simhash algorithm
CN105426528A (en) * 2015-12-15 2016-03-23 中南大学 Retrieving and ordering method and system for commodity data
CN105786799A (en) * 2016-03-21 2016-07-20 成都寻道科技有限公司 Web article originality judgment method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831198A (en) * 2012-08-07 2012-12-19 人民搜索网络股份公司 Similar document identifying device and similar document identifying method based on document signature technology
CN103714118A (en) * 2013-11-22 2014-04-09 浙江大学 Book cross-reading method
CN104951448A (en) * 2014-03-26 2015-09-30 北京雪球信息科技有限公司 Method and server for pushing messages of subscribed categories for users
CN104679835A (en) * 2015-02-09 2015-06-03 浙江大学 Book recommending method based on multi-view hash
CN105138647A (en) * 2015-08-26 2015-12-09 陕西师范大学 Travel network cell division method based on Simhash algorithm
CN105426528A (en) * 2015-12-15 2016-03-23 中南大学 Retrieving and ordering method and system for commodity data
CN105786799A (en) * 2016-03-21 2016-07-20 成都寻道科技有限公司 Web article originality judgment method

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107193893A (en) * 2017-05-03 2017-09-22 聚好看科技股份有限公司 Handle the method and device of video resource
CN107066621A (en) * 2017-05-11 2017-08-18 腾讯科技(深圳)有限公司 A kind of search method of similar video, device and storage medium
WO2018205838A1 (en) * 2017-05-11 2018-11-15 腾讯科技(深圳)有限公司 Method and apparatus for retrieving similar video, and storage medium
US10853660B2 (en) 2017-05-11 2020-12-01 Tencent Technology (Shenzhen) Company Limited Method and apparatus for retrieving similar video and storage medium
CN107977355A (en) * 2017-11-17 2018-05-01 四川长虹电器股份有限公司 TV programme suggesting method based on term vector training
CN110929002A (en) * 2018-09-03 2020-03-27 广州神马移动信息科技有限公司 Similar article duplicate removal method, device, terminal and computer readable storage medium
CN111128243A (en) * 2019-12-25 2020-05-08 苏州科达科技股份有限公司 Noise data acquisition method, device and storage medium
CN113672913A (en) * 2021-08-20 2021-11-19 绿盟科技集团股份有限公司 Security event processing method and device and electronic equipment
CN113672913B (en) * 2021-08-20 2024-06-28 绿盟科技集团股份有限公司 Security event processing method and device and electronic equipment

Similar Documents

Publication Publication Date Title
CN106326388A (en) Method and device for processing information
CN108090208A (en) Fused data processing method and processing device
WO2021042826A1 (en) Video playback completeness prediction method and apparatus
CN106326391B (en) Multimedia resource recommendation method and device
CN110909182B (en) Multimedia resource searching method, device, computer equipment and storage medium
CN102646097B (en) A kind of clustering method and device
CN104053023B (en) A kind of method and device of determining video similarity
CN107729578B (en) Music recommendation method and device
CN103377232A (en) Headline keyword recommendation method and system
CN102999588A (en) Method and system for recommending multimedia applications
CN104199896A (en) Video similarity determining method and video recommendation method based on feature classification
CN105005582A (en) Recommendation method and device for multimedia information
CN107592572B (en) Video recommendation method, device and equipment
CN110704677B (en) Program recommendation method and device, readable storage medium and terminal equipment
CN108600836B (en) Video processing method and device
CN108664654A (en) A kind of main broadcaster's recommendation method and device based on user's similarity
CN109242592A (en) A kind of recommended method and device of application
CN105718510A (en) Multimedia data recommendation method and device
CN111241381A (en) Information recommendation method and device, electronic equipment and computer-readable storage medium
CN106227881A (en) A kind of information processing method and server
CN112085058A (en) Object combination recall method and device, electronic equipment and storage medium
CN105022797A (en) Resource topic processing method and apparatus
CN103617221A (en) Software recommendation method and software recommendation system
CN108829699A (en) A kind of polymerization and device of focus incident
CN109461012A (en) A kind of Products Show method, apparatus and terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170111