CN106326388A - Method and device for processing information - Google Patents
Method and device for processing information Download PDFInfo
- Publication number
- CN106326388A CN106326388A CN201610681330.0A CN201610681330A CN106326388A CN 106326388 A CN106326388 A CN 106326388A CN 201610681330 A CN201610681330 A CN 201610681330A CN 106326388 A CN106326388 A CN 106326388A
- Authority
- CN
- China
- Prior art keywords
- information
- participle
- analog
- described information
- cryptographic hash
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/73—Querying
- G06F16/735—Filtering based on additional data, e.g. user or group profiles
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Multimedia (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention provides a method and a device for processing information. The method and the device for processing the information are applied to a video receiving terminal of a network video website. The method specifically comprises the following steps of obtaining the description content of information; according to the description content of the information, establishing a word segmenting component of the information; according to the word segmenting component of the information and a preset weight coefficient, determining whether the information is similar information or not; when the information is similar information, determining any of similar information as to-be-recommended information. The method has the advantages that by filtering the similar information, the different information is recommended for the user; the condition of a user watching more similar recommending information is avoided, and the troublesome of the user is avoided.
Description
Technical field
The present invention relates to technical field of network video, particularly relate to a kind of information processing method and device.
Background technology
When user utilizes Internet video playback terminal to obtain the relevant informations such as audio frequency, video, text news, Internet video
Website can recommend, to user, multiple information that user may be interested according to various conditions, abundant in order to allow users to obtain
Audio frequency and video, the information such as news, the general relatively horn of plenty of the information recommended, thus the most similar or height phase
Like waiting information repeated, user can be made to feel tired superfluous and loaded down with trivial details, not utilize the selection of user, cause Consumer's Experience poor.
Summary of the invention
In view of this, embodiments provide a kind of information processing method and device, to avoid recommending weight to user
Multiple information.
In order to solve the problems referred to above, the embodiment of the invention discloses a kind of information processing method, it is characterised in that including:
The description content of acquisition information;
Description content according to described information, sets up the participle vector of described information;
The weight coefficient that participle vector sum according to described information is preset, determines whether described information is analog information;
When described information is analog information, any one in described analog information is defined as information to be recommended.
Optionally, according to the weight coefficient that the participle vector sum of described information is preset, determine whether described information is similar
Information, including:
The weight coefficient that participle vector sum according to described information is preset, calculates the cryptographic Hash of described information;
Cryptographic Hash according to described information judges whether described information is analog information.
Optionally, according to the weight coefficient that the participle vector sum of described information is preset, calculate the cryptographic Hash of described information, bag
Include:
For the preset weight coefficient of participle element in the participle vector of described information;
The weighted value of described participle element is calculated according to described weight coefficient;
The described weighted value of described participle element is added, obtains the cryptographic Hash of described information.
Optionally, judge whether described information is analog information according to the cryptographic Hash of described information, including:
Cryptographic Hash according to described information calculates the Hamming distances between described information;
Described Hamming distances is converted to similarity, described similarity is compared with the similarity threshold preset;
If described similarity is less than described similarity threshold, then judge that described information is as analog information.
Optionally, described participle vector include information type, write a play, direct, performer, winning information, box office and evaluation letter
Part or all of participle element in breath.
Accordingly, in order to ensure the enforcement of said method, present invention also offers a kind of information processor, including:
Content obtaining module is described, for obtaining the description content of information;
Participle vector sets up module, for the description content according to described information, sets up the participle vector of described information;
Analog information determines module, for the weight coefficient preset according to the participle vector sum of described information, determines described
Whether information is analog information;
Recommendation information determines module, for when described information is analog information, and any one by described analog information
Individual it is defined as information to be recommended.
Optionally, described analog information determines that module includes:
Cryptographic Hash computing unit, for the weight coefficient preset according to the participle vector sum of described information, calculates described letter
The cryptographic Hash of breath;
According to the cryptographic Hash of described information, analog information identifying unit, for judging whether described information is analog information.
Optionally, cryptographic Hash computing unit includes:
The preset subelement of weight coefficient, is used for as the preset weight coefficient of participle element in the participle vector of described information;
Weighted value computation subunit, for calculating the weighted value of described participle element according to described weight coefficient;
Additional calculation subelement, for being added by the described weighted value of described participle element, obtains the Hash of described information
Value.
Optionally, analog information identifying unit includes:
Hamming distances computation subunit, for calculate according to the cryptographic Hash of described information hamming between described information away from
From;
Similarity judgment sub-unit, for described Hamming distances is converted to similarity, by described similarity with preset
Similarity threshold compares;
Similar judgement subelement, for when described similarity is less than described similarity threshold, it is determined that described information is phase
Like information.
Optionally, described participle vector include information type, write a play, direct, performer, winning information, box office and evaluation letter
Part or all of participle element in breath.
From technique scheme it can be seen that embodiments provide a kind of information reason method and apparatus, it is specially
The description content of acquisition information;Description content according to information, sets up the participle vector of information;Participle vector sum according to information
Preset weight coefficient, determines whether information is analog information;When above-mentioned information is analog information, by appointing in analog information
Anticipate one and be defined as information to be recommended.Pass through said method, it is possible to filtered by analog information, thus realize recommending user
Information be all different, it is to avoid user sees too much similar recommendation information, user will not be made to feel tired superfluous and loaded down with trivial details.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
In having technology to describe, the required accompanying drawing used is briefly described, it should be apparent that, the accompanying drawing in describing below is only this
Some embodiments of invention, for those of ordinary skill in the art, on the premise of not paying creative work, it is also possible to
Other accompanying drawing is obtained according to these accompanying drawings.
Fig. 1 is the flow chart of steps of a kind of information processing method embodiment of the present invention;
Fig. 2 is the flow chart of steps of the another kind of information processing method embodiment of the present invention;
Fig. 3 is the structured flowchart of a kind of information processor embodiment of the present invention;
Fig. 4 is the structured flowchart of the another kind of information processor embodiment of the present invention.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Describe, it is clear that described embodiment is only a part of embodiment of the present invention rather than whole embodiments wholely.Based on
Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under not making creative work premise
Embodiment, broadly falls into the scope of protection of the invention.
Fig. 1 is the flow chart of steps of a kind of information processing method embodiment of the present invention.
With reference to shown in Fig. 1, the information processing method that the present embodiment provides is applied to server end, receives information user
Time, in order to make user obtain good experience, typically can be according to factors such as the feature of user, behavior, history viewing records
Recommend, to user, the various information that user may like, but the content of these information is likely to be and similar even repeats
, this rating terminal can use following method step to realize duplicate removal in this case, and concrete steps include:
S101: obtain the description content of information.
Here information includes a plurality of information preparing to push to user, and information then refers to audio frequency, video or news
The information such as text, obtained the description content of every information before pushing above-mentioned information to user, and this describes content for embodying
The kind of this information, other relevant informations etc. of its content of originating, embody.
Such as, for video information, its describe content can include video type, write a play, direct, performer, prize-winning
The contents such as information, box office, evaluation information, because these contents can describe the basic feature of a video content such that it is able to
Carry out information for duplicate removal to prepare.
S102: set up the participle vector of information according to description content.
Participle vector can wherein comprise multiple variable as a function.Multiple participle element is included in participle vector
It is considered as above-mentioned variable.So-called participle element refers to the participle of the basic feature for describing video presentation content, such as
For video content, these participle elements include video type, write a play, direct, performer, winning information, box office, evaluation letter
Breath etc..Wherein, the participle vector of foundation can include above-mentioned whole participle element, it is also possible to includes part participle element.
S103: determine whether information is analog information according to participle vector sum weight coefficient.
After obtaining above-mentioned participle vector, determine it according to the degree of correlation of participle element in each participle vector Yu information
Weight coefficient, such as, for video information, for video type, write a play, direct and the relevant journey of performer and video information
Spend higher, in general substantially can determine a unique video content by above description, therefore to above-mentioned participle
Element gives higher weight coefficient;For winning information, box office, evaluation information with the degree of correlation of video information relatively
Difference, thus give relatively low weight coefficient.
Judge that whether two information are on the basis of corresponding weight coefficient participle vector is given by participle vector sum
Analog information.
S104: therefrom choose any one after judging analog information and be defined as information to be recommended.
After determining that two information are analog information by the corresponding weight coefficient of participle vector sum of information, the most optionally
One, as information to be recommended, and utilizes this rating terminal to make user obtain this recommendation information, so that user obtains useful
Recommendation information.
From technique scheme it can be seen that present embodiments provide a kind of information reason method, this information processing method should
For the rating terminal of Internet video website, the description content of specially acquisition information;Description content according to information, sets up letter
The participle vector of breath;The weight coefficient that participle vector sum according to information is preset, determines whether information is analog information;When above-mentioned
When information is analog information, any one in analog information is defined as information to be recommended.Pass through said method, it is possible to by phase
Filter like information, thus the information realizing recommending user is all different, it is to avoid user sees too much similar pushing away
Recommend information, user will not be made to feel tired superfluous and loaded down with trivial details.
Fig. 2 is the flow chart of steps of the another kind of information processing method embodiment of the present invention.
With reference to shown in Fig. 2, the information processing method that the present embodiment provides is applied to server end, receives information user
Time, in order to make user obtain good experience, typically can be according to factors such as the feature of user, behavior, history viewing records
Recommend, to user, the various information that user may like, but the content of these information is likely to be and similar even repeats
, this rating terminal can use following method step to realize duplicate removal in this case, and concrete steps include:
S201: obtain the description content of information.
Here information includes a plurality of information preparing to push to user, and information then refers to audio frequency, video or news
The information such as text, obtained the description content of every information before pushing above-mentioned information to user, and this describes content for embodying
The kind of this information, other relevant informations etc. of its content of originating, embody.
Such as, for video information, its describe content can include video type, write a play, direct, performer, prize-winning
The contents such as information, box office, evaluation information, because these contents can describe the basic feature of a video content such that it is able to
Carry out information for duplicate removal to prepare.
S202: set up the participle vector of information according to description content.
Participle vector can wherein comprise multiple variable as a function.Multiple participle element is included in participle vector
It is considered as above-mentioned variable.So-called participle element refers to the participle of the basic feature for describing video presentation content, such as
For video content, these participle elements include video type, write a play, direct, performer, winning information, box office, evaluation letter
Breath etc..Wherein, the participle vector of foundation can include above-mentioned whole participle element, it is also possible to includes part participle element.
S203: calculate the cryptographic Hash of information according to participle vector sum weight coefficient.
Cryptographic Hash is the numeric representation form that one piece of data is unique and compact, and it can check the integrity of data, typically
For quickly searching and AES.
Alternatively, based on discussed above, realized the calculating of the cryptographic Hash to information by following method.
First a different weight coefficient is given according to the feature of participle element respectively to each participle element.The most right
For video information, for video type, write a play, direct and performer is higher with the degree of correlation of video information, in general
Substantially can determine a unique video content by above description, therefore give higher power to above-mentioned participle element
Weight coefficient;For winning information, box office, evaluation information, the degree of correlation with video information is poor, thus gives relatively low
Weight coefficient.
Then calculate the weighted value of each participle element, a weight vectors will be converted to by participle vector, including
A series of weighted value, each weighted value corresponds to corresponding participle element.
Finally the weighted value of each participle element is added, thus obtains the cryptographic Hash of video presentation content, i.e. with Hash
Total weighted value of the video presentation content of value statement.
S204: judge whether information is analog information according to the cryptographic Hash of information.
I.e. after the cryptographic Hash of each description content of the multiple information obtained, any two is described the cryptographic Hash of content
Compare, judge whether the two information is analog information according to comparative result.
Alternatively, concrete judge process includes:
First the cryptographic Hash that two describe contents is compared, thus obtain two hammings described between content away from
From.In information encodes, two legitimate code correspondence positions encode different figure places and is referred to as code distance, also known as Hamming distances, this reality
Execute the similarity degree between two cryptographic Hash of statement in example.
Then the Hamming distances obtained is normalized, using the numerical value that obtains after normalized as similar
Degree, this similarity is a normalizing value between 0~1.0, then is entered by similarity threshold default with for this similarity
Row compares, and this similarity threshold typically chooses an any number between 0.7~1.0, and it is the least that this any number is chosen,
The probability that the similar video judged describes content is the biggest, otherwise probability can be less.
Finally, when the similarity between any two information is less than this similarity threshold, it is determined that above-mentioned two information is
Analog information, otherwise the analog information that the most really admits a fault.This predetermined threshold value preferably 0.7 in the present embodiment.
S205: therefrom choose any one after judging analog information and be defined as information to be recommended.
After determining that two information are analog information by the corresponding weight coefficient of participle vector sum of information, the most optionally
One, as information to be recommended, and utilizes this rating terminal to make user obtain this recommendation information, so that user obtains useful
Recommendation information.
From technique scheme it can be seen that present embodiments provide a kind of information reason method, this information processing method should
For rating terminal, the description content of specially acquisition information;Description content according to described information, sets up dividing of described information
Term vector;The weight coefficient that participle vector sum according to described information is preset, determines whether described information is analog information;Work as institute
When the information of stating is analog information, any one in described analog information is defined as information to be recommended.By said method, energy
Enough analog information is filtered, thus the information realizing recommending user is all different, it is to avoid user sees too much phase
As recommendation information, user will not be made to feel tired superfluous and loaded down with trivial details.
It should be noted that for embodiment of the method, in order to be briefly described, therefore it is all expressed as a series of action group
Closing, but those skilled in the art should know, the embodiment of the present invention is not limited by described sequence of movement, because depending on
According to the embodiment of the present invention, some step can use other orders or carry out simultaneously.Secondly, those skilled in the art also should
Knowing, embodiment described in this description belongs to preferred embodiment, and the involved action not necessarily present invention implements
Necessary to example.
Fig. 3 is the structured flowchart of a kind of information processor embodiment of the present invention.
With reference to shown in Fig. 3, the information processing method that the present embodiment provides is applied to server end, receives information user
Time, in order to make user obtain good experience, typically can be according to factors such as the feature of user, behavior, history viewing records
Recommend, to user, the various information that user may like, but the content of these information is likely to be and similar even repeats
, this rating terminal can use following method step to realize duplicate removal in this case, specifically includes description content obtaining mould
Block 10, participle vector set up module 20, analog information determines that module 30 and recommendation information determine module 40.
Content obtaining module 10 is described for obtaining the description content of information.
Here information includes a plurality of information preparing to push to user, and information then refers to audio frequency, video or news
The information such as text, obtained the description content of every information before pushing above-mentioned information to user, and this describes content for embodying
The kind of this information, other relevant informations etc. of its content of originating, embody.
Such as, for video information, its describe content can include video type, write a play, direct, performer, prize-winning
The contents such as information, box office, evaluation information, because these contents can describe the basic feature of a video content such that it is able to
Carry out information for duplicate removal to prepare.
Participle vector sets up module 20 for setting up the participle vector of information according to description content.
Participle vector can wherein comprise multiple variable as a function.Multiple participle element is included in participle vector
Above-mentioned variable can be regarded as.So-called participle element refers to the participle of the basic feature for describing video presentation content, such as
For video content, these participle elements include video type, write a play, direct, performer, winning information, box office, evaluation letter
Breath etc..Wherein, the participle vector of foundation can include above-mentioned whole participle element, it is also possible to includes part participle element.
Analog information determines according to participle vector sum weight coefficient, module 30 is for determining whether information is analog information.
After obtaining above-mentioned participle vector, determine it according to the degree of correlation of participle element in each participle vector Yu information
Weight coefficient, such as, for video information, for video type, write a play, direct and the relevant journey of performer and video information
Spend higher, in general substantially can determine a unique video content by above description, therefore to above-mentioned participle
Element gives higher weight coefficient;For winning information, box office, evaluation information with the degree of correlation of video information relatively
Difference, thus give relatively low weight coefficient.
Judge that whether two information are on the basis of corresponding weight coefficient participle vector is given by participle vector sum
Analog information.
Recommendation information determine module 40 after determining that module 30 judges analog information when analog information, from analog information
Choose any one and be defined as information to be recommended.
After determining that two information are analog information by the corresponding weight coefficient of participle vector sum of information, the most optionally
One, as information to be recommended, and utilizes this rating terminal to make user obtain this recommendation information, so that user obtains useful
Recommendation information.
From technique scheme it can be seen that present embodiments provide a kind of information reason device, this information processor should
For the rating terminal of Internet video website, the description content of specially acquisition information;Description content according to information, sets up letter
The participle vector of breath;The weight coefficient that participle vector sum according to information is preset, determines whether information is analog information;When above-mentioned
When information is analog information, any one in analog information is defined as information to be recommended.Pass through said apparatus, it is possible to by phase
Filter like information, thus the information realizing recommending user is all different, it is to avoid user sees too much similar pushing away
Recommend information, user will not be made to feel tired superfluous and loaded down with trivial details.
Fig. 4 is the structured flowchart of this embodiment of another kind of information processing of the present invention.
With reference to shown in Fig. 1, the information processing method that the present embodiment provides is applied to provide the user the Internet video of information
The rating terminal of website, specifically include description content obtaining module 10, participle vector sets up module 20, analog information determines mould
Block 30 and recommendation information determine module 40.
Content obtaining module 10 is described for obtaining the description content of information.
Here information includes a plurality of information preparing to push to user, and information then refers to audio frequency, video or news
The information such as text, obtained the description content of every information before pushing above-mentioned information to user, and this describes content for embodying
The kind of this information, other relevant informations etc. of its content of originating, embody.
Such as, for video information, its describe content can include video type, write a play, direct, performer, prize-winning
The contents such as information, box office, evaluation information, because these contents can describe the basic feature of a video content such that it is able to
Carry out information for duplicate removal to prepare.
Participle vector sets up module 20 for setting up the participle vector of information according to description content.
Participle vector can wherein comprise multiple variable as a function.Multiple participle element is included in participle vector
Above-mentioned variable can be regarded as.So-called participle element refers to the participle of the basic feature for describing video presentation content, such as
For video content, these participle elements include video type, write a play, direct, performer, winning information, box office, evaluation letter
Breath etc..Wherein, the participle vector of foundation can include above-mentioned whole participle element, it is also possible to includes part participle element.
Analog information determines that module 30, can for determining whether information is analog information according to participle vector sum weight coefficient
Selection of land, analog information determines that module 30 specifically includes cryptographic Hash computing unit 31 and analog information identifying unit 32.
Cryptographic Hash computing unit 31 for calculating the cryptographic Hash of information according to participle vector sum weight coefficient.Cryptographic Hash is one
The numeric representation form that segment data is unique and compact, it can be checked the integrity of data, be generally used for quickly searching and encrypting
Algorithm.Alternatively, based on discussed above, cryptographic Hash computing unit 31 includes the preset subelement of Hash weight coefficient 311, weighted value
Computation subunit 312 and additional calculation subelement 313.
The preset subelement of weight coefficient 311 is for giving one according to the feature of participle element respectively to each participle element
Different weight coefficients.Such as video information, for video type, write a play, direct and performer and video information
Degree of correlation is higher, in general substantially can determine a unique video content by above description, therefore to above-mentioned
Participle element give higher weight coefficient;To the relevant journey of video information for winning information, box office, evaluation information
Spend poor, thus give relatively low weight coefficient.
Weighted value computation subunit 312, for calculating the weighted value of each participle element, will be converted to one by participle vector
Individual weight vectors, including a series of weighted value, each weighted value corresponds to corresponding participle element.
Additional calculation subelement 313 is for being added the weighted value of each participle element, thus obtains video presentation content
Cryptographic Hash, i.e. with total weighted value of video presentation content of cryptographic Hash statement.
According to the cryptographic Hash of information, analog information identifying unit 32 is for judging whether information is analog information.
After the cryptographic Hash of the description content of the multiple information i.e. obtained at cryptographic Hash computing unit, in any two is described
The cryptographic Hash held compares, and judges whether the two information is analog information according to comparative result.Alternatively, analog information is sentenced
Cell 32 includes sea name distance computation subunit 321, similarity judgment sub-unit 322 and similar judgement subelement 323.
Sea name distance computation subunit 321 compares for describing the cryptographic Hash of content by two, thus obtains two
Hamming distances between content is described.In information encodes, two legitimate code correspondence positions encode different figure places and is referred to as code
Away from, also known as Hamming distances, for stating the similarity degree between two cryptographic Hash in the present embodiment.
Similarity judgment sub-unit 322 is for being normalized place by the Hamming distances that sea name distance computation subunit obtains
Reason, using the numerical value that obtains after normalized as similarity, this similarity is a normalizing value between 0~1.0, then
Being compared by similarity threshold default with for this similarity, this similarity threshold typically chooses between 0.7~1.0
Individual any number, it is the least that this any number is chosen, it is determined that similar video to describe the probability of content the biggest, otherwise can
Energy property can be less.
Similar judgement subelement 323 is when the similarity between any two information is less than this similarity threshold, it is determined that
Above-mentioned two information is analog information, on the contrary the analog information that the most really admits a fault.This predetermined threshold value preferably 0.7 in the present embodiment.
Recommendation information determines that module 40 is therefrom chosen arbitrarily after judging analog information when analog information identifying unit 30
One determines information to be recommended.
After determining that two information are analog information by the corresponding weight coefficient of participle vector sum of information, the most optionally
One, as information to be recommended, and utilizes this rating terminal to make user obtain this recommendation information, so that user obtains useful
Recommendation information.
From technique scheme it can be seen that present embodiments provide a kind of information reason device, this information processor should
For rating terminal, the description content of specially acquisition information;Description content according to described information, sets up dividing of described information
Term vector;The weight coefficient that participle vector sum according to described information is preset, determines whether described information is analog information;Work as institute
When the information of stating is analog information, any one in described analog information is defined as information to be recommended.By said apparatus, energy
Enough analog information is filtered, thus the information realizing recommending user is all different, it is to avoid user sees too much phase
As recommendation information, user will not be made to feel tired superfluous and loaded down with trivial details.
For device embodiment, due to itself and embodiment of the method basic simlarity, so describe is fairly simple, relevant
Part sees the part of embodiment of the method and illustrates.
Each embodiment in this specification all uses the mode gone forward one by one to describe, what each embodiment stressed is with
The difference of other embodiments, part similar between each embodiment sees mutually.
Those skilled in the art are it should be appreciated that the embodiment of the embodiment of the present invention can be provided as method, device or calculate
Machine program product.Therefore, the embodiment of the present invention can use complete hardware embodiment, complete software implementation or combine software and
The form of the embodiment of hardware aspect.And, the embodiment of the present invention can use one or more wherein include computer can
With in the computer-usable storage medium (including but not limited to disk memory, CD-ROM, optical memory etc.) of program code
The form of the computer program implemented.
The embodiment of the present invention is with reference to method, terminal unit (system) and computer program according to embodiments of the present invention
The flow chart of product and/or block diagram describe.It should be understood that can be by computer program instructions flowchart and/or block diagram
In each flow process and/or the flow process in square frame and flow chart and/or block diagram and/or the combination of square frame.These can be provided
Computer program instructions sets to general purpose computer, special-purpose computer, Embedded Processor or other programmable data processing terminals
Standby processor is to produce a machine so that held by the processor of computer or other programmable data processing terminal equipment
The instruction of row produces for realizing in one flow process of flow chart or multiple flow process and/or one square frame of block diagram or multiple square frame
The device of the function specified.
These computer program instructions may be alternatively stored in and can guide computer or other programmable data processing terminal equipment
In the computer-readable memory worked in a specific way so that the instruction being stored in this computer-readable memory produces bag
Including the manufacture of command device, this command device realizes in one flow process of flow chart or multiple flow process and/or one side of block diagram
The function specified in frame or multiple square frame.
These computer program instructions also can be loaded on computer or other programmable data processing terminal equipment so that
On computer or other programmable terminal equipment, execution sequence of operations step is to produce computer implemented process, thus
The instruction performed on computer or other programmable terminal equipment provides for realizing in one flow process of flow chart or multiple flow process
And/or the step of the function specified in one square frame of block diagram or multiple square frame.
Although having been described for the preferred embodiment of the embodiment of the present invention, but those skilled in the art once knowing base
This creativeness concept, then can make other change and amendment to these embodiments.So, claims are intended to be construed to
The all changes including preferred embodiment and falling into range of embodiment of the invention and amendment.
Finally, in addition it is also necessary to explanation, in this article, the relational terms of such as first and second or the like be used merely to by
One entity or operation separate with another entity or operating space, and not necessarily require or imply these entities or operation
Between exist any this reality relation or order.And, term " includes ", " comprising " or its any other variant meaning
Containing comprising of nonexcludability, so that include that the process of a series of key element, method, article or terminal unit not only wrap
Include those key elements, but also include other key elements being not expressly set out, or also include for this process, method, article
Or the key element that terminal unit is intrinsic.In the case of there is no more restriction, by wanting that statement " including ... " limits
Element, it is not excluded that there is also other similar elements in including the process of described key element, method, article or terminal unit.
Being described in detail technical scheme provided by the present invention above, specific case used herein is to this
Bright principle and embodiment are set forth, the explanation of above example be only intended to help to understand the method for the present invention and
Core concept;Simultaneously for one of ordinary skill in the art, according to the thought of the present invention, in detailed description of the invention and application
All will change in scope, in sum, this specification content should not be construed as limitation of the present invention.
Claims (10)
1. an information processing method, it is characterised in that including:
The description content of acquisition information;
Description content according to described information, sets up the participle vector of described information;
The weight coefficient that participle vector sum according to described information is preset, determines whether described information is analog information;
When described information is analog information, any one in described analog information is defined as information to be recommended.
2. the method for claim 1, it is characterised in that according to the weight system that the participle vector sum of described information is preset
Number, determines whether described information is analog information, including:
The weight coefficient that participle vector sum according to described information is preset, calculates the cryptographic Hash of described information;
Cryptographic Hash according to described information judges whether described information is analog information.
3. method as claimed in claim 2, it is characterised in that according to the weight system that the participle vector sum of described information is preset
Number, calculates the cryptographic Hash Hash of described information, including:
For the preset weight coefficient of participle element in the participle vector of described information;
The weighted value of described participle element is calculated according to described weight coefficient;
The described weighted value of described participle element is added, obtains the cryptographic Hash of described information.
4. method as claimed in claim 2, it is characterised in that judge that whether described information be according to the cryptographic Hash of described information
Analog information Hash, including:
Cryptographic Hash according to described information calculates the Hamming distances between described information;
Described Hamming distances is converted to similarity, described similarity is compared with the similarity threshold preset;
If described similarity is less than described similarity threshold, then judge that described information is as analog information.
5. the method as described in any one of claim 1-4, it is characterised in that described participle vector include information type, playwright, screenwriter,
Part or all of participle element in director, performer, winning information, box office and evaluation information.
6. an information processor, it is characterised in that including:
Content obtaining module is described, for obtaining the description content of information;
Participle vector sets up module, for the description content according to described information, sets up the participle vector of described information;
Analog information determines module, for the weight coefficient preset according to the participle vector sum of described information, determines described information
Whether it is analog information;
Recommendation information determines module, for when described information is analog information, by true for any one in described analog information
It is set to information to be recommended.
7. device as claimed in claim 1, it is characterised in that described analog information determines that module includes:
Cryptographic Hash computing unit, for the weight coefficient preset according to the participle vector sum of described information, calculates described information
Cryptographic Hash;
According to the cryptographic Hash of described information, analog information identifying unit, for judging whether described information is analog information.
8. device as claimed in claim 7, it is characterised in that cryptographic Hash computing unit includes:
The preset subelement of weight coefficient, is used for as the preset weight coefficient of participle element in the participle vector of described information;
Weighted value computation subunit, for calculating the weighted value of described participle element according to described weight coefficient;
Additional calculation subelement, for being added by the described weighted value of described participle element, obtains the cryptographic Hash of described information.
9. device as claimed in claim 7, it is characterised in that analog information identifying unit includes:
Hamming distances computation subunit, calculates the Hamming distances between described information for the cryptographic Hash according to described information;
Similarity judgment sub-unit, for described Hamming distances is converted to similarity, described similarity is similar to preset
Degree threshold value compares;
Similar judgement subelement, for when described similarity is less than described similarity threshold, it is determined that described information is similar letter
Breath.
10. the device as described in any one of claim 6~9, it is characterised in that described participle vector includes information type, volume
Part or all of participle element in play, director, performer, winning information, box office and evaluation information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610681330.0A CN106326388A (en) | 2016-08-17 | 2016-08-17 | Method and device for processing information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610681330.0A CN106326388A (en) | 2016-08-17 | 2016-08-17 | Method and device for processing information |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106326388A true CN106326388A (en) | 2017-01-11 |
Family
ID=57743991
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610681330.0A Pending CN106326388A (en) | 2016-08-17 | 2016-08-17 | Method and device for processing information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106326388A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107066621A (en) * | 2017-05-11 | 2017-08-18 | 腾讯科技(深圳)有限公司 | A kind of search method of similar video, device and storage medium |
CN107193893A (en) * | 2017-05-03 | 2017-09-22 | 聚好看科技股份有限公司 | Handle the method and device of video resource |
CN107977355A (en) * | 2017-11-17 | 2018-05-01 | 四川长虹电器股份有限公司 | TV programme suggesting method based on term vector training |
CN110929002A (en) * | 2018-09-03 | 2020-03-27 | 广州神马移动信息科技有限公司 | Similar article duplicate removal method, device, terminal and computer readable storage medium |
CN111128243A (en) * | 2019-12-25 | 2020-05-08 | 苏州科达科技股份有限公司 | Noise data acquisition method, device and storage medium |
CN113672913A (en) * | 2021-08-20 | 2021-11-19 | 绿盟科技集团股份有限公司 | Security event processing method and device and electronic equipment |
CN113672913B (en) * | 2021-08-20 | 2024-06-28 | 绿盟科技集团股份有限公司 | Security event processing method and device and electronic equipment |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102831198A (en) * | 2012-08-07 | 2012-12-19 | 人民搜索网络股份公司 | Similar document identifying device and similar document identifying method based on document signature technology |
CN103714118A (en) * | 2013-11-22 | 2014-04-09 | 浙江大学 | Book cross-reading method |
CN104679835A (en) * | 2015-02-09 | 2015-06-03 | 浙江大学 | Book recommending method based on multi-view hash |
CN104951448A (en) * | 2014-03-26 | 2015-09-30 | 北京雪球信息科技有限公司 | Method and server for pushing messages of subscribed categories for users |
CN105138647A (en) * | 2015-08-26 | 2015-12-09 | 陕西师范大学 | Travel network cell division method based on Simhash algorithm |
CN105426528A (en) * | 2015-12-15 | 2016-03-23 | 中南大学 | Retrieving and ordering method and system for commodity data |
CN105786799A (en) * | 2016-03-21 | 2016-07-20 | 成都寻道科技有限公司 | Web article originality judgment method |
-
2016
- 2016-08-17 CN CN201610681330.0A patent/CN106326388A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102831198A (en) * | 2012-08-07 | 2012-12-19 | 人民搜索网络股份公司 | Similar document identifying device and similar document identifying method based on document signature technology |
CN103714118A (en) * | 2013-11-22 | 2014-04-09 | 浙江大学 | Book cross-reading method |
CN104951448A (en) * | 2014-03-26 | 2015-09-30 | 北京雪球信息科技有限公司 | Method and server for pushing messages of subscribed categories for users |
CN104679835A (en) * | 2015-02-09 | 2015-06-03 | 浙江大学 | Book recommending method based on multi-view hash |
CN105138647A (en) * | 2015-08-26 | 2015-12-09 | 陕西师范大学 | Travel network cell division method based on Simhash algorithm |
CN105426528A (en) * | 2015-12-15 | 2016-03-23 | 中南大学 | Retrieving and ordering method and system for commodity data |
CN105786799A (en) * | 2016-03-21 | 2016-07-20 | 成都寻道科技有限公司 | Web article originality judgment method |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107193893A (en) * | 2017-05-03 | 2017-09-22 | 聚好看科技股份有限公司 | Handle the method and device of video resource |
CN107066621A (en) * | 2017-05-11 | 2017-08-18 | 腾讯科技(深圳)有限公司 | A kind of search method of similar video, device and storage medium |
WO2018205838A1 (en) * | 2017-05-11 | 2018-11-15 | 腾讯科技(深圳)有限公司 | Method and apparatus for retrieving similar video, and storage medium |
US10853660B2 (en) | 2017-05-11 | 2020-12-01 | Tencent Technology (Shenzhen) Company Limited | Method and apparatus for retrieving similar video and storage medium |
CN107977355A (en) * | 2017-11-17 | 2018-05-01 | 四川长虹电器股份有限公司 | TV programme suggesting method based on term vector training |
CN110929002A (en) * | 2018-09-03 | 2020-03-27 | 广州神马移动信息科技有限公司 | Similar article duplicate removal method, device, terminal and computer readable storage medium |
CN111128243A (en) * | 2019-12-25 | 2020-05-08 | 苏州科达科技股份有限公司 | Noise data acquisition method, device and storage medium |
CN113672913A (en) * | 2021-08-20 | 2021-11-19 | 绿盟科技集团股份有限公司 | Security event processing method and device and electronic equipment |
CN113672913B (en) * | 2021-08-20 | 2024-06-28 | 绿盟科技集团股份有限公司 | Security event processing method and device and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106326388A (en) | Method and device for processing information | |
CN108090208A (en) | Fused data processing method and processing device | |
WO2021042826A1 (en) | Video playback completeness prediction method and apparatus | |
CN106326391B (en) | Multimedia resource recommendation method and device | |
CN110909182B (en) | Multimedia resource searching method, device, computer equipment and storage medium | |
CN102646097B (en) | A kind of clustering method and device | |
CN104053023B (en) | A kind of method and device of determining video similarity | |
CN107729578B (en) | Music recommendation method and device | |
CN103377232A (en) | Headline keyword recommendation method and system | |
CN102999588A (en) | Method and system for recommending multimedia applications | |
CN104199896A (en) | Video similarity determining method and video recommendation method based on feature classification | |
CN105005582A (en) | Recommendation method and device for multimedia information | |
CN107592572B (en) | Video recommendation method, device and equipment | |
CN110704677B (en) | Program recommendation method and device, readable storage medium and terminal equipment | |
CN108600836B (en) | Video processing method and device | |
CN108664654A (en) | A kind of main broadcaster's recommendation method and device based on user's similarity | |
CN109242592A (en) | A kind of recommended method and device of application | |
CN105718510A (en) | Multimedia data recommendation method and device | |
CN111241381A (en) | Information recommendation method and device, electronic equipment and computer-readable storage medium | |
CN106227881A (en) | A kind of information processing method and server | |
CN112085058A (en) | Object combination recall method and device, electronic equipment and storage medium | |
CN105022797A (en) | Resource topic processing method and apparatus | |
CN103617221A (en) | Software recommendation method and software recommendation system | |
CN108829699A (en) | A kind of polymerization and device of focus incident | |
CN109461012A (en) | A kind of Products Show method, apparatus and terminal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170111 |