CN104636488B - A kind of repetition video file based on picture determines method and device - Google Patents

A kind of repetition video file based on picture determines method and device Download PDF

Info

Publication number
CN104636488B
CN104636488B CN201510089040.2A CN201510089040A CN104636488B CN 104636488 B CN104636488 B CN 104636488B CN 201510089040 A CN201510089040 A CN 201510089040A CN 104636488 B CN104636488 B CN 104636488B
Authority
CN
China
Prior art keywords
picture
video file
gray
target photo
fingerprint
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510089040.2A
Other languages
Chinese (zh)
Other versions
CN104636488A (en
Inventor
宋华
周燕红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201510089040.2A priority Critical patent/CN104636488B/en
Publication of CN104636488A publication Critical patent/CN104636488A/en
Application granted granted Critical
Publication of CN104636488B publication Critical patent/CN104636488B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5862Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/432Query formulation
    • G06F16/434Query formulation using image data, e.g. images, photos, pictures taken by a user

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention discloses a kind of repetition video files based on picture to determine method and device, a kind of repetition video file based on picture determines method, the following steps are included: obtaining the picture fingerprint of the picture fingerprint of the representative picture of the first video file and the representative picture of the second video file, wherein, picture fingerprint are as follows: the picture feature information being calculated according to the average gray information of picture and color average information;The picture fingerprint of the representative picture of first video file and the picture fingerprint of the representative picture of second video file are compared, the similarity degree of the representative picture of first video file and the representative picture of second video file is obtained;If the similarity degree meets preset condition, first video file and second video file are determined as to repeat video file.Provided technical solution through the embodiment of the present invention, available more apparent duplicate removal effect promote user experience.

Description

A kind of repetition video file based on picture determines method and device
Technical field
The present invention relates to Internet technical field, in particular to a kind of repetition video file based on picture determine method and Device.
Background technique
In video website, different user may upload the video file with identical content, even same user, The video file with identical content may also be repeatedly uploaded, so there are more serious weights for the video file in video website Multiple problem.In practical application, video website mostly shows video file to user with graphic form, to facilitate user to quickly understand view The content of frequency file, above-mentioned picture can be referred to as the representative picture of video file.Under normal circumstances, the consecutive frame figure of video file The similarity degree of piece is higher, and video file represents picture as frame picture a certain in the video file.For with identical content Video file can to have identical if adjacent several frames using in video file as it represent picture to video website respectively The representative picture of the video file of content is same or similar, when searching for target video file in video website as user in this way, It may see in search result and much represent the same or similar video file of picture, user experience is poor.
Based on this, video website need to palinopsia frequency confirm, so as to user show search result when, can Duplicate removal processing is carried out to the video file with identical content.A kind of existing repetition video file based on picture determines method It is: obtains each video file using MD5 (Message Digest Algorithm, Message Digest Algorithm 5) algorithm The character string sequence of picture is represented, the different character string sequences for representing picture are compared, by the picture with identical characters string sequence The video file of representative is confirmed as repeating video file.
Although being capable of determining that repetition video file using the above method under normal conditions, this method exists certain The shortcomings that, because MD5 algorithm is higher for the susceptibility of image data, as long as the data of different pictures have nuance, use The character string sequence that MD5 algorithm obtains will be different, so in the video file with identical content respectively in video file Consecutive frame picture is as in the case where representing picture, although these are represented, picture is similar, these represent the MD5 character string of picture Sequence is different, that is to say, that in video file duplicate removal processing, will not be made these video files with identical content It pays attention to for palinopsia frequency file.In this way, the representative picture for the video file that search result is shown may still have compared with multiphase Like picture, so that video file duplicate removal effect is unobvious, user experience is poor.
Summary of the invention
To solve the above problems, the embodiment of the invention discloses a kind of repetition video file based on picture determine method and Device.Technical solution is as follows:
A kind of repetition video file based on picture determines method, comprising:
The picture for obtaining the picture fingerprint of the representative picture of the first video file and the representative picture of the second video file refers to Line, wherein picture fingerprint are as follows: the picture feature being calculated according to the average gray information of picture and color average information Information;
By the representative picture of the picture fingerprint of the representative picture of first video file and second video file Picture fingerprint compares, the representative picture of the representative picture and second video file of acquisition first video file Similarity degree;
It is if the similarity degree meets preset condition, first video file and second video file is true It is set to repetition video file.
In a kind of specific embodiment of the invention, the representative picture of any one video file is calculated by following steps Picture fingerprint:
Target Photo is obtained, the Target Photo is the representative picture of target video file;
Obtain the corresponding gray scale picture of the Target Photo;
Calculate the average gray of the gray scale picture;
According to the size relation of average gray described in the sum of the grayscale values of each pixel in the gray scale picture, institute is obtained State the gray feature information of Target Photo;
Calculate the color average of the Target Photo;
According to the color average of the Target Photo, the color characteristic information of the Target Photo is obtained;
According to the gray feature information and the color characteristic information, the picture fingerprint of the Target Photo is generated.
In a kind of specific embodiment of the invention, the gray value according to each pixel in the gray scale picture With the size relation of the average gray, the gray feature information of the Target Photo is obtained, comprising:
In the following way, the gray value of each pixel in the gray scale picture is updated:
If the gray value of pixel is less than or equal to the average gray in the gray scale picture, by the pixel Gray value be updated to default first value, the gray value of the pixel is otherwise updated to default second value;
The gray value of updated all pixels point is ranked up according to preset order, obtains gray value sequence information, Using the gray value sequence information as the gray feature information of the Target Photo.
It is described to obtain the corresponding gray scale picture of the Target Photo in a kind of specific embodiment of the invention, comprising:
According to preset first proportionate relationship, diminution processing is carried out to the Target Photo, treated schemes according to reducing Piece obtains the corresponding gray scale picture of the Target Photo;Or
Obtain with Target Photo gray scale picture of the same size, according to preset second proportionate relationship, to being obtained Gray scale picture carry out diminution processing, obtain the corresponding gray scale picture of the Target Photo.
In a kind of specific embodiment of the invention, described by first video file and second video text Part is confirmed as after repetition video file, further includes:
First video file and second video file are marked with same tag symbol, with need to It in the case that family shows video file, is required according to preset selection, selects one in the video file with same tag symbol A video file is shown.
A kind of repetition video file determining device based on picture, comprising:
Picture fingerprint obtains module, for obtaining the picture fingerprint and the second video text of the representative picture of the first video file The picture fingerprint of the representative picture of part, wherein picture fingerprint are as follows: believed according to the average gray information and color average of picture Cease the picture feature information being calculated;
Similarity degree obtains module, for by the picture fingerprint of the representative picture of first video file and described second The picture fingerprint of the representative picture of video file compares, and obtains the representative picture and described second of first video file The similarity degree of the representative picture of video file;
Video file determining module is repeated, in the case where the similarity degree meets preset condition, by described the One video file and second video file are determined as repeating video file.
In a kind of specific embodiment of the invention, further includes:
Picture fingerprint computing module, the picture fingerprint of the representative picture for calculating any one video file:
The picture fingerprint computing module includes:
Target Photo obtains submodule, and for obtaining Target Photo, the Target Photo is the representative of target video file Picture;
Gray scale picture obtains submodule, for obtaining the corresponding gray scale picture of the Target Photo;
Average gray computational submodule, for calculating the average gray of the gray scale picture;
Gray feature information acquisition submodule, for according to the sum of the grayscale values of each pixel in the gray scale picture The size relation of average gray obtains the gray feature information of the Target Photo;
Color average computational submodule, for calculating the color average of the Target Photo;
Color characteristic information acquisition submodule obtains the target for the color average according to the Target Photo The color characteristic information of picture;
Picture fingerprint generates submodule, for generating institute according to the gray feature information and the color characteristic information State the picture fingerprint of Target Photo.
In a kind of specific embodiment of the invention, the gray feature information acquisition submodule, comprising:
Gray value updating unit, in the following way, updating the gray scale of each pixel in the gray scale picture Value: if the gray value of pixel is less than or equal to the average gray in the gray scale picture, by the ash of the pixel Angle value is updated to default first value, and the gray value of the pixel is otherwise updated to default second value;
Gray feature information obtainment unit, for carrying out the gray value of updated all pixels point according to preset order Sequence obtains grey-level sequence value information, using the grey-level sequence value information as the gray feature information of the Target Photo.
In a kind of specific embodiment of the invention, the gray scale picture obtains submodule, comprising:
Gray scale picture first obtains unit, for being contracted to the Target Photo according to preset first proportionate relationship Small processing obtains the corresponding gray scale picture of the Target Photo according to treated picture is reduced;Or
The second obtaining unit of gray scale picture, for acquisition and Target Photo gray scale picture of the same size, according to pre- If the second proportionate relationship, diminution processing is carried out to gray scale picture obtained, obtains the corresponding grayscale image of the Target Photo Piece.
In a kind of specific embodiment of the invention, further includes:
Mark module, for being confirmed as first video file and second video file to repeat video described After file, first video file and second video file are marked with same tag symbol, with need to It in the case that user shows video file, is required, is selected in the video file with same tag symbol according to preset selection One video file is shown.
Using technical solution provided by the embodiment of the present invention, determine whether two video files attach most importance to by picture fingerprint Diplopia frequency file, because picture fingerprint is calculated according to the average gray information and color average information of picture, And the average gray information and color average information of the same or similar higher picture of degree are same or similar, so identical Or the picture fingerprint of the higher picture of similarity degree is also identical or similar, and then can will be represented and be schemed according to picture fingerprint The same or similar video file of piece be determined as repeat video file, accordingly carry out video file duplicate removal processing, it is available compared with For apparent duplicate removal effect, user experience is promoted.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.
Fig. 1 is the implementation flow chart for determining method in the embodiment of the present invention based on the repetition video file of picture;
Fig. 2 is the implementation flow chart of picture fingerprint calculation method in the embodiment of the present invention;
Fig. 3 is the schematic diagram of 8*8 size picture in the embodiment of the present invention;
Fig. 4 is the structural schematic diagram of the repetition video file determining device based on picture in the embodiment of the present invention;
Fig. 5 is the structural schematic diagram of picture fingerprint computing module in the embodiment of the present invention.
Specific embodiment
It is provided for the embodiments of the invention a kind of repetition video file based on picture first and determines that method is illustrated, This method may comprise steps of:
The picture for obtaining the picture fingerprint of the representative picture of the first video file and the representative picture of the second video file refers to Line, wherein picture fingerprint are as follows: the picture feature being calculated according to the average gray information of picture and color average information Information;
By the representative picture of the picture fingerprint of the representative picture of first video file and second video file Picture fingerprint compares, the representative picture of the representative picture and second video file of acquisition first video file Similarity degree;
It is if the similarity degree meets preset condition, first video file and second video file is true It is set to repetition video file.
It is understood that either determine which video file is to repeat video file from multiple video files, Still it is individually determined whether two video files are palinopsia frequency file, is relatively carried out pair both for two video files every time Than needing to compare the picture fingerprint of the representative picture of two video files.To determine which is regarded from multiple video files In case where frequency file is repetition video file, it is assumed that multiple video files are respectively video file A, video file B, video File C and video file D first can select two video files according to certain rule from these video files, determine Whether the two video files are palinopsia frequency file, such as select video file A and video file C, if it is determined that the two Video file is palinopsia frequency file, then can further select video file B, and video file B and video file A is carried out Comparison judges whether video file B and video file A are palinopsia frequency file, if video file B is not with video file A Video file is repeated, then further selection video file D, judge whether are video file D and video file A, video file B respectively Video text may finally be determined if video file D and video file A are to repeat video file for palinopsia frequency file Part A, video file C, video file D are palinopsia frequency file.
In practical applications, technical solution provided by the embodiment of the present invention can be applied to show that video is searched to user The scene of hitch fruit, when needing to show search result to user, from video file corresponding with the searching request of user Determine which, for palinopsia frequency file, shows search result after carrying out duplicate removal processing to palinopsia frequency file.Certainly, the present invention is real The application scenarios for applying technical solution provided by example are not limited to this one kind, are needing to carry out video duplicate removal processing or picture duplicate removal It can be applied in the scene of processing.
Using technical solution provided by the embodiment of the present invention, determine whether two video files attach most importance to by picture fingerprint Diplopia frequency file, because picture fingerprint is calculated according to the average gray information and color average information of picture, And the average gray information and color average information of the same or similar higher picture of degree are same or similar, so identical Or the picture fingerprint of the higher picture of similarity degree is also identical or similar, and then can will be represented and be schemed according to picture fingerprint The same or similar video file of piece be determined as repeat video file, accordingly carry out video file duplicate removal processing, it is available compared with For apparent duplicate removal effect, user experience is promoted.
In order to make those skilled in the art more fully understand the technical solution in the embodiment of the present invention, below in conjunction with this hair Attached drawing in bright embodiment, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described Embodiment is only a part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, this field Those of ordinary skill's every other embodiment obtained without making creative work, belongs to protection of the present invention Range.
It is shown in Figure 1, it is that a kind of repetition video file based on picture provided in an embodiment of the present invention determines method Implementation flow chart, this method may comprise steps of:
S110: the figure of the picture fingerprint of the representative picture of the first video file and the representative picture of the second video file is obtained Piece fingerprint;
Wherein, picture fingerprint are as follows: the figure being calculated according to the average gray information of picture and color average information Piece characteristic information.
For convenience of understanding, to be illustrated for showing the scene of video search result to user.In this application scenarios In, calculating opportunity of picture fingerprint can there are two:
First calculating opportunity can precalculate these video files for video files all in video website The picture fingerprint of picture is represented, in this way, the searching request with user can be directly obtained when receiving the searching request of user The picture fingerprint of the representative picture of corresponding video file.
For example, in video website the representative picture of great lot video files can be crawled by crawler module, in turn Obtain these url (Uniform Resource Locator, uniform resource locator) information for representing picture.If video network There are many video file in standing, and the quantity for the representative picture for needing to crawl is very big, can use increment downloading mode or more machines Multi-threaded parallel downloading mode.Certainly, larger because re-downloading cost, it is possible to schedule backup represent image data to prevent It loses.For the representative picture downloaded, its picture fingerprint can be calculated with single machine, can also be calculated using multi-process mode, For example, representing picture by multiple and being grouped according to certain rule, each process calculates the picture fingerprint that picture is represented in corresponding group.Separately Outside, it is the picture finger print information that timely updates, can periodically calculates the picture finger print information of newly downloaded representative picture.
Second calculating opportunity after the searching request for receiving user, first obtains corresponding with the searching request of user Video file calculates its picture fingerprint for representing picture for these video files.
It is understood that no matter the calculating opportunity of picture fingerprint be it is any, when calculating picture fingerprint, require root It is calculated according to the average gray information and color average information of picture.
In a kind of specific embodiment of the invention, the representative of any one video file can be calculated by following steps The picture fingerprint of picture, shown in Figure 2:
S210: Target Photo is obtained;
The Target Photo is the representative picture of target video file.
S220: the corresponding gray scale picture of the Target Photo is obtained;
Gray proces, the gray scale picture of the available picture are done to any picture.It should be noted that this field skill Art personnel can according to the common knowledge of this field and common technology side section, to picture carry out gray proces, the present invention to this not It limits.For example, for the picture of yuv format, the ash of the picture can be obtained directly according to the Y-component information of picture Spend picture.
It, can be according to preset first proportionate relationship, to the target figure in a kind of specific embodiment of the invention Piece carries out diminution processing, according to treated picture is reduced, obtains the corresponding gray scale picture of the Target Photo.
For example, Target Photo is the picture of 64*64 size, Target Photo can be first reduced into the picture of 8*8 size, it is right Target Photo after diminution carries out gray proces, can be obtained the corresponding gray scale picture of Target Photo.
In another specific embodiment of the invention, can first it obtain and Target Photo gray scale of the same size Picture carries out diminution processing to gray scale picture obtained, obtains the Target Photo according still further to preset second proportionate relationship Corresponding gray scale picture.
For example, Target Photo is that the picture of 64*64 size can be obtained and the mesh after carrying out gray proces to the picture Gray scale picture obtained, is reduced into the picture of 8*8 size by piece of marking on a map gray scale picture of the same size, can will be after diminution Gray scale picture is as the corresponding gray scale picture of Target Photo.
Diminution processing is carried out to Target Photo or with Target Photo gray scale picture of the same size, it can be in following step The middle amount of calculation for reducing picture gray feature information, and then in carrying out the comparison of picture fingerprint, reduce comparison workload.
It should be noted that above-mentioned preset first proportionate relationship and preset second proportionate relationship can be according to practical need It is configured.
S230: the average gray of the gray scale picture is calculated;
It, can be according to each picture in the gray scale picture after step S220 obtains the corresponding gray scale picture of Target Photo The gray value information of vegetarian refreshments calculates the average value of the gray value of all pixels point, and the gray scale that the gray scale picture can be obtained is average Value.
S240: according to the size relation of average gray described in the sum of the grayscale values of each pixel in the gray scale picture, Obtain the gray feature information of the Target Photo;
By step S220 and step S240, the sum of the grayscale values gray scale that can obtain each pixel of gray scale picture is average The gray value of each pixel, is compared with average gray respectively, according to comparison result, can obtain Target Photo by value Gray feature information.
In a kind of specific embodiment of the invention, step S240 be may comprise steps of:
Step 1: in the following way, the gray value of each pixel in the gray scale picture is updated:
If the gray value of pixel is less than or equal to average gray in gray scale picture, by the gray value of the pixel It is updated to default first value, the gray value of the pixel is otherwise updated to default second value;
Step 2: the gray value of updated all pixels point is ranked up according to preset order, obtains gray value sequence Column information, using the gray value sequence information as the gray feature information of the Target Photo.
It is understood that the initial value of the gray value of each pixel is between [0,255] in gray scale picture, such as The gray value of pixel is less than or equal to average gray in fruit gray scale picture, then can be updated to the gray value of the pixel Default first value, is such as updated to 0, can be by the gray scale of the pixel if the gray value of pixel is greater than average gray Value is updated to default second value, is such as updated to 1, and after handling in this way, the gray value value of updated each pixel is pre- If the first value or default second value, such as 0 or 1.The gray value of updated all pixels point is ranked up according to preset order, Such as lateral sequence, longitudinal sequence or zigzag sequence, can obtain gray value sequence information, which is It can be used as the gray feature information of Target Photo.
As shown in figure 3, being the gray scale picture schematic diagram of a 8*8 size, wherein as shown in the figure to be worth for each pixel more Gray value after new, if the gray value sequence information obtained is i.e. according to being laterally sequentially ranked up are as follows:
1110101110111111111100100101011111111010110110110111011111111101。
S250: the color average of the Target Photo is calculated;
In practical applications, the color average of Target Photo can be calculated according to the colouring information of Target Photo.For example, For the Target Photo of rgb format, each pixel includes tri- color components of R, G, B in the Target Photo, calculates target figure The color average of piece, it is possible to understand that are as follows: calculate separately the average value of the R color component of all pixels point, G color component it is flat The average value of mean value, B color component.
Certainly, it for the calculating of color average, can be calculated based on Target Photo, after can also reducing It is calculated based on Target Photo.
S260: according to the color average of the Target Photo, the color characteristic information of the Target Photo is obtained;
The color average of Target Photo can be calculated by step S250, by each average value according to certain sequence into Row sequence, can be obtained the color characteristic information of Target Photo.For Target Photo still in an rgb format, R, G, B tri- are obtained After the average value of color component, each average value can be converted into binary sequence, that is, obtain three 8 binary integers, it will After these three 8 binary integers are arranged according to certain sequence, the color characteristic information of Target Photo is obtained.
It should be noted that Target Photo can be obtained after executing by step S220, step S230, step S240 sequence Gray feature information, the color characteristic information of Target Photo can be obtained after executing by step S250, step S260 sequence, But step S220, step S230, step S240 and step S250, step S260 are not carried out the limitation of sequence, that is to say, that mesh The gray feature information of piece of marking on a map and the acquisition of color characteristic information synchronous can carry out can also sequence progress.
S270: according to the gray feature information and the color characteristic information, the picture for generating the Target Photo refers to Line.
After execution by step S220 to step S260, the gray feature information and color characteristic of Target Photo are obtained Gray feature information and color characteristic information are combined according to default rule, the figure of Target Photo can be generated by information Piece fingerprint.For example, gray feature information is believed in rear or color characteristic information in preceding, gray feature in preceding, color characteristic information Breath is rear etc..
Still by taking the gray scale picture of 8*8 size shown in Fig. 3 as an example, which is the corresponding gray scale of a certain Target Photo Picture passes through step S220 to the gray feature information of the step S240 Target Photo obtained are as follows: 11101011101111111111001001010111111110101101101101110111 11111101, pass through step S250 With the color characteristic information of the step S260 Target Photo obtained are as follows: 110101010010111111100101, according to gray scale Characteristic information is combined in preceding, the posterior sequence of color characteristic information, then the picture fingerprint of the Target Photo generated are as follows: 11101011101111111111001001010111111110101101101101110111111111011101010100101 11111100101。
S120: by the representative figure of the picture fingerprint of the representative picture of first video file and second video file The picture fingerprint of piece compares, and obtains the representative picture of first video file and the representative figure of second video file The similarity degree of piece;
The picture fingerprint obtained by step S110 can be the sequence comprising multiple numerical value, by the generation of the first video file Numerical value at the same position of the picture fingerprint of the representative picture of the picture fingerprint and the second video file of table picture compares, Judge whether equal, according to the number of equal value at same position, obtains the representative picture and the second view of the first video file The similarity degree of the representative picture of frequency file.Such as, with the picture fingerprint of the representative picture of the first video file and the second video text In the picture fingerprint of the representative picture of part at same position the number of equal value and total numerical value number ratio, as two The similarity degree of the representative picture of video file.
S130: if the similarity degree meets preset condition, by first video file and second video File is determined as repeating video file.
If obtain two of step S120 represent the similarity degree of picture as the ratio of identical numerical value digit and total bit, Then when similarity degree is higher than a certain preset threshold, the first video file and the second video file can be determined as repeating video File.If obtain two of step S120 represent the similarity degree of picture as the ratio of different numerical value digits and total bit, When similarity degree is lower than a certain preset threshold, the first video file and the second video file can be determined as repeating video text Part.
In another embodiment of the present invention, described that first video file and second video file is true Think repeat video file after, can with the following steps are included:
First video file and second video file are marked with same tag symbol, with need to It in the case that family shows video file, is required according to preset selection, selects one in the video file with same tag symbol A video file is shown.
In practical applications, video file to be presented may have multiple, in multiple video files to be presented Any two video files can determine whether the two video files are repetition by executing step S110 to step S130 Video file.In this way, can will repeat video file after by multiple video file confirmations to be presented and be accorded with same tag It is marked.If video file to be presented is respectively video file 1, video file 2, video file 3, video file 4, video File 5 determines that video file 1, video file 4 are palinopsia frequency file, and video file 2, video file 3 are palinopsia frequency text Part video file 1 and video file 4 can be then marked with the first marker character, by video file 2 and video file 3 with Second marker character is marked, and when needing to show video file to user, is required according to preset selection, selects view respectively Frequency file 1, video file 2 show user together as final result together with video file 5.It should be noted that preset Selection requires, and can be determined according to factors such as quality, the click volumes of user of video file.
Using technical solution provided by the embodiment of the present invention, determine whether two video files attach most importance to by picture fingerprint Diplopia frequency file, because picture fingerprint is calculated according to the average gray information and color average information of picture, And the average gray information and color average information of the same or similar higher picture of degree are same or similar, so identical Or the picture fingerprint of the higher picture of similarity degree is also identical or similar, and then can will be represented and be schemed according to picture fingerprint The same or similar video file of piece be determined as repeat video file, accordingly carry out video file duplicate removal processing, it is available compared with For apparent duplicate removal effect, user experience is promoted.
Corresponding to above method embodiment, the embodiment of the invention also provides a kind of repetition video file based on picture Determining device, shown in Figure 4, the apparatus may include with lower module:
Picture fingerprint obtains module 310, for obtaining the picture fingerprint and the second view of the representative picture of the first video file The picture fingerprint of the representative picture of frequency file, wherein picture fingerprint are as follows: average according to the average gray information and color of picture The picture feature information that value information is calculated;
Similarity degree obtain module 320, for by the picture fingerprint of the representative picture of first video file with it is described The picture fingerprint of the representative picture of second video file compares, and obtains the representative picture of first video file and described The similarity degree of the representative picture of second video file;
Video file determining module 330 is repeated, is used in the case where the similarity degree is higher than preset threshold, it will be described First video file and second video file are determined as repeating video file.
In one embodiment of the invention, which can also include picture fingerprint computing module:
Picture fingerprint computing module, the picture fingerprint of the representative picture for calculating any one video file.
The picture fingerprint computing module may include following submodule, shown in Figure 5:
Target Photo obtains submodule 410, and for obtaining Target Photo, the Target Photo is the generation of target video file Table picture;
Gray scale picture obtains submodule 420, for obtaining the corresponding gray scale picture of the Target Photo;
Average gray computational submodule 430, for calculating the average gray of the gray scale picture;
Gray feature information acquisition submodule 440, for the sum of the grayscale values according to each pixel in the gray scale picture The size relation of the average gray obtains the gray feature information of the Target Photo;
Color average computational submodule 450, for calculating the color average of the Target Photo;
Color characteristic information acquisition submodule 460 obtains the mesh for the color average according to the Target Photo It marks on a map the color characteristic information of piece;
Picture fingerprint generates submodule 470, for generating according to the gray feature information and the color characteristic information The picture fingerprint of the Target Photo.
In a kind of specific embodiment of the invention, the gray feature information acquisition submodule 440, may include with Lower unit:
Gray value updating unit, in the following way, updating the gray scale of each pixel in the gray scale picture Value: if the gray value of pixel is less than or equal to the average gray in the gray scale picture, by the ash of the pixel Angle value is updated to default first value, and the gray value of the pixel is otherwise updated to default second value;
Gray feature information obtainment unit, for carrying out the gray value of updated all pixels point according to preset order Sequence obtains gray value sequence information, using the gray value sequence information as the gray feature information of the Target Photo.
In a kind of specific embodiment of the invention, it may include to place an order that the gray scale picture, which obtains submodule 420, Member:
Gray scale picture first obtains unit, for being contracted to the Target Photo according to preset first proportionate relationship Small processing obtains the corresponding gray scale picture of the Target Photo according to treated picture is reduced;Or
The second obtaining unit of gray scale picture, for acquisition and Target Photo gray scale picture of the same size, according to pre- If the second proportionate relationship, diminution processing is carried out to gray scale picture obtained, obtains the corresponding grayscale image of the Target Photo Piece.
In another embodiment of the present invention, can also include:
Mark module, for being confirmed as first video file and second video file to repeat video described After file, first video file and second video file are marked with same tag symbol, with need to It in the case that user shows video file, is required, is selected in the video file with same tag symbol according to preset selection One video file is shown.
Using device provided by the embodiment of the present invention, determine whether two video files are palinopsia by picture fingerprint Frequency file, because picture fingerprint is calculated according to the average gray information and color average information of picture, and phase Same or the higher picture of similarity degree average gray information and color average information are same or similar, so identical or phase Picture fingerprint like the higher picture of degree is also identical or similar, and then can will represent picture phase according to picture fingerprint Same or similar video file is determined as repeating video file, carries out video file duplicate removal processing accordingly, available more bright Aobvious duplicate removal effect promotes user experience.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that There is also other identical elements in process, method, article or equipment including the element.
Each embodiment in this specification is all made of relevant mode and describes, same and similar portion between each embodiment Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for device reality For applying example, since it is substantially similar to the method embodiment, so being described relatively simple, related place is referring to embodiment of the method Part explanation.
Those of ordinary skill in the art will appreciate that all or part of the steps in realization above method embodiment is can It is completed with instructing relevant hardware by program, the program can store in computer-readable storage medium, The storage medium designated herein obtained, such as: ROM/RAM, magnetic disk, CD.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the scope of the present invention.It is all Any modification, equivalent replacement, improvement and so within the spirit and principles in the present invention, are all contained in protection scope of the present invention It is interior.

Claims (10)

1. a kind of repetition video file based on picture determines method characterized by comprising
The picture fingerprint of the picture fingerprint of the representative picture of the first video file and the representative picture of the second video file is obtained, In, picture fingerprint are as follows: the picture feature information being calculated according to the average gray information of picture and color average information;
By the picture of the picture fingerprint of the representative picture of first video file and the representative picture of second video file Fingerprint compares, and obtains the similar of the representative picture of first video file and the representative picture of second video file Degree;
If the similarity degree meets preset condition, first video file and second video file are determined as Repeat video file;
Wherein, the representative figure of the picture fingerprint of the representative picture by first video file and second video file The picture fingerprint of piece compares, and obtains the representative picture of first video file and the representative figure of second video file The similarity degree of piece, comprising:
By the picture of the picture fingerprint of the representative picture of first video file and the representative picture of second video file Numerical value at the same position of fingerprint compares, and judges whether equal, according to the number of equal value at same position, obtains The similarity degree of the representative picture of the representative picture of first video file and second video file;
It is if the similarity degree meets preset condition, first video file and second video file is true It is set to repetition video file, comprising:
The similarity degree for representing picture of first video file and second video file is equal numbers at same position When the ratio of the number of value and total numerical value number, when the similarity degree is higher than preset threshold, by the first video text Part and second video file are determined as repeating video file;Or first video file and second video file The similarity degree for representing picture as the ratio of the number of difference numerical value at same position and total numerical value number when, when similar journey When degree is lower than preset threshold, first video file and second video file are determined as to repeat video file.
2. the method according to claim 1, wherein calculating the representative of any one video file by following steps The picture fingerprint of picture:
Target Photo is obtained, the Target Photo is the representative picture of target video file;
Obtain the corresponding gray scale picture of the Target Photo;
Calculate the average gray of the gray scale picture;
According to the size relation of average gray described in the sum of the grayscale values of each pixel in the gray scale picture, the mesh is obtained It marks on a map the gray feature information of piece;
Calculate the color average of the Target Photo;
According to the color average of the Target Photo, the color characteristic information of the Target Photo is obtained;
According to the gray feature information and the color characteristic information, the picture fingerprint of the Target Photo is generated.
3. according to the method described in claim 2, it is characterized in that, the ash according to each pixel in the gray scale picture The size relation of angle value and the average gray obtains the gray feature information of the Target Photo, comprising:
In the following way, the gray value of each pixel in the gray scale picture is updated:
If the gray value of pixel is less than or equal to the average gray in the gray scale picture, by the ash of the pixel Angle value is updated to default first value, and the gray value of the pixel is otherwise updated to default second value;
The gray value of updated all pixels point is ranked up according to preset order, gray value sequence information is obtained, by institute State gray feature information of the gray value sequence information as the Target Photo.
4. according to the method described in claim 2, it is characterized in that, described obtain the corresponding gray scale picture of the Target Photo, Include:
According to preset first proportionate relationship, diminution processing is carried out to the Target Photo, according to treated picture is reduced, is obtained Obtain the corresponding gray scale picture of the Target Photo;Or
Acquisition and Target Photo gray scale picture of the same size, according to preset second proportionate relationship, to ash obtained Degree picture carries out diminution processing, obtains the corresponding gray scale picture of the Target Photo.
5. method according to any one of claims 1 to 4, which is characterized in that it is described by first video file and Second video file is confirmed as after repetition video file, further includes:
First video file and second video file are marked with same tag symbol, to need to user's exhibition It in the case where showing video file, is required according to preset selection, selects a view in the video file with same tag symbol Frequency file is shown.
6. a kind of repetition video file determining device based on picture characterized by comprising
Picture fingerprint obtains module, for obtaining the picture fingerprint and the second video file of the representative picture of the first video file Represent the picture fingerprint of picture, wherein picture fingerprint are as follows: according to the average gray information of picture and color average information meter Obtained picture feature information;
Similarity degree obtains module, for by the picture fingerprint of the representative picture of first video file and second video The picture fingerprint of the representative picture of file compares, and obtains the representative picture and second video of first video file The similarity degree of the representative picture of file;
Video file determining module is repeated, in the case where the similarity degree meets preset condition, described first to be regarded Frequency file and second video file are determined as repeating video file;
Wherein, the similarity degree obtains module, specifically for by the picture fingerprint of the representative picture of first video file It is compared with the numerical value at the same position of the picture fingerprint of the representative picture of second video file, judges whether phase Deng according to the number of equal value at same position, the representative picture and second video of acquisition first video file The similarity degree of the representative picture of file;
It is described to repeat video file determining module, specifically for the representative of first video file and second video file When the similarity degree of picture is the ratio of the number of equal value and total numerical value number at same position, when the similarity degree When higher than preset threshold, first video file and second video file are determined as to repeat video file;Or institute The similarity degree for representing picture of the first video file and second video file is stated as of difference numerical value at same position When several ratio with total numerical value number, when similarity degree is lower than preset threshold, by first video file and described the Two video files are determined as repeating video file.
7. device according to claim 6, which is characterized in that further include:
Picture fingerprint computing module, the picture fingerprint of the representative picture for calculating any one video file:
The picture fingerprint computing module includes:
Target Photo obtains submodule, and for obtaining Target Photo, the Target Photo is the representative picture of target video file;
Gray scale picture obtains submodule, for obtaining the corresponding gray scale picture of the Target Photo;
Average gray computational submodule, for calculating the average gray of the gray scale picture;
Gray feature information acquisition submodule, for the gray scale according to the sum of the grayscale values of each pixel in the gray scale picture The size relation of average value obtains the gray feature information of the Target Photo;
Color average computational submodule, for calculating the color average of the Target Photo;
Color characteristic information acquisition submodule obtains the Target Photo for the color average according to the Target Photo Color characteristic information;
Picture fingerprint generates submodule, for generating the mesh according to the gray feature information and the color characteristic information It marks on a map the picture fingerprint of piece.
8. device according to claim 7, which is characterized in that the gray feature information acquisition submodule, comprising:
Gray value updating unit, in the following way, updating the gray value of each pixel in the gray scale picture: such as The gray value of pixel is less than or equal to the average gray in gray scale picture described in fruit, then more by the gray value of the pixel New is default first value, and the gray value of the pixel is otherwise updated to default second value;
Gray feature information obtainment unit, for arranging the gray value of updated all pixels point according to preset order Sequence obtains grey-level sequence value information, using the grey-level sequence value information as the gray feature information of the Target Photo.
9. device according to claim 7, which is characterized in that the gray scale picture obtains submodule, comprising:
Gray scale picture first obtains unit, for being carried out at diminution to the Target Photo according to preset first proportionate relationship Reason obtains the corresponding gray scale picture of the Target Photo according to treated picture is reduced;Or
The second obtaining unit of gray scale picture, for acquisition and Target Photo gray scale picture of the same size, according to preset Second proportionate relationship carries out diminution processing to gray scale picture obtained, obtains the corresponding gray scale picture of the Target Photo.
10. according to the described in any item devices of claim 6 to 9, which is characterized in that further include:
Mark module, for being confirmed as first video file and second video file to repeat video file described Later, first video file and second video file are marked with same tag symbol, to need to user It in the case where showing video file, is required according to preset selection, selects one in the video file with same tag symbol Video file is shown.
CN201510089040.2A 2015-02-26 2015-02-26 A kind of repetition video file based on picture determines method and device Active CN104636488B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510089040.2A CN104636488B (en) 2015-02-26 2015-02-26 A kind of repetition video file based on picture determines method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510089040.2A CN104636488B (en) 2015-02-26 2015-02-26 A kind of repetition video file based on picture determines method and device

Publications (2)

Publication Number Publication Date
CN104636488A CN104636488A (en) 2015-05-20
CN104636488B true CN104636488B (en) 2019-03-26

Family

ID=53215234

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510089040.2A Active CN104636488B (en) 2015-02-26 2015-02-26 A kind of repetition video file based on picture determines method and device

Country Status (1)

Country Link
CN (1) CN104636488B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105261044A (en) * 2015-09-15 2016-01-20 北京金山安全软件有限公司 Similar picture identification method and device and electronic equipment
CN105291391A (en) * 2015-09-30 2016-02-03 重庆世纪精信实业(集团)有限公司 Injection molding machine mold bonding detection method and device based on image identification processing
CN105677758A (en) * 2015-12-30 2016-06-15 合一网络技术(北京)有限公司 Method and system for establishing ownership relation between sample video and copy video
CN105681899B (en) * 2015-12-31 2019-05-10 北京奇艺世纪科技有限公司 A kind of detection method and device of similar video and pirate video
CN105657547B (en) * 2015-12-31 2019-05-10 北京奇艺世纪科技有限公司 A kind of detection method and device of similar video and pirate video
CN105681898B (en) * 2015-12-31 2018-10-30 北京奇艺世纪科技有限公司 A kind of detection method and device of similar video and pirate video
CN106780363B (en) * 2016-11-21 2019-07-23 北京金山安全软件有限公司 Picture processing method and device and electronic equipment
CN107688637A (en) * 2017-08-23 2018-02-13 广东欧珀移动通信有限公司 Information-pushing method, device, storage medium and electric terminal
CN108205674B (en) * 2017-12-22 2022-04-15 广州爱美互动网络科技有限公司 Social APP content identification method, electronic device, storage medium and system
CN108391140B (en) * 2018-02-28 2021-06-01 北京奇艺世纪科技有限公司 Video frame analysis method and device
CN109409245A (en) * 2018-09-30 2019-03-01 江苏满运软件科技有限公司 Identity checking method, system, electronic equipment and storage medium
CN109166336B (en) * 2018-10-19 2020-08-07 福建工程学院 Real-time road condition information acquisition and pushing method based on block chain technology
CN111382305B (en) * 2018-12-29 2023-05-12 广州市百果园信息技术有限公司 Video deduplication method, video deduplication device, computer equipment and storage medium
CN110517252B (en) * 2019-08-28 2022-05-03 北京达佳互联信息技术有限公司 Video detection method and device
CN110688514A (en) * 2019-08-30 2020-01-14 中国人民财产保险股份有限公司 Insurance claim settlement image data duplicate checking method and device
CN111368122B (en) * 2020-02-14 2022-09-30 深圳壹账通智能科技有限公司 Method and device for removing duplicate pictures
CN113076236A (en) * 2021-04-16 2021-07-06 北京京东拓先科技有限公司 Page loading monitoring method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102156751A (en) * 2011-04-26 2011-08-17 深圳市迅雷网络技术有限公司 Method and device for extracting video fingerprint
CN102930537A (en) * 2012-10-23 2013-02-13 深圳市宜搜科技发展有限公司 Image detection method and system
CN103593646A (en) * 2013-10-16 2014-02-19 中国计量学院 Dense crowd abnormal behavior detection method based on micro-behavior analysis

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8428301B2 (en) * 2008-08-22 2013-04-23 Dolby Laboratories Licensing Corporation Content identification and quality monitoring

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102156751A (en) * 2011-04-26 2011-08-17 深圳市迅雷网络技术有限公司 Method and device for extracting video fingerprint
CN102930537A (en) * 2012-10-23 2013-02-13 深圳市宜搜科技发展有限公司 Image detection method and system
CN103593646A (en) * 2013-10-16 2014-02-19 中国计量学院 Dense crowd abnormal behavior detection method based on micro-behavior analysis

Also Published As

Publication number Publication date
CN104636488A (en) 2015-05-20

Similar Documents

Publication Publication Date Title
CN104636488B (en) A kind of repetition video file based on picture determines method and device
US11093698B2 (en) Method and apparatus and computer device for automatic semantic annotation for an image
CN100363978C (en) Automatic optimization of the position of stems of text characters
US10671920B2 (en) Generative augmentation of image data
CN106649610A (en) Image labeling method and apparatus
CN113642659B (en) Training sample set generation method and device, electronic equipment and storage medium
CN102831568A (en) Method and device for generating verification code picture
CN104809751B (en) The method and apparatus for generating event group evolution diagram
CN108053454A (en) A kind of graph structure data creation method that confrontation network is generated based on depth convolution
Liu Single machine scheduling to minimize maximum lateness subject to release dates and precedence constraints
CN111932308A (en) Data recommendation method, device and equipment
CN115222845A (en) Method and device for generating style font picture, electronic equipment and medium
CN106599176A (en) Image display method and apparatus
US10846462B2 (en) Web page output selection
CN112381147B (en) Dynamic picture similarity model establishment and similarity calculation method and device
CN107612966A (en) Feed information feedback processing methods and system
CN109033049A (en) Generation method and device, storage medium, the terminal of PPT document
CN113723187A (en) Semi-automatic labeling method and system for gesture key points
US11929049B2 (en) Output content generation apparatus, output content generation method and program
CN104156470A (en) Recommendation processing method and system based on photograph information analysis
CN104504429A (en) Two-dimensional code generation method and device
CN111754518B (en) Image set expansion method and device and electronic equipment
CN103927736A (en) Matching method and device of images based on JPEG format
CN113704545A (en) Video tag mining method and device, electronic equipment and storage medium
CN113284199A (en) Image gray area determination method, electronic device and server

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant