CN110149529A - Processing method, server and the storage medium of media information - Google Patents

Processing method, server and the storage medium of media information Download PDF

Info

Publication number
CN110149529A
CN110149529A CN201811294527.4A CN201811294527A CN110149529A CN 110149529 A CN110149529 A CN 110149529A CN 201811294527 A CN201811294527 A CN 201811294527A CN 110149529 A CN110149529 A CN 110149529A
Authority
CN
China
Prior art keywords
fingerprint
image
finger
audio
finger image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811294527.4A
Other languages
Chinese (zh)
Other versions
CN110149529B (en
Inventor
刘刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201811294527.4A priority Critical patent/CN110149529B/en
Publication of CN110149529A publication Critical patent/CN110149529A/en
Application granted granted Critical
Publication of CN110149529B publication Critical patent/CN110149529B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • H04N21/2335Processing of audio elementary streams involving reformatting operations of audio signals, e.g. by converting from one coding standard to another
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements

Abstract

This application discloses a kind of processing method of media information, server and storage mediums.This method comprises: obtaining the source file of pending media information;The first digital finger-print of pending media information is extracted from source file, the first digital finger-print includes the first finger image and the first audio-frequency fingerprint;According to the second digital finger-print of the first digital finger-print and at least one target medium information, determine whether pending media information is similar to a target medium information, wherein the second digital finger-print includes at least one of the second finger image and the second audio-frequency fingerprint;And when determining that pending media information is similar to a target medium information, the auditing result of pending media information is determined according to the auditing result of the pre-set target medium information.Using these technical solutions, the re-scheduling efficiency of media information and the resource utilization of server can be improved.

Description

Processing method, server and the storage medium of media information
Technical field
This application involves Internet technical field more particularly to the processing methods of media information, server and storage medium.
Background technique
With the rapid development of mobile Internet, numerous video websites emerge one after another, and user can upload various Video.As mobile terminal is universal and the speed-raising of network, short, adaptable and fast big flow propagating contents gradually obtain each large platform, bean vermicelli With the favor of capital.For example, short-sighted frequency is a kind of video transmission of the duration propagated on internet new media within 5 minutes Content.Since the creation threshold of short video content is lower and lower, major video website platform is proposed corresponding subsidy policy again And incentive mechanism, so that contents producer produces the duplicate power of very strong content in author content.
In order to realize the re-scheduling processing to network video, the prior art is that 2 videos are judged according to the title of video content Whether repeat.But it is such the disadvantage is that needs a large amount of artificial comparisons and filtration, lead to manual examination and verification and re-scheduling efficiency It is low;Meanwhile the producer of video content can easily bypass such row by modifying title to promote oneself income Weight strategy further increases the cost of manual examination and verification to upload a large amount of repetitions or similar content in this way.With video Inner capacities continues to increase, if quickly not auditing and handling, will lead to, which can not quickly distribute content, especially content service, enriches High quality video.Therefore, very big influence is all caused to the experience of video production person and Video Browsing User, together When, the resource utilization for carrying out the server of re-scheduling processing is not also high.
Summary of the invention
In view of this, can be improved the present invention provides a kind of processing method of media information, server and storage medium The re-scheduling efficiency of media information and the resource utilization of server.
The technical scheme of the present invention is realized as follows:
The present invention provides a kind of processing methods of media information, comprising:
Obtain the source file of pending media information;
The first digital finger-print of the pending media information, the first digital finger-print packet are extracted from the source file Include the first finger image and the first audio-frequency fingerprint;
According to the second digital finger-print of first digital finger-print and at least one target medium information, determine described pending Whether core media information similar to a target medium information, wherein second digital finger-print include the second finger image and At least one of second audio-frequency fingerprint;And
When determining that the pending media information is similar to a target medium information, according to the pre-set mesh The auditing result of mark media information determines the auditing result of the pending media information.
The embodiment of the invention also provides a kind of servers, comprising:
File acquisition module, for obtaining the source file of pending media information;
Fingerprint extraction module, for extracting the pending media letter from the source file that the file acquisition module obtains First digital finger-print of breath, first digital finger-print include the first finger image and the first audio-frequency fingerprint;
Re-scheduling module, the first digital finger-print and at least one target medium for being obtained according to the fingerprint extraction module Second digital finger-print of information determines whether the pending media information is similar to a target medium information, wherein described Second digital finger-print includes at least one of the second finger image and the second audio-frequency fingerprint;And
Auditing module, for determining the pending media information and a target medium information when the re-scheduling module When similar, the audit knot of the pending media information is determined according to the auditing result of the pre-set target medium information Fruit.
The embodiment of the present invention provides a kind of computer readable storage medium again, is stored with computer-readable instruction, can be with At least one processor is set to execute above-mentioned method.
Compared with prior art, method provided by the invention, can extract multidimensional media content features carry out it is multi-modal Re-scheduling processing, substantially increases the accuracy rate of re-scheduling;The state of the media information of manual examination and verification is completed in recycling, significantly It reduces duplicate media information and enters the quantity of manual examination and verification, to achieve the effect that save manpower and core is accelerated to examine, discharging While manpower, accelerates popular, high-quality video file distribution speed, improve review efficiency;It can also effectively contain and pass through weight Multiple similar media content obtains the cheating of platform subsidy, while it can be found that a part is pirate and abuse, protecting Protect original number content originator of advocating peace;In addition, improving the ability of server batch processing media information, server is improved The resource utilization of equipment.
Detailed description of the invention
For the clearer technical solution illustrated in the embodiment of the present invention, will make below to required in embodiment description Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.Wherein,
Fig. 1 is the structural schematic diagram of medium information transmission system involved in one embodiment of the invention;
Fig. 2 is the exemplary process diagram of the processing method of the media information of an embodiment according to the present invention;
Fig. 3 is the exemplary process diagram of the processing method of the media information of another embodiment according to the present invention;
Fig. 4 is the exemplary process diagram of the finger image processing method of an embodiment according to the present invention;
Fig. 5 is the exemplary process diagram of the audio-frequency fingerprint processing method of an embodiment according to the present invention;
Fig. 6 is the exemplary process diagram of the processing method of the media information of another embodiment according to the present invention;
Fig. 7 is the schematic diagram that element is searched in finger image chained list of an embodiment according to the present invention;
The structural schematic diagram of Fig. 8 server of an embodiment according to the present invention;
Fig. 9 is the structural schematic diagram of the server of another embodiment according to the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Based on this hair Embodiment in bright, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall within the protection scope of the present invention.
Fig. 1 is the structural schematic diagram of medium information transmission system involved in one embodiment of the invention.As shown in Figure 1, should Medium information transmission system 100 includes 111~11M of contents production terminal, media information processing system 120 and the matchmaker of media information 131~13N of content consuming terminals of body information.Specifically, media information processing system 120 again include: control centre 121, on Downstream interface server 122, re-scheduling server 123, manual examination and verification server 124, content distribution output port server 125, content Storing data library 126 and metamessage database 127.
In the embodiment of the present application, media information includes audio file, video file.The contents production terminal of media information 111~11M is the M terminal for generating media information content, including but not limited to smart phone, tablet computer, on knee portable Computer, video website server etc..Each contents production terminal uploads local or shooting to media information processing system 120 Media information source file.
In specific implementation, these upload media informations can be divided into again professional production content (PGC, Professional Generated Content), user's original content (UGC, User Generated Content) and Mixed professional user produces content (PUGC, Professional User Generated Content).Wherein, PGC refers to Content personalization, visual angle diversification, the media content for propagating democratization, social relationships virtualization, such as video website server, micro- The content generated on rich server;UGC is generated to meet the individual requirement of user, and various intelligence can be used in user Energy terminal uploads local or homemade video etc.;And PUGC is to generate the professional audiovisual for being relatively close to PGC in the form of UGC Etc. contents.In content distribution field, for above-mentioned UGC content, the efficiency of distribution will significantly affect the experience of user.
131~13N of content consuming terminals of media information refers to the various terminals for being equipped with media information client end of playing back, Including but not limited to smart phone, tablet computer, pocket computer on knee, smart television etc..For example, client is Tencent's body It educates, Tencent's video, mango TV, like the video classes APP such as unusual.
According to the embodiment of the present application, as shown in Figure 1, the entire transmission process of media information includes the following steps:
1, the source file of media information is uploaded to downlink and uplink interfaces server 122 by 111~11M of contents production terminal.
In specific application, contents production terminal can also submit relevant information, such as while uploading source file simultaneously Title, publisher, abstract, surface plot, issuing time etc..
2, source file is stored in content store database 126 by downlink and uplink interfaces server 122.
In specific implementation, content store database 126 is one group of widely dispersed, the user as contents producer The storage server that can be accessed nearby can also have CDN that server is accelerated to carry out distributed caching acceleration, pass through in its periphery Uplink and downlink content interface server saves the video content that contents producer uploads.
In addition, content store database 126 carries out the transcoding operation of standard to the content of source file, it is different after the completion of transcoding Step returns to metamessage.
3, downlink and uplink interfaces server 122 will be in the metamessage write-in metamessage database 127 of media information.
For example, media information is a video file, metamessage includes video file size, surface plot link, code rate, file The information such as format, title, issuing time, author.Core database of the metamessage database 127 as media information, is preserved The metamessage of each media information.
4, the source file of upload is submitted to control centre 121 by downlink and uplink interfaces server 122, to carry out in subsequent Hold processing and circulation.
Control centre 121 is responsible for the entire scheduling process of video content circulation, controls the sequence and priority of scheduling.Specifically And the successive scheduling re-scheduling server 123 and manual examination and verification server 124 in control centre 121, respectively to the content of media information into Row machine processing and manual examination and verification processing.
5, control centre 121 dispatches re-scheduling server 123 and carries out the processing of machine re-scheduling to the source file that receives, and from row Weight server 123 receives re-scheduling treated result.
Re-scheduling server 123 handles source file, determines whether the source file is duplicate media information.
6, the processing result after machine re-scheduling is synchronized to manual examination and verification server 124 by control centre 121.
7, manual examination and verification server 124 reads the metamessage of source file from metamessage database 127, and to source file Content carries out secondary audit.
Manual examination and verification server 124 can check the result of re-scheduling, alternatively, unduplicated new to determining after re-scheduling Video file carries out secondary audit to content by manpower, mainly carries out classification and mark or the confirmation of label to content.
8, the result of manual examination and verification and state are returned to metamessage database 127 by manual examination and verification server 124.
In this way, metamessage of the metamessage database 127 in addition to preserving each media information, can also save manual examination and verification In the process to the classification information of content (such as one, two, three grade classifications and label information).
9, after the completion of manual examination and verification, control centre 121 enables content distribution output port server 125.
10, the media information that content distribution output port server 125 passes through manual examination and verification is distributed away, to content consumption end 131~13N of end sends the index of media information to be shown.
In this way, each content consuming terminals show that the media information received, the channel of displaying include recommended engine, search Engine or the direct displayed page of operation.
In addition to above-mentioned steps 1 arrive step 10, when content consuming terminals user browse to by audit after media letter After breath, it is desirable to which when watching some media information, the entire transmission process of media information further includes handling as follows:
11, content consuming terminals are interacted with downlink and uplink interfaces server 122, obtain the index of media information to be played;
12, content consuming terminals are interacted according to the index with content store database 126, download media text to be played Part, so as to be watched in locally broadcasting.
At this point, content store database 126 is as the data source externally serviced.In addition to this, real according to the method for the present invention Example is applied, re-scheduling server 123 will obtain source file from content store database 126 and carry out relevant processing, at this point, re-scheduling service Device 123 is as the data source internally serviced.In specific implementation, to avoid influencing each other, the access of inside and outside data source is to separate Deployment.
Wherein, each server for including in media information processing system 120 can be a server, or by several The server cluster or a cloud computing service center of platform server composition.Each terminal and and media information processing system System 120 can be connected by wireless network or cable network.
Fig. 2 is the exemplary process diagram of the processing method of the media information of an embodiment according to the present invention.This method application In server, re-scheduling server 123 as shown in Figure 1.As shown in Fig. 2, this method may include following steps:
Step 201, the source file of pending media information is obtained.
In this step, server can obtain the matchmaker of user's upload from 111~11M of contents production terminal shown in FIG. 1 Body information is as pending media information;Alternatively, server can obtain from content store database 126 shown in FIG. 1 The media information of storage is as pending media information.
Acquired source file is the media information file for including picture and audio, for example, dynamic video file.In reality In application, contents producer can carry out original video file (such as the video file for once uploading to videoconference client) Various reprocessing, to obtain above-mentioned pending source file.These reprocessing include but is not limited to: by original video file It is modified as different code rates, different clarity, different images size, part blank screen, addition/deletion filter, addition/deletion quotient is added Mark and insertion portion ad content and teaser or tail.
Step 202, the first digital finger-print of pending media information is extracted from source file.
In this step, the first digital finger-print includes the first finger image and the first audio-frequency fingerprint.For example, working as pending media When information is video file, the finger image and audio-frequency fingerprint that can therefrom extract key frame are as the first digital finger-print.
Step 203, it according to the second digital finger-print of the first digital finger-print and at least one target medium information, determines pending Whether core media information is similar to a target medium information.
In this step, target medium information refers to the history media information that auditing result has been determined.Second digital finger-print Including at least one of the second finger image and the second audio-frequency fingerprint.By to the first digital finger-print and each target medium letter Second digital finger-print of breath is compared, and can determine whether pending media information is similar to a target medium information, i.e., Carry out re-scheduling processing.
Step 204, when determining that pending media information is similar to a target medium information, according to pre-set The auditing result of the target medium information determines the auditing result of pending media information.
Since the auditing result of target medium information is predetermined, when determining pending media information and a mesh When mark media information is similar, pending media information does not have to audit again, but according to the auditing result of the target medium information To determine.
In the present embodiment, server extracts the first digital finger-print of pending media information from source file, according to Second digital finger-print of one digital finger-print and at least one target medium information, determine pending media information whether with a mesh It is similar to mark media information, when determining that pending media information is similar to a target medium information, according to pre-set The auditing result of the target medium information determines the auditing result of pending media information, can obtain following technical effect:
1) extracted digital finger-print includes finger image and audio-frequency fingerprint, realizes and starts with from two-dimensional feature to determine Whether pending media information is similar to a target medium information, this that having for video file is constructed using multi-modal method Feature is imitated, the accuracy rate of machine re-scheduling can be greatly improved.
2) duplicate media information can be greatly lowered and enters the quantity of manual examination and verification, to effectively reduce manual examination and verification Manpower and cost.For example, can identify the simple edit-modify that contents producer carries out original video content, such as It modifies video title, watermark be added or the processing such as editor's cutting, the head of addition advertisement and run-out, modification audio.
3) by similar comparison, the working efficiency of audit voluminous media information is improved, so that the place of popular video file Reason efficiency is promoted, and guarantees its quick outbound, it is ensured that have the quick circulation of strong timeliness content, so very for popular timeliness Strong media content, can quickly distribute, and greatly meet the needs of business development;
4) duplicate content can effectively be identified by digital finger-print, effectively containment is by repeating similar media content To obtain the cheating of platform subsidy, while it can be found that a part piracy and abuse, the power of protection content author Benefit;
5) ability of server batch processing media information is improved, the resource utilization of server apparatus is improved.
Fig. 3 is the exemplary process diagram of the processing method of the media information of another embodiment according to the present invention.This method is answered For server, pending media information and target medium information are all video file, and the first digital finger-print and the second number refer to Line all includes finger image and audio-frequency fingerprint.As shown in figure 3, this method may include following steps:
Step 301, the source file of pending media information is obtained.
In specific implementation, a special downloading file system can be set in the server, shown in Fig. 1 in It stores in storage database 126 and downloads original adapting content.In order to control the speed and progress of downloading, downloading file system can be with It is one group of parallel server, is made of relevant task schedule and distribution cluster.
Step 302, at least one key frame images is extracted from source file.
In this step, after being decoded to source file, at least one key frame images is therefrom extracted.So-called key frame, The original painting being equivalent in 2 D animation refers to that frame locating for the key operations in role or object of which movement or variation.It closes Picture frame between key frame and key frame is called transition frames or intermediate frame.
In specific application, since the duration of video is different, if taking out frame strategy using uniform, it will lead to sample frequency not It is enough, while all extracting and will increase the burden and calculation amount for taking out frame again, so that calculating cost increased dramatically, the space that fingerprint compares Expand.In consideration of it, in embodiments of the present invention, being extracted using elongated pumping frame strategy.Specifically, including following step It is rapid:
Step 3021, the switch frame in video content is determined.
So-called switch frame refers in video content scene the widely different of adjacent 2 frame in front and back, such as brightness variation is obvious Scene, plot changes scene abundant.This switch frame is able to indicate that video meaning place abundant itself, can have more Multi information characterizes video content.
Step 3022, the switch frame in video content is extracted according to high-frequency.
It is high-frequency and low frequency that set of frequency, which will be extracted,.For switch frame, extracted according to the high-frequency of setting.
Step 3023, for the context of switch frame, frame is taken out at equal intervals according to low frequency, obtains low-frequency frame.
In this way, key frame images include the low-frequency frame of high-frequency switch frame and a part.Wherein, the effect of low-frequency frame is Feature-rich data and increase are used for the video content of re-scheduling.In practical application, the quantity of low-frequency frame can be according to re-scheduling Precision is set dynamically.
Step 303, for each key frame images, discrete transform is carried out to the key frame images, according to transformed knot Fruit determines the first finger image.
In specific implementation, it can be obtained using discrete cosine transform (DCT, Discrete Cosine Transform) The low-frequency component of image.In view of general image all has many redundancies and correlation, DCT, can as image compression algorithm Image is transformed to frequency domain from pixel domain.After being transformed into frequency domain, most of coefficient be all 0 (in other words close to 0), the coefficient of only seldom a part of frequency component is not 0.In specific implementation, DCT is similar to discrete Fourier transform (DFT, Discrete Fourier Transform), but it only uses real number.
In embodiments of the present invention, the first finger image is determined according to transformed result, can specifically carries out binaryzation Processing constructs initial fingerprint sequence according to transformed coefficient sets;Calculate the average value of initial fingerprint sequence;According to average Value is Sequence Transformed at two-value fingerprint sequence by initial fingerprint, using two-value fingerprint sequence as the first finger image.
It, can be from coefficient set when constructing initial fingerprint sequence according to transformed coefficient sets in order to reduce calculation amount The coefficient that characterization low-limit frequency image information is extracted in conjunction, is combined into initial fingerprint sequence.
Step 304, according to the second finger image of the first finger image and at least one target medium information, first is determined Finger image similarity between finger image and the second finger image.
Here, when finger image is indicated by binary code, finger image similarity can be quantified as Hamming distance, i.e. sequence The quantity of 0 and 1 difference in column;Alternatively, further, being compared to the quantity and predetermined threshold, qualitatively providing the two is " phase Like " auditing result of still " dissmilarity ".
Step 305, audio data is extracted from source file.
Step 306, audio data is divided at least one audio parsing;For each audio parsing, to the audio point The filtering of Duan Jinhang binaryzation, obtains the first audio-frequency fingerprint.
Wherein, former and later two adjacent audio parsings have the audio data of the preset length of overlapping.In specific application, The preset length can be set according to the total duration of audio data.For example, when 5 seconds a length of, the default length of each audio parsing Degree 1 second or 2 seconds.The overlapped data of preset length is more, and calculation amount is bigger, but the effect of re-scheduling is better.
Step 307, according to the second audio-frequency fingerprint of the first audio-frequency fingerprint and at least one target medium information, first is determined Audio-frequency fingerprint similarity between audio-frequency fingerprint and the second audio-frequency fingerprint.
Similar to above-mentioned finger image similarity, when audio-frequency fingerprint is indicated by binary code, audio-frequency fingerprint similarity can To be quantified as the quantity of 0 and 1 difference;Both alternatively, further, the quantity and predetermined threshold are compared, qualitatively provide It is similar or dissimilar.
Step 308, according to finger image similarity and audio-frequency fingerprint similarity, determine pending media information whether with one A target medium information is similar.
In embodiments of the present invention, finger image similarity and audio-frequency fingerprint similarity are all by the Hamming distance that quantifies come table When showing, indicate more similar since Hamming distance is smaller, it at this time can be according to lesser Hamming distance in two Hamming distances Numerical value judge.
When finger image similarity and audio-frequency fingerprint similarity are to characterize the auditing result of " similar " or " dissmilarity ", If there is the auditing result that one is " similar ", it is determined that pending media information is similar to a target medium information.
In view of the actual re-scheduling effect of re-scheduling algorithm model, one in finger image and audio-frequency fingerprint also can choose To determine whether it is similar, it can be balanced between the accuracy rate and recall rate of model.Due to video itself require by Artificial secondary audit, can be recalled in increase by way of reducing accuracy rate in this way.
Step 309, when determining that pending media information is similar to a target medium information, according to pre-set The auditing result of the target medium information determines the auditing result of pending media information.
Through the foregoing embodiment, step 302 realizes finger image to 304 extraction and re-scheduling processing, step 305 to 307 realize the extraction of audio-frequency fingerprint and re-scheduling processing, the feature of video content are obtained from two dimensions, so as to similitude Judgement can based on the processing method of bimodal, compared with the prior art in be used only one-dimensional feature method, improve re-scheduling Accuracy.
For above-mentioned steps 302 to 304, Fig. 4 is the example of the finger image processing method of an embodiment according to the present invention Property flow chart, obtains finger image based on the hash algorithm of DCT by a kind of.As shown in figure 4, this method includes following step It is rapid:
Step 401, the source video file of pending media information is obtained.
Step 402, at least one key frame images is extracted from source video file.
Step 403, key frame images are pre-processed.
Since the specification and code rate of video file itself are different, pretreatment here refers to amendment image, such as adjusts Saturation degree, resolution ratio of image etc..
Next, carrying out the processing of DCT Hash for each pretreated key frame images.Specifically,
Step 404, the size of downscaled images.
In order to simplify the calculation amount of DCT, by the size reduction of image to certain specification.The specific method of diminution is to multiple The value of pixel merges.Combined mode, which can be, to be averaged or maximum value.For example, the size of original image is 8 pixel combinations are 1 by 64*64, and the picture size after being reduced in this way is 8*8.
In specific application, the size after diminution also has certain limitation, for example, being greater than 8*8 or 32*32 at least. It should be noted that reduction operation here can't reduce frequency.
Step 405, simplify the color of image.
For example, image is converted to gray level image, it is further simplified calculation amount.
Step 406, dct transform is carried out to image, obtains DCT coefficient matrix.
For example, having obtained the DCT coefficient matrix of 32*32.
Step 407, DCT coefficient matrix is reduced, initial fingerprint sequence is obtained.
In order to extract the low-limit frequency in image, the submatrix in the upper left corner in DCT coefficient matrix can be retained as just Beginning fingerprint sequence.For example, retaining the matrix of upper left corner 8*8 as initial fingerprint sequence for the DCT coefficient matrix of 32*32.
Step 408, binary conversion treatment is carried out to initial fingerprint sequence.
The mean value of all elements in the DCT matrix of 8*8 is calculated, 64 cryptographic Hash of setting 0 or 1 are more than or equal to DCT mean value Be set as " 1 ", less than being set as " 0 " for DCT mean value, combine just constitute one 64 two-value fingerprint sequences in this way. For example, 11001110010 ...
Step 409, the first finger image is exported.
In this way, all obtaining one 64 two-value fingerprint sequences for each key frame images in source video file.This A little two-value fingerprint sequences constitute the first finger image, handle for subsequent re-scheduling.
Step 410, the second finger image of target video file is inputted.
Here, target medium information is specially target video file, can therefrom extract the second finger image.In order to right Source video file carries out re-scheduling processing, and target video file existing in the video file and database is carried out similitude pair Than.For each target video file, determine the second finger image through the above steps, i.e., at least one 64 two It is worth fingerprint sequence.For example, these second finger images of 010111010111... can be by finger image storage of linked list in data In library.
Step 411, it is pre- to judge whether the finger image similarity between the first finger image and the second finger image meets If threshold value.
Here, finger image similarity can be indicated by calculating the Hamming distance of two-value fingerprint sequence.Hamming distance Refer to the sequences of the two equal lengths different quantity of value on corresponding position.For example, indicating two sequences x and y with d (x, y) Between Hamming distance.By carrying out XOR operation to two binary sequences, and the number that statistical result is 1 is Hamming distance From.Therefore, Hamming distance is specially the quantity of 0 and 1 difference in two 64 two-value fingerprint sequences.
Also, it is similar qualitatively to judge whether that preset threshold is arranged.When meeting preset threshold, step 412 is executed;It is no Then, step 413 is executed.For example, preset threshold is [10,30], if Hamming distance is 16, then Hamming distance is located at preset threshold Within the scope of.
When the first finger image and the second finger image respectively include multiple 64 two-value fingerprint sequences, calculate any The Hamming distance of two two-value fingerprint sequences.When there is a Hamming distance to meet preset threshold, step 412 is executed;When all Hamming distance when being all unsatisfactory for preset threshold, execute step 413.
Step 412, determine that source video file and target video file are similar.
Step 413, determine that source video file and target video file are dissimilar.
For above-mentioned steps 305 to 307, Fig. 5 is the example of the audio-frequency fingerprint processing method of an embodiment according to the present invention Property flow chart, obtains audio-frequency fingerprint based on the extraction algorithm of filter by a kind of.As shown in figure 5, this method includes as follows Step:
Step 501, the source file of pending media information is obtained.
Step 502, audio data is extracted from source file.
Step 503, audio data is divided into the audio parsing of overlapping.
Herein with reference to the description of step 306, details are not described herein.
Step 504, each audio parsing is converted into sonograph.
For example, using Short Time Fourier Transform (STFT, Short-Time Fourier Transformation) by audio Segmentation is converted to sonograph.SFTF makes signal only in a certain minizone as adding window Fourier transformation, by time window Effectively, so that Fourier transformation has the ability of local positioning.
Step 505, sonograph is converted into note figure.
Step 506, it is based on training data, filters out filter.
When determining filter, it is based on training data, several filters can be filtered out using asymmetric pairs of interpolator arithmetic Wave device.
Step 507, binaryzation filtering is carried out to note figure using filter.
In the embodiment of the present invention, the method for binaryzation filtering can have chromaprint, echoprint and landmark Algorithm etc..Wherein, chromaprint provides the public library of a client, and 64 sounds can be calculated by special algorithm Frequency character string.Echoprint is sampled audio data, generates vocal print sequence based on music fingerprint algorithm.Landmark is then It is that spectrogram is obtained by discrete Fourier transform (FFT, Fast Fourier Transform), is extracted from spectrogram Energy peak constructs fingerprint sequence as landmark, using selected landmark.
It for each audio parsing, is filtered by binaryzation, similarly obtains one 64 two-value fingerprint sequences.Example Such as, 1001011011...
Step 508, the first audio-frequency fingerprint is exported.
In this way, all obtaining one 64 two-value fingerprint sequences for each audio parsing.These two-value fingerprint sequence structures At the first audio-frequency fingerprint, handled for subsequent re-scheduling.
Step 509, the second audio-frequency fingerprint of target media file is inputted.
Here, target media file can be video file, be also possible to audio file.Second digital finger-print is specially Two audio-frequency fingerprints.It is determined through the above steps to carry out re-scheduling processing to source file for each target media file Second audio-frequency fingerprint out, in the database by audio-frequency fingerprint storage of linked list.
Step 510, it is pre- to judge whether the audio-frequency fingerprint similarity between the first audio-frequency fingerprint and the second audio-frequency fingerprint meets If threshold value.
Similar to finger image similarity, audio-frequency fingerprint similarity here can also be indicated by Hamming distance.When When meeting preset threshold, step 511 is executed;Otherwise, step 512 is executed.
Step 511, it determines source file and target media file is similar.
Step 512, it determines source file and target media file is dissimilar.
Fig. 6 is the exemplary process diagram of the processing method of the media information of another embodiment according to the present invention.This method is answered For server, judged by way of constructing fingerprint chained list pending media information whether with a target medium information phase Seemingly.As shown in fig. 6, including the following steps:
Step 601, the source file of pending media information is obtained.
Step 602, preliminary re-scheduling is carried out to pending media information according to character features.
Before carrying out re-scheduling processing based on digital finger-print, first pending media information can be carried out according to character features Preliminary re-scheduling.Here, character features include: text and title of source file etc..The pending media after preliminary re-scheduling will be passed through During information input to subsequent fingerprint re-scheduling.
Step 603, the first finger image and the first audio-frequency fingerprint of pending media information are obtained from source file.
Here acquisition modes are referred to the embodiment that Fig. 4 and Fig. 5 are provided, and details are not described herein.
Step 604, according to the second finger image of at least one target medium information, finger image chained list is constructed.
Step 605, according to the second audio-frequency fingerprint of at least one target medium information, audio-frequency fingerprint chained list is constructed.
In the embodiment of the present invention, the fingerprint chained list of building is used to store the second digital finger-print of all target medium information, The media information of media information and the process audit newly increased including storage.When practical application, this fingerprint chained list can be with It saves in the database, when carrying out similitude judgement, is loaded into memory.Wherein, finger image chained list and audio are referred to The building mode of card chain table is identical.
According to embodiments of the present invention, it is contemplated that the digital finger-print for needing orderly to save a large amount of media information passes through piecemeal The mode of index constructs fingerprint chained list.Below by taking finger image chained list as an example, specific construction method includes the following steps:
Step 6041, the second finger image is split into at least one second finger image subsequence.
In order to realize block index in finger image chained list, the second finger image is split.Firstly, setting second The length L of finger image subsequence.The principle of setting is that the empirical data obtained based on test is selected.Due to subsequence Length influences whether to do the digit compared, and when the digit compared is smaller more accurate, but the space compared can become larger.
By taking the second finger image is 64 binary codes as an example, if the sub-sequence length split out is L=16, then scheming As the maximum quantity indexed in fingerprint chained list is 216, and 64 the second finger images have been split into 4 the second images and have been referred to Line subsequence.
Step 6042, for each second finger image subsequence, according to the second finger image subsequence to the second figure As fingerprint reconfigures, third finger image is obtained.
This step is the corresponding element of each index (i.e. the second finger image subsequence) in finger image chained list to be determined (i.e. third finger image).Since the second finger image subsequence is as indexing, then the corresponding element of the index is to embody The information of other sequences in second finger image other than the second finger image subsequence.Therefore, " group again here Conjunction ", which refers to, is set in fixed position for the second finger image subsequence, and combines other remaining sequences by default rule Together.For example, the fixation position of setting, can be in the most preceding or last of sequence;Other remaining sequences are successively suitable Prolong and combine, or the sortord of setting fixation is combined.
Table 1 is the example for determining third finger image method.In table 1, the second finger image is by 4 the second finger images Subsequence composition, is expressed as [Q1, Q2, Q3, Q4].When reconfiguring, (such as table before the second finger image subsequence is come most Shown in underscore in 1 in third finger image), and other remaining sequences are successively postponed, to obtain each second figure As the corresponding third finger image of fingerprint subsequence.For example, the third image for the second finger image subsequence Q3, after combination Fingerprint is [Q3, Q1, Q2, Q4].
Table 1 determines the example of third finger image method
Step 6043, using the second finger image subsequence as key, using the third finger image as value, finger image is constructed Chained list.
In this step, the mode of " key-value " is used to store the fingerprint chained list in the database.Wherein, " key " indicates rope Draw, " value " indicates element.Specifically, being sequentially written in value institute as key, by three finger images using the second finger image subsequence In corresponding bucket.
Table 2 is the example of the finger image chained list of building.As shown in table 2, the length of the second finger image subsequence is 16, So a total of 216=65536 keys.For example, key is " 0000000000000010 ", this is good for corresponding when key serial number 2 Value is S1…SL2, that is, include L2 element;For another example, when key serial number 65534, key is " 1111111111111101 ", this It is a to be good for corresponding value as sky, that is to say, that without corresponding element.
Bond order number Key Element (value)
1 0000000000000001 S1…SL1
2 0000000000000010 S1…SL2
3 0000000000000011 S1…SL3
65534 1111111111111101 It is empty
65535 1111111111111110 S1…SL65535
65536 1111111111111111 S1…SL65536
2 fingerprint chained list example of table
In specific implementation, a special retrieval character server can be set to realize above-mentioned building operation, for The video content of increment is updated fingerprint chained list in the way of segmented index.
Step 606, according to finger image chained list, the finger image between the first finger image and the second finger image is determined Similarity.
Step 607, according to audio-frequency fingerprint chained list, the audio-frequency fingerprint between the first audio-frequency fingerprint and the second audio-frequency fingerprint is determined Similarity.
Here, the method for determination of step 606 and step 607 is identical, and specific determination side is provided by taking step 606 as an example Formula.Specifically, including the following steps:
Step 6061, the first finger image is split into at least one first finger image subsequence.
According to the method for constructing finger image chained list in above-mentioned steps 604, the first finger image subsequence is split into here Method it is identical as the mode for splitting into the second finger image subsequence in step 6041.
Fig. 7 is the schematic diagram that element is searched in finger image chained list of an embodiment according to the present invention.As shown in fig. 7, For 64 the first finger images 701, be split into first 4 16 the first finger image subsequence 711,712, 713、714。
Step 6062, for each first finger image subsequence, first image is searched in finger image chained list and is referred to The corresponding element set of line subsequence.
Specifically, using the first finger image subsequence as key, the corresponding value of the key is determined in finger image chained list; By step 6043 it is found that include in value is third finger image, then the third finger image for including in the value is determined as Element set.The element set can be sky, or non-empty.
As shown in fig. 7, successively carrying out matched and searched, the process to the first finger image subsequence 711,712,713,714 Also referred to as " amplify ", by each first finger image subsequence correspond to 721 shown in 16 strong in finger image chained list On, it is accurately to match here.After having found key, value corresponding to key also it is confirmed that, that is, found corresponding element set In each element.
Step 6063, when element set is not sky, each element in the first finger image and element set is successively calculated Between finger image distance.
Specifically, including at least one third finger image in the element set when element set is not sky.That , the first finger image is reconfigured according to the first finger image subsequence, obtains the 4th finger image;Successively calculate Finger image distance between four finger images and each third finger image.
Here, it reconfigures to obtain and reconfigures to obtain the in the concrete mode and above-mentioned steps 6042 of the 4th finger image The method of three finger images is identical.It is found that first piecemeal of the 4th finger image and third finger image is identical.When Four finger images and third finger image are when being indicated by binary code, and finger image distance is specially Hamming distance, calculate two The quantity of 0 and 1 difference in a binary code.By taking above-mentioned table 1 as an example, the 1st piecemeal of third finger image and the 4th finger image (first 16) be it is identical, above-mentioned Hamming distance specifically refers in the 2nd piecemeal, the 3rd piecemeal and the 4th piecemeal (subsequent in total 48 Position) 0 and 1 different quantity between binary code.
Step 6064, when determining finger image distance less than the first preset threshold, finger image distance is determined as Finger image similarity.
Here, it is so-called it is similar refer to local sensitivity Hash, i.e., there is an other digit to have difference in similar binary code Variation.Finger image distance embodies the quantity of this difference.Similar degree is controlled by the first preset threshold of setting.
In specific implementation, it is contemplated that while the number of videos of publication is larger, for the row of parallelization processing massive video Heavy industry is made, and the corresponding video content of multiple line units (rowkey) can be split as multiple subtasks, while calling above-mentioned retrieval Feature server carries out parallelization processing to service item.Simultaneously because being all stateless service, can dispose simultaneously here more A server calls multiple characteristic key services.
Step 608, according to finger image similarity and audio-frequency fingerprint similarity, determine pending media information whether with one A target medium information is similar.If so, executing step 609;Otherwise, step 610 is executed.
When finger image similarity is characterized by finger image distance characterization, audio-frequency fingerprint similarity by audio-frequency fingerprint distance When, it can determine the smaller value in finger image distance and audio-frequency fingerprint distance;Determine that element corresponding to the smaller value refers to The target medium information in generation;According to smaller value and the second preset threshold, determine pending media information whether with the target medium Information is similar.
By step 6063 it is found that finger image distance is corresponding with element;By step 6043 it is found that element is specially Three finger images;By step 6042 it is found that third finger image is determined with by the second finger image, therefore, from figure As fingerprint distance can determine the corresponding target medium information of the second finger image.For audio-frequency fingerprint distance and phase As treatment process.It therefore, can be with by the smaller value regardless of the smaller value is finger image distance or audio-frequency fingerprint distance Trace back to referred to target medium information.
Here, it is more nearly apart from smaller expression, therefore, smaller value and the second preset threshold is compared, it is final true Whether similar to the target medium information make pending media information.
Step 609, when determining that pending media information is similar to a target medium information, pending matchmaker is calculated Total similarity between body information and the target medium information, and according to total similarity and third predetermined threshold value, it is determined whether it will Pending media information and the target medium information are sent to manual examination and verification system and carry out secondary audit.If so, executing step 610.If not, executing step 611.
Here, total similarity here can be the smaller value determined in step 608, or only finger image away from From, or only audio-frequency fingerprint distance.Then, by step 609 to determining that similar result is judged again.By total phase The pending media information and the target medium information for meeting third predetermined threshold value like degree are sent to manual examination and verification system and carry out two Secondary audit.Third predetermined threshold value can be set to a part of the second preset threshold range.For example, the second preset threshold be [0, 50], third predetermined threshold value can be set to { [0,10], [30,50] }.
Step 610, examining for pending media information is determined according to the auditing result of the pre-set target medium information Core result.
Step 611, pending media information is sent to manual examination and verification system and carries out secondary audit.
Here, two kinds of situations are specifically divided into:
1) when, according to the second preset threshold, determining pending media information and each target medium information in step 608 When all dissimilar, pending media information is sent to manual examination and verification system and carries out secondary audit.
That is, it is similar to some existing target medium information to eliminate pending media information, then pending Media information can transfer to manual examination and verification system, borrow manpower and carry out secondary audit to the content of the pending media information.
When specific implementation, manual examination and verification system is the complicated system based on web database exploitation of a business, needs to read Whether the metamessage for taking video content itself in content store database 126 as shown in figure 1 relates to video content by manually And the characteristic of pornographic, gambling, political sensitivity carries out a wheel primary filtration;Then, herein on basis, the content of video is carried out Secondary audit mainly carries out classification and mark or the confirmation of label to content.
2) when determining that it is pre- that total similarity between pending media information and the target medium information meets in step 609 If when third threshold value, in addition to pending media information is sent to manual examination and verification system, also the target medium information is sent to Manual examination and verification system, borrow manpower treat audit media information and the target medium information between whether really repeat it is secondary Audit.
Through the foregoing embodiment, the technical effect of acquisition includes:
1) preliminary re-scheduling first is carried out according to character features, is then based on finger image, audio-frequency fingerprint progress similitude judgement, The various dimensions feature extraction of text, image, audio is realized, and is incorporated it into the building of repeated characteristic.In this way, re-scheduling Accuracy greatly promote.
2) using the lookup mode of building fingerprint chained list and block index, it can be realized parallel search, accelerate re-scheduling, to sea Continuing to increase for amount video content, can efficiently carry out global video re-scheduling very much, effectively promote the efficiency of re-scheduling, save The time of processing.
It 3), will be pending when determining that pending media information and each target medium information are all dissimilar according to digital finger-print Core media information is sent to manual examination and verification system and carries out secondary audit.In view of video content itself is by way of machine learning The result audited is not also mature enough, so needing by carrying out at secondary manual examination and verification on the basis of machine processing Reason.In this way, promoting the accuracy and efficiency that video itself marks by man-machine collaboration.
4) it by similarity and the second preset threshold, may be implemented to carry out initial re-scheduling to pending media information.When Two preset thresholds setting it is more relaxed when, can determine more similar/duplicate media information;At this point it is possible to by Three preset thresholds are selected again, and the media information for meeting third predetermined threshold value is sent to manual examination and verification system and is carried out accurately Re-scheduling.In this way, making full use of manual examination and verification system while the video file number for needing artificial treatment is greatly reduced Accuracy is balanced between the efficiency and accuracy rate of re-scheduling.
Fig. 8 is the structural schematic diagram of the server of an embodiment according to the present invention.As shown in figure 8, server 800 includes:
File acquisition module 810, for obtaining the source file of pending media information;
Fingerprint extraction module 820, for extracting pending media information from the source file that file acquisition module 810 obtains The first digital finger-print, the first digital finger-print include the first finger image and the first audio-frequency fingerprint;
Re-scheduling module 830, the first digital finger-print and at least one target matchmaker for being obtained according to fingerprint extraction module 820 Second digital finger-print of body information determines whether pending media information is similar to a target medium information, wherein the second number Word fingerprint includes at least one of the second finger image and the second audio-frequency fingerprint;And
Auditing module 840, for determining pending media information and a target medium information phase when re-scheduling module 830 Like when, the auditing result of pending media information is determined according to the auditing result of the pre-set target medium information.
In one embodiment, server 800 further comprises:
List construction module 850 constructs image and refers to for the second finger image according at least one target medium information Card chain table;According to the second audio-frequency fingerprint of at least one target medium information, audio-frequency fingerprint chained list is constructed;
Re-scheduling module 830 is used for, when the second digital finger-print includes the second finger image and the second audio-frequency fingerprint, according to chain The finger image chained list that table building module 850 obtains, determines the finger image between the first finger image and the second finger image Similarity;The audio-frequency fingerprint chained list obtained according to list construction module 850, determine the first audio-frequency fingerprint and the second audio-frequency fingerprint it Between audio-frequency fingerprint similarity;According to finger image similarity and audio-frequency fingerprint similarity, whether pending media information is determined It is similar to a target medium information.
In one embodiment, re-scheduling module 830 is used for, and the first finger image is split at least one first finger image Subsequence;For each first finger image subsequence, the first finger image subsequence pair is searched in finger image chained list The element set answered;When element set is not sky, successively calculate in the first finger image and element set between each element Finger image distance;When determining finger image distance less than the first preset threshold, finger image distance is determined as figure As fingerprint similarity.
In one embodiment, server 800 further comprises:
Judgment module 860, for determining pending media information and a target medium information phase when re-scheduling module 830 Like when, calculate total similarity between pending media information and the target medium information, and according to total similarity and third Preset threshold, it is determined whether pending media information and the target medium information are sent to manual examination and verification system and carry out secondary examine Core.
In one embodiment, server 800 further comprises:
Sending module 870, for determining pending media information and each target medium information all when re-scheduling module 830 When dissimilar, pending media information is sent to manual examination and verification system and carries out secondary audit.
Fig. 9 is the structural schematic diagram of the server of another embodiment according to the present invention.The server 900 can include: processing Device 910, memory 920, port 930 and bus 940.Processor 910 and memory 920 are interconnected by bus 940.Processor 910 can send and receive data by port 930.Wherein,
Processor 910 is used to execute the machine readable instructions module of the storage of memory 920.
Memory 920 is stored with the executable machine readable instructions module of processor 910.The executable finger of processor 910 Enabling module includes: file acquisition module 921, fingerprint extraction module 922, re-scheduling module 923 and auditing module 924.Wherein,
File acquisition module 921 can be with when being executed by processor 910 are as follows: the source file of pending media information is obtained, the One digital finger-print includes the first finger image and the first audio-frequency fingerprint;
Fingerprint extraction module 922 can be with when being executed by processor 910 are as follows: the source file obtained from file acquisition module 921 Middle the first digital finger-print for extracting pending media information;
Re-scheduling module 923 can be with when being executed by processor 910 are as follows: the first number obtained according to fingerprint extraction module 922 Second digital finger-print of fingerprint and at least one target medium information, determine pending media information whether with a target medium Information is similar, wherein the second digital finger-print includes at least one of the second finger image and the second audio-frequency fingerprint;
Auditing module 924 can be with when being executed by processor 910 are as follows: when re-scheduling module 923 determines pending media information When similar to a target medium information, pending media are determined according to the auditing result of the pre-set target medium information The auditing result of information.
In one embodiment, the executable instruction module of processor 910 further comprises: list construction module 925, In,
List construction module 925 can be with when being executed by processor 910 are as follows: according to the second of at least one target medium information Finger image constructs finger image chained list;According to the second audio-frequency fingerprint of at least one target medium information, audio-frequency fingerprint is constructed Chained list;
Re-scheduling module 923 may further when being executed by processor 910 are as follows: when the second digital finger-print includes that the second image refers to When line and the second audio-frequency fingerprint, the finger image chained list obtained according to list construction module 925 determines the first finger image and Finger image similarity between two finger images;The audio-frequency fingerprint chained list obtained according to list construction module 925, determines first Audio-frequency fingerprint similarity between audio-frequency fingerprint and the second audio-frequency fingerprint;It is similar with audio-frequency fingerprint according to finger image similarity Degree, determines whether pending media information is similar to a target medium information.
In one embodiment, the executable instruction module of processor 910 further comprises: judgment module 926, wherein
Judgment module 926 can be with when being executed by processor 910 are as follows: when re-scheduling module 923 determines pending media information When similar to a target medium information, total similarity between pending media information and the target medium information is calculated, And according to total similarity and third predetermined threshold value, it is determined whether pending media information and the target medium information are sent to people Work auditing system carries out secondary audit.
In one embodiment, the executable instruction module of processor 910 further comprises: sending module 927, wherein
Sending module 927 can be with when being executed by processor 910 are as follows: when re-scheduling module 923 determines pending media information When all dissimilar with each target medium information, pending media information is sent to manual examination and verification system and carries out secondary audit.
It can thus be seen that when storing the instruction module in memory 920 and being executed by processor 910, it can be achieved that preceding It states file acquisition module in each embodiment, fingerprint extraction module, re-scheduling module, auditing module, list construction module, judge mould The various functions of block and sending module.
In above-mentioned apparatus and system embodiment, modules and unit realize that the specific method of itself function is implemented in method It is described in example, which is not described herein again.
It, can also be in addition, each functional module in each embodiment of the present invention can integrate in one processing unit It is that modules physically exist alone, can also be integrated in one unit with two or more modules.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of software functional units.
In addition, each embodiment of the invention can pass through the data processing by data processing equipment such as computer execution Program is realized.Obviously, data processor constitutes the present invention.In addition, being commonly stored data in one storage medium Processing routine is by directly reading out storage medium for program or by installing or copying to data processing equipment for program It stores in equipment (such as hard disk and/or memory) and executes.Therefore, such storage medium also constitutes the present invention.Storage medium can be with Use any kind of recording mode, such as paper storage medium (such as paper tape), magnetic storage medium (such as floppy disk, hard disk, flash memory Deng), optical storage media (such as CD-ROM), magnetic-optical storage medium (such as MO) etc..
Therefore, the invention also discloses a kind of computer readable storage mediums, wherein it is stored with computer-readable instruction, it can So that at least one processor executes above-mentioned any embodiment of the method.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the present invention.

Claims (15)

1. a kind of processing method of media information characterized by comprising
Obtain the source file of pending media information;
Extract the first digital finger-print of the pending media information from the source file, first digital finger-print includes the One finger image and the first audio-frequency fingerprint;
According to the second digital finger-print of first digital finger-print and at least one target medium information, the pending matchmaker is determined Whether body information is similar to a target medium information, wherein second digital finger-print includes the second finger image and second At least one of audio-frequency fingerprint;And
When determining that the pending media information is similar to a target medium information, according to pre-set target matchmaker The auditing result of body information determines the auditing result of the pending media information.
2. according to the method described in claim 1, wherein, it is described extracted from the source file of the pending media information described in First digital finger-print of pending media information includes:
At least one key frame images is extracted from the source file;
For each key frame images, discrete transform is carried out to the key frame images, determines described the according to transformed result One finger image.
3. described to determine the first image fingerprint packet according to transformed result according to the method described in claim 2, wherein It includes:
Initial fingerprint sequence is constructed according to transformed coefficient sets;
Calculate the average value of the initial fingerprint sequence;
It is according to the average value that the initial fingerprint is Sequence Transformed at two-value fingerprint sequence, using the two-value fingerprint sequence as The first image fingerprint.
4. described to construct initial fingerprint sequence packet according to transformed coefficient sets according to the method described in claim 3, wherein It includes:
The coefficient that characterization low-limit frequency image information is extracted from the coefficient sets, is combined into the initial fingerprint sequence.
5. according to the method described in claim 1, wherein, it is described extracted from the source file of the pending media information described in First digital finger-print of pending media information includes:
Audio data is extracted from the source file;
The audio data is divided at least one audio parsing;
For each audio parsing, binaryzation filtering is carried out to the audio parsing, obtains first audio-frequency fingerprint.
6. according to the method described in claim 1, further comprising:
According to the second finger image of at least one the target medium information, finger image chained list is constructed;
According to the second audio-frequency fingerprint of at least one the target medium information, audio-frequency fingerprint chained list is constructed;
It is described to be referred to according to first number when second digital finger-print includes the second finger image and the second audio-frequency fingerprint Second digital finger-print of line and at least one target medium information, determine the pending media information whether with a target matchmaker Body information is similar to include:
According to described image fingerprint chained list, the finger image between the first image fingerprint and second finger image is determined Similarity;
According to the audio-frequency fingerprint chained list, the audio-frequency fingerprint between first audio-frequency fingerprint and second audio-frequency fingerprint is determined Similarity;
According to described image fingerprint similarity and the audio-frequency fingerprint similarity, determine the pending media information whether with one A target medium information is similar.
7. it is described according to described image fingerprint chained list according to the method described in claim 6, wherein, determine the first image Finger image similarity between fingerprint and second finger image includes:
The first image fingerprint is split into at least one first finger image subsequence;
For each first finger image subsequence, the first finger image subsequence pair is searched in described image fingerprint chained list The element set answered;When the element set is not sky, successively calculate in the first image fingerprint and the element set Finger image distance between each element;
When determining described image fingerprint distance less than the first preset threshold, described image fingerprint distance is determined as the figure As fingerprint similarity.
8. according to the method described in claim 7, wherein, the second image of at least one the target medium information according to Fingerprint, building finger image chained list include:
Second finger image is split into at least one second finger image subsequence;
For each second finger image subsequence, according to the second finger image subsequence to second finger image again Combination, obtains third finger image;Using the second finger image subsequence as key, using the third finger image as value, construct institute State finger image chained list;
When the element set is not sky, described the first finger image sequence of searching in described image fingerprint chained list is corresponded to Element set include:
Using the first finger image subsequence as key, the corresponding value of the key is determined in described image fingerprint chained list;
At least one the third finger image for including in the value is determined as the element set.
9. described successively to calculate the first image fingerprint and the element set according to the method described in claim 8, wherein In finger image distance between each element include:
The first image fingerprint is reconfigured according to the first finger image subsequence, obtains the 4th finger image;
Successively calculate the finger image distance between the 4th finger image and each third finger image.
10. according to the method described in claim 7, wherein, when the audio-frequency fingerprint similarity is characterized by audio-frequency fingerprint distance, It is described according to described image fingerprint similarity and the audio-frequency fingerprint similarity, determine the pending media information whether with one A target medium information is similar to include:
Determine the smaller value in described image fingerprint distance and audio-frequency fingerprint distance;
Determine the target medium information that element corresponding to the smaller value refers to;
According to the smaller value and the second preset threshold, determine the pending media information whether with the target medium information phase Seemingly.
11. method according to claim 1 to 10, further comprises:
When determining that the pending media information is similar to a target medium information, the pending media letter is calculated Total similarity between breath and the target medium information, and according to total similarity and third predetermined threshold value, it is determined whether it will The pending media information and the target medium information are sent to manual examination and verification system and carry out secondary audit.
12. a kind of server characterized by comprising
File acquisition module, for obtaining the source file of pending media information;
Fingerprint extraction module, for extracting the pending media information from the source file that the file acquisition module obtains First digital finger-print, first digital finger-print include the first finger image and the first audio-frequency fingerprint;
Re-scheduling module, the first digital finger-print and at least one target medium information for being obtained according to the fingerprint extraction module The second digital finger-print, determine whether the pending media information similar to a target medium information, wherein described second Digital finger-print includes at least one of the second finger image and the second audio-frequency fingerprint;And
Auditing module, for determining that the pending media information is similar to a target medium information when the re-scheduling module When, the auditing result of the pending media information is determined according to the auditing result of the pre-set target medium information.
13. server according to claim 12, further comprises:
List construction module constructs finger image for the second finger image according at least one the target medium information Chained list;According to the second audio-frequency fingerprint of at least one the target medium information, audio-frequency fingerprint chained list is constructed;
The re-scheduling module is used for, when second digital finger-print includes the second finger image and the second audio-frequency fingerprint, according to The finger image chained list that the list construction module obtains determines between the first image fingerprint and second finger image Finger image similarity;According to the audio-frequency fingerprint chained list that the list construction module obtains, first audio-frequency fingerprint is determined Audio-frequency fingerprint similarity between second audio-frequency fingerprint;According to described image fingerprint similarity and the audio-frequency fingerprint phase Like degree, determine whether the pending media information is similar to a target medium information.
14. server according to claim 13, wherein
The re-scheduling module is used for, and the first image fingerprint is split at least one first finger image subsequence;For Each first finger image subsequence searches the corresponding element of the first finger image subsequence in described image fingerprint chained list Set;When the element set is not sky, each element in the first image fingerprint and the element set is successively calculated Between finger image distance;When determining described image fingerprint distance less than the first preset threshold, by described image fingerprint Distance is determined as described image fingerprint similarity.
15. a kind of computer readable storage medium, which is characterized in that be stored with computer-readable instruction, at least one can be made Processor executes the method as described in any one of claims 1 to 11.
CN201811294527.4A 2018-11-01 2018-11-01 Media information processing method, server and storage medium Active CN110149529B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811294527.4A CN110149529B (en) 2018-11-01 2018-11-01 Media information processing method, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811294527.4A CN110149529B (en) 2018-11-01 2018-11-01 Media information processing method, server and storage medium

Publications (2)

Publication Number Publication Date
CN110149529A true CN110149529A (en) 2019-08-20
CN110149529B CN110149529B (en) 2021-05-28

Family

ID=67588407

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811294527.4A Active CN110149529B (en) 2018-11-01 2018-11-01 Media information processing method, server and storage medium

Country Status (1)

Country Link
CN (1) CN110149529B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110879967A (en) * 2019-10-16 2020-03-13 厦门美柚股份有限公司 Video content repetition judgment method and device
CN111143724A (en) * 2019-12-30 2020-05-12 广州市百果园网络科技有限公司 Data processing method, device, equipment and medium
CN112381408A (en) * 2020-11-16 2021-02-19 支付宝(杭州)信息技术有限公司 Quality inspection method and device and electronic equipment
CN112541390A (en) * 2020-10-30 2021-03-23 四川天翼网络服务有限公司 Frame-extracting dynamic scheduling method and system for violation analysis of examination video
CN112699872A (en) * 2020-12-29 2021-04-23 天津幸福生命科技有限公司 Form auditing processing method and device, electronic equipment and storage medium
CN112749326A (en) * 2019-11-15 2021-05-04 腾讯科技(深圳)有限公司 Information processing method, information processing device, computer equipment and storage medium

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101645888A (en) * 2009-06-02 2010-02-10 中国科学院声学研究所 Data distribution method based on access frequency variable-length logic section
CN101777075A (en) * 2010-02-05 2010-07-14 上海全土豆网络科技有限公司 Method for searching parallel audio fingerprint
CN101807208A (en) * 2010-03-26 2010-08-18 上海全土豆网络科技有限公司 Method for quickly retrieving video fingerprints
CN102012846A (en) * 2010-12-12 2011-04-13 成都东方盛行电子有限责任公司 Integrity check method for large video file
CN102307301A (en) * 2011-05-30 2012-01-04 电子科技大学 Audio-video fingerprint generation method based on key frames
CN102802090A (en) * 2011-05-27 2012-11-28 未序网络科技(上海)有限公司 Video copyright protection method and system
CN103345496A (en) * 2013-06-28 2013-10-09 新浪网技术(中国)有限公司 Multimedia information searching method and system
CN103902702A (en) * 2014-03-31 2014-07-02 北京车商汇软件有限公司 Data storage system and data storage method
US20140192263A1 (en) * 2011-09-02 2014-07-10 Jeffrey A. Bloom Audio video offset detector
CN103929644A (en) * 2014-04-01 2014-07-16 Tcl集团股份有限公司 Video fingerprint database building method and device and video fingerprint recognition method and device
CN105069111A (en) * 2015-08-10 2015-11-18 广东工业大学 Similarity based data-block-grade data duplication removal method for cloud storage
CN105095435A (en) * 2015-07-23 2015-11-25 北京京东尚科信息技术有限公司 Similarity comparison method and device for high-dimensional image features
CN105550257A (en) * 2015-12-10 2016-05-04 杭州当虹科技有限公司 Audio and video fingerprint identification method and tampering prevention system based on audio and video fingerprint streaming media
CN107122370A (en) * 2016-02-25 2017-09-01 阿里巴巴集团控股有限公司 A kind of distributed search method and device

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101645888A (en) * 2009-06-02 2010-02-10 中国科学院声学研究所 Data distribution method based on access frequency variable-length logic section
CN101777075A (en) * 2010-02-05 2010-07-14 上海全土豆网络科技有限公司 Method for searching parallel audio fingerprint
CN101807208A (en) * 2010-03-26 2010-08-18 上海全土豆网络科技有限公司 Method for quickly retrieving video fingerprints
CN102012846A (en) * 2010-12-12 2011-04-13 成都东方盛行电子有限责任公司 Integrity check method for large video file
CN102802090A (en) * 2011-05-27 2012-11-28 未序网络科技(上海)有限公司 Video copyright protection method and system
CN102307301A (en) * 2011-05-30 2012-01-04 电子科技大学 Audio-video fingerprint generation method based on key frames
US20140192263A1 (en) * 2011-09-02 2014-07-10 Jeffrey A. Bloom Audio video offset detector
CN103345496A (en) * 2013-06-28 2013-10-09 新浪网技术(中国)有限公司 Multimedia information searching method and system
CN103902702A (en) * 2014-03-31 2014-07-02 北京车商汇软件有限公司 Data storage system and data storage method
CN103929644A (en) * 2014-04-01 2014-07-16 Tcl集团股份有限公司 Video fingerprint database building method and device and video fingerprint recognition method and device
CN105095435A (en) * 2015-07-23 2015-11-25 北京京东尚科信息技术有限公司 Similarity comparison method and device for high-dimensional image features
CN105069111A (en) * 2015-08-10 2015-11-18 广东工业大学 Similarity based data-block-grade data duplication removal method for cloud storage
CN105550257A (en) * 2015-12-10 2016-05-04 杭州当虹科技有限公司 Audio and video fingerprint identification method and tampering prevention system based on audio and video fingerprint streaming media
CN107122370A (en) * 2016-02-25 2017-09-01 阿里巴巴集团控股有限公司 A kind of distributed search method and device

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110879967A (en) * 2019-10-16 2020-03-13 厦门美柚股份有限公司 Video content repetition judgment method and device
CN110879967B (en) * 2019-10-16 2023-02-17 厦门美柚股份有限公司 Video content repetition judgment method and device
CN112749326A (en) * 2019-11-15 2021-05-04 腾讯科技(深圳)有限公司 Information processing method, information processing device, computer equipment and storage medium
CN112749326B (en) * 2019-11-15 2023-10-03 腾讯科技(深圳)有限公司 Information processing method, information processing device, computer equipment and storage medium
CN111143724A (en) * 2019-12-30 2020-05-12 广州市百果园网络科技有限公司 Data processing method, device, equipment and medium
CN111143724B (en) * 2019-12-30 2023-07-04 广州市百果园网络科技有限公司 Data processing method, device, equipment and medium
CN112541390A (en) * 2020-10-30 2021-03-23 四川天翼网络服务有限公司 Frame-extracting dynamic scheduling method and system for violation analysis of examination video
CN112541390B (en) * 2020-10-30 2023-04-25 四川天翼网络股份有限公司 Frame extraction dynamic scheduling method and system for examination video violation analysis
CN112381408A (en) * 2020-11-16 2021-02-19 支付宝(杭州)信息技术有限公司 Quality inspection method and device and electronic equipment
CN112381408B (en) * 2020-11-16 2022-10-14 支付宝(杭州)信息技术有限公司 Quality inspection method and device and electronic equipment
CN112699872A (en) * 2020-12-29 2021-04-23 天津幸福生命科技有限公司 Form auditing processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN110149529B (en) 2021-05-28

Similar Documents

Publication Publication Date Title
CN110149529A (en) Processing method, server and the storage medium of media information
CN110225373B (en) Video auditing method and device and electronic equipment
WO2020119350A1 (en) Video classification method and apparatus, and computer device and storage medium
WO2017181612A1 (en) Personalized video recommendation method and device
CN109871490B (en) Media resource matching method and device, storage medium and computer equipment
CN111507097B (en) Title text processing method and device, electronic equipment and storage medium
CN112749608A (en) Video auditing method and device, computer equipment and storage medium
CN109756746A (en) Video reviewing method, device, server and storage medium
CN109660823A (en) Video distribution method, apparatus, electronic equipment and storage medium
CN101369281A (en) Retrieval method based on video abstract metadata
CN113962965B (en) Image quality evaluation method, device, equipment and storage medium
CN110418191A (en) A kind of generation method and device of short-sighted frequency
CN113704506A (en) Media content duplication eliminating method and related device
WO2022007626A1 (en) Video content recommendation method and apparatus, and computer device
CN112507167A (en) Method and device for identifying video collection, electronic equipment and storage medium
CN112950640A (en) Video portrait segmentation method and device, electronic equipment and storage medium
CN111488813B (en) Video emotion marking method and device, electronic equipment and storage medium
CN111222005B (en) Voiceprint data reordering method and device, electronic equipment and storage medium
CN110297897B (en) Question-answer processing method and related product
CN113590854B (en) Data processing method, data processing equipment and computer readable storage medium
CN111126390A (en) Correlation method and device for identifying identification pattern in media content
CN114372172A (en) Method and device for generating video cover image, computer equipment and storage medium
CN110851675A (en) Data extraction method, device and medium
CN110765304A (en) Image processing method, image processing device, electronic equipment and computer readable medium
CN115168568B (en) Data content identification method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant