WO2008106465A1 - Procédé et appareil de détection et d'identification automatiques de signaux vidéo non identifiés - Google Patents

Procédé et appareil de détection et d'identification automatiques de signaux vidéo non identifiés Download PDF

Info

Publication number
WO2008106465A1
WO2008106465A1 PCT/US2008/055040 US2008055040W WO2008106465A1 WO 2008106465 A1 WO2008106465 A1 WO 2008106465A1 US 2008055040 W US2008055040 W US 2008055040W WO 2008106465 A1 WO2008106465 A1 WO 2008106465A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
frame
fingerprint
stored
fingerprints
Prior art date
Application number
PCT/US2008/055040
Other languages
English (en)
Inventor
Kwan Cheung
Original Assignee
Mediaguide, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mediaguide, Inc. filed Critical Mediaguide, Inc.
Publication of WO2008106465A1 publication Critical patent/WO2008106465A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H60/00Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
    • H04H60/35Arrangements for identifying or recognising characteristics with a direct linkage to broadcast information or to broadcast space-time, e.g. for identifying broadcast stations or for identifying users
    • H04H60/37Arrangements for identifying or recognising characteristics with a direct linkage to broadcast information or to broadcast space-time, e.g. for identifying broadcast stations or for identifying users for identifying segments of broadcast information, e.g. scenes or extracting programme ID
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H60/00Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
    • H04H60/56Arrangements characterised by components specially adapted for monitoring, identification or recognition covered by groups H04H60/29-H04H60/54
    • H04H60/59Arrangements characterised by components specially adapted for monitoring, identification or recognition covered by groups H04H60/29-H04H60/54 of video

Definitions

  • TITLE METHOD AND APPARATUS FOR AUTOMATIC DETECTION AND IDENTIFICATION OF UNIDENTIFIED VIDEO SIGNALS
  • broadcast it is meant any readily available source of content, whether now known or hereafter devised, including, for example, streaming, peer to peer delivery of downloads, other delivery of downloads or detection of network traffic comprising such content delivery activity.
  • the system initially registers a known video program , which consists of a sequence of image frames, by digitally sampling the program in segments, typically on a frame by frame basis, and extracting particular feature sets that are characteristic of the frame.
  • the frame here can be the entire image frame, or a defined region within the image frame of the sequence.
  • the invention processes each set of features to produce a numerical code that represents the feature set for a particular segment or frame of the known program. These codes and the registration data identifying the program populate a database as part of the system. Once registration of one or more programs is complete, the system can then detect and identify the presence of the registered programming in a broadcast signal or its presence in and among a set of video signals (whether stored or broadcast) by extracting a feature set from the input signal, producing a numerical code for each segment input into the system and then comparing the sequence of detected numerical codes against the numerical codes stored in the database corresponding to known video content. Various testing criteria are applied during the comparison process in order to reduce the rate of false positives, false negatives and increase correct detections of the registered programming. The invention also encompasses certain improvements and optimizations in the comparison process so that it executes in a relatively short period of time.
  • the present invention relates to a method of detecting and tracking unknown broadcast video content items that are periodically encountered by automatic detection and tracking systems.
  • detection of broadcast content for example, music broadcast over radio
  • These known pattern vectors are stored in a database and while the broadcast signals are received, the same computation is applied to the incoming signal. Then, the detection process entails searching for matches between the incoming computed pattern vectors and the vast database of pre-created pattern vectors associated with the identity of known content.
  • Pattern vectors may also be derived from video frames by means of the application of digital signal processing or other algebraic techniques.
  • the fingerprint of a section of video is one or more numbers that are derived from the numbers making up the images comprising the section of the video.
  • one or more fingerprints may be calculated from a frame of video. Fingerprints may be calculated on a frame by frame basis or one frame out of a predetermined number of frames.
  • the techniques of searching through a database of pattern vectors looking for a series of matches may be used for pattern vectors derived from video.
  • the basic principles are the same as with searching for audio, albeit with some adaptations to accommodate the operating parameters associated with video signals.
  • the pattern vector itself may be derived in the manner set forth herein.
  • the management of distributed databases of pattern vectors for searching for analyzing many broadcast signals in distinct geographic areas can be applied using the video pattern vectors,
  • Practitioners of ordinary skill will recognize that the system can be adapted to visit websites on the Internet and download or otherwise receive video programming data from the website and automatically determine the identity of the programming available at the selected URL whose activation resulted in the download or delivery of the video program.
  • a number of methods have been developed to automate the detection of broadcast programming. These techniques generally fall into one of two categories: cue detection or pattern recognition.
  • the cue detection method is exemplified by U.S. Pat. Nos. 4,225,967 to Miwa et. al.; 3,845,391 to Crosby and 4,547,804 to Greenberg. These techniques rely on embedded cues inserted into the program prior to distribution. These approaches have not been favored in the field. In audio, the placement of cue signals in the program have limited the acceptance of this approach because it requires the cooperation of the program owners and/or broadcasters— thus making it impractical.
  • the pattern recognition method generally relies on the spectral or other characteristics of the content itself to produce a unique identifying code or signature.
  • the technique of identifying content consists of two steps: the first being extracting a signature or fingerprint from a known piece of content for insertion into a database, and the second being extracting a signature or fingerprint from a detected piece of content and searching for a signature or fingerprint match in the database in order to identify the detected content.
  • the preferred approach relies on characteristics of the broadcast content itself to create a signature unique to that content.
  • U.S. Patent No. 4,739,398 to Thomas, et.al. discloses a system that takes a known television program and creates for each video frame, a signature code out of both the audio and the video signal within that frame.
  • Figure 1 The components of the media broadcast monitoring system.
  • FIG. 3 The schematic of the DBS operation flow.
  • Figure 5 Schematic of Image Thresholding Dark Border Removal
  • the broadcast monitoring and detection system embodying the invention works in two phases: registration and detection.
  • registration phase known programming content is registered with the system by sending the program, as digital data, into the system.
  • a series of signatures in the case here, a pattern vector also referred to as a "fingerprint” or "signature" are stored as a sequence of data records in a database, with the identity of the program content cross-referenced to them as a group.
  • unidentified programming is input into the system.
  • Such programming can include video programming, whether terrestrial broadcast, satellite, internet, cable television or any other medium of delivery, whether now known or devised in the future. While such programming is being monitored, the pattern vectors of the programming (or any other signature generating technique) are continually calculated.
  • the calculated pattern vectors are then used to search for a match in the database.
  • the system uses the cross-referenced identity in the database to provide the identity of the content that is currently being played or made available for download.
  • the system is software running on a computer, however, it is envisioned that special purpose hardware components may replace parts or all of each module in order to increase performance and capacity of the system.
  • a computer containing a central processing unit is connected to a video digitizing card or interface device into which video programming is presented.
  • the interface is simply a network card that receives the appropriate digital video format, for example, broadcast HD, HDMI, DVI or even video data delivered as streamed or downloaded MPEG-2 or MPEG-4 data delivered through a computer network, including the Internet, that is attached to the computer.
  • the CPU fetches the video data from the interface card, or from the network card, calculates the pattern vector data, and then, along with timing data and the identity of the program, these results are stored in a database, as further described below.
  • the data may be loaded directly from authentic material, such as DVD disks, HD-DVD disks, Blu-ray discs, or storage devices containing digital data files in MPEG-2, MPEG-4 or any other video data format embodying the video signal.
  • the audio or other program signal is used in the following manner. If the system periodically detects an unknown program but with the substantially the same sequence of signatures each time, it assigns an arbitrary identifier for the sequence as an identifier for the unkown program material and enters the data into the database as if the program had been introduced during the registration phase. Once the program identity is determined in the future, then the database can be updated to include the appropriate content identity information as with authentic information while at the same time providing the owner of the programming the use data detected even when the identity of the program was not yet known.
  • the database which is typically a data file stored on a hard drive connected to the central processing unit of the computer by means of any kind of computer bus or data transmission interface, including SCSI or Ethernet.
  • the CPU fetches the video program data from the video card or the network card, or loads it from a data file that may be stored on the computer hard drive or external media reader.
  • the CPU calculates the pattern vector data for the detected signal, and then, along with the timing data, submits database queries to the database stored on the hard drive.
  • the database may be the same hard drive as in the computer, or an external hard drive accessed over a digital computer network.
  • the CPU continues to process the data to confirm the identification of the programming, as described further below.
  • the CPU can then communicate over any of a wide variety of computer networking systems well known in the art to deliver the identification result to a remote location to be displayed on a screen using a graphical user interface, or to be logged in another data file stored on the hard drive.
  • the program that executes the method may be stored on any kind of computer readable media, for example, a hard drive, CD-ROM, EEPROM or floppy and loaded into computer memory at run-time.
  • the signal can be acquired using an analog to digital video converter card, or the digital video data can be directly detected from digital video sources, for example, the Internet or digital television broadcast.
  • the system consists of four components.
  • Figure 1 shows the interconnection of the four modules: (1) a signal processing stage at the front end, (2) a pattern generation module in the middle, (3) followed by a database search engine module, and (4) a program recognition module at the end.
  • the results of the pattern generation module which creates signatures for known audio or video content, are stored in the database and the search and pattern recognition modules are not used.
  • the SA module receives video data makes it available to the remaining modules.
  • Practitioners of ordinary skill will recognize that there are a variety of products that receive analog video and convert those signals into digital data or to receive digital video or digital files embodying digital video signals.
  • These devices can be any source of digital audio data, including an interface card in a personal computer that converts analog video into digital video data accessible by the computer's CPU, a stand alone device that outputs digital video data in a standard format or a digital video receiver with digital video output.
  • pre-detected signal in digital form that is, digital video files in a pre-determined format, can be accessed from storage devices connected to the system over typical data networks. Formats like MPEG-2 or MPEG-4 are well known in the art.
  • the SA module regularly or on command reads the data from the digital interface device or data storage and stores the data into a data buffer or memory to be accessed by the Pattern Generation module.
  • Practitioners of ordinary skill will recognize that the typical digital video system will provide a frame's worth of digital video at regular intervals, called the frame rate.
  • the sequence of frames representing the video are stored in sequence.
  • data structures stored in the computer memory (which includes the hard drive if the operating system supports paging and swapping), may be used where the time frames are not physically stored in sequence, but logically may be referenced or indexed in the sequence that they were detected by means of memory addressing.
  • the PG module operating during the detection phase (2), fetches the stored video samples that were detected and stored by the SA Module. Once a frame of the samples is received, the PG module will compute the pattern vector of the frame and, when in detection phase, send the pattern vector to the Database Search Module in the form of a database query. During the registration phase, the PG module calculates the pattern vector in order that it be stored in the database, in correlation with the other relevant information about the known video program. The calculation of the pattern vector is described further below.
  • a video stream can be viewed as a sequence of 2-dimensional image files. So, a video stream by itself has a well-defined frame structure. The same video stream may be processed, or coded, to the same video sequences but with different configurations.
  • a DVD video sequence in NTSC format has a resolution of 720 x 480 and frame rate of 29.97 fps can be coded into a VCD video sequence of 320 x 240 and 29.97 fps.
  • Today's video coders can code this DVD sequence to the same video sequences of arbitrary resolution and arbitrary frame rates.
  • Fingerprint of a video frame is robust to arbitrary resolution (i.e. aspect ratio).
  • Fingerprint of a video frame is robust to the luminance, tint, and hue on every frame.
  • Fingerprint of a video frame is robust to the noise on every frame.
  • Identification of a video sequence is robust to different frame rate of the video sequence.
  • the production system SA Module is required to recognize all the popular formats.
  • an open-source codec ⁇ MPLAYER and the MENCODER - is used to decode video sequences from many different formats.
  • a digital video sequence is a sequence of two-dimensional digital images. Each image is referred to as a frame of the sequence.
  • a frame is composed of a rectangular array of pixels.
  • the resolution of the video sequence is specified in terms of the horizontal (h) and the vertial (v) count of pixels on a frame.
  • a DVD video sequence in NTSC format has a resolution of 720 (h) x 480 (v).
  • Each pixel of a frame is a color pixel, composed from three primary colors (Red, Green and Blue, or RGB).
  • the magnitude of each color component is coded into a number of bits. The most popular one is 8 bits but high quality video sequence can have a higher bit count.
  • x rgb (m v ,m h ) ⁇ x r (m v ,m h ), x b (m v ,m h ), x g
  • the formulation of the video fingerprint formulation is based on RGB color space. Practitioners of ordinary skill will recognize that the calculations used for creating the fingerprint themselves can be transformed into the YUV space, or any other color space, and then applied to video signals encoded in that space, with equivalent results.
  • the fingerprints are derived from any monochromatic representation of the video frames.
  • n is the frame index
  • any color image in any color space can be converted by well-known transformations from one color space to another or into a monochromatic image.
  • the color green is the predominant component of brightness, and therefore if the image data is in the Y, U, V color space, the Y values can be used.
  • Another embodiment deals with the problem where pirated copies of a movie are made with camcorders in a movie theater environment. Oftentimes, those clips consist of irregular dark borders due to camera shakes and rotation. A rotation element is added to the detection process and a correction made to compensate for the rotation, as shows in Figure 5.
  • a thresholding algorithm can be used to detect a borderline that has some slope relative to the edge of the image. This slope can be converted into an angle of rotation to be applied to the image frame, using well known techniques.
  • equalization is used to equalize the distribution of the pixel values, i.e. to maximize the contrast of the image.
  • This processing step is used to reduce the effect of illuminance (brightness), contrast and color shift resulting from the application of different video codecs or color space conversions on the pattern vector or fingerprint values.
  • RMS root-mean-squared
  • the RMS pixel values of every frame (typically after dark borders are removed) is set to equal some predetermined constant C, in the preferred embodiment 0.5.
  • the RMS equalization method is used (it is also called the power equalization used in the wireless communication network , see S. Verdu, "Wireless Bandwidth in the Making," IEEE Communications Magazine, Invited Paper, Special Issue of High-Speed Wireless Access, July, 2000).
  • the ratio r is used to scale every pixel from X) n I (m h , m v ) to r • X)J. ⁇ m h ,m ⁇ ) such that the RMS pixel value of the frame be equal to C.
  • the following is the equation to compute the RMS pixel valueof a given frame: ; where M h and M are the new d .imensi . ons aft.er dark border removal.
  • step 3 the horizontal and the vertical projection of the image is calculated as follows:
  • Each of the horizontal projection elements is obtained by a horizontal projection of the image, as follows:
  • Every horizontal projection element is an average of pixel values on the corresponding column of the image.
  • every vertical projection element is an average of pixel values on the corresponding row of the image.
  • Each projection is compressed and converted into two fingerprints or pattern vectors.
  • the first fingerprint vector is the one given by the horizontal projection:
  • the second fingerprint vector is the one given by the vertical projection:
  • the four parameters (O H , BH) and (Oy, By ) are determined by N H and Ny, the number of fingerprint elements in the horizontal and vertical projections respectively, as well as the number of pixels in both dimensions.
  • B is the corresponding image dimension (horizontal or vertical) divided by N and O is the value B times the percentage overlap.
  • N H and Ny are set to be 15 and the percentage overlap is 50%. Practitioners of ordinary skill will recognize that the parameters N H , Ny , O H , B H , Oy and By and the percentage overlap can be adjusted to vary the size of the databases, the speed of operation and the accuracy of the matching.
  • the two fingerprint vectors one is obtained with a horizontal projection of the image, and the other is obtained with a vertical projection of the image, are not aggregated into a single fingerprint vector. And there are good reasons for doing so.
  • the recall is more robust to changes in the aspect ratio. For example, use of a wide-format clip as a source to detect the same clips in non-wide formats, and vice-versa.
  • the normal format frame is obtained by a clipping of the wide format frame, which is a popular way of mapping from a wide format video to a normal format video. Due to the clipping, there is a very low chance of getting the horizontal projection matched. But the chance of getting the vertical projection matched is still reasonably good.
  • the second reason is to have the fingerprints be invariant to frame rotation, i.e. rotate every frame by 90 degrees to exchange the vertical and the horizontal axes, which is known to be a popular scamming scheme in order to distribute pirate video on the Internet. If the frame is rotated, then the two fingerprint vectors are interchanged.
  • the detection algorithm can be designed easily to run parallel search on FP V and FP H vectors on a single database that houses both fingerprint vectors.
  • the architecture also accommodates the effects of flipping the frames horizontally, vertically, or both.
  • a frame is flipped horizontally means that the index m v is mapped to
  • the system reruns the matching search process with flipped pattern vectors.
  • this module Upon the reception of a query generated by the PG module, this module, (3), will search the database containing the sequence of pattern vectors of known programming. If a match is found, then the module returns a set of registration numbers otherwise referred to herein as program-id's and frame-id's, referred to also as frame numbers, corresponding to the identities of a set of video programs and the frame numbers within these programs where the match occurred. If the search of the database fails to find a match, the DBS Module will issue a NO-MATCH flag. It is contemplated that aspects of the invention for the DBS Module are applicable to any kind of data set containing signal signatures, even signatures derived using techniques distinct from those used in the Pattern Vector Genreation module.
  • SDI Program Detection and Identification
  • This module constantly monitors the matching results from the DBS on the most recent contiguous of N time frames, as further described below.
  • N is set to five, although a larger or smaller number may be used with varying results.
  • Two schemes are used to determine if any video program has been positively detected. The first is a majority voting scheme which determines if, within each thread of matching pattern vectors among N, the number of frames that possess a valid sequence pass a designated majority of the block of frames. The second is a frame sequencing scheme which follows each of the potential thread and counts how many frames within that thread constitute a valid sequence. If there exists a thread(s) where a majority of the sequentially detected frames satisfy the frame sequencing requirement, then the program is deemed detected in that thread. Either or both schemes are used to suppress false positive detections and to increase the correct detections. In the preferred embodiment, both schemes are used.
  • the SDI module Given a program (or more than one) that is detected, the SDI module will initiate two modes:
  • Identification mode in this mode, the module logs all the reference information of the detected program, including title, production company or other copyright owner, or any other information input during the registration phase of the system, along with the time when the program is detected, and the time into the program that the detection was made. This information will be registered on the detection log.
  • Tracking mode In this mode, the module tracks each detected program by monitoring if the queried result of every new frame of the detected content is obeying the sequencing requirement, described below. The algorithm is locked in this mode until the queried results cannot be matched with the sequencing requirement. Upon the exiting from the tracking mode, a number of detection attributes, including the entire duration of the tracking, and the tracking score, will be logged.
  • the pattern vector generated by the PG Module is sent to the DBS Module in order to conduct a search of the database for a match.
  • the output is either a NO-MATCH flag, which indicates that the DBS fails to locate a frame within the database that passes the search criteria; or the program-id's and frame-id's of the pattern vectors that pass the search criteria.
  • the SDI Module collects the output from the DBS Module to detect if a new audio program is present. If so, the detected program is identified.
  • Figure 1 is an illustration of the flow of the algorithm from a frame of video to its result after detection. It is contemplated that aspects of the invention for the SDI Module are applicable to any ldnd of data set containing signal signatures, even signatures derived using techniques distinct from those used in the Pattern Vector Genreation module.
  • the Database Search Module takes the pattern vector of each frame from the PG Module and assembles a database query in order to match that pattern vector with database records that have the same pattern vector.
  • a soft matching scheme is employed to determine matches between database queries and pattern vectors stored in the database.
  • a hard matching scheme allows at most one matching entry for each query.
  • the soft matching scheme allows more than one matching entries per query, where a match is where a pattern vector is close enough, in the sense of meeting an error threshold, to the query vector.
  • the number of the matching entries can either be (i). limited to some maximum amount, or (ii) limited by the maximum permissible error between the query and the database entries. Either approach may be used.
  • the soft matching scheme relies on the fact that the program patterns are being oversampled in the registration phase.
  • the interframe distance used for registration is only 1/12 of that used in the detection.
  • the interframe distance used for registration is 1/12 sec, and for detection/identification is lsec.
  • a nearest neighbor search algorithm which consists of two parts.
  • Part 1 exercises an approximate search methodology.
  • a range search (RS) scheme is employed to determine which entries in the database falls within a close vicinity to the query.
  • Part 2 exercises a fine search methodology.
  • Results from Part 1 are sorted according to their distances to the query.
  • the search algorithm can either (i) return the best M results (in terms of having shortest distances to the query), or (ii) return all the results with distance less than some prescribed threshold. Either approach may be used.
  • the nearest neighbor algorithm can be replaced with other algorithms that provide better compute time performance when executing the search.
  • Range search requires pattern vectors that match within a tolerance, not necessarily a perfect match in each case. From the geometrical point of view, range search identifies which set of the entries encompassed within a polygon where the dimensions are determined by the tolerance parameters.
  • the polygon is a 15 dimensional hyper-cube for each projection, i.e. both Ny and N H are set to 15.
  • the pattern vector length is set.
  • the pattern corresponding to the horizontal projection has a dimension of N H
  • the pattern corresponding to the vertical projection has a dimension of N v .
  • the examples below show a length of R , however, the principles apply to whatever vector length is used.
  • the pattern vector library is a Mx R matrix, where Mis the total number of pattern vectors stored in the database and R represents the number of elements in the pattern vector. Mis a potentially huge number, as demonstrated below. Assume that the entire database is represented by the matrix A:
  • each vector z is a pattern vector of R elements calculated during the registration phase with known video content for which detection is sought during the detection phase.
  • the identification exercise is to locate a set of library pattern vectors, ⁇ z_opt ⁇ , which are being enclosed within the hypercube determined by the tolerance parameter.
  • Ll norm is used, where
  • X 1 + X 2 - ⁇ — + x R is the Ll norm of x.
  • e m,n is referred to as the nth point error between the c and z m
  • the search for z* over the entire library with the RS algorithm is based on the satisfaction of point error criteria. That is, each point error must be less than some tolerance and, in the preferred embodiment, the Ll norm less than a certain amount. Practitioners of ordinary skill will recognize that the tolerance for each element and the Ll norm may be the same or different, which changes the efficiency of searching. The determination of the tolerance is based on some statistical measure of empirically measured errors. Further, it is recognized that other measures of error, besides a first-order Ll norm may be used.
  • the search problem now becomes a range search problem, which is described elsewhere in the art. The following is incorporated by reference to P. K.
  • Criteria 1 select only those z m with error less than some prescribed threshold e max .
  • Criteria 2 select the best M candidates from L, where the M candidates are the least size of error to the Mth size of error.
  • the index set L is:
  • A is the library matrix consisting of M rows of pattern vectors:
  • Each row is a particular pattern vector. There are in total M pattern vectors, and in the preferred embodiment, each has R elements.
  • Table T k -1 is a mapping of m — T -lr — >m and table T ⁇
  • sorting and table creation may occur after the registration phase but prior to the search for any matches during the
  • the system By having pre-sorted the pattern vectors during the registration phase, the system reduces the search time during the detection phase.
  • the method begins with a search through the sorted vectors, as described below.
  • T [T 1 T 2 • • • T R ], a binary search method may be used to extract the indices of
  • the pair R can be converted back to the original indices using
  • each successive p k would be the prior ⁇ k minus those indices that failed the tolerance test for the Mi element.
  • the p # _i are the indices that meet all R tolerance tests.
  • the set S are all the original indices after the R intersection loops. If S is empty, issue the NO-MATCH flag. Otherwise, for hard matching, we proceed to locate the sole winner which may be the candidate with the smallest error. For soft matching, we proceed to collect all the qualifying entries.
  • the total number of candidates in each column can be measured.
  • the total number of candidates in each column is equal to the total number of candidates in each ⁇ k .
  • the order of k's can then be altered so that the first k is the one corresponding to the p k that has the fewest candidates, the second k is the only corresponding to have the next fewest candidates, and so on.
  • the last k is the one corresponding having the largest number of candidates of all.
  • the order of intersection starts with columns with the least number of candidates. .There is no alternation to the end result except the search speed is much improved..
  • SDI Program Detection and Identification
  • the SDI module takes the results of the DBS module and then provide final confirmation of the program identity.
  • the SDI module contains two routines:
  • Detection Filtering on regularity of the detected program number: Irregular matches, where the DBS module returns different program-id numbers on a consecutive set of frames, is a good indication that no program is being positively detected. In contrast, consistent returns, where the DBS module returns consistently the same song number on a consecutive set of frames, indicates that a program is successfully detected.
  • a majority vote calculation can be made in a number of ways, for example, it may be advantageous in certain applications to apply a stronger test, where the majority threshold is a value greater than K+l and less than or equal to 2K+1, where a threshold of 2K+1 would constitute a unanimous vote. This reduces false positives at potentially the cost of more undetected results.
  • majority vote shall be defined to include these alternative thresholds.
  • the preferred embodiment determines the majority vote using a median filter.
  • the detected result is a multiplication of x times y. .
  • the major feature of this formula is that it can be implemented in one pass rather than an implementation requiring loops and a counter.
  • the next step is to impose an additional verification test to determine if there is frame synchronization of the song being detected.
  • the frame synchronization test checks that the frame-id number output by the DBS module for each p-th frame is a monotonically increasing function over time, that is, as p increases. If it is not, or if the frame indices are random, the detection is declared void.
  • the following are the step-by- step method of the entire SDI
  • s p be a structure that holds the most recent 2K + 1 programed' s after the p-th. broadcast frame has been detected: ⁇ P -.
  • s m n - the r ⁇ -th programed being detected in the r ⁇ -th broadcast frame by the DBS module is the size of the bin.
  • P m is different for different m s.
  • f p is another structure holding the the corresponding frame numbers or frame indices:
  • a register is created to hold this result until a new and different song or program is detected.
  • f t is the a frame of s p m in the t-th bin of f .
  • the quantity d x is the offset of frames between the two detected frames in B.
  • This quantity can also be translated to an actual time offset as well: by multiplying the value by the interframe distance in samples and dividing by the samples per second.
  • the quantity d 2 is the frame offset between the two broadcast frames.
  • d is the ratio of the two offsets, representing the advance rate of the detected sequence.
  • the system expects an ideal rate of 12 for video detection as the value for d.
  • an elastic constraint on d is applied: If [d ⁇ e (l2[d 2 -l]+10,12[tf 2 -l]+ 14)], the two frames are in the right sequencing order.
  • the values of 10 and 14 are a range centering around the ideal value 12.
  • the majority vote test is applied again because even if the majority vote passes in Step 5, the majority vote test may fail after cleaning up the result with the sequencing rule requirement. If the revised majority vote passes, then a new program or song has been positively detected, otherwise, there is no detection.
  • Step 5c The sequencing requirement here is the same as what is being used in Step 5c. That is, we expect the id of the detected frame for the new broadcast frame is in a monotonic increasing manner, and the increasing amount between successive frame of broadcast is between 10 to 12 in the preferred embodiment.
  • the thread that has the highest score is the winner of all in the Final list. i.
  • the score can be calculated based on the error between each frame in the thread to the corresponding frame of the broadcast; or based on the duration of the thread. Or both. In our preferred embodiment, the duration is taken as the tracking score of each of thread. The one that endures the longest within the period of tracking is the winner thread.
  • D If multiple programs in being posted 57 in Step 2. correct the posting by the program_id of the winning thread. 8. Wait for the newp-th frame from the broadcast, Go back to Step 1.
  • Step 5 for testing the sequentiality frame-id's may be changed either to make the test easier or make the test harder to meet. This controls whether the results increase false positives or suppress false positives while raising or lowering the number of correct identifications as compared to no detections.
  • the detection phase of the process by means of video pattern vector matching process can first check a match using the vertical pattern vector and then attempt a match using a horizontal pattern vector. If a soft match is found with either one, then the sequential testing is applied using horizontal vectors or vertical vectors, depending on which type created the match. The assumption is that the video signal will not be rotated back and forth by 90 degrees each frame.
  • the invention embodied by a computer program stored on a disk as part of a computer, can be executed by a computer that loads the program.
  • the computer can be a server operatively connected to a database over a computer network, and also connected to the Internet.
  • the server can use well known protocols to test websites for the presence of hyperlinks or other indicia of network addressing that have video data made available, either as download or in streamed form.
  • the invention can receive this video data and process it in accordance with the methodology described herein.
  • Practitioners will recognize that a video program may be registered in one format and then detected in another. For example, a website may host a streamed version at low resolution of the same video registered with the database in the system at a high resolution.
  • the pattern vectors are optimally configured so that pattern vector calculations from the two formats produce sufficiently identical pattern vectors.
  • a server may be a computer comprised of a central processing unit with a mass storage device and a network connection.
  • a server can include multiple of such computers connected together with a data network or other data transfer connection, or, multiple computers on a network with network accessed storage, in a manner that provides such functionality as a group.
  • Practitioners of ordinary skill will recognize that functions that are accomplished on one server may be partitioned and accomplished on multiple servers that are operatively connected by a computer network by means of appropriate inter process communication.
  • the access of the website can be by means of an Internet browser accessing a secure or public page or by means of a client program running on a local computer that is connected over a computer network to the server.
  • a data message and data upload or download can be delivered over the Internet using typical protocols, including TCP/IP, HTTP, SMTP, RPC, FTP or other kinds of data communication protocols that permit processes running on two remote computers to exchange information by means of digital network communication.
  • a data message can be a data packet transmitted from or received by a computer containing a destination network address, a destination process or application identifier, and data values that can be parsed at the destination computer located at the destination network address by the destination application in order that the relevant data values are extracted and used by the destination application.
  • the method described herein can be executed on a computer system, generally comprised of a central processing unit (CPU) that is operatively connected to a memory device, data input and output circuitry (10) and computer data network communication circuitry.
  • Computer code executed by the CPU can take data received by the data communication circuitry and store it in the memory device.
  • the CPU can take data from the I/O circuitry and store it in the memory device.
  • the CPU can take data from a memory device and output it through the IO circuitry or the data communication circuitry.
  • the data stored in memory may be further recalled from the memory device, further processed or modified by the CPU in the manner described herein and restored in the same memory device or a different memory device operatively connected to the CPU including by means of the data network circuitry.
  • the memory device can be any kind of data storage circuit or magetic storage or optical device, including a hard disk, optical disk or solid state memory.
  • Source code may include a series of computer program instructions implemented in any of various programming languages (e.g., an object code, an assembly language, or a high-level language such as FORTRAN, C, C++, JAVA, or HTML) for use with various operating systems or operating environments.
  • the source code may define and use various data structures and communication messages.
  • the source code may be in a computer executable form (e.g., via an interpreter), or the source code may be converted (e.g., via a translator, assembler, or compiler) into a computer executable form.
  • the computer program may be fixed in any form (e.g., source code form, computer executable form, or an intermediate form) either permanently or transitorily in a tangible storage medium, such as a semiconductor memory device (e.g., a RAM, ROM, PROM, EEPROM, or Flash-Programmable RAM), a magnetic memory device (e.g., a diskette or fixed disk), an optical memory device (e.g., a CD-ROM), a PC card (e.g., PCMCIA card), or other memory device.
  • the computer program may be fixed in any form in a signal that is transmittable to a computer using any of various communication technologies, including, but in no way limited to, analog technologies, digital technologies, optical technologies, wireless technologies, networking technologies, and internetworking technologies.
  • the computer program may be distributed in any form as a removable storage medium with accompanying printed or electronic documentation (e.g., shrink wrapped software or a magnetic tape), preloaded with a computer system (e.g., on system ROM or fixed disk), or distributed from a server or electronic bulletin board over the communication system (e.g., the Internet or World Wide Web.)
  • printed or electronic documentation e.g., shrink wrapped software or a magnetic tape
  • a computer system e.g., on system ROM or fixed disk
  • a server or electronic bulletin board over the communication system (e.g., the Internet or World Wide Web.)

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé de détection de l'identité d'une programmation vidéo, moyennant quoi la programmation vidéo connue est convertie en un ensemble de vecteurs de motif stockés dans une base de données et la programmation vidéo détectée entrante est convertie en un ensemble de vecteurs de motif qui sont utilisés pour rechercher dans la base de données les vecteurs de motif correspondants indiquant une correspondance avec la programmation vidéo connue.
PCT/US2008/055040 2007-02-26 2008-02-26 Procédé et appareil de détection et d'identification automatiques de signaux vidéo non identifiés WO2008106465A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US89154807P 2007-02-26 2007-02-26
US60/891,548 2007-02-26

Publications (1)

Publication Number Publication Date
WO2008106465A1 true WO2008106465A1 (fr) 2008-09-04

Family

ID=39721593

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2008/055040 WO2008106465A1 (fr) 2007-02-26 2008-02-26 Procédé et appareil de détection et d'identification automatiques de signaux vidéo non identifiés

Country Status (1)

Country Link
WO (1) WO2008106465A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9430472B2 (en) 2004-02-26 2016-08-30 Mobile Research Labs, Ltd. Method and system for automatic detection of content
EP3113075A1 (fr) * 2015-06-30 2017-01-04 Thomson Licensing Procédé et appareil permettant de trouver une correspondance image dans un ensemble d'images données
CN110709841A (zh) * 2017-12-13 2020-01-17 谷歌有限责任公司 用于检测和转换旋转的视频内容项的方法、系统和介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4890319A (en) * 1984-09-21 1989-12-26 Scientific-Atlantic, Inc. Method for controlling copying of protected information transmitted over a communications link
US20030061489A1 (en) * 2001-08-31 2003-03-27 Pelly Jason Charles Embedding data in material
US20040212853A1 (en) * 1998-09-23 2004-10-28 Xerox Corporation Electronic image registration for a scanner
WO2005081829A2 (fr) * 2004-02-26 2005-09-09 Mediaguide, Inc. Procede et dispositif destines a la detection et a l'identification automatiques d'un signal de programmation audio ou video diffuse

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4890319A (en) * 1984-09-21 1989-12-26 Scientific-Atlantic, Inc. Method for controlling copying of protected information transmitted over a communications link
US20040212853A1 (en) * 1998-09-23 2004-10-28 Xerox Corporation Electronic image registration for a scanner
US20030061489A1 (en) * 2001-08-31 2003-03-27 Pelly Jason Charles Embedding data in material
WO2005081829A2 (fr) * 2004-02-26 2005-09-09 Mediaguide, Inc. Procede et dispositif destines a la detection et a l'identification automatiques d'un signal de programmation audio ou video diffuse

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9430472B2 (en) 2004-02-26 2016-08-30 Mobile Research Labs, Ltd. Method and system for automatic detection of content
EP3113075A1 (fr) * 2015-06-30 2017-01-04 Thomson Licensing Procédé et appareil permettant de trouver une correspondance image dans un ensemble d'images données
CN110709841A (zh) * 2017-12-13 2020-01-17 谷歌有限责任公司 用于检测和转换旋转的视频内容项的方法、系统和介质
CN110709841B (zh) * 2017-12-13 2023-09-12 谷歌有限责任公司 用于检测和转换旋转的视频内容项的方法、系统和介质

Similar Documents

Publication Publication Date Title
US20090006337A1 (en) Method and apparatus for automatic detection and identification of unidentified video signals
US9785757B2 (en) System for identifying content of digital data
US9508011B2 (en) Video visual and audio query
CN103975605B (zh) 基于试验性水印的水印提取
CN101379513B (zh) 一种用于自动生成镶嵌图像的方法
US20120143915A1 (en) Content-based video copy detection
US20170155933A1 (en) Security and/or tracing video media-content
WO2009106998A1 (fr) Comparaison de séquences de trames dans des flux multimédias
EP2559237A1 (fr) Interactivité indépendante de la plateforme avec diffusions multimédias
AU2005216057A1 (en) Method and apparatus for automatic detection and identification of broadcast audio or video programming signal
CN110853033A (zh) 基于帧间相似度的视频检测方法和装置
WO2015067964A2 (fr) Identification de composants multimédias
CN109120949B (zh) 视频集合的视频消息推送方法、装置、设备及存储介质
WO2012093407A2 (fr) Identification de logo
CN103841438A (zh) 信息推送方法、信息推送系统及数字电视接收终端
CN110457974A (zh) 图像叠加方法、装置、电子设备及可读存储介质
WO2008106465A1 (fr) Procédé et appareil de détection et d'identification automatiques de signaux vidéo non identifiés
CN103455966A (zh) 数字水印嵌入设备、数字水印嵌入方法和数字水印检测设备
CN112560552A (zh) 视频分类的方法和装置
CA2760414C (fr) Detection de copie video basee sur le contenu
KR102308303B1 (ko) 유해 동영상 파일을 필터링 하기 위한 장치 및 방법
CN117014649A (zh) 视频处理方法、装置及电子设备
CN110677692B (zh) 视频解码方法及装置、视频编码方法及装置
CN113569719A (zh) 视频侵权判定方法、装置、存储介质及电子设备
CN117176979B (zh) 多源异构视频的内容帧提取方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08730779

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC, EPO FORM 1205A SENT ON 22/12/09

122 Ep: pct application non-entry in european phase

Ref document number: 08730779

Country of ref document: EP

Kind code of ref document: A1