CN1252982C - Method and apparatus for reducing false positives in cut detection - Google Patents

Method and apparatus for reducing false positives in cut detection Download PDF

Info

Publication number
CN1252982C
CN1252982C CNB008070067A CN00807006A CN1252982C CN 1252982 C CN1252982 C CN 1252982C CN B008070067 A CNB008070067 A CN B008070067A CN 00807006 A CN00807006 A CN 00807006A CN 1252982 C CN1252982 C CN 1252982C
Authority
CN
China
Prior art keywords
frame
brightness
value
scene
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB008070067A
Other languages
Chinese (zh)
Other versions
CN1349711A (en
Inventor
T·麦格
N·蒂米特罗瓦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Koninklijke Philips NV
Original Assignee
Koninklijke Philips Electronics NV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics NV filed Critical Koninklijke Philips Electronics NV
Publication of CN1349711A publication Critical patent/CN1349711A/en
Application granted granted Critical
Publication of CN1252982C publication Critical patent/CN1252982C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/11Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information not detectable on the record carrier
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7847Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
    • G06F16/785Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using colour or luminescence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7847Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
    • G06F16/7864Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using domain-transform features, e.g. DCT or wavelet transform coefficients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/49Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • G11B27/28Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/142Detection of scene cut or scene change
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/179Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a scene or a shot
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/87Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving scene cut or scene change detection in combination with video compression
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/147Scene change detection
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B2220/00Record carriers by type
    • G11B2220/60Solid state media
    • G11B2220/65Solid state media wherein solid state memory is used for storing indexing information or metadata

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Library & Information Science (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Television Signal Processing For Recording (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Closed-Circuit Television Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A video indexing method and device for selecting keyframes from each detected scene in the video. The method and device determines whether a scene change has occurred between two frames of video or whether the change between the two frames is merely a uniform change in luminance values.

Description

Reduce the method and apparatus of misattribution in the picture change detection
Technical field
A kind of device of relate generally to of the present invention, it detects each important scenes in the information source video, and therefrom selects representational key frame.The present invention is specifically related to make a kind of judgement, promptly whether a detected scene changes is that a scene changes really, a perhaps even variation of image brightness just, the brightness even variation that causes such as the photoflash lamp flash of light of the camera that in occasions such as news broadcast program, takes place.
Background technology
The user tends on tape record home videos program, perhaps recording television programs, film, concert, sports tournament, or the like, watch or repeat after being used for and watch.Yet the user may not write the content on the recorded tape, does not perhaps remember to have recorded what content on tape, or do not remember specific scene, film, logout go tape where.Therefore, the user may must sit down and watch whole tape, and what content is arranged on the memory tape.
The video content analysis device is used automatically and semi-automatic method extracts and can describe the information of having recorded material content.Extract the structure and the implication of image in video content index and the analytical equipment visual cues from video content.Usually, by to reflecting that some frames of different scenes make one's options in the video, from a TV programme or home video, to take out a video clipping.
In Multimedia Tools and Applications (multimedia tool and application) the 89-111 page or leaf that nineteen ninety-five publishes, in a kind of scene change detecting system that Hongjiang Zhang, Chien Yong Low and Stephen W. Smoliar narrate in " Video Parsing and Browsing UsingCompressed Data (video of applied compression data is resolved and browsed) " article, to making comparisons as piece accordingly between two frame of video, do not separating as just all on the whole video frame, always adding up under the block type as the difference between the piece.If what some was arranged between two frames as piece variation has taken place, change just detect a scene.Yet if the difference between the relevant picture piece of two frames is identical approx with regard to colour or brightness, the detection system of Zhang may produce the result who twists.Under this kind occasion, detection system can detect a scene and change, and the photoflash lamp flash of light of camera has in fact just taken place during news broadcast.
Summary of the invention
Need a kind of system, can create a visual index for precedence record or the video source that writing down, it is convenient to use in selecting important key frame and is more accurate, provides a kind of applicable amount of information to the user simultaneously.This system can detect scene and change, and from every kind of scene, select a key frame, then in the frame in fact all as only even variation in piece or the macro block brightness aspect reality, in the information change between two frames that it produced, detecting of disposable its scene variation and choosing of key frame.
An object of the present invention is two frame of video of comparison and change, but if the difference between two frames only is the substantial even variation in brightness aspect, then the present invention will judge, not detect a scene and change to detect a scene.
Another object of the present invention is corresponding direct current (DC) coefficient as piece in comparison two frames.If the variation of DC coefficient all in the reality in frame then will judge do not have occurrence scene to change, and do not select another key frame as approximate identical on the piece.
According to a first aspect of the invention, provide a kind of video index system, be used to detect scene and change, and each scene is selected key frame, this system includes:
A) scene change detector (230), the scene that detects between two frame of video changes; And
B) be used for the detection system of sensed luminance even variation between two frames, described detection system contains:
I) receiver (210,202) receives the information source video, and has each frame that is made of brightness value in the information source video; And
An ii) comparator (230,240), with separately brightness value in the brightness value in first frame and second frame compare and detect in first frame all brightness values whether with second frame in all brightness values changed practically identical value significantly;
Change described detection system and can receive two frame of video once detecting scene, and can judge difference brightness even variation whether in fact just between two frames.According to a second aspect of the invention, a kind of method that is used to differentiate the misattribution of scene change-detection is provided, comprise: receive at least two frame of video, every frame has each brightness value, and this two frame has been detected to second frame scene variation having taken place from first frame; Corresponding brightness value in each brightness value in first frame and second frame is compared; And calculate in first frame all brightness values whether with second frame in all brightness values changed practically identical value significantly, if so, judge that then the scene that misattribution has taken place changes between this two frame.
For understand the present invention better, it moves advantage, and uses the specific purpose that it can reach, must be with reference to the accompanying drawings and its explanation, in the accompanying drawing, example and described of the present invention all preferred
Embodiment.
Description of drawings
In order to understand better, following each figure is made explanation.
Fig. 1 shows bright a kind of video filing processing;
Fig. 2 A and Fig. 2 B are the block diagrams according to a preferred embodiment of the present invention institute's operative installations in creating a visual index;
Fig. 3 shows clear a frame, a macro block and several picture piece;
Fig. 4 shows some DCT coefficients of understanding a picture piece;
Fig. 5 shows a clear macro block and the several picture piece that has the DCT coefficient; And
Fig. 6 example goes out a kind of video streaming image, and here, variation has taken place in the brightness aspect.
Embodiment
In handling, video content index has two stages: filing and retrieval.During filing is handled, in the video dissection process, analyze the content of video, and create a visual index.When the video dissection process, carry out automatic important scenes detection, uniform luminance change-detection and key frame and select.It is a kind of processing that scene changes of discerning that important scenes detects, just, and " switching " (video switch detects or segmentation detects) and identification static scene (static scene detection).For each detected scene, extract a specific representative frame that is called key frame.So it is very important correctly to discern the generation that scene changes, otherwise, can select too much key frame for single scene, perhaps change non-selected enough key frames that goes out for a plurality of scenes.It is a kind of processing that brightness aspect between two frames is identified variation that uniform luminance detects, narration in further detail below.(with a kind of information source tape obviously is distinct as a reference, but the information source video also can be from a file, disk, DVD, other storage device, perhaps direct signal source from transmission (for example, when the video recording of record one family)).
Show among Fig. 1 that bright a kind of video filing handles on the information source tape of information source video image that has been applied to precedence record, it can include audio frequency and/or text, but, to previous other memory unit of storing visual information a such as mpeg file, but also handle like the implementation of class.In this kind processing, be based on visual index of information source video creation.Second processing that the user is done for record on the information source tape is to want to create a visual index of finishing simultaneously with record.
Fig. 1 shows a bright example that is used for first kind of processing (being used for before having write down the information source tape of program) of video tape.In the step 101, can be when needing by the reproduction/recording apparatus of all VCR in this way (video cassette recorder), with the rewinding of information source videocassette.In the step 102, playback information source videocassette.The signal that provides from the information source videocassette is received by television set, VCR or other processing unit.In the step 103, receive this vision signal, and video signal format is changed into each frame (frame collection) of represent pixel data by a Media Processor or a ppu in this processing unit.
In the step 104, a primary processor is separated into each as piece with every frame, makes the picture piece data relevant with them realize conversion, produces DCT (discrete cosine transform) coefficient; The enforcement important scenes detects, and the brightness even variation detects, and key frame is selected; Key frame is set up and is stored in memory, disk or other medium as a data structure.In the step 105, with the top of information source tape rewind to it, and in step 106, the information source tape is set to recorded information.In the step 107, the data structure of coming from memory converts the information source tape to, produces visual index.Then, with tape rewind, watch visual index.(when not adopting tape, also can use any other medium, or index can be stored on server and/or produce.)
When creating visual index on tape when the user wishes at record, top processing changes a little.Without step 101 and 102 and replace, resemble the frame collection in the performing step 103 video (film etc.) record then with the step 112 shown in Fig. 1.
Step 103 and 104 more specifically is shown among Fig. 2 A and Fig. 2 B.Vision signal exists with analog form (continuous data) or digital form (discrete data).This example runs on digital field, thereby the Applied Digital form is handled.Information source video or vision signal are a succession of each other image or frame of video, show (being per second 30 frames in this example) with sufficiently high speed, thereby the image sequence that shows is rendered as a continuous images stream.These frame of video can be (NTSC or original video) data of not compressing, or such as MPEG, MPEG2, MPEG4, M-JPEG (motion JPEG) or the packed data of form other.
The information in the compressed video at first is not segmented into each frame in Media Processor 202, and it has been used such as a kind of frame acquisition technique 204 that appears among the Intel Smart Video Recorder III.Though, there is other frame size to use, in this example shown on Fig. 3, frame 302 expressions TV, video or other visual image include 352 * 240 pixels.
Frame 302 its each is divided into all picture pieces 304 in primary processor 210 (Fig. 2 A), be the picture piece 304 of 8 * 8 pixels in the present example.Use these as piece 304 and current broadcast standard GCIR-601, (Fig. 2 A) produces each luminance block by the macro block generator, and colour information is carried out subsampling and produced each chrominance block.Form macro block 308 by luminance block and chrominance block.In this example, adopted the form of 4:2:0, however, the skilled person in the present technique field can use other form easily, all 4:1:1 in this way or 4:2:0.In 4:2:0, macro block 308 comprises 6 picture pieces, i.e. each in 4 brightness picture piece Y1, Y2, Y3 and Y4, and two colourity picture piece Cr and Cb, macro block is made up of 8 * 8 pixels as piece.
Use a kind of compression standard, such as M-JPEG (JPEG, Joint Photographic Experts Group) standard and MPEG (Motion Picture Experts Group) standard, vision signal also can be represented a kind of image of compression.If vision signal is the signal of a mpeg signal or other compression, then as shown in Fig. 2 B, this mpeg signal uses a kind of frame by frame parser 205 or the bit stream analytic technique is divided into each frame with mpeg signal.Then, each frame is sent on the entropy decoder 214 in the Media Processor 203, and is sent on the explanation of tables symbol 216.Data in the entropy decoder 214 application table specifiers 216, utilizing for example is that Hofmann decoding or other decoding technique are decoded mpeg signal.
The signal of decoding is supplied with one subsequently and is removed quantizer 218, and the data of its application table specifier 216 make the signal of decoding go to quantize.Though, work shown in Fig. 2 B betides in the Media Processor 203, but depend on used device, these steps (step 214-218) or can occur in Media Processor 203, the primary processor 211, or even can occur in another external device (ED).
Another kind of situation, if a system has code capacity (for example, encoding) in Media Processor, it allows the processing level that visit is different, then the DCT coefficient can directly transfer on the primary processor.In all these methods, processing can realize in real time.
In the step 104 of Fig. 1, implement important scenes detection, key frame selection by primary processor 210, and in a foundation and store data structure to an index store, this memory all hard disk, file, tape, DVD or other medium in this way, and primary processor for example can be an Intel Pentium TMChip or other processor or multiprocessor can be Philips Trimedia TMChip or other multimedia processor can be a computer, enhancement mode VCR, record/playback apparatus, or television set or any other processor.
Important scenes detection/brightness even variation detects: detect for automatic important scenes, when the scene of a video had changed or a static scene has taken place, the present invention was detected making great efforts.A scene can be represented one or more relevant images.In important scenes detects, two frames are in succession compared, if be judged to be two interframe remarkable difference is arranged, then determine between this two frame the scene variation has taken place; If they are similar significantly and be judged to be, then implement to handle to determine, whether a kind of static scene has taken place.In the uniform luminance change-detection, change if detect a scene, then the brightness value with two frames compares, if the even variation of brightness promptly belongs to the main variation between two frames, then can judge, and occurrence scene does not change between this two frame.
Fig. 2 A shows the example of a bright primary processor 210, and it has brightness change detector 240.DCT picture piece is provided by macro block generator 206 and dct transform device 220.Fig. 2 B shows the example of a bright primary processor 211, and it has important scenes detector 230 and brightness change detector 240.DCT picture piece is by going quantizer 218 to provide.The scene that important scenes processor 230 detects between two frames changes, and by brightness change detector 240 judges whether the scene variation has in fact taken place then, and perhaps whether the difference between two frames is because the result of brightness even variation.If select one the key frame that scene changes has taken place, and offered frame memory 234, so be supplied in the index store 260.If detect even variation is arranged in the brightness, then from this identical scene, do not select another key frame.
The problem to be solved in the present invention is, two frames are compared, and detects material difference between two frames.Having many reasons can make the generation of this significant difference may not be that the scene variation causes.For example, vision signal can be a kind of news broadcast program, and the videograph person is recording the program band of press conference.In this press conference, many cameras are in flash of light, and it makes the brightness between two frames change.The present invention treats it after detecting the even variation of brightness as the image of same scene, rather than detects to a kind of scene and change and select another key frame.Similarly, if the light in the room is opened, or light flash in the discotheque should not be detected as a kind of scene and change, because the difference between two frames only is the even variation of brightness.
This method and device have been used the even variation of relatively coming sensed luminance of DCT (discrete cosine transform) coefficient, and still, other method also can adopt.At first, in primary processor 210, handle each frame that receives 302 severally, produce the coefficient block 440 of 8 * 8 numerical value.Primary processor 210 is used discrete cosine transformer 220 and is handled each 8 * 8 coefficient block 440 that wherein includes spatial information, extracts the DCT coefficient and sets up macro block 308.
When the vision signal that receives was the video format of MPEG and so on compression, the DCT coefficient can extract after going quantification, need not handled by discrete cosine transformer.In addition, as previously described, depend on the device of use, the DCT coefficient can extract automatically and obtain.
The dct transform device also is that Y1, Y2, Y3, Y4, Cr and Cb provide the DCT coefficient value to each as piece 440 (Fig. 4).According to this standard, the upper left corner of each coefficient block comprises DC information (DC value), and remaining DCT coefficient has comprised exchange of information (AC value).As among Fig. 4 partly shown in, from " Z " font on DC value the right in proper order AC value sequence number increase by spatial frequency, next arrives the DCT coefficient under the DC value.Each Y value among Fig. 4 is each brightness value.
In the method for following, the processing of carrying out be limited to in two frames between the relevant block variation of each DC value detected, producing the result quickly, and this limited processing there is no heavy losses on efficient; Yet clearly, the skilled person in the present technique field also can compare the difference in the brightness between corresponding macro block, or adopts any other method to come sensed luminance to change.
According to the method and apparatus of the preferred embodiment of the present invention, judge by the relevant DC value that compares two frames whether brightness has taken place to change in fact uniformly as piece.
Suppose that n is the number of picture piece in the frame.Supposition again, F 1Be first frame, F 2Be second frame, F 1[i] is the i picture piece in first frame, F 2[i] is the i picture piece in second frame.Supposition again, diffmin at first is set on certain high value, and such as 1,000,000, diffmax at first is set on certain low value, such as-9,000,000, then, make following comparison:
For?i=0?to?n
Diff=ABS(F 1[i]-F 2[i])
If?diff<diffmin?then?diffmin=diff;
If?diff>diffmax?then?diffmax=diff;
i=i+1
end
If (diffmax-diffmin)<threshold value does not then have scene and changes generation.
Top computing method is to calculate the absolute value of difference between two DC coefficient, and a DC coefficient is each DC coefficient as piece in first frame, and another is its relevant DC coefficient as piece in second frame.Then, this difference and diffmin and diffmax are compared, to seek out minimal difference between corresponding DC coefficient and maximum difference between two frames.If the difference between maximum difference (diffmax) and the minimal difference (diffmin) is less than some threshold values, then all DC values have changed about identical value, and it indicates brightness and changes.In a preferred embodiment of the present invention, threshold value be chosen as anywhere last diffmax value 0 to 10% between, but depend on applicable cases and this threshold value can change.
If judge, the brightness even variation has taken place between two frames, then two frame sequences are not selected a key frame.Should be pointed out that other method that can applying detection brightness changes, such as using histogram method and Wavelet Transform etc., the present invention is not restricted to embodiment described above.Comparison between the ratio of the ratio of brightness variation and colourity variation can be used for determining that brightness changes, and perhaps any other formula can be used for determining that brightness changes.
Fig. 6 A to Fig. 6 D shows two kinds of schemes that bright detection scene changes, and the difference between two frames is that brightness changes.An image examples when Fig. 6 A is camera flash.Fig. 6 B shows bright this same image behind camera flash.Similarly, show among Fig. 6 C bright when light is turned off the scene of discotheque, show bright this Same Scene when light is opened among Fig. 6 D.
Used the DCT coefficient in the explanation of the present invention, but people can replace with other representational value, such as wavelet coefficient, histogram etc., or be performed on a function on the image region, it can provide a representational value to this subregion.In addition, the present invention is that benchmark has been made explanation with a kind of video index system, yet, it belongs to the brightness even variation that detects between two frames generally, so it also can detect each scene as a kind of searcher where camera flash is arranged, or it gathers representational each frame as a kind of archiving method under the another kind of situation.
Though, the present invention has been described in conjunction with the preferred embodiments, it will be appreciated that, modify in the principle that skilled person in the present technique field obviously can delineate out in the above, therefore, the present invention is not restricted to preferred embodiment, and predetermined covering and this type of modification.

Claims (12)

1. a video index system is used to detect scene and changes, and each scene is selected key frame, and this system comprises:
A) scene change detector (230), the scene that detects between two frame of video changes; And
B) be used for the detection system of sensed luminance even variation between two frames, described detection system contains:
I) receiver (210,202) receives the information source video, and has each frame that is made of brightness value in the information source video; And
An ii) comparator (230,240), with separately brightness value in the brightness value in first frame and second frame compare and detect in first frame all brightness values whether with second frame in all brightness values changed practically identical value significantly;
Change described detection system and can receive two frame of video once detecting scene, and can judge difference brightness even variation whether in fact just between two frames.
2. the system of claim 1, wherein, brightness value converts the form of discrete cosine transform coefficient to.
3. the system of claim 1, wherein, brightness value converts the form of wavelet coefficient to.
4. the system of claim 1, wherein, brightness value converts the form of histogram value to.
5. the system of claim 1, wherein said comparator (230,240), further calculate a maximum difference between all corresponding brightness values in first frame and second frame, and first the minimal difference between all corresponding brightness values in frame and second frame, then, absolute value and threshold value of the difference between maximum difference and the minimal difference can be compared, whether the brightness even variation has taken place to judge.
6. the system of claim 5, wherein, described threshold value 0 to 10% scope of described maximum difference.
7. method that is used to differentiate the misattribution of scene change-detection comprises:
Receive at least two frame of video, every frame has each brightness value, and this two frame has been detected to second frame scene variation having taken place from first frame;
Corresponding brightness value in each brightness value in first frame and second frame is compared; And
Calculate in first frame all brightness values whether with second frame in all brightness values changed practically identical value significantly, if so, judge that then the scene that misattribution has taken place changes between this two frame.
8. the method for claim 7, wherein, brightness value converts the discrete cosine transform coefficient form to.
9. the method for claim 7, wherein, brightness value converts the wavelet coefficient form to.
10. the method for claim 7, wherein, brightness value converts the histogram value form to.
11. the method for claim 7 also comprises the steps:
Calculate the maximum difference between all corresponding brightness values in first frame and second frame, and the minimal difference between all corresponding brightness values in first frame and second frame;
Absolute value and threshold value with the difference between maximum difference and the minimal difference compares then, to judge whether the brightness even variation taken place.
12. the method for claim 11, wherein, described threshold value is 0 to 10% of described maximum difference.
CNB008070067A 1999-12-30 2000-12-15 Method and apparatus for reducing false positives in cut detection Expired - Fee Related CN1252982C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US47708599A 1999-12-30 1999-12-30
US09/477085 1999-12-30

Publications (2)

Publication Number Publication Date
CN1349711A CN1349711A (en) 2002-05-15
CN1252982C true CN1252982C (en) 2006-04-19

Family

ID=23894478

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB008070067A Expired - Fee Related CN1252982C (en) 1999-12-30 2000-12-15 Method and apparatus for reducing false positives in cut detection

Country Status (4)

Country Link
EP (1) EP1180307A2 (en)
JP (1) JP2003519971A (en)
CN (1) CN1252982C (en)
WO (1) WO2001050737A2 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6766098B1 (en) 1999-12-30 2004-07-20 Koninklijke Philip Electronics N.V. Method and apparatus for detecting fast motion scenes
US7333712B2 (en) 2002-02-14 2008-02-19 Koninklijke Philips Electronics N.V. Visual summary for scanning forwards and backwards in video content
MXPA06002837A (en) 2003-09-12 2006-06-14 Nielsen Media Res Inc Digital video signature apparatus and methods for use with video program identification systems.
WO2005091050A1 (en) 2004-03-12 2005-09-29 Koninklijke Philips Electronics N.V. Multiview display device
KR100825737B1 (en) * 2005-10-11 2008-04-29 한국전자통신연구원 Method of Scalable Video Coding and the codec using the same
CN100428801C (en) * 2005-11-18 2008-10-22 清华大学 Switching detection method of video scene
CN102724385B (en) * 2012-06-21 2016-05-11 浙江宇视科技有限公司 A kind of Intelligent video analysis method and device
CN108769458A (en) * 2018-05-08 2018-11-06 东北师范大学 A kind of deep video scene analysis method

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2231746B (en) * 1989-04-27 1993-07-07 Sony Corp Motion dependent video signal processing
US5969755A (en) * 1996-02-05 1999-10-19 Texas Instruments Incorporated Motion based event detection system and method
US5767922A (en) * 1996-04-05 1998-06-16 Cornell Research Foundation, Inc. Apparatus and process for detecting scene breaks in a sequence of video frames
US5920360A (en) * 1996-06-07 1999-07-06 Electronic Data Systems Corporation Method and system for detecting fade transitions in a video signal
US6137544A (en) * 1997-06-02 2000-10-24 Philips Electronics North America Corporation Significant scene detection and frame filtering for a visual indexing system

Also Published As

Publication number Publication date
JP2003519971A (en) 2003-06-24
WO2001050737A2 (en) 2001-07-12
CN1349711A (en) 2002-05-15
EP1180307A2 (en) 2002-02-20
WO2001050737A3 (en) 2001-11-15

Similar Documents

Publication Publication Date Title
JP4749518B2 (en) Visible indexing system
JP4256940B2 (en) Important scene detection and frame filtering for visual indexing system
US6496228B1 (en) Significant scene detection and frame filtering for a visual indexing system using dynamic thresholds
US6697523B1 (en) Method for summarizing a video using motion and color descriptors
JP4667697B2 (en) Method and apparatus for detecting fast moving scenes
Kobla et al. Detection of slow-motion replay sequences for identifying sports videos
US20080267290A1 (en) Coding Method Applied to Multimedia Data
AU2007231756B2 (en) A method of segmenting videos into a hierachy of segments
EP1319230A1 (en) An apparatus for reproducing an information signal stored on a storage medium
Faernando et al. Scene change detection algorithms for content-based video indexing and retrieval
CN1237793C (en) Method for coding motion image data and its device
Nakajima A video browsing using fast scene cut detection for an efficient networked video database access
CN1252982C (en) Method and apparatus for reducing false positives in cut detection
Lie et al. News video summarization based on spatial and motion feature analysis
JP2002064823A (en) Apparatus and method for detecting scene change of compressed dynamic image as well as recording medium recording its program

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C19 Lapse of patent right due to non-payment of the annual fee
CF01 Termination of patent right due to non-payment of annual fee