KR101706347B1 - Method for shot boundary detection, and image processing apparatus and method implementing the same method - Google Patents
Method for shot boundary detection, and image processing apparatus and method implementing the same method Download PDFInfo
- Publication number
- KR101706347B1 KR101706347B1 KR1020150115910A KR20150115910A KR101706347B1 KR 101706347 B1 KR101706347 B1 KR 101706347B1 KR 1020150115910 A KR1020150115910 A KR 1020150115910A KR 20150115910 A KR20150115910 A KR 20150115910A KR 101706347 B1 KR101706347 B1 KR 101706347B1
- Authority
- KR
- South Korea
- Prior art keywords
- extracting
- pixel
- value
- binary value
- bits
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/40—Image enhancement or restoration by the use of histogram techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
Abstract
A method of detecting a shot boundary in an image, the method comprising: extracting a feature value including shape information and color information for each pixel; generating a histogram of each frame using pixel-specific feature values; And extracting a shot boundary from the frames based on the difference.
Description
The present invention relates to image processing.
Today, digital video is widely used for archival as a recording heritage, as a material for intelligent services, in addition to services for playback. However, video analysis and processing are not easy because users have to analyze and process large amounts of video with sequential playback. To solve this problem, scene and shot boundary detection methods are being studied.
The method of detecting shot boundaries in video calculates consecutive frame differences and treats them as shot boundaries if the difference is large enough. The difference between frames can be calculated by a pixel based method, a block based method, a histogram based method, an edge based method, a motion vector based method and the like. The pixel-based method is a method of detecting a shot boundary using the difference of corresponding pixel values in a neighboring frame, and is disadvantageous in that it is very sensitive to noise and object motion. The histogram-based method is a method of calculating histograms by using color information in neighboring frames and then calculating the difference between histograms. In contrast to the pixel-based method, the color distribution of the frames in the same shot is abrupt There is a disadvantage in that it is erroneously detected as a new shot boundary. Therefore, there is a need for a method to compensate the problem that is sensitive to the state change of the object in the histogram-based method.
SUMMARY OF THE INVENTION It is an object of the present invention to provide a shot boundary detection method, and an image processing apparatus and method implementing the same.
A method for detecting a shot boundary in an image according to an embodiment of the present invention includes extracting feature values including shape information and color information for each pixel, generating histograms of each frame using feature values for each pixel And extracting a shot boundary among the frames based on a histogram difference between adjacent frames.
The extracting of the feature value may include extracting a first binary value representing the shape information of the pixel, extracting a second binary value representing the color information of the pixel, and extracting the first binary value and the second binary value, And combining the two binary values to generate a feature value of the pixel.
The extracting of the first binary value may extract the first binary value using local binary patterns.
The step of extracting the first binary value may include generating a rotation invariant code by minimizing the binary code generated in the local binary pattern by a bit unit and outputting the rotation related code as the first binary value Can be extracted.
The extracting of the second binary value may include extracting some bits from the color channels representing red, green, and blue extracted from the pixels, combining the extracted bits, A second binary value can be generated.
The bits may be the upper two bits of each color channel and the second binary value may be six bits.
Wherein the step of extracting the shot boundary computes an average histogram difference of the frames before and after the reference frame as a threshold value and if the histogram difference between the reference frame and a previous frame of the reference frame is greater than the threshold value, .
According to another embodiment of the present invention, there is provided a method of processing an image, the method comprising: dividing the image into a shot or a scene, which is a set of similar frames, based on a histogram difference between adjacent frames; And the histogram difference may be calculated based on a feature value of a pixel including shape information.
The method may further include extracting a feature value including shape information and color information for each pixel, generating a histogram of each frame using pixel-specific feature values, and calculating a histogram difference between adjacent frames .
The type information may be extracted as local binary patterns.
In the extracting of the feature value, the binary code extracted in the local binary pattern may be shifted bit by bit to obtain a rotation invariant code, and the rotation related code may be included in the feature value.
Wherein the step of dividing the reference frame into the shots or the scenes comprises: calculating a difference between a mean histogram of frames before and after a reference frame as a threshold value; and if the histogram difference between the reference frame and a previous frame of the reference frame is greater than the threshold, Or to the boundaries of the scene.
The image may be an image photographed on a table, and the object may be a physical object placed on the table.
According to another aspect of the present invention, there is provided an image processing apparatus comprising: a memory for storing a program; and a processor for executing the program in cooperation with the memory, the program comprising: Extracting a feature value to be included, generating a histogram of each frame by using the feature value for each pixel, and extracting a shot boundary among frames based on the difference between histograms of adjacent frames do.
The program may include instructions for extracting a representative frame from a set of similar frames classified based on the extracted shot boundary, and performing an operation of recognizing the object in the representative frame.
According to the embodiment of the present invention, the shot boundary can be detected by a method resistant to environmental changes such as noise, illumination change, and object state change more than when only color information is used. According to the embodiment of the present invention, even if a state change such as appearance, movement or disappearance of a physical object on a table occurs, it can be recognized as a similar frame. Therefore, according to the embodiment of the present invention, the performance of the remote collaboration system can be improved and the remote collaboration service with high satisfaction can be provided through the table sharing among the remote sites.
1 is a diagram for explaining a hierarchical structure of video.
2 is a flowchart of a shot boundary detection method according to an embodiment of the present invention.
3 is a flowchart of a method of generating a feature value of a pixel according to an embodiment of the present invention.
4 is a view for explaining a method of extracting morphological information for a feature value according to an embodiment of the present invention.
5 is a diagram illustrating a color information extraction method for a feature value according to an embodiment of the present invention.
6 is a diagram illustrating a structure of a feature value according to an embodiment of the present invention.
7 is a block diagram of a remote collaboration system according to an embodiment of the present invention.
8 is a flowchart of an image processing method according to an embodiment of the present invention.
9 is an illustration of an image for evaluating the performance of an image processing apparatus according to an embodiment of the present invention.
10 is a graph showing experimental results according to an embodiment of the present invention.
11 is a diagram schematically showing a hardware structure of an image processing apparatus according to an embodiment of the present invention.
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily carry out the present invention. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. In order to clearly illustrate the present invention, parts not related to the description are omitted, and similar parts are denoted by like reference characters throughout the specification.
Throughout the specification, when an element is referred to as "comprising ", it means that it can include other elements as well, without excluding other elements unless specifically stated otherwise. Also, the terms " part, "" module," and " module ", etc. in the specification mean a unit for processing at least one function or operation and may be implemented by hardware or software or a combination of hardware and software have.
1 is a diagram for explaining a hierarchical structure of video.
Referring to FIG. 1, a video has a hierarchical structure in which similar frames are gathered to form shots, shots of a similar meaning group are gathered to form a scene, and a set of scenes forms a cluster.
The shot / scene boundary detection method is used for analyzing the hierarchical structure of video and for indexing for video summarization and searching, shot-based / scene-based editing, and processing saving It is used variously.
Among methods of detecting shot / scene boundaries of a video, there is a histogram-based detection method. A method of calculating the difference between the histograms by using the color information in the adjacent frames and then calculating the difference between the histograms is disadvantageous in that if the color distribution of the frame is suddenly changed in the same shot / scene, the shot is erroneously detected as a new shot / scene.
The following describes a shot / scene boundary detection method robust to object state changes. In the following, it is described by the "shot" boundary detection, which is a sub-concept of the scene, but it can be used for scene boundary detection.
2 is a flowchart of a shot boundary detection method according to an embodiment of the present invention.
Referring to FIG. 2, the
The
The
FIG. 3 is a flowchart illustrating a method of generating a feature value of a pixel according to an exemplary embodiment of the present invention. FIG. 4 is a diagram illustrating a method of extracting shape information for a feature value according to an exemplary embodiment of the present invention, FIG. 6 is a diagram illustrating a structure of feature values according to an embodiment of the present invention. FIG. 6 is a diagram illustrating a method of extracting color information for a feature value according to an exemplary embodiment of the present invention.
The color histogram - based shot boundary detection method is sensitive to the state changes (movements) of the physical objects included in the image. In order to solve this problem, the present invention obtains a histogram of a frame based on a feature value combining color information and type information. The
Referring to FIG. 3, the
Referring to FIG. 4, in the case of a 3x3 block, a binary code (for example, 11101010) is generated by comparing a value of a center pixel with that of neighboring pixels to assign a value of 0 or 1 to a neighboring pixel. The binary code can be converted to a decimal number 234 for histogram generation.
The LBP code is generated as shown in Equation (3). In Equation (3), i c is the pixel value of the center pixel (x c , y c ), and i p is the eight neighbor pixel values around the center pixel. T is the threshold value,
Is a value used to generate noise robust LBP code. Can be selected through an experiment, for example, 2.
The
The
The
Referring to FIG. 5, the
The
The
Referring to FIG 6, when it is being characteristic value is represented in 12 bits, and some bits of the 12 bits (e.g., 6 bits) and LBP code, the remaining bits (e.g., 6 bits) color depth (R 7 , it may be R 6, G 7, G 6 , B 7, B 6). The LBP code may be a RILBP code.
The
FIG. 7 is a configuration diagram of a remote collaboration system according to an embodiment of the present invention, and FIG. 8 is a flowchart of an image processing method according to an embodiment of the present invention.
Referring to FIG. 7, the
The
The
The table surface is photographed by a camera, and the photographed table image is transmitted to the
Depending on the user's behavior, the same physical object can move within a limited range (table area). If the histogram difference is calculated based on the color information, if the physical object moves, the histogram difference may be large despite the frame including the same physical object. However, since the
In particular, since the remote collaboration system is applied to various technologies, it is necessary to efficiently operate the processor of the computing device. As the number of frames for object recognition increases, more processor resources are required, which makes efficient operation impossible. Accordingly, the
Referring to FIG. 8, the
The
The
Thus, the
FIG. 9 is an illustration of an image for evaluating the performance of an image processing apparatus according to an embodiment of the present invention, and FIG. 10 is a graph illustrating experimental results according to an embodiment of the present invention.
Referring to FIG. 9, an example of images taken on a table is an expected situation that may appear in a remote collaboration environment. FIG. 9A is a reference image, FIG. 9B is an image taken in a state in which illumination is changed in a reference image, FIG. 9C is an image in which a new physical object appears, to be.
Referring to FIG. 10, the histogram difference between FIG. 9A and FIG. 9B, FIG. 9C, and FIG. 9D can be calculated and the shot / scene boundary detection performance can be confirmed based on this difference. Based on a color histogram-based method, a feature-value-based method (color + LBP) that combines color information and LBP code, and a feature-based method (color + RILBP) that combines color information and RILBP code, Calculate the histogram difference for the appearing image.
The color histogram-based method is relatively sensitive to illumination and rotation changes compared to other methods, and the histogram difference is very large when objects appear. In addition, the color histogram-based method responds sensitively to situations where existing physical objects change instantaneously.
Feature value based method (color + LBP) which combines color information and LBP code shows better results in illumination change, rotation change, object appearance than color histogram based method.
Feature value based method (color + RILBP) which combines color information and RILBP code shows robustness to rotation change than color + LBP. This is because RILBP uses 2 bits less than LBP using 8 bits.
Table 1 shows the results for each method using Precision (= actual shot detected / shot detected) and Recall (= detected actual shot / actual shot) using experimental video. Experimental video is composed of situations where three physical objects appear and disappear once.
As described above, the histogram distance calculation method using feature values combining color information and shape information and the shot / scene detection method using the method are robust to noise, illumination change, and object state change more than when only color information is used.
11 is a diagram schematically showing a hardware structure of an image processing apparatus according to an embodiment of the present invention.
Referring to FIG. 11, the
The memory stores instructions for carrying out the present invention, or temporarily loads the instructions from the storage device. The processor executes the instructions of the present invention by executing instructions stored or loaded in memory.
The embodiments of the present invention described above are not implemented only by the apparatus and method, but may be implemented through a program for realizing the function corresponding to the configuration of the embodiment of the present invention or a recording medium on which the program is recorded.
While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments, It belongs to the scope of right.
Claims (15)
Extracting characteristic values including shape information and color information for each pixel,
Generating a histogram of each frame using pixel-specific feature values, and
A step of extracting a shot boundary among frames based on a histogram difference between adjacent frames
Lt; / RTI >
The step of extracting the feature value
Extracting a first binary value representing the type information of the pixel,
Extracting a second binary value representing the color information of the pixel, and
And combining the first binary value and the second binary value to generate a feature value of the pixel,
The step of extracting the second binary value
Extracts some bits from color channels representing red, green, and blue extracted from the pixels, combines the extracted bits to generate the second binary value,
Wherein the bits are the upper two bits of each color channel and the second binary value is six bits.
The step of extracting the first binary value
And extracting the first binary value using local binary patterns.
The step of extracting the first binary value
Generating a rotation invariant code by minimizing a binary code generated in the local binary pattern by a bit unit, and extracting the rotation uninterruptible code as the first binary value.
The step of extracting the shot boundary
Calculating a threshold value of an average histogram difference between frames before and after a reference frame and extracting the reference frame as a shot boundary when the difference between histograms of the reference frame and a previous frame of the reference frame is greater than the threshold value.
Extracting characteristic values including shape information and color information for each pixel,
Generating a histogram of each frame using pixel-by-pixel feature values,
Calculating a histogram difference between adjacent frames,
Dividing the image into a shot or a scene, which is a set of similar frames, based on the histogram difference between adjacent frames, and
Recognizing objects in representative frames of each shot or each scene
Lt; / RTI >
Wherein the histogram difference is calculated based on a feature value of a pixel including shape information,
The step of extracting the feature value
Extracting a first binary value representing the type information of the pixel,
Extracting a second binary value representing the color information of the pixel, and
And combining the first binary value and the second binary value to generate a feature value of the pixel,
The step of extracting the second binary value
Extracts some bits from color channels representing red, green, and blue extracted from the pixels, combines the extracted bits to generate the second binary value,
Wherein said some bits are upper two bits of each color channel and said second binary value is six bits.
Wherein the type information is extracted as local binary patterns.
The step of extracting the feature value
Wherein the binary code extracted in the local binary pattern is shifted bit by bit to obtain a rotation invariant code and the rotation related code is included in the feature value.
The step of dividing into shots or scenes
A threshold value calculation unit for calculating an average histogram difference of frames before and after a reference frame as a threshold value and for performing image processing for extracting the reference frame as a boundary of a shot or a scene when a difference between histograms of the reference frame and a previous frame of the reference frame is larger than the threshold value Way.
The image is an image taken on a table,
Wherein the object is a physical object placed on the table.
And a processor for executing the program in cooperation with the memory,
The program
Extract feature values including shape information and color information for each pixel in the input image, generate a histogram of each frame using feature values for each pixel, and extract a shot boundary among the frames based on the histogram difference between adjacent frames Instructions for performing an operation,
The program
A first binary value expressing the shape information of the pixel is extracted and a part of bits extracted from the color channels expressing red, green and blue extracted from the pixel are combined, Further comprising instructions for generating a binary value and combining the first binary value and the second binary value to generate a feature value of the pixel,
Wherein said some bits are the upper two bits of each color channel and said second binary value is six bits.
The program
Extracting a representative frame from a set of similar frames divided based on the extracted shot boundary, and performing an operation of recognizing an object in the representative frame.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR20140111019 | 2014-08-25 | ||
KR1020140111019 | 2014-08-25 |
Publications (2)
Publication Number | Publication Date |
---|---|
KR20160024767A KR20160024767A (en) | 2016-03-07 |
KR101706347B1 true KR101706347B1 (en) | 2017-02-14 |
Family
ID=55540195
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020150115910A KR101706347B1 (en) | 2014-08-25 | 2015-08-18 | Method for shot boundary detection, and image processing apparatus and method implementing the same method |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR101706347B1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20190121547A (en) | 2018-04-18 | 2019-10-28 | 주식회사 더말코리아 | The sheet mask containing the rosepink metal layer and manufacturing method thereof |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009141508A (en) | 2007-12-04 | 2009-06-25 | Nippon Telegr & Teleph Corp <Ntt> | Television conference device, television conference method, program, and recording medium |
-
2015
- 2015-08-18 KR KR1020150115910A patent/KR101706347B1/en active IP Right Grant
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009141508A (en) | 2007-12-04 | 2009-06-25 | Nippon Telegr & Teleph Corp <Ntt> | Television conference device, television conference method, program, and recording medium |
Non-Patent Citations (2)
Title |
---|
KAIST(2011) |
논문1:한국정보과학회(2014) |
Also Published As
Publication number | Publication date |
---|---|
KR20160024767A (en) | 2016-03-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106254933B (en) | Subtitle extraction method and device | |
Cernekova et al. | Information theory-based shot cut/fade detection and video summarization | |
KR100645300B1 (en) | Method and apparatus for summarizing and indexing the contents of an audio-visual presentation | |
Guimaraes et al. | Video segmentation based on 2D image analysis | |
CN107209931B (en) | Color correction apparatus and method | |
Karaman et al. | Comparison of static background segmentation methods | |
JP6553692B2 (en) | Moving image background removal method and moving image background removal system | |
JP2008527525A (en) | Method and electronic device for detecting graphical objects | |
US20100067863A1 (en) | Video editing methods and systems | |
WO2019225692A1 (en) | Video processing device, video processing method, and video processing program | |
WO2017027212A1 (en) | Machine vision feature-tracking system | |
US20180197577A1 (en) | Thumbnail generation for video | |
US20130148899A1 (en) | Method and apparatus for recognizing a character based on a photographed image | |
JP2010211498A (en) | Image processing program and image processing system | |
Lee et al. | Video scene change detection using neural network: Improved ART2 | |
Fleuret et al. | Re-identification for improved people tracking | |
KR101706347B1 (en) | Method for shot boundary detection, and image processing apparatus and method implementing the same method | |
Setiawan et al. | Gaussian mixture model in improved hls color space for human silhouette extraction | |
JP2003303346A (en) | Method, device and program for tracing target, and recording medium recording the program | |
CN112752110B (en) | Video presentation method and device, computing device and storage medium | |
Ortego et al. | Multi-feature stationary foreground detection for crowded video-surveillance | |
CN108737814B (en) | Video shot detection method based on dynamic mode decomposition | |
Lo et al. | A statistic approach for photo quality assessment | |
CN113192081A (en) | Image recognition method and device, electronic equipment and computer-readable storage medium | |
Jarraya et al. | Accurate background modeling for moving object detection in a dynamic scene |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
E701 | Decision to grant or registration of patent right | ||
GRNT | Written decision to grant |