KR20130119781A

KR20130119781A - Apparatus and method for detecting video copy

Info

Publication number: KR20130119781A
Application number: KR1020120042845A
Authority: KR
Inventors: 이정호; 서용석; 김정현; 박지현; 서영호; 윤영석; 이상광; 이승재; 김성민; 유원영
Original assignee: 한국전자통신연구원
Priority date: 2012-04-24
Filing date: 2012-04-24
Publication date: 2013-11-01

Abstract

The present invention is a video duplication detection device, characterized in that the frame configuration unit for decoding the input video to be composed of successive frames over time, and feature extraction to extract and code the feature of the video from the frames configured in the frame configuration unit A video storing a coded feature of at least one video, and a video having a coded feature identical to the coded feature of the input video coded by the feature extractor is detected from the database. And a feature matching unit for determining a function and a duplicate video removing unit for removing remaining portions except for one of the duplicated videos when the feature matching unit determines that the input video is duplicated.

Description

Apparatus and Method for Detecting Video Copy}

TECHNICAL FIELD The present invention relates to a duplication detection apparatus and method, and more particularly, to an apparatus and method for rapidly detecting duplicate video in a large data storage environment.

The development of communication environment and digital device is creating explosive demand of digital contents. In particular, as services for smart mobile terminals increase, services that allow users to enjoy content on devices with low performance and low capacity storage devices are increasing.

Typical services include video sharing sites such as YouTube and cloud services such as iCloud. These services aim to build a stable transmission environment and a system that can be accessed by a large number of users by building a large database.

However, in such a system where many users are accessible, many users frequently upload the same data. That is, unnecessary data duplication occurs. As the amount of data stored in a database increases, maintenance and management costs increase exponentially. Such data duplication increases the cost of maintaining and managing the database.

Therefore, if duplicate data can be found and removed, or if the user can determine whether the data is duplicated before uploading the data, not only the database but also the network resources required for data transmission can be saved.

The deduplication technology currently used in mass storage servers is most likely to detect duplicates of exactly the same data. By the way, even if the multimedia content is not exactly the same file, the content is often the same. For example, in the case of video, the size of a file varies depending on a compression codec (DivX, H264, etc.) and a file format (AVI, MKV, etc.), but the same file is not the same file.

Fingerprint technology is used to search for similar video. Fingerprint technology is used in various fields such as video search and advertisement detection as a method of encoding a content type by using a descriptor that indicates the characteristics of the content. Many patents have been registered and filed.

However, the existing fingerprint technology includes not only finding videos with the same contents but also searching for an edited portion of the video, which makes the search and matching method complicated, and the process of extracting features for searching is simple. Otherwise, the amount of computation and time can be increased.

The present invention provides an apparatus and method for detecting duplication of video content with a small amount of computation.

The present invention provides an apparatus and method for quickly detecting duplication of video content.

The present invention provides a method for detecting video overlap in a device having a database storing one or more videos, the method comprising: decoding an input video and configuring the frames into successive frames over time, and encoding the video from the configured frames. Extracting a feature, determining whether a video is duplicated according to whether a video having a coded feature identical to a coded feature of the coded input video is detected from the database, and determining that the input video is duplicated If any, removing the remaining ones except for one of the duplicated videos.

The present invention provides a method for extracting a feature that can represent a given video content to detect video content that is substantially the same as the video.

In addition, the present invention saves the time required for feature extraction compared to the conventional method by expressing a simple feature extraction method and the extracted feature in a code having a predetermined length, and compares extracted features at high speed to detect a video having duplicate contents It is easy to manage large databases and save resources.

1 is a block diagram of an apparatus for detecting video duplication according to an embodiment of the present invention.
FIG. 2 is a diagram illustrating a detailed configuration of the feature extraction unit illustrated in FIG. 1.
3 is a graph illustrating an example of an average value for a specific component of selected frames.
4 is a graph in which the graph of FIG. 3 is adaptive quantized.
FIG. 5 is a graph in which the graph illustrated in FIG. 4 is modified by morphology calculation.
6 is a flowchart illustrating a video overlap detection method according to an embodiment of the present invention.
FIG. 7 is a flowchart for describing operation 620 of FIG. 6 in detail.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout.

In the following description of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear.

The terms used throughout the specification are defined in consideration of the functions in the embodiments of the present invention and can be sufficiently modified according to the intentions and customs of the user or the operator. It should be based on the contents of.

1 is a block diagram of an apparatus for detecting video duplication according to an embodiment of the present invention.

Referring to FIG. 1, the apparatus for detecting video overlap includes a frame constructer 110, a feature extractor 120, a feature matcher 130, a DB 140, and a redundant video remover 150.

The frame configuration unit 110 receives a newly uploaded video and configures a sequence of frames according to time. Since the video may be composed of a plurality of file formats and codecs, the frame configuration unit 110 decodes the video to form a sequence of individual frames.

The feature extractor 120 extracts and encodes a feature of a corresponding video from the frames configured by the frame constructer 110, which will be described in detail with reference to FIG. 2.

The feature matcher 130 compares the feature of the input video coded by the feature extractor 120 with the coded feature of one or more videos stored in the DB 140. That is, the degree of matching between the coded feature of the input video and the coded feature of the stored video is digitized, and it is determined whether the quantized match is greater than or equal to a preset threshold, and whether or not video overlap is detected according to whether a video having a threshold or more is detected. To judge. Then, the determination result information is transmitted to the redundant video removal unit 150.

DB 140 is a database that stores the original video to be provided by the shared site and the cloud service provider. The codec and file format for encoding each video may be different from each other.

The redundant video remover 150 may leave only one of the duplicated videos when a video matching the input video exists in the DB 140, that is, when the result determined by the feature matcher 130 is determined to be duplicated. Remove the rest. At this time, according to the image quality or the setting information of the administrator, only the best image quality is left and the others are removed.

FIG. 2 is a diagram illustrating a detailed configuration of the feature extraction unit illustrated in FIG. 1.

Referring to FIG. 2, the feature extractor 120 includes a frame normalizer 121, a component separator 122, an average value calculator 123, a center value calculator 124, an adaptive quantizer 125, The morphology unit 126 and the code generation unit 127 are included.

The frame normalizer 121 selects one or more frames to normalize the frame rate and extract a feature based on time. Since a video repeats the same scene in consecutive frames, unnecessary operations may occur due to repetitive data during feature extraction. In order to prevent such an unnecessary operation, the frame normalization unit 121 selects only frames having a predetermined time interval according to the system setting. In this case, since the frame rate may vary according to the encoding environment of the video, the frame rate is normalized based on time, and a frame from which the feature is extracted is selected.

The component separator 122 separates a specific component for feature extraction for each of the frames selected by the frame normalizer 121. That is, the frame has all the color components. In general, the digital image may be color-coded in the form of RGB or YCbCr. According to an embodiment of the present invention, the component separation unit 122 separates only the luminance component (Luminance) corresponding to Y in the YCbCr form of expression.

The average value calculator 123 calculates an average value of pixel values for a specific component of the entire pixel included in each of the selected frames to generate a feature that can represent the input video. That is, an average value of brightness values of each pixel is calculated. Since the average value is calculated as one value per frame, the average value is uniform regardless of the size or ratio of the screen.

3 is a graph illustrating an example of an average value for a specific component of selected frames.

In FIG. 3, the horizontal axis is a frame order (time), and the vertical axis is an average value calculated in each frame and may have a value between 0 and 255. FIG.

The center value calculator 124 grasps the maximum value and the minimum value of the average value calculated by the average value calculator 123 and calculates the center value therefrom. According to the characteristics of the digital image, the maximum value is 255 or less, and the minimum value is 0 or more. The reason for calculating the center value is that even though the video has the same content, the brightness value may vary depending on the encoded environment, but the degree of relative change over time is constant.

The adaptive quantization unit 125 quantizes the average value by a predetermined number preset in the system based on the maximum value, minimum value, and center value of the above-described average value. Through this, high frequency components may be removed from the average value graph as shown in FIG. 3 to facilitate coding. Also, it is to detect a video having the same content even when the brightness value of the entire video is changed due to an external factor.

Meanwhile, the quantized average value has only positive values, and the adaptive quantization unit 125 expresses the remaining quantized average values in the form of positive and negative on the basis of the quantization value corresponding to the center value of the average value.

4 is a graph in which the graph of FIG. 3 is adaptive quantized.

Referring to FIG. 4, not only the high frequency component illustrated in FIG. 3 is removed, but the average values have a positive value and a negative value based on the center value of the average value.

Referring back to FIG. 2, the morphology unit 126 processes an average value showing a sudden change in the short term among the quantized average values by the adaptive quantization unit 125.

That is, referring to the graph of FIG. 4, it can be seen that the average value of the second half of the graph is suddenly changed in the short term compared with the neighboring average value. The morphology unit 126 processes these abruptly changing average values by performing one-dimensional erosion and dilation morphology operations. In this case, use the morphology filter corresponding to the length X set by the system.

FIG. 5 is a graph in which the graph illustrated in FIG. 4 is modified by morphology calculation.

Referring to FIG. 5, in the graph of FIG. 4, it can be seen that average values showing sudden changes in the short term are replaced by neighboring average values.

Referring back to FIG. 2, the encoding unit 127 expresses average values that have undergone a morphology operation as shown in FIG. 5 as binary codes.

The binary code continuously expresses the feature values in the chronological order of the frames. For example, the quantized average values may be used as they are, or in another example, the average values may be transformed into gray codes for convenience in measuring the difference value in the comparison process. It may be. In this case, since the length of the code increases in proportion to the length of the video, the feature matching unit 130 may determine whether the first overlap is the length of the code.

Next, a video overlap detection method by the apparatus as described above will be described with reference to FIGS. 6 and 7.

6 is a flowchart illustrating a video overlap detection method according to an embodiment of the present invention.

Referring to FIG. 1, as a newly uploaded video is input in step 610, a sequence of frames is formed over time. In operation 620, a feature of the corresponding video is extracted and coded from the configured frames, which will be described in detail with reference to FIG. 7 below.

In operation 630, the features of the encoded input video are compared with the encoded features of the one or more videos that are already stored. That is, the degree of coincidence between the encoded feature of the input video and the encoded feature of the stored video is digitized, and whether or not the video overlaps is determined according to whether the quantized match is equal to or greater than a preset threshold.

If there is a video that matches the video input in step 630, only one of the duplicated videos is left and the others are removed in step 640. At this time, according to the image quality or the setting information of the administrator, the rest except for the best image quality is removed.

FIG. 7 is a flowchart for describing operation 620 of FIG. 6 in detail.

Referring to FIG. 7, in step 621, the frame rate is normalized based on time, and one or more frames are selected to extract a feature. At this time, only frames having a predetermined time interval are selected according to the system setting.

A specific component is separated for feature extraction for each of the frames selected in step 622. That is, the frame has all the color components. In general, the digital image may be color-coded in the form of RGB or YCbCr. According to an embodiment of the present invention, only the luminance component (Luminance) corresponding to Y is separated in the YCbCr type representation.

In order to generate a feature that can represent the video input in step 623, an average value of pixel values for a specific component of the entire pixel included in each of the selected frames is calculated. That is, the average value of the brightness values of each pixel is calculated. Since the average value is calculated as one value per frame, the average value is uniform regardless of the size or ratio of the screen.

In step 624, the maximum value and the minimum value of the average value are determined, and the center value is calculated therefrom. At this time, the maximum value is 255 or less, and the minimum value is 0 or more, depending on the characteristics of the digital image.

In operation 625, the average value is quantized by a predetermined number preset in the system based on the maximum, minimum, and center values of the above-described average value. Through this, high frequency components may be removed from the average value graph as shown in FIG. 3 to facilitate coding. Meanwhile, the quantized average value has only positive values. In step 625, the remaining quantized average values may be expressed in a positive and negative form based on the quantization value corresponding to the center value of the average value.

In step 626, a mean value showing a sudden change in the short term among the quantized mean values is processed, and a one-dimensional erosion and dilation morphology calculation may be performed. As a result, the average values that showed a sudden change in the short term can be replaced by the neighboring average values.

In operation 627, the average values that have undergone the morphology calculation may be represented by a binary code. The binary code continuously expresses the feature values in the chronological order of the frames. For example, the quantized average values may be used as they are, or in another example, the average values may be transformed into gray codes for convenience in measuring the difference value in the comparison process. It may be. In this case, since the length of the code increases in proportion to the length of the video, it may be determined whether the primary overlap is the length of the code in step 640 of FIG. 6.

Claims

A frame structurer configured to decode the input video and configure successive frames according to time;
A feature extractor configured to extract and code a feature of the video from the frames configured in the frame composer;
A database that stores coded features of one or more videos;
And a feature matcher for determining whether or not the video is overlapped according to whether a video having the same coded feature as the coded feature of the input video coded by the feature extractor is detected from the database. .

The method of claim 1,
And a duplicate video remover configured to remove remaining portions of the input video when the input video is determined to be duplicated by the feature matcher.

The method of claim 1, wherein the feature extraction unit
A frame normalization unit for normalizing a frame rate based on time and selecting one or more frames from which the feature is to be extracted;
A component separator for separating a specific component for each of the selected frames;
An average value calculator for calculating an average value of pixel values of a specific component included in each of the selected frames;
And a code generator which codes and outputs the average values.

The method of claim 3, wherein the frame normalization unit
And a predetermined number of frames having a predetermined time interval.

The method of claim 3, wherein the component separation unit
A video redundancy detection device characterized by separating the brightness components.

The method of claim 3, wherein the feature extraction unit
And an adaptive quantizer configured to quantize and output the average values output from the average value calculator to a predetermined number of values.

The method of claim 6, wherein the feature extraction unit
And a center value calculator for detecting the maximum and minimum values of the average values output from the average value calculator, and calculating a center value from the detected maximum and minimum values.
The adaptive quantization unit
And a mean value quantized on the basis of the center value calculated by the center value calculation unit into a positive value and a negative value.

The method of claim 6, wherein the feature extraction unit
And a morphology unit which replaces a short-term change in the short-term average value among the average values quantized by the adaptive quantization unit with an average value of a neighboring frame.

The method of claim 1, wherein the feature matching portion
And determining the primary overlap based on the length of the code.

The method of claim 1, wherein the feature matching portion
The video is characterized in that the degree of matching between the coded feature of the input video and the coded feature of the video stored in the database is digitized, and whether or not duplication is determined according to whether or not the quantized match is greater than or equal to a preset threshold. Redundancy Detection Device.

The method of claim 1, wherein the redundant video removal unit
A video overlapping detection device, characterized in that to remove a lower quality video among two or more duplicated videos.

A video duplication detection method in a device having a database storing one or more videos,
Decoding the input video and organizing successive frames over time;
Extracting the coded feature of the video from the constructed frames;
And determining whether video is duplicated according to whether a video having a coded feature identical to a coded feature of the coded input video is detected from the database.

13. The method of claim 12,
If it is determined that the input video is overlapping, removing the remaining portions except for one of the overlapping videos.

The method of claim 13, wherein the extracting step
Normalizing the frame rate based on time, selecting one or more frames from which the feature is to be extracted;
Separating a specific component for each of the selected frames;
Calculating an average value of pixel values for a specific component included in each of the selected frames;
And coding said average values.

The method of claim 14, wherein the extracting step
And quantizing the calculated average values to a predetermined predetermined number of values.

The method of claim 15, wherein the extracting step
Detecting the maximum and minimum values of the calculated average values, and calculating a center value from the detected maximum and minimum values,
The quantizing step
And a quantized average value is changed into a positive value and a negative value based on the calculated center value.

The method of claim 15, wherein the extracting step
And substituting an average value of the quantized average values in the short term with an average value of neighboring frames.