KR20170088188A

KR20170088188A - Apparatus and method for dividing video contents into a plurality of shots using contents adaptive threshold

Info

Publication number: KR20170088188A
Application number: KR1020160008226A
Authority: KR
Inventors: 박소영; 김상권; 김선중; 김승희; 박원주
Original assignee: 한국전자통신연구원
Priority date: 2016-01-22
Filing date: 2016-01-22
Publication date: 2017-08-01

Abstract

Disclosed is a shot division technique of image contents applying a contents-adaptive threshold. The shot division technique according to the present invention includes a process of applying the threshold after adaptively producing the threshold which is a criterion of shot division with regard to the contents. For example, target image contents are divided into a plurality of shots by applying a preset fixed threshold to an analyzing and processing result of image information with regard to the image contents. Then, a variable threshold is acquired through the analysis and processing of the image information with regard to each of the plurality of shots obtained as the division result and a part of the plurality of shots is divided into a plurality of shots again by applying the variable threshold. Accordingly, the present invention can improve the accuracy of shot division.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to an apparatus and a method for dividing a shot of a video content using a content adaptive threshold,

BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a broadcast communication technology, and more particularly, to an apparatus and a method for dividing an image content into a plurality of shots by applying a content adaptive threshold.

TV contents, movies, and so on are produced and distributed, it is not easy for content users to find desired image contents. Accordingly, a service for efficiently retrieving or recommending video contents has been attracting attention to content users. In order for these services to meet the needs of individual users of the contents, there is a need for a technique for analyzing the characteristics of the image contents accurately and as specifically as possible. However, it is difficult to provide a satisfactory personalized service due to the limitations of the content characteristic analysis technique, because the search and recommendation of the content unit alone have a limitation in providing various services capable of providing high profit and utility.

In order to overcome such limitations, in addition to meta data relating to the entire content such as a genre of a content, characters and the like generally used for searching for and recommending a current content, meta data for each scene constituting a content episode Techniques for searching and recommending content and scenes based on data are being studied. Here, the content episode refers to a broadcast portion of the video content such as a drama and a movie.

An image content is composed of a plurality of frames. Normally, an image content of 1 second consists of 24 frames or 30 frames. Here, a 'frame' refers to a moving image content, that is, one of still images constituting moving image content, and a 'shot' generally refers to a scene without stopping at the time of shooting for producing an image content Indicates the unit of the image taken at one time. And 'scene' refers to a scene which is usually divided or integrated according to a scene, and God is an image unit in which a single situation, action, metabolism or event appears at the same time, same place, . In order to provide a scene-based search and recommendation service, a procedure of forming a shot composed of a plurality of frames, composition of a scene through clustering of the related shots, and metadata generation and tagging by analyzing the characteristics of each scene .

More specifically, in order to divide an image content into shots, color information analysis for each frame and similarity between the consecutive frames constituting the normal image are compared to determine whether to divide the shot. In order to determine whether or not to divide the shot, it is necessary that a threshold value serving as a reference of the shot division should be set in advance. In order to improve the accuracy of the shot division, it is important to appropriately set such a threshold value.

However, uniformly applying a predetermined threshold value without considering the types and characteristics of image contents can limit the accuracy of shot division. If the accuracy of shot segmentation is low, it is difficult to effectively provide scene-based search and recommendation services. Accordingly, when dividing each of various image contents into a plurality of shots, a method for increasing the accuracy is required.

Korean Patent No. 10-1430257 Korean Patent No. 10-0963701

An object of the present invention is to provide an apparatus and a method for dividing an image content into a plurality of shots in consideration of characteristics of image contents to be divided into shots.

Another object of the present invention is to provide an apparatus and a method for dividing a shot of an image content by dividing an image content into shots by applying an adaptively determined threshold considering the corresponding image content characteristic will be.

According to another aspect of the present invention, there is provided a method of dividing an image content into a plurality of shots, the method comprising: applying a predetermined threshold value to an analysis and processing result of image information on the image content, The method comprising the steps of: dividing the content into a plurality of first shots; obtaining a variation threshold through analysis and processing of image information for each of the plurality of first shots; and applying the variation threshold to at least one And dividing the first shot of the first shot into a plurality of second shots.

According to the embodiment of the present invention described above, when dividing the image content into shots, the threshold value of the inter-frame image feature value serving as a reference in dividing the image into different shots is not set as a fixed value, Adaptive decision is made considering image characteristics. Therefore, according to the embodiment of the present invention, it is possible to improve the accuracy of shot division on the image content. In addition, since the image contents are divided into shots with improved accuracy, it is possible to provide scene-based retrieval and recommendation services for image contents more efficiently and accurately in a customized manner.

1 is a block diagram illustrating a configuration of a shot division apparatus for video content according to an exemplary embodiment of the present invention.
2 is a block diagram showing an example of a detailed configuration of the threshold management unit of FIG.
3 is a flowchart illustrating an example of a method of dividing a shot of video content according to an embodiment of the present invention.
FIG. 4 is a simplified schematic diagram of an aspect of a shot division method according to an embodiment of the present invention, described above with reference to FIG.
FIG. 5 is a schematic diagram illustrating another aspect of the shot dividing method according to an embodiment of the present invention described above with reference to FIG.

BRIEF DESCRIPTION OF THE DRAWINGS Fig. However, these drawings are only an example for easily describing the content and scope of technical ideas, and thus the technical scope thereof is not limited or changed. It will be apparent to those skilled in the art that various changes and modifications can be made within the scope of the technical idea based on these examples. Also, terms and words used in the present specification are terms selected in consideration of the functions in the embodiments, and the meaning of the terms may vary depending on the user, the intention or custom of the operator, and the like. Therefore, the terms used in the following embodiments are defined according to their definitions when they are specifically defined in this specification, and in the absence of a specific definition, they should be construed in a sense generally recognized by ordinary artisans.

1 is a block diagram illustrating a configuration of a shot division apparatus for video content according to an exemplary embodiment of the present invention. 1, the shot division device 100 includes an image information extraction unit 110, an image information analysis unit 120, a segmentation unit 130, and a shot segmentation policy unit 140. The shot dividing device 100 is a device for dividing the image content received from the image content providing device 200 into shot units. The video content providing apparatus 200 is a database that stores video content, and provides a requested content when there is a request from the shot dividing apparatus 100. [

The image information extracting unit 110 extracts image information from the image content received from the image content providing apparatus 200. At this time, the image information extraction unit 110 extracts image information according to the shot division policy received from the shot division policy unit 140. [ Specifically, the video information extracting unit 110 requests the video content providing apparatus 200 for the video content to be subjected to shot division, and receives the video content. In addition, the image information extraction unit 110 requests the shot division setting unit 141 to receive the setting information related to the shot division to be applied to the extraction of the image information. The video information extraction unit 110 receiving the video content and the shot division setting information proceeds with the video information extraction operation according to the shot division setting information for the video content.

The image information analyzing unit 120 analyzes and processes the extracted image information transmitted from the image information extracting unit 110. For example, the image information extraction unit 110 extracts at least one image information such as RGB information, Hue, H information, Saturation, S information, The gray level or the gray intensity information and the optical flow information are extracted from the image information analyzing unit 120, the image information analyzing unit 120 extracts different units It can analyze these values and process it into a format that can be used to judge whether or not the shot is divided. The image information analyzing unit 120 transmits the analyzed and processed image information to the division executing unit 130 so that it can be used at the time of shot division, and also transmits the image information to the threshold managing unit 142 when necessary.

Among the image information extracted by the image information extraction unit 110, the RGB information is a value indicating the degree of each of Red, Green, and Blue. The color (H) information is displayed by a predetermined value so that chromatic colors such as red, yellow, green, blue, and violet can be divided into different types. For example, the color (H) information includes a color wheel Can be regarded as the relative arrangement angle when the longest wavelength of red is 0.. The saturation (S) information represents a relative value when the darkest (or pure) state of a specific color is taken as 100%, and the brightness (V) information represents the darkest black as 0% and the brightest white as 100% Lt; / RTI > In addition, the gray intensity information indicates a brightness information of an achromatic color, and the optical flow information indicates a movement pattern that occurs in an image, an object, a surface, an edge, or the like due to a relative movement between a viewer (eye or camera) and a scene.

The division performing unit 130 divides the image content into a plurality of shots, that is, a first shot based on the image information analysis result received from the image information analyzing unit 120 and the threshold value received from the threshold managing unit 142 . For example, the division performing unit 130 may output an analysis result of similarity between consecutive images based on image features, for all images that can be subject to shot division, from the image information analyzing unit 120, and The threshold value management unit 142 can receive the threshold value to be applied to the shot division. The segmentation unit 130 may apply the received threshold value to the received similarity segmentation result, and perform the shot segmentation if the similarity is less than the threshold value. As another example, the division performing unit 130 may receive the numerical value of the characteristic difference between images and divide it into separate shots when the value is larger than the threshold value. The division performing unit 130 delivers the shot division result performed in the above-described manner to the image information analyzing unit 120. [

According to this embodiment, the threshold value received from the threshold value management unit 142 by the division performing unit 130 is not limited to one fixed value, that is, a fixed threshold value, which is equally applied to all contents, Lt; RTI ID = 0.0 > adaptive < / RTI > threshold. The threshold value received from the threshold value management unit 142 by the segmentation unit 130 may vary according to the step of performing the shot segmentation. For example, in a case where the shot division of the image content is divided into a plurality of steps (for example, two steps), the threshold value applied in each step may be different. That is, in the first shot division step of the image content, the segmentation performing unit 130 applies a fixed threshold value, but the shot obtained as a result of the first shot segmentation step, i.e., the first shot, The adaptive threshold can be applied in the second shot segmentation step of obtaining the shot. As described later, the adaptive threshold value can be obtained based on the video characteristic of the video content or the extracted video information by the threshold management unit 142 of the shot division policy unit 140.

The shot segmentation policy unit 140 manages settings and thresholds necessary for shot segmentation. For this purpose, the shot segmentation policy unit 140 may include a shot segmentation setting unit 141 and a threshold value management unit 142. However, displaying the shot division setting unit 141 and the threshold value management unit 142 as separate blocks is a logical division according to their functions, and each of the constituent units may be physically implemented separately or integrally.

The shot division setting section 141 plays a role of storing and managing the settings necessary for the image information extracting section 110, the image information analyzing section 120 and the division performing section 130 to perform a series of operations required for the shot division . More specifically, the image information extracting unit 110 may extract image information from the image contents using an extraction algorithm and conditions of the image information received from the shot division setting unit 141. [ For example, 1) the image content is confirmed to be 30 fps (frame per second), and one frame (for example, the first frame among every 10 frames) is selected as an image analysis target for every 10 frames, and 2) It is possible to divide each selected frame into a predetermined number of blocks, e.g., 4x4, i.e., 16 blocks, and 3) extract image information for a unit image, which is a target of image information extraction having a block shape. The algorithm and setting conditions for extracting the image information are illustrative, and the present embodiment is not limited thereto. According to the present embodiment, there is no particular limitation on the kind, number, range, etc. of the image information to be extracted. As described above, the types of image information that can be extracted with respect to an image generally include RGB information, hue information, saturation information, brightness or brightness (V) information, gray level or gray intensity information, And so on.

The threshold value management unit 142 manages a threshold value serving as a reference for shot division. For example, when dividing a shot, image characteristics of each frame (for example, RGB average value) are used. In this case, the image characteristics of each frame are represented by numerical values, Similarity or difference can be obtained. Here, when the similarity degree between successive frames is equal to or less than a predetermined value, or when the difference is equal to or larger than a predetermined value, it is determined that the two frames belong to different shots, and shot division can be made at that point. The threshold value indicates a value that can be applied to judge whether or not the shot is divided in such a case.

2 is a block diagram showing an example of a detailed configuration of the threshold value management unit 142 of FIG. Referring to FIG. 2, the threshold management unit 142 includes a fixed threshold management unit 142a and a variation threshold management unit 142b. However, this classification of the threshold value management unit 142 is logical based on its function, and may be physically implemented separately or integrally integrated.

The fixed threshold value management unit 142a plays a role of managing a fixed threshold value, and the variation threshold management unit 142b plays a role of managing a variation threshold value. As described above, according to the embodiment of the present invention, the shot dividing device 100 uses one of the fixed threshold value and the variation threshold value (i.e., the adaptive threshold value) according to the step, . Here, the fixed threshold value has a fixed value applied regardless of the type of content, and the variation threshold value is a value that can be changed according to the type of content, image characteristic, etc., and means a content adaptive threshold value.

3 is a flowchart illustrating an example of a method of dividing a shot of video content according to an embodiment of the present invention. The shot dividing method shown in FIG. 3 may be a procedure performed in the shot dividing apparatus 100 shown in FIG. 1 and FIG.

Referring to FIG. 3, first, the image information extracting unit 110 receives image contents to be subjected to shot division from the image contents providing apparatus 200 (S11). Then, the image information extraction unit 110 receives the setting information to be applied to the extraction of the image information from the shot division setting unit 141 (S12). According to this embodiment, there is no particular limitation on the range of the setting information to be received As described above. In FIG. 3, steps S11 and S12 are shown to be performed sequentially, but this is merely for convenience of the city. That is, steps S11 and S12 may be performed simultaneously or S12 may be performed before S11.

Then, the image information extraction unit 110 extracts image information for the corresponding image content based on the received shot division setting information (specifically, setting information for extracting image information related to shot division) (S13). Then, the image information extraction unit 110 transmits the extracted image information to the image information analysis unit 120 (S14). Subsequently, the video information analysis unit 120 receives the shot division setting information (specifically, the setting information for analyzing the video information related to the shot division) from the shot division setting unit 141 (S15) And performs image information analysis and processing on the received image information (S16). The image information analysis unit 120 transmits the image information analysis result to the segmentation unit 130 (S17).

The segmentation unit 130 receives the setting information related to the shot division from the shot segmentation unit 141 (S18), and receives a value corresponding to the fixed threshold value from the threshold management unit 142 (S19). Here, steps S18 and S19 are shown as being performed sequentially, but this is only for convenience of the city. That is, steps S18 and S19 may be performed simultaneously or S19 may be performed before S18.

In step S20, the segmentation unit 130 performs a shot segmentation operation on the image content based on the image information analysis result, the shot segmentation setting information, and the fixed threshold value. The result of the shot division in step S20 is transmitted to the image information analysis unit 120 (S21). The image information analyzing unit 120 analyzes and processes the image information for each shot divided according to the result of the shot division performed in step S22, In order to perform shot division secondarily with respect to a part thereof.

Subsequently, the image information analyzing unit 120 transmits the analysis and processing results of the image information to the partition performing unit 130 and the threshold management unit 142 (S23 and S24). Then, the threshold value management unit 142 calculates a variation threshold value, which is a content adaptive threshold value, based on the received image information analysis result (S25). Then, the threshold value management unit 142 delivers the variation threshold value to the split performing unit 130 (S26). The segmentation unit 130 finalizes the shot segmentation procedure by performing the secondary shot segmentation based on the image information analysis result and the variation threshold value (S27).

FIG. 4 is a simplified schematic diagram of an aspect of a shot division method according to an embodiment of the present invention, described above with reference to FIG. Referring to FIG. 4, one unit image content (for example, one movie, one broadcast content episode, etc.) is composed of a plurality of frames. Generally, since the image content is composed of 24 frames per second or 30 frames per second, one unit image may be composed of thousands or tens of thousands of frames depending on its length. According to the embodiment of the present invention, the primary shot division is performed by applying a predetermined fixed threshold to the spiritual contents, and as a result, the image contents can be divided into a plurality of shots. Subsequently, the secondary shot division is performed on the specific shot (Shot 2 in FIG. 4) using the primary shot division result. In this case, the variation threshold is applied. As described above, the variation threshold can be derived using the image information or the image characteristic of the specific shot divided by the primary shot. For example, a second shot division is performed by applying a variation threshold to some shots (Shot 2 in FIG. 4) satisfying a specific condition among a plurality of shots derived from the first shot division result. As a result, (Shot 2) is divided into two or more shots (Shot 2-1, Shot 2-2), so that additional shots may be split.

FIG. 5 is a schematic diagram illustrating another aspect of the shot dividing method according to an embodiment of the present invention described above with reference to FIG. 5, analysis / processing of the secondary image information is performed based on the primary shot division result among the shot division methods shown in FIG. 3, and a variation threshold and a secondary shot division target shot are selected based on the result The process is schematically illustrated.

Referring to FIG. 5, when a first shot division is performed on one unit image content, a plurality of divided shots are derived as a result. For this purpose, the analysis and processing of image information is performed primarily as described above. The second image information analysis and processing are performed on each of the derived shots. The image information analysis result for each shot can be derived according to a preset algorithm stored in the shot division setting unit (Shot 1 , Shot 2, and Shot n are the results of the analysis). Then, based on the image information analysis result derived for each shot, the variation threshold to be applied to the second shot division is calculated, and the second shot division target shot can be selected. The variation threshold is obtained by applying the previously defined function (f (a1, b1, c1, ...)) to the image information analysis result obtained from each shot, a2, ... an) can be applied. The selection of the shots to be subjected to the second shot division is performed by using the functions (f (a1, b1, c1, ...)) defined in the result of performing the image information analysis for each shot derived from the first shot division (F (a1, b1, c1, ...), ... f (an, bn, cn, ...) (sel, a, b, ..., n, com (a, b, ..., n)).

After the calculation of the variation threshold and the selection of the shot for the secondary shot division are completed, the shot division is performed again on the basis of the variation threshold value for the shot, thereby completing the shot division for the image content.

The above description is only an example and should not be construed as being limited thereto. It is to be understood that the technical spirit of the present invention should be defined only by the invention disclosed in the claims, and all technical ideas within the scope of equivalents thereof should be construed as being included in the scope of the present invention. Therefore, it is apparent to those skilled in the art that the above-described embodiments can be modified and implemented in various forms.

100: Shot splitter
110:
120: Image information analysis section
130:
140: shot division policy section

Claims

A method of dividing a video content into a plurality of shots,
Dividing the image content into a plurality of first shots by applying a preset threshold value to an analysis and processing result of the image information for the image content;
Obtaining a variation threshold value through analysis and processing of image information for each of the plurality of first shots; And
And dividing at least one first shot of the plurality of first shots into a plurality of second shots by applying the variation threshold value.