WO2010087125A1 - 時間区間代表特徴ベクトル生成装置 - Google Patents
時間区間代表特徴ベクトル生成装置 Download PDFInfo
- Publication number
- WO2010087125A1 WO2010087125A1 PCT/JP2010/000247 JP2010000247W WO2010087125A1 WO 2010087125 A1 WO2010087125 A1 WO 2010087125A1 JP 2010000247 W JP2010000247 W JP 2010000247W WO 2010087125 A1 WO2010087125 A1 WO 2010087125A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- feature vector
- time interval
- feature
- time
- representative
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/91—Television signal processing therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/21—Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/2151—Time stamp
Definitions
- the present invention relates to an apparatus and method for generating a feature vector representing a time interval for each time interval from a sequence of feature vectors for each frame, which represents time-series data such as moving image data and sound data.
- the present invention relates to a time section representative feature vector generation apparatus, method, and program for generating a time section representative feature vector that can describe a time-series change in a section.
- Non-Patent Document 1 when searching for a moving image, a color layout descriptor (color layout descriptor) defined in the international standard ISO / IEC 15938-3 is used as a feature vector for each frame, and distance calculation is performed in units of frames. And similar sections are identified.
- the method of matching feature vector sequences to be compared in frame units requires a long time for searching. Therefore, in order to speed up the search, feature vectors are not collated in units of frames, but feature vectors representing time intervals (called time interval representative feature vectors) are generated for each time interval including a plurality of frames. In addition, a method of matching using the generated time section representative feature vector has been proposed.
- a histogram feature is generated from a feature vector included in a time interval as a time interval representative feature vector.
- a feature vector for each frame of a moving image a frame image is divided into a plurality of sub-images, and the color component values (R component, G component, B component) of each sub-image are used as feature amounts.
- a vector is used.
- the time interval representative feature vector is generated as a histogram representing the frequency of appearance of each quantization index by vector quantization of the feature vector of the frame included in the time interval.
- Non-Patent Document 3 and Non-Patent Document 4 a key frame in a time interval is selected as a time interval representative feature vector, and the feature vector of the selected key frame is used as it is as a time interval representative feature vector.
- a shot of a moving image is set as a time interval
- a key frame is selected from the shot
- a feature vector thereof is set as a time interval representative feature vector.
- Non-Patent Document 5 an average value or median value of feature amounts is calculated for each dimension of the feature vector from feature vectors of a plurality of frames included in the time interval as the time interval representative feature vector, and the calculated average A feature vector composed of a value or a median value is used as a time section representative feature vector.
- Non-Patent Document 2 to Non-Patent Document 5 cannot describe the time series change (time change) of the feature vector series in the time section. For this reason, the above-described collation using the time section representative feature vector cannot identify a time series change in the time section (it is more likely to determine that a feature vector series having a different time series change is similar). There is a problem that the accuracy of the search for the vector series is lowered.
- the histogram cannot describe the time series order.
- the time series change of the series cannot be described (for example, the same histogram is obtained even if the time series change is in reverse order).
- Non-Patent Document 3 and Non-Patent Document 4 a key frame in a time interval is selected as a time interval representative feature vector, and the feature vector of the selected key frame is used as a time interval representative feature vector as it is. Since the feature vector of the selected single frame is used, only one point of information on the time series is described, so that the time series change of the feature vector series in the time interval cannot be described.
- an average value or median value of feature amounts is calculated for each dimension of the feature vector from feature vectors of a plurality of frames included in the time interval.
- the calculated value for each dimension of the time segment representative feature vector is the time series position (time) in the time segment. )
- the time series change of the feature vector sequence in the time interval cannot be described (for example, even if the time series change is in reverse order, the same segment representative feature vector is obtained).
- An object of the present invention is to represent a time interval representative that solves the problem that the time interval representative feature vectors described in Non-Patent Document 2 to Non-Patent Document 5 cannot describe a time series change (time change) of a feature vector sequence in a time interval It is to provide a feature vector generation apparatus.
- a time-section representative feature vector generation device includes a time-section feature vector group that selects feature vectors of a plurality of frames included in a time section for each time section from a sequence of feature vectors for each frame.
- the time section representative feature vector which is a feature vector representing a time section by selecting a feature quantity of a different dimension of the feature vector from the feature vectors of different frames in the selected time section for each selection section and the selection means Dimension generating means for generating
- a time segment representative feature vector generation device that generates a time segment representative feature vector that can describe a time series change of a feature vector sequence in a time segment from a feature vector sequence for each frame of time series data.
- time segment representative feature vector generation device 100 receives a sequence (feature vector sequence) in which feature vectors for each frame are arranged in time sequence, and inputs a time segment.
- the time interval representative feature vector generation device 100 includes a time interval feature vector group selection unit 101 and a dimension selection unit 102.
- the feature vector group selection unit 101 in the time interval selects a feature vector of a plurality of frames included in the time interval for each time interval, and selects a plurality of selected feature vectors for each time interval. Information on the feature vector of the frame is supplied to the dimension selection means 102.
- the input feature vector series is a series in which feature vectors for each frame of time series data such as moving image data and sound data are arranged in time series order.
- the time series data is not limited to moving image data or sound data.
- a frame is an individual element of time-series data.
- each element of the time-series data is referred to as a frame for convenience.
- the feature vector for each frame is composed of feature quantities of multiple dimensions.
- various visual features commonly known as MPEG-7 Visual
- MPEG-7 Visual stipulated in the international standard ISO / IEC 15938-3 extracted for each frame of the moving image, that is, Dominant Color. , Color Layout, Scalable Color, Color Structure, Edge Histogram, Homogeneous Texture, Texture Browsing, Region Shape, Contour Shape, Moisture Shape.
- the multi-dimensional feature amount constituting the feature vector for each frame is an improved feature amount so as to be effective for more types of moving images.
- FIG. 2 is a diagram showing a method of extracting an example of a feature amount improved so as to be effective for more types of moving images (hereinafter referred to as a multi-shaped region comparison feature amount).
- the multi-shaped region comparison feature amount is determined in advance for each dimension of the feature vector in two extraction regions (first extraction region and second extraction region) in the image for extracting the feature amount. Yes.
- the feature is that the shape of the extraction region is diverse.
- the average luminance value of the first extraction region and the second extraction region determined for each dimension is calculated for each dimension, and the average luminance value of the first extraction region is calculated.
- the average luminance value of the second extraction region are compared (that is, based on the difference value) and quantized into three values (+1, 0, ⁇ 1) to obtain a quantization index.
- the absolute value of the difference value between the average luminance value of the first extraction region and the average luminance value of the second extraction region is equal to or less than a predetermined threshold value, the average luminance of the first extraction region and the second extraction region It is assumed that there is no difference in the values, and the quantization index is 0 indicating that there is no difference.
- the magnitude of the average luminance value of the first extraction region and the average luminance value of the second extraction region is set to In comparison, if the average luminance value of the first extraction area is larger, the quantization index is +1, and otherwise the quantization index is -1.
- the quantization index Qn of dimension n is given by Can be calculated.
- a feature vector calculated by performing frequency analysis on an acoustic frame may be used. For example, a Fourier transform is performed on the analysis window, a power spectrum in the frequency domain is calculated, the power spectrum is divided into a plurality of subbands, and a feature vector having an average power value of each subband as a feature amount is used. There may be.
- the time section is a continuous section on the time axis.
- the method of determining the time interval is arbitrary as long as it is constant for any feature vector sequence to which it is input.
- the time section may be, for example, individual sections divided on the time axis by a certain time length (time width).
- time width For example, referring to the example of FIG. 3-A, each section divided in units of 10 frames having a certain time width on the feature vector series (frame series) is defined as a time section. Further, for example, each section divided in units of one second, which is a fixed time length, may be defined as a time section.
- the time interval may be determined so as to allow overlapping of the intervals while shifting the interval of a certain time length (time width) at a certain interval. For example, referring to the example of FIG. 3B, on the feature vector sequence (frame sequence), the interval of 10 frames having a constant time width is shifted at intervals of 4 frames, and overlapping of the intervals is allowed. A time interval may be defined. Further, for example, the time interval may be determined so as to allow overlapping of the intervals while shifting the interval of one second unit having a constant time length at intervals of one frame.
- time interval need not always have a fixed time length (time width).
- a change point for example, a shot division point of moving image data
- a feature vector series frame series
- individual intervals between the change points are defined as time intervals. May be defined.
- the change point may be detected from, for example, the feature vector sequence itself (for example, the distance between feature vectors of adjacent frames is calculated, and the change point is determined when the distance exceeds a specified threshold). You may detect from time series data.
- the method for selecting feature vectors of a plurality of frames included in a time interval in the time interval feature vector group selecting means 101 is constant for any feature vector sequence to which it is input. As long as it is, it is arbitrary.
- the feature vector group selection unit 101 in the time interval may select feature vectors of all frames included in the time interval, as shown in FIG. Further, for example, as shown in FIG. 4, feature vectors of frames sampled at a constant interval may be selected. Other selection methods will be described later with reference to FIGS.
- the dimension selection unit 102 differs from the feature vector information of a plurality of selected frames for each time interval supplied from the in-time interval feature vector group selection unit 101 for each time interval.
- a feature quantity of a different dimension of the feature vector is selected from the feature vector of the frame, and is output as a time section representative feature vector.
- “selecting feature quantities of different dimensions from feature vectors of different frames” does not necessarily mean that both the frames and dimensions of the feature quantities to be selected are selected without overlapping, and at least two or more feature quantities are selected. That is, at least two or more feature quantities of different dimensions are selected from feature vectors of different frames.
- the number of dimensions of the feature quantity selected by the dimension selection unit 102 may be arbitrary.
- the number of dimensions of the feature vector of the feature vector sequence given as input is N
- the number of dimensions of the feature quantity selected here is the same as N. Also good. It may be less than N or more than N.
- the method by which the dimension selection means 102 selects feature quantities of different dimensions of feature vectors from feature vectors of different frames within the selected time interval is constant for any feature vector sequence to which it is input. As long as it is optional.
- FIG. 5 shows the arrangement of feature vectors of 11 frames selected in the time interval in time series order.
- 11 feature quantities of different dimension of the feature vector are selected from the feature vectors of different frames, and the 11-dimensional feature vector composed of the selected 11 feature quantities is represented as a time interval. It is generated as a representative feature vector.
- FIG. 6 and FIG. 7 show examples of another method in which the dimension selection unit 102 selects feature quantities of different dimensions of feature vectors from feature vectors of different frames in the selected time interval.
- 11 different dimension feature quantities are selected one by one from dimension 1 to dimension 11 in order from frame 1 to frame 11, and again one dimension from dimension 12 to dimension 22 in order from frame 1 to frame 11 again.
- Eleven different dimensional feature quantities are selected one by one, and a 22-dimensional feature vector composed of a total of 22 feature quantities is generated as a time section representative feature vector.
- a total of 22 different dimension feature quantities are selected in two dimensions from dimension 1 to dimension 22 in order from frame 1 to frame 11, and a 22 dimension feature composed of a total of 22 feature quantities. The vector is generated as a time section representative feature vector.
- the dimension selection unit 102 desirably selects feature quantities of different dimensions of feature vectors uniformly from a plurality of frames in the selected time interval. For example, at least one dimension feature amount may be selected from all the frames in the selected time interval.
- the time section representative feature vector for each time section output from the time section representative feature vector generation device 100 describes a time series change of the feature vector series in the time section. The reason is that the characteristics of a plurality of positions (time) on the time series in the time interval are aggregated.
- the generated time interval representative feature vector is a feature quantity having a different meaning (because feature quantities of different dimensions are feature quantities extracted by different procedures). , It has different meanings).
- the time section representative feature vector output from the time section representative feature vector generation device 100 is an aggregation of feature quantities having different meanings at different positions in the time section. For this reason, there is little redundancy and the description capability (identification capability) of the section feature representative feature vector is high. Therefore, high-precision search is possible.
- time-series change in the time section can be identified by using the time-section representative feature vector for each time section output from the time-section representative feature vector generation apparatus 100, a fast and highly accurate feature vector series for each time section. Can be searched.
- a feature vector series search system configured using the time segment representative feature vector generation device 100 will be described later.
- the feature vector group selection unit 101 in the time interval selects a feature vector of a plurality of frames included in the time interval for each time interval (step A1). Then, information on feature vectors of a plurality of frames selected for each time interval is supplied to the dimension selection means 102.
- the dimension selection unit 102 selects the time interval selected for each time interval from the feature vector information of the plurality of frames selected for each time interval supplied from the feature vector group selection unit 101 in the time interval.
- a feature quantity of a different dimension of the feature vector is selected from the feature vectors of different frames in the frame (step A2). And it outputs as a time section representative feature vector.
- the time-section representative feature vector generation device 100 of the first embodiment it is possible to generate a time-section representative feature vector that can describe (identify) a time-series change of the feature vector series in the time section.
- the reason is that by selecting feature quantities of different dimensions of the feature vector from the feature vectors of different frames in the time interval and using them as time interval representative feature vectors, a plurality of positions (time of day) in the time interval are selected. This is because the characteristics of Thus, since the generated time segment representative feature vector can identify a time series change of the feature vector sequence in the time interval (a feature vector sequence having a different time series change can be identified), the accuracy of the feature vector sequence search Can be improved.
- the matching may be the same as the feature vector matching method for each original frame. Therefore, in a system that hierarchically performs matching of section representative feature vectors and frame-by-frame matching, such as a second feature vector sequence search system described later with reference to FIG. There is also an effect that it can be made into a thing.
- a time segment representative feature vector generation device 110 includes a time segment feature vector group of the time segment representative feature vector generation device 100 according to the first exemplary embodiment. The difference is that the selection unit 101 is replaced with the time interval feature vector group selection unit 111.
- Information indicating the frame rate of the feature vector series and information indicating the reference frame rate for generating the time segment representative feature vector are input to the feature vector group selection unit 111 in the time interval.
- the intra-time feature vector group selection unit 111 specifies the sample position at the reference frame rate from the feature vector sequence using the frame rate of the feature vector sequence. Then, feature vectors of a plurality of frames at the specified sample position are selected, and information on the feature vectors of the plurality of frames selected for each time interval is supplied to the dimension selection unit 102.
- Fig. 12 shows a specific example.
- the frame rate of the feature vector series is 30 frames / second, and a 1 second section (that is, 30 frames) is defined as a time section.
- the reference frame rate for generating the time section representative feature vector is 5 frames / second.
- Sampling interval (frame)
- the sampling interval may be calculated as: frame rate of feature vector series / reference frame rate, and the sample position may be specified accordingly.
- one frame may be sampled every six frames.
- the sampling interval is not an integer value but a decimal value, for example, a frame at an integer sample position obtained by rounding off the sample position calculated by a decimal may be sampled.
- time section representative feature vector generation device 110 According to the time section representative feature vector generation device 110 according to the second embodiment, it is possible to generate time section representative feature vectors that can be compared with each other even for feature vector sequences having different frame rates. The reason for this is to unify the frame sequence of feature vectors selected for generating the time interval representative feature vector to the reference frame rate using the reference frame rate for generating the time interval representative feature vector. .
- a first feature vector series having a frame rate of 30 frames / second and a second feature vector series having a frame rate of 15 frames / second are generated from the same moving image.
- a 1-second interval is determined as a time interval, and a reference frame rate for generating a time interval representative feature vector is 5 frames / second. At this time, 5 frames are selected from the 30th frame every 6 frames from the first feature vector series.
- the second feature vector series 5 frames are selected every 15 frames from 15 frames. At this time, the five frames selected from the second feature vector series are the same as the five frames selected from the first feature vector series.
- the time interval representative feature vector generation device 120 includes a dimension selection unit 102 of the time interval representative feature vector generation device 100 according to the first exemplary embodiment. It differs in that it is replaced with the dimension selection means 122.
- the dimension selection means 122 receives information (dimension importance information) indicating the importance of each dimension of the feature vector.
- the dimension selection means 122 is different from the feature vector information of a plurality of selected frames for each time interval supplied from the feature vector group selection means 101 in the time interval, and is different in the selected time interval for each time interval. From the feature vector of the frame, according to the importance for each dimension, feature quantities of different dimensions in the feature vector are selected in order from the dimension with the highest importance, and output as a time section representative feature vector.
- the information indicating the importance for each dimension may be, for example, information obtained by quantifying the importance for each dimension, or information representing a permutation of the importance for each dimension. Alternatively, it may be information expressing the importance as a binary value of 1 or 0.
- the significance of the importance is arbitrary, but for example, the degree of contribution to the search accuracy of the feature quantity in the feature vector dimension, the degree of discrimination ability of the feature quantity in the dimension of the feature vector (the degree to which different data can be identified) Or the degree of robustness of the feature quantity in the dimension of the feature vector (resistance to various noises and processing for data).
- Fig. 14 shows a specific example.
- the feature vectors of the feature vector series are composed of 25-dimensional feature quantities, which are dimension 1 to dimension 25, respectively.
- the importance for each dimension is assumed to decrease as the dimension number increases. That is, the dimensions are arranged in descending order of importance, with dimension 1 having the highest importance and dimension 25 having the lowest importance.
- Information indicating that the dimensions are arranged in descending order of importance is input to the dimension selection means 122 as dimension importance information, and the dimension selection means 122 follows the feature number of the dimension with a smaller dimension number accordingly. Are selected in order.
- a total of 11-dimensional feature amounts from dimension 1 to dimension 11 are selected in descending order of dimension importance.
- a time-section representative feature vector can be generated from a dimension with high importance of the dimension of the feature vector. This is effective because the dimension with higher importance is selected when the number of dimensions of the time interval representative feature vector to be generated is made smaller than the number of dimensions of the original feature vector.
- the feature vector sequence search system is described as being configured using the time segment representative feature vector generation device 100.
- the time segment representative feature vector generation device described in the second embodiment is used.
- 110 and the time section representative feature vector generation device 120 described in the third embodiment may be used.
- the first feature vector sequence search system includes a time interval representative feature vector generation device 100 and a matching device 200.
- the time segment representative feature vector generation device 100 receives the first feature vector sequence and the second feature vector sequence, and the time segment representative feature vector for each time segment of the first feature vector sequence, And a time section representative feature vector for each time section of the feature vector series.
- the output time section representative feature vector for each time section of the first feature vector series and the time section representative feature vector for each time section of the second feature vector series are supplied to the matching device 200.
- the collation device 200 includes a time section representative feature vector collation means 201.
- the time section representative feature vector matching unit 201 includes a time section representative feature vector for each time section of the first first feature vector sequence supplied from the time section representative feature vector generation device 100, and a second feature vector sequence.
- the time interval representative feature vectors for each time interval are collated to determine whether or not the time interval representative feature vectors are similar, and when it is determined that they are similar, information on each corresponding time interval is displayed as similar time interval information. Output as.
- the following method is used as a method for collating a time section representative feature vector corresponding to a certain time section of the first feature vector series with a time section representative feature vector corresponding to a time section of the second feature vector series.
- the degree of similarity between the time section representative feature vectors to be compared is calculated. For example, the distance between vectors (Euclidean distance, Hamming distance, etc.), or the similarity between vectors (cosine similarity, etc.) is calculated, and the degree of similarity is calculated.
- the distance between vectors the smaller the value is, the more similar
- the similarity between vectors it can be determined that the larger the value is, the more similar.
- the numerical value of the degree of similarity calculated in this way is subjected to threshold processing with a certain predetermined threshold (this threshold is given in advance, for example) to determine whether or not they are similar. For example, when the distance between vectors is used, it is determined that the values are similar when the value is smaller than a predetermined threshold. When the similarity between vectors is used, the value is larger than the predetermined threshold. It is determined that the cases are similar. If it is determined that they are similar, information on the time section corresponding to each time section representative feature vector is output as similar time section information.
- this threshold is given in advance, for example
- a time interval representative feature vector corresponding to a time interval of 80 to 100 frames of the first feature vector sequence, and a time interval representative feature vector corresponding to a time interval of 250 to 270 frames of the second feature vector sequence are determined to be similar, for example, assuming that the first feature vector sequence from 80 frames to 100 frames and the second feature vector sequence from 250 frames to 270 frames are similar time intervals. , May be output.
- the first feature vector series search system it is possible to realize a high-speed and high-accuracy feature vector series search that can identify a time series change in a time section using a time section representative feature vector.
- the second feature vector sequence search system is different in that the matching device 200 of the first feature vector sequence search system is replaced with a matching device 210.
- the collation device 210 includes a time segment representative feature vector collating unit 201 and a frame unit feature vector collating unit 212.
- time interval representative feature vector matching means 201 is the same as that in the first feature vector series search system, description thereof is omitted here.
- the frame unit feature vector matching unit 212 includes each time section indicated by the similar time section information output by the time section representative feature vector matching unit 201 between the input first feature vector series and the second feature vector series. When the feature vectors of the frames included in are collated again in units of frames and it is determined that they are similar time intervals, similar time interval information is output.
- a high-speed and high-accuracy feature vector sequence search using a time segment representative feature vector that can identify time-series changes in the time segment is used as the first-stage search.
- the time section representative feature vector generation device and the collation device of the present invention can be realized by a computer and a program, as well as by hardware.
- the program is provided by being recorded on a computer-readable recording medium such as a magnetic disk or a semiconductor memory, and is read by the computer at the time of starting up the computer, etc.
- the time section representative feature vector generation device and the collation device in the form are used.
- the present invention can be used for searching moving image data, sound data, and the like.
- desired content can be searched at high speed from a database in which movie content and music content are stored. It can also be used to detect illegal copies of moving image data and sound data illegally uploaded to the Internet or the like.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- General Engineering & Computer Science (AREA)
- Library & Information Science (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Signal Processing (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
本発明の目的は、非特許文献2から非特許文献5に記載の時間区間代表特徴ベクトルでは時間区間における特徴ベクトル系列の時系列変化(時間変化)を記述できない、という課題を解決する時間区間代表特徴ベクトル生成装置を提供することにある。
図1を参照すると、本発明の第1の実施の形態に係る時間区間代表特徴ベクトル生成装置100は、フレームごとの特徴ベクトルを時系列順に並べた系列(特徴ベクトル系列)が入力され、時間区間を代表する特徴ベクトルである時間区間代表特徴ベクトルを出力する。時間区間代表特徴ベクトル生成装置100は、時間区間内特徴ベクトル群選定手段101と、次元選択手段102とを備えている。
0 (|Vn1-Vn2|≦th の場合)
-1 (|Vn1-Vn2|>th かつ Vn1≦Vn2 の場合)
…[式1]
次に、図10のフローチャートを参照して、第1の実施の形態に係る時間区間代表特徴ベクトル生成装置100の動作を説明する。
第1の実施の形態の時間区間代表特徴ベクトル生成装置100によれば、時間区間における特徴ベクトル系列の時系列変化を記述できる(識別できる)時間区間代表特徴ベクトルを生成することができる。その理由は、時間区間内の異なるフレームの特徴ベクトルから、特徴ベクトルの異なる次元の特徴量を選択して時間区間代表特徴ベクトルとすることで、時間区間内の時系列上の複数の位置(時刻)の特徴を集約しているからである。このように、生成された時間区間代表特徴ベクトルは、時間区間における特徴ベクトル系列の時系列変化を識別できる(異なる時系列変化を持つ特徴ベクトル系列を識別できる)ため、特徴ベクトル系列の検索の精度を向上させることができる。
図11を参照すると、本発明の第2の実施の形態に係る時間区間代表特徴ベクトル生成装置110は、第1の実施の形態に係る時間区間代表特徴ベクトル生成装置100の時間区間内特徴ベクトル群選定手段101が、時間区間内特徴ベクトル群選定手段111に置き換わる点において異なる。
サンプリング間隔(フレーム)
=特徴ベクトル系列のフレームレート÷基準フレームレート
としてサンプリング間隔を算出し、それに従ってサンプル位置を特定してもよい。ここの例では、サンプリング間隔=30÷5=6となるので、6フレームごとに1つのフレームをサンプリングすればよい。なお、サンプリング間隔が整数値ではなく、小数値となる場合は、例えば、小数で算出されるサンプル位置を四捨五入して得られる整数値のサンプル位置のフレームを、サンプリングすればよい。
第2の実施の形態による時間区間代表特徴ベクトル生成装置110によれば、異なるフレームレートを持つ特徴ベクトル系列であっても、相互に比較可能な時間区間代表特徴ベクトルを生成することができる。その理由は、時間区間代表特徴ベクトルを生成する基準のフレームレートを用いて、時間区間代表特徴ベクトルを生成するために選定される特徴ベクトルのフレーム列を、基準のフレームレートに統一するためである。
図13を参照すると、本発明の第3の実施の形態に係る時間区間代表特徴ベクトル生成装置120は、第1の実施の形態に係る時間区間代表特徴ベクトル生成装置100の次元選択手段102が、次元選択手段122に置き換わる点において異なる。
第3の実施の形態による時間区間代表特徴ベクトル生成装置120によれば、特徴ベクトルの次元の重要度の高い次元から、時間区間代表特徴ベクトルを生成することができる。これは、元の特徴ベクトルの次元の数よりも、生成する時間区間代表特徴ベクトルの次元の数を小さくする場合に、より重要度の高い次元を選択するため、効果的である。
図8を参照すると、本発明による第1の特徴ベクトル系列検索システムは、時間区間代表特徴ベクトル生成装置100と、照合装置200とを備える。
図9を参照すると、本発明による第2の特徴ベクトル系列検索システムは、第1の特徴ベクトル系列検索システムの照合装置200が、照合装置210に置き換わる点が異なる。
101…時間区間内特徴ベクトル群選定手段
102…次元選択手段
200…照合装置
201…時間区間代表特徴ベクトル照合手段
210…照合装置
201…時間区間代表特徴ベクトル照合手段
212…フレーム単位特徴ベクトル照合手段
110…時間区間代表特徴ベクトル生成装置
111…時間区間内特徴ベクトル群選定手段
120…時間区間代表特徴ベクトル生成装置
122…次元選択手段
Claims (19)
- フレームごとの特徴ベクトルの系列から、時間区間ごとに、時間区間に含まれる複数のフレームの特徴ベクトルを選定する時間区間内特徴ベクトル群選定手段と、
時間区間ごとに、前記選定された時間区間内の異なるフレームの特徴ベクトルから、特徴ベクトルの異なる次元の特徴量を選択し、時間区間を代表する特徴ベクトルである時間区間代表特徴ベクトルとして生成する次元選択手段と、
を備えることを特徴とする時間区間代表特徴ベクトル生成装置。 - 前記特徴ベクトル系列は、動画像データのフレームごとの特徴ベクトルの系列である
ことを特徴とする請求項1に記載の時間区間代表特徴ベクトル生成装置。 - 前記特徴ベクトルは、動画像のフレームにおける、複数の部分領域対の、対をなす2つの部分領域の特徴量の差分値に基づいて算出される
ことを特徴とする請求項2に記載の時間区間代表特徴ベクトル生成装置。 - 前記次元選択手段は、前記選定された時間区間内の全てのフレームの特徴ベクトルから、少なくとも1つの次元の特徴量を選択する
ことを特徴とする請求項1乃至3の何れか1項に記載の時間区間代表特徴ベクトル生成装置。 - 前記時間区間内特徴ベクトル群選定手段は、
前記特徴ベクトル系列のフレームレートを示す情報と、時間区間代表特徴ベクトルを生成する基準のフレームレートを示す情報とに基づいて、前記特徴ベクトル系列から、前記基準フレームレートにおけるサンプル位置を特定し、特定されたサンプル位置の複数のフレームの特徴ベクトルを選定する、
ことを特徴とする請求項1乃至4の何れか1項に記載の時間区間代表特徴ベクトル生成装置。 - 前記時間区間内特徴ベクトル群選定手段は、
前記特徴ベクトル系列のフレームレートと前記基準フレームレートとの比によって定まるサンプリング間隔に基づいてサンプル位置を特定する
ことを特徴とする請求項5に記載の時間区間代表特徴ベクトル生成装置。 - 前記次元選択手段は、
あらかじめ定められた特徴ベクトルの次元ごとの重要度に従って、重要度の高い次元から順に、前記選定された時間区間内の異なるフレームの特徴ベクトルから、特徴ベクトルの異なる次元の特徴量を選択する、
ことを特徴とする請求項1乃至6の何れか1項に記載の時間区間代表特徴ベクトル生成装置。 - 請求項1乃至7の何れか1項に記載の時間区間代表特徴ベクトル生成装置によって生成された、第1の特徴ベクトル系列の時間区間ごとの時間区間代表特徴ベクトルと、第2の特徴ベクトル系列の時間区間ごとの時間区間代表特徴ベクトルとを照合して、時間区間代表特徴ベクトルどうしが類似するか否かを判定する第1の照合手段、
を備える照合装置。 - 前記第1の照合手段によって類似すると判定した時間区間代表特徴ベクトルの対に対して、それぞれに対応する時間区間に含まれるフレームの特徴ベクトルを、フレーム単位で照合する第2の照合手段、
を備えることを特徴とする請求項8に記載の照合装置。 - フレームごとの特徴ベクトルの系列から、時間区間ごとに、時間区間に含まれる複数のフレームの特徴ベクトルを選定し、
時間区間ごとに、前記選定された時間区間内の異なるフレームの特徴ベクトルから、特徴ベクトルの異なる次元の特徴量を選択し、時間区間を代表する特徴ベクトルである時間区間代表特徴ベクトルとして生成する
ことを特徴とする時間区間代表特徴ベクトル生成方法。 - 前記特徴ベクトル系列は、動画像データのフレームごとの特徴ベクトルの系列である
ことを特徴とする請求項10に記載の時間区間代表特徴ベクトル生成方法。 - 前記特徴ベクトルは、動画像のフレームにおける、複数の部分領域対の、対をなす2つの部分領域の特徴量の差分値に基づいて算出される
ことを特徴とする請求項11に記載の時間区間代表特徴ベクトル生成方法。 - 前記時間区間代表特徴ベクトルの生成では、前記選定された時間区間内の全てのフレームの特徴ベクトルから、少なくとも1つの次元の特徴量を選択する
ことを特徴とする請求項10乃至12の何れか1項に記載の時間区間代表特徴ベクトル生成方法。 - 前記複数のフレームの特徴ベクトルの選定では、
前記特徴ベクトル系列のフレームレートを示す情報と、時間区間代表特徴ベクトルを生成する基準のフレームレートを示す情報とに基づいて、前記特徴ベクトル系列から、前記基準フレームレートにおけるサンプル位置を特定し、特定されたサンプル位置の複数のフレームの特徴ベクトルを選定する
ことを特徴とする請求項10乃至13の何れか1項に記載の時間区間代表特徴ベクトル生成方法。 - 前記複数のフレームの特徴ベクトルの選定では、
前記特徴ベクトル系列のフレームレートと前記基準フレームレートとの比によって定まるサンプリング間隔に基づいてサンプル位置を特定する
ことを特徴とする請求項14に記載の時間区間代表特徴ベクトル生成方法。 - 前記時間区間代表特徴ベクトルの生成では、
あらかじめ定められた特徴ベクトルの次元ごとの重要度に従って、重要度の高い次元から順に、前記選定された時間区間内の異なるフレームの特徴ベクトルから、特徴ベクトルの異なる次元の特徴量を選択する、
ことを特徴とする請求項10乃至15の何れか1項に記載の時間区間代表特徴ベクトル生成方法。 - 請求項10乃至16の何れか1項に記載の時間区間代表特徴ベクトル生成方法によって生成された、第1の特徴ベクトル系列の時間区間ごとの時間区間代表特徴ベクトルと、第2の特徴ベクトル系列の時間区間ごとの時間区間代表特徴ベクトルとを照合して、時間区間代表特徴ベクトルどうしが類似するか否かを判定する
ことを特徴とする照合方法。 - 前記類似すると判定した時間区間代表特徴ベクトルの対に対して、それぞれに対応する時間区間に含まれるフレームの特徴ベクトルを、フレーム単位で照合する
ことを特徴とする請求項17に記載の照合方法。 - コンピュータを、
フレームごとの特徴ベクトルの系列から、時間区間ごとに、時間区間に含まれる複数のフレームの特徴ベクトルを選定する時間区間内特徴ベクトル群選定手段と、
時間区間ごとに、前記選定された時間区間内の異なるフレームの特徴ベクトルから、特徴ベクトルの異なる次元の特徴量を選択し、時間区間を代表する特徴ベクトルである時間区間代表特徴ベクトルとして生成する次元選択手段と
して機能させるためのプログラム。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2010548398A JP4894956B2 (ja) | 2009-01-29 | 2010-01-19 | 時間区間代表特徴ベクトル生成装置 |
CN201080005899.6A CN102301698B (zh) | 2009-01-29 | 2010-01-19 | 时间分段表示特征矢量生成设备 |
KR1020117015970A KR101352448B1 (ko) | 2009-01-29 | 2010-01-19 | 시간 구간 대표 특징 벡터 생성 장치 |
EP10735597.6A EP2383990B1 (en) | 2009-01-29 | 2010-01-19 | Time segment representative feature vector generation device |
US13/143,673 US8175392B2 (en) | 2009-01-29 | 2010-01-19 | Time segment representative feature vector generation device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2009017807 | 2009-01-29 | ||
JP2009-017807 | 2009-01-29 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2010087125A1 true WO2010087125A1 (ja) | 2010-08-05 |
Family
ID=42395391
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2010/000247 WO2010087125A1 (ja) | 2009-01-29 | 2010-01-19 | 時間区間代表特徴ベクトル生成装置 |
Country Status (6)
Country | Link |
---|---|
US (1) | US8175392B2 (ja) |
EP (1) | EP2383990B1 (ja) |
JP (1) | JP4894956B2 (ja) |
KR (1) | KR101352448B1 (ja) |
CN (1) | CN102301698B (ja) |
WO (1) | WO2010087125A1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011130413A (ja) * | 2009-10-05 | 2011-06-30 | Mitsubishi Electric R&D Centre Europe Bv | デジタルコンテンツ符号器、復号器、検索装置、符号化方法、検索方法、記録担体、信号、および記憶装置 |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101612212B1 (ko) * | 2011-11-18 | 2016-04-15 | 닛본 덴끼 가부시끼가이샤 | 국소 특징 기술자 추출 장치, 국소 특징 기술자 추출 방법, 및 프로그램을 기록한 컴퓨터 판독가능 기록 매체 |
CN102857778B (zh) * | 2012-09-10 | 2015-01-21 | 海信集团有限公司 | 3d视频转换系统和方法及其选择关键帧的方法和装置 |
KR101957944B1 (ko) * | 2014-11-13 | 2019-03-13 | 삼성전자주식회사 | 영상의 주파수 특성 정보를 포함하는 메타 데이터를 생성하는 방법 및 장치 |
CN105245950B (zh) * | 2015-09-25 | 2018-09-14 | 精硕科技(北京)股份有限公司 | 视频广告监播方法及装置 |
CN106095764A (zh) * | 2016-03-31 | 2016-11-09 | 乐视控股(北京)有限公司 | 一种动态图片处理方法及系统 |
CN107871190B (zh) * | 2016-09-23 | 2021-12-14 | 阿里巴巴集团控股有限公司 | 一种业务指标监控方法及装置 |
CN108874813B (zh) * | 2017-05-10 | 2022-07-29 | 腾讯科技(北京)有限公司 | 一种信息处理方法、装置及存储介质 |
KR102261928B1 (ko) * | 2019-12-20 | 2021-06-04 | 조문옥 | 기계학습이 완료된 사물 인식 모델을 통해 동영상에 대한 상황 정보 판단이 가능한 동영상 정보 판단장치 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002044610A (ja) * | 2000-04-27 | 2002-02-08 | Nippon Telegr & Teleph Corp <Ntt> | 信号検出方法、装置及びそのプログラム、記録媒体 |
JP2007336106A (ja) * | 2006-06-13 | 2007-12-27 | Osaka Univ | 映像編集支援装置 |
JP2009017807A (ja) | 2007-07-11 | 2009-01-29 | Marusan Technos:Kk | 鳥害防止用装置 |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR0171118B1 (ko) * | 1995-03-20 | 1999-03-20 | 배순훈 | 비디오신호 부호화 장치 |
KR0171154B1 (ko) * | 1995-04-29 | 1999-03-20 | 배순훈 | 특징점 기반 움직임 추정을 이용하여 비디오 신호를 부호화하는 방법 및 장치 |
US6404925B1 (en) * | 1999-03-11 | 2002-06-11 | Fuji Xerox Co., Ltd. | Methods and apparatuses for segmenting an audio-visual recording using image similarity searching and audio speaker recognition |
EP1161098B1 (en) * | 2000-04-27 | 2011-06-22 | Nippon Telegraph And Telephone Corporation | Signal detection method and apparatus |
US6859554B2 (en) * | 2001-04-04 | 2005-02-22 | Mitsubishi Electric Research Laboratories, Inc. | Method for segmenting multi-resolution video objects |
JP2004234613A (ja) * | 2002-12-02 | 2004-08-19 | Nec Corp | 映像記述システムおよび方法、映像識別システムおよび方法 |
KR100679124B1 (ko) * | 2005-01-27 | 2007-02-05 | 한양대학교 산학협력단 | 이미지 시퀀스 데이터 검색을 위한 정보 요소 추출 방법및 그 방법을 기록한 기록매체 |
JP2006351001A (ja) * | 2005-05-19 | 2006-12-28 | Nippon Telegr & Teleph Corp <Ntt> | コンテンツ特徴量抽出方法及び装置及びコンテンツ同一性判定方法及び装置 |
-
2010
- 2010-01-19 EP EP10735597.6A patent/EP2383990B1/en active Active
- 2010-01-19 CN CN201080005899.6A patent/CN102301698B/zh active Active
- 2010-01-19 WO PCT/JP2010/000247 patent/WO2010087125A1/ja active Application Filing
- 2010-01-19 JP JP2010548398A patent/JP4894956B2/ja active Active
- 2010-01-19 KR KR1020117015970A patent/KR101352448B1/ko active IP Right Grant
- 2010-01-19 US US13/143,673 patent/US8175392B2/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002044610A (ja) * | 2000-04-27 | 2002-02-08 | Nippon Telegr & Teleph Corp <Ntt> | 信号検出方法、装置及びそのプログラム、記録媒体 |
JP2007336106A (ja) * | 2006-06-13 | 2007-12-27 | Osaka Univ | 映像編集支援装置 |
JP2009017807A (ja) | 2007-07-11 | 2009-01-29 | Marusan Technos:Kk | 鳥害防止用装置 |
Non-Patent Citations (6)
Title |
---|
ANIL JAIN, ADITYA VAILAYA, WEI XIONG: "Query by Video Clip", PROC. ON ICPR (INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, vol. 1, August 1998 (1998-08-01), pages 16 - 20 |
EIJI KASUTANI, AKIO YAMADA: "Acceleration of Video Identification Process Using Group-of-Frame Feature", PROC. ON FIT (FORUM ON INFORMATION TECHNOLOGY, 2003, pages 85 - 86 |
EIJI KASUTANI, RYOMA OAMI, AKIO YAMADA, TAKAMI SATO, KYOJI HIRATA: "Video Material Archive System for Efficient Video Editing based on Media Identification", PROC. ON ICME (INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, vol. 1, June 2004 (2004-06-01), pages 727 - 730, XP010770914, DOI: doi:10.1109/ICME.2004.1394295 |
KUNIO KASHINO, TAKAYUKI KUROZUMI, HIROSHI MURASE: "A Quick Search Method for Audio and Video Signals Based on Histogram Pruning", IEEE TRANSACTIONS ON MULTIMEDIA, vol. 5, no. 3, September 2003 (2003-09-01), XP001189754, DOI: doi:10.1109/TMM.2003.813281 |
See also references of EP2383990A4 |
YUSUKE UCHIDA, MASARU SUGANO, AKIO YONEYAMA: "A Study on Content Based Copy Detection Using Color Layout", PROC. ON IMPS (IMAGE MEDIA. PROCESSING SYMPOSIUM) 2008, PROCEEDINGS, October 2008 (2008-10-01), pages 69 - 70 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011130413A (ja) * | 2009-10-05 | 2011-06-30 | Mitsubishi Electric R&D Centre Europe Bv | デジタルコンテンツ符号器、復号器、検索装置、符号化方法、検索方法、記録担体、信号、および記憶装置 |
Also Published As
Publication number | Publication date |
---|---|
EP2383990A1 (en) | 2011-11-02 |
KR101352448B1 (ko) | 2014-01-17 |
EP2383990B1 (en) | 2017-09-20 |
KR20110105793A (ko) | 2011-09-27 |
JPWO2010087125A1 (ja) | 2012-08-02 |
US20110274359A1 (en) | 2011-11-10 |
US8175392B2 (en) | 2012-05-08 |
EP2383990A4 (en) | 2012-08-29 |
CN102301698A (zh) | 2011-12-28 |
JP4894956B2 (ja) | 2012-03-14 |
CN102301698B (zh) | 2014-08-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP4894956B2 (ja) | 時間区間代表特徴ベクトル生成装置 | |
US8467611B2 (en) | Video key-frame extraction using bi-level sparsity | |
Panagiotakis et al. | Equivalent key frames selection based on iso-content principles | |
WO2009129328A1 (en) | Universal lookup of video-related data | |
EP1067786A1 (en) | Data describing method and data processor | |
CN101789082B (zh) | 视频识别 | |
JP5366212B2 (ja) | 多数の参照用映像の中から検索キー用映像を用いて検索する映像検索装置、プログラム及び方法 | |
US7778469B2 (en) | Methods and systems for discriminative keyframe selection | |
JP2010186307A (ja) | 動画コンテンツ識別装置および動画コンテンツ識別方法 | |
JP5644505B2 (ja) | 照合加重情報抽出装置 | |
Tan et al. | Accelerating near-duplicate video matching by combining visual similarity and alignment distortion | |
Jun et al. | Duplicate video detection for large-scale multimedia | |
JP2011248671A (ja) | 多数の参照用映像の中から検索キー用映像を用いて検索する映像検索装置、プログラム及び方法 | |
Ghanem et al. | Context-aware learning for automatic sports highlight recognition | |
Panagiotakis et al. | Video synopsis based on a sequential distortion minimization method | |
Bailer et al. | A distance measure for repeated takes of one scene | |
Qiang et al. | Key frame extraction based on motion vector | |
JP2013070158A (ja) | 映像検索装置およびプログラム | |
Anh et al. | Video retrieval using histogram and sift combined with graph-based image segmentation | |
Bhaumik et al. | Real-time storyboard generation in videos using a probability distribution based threshold | |
Sandeep et al. | Application of Perceptual Video Hashing for Near-duplicate Video Retrieval | |
Gupta et al. | Evaluation of object based video retrieval using SIFT | |
Anju et al. | Video copy detection using F-sift and graph based video sequence matching | |
Lin et al. | Video retrieval for shot cluster and classification based on key feature set | |
Khin et al. | Key frame extraction techniques |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201080005899.6 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10735597 Country of ref document: EP Kind code of ref document: A1 |
|
DPE1 | Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 13143673 Country of ref document: US |
|
ENP | Entry into the national phase |
Ref document number: 20117015970 Country of ref document: KR Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2010548398 Country of ref document: JP |
|
REEP | Request for entry into the european phase |
Ref document number: 2010735597 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2010735597 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |