JP4648183B2

JP4648183B2 - Continuous media data shortening reproduction method, composite media data shortening reproduction method and apparatus, program, and computer-readable recording medium

Info

Publication number: JP4648183B2
Application number: JP2005364929A
Authority: JP
Inventors: 宏志小西; 正志森本
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2005-12-19
Filing date: 2005-12-19
Publication date: 2011-03-09
Anticipated expiration: 2025-12-19
Also published as: JP2007171267A

Abstract

<P>PROBLEM TO BE SOLVED: To enable highly compressed abbreviated reproduction, while maintaining comprehensiveness so that a user can observe and search contents in a short period of time. <P>SOLUTION: The invention includes the process steps of: dividing input continuous data into frame sections; calculating the change amount of a feature parameter of each frame section; extracting a frame in which the change amount of the feature parameter is larger than a predetermined value; clustering the extracted frame; calculating the change amount of a cluster unit which is clustered; restructuring the data by linking only clusters in each of which the change amount of the cluster unit is larger than a predetermined value; and outputting the linked restructured data. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、連続メディアデータを短縮再生する連続メディアデータ短縮再生方法、複合メディアデータ短縮再生方法及び装置及びプログラム及びコンピュータ読み取り可能な記録媒体に係り、特に、利用者が短時間で連続メディアデータの内容を概観・検索可能とするための連続メディアデータ短縮再生方法、複合メディアデータ短縮再生方法及び装置及びプログラム及びコンピュータ読み取り可能な記録媒体に関する。 The present invention relates to a continuous media data shortening playback method, a composite media data shortening playback method and apparatus, a program, and a computer-readable recording medium for shortening playback of continuous media data. The present invention relates to a continuous media data shortening / reproducing method, a composite media data shortening / reproducing method and apparatus, a program, and a computer-readable recording medium for enabling overview and search of contents.

連続メディアを時間短縮して再生する方法を大別すると、連続メディア全体の内容をできるだけ損なわずに高速に再生することにより再生時間の短縮を図る「高速再生」、連続メディアを部分的に再生することにより再生時間の短縮を図る「部分再生」、部分再生の中でも、連続メディアの内容を意味的に解析し、意味的に重要な区間を抽出し、再生時間の短縮を図る「要約再生」がある。 The method of playing back continuous media by shortening the time is roughly divided into "high-speed playback" that shortens the playback time by playing at high speed without losing the entire content of the continuous media as much as possible. “Partial playback” that shortens the playback time by means of “Summary playback” that analyzes the content of continuous media semantically, extracts semantically important sections, and shortens the playback time. is there.

映像は画像と音からなる複合メディアであり、画像は動画と静止画に分類される。本来、音は音声、音楽、音響等を包含するものであるが、ここでは、音声を音の意味を含むものとして使用する。特に、連続メディアに対する方法を説明するため、映像は動画と音声からなるものとして説明する。 Video is a composite media composed of images and sounds, and images are classified into moving images and still images. Originally, sound includes sound, music, sound and the like, but here, sound is used as including the meaning of sound. In particular, in order to explain the method for continuous media, the video will be described as consisting of video and audio.

連続メディアの代表例として音声があり、通常のテープレコーダ等の早送りでは、再生音の周波数が高くなり、２倍程度を限界にそれ以上の高速再生では内容を把握することが困難となる。これを改善するために、フレーム単位に分割し、等間隔や規則的に間引いたり、ピッチ周期を検出しピック区間波形単位で規則的に間引いたりすることにより、原音と同等の周波数で音を再生し、高速再生を可能とする手法が提案されている（例えば、特許文献１、特許文献２参照）。 Audio is a typical example of continuous media. The frequency of the reproduced sound is high when fast-forwarding with a normal tape recorder or the like, and it is difficult to grasp the content at high-speed reproduction beyond the limit of about twice. In order to improve this, sound is played back at the same frequency as the original sound by dividing into frames and thinning them out at regular intervals or regularly, or by detecting the pitch period and thinning out regularly in units of pick section waveforms. However, a technique that enables high-speed reproduction has been proposed (see, for example, Patent Document 1 and Patent Document 2).

また、音声特有の情報を用い、有音区間と無音区間とを検出し、無音区間を削除し、有音区間のみを再生する方法（例えば、特許文献１参照）や、検出された有音区間を更に母音区間、子音区間、母音区間と子音区間との間の遷移区間、及び、雑音区間に分類し、圧縮の度合いを変え、音声の劣化を少なくする装置（例えば、特許文献３参照）が提案されている。 Also, a method for detecting a voiced section and a silent section using information unique to speech, deleting the silent section, and reproducing only the voiced section (for example, see Patent Document 1), or a detected voiced section. Is further classified into a vowel section, a consonant section, a transition section between a vowel section and a consonant section, and a noise section, and a device that changes the degree of compression and reduces speech deterioration (see, for example, Patent Document 3). Proposed.

しかし、これらの手法を用いても、再生速度は２〜３倍程度が限界となり、これ以上の速度では内容を把握することが困難になる。また、音声特有の情報を用いることは他の連続メディアには適用できないため、メディア毎に処理を分ける必要があり、複合メディア（例えば、音声と動画の複合である映像）の高速再生に適用する際等にも、処理が複雑化する欠点を伴っている。 However, even if these methods are used, the reproduction speed is limited to about 2 to 3 times, and it becomes difficult to grasp the contents at a speed higher than this. In addition, use of audio-specific information is not applicable to other continuous media, so it is necessary to divide the processing for each medium, and it applies to high-speed playback of composite media (for example, video that is a composite of audio and video) In some cases, the processing is complicated.

他の連続メディアの例としては動画があり、通常のビデオレコーダ等の早送りのように再生周波数を上げたり、フレーム単位で等間隔や規則的に間引いたり、縮退させたりすることにより、高速再生を可能とする手法が提案されている。しかし、音声同様、ある速度以上の高速再生では内容を把握することが困難となる。 Another example of continuous media is moving images, which can be played at high speeds by increasing the playback frequency like normal video recorders, fast-forwarding, etc. A possible approach has been proposed. However, as with voice, it is difficult to grasp the content at high speed playback above a certain speed.

また、動画からある物理量（輝度変化量やカット点等）を抽出し、ある条件を満たす部分を取り出して短縮する手法が提案されている（例えば、特許文献４、特許文献５参照）。しかし、動画特有の情報を適用しているため、音声等の他のメディアに適用することはできない。 Also, a method has been proposed in which a certain physical quantity (brightness change amount, cut point, etc.) is extracted from a moving image, and a portion that satisfies a certain condition is extracted and shortened (for example, see Patent Document 4 and Patent Document 5). However, since information specific to moving images is applied, it cannot be applied to other media such as audio.

その他、動画と音声との組み合わせによる様々な短縮再生（高速再生、部分再生、要約再生）技術として、前述の組み合わせ、もしくは、その拡張した手法が提案されている（例えば、特許文献５、特許文献６、特許文献７参照）。しかし、これらも前述の手法と同様の欠点があったり、動画に主眼を置き、動画に付随した部分音声のみを再生するために、全体の流れの把握が困難であったりする。 In addition, as the various shortened playback (high-speed playback, partial playback, summary playback) techniques using a combination of moving images and sounds, the above-described combination or an extended method thereof has been proposed (for example, Patent Document 5, Patent Document) 6, see Patent Document 7). However, these also have the same drawbacks as the above-mentioned methods, or focus on the moving image and only the partial sound attached to the moving image is reproduced, so that it is difficult to grasp the entire flow.

これらの欠点を鑑みて、動画や音声に特化せず汎用的で統合しやすい枠組みで、再生速度を向上させる連続メディアの高速再生技術として連続メディアデータ高速再生方法、複合メディアデータ高速再生方法等が提案されている（例えば、特許文献８参照）。しかし、この方法を用いても、音声を含んだメディアの場合に、３〜５倍速程度の高速再生が限界であった。
特開平６−２０２６９１号公報特開２０００−２５９２００号公報特開平９−１５２８８９号公報特開平４−２３７２８４号公報特開平６−２３３２２７号公報特開平８−１１６５１４号公報特開２００３−１６９２９８号公報特開２００５−２０４００３号公報 In view of these drawbacks, a continuous media data high-speed playback method, a composite media data high-speed playback method, etc. as a continuous media high-speed playback technology that improves playback speed with a versatile and easy-to-integrate framework that does not specialize in video and audio Has been proposed (see, for example, Patent Document 8). However, even if this method is used, high-speed playback of about 3 to 5 times speed is the limit in the case of media including sound.
JP-A-6-202691 JP 2000-259200 A Japanese Patent Laid-Open No. 9-152889 JP-A-4-237284 JP-A-6-233227 JP-A-8-116514 JP 2003-169298 A JP 2005-204003 A

上記の従来技術の連続メディアデータ高速再生方法等は、一定以上の再生速度に達すると連続メディアの最小構成単位（例えば、音声の場合は音韻、音節、単語、文節等に相当し、動画の場合は、動物体の動作等に相当する）が原形を留めない程縮退してしまうため、了解性が著しく低下し、内容がわからなくなる問題があった。例えば、音声の場合、３〜５倍速を超えると、音韻の欠落が多くなりすぎ、単語や文章として意味の概要を汲み取ることができなくなる。 The above-described conventional continuous media data high-speed playback method, etc., when the playback speed reaches a certain level or higher, corresponds to the minimum constituent unit of continuous media (for example, in the case of speech, it corresponds to phonology, syllables, words, phrases, etc. (Corresponding to the movement of the moving object etc.) is degenerated to the extent that it does not retain its original form, so that there is a problem that the intelligibility is remarkably lowered and the contents cannot be understood. For example, in the case of speech, if it exceeds 3 to 5 times speed, phoneme loss becomes excessive, and it becomes impossible to draw an outline of meaning as words or sentences.

また、複合メディアデータの高速再生方法等においては、各メディア間で同期したフレームの変化量を基に統合した変化量を計算するため、メディア数が増加すると、複合メディアとしての圧縮率に比べ、各メディア自体の局所的な圧縮率が高くなり、上記の問題点である最小構成単位が原形を留めない程縮退する状態に早く達してしまう問題があった。例えば、音声と動画の複合メディアの場合、全体として５倍速に設定しても、音声の局所的圧縮率が限界の５倍速相当を超えてしまい、音声の内容が聞き取れなくなる。 Also, in the high-speed playback method of composite media data, etc., since the amount of change integrated based on the amount of change of frames synchronized between each media is calculated, when the number of media increases, compared to the compression rate as composite media, There is a problem in that the local compression rate of each medium itself is increased, and the minimum structural unit, which is the above-described problem, quickly reaches a state where the original structure is degenerated. For example, in the case of composite media of audio and moving images, even if the overall speed is set to 5 × speed, the local compression rate of the audio exceeds the limit equivalent to 5 × speed, and the contents of the audio cannot be heard.

本発明は、以上のような従来技術の実情に鑑みてなされたもので、その目的は、使用者が短時間で内容を概観、検索できるように了解性を保持したまま、より高圧縮な短縮再生を可能にすると共に、音声や動画に特化せず、他の連続メディアや複合メディアにも適用できる連続メディアデータ短縮再生方法、複合メディアデータ短縮再生方法及び装置及びプログラム及びコンピュータ読み取り可能な記録媒体を提供することを目的とする。 The present invention has been made in view of the above-described actual situation of the prior art, and its purpose is to shorten the compression more highly while maintaining the intelligibility so that the user can overview and search the contents in a short time. Continuous media data shortening / reproducing method, composite media data shortening / reproducing method and apparatus and program, and computer-readable recording that can be reproduced and can be applied to other continuous media and composite media without specializing in audio and video The purpose is to provide a medium.

図１は、本発明の原理構成図である。 FIG. 1 is a principle configuration diagram of the present invention.

本発明（請求項１）は、連続メディアデータを短縮して再生する連続メディアデータ短縮再生方法であって、
メディア入力・特徴パラメータ変化量計算手段が、入力された連続メディアデータをフレーム区間に分割し、各フレーム区間の特徴パラメータの変化量を計算するメディア入力・特徴パラメータ変化量計算ステップ（ステップ１）と、
フレーム抽出手段が、連続メディアデータの総フレーム数を了解性の極端な低下を引き起こさない再生速度で除したフレーム数のフレームを、特徴パラメータの変化量が大きなフレームに限定して抽出するフレーム抽出ステップ（ステップ２）と、
クラスタリング手段が、抽出されたフレームを、抽出フレームとの時間間隔が小さいものを同一クラスタとしてグループ化することにより、クラスタリングするクラスタリングステップ（ステップ３）と、
クラスタ変化量算出手段が、クラスタリングしたクラスタ単位の変化量を計算するクラスタ変化量算出ステップ（ステップ４）と、
メディア再構成手段が、総圧縮率の逆数が再生速度以上となる総圧縮率に対して、連続メディアデータの総フレーム数に総圧縮率を乗じた乗算値と、クラスタに含まれるフレームの合計数が概ね同じとなるように、クラスタ単位の変化量が大きなクラスタを選択し、選択されたクラスタに含まれるフレームを順序関係を維持して連結し、再構成するメディア再構成ステップ（ステップ５）と、
出力手段が、連結された再構成データを出力する出力ステップ（ステップ６）と、を行う。 The present invention (Claim 1) is a continuous media data shortening reproduction method for shortening and reproducing continuous media data,
A media input / feature parameter change amount calculating step (step 1) in which the media input / feature parameter change amount calculation means divides the input continuous media data into frame intervals and calculates the change amount of the feature parameter in each frame interval; ,
Frame frame extraction means, the frame number of a frame divided by the reproduction speed the total number of frames that do not cause an extreme reduction in the intelligibility of continuous media data, the amount of change in characteristic parameter is extracted is limited to large kina frame An extraction step (step 2);
A clustering step (step 3) in which the clustering means clusters the extracted frames by grouping the extracted frames having a small time interval with the extracted frame as the same cluster ;
A cluster change amount calculating unit (step 4) in which a cluster change amount calculating unit calculates a change amount of clustered clusters;
The media reconstruction means uses a product obtained by multiplying the total compression rate by which the inverse of the total compression rate is equal to or higher than the playback speed by multiplying the total number of frames of continuous media data by the total compression rate, and the total number of frames included in the cluster. A medium reconstructing step (step 5) of selecting a cluster having a large change amount in cluster units, concatenating frames included in the selected cluster while maintaining an order relation, and reconfiguring the clusters. ,
The output means performs an output step (step 6) for outputting the connected reconstruction data.

本発明（請求項２）は、複数の連続メディアデータからなる複合メディアデータを短縮して再生する複合メディアデータ短縮再生方法であって、
メディア入力・特徴パラメータ変化量計算手段が、入力された各連続メディアデータをフレーム区間に分割し、各フレーム区間の特徴パラメータの変化量を計算するメディア入力・特徴パラメータ変化量計算ステップと、
変化量パラメータ抽出手段が、各連続メディアデータの特徴パラメータの変化量から、統合した特徴パラメータの変化量を計算する変化量パラメータ抽出ステップと、
フレーム抽出手段が、連続メディアデータの総フレーム数を了解性の極端な低下を引き起こさない再生速度で除したフレーム数のフレームを、統合した特徴パラメータの変化量が大きなフレームに限定して抽出するフレーム抽出ステップと、
クラスタリング手段が、抽出されたフレームを、抽出フレームとの時間間隔が小さいものを同一クラスタとしてグループ化することにより、クラスタリングするクラスタリングステップと、
クラスタ変化量算出手段が、クラスタリングしたクラスタ単位の変化量を計算するクラスタ変化量算出ステップと、
メディア再構成手段が、総圧縮率の逆数が再生速度以上となる総圧縮率に対して、連続メディアデータの総フレーム数に総圧縮率を乗じた乗算値と、クラスタに含まれるフレームの合計数が概ね同じとなるように、クラスタ単位の変化量が大きなクラスタを選択し、選択されたクラスタに含まれるフレームを順序関係を維持して連結し、再構成するメディア再構成ステップと、
出力手段が、再構成データを出力する出力ステップと、を行う。 The present invention (Claim 2 ) is a composite media data shortening playback method for shortening and playing back composite media data composed of a plurality of continuous media data,
A media input / feature parameter change amount calculating unit divides each input continuous media data into frame sections, and calculates a change amount of a feature parameter in each frame section; a media input / feature parameter change amount calculation step;
A variation parameter extraction means for calculating a variation amount of the integrated feature parameter from the variation amount of the feature parameter of each continuous media data;
Frame extracting means, the frame number of a frame divided by the reproduction speed the total number of frames that do not cause an extreme reduction in the intelligibility of continuous media data, the amount of change in integrated characteristic parameter is limited to large kina frame extraction A frame extraction step,
A clustering step in which the clustering means clusters the extracted frames by grouping the extracted frames with a small time interval as the same cluster ; and
A cluster change amount calculating means for calculating a change amount of clustered cluster units, and a cluster change amount calculating step;
The media reconstruction means uses a product obtained by multiplying the total compression rate by which the inverse of the total compression rate is equal to or higher than the playback speed by multiplying the total number of frames of continuous media data by the total compression rate, and the total number of frames included in the cluster. A medium reconfiguration step of selecting a cluster having a large amount of change in cluster units, concatenating and reconfiguring frames included in the selected cluster in an orderly relationship ,
An output means performs an output step of outputting the reconstruction data.

本発明（請求項３）は、複数の連続メディアデータからなる複合メディアデータを短縮して再生する複合メディアデータ短縮再生方法であって、
メディア入力・特徴パラメータ変化量計算手段が、入力された各連続メディアデータをフレーム区間に分割し、各フレーム区間の特徴パラメータの変化量を計算するメディア入力・特徴パラメータ変化量計算ステップと、
フレーム抽出手段が、連続メディアデータの総フレーム数を了解性の極端な低下を引き起こさない再生速度で除したフレーム数のフレームを、特徴パラメータの変化量が大きなフレームに限定して抽出するフレーム抽出ステップと、
クラスタリング手段が、抽出されたフレームを、抽出フレームとの時間間隔が小さいものを同一クラスタとしてグループ化することにより、クラスタリングするクラスタリングステップと、
時間圧縮手段が、クラスタリングされたクラスタ単位で各連続メディアデータを時間圧縮する時間圧縮ステップと、
クラスタ変化量算出手段が、時間圧縮されたクラスタ単位の変化量を計算するクラスタ変化量算出ステップと、
メディア再構成手段が、総圧縮率の逆数が再生速度以上となる総圧縮率に対して、連続メディアデータの総フレーム数に総圧縮率を乗じた乗算値と、クラスタに含まれるフレームの合計数が概ね同じとなるように、クラスタ単位の変化量が大きなクラスタを選択し、選択されたクラスタに含まれるフレームを順序関係を維持して連結し、再構成するメディア再構成ステップと、
出力手段が、再構成データを出力する出力ステップと、を行う。 The present invention (Claim 3 ) is a composite media data shortening reproduction method for shortening and reproducing composite media data composed of a plurality of continuous media data,
A media input / feature parameter change amount calculating unit divides each input continuous media data into frame sections, and calculates a change amount of a feature parameter in each frame section; a media input / feature parameter change amount calculation step;
Frame frame extraction means, the frame number of a frame divided by the reproduction speed the total number of frames that do not cause an extreme reduction in the intelligibility of continuous media data, the amount of change in characteristic parameter is extracted is limited to large kina frame An extraction step;
Clustering means, by grouping frames issued extracted, what time interval between the extraction frame is smaller as the same cluster, clustering step of clustering,
A time compression step in which the time compression means compresses each continuous media data in a clustered cluster unit;
A cluster change amount calculating means for calculating a change amount of the time-compressed cluster unit, and a cluster change amount calculating step;
The media reconstruction means uses a product obtained by multiplying the total compression rate by which the inverse of the total compression rate is equal to or higher than the playback speed by multiplying the total number of frames of continuous media data by the total compression rate, and the total number of frames included in the cluster. A medium reconfiguration step of selecting a cluster having a large amount of change in cluster units, concatenating and reconfiguring frames included in the selected cluster in an orderly relationship ,
An output means performs an output step of outputting the reconstruction data.

図２は、本発明の原理構成図である。 FIG. 2 is a principle configuration diagram of the present invention.

本発明（請求項４）は、連続メディアデータを短縮して再生する連続メディアデータ短縮再生装置であって、
入力された連続メディアデータをフレーム区間に分割し、各フレーム区間の特徴パラメータの変化量を計算するメディア入力・特徴パラメータ変化量計算手段１１０と、
連続メディアデータの総フレーム数を了解性の極端な低下を引き起こさない再生速度で除したフレーム数のフレームを、特徴パラメータの変化量が大きなフレームに限定して抽出するフレーム抽出手段１２０と、
抽出されたフレームを、抽出フレームとの時間間隔が小さいものを同一クラスタとしてグループ化することにより、クラスタリングするクラスタリング手段１３０と、
クラスタリングしたクラスタ単位の変化量を計算するクラスタ変化量算出手段１４０と、
総圧縮率の逆数が再生速度以上となる総圧縮率に対して、連続メディアデータの総フレーム数に総圧縮率を乗じた乗算値と、クラスタに含まれるフレームの合計数が概ね同じとなるように、クラスタ単位の変化量が大きなクラスタを選択し、選択されたクラスタに含まれるフレームを順序関係を維持して連結し、再構成するメディア再構成手段１５０と、
連結された再構成データを出力する出力手段１６０と、を有する。 The present invention (Claim 4 ) is a continuous media data shortening / reproducing apparatus for shortening and reproducing continuous media data,
Media input / feature parameter change amount calculation means 110 that divides input continuous media data into frame sections and calculates a change amount of a feature parameter in each frame section;
The frame number of a frame divided by the reproduction speed the total number of frames that do not cause an extreme reduction in the intelligibility of continuous media data, the frame extraction unit 120 the amount of change feature parameters are extracted is limited to large Kina frame,
Clustering means 130 for clustering the extracted frames by grouping those having a small time interval with the extracted frame as the same cluster ;
A cluster change amount calculating means 140 for calculating a change amount of clustered cluster units;
For the total compression rate at which the reciprocal of the total compression rate is equal to or higher than the playback speed, the multiplication value obtained by multiplying the total number of frames of continuous media data by the total compression rate is approximately the same as the total number of frames included in the cluster. Media reconstructing means 150 that selects a cluster having a large amount of change in cluster units, concatenates frames included in the selected cluster while maintaining an order relation, and reconfigures the medium;
And output means 160 for outputting the reconstructed data connected.

本発明（請求項５）は、複数の連続メディアデータからなる複合メディアデータを短縮して再生する複合メディアデータ短縮再生装置であって、
入力された各連続メディアデータをフレーム区間に分割し、各フレーム区間の特徴パラメータの変化量を計算するメディア入力・特徴パラメータ変化量計算手段と、
各連続メディアデータの特徴パラメータの変化量から、統合した特徴パラメータの変化量を計算する変化量パラメータ抽出手段と、
連続メディアデータの総フレーム数を了解性の極端な低下を引き起こさない再生速度で除したフレーム数のフレームを、統合した特徴パラメータの変化量が大きなフレームに限定して抽出するフレーム抽出手段と、
抽出されたフレームを、抽出フレームとの時間間隔が小さいものを同一クラスタとしてグループ化することにより、クラスタリングするクラスタリング手段と、
クラスタリングしたクラスタ単位の変化量を計算するクラスタ変化量算出手段と、
総圧縮率の逆数が再生速度以上となる総圧縮率に対して、連続メディアデータの総フレーム数に総圧縮率を乗じた乗算値と、クラスタに含まれるフレームの合計数が概ね同じとなるように、クラスタ単位の変化量が大きなクラスタを選択し、選択されたクラスタに含まれるフレームを順序関係を維持して連結し、再構成するメディア再構成手段と、
再構成データを出力する出力手段と、を有する。 The present invention (Claim 5 ) is a composite media data shortening / reproducing apparatus for shortening and reproducing composite media data composed of a plurality of continuous media data,
Media input / feature parameter change amount calculating means for dividing each input continuous media data into frame sections and calculating a change amount of a feature parameter in each frame section;
Change amount parameter extraction means for calculating the change amount of the integrated feature parameter from the change amount of the feature parameter of each continuous media data;
The frame number of a frame divided by the reproduction speed the total number of frames that do not cause an extreme reduction in the intelligibility of continuous media data, a frame extracting means the amount of change integrated feature parameters are extracted is limited to large kina frame ,
Clustering means for clustering the extracted frames by grouping those having a small time interval with the extracted frame as the same cluster ;
Cluster change amount calculating means for calculating the change amount of clustered clusters,
For the total compression rate at which the reciprocal of the total compression rate is equal to or higher than the playback speed, the multiplication value obtained by multiplying the total number of frames of continuous media data by the total compression rate is approximately the same as the total number of frames included in the cluster. In addition, media reconfiguration means for selecting a cluster having a large change amount in cluster units, concatenating frames included in the selected cluster while maintaining an order relation, and reconfiguring ;
Output means for outputting reconstructed data.

本発明（請求項６）は、複数の連続メディアデータからなる複合メディアデータを短縮して再生する複合メディアデータ短縮再生装置であって、
入力された各連続メディアデータをフレーム区間に分割し、各フレーム区間の特徴パラメータの変化量を計算するメディア入力・特徴パラメータ変化量計算手段と、
記連続メディアデータの総フレーム数を了解性の極端な低下を引き起こさない再生速度で除したフレーム数のフレームを、特徴パラメータの変化量が大きなフレームに限定して抽出するフレーム抽出手段と、
各連続メディアデータの抽出されたフレームを、抽出フレームとの時間間隔が小さいものを同一クラスタとしてグループ化することにより、クラスタリングするクラスタリング手段と、
クラスタリングされたクラスタ単位で各連続メディアデータを時間圧縮する時間圧縮手段と、
時間圧縮されたクラスタ単位の変化量を計算するクラスタ変化量算出手段と、
総圧縮率の逆数が再生速度以上となる総圧縮率に対して、連続メディアデータの総フレーム数に総圧縮率を乗じた乗算値と、クラスタに含まれるフレームの合計数が概ね同じとなるように、クラスタ単位の変化量が大きなクラスタを選択し、選択されたクラスタに含まれるフレームを順序関係を維持して連結し、再構成するメディア再構成手段と、
再構成データを出力する出力手段と、を有する。 The present invention (Claim 6 ) is a composite media data shortening / reproducing apparatus for shortening and reproducing composite media data composed of a plurality of continuous media data,
Media input / feature parameter change amount calculating means for dividing each input continuous media data into frame sections and calculating a change amount of a feature parameter in each frame section;
Serial frame number of a frame divided by the playback speed that does not cause extreme lowering of intelligibility the total number of frames in the continuous media data, a frame extracting means the amount of change in characteristic parameter is extracted is limited to large Kina frame,
Clustering means for clustering the extracted frames of each continuous media data by grouping the frames having a small time interval with the extracted frames as the same cluster ;
A time compression means for time compressing each continuous media data in clustered cluster units;
A cluster change amount calculating means for calculating a change amount of the time-compressed cluster unit;
For the total compression rate at which the reciprocal of the total compression rate is equal to or higher than the playback speed, the multiplication value obtained by multiplying the total number of frames of continuous media data by the total compression rate is approximately the same as the total number of frames included in the cluster. In addition, media reconfiguration means for selecting a cluster having a large change amount in cluster units, concatenating frames included in the selected cluster while maintaining an order relation, and reconfiguring ;
Output means for outputting reconstructed data.

本発明（請求項７）は、コンピュータを、請求項４乃至６記載の装置として機能させるプログラムである。 The present invention (Claim 7 ) is a program that causes a computer to function as the apparatus according to Claims 4 to 6 .

本発明（請求項８）は、コンピュータを、請求項４乃至６記載の装置として機能させるプログラムを格納したコンピュータ読み取り可能な記録媒体である。

The present invention (Claim 8 ) is a computer-readable recording medium storing a program that causes a computer to function as the apparatus according to Claims 4 to 6 .

上記のように本発明によれば、特徴パラメータの変化量の大きなフレームを抽出する第１段階の圧縮過程と、クラスタ単位の変化量の大きなクラスタのみを連結してデータを再構成する第２段階の圧縮過程との２段階構成の圧縮過程を持ち、第１段階で連続メディアの最小構成単位を保持する品質での高速再生に相当する圧縮を行い、第２段階で、部分再生に相当する連続メディアの最小構成単位より大きな単位での間引きを行い、圧縮することになり、高速再生と部分再生の両方の効果を持つ利点がある。即ち、圧縮率が低い時は、高速再生の効果が主となる短縮再生になり、圧縮率が高くなると、部分再生の効果が加わった短縮再生にシームレスに移行することができ、圧縮率が高くなっても、了解性の著しい低下を引き起こさずに短縮することができる利点がある。 As described above, according to the present invention, the first step of extracting a frame having a large amount of change in the characteristic parameter and the second step of reconstructing data by connecting only clusters having a large amount of change in cluster units. The compression process has a two-stage configuration including the compression process of No. 1. The first stage performs compression corresponding to high-speed playback with a quality that holds the minimum structural unit of continuous media, and the second stage performs continuous compression corresponding to partial playback. Thinning is performed in units larger than the minimum constituent unit of the media and compression is performed, and there is an advantage that both high-speed playback and partial playback are effective. In other words, when the compression ratio is low, shortened playback is mainly performed by high-speed playback, and when the compression ratio is high, it is possible to seamlessly shift to shortened playback with the effect of partial playback and the compression ratio is high. Even if it becomes, there exists an advantage which can be shortened, without causing the remarkable fall of intelligibility.

特に、連続メディアとして音声に適用した場合、５倍速以上に相当する５分の１以下の短縮においても、了解性を保持した短縮再生が可能となる。 In particular, when applied to audio as continuous media, shortened playback with intelligibility is possible even with a shortening of 1/5 or less, which corresponds to 5 times or more speed.

また、本発明の準同期の複合メディアデータ短縮再生方法により、フレーム同期からクラスタ単位での同期に制約を緩和することにより、メディア数が増加しても、各メディア自体の圧縮率の増加を緩和して、より高い圧縮率を実現できる利点がある。 In addition, the quasi-synchronized composite media data shortened playback method of the present invention eases restrictions on frame-to-cluster synchronization, thereby reducing the increase in the compression rate of each media itself even if the number of media increases. Thus, there is an advantage that a higher compression rate can be realized.

また、本発明で用いている特徴パラメータの変化量という指標や抽出フレームのクラスタリングは汎用的であり、どんな連続メディアにも適用でき、異なるメディアでも統一的に扱える点等の利点や、任意の短縮再生時間や任意の総圧縮率を指定できる利点がある。 In addition, the index of feature parameter change used in the present invention and the clustering of extracted frames are general-purpose, can be applied to any continuous media, can be handled uniformly on different media, and can be shortened arbitrarily. There is an advantage that the playback time and an arbitrary total compression rate can be specified.

これらの利点により、内容の完全性や意味的な一貫性よりも短時間性を重視するメディアの内容の概観や検索等の用途において、従来手法よりも有効な手法ということができる。 Because of these advantages, it can be said that the method is more effective than the conventional method in applications such as overview and search of media content that emphasizes short time rather than completeness and semantic consistency.

以下、図面と共に本発明の実施の形態を説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

［第１の実施の形態］
図３は、本発明の第１の実施の形態における連続メディアデータ短縮再生装置の構成図である。 [First Embodiment]
FIG. 3 is a configuration diagram of the continuous media data shortening / reproducing apparatus according to the first embodiment of the present invention.

同図に示すように、連続メディアデータ短縮再生装置は、連続メディア入力部１０１、変化量パラメータ抽出部１０２、再生速度指示部１０３、クラスタリング部１０４、クラスタ変化量算出部１０５、総圧縮率指示部１０６、再生クラスタ選択部１０７、連続メディア再構成部１０８、連続メディア出力部１０９から構成される。 As shown in the figure, the continuous media data shortening playback apparatus includes a continuous media input unit 101, a change amount parameter extraction unit 102, a playback speed instruction unit 103, a clustering unit 104, a cluster change amount calculation unit 105, and a total compression rate instruction unit. 106, a reproduction cluster selection unit 107, a continuous media reconstruction unit 108, and a continuous media output unit 109.

連続メディア入力部１０１と変化量パラメータ抽出部１０２で構成される部分は、連続メディアデータをフレーム区間に分割し、それぞれのフレーム区間の特徴パラメータの変化量を計算する手段に相当する。 The portion configured by the continuous media input unit 101 and the change amount parameter extraction unit 102 corresponds to a unit that divides continuous media data into frame sections and calculates a change amount of a feature parameter in each frame section.

再生速度指示部１０３とクラスタリング部１０４で構成される部分は、特徴パラメータの変化量が所定の値より大きなフレームを抽出するメディア入力・特徴パラメータ変化量計算手段と抽出されたフレームをクラスタリングするクラスタリング手段に相当する。 The playback speed instruction unit 103 and the clustering unit 104 are composed of a media input / feature parameter change amount calculating means for extracting a frame whose feature parameter change amount is larger than a predetermined value and a clustering means for clustering the extracted frames. It corresponds to.

クラスタ変化量算出部１０５は、クラスタリングしたクラスタ単位の変化量を計算する手段クラスタ変化量算出手段に相当する。 The cluster change amount calculation unit 105 corresponds to a cluster change amount calculation unit that calculates a change amount of clustered clusters.

総圧縮率指示部１０６と再生クラスタ選択部１０７と連続メディア再構成部１０８で構成される部分は、クラスタ単位の変化量が所定の値より大きなクラスタのみを連結してデータを再構成するメディア再構成手段に相当する。 The portion composed of the total compression rate instruction unit 106, the reproduction cluster selection unit 107, and the continuous media reconstruction unit 108 is a media reconstruction unit that reconstructs data by concatenating only clusters in which the amount of change in cluster units is larger than a predetermined value. It corresponds to a configuration means.

連続メディア出力部１０９は、連結された再構成データを出力する出力手段に相当する。 The continuous media output unit 109 corresponds to an output unit that outputs connected reconstructed data.

以下、各構成要素について説明する。 Hereinafter, each component will be described.

図４は、本発明の第１の実施の形態における動作のフローチャートである。 FIG. 4 is a flowchart of the operation in the first embodiment of the present invention.

連続メディア入力部１０１は、入力された連続メディアデータをバッファ（図示せず）に読み込んで、変化量パラメータ抽出部１０２に送る（ステップ１０１）。例えば、連続メディア入力部１０１の入力は、入力されるデータがアナログデータである場合に、これをデジタルデータに変換しながら読み込んでもよいし、直接デジタルデータとして、ファイル形式の連続メディアデータを読み込んでもよいし、メモリに蓄積された連続メディアデータを読み込んでもよい。 The continuous media input unit 101 reads the input continuous media data into a buffer (not shown) and sends it to the change amount parameter extraction unit 102 (step 101). For example, when the input data is analog data, the input of the continuous media input unit 101 may be read while being converted into digital data, or may be read directly as file format continuous media data. Alternatively, continuous media data stored in the memory may be read.

連続メディア入力部１０１が読み込むデータ量は、全体を一括で読み込んでもよいし、一定単位の量を周期的に読み込んでもよいし、動的に任意の量を読み込んでもよい。 The data amount to be read by the continuous media input unit 101 may be read all at once, may be read periodically in a fixed unit amount, or may be read in an arbitrary amount dynamically.

また、必要に応じて、バッファ（図示せず）に読み込む前、もしくは、バッファから変化量パラメータ抽出部１０２に出力する時に、入力された連続メディアのデジタルデータを変化量パラメータ抽出部１０２に合わせてフォーマット変換する処理を加えても良い。 If necessary, the digital data of the input continuous media is adjusted to the variation parameter extraction unit 102 before being read into a buffer (not shown) or when being output from the buffer to the variation parameter extraction unit 102. Processing for format conversion may be added.

また、読み込んだバッファ（図示せず）から変化量パラメータ抽出部１０２へ出力するデータ量は、連続メディアデータの全体を一括で渡してもよいし、一定単位の量に分割して周期的に渡してもよいし、動的に任意の量を逐次渡してもよい。出力するデータ量は、読み込むデータ量と同じ量でもよいし、異なっていてもよい（異なる場合は、連続メディア入力部１０１でバッファリンクすることになる）。 The data amount output from the read buffer (not shown) to the change amount parameter extracting unit 102 may be the entire continuous media data delivered in a lump or may be divided into fixed units and delivered periodically. Alternatively, an arbitrary amount may be sequentially passed dynamically. The amount of data to be output may be the same as or different from the amount of data to be read (in the case of being different, buffer link is performed by the continuous media input unit 101).

入力される連続メディアデータが、変化量パラメータ抽出部１０２が必要とするフォーマットであり、データを全て変化量パラメータ抽出部１０２に一括で渡す等の場合は、連続メディア入力部１０１の機能を変化量パラメータ抽出部１０２を含めた構成にすることも可能である。 When the continuous media data to be input is in a format required by the variation parameter extraction unit 102 and all of the data is to be collectively transferred to the variation parameter extraction unit 102, the function of the continuous media input unit 101 is changed. A configuration including the parameter extraction unit 102 is also possible.

変化量パラメータ抽出部１０２は、連続メディア入力部１０１から連続メディアデータを一定周期の小区間（フレーム区間）に分割し、それぞれのフレーム区間の代表となる特徴パラメータの変化量を計算する（ステップ１０２）。例えば、連続メディアデータから一旦フレーム区間の代表となる特徴パラメータを計算した後、その特徴パラメータの時系列から変化量を計算してもよいし、連続メディアデータから直接フレーム区間の代表となる特徴パラメータの変化量を計算してもよい。 The change amount parameter extraction unit 102 divides the continuous media data from the continuous media input unit 101 into small sections (frame sections) having a fixed period, and calculates a change amount of a feature parameter representing each frame section (step 102). ). For example, after calculating a feature parameter representative of a frame section from continuous media data, the amount of change may be calculated from the time series of the feature parameter, or a feature parameter representative of a frame section directly from continuous media data The amount of change may be calculated.

フレーム区間の代表となる特徴パラメータや特徴パラメータの変化量の計算には、フレーム区間内のデータのみから計算してもよいし、フレーム区間外のデータを含めて計算してもよい。 The calculation of the feature parameter representing the frame section and the amount of change of the feature parameter may be calculated from only the data in the frame section, or may be calculated including data outside the frame section.

特徴パラメータは、スカラーでもよいし、２次元以上のベクトルでもよい。特徴パラメータの変化量は計算するフレームの特徴パラメータと前フレームもしくは後フレームの特徴パラメータとの２値の距離（数学でいう距離の公理を満たす距離関数で定義できるもの。例えば、マンハッタン距離、ユークリッド距離、べき乗距離、チェビシェフ距離やマハノビス距離等を用いることができる）でもよいし、計算するフレームの前フレームの特徴パラメータと後フレームの特徴パラメータの２値の距離の２分の１でもよいし、計算するフレームの前後複数フレームを含めた特徴パラメータｎ値から距離等を用いて計算される値でもよい。 The feature parameter may be a scalar or a vector of two or more dimensions. The change amount of the feature parameter is a binary distance between the feature parameter of the frame to be calculated and the feature parameter of the previous frame or the subsequent frame (which can be defined by a distance function that satisfies the mathematical distance axiom. For example, Manhattan distance, Euclidean distance Power distance, Chebyshev distance, Mahanobis distance, etc. may be used) or may be a half of the binary distance between the feature parameter of the previous frame and the feature parameter of the subsequent frame of the frame to be calculated. A value calculated using a distance or the like from the feature parameter n value including a plurality of frames before and after the frame to be performed may be used.

再生速度指示部１０３は、連続メディアの最小構成単位が原形を留めない程度縮退することを防ぐために局所的な再生速度を設定しておくもので、特徴パラメータの変化量の大きなフレームを抽出する基準となる再生速度をクラスタリング部１０４に指示する（ステップ１０３）。このとき、総圧縮率指示部１０６から総圧縮率を取得し、総圧縮率の逆数が設定された再生速度より小さい場合は、総圧縮率の逆数をクラスタリング部１０４に指示する。 The playback speed instruction unit 103 sets a local playback speed in order to prevent the minimum structural unit of continuous media from being degenerated to the extent that it does not retain its original shape. Is designated to the clustering unit 104 (step 103). At this time, the total compression rate is acquired from the total compression rate instruction unit 106, and when the reciprocal of the total compression rate is lower than the set reproduction speed, the reciprocal of the total compression rate is instructed to the clustering unit 104.

クラスタリング部１０４は、再生速度指示部１０３からの再生速度を基に、抽出フレーム数
変化量パラメータ抽出部に入力されたデータの総フレーム数／再生速度（１）
を計算し、変化量パラメータ抽出部１０２で計算した特徴パラメータの変化量の大きい方から抽出フレーム数だけ抽出し、抽出したフレームをクラスタリングする（ステップ１０４）。クラスタリングは、閾値を設定し、抽出されたフレームにおいて隣接する抽出フレームとの時間間隔がこの閾値より小さいものを同一クラスタとしてグループ化する等の方法により決定することができるが、これに限定されるものではなく、抽出されたフレームをグループ化できるものであればよい。例えば、閾値を定数として設定しておく他、隣接する抽出フレーム間の時間間隔の分布から動的に閾値を決定することもできる。 Based on the playback speed from the playback speed instruction unit 103, the clustering unit 104 calculates the total number of frames of data input to the number of extracted frame change parameter extraction unit / playback speed (1).
Are extracted by the number of extracted frames from the one with the larger change amount of the feature parameter calculated by the change amount parameter extraction unit 102, and the extracted frames are clustered (step 104). Clustering can be determined by a method such as setting a threshold and grouping extracted frames whose time interval between adjacent extracted frames is smaller than this threshold as the same cluster. What is necessary is just to be able to group the extracted frames. For example, in addition to setting the threshold value as a constant, the threshold value can also be dynamically determined from the distribution of time intervals between adjacent extracted frames.

クラスタ変化量算出部１０５は、クラスタリング部１０４でクラスタリングしたクラスタ単位の変化量を計算する（ステップ１０５）。クラスタ単位の変化量は、クラスタ内のフレームの特徴パラメータの変化量から算出することができる（例えば、クラスタ内のフレームの特徴パラメータの変化量の最大値や平均値や中央値や最小値等を用いることができる）がこれに限定されるものではなく、クラスタ区間を代表する別の特徴パラメータを用いてクラスタ単位の変化量を算出してもよい。 The cluster variation calculation unit 105 calculates the cluster unit variation clustered by the clustering unit 104 (step 105). The amount of change for each cluster can be calculated from the amount of change in the feature parameter of the frame in the cluster (for example, the maximum value, average value, median value, minimum value, etc. of the feature parameter change amount of the frame in the cluster can be calculated. However, the present invention is not limited to this, and the amount of change in cluster units may be calculated using another feature parameter representing the cluster section.

総圧縮率指示部１０６は、短縮再生するための連続メディア全体に対する圧縮率である総圧縮率を再生クラスタ選択部１０７に指示する（ステップ１０６）。また、短縮再生時間から総圧縮率を計算して指示するように構成してもよい。このとき、総圧縮率は、
短縮再生時間／連続メディアデータ全体の再生時間（２）
から計算することができる。 The total compression rate instruction unit 106 instructs the reproduction cluster selection unit 107 on the total compression rate, which is the compression rate for the entire continuous media for shortened playback (step 106). Alternatively, the total compression rate may be calculated and indicated from the shortened playback time. At this time, the total compression rate is
Shortened playback time / playback time of the entire continuous media data (2)
Can be calculated from

再生クラスタ選択部１０７は、総圧縮率指示部１０６で求められた総圧縮率を基に、クラスタ変化量算出部１０５のクラスタ単位の変化量の大きなクラスタから、選択されたクラスタに含まれるフレーム数の合計が、
連続メディアデータの総フレーム数＊総圧縮率（３）
で計算されるフレーム数になるまで選択する（ステップ１０７）。フレーム数の合計が上記の式（３）を超えたクラスタまでを選択クラスタとしてもよいし、超える直前までのクラスタを選択クラスタとしてもよいし、超える直前までのクラスタと
連続メディアデータの総フレーム数＊総圧縮率
−超える直前までのクラスタに含まれるフレーム数の合計
（４）
で計算されるフレーム数のフレームデータを追加補充して、フレーム数の合計が式（３）に一致するようにしてもよい。追加補充するフレームデータは、空白データ（例えば、音声の場合は無音や白色雑音やピンク雑音等、動画の場合は単色フレーム画像等）でもよいし、補充する直前と同一のフレームデータを繰り返し用いてもよいし、補充する直前のフレームデータに続く、原フレームデータを式（４）のフレーム数だけ繰り上げ選択するなどしてもよい。また、これらに限定されることなく、総圧縮率を基に、選択フレームが決定できる方法であればよい。 Based on the total compression rate obtained by the total compression rate instruction unit 106, the reproduction cluster selection unit 107 selects the number of frames included in the selected cluster from the clusters having a large change amount in cluster units of the cluster change amount calculation unit 105. The sum of
Total number of frames of continuous media data * Total compression rate (3)
Selection is made until the number of frames calculated in (1) is reached (step 107). Up to clusters whose total number of frames exceeds the above formula (3) may be selected clusters, clusters immediately before exceeding may be selected clusters, and the total number of frames of clusters and continuous media data until just before exceeding. * Total compression rate
-Total number of frames included in the cluster just before
(4)
The frame data of the number of frames calculated in (1) may be additionally supplemented so that the total number of frames matches equation (3). The frame data to be additionally replenished may be blank data (for example, silence, white noise, pink noise, etc. for audio, single color frame image for moving images, etc.), or the same frame data as that immediately before replenishment may be used repeatedly. Alternatively, the original frame data following the frame data immediately before replenishment may be selected by raising the number of frames of the equation (4). Further, the present invention is not limited to these, and any method that can determine the selected frame based on the total compression rate may be used.

連続メディア再構成部１０８は、再生クラスタ選択部１０７で選択されたクラスタのフレームに対応する連続メディアデータのフレーム区間データを順序関係を維持しながら連結し再構成する（ステップ１０８）。例えば、連結する際に、連結する前後のデータの不連続性を軽減するための連結する前後のデータに平滑化処理を加えてもよい。対応する連続メディアデータは、連続メディア入力部から変化量パラメータ抽出部１０２、クラスタリング部１０４、クラスタ変化量算出部１０５、再生クラスタ選択部１０７を経由して受け取ってもよいし、連続メディア入力部１０１から直接受け取ってもよい。 The continuous media reconstruction unit 108 concatenates and reconstructs the frame data of continuous media data corresponding to the frame of the cluster selected by the reproduction cluster selection unit 107 while maintaining the order relationship (step 108). For example, when connecting, smoothing processing may be added to the data before and after the connection for reducing the discontinuity of the data before and after the connection. Corresponding continuous media data may be received from the continuous media input unit via the change amount parameter extraction unit 102, clustering unit 104, cluster change amount calculation unit 105, and reproduction cluster selection unit 107, or the continuous media input unit 101. You may receive directly from.

連続メディア出力部１０９は、連続メディア再構成部１０８で再構成した連続メディアデータを出力する（ステップ１０９）。例えば、出力は、外部出力デバイスに随時出力してもよいし、後で再生することを目的として、記録媒体にファイルとして出力してもよいし、メモリ等の記憶媒体に出力し、別の装置、アプリケーションが逐次利用できるようにしてもよい。 The continuous media output unit 109 outputs the continuous media data reconstructed by the continuous media reconstruction unit 108 (step 109). For example, the output may be output to an external output device as needed, or may be output as a file to a recording medium for the purpose of later playback, or output to a storage medium such as a memory, and another device The application may be used sequentially.

［第２の実施の形態］
本実施の形態では、同期型の複合メディアデータ短縮再生装置について説明する。 [Second Embodiment]
In this embodiment, a synchronous composite media data shortening / playback apparatus will be described.

図５は、本発明の第２の実施の形態における同期型の複合メディア短縮再生装置の構成図である。 FIG. 5 is a block diagram of a synchronous composite media abbreviated playback apparatus according to the second embodiment of the present invention.

同図に示すように、同期型の複合メディア短縮再生装置は、連続メディア入力部２０１_１から連続メディア入力部２０１_ｎのｎ個の連続メディア入力部２０１、パラメータ抽出部２０２_１からパラメータ抽出部２０２_ｎのｎ個の変化量パラメータ抽出部２０２、パラメータ合成部２１０、再生速度指示部２０３、クラスタリング部２０４、クラスタ変化量算出部２０５、総圧縮率指示部２０６、再生クラスタ選択部２０７、連続メディア再構成部２０８_１から連続メディア再構成部２０８_ｎのｎ個の連続メディア再構成部２０８、連続メディア出力部２０９_１から連続メディア出力部２０９_ｎのｎ個の連続メディア出力部２０９から構成される。 As shown in the figure, the synchronous composite media abbreviated playback apparatus includes _n continuous media input units 201 from a continuous media input unit 201 ₁ to a continuous media input unit 201 _n , and a parameter extraction unit 202 ₁ to a parameter extraction unit 202. _n change parameter extraction unit 202 of n, parameter synthesis unit 210, playback speed instruction unit 203, clustering unit 204, cluster change amount calculation unit 205, total compression rate instruction unit 206, playback cluster selection unit 207, continuous media reproduction configuration unit ₂₀₈₁ from continuous media reconstruction unit 208 _n n successive media reconstruction unit 208, and a n consecutive media output unit 209 of the continuous media output unit 209 _n from the continuous media output unit 209 _1.

なお、ｎ個の連続メディアは、ｎ種類の連続メディアでもよいし、同一種類のｎチャンネルの連続メディアでもよいし、ｊ種類の連続メディアがｋチャンネル（ｊ＊ｋ＝ｎ）あってもよいし、これに限定することなく、合計がｎ個の連続メディアであればよい。 The n continuous media may be n types of continuous media, the same type of n channels of continuous media, or j types of continuous media of k channels (j * k = n). Without being limited thereto, the total may be n continuous media.

連続メディア入力部２０１_１から連続メディア入力部２０１_ｎのｎ個の連続メディア入力部２０１とパラメータ抽出部２０２_１からパラメータ抽出部２０２_ｎのｎ個の変化量パラメータ抽出部２０２で構成される部分は、それぞれの連続メディアをフレーム区間に分割し、それぞれのフレーム区間のそれぞれの連続メディアデータの特徴パラメータの変化量を計算するメディア入力・特徴パラメータ変化量計算手段に相当する。 A portion composed of _n continuous media input units 201 from the continuous media input unit 201 ₁ to the continuous media input unit 201 _n and n change parameter extraction units 202 from the parameter extraction unit 202 ₁ to the parameter extraction unit 202 _n This corresponds to media input / feature parameter change amount calculation means for dividing each continuous medium into frame sections and calculating the change amount of the feature parameter of each continuous media data in each frame section.

パラメータ合成部２１０は、複数の連続メディアデータの特徴パラメータの変化量から統合した連続メディアの特徴パラメータの変化量を計算する変化量パラメータ抽出手段に相当する。 The parameter synthesizing unit 210 corresponds to a change parameter extraction unit that calculates the change amount of the feature parameter of the continuous media integrated from the change amount of the feature parameter of the plurality of continuous media data.

再生速度指示部２０３とクラスタリング部２０４で構成される部分は、統合した特徴パラメータの変化量の大きなフレームを抽出するフレーム抽出手段と抽出されたフレームをクラスタリングするクラスタリング手段に相当する。 The portion constituted by the reproduction speed instruction unit 203 and the clustering unit 204 corresponds to a frame extraction unit that extracts a frame with a large change amount of the integrated feature parameter and a clustering unit that clusters the extracted frames.

クラスタ変化量算出部２０５は、クラスタリングしたクラスタ単位の変化量を計算するクラスタ変化量算出手段に相当する。 The cluster change amount calculation unit 205 corresponds to cluster change amount calculation means for calculating the change amount of clustered clusters.

総圧縮率指示部２０６と再生クラスタ選択部２０７と連続メディア再構成部２０８_１から連続メディア再構成部２０８_ｎのｎ個の連続メディア再構成部２０８で構成される部分は、クラスタ単位の変化量の大きなクラスタのみを連結して、各連続メディアデータ毎に再構成データを生成するメディア再構成手段に相当する。 Section, consisting of a total compression rate instructing unit 206 and the reproducing cluster selection unit 207 from the continuous media reconstruction unit 208 ₁ of the continuous media reconstruction unit 208 _n n consecutive media reconstruction unit 208, the amount of change in cluster units This corresponds to a media reconstruction unit that connects only large clusters and generates reconstruction data for each continuous media data.

連続メディア出力部２０９_１から連続メディア出力部２０９_ｎのｎ個の連続メディア出力部２０９は、各連結された再構成データを出力する出力手段に相当する。 The _n continuous media output units 209 from the continuous media output unit 209 ₁ to the continuous media output unit 209 _n correspond to output means for outputting each connected reconstructed data.

上記の構成における動作を説明する。 The operation in the above configuration will be described.

図６は、本発明の第２の実施の形態における動作のフローチャートである。 FIG. 6 is a flowchart of the operation in the second embodiment of the present invention.

連続メディア入力部２０１_１から連続メディア入力部２０１_ｎは、それぞれ対応する入力された連続メディアデータ１から連続メディアデータｎをそれぞれのバッファ（図示せず）に読み込んで、それぞれに対応するパラメータ抽出部２０２_１からパラメータ抽出部２０２_ｎに同期をとって送る（ステップ２０１）。 The continuous media input unit 201 ₁ to the continuous media input unit 201 _n read the continuous media data n from the corresponding input continuous media data 1 into respective buffers (not shown), and the corresponding parameter extraction units. send synchronization from 202 ₁ to the parameter extraction unit 202 _n (step 201).

各連続メディア入力部２０１は、第１の実施の形態における連続メディアデータ短縮再生装置の連続メディア入力１０１の処理に加え、各連続メディア入力部２０１間で各パラメータ抽出部２０２に送るデータ量を同じ時間分（サイズが同じとは限らず、再生した場合に同じ時間となる量等）に合わせ、同じタイミング（厳密に動作時間を合わせる必要はないが、データの処理位置、順序を同一にする）に同期して送る。 Each continuous media input unit 201 has the same amount of data sent to each parameter extraction unit 202 between the continuous media input units 201 in addition to the processing of the continuous media input 101 of the continuous media data shortening / playback apparatus in the first embodiment. Match the time (not necessarily the same size, the amount that will be the same time when played back), and the same timing (there is no need to match the operating time exactly, but the data processing position and order are the same) Send in sync with.

各連続メディア入力部２０１間で各連続メディアデータを読み込むデータ量、タイミングについては同期してもよいが、同期しなくてもよい。 The amount and timing of reading each continuous media data between each continuous media input unit 201 may be synchronized, but may not be synchronized.

パラメータ抽出部２０２_１からパラメータ抽出部２０２_ｎは、それぞれに対応する連続メディア入力部２０１から受信したそれぞれの連続メディアデータを一定周期の小区間（フレーム区間）に分割し、それぞれのフレーム区間の代表となる特徴パラメータの変化量を計算する（ステップ２０２）。 Parameter extracting unit 202 ₁ parameter extraction unit 202 _n from divides each successive media data received from the continuous media input unit 201 corresponding to the respective predetermined period of small sections (frame sections) representative of the respective frame section The change amount of the characteristic parameter is calculated (step 202).

各パラメータ抽出部２０２のフレーム区間長（フレーム周期）が共通である他は、各パラメータ抽出部２０２は、第１の実施の形態における変化量パラメータ抽出部１０２と同様の処理を行う。 Each parameter extraction unit 202 performs the same processing as the change amount parameter extraction unit 102 in the first embodiment, except that each parameter extraction unit 202 has a common frame section length (frame period).

パラメータ合成部２１０は、パラメータ抽出部２０２_１からパラメータ抽出部２０２_ｎのｎ個のパラメータ抽出部２０２からそれぞれの特徴パラメータの変化量を取得する。それぞれのパラメータの変化量は、同一フレーム数の時系列データであり、加重平均により、１つの変化量の時系列データに変換し、統合した特徴パラメータ変化量を求める（加重平均は、それぞれの特徴パラメータの変化量の単位が異なる場合の正規化やそれぞれの連続メディアに対して重み付けを行う場合を含み、平均方法についても算術平均や幾何平均等の場合を含む。正規化方法も例えば、それぞれの特徴パラメータの変化量の最大値でそれぞれの特徴パラメータの変化量を割ることにより、０から１の値に正規化してもよいし、それぞれの特徴パラメータの変化量の平均値で割ることにより正規化してもよいし、それぞれの特徴パラメータの変化量の分散で割ることにより正規化してもよい）（ステップ２０３）。 Parameter combination unit 210 obtains the amount of change in each of the feature parameters from the parameter extraction unit 202 _n n number of parameter extraction unit 202 from the parameter extraction unit 202 _1. The amount of change in each parameter is time-series data of the same number of frames, and is converted into time-series data of one change amount by weighted average to obtain an integrated feature parameter change amount (the weighted average is a feature of each feature). This includes normalization when the unit of parameter change is different and weighting each continuous medium, and also includes the case of arithmetic average, geometric average, etc. The normalization method also includes, for example, the respective normalization methods. By dividing the amount of change of each feature parameter by the maximum value of the amount of change of the feature parameter, normalization may be performed from 0 to 1, or by dividing by the average value of the amount of change of each feature parameter. Alternatively, normalization may be performed by dividing by the variance of the change amount of each feature parameter) (step 203).

再生速度指示部２０３は、第１の実施の形態における連続メディアデータ短縮再生装置の再生速度指示部１０３と同様の処理を行う（ステップ２０４）。 The playback speed instruction unit 203 performs the same processing as the playback speed instruction unit 103 of the continuous media data shortening playback device in the first embodiment (step 204).

クラスタリング部２０４は、第１の実施の形態における連続メディアデータ短縮再生装置のクラスタリング部１０４と、変化量パラメータ抽出部１０２の代わりに、パラメータ合成部２１０で計算した統合した特徴パラメータの変化量を用いる以外は、同様の処理を行う（ステップ２０５）。 The clustering unit 204 uses the change amount of the integrated feature parameter calculated by the parameter synthesis unit 210 instead of the clustering unit 104 and the change amount parameter extraction unit 102 of the continuous media data shortening / playback device according to the first embodiment. Except for this, the same processing is performed (step 205).

クラスタ変化量算出部２０５は、第１の実施の形態における連続メディアデータ短縮再生装置のクラスタ変化量算出部１０５と同様の処理を行う（ステップ２０６）。 The cluster change amount calculation unit 205 performs the same processing as the cluster change amount calculation unit 105 of the continuous media data shortening / playback apparatus in the first embodiment (step 206).

総圧縮率指示部２０６は、第１の実施の形態における連続メディアデータ短縮再生装置の総圧縮率指示部１０６と同様の処理を行う（ステップ２０７）。 The total compression rate instruction unit 206 performs the same processing as the total compression rate instruction unit 106 of the continuous media data shortening / reproducing apparatus in the first embodiment (step 207).

再生クラスタ選択部２０７は、第１の実施の形態における連続メディアデータ短縮再生装置の再生クラスタ選択部１０７と同様の処理を行う（ステップ２０８）。 The playback cluster selection unit 207 performs the same processing as the playback cluster selection unit 107 of the continuous media data shortening playback device in the first embodiment (step 208).

連続メディア再構成部２０８_１から連続メディア再構成部２０８_ｎは、再生クラスタ選択部２０７で選択された再生クラスタ数のフレームに対応するそれぞれの連続メディアデータのフレーム区間データを、順序関係を維持しながらそれぞれ連結し、それぞれ再構成する（ステップ２０９）。例えば、連結する際に、連結する前後のデータの不連続性を軽減するため連結する前後のデータに平滑化処理を加えてもよい。平滑化処理を加えるのは、全ての連続メディア再構成部２０８でもよいし、一部の連続メディア再構成部２０８でもよい。対応するそれぞれの連続メディアデータは、それぞれの連続メディア入力部２０１からそれぞれのパラメータ抽出部２０２、パラメータ合成部２１０、クラスタリング部２０４、クラスタ変化量算出部２０５、再生クラスタ選択部２０７を経由して受け取ってもよいし、それぞれの連続メディア入力部２０１から直接受け取ってもよい。 Continuous Media reconstruction unit 208 _n from continuous media reconstruction unit 208 _1, each frame section data of continuous media data corresponding to the frame number of playback clusters selected by the reproduction cluster selection unit 207, maintaining the order relationship Then, they are connected and reconfigured (step 209). For example, when connecting, smoothing processing may be added to the data before and after the connection in order to reduce the discontinuity of the data before and after the connection. The smoothing process may be applied to all the continuous media reconstruction units 208 or a part of the continuous media reconstruction units 208. Each corresponding continuous media data is received from each continuous media input unit 201 via each parameter extraction unit 202, parameter synthesis unit 210, clustering unit 204, cluster change amount calculation unit 205, and reproduction cluster selection unit 207. Alternatively, it may be received directly from each continuous media input unit 201.

連続メディア出力部２０９_１から連続メディア出力部２０９_ｎは、それぞれに対応するそれぞれの連続メディア再構成部２０８で再構成した連続メディアデータをそれぞれ出力する（ステップ２１０）。 The continuous media output units 209 ₁ to 209 _n output the continuous media data reconstructed by the corresponding continuous media reconstruction units 208 respectively (step 210).

連続メディア出力部２０９は、同期をとって出力してもよいし、しなくてもよい。例えば、外部出力デバイスに随時出力する場合に同期をとって出力してもよいし、後で再生することを目的として、記録媒体にファイルとして出力したり、メモリ等の記憶媒体に出力し、別の装置、アプリケーションが逐次利用できるようにする場合は、同期をとらなくてもよい。 The continuous media output unit 209 may or may not output in synchronization. For example, when outputting to an external output device as needed, it may be output in synchronization, or output as a file to a recording medium or output to a storage medium such as a memory for later playback. If the devices and applications can be used sequentially, there is no need to synchronize.

［第３の実施の形態］
本実施の形態では、準同期型の複合メディアデータの短縮再生装置について説明する。 [Third Embodiment]
In this embodiment, a semi-synchronous composite media data shortening reproduction apparatus will be described.

図７は、本発明の第３の実施の形態における準同期型の複合メディアデータ短縮再生装置の構成図である。 FIG. 7 is a block diagram of a quasi-synchronous composite media data shortening / playback apparatus according to the third embodiment of the present invention.

同図に示すように、準同期型の複合メディアデータ短縮再生装置は、連続メディア入力部３０１_１から連続メディア入力部３０１_ｎのｎ個の連続メディア入力部３０１と、パラメータ抽出部３０２_１からパラメータ抽出部３０２_ｎのｎ個のパラメータ抽出部３０２と、再生速度指示部３０３、クラスタリング部３０４、クラスタ内時間圧縮部３１１、総圧縮率指示部３０６、再生クラスタ選択部３０７、連続メディア再構成部３０８_１から連続メディア再構成部３０８_ｎのｎ個の連続メディア再構成部３０８、連続メディア出力部３０９_１から連続メディア出力部３０９_ｎのｎ個の連続メディア出力部３０９から構成される。 As shown in the figure, the quasi-synchronous composite media data compaction reproducing apparatus includes n-number of continuous media input unit 301 of the continuous media input unit 301 _n from the continuous media input unit 301 _1, the parameters from the parameter extraction unit 302 ₁ Extraction unit 302 _n parameter extraction unit 302, playback speed instruction unit 303, clustering unit 304, intra-cluster time compression unit 311, total compression rate instruction unit 306, playback cluster selection unit 307, continuous media reconstruction unit 308 ₁ to _n continuous media reconstruction units 308 _n to n continuous media reconstruction units 308, and continuous media output units 309 ₁ to _n continuous media output units 309 _n to n continuous media output units 309.

連続メディア入力部３０１_１から連続メディア入力部３０１_ｎのｎ個の連続メディア入力部３０１とパラメータ抽出部３０２_１からパラメータ抽出部３０２_ｎのｎ個のパラメータ抽出部３０２で構成される部分は、それぞれの連続メディアデータをフレーム区間に分割し、それぞれのフレーム区間のそれぞれの連続メディアの特徴パラメータの変化量を計算するメディア入力・特徴パラメータ変化量計算手段に相当する。 The parts composed of the n continuous media input units 301 from the continuous media input unit 301 ₁ to the continuous media input unit 301 _{n and} the n parameter extraction units 302 from the parameter extraction unit 302 ₁ to the parameter extraction unit 302 _n are respectively This is equivalent to a media input / feature parameter change amount calculation means for dividing the continuous media data of each frame into frame sections and calculating the amount of change of the feature parameter of each continuous medium in each frame section.

再生速度指示部３０３とクラスタリング部３０４で構成される部分は、各連続メディアの特徴パラメータの変化量の大きなフレームを抽出するフレーム抽出手段と、各連続メディアデータの抽出されたフレームを統合して、クラスタリングするクラスタリング手段に相当する。 The portion composed of the playback speed instruction unit 303 and the clustering unit 304 integrates the frame extraction means for extracting a frame with a large amount of change in the feature parameter of each continuous media, and the extracted frame of each continuous media data, This corresponds to clustering means for clustering.

クラスタ内時間圧縮部３１１は、クラスタリングされたクラスタ単位で各連続メディアデータを時間圧縮する時間圧縮手段に相当する。 The intra-cluster time compressing unit 311 corresponds to a time compressing unit that temporally compresses each continuous media data in clustered cluster units.

クラスタ変化量算出部３０５は、時間圧縮されたクラスタ単位の変化量を計算するクラスタ変化量算出手段に相当する。 The cluster change amount calculation unit 305 corresponds to a cluster change amount calculation unit that calculates a change amount of a cluster unit that is time-compressed.

総圧縮率指示部３０６と再生クラスタ選択部３０７と連続メディア再構成部３０８_１から連続メディア再構成部３０８_ｎのｎ個の連続メディア再構成部３０８で構成される部分は、時間圧縮されたクラスタ単位の変化量の大きな時間圧縮されたクラスタのみを連結して各連続メディアデータ毎に再構成データを生成する再構成手段に対応する。 Section, consisting of n consecutive media reconstruction unit 308 of the total compression rate instruction unit 306 and the reproducing cluster selection unit 307 and the continuous media reconstruction unit 308 _n from continuous media reconstruction unit 308 _1, the compressed time Cluster Corresponding to reconstruction means for generating reconstructed data for each continuous media data by concatenating only time-compressed clusters having a large unit variation.

連続メディア出力部３０９_１から連続メディア出力部３０９_ｎのｎ個の連続メディア出力部３０９は、各連続された再構成データを出力する出力手段に相当する。 The _n continuous media output units 309 from the continuous media output unit 309 ₁ to the continuous media output unit 309 _n correspond to output means for outputting each continuous reconstructed data.

図８は、本発明の第３の実施の形態における動作のフローチャートである。 FIG. 8 is a flowchart of the operation in the third embodiment of the present invention.

連続メディア入力部３０１_１から連続メディア入力部３０１_ｎは、前述の第２の実施の形態の同期型複合メディア短縮再生装置の連続メディア入力部２０１_１から連続メディア入力部２０１_ｎと同様の処理を行う（ステップ３０１）。 The continuous media input unit 301 ₁ to the continuous media input unit 301 _n perform the same processing as the continuous media input unit 201 ₁ to the continuous media input unit 201 _n of the synchronous composite media abbreviated playback apparatus of the second embodiment described above. Perform (step 301).

パラメータ抽出部３０２_１からパラメータ抽出部３０２_ｎは、前述の第２の実施の形態の同期型複合メディア短縮再生装置のパラメータ抽出部２０２_１からパラメータ抽出部２０２_ｎと同様の処理を行う（ステップ３０２）。 The parameter extraction unit 302 ₁ to the parameter extraction unit 302 _n perform the same processing as the parameter extraction unit 202 ₁ to the parameter extraction unit 202 _n of the synchronous composite media abbreviated playback device of the second embodiment described above (step 302). ).

再生速度指示部３０３は、各連続メディアの最小構成単位が原形を留めない程縮退することを防ぐために、局所的な再生速度をそれぞれ設定しておくもので、連続メディアデータ１から連続メディアデータｎのそれぞれの特徴パラメータの変化量の大きなフレームを抽出する基準となる再生速度１から再生速度ｎをクラスタリング部３０４に指示する（ステップ３０３）。この時、総圧縮率指示部３０３から総圧縮率を取得し、それぞれの再生速度が総圧縮率の逆数より大きい場合は、その再生速度を総圧縮率の逆数に置き換えてクラスタリング部３０４に指示する。 The playback speed instruction unit 303 sets a local playback speed in order to prevent the minimum structural unit of each continuous medium from degenerating so as not to retain its original shape. The clustering unit 304 is instructed from the playback speed 1 to the playback speed n, which serve as a reference for extracting a frame with a large amount of change in each feature parameter (step 303). At this time, the total compression rate is acquired from the total compression rate instruction unit 303, and when each playback speed is larger than the reciprocal of the total compression rate, the playback speed is replaced with the reciprocal of the total compression rate and the clustering unit 304 is instructed. .

クラスタリング部３０４は、再生速度指示部３０３から再生速度１から再生速度ｎを基に、それぞれのメディアデータに対する、前述の式（１）の抽出フレーム数を計算し、パラメータ抽出部３０２_ｎからパラメータ抽出部３０２_ｎのｎ個のパラメータ抽出部３０２からそれぞれの特徴パラメータの変化量を取得し、各メディアデータごとに特徴パラメータの変化量の大きい方からそれぞれの抽出フレーム数だけ抽出する。全てのメディアデータにおいて抽出されていない区間の時間間隔を基にクラスタリングする（ステップ３０４）。
全てのメディアデータにおいて抽出されていない区間がない、もしくは、少ない場合は、抽出区間の重複頻度を計算して、抽出区間の重複頻度の少ない区間の時間間隔を基にクラスタリングしてもよいし、フレームの特徴パラメータの変化量を統合して統合した変化量の大きさを基にクラスタリングしてもよい。また、これらに限定されることなく、各再生速度と各特徴パラメータの変化量を基に、クラスタリングできる方法であればよい。クラスタリング方法については、第１の実施の形態における連続メディアデータ短縮再生装置のクラスタリングと同様である。 The clustering unit 304, based on the playback speed n from the playback speed 1 from the reproduction speed instruction unit 303, for each media data, calculates the number of extracted frames of the formula (1) described above, the parameters extracted from the parameter extraction unit 302 _n The change amount of each feature parameter is acquired from the n parameter extraction units 302 of the unit 302 _{n, and} the number of extracted frames is extracted for each media data from the larger feature parameter change amount. Clustering is performed based on the time intervals of sections not extracted in all media data (step 304).
If there is no section that is not extracted in all media data, or if there are few sections, the overlap frequency of the extracted sections may be calculated and clustered based on the time interval of the sections with a low overlap frequency of the extracted sections, Clustering may be performed on the basis of the magnitude of the change amount obtained by integrating the change amounts of the feature parameters of the frame. Further, the present invention is not limited to these, and any method that can perform clustering based on each reproduction speed and the amount of change of each characteristic parameter may be used. The clustering method is the same as that of the continuous media data shortening / playback apparatus in the first embodiment.

クラスタ内時間圧縮部３１１は、クラスタリング部３０４でクラスタリングされたクラスタに対して、クラスタ内でメディア間のフレーム同期性を緩めて時間圧縮を行い、時間圧縮されたクラスタを生成する（ステップ３０５）。時間圧縮方法は、クラスタ内で各メディアデータ毎に抽出フレーム数を計算し、その中から最長抽出フレーム数を決定する。その最長抽出フレーム数より少ないメディアデータに対して、再度、クラスタ区間内で特徴パラメータの変化量の大きいフレームから最長抽出フレーム数だけフレームの抽出をやり直す。これにより、クラスタ内の全てのメディアデータから最長抽出フレーム数ずつのフレームが抽出でき、それぞれのメディアデータで抽出するフレームを連結すると、クラスタとしてのフレーム数以下の最長抽出フレーム数の長さに時間圧縮されたクラスタが生成される。この時間圧縮されたクラスタは、クラスタ内ではメディア間のフレーム同期は崩れるが、クラスタとしては同期がとれた圧縮となる。 The intra-cluster time compression unit 311 performs time compression on the clusters clustered by the clustering unit 304 by relaxing the frame synchronism between media within the cluster, and generates a time-compressed cluster (step 305). In the time compression method, the number of extracted frames is calculated for each media data in the cluster, and the longest extracted frame number is determined from the calculated number. With respect to the media data smaller than the longest extracted frame number, the frame extraction is performed again by the longest extracted frame number from the frame having a large change amount of the characteristic parameter in the cluster section. As a result, frames of the maximum number of extracted frames can be extracted from all media data in the cluster, and when the frames extracted by the respective media data are concatenated, the length of the maximum number of extracted frames equal to or less than the number of frames as a cluster is reduced. A compressed cluster is generated. This time-compressed cluster loses frame synchronization between media within the cluster, but the cluster is compressed in synchronization.

クラスタ変化量算出部３０５は、前述の第２の実施の形態における同期型の複合メディアデータ短縮再生装置のクラスタ変化量算出部２０５と、クラスタリング部２０４でクラスタリングされたクラスタの代わりにクラスタ内時間圧縮部３１１で時間圧縮されたクラスタを用いる以外は、同様の処理を行う（ステップ３０６）。 The cluster change amount calculation unit 305 includes a cluster change amount calculation unit 205 of the synchronous composite media data shortening reproduction apparatus in the second embodiment described above, and the intra-cluster time compression instead of the cluster clustered by the clustering unit 204. The same processing is performed except that the time-compressed cluster in the unit 311 is used (step 306).

総圧縮率指示部３０６は、前述の第２の実施の形態における同期型の複合メディアデータ短縮再生装置の総圧縮率指示部２０６と同様の処理を行う（ステップ３０７）。 The total compression rate instruction unit 306 performs the same process as the total compression rate instruction unit 206 of the synchronous composite media data shortening / playback apparatus in the second embodiment described above (step 307).

再生クラスタ選択部３０７は、前述の第２の実施の形態における同期型の複合メディアデータ短縮再生装置の再生クラスタ選択部２０７と同様の処理を行う（ステップ３０８）。 The playback cluster selection unit 307 performs the same processing as the playback cluster selection unit 207 of the synchronous composite media data shortening playback device in the second embodiment described above (step 308).

連続メディア再構成部３０８_１から連続メディア再構成部３０８_ｎは、前述の第２の実施の形態における同期型の複合メディアデータ短縮再生装置の連続メディア再構成部２０８_１から連続メディア再構成部２０８_ｎと同様の処理を行う（ステップ３０９）。 The continuous media reconstruction unit 308 ₁ to the continuous media reconstruction unit 308 _n are connected to the continuous media reconstruction unit 208 ₁ to the continuous media reconstruction unit 208 of the synchronous composite media data shortening reproduction apparatus in the second embodiment described above. Processing similar to _n is performed (step 309).

連続メディア出力部３０９_１から連続メディア出力部３０９_ｎは、前述の第２の実施の形態における同期型の複合メディアデータ短縮再生装置の連続メディア出力部２０９_１から連続メディア出力部２０９_ｎと同様の処理を行う（ステップ３１０）。 The continuous media output unit 309 ₁ to the continuous media output unit 309 _n are the same as the continuous media output unit 209 ₁ to the continuous media output unit 209 _n of the synchronous composite media data shortening / playback device in the second embodiment described above. Processing is performed (step 310).

なお、上記の第１〜第３の実施の形態は、図３、図５、図７に示した構成に限定されるものではなく、様々な応用が可能である。 The first to third embodiments described above are not limited to the configurations shown in FIGS. 3, 5, and 7, and various applications are possible.

また、本発明の連続メディアデータ短縮再生装置及び複合メディア短縮再生装置は、論理回路等を用いたハードウェアのみによって実現することも可能であると共に、コンピュータとそれによって実行されるソフトウェアを用いて実行することも可能である。 Further, the continuous media data shortening / reproducing apparatus and the composite media shortening / reproducing apparatus of the present invention can be realized only by hardware using a logic circuit or the like, and are executed by using a computer and software executed thereby. It is also possible to do.

また、このソフトウェアは、コンテンツ読み取り可能な記録媒体、あるいは通信回線介して配布することが可能である。 The software can be distributed via a content-readable recording medium or a communication line.

［第１の実施例］
連続メディアデータ短縮再生装置を音声に適用した実施例を図３、図９〜図１３を用いて説明する。 [First embodiment]
An embodiment in which the continuous media data shortening / reproducing apparatus is applied to sound will be described with reference to FIGS. 3 and 9 to 13.

連続メディア入力部１０１で入力された「あきあき（音素記号「ａｋｉａｋｉ」）」と発話した音声データ（例えば、10kHzサンプリング、１６ビットリニアPCMで本実施例の説明を行うが、10kHz以外のサンプリング周波数や他の音声符号を用いてもよい）の音声波形の例を図９に示す。音素記号は、それぞれ発音された音素の区間を示す。 Voice data (for example, 10 kHz sampling, 16-bit linear PCM, which is spoken with “Aki Aki” (phoneme symbol “aki aki”)) input at the continuous media input unit 101 will be described. FIG. 9 shows an example of a speech waveform of frequency and other speech codes (which may be used). Each phoneme symbol indicates a section of the phoneme that is pronounced.

変化量パラメータ抽出部１０２では、音声データのフレーム区間（例えば、フレーム区間長を10msとして本実施例の説明を行うと、100ポイントの音声データがフレーム区間長の音声データとなる。フレーム区間長は10ms以外でもよい）に分割し、その代表となる特徴パラメータとして音声パワーを計算する。音声パワーの計算には、例えば、フレーム区間を中心に、フレーム区間外の音声データを含めた２５６ポイント（25.6msの音声データ）を用い、窓長２５６ポイントのブラックマン窓をかけた後、音声パワーの計算を行い、フレーム区間の代表値とすることができる（窓長は２５６ポイント以外でもよいし、窓の形もブラックマン窓以外を用いてもよいし、窓かけ計算を必ずしも行わなくてもよい）。 In the variation parameter extraction unit 102, when the present embodiment is described with the frame section of the audio data (for example, the frame section length is 10 ms), 100 points of audio data becomes the audio data of the frame section length. The voice power is calculated as a representative characteristic parameter. For the calculation of audio power, for example, using 256 points (25.6ms audio data) including audio data outside the frame interval centered on the frame interval, after applying a Blackman window with a window length of 256 points, the audio power is calculated. Power can be calculated and used as a representative value for the frame interval (the window length may be other than 256 points, the window shape may be other than the Blackman window, and the windowing calculation is not necessarily performed). Also good).

ここで、図９の音声波形に対応した音声パワーの例を図１０に示す。音声パワーの場合はスカラーの時系列であるが、音声パワーのほか、△（デルタ）パワー、ＦＦＴ係数、ＬＰＣ係数、ケプストラム係数、△（デルタ）ケプストラム係数、メル周波数ケプストラム係数（MFCC）、△（デルタ）メル周波数ケプストラム係数（△MFCC）、これらに類する各種音声分析パラメータやそれらの組み合わせを用いることもでき、スカラーだけでなくベクトルでもよい。 Here, an example of audio power corresponding to the audio waveform of FIG. 9 is shown in FIG. In the case of voice power, it is a scalar time series. In addition to voice power, Δ (delta) power, FFT coefficient, LPC coefficient, cepstrum coefficient, Δ (delta) cepstrum coefficient, mel frequency cepstrum coefficient (MFCC), Δ ( Delta) Mel frequency cepstrum coefficient (ΔMFCC), various speech analysis parameters similar to these, and combinations thereof can be used, and not only scalars but also vectors.

特徴パラメータの変化量は、例えば、音声パワーのフレーム間の差分の絶対値を計算する。当該フレームと前フレームの音声パワーの差分の絶対値でもよいし、当該フレームと後フレームの音声パワーの差分の絶対値でもよいし、前フレームと後フレームの音声パワーの差分の２分の１の絶対値でもよいし、前後複数フレームの音声パワーを組み合わせて計算してもよい。また、音声データから直接△（デルタ）パワーの絶対値を計算してもよい。 As the amount of change of the feature parameter, for example, an absolute value of a difference between frames of audio power is calculated. It may be the absolute value of the difference between the audio power of the frame and the previous frame, the absolute value of the difference of the audio power of the frame and the subsequent frame, or half the difference of the audio power of the previous frame and the subsequent frame. It may be an absolute value or may be calculated by combining audio powers of a plurality of frames before and after. Moreover, you may calculate the absolute value of (delta) power directly from audio | voice data.

ここで、図１０の音声パワーに対応した、音声パワーの変化量の例を図１１に示す。音声パワーはスカラーであり、差分計算の例を示したが、ベクトルの場合を含めて一般化すると２値の距離計算で置き換えることができる。ベクトルの場合に距離計算で置き換えた例としては、多次元ベクトルのMFCCや△（デルタ）MFCCから、メル周波数ケプストラム距離（MFCD）を計算して用いることもできる。 Here, FIG. 11 shows an example of the amount of change in audio power corresponding to the audio power in FIG. The voice power is a scalar, and an example of the difference calculation has been shown. However, when generalized including the case of a vector, it can be replaced with a binary distance calculation. As an example in which a vector is replaced by distance calculation, a mel frequency cepstrum distance (MFCD) can be calculated from an MFCC or Δ (delta) MFCC of a multidimensional vector.

総圧縮率指示部１０６では、利用者が短縮再生したい総圧縮率、もしくは、短縮再生時間を指定する。ここでは、総圧縮率を６分の１（高速再生での６倍速に相当）に指定した場合の例を用いて説明する。 The total compression rate instruction unit 106 designates the total compression rate or the shortened playback time that the user wants to shorten. Here, description will be made using an example in which the total compression rate is designated as 1/6 (corresponding to 6 times speed in high-speed playback).

再生速度指示部１０３では、連続メディア（ここでは音声）の最小構成単位（ここでは音韻や単語等）が原形を留めない程縮退（ここでは、聞き取れなくなることに相当）することを防ぐために局所的な再生速度を設定しておくもので、図１２は音韻の欠落が少ない３倍速に設定した例である。図１３は比較のための総圧縮率と同等な６倍速に設定した例である。図１３の（ｃ）のグラフは６倍速の再生速度を基に音声パワーの変化量の閾値を設定した例であり、閾値より上に値がくる網掛け部分が抽出されたフレームを示している。図１３（ｄ）のグラフは抽出されたフレームを連結した例であるが、元音声の「あきあき（音素記号「ａｋｉａｋｉ」）」のうち、音素／ｉ／が欠落し、音素／ｋ／は区間が短すぎて聞き取れない程縮退した例である。結果として抽出された音声は「ああ」となり、単語としての意味もわからなくなってしまった例である。これに対し、図１２は、適切に音韻欠落が少ない３倍速に設定した例であり、図１２（ｃ）のグラフから３倍速の再生速度を基にパワーの変化量の閾値を設定し、抽出した段階では、元音声の「あきあき（音素記号「ａｋｉａｋｉ」）の全ての音素が残っている例となっている。 In the playback speed instruction unit 103, a local component in order to prevent a minimum constituent unit (here, phonemes, words, etc.) of a continuous medium (here, phonemes, words, etc.) from degenerates (which corresponds to being inaudible here) to an original form. FIG. 12 shows an example in which the playback speed is set to 3 × speed with few missing phonemes. FIG. 13 shows an example in which the speed is set to 6 × speed equivalent to the total compression rate for comparison. The graph of (c) of FIG. 13 is an example in which a threshold value of the amount of change in audio power is set based on a 6 × speed reproduction speed, and shows a frame in which a shaded portion whose value is above the threshold value is extracted. . The graph of FIG. 13D is an example of concatenating the extracted frames. In the original speech “Akiaki (phoneme symbol“ aki aki ”)”, the phoneme / i / is missing and the phoneme / k / Is an example where the section is too short to be heard. As a result, the extracted voice is “Oh”, and the meaning as a word is lost. On the other hand, FIG. 12 is an example in which the 3 × speed is set appropriately with few phoneme omissions, and the threshold value of the power change amount is set based on the 3 × speed playback speed and extracted from the graph of FIG. At this stage, all phonemes of the original speech “Aki Aki (phoneme symbol“ aki aki ”) remain.

クラスタリング部１０４では、音声パワーの変化量の大きなものから局所的な再生速度を基にフレームを抽出する処理とクラスタリングを行う。図１２（ｃ）のグラフは、クラスタリングとして、クラスタ分割のための最大区間長の閾値を設定し、その最大区間長の閾値より、隣接する抽出フレームの区間が近ければ同一のクラスタとして統合した例である。ここでは、２つのクラスタが形成されている。 The clustering unit 104 performs processing and clustering for extracting frames based on the local reproduction speed from the one with the large amount of change in audio power. The graph of FIG. 12C is an example in which, as clustering, a threshold of the maximum section length for cluster division is set, and if the sections of adjacent extracted frames are closer than the threshold of the maximum section length, they are integrated as the same cluster. It is. Here, two clusters are formed.

クラスタ変化量算出部１０５では、形成されたクラスタに対し、クラスタ単位での変化量を算出する。例えば、形成されたクラスタに含まれる抽出フレームの特徴パラメータ変化量の平均をとることで、クラスタ単位の変化量を算出することができるが、いろいろな方法が可能である。 The cluster change amount calculation unit 105 calculates the change amount in units of clusters for the formed cluster. For example, by taking the average of the feature parameter change amounts of the extracted frames included in the formed cluster, the change amount in cluster units can be calculated, but various methods are possible.

再生クラスタ選択部１０７では、クラスタ単位の変化量の大きいクラスタを選択する。ここでは、総圧縮率６分の１であるが、局所的な再生速度の３倍速を基にフレーム抽出を行っているので、クラスタの選択により更に２分の１にすることになる。図１２の（ｄ）、は、図１２（ｃ）で形成されたクラスタのうち、クラスタ単位の変化量の大きいクラスタを選択して、２分の１にした例である。結果として「あき」という音声が生成される例である。２つの単語のうち、１つしか再生されないことになるが、クラスタ単位の変化量の大きいクラスタの方が音声データの特徴を表していると解釈すれば、単語として意味がわかるレベルでの短縮再生として適切であるということができる。図１３（ｄ）のように、２つの単語からフレームを抽出しても、内容が概観できなければ意味がないからである。 The reproduction cluster selection unit 107 selects a cluster having a large change amount in cluster units. Here, although the total compression rate is 1/6, since frame extraction is performed based on the triple speed of the local reproduction speed, it is further reduced to 1/2 by selecting the cluster. (D) of FIG. 12 is an example in which a cluster having a large amount of change in cluster units is selected from the clusters formed in FIG. As a result, the voice “Aki” is generated. Only one of the two words will be played back, but shortened playback at a level that can be understood as a word if it is interpreted that the cluster with the larger amount of change in cluster units represents the characteristics of the audio data. It can be said that it is appropriate. This is because, as shown in FIG. 13D, even if a frame is extracted from two words, it is meaningless if the contents cannot be seen.

連続メディア再構成部１０８では、再生クラスタ選択部１０７で選択したフレームに対応する音声データを抽出し、順序関係を維持しながら連結して再構成する。 The continuous media reconstruction unit 108 extracts audio data corresponding to the frame selected by the reproduction cluster selection unit 107, and concatenates and reconstructs the data while maintaining the order relationship.

音声データを連結する時に、連結部分に平滑化処理を施してもよい。音声断片データをそのまま連結すると不連続性により再生時にノイズとして知覚されるが、平滑化処理によって、その不連続性を解消し、ノイズを軽減することができる。 When audio data is connected, a smoothing process may be applied to the connected part. If the audio fragment data are connected as they are, they are perceived as noise during reproduction due to discontinuity, but the discontinuity can be eliminated and noise can be reduced by smoothing processing.

この平滑化処理の例としては、連結する前後の一定区間の音声データの移動平均をとる方法がある。例えば、前後の計５ポイントのデータの移動平均をとってもよいし、５ポイント以外の数でもよい。また、移動平均以外の平滑化方法でもよい。 As an example of the smoothing process, there is a method of taking a moving average of audio data in a certain section before and after connection. For example, a moving average of data of a total of 5 points before and after may be taken, or a number other than 5 points may be taken. A smoothing method other than moving average may be used.

連続メディア出力部１０９は、例えば、スピーカやヘッドホン等のオーディオ出力デバイスに随時出力し、再生してもよいし、後で再生することを目的に、ファイル出力してもよい。 For example, the continuous media output unit 109 may output to an audio output device such as a speaker or a headphone and reproduce it at any time, or may output a file for the purpose of reproducing later.

［第２の実施例］
本実施例では、準同期の複合メディアデータ短縮再生装置について、図５、図７の用語を基に、図１４を用いて説明する。 [Second Embodiment]
In this embodiment, a semi-synchronized composite media data shortening / reproducing apparatus will be described with reference to FIG. 14 based on the terms in FIGS.

図１４は、本発明の第２の実施例の準同期の複合メディアデータ短縮再生方法の動作の例であり、連続メディア１に音声、連続メディア２に動画を適用した例である。 FIG. 14 is an example of the operation of the semi-synchronous composite media data shortening reproduction method according to the second embodiment of the present invention, in which audio is applied to the continuous media 1 and moving images are applied to the continuous media 2.

図１４（ａ）は、連続メディア入力部２０１_１、連続メディア入力部２０２_２とパラメータ抽出部２０２_１、パラメータ抽出部２０２_２を経由して、クラスタリング部２０４でクラスタリングしたフレーム区間を原データの時間に併せてマッピングした例である。 FIG. 14A shows the frame period of the original data obtained by clustering the frame sections clustered by the clustering unit 204 via the continuous media input unit 201 ₁ , the continuous media input unit 202 ₂ , the parameter extraction unit 202 ₁ , and the parameter extraction unit 202 _2. This is an example of mapping together.

図１４（ｂ）は、図１４（ａ）をフレーム同期を維持して抽出フレームを連結した場合の例である。 FIG. 14B shows an example in which extracted frames are connected to FIG. 14A while maintaining frame synchronization.

図１４（ｃ）は、図１４（ｂ）をクラスタ内時間圧縮部３１１において、更にクラスタ内で時間圧縮した後、連結した場合の例である。図１４（ｂ），（ｃ）を比較するとわかるように、音声と動画のフレームの同期はとれていないものの、クラスタ単位の同期はとれた状態で、音声の音韻や動画の動物体の動きの了解性を維持しながら、より短縮できている。 FIG. 14C shows an example in which FIG. 14B is further compressed by the intra-cluster time compressing unit 311 after being further compressed in the cluster. As can be seen from a comparison between FIGS. 14B and 14C, although the frame of the audio and the moving image is not synchronized, the movement of the movement of the phonology of the audio and the moving object of the moving image is maintained in a state where the synchronization is performed in cluster units. It can be shortened while maintaining intelligibility.

後は、クラスタ変化量算出部２０５でクラスタ単位の変化量を計算し、再生クラスタ選択部２０７でクラスタ単位の変化量の大きいクラスタを総圧縮率に合わせて選択する。 Thereafter, the cluster change amount calculation unit 205 calculates the change amount in cluster units, and the reproduction cluster selection unit 207 selects a cluster having a large change amount in cluster units according to the total compression rate.

連続メディア再構成部２０８_１、連続メディア再構成部２０８_２では、選択したクラスタ内のフレーム区間に対応するそれぞれのメディアデータを抽出し、順序関係を維持しながら、それぞれ連結して再構成する。 The continuous media reconstruction unit 208 ₁ and the continuous media reconstruction unit 208 ₂ extract media data corresponding to the frame sections in the selected cluster, and connect and reconfigure them while maintaining the order relationship.

連続メディア出力部２０９_１、連続メディア出力部２０９_２は、例えば、スピーカやヘッドホン等のオーディオ出力デバイスとディスプレイ等の表示デバイスに随時出力してもよいし、後で再生することを目的に、ファイル出力してもよい。ファイル出力する時は、音声、動画のそれぞれで別のファイルに出力してもよいし、複合したフォーマットに変換して１つのファイルに出力してもよい。 The continuous media output unit 209 ₁ and the continuous media output unit 209 ₂ may output to an audio output device such as a speaker or a headphone and a display device such as a display at any time, or may be a file for playback later. It may be output. When outputting a file, the sound and the moving image may be output to separate files, or may be converted into a composite format and output to a single file.

なお、本発明は、上記の実施の形態及び実施例に限定されることなく、特許請求の範囲内において種々変更・応用が可能である。 The present invention is not limited to the above-described embodiments and examples, and various modifications and applications can be made within the scope of the claims.

本発明は、種々の連続メディアを短縮再生する技術に適用可能である。 The present invention can be applied to a technique for shortening and reproducing various continuous media.

本発明の原理を説明するための図である。It is a figure for demonstrating the principle of this invention. 本発明の原理構成図である。It is a principle block diagram of this invention. 本発明の第１の実施の形態における連続メディアデータ短縮再生装置の構成図である。It is a block diagram of the continuous media data shortening reproduction | regeneration apparatus in the 1st Embodiment of this invention. 本発明の第１の実施の形態における動作のフローチャートである。It is a flowchart of the operation | movement in the 1st Embodiment of this invention. 本発明の第２の実施の形態における同期型の複合メディアデータ短縮再生装置の構成図である。It is a block diagram of the synchronous type composite media data shortening reproduction | regeneration apparatus in the 2nd Embodiment of this invention. 本発明の第２の実施の形態における動作のフローチャートである。It is a flowchart of the operation | movement in the 2nd Embodiment of this invention. 本発明の第３の実施の形態における準同期型の複合メディアデータの短縮再生装置の構成図である。It is a block diagram of the shortening reproducing | regenerating apparatus of the semi-synchronous type composite media data in the 3rd Embodiment of this invention. 本発明の第３の実施の形態における動作のフローチャートである。It is a flowchart of the operation | movement in the 3rd Embodiment of this invention. 本発明の第１の実施例の音声波形の例である。It is an example of the audio | voice waveform of 1st Example of this invention. 本発明の第１の実施例の音声パワーの例である。It is an example of the audio | voice power of 1st Example of this invention. 本発明の第１の実施例の音声パワーの変化量の例である。It is an example of the variation | change_quantity of the audio | voice power of 1st Example of this invention. 本発明の第１の実施例のクラスタリング有りの場合の再構成した音声データの例である。It is an example of the reconfigure | reconstructed audio | speech data in the case of clustering of 1st Example of this invention. 本発明の第１の実施例のクラスタリング無しの場合の再構成した音声データの例である。It is an example of the reconfigure | reconstructed audio | voice data in the case of no clustering of 1st Example of this invention. 本発明の第２の実施例の準同期の複合メディアデータ短縮再生方法の動作の例である。It is an example of operation | movement of the semi-synchronous composite media data shortening reproduction method of 2nd Example of this invention.

Explanation of symbols

１０１，２０１，３０１連続メディア入力部
１０２，２０２，３０２変化量パラメータ抽出部
１０３，２０３再生速度指定部
１０４，２０４，３０４クラスタリング部
１０５，２０５，３０５クラスタ変化量算出部
１０６，２０６，３０６総圧縮率指示部
１０７，２０７，３０７再生クラスタ選択部
１０８，２０８，３０８連続メディア再構成部
１０９，２０９，３０９連続メディア出力部
１１０メディア入力・特徴パラメータ変化量計算手段
１２０フレーム抽出手段
１３０クラスタリング手段
１４０クラスタ変化量算出手段
１５０メディア再構成手段
１６０出力手段
２１０パラメータ合成部
３１１クラスタ内時間圧縮部 101, 201, 301 Continuous media input unit 102, 202, 302 Change parameter extraction unit 103, 203 Playback speed designation unit 104, 204, 304 Clustering unit 105, 205, 305 Cluster change calculation unit 106, 206, 306 Total compression Rate indication unit 107, 207, 307 Playback cluster selection unit 108, 208, 308 Continuous media reconstruction unit 109, 209, 309 Continuous media output unit 110 Media input / feature parameter change amount calculation unit 120 Frame extraction unit 130 Clustering unit 140 Cluster Change amount calculation means 150 Media reconstruction means 160 Output means 210 Parameter composition section 311 Intracluster time compression section

Claims

A continuous media data shortened playback method for shortening and playing back continuous media data,
Media input / feature parameter change amount calculation means divides the input continuous media data into frame sections, and calculates a change amount of the feature parameter in each frame section;
Frame extracting means, the frame number of a frame divided by the reproduction speed the total number of frames that do not cause an extreme reduction in intelligibility of the continuous media data, the amount of change in the characteristic parameters are limited to a large kina frame extraction A frame extraction step,
A clustering step in which the clustering means clusters the extracted frames by grouping the extracted frames with a small time interval as the same cluster ; and
A cluster change amount calculating means for calculating a change amount of clustered cluster units, and a cluster change amount calculating step;
The media reconstruction unit is configured to multiply the total compression rate at which the reciprocal of the total compression rate is equal to or higher than the reproduction speed by multiplying the total number of frames of the continuous media data by the total compression rate, and the frames included in the cluster. A medium reconstructing step of selecting a cluster having a large change amount of the cluster unit so that the total number of the same is substantially the same, concatenating and reconfiguring frames included in the selected cluster in an ordered relationship ; ,
An output step in which the output means outputs the concatenated reconstruction data; and
A method for shortening and reproducing continuous media data.

A composite media data shortening playback method for shortening and playing back composite media data composed of a plurality of continuous media data,
A media input / feature parameter change amount calculating unit divides each input continuous media data into frame sections, and calculates a change amount of a feature parameter in each frame section; a media input / feature parameter change amount calculation step;
A change parameter extraction means for calculating a change amount of the integrated feature parameter from the change amount of the feature parameter of each continuous media data,
Frame extracting means, the frame number of a frame divided by the reproduction speed the total number of frames that do not cause an extreme reduction in intelligibility of the continuous media data, the amount of deformation feature parameters the integration is limited to large kina frame A frame extraction step to extract
A clustering step in which the clustering means clusters the extracted frames by grouping the extracted frames with a small time interval as the same cluster ; and
A cluster change amount calculating means for calculating a change amount of clustered cluster units, and a cluster change amount calculating step;
The media reconstruction unit is configured to multiply the total compression rate at which the reciprocal of the total compression rate is equal to or higher than the reproduction speed by multiplying the total number of frames of the continuous media data by the total compression rate, and the frames included in the cluster. A medium reconstructing step of selecting a cluster having a large change amount of the cluster unit so that the total number of the same is substantially the same, concatenating and reconfiguring frames included in the selected cluster in an ordered relationship ; ,
An output step for outputting the reconstructed data by an output means;
Synchronous type composite media data shortening reproduction method characterized by performing.

A composite media data shortening playback method for shortening and playing back composite media data composed of a plurality of continuous media data,
A media input / feature parameter change amount calculating unit divides each input continuous media data into frame sections, and calculates a change amount of a feature parameter in each frame section; a media input / feature parameter change amount calculation step;
Frame extracting means, the frame number of a frame divided by the reproduction speed the total number of frames that do not cause an extreme reduction in intelligibility of the continuous media data, the amount of change in the characteristic parameters are limited to a large kina frame extraction A frame extraction step,
Clustering means, by grouping the frames issued extracted, what time interval between the extraction frame is smaller as the same cluster, clustering step of clustering,
A time compression step in which the time compression means compresses each continuous media data in a clustered cluster unit;
A cluster change amount calculating means for calculating a change amount of the time-compressed cluster unit, and a cluster change amount calculating step;
The media reconstruction unit is configured to multiply the total compression rate at which the reciprocal of the total compression rate is equal to or higher than the reproduction speed by multiplying the total number of frames of the continuous media data by the total compression rate, and the frames included in the cluster. A medium reconstructing step of selecting a cluster having a large change amount of the cluster unit so that the total number of the same is substantially the same, concatenating and reconfiguring frames included in the selected cluster in an ordered relationship ; ,
An output step for outputting the reconstructed data by an output means;
A quasi-synchronous composite media data shortening reproduction method characterized in that

A continuous media data shortening / playback device that shortens and plays back continuous media data,
Media input / feature parameter change amount calculating means for dividing the input continuous media data into frame sections and calculating a change amount of the feature parameter in each frame section;
The frame number of a frame divided by the reproduction speed the total number of frames that do not cause an extreme reduction in intelligibility of the continuous media data, the frame extraction unit amount of change in the feature parameters are extracted is limited to large kina frame ,
Clustering means for clustering the extracted frames by grouping the extracted frames with a small time interval as the same cluster ;
Cluster change amount calculating means for calculating the change amount of clustered clusters,
For the total compression rate at which the reciprocal of the total compression rate is equal to or higher than the playback speed, the multiplication value obtained by multiplying the total number of frames of the continuous media data by the total compression rate is approximately the same as the total number of frames included in the cluster. Medium reconfiguring means for selecting a cluster having a large change amount in cluster units, concatenating and reconfiguring frames included in the selected cluster in an ordered relationship ; and
An output means for outputting the concatenated reconstruction data;
A continuous media data shortening / reproducing apparatus comprising:

A composite media data shortening / reproducing apparatus for shortening and reproducing composite media data composed of a plurality of continuous media data,
Media input / feature parameter change amount calculating means for dividing each input continuous media data into frame sections and calculating a change amount of a feature parameter in each frame section;
A change amount parameter extracting means for calculating a change amount of the integrated feature parameter from the change amount of the feature parameter of each continuous media data;
Frame extracting a frame number of a frame divided by the reproduction speed the total number of frames that do not cause an extreme reduction in intelligibility of the continuous media data, the amount of deformation feature parameters the integration extracted is limited to large kina frame Means,
Clustering means for clustering the extracted frames by grouping the extracted frames with a small time interval as the same cluster ;
Cluster change amount calculating means for calculating the change amount of clustered clusters,
For the total compression rate at which the reciprocal of the total compression rate is equal to or higher than the playback speed, the multiplication value obtained by multiplying the total number of frames of the continuous media data by the total compression rate is approximately the same as the total number of frames included in the cluster. as such, selecting the cluster unit of the amount of change is large clusters, and the media reconstructing means frame connected to maintain the order relationship, to reconstruct included in the selected cluster becomes,
Output means for outputting the reconstructed data;
A synchronous composite media data shortening / reproducing apparatus characterized by comprising:

A composite media data shortening / reproducing apparatus for shortening and reproducing composite media data composed of a plurality of continuous media data,
Media input / feature parameter change amount calculating means for dividing each input continuous media data into frame sections and calculating a change amount of a feature parameter in each frame section;
Serial frame number of a frame divided by the playback speed that does not cause extreme lowering of intelligibility the total number of frames in the continuous media data, the frame extraction unit amount of change in the feature parameters are extracted is limited to large kina frame ,
Clustering means for clustering the extracted frames of each continuous media data by grouping the frames having a small time interval with the extracted frames as the same cluster ;
A time compression means for time compressing each continuous media data in clustered cluster units;
A cluster change amount calculating means for calculating a change amount of the time-compressed cluster unit;
For the total compression rate at which the reciprocal of the total compression rate is equal to or higher than the playback speed, the multiplication value obtained by multiplying the total number of frames of the continuous media data by the total compression rate is approximately the same as the total number of frames included in the cluster. as such, selecting the cluster unit of the amount of change is large clusters, and the media reconstructing means frame connected to maintain the order relationship, to reconstruct included in the selected cluster becomes,
Output means for outputting the reconstructed data;
A quasi-synchronous composite media data shortening / reproducing apparatus characterized by comprising:

Computer
A program which functions as the apparatus according to claim 4 .

Computer
A computer-readable recording medium storing a program that functions as the apparatus according to claim 4 .