WO2007013407A1 - Digest generation device, digest generation method, recording medium containing a digest generation program, and integrated circuit used in digest generation device - Google Patents

Digest generation device, digest generation method, recording medium containing a digest generation program, and integrated circuit used in digest generation device Download PDF

Info

Publication number
WO2007013407A1
WO2007013407A1 PCT/JP2006/314589 JP2006314589W WO2007013407A1 WO 2007013407 A1 WO2007013407 A1 WO 2007013407A1 JP 2006314589 W JP2006314589 W JP 2006314589W WO 2007013407 A1 WO2007013407 A1 WO 2007013407A1
Authority
WO
WIPO (PCT)
Prior art keywords
digest
section
time
specific section
candidate
Prior art date
Application number
PCT/JP2006/314589
Other languages
French (fr)
Japanese (ja)
Inventor
Takashi Kawamura
Meiko Maeda
Kazuhiro Kuroyama
Original Assignee
Matsushita Electric Industrial Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matsushita Electric Industrial Co., Ltd. filed Critical Matsushita Electric Industrial Co., Ltd.
Priority to JP2007528453A priority Critical patent/JPWO2007013407A1/en
Priority to US11/994,827 priority patent/US20090226144A1/en
Publication of WO2007013407A1 publication Critical patent/WO2007013407A1/en

Links

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • G11B27/28Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/102Programmed access in sequence to addressed parts of tracks of operating record carriers
    • G11B27/105Programmed access in sequence to addressed parts of tracks of operating record carriers of operating discs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H60/00Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
    • H04H60/35Arrangements for identifying or recognising characteristics with a direct linkage to broadcast information or to broadcast space-time, e.g. for identifying broadcast stations or for identifying users
    • H04H60/37Arrangements for identifying or recognising characteristics with a direct linkage to broadcast information or to broadcast space-time, e.g. for identifying broadcast stations or for identifying users for identifying segments of broadcast information, e.g. scenes or extracting programme ID
    • H04H60/375Commercial
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H60/00Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
    • H04H60/56Arrangements characterised by components specially adapted for monitoring, identification or recognition covered by groups H04H60/29-H04H60/54
    • H04H60/58Arrangements characterised by components specially adapted for monitoring, identification or recognition covered by groups H04H60/29-H04H60/54 of audio
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H60/00Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
    • H04H60/56Arrangements characterised by components specially adapted for monitoring, identification or recognition covered by groups H04H60/29-H04H60/54
    • H04H60/59Arrangements characterised by components specially adapted for monitoring, identification or recognition covered by groups H04H60/29-H04H60/54 of video
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H60/00Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
    • H04H60/61Arrangements for services using the result of monitoring, identification or recognition covered by groups H04H60/29-H04H60/54
    • H04H60/65Arrangements for services using the result of monitoring, identification or recognition covered by groups H04H60/29-H04H60/54 for using the result on users' side
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B2220/00Record carriers by type
    • G11B2220/20Disc-shaped record carriers
    • G11B2220/25Disc-shaped record carriers characterised in that the disc is based on a specific recording technology
    • G11B2220/2508Magnetic discs
    • G11B2220/2516Hard disks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • H04N5/765Interface circuits between an apparatus for recording and another apparatus
    • H04N5/775Interface circuits between an apparatus for recording and another apparatus between a recording apparatus and a television receiver
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • H04N5/78Television signal recording using magnetic recording
    • H04N5/781Television signal recording using magnetic recording on disks or drums

Definitions

  • Digest generating apparatus digest generating method, recording medium storing digest generating program, and integrated circuit used for digest generating apparatus
  • the present invention relates to the generation of digest scenes, and more specifically, the generation of digest scenes that calculate video and audio feature quantities from television broadcasts, etc., and determine specific important scenes using them. About.
  • digest generation apparatuses that calculate feature quantities of powerful video and audio such as television broadcasts and determine important scenes using these.
  • the following method is generally used for generating a digest.
  • the video signal “video” once recorded on the recording medium is calculated for one program, the CM section is detected based on those features, and time information such as a playlist for digest playback is calculated.
  • time information such as a playlist for digest playback is calculated.
  • FIG. 14 shows an example of the configuration of a digest generation device that generates a digest excluding the CM section.
  • a receiving unit 101 receives a broadcast radio wave and demodulates it into an audio video signal (hereinafter referred to as an AV signal).
  • the mass storage medium 102 is a medium for recording received AV signals. HDD etc. correspond to this.
  • the feature quantity extraction unit 103 stores a feature quantity required for digest generation (hereinafter referred to as digest feature quantity) and a feature quantity required for CM detection (hereinafter referred to as CM feature quantity) in the mass storage medium 102.
  • the AV signal power is also calculated.
  • CM feature amounts may include scene change detection results based on luminance information, and information on sound silence.
  • the CM detection unit 104 detects a CM section (start time and end time information) based on the calculated CM feature value, and outputs it to the digest detection unit 105.
  • the detection method of the CM section is to detect the video scene change from the luminance information of the video and detect it.
  • the digest detection unit 105 detects a CM section external force digest scene based on the digest feature value and the CM section information output from the CM detection unit 104.
  • the detected digest scene (start time end time information) is output to the playback control unit 106 as digest information.
  • a slow motion scene (repetitive slow motion scene) is identified from the motion vector of the video, and the previous few cuts are detected as a scene that is rising.
  • Patent Document 1 a method of detecting a scene that takes a locally large value of voice path information as a scene that is raised (e.g. Patent Document 2), and text given to a program
  • Patent Document 3 A method for detecting important scenes by combining information and features of video and audio signals (for example, Patent Document 3) is used.
  • the playback control unit 106 reads an AV signal from the large-capacity storage medium 102 and performs digest playback based on the digest information.
  • a digest scene starts from the chopping and segment excluding the CM segment. It is possible to create information and perform digest playback.
  • FIG. 19 shows real-time digest scene candidates while calculating feature values in parallel with the recording process, storing them together with CM feature values in a large-capacity storage means, and detecting CM sections during playback.
  • a digest generation device that generates correct digest information by excluding those that are included in the CM section.
  • the receiving unit 101 records the received AV signal on the large-capacity storage medium 102 and outputs the AV signal to the feature amount extracting unit 103 as well.
  • the feature quantity extraction unit 103 calculates a CM feature quantity and stores it in the mass storage medium 102.
  • the feature quantity extraction unit 103 outputs the digest feature quantity such as the speech parsing level to the digest detection unit 105.
  • the digest detection unit 105 analyzes the digest feature value, and detects, for example, a scene whose voice power level is equal to or higher than a predetermined threshold as a digest scene candidate. The digest detection unit 105 then detects the detected scene. And stored in the mass storage medium 102 as digest candidate information. In other words, a scene that is a digest candidate is detected in parallel with the program recording. Then, the digest candidate information (time information) and the CM feature amount are recorded in the mass storage medium 102.
  • the CM detection unit 104 reads the CM feature amount from the large-capacity storage medium 102 and detects a CM section. CM detecting section 104 then outputs the detection result as CM section information to CM section removing section 107.
  • the CM section removing unit 107 deletes the portion corresponding to the digest candidate information read from the large-capacity storage medium 102 and creates the digest information.
  • a scene with a voice channel level equal to or higher than a predetermined value is temporarily detected including the CM section, and recorded as digest candidate information.
  • the entire recorded program is analyzed to detect the CM section, and the digest candidate power is also subtracted from the CM section. The digest section is extracted.
  • Patent Document 1 JP 2004-128550 A
  • Patent Document 2 JP-A-10-039890
  • Patent Document 3 Japanese Patent Laid-Open No. 2001-119649
  • the digest generation apparatus as described above has the following problems.
  • the second method calculates feature values and detects scene information that is a digest candidate during recording. Therefore, compared to the first method, it is possible to reduce the time required for the feature amount calculation processing performed at the time of playback instruction. But CM Section detection is performed after the end of recording (when playback is instructed, etc.) because the start and end of the CM section cannot be determined in real time.
  • an object of the present invention is to provide a digest generation apparatus that does not have a processing waiting time for generating digest information of a program after the recording of the program ends.
  • the present invention employs the following configuration.
  • a first aspect is a digest generation device that generates digest scene information related to a program when the broadcast signal of the broadcast program is received and recorded on a recording medium, and includes a feature amount calculation unit, A specific section end detection unit and a digest scene information creation unit are provided.
  • the feature amount calculation unit indicates a feature amount indicating a feature related to at least one of video and audio included in the broadcast signal from the received broadcast signal for the unit time. Calculate at least one type.
  • the specific section end detection unit determines whether or not the predetermined amount of time included in the signal portion in which the characteristic amount has already been calculated among the received broadcast signals is a force that causes the start or end of the specific section.
  • the digest scene information creation unit determines whether or not the broadcast signal for the section excluding the specific section of the entire section of the program is a digest scene, based on the feature amount, every time the feature amount is calculated. And digest scene information is generated.
  • the digest scene information creation unit determines whether the content included in the AV signal for the unit time is a digest scene or not.
  • a digest section detecting unit that detects a digest candidate section based on the received AV signal by determining based on the feature quantity each time the feature quantity is calculated.
  • the digest scene information creation unit is a specific section end detection unit. Each time a set of the start and end of a specific section is detected, it is determined whether or not the specific section from the start to the end overlaps with the digest candidate section, and the digest detected by the digest section detection unit is determined.
  • the candidate sections information indicating a section excluding the digest candidate section that overlaps with the specific section is generated as digest scene information.
  • the third aspect includes a temporary storage unit in which the digest scene information creation unit stores the calculated feature amount up to the latest calculated time point power for a predetermined time.
  • the digest scene information creation unit detects the time point of the feature amount stored in the temporary storage unit from the start to the end of the specific section detected by the specific section end detection unit. If it is not included, only if it is not included, the content that is a digest scene is detected from the content included in the broadcast signal for a unit time, and the digest scene information is generated.
  • the feature amount calculating unit calculates the first and second feature amounts
  • the specific section end detection unit is configured to specify the specific section based on the first feature amount.
  • the digest section detection unit detects a digest candidate section based on the second feature amount.
  • the specific section end detection unit detects a section including only the feature quantity satisfying the condition as a specific section candidate when the feature quantity satisfies a predetermined condition.
  • a section candidate detection unit and a specific section determination unit that detects a candidate that is a start or end of a specific section based on a time difference between the specific section candidates in the program.
  • the specific section determination unit detects the specific section candidate power that is detected every time a specific section candidate is detected. If it is included in the candidate, the time point before the predetermined time is detected as the start of the specific section, and the specific section candidate is detected as the end of the specific section.
  • the specific section detection unit detects the specific section candidate power last detected every time a specific section candidate is detected.
  • the specific section candidate power detected at the end is determined by the determination section that determines whether or not there is a specific section candidate that has already been detected at a time point before the predetermined second time.
  • an adder that adds points to each of the specific section candidate determined to exist and the last specific section candidate detected last, and a target candidate having a score equal to or greater than a predetermined value are detected.
  • the target candidate power is determined whether there is a specific interval candidate whose score is equal to or greater than the predetermined value at the time point before the third time.
  • a start end determination unit having the target candidate as a start point of the specific section, and whenever a predetermined third time elapses after detection of a target candidate having a score equal to or greater than a predetermined value, And determining whether there is a specific section candidate whose score is greater than or equal to the predetermined value. If there is no specific section candidate, the terminal determination section includes the target candidate as the end of the specific section.
  • the feature amount calculation unit calculates a voice level of the audio signal as a feature amount
  • the specific interval candidate detection unit detects a silent interval whose par level is a predetermined value or less. Detect as a specific section candidate.
  • the feature amount calculation unit calculates the luminance information based on the video signal as the feature amount, and the specific section candidate detection unit determines that the amount of change in the luminance information is a predetermined value.
  • the above scene change points are detected as specific section candidates.
  • a tenth aspect is a digest generation method for generating digest scene information related to a program when the broadcast signal of the program to be broadcast is received and recorded on a recording medium.
  • a section end detection step and a digest scene information creation step are provided.
  • the feature amount calculating step is a feature amount indicating a feature related to at least one of video and audio included in the broadcast signal from the received broadcast signal for the unit time every time a broadcast signal of a predetermined unit time is received. Calculate at least one type.
  • the feature amount is calculated based on whether or not a predetermined time point included in the signal portion of the received broadcast signal whose feature amount has already been calculated is the start or end of the specific section.
  • the digest scene information creation step determines whether or not the broadcast signal for a section excluding the specific section of the entire section of the program is a digest scene every time the feature amount is calculated. Digest scene information is generated.
  • the digest scene information creation step includes By determining whether or not the content included in the broadcast signal for about several hours is a digest scene based on the feature amount every time the feature amount is calculated for the broadcast signal for the unit time, A digest section detecting step for detecting a digest candidate section for the broadcast signal.
  • the digest scene information creation step determines whether or not the specific section up to the start force and the end overlaps with the digest candidate section every time the start and end pair of the specific section is detected by the specific section end detection step.
  • the digest scene information is generated as digest scene information by determining and excluding digest candidate sections that overlap with the specific section among the digest candidate sections detected by the digest section detection step.
  • the digest scene information creation step includes a temporary storage step of storing the calculated feature amount up to a latest calculation time point force for a predetermined time.
  • the digest scene information creation step includes the start and end points of the specific section detected by the specific section end detection step when the feature amount stored in the temporary storage step is calculated each time the feature amount is calculated. Judge whether it is included or not, not included! / Only in this case, digest scene information is generated by detecting the content that is the digest scene from the content included in the AV signal for the unit time.
  • a thirteenth aspect stores a digest generation program to be executed by a digest generation device computer that generates digest scene information related to a program when the broadcast signal of the broadcast program is received and recorded on a recording medium.
  • the recording medium stores a feature amount calculation step, a specific section end detection step, and a digest scene information creation step.
  • the feature amount calculating step is a feature amount indicating a feature related to at least one of video and audio included in the broadcast signal from the received broadcast signal for the unit time every time a broadcast signal of a predetermined unit time is received. This is a process for calculating at least one of the above.
  • the specific interval end detection step calculates whether the feature amount is whether or not a predetermined time point included in the signal portion for which the feature amount has already been calculated in the received broadcast signal is the start or end of the specific interval. This is a process for detecting the time point at the beginning or end of a specific section by determining each time it is performed.
  • the digest scene information creation step every time a feature value is calculated, the entire section of the program is calculated based on the feature value. This is a process for generating digest scene information by determining whether or not a broadcast signal in a section excluding a specific section is digest scene power.
  • the digest scene information creation step determines whether or not the content included in the broadcast signal for a unit time is a digest scene, for the unit time.
  • a digest section detecting step of detecting a digest candidate section for the received broadcast signal by making a determination based on the feature quantity each time the feature quantity is calculated for the broadcast signal is included.
  • the digest scene information creation step includes a step in which a specific section up to the start force and the end overlaps with the digest candidate section each time a set of the start and end of the specific section is detected by the specific section end detection step. It determines whether or not, and information indicating a section excluding the digest candidate section that overlaps the specific section among the digest candidate sections detected by the digest section detection step is generated as digest scene information.
  • the digest scene information creation step includes a temporary storage step of storing the calculated feature amount up to a latest calculation time force for a predetermined time.
  • the digest scene information creation step includes the start and end points of the specific section detected by the specific section end detection step when the feature amount stored in the temporary storage step is calculated each time the feature amount is calculated. Judge whether it is included or not, not included! / Only in this case, digest scene information is generated by detecting the content that is the digest scene from the content included in the AV signal for the unit time.
  • a sixteenth aspect is an integrated circuit used in a digest generation device that generates digest scene information related to a program when the broadcast signal of the program to be broadcast is received and recorded on a recording medium.
  • a calculation unit, a specific section end detection unit, and a digest scene information creation unit are provided.
  • the feature amount calculation unit indicates a feature related to at least one of video and audio included in the broadcast signal from the received broadcast signal for the unit time. At least one feature is calculated.
  • the specific section end detection unit calculates whether or not a predetermined time point included in the signal portion of the received broadcast signal whose characteristic amount has already been calculated is the start or end of the specific section.
  • the digest scene information creation unit determines whether or not the broadcast signal in the entire program section excluding the specific section is a digest scene based on the feature amount. To generate digest scene information.
  • the digest scene information creation unit determines whether or not the content included in the broadcast signal for a unit time is a digest scene, for the unit time. It includes a digest section detection unit that detects a digest candidate section for a received broadcast signal by determining based on the feature quantity each time a feature amount is calculated for a broadcast signal. In addition, the digest scene information creation unit determines whether or not the specific section from the start end to the end overlaps with the digest candidate section every time the specific section end detection unit detects the set of the start and end of the specific section. It determines, and the information which shows the area except the digest candidate area which overlaps with the said specific area among the digest candidate areas detected by the digest area detection part is produced
  • the digest scene information creation unit includes a temporary storage unit that stores the calculated feature amount up to a predetermined calculation time force for a predetermined time.
  • the digest scene information creation unit detects the time point of the feature value stored in the temporary storage unit from the start point to the end point of the specific segment detected by the specific segment end detection unit. Only when it is not included, the digest scene information is generated by detecting the content that is the digest scene among the content included in the AV signal for the unit time.
  • digest scene information excluding the specific section can be generated in parallel with the recording of the program. it can.
  • a specific section for example, a CM section
  • digest scene information excluding the specific section can be generated in parallel with the recording of the program. it can.
  • the same effect as the first invention can be obtained.
  • the specific section is determined based on the time interval between the specific section candidates. Thereby, a specific area can be determined more accurately.
  • the specific section candidates are scored based on a predetermined time interval. Thereby, it is possible to evaluate the likelihood of the start or end of the specific section. Furthermore, because the specific section candidate with a high score is the start or end of the specific section, it is possible to prevent the specific section candidate that happened to exist in the program from being erroneously determined to be the start or end of the specific section. . As a result, it is possible to create digest scene information excluding specific sections more accurately.
  • the silent section is set as the specific section candidate. This makes it possible to detect a specific specific section using the property that the first and last sections are silent sections, such as the CM section.
  • a scene change point at which the luminance information has greatly changed is determined as a specific section candidate. For this reason, the transition from a program whose luminance information greatly changes to a specific section can be set as a specific section candidate, and as a result, the specific section can be determined more accurately.
  • FIG. 1 is a block diagram showing a configuration of a digest generation apparatus 10 that is helpful in the first embodiment.
  • FIG. 2 is a diagram showing an example of data used in the present invention.
  • FIG. 3 is a flowchart showing digest scene list generation processing.
  • FIG. 4 is a flowchart showing details of the silent section detection processing shown in step S4 of FIG. It is
  • FIG. 5 is a flowchart showing details of the point evaluation process shown in step S 16 of FIG. 4.
  • FIG. 6 is a flowchart showing details of the candidate section detection process shown in step S5 of FIG.
  • FIG. 7 is a flowchart showing details of the CM section determination processing shown in step S6 of FIG.
  • FIG. 8 is a diagram showing an example of CM section determination in the CM section determination processing.
  • FIG. 9 is a flowchart showing details of the digest scene list output process shown in step S 7 of FIG. 3.
  • FIG. 10 is a block diagram showing a configuration of a digest generation apparatus 10 according to the second embodiment.
  • FIG. 11 is a diagram showing an example of data used in the present invention.
  • FIG. 12 is a flowchart showing a digest scene list generation process that is relevant to the second embodiment.
  • FIG. 13 is a flowchart showing details of the silent section detection process shown in step S 66 of FIG. 12.
  • FIG. 14 is a block diagram showing a configuration of a conventional recording / reproducing apparatus.
  • FIG. 15 is a block diagram showing a configuration of a conventional recording / reproducing apparatus.
  • the present invention creates a digest scene list indicating the position of the digest scene in parallel with the recording of the program.
  • the digest scene employs a scene in which the voice par level takes a locally large value, that is, a scene that is raised, as the digest scene. For this reason, a scene whose voice path level is equal to or higher than a predetermined value is extracted as a digest candidate section.
  • a section whose voice level is equal to or less than a predetermined value is extracted as a silent section, and a section where the silent section appears at a predetermined interval (for example, every 15 seconds) is extracted as a CM section.
  • CM section a digest scene list indicating the digest scenes in the program section is created by excluding information corresponding to the CM section from the information of the digest candidate sections. In the present embodiment, the description will be made assuming that the length of one CM section is 60 seconds at the maximum.
  • FIG. 1 is a block diagram showing a configuration of a digest generation apparatus according to the first embodiment of the present invention.
  • a digest generating device 10 includes a receiving unit 11, a feature amount calculating unit 12, a silent segment detecting unit 13, a candidate segment detecting unit 14, a CM segment determining unit 15, a digest list creating unit 16, and a large-capacity recording medium 17 And a playback control unit 18.
  • the receiving unit 11 receives the broadcast radio wave and demodulates it into an image signal and an audio signal (hereinafter referred to as AV signal). In addition, the reception unit 11 outputs the demodulated AV signal to the feature amount calculation unit 12, the large-capacity recording medium 17, and the reproduction control unit 18.
  • the feature amount calculation unit 12 analyzes the AV signal to calculate a feature amount, and outputs the feature amount to the silent section detection unit 13 and the candidate section detection unit 14.
  • the feature value is used to determine the CM section and digest scene in the program.
  • the CM section is determined based on the occurrence interval of the silent section as described above, the feature level for determining the CM section corresponds to the voice feature quantity such as the par level of the voice signal.
  • feature quantities for determining a digest scene for example, video feature quantities such as luminance information and motion vectors of video signals, and audio feature quantities such as audio signal par level and spectrum are applicable. In the present embodiment, the description will be made on the assumption that the par level of an audio signal is used as a feature amount for determination of both a CM section and a digest scene.
  • the silent section detector 13 detects a silent section in the program based on the feature amount, and generates silent section information 24. Further, the silent section detection unit 13 outputs the silent section information 24 to the CM section determination unit 15.
  • Candidate section detection unit 14 detects a section (hereinafter referred to as a candidate section) that is a digest scene candidate in the program based on the feature amount, and generates candidate section information 25. Further, the candidate section detection unit 14 outputs the candidate section information 25 to the digest list creation unit 16.
  • the CM section determination unit 15 determines the CM section by looking at the time interval of the silent section based on the silent section information 24. Then, the CM section determination unit 15 outputs the determined CM section as CM section information 27 to the digest list creation unit 16.
  • the digest list creating section 16 Based on the candidate section information 25 and the CM section information 27, the digest list creating section 16 creates a digest scene list 28 that is information indicating the position of the digest scene. The digest list creation unit 16 then stores the digest scene list 28 in a large capacity. The data is output to the recording medium 17 and the reproduction control unit 18.
  • the large-capacity recording medium 17 is a medium for recording the AV signal and the digest scene list 28, and is realized by a DVD, an HDD, or the like.
  • the reproduction control unit 18 performs reproduction control such as reproduction of the received AV signal and reproduction of the AV signal recorded on the large-capacity recording medium 17 and output to the monitor.
  • the feature quantity calculation unit 12, the silent segment detection unit 13, the candidate segment detection unit 14, the CM segment determination unit 15, and the digest list creation unit 16 illustrated in FIG. 1 are typically LSIs that are integrated circuits. It may be realized as.
  • the feature quantity calculation unit 12, the silent segment detection unit 13, the candidate segment detection unit 14, the CM segment determination unit 15, and the digest list creation unit 16 may be individually combined, or may include some or all of them. One chip may be added. Further, the method of circuit integration may be realized by a dedicated circuit or general-purpose processor, not limited to LSI.
  • comparison feature quantity information 21 (FIG. 2 (A)) is used to detect the silent section and the like, and the time information 211 for the immediately preceding frame and the voice calculated by the feature quantity calculation unit 12 are used. It has a feature value 212 immediately before the power level value is stored.
  • Silence start edge information 22 (Fig. 2 (B)) has a silence start edge time, and is used to detect a silence interval.
  • Candidate start edge information 23 (Fig. 2 (C)) has a candidate start edge time, and is used to detect a candidate section.
  • the silent section information 24 (FIG. 2 (D)) stores the detection result of the silent section by the silent section detector 13.
  • the silent section information 24 includes the collective power of the section number 241, the score 242, the start time 243, and the end time 244.
  • the section number 241 is a number for identifying each silent section.
  • the score 242 is a value that evaluates how much the silence section is likely to be the end of the CM section. The higher the score, the higher the possibility that the silent section is the end of the CM section. Conversely, if the score is low, the silent section is a silent section that happens to appear in the program. (Ie, not the end of the CM section).
  • the start time 243 and end time 244 are time information indicating the start time and end time of the silent section.
  • Candidate section information 25 (Fig. 2 (E)) stores the detection results of candidate sections by candidate section detector 14.
  • Candidate section information 25 consists of a set of candidate number 251, start time 252 and end time 253.
  • Candidate number 251 is a number for identifying each candidate section.
  • the start time 252 and end time 253 are time information indicating the start time and end time of the candidate section.
  • Temporary CM start edge information 26 (FIG. 2 (F)) has a temporary CM start edge time used by the CM interval determination unit 15 to detect the CM interval, and the start interval time of the silent interval that can be the start edge of the CM interval. Is stored.
  • CM section information 27 (FIG. 2 (G)) information on the CM section detected by the CM section determination unit 15 is stored.
  • CM section information 27 is also a collective force of CM number 271, CM start time 272, and CM end time 273.
  • CM number 271 is a number for identifying each CM section.
  • CM start time 272 and CM end time 273 are time information indicating the start time and end time of the CM section.
  • the digest scene list 28 (FIG. 2 (H)) is a file indicating the time information of the section that becomes the digest scene in the yarn. This is a set of digest number 281, digest start time 282, and digest end time 283.
  • the digest number 281 is a number for identifying each digest section.
  • the digest start time 282 and the digest end time 283 are time information indicating the start time and end time of the digest section.
  • FIG. 3 is a flowchart showing the detailed operation of the digest scene list creation process according to the first embodiment.
  • the process shown in Fig. 3 is started by a recording instruction from the user.
  • the scan time of the process shown in FIG. 3 is one frame.
  • the digest generation device 10 determines whether or not the end of recording has been instructed. (Step SI). As a result, when the end of recording is instructed (YES in step S1), the digest scene list creation process is terminated. On the other hand, when the end of the recording is not instructed (NO in step S1), the feature amount calculation unit 12 acquires a signal for one frame from the reception unit 11 (step S2). Next, the feature amount calculation unit 12 analyzes the acquired signal and calculates a voice power level (feature amount) (step S3).
  • FIG. 4 is a flowchart showing details of the silent section detection process shown in step S4.
  • the silent section detection unit 13 determines whether or not the power level of the audio signal calculated in step S3 is equal to or less than a predetermined threshold (step S11). As a result, if it is equal to or less than the predetermined threshold value (YES in step S11), the silent section detection unit 13 refers to the immediately preceding feature value 212 in which the feature value related to the previous frame is stored, and the value is It is determined whether or not the force is equal to or less than a predetermined threshold (step S12).
  • the silent section detecting unit 13 determines the change in the audio power level between the current frame and the previous frame. As a result, if it is not less than the predetermined threshold value (NO in step S12), the silent section detecting unit 13 stores the time information of the frame in the silent start end information 22 (step S13). It should be noted that immediately after the start of processing, nothing is stored in the immediately preceding feature value 212, so in this case, the processing is proceeded assuming that it is not less than a predetermined threshold value. On the other hand, if it is equal to or less than the predetermined threshold (YES in step S12), the silent section is being continued, so the silent section detection process is terminated.
  • step S11 determines whether or not the power level stored here is below a predetermined threshold (step S14). As a result, if it is equal to or less than the predetermined threshold value (YES in step S14), the silent period that has been continued has been completed in the previous frame.
  • the section from the silence start time of 22 to the time information 211 of the previous frame is output to the silence section information 24 as one silence section (step S15).
  • the silent section detector 13 performs a point evaluation process (step S16) as will be described later on the silent section output in step S15.
  • step S16 a point evaluation process
  • step S16 it is determined whether or not the time of 15 seconds, 30 seconds, and 60 seconds before the last detected silent interval is silence interval power, and if it is a silent interval, 1 point is added to each silent interval information It is.
  • This makes it possible to increase the score for silent sections that are considered to be the beginning or end of any CM.
  • it is a silent section that occurs during a program using the property that both ends of the CM section are silent sections and the length of one CM section is 15 seconds, 30 seconds, or 60 seconds.
  • the process of evaluating the “end of CM section” by assigning points As a result, it is possible to distinguish between silent sections that occur occasionally during a program and silent sections that indicate CM boundaries.
  • the silent section detecting unit 13 acquires the start time 243 of the silent section stored last in the silent section information 24. Then, the silent section detector 13 determines whether or not there is a silent section having a time 15 seconds before the time by searching the silent section information 24 (step S21). As a result, if a silent section can be searched (YES in step S21), the silent section detecting unit 13 adds 1 to the score 242 of each of the silent section stored last and the silent section searched in step S21 (step S21). S22). On the other hand, if the result of the determination in step S21 is that a silence interval 15 seconds ago cannot be searched (NO in step S21), the silence interval detection unit 13 proceeds to step S23 without performing step S22. .
  • the silent section detector 13 determines whether or not 30 seconds before is the silent section, as in step S21 (step S23). As a result, if the search is possible (YES in step S23), the silent section detection unit 13 adds 1 to the score 242 of the last stored silent section and the silent section searched this time (step S24). . On the other hand, if the result of the determination in step S23 is that the silent section 30 seconds before cannot be searched (NO in step S23), the silent section detector 13 proceeds to step S25 without performing the process in step S24. . Step S25 The silent section detector 13 determines whether or not there is a silent section 60 seconds before, as in steps S21 and S23.
  • the silent section detector 13 sets 1 to 242 as in steps S22 and S24. to add. Above, the point evaluation process concerning step S16 is complete
  • the silent section information 24 is searched based on the start time 243 of the silent section. However, the present invention is not limited to this, and the end time 244 of the silent section or any time point in the silent section is used as a reference. Then you can search.
  • step S5 This process is a process of detecting a section where the voice path level is equal to or higher than a predetermined threshold as a digest scene candidate section.
  • FIG. 6 is a flowchart showing details of the candidate section detection process shown in step S5.
  • the candidate section detection unit 14 determines whether or not the speech signal level extracted in step S3 is equal to or higher than a predetermined threshold (step S31). As a result, if it is equal to or greater than the predetermined threshold value (YES in step S31), then the candidate section detection unit 14 determines whether or not the preceding feature value 212 is greater than or equal to the predetermined threshold value (step S32). As a result, if it is not equal to or greater than the predetermined threshold value (NO in step S32), the candidate section detection unit 14 candidates the time information of the frame (currently processing target! /) Frame acquired in step S2.
  • step S33 Store in the start edge information 23 (step S33). Immediately after the start of processing, nothing is stored yet in the immediately preceding feature amount 212. In this case, the processing is proceeded assuming that it is not equal to or greater than a predetermined threshold. On the other hand, if it is equal to or greater than the predetermined threshold (YES in step S32), the candidate section is being continued, and the candidate section detection unit 14 advances the process to step S36.
  • step S31 when the sound signal level calculated in step S3 is not equal to or greater than a predetermined threshold (NO in step S31), candidate section detecting unit 14 refers to immediately preceding feature quantity 212, It is determined whether or not the power level stored here is greater than or equal to a predetermined threshold (step S34). As a result, if it is equal to or greater than the predetermined threshold value (NO in step S34), the candidate section that has been continued ends in the previous frame. The section from the stored candidate start time to the time information 211 that is the time of the previous frame is output to the candidate section information 25 as one candidate section (step S35).
  • step S34 if the value of the immediately preceding feature value 212 is not equal to or greater than the predetermined threshold (NO in step S34), a section that is not a candidate section is continuing, so a candidate section detection unit 14 advances the process to step S36. It should be noted that immediately after the start of processing, nothing is stored in the immediately preceding feature value 212, so the processing is recommended as not exceeding a predetermined threshold!
  • step S36 the candidate section detection unit 14 stores the level of the audio signal acquired in step S3 in the immediately preceding feature quantity 212 (step S36). This completes the candidate interval detection process.
  • FIG. 7 is a flowchart showing details of the CM section determination process shown in step S6.
  • the CM section determination unit 15 searches the silent section information 24, and there is a silent section with a score 242 greater than or equal to a predetermined value (for example, three points) at a time point 60 seconds before the current frame. It is determined whether or not to perform (step S41). That is, it is determined whether or not the power was 60 seconds before the silent section.
  • a predetermined value for example, three points
  • the reason for searching for the presence of a silent section is 60 seconds ago, because in this embodiment, it is assumed that the length of one CM section is a maximum of 60 seconds. Therefore, if it is assumed that the length of one CM section is 30 seconds at the maximum, the search time should be 30 seconds.
  • the CM section determination unit 15 advances the process to step S46 described later.
  • CM section determination unit 15 determines whether or not there is data in provisional CM start information 26 (step). S42). As a result, if there is no data in the provisional CM start end information 26 (NO in step S42), the CM section determination unit 15 outputs the searched silent section time information to the provisional CM start end information 26 (step S49). On the other hand, if data already exists (YES in step S42), the CM section determination unit 15 acquires the provisional start time from the provisional CM start information 26 and associates it with the CM number 271 as the CM start time 272. Output to CM section information 27.
  • the end time of the silent section searched in step S41 (that is, the silent section 60 seconds before) is output to the CM section information 27 as the CM end time 273 (step S43).
  • the CM section determination unit 15 sets the D list creation flag, which is a flag for creating a digest scene list, which will be described later, to ON (step S44).
  • the CM section determination unit 15 outputs the end time of the silent section information 60 seconds before as the start time of the provisional CM start end information 26 (step S45).
  • the CM section determination unit 15 determines whether or not the force has exceeded 120 seconds from the time of the provisional CM start end information 26 (step S46). In other words, if there is no silent section with a score of 242 or higher for 120 seconds after a silent section that has the possibility of starting CM is found, the silent section is not the start of CM.
  • the reason for the determination criterion being 120 seconds is that in this embodiment, it is assumed that one CM section is a maximum of 60 seconds. In other words, even if a start candidate for a CM section is found once and a silence section is found 60 seconds later, an additional 60 seconds are required to determine whether or not the silence section is at the end of the CM section. .
  • step S46 If 120 seconds or more have passed as a result of the determination in step S46! / YES (YES in step S46), the CM section determination unit 15 clears the provisional CM start end information 26 (step S47). Subsequently, the CM section determination unit 15 sets the D list creation flag to ON (step S48). On the other hand, if 120 seconds or more have not elapsed (NO in step S46), the process is terminated as it is. This is the end of the CM section determination process.
  • points A to G are silent sections and the ends of CM sections with a 15-second interval.
  • point A is set as the temporary CM start point at point E (60 seconds) in FIG.
  • point F 75 seconds
  • points A to B are CM sections, and are output to the time information SCM section information 27 of the section.
  • point B is the beginning of a new provisional CM.
  • points B to C are confirmed as CM sections and output to the CM section information.
  • point C will be the beginning of provisional CM.
  • step S7 shows in step S7 above. It is a flowchart which shows the detail of the performed digest scene list output process.
  • the digest list creation unit 16 determines whether or not the D list creation flag is on (step S51). As a result, if it is not on (NO in step S51), the digest list creation unit 16 ends the process as it is. On the other hand, if it is on (YES in step S51), the digest list creation unit 16 determines whether or not a new candidate section has been added to the candidate section information 25 since the digest scene list output process has been performed previously. (Step S52).
  • the digest list creation unit 16 ends the digest scene list creation process as it is.
  • the candidate candidate section has been newly added when the digest scene list output process has been performed previously (YES in step S52)
  • the digest list creation unit 16 adds information on the candidate section for the increment to 1 (Step S53).
  • the digest list creation unit 16 determines whether or not the candidate section is included in the CM section with reference to the CM section information 27 (step S54). As a result, if it is not within the CM section (NO in step S54), the digest list creation unit 16 outputs information on the candidate section to the digest scene list 28 (step S55). On the other hand, if it is within the CM section (YES in step S54), the process proceeds to step S56. In other words, if the candidate section is also a CM section, the candidate section is not used as a digest scene.
  • the digest list creation unit 16 determines whether or not the above-described distribution process has been performed for all of the incremented candidate sections (step S56). As a result, if an unprocessed increase candidate section still remains (NO in step S56), the digest list creation unit 16 returns to step S53 and repeats the process. On the other hand, when all the increased candidate sections have been processed, the digest list creation unit 16 sets the D list creation flag to OFF (step S57), and ends the digest scene list output process. This is the end of the digest scene list creation process that is useful for the first embodiment.
  • a digest candidate section whose voice level is equal to or higher than a predetermined value is simply extracted, and the one corresponding to the CM section is extracted from the digest candidate section.
  • a digest scene list in which only digest scenes in the program section are extracted can be created in parallel with recording.
  • the silence interval detection unit 13 performs the silence interval detection process.
  • the present invention is not limited to this, and the CM interval determination unit 15 performs the silence interval detection process prior to the CM interval determination process. You may make it detect a sound area.
  • the digest scene detection is not limited to the above-described method using the audio sound level, but is limited to, for example, a sport that is a specific program genre. (Repeat slow motion scene) is identified, and the most important scenes are combined with the method that detects the last few cuts as a raised scene, and the combination of text information and video / audio signal features given to the program It is also possible to use a method for detecting. Of course, any method may be used as long as it detects a digest scene that is not limited to these digest scene detection methods.
  • the detection of the CM section is not limited to the method using the audio path level as described above. For example, the scene change point of the video is detected from the luminance information of the video, and the CM section is determined based on the occurrence interval. May be determined. In this case, the luminance information of the video may be used as the feature amount.
  • the above-described digest list may be used to catch up and reproduce the program during program recording.
  • the user instructs catch-up reproduction.
  • the playback control unit 18 determines whether or not two minutes have passed since the start of recording. If it is two minutes or longer, only the digest scene is played back using the digest list generated by the above-described processing. . On the other hand, if it is not 2 minutes or longer, the playback control unit 18 performs fast playback (for example, playback at a playback speed of 1.5 times). After that, if the fast-play playback catches up with the actual broadcast, the fast-play playback may be stopped and switched to the real-time broadcast output.
  • subsequent playback may be left to the user's instruction.
  • normal playback of the digest scene may be performed, or playback may be performed with thinning.
  • the playback control unit 18 plays back the digest scene so as to end in 10 minutes based on the digest scene list created at that time. Then, for viewing after the digest scene has been played back, it is up to the user to wait for the instruction.
  • the 10-minute program during the digest scene playback may be thinned out in response to a user instruction. You may make it perform reproduction
  • the playback control unit 18 ends the playback process in response to a user instruction.
  • the digest scene list is generated in parallel with the recording, so that digest playback can be performed at any timing during recording.
  • the digest scene information is created by subtracting the CM section from the digest candidate section.
  • the section to be subtracted from the digest candidate section is not limited to the CM section.
  • a section where a still image is displayed may be detected and subtracted.
  • the broadcast cannot be broadcast, and editing is performed before the broadcast so that a still image (“Cannot be displayed!” Is displayed) will be displayed instead. Will be broadcast. Therefore, the feature amount of the still image (for example, the motion vector of the video is 0) is detected, and the still image section where the still image is continuously displayed is detected.
  • the digest scene information may be created by subtracting the still image section (that is, the broadcast prohibited section) from the digest candidate section. If a section having a predetermined feature such as a CM section or a still image section is detected as a specific section and the specific section is subtracted from the digest candidate section, a digest list in which only digest scenes are appropriately extracted can be obtained. Can be generated.
  • FIG. 10 is a block diagram showing a configuration of the digest generation device 30 according to the second exemplary embodiment of the present invention.
  • the feature quantity calculation unit 12 associates the calculated feature quantity with the time information and stores them in the temporary storage unit 31 as the temporarily accumulated feature quantity 36.
  • the temporary storage unit 31 has a capacity to hold frame feature values and time information for a predetermined time.
  • the digest list creation unit 32 detects a digest scene from a section other than the CM section based on the feature amount stored in the CM section information 27 and the temporary storage unit 31, and creates the digest scene list 28. Except for these, the digest generation device 30 according to the present embodiment basically has the same configuration as that of the first embodiment described above. Therefore, the same portions are denoted by the same reference numerals, and detailed description thereof is omitted.
  • a temporarily accumulated feature 36, immediately before digest information 37, and digest start end information 38 are used.
  • the temporarily accumulated feature quantity 36 is used for detecting a digest scene, and has time information 361 and a feature quantity 362.
  • the time information 361 stores frame time information.
  • the feature quantity 362 stores the feature quantity (in this embodiment, the voice path level) used by the feature quantity calculation unit 12 and used for digest scene detection.
  • Information immediately before digest 37 (FIG. 11 (B)) is also used for detecting a digest scene, and has time information 371 immediately before digest and feature amount 372 immediately before digest.
  • the time information immediately before the digest 371 stores the time information related to the frame immediately before the current frame to be processed.
  • the feature value 372 immediately before the digest is stored in the feature value 372 immediately before the digest.
  • the digest start end information 38 (FIG. 11C) has a digest start end time and is used to detect a digest scene.
  • FIG. 11 shows a digest scene squirrel that is related to the second embodiment.
  • 5 is a flowchart showing a detailed operation of the image creation process.
  • the processing of steps S61 and S62 is the same as the processing of steps Sl and S2 described with reference to FIG. 3 in the first embodiment, and therefore detailed description thereof is omitted here.
  • the feature amount calculation processing according to step S63 the step described with reference to FIG. 3 in the first embodiment described above, except that the calculated feature amount is output to the temporary storage unit 31. Since it is the same as the process of S3, detailed description is omitted.
  • step S64 the feature amount (sound level of the audio signal) calculated in step S63 is stored in the immediately preceding feature amount 212 at the end of the processing, except for the above-described first step. Since it is the same as the process of step S4 described with reference to FIG. 4 in the first embodiment, detailed description thereof is omitted.
  • step S65 the CM section determination unit 15 performs CM section determination processing and creates CM section information. Since the operation in step S65 is the same as the process in step S6 described with reference to FIG. 7 in the first embodiment, detailed description thereof is omitted.
  • step S66 the digest list creating unit 32 performs a digest list output process.
  • FIG. 13 is a flowchart showing details of the digest list output process shown in step S66.
  • the digest list creation unit 32 determines whether or not 120 seconds of frame feature values have been stored in the temporarily stored feature value 36 (step S71).
  • the maximum length of the CM section is assumed to be 60 seconds. For example, when the CM section is 60 seconds at the beginning of the program, it takes 120 seconds at the maximum to determine the CM section. Therefore, this processing is not performed for at least 120 seconds of program start force.
  • step S71 If the result of determination in step S71 indicates that 120 seconds have not yet been accumulated (NO in step S71), the digest list output process ends. On the other hand, if stored (YES in step S71), the digest list creation unit 16 acquires the oldest time information 361 and feature quantity 362 from the temporarily stored feature quantity 36 (step S72).
  • the digest list creation unit 32 determines whether or not the time indicated by the time information 361 acquired in step S72 exists in the CM section with reference to the CM section information (step S73). ). If the result is within the CM section (YES in step S73), the digest list The creation unit 32 ends the digest list generation process. On the other hand, if not within the CM section (NO in step S73), the digest list creation unit 32 determines whether or not the value of the feature quantity 362 is greater than or equal to a predetermined value (step S74).
  • the digest list creating unit 32 determines whether or not the feature quantity 372 immediately before the digest is greater than or equal to the predetermined value (step S75). That is, a change in the voice path level between the frame acquired in step S72 and the frame immediately before that frame is determined. As a result, if the feature value 372 immediately before the digest is not equal to or greater than the predetermined value (NO in step S75), the time information of the frame is saved in the digest start end information 38 (step S76). At the time of the first process, nothing is stored in the feature quantity 212 immediately before the digest.
  • step S75 determines whether the feature amount 372 immediately before digest is equal to or greater than the predetermined value. If the result of determination in step S75 is that the feature amount 372 immediately before digest is equal to or greater than the predetermined value (YES in step S75), the digest list creation unit 16 performs the process in step S77 without performing the process in step S76. move on.
  • step S74 determines whether the value of feature quantity 362 is not greater than or equal to the predetermined value. If the result of determination in step S74 is that the value of feature quantity 362 is not greater than or equal to the predetermined value (NO in step S74), then digest list creation unit 32 sets feature quantity 372 immediately before digest to a predetermined value. It is determined whether the value is greater than or equal to the value (step S78). As a result, if the immediately-digest feature quantity 372 is not equal to or greater than the predetermined value (NO in step S78), the digest list creation unit 16 ends the digest list generation process.
  • step S78 if the feature value 372 immediately before the digest is equal to or greater than the predetermined value (YES in step S78), the digest scene that has been continued has been completed in the previous frame, so the digest indicated by the digest start edge information 38 is the digest.
  • the section from the start time to the time information 371 immediately before the digest is output to the digest scene list 28 as one digest section (step S79).
  • the digest list creation unit 16 saves the voice path level of the frame to the feature amount 372 immediately before the digest (step S77). This completes the digest scene list creation process that is relevant to the second embodiment.
  • the CM section can be detected in parallel with the recording of the program, and the digest scene can be detected by the program section force other than the CM section. This makes it necessary to perform a separate process for generating a digest scene list after the recording of the program is completed. This eliminates the need for a processing time for the generation process and provides a comfortable viewing environment to the user.
  • each of the above-described embodiments may be provided in the form of a recording medium storing a program to be executed by a computer.
  • the digest generation program stored in the recording medium is read, and the digest generation device (more precisely, the control unit not shown) may perform the processes shown in FIGS.
  • a digest generating device, a digest generating method, a recording medium storing a digest generating program, and an integrated circuit used in the digest generating device according to the present invention generate digest scene information while recording a program. It is useful for applications such as HDD recorders and DVD recorders.

Abstract

A feature amount calculation unit (12) calculates a feature amount from a received AV signal. A soundless section detection unit (13) detects a section where the sound power level is below a predetermined value as a soundless section. Moreover, a candidate section detection unit (14) detects a section of a sound power level above a predetermined value as a digest scene candidate section. A CM section judgment unit (15) judges a CM section according to the time interval between the soundless sections. A digest list creation unit (16) deletes the section corresponding to the judged CM section from the digest candidate sections, thereby generating digest scene information in the program sections excluding the CM section.

Description

ダイジェスト生成装置、ダイジェスト生成方法、ダイジェスト生成プログラム を格納した記録媒体、およびダイジェスト生成装置に用いる集積回路  Digest generating apparatus, digest generating method, recording medium storing digest generating program, and integrated circuit used for digest generating apparatus
技術分野  Technical field
[0001] 本発明は、ダイジェストシーンの生成に関し、より特定的には、テレビ放送などから 映像や音声の特徴量を算出し、これらを利用して特定の重要シーンを決定するダイ ジェストシーンの生成に関する。  [0001] The present invention relates to the generation of digest scenes, and more specifically, the generation of digest scenes that calculate video and audio feature quantities from television broadcasts, etc., and determine specific important scenes using them. About.
背景技術  Background art
[0002] 従来より、テレビ放送など力 映像や音声の特徴量を算出し、これらを利用して重 要シーンを決定するダイジェスト(要約)生成装置がある。このような装置においては、 ダイジェストの生成につき、次のような方式が一般的に用いられる。まず、一度記録媒 体に録画された AV信号カゝら映像 '音声の特徴量を 1番組分算出し、それらの特徴量 に基づき CM区間を検出して、ダイジェスト再生のプレイリストなどの時刻情報を CM 区間以外力 算出する方式がある。  Conventionally, there are digest (summary) generation apparatuses that calculate feature quantities of powerful video and audio such as television broadcasts and determine important scenes using these. In such an apparatus, the following method is generally used for generating a digest. First, the video signal “video” once recorded on the recording medium is calculated for one program, the CM section is detected based on those features, and time information such as a playlist for digest playback is calculated. There is a method to calculate the force outside the CM section.
[0003] この方式を採用する従来のダイジェスト生成装置の構成について、図 14を用いて 説明する。図 14は、 CM区間を除外したダイジェストの生成を行うダイジェスト生成装 置の構成例である。図 14において、受信部 101は、放送電波を受信し、音声 '映像 信号 (以下、 AV信号)に復調する。大容量記憶媒体 102は、受信した AV信号を記 録する媒体である。 HDD等がこれに該当する。特徴量抽出部 103は、ダイジェスト生 成のために必要な特徴量 (以下、ダイジェスト特徴量)と CM検出に必要な特徴量 (以 下、 CM特徴量)とを大容量記憶媒体 102に記憶された AV信号力も算出する。ここ で、ダイジェスト特徴量としては、動きベクトルや輝度情報に基づくシーンチェンジ検 出結果、音声パヮ、番組に付与されたテキスト情報、などが考えられる。また、 CM特 徴量としては、輝度情報に基づくシーンチェンジ検出結果や、音声無音部の情報、 などが考えられる。 CM検出部 104は、算出された CM特徴量に基づき CM区間(の 始端'終端の時刻情報)を検出し、ダイジェスト検出部 105に出力する。 CM区間の 検出方法としては、映像の輝度情報から映像シーンチェンジを検出し、その検出され た時間間隔が一定の時間(15秒、 30秒)であれば CM区間と判定する方法や、音声 の無音部を検出し、同様に時間間隔を調べて CM区間を判定する方法を用いる。ダ イジエスト検出部 105は、上記ダイジェスト特徴量と CM検出部 104から出力された C M区間情報とに基づき、 CM区間外力 ダイジェストシーンを検出する。また、検出し たダイジェストシーン (の始端 '終端の時刻情報)をダイジェスト情報として再生制御部 106に出力する。ダイジェストシーンの検出方法としては、スポーツ中継等の場合に、 映像の動きベクトルから動きのスローなシーン(繰り返しのスローモーションシーン)を 特定し、この直前の数カットを盛り上がつているシーンとして検出する方法 (例えば特 許文献 1)や、音声パヮ情報の局所的に大きい値を取るシーンを盛り上がつているシ ーンとして検出する方法 (例えば特許文献 2)、また番組に付与されたテキスト情報と 映像 ·音声信号の特徴量とを組み合わせて重要シーンを検出する方法 (例えば特許 文献 3)などを用いる。再生制御部 106は、上記大容量記憶媒体 102から AV信号を 読み出し、上記ダイジェスト情報に基づいてダイジェスト再生を行う。このような構成に より、ユーザが録画した番組を視聴する際、すなわち、大容量記憶媒体 102に記憶さ れた AV信号を再生する際に、 CM区間を除 ヽた番糸且区間からダイジェストシーン情 報を作成し、ダイジェスト再生を行うことが可能となる。 [0003] The configuration of a conventional digest generation apparatus that employs this method will be described with reference to FIG. Fig. 14 shows an example of the configuration of a digest generation device that generates a digest excluding the CM section. In FIG. 14, a receiving unit 101 receives a broadcast radio wave and demodulates it into an audio video signal (hereinafter referred to as an AV signal). The mass storage medium 102 is a medium for recording received AV signals. HDD etc. correspond to this. The feature quantity extraction unit 103 stores a feature quantity required for digest generation (hereinafter referred to as digest feature quantity) and a feature quantity required for CM detection (hereinafter referred to as CM feature quantity) in the mass storage medium 102. The AV signal power is also calculated. Here, as the digest feature amount, a scene change detection result based on a motion vector or luminance information, an audio path, text information attached to a program, and the like can be considered. In addition, CM feature amounts may include scene change detection results based on luminance information, and information on sound silence. The CM detection unit 104 detects a CM section (start time and end time information) based on the calculated CM feature value, and outputs it to the digest detection unit 105. The detection method of the CM section is to detect the video scene change from the luminance information of the video and detect it. If the time interval is a fixed time (15 seconds, 30 seconds), use the method of determining the CM section or the method of detecting the silent part of speech and determining the CM section by examining the time interval in the same way. The digest detection unit 105 detects a CM section external force digest scene based on the digest feature value and the CM section information output from the CM detection unit 104. In addition, the detected digest scene (start time end time information) is output to the playback control unit 106 as digest information. As a method for detecting digest scenes, in the case of sports broadcasts, etc., a slow motion scene (repetitive slow motion scene) is identified from the motion vector of the video, and the previous few cuts are detected as a scene that is rising. (E.g. Patent Document 1), a method of detecting a scene that takes a locally large value of voice path information as a scene that is raised (e.g. Patent Document 2), and text given to a program A method for detecting important scenes by combining information and features of video and audio signals (for example, Patent Document 3) is used. The playback control unit 106 reads an AV signal from the large-capacity storage medium 102 and performs digest playback based on the digest information. With such a configuration, when a user watches a recorded program, that is, when an AV signal stored in the large-capacity storage medium 102 is played back, a digest scene starts from the chopping and segment excluding the CM segment. It is possible to create information and perform digest playback.
また、番組の録画と並行して特徴量も算出しておき、当該特徴量を記録媒体に格 納しておく方式もある。図 19は、録画処理と並行して特徴量を算出しながらリアルタ ィムにダイジェストシーン候補を検出して CM特徴量と共に大容量記憶手段に記憶し ておき、再生時に CM区間を検出してダイジェストシーン候補の中力も CM区間に含 まれるものを除外して正しいダイジェスト情報を生成するダイジェスト生成装置の構成 例である。図 19において、受信部 101は、受信した AV信号を大容量記憶媒体 102 に記録するのに併せて、当該 AV信号を特徴量抽出部 103にも出力する。特徴量抽 出部 103は、 CM特徴量を算出し、大容量記憶媒体 102に格納する。これに併せて 、特徴量抽出部 103は、例えば音声のパヮレベル等の、上記ダイジェスト特徴量をダ イジェスト検出部 105に出力する。ダイジェスト検出部 105は、当該ダイジェスト特徴 量を分析し、例えば音声のパヮレベルが所定の閾値以上であるシーンをダイジェスト シーン候補として検出する。そして、ダイジェスト検出部 105は、検出したシーンにつ いて、ダイジェスト候補情報として大容量記憶媒体 102に格納する。つまり、番組録 画と並行して、ダイジェスト候補となるシーンを検出することになる。そして、当該ダイ ジェスト候補情報 (時刻情報)と、 CM特徴量を大容量記憶媒体 102に記録しておく。 ここで、 CM検出については、リアルタイムでは CM区間の始端 '終端が特定できない ため、ここでは、後の検出処理に必要な CM特徴量のみを記録しておくものである。 そして、当該録画した番組をユーザの指示で再生する際に、 CM検出部 104は、大 容量記憶媒体 102から上記 CM特徴量を読み込み、 CM区間を検出する。そして、 C M検出部 104は、検出結果を CM区間情報として CM区間除去部 107に出力する。 CM区間除去部 107は、大容量記憶媒体 102から読み込んだダイジェスト候補情報 力も CM区間に該当する部分を削除して、ダイジェスト情報を作成する。つまり、録画 時に、例えば音声パヮレベルが所定値以上のシーンを CM区間も含めて仮検出して おき、ダイジェスト候補情報として記録しておく。そして、録画終了後、例えば再生開 始指示を受けた時に、当該録画された番組全体 (の特徴量)を分析して CM区間を 検出し、ダイジェスト候補力も CM区間を差し引くことで、番組区間内のダイジェスト区 間を抽出するものである。 In addition, there is a method in which a feature quantity is calculated in parallel with the recording of a program and the feature quantity is stored in a recording medium. Figure 19 shows real-time digest scene candidates while calculating feature values in parallel with the recording process, storing them together with CM feature values in a large-capacity storage means, and detecting CM sections during playback. This is an example of the configuration of a digest generation device that generates correct digest information by excluding those that are included in the CM section. In FIG. 19, the receiving unit 101 records the received AV signal on the large-capacity storage medium 102 and outputs the AV signal to the feature amount extracting unit 103 as well. The feature quantity extraction unit 103 calculates a CM feature quantity and stores it in the mass storage medium 102. At the same time, the feature quantity extraction unit 103 outputs the digest feature quantity such as the speech parsing level to the digest detection unit 105. The digest detection unit 105 analyzes the digest feature value, and detects, for example, a scene whose voice power level is equal to or higher than a predetermined threshold as a digest scene candidate. The digest detection unit 105 then detects the detected scene. And stored in the mass storage medium 102 as digest candidate information. In other words, a scene that is a digest candidate is detected in parallel with the program recording. Then, the digest candidate information (time information) and the CM feature amount are recorded in the mass storage medium 102. Here, for CM detection, since the start end of the CM section cannot be specified in real time, only the CM feature quantity necessary for the subsequent detection processing is recorded here. Then, when the recorded program is reproduced according to a user instruction, the CM detection unit 104 reads the CM feature amount from the large-capacity storage medium 102 and detects a CM section. CM detecting section 104 then outputs the detection result as CM section information to CM section removing section 107. The CM section removing unit 107 deletes the portion corresponding to the digest candidate information read from the large-capacity storage medium 102 and creates the digest information. In other words, at the time of recording, for example, a scene with a voice channel level equal to or higher than a predetermined value is temporarily detected including the CM section, and recorded as digest candidate information. After the recording is completed, for example, when a playback start instruction is received, the entire recorded program is analyzed to detect the CM section, and the digest candidate power is also subtracted from the CM section. The digest section is extracted.
特許文献 1 :特開 2004— 128550号公報 Patent Document 1: JP 2004-128550 A
特許文献 2 :特開平 10— 039890号公報 Patent Document 2: JP-A-10-039890
特許文献 3:特開 2001— 119649号公報 Patent Document 3: Japanese Patent Laid-Open No. 2001-119649
発明の開示 Disclosure of the invention
発明が解決しょうとする課題 Problems to be solved by the invention
し力しながら、上述したようなダイジェスト生成装置においては、以下に示す問題点 があった。まず、 1つ目の方式では、録画終了後、例えばユーザによるダイジェスト再 生開始指示を受けたタイミングで、特徴量算出、 CM区間検出、ダイジェストシーン検 出、およびダイジェスト情報作成という処理を行っている。従って、ダイジェスト再生の 開始指示後、実際に再生が開始されるまでに上述の処理待ちの時間が発生してしま うという問題点がある。また、 2つ目の方式では、録画時に特徴量の算出およびダイ ジェスト候補となるシーンの情報を検出する。そのため、 1つ目の方式に比べて再生 指示時に行われていた特徴量算出の処理に力かる時間を削減できる。しかし、 CM 区間の検出については、 CM区間の始端終端がリアルタイムでは判断できないことか ら、やはり録画終了後(再生開始指示時等)に行っている。そのため、この方式であつ ても、ダイジェスト情報作成のための処理待ち時間は発生してしまう。特に、一般的な DVDレコーダ等の民生機器は、パーソナルコンピュータに比べると約 1Z10ほどの 性能し力備えていない CPUを搭載することが一般的である。そのため、上記処理待 ちの時間も長くなり、当該待ち時間による不快感ゃ使い勝手の悪さ等の悪印象をュ 一ザに与えてしまう。 However, the digest generation apparatus as described above has the following problems. First, in the first method, after recording ends, for example, when a user receives an instruction to start digest playback, processing such as feature amount calculation, CM section detection, digest scene detection, and digest information creation is performed. . Therefore, there is a problem in that the above-described processing waiting time occurs after the digest playback start instruction until playback is actually started. The second method calculates feature values and detects scene information that is a digest candidate during recording. Therefore, compared to the first method, it is possible to reduce the time required for the feature amount calculation processing performed at the time of playback instruction. But CM Section detection is performed after the end of recording (when playback is instructed, etc.) because the start and end of the CM section cannot be determined in real time. For this reason, even with this method, processing wait time for digest information creation occurs. In particular, consumer devices such as general DVD recorders are generally equipped with a CPU that does not have the power and performance of about 1Z10 compared to personal computers. For this reason, the waiting time for the above processing becomes longer, and if the discomfort caused by the waiting time is given, the user is given a bad impression such as inconvenience.
[0006] それ故に、本発明の目的は、番組の録画終了後における当該番組のダイジェスト 情報生成のための処理待ち時間がないダイジェスト生成装置を提供することである。 課題を解決するための手段  [0006] Therefore, an object of the present invention is to provide a digest generation apparatus that does not have a processing waiting time for generating digest information of a program after the recording of the program ends. Means for solving the problem
[0007] 上記目的を達成するために、本発明は以下のような構成を採用した。 In order to achieve the above object, the present invention employs the following configuration.
[0008] 第 1の局面は、放送される番組の放送信号を受信して記録媒体に記録する際に当 該番組に関するダイジェストシーン情報を生成するダイジェスト生成装置であって、 特徴量算出部と、特定区間端検出部と、ダイジェストシーン情報作成部とを備える。 特徴量算出部は、所定の単位時間の放送信号が受信される度に、当該受信された 単位時間分の放送信号から、当該放送信号に含まれる映像および音声の少なくとも 一方に関する特徴を示す特徴量を少なくとも 1種類算出する。特定区間端検出部は 、前記受信された放送信号のうちですでに特徴量が算出された信号部分に含まれる 所定の時点が特定区間の始端または終端となる力否かを、前記特徴量が算出される 度に判定することによって、特定区間の始端または終端となる時点を検出する。ダイ ジェストシーン情報作成部は、前記特徴量が算出される度に、当該特徴量に基づい て、前記番組の全体の区間のうち前記特定区間を除いた区間にかかる放送信号が ダイジェストシーンか否かを判定してダイジェストシーン情報を生成する。 [0008] A first aspect is a digest generation device that generates digest scene information related to a program when the broadcast signal of the broadcast program is received and recorded on a recording medium, and includes a feature amount calculation unit, A specific section end detection unit and a digest scene information creation unit are provided. Each time a broadcast signal of a predetermined unit time is received, the feature amount calculation unit indicates a feature amount indicating a feature related to at least one of video and audio included in the broadcast signal from the received broadcast signal for the unit time. Calculate at least one type. The specific section end detection unit determines whether or not the predetermined amount of time included in the signal portion in which the characteristic amount has already been calculated among the received broadcast signals is a force that causes the start or end of the specific section. By determining each time it is calculated, it detects the time when it becomes the start or end of a specific section. The digest scene information creation unit determines whether or not the broadcast signal for the section excluding the specific section of the entire section of the program is a digest scene, based on the feature amount, every time the feature amount is calculated. And digest scene information is generated.
[0009] 第 2の局面は、第 1の局面において、ダイジェストシーン情報作成部は、単位時間 分の AV信号に含まれるコンテンツがダイジェストシーンであるか否かを、当該単位時 間分の AV信号について特徴量が算出される度に当該特徴量に基づいて判定する こと〖こよって、受信された AV信号にっ ヽてダイジェスト候補区間を検出するダイジェ スト区間検出部を含む。更に、ダイジェストシーン情報作成部は、特定区間端検出部 によって特定区間の始端および終端の組が検出される度に、当該始端から当該終 端までの特定区間がダイジェスト候補区間と重複する力否かを判定し、ダイジェスト区 間検出部によって検出されたダイジェスト候補区間のうちで当該特定区間と重複する ダイジェスト候補区間を除いた区間を示す情報をダイジェストシーン情報として生成 する。 [0009] In a second aspect, in the first aspect, the digest scene information creation unit determines whether the content included in the AV signal for the unit time is a digest scene or not. A digest section detecting unit that detects a digest candidate section based on the received AV signal by determining based on the feature quantity each time the feature quantity is calculated. Furthermore, the digest scene information creation unit is a specific section end detection unit. Each time a set of the start and end of a specific section is detected, it is determined whether or not the specific section from the start to the end overlaps with the digest candidate section, and the digest detected by the digest section detection unit is determined. Among the candidate sections, information indicating a section excluding the digest candidate section that overlaps with the specific section is generated as digest scene information.
[0010] 第 3の局面は、第 1の局面において、ダイジェストシーン情報作成部は、算出された 特徴量を最新の算出時点力 所定時間分まで記憶する一時記憶部を含む。また、ダ イジヱストシーン情報作成部は、特徴量が算出される度に、一時記憶部に記憶され ている特徴量にかかる時点が特定区間端検出部によって検出された特定区間の始 端から終端までの間に含まれるか否かを判定し、含まれない場合にのみ、単位時間 分の放送信号に含まれるコンテンツのうちダイジェストシーンであるコンテンツを検出 して、ダイジェストシーン情報を生成する。  [0010] In a first aspect, the third aspect includes a temporary storage unit in which the digest scene information creation unit stores the calculated feature amount up to the latest calculated time point power for a predetermined time. In addition, each time the feature amount is calculated, the digest scene information creation unit detects the time point of the feature amount stored in the temporary storage unit from the start to the end of the specific section detected by the specific section end detection unit. If it is not included, only if it is not included, the content that is a digest scene is detected from the content included in the broadcast signal for a unit time, and the digest scene information is generated.
[0011] 第 4の局面は、第 2の局面において、特徴量算出部は、第 1及び第 2の特徴量を算 出し、特定区間端検出部は、第 1の特徴量に基づいて特定区間の始端または終端を 判定し、ダイジェスト区間検出部は、第 2の特徴量に基づいてダイジェスト候補区間を 検出する。  [0011] In a fourth aspect according to the second aspect, the feature amount calculating unit calculates the first and second feature amounts, and the specific section end detection unit is configured to specify the specific section based on the first feature amount. The digest section detection unit detects a digest candidate section based on the second feature amount.
[0012] 第 5の局面は、第 1の局面において、特定区間端検出部は、特徴量が所定の条件 を満たすとき、当該条件を満たす特徴量のみを含む区間を特定区間候補として検出 する特定区間候補検出部と、番組内における特定区間候補同士の時間差に基づい て特定区間の始端または終端となる候補を検出する特定区間判定部とを含む。  [0012] In a fifth aspect according to the first aspect, the specific section end detection unit detects a section including only the feature quantity satisfying the condition as a specific section candidate when the feature quantity satisfies a predetermined condition. A section candidate detection unit and a specific section determination unit that detects a candidate that is a start or end of a specific section based on a time difference between the specific section candidates in the program.
[0013] 第 6の局面は、第 5の局面において、特定区間判定部は、特定区間候補が検出さ れる度に、検出された特定区間候補力 所定時間前の時点が既に検出された特定 区間候補に含まれていれば、当該所定時間前の時点を特定区間の始端とし、当該 特定区間候補を特定区間の終端として検出する。  [0013] In a sixth aspect according to the fifth aspect, the specific section determination unit detects the specific section candidate power that is detected every time a specific section candidate is detected. If it is included in the candidate, the time point before the predetermined time is detected as the start of the specific section, and the specific section candidate is detected as the end of the specific section.
[0014] 第 7の局面は、第 5の局面において、特定区間検出部は、特定区間候補が検出さ れる度に、最後に検出された特定区間候補力 所定の第 1時間前の時点に、または 、当該最後に検出された特定区間候補力 所定の第 2時間前の時点に、すでに検 出された特定区間候補が存在するか否かを判定する判定部と、判定部によって特定 区間候補が存在すると判定された場合、存在すると判定された特定区間候補および 当該最後に検出された特定区間候補についてそれぞれ点数を加算する加算部と、 点数が所定値以上である対象候補が検出されて力 所定の第 3時間が経過する度 に、当該対象候補力 当該第 3時間前の時点に、点数が当該所定値以上の特定区 間候補が存在するか否かを判定し、存在しない場合、当該対象候補を特定区間の 始端とする始端決定部と、点数が所定値以上である対象候補が検出されてから所定 の第 3時間が経過する度に、当該第 3時間が経過した時点に、点数が当該所定値以 上の特定区間候補が存在するか否かを判定し、存在しない場合、当該対象候補を 特定区間の終端とする終端決定部とを備える。 [0014] In a fifth aspect according to a seventh aspect, in the fifth aspect, the specific section detection unit detects the specific section candidate power last detected every time a specific section candidate is detected. Alternatively, the specific section candidate power detected at the end is determined by the determination section that determines whether or not there is a specific section candidate that has already been detected at a time point before the predetermined second time. When it is determined that there is a section candidate, an adder that adds points to each of the specific section candidate determined to exist and the last specific section candidate detected last, and a target candidate having a score equal to or greater than a predetermined value are detected. When the specified third time elapses, the target candidate power is determined whether there is a specific interval candidate whose score is equal to or greater than the predetermined value at the time point before the third time. A start end determination unit having the target candidate as a start point of the specific section, and whenever a predetermined third time elapses after detection of a target candidate having a score equal to or greater than a predetermined value, And determining whether there is a specific section candidate whose score is greater than or equal to the predetermined value. If there is no specific section candidate, the terminal determination section includes the target candidate as the end of the specific section.
[0015] 第 8の局面は、第 5の局面において、特徴量算出部は、音声信号の音声パヮレベ ルを特徴量として算出し、特定区間候補検出部は、パヮレベルが所定値以下の無音 区間を特定区間候補として検出する。  [0015] In an eighth aspect based on the fifth aspect, the feature amount calculation unit calculates a voice level of the audio signal as a feature amount, and the specific interval candidate detection unit detects a silent interval whose par level is a predetermined value or less. Detect as a specific section candidate.
[0016] 第 9の局面は、第 5の局面において、特徴量算出部は、映像信号に基づく輝度情 報を特徴量として算出し、特定区間候補検出部は、輝度情報の変化量が所定値以 上であるシーンチェンジ点を特定区間候補として検出する。  [0016] In a ninth aspect according to the fifth aspect, the feature amount calculation unit calculates the luminance information based on the video signal as the feature amount, and the specific section candidate detection unit determines that the amount of change in the luminance information is a predetermined value. The above scene change points are detected as specific section candidates.
[0017] 第 10の局面は、放送される番組の放送信号を受信して記録媒体に記録する際に 当該番組に関するダイジェストシーン情報を生成するダイジェスト生成方法であって 、特徴量算出ステップと、特定区間端検出ステップと、ダイジェストシーン情報作成ス テツプとを備える。特徴量算出ステップは、所定の単位時間の放送信号が受信される 度に、当該受信された単位時間分の放送信号から、当該放送信号に含まれる映像 および音声の少なくとも一方に関する特徴を示す特徴量を少なくとも 1種類算出する 。特定区間端検出ステップは、受信された放送信号のうちですでに特徴量が算出さ れた信号部分に含まれる所定の時点が特定区間の始端または終端となる力否かを、 特徴量が算出される度に判定することによって、特定区間の始端または終端となる時 点を検出する。ダイジェストシーン情報作成ステップは、特徴量が算出される度に、 当該特徴量に基づいて、番組の全体の区間のうち特定区間を除いた区間にかかる 放送信号がダイジェストシーンか否かを判定してダイジェストシーン情報を生成する。  [0017] A tenth aspect is a digest generation method for generating digest scene information related to a program when the broadcast signal of the program to be broadcast is received and recorded on a recording medium. A section end detection step and a digest scene information creation step are provided. The feature amount calculating step is a feature amount indicating a feature related to at least one of video and audio included in the broadcast signal from the received broadcast signal for the unit time every time a broadcast signal of a predetermined unit time is received. Calculate at least one type. In the specific section edge detection step, the feature amount is calculated based on whether or not a predetermined time point included in the signal portion of the received broadcast signal whose feature amount has already been calculated is the start or end of the specific section. By determining each time, the time point at the start or end of a specific section is detected. The digest scene information creation step determines whether or not the broadcast signal for a section excluding the specific section of the entire section of the program is a digest scene every time the feature amount is calculated. Digest scene information is generated.
[0018] 第 11の局面は、第 10の局面において、ダイジェストシーン情報作成ステップは、単 位時間分の放送信号に含まれるコンテンツがダイジェストシーンであるカゝ否かを、当 該単位時間分の放送信号について特徴量が算出される度に当該特徴量に基づい て判定することによって、受信された放送信号についてダイジェスト候補区間を検出 するダイジェスト区間検出ステップを含む。また、ダイジェストシーン情報作成ステップ は、特定区間端検出ステップによって特定区間の始端および終端の組が検出される 度に、当該始端力 当該終端までの特定区間がダイジェスト候補区間と重複するか 否かを判定し、前記ダイジェスト区間検出ステップによって検出されたダイジェスト候 補区間のうちで当該特定区間と重複するダイジェスト候補区間を除いた区間を示す 情報をダイジェストシーン情報として生成する。 [0018] In an eleventh aspect according to the tenth aspect, the digest scene information creation step includes By determining whether or not the content included in the broadcast signal for about several hours is a digest scene based on the feature amount every time the feature amount is calculated for the broadcast signal for the unit time, A digest section detecting step for detecting a digest candidate section for the broadcast signal. In addition, the digest scene information creation step determines whether or not the specific section up to the start force and the end overlaps with the digest candidate section every time the start and end pair of the specific section is detected by the specific section end detection step. The digest scene information is generated as digest scene information by determining and excluding digest candidate sections that overlap with the specific section among the digest candidate sections detected by the digest section detection step.
[0019] 第 12の局面は、第 10の局面において、ダイジェストシーン情報作成ステップは、算 出された特徴量を最新の算出時点力 所定時間分まで記憶する一時記憶ステップ を含む。また、ダイジェストシーン情報作成ステップは、特徴量が算出される度に、一 時記憶ステップによって記憶された特徴量に力かる時点が特定区間端検出ステップ によって検出された特定区間の始端カゝら終端までの間に含まれるカゝ否かを判定し、 含まれな!/、場合にのみ、単位時間分の AV信号に含まれるコンテンツのうちでダイジ ェストシーンであるコンテンツを検出して、ダイジェストシーン情報を生成する。  [0019] In a twelfth aspect according to the tenth aspect, the digest scene information creation step includes a temporary storage step of storing the calculated feature amount up to a latest calculation time point force for a predetermined time. In addition, the digest scene information creation step includes the start and end points of the specific section detected by the specific section end detection step when the feature amount stored in the temporary storage step is calculated each time the feature amount is calculated. Judge whether it is included or not, not included! / Only in this case, digest scene information is generated by detecting the content that is the digest scene from the content included in the AV signal for the unit time.
[0020] 第 13の局面は、放送される番組の放送信号を受信して記録媒体に記録する際に 当該番組に関するダイジェストシーン情報を生成するダイジェスト生成装置のコンビ ユータに実行させるダイジェスト生成プログラムを格納した記録媒体であって、特徴量 算出ステップと、特定区間端検出ステップと、ダイジェストシーン情報作成ステップと を格納している。特徴量算出ステップは、所定の単位時間の放送信号が受信される 度に、当該受信された単位時間分の放送信号から、当該放送信号に含まれる映像 および音声の少なくとも一方に関する特徴を示す特徴量を少なくとも 1種類算出する ための処理である。特定区間端検出ステップは、受信された放送信号のうちですで に特徴量が算出された信号部分に含まれる所定の時点が特定区間の始端または終 端となるか否かを、特徴量が算出される度に判定することによって、特定区間の始端 または終端となる時点を検出するための処理である。ダイジェストシーン情報作成ス テツプは、特徴量が算出される度に、当該特徴量に基づいて、番組の全体の区間の うち特定区間を除いた区間にかかる放送信号がダイジェストシーン力否かを判定して ダイジェストシーン情報を生成するための処理である。 [0020] A thirteenth aspect stores a digest generation program to be executed by a digest generation device computer that generates digest scene information related to a program when the broadcast signal of the broadcast program is received and recorded on a recording medium. The recording medium stores a feature amount calculation step, a specific section end detection step, and a digest scene information creation step. The feature amount calculating step is a feature amount indicating a feature related to at least one of video and audio included in the broadcast signal from the received broadcast signal for the unit time every time a broadcast signal of a predetermined unit time is received. This is a process for calculating at least one of the above. The specific interval end detection step calculates whether the feature amount is whether or not a predetermined time point included in the signal portion for which the feature amount has already been calculated in the received broadcast signal is the start or end of the specific interval. This is a process for detecting the time point at the beginning or end of a specific section by determining each time it is performed. In the digest scene information creation step, every time a feature value is calculated, the entire section of the program is calculated based on the feature value. This is a process for generating digest scene information by determining whether or not a broadcast signal in a section excluding a specific section is digest scene power.
[0021] 第 14の局面は、第 13の局面において、ダイジェストシーン情報作成ステップは、単 位時間分の放送信号に含まれるコンテンツがダイジェストシーンであるカゝ否かを、当 該単位時間分の放送信号について特徴量が算出される度に当該特徴量に基づい て判定することによって、受信された放送信号についてダイジェスト候補区間を検出 するダイジェスト区間検出ステップを含む。また、ダイジェストシーン情報作成ステップ は、特定区間端検出ステップによって特定区間の始端および終端の組が検出される 度に、当該始端力 当該終端までの特定区間が前記ダイジェスト候補区間と重複す るカゝ否かを判定し、ダイジェスト区間検出ステップによって検出されたダイジェスト候 補区間のうちで当該特定区間と重複するダイジェスト候補区間を除いた区間を示す 情報をダイジェストシーン情報として生成する。  [0021] In a fourteenth aspect, in the thirteenth aspect, the digest scene information creation step determines whether or not the content included in the broadcast signal for a unit time is a digest scene, for the unit time. A digest section detecting step of detecting a digest candidate section for the received broadcast signal by making a determination based on the feature quantity each time the feature quantity is calculated for the broadcast signal is included. In addition, the digest scene information creation step includes a step in which a specific section up to the start force and the end overlaps with the digest candidate section each time a set of the start and end of the specific section is detected by the specific section end detection step. It determines whether or not, and information indicating a section excluding the digest candidate section that overlaps the specific section among the digest candidate sections detected by the digest section detection step is generated as digest scene information.
[0022] 第 15の局面は、第 13の局面において、ダイジェストシーン情報作成ステップは、算 出された特徴量を最新の算出時点力 所定時間分まで記憶する一時記憶ステップ を含む。また、ダイジェストシーン情報作成ステップは、特徴量が算出される度に、一 時記憶ステップによって記憶された特徴量に力かる時点が特定区間端検出ステップ によって検出された特定区間の始端カゝら終端までの間に含まれるカゝ否かを判定し、 含まれな!/、場合にのみ、単位時間分の AV信号に含まれるコンテンツのうちでダイジ ェストシーンであるコンテンツを検出して、ダイジェストシーン情報を生成する。  [0022] In a fifteenth aspect, in the thirteenth aspect, the digest scene information creation step includes a temporary storage step of storing the calculated feature amount up to a latest calculation time force for a predetermined time. In addition, the digest scene information creation step includes the start and end points of the specific section detected by the specific section end detection step when the feature amount stored in the temporary storage step is calculated each time the feature amount is calculated. Judge whether it is included or not, not included! / Only in this case, digest scene information is generated by detecting the content that is the digest scene from the content included in the AV signal for the unit time.
[0023] 第 16の局面は、放送される番組の放送信号を受信して記録媒体に記録する際に 当該番組に関するダイジェストシーン情報を生成するダイジェスト生成装置に用いら れる集積回路あって、特徴量算出部と、特定区間端検出部と、ダイジェストシーン情 報作成部とを備える。特徴量算出部は、所定の単位時間の放送信号が受信される度 に、当該受信された単位時間分の放送信号から、当該放送信号に含まれる映像およ び音声の少なくとも一方に関する特徴を示す特徴量を少なくとも 1種類算出する。特 定区間端検出部は、受信された放送信号のうちですでに特徴量が算出された信号 部分に含まれる所定の時点が特定区間の始端または終端となるか否かを、特徴量が 算出される度に判定することによって、特定区間の始端または終端となる時点を検出 する。ダイジェストシーン情報作成部は、特徴量が算出される度に、当該特徴量に基 づいて、番組の全体の区間のうち特定区間を除いた区間にかかる放送信号がダイジ ェストシーンか否かを判定してダイジェストシーン情報を生成する。 [0023] A sixteenth aspect is an integrated circuit used in a digest generation device that generates digest scene information related to a program when the broadcast signal of the program to be broadcast is received and recorded on a recording medium. A calculation unit, a specific section end detection unit, and a digest scene information creation unit are provided. Each time a broadcast signal for a predetermined unit time is received, the feature amount calculation unit indicates a feature related to at least one of video and audio included in the broadcast signal from the received broadcast signal for the unit time. At least one feature is calculated. The specific section end detection unit calculates whether or not a predetermined time point included in the signal portion of the received broadcast signal whose characteristic amount has already been calculated is the start or end of the specific section. By detecting each time it is detected, it detects the time that becomes the start or end of a specific section To do. Each time the feature amount is calculated, the digest scene information creation unit determines whether or not the broadcast signal in the entire program section excluding the specific section is a digest scene based on the feature amount. To generate digest scene information.
[0024] 第 17の局面は、第 16の局面において、ダイジェストシーン情報作成部は、単位時 間分の放送信号に含まれるコンテンツがダイジェストシーンであるカゝ否かを、当該単 位時間分の放送信号について特徴量が算出される度に当該特徴量に基づいて判 定することによって、受信された放送信号についてダイジェスト候補区間を検出する ダイジェスト区間検出部を含む。また、ダイジェストシーン情報作成部は、特定区間端 検出部によって特定区間の始端および終端の組が検出される度に、当該始端から 当該終端までの特定区間がダイジェスト候補区間と重複する力否かを判定し、ダイジ スト区間検出部によって検出されたダイジェスト候補区間のうちで当該特定区間と 重複するダイジェスト候補区間を除いた区間を示す情報をダイジェストシーン情報と して生成する。  [0024] In a seventeenth aspect, in the sixteenth aspect, the digest scene information creation unit determines whether or not the content included in the broadcast signal for a unit time is a digest scene, for the unit time. It includes a digest section detection unit that detects a digest candidate section for a received broadcast signal by determining based on the feature quantity each time a feature amount is calculated for a broadcast signal. In addition, the digest scene information creation unit determines whether or not the specific section from the start end to the end overlaps with the digest candidate section every time the specific section end detection unit detects the set of the start and end of the specific section. It determines, and the information which shows the area except the digest candidate area which overlaps with the said specific area among the digest candidate areas detected by the digest area detection part is produced | generated as digest scene information.
[0025] 第 18の局面は、第 16の局面において、ダイジェストシーン情報作成部は、算出さ れた特徴量を最新の算出時点力 所定時間分まで記憶する一時記憶部を含む。ま た、ダイジェストシーン情報作成部は、特徴量が算出される度に、一時記憶部に記憶 されている特徴量にかかる時点が特定区間端検出部によって検出された特定区間 の始端カも終端までの間に含まれる力否かを判定し、含まれない場合にのみ、単位 時間分の AV信号に含まれるコンテンツのうちでダイジェストシーンであるコンテンツ を検出して、ダイジェストシーン情報を生成する。  [0025] In an eighteenth aspect according to the sixteenth aspect, the digest scene information creation unit includes a temporary storage unit that stores the calculated feature amount up to a predetermined calculation time force for a predetermined time. In addition, every time a feature value is calculated, the digest scene information creation unit detects the time point of the feature value stored in the temporary storage unit from the start point to the end point of the specific segment detected by the specific segment end detection unit. Only when it is not included, the digest scene information is generated by detecting the content that is the digest scene among the content included in the AV signal for the unit time.
発明の効果  The invention's effect
[0026] 上記第 1の発明によれば、番組を記録しながら特定区間(例えば CM区間)を検出 できるため、番組の記録と並行して、特定区間を除外したダイジェストシーン情報を 生成することができる。これにより、録画終了後におけるダイジェストシーン情報生成 のための処理待ち時間を無くすことができ、ユーザに快適なダイジェスト再生操作を 提供することができる。更に、番組記録途中に追つかけ再生をする場合においても、 当該記録途中の時点に近 ヽところまでのダイジェスト再生を行うこともでき、より使い 勝手の良い再生環境を提供することができる。 [0027] 上記第 2乃至第 3の発明によれば、上記第 1の発明と同様の効果を得ることができ る。 [0026] According to the first aspect of the invention, since a specific section (for example, a CM section) can be detected while recording a program, digest scene information excluding the specific section can be generated in parallel with the recording of the program. it can. As a result, it is possible to eliminate the processing waiting time for generating digest scene information after the end of recording, and to provide a user with a comfortable digest playback operation. Furthermore, even in the case of chase playback during program recording, it is possible to perform digest playback up to the point in the middle of the recording, providing a more convenient playback environment. [0027] According to the second to third inventions, the same effect as the first invention can be obtained.
[0028] 上記第 4の発明によれば、 2種類の特徴量を用いることができる。そのため、特定区 間あるいはダイジェスト区間のそれぞれの検出に適した特徴量を利用し、より的確に 特定区間あるいはダイジェスト区間を検出することができる。  [0028] According to the fourth aspect of the invention, two types of feature quantities can be used. Therefore, it is possible to detect a specific section or digest section more accurately by using feature quantities suitable for detection of each specific section or digest section.
[0029] 上記第 5乃至第 6の発明によれば、特定区間候補同士の時間間隔に基づいて特 定区間を判定する。これにより、特定区間をより的確に判定することができる。  [0029] According to the fifth to sixth inventions, the specific section is determined based on the time interval between the specific section candidates. Thereby, a specific area can be determined more accurately.
[0030] 上記第 7の発明によれば、特定区間候補につき、所定の時間間隔に基づいて点数 付けを行う。これにより、特定区間の始端あるいは終端らしさが評価できる。さらに、点 数の高い特定区間候補を特定区間の始端または終端とするため、番組中にたまたま 存在した特定区間候補を、誤って特定区間の始端または終端であると判定すること を防ぐことができる。その結果、より的確に特定区間を除外したダイジェストシーン情 報を作成することができる。  [0030] According to the seventh invention, the specific section candidates are scored based on a predetermined time interval. Thereby, it is possible to evaluate the likelihood of the start or end of the specific section. Furthermore, because the specific section candidate with a high score is the start or end of the specific section, it is possible to prevent the specific section candidate that happened to exist in the program from being erroneously determined to be the start or end of the specific section. . As a result, it is possible to create digest scene information excluding specific sections more accurately.
[0031] 上記第 8の発明によれば、無音区間を特定区間候補とする。これにより、例えば C M区間のような、区間の最初と終わりが無音区間であるという性質を利用した的確な 特定区間の検出ができる。  [0031] According to the eighth aspect, the silent section is set as the specific section candidate. This makes it possible to detect a specific specific section using the property that the first and last sections are silent sections, such as the CM section.
[0032] 上記第 9の発明によれば、輝度情報が大きく変化したシーンチェンジ点を特定区間 候補とする。そのため、輝度情報が大きく変化する番組から特定区間への変わり目に ついて、特定区間候補とすることができ、その結果、より的確に特定区間を判定する ことができる。  [0032] According to the ninth aspect, a scene change point at which the luminance information has greatly changed is determined as a specific section candidate. For this reason, the transition from a program whose luminance information greatly changes to a specific section can be set as a specific section candidate, and as a result, the specific section can be determined more accurately.
[0033] 上記第 10乃至第 18の発明によれば、上記第 1の発明と同様の効果を得ることがで きる。  [0033] According to the tenth to eighteenth inventions, the same effects as in the first invention can be obtained.
図面の簡単な説明  Brief Description of Drawings
[0034] [図 1]図 1は、第 1の実施形態に力かるダイジェスト生成装置 10の構成を示すブロック 図である。  [0034] FIG. 1 is a block diagram showing a configuration of a digest generation apparatus 10 that is helpful in the first embodiment.
[図 2]図 2は、本発明で用いられるデータの一例を示す図である。  FIG. 2 is a diagram showing an example of data used in the present invention.
[図 3]図 3は、ダイジェストシーンリスト生成処理を示すフローチャートである。  FIG. 3 is a flowchart showing digest scene list generation processing.
[図 4]図 4は、図 3のステップ S4で示した無音区間検出処理の詳細を示すフローチヤ ートである。 [FIG. 4] FIG. 4 is a flowchart showing details of the silent section detection processing shown in step S4 of FIG. It is
[図 5]図 5は、図 4のステップ S 16で示したポイント評価処理の詳細を示すフローチヤ ートである。  FIG. 5 is a flowchart showing details of the point evaluation process shown in step S 16 of FIG. 4.
[図 6]図 6は、図 3のステップ S5で示した候補区間検出処理の詳細を示すフローチヤ ートである。  [FIG. 6] FIG. 6 is a flowchart showing details of the candidate section detection process shown in step S5 of FIG.
[図 7]図 7は、図 3のステップ S6で示した CM区間判定処理の詳細を示すフローチヤ ートである。  [FIG. 7] FIG. 7 is a flowchart showing details of the CM section determination processing shown in step S6 of FIG.
[図 8]図 8は、 CM区間判定処理における CM区間判定の一例を示す図である。  FIG. 8 is a diagram showing an example of CM section determination in the CM section determination processing.
[図 9]図 9は、図 3のステップ S7で示したダイジェストシーンリスト出力処理の詳細を示 すフローチャートである。  FIG. 9 is a flowchart showing details of the digest scene list output process shown in step S 7 of FIG. 3.
[図 10]図 10は、第 2の実施形態にカゝかるダイジェスト生成装置 10の構成を示すプロ ック図である。  FIG. 10 is a block diagram showing a configuration of a digest generation apparatus 10 according to the second embodiment.
[図 11]図 11は、本発明で用いられるデータの一例を示す図である。  FIG. 11 is a diagram showing an example of data used in the present invention.
[図 12]図 12は、第 2の実施形態に力かるダイジェストシーンリスト生成処理を示すフロ 一チャートである。  [FIG. 12] FIG. 12 is a flowchart showing a digest scene list generation process that is relevant to the second embodiment.
[図 13]図 13は、図 12のステップ S66で示した無音区間検出処理の詳細を示すフロ 一チャートである。  FIG. 13 is a flowchart showing details of the silent section detection process shown in step S 66 of FIG. 12.
[図 14]図 14は、従来の記録再生装置の構成を示すブロック図である。  FIG. 14 is a block diagram showing a configuration of a conventional recording / reproducing apparatus.
[図 15]図 15は、従来の記録再生装置の構成を示すブロック図である。 FIG. 15 is a block diagram showing a configuration of a conventional recording / reproducing apparatus.
符号の説明 Explanation of symbols
10、 30 ダイジェスト生成装置 10, 30 digest generator
11 受信部 11 Receiver
12 特徴量算出部 12 Feature value calculator
13 無音区間検出部 13 Silent section detector
14 候補区間検出部 14 Candidate section detector
15 CM区間判定部 15 CM section judgment section
16、 32 ダイジェストリスト作成部 16, 32 Digest list creation part
17 大容量記録媒体 18 再生制御部 17 Large capacity recording media 18 Playback control section
21 比較用特徴量情報  21 Feature information for comparison
22 無音始端情報  22 Silence start information
23 候補始端情報  23 Candidate start information
24 無音区間情報  24 Silent section information
25 候補区間情報  25 Candidate section information
26 暫定 CM始端情報  26 Provisional CM start information
27 CM区間情報  27 CM section information
28 ダイジェストシーンリスト  28 Digest scene list
31 一時記憶部  31 Temporary storage
36 一時蓄積特徴量  36 Temporarily accumulated features
37 ダイジェスト直前情報  37 Information just before digest
38 ダイジェスト始端情報  38 Digest start information
発明を実施するための最良の形態  BEST MODE FOR CARRYING OUT THE INVENTION
[0036] 本発明は、番組の録画と並行して、ダイジェストシーンの位置を示すダイジェストシ ーンリストを作成していく。以下に説明する本実施形態においては、ダイジェストシー ンは、音声パヮレベルが局所的に大きい値を取るシーン、つまり、盛り上がつているシ ーンをダイジェストシーンとして採用する。そのため、音声パヮレベルが所定値以上 であるシーンをダイジェスト候補区間として抜き出しておく。その一方で、音声パヮレ ベルが所定値以下の区間を無音区間として抜き出し、当該無音区間が、所定の間隔 (例えば 15秒間隔)で現れた区間を CM区間として抜き出しておく。これは、 CM区間 の最初と最後には無音区間があること、および CM区間の長さは一定であるという性 質を有することから、無音区間が一定の間隔で現れる部分は CM区間であると考えら れるからである。そして、 CM区間が 1つ抜き出される度に、ダイジェスト候補区間の 情報から CM区間に該当する情報を除くことで、番組区間内のダイジェストシーンを 示すダイジェストシーンリストを作成する。なお、本実施形態において、 1つの CM区 間の長さは、最大で 60秒であるものとして説明する。  The present invention creates a digest scene list indicating the position of the digest scene in parallel with the recording of the program. In the present embodiment described below, the digest scene employs a scene in which the voice par level takes a locally large value, that is, a scene that is raised, as the digest scene. For this reason, a scene whose voice path level is equal to or higher than a predetermined value is extracted as a digest candidate section. On the other hand, a section whose voice level is equal to or less than a predetermined value is extracted as a silent section, and a section where the silent section appears at a predetermined interval (for example, every 15 seconds) is extracted as a CM section. This is because there is a silent section at the beginning and end of the CM section, and the length of the CM section is constant, so the part where the silent section appears at a constant interval is the CM section. Because it is considered. Each time one CM section is extracted, a digest scene list indicating the digest scenes in the program section is created by excluding information corresponding to the CM section from the information of the digest candidate sections. In the present embodiment, the description will be made assuming that the length of one CM section is 60 seconds at the maximum.
[0037] (第 1の実施形態) 図 1は、本発明の第 1の実施形態に係るダイジェスト生成装置の構成を示したブロッ ク図である。図 1において、ダイジェスト生成装置 10は、受信部 11と特徴量算出部 1 2と無音区間検出部 13と候補区間検出部 14と CM区間判定部 15とダイジェストリスト 作成部 16と大容量記録媒体 17と再生制御部 18とで構成される。 [0037] (First embodiment) FIG. 1 is a block diagram showing a configuration of a digest generation apparatus according to the first embodiment of the present invention. In FIG. 1, a digest generating device 10 includes a receiving unit 11, a feature amount calculating unit 12, a silent segment detecting unit 13, a candidate segment detecting unit 14, a CM segment determining unit 15, a digest list creating unit 16, and a large-capacity recording medium 17 And a playback control unit 18.
[0038] 受信部 11は、放送電波を受信し、画像信号及び音声信号 (以下、 AV信号)に復 調する。また、受信部 11は、復調した AV信号を、特徴量算出部 12、大容量記録媒 体 17、再生制御部 18へ出力する。  [0038] The receiving unit 11 receives the broadcast radio wave and demodulates it into an image signal and an audio signal (hereinafter referred to as AV signal). In addition, the reception unit 11 outputs the demodulated AV signal to the feature amount calculation unit 12, the large-capacity recording medium 17, and the reproduction control unit 18.
[0039] 特徴量算出部 12は、上記 AV信号を分析して特徴量を算出し、無音区間検出部 1 3および候補区間検出部 14に出力する。ここで、特徴量とは、番組内の CM区間や ダイジェストシーンを判別するために用いられるものである。 CM区間を判定するため の特徴量としては、上述のように無音区間の発生間隔に基づいて CM区間を判定す ることから、音声信号のパヮレベルゃパヮスペクトルなどの音声特徴量が該当する。 一方、ダイジェストシーンを判定するための特徴量としては、例えば、映像信号の輝 度情報や動きベクトル等の映像特徴量や、音声信号のパヮレベルゃパヮスペクトル などの音声特徴量が該当する。本実施形態では、 CM区間およびダイジェストシーン の双方の判定に音声信号のパヮレベルを特徴量として用いるものとして説明する。  The feature amount calculation unit 12 analyzes the AV signal to calculate a feature amount, and outputs the feature amount to the silent section detection unit 13 and the candidate section detection unit 14. Here, the feature value is used to determine the CM section and digest scene in the program. As described above, since the CM section is determined based on the occurrence interval of the silent section as described above, the feature level for determining the CM section corresponds to the voice feature quantity such as the par level of the voice signal. On the other hand, as feature quantities for determining a digest scene, for example, video feature quantities such as luminance information and motion vectors of video signals, and audio feature quantities such as audio signal par level and spectrum are applicable. In the present embodiment, the description will be made on the assumption that the par level of an audio signal is used as a feature amount for determination of both a CM section and a digest scene.
[0040] 無音区間検出部 13は、上記特徴量に基づいて番組内の無音区間を検出し、無音 区間情報 24を生成する。また、無音区間検出部 13は、 CM区間判定部 15に当該無 音区間情報 24を出力する。  [0040] The silent section detector 13 detects a silent section in the program based on the feature amount, and generates silent section information 24. Further, the silent section detection unit 13 outputs the silent section information 24 to the CM section determination unit 15.
[0041] 候補区間検出部 14は、上記特徴量に基づいて番組内のダイジェストシーンの候補 となる区間 (以下、候補区間)を検出し、候補区間情報 25を生成する。また、候補区 間検出部 14は、ダイジェストリスト作成部 16に当該候補区間情報 25を出力する。  [0041] Candidate section detection unit 14 detects a section (hereinafter referred to as a candidate section) that is a digest scene candidate in the program based on the feature amount, and generates candidate section information 25. Further, the candidate section detection unit 14 outputs the candidate section information 25 to the digest list creation unit 16.
[0042] CM区間判定部 15は、上記無音区間情報 24に基づき、無音区間の時間間隔をみ ることで、 CM区間を判定する。そして、 CM区間判定部 15は、判定した CM区間を C M区間情報 27として、ダイジェストリスト作成部 16へ出力する。  [0042] The CM section determination unit 15 determines the CM section by looking at the time interval of the silent section based on the silent section information 24. Then, the CM section determination unit 15 outputs the determined CM section as CM section information 27 to the digest list creation unit 16.
[0043] ダイジェストリスト作成部 16は、候補区間情報 25および CM区間情報 27に基づ ヽ て、ダイジェストシーンの位置を示す情報であるダイジェストシーンリスト 28を作成す る。そして、ダイジェストリスト作成部 16は、当該ダイジェストシーンリスト 28を大容量 記録媒体 17及び再生制御部 18へ出力する。 Based on the candidate section information 25 and the CM section information 27, the digest list creating section 16 creates a digest scene list 28 that is information indicating the position of the digest scene. The digest list creation unit 16 then stores the digest scene list 28 in a large capacity. The data is output to the recording medium 17 and the reproduction control unit 18.
[0044] 大容量記録媒体 17は、 AV信号やダイジェストシーンリスト 28を記録するための媒 体であり、 DVDや HDDなどで実現される。  [0044] The large-capacity recording medium 17 is a medium for recording the AV signal and the digest scene list 28, and is realized by a DVD, an HDD, or the like.
[0045] 再生制御部 18は、受信した AV信号ゃ大容量記録媒体 17に記録された AV信号 の再生およびモニタへ出力等の再生制御を行う。  [0045] The reproduction control unit 18 performs reproduction control such as reproduction of the received AV signal and reproduction of the AV signal recorded on the large-capacity recording medium 17 and output to the monitor.
[0046] なお、図 1に示す特徴量算出部 12、無音区間検出部 13、候補区間検出部 14、 C M区間判定部 15およびダイジェストリスト作成部 16は、典型的には集積回路である L SIとして実現されてもよい。特徴量算出部 12、無音区間検出部 13、候補区間検出 部 14、 CM区間判定部 15およびダイジェストリスト作成部 16は、個別に 1チップィ匕さ れても良いし、一部または全てを含むように 1チップィ匕されても良い。また、集積回路 化の手法は、 LSIに限るものではなぐ専用回路または汎用プロセッサで実現しても よい。  [0046] Note that the feature quantity calculation unit 12, the silent segment detection unit 13, the candidate segment detection unit 14, the CM segment determination unit 15, and the digest list creation unit 16 illustrated in FIG. 1 are typically LSIs that are integrated circuits. It may be realized as. The feature quantity calculation unit 12, the silent segment detection unit 13, the candidate segment detection unit 14, the CM segment determination unit 15, and the digest list creation unit 16 may be individually combined, or may include some or all of them. One chip may be added. Further, the method of circuit integration may be realized by a dedicated circuit or general-purpose processor, not limited to LSI.
[0047] 次に、図 2を用いて、本実施形態で用いられる各種データについて説明する。以下 に説明する各種データは、例えば半導体メモリによって実現される一時記憶部(図示 せず)に格納される。図 2において、比較用特徴量情報 21 (図 2 (A) )は、上記無音 区間等を検出するために用いられ、直前のフレームについての時刻情報 211および 特徴量算出部 12により算出された音声パヮレベル値が格納される直前特徴量 212 を有する。  Next, various data used in the present embodiment will be described with reference to FIG. Various data described below are stored in a temporary storage unit (not shown) realized by a semiconductor memory, for example. In FIG. 2, comparison feature quantity information 21 (FIG. 2 (A)) is used to detect the silent section and the like, and the time information 211 for the immediately preceding frame and the voice calculated by the feature quantity calculation unit 12 are used. It has a feature value 212 immediately before the power level value is stored.
[0048] 無音始端情報 22 (図 2 (B) )は、無音始端時刻を有しており、無音区間を検出する ために用いられる。  [0048] Silence start edge information 22 (Fig. 2 (B)) has a silence start edge time, and is used to detect a silence interval.
[0049] 候補始端情報 23 (図 2 (C) )は、候補始端時刻を有しており、候補区間を検出する ために用いられる。  [0049] Candidate start edge information 23 (Fig. 2 (C)) has a candidate start edge time, and is used to detect a candidate section.
[0050] 無音区間情報 24 (図 2 (D) )は、無音区間検出部 13による無音区間の検出結果が 格納される。無音区間情報 24は、区間番号 241と点数 242と始端時刻 243と終端時 刻 244との集合力も成る。区間番号 241は、各無音区間を識別するための番号であ る。点数 242は、当該無音区間が、どの程度 CM区間の端である可能性が高いかを 評価した値である。当該点数が高いほど、当該無音区間は CM区間の端である可能 性が高いとし、逆にこの点数が低ければ、番組中にたまたま出てきた無音区間である (すなわち、 CM区間の端ではない)可能性が高いものとする。始端時刻 243および 終端時刻 244は、当該無音区間の開始時刻および終了時刻を示す時刻情報である The silent section information 24 (FIG. 2 (D)) stores the detection result of the silent section by the silent section detector 13. The silent section information 24 includes the collective power of the section number 241, the score 242, the start time 243, and the end time 244. The section number 241 is a number for identifying each silent section. The score 242 is a value that evaluates how much the silence section is likely to be the end of the CM section. The higher the score, the higher the possibility that the silent section is the end of the CM section. Conversely, if the score is low, the silent section is a silent section that happens to appear in the program. (Ie, not the end of the CM section). The start time 243 and end time 244 are time information indicating the start time and end time of the silent section.
[0051] 候補区間情報 25 (図 2 (E) )は、候補区間検出部 14による候補区間の検出結果が 格納される。候補区間情報 25は、候補番号 251と始端時刻 252と終端時刻 253との 集合からなる。候補番号 251は、各候補区間を識別するための番号である。始端時 刻 252および終端時刻 253は、当該候補区間の開始時刻および終了時刻を示す時 刻情報である。 [0051] Candidate section information 25 (Fig. 2 (E)) stores the detection results of candidate sections by candidate section detector 14. Candidate section information 25 consists of a set of candidate number 251, start time 252 and end time 253. Candidate number 251 is a number for identifying each candidate section. The start time 252 and end time 253 are time information indicating the start time and end time of the candidate section.
[0052] 暫定 CM始端情報 26 (図 2 (F) )は、 CM区間判定部 15が CM区間を検出するため に用いる暫定 CM始端時刻を有し、 CM区間の始端となり得る無音区間の始端時刻 が格納される。  [0052] Temporary CM start edge information 26 (FIG. 2 (F)) has a temporary CM start edge time used by the CM interval determination unit 15 to detect the CM interval, and the start interval time of the silent interval that can be the start edge of the CM interval. Is stored.
[0053] CM区間情報 27 (図 2 (G) )は、 CM区間判定部 15によって検出された CM区間の 情報が格納される。 CM区間情報 27は、 CM番号 271と CM始端時刻 272と CM終 端時刻 273との集合力もなる。 CM番号 271は、各 CM区間を識別するための番号 である。 CM始端時刻 272および CM終端時刻 273は、当該 CM区間の開始時刻お よび終了時刻を示す時刻情報である。  In the CM section information 27 (FIG. 2 (G)), information on the CM section detected by the CM section determination unit 15 is stored. CM section information 27 is also a collective force of CM number 271, CM start time 272, and CM end time 273. CM number 271 is a number for identifying each CM section. CM start time 272 and CM end time 273 are time information indicating the start time and end time of the CM section.
[0054] ダイジェストシーンリスト 28 (図 2 (H) )は、番糸且中のダイジェストシーンとなる区間の 時刻情報を示すファイルである。ダイジェスト番号 281とダイジェスト始端時刻 282と ダイジェスト終端時刻 283との集合カゝらなる。ダイジェスト番号 281は、各ダイジェスト 区間を識別するための番号である。ダイジェスト始端時刻 282およびダイジェスト終 端時刻 283は、当該ダイジェスト区間の開始時刻および終了時刻を示す時刻情報で ある。  The digest scene list 28 (FIG. 2 (H)) is a file indicating the time information of the section that becomes the digest scene in the yarn. This is a set of digest number 281, digest start time 282, and digest end time 283. The digest number 281 is a number for identifying each digest section. The digest start time 282 and the digest end time 283 are time information indicating the start time and end time of the digest section.
[0055] 以下、図 3〜図 9を用いて、ダイジェスト生成装置 10が行うダイジェストシーンリスト 作成処理の詳細動作を説明する。図 3は、第 1の実施形態にカゝかるダイジェストシー ンリスト作成処理の詳細動作を示すフローチャートである。図 3に示す処理は、ユー ザによる録画指示によって開始される。また、図 3に示す処理のスキャンタイムは 1フ レームであるとする。  [0055] The detailed operation of the digest scene list creation process performed by the digest generation device 10 will be described below with reference to FIGS. FIG. 3 is a flowchart showing the detailed operation of the digest scene list creation process according to the first embodiment. The process shown in Fig. 3 is started by a recording instruction from the user. In addition, it is assumed that the scan time of the process shown in FIG. 3 is one frame.
[0056] 図 3において、まず、ダイジェスト生成装置 10は、録画の終了が指示されたか否か を判定する (ステップ SI)。その結果、録画の終了が指示された場合は (ステップ S1 で YES)、ダイジェストシーンリスト作成処理を終了する。一方、録画の終了が指示さ れていない場合は (ステップ S1で NO)、特徴量算出部 12は、受信部 11から 1フレー ム分の信号を取得する (ステップ S2)。次に、特徴量算出部 12は、当該取得した信 号を分析し、音声パヮレベル (特徴量)を算出する (ステップ S3)。 In FIG. 3, first, the digest generation device 10 determines whether or not the end of recording has been instructed. (Step SI). As a result, when the end of recording is instructed (YES in step S1), the digest scene list creation process is terminated. On the other hand, when the end of the recording is not instructed (NO in step S1), the feature amount calculation unit 12 acquires a signal for one frame from the reception unit 11 (step S2). Next, the feature amount calculation unit 12 analyzes the acquired signal and calculates a voice power level (feature amount) (step S3).
[0057] 次に、無音区間検出部 13は、無音区間検出処理を行って無音区間を検出する (ス テツプ S4)。図 4は、上記ステップ S4で示した無音区間検出処理の詳細を示すフロ 一チャートである。図 4において、まず、無音区間検出部 13は、ステップ S3で算出し た音声信号のパヮレベルが所定の閾値以下である力否かを判定する (ステップ S 11) 。その結果、所定の閾値以下であれば (ステップ S11で YES)、無音区間検出部 13 は、 1つ前のフレームにかかる特徴量が格納されている直前特徴量 212を参照し、そ の値が所定の閾値以下である力否かを判定する(ステップ S 12)。つまり、現在のフレ ームと 1つ前のフレームとの音声パヮレベルの変化を判定することになる。その結果、 所定の閾値以下でなければ (ステップ S12で NO)、無音区間検出部 13は、当該フレ ームの時刻情報を無音始端情報 22へ格納しておく (ステップ S13)。なお、処理開始 直後は、直前特徴量 212にはまだ何も格納されていないため、この場合は、所定の 閾値以下ではないとして処理をすすめる。一方、所定の閾値以下であれば (ステップ S12で YES)、無音区間継続中であるため、そのまま無音区間検出処理を終了する [0057] Next, the silent section detector 13 performs a silent section detection process to detect a silent section (step S4). FIG. 4 is a flowchart showing details of the silent section detection process shown in step S4. In FIG. 4, first, the silent section detection unit 13 determines whether or not the power level of the audio signal calculated in step S3 is equal to or less than a predetermined threshold (step S11). As a result, if it is equal to or less than the predetermined threshold value (YES in step S11), the silent section detection unit 13 refers to the immediately preceding feature value 212 in which the feature value related to the previous frame is stored, and the value is It is determined whether or not the force is equal to or less than a predetermined threshold (step S12). In other words, it determines the change in the audio power level between the current frame and the previous frame. As a result, if it is not less than the predetermined threshold value (NO in step S12), the silent section detecting unit 13 stores the time information of the frame in the silent start end information 22 (step S13). It should be noted that immediately after the start of processing, nothing is stored in the immediately preceding feature value 212, so in this case, the processing is proceeded assuming that it is not less than a predetermined threshold value. On the other hand, if it is equal to or less than the predetermined threshold (YES in step S12), the silent section is being continued, so the silent section detection process is terminated.
[0058] 一方、上記ステップ S11の結果、ステップ S3で抽出した音声信号のパヮレベルが 所定の閾値以下でない場合は (ステップ S11で NO)、無音区間検出部 13は、直前 特徴量 212を参照し、ここに格納されているパヮレベルが所定の閾値以下であるか 否かを判定する (ステップ S14)。その結果、所定の閾値以下であれば (ステップ S14 で YES)、«続していた無音区間が 1つ前のフレームで終了したことになるため、無 音区間検出部 13は、上記無音始端情報 22の無音始端時刻から 1つ前のフレームの 時刻情報 211までの区間を 1つの無音区間として、無音区間情報 24に出力する (ス テツプ S15)。次に、無音区間検出部 13は、ステップ S15で出力した無音区間につ いて、後述するようなポイント評価処理 (ステップ S16)を行う。 [0059] 一方、ステップ S 14の判定の結果、直前特徴量 212のパヮレベルが所定の閾値以 下でなければ (ステップ S 14で NO)、無音区間ではない区間が継続中であるため、 無音区間検出部 13は、処理を終了する。なお、処理開始直後は、直前特徴量 212 にはまだ何も格納されていないため、この場合も、所定の閾値以下でないとして処理 をすすめる。以上で、無音区間検出処理が終了する。 [0058] On the other hand, if the result of step S11 is that the level of the audio signal extracted in step S3 is not less than or equal to the predetermined threshold (NO in step S11), the silent section detection unit 13 refers to the immediately preceding feature value 212, It is determined whether or not the power level stored here is below a predetermined threshold (step S14). As a result, if it is equal to or less than the predetermined threshold value (YES in step S14), the silent period that has been continued has been completed in the previous frame. The section from the silence start time of 22 to the time information 211 of the previous frame is output to the silence section information 24 as one silence section (step S15). Next, the silent section detector 13 performs a point evaluation process (step S16) as will be described later on the silent section output in step S15. [0059] On the other hand, as a result of the determination in step S14, if the power level of the immediately preceding feature quantity 212 is not less than or equal to the predetermined threshold (NO in step S14), a non-silent section is continuing, so the silent section The detection unit 13 ends the process. It should be noted that immediately after the start of processing, nothing is stored in the immediately preceding feature value 212, so in this case as well, the processing is proceeded assuming that it is not below a predetermined threshold. Thus, the silent section detection process is completed.
[0060] 次に、図 5を用いて、上述したステップ S16におけるポイント評価処理の詳細につい て説明する。本処理では、最後に検出した無音区間の前 15秒、 30秒、 60秒の時点 が無音区間力否かを判定し、無音区間であれば、それぞれの無音区間情報に 1点を 加算する処理である。これにより、いずれか CMの始端あるいは終端であると考えら れる無音区間については点数が高くすることができる。つまり、 CM区間の両端が無 音区間であるという性質、および 1つの CM区間の長さが 15秒、 30秒あるいは 60秒 であると 、う性質を利用して、番組中に発生する無音区間につ 、ての「CM区間の端 らしさ」を、点数をつけることにより評価していく処理を行う。その結果、番組中にたま たま発生した無音区間と CMの境界を示す無音区間とを区別することが可能となる。  Next, the details of the point evaluation process in step S16 described above will be described with reference to FIG. In this process, it is determined whether or not the time of 15 seconds, 30 seconds, and 60 seconds before the last detected silent interval is silence interval power, and if it is a silent interval, 1 point is added to each silent interval information It is. This makes it possible to increase the score for silent sections that are considered to be the beginning or end of any CM. In other words, it is a silent section that occurs during a program using the property that both ends of the CM section are silent sections and the length of one CM section is 15 seconds, 30 seconds, or 60 seconds. On the other hand, the process of evaluating the “end of CM section” by assigning points. As a result, it is possible to distinguish between silent sections that occur occasionally during a program and silent sections that indicate CM boundaries.
[0061] 図 5において、まず、無音区間検出部 13は、無音区間情報 24に最後に格納された 無音区間の始端時刻 243を取得する。そして、無音区間検出部 13は、当該時刻の 1 5秒前の時刻を有する無音区間がある力否かを、無音区間情報 24を検索することで 判定する(ステップ S21)。その結果、無音区間が検索できれば (ステップ S21で YES )、無音区間検出部 13は、最後に格納された無音区間、およびステップ S21で検索 した無音区間のそれぞれの点数 242に 1を加算する (ステップ S22)。一方、ステップ S21の判定の結果、 15秒前の無音区間を検索できなければ (ステップ S21で NO)、 無音区間検出部 13は、ステップ S22の処理は行わずに、処理をステップ S23に進め る。次に、無音区間検出部 13は、ステップ S21と同様に 30秒前が無音区間であるか 否かを判定する(ステップ S23)。その結果、検索できれば (ステップ S23で YES)、無 音区間検出部 13は、最後に格納した無音区間、および今回検索した無音区間のそ れぞれの点数 242に 1を加算する(ステップ S24)。一方、ステップ S23の判定の結果 、 30秒前の無音区間を検索できなければ (ステップ S23で NO)、無音区間検出部 1 3は、ステップ S24の処理は行わずに、処理をステップ S25に進める。ステップ S25に おいては、無音区間検出部 13は、ステップ S21や S23と同様に、 60秒前に無音区 間があるか否かを判定し、あれば、ステップ S22や S24と同様に点数 242に 1を加算 する。以上で、ステップ S16にかかるポイント評価処理は終了する。なお、上述の説 明では無音区間の始端時刻 243を基準に無音区間情報 24を検索したが、これに限 らず、無音区間の終端時刻 244、あるいは当該無音区間中の任意の時点を基準に して検索しても良い。 In FIG. 5, first, the silent section detecting unit 13 acquires the start time 243 of the silent section stored last in the silent section information 24. Then, the silent section detector 13 determines whether or not there is a silent section having a time 15 seconds before the time by searching the silent section information 24 (step S21). As a result, if a silent section can be searched (YES in step S21), the silent section detecting unit 13 adds 1 to the score 242 of each of the silent section stored last and the silent section searched in step S21 (step S21). S22). On the other hand, if the result of the determination in step S21 is that a silence interval 15 seconds ago cannot be searched (NO in step S21), the silence interval detection unit 13 proceeds to step S23 without performing step S22. . Next, the silent section detector 13 determines whether or not 30 seconds before is the silent section, as in step S21 (step S23). As a result, if the search is possible (YES in step S23), the silent section detection unit 13 adds 1 to the score 242 of the last stored silent section and the silent section searched this time (step S24). . On the other hand, if the result of the determination in step S23 is that the silent section 30 seconds before cannot be searched (NO in step S23), the silent section detector 13 proceeds to step S25 without performing the process in step S24. . Step S25 The silent section detector 13 determines whether or not there is a silent section 60 seconds before, as in steps S21 and S23. If there is, the silent section detector 13 sets 1 to 242 as in steps S22 and S24. to add. Above, the point evaluation process concerning step S16 is complete | finished. In the above description, the silent section information 24 is searched based on the start time 243 of the silent section. However, the present invention is not limited to this, and the end time 244 of the silent section or any time point in the silent section is used as a reference. Then you can search.
[0062] 図 3に戻り、ステップ S4の処理の後、候補区間検出部 14は、候補区間検出処理を 行う(ステップ S5)。この処理は、音声パヮレベルが所定の閾値以上の区間をダイジ エストシーンの候補区間として検出する処理である。  Returning to FIG. 3, after the process of step S4, the candidate section detection unit 14 performs a candidate section detection process (step S5). This process is a process of detecting a section where the voice path level is equal to or higher than a predetermined threshold as a digest scene candidate section.
[0063] 図 6は、上記ステップ S5で示した候補区間検出処理の詳細を示すフローチャート である。図 6において、まず、候補区間検出部 14は、ステップ S3で抽出した音声信 号のパヮレベルが所定の閾値以上であるか否かを判定する (ステップ S31)。その結 果、所定の閾値以上であれば (ステップ S31で YES)、続いて候補区間検出部 14は 、上記直前特徴量 212が所定の閾値以上である力否かを判定する (ステップ S32)。 その結果、所定の閾値以上でなければ (ステップ S32で NO)、候補区間検出部 14 は、ステップ S 2で取得したフレーム(現在処理対象となって!/、るフレーム)の時刻情 報を候補始端情報 23へ格納する (ステップ S33)。なお、処理開始直後は、直前特 徴量 212にはまだ何も格納されていないため、この場合は、所定の閾値以上ではな いとして処理をすすめる。一方、所定の閾値以上であれば (ステップ S32で YES)、 候補区間継続中であるため、候補区間検出部 14は、処理をステップ S36へ進める。  [0063] FIG. 6 is a flowchart showing details of the candidate section detection process shown in step S5. In FIG. 6, first, the candidate section detection unit 14 determines whether or not the speech signal level extracted in step S3 is equal to or higher than a predetermined threshold (step S31). As a result, if it is equal to or greater than the predetermined threshold value (YES in step S31), then the candidate section detection unit 14 determines whether or not the preceding feature value 212 is greater than or equal to the predetermined threshold value (step S32). As a result, if it is not equal to or greater than the predetermined threshold value (NO in step S32), the candidate section detection unit 14 candidates the time information of the frame (currently processing target! /) Frame acquired in step S2. Store in the start edge information 23 (step S33). Immediately after the start of processing, nothing is stored yet in the immediately preceding feature amount 212. In this case, the processing is proceeded assuming that it is not equal to or greater than a predetermined threshold. On the other hand, if it is equal to or greater than the predetermined threshold (YES in step S32), the candidate section is being continued, and the candidate section detection unit 14 advances the process to step S36.
[0064] 一方、上記ステップ S31の結果、ステップ S3で算出した音声信号のパヮレベルが 所定の閾値以上でない場合は (ステップ S31で NO)、候補区間検出部 14は、直前 特徴量 212を参照し、ここに格納されているパヮレベルが所定の閾値以上であるか 否かを判定する (ステップ S34)。その結果、所定の閾値以上であれば (ステップ S34 で NO)、継続していた候補区間が 1つ前のフレームで終了したことになるため、候補 区間検出部 14は、上記候補始端情報 23に格納されている候補始端時刻から、 1つ 前のフレームの時刻である時刻情報 211までの区間を 1つの候補区間として、候補 区間情報 25に出力する (ステップ S35)。 [0065] 一方、ステップ S34の判定の結果、直前特徴量 212の値が所定の閾値以上でなけ れば (ステップ S34で NO)、候補区間ではない区間が継続中であるため、候補区間 検出部 14は、処理をステップ S36へ進める。なお、処理開始直後は、直前特徴量 21 2にはまだ何も格納されて ヽな 、ため、所定の閾値以上でな!、として処理をすすめる 。ステップ S36においては、候補区間検出部 14は、上記ステップ S3で取得した音声 信号のパヮレベルを、直前特徴量 212に格納する (ステップ S36)。以上で、候補区 間検出処理が終了する。 [0064] On the other hand, as a result of step S31, when the sound signal level calculated in step S3 is not equal to or greater than a predetermined threshold (NO in step S31), candidate section detecting unit 14 refers to immediately preceding feature quantity 212, It is determined whether or not the power level stored here is greater than or equal to a predetermined threshold (step S34). As a result, if it is equal to or greater than the predetermined threshold value (NO in step S34), the candidate section that has been continued ends in the previous frame. The section from the stored candidate start time to the time information 211 that is the time of the previous frame is output to the candidate section information 25 as one candidate section (step S35). On the other hand, as a result of the determination in step S34, if the value of the immediately preceding feature value 212 is not equal to or greater than the predetermined threshold (NO in step S34), a section that is not a candidate section is continuing, so a candidate section detection unit 14 advances the process to step S36. It should be noted that immediately after the start of processing, nothing is stored in the immediately preceding feature value 212, so the processing is recommended as not exceeding a predetermined threshold! In step S36, the candidate section detection unit 14 stores the level of the audio signal acquired in step S3 in the immediately preceding feature quantity 212 (step S36). This completes the candidate interval detection process.
[0066] 図 3に戻り、ステップ S5の処理が終われば、次に、 CM区間判定部 15は、 CM区間 判定処理を行う(ステップ S6)。図 7は、上記ステップ S6で示した CM区間判定処理 の詳細を示すフローチャートである。図 7において、まず、 CM区間判定部 15は、無 音区間情報 24を検索し、現在のフレームの 60秒前の時点に、点数 242が所定値以 上 (例えば 3点)の無音区間が存在するかどうかを判定する (ステップ S41)。すなわち 、 60秒前の時点が無音区間であった力否かを判定することになる。ここで、無音区間 存在の検索の時点を 60秒前としているのは、本実施形態では、 1つの CM区間の長 さが最大 60秒と仮定しているためである。そのため、 1つの CM区間の長さが最大 30 秒と仮定する場合は、当該検索の時点を 30秒にすればよい。ステップ S41の判定の 結果、 60秒前の時点が無音区間でなければ (ステップ S41で NO)、 CM区間判定部 15は、処理を後述のステップ S46へ進める。  Returning to FIG. 3, when the process of step S5 is completed, the CM section determination unit 15 next performs a CM section determination process (step S6). FIG. 7 is a flowchart showing details of the CM section determination process shown in step S6. In FIG. 7, first, the CM section determination unit 15 searches the silent section information 24, and there is a silent section with a score 242 greater than or equal to a predetermined value (for example, three points) at a time point 60 seconds before the current frame. It is determined whether or not to perform (step S41). That is, it is determined whether or not the power was 60 seconds before the silent section. Here, the reason for searching for the presence of a silent section is 60 seconds ago, because in this embodiment, it is assumed that the length of one CM section is a maximum of 60 seconds. Therefore, if it is assumed that the length of one CM section is 30 seconds at the maximum, the search time should be 30 seconds. As a result of the determination in step S41, if the time point 60 seconds before is not a silent section (NO in step S41), the CM section determination unit 15 advances the process to step S46 described later.
[0067] 一方、ステップ S41の判定の結果、無音区間であれば (ステップ S41で YES)、 CM 区間判定部 15は、暫定 CM始端情報 26にデータが存在する力否かを判定する (ス テツプ S42)。その結果、暫定 CM始端情報 26にデータが存在しなければ (ステップ S42で NO)、 CM区間判定部 15は、検索した無音区間の時刻情報を暫定 CM始端 情報 26に出力する (ステップ S49)。一方、すでにデータが存在すれば (ステップ S4 2で YES)、 CM区間判定部 15は、暫定 CM始端情報 26から暫定始端時刻を取得し 、これを CM始端時刻 272として、 CM番号 271と関連付けて CM区間情報 27に出 力する。併せて、ステップ S41で検索した無音区間(すなわち、 60秒前の時点の無 音区間)の終端時刻を CM終端時刻 273として、 CM区間情報 27に出力する (ステツ プ S43)。 [0068] 次に、 CM区間判定部 15は、後述するダイジェストシーンリストを作成させるための フラグである、 Dリスト作成フラグをオンに設定する (ステップ S44)。続いて、 CM区間 判定部 15は、当該 60秒前の無音区間情報の終端時刻を暫定 CM始端情報 26の始 端時刻として出力する (ステップ S45)。 [0067] On the other hand, if the result of determination in step S41 is a silent section (YES in step S41), CM section determination unit 15 determines whether or not there is data in provisional CM start information 26 (step). S42). As a result, if there is no data in the provisional CM start end information 26 (NO in step S42), the CM section determination unit 15 outputs the searched silent section time information to the provisional CM start end information 26 (step S49). On the other hand, if data already exists (YES in step S42), the CM section determination unit 15 acquires the provisional start time from the provisional CM start information 26 and associates it with the CM number 271 as the CM start time 272. Output to CM section information 27. At the same time, the end time of the silent section searched in step S41 (that is, the silent section 60 seconds before) is output to the CM section information 27 as the CM end time 273 (step S43). [0068] Next, the CM section determination unit 15 sets the D list creation flag, which is a flag for creating a digest scene list, which will be described later, to ON (step S44). Subsequently, the CM section determination unit 15 outputs the end time of the silent section information 60 seconds before as the start time of the provisional CM start end information 26 (step S45).
[0069] 次に、 CM区間判定部 15は、暫定 CM始端情報 26の時刻から 120秒以上経過し ている力否かを判定する (ステップ S46)。つまり、 CM始端の可能性のある無音区間 が見つかった後、 120秒間、点数 242が所定値以上の無音区間がなければ、当該 無音区間は CMの始端ではないとするものである。ここで、判定基準の時間を 120秒 としているのは、本実施形態では 1つの CM区間が最大 60秒と仮定しているためであ る。つまり、一度 CM区間の始端候補がみっかり、その 60秒後に無音区間が見つか つたとしても、当該無音区間が CM区間の終端力否かが確定するためには更に 60秒 必要となるからである。  [0069] Next, the CM section determination unit 15 determines whether or not the force has exceeded 120 seconds from the time of the provisional CM start end information 26 (step S46). In other words, if there is no silent section with a score of 242 or higher for 120 seconds after a silent section that has the possibility of starting CM is found, the silent section is not the start of CM. Here, the reason for the determination criterion being 120 seconds is that in this embodiment, it is assumed that one CM section is a maximum of 60 seconds. In other words, even if a start candidate for a CM section is found once and a silence section is found 60 seconds later, an additional 60 seconds are required to determine whether or not the silence section is at the end of the CM section. .
[0070] ステップ S46の判定の結果、 120秒以上経過して!/ヽれば (ステップ S46で YES)、 C M区間判定部 15は、暫定 CM始端情報 26をクリアする (ステップ S47)。続いて、 C M区間判定部 15は、 Dリスト作成フラグをオンに設定する (ステップ S48)。一方、 12 0秒以上経過していない場合は (ステップ S46で NO)、そのまま処理を終了する。以 上で、 CM区間判定処理は終了する。  [0070] If 120 seconds or more have passed as a result of the determination in step S46! / YES (YES in step S46), the CM section determination unit 15 clears the provisional CM start end information 26 (step S47). Subsequently, the CM section determination unit 15 sets the D list creation flag to ON (step S48). On the other hand, if 120 seconds or more have not elapsed (NO in step S46), the process is terminated as it is. This is the end of the CM section determination process.
[0071] ここで、図 8を用いて、上記 CM区間判定処理について補足説明する。図 8におい て、点 A〜点 Gは、無音区間であり且つ 15秒間隔の CM区間の端である。上述した 処理によれば、図 8の点 E (60秒)の時点で、点 Aが暫定 CM始端とされる。その後、 点 F (75秒)の時点で、点 A〜点 Bが CM区間であると確定し、当該区間の時刻情報 力 SCM区間情報 27に出力される。併せて、点 Bが新たな暫定 CM始端とされる。その 後、点 Gの時点で、点 B〜点 Cが CM区間として確定し、 CM区間情報に出力され、こ れに併せて点 Cが暫定 CM始端とされることになる。このように、上述のような処理に よれば、幾分かの遅延時間が発生する形ではある力 番組録画中においても、正確 な CM区間を並行して確定していくことができる。  [0071] Here, the CM section determination process will be supplementarily described with reference to FIG. In FIG. 8, points A to G are silent sections and the ends of CM sections with a 15-second interval. According to the processing described above, point A is set as the temporary CM start point at point E (60 seconds) in FIG. Thereafter, at point F (75 seconds), it is determined that points A to B are CM sections, and are output to the time information SCM section information 27 of the section. At the same time, point B is the beginning of a new provisional CM. After that, at point G, points B to C are confirmed as CM sections and output to the CM section information. At the same time, point C will be the beginning of provisional CM. In this way, according to the above-described processing, an accurate CM section can be determined in parallel even during a powerful program recording in which some delay time is generated.
[0072] 図 3に戻り、ステップ S6の処理が終われば、次に、ダイジェストリスト作成部 16は、 ダイジェストシーンリスト出力処理を行う(ステップ S7)。図 9は、上記ステップ S7で示 したダイジェストシーンリスト出力処理の詳細を示すフローチャートである。図 9におい て、まず、ダイジェストリスト作成部 16は、 Dリスト作成フラグがオンカゝ否かを判定する( ステップ S51)。その結果、オンでない場合 (ステップ S51で NO)、ダイジェストリスト 作成部 16は、そのまま処理を終了する。一方、オンであれば (ステップ S51で YES) 、ダイジェストリスト作成部 16は、以前にダイジェストシーンリスト出力処理を行ってか ら以降に、候補区間情報 25に新たな候補区間が追加された力否かを判定する (ステ ップ S52)。その結果、候補区間が追加されていなければ (ステップ S52で NO)、ダイ ジェストリスト作成部 16は、そのままダイジェストシーンリスト作成処理を終了する。一 方、以前にダイジェストシーンリスト出力処理を行ったとき力 候補区間が新たに追カロ されている場合は (ステップ S52で YES)、ダイジェストリスト作成部 16は、増加分の 候補区間の情報を 1件取得する (ステップ S53)。次に、ダイジェストリスト作成部 16は 、当該候補区間が CM区間内に含まれている力否かを、 CM区間情報 27を参照して 判定する(ステップ S54)。その結果、 CM区間内でなければ (ステップ S54で NO)、 ダイジェストリスト作成部 16は、当該候補区間の情報をダイジェストシーンリスト 28に 出力する(ステップ S55)。一方、 CM区間内であれば (ステップ S54で YES)、処理 をステップ S56に進める。つまり、候補区間が CM区間でもある場合は、当該候補区 間はダイジェストシーンとしては採用しない、という振り分けを行うことになる。 Returning to FIG. 3, when the process of step S6 ends, the digest list creation unit 16 next performs a digest scene list output process (step S7). Figure 9 shows in step S7 above. It is a flowchart which shows the detail of the performed digest scene list output process. In FIG. 9, first, the digest list creation unit 16 determines whether or not the D list creation flag is on (step S51). As a result, if it is not on (NO in step S51), the digest list creation unit 16 ends the process as it is. On the other hand, if it is on (YES in step S51), the digest list creation unit 16 determines whether or not a new candidate section has been added to the candidate section information 25 since the digest scene list output process has been performed previously. (Step S52). As a result, if a candidate section has not been added (NO in step S52), the digest list creation unit 16 ends the digest scene list creation process as it is. On the other hand, if the candidate candidate section has been newly added when the digest scene list output process has been performed previously (YES in step S52), the digest list creation unit 16 adds information on the candidate section for the increment to 1 (Step S53). Next, the digest list creation unit 16 determines whether or not the candidate section is included in the CM section with reference to the CM section information 27 (step S54). As a result, if it is not within the CM section (NO in step S54), the digest list creation unit 16 outputs information on the candidate section to the digest scene list 28 (step S55). On the other hand, if it is within the CM section (YES in step S54), the process proceeds to step S56. In other words, if the candidate section is also a CM section, the candidate section is not used as a digest scene.
[0073] 次に、ダイジェストリスト作成部 16は、増加分の候補区間全てについて上記振り分 けの処理を行ったか否かを判定する (ステップ S56)。その結果、まだ未処理の増加 分の候補区間が残っていれば (ステップ S56で NO)、ダイジェストリスト作成部 16は、 上記ステップ S53に戻って処理を繰り返す。一方、増加分の候補区間について全て 処理した場合は、ダイジェストリスト作成部 16は、 Dリスト作成フラグをオフに設定し( ステップ S57)、当該ダイジェストシーンリスト出力処理を終了する。以上で、第 1の実 施形態に力かるダイジェストシーンリスト作成処理は終了する。  [0073] Next, the digest list creation unit 16 determines whether or not the above-described distribution process has been performed for all of the incremented candidate sections (step S56). As a result, if an unprocessed increase candidate section still remains (NO in step S56), the digest list creation unit 16 returns to step S53 and repeats the process. On the other hand, when all the increased candidate sections have been processed, the digest list creation unit 16 sets the D list creation flag to OFF (step S57), and ends the digest scene list output process. This is the end of the digest scene list creation process that is useful for the first embodiment.
[0074] このように、第 1の実施形態では、番組の録画と並行しながら、単純に音声パヮレベ ルが所定値以上であるダイジェスト候補区間を抜き出し、その中から、 CM区間に該 当するものを差し引いていくことで、番組区間内におけるダイジェストシーンだけを抽 出したダイジェストシーンリストを録画と並行しながら作成して 、くことができる。これに より、番組の録画終了後に、別途ダイジェストシーンリスト生成のための処理を行う必 要がなくなり、当該生成処理のための処理待ち時間のな!、快適な視聴環境をユーザ に提供することができる。 [0074] Thus, in the first embodiment, in parallel with the recording of a program, a digest candidate section whose voice level is equal to or higher than a predetermined value is simply extracted, and the one corresponding to the CM section is extracted from the digest candidate section. By subtracting, a digest scene list in which only digest scenes in the program section are extracted can be created in parallel with recording. to this As a result, there is no need to perform a separate process for generating a digest scene list after the recording of the program, and a comfortable viewing environment can be provided to the user without the waiting time for the generation process.
[0075] なお、上述した実施形態では、無音区間の検出処理を無音区間検出部 13が行つ ているが、これに限らず、 CM区間判定部 15が、 CM区間の判定処理に先立って無 音区間を検出するようにしてもよい。  In the embodiment described above, the silence interval detection unit 13 performs the silence interval detection process. However, the present invention is not limited to this, and the CM interval determination unit 15 performs the silence interval detection process prior to the CM interval determination process. You may make it detect a sound area.
[0076] また、ダイジェストシーンの検出についても、上述した音声のパヮレベルを用いる方 式に限らず、例えば、特定の番組ジャンルであるスポーツに限定して、映像の動きべ タトルから動きのスローなシーン(繰り返しのスローモーションシーン)を特定し、この 直前の数カットを盛り上がつているシーンとして検出する方式や、番組に付与された テキスト情報と映像 ·音声信号の特徴量とを組み合わせて重要シーンを検出する方 式を用いても良い。もちろん、これらのダイジェストシーン検出方式に限定するもので はなぐダイジェストシーンを検出するものであればどのような方式であってもよい。同 様に、 CM区間の検出についても、上述したような音声パヮレベルを用いる方式に限 らず、例えば、映像の輝度情報から映像のシーンチェンジ点を検出し、その発生間 隔に基づいて CM区間を判定するようにしてもよい。この場合は、上記特徴量として、 映像の輝度情報を用いれば良 、。  In addition, the digest scene detection is not limited to the above-described method using the audio sound level, but is limited to, for example, a sport that is a specific program genre. (Repeat slow motion scene) is identified, and the most important scenes are combined with the method that detects the last few cuts as a raised scene, and the combination of text information and video / audio signal features given to the program It is also possible to use a method for detecting. Of course, any method may be used as long as it detects a digest scene that is not limited to these digest scene detection methods. Similarly, the detection of the CM section is not limited to the method using the audio path level as described above. For example, the scene change point of the video is detected from the luminance information of the video, and the CM section is determined based on the occurrence interval. May be determined. In this case, the luminance information of the video may be used as the feature amount.
[0077] また、上述のダイジェストリストを用いて、番組録画中に当該番組の追いつき再生を 行っても良い。この場合は、ユーザが追いつき再生を指示する。この指示を受け、再 生制御部 18は、録画開始から 2分以上経っているかを判定し、 2分以上であれば、 上述の処理で生成されていくダイジェストリストを用いてダイジェストシーンのみ再生 する。一方、 2分以上でなければ、再生制御部 18は、早見再生 (例えば、再生速度 1 . 5倍での再生等)を行う。その後、早見再生が実放送に追いつけば、当該早見再生 を停止し、実時間放送の出力に切り替えるようにしてもよい。また、上記ダイジェストシ ーンの再生終了後は、その後の再生についてはユーザの指示に委ねるようしてもよ い。例えば、ダイジェストシーン通常の再生を行っても良いし、間引いて再生するよう にしてもよい。例えば、 60分番組において番組開始後 30分経過した時点で、ユーザ が「ダイジェストシーンを 10分で再生するように」 t 、う旨の追 、つき再生を指示したと する。この場合は、再生制御部 18は、その時点で作成されているダイジェストシーン リストに基づいて、 10分で終わるようにダイジェストシーンを再生していく。そして、ダ イジエストシーンの再生が終わった後の視聴にっ 、てはユーザに委ね、その指示を 待つ。すなわち、ダイジェストシーンの再生後は、番組開始後 40分経過していること から、ダイジェストシーン再生中の 10分間の番組について、ユーザの指示を受けて 間引き再生を行うようにしてもよいし、早見再生を行うようにしてもよい。もちろん、当 該 10分間の番組については再生せずに、実放送を見てもよい。この場合は、再生制 御部 18は、ユーザの指示を受けて再生処理を終了することになる。このように、本実 施形態によれば、録画と並行してダイジェストシーンリストが生成されていくため、録 画途中の任意のタイミングであっても、ダイジェスト再生を行うことができる。 [0077] Further, the above-described digest list may be used to catch up and reproduce the program during program recording. In this case, the user instructs catch-up reproduction. Upon receiving this instruction, the playback control unit 18 determines whether or not two minutes have passed since the start of recording. If it is two minutes or longer, only the digest scene is played back using the digest list generated by the above-described processing. . On the other hand, if it is not 2 minutes or longer, the playback control unit 18 performs fast playback (for example, playback at a playback speed of 1.5 times). After that, if the fast-play playback catches up with the actual broadcast, the fast-play playback may be stopped and switched to the real-time broadcast output. In addition, after the digest scene has been played back, subsequent playback may be left to the user's instruction. For example, normal playback of the digest scene may be performed, or playback may be performed with thinning. For example, in a 60-minute program, when 30 minutes have elapsed since the start of the program, the user has instructed the user to “reproduce the digest scene in 10 minutes”, to add a message to the effect, and to play back. To do. In this case, the playback control unit 18 plays back the digest scene so as to end in 10 minutes based on the digest scene list created at that time. Then, for viewing after the digest scene has been played back, it is up to the user to wait for the instruction. In other words, since 40 minutes have elapsed since the start of the program after the digest scene has been played back, the 10-minute program during the digest scene playback may be thinned out in response to a user instruction. You may make it perform reproduction | regeneration. Of course, you may watch the actual broadcast without playing the 10-minute program. In this case, the playback control unit 18 ends the playback process in response to a user instruction. Thus, according to the present embodiment, the digest scene list is generated in parallel with the recording, so that digest playback can be performed at any timing during recording.
[0078] また、上述の実施形態では、ダイジェスト候補区間から CM区間を差し引いてダイジ ェストシーン情報を作成していた。しかし、ダイジェスト候補区間から差し引く区間は C M区間に限るものではない。例えば、静止画が表示されている区間を検出して差し 引くようにしてもよい。これは、例えば、ある番組を再放送する際において、ライセンス や肖像権の関係上、番組内で放送できないシーンが生じる場合がある。このようなと きは、放送できな 、シーンにつ ヽては代わりに静止画(「表示できません」と!、う表示 力 Sされている)が表示されるような編集が放送前になされてから、放送される。そのた め、上記静止画の特徴量 (例えば、映像の動きベクトルが 0)を検出して、静止画が継 続して表示されている静止画区間を検出する。そして、ダイジェスト候補区間から当 該静止画区間(つまり、放送禁止区間)を差し引いてダイジェストシーン情報を作成 するようにしてもよい。このような CM区間や静止画区間等の所定の特徴を有する区 間を特定区間として検出し、当該特定区間をダイジェスト候補区間から差し引くように すれば、適切にダイジェストシーンだけを抜き出したダイジェストリストを生成すること ができる。  In the above-described embodiment, the digest scene information is created by subtracting the CM section from the digest candidate section. However, the section to be subtracted from the digest candidate section is not limited to the CM section. For example, a section where a still image is displayed may be detected and subtracted. For example, when rebroadcasting a certain program, there may be a scene that cannot be broadcast in the program due to the license and portrait rights. In such a case, the broadcast cannot be broadcast, and editing is performed before the broadcast so that a still image (“Cannot be displayed!” Is displayed) will be displayed instead. Will be broadcast. Therefore, the feature amount of the still image (for example, the motion vector of the video is 0) is detected, and the still image section where the still image is continuously displayed is detected. Then, the digest scene information may be created by subtracting the still image section (that is, the broadcast prohibited section) from the digest candidate section. If a section having a predetermined feature such as a CM section or a still image section is detected as a specific section and the specific section is subtracted from the digest candidate section, a digest list in which only digest scenes are appropriately extracted can be obtained. Can be generated.
[0079] (第 2の実施形態)  [0079] (Second Embodiment)
次に、図 10から図 13を参照して、本発明の第 2の実施形態について説明する。上 述の第 1の実施形態では、ダイジェストシーンの候補区間を随時検出している。これ に対して、第 2の実施形態では、候補区間を検出せずに、ダイジェストシーン検出の ため必要な特徴量を所定時間分溜めておき、所定のタイミングで CM区間以外の当 該特徴量カゝらダイジェストシーンを検出する。図 10は、本発明の第 2の実施形態に係 るダイジェスト生成装置 30の構成を示したブロック図である。図 10において、特徴量 算出部 12は、算出した特徴量と時刻情報とを関連付け、一時蓄積特徴量 36として 一時記憶部 31に格納する。一時記憶部 31は、所定時間分のフレームの特徴量およ び時刻情報を保持する容量を有する。本実施形態では、 2分間分のフレームの情報 を保持できるものとする。また、一時記憶部 31は、リングバッファ方式により古いデー タカも順に上書きされていくものとする。ダイジェストリスト作成部 32は、 CM区間情報 27および一時記憶部 31に記憶された特徴量に基づいて、 CM区間以外の区間から ダイジェストシーンを検出し、ダイジェストシーンリスト 28を作成する。これらを除けば、 当該実施形態に係るダイジェスト生成装置 30は、上述した第 1の実施形態と、基本 的に同じ構成を成している。そのため、同一箇所には同一の参照符号を付して詳細 な説明を省略する。 Next, a second embodiment of the present invention will be described with reference to FIGS. In the first embodiment described above, digest scene candidate sections are detected as needed. In contrast, in the second embodiment, digest scene detection is performed without detecting candidate sections. Therefore, necessary feature values are accumulated for a predetermined time, and a digest scene is detected from the feature values other than the CM section at a predetermined timing. FIG. 10 is a block diagram showing a configuration of the digest generation device 30 according to the second exemplary embodiment of the present invention. In FIG. 10, the feature quantity calculation unit 12 associates the calculated feature quantity with the time information and stores them in the temporary storage unit 31 as the temporarily accumulated feature quantity 36. The temporary storage unit 31 has a capacity to hold frame feature values and time information for a predetermined time. In this embodiment, it is assumed that frame information for 2 minutes can be held. In addition, it is assumed that the old data is overwritten in the temporary storage unit 31 in order by the ring buffer method. The digest list creation unit 32 detects a digest scene from a section other than the CM section based on the feature amount stored in the CM section information 27 and the temporary storage unit 31, and creates the digest scene list 28. Except for these, the digest generation device 30 according to the present embodiment basically has the same configuration as that of the first embodiment described above. Therefore, the same portions are denoted by the same reference numerals, and detailed description thereof is omitted.
[0080] 次に、図 11を用いて、第 2の実施形態で用いられるデータについて説明する。第 2 の実施形態では、第 1の実施形態で用いたデータの他、一時蓄積特徴量 36、ダイジ ェスト直前情報 37、ダイジェスト始端情報 38を用いる。一時蓄積特徴量 36はダイジ エストシーンの検出に用いられるものであり、時刻情報 361および特徴量 362を有す る。時刻情報 361には、フレームの時刻情報が格納される。特徴量 362には、特徴 量算出部 12が算出した、ダイジェストシーン検出のために用いられる特徴量 (本実施 形態では音声パヮレベル)が格納される。ダイジェスト直前情報 37 (図 11 (B) )も、ダ イジェストシーンの検出に用いられるものであり、ダイジェスト直前時刻情報 371およ びダイジェスト直前特徴量 372を有する。ダイジェスト直前時刻情報 371には、現在 処理対象となっているフレームの 1つ前のフレームにかかる時刻情報が格納される。 ダイジェスト直前特徴量 372には、現在処理対象となっているフレームの 1つ前のフ レームについての特徴量が格納される。ダイジェスト始端情報 38 (図 11 (C) )は、ダイ ジエスト始端時刻を有しており、ダイジェストシーンを検出するために用いられる。  Next, data used in the second embodiment will be described with reference to FIG. In the second embodiment, in addition to the data used in the first embodiment, a temporarily accumulated feature 36, immediately before digest information 37, and digest start end information 38 are used. The temporarily accumulated feature quantity 36 is used for detecting a digest scene, and has time information 361 and a feature quantity 362. The time information 361 stores frame time information. The feature quantity 362 stores the feature quantity (in this embodiment, the voice path level) used by the feature quantity calculation unit 12 and used for digest scene detection. Information immediately before digest 37 (FIG. 11 (B)) is also used for detecting a digest scene, and has time information 371 immediately before digest and feature amount 372 immediately before digest. The time information immediately before the digest 371 stores the time information related to the frame immediately before the current frame to be processed. The feature value 372 immediately before the digest is stored in the feature value 372 immediately before the digest. The digest start end information 38 (FIG. 11C) has a digest start end time and is used to detect a digest scene.
[0081] 以下、図 12〜図 13を用いて、本発明の第 2の実施形態に力かるダイジェストシーン リスト作成処理を説明する。図 12は、第 2の実施形態にカゝかるダイジェストシーンリス ト作成処理の詳細動作を示すフローチャートである。図 11において、ステップ S61、 S62の処理は、上述の第 1の実施形態で図 3を用いて説明したステップ Sl、 S2の処 理と同様であるため、ここでは詳細な説明を省略する。また、ステップ S63にかかる特 徴量算出処理についても、算出した特徴量を上記一時記憶部 31に出力する点を除 けば、上述の第 1の実施形態で図 3を用 、て説明したステップ S 3の処理と同様であ るため、詳細な説明は省略する。また、ステップ S64にかかる無音区間検出処理につ いても、処理の最後に、ステップ S63で算出した特徴量 (音声信号のパヮレベル)を、 直前特徴量 212に格納する点を除けば、上述の第 1の実施形態で図 4を用いて説明 したステップ S4の処理と同様であるため、詳細な説明は省略する。 Hereinafter, a digest scene list creation process that is relevant to the second embodiment of the present invention will be described with reference to FIGS. Fig. 12 shows a digest scene squirrel that is related to the second embodiment. 5 is a flowchart showing a detailed operation of the image creation process. In FIG. 11, the processing of steps S61 and S62 is the same as the processing of steps Sl and S2 described with reference to FIG. 3 in the first embodiment, and therefore detailed description thereof is omitted here. In addition, regarding the feature amount calculation processing according to step S63, the step described with reference to FIG. 3 in the first embodiment described above, except that the calculated feature amount is output to the temporary storage unit 31. Since it is the same as the process of S3, detailed description is omitted. In addition, for the silent section detection processing in step S64, the feature amount (sound level of the audio signal) calculated in step S63 is stored in the immediately preceding feature amount 212 at the end of the processing, except for the above-described first step. Since it is the same as the process of step S4 described with reference to FIG. 4 in the first embodiment, detailed description thereof is omitted.
[0082] ステップ S64の次に、 CM区間判定部 15は、 CM区間判定処理を行い、 CM区間 情報を作成する(ステップ S65)。このステップ S65の動作についても、上述の第 1の 実施形態で図 7を用いて説明したステップ S6の処理と同様であるため、詳細な説明 を省略する。 [0082] Following step S64, the CM section determination unit 15 performs CM section determination processing and creates CM section information (step S65). Since the operation in step S65 is the same as the process in step S6 described with reference to FIG. 7 in the first embodiment, detailed description thereof is omitted.
[0083] ステップ S65の処理が終われば、ダイジェストリスト作成部 32は、ダイジェストリスト 出力処理を行う(ステップ S66)。図 13は、上記ステップ S66で示したダイジェストリス ト出力処理の詳細を示すフローチャートである。図 13において、まず、ダイジェストリ スト作成部 32は、一時蓄積特徴量 36に 120秒分のフレームの特徴量が蓄積された か否かを判定する (ステップ S71)。これは、本実施形態において CM区間の最大長 を 60秒と想定しているところ、例えば番組冒頭に CM区間が 60秒あった場合に、当 該 CM区間が確定するためには最大 120秒必要となることから、少なくとも番組開始 力 120秒間は、本処理を行わないようにするためのものである。ステップ S71の判 定の結果、まだ 120秒分蓄積されていなければ (ステップ S71で NO)、ダイジェストリ スト出力処理を終了する。一方、蓄積されていれば (ステップ S71で YES)、ダイジェ ストリスト作成部 16は、一時蓄積特徴量 36から、一番古い時刻情報 361および特徴 量 362を取得する(ステップ S 72)。  [0083] When the process of step S65 is completed, the digest list creating unit 32 performs a digest list output process (step S66). FIG. 13 is a flowchart showing details of the digest list output process shown in step S66. In FIG. 13, first, the digest list creation unit 32 determines whether or not 120 seconds of frame feature values have been stored in the temporarily stored feature value 36 (step S71). In this embodiment, the maximum length of the CM section is assumed to be 60 seconds. For example, when the CM section is 60 seconds at the beginning of the program, it takes 120 seconds at the maximum to determine the CM section. Therefore, this processing is not performed for at least 120 seconds of program start force. If the result of determination in step S71 indicates that 120 seconds have not yet been accumulated (NO in step S71), the digest list output process ends. On the other hand, if stored (YES in step S71), the digest list creation unit 16 acquires the oldest time information 361 and feature quantity 362 from the temporarily stored feature quantity 36 (step S72).
[0084] 続いて、ダイジェストリスト作成部 32は、ステップ S72で取得した時刻情報 361の示 す時刻が CM区間内に存在する力否かを、 CM区間情報を参照して判定する (ステツ プ S73)。その結果、 CM区間内であれば (ステップ S73で YES)、ダイジェストリスト 作成部 32は、ダイジェストリスト生成処理を終了する。一方、 CM区間内でなければ( ステップ S73で NO)、ダイジェストリスト作成部 32は、特徴量 362の値が所定値以上 か否かを判定する (ステップ S74)。その結果、所定値以上であれば (ステップ S74で YES)、ダイジェストリスト作成部 32は、ダイジェスト直前特徴量 372が所定値以上か 否かを判定する(ステップ S 75)。つまり、ステップ S72で取得したフレームと当該フレ ームの 1つ前のフレームとの音声パヮレベルの変化を判定する。その結果、ダイジヱ スト直前特徴量 372が所定値以上でなければ (ステップ S75で NO)、当該フレーム の時刻情報をダイジェスト始端情報 38に退避する (ステップ S76)。なお、最初の一 回目の処理のときにおいては、ダイジェスト直前特徴量 212にまだ何も格納されてい ないため、このときは、所定値以上でないとして処理をすすめる。一方、ステップ S75 の判定の結果、ダイジェスト直前特徴量 372が所定値以上であれば (ステップ S75で YES)、ダイジェストリスト作成部 16は、ステップ S76の処理を行わずに、ステップ S7 7の処理に進む。 Subsequently, the digest list creation unit 32 determines whether or not the time indicated by the time information 361 acquired in step S72 exists in the CM section with reference to the CM section information (step S73). ). If the result is within the CM section (YES in step S73), the digest list The creation unit 32 ends the digest list generation process. On the other hand, if not within the CM section (NO in step S73), the digest list creation unit 32 determines whether or not the value of the feature quantity 362 is greater than or equal to a predetermined value (step S74). As a result, if it is greater than or equal to the predetermined value (YES in step S74), the digest list creating unit 32 determines whether or not the feature quantity 372 immediately before the digest is greater than or equal to the predetermined value (step S75). That is, a change in the voice path level between the frame acquired in step S72 and the frame immediately before that frame is determined. As a result, if the feature value 372 immediately before the digest is not equal to or greater than the predetermined value (NO in step S75), the time information of the frame is saved in the digest start end information 38 (step S76). At the time of the first process, nothing is stored in the feature quantity 212 immediately before the digest. Therefore, at this time, the process is proceeded assuming that it is not equal to or greater than the predetermined value. On the other hand, if the result of determination in step S75 is that the feature amount 372 immediately before digest is equal to or greater than the predetermined value (YES in step S75), the digest list creation unit 16 performs the process in step S77 without performing the process in step S76. move on.
[0085] 一方、ステップ S74の判定の結果、特徴量 362の値が所定値以上でなければ (ステ ップ S74で NO)、次に、ダイジェストリスト作成部 32は、ダイジェスト直前特徴量 372 が所定値以上カゝ否かを判定する (ステップ S78)。その結果、ダイジェスト直前特徴量 372が所定値以上でなければ (ステップ S78で NO)、ダイジェストリスト作成部 16は、 ダイジェストリスト生成処理を終了する。一方、ダイジェスト直前特徴量 372が所定値 以上であれば (ステップ S78で YES)、継続していたダイジェストシーンが 1つ前のフ レームで終了したことになるため、上記ダイジェスト始端情報 38の示すダイジェスト始 端時刻からダイジェスト直前時刻情報 371までの区間を 1つのダイジェスト区間として 、ダイジェストシーンリスト 28に出力する(ステップ S79)。  [0085] On the other hand, if the result of determination in step S74 is that the value of feature quantity 362 is not greater than or equal to the predetermined value (NO in step S74), then digest list creation unit 32 sets feature quantity 372 immediately before digest to a predetermined value. It is determined whether the value is greater than or equal to the value (step S78). As a result, if the immediately-digest feature quantity 372 is not equal to or greater than the predetermined value (NO in step S78), the digest list creation unit 16 ends the digest list generation process. On the other hand, if the feature value 372 immediately before the digest is equal to or greater than the predetermined value (YES in step S78), the digest scene that has been continued has been completed in the previous frame, so the digest indicated by the digest start edge information 38 is the digest. The section from the start time to the time information 371 immediately before the digest is output to the digest scene list 28 as one digest section (step S79).
[0086] 次に、ダイジェストリスト作成部 16は、当該フレームの音声パヮレベルをダイジェスト 直前特徴量 372に退避する (ステップ S77)。以上で、第 2の実施形態に力かるダイ ジェストシーンリスト作成処理が終了する。  Next, the digest list creation unit 16 saves the voice path level of the frame to the feature amount 372 immediately before the digest (step S77). This completes the digest scene list creation process that is relevant to the second embodiment.
[0087] このように、第 2の実施形態では、番組の録画と並行しながら、 CM区間を検出し、 CM区間以外の番組区間力もダイジェストシーンを検出していくことができる。これに より、番組の録画終了後に、別途ダイジェストシーンリスト生成のための処理を行う必 要がなくなり、当該生成処理のための処理待ち時間のな!、快適な視聴環境をユーザ に提供することができる。 As described above, in the second embodiment, the CM section can be detected in parallel with the recording of the program, and the digest scene can be detected by the program section force other than the CM section. This makes it necessary to perform a separate process for generating a digest scene list after the recording of the program is completed. This eliminates the need for a processing time for the generation process and provides a comfortable viewing environment to the user.
[0088] なお、上述した各実施形態は、コンピュータに実行させるプログラムを格納した記録 媒体の形態で提供されてもよい。この場合は、当該記録媒体に格納されたダイジエス ト生成プログラムを読み込み、ダイジェスト生成装置 (より正確には、図示しない制御 部)が、図 3、図 12に示すような処理を実行すればよい。  Note that each of the above-described embodiments may be provided in the form of a recording medium storing a program to be executed by a computer. In this case, the digest generation program stored in the recording medium is read, and the digest generation device (more precisely, the control unit not shown) may perform the processes shown in FIGS.
産業上の利用可能性  Industrial applicability
[0089] 本発明にカゝかるダイジェスト生成装置、ダイジェスト生成方法、ダイジェスト生成プロ グラムを格納した記録媒体、およびダイジェスト生成装置に用いられる集積回路は、 番組を録画しながらダイジェストシーン情報を生成することができ、 HDDレコーダや DVDレコーダ等の用途に有用である。 [0089] A digest generating device, a digest generating method, a recording medium storing a digest generating program, and an integrated circuit used in the digest generating device according to the present invention generate digest scene information while recording a program. It is useful for applications such as HDD recorders and DVD recorders.

Claims

請求の範囲 The scope of the claims
[1] 放送される番組の放送信号を受信して記録媒体に記録する際に当該番組に関す るダイジェストシーン情報を生成するダイジェスト生成装置であって、  [1] A digest generation device that generates digest scene information related to a program when the broadcast signal of the broadcast program is received and recorded on a recording medium,
所定の単位時間の放送信号が受信される度に、当該受信された単位時間分の放 送信号から、当該放送信号に含まれる映像および音声の少なくとも一方に関する特 徴を示す特徴量を少なくとも 1種類算出する特徴量算出部と、  Each time a broadcast signal of a predetermined unit time is received, at least one type of feature amount indicating a feature related to at least one of video and audio included in the broadcast signal is received from the received broadcast signal for the unit time. A feature amount calculation unit to be calculated;
前記受信された放送信号のうちですでに特徴量が算出された信号部分に含まれる 所定の時点が特定区間の始端または終端となる力否かを、前記特徴量が算出される 度に判定することによって、特定区間の始端または終端となる時点を検出する特定 区間端検出部と、  Each time the feature value is calculated, it is determined whether or not the predetermined time point included in the signal portion of the received broadcast signal in which the feature value has already been calculated is the start or end of the specific section. A specific section edge detection unit that detects a point of time that is the start or end of the specific section,
前記特徴量が算出される度に、当該特徴量に基づいて、前記番組の全体の区間 のうち前記特定区間を除いた区間にかかる放送信号がダイジェストシーンか否かを 判定してダイジェストシーン情報を生成するダイジェストシーン情報作成部とを備える 、ダイジェスト生成装置。  Each time the feature quantity is calculated, based on the feature quantity, it is determined whether or not the broadcast signal applied to the entire section of the program excluding the specific section is a digest scene, and digest scene information is obtained. A digest generation apparatus, comprising: a digest scene information generation unit for generating.
[2] 前記ダイジェストシーン情報作成部は、  [2] The digest scene information creation unit
前記単位時間分の放送信号に含まれるコンテンツがダイジェストシーンであるか 否かを、当該単位時間分の放送信号について特徴量が算出される度に当該特徴量 に基づ!/ヽて判定することによって、前記受信された放送信号にっ ヽてダイジェスト候 補区間を検出するダイジェスト区間検出部を含み、  Whether the content included in the broadcast signal for the unit time is a digest scene or not is determined based on the feature amount every time the feature amount is calculated for the broadcast signal for the unit time! A digest section detecting unit for detecting a digest candidate section based on the received broadcast signal,
前記特定区間端検出部によって特定区間の始端および終端の組が検出される 度に、当該始端力 当該終端までの特定区間が前記ダイジェスト候補区間と重複す るカゝ否かを判定し、前記ダイジェスト区間検出部によって検出されたダイジェスト候補 区間のうちで当該特定区間と重複するダイジェスト候補区間を除いた区間を示す情 報をダイジェストシーン情報として生成する、請求項 1に記載のダイジェスト生成装置  Each time the specific section end detection unit detects a set of the start and end of a specific section, the start force determines whether or not the specific section up to the end overlaps with the digest candidate section, and the digest The digest generation device according to claim 1, wherein information indicating a section excluding a digest candidate section that overlaps the specific section among the digest candidate sections detected by the section detection unit is generated as digest scene information.
[3] 前記ダイジェストシーン情報作成部は、 [3] The digest scene information creation unit
前記算出された特徴量を最新の算出時点力 所定時間分まで記憶する一時記憶 部を含み、 前記特徴量が算出される度に、前記一時記憶部に記憶されている特徴量にかかる 時点が前記特定区間端検出部によって検出された特定区間の始端力 終端までの 間に含まれるカゝ否かを判定し、含まれない場合にのみ、単位時間分の放送信号に含 まれるコンテンツのうちダイジェストシーンであるコンテンツを検出して、ダイジェストシ ーン情報を生成する、請求項 1に記載のダイジェスト生成装置。 A temporary storage unit for storing the calculated feature amount up to a predetermined calculation time force for a predetermined time; Every time the feature amount is calculated, the time point corresponding to the feature amount stored in the temporary storage unit is included between the start point of the specific section detected by the specific section end detection unit and the end point. The digest scene information is generated by detecting content that is a digest scene from content included in a broadcast signal for a unit time only when the content is not included. Digest generator.
[4] 前記特徴量算出部は、第 1及び第 2の特徴量を算出し、 [4] The feature amount calculation unit calculates the first and second feature amounts,
前記特定区間端検出部は、第 1の特徴量に基づいて特定区間の始端または終端 を判定し、  The specific section end detection unit determines the start or end of the specific section based on the first feature amount,
前記ダイジェスト区間検出部は、第 2の特徴量に基づ 、て前記ダイジェスト候補区 間を検出する、請求項 2に記載のダイジェスト生成装置。  The digest generation device according to claim 2, wherein the digest section detection unit detects the digest candidate section based on a second feature amount.
[5] 前記特定区間端検出部は、 [5] The specific section end detection unit includes:
前記特徴量が所定の条件を満たすとき、当該条件を満たす特徴量のみを含む区 間を特定区間候補として検出する特定区間候補検出部と、  When the feature quantity satisfies a predetermined condition, a specific section candidate detection unit that detects a section including only the feature quantity satisfying the condition as a specific section candidate;
前記番組内における前記特定区間候補同士の時間差に基づいて特定区間の始 端または終端となる候補を検出する特定区間判定部とを含む、請求項 1に記載のダ イジエスト生成装置。  The digest generation apparatus according to claim 1, further comprising: a specific section determination unit that detects a candidate that is a start or end of a specific section based on a time difference between the specific section candidates in the program.
[6] 前記特定区間判定部は、前記特定区間候補が検出される度に、検出された特定 区間候補力 所定時間前の時点が既に検出された特定区間候補に含まれていれば 、当該所定時間前の時点を特定区間の始端とし、当該特定区間候補を特定区間の 終端として検出する、請求項 5記載のダイジェスト生成装置。  [6] Each time the specific section candidate is detected, the specific section determination unit detects the specific section candidate power if the specific section candidate already included in the detected specific section candidate includes a time point before a predetermined time. 6. The digest generation device according to claim 5, wherein a time point before the time is used as a start point of the specific section, and the specific section candidate is detected as the end point of the specific section.
[7] 前記特定区間検出部は、  [7] The specific section detector
前記特定区間候補が検出される度に、最後に検出された特定区間候補力 所定 の第 1時間前の時点に、または、当該最後に検出された特定区間候補力 所定の第 2時間前の時点に、すでに検出された特定区間候補が存在するか否かを判定する 判定部と、  Each time the specific section candidate is detected, the last detected specific section candidate power at a time point before a predetermined first time, or the last detected specific section candidate power at a time point before a predetermined second time A determination unit for determining whether or not a specific section candidate that has already been detected exists,
前記判定部によって特定区間候補が存在すると判定された場合、存在すると判 定された特定区間候補および当該最後に検出された特定区間候補についてそれぞ れ点数を加算する加算部と、 点数が所定値以上である対象候補が検出されて力 所定の第 3時間が経過する 度に、当該対象候補力も当該第 3時間前の時点に、点数が当該所定値以上の特定 区間候補が存在するか否かを判定し、存在しない場合、当該対象候補を特定区間 の始端とする始端決定部と、 An adder that adds points for each of the specific section candidate determined to be present and the specific section candidate detected last when the determination section determines that the specific section candidate exists; When a target candidate with a score greater than or equal to a predetermined value is detected and the force reaches the predetermined third time, the target candidate power also has a specific section candidate with a score greater than or equal to the predetermined value at the time before the third time And if it does not exist, a start end determination unit that sets the target candidate as the start end of the specific section;
点数が所定値以上である対象候補が検出されて力 所定の第 3時間が経過する 度に、当該第 3時間が経過した時点に、点数が当該所定値以上の特定区間候補が 存在するか否かを判定し、存在しない場合、当該対象候補を特定区間の終端とする 終端決定部とを備える、請求項 5記載のダイジェスト生成装置。  When a target candidate whose score is equal to or greater than a predetermined value is detected and the force is detected and the specified third time elapses, whether or not there is a specific section candidate whose score is equal to or greater than the predetermined value when the third time elapses 6. The digest generation device according to claim 5, further comprising: an end determination unit configured to determine whether the target candidate is the end of the specific section when the target candidate does not exist.
[8] 前記特徴量算出部は、音声信号の音声パヮレベルを前記特徴量として算出し、 前記特定区間候補検出部は、前記パヮレベルが所定値以下の無音区間を前記特 定区間候補として検出する、請求項 5記載のダイジェスト生成装置。 [8] The feature amount calculation unit calculates a voice path level of an audio signal as the feature amount, and the specific section candidate detection unit detects a silent section having the par level of a predetermined value or less as the specific section candidate. The digest production | generation apparatus of Claim 5.
[9] 前記特徴量算出部は、映像信号に基づく輝度情報を前記特徴量として算出し、 前記特定区間候補検出部は、前記輝度情報の変化量が所定値以上であるシーン チェンジ点を前記特定区間候補として検出する、請求項 5記載のダイジェスト生成装 置。 [9] The feature amount calculation unit calculates luminance information based on a video signal as the feature amount, and the specific section candidate detection unit specifies the scene change point where the amount of change in the luminance information is a predetermined value or more. 6. The digest generation device according to claim 5, which is detected as a section candidate.
[10] 放送される番組の放送信号を受信して記録媒体に記録する際に当該番組に関す るダイジェストシーン情報を生成するダイジェスト生成方法であって、  [10] A digest generation method for generating digest scene information related to a program when the broadcast signal of the broadcast program is received and recorded on a recording medium,
所定の単位時間の放送信号が受信される度に、当該受信された単位時間分の放 送信号から、当該放送信号に含まれる映像および音声の少なくとも一方に関する特 徴を示す特徴量を少なくとも 1種類算出する特徴量算出ステップと、  Each time a broadcast signal of a predetermined unit time is received, at least one type of feature amount indicating a feature related to at least one of video and audio included in the broadcast signal is received from the received broadcast signal for the unit time. A feature amount calculating step to calculate;
前記受信された放送信号のうちですでに特徴量が算出された信号部分に含まれる 所定の時点が特定区間の始端または終端となる力否かを、前記特徴量が算出される 度に判定することによって、特定区間の始端または終端となる時点を検出する特定 区間端検出ステップと、  Each time the feature value is calculated, it is determined whether or not the predetermined time point included in the signal portion of the received broadcast signal in which the feature value has already been calculated is the start or end of the specific section. A specific section edge detection step for detecting a time point that is a start or end of the specific section,
前記特徴量が算出される度に、当該特徴量に基づいて、前記番組の全体の区間 のうち前記特定区間を除いた区間にかかる放送信号がダイジェストシーンか否かを 判定してダイジェストシーン情報を生成するダイジェストシーン情報作成ステップとを 備える、ダイジェスト生成方法。 Each time the feature quantity is calculated, based on the feature quantity, it is determined whether or not the broadcast signal applied to the entire section of the program excluding the specific section is a digest scene, and digest scene information is obtained. A digest generation method comprising: generating a digest scene information.
[11] 前記ダイジェストシーン情報作成ステップは、 [11] The digest scene information creation step includes:
前記単位時間分の放送信号に含まれるコンテンツがダイジェストシーンであるか 否かを、当該単位時間分の放送信号について特徴量が算出される度に当該特徴量 に基づ!/ヽて判定することによって、前記受信された放送信号にっ ヽてダイジェスト候 補区間を検出するダイジェスト区間検出ステップを含み、  Whether the content included in the broadcast signal for the unit time is a digest scene or not is determined based on the feature amount every time the feature amount is calculated for the broadcast signal for the unit time! A digest section detecting step of detecting a digest candidate section based on the received broadcast signal,
前記特定区間端検出ステップによって特定区間の始端および終端の組が検出さ れる度に、当該始端力 当該終端までの特定区間が前記ダイジェスト候補区間と重 複するか否かを判定し、前記ダイジェスト区間検出ステップによって検出されたダイジ スト候補区間のうちで当該特定区間と重複するダイジェスト候補区間を除いた区間 を示す情報をダイジェストシーン情報として生成する、請求項 10記載のダイジェスト 生成方法。  Each time the start and end pair of a specific section is detected by the specific section end detection step, it is determined whether or not the specific section up to the start force and the end overlaps with the digest candidate section. 11. The digest generation method according to claim 10, wherein information indicating a section excluding the digest candidate section overlapping with the specific section among the digest candidate sections detected by the detection step is generated as digest scene information.
[12] 前記ダイジェストシーン情報作成ステップは、  [12] The digest scene information creating step includes:
前記算出された特徴量を最新の算出時点力 所定時間分まで記憶する一時記 憶ステップを含み、  A temporary storage step of storing the calculated feature quantity up to a predetermined calculation time force for a predetermined time;
前記特徴量が算出される度に、前記一時記憶ステップによって記憶された特徴量 にかかる時点が前記特定区間端検出ステップによって検出された特定区間の始端 力も終端までの間に含まれるか否かを判定し、含まれない場合にのみ、単位時間分 の AV信号に含まれるコンテンツのうちでダイジェストシーンであるコンテンツを検出し て、ダイジェストシーン情報を生成する、請求項 10記載のダイジェスト生成方法。  It is determined whether the start time of the specific section detected by the specific section end detection step is included between the end point of the specific section detected by the specific section end detection step and the end point each time the feature amount is calculated. 11. The digest generation method according to claim 10, wherein the digest scene information is generated by detecting content that is a digest scene from contents included in an AV signal for a unit time only when it is determined and not included.
[13] 放送される番組の放送信号を受信して記録媒体に記録する際に当該番組に関す るダイジェストシーン情報を生成するダイジェスト生成装置のコンピュータに実行させ るダイジェスト生成プログラムを格納した記録媒体であって、 [13] A recording medium that stores a digest generation program that is executed by a computer of a digest generation apparatus that generates digest scene information about the program when a broadcast signal of the broadcast program is received and recorded on the recording medium. There,
所定の単位時間の放送信号が受信される度に、当該受信された単位時間分の放 送信号から、当該放送信号に含まれる映像および音声の少なくとも一方に関する特 徴を示す特徴量を少なくとも 1種類算出する特徴量算出ステップと、  Each time a broadcast signal of a predetermined unit time is received, at least one type of feature amount indicating a feature related to at least one of video and audio included in the broadcast signal is received from the received broadcast signal for the unit time. A feature amount calculating step to calculate;
前記受信された放送信号のうちですでに特徴量が算出された信号部分に含まれる 所定の時点が特定区間の始端または終端となる力否かを、前記特徴量が算出される 度に判定することによって、特定区間の始端または終端となる時点を検出する特定 区間端検出ステップと、 Each time the feature value is calculated, it is determined whether or not the predetermined time point included in the signal portion of the received broadcast signal in which the feature value has already been calculated is the start or end of the specific section. Specific to detect the start or end of a specific section An interval end detection step;
前記特徴量が算出される度に、当該特徴量に基づいて、前記番組の全体の区間 のうち前記特定区間を除いた区間にかかる放送信号がダイジェストシーンか否かを 判定してダイジェストシーン情報を生成するダイジェストシーン情報作成ステップとを 前記コンピュータに実行させるプログラムを格納した記録媒体。  Each time the feature quantity is calculated, based on the feature quantity, it is determined whether or not the broadcast signal applied to the entire section of the program excluding the specific section is a digest scene, and digest scene information is obtained. The recording medium which stored the program which makes the said computer perform the digest scene information creation step to produce | generate.
[14] 前記ダイジェストシーン情報作成ステップは、  [14] The digest scene information creating step includes:
前記単位時間分の放送信号に含まれるコンテンツがダイジェストシーンであるか 否かを、当該単位時間分の放送信号について特徴量が算出される度に当該特徴量 に基づ!/ヽて判定することによって、前記受信された放送信号にっ ヽてダイジェスト候 補区間を検出するダイジェスト区間検出ステップを含み、  Whether the content included in the broadcast signal for the unit time is a digest scene or not is determined based on the feature amount every time the feature amount is calculated for the broadcast signal for the unit time! A digest section detecting step of detecting a digest candidate section based on the received broadcast signal,
前記特定区間端検出ステップによって特定区間の始端および終端の組が検出さ れる度に、当該始端力 当該終端までの特定区間が前記ダイジェスト候補区間と重 複するか否かを判定し、前記ダイジェスト区間検出ステップによって検出されたダイジ スト候補区間のうちで当該特定区間と重複するダイジェスト候補区間を除いた区間 を示す情報をダイジェストシーン情報として生成する、請求項 13記載の記録媒体。  Each time the start and end pair of a specific section is detected by the specific section end detection step, it is determined whether or not the specific section up to the start force and the end overlaps with the digest candidate section. 14. The recording medium according to claim 13, wherein information indicating a section excluding a digest candidate section that overlaps the specific section among the digest candidate sections detected by the detecting step is generated as digest scene information.
[15] 前記ダイジェストシーン情報作成ステップは、  [15] The digest scene information creation step includes:
前記算出された特徴量を最新の算出時点力 所定時間分まで記憶する一時記 憶ステップを含み、  A temporary storage step of storing the calculated feature quantity up to a predetermined calculation time force for a predetermined time;
前記特徴量が算出される度に、前記一時記憶ステップによって記憶された特徴量 にかかる時点が前記特定区間端検出ステップによって検出された特定区間の始端 力も終端までの間に含まれるか否かを判定し、含まれない場合にのみ、単位時間分 の AV信号に含まれるコンテンツのうちでダイジェストシーンであるコンテンツを検出し て、ダイジェストシーン情報を生成する、請求項 13記載の記録媒体。  It is determined whether the start time of the specific section detected by the specific section end detection step is included between the end point of the specific section detected by the specific section end detection step and the end point each time the feature amount is calculated. 14. The recording medium according to claim 13, wherein only when it is determined and not included, the digest scene information is generated by detecting the content that is the digest scene among the content included in the AV signal for the unit time.
[16] 放送される番組の放送信号を受信して記録媒体に記録する際に当該番組に関す るダイジェストシーン情報を生成するダイジェスト生成装置に用いられる集積回路あ つて、 [16] An integrated circuit used in a digest generation device that generates digest scene information related to a program when the broadcast signal of the broadcast program is received and recorded on a recording medium.
所定の単位時間の放送信号が受信される度に、当該受信された単位時間分の放 送信号から、当該放送信号に含まれる映像および音声の少なくとも一方に関する特 徴を示す特徴量を少なくとも 1種類算出する特徴量算出部と、 Each time a broadcast signal of a predetermined unit time is received, a special feature relating to at least one of video and audio included in the broadcast signal is received from the received broadcast signal for the unit time. A feature amount calculation unit that calculates at least one type of feature amount indicating a sign;
前記受信された放送信号のうちですでに特徴量が算出された信号部分に含まれる 所定の時点が特定区間の始端または終端となる力否かを、前記特徴量が算出される 度に判定することによって、特定区間の始端または終端となる時点を検出する特定 区間端検出部と、  Each time the feature value is calculated, it is determined whether or not the predetermined time point included in the signal portion of the received broadcast signal in which the feature value has already been calculated is the start or end of the specific section. A specific section edge detection unit that detects a point of time that is the start or end of the specific section,
前記特徴量が算出される度に、当該特徴量に基づいて、前記番組の全体の区間 のうち前記特定区間を除いた区間にかかる放送信号がダイジェストシーンか否かを 判定してダイジェストシーン情報を生成するダイジェストシーン情報作成部とを備える 、ダイジェスト生成装置に用いられる集積回路。  Each time the feature quantity is calculated, based on the feature quantity, it is determined whether the broadcast signal applied to a section excluding the specific section of the entire section of the program is a digest scene, and digest scene information is obtained. An integrated circuit used in a digest generation device, comprising: a digest scene information generation unit for generating.
[17] 前記ダイジェストシーン情報作成部は、  [17] The digest scene information creation unit
前記単位時間分の放送信号に含まれるコンテンツがダイジェストシーンであるか 否かを、当該単位時間分の放送信号について特徴量が算出される度に当該特徴量 に基づ!/ヽて判定することによって、前記受信された放送信号にっ ヽてダイジェスト候 補区間を検出するダイジェスト区間検出部を含み、  Whether the content included in the broadcast signal for the unit time is a digest scene or not is determined based on the feature amount every time the feature amount is calculated for the broadcast signal for the unit time! A digest section detecting unit for detecting a digest candidate section based on the received broadcast signal,
前記特定区間端検出部によって特定区間の始端および終端の組が検出される 度に、当該始端力 当該終端までの特定区間が前記ダイジェスト候補区間と重複す るカゝ否かを判定し、前記ダイジェスト区間検出部によって検出されたダイジェスト候補 区間のうちで当該特定区間と重複するダイジェスト候補区間を除いた区間を示す情 報をダイジェストシーン情報として生成する、請求項 16記載の集積回路。  Each time the specific section end detection unit detects a set of the start and end of a specific section, the start force determines whether or not the specific section up to the end overlaps with the digest candidate section, and the digest 17. The integrated circuit according to claim 16, wherein information indicating a section excluding the digest candidate section that overlaps with the specific section among the digest candidate sections detected by the section detection unit is generated as digest scene information.
[18] 前記ダイジェストシーン情報作成部は、 [18] The digest scene information creation unit
前記算出された特徴量を最新の算出時点力 所定時間分まで記憶する一時記 憶部を含み、  A temporary storage unit for storing the calculated feature amount up to a predetermined calculation time force for a predetermined time;
前記特徴量が算出される度に、前記一時記憶部に記憶されている特徴量にかか る時点が前記特定区間端検出部によって検出された特定区間の始端力 終端まで の間に含まれるか否かを判定し、含まれない場合にのみ、単位時間分の AV信号に 含まれるコンテンツのうちでダイジェストシーンであるコンテンツを検出して、ダイジェ ストシーン情報を生成する、請求項 16記載の集積回路。  Whether the time point for the feature value stored in the temporary storage unit is included between the start time and the end force of the specific section detected by the specific section end detection unit each time the feature value is calculated The integrated circuit according to claim 16, wherein the digest scene information is generated by detecting content that is a digest scene among the contents included in the AV signal for a unit time only when the content is not included. .
PCT/JP2006/314589 2005-07-27 2006-07-24 Digest generation device, digest generation method, recording medium containing a digest generation program, and integrated circuit used in digest generation device WO2007013407A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2007528453A JPWO2007013407A1 (en) 2005-07-27 2006-07-24 Digest generating apparatus, digest generating method, recording medium storing digest generating program, and integrated circuit used for digest generating apparatus
US11/994,827 US20090226144A1 (en) 2005-07-27 2006-07-24 Digest generation device, digest generation method, recording medium storing digest generation program thereon and integrated circuit used for digest generation device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2005217724 2005-07-27
JP2005-217724 2005-07-27

Publications (1)

Publication Number Publication Date
WO2007013407A1 true WO2007013407A1 (en) 2007-02-01

Family

ID=37683303

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2006/314589 WO2007013407A1 (en) 2005-07-27 2006-07-24 Digest generation device, digest generation method, recording medium containing a digest generation program, and integrated circuit used in digest generation device

Country Status (4)

Country Link
US (1) US20090226144A1 (en)
JP (1) JPWO2007013407A1 (en)
CN (1) CN101228786A (en)
WO (1) WO2007013407A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2157580A1 (en) * 2008-08-22 2010-02-24 Panasonic Corporation Video editing system
JP2016090774A (en) * 2014-11-04 2016-05-23 ソニー株式会社 Information processing device, information processing method and program
JP2019020743A (en) * 2018-10-04 2019-02-07 ソニー株式会社 Information processing device

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9832022B1 (en) * 2015-02-26 2017-11-28 Altera Corporation Systems and methods for performing reverse order cryptographic operations on data streams

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1032776A (en) * 1996-07-18 1998-02-03 Matsushita Electric Ind Co Ltd Video display method and recording/reproducing device
JP2001177804A (en) * 1999-12-20 2001-06-29 Toshiba Corp Image recording and reproducing device
JP2005175710A (en) * 2003-12-09 2005-06-30 Sony Corp Digital recording and reproducing apparatus and digital recording and reproducing method

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09312827A (en) * 1996-05-22 1997-12-02 Sony Corp Recording and reproducing device
US6160950A (en) * 1996-07-18 2000-12-12 Matsushita Electric Industrial Co., Ltd. Method and apparatus for automatically generating a digest of a program
JPH10224722A (en) * 1997-02-07 1998-08-21 Sony Corp Commercial scene detector and its detection method
EP0977172A4 (en) * 1997-03-19 2000-12-27 Hitachi Ltd Method and device for detecting starting and ending points of sound section in video
JP4178629B2 (en) * 1998-11-30 2008-11-12 ソニー株式会社 Information processing apparatus and method, and recording medium
US7155735B1 (en) * 1999-10-08 2006-12-26 Vulcan Patents Llc System and method for the broadcast dissemination of time-ordered data
JP3632646B2 (en) * 2001-11-09 2005-03-23 日本電気株式会社 Communication system, communication terminal, server, and frame transmission control program
US7703044B2 (en) * 2001-11-19 2010-04-20 Ricoh Company, Ltd. Techniques for generating a static representation for time-based media information
US7206494B2 (en) * 2002-05-09 2007-04-17 Thomson Licensing Detection rules for a digital video recorder
US7260308B2 (en) * 2002-05-09 2007-08-21 Thomson Licensing Content identification in a digital video recorder
JP2004265477A (en) * 2003-02-28 2004-09-24 Canon Inc Regeneration apparatus
US20050001842A1 (en) * 2003-05-23 2005-01-06 Woojin Park Method, system and computer program product for predicting an output motion from a database of motion data
US7260035B2 (en) * 2003-06-20 2007-08-21 Matsushita Electric Industrial Co., Ltd. Recording/playback device
EP1708101B1 (en) * 2004-01-14 2014-06-25 Mitsubishi Denki Kabushiki Kaisha Summarizing reproduction device and summarizing reproduction method
JP2005229156A (en) * 2004-02-10 2005-08-25 Funai Electric Co Ltd Decoding and recording device
US20050226601A1 (en) * 2004-04-08 2005-10-13 Alon Cohen Device, system and method for synchronizing an effect to a media presentation
WO2005109905A2 (en) * 2004-04-30 2005-11-17 Vulcan Inc. Time-based graphical user interface for television program information
JP2006050531A (en) * 2004-06-30 2006-02-16 Matsushita Electric Ind Co Ltd Information recording apparatus
US20060059510A1 (en) * 2004-09-13 2006-03-16 Huang Jau H System and method for embedding scene change information in a video bitstream

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1032776A (en) * 1996-07-18 1998-02-03 Matsushita Electric Ind Co Ltd Video display method and recording/reproducing device
JP2001177804A (en) * 1999-12-20 2001-06-29 Toshiba Corp Image recording and reproducing device
JP2005175710A (en) * 2003-12-09 2005-06-30 Sony Corp Digital recording and reproducing apparatus and digital recording and reproducing method

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2157580A1 (en) * 2008-08-22 2010-02-24 Panasonic Corporation Video editing system
JP2016090774A (en) * 2014-11-04 2016-05-23 ソニー株式会社 Information processing device, information processing method and program
JP2019020743A (en) * 2018-10-04 2019-02-07 ソニー株式会社 Information processing device

Also Published As

Publication number Publication date
CN101228786A (en) 2008-07-23
JPWO2007013407A1 (en) 2009-02-05
US20090226144A1 (en) 2009-09-10

Similar Documents

Publication Publication Date Title
JP4757876B2 (en) Digest creation device and program thereof
US7941031B2 (en) Video processing apparatus, IC circuit for video processing apparatus, video processing method, and video processing program
JP3891111B2 (en) Acoustic signal processing apparatus and method, signal recording apparatus and method, and program
JP3744464B2 (en) Signal recording / reproducing apparatus and method, signal reproducing apparatus and method, program, and recording medium
JP4387408B2 (en) AV content processing apparatus, AV content processing method, AV content processing program, and integrated circuit used for AV content processing apparatus
WO2007132566A1 (en) Video reproduction device, video reproduction method, and video reproduction program
US7149365B2 (en) Image information summary apparatus, image information summary method and image information summary processing program
JP2007266653A (en) Commercial detection apparatus and video playback apparatus
WO2007013407A1 (en) Digest generation device, digest generation method, recording medium containing a digest generation program, and integrated circuit used in digest generation device
JP3879122B2 (en) Disk device, disk recording method, disk reproducing method, recording medium, and program
US8234278B2 (en) Information processing device, information processing method, and program therefor
US20130101271A1 (en) Video processing apparatus and method
JP5002227B2 (en) Playback device
JP2006270233A (en) Method for processing signal, and device for recording/reproducing signal
WO2007145281A1 (en) Video reproducing device, video reproducing method, and video reproducing program
JP2007189448A (en) Video storing and reproducing device
JP2006115224A (en) Video recorder
JP2007082091A (en) Apparatus and method for setting delimiter information to video signal
JP5682167B2 (en) Video / audio recording / reproducing apparatus and video / audio recording / reproducing method
JP5560999B2 (en) Video / audio recording / reproducing apparatus and video / audio recording / reproducing method
JP6164445B2 (en) Chapter setting device
JP2002041095A (en) Compressed audio signal reproducing device
US20090136202A1 (en) Recording/playback device and method, program, and recording medium
JP2009194598A (en) Information processor and method, program, and recording medium
KR20040102962A (en) Apparatus for generating highlight stream in PVR and method for the same

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200680027069.7

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2007528453

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 11994827

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06781501

Country of ref document: EP

Kind code of ref document: A1