WO2014155526A1

WO2014155526A1 - Information processing device and information processing method

Info

Publication number: WO2014155526A1
Application number: PCT/JP2013/058791
Authority: WO
Inventors: 剛舘野
Original assignee: 株式会社東芝
Priority date: 2013-03-26
Filing date: 2013-03-26
Publication date: 2014-10-02

Abstract

In an embodiment, an information processing device is provided with a searching means (28c) and an analyzing means (28d). The searching means (28c) performs a song search on content to be analyzed at prescribed time intervals. The analyzing means (28d) analyzes a playback state for songs included in the content on the basis of the song search results obtained at the prescribed intervals by the search means (28c).

Description

Information processing apparatus and information processing method

Embodiments of the present invention relate to an information processing apparatus and an information processing method for analyzing a reproduction state of music information included in recorded content.

As is well known, in recent years, receiving devices that receive television broadcasts, radio broadcasts, and the like can not only receive and record broadcast content, but also by accessing a predetermined server via a network line, A function of acquiring attribute information related to various types of information included in the recorded content has been installed.

By using this function, when the receiving device generates a feature value from the music information included in the recorded content and sends it to the server, the server searches for the music piece based on the feature value. A so-called song search function in which attribute information corresponding to a song is returned to the receiving device has been put into practical use.

JP 2007-219178 A

By using the song search function, it is possible to quickly analyze the playback status of music information (songs) included in the recorded content, that is, the tunes and song sections, etc. As a result, the user can perform various kinds of songs included in the recorded content. It is an object of the present invention to provide an information processing apparatus and an information processing method that can recognize information in advance and make it easier to handle when viewing recorded content.

According to the embodiment, the information processing apparatus includes search means and analysis means. The search means searches for music at a predetermined time interval for the content to be analyzed. The analysis unit analyzes the reproduction state of the song included in the content based on the song search result obtained at predetermined time intervals by the search unit.

By using the song search function, it is possible to quickly analyze the playback status of music information (songs) included in the recorded content, that is, the tunes and song sections, etc. As a result, the user can perform various kinds of songs included in the recorded content. It is possible to provide an information processing apparatus and an information processing method that can be recognized in advance and can be handled easily when viewing recorded content.

It is a block block diagram shown in order to demonstrate roughly an example of the content delivery system as embodiment. It is a block block diagram shown in order to demonstrate roughly an example of the signal processing system of the receiving terminal which comprises the content delivery system in the embodiment. 3 is an external view for explaining an example of a remote controller that operates the receiving terminal in the embodiment. FIG. It is a block block diagram shown in order to demonstrate an example of the music search process which the music search process part of the receiving terminal in the embodiment performs. It is a flowchart shown in order to demonstrate an example of the music search process operation which the music search process part of the receiving terminal in the embodiment performs. It is a flowchart shown in order to demonstrate an example of the music analysis process operation which the music analysis process part of the receiving terminal in the embodiment performs. It is a figure shown in order to demonstrate the music search data length utilized for the music search process which the music search process part of the receiving terminal in the embodiment performs. It is a flowchart shown in order to demonstrate an example of the music section specific process operation | movement and analysis process operation which the music analysis process part of the receiving terminal in the embodiment performs. It is a figure shown in order to demonstrate an example of the analysis process of the end position and start position of the music which the music analysis process part of the receiving terminal in the embodiment performs. 6 is a flowchart shown for explaining an example of a music end position analysis processing operation performed by a music analysis processing unit of the receiving terminal in the embodiment. It is a flowchart shown in order to demonstrate an example of the analysis process operation | movement of the music start position which the music analysis process part of the receiving terminal in the embodiment performs. It is a flowchart shown in order to demonstrate an example of the specific process operation | movement and analysis process operation | movement of an unspecified music area which the music analysis process part of the receiving terminal in the embodiment performs. It is a flowchart shown in order to demonstrate an example of the specific process operation | movement and analysis process operation | movement of an unspecified music area which the music analysis process part of the receiving terminal in the embodiment performs. It is a flowchart shown in order to demonstrate an example of the specific process operation | movement and analysis process operation | movement of an unspecified music area which the music analysis process part of the receiving terminal in the embodiment performs. It is a flowchart shown in order to demonstrate an example of the specific process operation | movement and analysis process operation | movement of an unspecified music area which the music analysis process part of the receiving terminal in the embodiment performs. It is a figure shown in order to demonstrate an example of the result of having analyzed the reproduction | regeneration state of the music information contained in the recording content which the music analysis process part of the receiving terminal in the same embodiment performed. It is a figure shown in order to demonstrate the other example of the result of having analyzed the reproduction | regeneration state of the music information contained in the recording content which the music analysis process part of the receiving terminal in the same embodiment performed.

Hereinafter, embodiments will be described in detail with reference to the drawings. According to the embodiment, the information processing apparatus includes a search unit and an analysis unit. The search means searches for music at predetermined time intervals for the content to be analyzed. The analysis unit analyzes the reproduction state of the song included in the content based on the song search result obtained at predetermined time intervals by the search unit.

FIG. 1 schematically shows an example of a content distribution system 11 described in this embodiment. First, in the content distribution system 11, program content distributed from a broadcasting station 12 using a broadcast wave as a medium is received by a receiving terminal 13 and used for video display, audio reproduction, and the like. The receiving terminal 13 also has a function of recording and reproducing the received program content.

Furthermore, in the content distribution system 11, program content is supplied from the broadcasting station 12 to the server 14 by wire or wireless communication and stored. The receiving terminal 13 communicates with the server 14 via a LAN (local area network) router 15 capable of wired or wireless communication, for example, a network circuit network 16 such as a fixed IP (internet protocol) communication network, and a gateway 17. It is accessible.

As a result, the receiving terminal 13 obtains a program content distributed from the server 14 based on a preset program distribution schedule and performs video display, audio reproduction, or the like, or a request from the server 14. Therefore, it is possible to realize a so-called VOD (video on demand) function that performs video display, audio playback, and the like based on program content acquired from the server 14.

In the content distribution system 11, the broadcast station 12 supplies the server 14 with attribute information related to various program contents stored in the broadcast or the server 14 and stores the attribute information. For this reason, the receiving terminal 13 can acquire attribute information for a desired program content by accessing the server 14 and make it available for viewing by the user.

Further, by using this attribute information acquisition function, in the content distribution system 11, when the receiving terminal 13 generates a feature amount from the music information included in the program content recorded by itself and sends it to the server 14, A so-called music search function is also realized, in which a music piece 14 searches for a music piece based on the feature amount and returns attribute information corresponding to the searched music piece to the receiving terminal 13.

By using this song search function, the receiving terminal 13 can analyze the reproduction state of the music information included in the program content recorded therein, that is, the song and song section, etc. it can. Thereby, the user can recognize in advance the reproduction state of the song included in the program content recorded in the receiving terminal 13, and can make the handling when viewing the recorded program content convenient.

FIG. 2 schematically shows an example of the signal processing system of the receiving terminal 13. That is, the broadcast signal received by the antenna 18 is supplied to the tuner unit 20 via the input terminal 19, so that the broadcast signal of a desired channel is selected. The broadcast signal selected by the tuner unit 20 is supplied to the demodulation processing unit 21 to demodulate the TS (transport stream), and the TS demodulated by the demodulation processing unit 21 is supplied to the signal processing unit 22. .

The signal processing unit 22 separates the input TS into a video component and an audio component, performs a decoding process on each stream, restores the digital video signal and the audio signal, and then restores the restored video signal. And predetermined digital signal processing is applied to the audio signal. The signal processing unit 22 outputs the restored video signal to the synthesis processing unit 23, and outputs the restored audio signal to the audio processing unit 24.

Among these, the composition processing unit 23 superimposes and outputs an OSD (on-screen display) signal on the video signal supplied from the signal processing unit 22. The video signal output from the synthesis processing unit 23 is supplied to the video processing unit 25 and converted into a format that can be displayed on a flat-type video display unit 26 having a liquid crystal display panel or the like at a subsequent stage, The video is supplied to the video display unit 26 and used for video display.

The audio processing unit 24 converts the input audio signal into an audio signal in a format that can be reproduced by the speaker 27 at the subsequent stage. Then, the audio signal output from the audio processing unit 24 is supplied to the speaker 27 for audio reproduction. Note that the audio signal output from the audio processing unit 24 is not limited to the speaker 17 and can be supplied to, for example, headphones (not shown) or the like for audio reproduction.

Here, in the receiving terminal 13, various operations including the above-described various receiving operations are comprehensively controlled by the control unit 28. The control unit 28 includes a CPU (central processing unit) 28 a and is transmitted from the operation unit 29 provided in the main body of the receiving terminal 13 or transmitted from the remote controller 30 and received by the receiving unit 31. By receiving the operation information, each part is controlled so that the operation content is reflected.

In this case, the control unit 28 controls each unit by using the memory unit 28b. The memory unit 28b mainly includes a ROM (read only した memory) storing a control program executed by the CPU 28a, a RAM (random access memory) for providing a work area to the CPU 28a, various setting information, A device including a nonvolatile memory in which control information and the like are stored is assumed.

Further, an HDD (hard disk drive) 32 is connected to the control unit 28. The control unit 28 controls the video signal and the audio signal obtained from the signal processing unit 22 to be supplied to the HDD 32 and recorded on the hard disk 32a based on the operation of the operation unit 29 or the remote controller 30 by the user. be able to.

Further, the control unit 28 reads out the video signal and the audio signal from the hard disk 32a by the HDD 32 and supplies them to the signal processing unit 22 based on the operation of the operation unit 29 or the remote controller 30 by the user. It can be controlled to be used for the above-described video display and audio reproduction.

Furthermore, a network interface 33 is connected to the control unit 28. The network interface 33 is connected to the LAN router 15 so as to be able to transmit information. Therefore, the control unit 28 can access the server 14 based on the operation of the operation unit 29 and the remote controller 30 by the user, and can acquire the program content and attribute information provided there. .

Of course, the program content and attribute information acquired from the server 14 are also used for the above-described video display by the video display unit 26 and audio reproduction by the speaker 27. Furthermore, it goes without saying that program content and attribute information acquired from the server 14 are also used for recording and reproduction on the hard disk 32a by the HDD 32.

In addition, the control unit 28 is provided with a music search processing unit 28c. As will be described in detail later, the song search processing unit 28c cuts out an audio stream of a predetermined section from the audio stream constituting the program content recorded on the hard disk 32a at regular time intervals, and decodes the cut out audio stream A feature value is generated from the processed audio signal and sent to the server 14, and the server 14 obtains a result of searching for a piece of music based on the feature value.

Further, the control unit 28 is provided with a music analysis processing unit 28d. As will be described in detail later, the music analysis processing unit 28d uses the search result acquired by the music search processing unit 28c to include the audio signal stream constituting the program content recorded on the hard disk 32a. It functions to analyze the reproduction state of the music information to be played, that is, the music piece, the music section, and the like.

FIG. 3 shows the external appearance of the remote controller 30. The remote controller 30 mainly includes a power key 30a, a numeric key 30b, a channel up / down key 30c, a volume adjustment key 30d, a cursor up key 30e, a cursor down key 30f, a cursor left key 30g, and a cursor right key. 30h, an enter key 30i, a menu key 30j, a return key 30k, an end key 301, four color (blue, red, green, yellow) color keys 30m are provided.

Further, the remote controller 30 is provided with a reproduction stop key 30n, a reproduction / pause key 30o, a reverse skip key 30p, a forward skip key 30q, a fast reverse key 30r, a fast forward key 30s, and the like. That is, the HDD 32 can be played, stopped, and paused by operating the playback stop key 30n or the playback / pause key 30o of the remote controller 30.

Further, by operating the backward skip key 30p or the forward skip key 30q of the remote controller 30, data such as video and audio read from the hard disk 32a by the HDD 32 can be reversed or forward with respect to the reproduction direction. Thus, it is possible to perform a so-called reverse skip or forward skip, which causes a predetermined amount to be skipped.

Furthermore, by operating the fast reverse key 30r, fast forward key 30s, etc. of the remote controller 30, the data such as video and audio read from the hard disk 32a by the HDD 32 is continuously reversed or forward with respect to the reproduction direction. Thus, so-called fast reverse playback and fast forward playback can be performed.

FIG. 4 shows an example of a music search process performed by the music search processing unit 28c using functional blocks. That is, when a song search request is supplied to the song search instruction unit 34, the song search instruction unit 34 generates a cut-out request to the voice cut-out unit 35, and the voice cut-out unit 35 sends a request to the signal processing unit 22. And request acquisition of the audio signal.

Then, in the signal processing unit 22, the audio stream acquisition unit 36 acquires an audio stream of a predetermined section from the audio stream separated from the TS, and outputs it to the audio decoding unit 37. The audio decoding unit 37 performs a decoding process on the input audio stream, generates a digital audio signal converted into, for example, PCM (pulse code modulation), and outputs the digital audio signal to the audio storage unit 38.

Then, the voice accumulation unit 38 accumulates the input voice signal for a predetermined section and outputs it to the voice cutout unit 35, whereby the voice signal cut out for the predetermined section is supplied to the voice cutout unit 35. . Thereafter, the voice cutout unit 35 outputs the input voice signal to the feature value generation unit 39. The feature value generation unit 39 generates a feature value necessary for performing a song search from the input audio signal, and transmits the feature value to the server 14.

As a result, the server 14 searches for the music corresponding to the received feature value, and returns the search result to the receiving terminal 13. The search result returned from the server 14 is acquired by the search result acquisition unit 40 of the receiving terminal 13, and after it is determined whether or not the search result determination unit 41 is valid, the search result is stored in the search result storage unit 42. They are taken out as necessary and used for display or the like.

FIG. 5 shows a flowchart summarizing an example of the music search processing operation performed by the music search processing unit 28c. That is, when the process is started (step S5a), the song search processing unit 28c cuts out the audio signal for a predetermined section in step S5b, and performs a song search from the cut out audio signal in step S5c. Necessary feature amounts are generated, and the generated feature amounts are transmitted to the server 14 in step S5d.

Then, in step S5e, the server 14 searches for the music piece based on the received feature amount, and transmits the search result to the receiving terminal 13. Thereby, the music search processing unit 28c receives and determines the search result in step S5f, accumulates the search result in step S5g, and ends the process (step S5h).

In this embodiment, the song analysis processing unit 28d uses the search result of the song search processing by the song search processing unit 28c to determine the audio signal constituting the program content recorded in the hard disk 32a. The reproduction state of the music information included in the music, that is, the music piece, the music section, and the like are analyzed.

FIG. 6 shows an example of a processing operation in which the music analysis processing unit 28d uses the search result of the music search processing by the music search processing unit 28c to analyze the music information from the audio signal constituting the program content recorded on the hard disk 32a. The flowchart which summarized these is shown. That is, when the process is started (step S6a), the song analysis processing unit 28d, in step S6b, performs a certain time interval (Intvl) from the beginning to the end of the audio signal constituting the program content to be analyzed. [Sec]: For example, in 30 seconds to 1 minute), the music search processing unit 28c is caused to execute the music search processing.

In this case, the audio data length (music search data length) used for the music search process is AudLen [sec], and the minimum audio data length necessary for the music search processing unit 28c to perform the music search process is Lmin [sec. ]
AudLen ＝ Lmin × 2
Satisfy the relationship. The song search interval Intvl [sec] and song search data length AudiLen [sec]
Intvl ＝ AudLen
Satisfy the relationship.

Here, the song search data length AudLen [sec] is set to twice the minimum voice data length Lmin [sec] necessary for performing the song search processing, so that the song search is performed near the change of the song. In other words, as shown in FIGS. 7A, 7B, and 7C, the case where the music search process is performed on the audio signal in which the music B continues after the music A is taken into consideration.

That is, if AudLen [sec] is set to be twice as long as Lmin [sec], the song A and B at the change of the song search data length AudLen [sec] as shown in FIG. Lmin [sec] including only the audio signal of the song A, and Lmin [sec] including only the audio signal of the song A and Lmin including only the audio signal of the song B as shown in FIG. There are cases where [sec] can be secured and cases where Lmin [sec] including only the audio signal of the music piece B can be secured as shown in FIG.

If Lmin [sec] including only the audio signal of the music A can be secured in the music search data length AudLen [sec], the music search result is one candidate for the music A only. In addition, in the music search data length AudLen [sec], when Lmin [sec] including only the audio signal of the music A and Lmin [sec] including only the audio signal of the music B can be secured, The song search results are two candidates, song A and song B. Furthermore, when Lmin [sec] including only the audio signal of the song B can be secured in the song search data length AudLen [sec], the song search result is only one candidate for the song B.

Note that the song search data length AudLen [sec] is shorter than twice the minimum voice data length Lmin [sec] necessary for performing the song search process, that is,
AudLen <Lmin × 2
If set to, a situation occurs in which Lmin [sec] including only the audio signal of the music A or B cannot be secured in the music search data length AudLen [sec] at the turn of the music A and B. In this case, a reliable music search result cannot be obtained. In addition, the music search may fail even when the human voice is included with the music. As a countermeasure, it has been found that it is effective to lengthen the music search data length AudLen [sec]. Therefore, AudLen [sec] is set to be twice as long as Lmin [sec].

Next, in step S6c, the song analysis processing unit 28d refers to the song search results obtained by performing the song search processing at a constant time interval Intvl [sec] one by one, and determines whether or not the analysis is to be performed. A so-called song segment identification process is performed. In addition, a so-called song segment analysis process is performed for estimating a song start position and end position for a song segment determined to be analyzed. Details of the song segment identification processing and analysis processing will be described later.

Thereafter, in step S6d, the song analysis processing unit 28d executes the zone specifying process and the analysis process for the song section that has not been specified in the process of step S6c. Details of the specifying process and the analyzing process for the unspecified section will be described later.

Then, the song analysis processing unit 28d performs a filtering process on the character string that is the song search result in step S6e. That is, the song search result is returned as a character string from the server 14. In this case, even if the same song has a plurality of notations on the database, it may be returned as a different song search result. For example, there are cases where the alphabet is in uppercase and lowercase letters, or whether the font is full-width or half-width. In this embodiment, the existing filtering process is performed on the character string that is the music search result, thereby rounding off the difference in notation.

Thereafter, in step S6f, the music analysis processing unit 28d stores the analysis result after the filtering process on the character string in the memory unit 28b, the hard disk 32a, etc., and ends the processing (step S6g). The stored analysis results can be displayed as a list on the video processing unit 26 based on, for example, the operation of the operation unit 29 or the remote controller 30 by the user.

As a result, it is possible to quickly analyze the reproduction state of the music information included in the audio signal constituting the program content recorded on the hard disk 32a, that is, the music piece, the music section, and the like. Various kinds of music included in the recorded content can be recognized in advance, and handling when viewing the recorded content can be made convenient.

In the processing operation described with reference to FIG. 6, the music search processing unit 28c is caused to execute the music search processing at a constant time interval Intvl [sec]. It can be changed at a predetermined time interval for each scene, such as changing to the same program content.

FIG. 8 shows a flow chart summarizing an example of the song segment identification processing operation and analysis processing operation performed in step S6c. That is, when the process is started (step S8a), the song analysis processing unit 28d performs a song search process at a predetermined time interval Intvl [sec] in step S8b, among a plurality of song search results obtained. Then, one music search result is input based on a predetermined order (for example, oldest order in time).

Thereafter, in step S8c, the music analysis processing unit 28d determines whether or not the input music search result is a so-called no result state in which no specific music is shown. If it is determined that there is no result (YES), in step S8d, the song analysis processing unit 28d sends the song search result to the analysis processing in the unspecified section performed in step S6d. The process returns to S8b.

On the other hand, if it is determined in step S8c that the input song search result indicates a specific song, that is, it is determined that there is no result (NO), the song analysis processing unit 28d, in step S8e, The number of songs obtained as a song search result, that is, whether or not the number of candidate songs is two or more is determined. If it is determined that there are two or more songs (YES), the song search results are displayed in step S8d. The analysis process is performed in the unspecified section performed in step S6d, and the process returns to step S8b.

On the other hand, if it is determined in step S8e that the number of candidate songs is not two or more (NO), the song analysis processing unit 28d obtains the input song search result before or after that in step S8f. It is determined whether or not it is the same as the music search result, and if it is determined that it is the same (YES), in step S8g, the sections in which both music search results are obtained are combined as sections where the same music exists. Then, the process returns to step S8b.

If it is determined in step S8f that the input song search result is not the same as the song search result obtained before or after that time (NO), the song analysis processing unit 28d, in step S8h, A so-called song section analysis process (details will be described later) for estimating the start position or end position of the song indicated by the song search result is executed.

Thereafter, in step S8i, the song analysis processing unit 28d determines whether or not the song segment identification processing and analysis processing have been completed for all the song search results obtained by performing the song search processing. If it is determined that there is not (NO), the process returns to step S8b, and if it is determined that the process is completed (YES), the process ends (step S8j).

FIGS. 9A and 9B specifically show the song segment identification processing and analysis processing described in FIG. For example, as shown in FIG. 9A, among the song search results obtained at regular time intervals Intvl [sec], the song search result obtained at time T1 indicates the song TNs, and the next time T2 When the music search result obtained in step S8f also indicates the same music TNs (YES in step S8f), the music analysis processing unit 28d uses the section (Intvl [ sec]) and the section (Intvl [sec]) used to obtain the song search result (TNs) at time T2 is regarded as a section in which the same song TNs exists (step S8g). To identify.

If the song search result obtained at time T2 indicates the song TNs and the song search result obtained at the next time T3 indicates a song TN different from the song TNs (NO in step S8f), The analysis processing unit 28d determines that the song TNs ends between times T2 and T3, and estimates the end position of the song TNs between times T2 and T3 (step S8h). Analysis processing is performed.

On the other hand, as shown in FIG. 9B, among the music search results obtained at regular time intervals Intvl [sec], the music search result obtained at time T1 indicates the music TN, and the next time T2 If the song search result obtained in step S3 indicates a song TNe different from the song TN, and the song search result obtained at the next time T3 also indicates the same song TNe, the song analysis processing unit 28d The section (Intvl [sec]) used to obtain the song search result (TNe) at T2 and the section (Intvl [sec]) used to obtain the song search result (TNe) at time T3, The song section is identified as a section where the same song TNe exists, and the song TNe is determined to start between the times T1 and T2, and the start position of the song TNe is determined between the times T1 and T2. An estimation processing of the song section is performed.

FIG. 10 shows a flowchart summarizing the analysis processing operation for estimating the end position of the song TNs between times T2 and T3 in FIG. 9A. That is, when the process is started (step S10a), the song analysis processing unit 28d, in step S10b, the position Ns on the audio signal corresponding to the time T2 when the song TNs was obtained as the song search result, and the song search result. As a result, an intermediate position N_mid with respect to the position N on the audio signal corresponding to the time T3 when the song TN is obtained is calculated by the following equation.

N_mid = (Ns + N) / 2
Then, in step S10c, the music analysis processing unit 28d determines whether or not the time difference between the position Ns and the intermediate position N_mid is equal to or less than a preset threshold value (for example, 1 second). If it is determined that there is (YES), in step S10d, the position Ns or the intermediate position N_mid is determined as the end position of the music TNs, and the process is ended (step S10e).

On the other hand, if it is determined in step S10c that the time difference between the position Ns and the intermediate position N_mid is not less than 1 second (NO), the music analysis processing unit 28d performs the music search processing unit at the intermediate position N_mid in step S10f. The music search process is performed in 28c, and the music TN_mid is obtained as a music search result.

Thereafter, in step S10g, the song analysis processing unit 28d determines whether or not the song TN_mid indicated by the song search result at the intermediate position N_mid is the same as the song TNs indicated by the song search result at the position Ns. (YES), the position Ns is updated to the intermediate position N_mid in step S10h, and the process returns to step S10b.

When it is determined in step S10g that the song TN_mid indicated by the song search result at the intermediate position N_mid is not the same as the song TNs indicated by the song search result at the position Ns (NO), the song analysis processing unit 28d In step S10i, the position N is updated to the intermediate position N_mid, and the process returns to step S10b.

That is, when the end position of the music TNs exists between the position Ns and the position N, a music search is performed at an intermediate position N_mid between the position Ns and the position N, and as a result, the music TNs is obtained. It is determined whether the position Ns is updated to the intermediate position N_mid and the music search at the intermediate position with the position N is performed, or the position N is updated to the intermediate position N_mid and the music search at the intermediate position with the position Ns is performed. When the difference between the position Ns and the intermediate position N_mid is less than or equal to a predetermined threshold value, the position Ns or the intermediate position N_mid is determined as the end position of the song TNs.

FIG. 11 shows a flowchart summarizing the analysis processing operation for estimating the start position of the song TNe between times T1 and T2 in FIG. 9B. That is, when the process is started (step S11a), the song analysis processing unit 28d, in step S11b, the position N on the audio signal corresponding to the time T1 when the song TN was obtained as the song search result, and the song search result. As an intermediate position Ne_mid with respect to the position Ne on the audio signal corresponding to the time T2 at which the song TNe is obtained as follows.

Ne_mid = (Ne + N) / 2
Then, in step S11c, the music analysis processing unit 28d determines whether or not the time difference between the position Ne and the intermediate position Ne_mid is equal to or less than a preset threshold value (for example, 1 second). If it is determined that there is (YES), in step S11d, the position Ne or the intermediate position Ne_mid is determined as the start position of the music TNe, and the process ends (step S11e).

If it is determined in step S11c that the time difference between the position Ne and the intermediate position Ne_mid is not less than 1 second (NO), the music analysis processing unit 28d performs the music search processing unit at the intermediate position Ne_mid in step S11f. The music search process is performed in 28c, and the music TNE_mid is obtained as a music search result.

Thereafter, in step S11g, the music analysis processing unit 28d determines whether or not the music TNe_mid indicated by the music search result at the intermediate position Ne_mid is the same as the music T Ne indicated by the music search result at the position Ne. If it is determined (YES), the position Ne is updated to the intermediate position Ne_mid in step S11h, and the process returns to step S11b.

When it is determined in step S11g that the song TNe_mid indicated by the song search result at the intermediate position Ne_mid is not the same as the song TNe indicated by the song search result at the position Ne (NO), the song analysis processing unit 28d In step S11i, the position N is updated to the intermediate position Ne_mid, and the process returns to step S11b.

That is, when the start position of the music TNe exists between the position N and the position Ne, a music search is performed at an intermediate position Ne_mid between the position N and the position Ne, and as a result, whether or not the music TNe is obtained, It is determined whether the position Ne is updated to the intermediate position Ne_mid and the music search at the intermediate position with the position N is performed, or the position N is updated to the intermediate position Ne_mid and the music search at the intermediate position with the position Ne is performed. When the difference between the position Ne and the intermediate position Ne_mid is equal to or less than a predetermined threshold value, the position Ne or the intermediate position Ne_mid is determined as the start position of the song TNe.

FIGS. 12 to 15 are flowcharts summarizing an example of the specifying process operation and the analyzing process operation of the unspecified music section performed in step S6d. That is, when the process is started (step S12a), the song analysis processing unit 28d determines whether or not there is a song search result that was sent to the analysis process in the unspecified section in step S8d in step S12b. If it is determined that it does not exist (NO), the process ends (step S12c).

If it is determined in step S12b that there is a music search result sent to the analysis process in the unspecified section (YES), the music analysis processing unit 28d determines in step S12d that a predetermined order (eg, time In accordance with the oldest etc.), and in step S12e, it is determined whether or not the input song search result indicates that there is no specific song, so-called no result. Determine.

If it is determined that there is no result (YES), the song analysis processing unit 28d determines that the song search results obtained before and after the song search result are already in the song section or in step S13a. It is determined whether or not it has been analyzed as a non-music section. If it is determined that it has been analyzed (YES), the section used to obtain the song search result is determined as a non-music section in step S13b. Then, the process returns to step S12b.

If it is determined in step S13a that the song search result obtained before and after the input song search result has not been analyzed (NO), the song analysis processing unit 28d performs step S13c before and after. It is determined whether or not the obtained music search result is a so-called no result state in which no specific music is indicated.

If it is determined that there is no result (NO), the song analysis processing unit 28d estimates the end position of the non-music section using the analysis processing operation described above with reference to FIG. 10 in step S13d. The process returns to step S12b.

If it is determined in step S13c that the song search result obtained before and after the input song search result is in a state of no result (YES), the song analysis processing unit 28d determines in step S13e. The song search processing unit 28c performs song search processing at an intermediate position between the position on the audio signal from which the song search result is obtained and the position on the audio signal from which the previous or subsequent song search result is obtained. Make it.

Then, in step S13f, the song analysis processing unit 28d determines whether or not the song search result at the intermediate position is a so-called no-result state in which no particular song is indicated, and there is no result. If it is determined that it is not (NO), in step S13g, the section used to obtain the song search result is newly added as an unspecified section, and the process returns to step S12b.

If it is determined in step S13f that the result of the song search at the intermediate position is in a state of no result (YES), the song analysis processing unit 28d is used to obtain the input song search result in step S13h. The obtained sections and the sections used for obtaining the music search results before and after the combined sections are combined as non-music sections, and the process returns to step S12b.

Next, when it is determined in step S12e that the input song search result is not in a state of no result (NO), the song analysis processing unit 28d determines the number of songs obtained as a song search result in step S12f, That is, it is determined whether or not the number of candidate songs is one, and if it is determined that there is one song (YES), in step S12g, the analysis processing operation described above with reference to FIGS. 10 and 11 is used. Thus, the end position and start position of the candidate song are estimated, and the process returns to step S12b.

If it is determined in step S12f that the number of candidate songs is not one (NO), the song analysis processing unit 28d determines whether or not the number of candidate songs is two in step S12h. If it is determined that there are three or more songs (NO), in step S14a, there is the same song as the song obtained from the song search results before and after the entered song search result in step S14a. It is determined whether or not to do.

If it is determined that it exists (YES), the song analysis processing unit 28d uses the section used for obtaining the input song search result in step S14b to obtain the song search result before and after that. Are combined as the same song section as that existing in the section, and the process returns to the process of step S12b.

If it is determined in step S14a that the same song as the song obtained from the song search results before and after the input song search result does not exist in the candidate song (NO), the song analysis processing unit 28d In step S14c, it is determined whether or not the candidate song includes the same song as the song obtained from the song search result before or after the input song search result.

If it is determined that it exists (YES), the song analysis processing unit 28d obtains the song search result before or after the section used to obtain the entered song search result in step S14d. The music is combined as the same music section as the music existing in the used section, and the process returns to step S12b.

If it is determined in step S14c that the candidate song does not include the same song as the song obtained from the song search result before or after the input song search result (NO), the song analysis processing unit In step S14e, in order to obtain an input song search result for the first candidate song determined based on a preset priority in consideration of the case where there are three or more candidate songs in step S14e. It is estimated that the song exists in the used section, and the process returns to step S12b.

On the other hand, when it is determined in step S12h that the number of candidate songs is two (YES), the song analysis processing unit 28d does not select one of the two songs, but two songs are continuous. It operates to search for the boundary between two songs. That is, the song analysis processing unit 28d determines a song search start position on a section where two candidate songs are searched in step S15a, and resets the time counter. In this case, the song search start position is the head position of the section in which two candidate songs are searched.

Then, in step S15b, the song analysis processing unit 28d determines whether or not the newly stored song search result has changed compared to the previous song search result, and if it is determined that it has changed (YES) In step S15c, it is determined that the changed time is the boundary of the song, and the process returns to step S12b.

If it is determined in step S15b that the song search result has not changed (NO), the song analysis processing unit 28d sets the song search data length AudLen [sec] to Lmin [sec] at the head position in step S15d. The music search processing unit 28c performs music search processing, and the music search results are stored in step S15e.

Thereafter, when one second is measured by the time counter in step S15f, the music analysis processing unit 28d moves the music search position by a predetermined amount in step S15g and returns to the processing in step S15b. In this case, the music search position is moved by a certain amount toward the end position of the section in which two candidate songs are searched.

FIG. 16 shows an example of the result of analyzing the reproduction state of the music information using the song search results as described in FIGS. 6 to 15 for the program content recorded on the hard disk 32a by the HDD 32. Yes. The analysis result shown in this example is obtained by the user operating the menu key 30j of the remote controller 30 to divide into a plurality of menu screens having a hierarchical structure and requesting the display of the analysis result, thereby the video display unit 26 Can be listed.

According to this example, 10 minutes, 2 hours and 55 minutes, 3 hours and 8 minutes, 3 hours and 29 minutes, 4 hours and 27 minutes, and 4 hours and 44 minutes after the start of program content playback, You can see if the song with the track name starts. In addition, it can be seen that after the start of program content playback, non-music intervals are reached after 0 minutes, 1 hour 48 minutes, 3 hours 43 minutes, and 4 hours 58 minutes. Thereby, since the user can recognize in advance various kinds of music included in the recorded content, it is possible to make handling convenient when viewing the recorded content.

By the way, for the program content recorded on the hard disk 32a, when the music information playback state is analyzed using the song search result, there may be two candidate songs at the same time as described above. . For example, as shown in the analysis result shown in FIG. 17, two songs may be searched as candidates one hour and fifteen minutes after the start of program content playback. In such a case, the boundary between two songs is searched using the processing described in FIG.

In the above-described embodiment, when estimating the start position and end position of a searched song, only the start position is estimated for all songs, and then only the end position is estimated for all songs. In this way, when the end position is estimated, the estimation result of the start position can be used, so that the processing can be facilitated. Conversely, it is also conceivable that only the end position is estimated for all songs, and then only the start position is estimated for all songs.

Note that the present invention is not limited to the above-described embodiment as it is, and can be embodied by variously modifying the constituent elements without departing from the scope of the invention in the implementation stage. Various inventions can be formed by appropriately combining a plurality of constituent elements disclosed in the above-described embodiments. For example, some components may be deleted from all the components shown in the embodiment. Furthermore, constituent elements according to different embodiments may be appropriately combined.

Claims

Search means for searching for music at predetermined time intervals for the content to be analyzed;
An information processing apparatus comprising: an analysis unit that analyzes a reproduction state of a song included in the content based on a song search result obtained at a predetermined time interval by the search unit.
2. The information processing according to claim 1, wherein the search unit cuts out an audio signal having a predetermined length from the content, generates a feature amount from the cut-out audio signal, and acquires a result of music search based on the feature amount. apparatus.
The search means includes
Cut-out means for cutting out a predetermined-length audio signal from the content in response to a song search request;
Generating means for generating a feature quantity from the audio signal cut out by the cut-out means, and transmitting the generated feature quantity to a server via a network;
The information processing apparatus according to claim 1, further comprising: an acquisition unit configured to acquire a result of the music search performed by the server based on the feature amount generated by the generation unit via the network.
When the plurality of song search results that are temporally continuous among the song search results obtained at predetermined time intervals by the search means indicate the same song, the analysis means has obtained those song search results. The information processing apparatus according to claim 1, wherein each position on the audio signal is combined as a section where the same music exists.
When the two song search results obtained continuously in time indicate different songs among the song search results obtained by the search means at predetermined time intervals, the analysis means searches for the two songs. The information processing apparatus according to claim 1, wherein the end position of the music indicated by the previous music search result is estimated using the result.
The analysis means includes
When a song search is performed at an intermediate position between two positions where two time search results are obtained, and as a result, a song indicated by the previous song search result is obtained, the intermediate position and time The song search is performed at a position intermediate to the position where the later song search result is obtained,
When a song search is performed at an intermediate position between two positions where two temporal search results are obtained, and as a result, a song indicated by a later song search result is obtained, the intermediate position and time 6. The information processing apparatus according to claim 5, wherein the music search is performed at an intermediate position from a position where the previous music search result is obtained.
When the two song search results obtained continuously in time indicate different songs among the song search results obtained by the search means at predetermined time intervals, the analysis means searches for the two songs. The information processing apparatus according to claim 1, wherein the start position of a song indicated by a later song search result is estimated using the result.
The analysis means includes
When a song search is performed at an intermediate position between two positions where two temporal search results are obtained, and as a result, a song indicated by a later song search result is obtained, the intermediate position and time The song search is performed at an intermediate position from the position where the previous song search result was obtained,
When a song search is performed at an intermediate position between two positions where two time search results are obtained, and as a result, a song indicated by the previous song search result is obtained, the intermediate position and time The information processing apparatus according to claim 7, wherein the music search is performed at an intermediate position from a position where a subsequent music search result is obtained.
Using search means, perform a song search at a predetermined time interval for the content to be analyzed,
An information processing method for analyzing a reproduction state of a song included in the content based on a song search result obtained at a predetermined time interval by the search unit using an analysis unit.