WO2010058509A1

WO2010058509A1 - Information processing device

Info

Publication number: WO2010058509A1
Application number: PCT/JP2009/004705
Authority: WO
Inventors: 大網亮磨
Original assignee: 日本電気株式会社
Priority date: 2008-11-21
Filing date: 2009-09-18
Publication date: 2010-05-27
Also published as: JP5304795B2; JPWO2010058509A1

Abstract

An information processing device is provided with a credit section extracting means which extracts a time section wherein character information is superimposed to video information as credit section information on the basis of the video information comprising an inputted predetermined reproduction time, a character information extracting means which subjects the video information to character recognition processing, associates the character information contained in the video information and reproduction time information expressing a time wherein the character information is reproduced, and extracts the information as recognized character information, a performer information display section extracting means which extracts information relating to performer information display section which is a time section wherein performer information expressing the names of performers who performed in the video is displayed in the character information on the basis of the credit section information and the recognized character information, and a performer information extracting means which extracts the character information contained in the recognized character information in the time section specified by the information on the performer information display section, as the performer information.

Description

Information processing device

The present invention relates to an information processing apparatus, and more particularly to an information processing apparatus that extracts specific character information from video information.

In recent years, with the digitization of content such as moving images and music, there are many problems concerning the rights of these content such as copyrights and adjacent rights. For example, how to manage unauthorized use of content, content usage permission, usage fee collection, and the like is a problem. For such a situation, Patent Document 1 discloses a rights management system that manages rights such as copyrights attached to contents such as moving images. In this rights management system, copyrights and other rights are centrally managed by the content management server and linked with the contract management server, billing server, authentication server, etc., and automatic contracts and content Realizes secure distribution.

However, in the above system, it is assumed that copyright information such as copyright is manually registered by an intermediary. For this reason, in order to be able to handle the content produced in the past by the system as described above, it is necessary to manually extract the right information from the content and register it. Then, for example, in the case of past contents, details of contract information and the like often remain, and it is first necessary to clarify who has the rights to the contents. Then, registration is performed while checking manually, but there is a problem that a very large number of man-hours must be spent on this process. This, for example, has also hindered the use of excellent content such as TV broadcast dramas to the secondary distribution market.

On the other hand, Non-Patent Documents 1 and 2 disclose telop recognition technology that reads character information in credit information that displays names of producers and performers that flow in content such as broadcast programs. By using such technology, it is also possible to automatically extract information such as copyright and copyright right existing in the video from the video.

Here, personal information such as “original author”, “screenwriter”, “director”, etc. included in the credit title and related to the copyright to be extracted, and “performer” as information related to the copyright adjacent right These are particularly important for secondary use. And, if the person name is written with the word that can determine the rights holder type such as “original author”, “screenwriter”, “director”, etc., by associating with the word after telop recognition, The right holder type of the person name can be determined.

JP 2002-109254 A

However, in the case of the latter “performer” described above, the word indicating the right holder type is not shown in the credit title, and only the name is listed, or it is described together with the cast name that is not a general noun. There is a case. In such a case, it is not possible to automatically determine whether or not the person's name is a performer using only the character information recognized from the credit information, and it is necessary to rely on human hands after the telop recognition. Therefore, as described above, there are still problems that the work cost for clarifying the right included in the content increases and the secondary use of the content is limited.

For this reason, an object of the present invention is to provide an information processing apparatus capable of solving the above-mentioned problem, “an increase in cost for extracting right information included in content and restriction on secondary use of content”. It is to provide.

In order to achieve such an object, an information processing apparatus according to an aspect of the present invention provides:
Credit section extraction means for extracting, as credit section information, a time section in which character information is superimposed on the video information based on the input video information having a predetermined playback time;
Character information extraction means for performing character recognition processing on the video information and extracting the character information included in the video information and the reproduction time information indicating the time during which the character information is reproduced in association with each other. When,
Performers to extract performer information display section information, which is a time section in which performer information representing the name of the performer who appeared in the video is displayed in the character information based on the credit section information and the recognized character information. Person information display section extraction means;
Performer information extracting means for extracting the character information included in the recognized character information in the time section specified by the performer information display section information as the performer information;
Is provided.

Moreover, the program which is the other form of this invention is:
In the information processing device,
Credit section extraction means for extracting, as credit section information, a time section in which character information is superimposed on the video information based on the input video information having a predetermined playback time;
Character information extraction means for performing character recognition processing on the video information and extracting the character information included in the video information and the reproduction time information indicating the time during which the character information is reproduced in association with each other. When,
Performers to extract performer information display section information, which is a time section in which performer information representing the name of the performer who appeared in the video is displayed in the character information based on the credit section information and the recognized character information. Person information display section extraction means;
Performer information extracting means for extracting the character information included in the recognized character information in the time section specified by the performer information display section information as the performer information;
It is a program for realizing.

In addition, an information processing method according to another aspect of the present invention includes:
Based on the input video information having a predetermined reproduction time, a time section in which character information is superimposed on the video information is extracted as credit section information,
Before and after extracting the credit section information, character recognition processing is performed on the video information, and the character information included in the video information is associated with the reproduction time information indicating the time at which the character information is reproduced. Extracted as recognized character information,
Based on the credit section information and the recognized character information, extract performer information display section information which is a time section in which performer information representing the name of the performer who appeared in the video is displayed in the character information,
Extracting the character information included in the recognized character information in the time section specified by the performer information display section information as the performer information;
The structure is taken.

The present invention is configured as described above, so that the right information included in the video content can be easily extracted with high accuracy and low cost.

It is a functional block diagram which shows the structure of the information processing apparatus of this invention. It is a figure which shows an example of an image | video. It is a figure which shows an example of an image | video. 3 is a flowchart illustrating an operation of the information processing apparatus disclosed in FIG. 1. It is a functional block diagram which shows the structure of the performer information display area extraction means of the information processing apparatus in Embodiment 2. It is a flowchart which shows operation | movement of the performer information display area extraction means disclosed in FIG. It is a functional block diagram which shows the structure of the performer information display area extraction means of the information processing apparatus in Embodiment 3. It is a flowchart which shows operation | movement of the performer information display area extraction means disclosed in FIG. It is a functional block diagram which shows the structure of the performer information display area extraction means of the information processing apparatus in Embodiment 4. It is a flowchart which shows operation | movement of the performer information display area extraction means disclosed in FIG. It is a functional block diagram which shows the structure of the performer information display area extraction means of the information processing apparatus in Embodiment 5. It is a flowchart which shows operation | movement of the performer information display area extraction means disclosed in FIG.

<Embodiment 1>
A first embodiment of the present invention will be described with reference to FIGS. FIG. 1 is a functional block diagram illustrating a configuration of the information processing apparatus. 2 to 3 are diagrams illustrating an example of an image. FIG. 4 is a flowchart showing the operation of the information processing apparatus. In addition, this embodiment shows a specific example of the information processing apparatus disclosed in Embodiment 6 described later.

[Constitution]
The information processing apparatus in the present embodiment is a general computer that includes an arithmetic device and a storage device. And the video information, such as a movie and a television program, is input, and the structure of extracting performer information from the character information superimposed on this video information is taken.

And the basic structure of the information processing apparatus 1 in this embodiment is as shown in FIG. 1, credit section extraction means 2, character information extraction means 3, performer information display section extraction means 4, and performers. Information extraction means 5. Each of these means 2 to 5 is constructed in the information processing apparatus 1 by incorporating a performer information extraction program into the arithmetic device. The details will be described below.

The above program is provided to the information processing apparatus 1 in a state where it is stored in advance in a storage device provided in the information processing apparatus 1 or stored in a storage medium such as a CD-ROM. Alternatively, the program may be stored in a storage device of another server computer on the network and provided to the information processing device 1 from the other server computer via the network.

First, video information will be described in detail. The video information is moving image data having a predetermined reproduction time such as a movie or a television program. Character information is superimposed on this video information. For example, as character information, there is information called credit information (credit title) that displays the names of persons involved in the production of content that is video, and telops such as descriptions of video and words of performers in the video. Among these, in the present invention, performer information is further extracted from the credit information.

Here, FIGS. 2 to 3 show examples of video information, that is, display examples when video is displayed on the display screen. For example, as shown in FIG. 2, the video information includes credit information that is the name of the performer (for example, “XXXXX” or “△△△△”) at the beginning or end of the program. There is a case. In this case, as shown in FIG. 2 (A), only the names of the performers may be displayed in one line. On the other hand, as shown in FIG. 2 (B), “original author”, “ In some cases, information indicating the role of a person who is not appearing in a video such as “screenwriter” or “director” and who is involved in the production of the video may be displayed. 2A and 2B show the case where the image is displayed at the center of the video screen. However, as shown in FIG. 3A, the image is displayed only at the lower part of the video screen, or FIG. In some cases, as shown in FIG. Although not shown, it may be displayed together with the cast name of the performer. In some cases, telops such as words spoken by the performers are displayed.

Then, based on the input video information, the credit section extraction means 2 extracts a time section in which character information is superimposed on the video information as credit section information. Then, the credit section information is output to the performer information display section extracting means 4. Specifically, the credit section extraction means 2 extracts the time section of the theme song from the program and outputs it as credit section information. This is because, for example, in the case of a video such as a drama, the credit information is often superimposed on the theme song. Accordingly, the credit section extraction unit 2 has a function of detecting music played during video playback, and uses the time section in which the music is played as credit section information. At this time, the credit section extracting means 2 detects that music is being reproduced by detecting continuous sounds of a predetermined loudness, for example, but any method may be used. Credit information is often displayed as a roll telop at the end of a program in video information such as a variety program. For this reason, for example, the credit section extracting means 2 may detect a roll telop that scrolls in a predetermined direction such as a horizontal or vertical direction at a constant speed at the end of the program, and outputs this time section as credit time section information. However, the method of extracting the credit section by the credit section extracting unit 2 is not limited to the method described above.

Also, the character information extracting means 3 receives the input of video information in the same manner as the credit section extracting means 2. Then, character recognition processing is performed on the video information, and character information that is a recognized character string is extracted. At this time, the character information extracting means 3 acquires reproduction time information indicating the time when the recognized character string is reproduced, associates the recognized character information with the reproduction time information, and displays the performer information as recognized character information. It outputs to the section extraction means 4 and the performer information extraction means 5. Note that the character recognition process can be realized using, for example, the techniques disclosed in Non-Patent Documents 1 and 2 described above.

Here, the character information extraction means 3 may extract the position information on the video screen (frame) of the character string together with the character string, associate the position information with the character string, and include the position information in the recognized character information. . For example, the position coordinates of each vertex of the circumscribed rectangle of the recognized character string and the width and height information of one vertex of the circumscribed rectangle and the circumscribed rectangle are extracted as the position information of the character string and included in the recognized character information May be.

Furthermore, the character information extraction unit 3 acquires the credit section information extracted by the credit section extraction unit 2 described above, and the character recognition process described above is performed only for the video of the time section specified by the credit section information. May be executed.

Further, the performer information display section extracting means 4 first receives input of the credit section information output from the credit section extracting means 4 and the recognized character information output from the character information extracting means 3 as described above. . And the performer information display section extraction means 4 is based on these credit section information and recognition character information, and the time section where the performer information showing the name of the performer who appeared in the image is displayed in the character information is displayed. The performer information display section information that identifies this time section is extracted and output to the performer information extraction means 5. Specifically, the performer information display section extracting means 4 displays the performer information using the characteristics of the recognized character string included in the recognized character information and information representing the temporal position in the credit section. Specify the time interval to be played. For example, based on the rule that “the time interval in which the performer information is displayed in the credit information exists collectively”, the time interval in which the appearance of the character information is concentrated in the credit interval information is displayed as the performer information display. Interval. In addition, the more detailed structure of this performer information display area extraction means 4, ie, the extraction method of a performer information display area, is explained in full detail in other embodiment.

Further, the performer information extraction means 5 accepts input of performer information display section information output from the performer information display section extraction means 4 and recognized character information output from the character information extraction means 3. And the said character information contained in the image | video in the time area specified by the said performer information display area information is extracted as performer information showing the name of a performer. At this time, the performer information extracting means 5 determines the personality of the character string based on predetermined criteria information from the viewpoints of the number of characters, the arrangement of hiragana and kanji, the used kanji, etc. A character string that satisfies the criteria is extracted as the name of the performer. Further, as another example, the performer information extracting means 5 may select only the name of the performer after eliminating the cast name based on the arrangement of the recognized character string on the video screen. Good. For example, when the names of people are arranged in two columns and one of the characters is small, the smaller character may be excluded as a cast name, and the other character string may be extracted as the name of the performer. .

Furthermore, when the performer information extracting means 5 detects a character string representing the role of a person involved in the production of a preset video, the name associated with the character string representing this role is Do not extract as not. For example, if a character string that represents the role of a person who does not appear in the video such as “original author”, “screenwriter”, or “director” is detected, the names of people in the same column as the character string that represents this role Does not extract.

Then, the performer information extraction means 5 outputs the performer information, which is the name of the performer extracted as described above, to the display of the information processing apparatus 1 or to a predetermined file for storage.

The method for extracting performer information by the performer information extracting means 5 may be used when the above-described performer information display section extracting means 4 specifies and extracts the performer information display section. That is, as described above, the performer information display section extraction unit 4 extracts, as the performer information display section, the time section that is determined to include the name of the performer from the number of characters, the arrangement of hiragana and kanji, and the like. And may be output to the performer information extraction means.

[Operation]
Next, the operation of the information processing apparatus 1 configured as described above will be described with reference to the flowchart of FIG. First, the information processing apparatus 1 accepts input of video information (step S1). Then, the information processing apparatus 1 extracts a time interval in which credit information that is character information is superimposed in the video as credit interval information (step S2, credit interval extraction step). At this time, for example, a time section in which music such as a theme song flows is extracted, or a time section in which roll telop is flowing is extracted.

Further, the information processing apparatus 1 recognizes a character string superimposed in the input video information before and after the credit section information extraction process, associates it with its appearance time, and extracts it as recognized character information. (Step S3, character information extraction step). At this time, not only the time but also position information specifying the display position of the character string may be extracted and included in the recognized character information. Further, as described above, the information processing apparatus 1 may perform character recognition only for the time section extracted as the credit section.

Note that the process of step S2 by the credit section extraction unit 2 and the process of step S3 by the character information extraction unit 3 are not limited to being performed in the order described above, but are performed in reverse order or simultaneously. Also good.

Subsequently, the information processing apparatus 1 extracts a time section in which performer information is included in the video based on the credit section information and the recognized character information (step S4, performer information display section extraction step). ). For example, a time interval in which a character string is displayed together more than a certain reference, or a specific time interval such as the beginning or ending portion in a video is extracted as a time interval including performer information.

Thereafter, the information processing apparatus 1 extracts performer information from the character string within the time interval specified that the performer information is displayed (step S5, performer information extraction step). At this time, for example, the personality of the character string is determined, and only the personal name is extracted and used as performer information. In some cases, the name of the cast in the video may be specified and excluded according to the arrangement of the character string. In addition, a character string representing the role of a person who has not appeared in the video such as “original author”, “screenwriter”, “director”, etc. is detected, and the name of the person whose character string representing this role is not in the same column appears. As a person's name. Thereafter, performer information representing the name of the performer is output to a display or a file (step S6).

As described above, in this embodiment, first, the time interval in which the performer information is displayed is specified based on the time interval in which the credit information is displayed and the character content of the credit. Therefore, the performer information can be extracted from the video easily, with high accuracy, and at low cost, and the right information included in the content can be specified.

<Embodiment 2>
A second embodiment of the present invention will be described with reference to FIGS. FIG. 5 is a functional block diagram illustrating a configuration of the information processing apparatus. FIG. 6 is a flowchart showing the operation of the information processing apparatus.

The information processing apparatus 1 according to the present embodiment has almost the same configuration as that of the first embodiment described above. And in this embodiment, especially the structure of the performer information display area extraction means 4 differs. Therefore, the configuration and operation of the performer information display section extraction unit 4 will be mainly described below.

As shown in FIG. 5, the performer information display section extraction unit 4 of the information processing apparatus 1 in this embodiment includes a performer information display section candidate extraction unit 41 and a performer information display section determination unit 42. Yes. Each of the

means

41 and 42 is constructed in the information processing apparatus 1 by incorporating a performer information extraction program into the arithmetic device. The details will be described below.

The performer information display section candidate extracting means 41 acquires recognized character information including character information extracted from the video by the character information extracting means 3 disclosed in FIG. Then, the performer information display section candidate extraction means 41 checks whether or not a predetermined “specific character string” is included in the character information included in the recognized character information.

Here, the above-mentioned “specific character string” is a character representing the role of the person who is not appearing in the video, such as “original”, “screenplay”, “director”, “producer”, etc. It is. In other words, the specific character string is a character string representing a right type such as a copyright or a copyright right adjacent to the video.

And the performer information display area candidate extraction means 41 determines whether each character string corresponds to a specific character string, and the continuous time interval from which the character string which does not correspond to any of the said specific character string was extracted Ask for. Then, the information which specifies this time interval is made into performer information display area candidate information, and it outputs to the performer information display area determination means 42 (step S11, performer information display area candidate extraction process). At this time, for example, the start and end times of candidate time intervals are output for each candidate interval. Alternatively, either the start or end time of the candidate section and section length information indicating the length of the section may be output for each section.

Also, the performer information display section determination means 42 acquires the performer information display section candidate information and the credit section information extracted from the video by the credit section extraction means 2 disclosed in FIG. Then, the performer information display section determination means 42 uses the credit section information, the time from the start of displaying character information of each candidate section included in the performer information display section candidate information, that is, the time of each candidate section. The overall length is calculated. The performer information display section determination means 42 identifies the performer information display section information based on the length of the time section, such as the longest time section represented by the performer display section candidate information. To do.

Moreover, the performer information display section determination means 41 may extract the beginning section of the candidate time section as the performer information display section information. For example, since performers are often displayed in the first half of the credit display, the candidate section that is included in the first half of the credit information and has the longest time section length is selected as the performer information display section. (Step S12, performer information display section determination step). And the information which specifies the selected time interval is output to the performer information extraction means 5 as performer information display area information.

The performer information display section determination means 42 may obtain performer information display section information from only information included in the performer information display section candidate information and may output this. And when there is only one candidate section, this may be used as the performer display section as it is, or when there are multiple candidate sections, one is selected on the basis of the time section length or the like, and the performer display section is selected. You may make it.

As described above, the performer information display section extraction unit 4 in the present embodiment uses a time section in which the recognized character information does not include a predetermined character as a time section in which performer information is displayed. By extracting, the time interval in which performer information is included can be narrowed down more efficiently. Further, based on the length of the time section that does not include a specific character or the relative time with respect to the entire video, the time section in which the performer information is displayed is extracted from the time section candidates. Thus, performer information can be extracted with higher accuracy.

<Embodiment 3>
A third embodiment of the present invention will be described with reference to FIGS. FIG. 7 is a functional block diagram illustrating the configuration of the information processing apparatus. FIG. 8 is a flowchart showing the operation of the information processing apparatus.

The information processing apparatus 1 in the present embodiment has almost the same configuration as that of the third embodiment described above. And in this embodiment, especially the structure of the performer information display area extraction means 4 differs. Therefore, the configuration and operation of the performer information display section extraction unit 4 will be mainly described below.

As shown in FIG. 7, the performer information display section extraction means 4 of the information processing apparatus 1 in this embodiment includes performer information display section candidate extraction means 141, performer information display section determination means 142, and performer information. Display probability calculation means 143. Each of the means 141 to 143 is constructed in the information processing apparatus 1 by incorporating a performer information extraction program into the arithmetic device. The details will be described below.

The performer information display probability calculating means 143 acquires the credit section information extracted from the video by the credit section extracting means 2 disclosed in FIG. Here, the information processing apparatus 1 stores “ease of display information” that represents the relationship between the reproduction time of credit information for video and the ease of display of performer information. Based on this “ease of display information”, the performer information display probability calculation means 143 calculates, as performer information display probability information, the probability that the performer information can be displayed for each time in the credit section information. .

The above “ease of display information” is expressed by a function relative to the relative time from the beginning of the credit display normalized by the length of the section in which the credit is displayed, and based on this, for each time This is data for which the probability can be calculated. Further, the “ease of display information” may hold, for example, the probability that the performer information is displayed at each relative time from the beginning of the credit display as a value. Further, “ease of display information” may be parameter information describing a model by modeling a function with respect to the relative time from the beginning of the credit display. The “ease of display information” may be information automatically calculated based on a processing function in which the information processing apparatus 1 is incorporated from a plurality of pieces of credit information so far. It may be adjusted and assigned in.

Then, the performer information display probability calculation means 143 displays the performer information at each reproduction time of the video based on the credit section information and the “displayability information” stored in advance as described above. The probability is calculated and output to the performer information display section determination means 142 (step S21, performer information display probability calculation step).

Further, the performer information display section candidate extraction means 141 is substantially the same as that of the third embodiment described above. That is, performer information display section candidate information representing a time section that is a candidate that can include performer information is output to the performer information display section determination means 142 (step S22, performer information display section candidate extraction step).

In addition, the process of step S21 by the performer information display probability calculation unit 143 and the process of step S22 by the performer information display section candidate extraction unit 141 are not limited to being performed in the order described above, and vice versa. May be executed at the same time or simultaneously.

The performer information display section determining means 142 is configured to output the performer information display probability information output from the performer information display probability calculating means 143 and the performer output from the performer information display section candidate extracting means 141. User information display section candidate information is acquired. And the performer information display area determination means 142 specifies the time interval in which performer information is displayed based on the probability that performer information in each candidate section is included (step S23, performer information display section determination step). ). For example, the probability that performer information can be displayed in each candidate section is calculated for each candidate section specified in the performer information display section candidate information. At this time, for example, the average of the entire performer information display probability in each candidate section, the maximum value, the minimum value, and the like are used as the probability of the section. Then, among the candidate sections, the candidate section having the maximum obtained probability is set as the time section in which the performer information is displayed, and the performer information display section information for specifying this section is output to the performer information extracting means 5. .

Moreover, the performer information display section determination means 142 may specify the performer information display section as follows. For example, it further has a standard regarding the section length in which performer information is displayed, and the validity of the section length is verified based on this standard, and the performer information display section is determined together with the above-mentioned probability. It may be. Specifically, a minimum length that is appropriate as a performer information display section is defined as a reference value, and a candidate section that satisfies this reference value and that has the maximum probability is selected. It may be.

As described above, in the present embodiment, the information of the performer information for each reproduction time is prepared based on information that is prepared in advance by statistics or the like and represents the easiness of display of the performer information according to the reproduction time of the video. Display probability is calculated. Then, a time interval in which performer information is displayed is calculated based on the calculated probability, such as a time interval in which the probability is maximum. Therefore, performer information can be extracted with higher accuracy.

<Embodiment 4>
A fourth embodiment of the present invention will be described with reference to FIGS. FIG. 9 is a functional block diagram illustrating a configuration of the information processing apparatus. FIG. 10 is a flowchart showing the operation of the information processing apparatus.

As shown in FIG. 9, the performer information display section extraction means 4 of the information processing apparatus 1 in this embodiment includes a performer information display section candidate extraction means 241, a performer information display section determination means 242, and an appearance pattern analysis. Means 244. These means 241, 242, and 244 are constructed in the information processing apparatus 1 by incorporating a performer information extraction program into the arithmetic device. The details will be described below.

The appearance pattern analysis unit 244 acquires recognized character information including character information extracted from the video by the character information extraction unit 3 disclosed in FIG. Then, the appearance pattern analysis unit 244 analyzes the appearance pattern representing the temporal appearance status of the character information included in the recognized character information with respect to the video and the appearance pattern representing the appearance status in the display layout of the character information with respect to the video reproduction area. To do.

Specifically, in the analysis of the temporal appearance pattern of the former character information described above, the appearance frequency information of the character string in the video is calculated for each time in the credit information. At this time, for example, performer information is displayed sequentially from the protagonist person to the supporting part, but when the protagonist person is displayed, it is often displayed by that person alone, per unit time The appearance frequency of the character string is low. On the other hand, in the case of a supporting role person, a plurality of people are often displayed together, so that the appearance frequency of the character string per unit time increases. Therefore, the appearance frequency information of the character string with the passage of the video playback time can be used for specifying the performer information display section in the performer information display section determination means 242 described later.

On the other hand, in the spatial layout of the latter character information described above, the recognized character information extracted in advance by the character information extraction means 3 includes layout information such as the position and size of the character string in the display screen (frame). Analysis can be performed. In this case, a predetermined layout structure such as two character strings are displayed in one line, such as a character string indicating the type of right or role and a person's name (such as the keyword “script” and the name of the writer). Whether there is a corresponding character string to be displayed is determined from the layout information of the recognized character string. For example, a performer may be displayed together with a cast name, but there are many cases where there is no cast name and the name is displayed alone. In such a case, if there is a continuous display of character strings that do not have other corresponding character strings in a specific layout structure such as the same line, it is likely that the performer is displayed. . Therefore, it is possible to analyze the layout structure of such a character string and use it to improve the accuracy of performing performer information display section determination in the performer information display section determination means 242 described later.

Note that even if the cast name and the name of the performer are displayed at the same time in a specific layout, the layout may be displayed differently from the display of information on other rights holders. Therefore, when another corresponding character string with a specific layout structure is detected, the spatial positional relationship is analyzed, and if there is a change in the corresponding relationship, it is detected. Also good. For example, analyze whether "right holder information" and "person name" are displayed at the same time, or "casting name" and "person name" are displayed at the same time, according to the character spacing between the corresponding character strings. Can be determined.

It is also possible to calculate information corresponding to the appearance frequency information of the character string from the spatial layout information. For example, since the distance between character strings displayed on the same screen (frame) is inversely proportional to the appearance frequency described above, the appearance frequency information of the character string per unit time may be calculated from this distance information. .

Either or both of the temporal appearance pattern and the spatial layout analysis result analyzed in this way are extracted as character string appearance pattern analysis information and output to the performer information display section determination means 242. (Step S31, appearance pattern analysis step).

Further, the performer information display section candidate extraction means 241 is substantially the same as that of the second embodiment described above. That is, performer information display section candidate information representing a time section that can be a candidate for including performer information is output to the performer information display section determination means 242 (step S32, performer information display section candidate extraction step).

In addition, the process of step S31 by the said appearance pattern analysis means 244 and the process of step S32 by the said performer information display area candidate extraction means 241 are not limited to being performed in the order mentioned above, It may be executed simultaneously.

And the said performer information display area determination means 242 acquires the said appearance pattern analysis information and the performer information display area candidate information output from the performer information display area candidate extraction means 241 disclosed in FIG. . Moreover, the performer information display section determination means 242 also acquires credit section information extracted from the video by the credit section extraction means 2 disclosed in FIG. Then, a performer information display section is calculated and extracted from the credit time section information, the telop pattern analysis result information, and the performer information display section candidate information (step S33, performer information display section determination step).

Specifically, the performer information display section determination means 242 first calculates the relative time from the start of the credit display of each candidate section included in the performer information display section candidate information using the credit section information. Subsequently, the performer information display section determination means 242 calculates the probability that performer information is displayed for each candidate section using the appearance pattern analysis result information. For example, when the appearance pattern analysis result information includes temporal appearance frequency information of a character string, the increase / decrease in the appearance frequency information in each candidate section is analyzed. Then, the degree of suitability of the analysis result with respect to the information representing the appearance frequency characteristic of the temporal character string specific to the performer display set in advance is determined, and the performers are displayed based on the degree of suitability. Probability is calculated.

In addition, when the appearance pattern analysis result information includes the analysis result of the spatial layout information, the probability that the performer information is displayed in the layout in each candidate section is calculated using the appearance pattern analysis result information. calculate. Then, the degree to which this analysis result matches the preset information representing the layout characteristics of the spatial character string unique to the performer display is determined, and the probability that the performer is displayed is calculated from the degree. However, if the cast information is displayed together with the cast name, and the relationship between the cast name and the display position of the cast does not change from the case of other rights holder information, the cast information is displayed from the spatial layout. Since it is difficult to determine whether or not there is any spatial layout information, spatial layout information is not used.

In addition, when the appearance pattern analysis result information includes both the temporal appearance frequency information of the character string and the analysis result of the spatial layout information, the temporal appearance frequency characteristics of the character string, the spatial The degree of matching with the desired layout characteristics is determined, and the probability that the performer is displayed is calculated from the degree.

Then, the performer information display section is selected and output using the probability that the performer information calculated in this way is displayed and the relative time information from the start of the credit display. For example, since performers are often displayed in the first half of the credit, among the candidate sections included in the first half of the credit, the candidate section having the maximum probability calculated as described above is selected as the performer information display section. Select as. At this time, even if it has the standard regarding the section length by which performer information is displayed, the validity with respect to the section length is verified, and the performer information display section is determined together with the above-mentioned probability Good.

Or after narrowing down the section which can become a performer information display section beforehand from the relative time information from the credit head of each candidate section, and the section length, the above-mentioned probability is calculated, and the candidate section where the probability becomes the maximum May be determined as the performer information display section and output.

<Embodiment 5>
A fifth embodiment of the present invention will be described with reference to FIGS. FIG. 11 is a functional block diagram illustrating a configuration of the information processing apparatus. FIG. 12 is a flowchart illustrating the operation of the information processing apparatus.

The information processing apparatus 1 in the present embodiment has almost the same configuration as that of the third and fourth embodiments described above. And in this embodiment, especially the structure of the performer information display area extraction means 4 differs. Therefore, the configuration and operation of the performer information display section extraction unit 4 will be mainly described below.

As shown in FIG. 11, the performer information display section extraction means 4 of the information processing apparatus 1 in the present embodiment includes performer information display section candidate extraction means 341, performer information display section determination means 342, and appearance information. Display probability calculation means 343 and appearance pattern analysis means 344 are provided. Each of these means 341 to 344 is constructed in the information processing apparatus 1 by incorporating a performer information extraction program into the arithmetic device. The details will be described below.

First, the performer information display probability calculation means 343 is almost the same as that of the third embodiment described above. That is, the performer information display probability calculating unit 343 first acquires the credit section information extracted from the video by the credit section extracting unit 2 disclosed in FIG. And based on "ease of display information" representing the relationship between the reproduction time of the credit information for the video and the ease of display of the performer information stored in the information processing apparatus 1 stored in advance. The probability that performer information can be displayed at each time in the credit section information is calculated as performer information display probability information. And the performer information display probability calculation means 343 outputs the probability that the performer information in each reproduction time of the calculated video is displayed to the performer information display section determination means 342 (step S41, performer information display probability). Calculation step).

Further, the appearance pattern analysis means 344 is almost the same as that of the fourth embodiment described above. That is, the appearance pattern analysis unit 344 acquires the recognized character information including the character information extracted from the video by the character information extraction unit 3 disclosed in FIG. Then, the appearance pattern analysis information is used to display an appearance pattern that represents a temporal appearance state of the character information included in the recognized character information and / or an appearance pattern that represents a layout appearance state of the video reproduction area. And output to the performer information display section determination means 342 (step S42, appearance pattern analysis step).

Further, the performer information display section candidate extraction means 341 is substantially the same as that of the second embodiment described above. That is, performer information display section candidate information representing a time section that is a candidate that can include performer information is output to the performer information display section determination means 342 (step S43, performer information display section candidate extraction step).

Note that the processes by the performer information probability calculation unit 343, the appearance pattern analysis unit 344, and the performer information display section candidate extraction unit 341 described above are not limited to being performed in the order shown in FIG. Or may be performed in parallel.

Then, the performer information display section determining means 342 calculates a performer information display section from the performer information display probability, the appearance pattern analysis result information, and the performer information display section candidate information (step S44, appearance). Person information display section determination step). Specifically, first, as in the case of Embodiment 3 described above, performer information can be displayed in each candidate section from the time information for each candidate section specified in the performer information display section candidate information. Probability is calculated. Next, as in the case of the fourth embodiment described above, for each candidate section, the probability that the performer information is displayed is calculated from the appearance pattern analysis result information, and multiplied by the probability obtained from the time information. As a result, the section having the maximum probability is selected as the performer information display section. Or just like the case of Embodiment 3 mentioned above, validity with respect to section length may be verified, and a performer information display section may be determined together with said probability. And the information which describes the selected area is output as performer information display area information. Thereby, performer information can be extracted with higher accuracy.

<Embodiment 6>
A sixth embodiment of the present invention will be described with reference to FIG. FIG. 1 is a functional block diagram illustrating a configuration of the information processing apparatus. In the present embodiment, an outline of the configuration of the information processing apparatus will be described.

An information processing apparatus 1 according to one aspect of the present invention
Credit section extracting means 2 for extracting, as credit section information, a time section in which character information is superimposed on the video information based on the input video information having a predetermined reproduction time;
Character information extraction means for performing character recognition processing on the video information and extracting the character information included in the video information and the reproduction time information indicating the time during which the character information is reproduced in association with each other. 3 and
Performers to extract performer information display section information, which is a time section in which performer information representing the name of the performer who appeared in the video is displayed in the character information based on the credit section information and the recognized character information. Person information display section extraction means 4,
Performer information extracting means 5 for extracting the character information included in the recognized character information in the time section specified by the performer information display section information as the performer information;
Is provided.

According to the information processing apparatus configured as described above, the information processing apparatus first extracts a time section in which character information included in the video is reproduced as credit section information. In addition, the information processing apparatus performs character recognition processing on the video, and extracts character information and reproduction time information thereof as recognized character information. Further, the information processing device displays the time section in the video in which the performer information indicating the name of the performer appearing in the video is displayed based on the credit section information and the recognized character information. Extracted as section information. Then, the information processing apparatus extracts character information displayed in the time section in the video specified by the extracted performer information display section information as the name of the performer.

Thus, based on the time section where the credit is displayed and the character content of the credit, the time section where the performer information is displayed is specified. Therefore, the performer information can be extracted from the video easily, with high accuracy, and at low cost, and the right information included in the content can be specified.

In the information processing apparatus, the performer information display section extracting means displays the performer information in a time section in which the character information included in the recognized character information does not include a preset specific character. The structure of extracting as a time interval to be taken is adopted.

Further, the information processing apparatus adopts a configuration in which the specific character is a character that does not appear in the video represented by the video information and represents a role of a person involved in the production of the video information.

As a result, the information processing apparatus displays the performer information in a time section that does not include characters that identify a person who is not a performer, such as characters representing the role of a person involved in the production of video information such as a director or a producer. It is assumed that the time interval is set. Therefore, performer information can be extracted with higher accuracy.

In the information processing apparatus,
The performer information display section extracting means is:
Performer information display section candidates representing time sections in which the character information included in the recognized character information does not include a preset specific character, and time sections that are candidates for performing the performer information. Performer information display section candidate extraction means for extracting as information;
Performer information display section for extracting the performer information display section information representing the time section in which the performer information is displayed in the character information based on the credit section information and the performer information display section candidate information. A determination means,
The structure is taken.

In the information processing apparatus, the performer information display section determination means extracts the performer information display section information based on the length of the time section represented by the performer display section candidate information. Take the configuration.

Further, in the information processing apparatus, the performer information display section determination means has the longest length of the time section among the time sections represented by the performer display section candidate information, and the credit section information Based on, at least the beginning section of the time section is extracted as the performer information display section information.

Thereby, a time section in which specific characters are not displayed is extracted as a candidate for a time section in which performer information is displayed, and further, based on the length of the time section, the relative time with respect to the entire video, etc. The time section in which the performer information is displayed is extracted from the time section candidates. Therefore, performer information can be extracted with higher accuracy.

In the information processing apparatus,
The performer information display section extraction means is based on the credit section information and display ease information representing a relationship between a preset reproduction time of the character information and ease of display of performer information. And a performer information display probability calculating means for calculating, as performer information display probability information, a probability that the performer information at each time in the credit section information can be displayed,
The performer information display section determining means included in the performer information display section extracting means is configured such that the performer is included in the character information in the performer information display section candidate information based on the performer information display probability calculation information. Extracting the performer information display section information representing the time section in which the information is displayed;
The structure is taken.

Moreover, in the information processing apparatus, the performer information display section determination means sets a time section including a time when the probability of the performer information display probability calculation information is maximized as a time section in which the performer information is displayed. Take the configuration.

Thereby, the display probability of the performer information for each playback time is calculated based on the information prepared in advance based on the statistics and indicating the ease of display of the performer information according to the playback time of the video. Then, a time interval in which performer information is displayed is calculated based on the calculated probability, such as a time interval in which the probability is maximum. Therefore, performer information can be extracted with higher accuracy.

In the information processing apparatus,
The recognized character information includes appearance pattern information indicating the appearance status of the character information in the video,
The performer information display section extracting means calculates the performer information display section based on the appearance pattern information included in the recognized character string information.
The structure is taken.

In the information processing apparatus,
The performer information display section extraction means includes appearance pattern analysis means for extracting appearance pattern information representing the appearance status of the character information in the video from the recognized character information,
The performer information display section determining means included in the performer information display section extracting means includes the character information in the character information based on the credit time section information, the performer information display section candidate information, and the appearance pattern information. Extracting the performer information display section information representing the time section in which the performer information is displayed;
The structure is taken.

Furthermore, in the information processing apparatus,
The performer information display section extraction means includes appearance pattern analysis means for extracting appearance pattern information representing the appearance status of the character information in the video from the recognized character information,
The performer information display section determining means included in the performer information display section extracting means is based on the performer information display probability information, the performer information display section candidate information, and the appearance pattern information. To extract the performer information display section information representing the time section in which the performer information is displayed,
The structure is taken.

And in the said information processing apparatus, the said appearance pattern analysis means takes the structure that the appearance frequency of the said character information with progress of the reproduction | regeneration time of an image | video is extracted as said appearance pattern based on the said recognition character information.

In the information processing apparatus, the appearance pattern analysis means extracts the layout of the character information for the video reproduction area as the appearance pattern based on the recognized character information.

Thereby, the time interval in which the performer information is displayed is extracted based on the appearance pattern such as the appearance frequency of the character information on the video and the layout on the reproduction area. Therefore, performer information can be extracted with higher accuracy.

Further, in the information processing apparatus, the credit section extracting means detects a roll telop in which the character information scrolls in a predetermined direction on the reproduced video, and the time section in which the roll telop is reproduced is credited to the credit information. A configuration is adopted in which time interval information is used.

In the information processing apparatus, the credit section extraction unit detects music played on the played video, and uses the time section in which the music is played as the credit time section information. take.

Further, the information processing apparatus described above can be realized by incorporating a program into the information processing apparatus.
Specifically, a program according to another embodiment of the present invention is stored in an information processing apparatus.
Credit section extraction means for extracting, as credit section information, a time section in which character information is superimposed on the video information based on the input video information having a predetermined playback time;
Character information extraction means for performing character recognition processing on the video information and extracting the character information included in the video information and the reproduction time information indicating the time during which the character information is reproduced in association with each other. When,
Performers to extract performer information display section information, which is a time section in which performer information representing the name of the performer who appeared in the video is displayed in the character information based on the credit section information and the recognized character information. Person information display section extraction means;
Performer information extracting means for extracting the character information included in the recognized character information in the time section specified by the performer information display section information as the performer information;
It is a program for realizing.

In the program, the performer information display section extraction means displays the performer information in a time section in which the character information included in the recognized character information does not include a preset specific character. A configuration of extracting as a time interval is adopted.

The above program
In the information processing apparatus,
Performer information display section candidate information representing a time section in which the character information included in the recognized character information does not include a predetermined character set in advance and which is a candidate for which the performer information is displayed. Performer information display section candidate extraction means for extracting as:
Performer information display section for extracting the performer information display section information representing the time section in which the performer information is displayed in the character information based on the credit section information and the performer information display section candidate information. A determination means;
Is a program for realizing the performer information display section extracting means.

In addition, an information processing method that is executed when the above-described information processing apparatus operates is as follows.
Based on the input video information having a predetermined reproduction time, a time section in which character information is superimposed on the video information is extracted as credit section information,
Before and after extracting the credit section information, character recognition processing is performed on the video information, and the character information included in the video information is associated with the reproduction time information indicating the time at which the character information is reproduced. Extracted as recognized character information,
Based on the credit section information and the recognized character information, extract performer information display section information which is a time section in which performer information representing the name of the performer who appeared in the video is displayed in the character information,
Extracting the character information included in the recognized character information in the time section specified by the performer information display section information as the performer information;
The structure is taken.

In the information processing method, when the performer information display section information is extracted, the character information included in the recognized character information includes a time section that does not include a preset specific character. It is configured to extract as a time section in which is displayed.

In the above information processing method,
When extracting the performer information display section information,
Performer information display section candidates representing time sections in which the character information included in the recognized character information does not include a preset specific character, and time sections that are candidates for performing the performer information. Extracted as information,
Based on the credit section information and the performer information display section candidate information, the performer information display section information representing a time section in which the performer information is displayed in the character information is extracted.
The structure is taken.

Even with the invention of the program or the information processing method having the above-described configuration, the above-described object of the present invention can be achieved because it has the same operation as the information processing apparatus.

Although the present invention has been described with reference to the above embodiments, the present invention is not limited to the above-described embodiments. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention.

Note that the present invention enjoys the benefit of priority claim based on the patent application of Japanese Patent Application No. 2008-297756 filed on November 21, 2008 in Japan, and is described in the patent application. The contents are all included in this specification.

The present invention can be used when an operator who manages and uses video automatically extracts the rights of performers from video information, and has industrial applicability.

DESCRIPTION OF SYMBOLS 1 Information processing apparatus 2 Credit area extraction means 3 Character information extraction means 4 Performer information display area extraction means 5 Performer information extraction means 41, 141, 241, 341 Performer information display area candidate extraction means 42, 142, 242, 342 Performer information display section determination means 143, 343 Performer information display probability calculation means 244, 344 Appearance pattern analysis means

Claims

Credit section extraction means for extracting, as credit section information, a time section in which character information is superimposed on the video information based on the input video information having a predetermined playback time;
Character information extraction means for performing character recognition processing on the video information, and extracting character information included in the video information and reproduction time information indicating a time during which the character information is reproduced in association with each other. When,
Performers to extract performer information display section information, which is a time section in which performer information representing the names of performers who appeared in the video is displayed in the character information based on the credit section information and the recognized character information. Person information display section extraction means;
Performer information extracting means for extracting the character information included in the recognized character information within the time section specified by the performer information display section information as the performer information;
An information processing apparatus comprising:
The information processing apparatus according to claim 1,
The performer information display section extracting means extracts a time section in which the character information included in the recognized character information does not include a preset specific character as a time section in which performer information is displayed.
Information processing device.
An information processing apparatus according to claim 2,
The specific character is a character representing a role of a person who does not appear in the video represented by the video information and is involved in the production of the video information.
Information processing device.
The information processing apparatus according to claim 1,
The performer information display section extracting means is:
Performer information display section candidates representing a time section in which the character information included in the recognized character information does not include a predetermined specific character and a time section that is a candidate for which the performer information is displayed Performer information display section candidate extraction means for extracting as information;
Performer information display section that extracts the performer information display section information representing a time section in which the performer information is displayed in the character information based on the credit section information and the performer information display section candidate information. A determination means,
Information processing device.
The information processing apparatus according to claim 4,
The performer information display section determination means extracts the performer information display section information based on the length of the time section represented by the performer display section candidate information.
Information processing device.
The information processing apparatus according to claim 5,
The performer information display section determination means has the longest length of the time section among the time sections represented by the performer display section candidate information, and at least the time section based on the credit section information. The beginning section is extracted as the performer information display section information.
Information processing device.
The information processing apparatus according to any one of claims 4 to 6,
The performer information display section extracting means is based on the credit section information and display ease information representing a relationship between a preset reproduction time of the character information and ease of display of performer information. , Comprising performer information display probability calculation means for calculating the probability that the performer information at each time in the credit section information can be displayed as performer information display probability information,
The performer information display section determining means included in the performer information display section extracting means is configured to include the performer in the character information in the performer information display section candidate information based on the performer information display probability calculation information. Extracting the performer information display section information representing a time section in which information is displayed;
Information processing device.
The information processing apparatus according to claim 7,
The performer information display section determination means sets a time section including a time at which the probability of the performer information display probability calculation information is maximized as a time section in which the performer information is displayed.
Information processing device.
An information processing apparatus according to any one of claims 1 to 8,
The recognized character information includes appearance pattern information representing an appearance state of the character information in a video;
The performer information display section extracting means calculates the performer information display section based on the appearance pattern information included in the recognized character string information.
Information processing device.
The information processing apparatus according to any one of claims 4 to 6,
The performer information display section extraction means includes appearance pattern analysis means for extracting appearance pattern information representing the appearance status of the character information in the video from the recognized character information,
The performer information display section determining means included in the performer information display section extracting means includes the character information in the character information based on the credit time section information, the performer information display section candidate information, and the appearance pattern information. Extracting the performer information display section information representing a time section in which performer information is displayed;
Information processing device.
The information processing apparatus according to claim 7 or 8,
The performer information display section extraction means includes appearance pattern analysis means for extracting appearance pattern information representing the appearance status of the character information in the video from the recognized character information,
The performer information display section extracting means included in the performer information display section extracting means is based on the performer information display probability information, the performer information display section candidate information, and the appearance pattern information. Extracting the performer information display section information representing a time section in which the performer information is displayed,
Information processing device.
The information processing apparatus according to claim 10 or 11,
The appearance pattern analysis means extracts the appearance frequency of the character information as the appearance pattern with the elapse of the reproduction time of the video based on the recognized character information.
Information processing device.
An information processing apparatus according to any one of claims 10 to 12,
The appearance pattern analysis means extracts, as the appearance pattern, a layout of the character information with respect to a reproduction region of a video based on the recognized character information;
Information processing device.
An information processing apparatus according to any one of claims 1 to 13,
The credit section extraction means detects a roll telop in which the character information scrolls in a predetermined direction on the reproduced video, and sets a time section in which the roll telop is reproduced as the credit section information.
Information processing device.
An information processing apparatus according to any one of claims 1 to 13,
The credit section extracting means detects music played on the played video, and uses the time section in which the music is played as the credit section information.
Information processing device.
In the information processing device,
Credit section extraction means for extracting, as credit section information, a time section in which character information is superimposed on the video information based on the input video information having a predetermined playback time;
Character information extraction means for performing character recognition processing on the video information, and extracting character information included in the video information and reproduction time information indicating a time during which the character information is reproduced in association with each other. When,
Performers to extract performer information display section information, which is a time section in which performer information representing the names of performers who appeared in the video is displayed in the character information based on the credit section information and the recognized character information. Person information display section extraction means;
Performer information extracting means for extracting the character information included in the recognized character information within the time section specified by the performer information display section information as the performer information;
A program to realize
The program according to claim 16, wherein
The performer information display section extracting means extracts a time section in which the character information included in the recognized character information does not include a preset specific character as a time section in which performer information is displayed.
program.
The program according to claim 16, wherein
In the information processing apparatus,
Performer information display section candidates representing a time section in which the character information included in the recognized character information does not include a predetermined specific character and a time section that is a candidate for which the performer information is displayed Performer information display section candidate extraction means for extracting as information;
Performer information display section that extracts the performer information display section information representing a time section in which the performer information is displayed in the character information based on the credit section information and the performer information display section candidate information. A determination means;
A program for realizing the performer information display section extraction means.
Based on the input video information having a predetermined reproduction time, a time section in which character information is superimposed on the video information is extracted as credit section information,
Before and after the extraction of the credit section information, character recognition processing is performed on the video information, and character information included in the video information is associated with reproduction time information indicating a time during which the character information is reproduced. Extracted as recognized character information,
Based on the credit section information and the recognized character information, extract performer information display section information which is a time section in which performer information representing the name of the performer who appeared in the video is displayed in the character information,
Extracting the character information included in the recognized character information in the time section specified by the performer information display section information as the performer information;
Information processing method.
The information processing method according to claim 19,
When extracting the performer information display section information, a time section in which the character information included in the recognized character information does not include a preset specific character is set as a time section in which performer information is displayed. Extract,
Information processing method.
The information processing method according to claim 20, wherein
When extracting the performer information display section information,
Performer information display section candidates representing a time section in which the character information included in the recognized character information does not include a predetermined specific character and a time section that is a candidate for which the performer information is displayed Extracted as information,
Based on the credit section information and the performer information display section candidate information, the performer information display section information representing a time section in which the performer information is displayed in the character information is extracted.
Information processing method.