WO2013084422A1

WO2013084422A1 - Information processing device, communication terminal, information search method, and non-temporary computer-readable medium

Info

Publication number: WO2013084422A1
Application number: PCT/JP2012/007342
Authority: WO
Inventors: 光洋渡邊
Original assignee: 日本電気株式会社
Priority date: 2011-12-08
Filing date: 2012-11-15
Publication date: 2013-06-13
Also published as: JPWO2013084422A1

Abstract

In an embodiment, an information processing device (1) includes a specification control part (14) and a search control part (15). In order to search for attribute information for an object imaged in a video being displayed on a display (12), the specification control part (14) receives a specification for the target image included in the relevant video in accordance with a user operating an input device (13). The search control part (15) sends the target image or a substitute image thereof to a search system (9), and receives the attribute information retrieved on the basis of the target image or the substitute image from the search system (9). In addition, to compensate for the delay time required to execute the user operation, the specification control part (14) decides that the image being displayed on the display (12) is the target image even before the relevant operation has been completed.

Description

Information processing apparatus, communication terminal, information search method, and non-transitory computer-readable medium

The present invention relates to a technique for searching for related information of a subject displayed in a moving image.

A technique is known in which similar images are searched based on a certain image, and attribute information associated with the searched similar images is acquired (see, for example, Patent Documents 1 and 2). Patent Document 1 discloses a system that specifies a constellation by performing a similar image search based on a night sky photographed image and provides information on the specified constellation. Patent Document 2 discloses a system that searches for similar images based on diagnostic images in the medical field and provides case data associated with the searched similar images.

On the other hand, Patent Document 3 discloses a technique for retrieving attribute information (eg seller, price) of a subject (eg 衣装 costume worn by an actor) displayed in a video such as a television broadcast. . More specifically, the search system disclosed in Patent Document 3 uses moving picture identification information (eg channel number and broadcast date and time) and a selected position on the screen in order to identify the subject selected by the viewer. And information indicating the range is transmitted from the viewer side terminal to the server. And a server searches the attribute information linked | related with the information which shows the identification information, selection position, and range of a moving image, and transmits the obtained attribute information to a terminal.

The general similar image search disclosed in

Patent Documents

1 and 2 uses an image matching technique. That is, the similarity is evaluated by comparing the feature amounts of two images. On the other hand, Patent Document 3 performs a search using the identification information (eg channel number and broadcast date and time) of the moving image and information indicating the selection position and range on the screen as a key. Not something to do.

JP 2005-174240 A JP 2004-005364 A JP 2002-334092 A

When a viewer is viewing a video, there is a problem in that even if the viewer wants to know the attribute information of the subject displayed in the video, it can be searched immediately. Here, the moving image browsed by the viewer is, for example, a television broadcast, a movie, or a video of a sports competition or concert. In addition, the moving image may be an image obtained by displaying a photographed image by a camera mounted on a mobile communication terminal (eg smart phone, tablet computer, notebook PC (Personal Computer)) substantially in real time on the display of the terminal. . The moving image may be a reproduced image of encoded moving image data (e.g. MPEG-2 data, MPEG-4 data) acquired from a recording medium (e.g. optical disc, hard disk, flash memory) or a communication medium. The subject is, for example, a person, an animal, a plant, a product, an anime character, or the like. The subject attribute information includes, for example, a person profile such as name, nationality, birthplace, date of birth, character name, product name, manufacturer, release date, price, URL (Uniform Resource Locator), and the like. .

More specifically, when searching for attribute information of a subject displayed in a moving image, the viewer needs to perform an operation of specifying a target image in which the subject desired to be searched is shown. However, since the moving image display screen changes every moment, there is a problem that it is difficult to perform the selection operation of the target image. None of Patent Documents 1 to 3 disclose a technique that contributes to the solution of this problem.

The present invention has been made on the basis of the above-mentioned knowledge and consideration by the present inventor, and when searching for attribute information of a subject displayed in a moving image, an operation for designating a target image showing the subject from the moving image. An object is to provide an information processing apparatus, a communication terminal, an information search method, and a program that can be facilitated.

The first aspect of the present invention includes an information processing apparatus. The information processing apparatus includes a designation control unit and a search control unit. The designation control unit accepts designation of a target image included in the moving image by an operation of an input device by a user in order to search for attribute information of a subject shown in the moving image displayed on the display. The search control unit transmits the target image or a substitute image thereof to a search system, and receives attribute information searched based on the target image or the substitute image from the search system. Further, the designation control unit determines, as the target image, an image that has been displayed on the display before a reference time point that is before the completion time of the operation in order to compensate for a delay time required for execution of the operation. To do.

The second aspect of the present invention includes a communication terminal. The communication terminal includes the information processing apparatus, the display, the input device, and the communication unit according to the first aspect of the present invention described above. The communication unit is used for transmission of the target image and reception of the attribute information by the search control unit.

A third aspect of the present invention includes an information search method by an information processing device. The information retrieval method includes the following steps (a) to (c).
(A) accepting designation of a target image included in the moving image by an operation of an input device by a user in order to search for attribute information of a subject reflected in the moving image displayed on the display;
(B) transmitting the target image or a substitute image thereof to a search system; and (c) receiving attribute information searched based on the target image or the substitute image from the search system.
Further, the accepting in (a) means that an image displayed on the display before a reference time point before the completion time of the operation is compensated for in order to compensate for a delay time required for execution of the operation. Including determining as an image.

The third aspect of the present invention includes a program for causing a computer to perform the information search method according to the third aspect of the present invention described above.

According to each aspect of the present invention described above, an information processing apparatus capable of facilitating an operation of designating a target image in which a subject is displayed from the video when searching for attribute information of the subject displayed in the video, A communication terminal, an information search method, and a program can be provided.

It is a network block diagram including the communication terminal which concerns on Embodiment 1 of this invention. It is a block diagram which shows the structural example of the communication terminal which concerns on Embodiment 1 of this invention. It is a flowchart which shows the specific example of the information search method by the communication terminal which concerns on Embodiment 1 of this invention. It is a figure for demonstrating designation | designated operation of the target image in the communication terminal which concerns on Embodiment 1 of this invention. It is a figure for demonstrating designation | designated operation of the target image in the communication terminal which concerns on Embodiment 1 of this invention. It is a figure for demonstrating designation | designated operation of the target image in the communication terminal which concerns on Embodiment 1 of this invention. It is a figure for demonstrating designation | designated operation of the target image in the communication terminal which concerns on Embodiment 1 of this invention. It is a figure for demonstrating designation | designated operation of the target image in the communication terminal which concerns on Embodiment 1 of this invention. It is a block diagram which shows the other structural example of the communication terminal which concerns on Embodiment 1 of this invention. It is a flowchart which shows the specific example of the information search method by the communication terminal which concerns on Embodiment 2 of this invention. It is a flowchart which shows the specific example of the information search method by the communication terminal which concerns on Embodiment 3 of this invention. It is a block diagram which shows the structural example of the communication terminal which concerns on Embodiment 4 of this invention. It is a flowchart which shows the specific example of the information search method by the communication terminal which concerns on Embodiment 4 of this invention. It is a flowchart which shows the specific example of the designation | designated method of the alternative image by the communication terminal which concerns on Embodiment 4 of invention. It is a block diagram which shows the structural example of the communication terminal which concerns on Embodiment 5 of this invention. It is a flowchart which shows the specific example of the information search method by the communication terminal which concerns on Embodiment 5 of this invention.

Hereinafter, specific embodiments to which the present invention is applied will be described in detail with reference to the drawings. In each drawing, the same or corresponding elements are denoted by the same reference numerals, and redundant description is omitted as necessary for clarification of the description.

<Embodiment 1 of the Invention>
FIG. 1 is a diagram showing a network configuration including communication terminals according to the present embodiment. The communication terminal 1 has a wireless or wired transceiver and can communicate with the search system 9 via the network 8. Specific examples of the communication terminal 1 include a mobile phone terminal, a smartphone, a tablet computer, a notebook PC, a desktop PC, and a television broadcast receiver having a communication function. The network 8 is a data transfer network such as an IP (Internet Protocol) network. The network 8 may be a wired network, a wireless network, or a combination thereof. The network 8 includes, for example, a radio access network and packet core network of a communication carrier, an IP leased line, and the public Internet.

The communication terminal 1 transmits the target image selected from the video by the viewer to the search system 9 in order to search for the attribute information of the subject shown in the video displayed on the display. The target image may be an image for one screen (one frame) or a partial image corresponding to a part of the screen. Details of the method of selecting a target image in the communication terminal 1 will be described later.

The search system 9 specifies a subject shown in the target image by performing a similar image search using the target image received from the communication terminal 1. Further, the search system 9 transmits attribute information related to the identified subject to the communication terminal 1. As described above, the subject is, for example, a person, an animal, a plant, a product, an animation character, or the like. The subject attribute information includes, for example, a person profile such as name, nationality, birthplace, or date of birth, character name, product name, manufacturer, release date, price, URL (Uniform Resource Locator), and the like. .

The search system 9 may be a general-purpose image search server arranged on the Internet. The search system 9 may be a search system specialized for a specific moving image. For example, the search system 9 may be a system specialized in searching for a person (performer) displayed on a television broadcast. In this case, the search system 9 may determine an image to be preferentially collated with the target image by using information on a television program broadcast in the time zone in which the search is performed. Specifically, the search system 9 may preferentially collate the image of the performer of the television program broadcasted during the search time zone with the target image.

As described above, the communication terminal 1 sends the target image selected from the moving image by the viewer (user of the i.e. terminal 1) to the search system 9. Therefore, the viewer needs to perform an operation on the communication terminal 1 to specify a target image in which a subject desired to be searched is shown. However, the display content of the video changes from when the viewer decides to search for the subject until the viewer finishes this operation, and the subject that the viewer wants to search moves to a different position on the screen. It is assumed that it no longer exists in the screen. That is, since the moving image display screen changes every moment, there is a problem that it is difficult to perform the selection operation of the target image.

For example, as one method for designating a subject in a moving image, it is conceivable that a viewer operates a touch panel as the input device 13 at a timing when a desired subject is displayed on the display of the communication terminal. For example, the subject may be specified by touching the viewer so that the viewer surrounds the range on the touch panel where the desired subject is displayed on the display below (with an e.g. blue circle). However, the display content of the moving image changes with time. For this reason, the display content of the video changes from when the viewer decides to search for the subject to when the operation on the touch panel is finished, and the subject that the viewer tries to search moves to a different position on the screen, It is assumed that they no longer exist within.

In order to deal with this problem, the communication terminal 1 has a function of supporting an operation for designating the target image. That is, the communication terminal 1 determines, as a target image, an image displayed on the display before a reference time determined based on the viewer's operation in order to compensate for the delay time required for the viewer to perform the operation. It is configured. Here, the reference time point is before the time point when the viewer's operation is completed. Below, the structural example and operation | movement of the communication terminal 1 are demonstrated in detail.

FIG. 2 is a block diagram illustrating a configuration example of the communication terminal 1. The communication terminal 1 shown in FIG. 2 has a wireless communication function. The processor 10 performs control, monitoring, and information processing of the terminal 1. The processor 10 may be a combination of a plurality of computers (e.g. MPU (Micro Processing Unit), microcontroller). More specifically, the processor 10 displays a moving image on the display 12, accepts an operation for selecting a target image, and performs a search system 9 in order to search for attribute information of the subject displayed in the moving image. Send and receive data. The processor 10 includes a designation control unit 14 and a search control unit 15. Details of the designation control unit 14 and the search control unit 15 will be described later.

The wireless communication unit 11 connects to a wireless communication network via a base station (or a wireless access point). The wireless communication unit 11 performs transmission path coding, interleaving, modulation (transmission symbol mapping), frequency up-conversion, signal amplification, and the like on transmission data to generate a transmission signal. In addition, the wireless communication unit 11 generates reception data by performing each process such as signal amplification, frequency down-conversion, demodulation, error correction decoding, and the like on the reception signal from the antenna. The wireless communication unit 11 includes known cellular communication methods such as UTRA (UMTS Terrestrial Radio Access), E-UTRA (Evolved UTRA), GSM (Global System for Mobile Communications) (registered trademark), and wireless LAN (Local Area Network). ), A transceiver conforming to WiMAX (Worldwide Interoperability for Microwave Access) or the like may be used.

The display 12 displays an image including a moving image so that a viewer (user of the terminal 1) can visually recognize the display. Specific examples of the display 12 are a liquid crystal display (LCD), an EL (electroluminescence) display, and a CRT (Cathode Ray Tube) display.

The moving image displayed on the display 12 may be an image taken by a camera (not shown) mounted on the communication terminal 1. The moving image is encoded moving image data (eg MPEG-2 data) acquired from a memory (eg optical disc, hard disk, flash memory) built in the communication terminal 1 or an external device accessible via the wireless communication unit 11. , MPEG-4 data).

The input device 13 is a device that accepts user operations. The input device 13 includes at least one of a pointing device operated by a viewer (user of the terminal 1), a microphone that collects the viewer's voice, and a pointing device operated by the viewer's line of sight. Including. Specific examples of the pointing device operated by the viewer's hand include a touch panel, a touch pad, and a mouse.

The designation control unit 14 accepts designation of the target image included in the moving image by operating the input device 13 by the viewer in order to search for the attribute information of the subject shown in the moving image displayed on the display 12. For example, when the input device 13 is a touch panel, the viewer touches the touch panel so as to surround the display range of the subject with a finger (eg, draw an ellipse circle) at the timing when the desired subject is displayed on the display 12. Thus, the subject may be specified. In addition, the viewer may perform an operation of touching one point of the subject in order to specify the subject. In this case, the designation control unit 14 may identify an image region including the subject by performing image recognition processing such as face recognition on the region including one point touched by the viewer.

When the input device 13 is another pointing device, the viewer may operate the pointing device so as to enclose the display range of the subject with a pointer (e.g., draw an ellipse). In addition, the viewer may perform an operation of designating one point of the subject with a pointer in order to designate the subject.

In addition, when the input device 13 includes a microphone, the viewer may input a word or a phrase (e.g. male, female, dog, cat, flower, car, etc.) for specifying the subject to the microphone. In this case, the search control unit 15 to be described later may specify the target image based on the timing at which the viewer inputs a word or phrase, or other input devices (eg, touch panel, mouse, operation buttons, etc.) by the viewer. The target image may be specified by an operation. At this time, the target image may be a screen image of the entire screen, not a partial image. The search control unit 15 may transmit the target image as the screen image and the word or phrase input by the viewer to the search system 9. The search system 9 may recognize a subject corresponding to the word or phrase input by the viewer from the target image, and perform a similar image search using the recognized subject image.

Furthermore, in order to compensate for the delay time required for the viewer to perform the target image designation operation, the designation control unit 14 selects an image displayed on the display 12 before the reference time determined based on the operation by the viewer. Determine as the target image. Here, the reference time is before the time when the operation by the viewer is completed. In other words, the designation control unit 14 determines an image displayed on the display 12 at least before the completion of the operation as a target image. The reference time point may be, for example, the completion time of the target image designation operation, the start time of the designation operation, or the central time point of the period required for the designation operation.

The extent to which an image before the reference point of operation by the viewer (eg) is determined as the target image may be determined statically or may be changed according to the viewer or according to the video. Also good. When statically determined, a period (hereinafter referred to as a shift time) that goes back from the reference time point may be determined in consideration of the average reaction speed of the person.

On the other hand, when the shift time is dynamically determined according to the viewer, calibration for measuring the reaction speed of the viewer may be performed. Specifically, a test video is displayed, and after a specific subject appears in the test video, the time required for the viewer to complete the operation to select the display range of the subject is measured and measured. What is necessary is just to determine shift time according to the length of performed time. Further, the viewer may be able to freely change the initial value of the shift time based on the average reaction speed of the person. For example, when the viewer operates the terminal 1, it is possible to correct that the shift time is too long (going back too far in the past) and the shift time is too short (poor going back). Also good. It is assumed that the time required for the viewer to complete the operation of specifying the target image is different depending on the influence of the viewer's age and the like. Therefore, the target image intended by the viewer can be more appropriately identified by changing the shift time according to the viewer.

Also, when the shift time is dynamically determined according to the moving image, the shift time may be changed according to the speed of movement of the subject reflected in the moving image. A specific example of changing the shift time according to the moving image will be described in detail in another embodiment (Embodiment 3).

The search control unit 15 transmits the target image specified by the designation control unit 14 or a substitute image thereof to the search system 9 via the wireless communication unit 11. Then, the search control unit 15 receives the attribute information searched based on the target image or its substitute image from the search system 9 via the wireless communication unit 11. Here, the substitute image is an image suitable for similar image search because it includes a subject that is substantially the same as the target image specified by the designation control unit 14 but has a higher image quality than the target image. An example of using an alternative image will be described in detail in another embodiment (Embodiment 4).

FIG. 3 is a flowchart showing a specific example of the information search method by the communication terminal 1 according to the present embodiment. In step S 1, the communication terminal 1 displays a moving image on the display 12. In step S 2, the communication terminal 1 accepts an operation of the input device 13 by a viewer for designating a target image. In step S3, the communication terminal 1 selects an image displayed before the reference time point of the operation by the viewer, in other words, an image displayed on the display 12 at least before the time point when the operation by the viewer is completed. Determine as. In step S 4, the communication terminal 1 transmits the determined target image or a substitute image thereof to the search system 9. Finally, in step S5, the communication terminal 1 receives attribute information from the search system 9.

Subsequently, in the following, the advantage of specifying the target image using the shift time will be described with reference to FIGS. 4A to 4D and FIG. 4A to 4D and FIG. 5 show a case where the input device 13 is a touch panel as an example. The viewer designates a target image as a partial image by touching the touch panel with a finger so as to surround the display range of the subject with the finger.

4A to 4C each show an image 40 of one screen of the display 12 on which the subject 401 is projected. Here, the subject 401 is a person. The display on the display 12 is assumed to change in the order of FIGS. 4A, 4B, and 4C over time. In other words, the subject 401 moves to the right of the screen as indicated by the white arrow in FIGS. 4A to 4C.

The viewer decides to search the attribute information of the subject 401 at the timing of FIG. 4A, and starts an operation of selecting to surround the face portion of the subject 401 at the timing of FIG. 4B. An operation locus 402 in FIG. 4B indicates a locus that the viewer touches with a finger. And the operation locus | trajectory 403 of FIG. 4C has shown the operation locus at the time of a viewer completing operation. However, the display content of the moving image changes every moment. Therefore, at the time of FIG. 4C, the subject 401 has moved in the right direction on the screen, and the desired subject 401 image does not exist within the range surrounded by the operation locus 403 by the viewer.

In order to compensate for the delay time required for execution of the viewer's operation, the communication terminal 1 according to the present embodiment uses, for example, the completion time of the operation as a reference time, and a shift time that is set in advance from the reference time The previous image (or the shift time determined according to the viewer or the moving image) is adopted as the target image. For example, the subject 401 can be correctly selected as in the image 43 shown in FIG. 4D by going back to the operation start time by the shift time.

In the examples of FIGS. 4A to 4D, the compensation of the delay time from when the viewer starts the operation until the operation is completed has been described. However, there is a delay time from when the viewer decides to perform the search to when the operation for that purpose is started. For example, it is assumed that the display content of the moving image changes during the delay time from when the viewer tries to search for the subject in the moving image until the input device 13 is actually operated, and the subject is not displayed. In order to cope with this problem, for example, the start time of the viewer's operation may be set as the reference time, and the display image that is the shift time before the start time of the viewer's operation may be selected as the target image. Thereby, a display image close to the time when the viewer decides to perform the search can be selected as the target image.

The target image designation procedure shown in FIGS. 4A to 4D is merely an example. In order to facilitate the designation of the target image (subject) or to shorten the delay time required for the operation of designating the target image (subject), it is preferable to simplify the operation of specifying the target image (subject). . For example, the communication terminal 1 (designation control unit 14) automatically recognizes the subject displayed in the moving image, and may accept the designation of the subject to be selected from the automatically recognized subjects by the operation of the viewer. Good. When specifying a person, as shown in FIG. 5, the communication terminal 1 performs face recognition processing on the image 40 of one screen of the display 12 on which the subject 401 is projected, and includes a face including the detected person's face. A frame 404 indicating a region may be displayed over the image 40. And the communication terminal 1 should just receive the operation of the viewer who designates the frame 404 using the input device 13, for example. For example, when the input device 13 is a touch panel, an operation in which the viewer touches the frame 404 or an area inside the frame 404 may be used as the target image specifying operation. When a plurality of subjects (e.g.ga plurality of persons) are displayed in the image 40, the communication terminal 1 may display the detected plurality of subject regions (e.g. a face region) with a frame 404 or the like. That is, the operation of specifying the target image (subject) can be simplified because the communication terminal 1 can further simplify the operation of specifying the target image (subject) by using the subject automatic recognition function to present the target image candidate. Can be shortened.

However, even if the time required for the viewer's operation is shortened by using the subject automatic recognition function as shown in FIG. There is at least a delay time until the operation is performed. Therefore, a display image that is a shift time before the reference time point (e.g. start time, central time point, or completion time) of the viewer's operation may be selected as the target image. The shift time here may be determined in consideration of the delay time from when the viewer decides to perform the search to when the operation is performed. Thereby, a display image close to the time when the viewer decides to perform the search can be selected as the target image.

Of course, the problem that it is difficult to select an object (target image) in a moving picture because the display screen of the moving picture changes every moment can occur in any moving picture. Therefore, the communication terminal 1 according to the present embodiment is effective regardless of the type of moving image displayed on the display 12. However, the above-described problem relating to the designation of the subject in the moving image has a particularly large effect when the viewer is watching a television broadcast. This is because, in general, television broadcasting cannot be freely paused or rewound at the viewer's own will. Therefore, the communication terminal 1 according to the present embodiment is particularly effective when the moving image displayed on the display 12 is a television broadcast.

By the way, the processing performed by the designation control unit 14 and the search control unit 15 described in the present embodiment may be realized using a semiconductor processing apparatus including an ASIC (Application Specific Integrated Circuit). These processes may be realized by causing a computer such as a microprocessor or a DSP (Digital Signal Processor) to execute a program. One or a plurality of programs including a group of instructions for causing the computer to execute the algorithm described with reference to FIGS. 1 to 4 may be created and the programs may be supplied to the computer.

This program can be stored using various types of non-transitory computer readable media and supplied to a computer. Non-transitory computer readable media include various types of tangible storage media (tangible storage medium). Examples of non-transitory computer-readable media include magnetic recording media (eg flexible disks, magnetic tapes, hard disk drives), magneto-optical recording media (eg magneto-optical discs), CD-ROMs (Read Only Memory), CD-Rs, CD-R / W, semiconductor memory (for example, mask ROM, PROM (Programmable ROM), EPROM (Erasable ROM), flash ROM, RAM (random access memory)) are included. The program may also be supplied to the computer by various types of temporary computer-readable media. Examples of transitory computer readable media include electrical signals, optical signals, and electromagnetic waves. The temporary computer-readable medium can supply the program to the computer via a wired communication path such as an electric wire and an optical fiber, or a wireless communication path.

FIG. 6 shows a configuration example when the communication terminal 1 is manufactured using a computer system. The LCD 121 is a specific example of the display 12. The touch panel 131 and the microphone 132 are specific examples of the input device 13. An operating system (OS) 103 and a search application program 104 stored in a nonvolatile storage unit 102 (for example, a flash memory or a hard disk drive) are loaded into a RAM (Random Access Memory) 101. The MPU (Micro Processing Unit) 100 executes the OS 103 and the search application program 104 loaded in the RAM 101, thereby realizing the functions of the designation control unit 14 and the search control unit 15.

<Embodiment 2 of the Invention>
In the present embodiment, a first modification of the communication terminal 1 according to the first embodiment of the invention described above will be described. In the present embodiment, the communication terminal 1 determines the presence / absence of a scene change of a moving image within a first period before a reference time (eg operation start time or operation completion time) related to a viewer's operation. Then, when there is a scene change, the terminal 1 determines an image before the scene change as a target image. The presence or absence of a scene change may be determined by comparing an image related to the reference time point with an image before the reference time point. Specifically, between two images, the magnitude of change in pixel value for each pixel or pixel block including a plurality of pixels is calculated, and the scene change occurs when the amount of change in pixel value exceeds a predetermined reference What is necessary is just to determine with existence.

When the movement of the subject in the video is small, the viewer can specify the subject relatively easily. However, in television broadcasting and movies, scenes sometimes switch discontinuously. Therefore, if a scene change occurs between the time when the viewer decides to search for a subject and the time when the operation is started, the viewer may lose the opportunity to search for the subject. However, in the present embodiment, the presence / absence of a scene change is determined, and when there is a scene change, an image before the scene change is determined as a target image. For this reason, the viewer can appropriately select a target image for which a search is desired.

FIG. 7 is a flowchart showing a specific example of the information search method by the communication terminal 1 according to the present embodiment. The processes in steps S1, S2, S4, and S5 in FIG. 7 are the same as the steps with the same reference numerals shown in FIG. Therefore, the redundant description regarding these steps is omitted.

Steps S31 to S34 in FIG. 7 show a modification of step S3 in FIG. In step S31, the communication terminal 1 (designation control unit 14) determines whether or not there is a scene change within a predetermined period before the reference time point (e.g. start time of the e.g. operation) by the viewer. As can be understood from the description in the first embodiment of the present invention, the predetermined period of step S31 may be determined statically or may be dynamically determined according to the viewer or the moving image. Good. When there is a scene change (YES in step S32), the communication terminal 1 (designation control unit 14) determines an image before the scene change as a target image (step S33). On the other hand, when the scene change is not detected (NO in step S32), the communication terminal 1 (designation control unit 14) may select the image at the reference time point of the operation by the viewer as the target image. The previous image may be selected (step S34).

<Third Embodiment of the Invention>
In the present embodiment, a second modification of the communication terminal 1 according to the first embodiment of the invention described above will be described. In the present embodiment, the communication terminal 1 changes the shift time according to the speed of movement of the subject included in the moving image. In other words, the communication terminal 1 according to the present embodiment changes the shift time according to the magnitude of the motion vector between a plurality of images included in the moving image.

In order to determine the speed of movement of a subject, a motion vector between a plurality of images included in a moving image may be calculated to determine the magnitude of the motion vector. For example, the shift time may be relatively longer as the movement of the subject shown in the moving image is faster, that is, as the motion vector is larger. It is assumed that the faster the movement of the subject, the more confusing the viewer's operation is, or the longer it takes to complete the operation to accurately select the subject. Therefore, the target image intended by the viewer can be more appropriately identified by increasing the shift time as the motion of the subject shown in the moving image is faster, that is, as the motion vector is larger.

FIG. 8 is a flowchart showing a specific example of the information search method by the communication terminal 1 according to the present embodiment. The processes in steps S1, S2, S4, and S5 in FIG. 8 are the same as the steps with the same reference numerals shown in FIG. Therefore, the redundant description regarding these steps is omitted.

Steps S35 to S37 in FIG. 8 show a modification of step S3 in FIG. In step S35, the communication terminal 1 (designation control unit 14) calculates a motion vector between a plurality of images in the moving image. In step S36, the communication terminal 1 (designation control unit 14) determines the shift time according to the calculated magnitude of the motion vector. In step S37, the communication terminal 1 (designation control unit 14) determines, as a target image, a display image that is a shift time before the reference time point of the operation by the viewer.

<Embodiment 4 of the Invention>
In the present embodiment, a third modification of the communication terminal 1 according to the first embodiment of the invention described above will be described. Specifically, the present embodiment shows an example in which a substitute image corresponding to the target image determined by the designation control unit 14 is transmitted to the search system 9.

FIG. 9 is a block diagram showing a configuration example of the communication terminal 4 according to the present embodiment. The communication terminal 4 displays on the display 12 an image obtained by photographing a television broadcast screen displayed on an external television receiver or the like with the camera 161 mounted on the terminal 4.

The designation control unit 14 applies the target image from the moving image (that is, the video including the TV broadcast screen shot by the camera 161) displayed on the display 12 according to any of the methods described in the first to third embodiments. To decide.

The search control unit 45 acquires a substitute image corresponding to the target image determined by the designation control unit 14. More specifically, the search control unit 45 uses the TV tuner 162 mounted on the terminal 1 to display a TV broadcast image at substantially the same time as the target image displayed on the display 12 after being captured by the camera 161. It is acquired as a substitute image by using.

This embodiment assumes a case where the viewer searches for attribute information such as the name of the performer when the viewer is watching a television program using a general television broadcast receiver. Note that the viewer can also obtain a desired attribute by using any of the information retrieval methods described in Embodiments 1 to 3 described above for the moving image captured by the camera 161 and displayed on the display 12. Of course, information can be acquired. However, the television broadcast screen shot by the camera 161 may be inferior in image quality to the video obtained by the television tuner 162 mounted on the terminal 4. For example, image quality deteriorates when the screen of a television broadcast receiver is taken from a distance or when camera shake occurs during shooting. In the present embodiment, since the video obtained by the TV tuner 162 is used as a substitute image, there is an advantage that a substitute image having higher image quality than the target image can be used for similar image search.

In order to determine a substitute image, it is necessary to identify a channel of a television broadcast screen shot by the camera 161. For example, the search control unit 45 compares the image of the camera 161 with the image of the TV tuner 162, or compares the sound acquired by the microphone (not shown) with the sound of the TV tuner 162, or Based on these combinations, it is only necessary to determine whether the TV broadcast screen shot by the camera 161 matches the channel of the TV tuner 162 video. Further, the search control unit 45 may detect characters in channel information included in the television broadcast screen shot by the camera 161. Furthermore, instead of these automatic channel identifications, the viewer may designate a channel by operating the terminal 4.

FIG. 10 is a flowchart showing a specific example of the information search method by the communication terminal 4 according to the present embodiment. The processes in steps S2, S3, and S5 in FIG. 10 are the same as the steps with the same reference numerals shown in FIG. Therefore, the redundant description regarding these steps is omitted.

In step S11, the communication terminal 4 displays on the display 12 a moving image obtained by photographing the TV broadcast screen with the camera 161. In step S 41, the communication terminal 4 acquires a substitute image related to the target image determined in step S 3 from the video obtained by the television tuner 162. In step S 42, the communication terminal 4 transmits a substitute image to the search system 9.

Furthermore, when a viewer uses the camera 161 to shoot a television broadcast screen, it is preferable that the viewer can shoot the television broadcast screen so that it fits exactly on the image frame of the camera 161. However, in consideration of the case where the TV broadcast screen is shot in a hurry for the subject search, it is not easy to shoot the TV broadcast screen so as to fit the image frame of the camera 161. Also, considering the convenience of the viewer, it is desirable that the TV broadcast screen can be captured more easily. Therefore, the communication terminal 4 may detect a television broadcast screen reflected in an image captured by the camera 161. A specific example of detecting a television broadcast screen will be described with reference to FIG.

FIG. 11 is a flowchart showing a specific example of step S41 in FIG. In step S411, the search control unit 45 detects a television broadcast screen from images captured by the camera 161. More specifically, the search control unit 45 may detect a rectangular frame of the television broadcast receiver from the image captured by the camera 161.

In step S412, the search control unit 45 identifies the partial image selected by the viewer (user of the terminal 4) based on the position and size of the television broadcast screen in the image captured by the camera 161.

In step S413, the search control unit 45 determines a video by the TV tuner 162 corresponding to the partial image selected by the viewer as a substitute image.

<Embodiment 5 of the Invention>
In the present embodiment, a fourth modification of the communication terminal 1 according to the first embodiment of the invention described above will be described. FIG. 12 is a block diagram illustrating a configuration example of the communication terminal 5 according to the present embodiment. The communication terminal 5 has a recording control unit 16. The recording control unit 16 uses the television tuner 162 to automatically record the television broadcast program related to the attribute information acquired from the search system 9.

FIG. 13 is a flowchart showing a specific example of the information search method by the communication terminal 5 according to the present embodiment. The processes in steps S1 to S5 in FIG. 13 are the same as the steps with the same reference numerals shown in FIG. Therefore, the redundant description regarding these steps is omitted.

In step S6 of FIG. 13, the communication terminal 5 (recording control unit 16) automatically records a television broadcast program related to the attribute information acquired from the search system 9. For example, the recording control unit 16 may automatically record a television program in which a person corresponding to the subject name included in the attribute information appears. The recording control unit 16 acquires a TV program guide or accesses a server (for example, a World Wide Web server) that holds TV program performer information, so that the person corresponding to the person name included in the attribute information The TV program that appears in

According to the present embodiment, the communication terminal 5 will appear in the future in response to a viewer who is watching a television program specifying an image of a performer of the program and instructing an image search. Other TV programs can be easily reserved automatically. That is, even when the viewer does not know the name of the performer, the viewer can make a recording reservation for another program in which the performer appears.
<Other embodiments>

In the above-described first to fifth embodiments of the invention, the example in which the display 12, the input device 13, the camera 161, and the television tuner 162 are mounted on the

communication terminals

1, 4, and 5 together with the processor 10 has been described. However, these devices only need to be combined with the processor 10 and need not be configured as an integrated communication terminal. For example, these devices and the processor 1 only need to be able to communicate using a wireless communication function such as a wireless LAN or Bluetooth (registered trademark) or a wired communication function.

Furthermore, the present invention is not limited to the above-described embodiments, and various modifications can be made without departing from the gist of the present invention already described.

This application claims priority based on Japanese Patent Application No. 2011-268994 filed on Dec. 8, 2011, the entire disclosure of which is incorporated herein.

1, 4, 5 Communication terminal 10 Processor 11 Wireless communication unit 12 Display 13 Input device 14 Designation control unit 15 Search control unit 16 Recording control unit 40 to 43 Image 45 Search control unit 8 Network 9 Search system 100 MPU (Micro Processing Unit)
101 RAM (Random Access Memory)
102 Non-volatile storage unit 103 Operating system 104 Search application program 121 LCD (Liquid Crystal Display)
131 Touch Panel 132 Microphone 161 Camera 162 Television Tuner 401

Subject

402, 403 Operation Trajectory 404 Frame Showing Face Area

Claims

A designation control means for accepting designation of a target image included in the video by an operation of an input device by a user for searching for attribute information of a subject reflected in the video displayed on the display;
Search control means for transmitting the target image or a substitute image thereof to a search system and receiving attribute information searched based on the target image or the substitute image from the search system;
With
The designation control means uses, as the target image, an image displayed on the display before a reference time point of the operation, which is before the completion time of the operation, in order to compensate for a delay time required for execution of the operation. decide,
Information processing device.
The designation control means determines whether or not there is a scene change of the moving image within a first period before the reference time point, and when there is the scene change, an image before the scene change is set as the target image. The information processing device according to claim 1, wherein the information processing device is determined.
3. The information processing apparatus according to claim 2, wherein the designation control unit determines the presence or absence of the scene change by comparing an image related to the reference time point with an image before the reference time point.
2. The information processing apparatus according to claim 1, wherein the designation control unit determines an image that has been displayed for a first period before the reference time as the target image.
5. The information processing apparatus according to claim 2, wherein the designation control unit can change the first period.
6. The information processing apparatus according to claim 5, wherein the designation control unit performs calibration for measuring the reaction time of the user and determines the first period according to the reaction time.
The information processing apparatus according to claim 5, wherein the designation control unit determines the first period according to a magnitude of a motion vector between a plurality of images included in the moving image.
The information processing apparatus according to claim 7, wherein the designation control unit determines that the first period is relatively longer as the magnitude of the motion vector is larger.
The information processing apparatus according to any one of claims 1 to 8, wherein the reference time point is a time point when the operation is completed, a time point when the operation is started, or a center time point of the period required for the operation.
The moving image is a video shot by a camera coupled to the information processing apparatus.
The information processing apparatus according to any one of claims 1 to 9.
The information processing apparatus according to claim 10, wherein the moving image is a video obtained by shooting a display screen of a television broadcast with the camera.
The search control means includes
An image of a television broadcast at substantially the same time as the target image displayed on the display after being captured by the camera is acquired as the substitute image by using a television tuner coupled to the information processing device,
Sending the substitute image to the search system;
The information processing apparatus according to claim 11.
The search control means includes
Detecting the TV broadcast display screen from the image taken by the camera,
Acquiring the substitute image from the video by the television tuner based on the size and position of the display screen in the captured image;
The information processing apparatus according to claim 12.
14. The information processing apparatus according to claim 1, wherein the target image is a partial image of a one-screen image included in the moving image.
The designation control means automatically recognizes a subject from an image included in the video, and outputs a display indicating the recognized subject to the display.
The information processing apparatus according to claim 14, wherein the operation includes designating a display indicating the subject.
The input device includes a touch panel disposed in front of the display,
The operation includes touching the user so as to surround the range on the touch panel where a desired subject is displayed on the display with a finger.
The information processing apparatus according to claim 14.
The input device includes a microphone that collects the voice of the user;
The search control means transmits the one-screen image including the target image as the partial image and audio information for designating the subject input by the macrophone to the search system.
The information processing apparatus according to claim 14.
The recording control means for automatically recording a television broadcast program related to the attribute information acquired from the search system by using a television tuner coupled to the information processing apparatus. The information processing apparatus according to claim 1.
An information processing apparatus according to any one of claims 1 to 17,
The display;
The input device;
Communication means used for transmission of the target image by the search control means and reception of the attribute information;
A communication terminal comprising:
The input device includes at least one of a pointing device operated by the user's hand, a microphone that collects the user's voice, and a pointing device operated by the user's line of sight. 19. The communication terminal according to 19.
An information search method by an information processing device,
Accepting designation of a target image included in the video by an operation of an input device by a user for searching for attribute information of a subject reflected in the video displayed on the display;
Transmitting the target image or a substitute image thereof to a search system; and receiving attribute information searched based on the target image or the substitute image from the search system;
With
The accepting determines, as the target image, an image that has been displayed on the display before a reference time point of the operation before the completion time of the operation in order to compensate for a delay time required for execution of the operation. Including
Information retrieval method.
Determining as the target image determines whether or not there is a scene change in the moving image within a first period before the reference time, and if there is a scene change, an image before the scene change is determined as the target image. The information search method according to claim 21, comprising determining as a target image.
The information search method according to claim 22, wherein determining the target image includes determining whether or not the scene has changed by comparing an image related to the reference time point with an image before the reference time point. .
The information search method according to claim 21, wherein the determination as the target image includes determining an image that has been displayed only a first period before the reference time point as the target image.
The information search method according to any one of claims 22 to 24, further comprising changing the first period.
26. The method according to claim 25, wherein changing the first period includes performing calibration for measuring a reaction time of the user and determining the first period according to the reaction time. Information retrieval method described.
The information according to claim 25, wherein changing the first period includes determining the first period according to a magnitude of a motion vector between a plurality of images included in the moving image. retrieval method.
28. The information search method according to claim 27, wherein the first period is determined to be relatively longer as the magnitude of the motion vector is larger.
The information search method according to any one of claims 21 to 28, wherein the reference time point is a time point when the operation is completed, a time point when the operation is started, or a center time point of the period required for the operation.
The moving image is a video shot by a camera coupled to the information processing apparatus.
The information search method according to any one of claims 21 to 29.
The information retrieval method according to claim 30, wherein the moving image is a video obtained by shooting a display screen of a television broadcast with the camera.
The sending is
Acquiring a television broadcast image at substantially the same time as the target image displayed on the display after being captured by the camera as the substitute image by using a television tuner coupled to the information processing apparatus. Sending the substitute image to the search system;
The information search method according to claim 31, comprising:
The obtaining is
Detecting a display screen of the television broadcast from a photographed image by the camera, and acquiring the substitute image from a video by the television tuner based on a size and a position of the display screen in the photographed image. ,
The information search method according to claim 32, comprising:
The information search method according to any one of claims 21 to 33, wherein the target image is a partial image of a one-screen image included in the moving image.
Automatically recognizing a subject from an image included in the video, and further outputting a display indicating the recognized subject to the display,
The information search method according to claim 34, wherein the operation includes designating a display indicating the subject.
The input device includes a touch panel disposed in front of the display,
The operation includes touching the user so as to surround the range on the touch panel where a desired subject is displayed on the display with a finger.
The information search method according to claim 34.
The television broadcast program related to the attribute information acquired from the search system is further recorded automatically by using a television tuner coupled to the information processing device, further comprising any one of claims 21 to 36. The information search method described in the section.
A non-transitory computer-readable medium storing a program for causing a computer to perform an information retrieval method,
The method
Accepting designation of a target image included in the video by an operation of an input device by a user for searching for attribute information of a subject reflected in the video displayed on the display;
Transmitting the target image or a substitute image thereof to a search system; and receiving attribute information searched based on the target image or the substitute image from the search system;
Including
The accepting determines, as the target image, an image that has been displayed on the display before a reference time point of the operation before the completion time of the operation in order to compensate for a delay time required for execution of the operation. Including
A non-transitory computer readable medium.