WO2010073905A1 - Moving image viewing apparatus - Google Patents
Moving image viewing apparatus Download PDFInfo
- Publication number
- WO2010073905A1 WO2010073905A1 PCT/JP2009/070566 JP2009070566W WO2010073905A1 WO 2010073905 A1 WO2010073905 A1 WO 2010073905A1 JP 2009070566 W JP2009070566 W JP 2009070566W WO 2010073905 A1 WO2010073905 A1 WO 2010073905A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- moving image
- unit
- still image
- information
- Prior art date
Links
- 238000000605 extraction Methods 0.000 claims description 26
- 238000004891 communication Methods 0.000 claims description 25
- 239000000284 extract Substances 0.000 claims description 15
- 230000001174 ascending effect Effects 0.000 claims description 6
- 238000000034 method Methods 0.000 abstract description 29
- 238000010586 diagram Methods 0.000 description 5
- 238000004321 preservation Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 239000003086 colorant Substances 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N9/00—Details of colour television systems
- H04N9/79—Processing of colour television signals in connection with recording
- H04N9/80—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback
- H04N9/82—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback the individual colour picture signal components being recorded simultaneously only
- H04N9/8205—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback the individual colour picture signal components being recorded simultaneously only involving the multiplexing of an additional signal and the colour video signal
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7847—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N1/00—Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
- H04N1/00127—Connection or combination of a still picture apparatus with another apparatus, e.g. for storage, processing or transmission of still picture signals or of information associated with a still picture
- H04N1/00132—Connection or combination of a still picture apparatus with another apparatus, e.g. for storage, processing or transmission of still picture signals or of information associated with a still picture in a digital photofinishing system, i.e. a system where digital photographic images undergo typical photofinishing processing, e.g. printing ordering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N1/00—Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
- H04N1/00127—Connection or combination of a still picture apparatus with another apparatus, e.g. for storage, processing or transmission of still picture signals or of information associated with a still picture
- H04N1/00204—Connection or combination of a still picture apparatus with another apparatus, e.g. for storage, processing or transmission of still picture signals or of information associated with a still picture with a digital computer or a digital computer system, e.g. an internet server
- H04N1/00209—Transmitting or receiving image data, e.g. facsimile data, via a computer, e.g. using e-mail, a computer network, the internet, I-fax
- H04N1/00222—Transmitting or receiving image data, e.g. facsimile data, via a computer, e.g. using e-mail, a computer network, the internet, I-fax details of image data generation or reproduction, e.g. scan-to-email or network printing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/462—Content or additional data management, e.g. creating a master electronic program guide from data received from the Internet and a Head-end, controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
- H04N21/4622—Retrieving content or additional data from different sources, e.g. from a broadcast channel and the Internet
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/482—End-user interface for program selection
- H04N21/4828—End-user interface for program selection for searching program descriptors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/8146—Monomedia components thereof involving graphical data, e.g. 3D object, 2D graphics
- H04N21/8153—Monomedia components thereof involving graphical data, e.g. 3D object, 2D graphics comprising still images, e.g. texture, background image
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/84—Generation or processing of descriptive data, e.g. content descriptors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/858—Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot
- H04N21/8586—Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot by using a URL
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
Definitions
- the present invention relates to a moving image viewing apparatus that searches and displays moving images, which are shot by a plurality of moving shooting devices and distributed in real time, using feature information calculated from still images.
- the application will be used for the transmission and reception of high-quality moving images, and everyone will be able to deliver video in real time at any time.
- a photographing device such as an in-vehicle camera or a mobile phone is taking a picture while moving from place to place
- the content of the scene or delivered video changes from moment to moment.
- viewers need to search and select videos that they want to view efficiently.
- Patent Literature 1 in order for a viewer to select a moving image distributed from a plurality of photographing devices, a method of presenting and selecting a snap video acquired from each photographing device, or a position of the photographing device on a map is displayed. And suggesting a way to choose from.
- Patent Document 2 it is possible to perform a search using an attribute list composed of text information directly attached to an image to be searched or feature information extracted from the image.
- Patent Document 1 provides a means for intuitively selecting a scene to be viewed, when the number of photographing devices increases, how to search for a target moving image from among them is increased. There is no mention of what to do. Further, the method disclosed in Patent Document 2 has a problem of how to perform accurate attribute extraction and feature extraction in the case of a video that is updated in real time and whose contents change from moment to moment.
- the present invention is a moving image viewing apparatus that enables a moving image captured by a moving image capturing apparatus and distributed via a network to be searched and viewed using a still image
- a communication unit that communicates by connecting to a network, an input interface unit that receives input from a user, a display unit that presents search results to the user, and a moving image distributed from the network via the communication unit
- a moving image frame acquisition unit that acquires frame data
- a feature extraction unit that extracts feature information indicating image characteristics from the still image used for search and the frame data acquired by the moving image frame acquisition unit, and the feature extraction
- a meta information storage unit for storing the meta information of the still image and the frame data including the feature information extracted from the unit
- an image search control unit for controlling moving image search
- the image search control unit searches the meta information of the still image of the meta information storage unit based on the search instruction input from the input interface unit, and the meta information of the still image corresponding to the search instruction exists , Calculating the similarity of the feature information of the frame data
- the image search control unit may select meta information of frame data having feature information whose similarity is a certain value or more, and display the moving image corresponding to the selected frame data on the display unit The user may select a target moving image by the input interface.
- the moving image viewing apparatus of the present invention further includes a still image acquisition unit that acquires a still image from a network via the communication unit, and the image search control unit receives a search instruction input from the input interface unit. Based on the above, the still image acquisition unit searches and acquires a still image from the network.
- the moving image viewing apparatus of the present invention further includes a still image storage unit that stores a still image, and the image search control unit is configured to store the still image storage unit based on a search instruction input from the input interface unit. It is characterized in that a still image is retrieved and acquired from the image.
- the image search control unit searches and acquires a still image when the meta information storage unit has a predetermined number or less of corresponding meta information.
- the meta information of the still image includes a keyword for performing a search, and the image search control unit searches based on the keyword input from the input interface unit.
- the meta information of the still image and the frame data includes position information that uniquely indicates a location of the image, and the image search control unit acquires an image based on the position information.
- the moving image viewing apparatus of the present invention further includes a still image storage unit that stores a still image, and the position information of the still image is information that specifies a still image stored in the still image storage unit. It is characterized by that.
- the moving image frame acquisition unit acquires an image at an arbitrarily specified time among the moving images distributed from the moving image capturing device. If the frame closest to the specified time is an intra frame, one moving image frame acquiring unit A frame is acquired, and when a frame closest to a specified time is a predicted frame, an intra frame and a frame including the predicted frame necessary for decoding the frame are acquired.
- the moving image frame acquisition unit acquires images from a plurality of the moving image photographing devices specified in advance according to a specified time.
- the feature extraction unit extracts luminance information of each pixel from the image, and outputs the number of pixels having the same luminance value arranged in ascending order, descending order, or predetermined order as feature information.
- the feature extraction unit extracts color information of each pixel from the image, and outputs, as feature information, the number of pixels having the same color arranged in ascending order, descending order, or a predetermined order.
- the present invention uses meta information of still images for search to search for moving images transmitted with contents that change scenes and subjects from moment to moment, and therefore uses still images as keywords or keys.
- the scene you want to see can be searched efficiently and easily.
- FIG. 1 is a system diagram showing an embodiment of a moving image viewing system. It is a block diagram which shows the structure of the function with which the moving image viewing apparatus of a moving image viewing system is provided. It is a flowchart which shows the procedure from searching a still image on a network based on the input keyword, extracting the characteristic of the acquired still image, and storing in the still image meta-information storage part. It is a figure which shows the example of the still image meta information stored in a still image meta information storage part. It is a flowchart which shows the procedure from acquiring a still image from a still image preservation
- FIG. 1 is a system diagram showing an embodiment of a moving image viewing system.
- Reference numerals 11 to 14 denote moving image photographing devices, which are equipped with a communication device and capable of transmitting the images in real time while photographing.
- four devices are shown to indicate that a plurality of different devices are mixed, but the number of devices is not limited in actual use.
- Reference numerals 15 and 16 denote moving image viewing apparatuses, such as a mobile phone, a personal computer, and a television.
- Reference numeral 17 denotes a network in which the moving image capturing device and the moving image viewing device can communicate with each other, and includes, for example, communication using a circuit switching network, a mobile phone network, a wireless LAN, Bluetooth, or the like.
- Reference numerals 18 and 19 denote servers connected to the network 17, which can store and distribute still images, moving images taken by the moving image shooting device, and the like.
- the moving image capturing apparatus may provide a moving image to the moving image viewing apparatus via the network 17 without using the server.
- the moving image viewing apparatuses 15 and 16 receive a search request from the moving image photographing apparatuses 11 to 14 that distribute moving images in real time (or via the servers 18 and 19) at regular intervals or from the viewer.
- a snapshot of the image being distributed that is, frame data (still image) of the moving image is acquired.
- the feature information is extracted and stored in the apparatus as meta information of the moving image frame together with the acquisition source information (URI).
- URI acquisition source information
- the meta information includes image feature information, acquisition source information (URI), acquisition time, position information indicating the location of the image, search keywords, copyright information, and the like. It is embedded in still image data, managed as separate data such as text or XML, and managed as a database.
- the search keyword is a character string representing an object, a landscape, a person, a state, and the like.
- the moving image viewing devices 15 and 16 extract feature information from still images in the devices or still images that can be acquired from the communication network 17 (still images in the servers 18 and 19), and search keywords, It is stored in the apparatus as meta information of a still image together with acquisition source information (information indicating whether a file in the apparatus exists or URI).
- the search keyword includes a keyword included in metadata included in a still image in advance, a keyword used as a search word when acquiring a still image from the communication network 17, and a keyword directly input by a user from an input device of a terminal. Is used.
- the meta information of the still image acquired by the same method is stored in the apparatus in advance.
- the moving image viewing device extracts feature information of the corresponding still image, and features information of the moving image frame meta information The similarity is calculated, and it is determined that the high similarity is the moving image desired by the viewer, and the search result is presented.
- the feature information of the extracted still image may be one or plural, and in the case where there are plural pieces, the result obtained by calculating the similarity with one moving image frame meta information is also plural. Therefore, the one with the highest similarity is adopted.
- FIG. 2 is a block diagram showing a functional configuration of the moving image viewing apparatus of the moving image viewing system.
- the moving image viewing apparatus includes a communication unit 20, a moving image frame acquisition unit 30, a still image acquisition unit 40, an image decoding unit 50, a feature extraction unit 60, a meta information storage unit 70, A still image storage unit 80, an image search control unit 90, an input interface unit 100, and a display unit 110 are provided.
- the communication unit 20 exchanges data with the servers 18 and 19 and the moving image photographing apparatuses 11 to 14 via the communication network 17.
- the moving image frame acquisition unit 30 is a snapshot (moving image frame) of a moving image that is distributed at a certain time interval or when a search request is received from a viewer via the communication unit 20, that is, Get a still image.
- the still image acquisition unit 40 acquires a still image distributed to the communication network 17 via the communication unit 20.
- the image decoding unit 50 has a function of decoding a moving image and a function of decoding a still image, and are called a moving image decoding unit 51 and a still image decoding unit 52, respectively.
- the feature extraction unit 60 extracts feature information (details will be described later) from the decoded image.
- the meta information storage unit 70 has a function capable of separately storing the feature information extracted from the still image and the feature information extracted from the moving image, and the storage destination is the still image meta information storage unit 71, This is called a moving image frame meta information storage unit 72.
- the still image storage unit 80 stores still images.
- the image search control unit 90 controls each unit when moving image search processing is performed.
- the input interface unit 100 has a function of inputting a keyword for performing a search and a function of selecting a moving image to be viewed, and are referred to as a search key input unit 101 and a moving image designating unit 102, respectively.
- the input interface unit 100 includes hardware such as an input device such as a keyboard, a mouse, a touch panel, and a button, and a pointing device.
- the search key input unit 101 can input a scene that the viewer wants to see by directly inputting a keyword or the like, selecting from a keyword or genre registered in advance, or selecting an image close to the scene to be viewed.
- the display unit 110 can present a search result obtained by inputting a search key to a viewer or display a moving image selected from the viewer, and these can be displayed as a search result display unit 111 and a moving image display, respectively. This is referred to as part 112.
- FIG. 3 shows a procedure from when the moving image viewing apparatus searches for a still image on the network based on the input keyword, extracts the characteristics of the acquired still image, and stores them in the still image meta information storage unit 71. It is a flowchart which shows. This is a partial extraction of the processing from search to viewing according to the present invention.
- step S11 based on the keyword input from the search key input unit 101 of the input interface unit 100, the image search control unit 90 captures a still image on the communication network 17 via the communication unit 20.
- Search for The search method is not particularly limited. For example, in the Internet, when some keywords are input and transmitted to the Web server, there are some services that create a list of still images corresponding to the keywords from the database in the Web server. You can use this as well. Later, in order to further improve the accuracy of the search result of moving images, a plurality of images are acquired from the plurality of searched still images.
- step S12 the image search control unit 90 confirms whether or not the acquired still image has already been acquired and the feature information has been extracted. , And remove it from the target of the subsequent processing. Whether or not it has been extracted can be determined by information indicating a still image acquisition source stored in the still image meta information storage unit 71, that is, by a URI. If the feature information has already been extracted from all of the acquired still images, the process shown in FIG. 3 ends here. If there is an unextracted still image, the process proceeds to the next process.
- step S13 the image search control unit 90 causes the still image acquisition unit 40 to acquire a still image from which patent information has not been extracted, and sets the still image in a format that can extract feature information. Therefore, the still image decoding unit 52 performs decoding.
- the feature extraction unit 60 extracts the feature information of the decoded still image.
- the feature information is information that can be used to search for an original image or an image similar to the original image using the information as a clue.
- the feature information extraction method is not limited in the present invention. For example, the method of recording the color distribution of the image, the method of extracting the feature point of the image, the texture in the image, the luminance information, or the color information There is a method of extracting statistical information, that is, a histogram.
- the texture data histogram for example, for all the pixels in the image, compared to the brightness of the surrounding eight pixels, whether or not more than half of the surrounding pixels are brighter than the brightness of the pixel of interest. Is obtained, and the number of pixels is totaled for each luminance and arranged for each luminance.
- the histogram of color information is obtained, for example, by extracting the degree of red, green, and blue colors for each pixel, and counting and arranging the number of pixels for each color.
- Alignment may be in ascending order or descending order in terms of the number of pixels, or in ascending order or descending order in terms of luminance or color. Alternatively, a predetermined arrangement may be used.
- step S15 the feature extraction unit 60 collects the keyword input in step S11, the information (URI) of the still image acquisition source, and the feature information extracted in step S14 into a still image meta information. Store in the storage unit 71.
- steps S12 to S15 may be repeated for each still image, or the processing from steps S12 to S15 is performed for all the images. You may go in order.
- FIG. 4 is a diagram illustrating an example of still image meta information stored in the still image meta information storage unit 71 as a result of the series of processes in FIG.
- Still image meta information numbers 1 to 3 are still meta image information extracted by keywords inputted so far, and still image meta information number 4 is added with still image meta information obtained by newly inputted keywords. Indicates that In this way, if still image meta information corresponding to a keyword input in the past is accumulated, the search speed when the same keyword is input can be increased.
- the image search control unit 90 does not make the determination in step S12, and once performs the processing up to step S14, compares the feature information, and does not newly store still image meta information if they match completely. It is also possible. In addition, if the feature information matches but the keyword does not match, only the keyword can be added to the keyword portion of the stored still image meta information. As a result, the same feature information has a different meaning key. Can also be used.
- FIG. 5 is a flowchart showing a procedure from acquiring a still image from the still image storage unit 80 based on the input keyword, extracting features of the acquired still image, and storing the feature in the still image meta information storage unit 71. It is. This is a partial extraction of the processing from search to viewing according to the present invention.
- the image search control unit 90 includes a still image that includes the keyword input from the search key input unit 101 as a part of the file name or has as file metadata. It is confirmed whether it exists in the preservation
- step S ⁇ b> 22 the image search control unit 90 confirms whether any of the still images stored in the still image storage unit 80 has already been acquired and feature information has been extracted. However, those that have already been extracted are excluded from the target of subsequent processing. Whether or not it has been extracted can be determined by information (see FIG. 4) indicating a still image acquisition source stored in the still image meta information storage unit 71, that is, a file name. If the feature information has already been extracted from all of the stored still images, the processing shown in FIG. 5 ends here. If there is an unextracted still image, the process proceeds to the next process.
- step S23 the image search control unit 90 sets the still image from which the patent information stored in the still image storage unit 80 has not been extracted into a format in which the feature information can be extracted.
- Decoding is performed by the image decoding unit 52.
- step S24 the feature extraction unit 60 extracts the feature information of the decoded still image.
- step S25 the feature extraction unit 60 stores the keyword input in step S21, the still image file path, and the feature information extracted in S24 together in the still image meta information storage unit 71.
- the saved information is saved as in the example shown in FIG.
- An apparatus for viewing a moving image needs to have information on a moving image acquisition source, that is, a moving image photographing apparatus, in advance.
- a moving image photographing apparatus For example, several photographing devices are registered in advance, information on each photographing device can be registered by a viewer, or information is acquired from a portal site that collects information on moving image photographing devices. There are methods.
- the moving image frame acquisition unit 30 of the moving image viewing apparatus sequentially takes a snapshot of the moving image being distributed at a certain time interval or at the time when a search request is received from the viewer. (Moving image frame) is acquired. For example, an image at an arbitrarily specified time is acquired from the moving images distributed from the moving image capturing device, and if the frame closest to the specified time is an intra frame, one intra frame is acquired and specified. If the frame closest to the predetermined time is a predicted frame, a frame including an intra frame and a predicted frame necessary for decoding the frame is acquired.
- the image acquired by the moving image frame acquisition unit 30 is decoded by the still image decoding unit 52 of the image decoding unit 50, and the feature extraction unit 60 extracts features by the same method as the feature extraction of the still image.
- the feature extraction unit 60 stores the extracted feature information in the moving image frame meta information storage unit 72 together with information (URI) indicating the acquisition source. At this time, if moving image frame meta information having the same information indicating the acquisition source already exists in the moving image frame meta information storage unit 72, the feature extraction unit 60 overwrites and stores the information.
- FIG. 6 is an example of moving picture frame meta information stored in the moving picture frame meta information storage unit.
- step S31 first, the viewer is presented with a screen for searching for a moving image on the display unit 110.
- the moving image to be viewed can be searched and selected.
- one or a plurality of keywords are input from the search key input unit 101 of the input interface unit 100, selected from a keyword list, or a list of images that represent scenes to be viewed is displayed and selected from there. And so on.
- step S32 it is determined whether or not a still image used in the search is newly acquired from the network.
- the determination condition is whether or not the still image meta information corresponding to the keyword input in step S31 exists in the still image meta information storage unit 71. Normally, if this does not exist, a moving image cannot be searched unless a new still image is acquired. However, if it exists, it is not necessary to newly acquire it. However, since the search accuracy of moving images is improved by using meta information of a plurality of still images, for example, as a result of the search, the still image meta information storage unit 71 If only a certain number or less of still image meta information can be secured, a still image for search is acquired from the network. If it is expected in step S34 that the still image in the apparatus can be used for search purposes, or if it is not necessary to find one as a search result, there is no need to obtain it.
- step S33 the series of processing shown in FIG. 3 is performed, and the meta information including the acquired still image feature information is stored in the still image meta information storage unit 71.
- step S34 the image search control unit 90 determines whether to use a still image in the apparatus as a still image used in the search.
- the determination condition is whether or not the still image meta information corresponding to the keyword input in step S31 exists in the still image meta information storage unit 71. Normally, if it exists, it is not necessary to use a still image in the apparatus as a search image, but it is desirable to use it in order to improve the accuracy of moving image search.
- step S35 the series of processing shown in FIG. 5 is performed, and the feature information is stored in the still image meta information storage unit.
- the determination conditions in steps S32 and S34 may be settings unique to the viewing device, or may be set freely by the viewer.
- step S36 the image search control unit 90 inputs the keyword, the still image meta information stored in the still image meta information storage unit 71, and the moving image frame meta information stored in the moving image frame meta information storage unit 72. Then, the moving image search process is performed.
- still image meta information whose keyword matches part or all of the input keyword is extracted.
- the similarity between the extracted feature information of the still image meta information and the feature information of the moving image frame meta information stored in the moving image frame meta information storage unit 72 is calculated.
- the similarity calculation method differs depending on the feature information format, and a plurality of calculation methods can be considered even if the feature feature format is the same. Therefore, although not particularly limited in the present invention, for example, as described above, the brightness of the image When the information histogram is used as the feature information, the distance between the values for each luminance in the feature information is calculated, and the sum of the calculated distances can be used as a scale representing the similarity. In this method, the similar value is smaller, and the value is larger if they are not similar.
- the similarity will be different if the resolutions are different, so resizing so that the image size will be the same number of pixels before performing feature extraction. It is preferable to calculate the similarity after performing the feature extraction or by making the scales of the histograms equal before calculating the similarity, that is, by making the sum of all the values in the histogram the same.
- the search result is displayed on the search result display unit 111 in step S37.
- the search results may be displayed in order of URIs, or the corresponding snapshot images acquired earlier may be displayed instead of the URIs. If possible, a moving image may be received from each photographing apparatus and displayed.
- the viewer designates a moving image desired to be viewed from the search results displayed on the search result display unit 111 by the moving image designating unit 102.
- the image search control unit 90 issues a viewing start request to the URI specified via the communication unit 20.
- the moving image decoding unit 51 decodes the moving image and the moving image display unit 112 displays the moving image.
- the viewer simply enters the search keyword and selects the desired moving image from the search results, and the content of the scene or video to be distributed changes from moment to moment.
- the present invention assumes a case where a moving image is distributed via a communication network.
- the present invention is also applicable to moving image distribution by broadcasting.
- the acquisition source information held as metadata is channel information or frequency information.
- the search keyword can also be used when broadcast information includes character information such as program information.
- the present invention assumes a moving image photographed by a photographing apparatus, it can also be applied to an animation moving image.
- Moving image capturing devices 15 and 16 Moving image viewing device 17 Network 18 and 19 Server 20 Communication unit 30 Moving image frame acquisition unit 40 Still image acquisition unit 50 Image decoding unit 51 Moving image decoding unit 52 Still image decoding unit 60 Features Extraction unit 70 Meta information storage unit 71 Still image meta information storage unit 72 Moving image frame meta information storage unit 80 Still image storage unit 90 Image search control unit 100 Input interface unit 101 Search key input unit 102 Moving image designation unit 110 Display unit 111 Search result display unit 112 Moving image display unit
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Library & Information Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Computer Graphics (AREA)
- Human Computer Interaction (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Television Signal Processing For Recording (AREA)
Abstract
Provided is a moving image viewing apparatus capable of efficiently searching for and retrieving a desired scene from a plurality of moving images delivered from a plurality of image pickup apparatuses. If no still image meta information for use in search is existent in a still image meta information storing unit (71), such still image meta information is acquired from among still images delivered over a network or from a still image storing unit (80) in the moving image viewing apparatus, and then image characteristic information is extracted by use of a characteristic extracting unit (60) and then stored into the still image meta information storing unit (71). Characteristic information is extracted, by use of the characteristic extracting unit (60), from the frame data of moving images delivered from the image pickup apparatuses and then stored into a moving image frame meta information storing unit (72). An image search control unit (90) executes a moving image searching process using entered keywords, the still image meta information stored in the still image meta information storing unit (71), and the moving image frame meta information stored in the moving image frame meta information storing unit (72).
Description
本発明は、移動する複数の撮影装置で撮影されリアルタイムに配信される動画像を、静止画像から算出した特徴情報を用いて動画像を検索し、表示する動画像視聴装置に関する。
The present invention relates to a moving image viewing apparatus that searches and displays moving images, which are shot by a plurality of moving shooting devices and distributed in real time, using feature information calculated from still images.
移動通信における伝送速度の向上により、その用途は高品質な動画像の送受信にも及び、誰もがいつでも映像をリアルタイムに配信できるようになると考えられる。車載カメラや携帯電話などの撮影装置が場所を移動しながら撮影している場合、その画像がリアルタイムに配信されていると、場面あるいは配信される映像の内容が時々刻々と変化していく。このような映像を配信する端末が増えると、視聴者はその中から効率的に見たい映像を検索し、選択する必要が出てくる。
Due to the improvement of transmission speed in mobile communications, the application will be used for the transmission and reception of high-quality moving images, and everyone will be able to deliver video in real time at any time. When a photographing device such as an in-vehicle camera or a mobile phone is taking a picture while moving from place to place, if the image is delivered in real time, the content of the scene or delivered video changes from moment to moment. As the number of terminals that distribute such videos increases, viewers need to search and select videos that they want to view efficiently.
特許文献1では、複数の撮影装置から配信される動画像を視聴者が選択するために、各撮影装置から取得したスナップ映像を提示して選択させる方法や、地図上に撮影装置の位置を表示してそこから選択させる方法を提案している。
In Patent Literature 1, in order for a viewer to select a moving image distributed from a plurality of photographing devices, a method of presenting and selecting a snap video acquired from each photographing device, or a position of the photographing device on a map is displayed. And suggesting a way to choose from.
特許文献2では、検索対象となる画像に直接付随した、テキスト情報で構成された属性リスト、または画像から抽出した特徴情報を用いて検索することを可能にしている。
In Patent Document 2, it is possible to perform a search using an attribute list composed of text information directly attached to an image to be searched or feature information extracted from the image.
しかしながら、特許文献1に示される方法では、見た場面を直感的に選択する手段を提供しているものの、撮影装置の数が多くなった場合、その中からどのように目的の動画像を検索するかについては触れていない。また、特許文献2に示される方法では、リアルタイムに更新され、内容が時々刻々と変化するような映像の場合の正確な属性抽出と特徴抽出をどう行うかという課題がある。
However, although the method disclosed in Patent Document 1 provides a means for intuitively selecting a scene to be viewed, when the number of photographing devices increases, how to search for a target moving image from among them is increased. There is no mention of what to do. Further, the method disclosed in Patent Document 2 has a problem of how to perform accurate attribute extraction and feature extraction in the case of a video that is updated in real time and whose contents change from moment to moment.
本発明では、複数の撮影装置から配信される複数の動画像の中から、見たい場面を効率的に検索できるようにした動画像視聴装置を提供することを目的とする。
It is an object of the present invention to provide a moving image viewing apparatus that can efficiently search a desired scene from a plurality of moving images distributed from a plurality of photographing devices.
本発明は、動画像撮影装置により撮影されてネットワークを介して配信される動画像を、静止画像を利用して検索し視聴可能とする動画像視聴装置であって、
ネットワークに接続して通信を行う通信部と、利用者からの入力を受け付ける入力インタフェース部と、検索結果を利用者に提示する表示部と、前記通信部を介してネットワークから配信される動画像のフレームデータを取得する動画像フレーム取得部と、検索に利用する前記静止画像及び前記動画像フレーム取得部で取得したフレームデータから画像の特徴を示す特徴情報を抽出する特徴抽出部と、前記特徴抽出部より抽出された特徴情報を含む前記静止画像及び前記フレームデータのメタ情報をそれぞれ保存するメタ情報保存部と、動画像検索の制御を行う画像検索制御部と、を備え、
前記画像検索制御部は、前記入力インタフェース部より入力された検索指示に基づいて、前記メタ情報保存部の静止画像のメタ情報を検索し、検索指示に該当する静止画像のメタ情報が存在した場合、該静止画像の特徴情報に対する前記メタ情報保存部のフレームデータの特徴情報の類似度を算出して類似度が高い順からフレームデータのメタ情報を選択し、選択した前記フレームデータメタ情報に基づいて、対応する動画像を前記通信部を介して取得して、前記表示部に表示させることを特徴とする。 The present invention is a moving image viewing apparatus that enables a moving image captured by a moving image capturing apparatus and distributed via a network to be searched and viewed using a still image,
A communication unit that communicates by connecting to a network, an input interface unit that receives input from a user, a display unit that presents search results to the user, and a moving image distributed from the network via the communication unit A moving image frame acquisition unit that acquires frame data, a feature extraction unit that extracts feature information indicating image characteristics from the still image used for search and the frame data acquired by the moving image frame acquisition unit, and the feature extraction A meta information storage unit for storing the meta information of the still image and the frame data including the feature information extracted from the unit, and an image search control unit for controlling moving image search,
The image search control unit searches the meta information of the still image of the meta information storage unit based on the search instruction input from the input interface unit, and the meta information of the still image corresponding to the search instruction exists , Calculating the similarity of the feature information of the frame data of the meta information storage unit with respect to the feature information of the still image, selecting the meta information of the frame data in descending order of similarity, and based on the selected frame data meta information The corresponding moving image is acquired via the communication unit and displayed on the display unit.
ネットワークに接続して通信を行う通信部と、利用者からの入力を受け付ける入力インタフェース部と、検索結果を利用者に提示する表示部と、前記通信部を介してネットワークから配信される動画像のフレームデータを取得する動画像フレーム取得部と、検索に利用する前記静止画像及び前記動画像フレーム取得部で取得したフレームデータから画像の特徴を示す特徴情報を抽出する特徴抽出部と、前記特徴抽出部より抽出された特徴情報を含む前記静止画像及び前記フレームデータのメタ情報をそれぞれ保存するメタ情報保存部と、動画像検索の制御を行う画像検索制御部と、を備え、
前記画像検索制御部は、前記入力インタフェース部より入力された検索指示に基づいて、前記メタ情報保存部の静止画像のメタ情報を検索し、検索指示に該当する静止画像のメタ情報が存在した場合、該静止画像の特徴情報に対する前記メタ情報保存部のフレームデータの特徴情報の類似度を算出して類似度が高い順からフレームデータのメタ情報を選択し、選択した前記フレームデータメタ情報に基づいて、対応する動画像を前記通信部を介して取得して、前記表示部に表示させることを特徴とする。 The present invention is a moving image viewing apparatus that enables a moving image captured by a moving image capturing apparatus and distributed via a network to be searched and viewed using a still image,
A communication unit that communicates by connecting to a network, an input interface unit that receives input from a user, a display unit that presents search results to the user, and a moving image distributed from the network via the communication unit A moving image frame acquisition unit that acquires frame data, a feature extraction unit that extracts feature information indicating image characteristics from the still image used for search and the frame data acquired by the moving image frame acquisition unit, and the feature extraction A meta information storage unit for storing the meta information of the still image and the frame data including the feature information extracted from the unit, and an image search control unit for controlling moving image search,
The image search control unit searches the meta information of the still image of the meta information storage unit based on the search instruction input from the input interface unit, and the meta information of the still image corresponding to the search instruction exists , Calculating the similarity of the feature information of the frame data of the meta information storage unit with respect to the feature information of the still image, selecting the meta information of the frame data in descending order of similarity, and based on the selected frame data meta information The corresponding moving image is acquired via the communication unit and displayed on the display unit.
前記画像検索制御部は、類似度が一定値以上の特徴情報を有するフレームデータのメタ情報を選択してもよいし、また、選択した前記フレームデータに対応する前記動画像を前記表示部に表示させ、前記入力インタフェースにより利用者に目的の動画像を選択させてもよい。
The image search control unit may select meta information of frame data having feature information whose similarity is a certain value or more, and display the moving image corresponding to the selected frame data on the display unit The user may select a target moving image by the input interface.
また、本発明の動画像視聴装置は、ネットワークから前記通信部を介して静止画像を取得する静止画像取得部を更に備え、前記画像検索制御部は、前記入力インタフェース部より入力された検索指示に基づいて、前記静止画像取得部にネットワークから静止画像を検索して取得させることを特徴とする。
The moving image viewing apparatus of the present invention further includes a still image acquisition unit that acquires a still image from a network via the communication unit, and the image search control unit receives a search instruction input from the input interface unit. Based on the above, the still image acquisition unit searches and acquires a still image from the network.
また、本発明の動画像視聴装置は、静止画像を格納する静止画像保存部を更に備え、前記画像検索制御部は、前記入力インタフェース部より入力された検索指示に基づいて、前記静止画像保存部から静止画像を検索して取得させることを特徴とする。
The moving image viewing apparatus of the present invention further includes a still image storage unit that stores a still image, and the image search control unit is configured to store the still image storage unit based on a search instruction input from the input interface unit. It is characterized in that a still image is retrieved and acquired from the image.
ここで、前記画像検索制御部は、前記メタ情報保存部に所定数以下の該当メタ情報しかない場合に、静止画像を検索して取得することを特徴とする。
Here, the image search control unit searches and acquires a still image when the meta information storage unit has a predetermined number or less of corresponding meta information.
また、前記静止画像のメタ情報は、検索を行うためのキーワードを含み、前記画像検索制御部は、前記入力インタフェース部より入力されたキーワードに基づいて検索することを特徴とする。
Further, the meta information of the still image includes a keyword for performing a search, and the image search control unit searches based on the keyword input from the input interface unit.
また、前記静止画像及び前記フレームデータのメタ情報は、画像の所在を一意に示す位置情報を含み、前記画像検索制御部は、その位置情報に基づいて画像を取得することを特徴とする。
Further, the meta information of the still image and the frame data includes position information that uniquely indicates a location of the image, and the image search control unit acquires an image based on the position information.
また、本発明の動画像視聴装置は、静止画像を格納する静止画像保存部を更に備え、前記静止画像の位置情報は、前記静止画像保存部に格納されている静止画像を特定する情報であることを特徴とする。
The moving image viewing apparatus of the present invention further includes a still image storage unit that stores a still image, and the position information of the still image is information that specifies a still image stored in the still image storage unit. It is characterized by that.
前記動画像フレーム取得部は、前記動画像撮影装置から配信される動画像のうち任意に指定された時刻の画像を取得し、指定された時間に最も近いフレームがイントラフレームの場合は1つのイントラフレームを取得し、指定された時間に最も近いフレームが予測フレームの場合は、そのフレームを復号するために必要なイントラフレームと予測フレームを含むフレームを取得することを特徴とする。
The moving image frame acquisition unit acquires an image at an arbitrarily specified time among the moving images distributed from the moving image capturing device. If the frame closest to the specified time is an intra frame, one moving image frame acquiring unit A frame is acquired, and when a frame closest to a specified time is a predicted frame, an intra frame and a frame including the predicted frame necessary for decoding the frame are acquired.
また、前記動画像フレーム取得部は、あらかじめ指定された複数の前記動画像撮影装置から、指定された時間に従って画像を取得することを特徴とする。
Further, the moving image frame acquisition unit acquires images from a plurality of the moving image photographing devices specified in advance according to a specified time.
また、前記特徴抽出部は、画像から、各画素の輝度情報を抽出し、同じ輝度値である画素の数を、昇順又は降順又はあらかじめ決められた順に並べたものを特徴情報として出力することを特徴とする。
In addition, the feature extraction unit extracts luminance information of each pixel from the image, and outputs the number of pixels having the same luminance value arranged in ascending order, descending order, or predetermined order as feature information. Features.
また、前記特徴抽出部は、画像から、各画素の色情報を抽出し、同じ色である画素の数を、昇順又は降順又はあらかじめ決められた順に並べたものを特徴情報として出力することを特徴とする。
In addition, the feature extraction unit extracts color information of each pixel from the image, and outputs, as feature information, the number of pixels having the same color arranged in ascending order, descending order, or a predetermined order. And
本発明は、検索用の静止画像のメタ情報を利用して、時々刻々と場面や被写体が変化するような内容で送信される動画像を検索するので、キーワード、あるいはキーとなる静止画像を用いて、見たい場面を効率的に且つ簡単に検索することが出来る。
The present invention uses meta information of still images for search to search for moving images transmitted with contents that change scenes and subjects from moment to moment, and therefore uses still images as keywords or keys. The scene you want to see can be searched efficiently and easily.
本発明の実施形態の一例を図1から図7に基づいて以下に説明する。
An example of an embodiment of the present invention will be described below with reference to FIGS.
図1は、動画像視聴システムの一実施形態を示すシステム図である。11~14は、動画像撮影装置であり、通信装置を備え、撮影しながら、その画像をリアルタイムに送信できる機器である。例えば、カメラが搭載された携帯電話やスマートフォン、通信機能を備えたデジタルカメラやデジタルビデオカメラ、自動車に備え付けられたカメラ、ウェブカメラを備えたモバイルパソコンなどである。ここでは異なる機器が複数混在することを示すために4台の機器を記載しているが、実際の使用に際してはその台数は問わない。
FIG. 1 is a system diagram showing an embodiment of a moving image viewing system. Reference numerals 11 to 14 denote moving image photographing devices, which are equipped with a communication device and capable of transmitting the images in real time while photographing. For example, a mobile phone or a smartphone equipped with a camera, a digital camera or digital video camera having a communication function, a camera installed in a car, a mobile personal computer equipped with a web camera, or the like. Here, four devices are shown to indicate that a plurality of different devices are mixed, but the number of devices is not limited in actual use.
15,16は動画像視聴装置であり、例えば、携帯電話やパソコン、テレビなどである。
17は動画像撮影装置と動画像視聴装置が互いに通信可能なネットワークであり、例えば回線交換網、携帯電話網や、無線LAN、Bluetoothなどを用いた通信などで構成される。Reference numerals 15 and 16 denote moving image viewing apparatuses, such as a mobile phone, a personal computer, and a television.
Reference numeral 17 denotes a network in which the moving image capturing device and the moving image viewing device can communicate with each other, and includes, for example, communication using a circuit switching network, a mobile phone network, a wireless LAN, Bluetooth, or the like.
17は動画像撮影装置と動画像視聴装置が互いに通信可能なネットワークであり、例えば回線交換網、携帯電話網や、無線LAN、Bluetoothなどを用いた通信などで構成される。
18,19は、ネットワーク17に接続しているサーバであり、ここに静止画像や動画像撮影装置が撮影した動画等を格納し、配信可能となっている。もちろん、動画像撮影装置がサーバを介さずに動画像視聴装置に動画をネットワーク17を介して提供する場合もある。
Reference numerals 18 and 19 denote servers connected to the network 17, which can store and distribute still images, moving images taken by the moving image shooting device, and the like. Of course, the moving image capturing apparatus may provide a moving image to the moving image viewing apparatus via the network 17 without using the server.
動画像視聴装置15,16は、動画をリアルタイムに配信している動画像撮影装置11~14から(あるいはサーバ18,19を介して)、一定時間ごとに、または、視聴者からの検索要求があった時点で、配信している画像のスナップショット、すなわち動画像のフレームデータ(静止画像)を取得する。取得したフレームデータの静止画像は、その特徴情報を抽出し、取得元の情報(URI)とともに動画像フレームのメタ情報として装置内に格納する。取得元となる動画像撮影装置は、あらかじめ選別しておくか、そのような装置を取りまとめているポータルサイト(サーバ18,19)などから取得できるものとする。
ここで、メタ情報とは、画像の特徴情報、取得元の情報(URI)、取得時刻、画像の所在を示す位置情報、検索用キーワード、著作権情報、等を含むものであり、EXIFのように静止画像データに埋め込むもの、テキストやXMLなどで別データとして管理するもの、データベースとして管理するもののことを指す。検索用キーワードとは、物体、景色、人物、状態、等を表す文字列である。 The movingimage viewing apparatuses 15 and 16 receive a search request from the moving image photographing apparatuses 11 to 14 that distribute moving images in real time (or via the servers 18 and 19) at regular intervals or from the viewer. At a certain point, a snapshot of the image being distributed, that is, frame data (still image) of the moving image is acquired. For the still image of the acquired frame data, the feature information is extracted and stored in the apparatus as meta information of the moving image frame together with the acquisition source information (URI). It is assumed that the moving image capturing device as the acquisition source is selected in advance or can be acquired from a portal site (servers 18 and 19) that collects such devices.
Here, the meta information includes image feature information, acquisition source information (URI), acquisition time, position information indicating the location of the image, search keywords, copyright information, and the like. It is embedded in still image data, managed as separate data such as text or XML, and managed as a database. The search keyword is a character string representing an object, a landscape, a person, a state, and the like.
ここで、メタ情報とは、画像の特徴情報、取得元の情報(URI)、取得時刻、画像の所在を示す位置情報、検索用キーワード、著作権情報、等を含むものであり、EXIFのように静止画像データに埋め込むもの、テキストやXMLなどで別データとして管理するもの、データベースとして管理するもののことを指す。検索用キーワードとは、物体、景色、人物、状態、等を表す文字列である。 The moving
Here, the meta information includes image feature information, acquisition source information (URI), acquisition time, position information indicating the location of the image, search keywords, copyright information, and the like. It is embedded in still image data, managed as separate data such as text or XML, and managed as a database. The search keyword is a character string representing an object, a landscape, a person, a state, and the like.
また、動画像視聴装置15,16は、その装置内にある静止画像、あるいは通信ネットワーク17から取得可能な静止画像(サーバ18,19内の静止画像)から特徴情報を抽出し、検索用キーワード、取得元情報(装置内のファイルのありかを示す情報またはURI)とともに静止画像のメタ情報として装置内に格納する。検索用キーワードには、静止画像にあらかじめ含まれるメタデータに含まれるキーワードや、通信ネットワーク17から静止画像を取得する際に検索ワードとして使用したキーワードや、ユーザが端末の入力装置から直接入力するキーワードを使用する。さらには、同じ方法で取得した静止画像のメタ情報をあらかじめ装置内に格納しておく。
Further, the moving image viewing devices 15 and 16 extract feature information from still images in the devices or still images that can be acquired from the communication network 17 (still images in the servers 18 and 19), and search keywords, It is stored in the apparatus as meta information of a still image together with acquisition source information (information indicating whether a file in the apparatus exists or URI). The search keyword includes a keyword included in metadata included in a still image in advance, a keyword used as a search word when acquiring a still image from the communication network 17, and a keyword directly input by a user from an input device of a terminal. Is used. Furthermore, the meta information of the still image acquired by the same method is stored in the apparatus in advance.
視聴者が検索キーとなるキーワードを入力したり、検索用の静止画像を指定入力したら、動画像視聴装置は、それに該当する静止画像の特徴情報を抽出して、動画像フレームメタ情報の特徴情報との類似度を算出し、類似度の高いものが視聴者が希望する動画像であるものと判断して検索結果を提示する。ここで、抽出された静止画像の特徴情報は1つであっても複数であってもよく、複数である場合は、1つの動画像フレームメタ情報との類似度算出にて得られる結果も複数になるので、このうち最も類似度の高いものを採用する。
When the viewer inputs a keyword as a search key or specifies and inputs a still image for search, the moving image viewing device extracts feature information of the corresponding still image, and features information of the moving image frame meta information The similarity is calculated, and it is determined that the high similarity is the moving image desired by the viewer, and the search result is presented. Here, the feature information of the extracted still image may be one or plural, and in the case where there are plural pieces, the result obtained by calculating the similarity with one moving image frame meta information is also plural. Therefore, the one with the highest similarity is adopted.
図2は、動画像視聴システムの動画像視聴装置が備える機能の構成を示すブロック図である。ここに示すとおり、動画像視聴装置は、通信部20と、動画像フレーム取得部30と、静止画像取得部40と、画像復号部50と、特徴抽出部60と、メタ情報保存部70と、静止画像保存部80と、画像検索制御部90と、入力インタフェース部100と、表示部110とを備えている。
FIG. 2 is a block diagram showing a functional configuration of the moving image viewing apparatus of the moving image viewing system. As shown here, the moving image viewing apparatus includes a communication unit 20, a moving image frame acquisition unit 30, a still image acquisition unit 40, an image decoding unit 50, a feature extraction unit 60, a meta information storage unit 70, A still image storage unit 80, an image search control unit 90, an input interface unit 100, and a display unit 110 are provided.
通信部20は、通信ネットワーク17を介してサーバ18,19や動画像撮影装置11~14とデータのやり取りを行う。
動画像フレーム取得部30は、通信部20を介して、一定時間ごとに、または、視聴者からの検索要求があった時点で、配信している動画像のスナップショット(動画像フレーム)、すなわち静止画像を取得する。
静止画像取得部40は、通信部20を介して、通信ネットワーク17に配信されている静止画像を取得する。
画像復号部50は、動画像を復号する機能と静止画像を復号する機能を持ち、それぞれ動画像復号部51、静止画像復号部52と呼ぶ。 Thecommunication unit 20 exchanges data with the servers 18 and 19 and the moving image photographing apparatuses 11 to 14 via the communication network 17.
The moving imageframe acquisition unit 30 is a snapshot (moving image frame) of a moving image that is distributed at a certain time interval or when a search request is received from a viewer via the communication unit 20, that is, Get a still image.
The stillimage acquisition unit 40 acquires a still image distributed to the communication network 17 via the communication unit 20.
Theimage decoding unit 50 has a function of decoding a moving image and a function of decoding a still image, and are called a moving image decoding unit 51 and a still image decoding unit 52, respectively.
動画像フレーム取得部30は、通信部20を介して、一定時間ごとに、または、視聴者からの検索要求があった時点で、配信している動画像のスナップショット(動画像フレーム)、すなわち静止画像を取得する。
静止画像取得部40は、通信部20を介して、通信ネットワーク17に配信されている静止画像を取得する。
画像復号部50は、動画像を復号する機能と静止画像を復号する機能を持ち、それぞれ動画像復号部51、静止画像復号部52と呼ぶ。 The
The moving image
The still
The
特徴抽出部60は、復号された画像から特徴情報(詳しくは後述する)を抽出する。
メタ情報保存部70は、静止画像から抽出した特徴情報と、動画像から抽出した特徴情報をそれぞれ別々に保存することができる機能を持ち、保存先をそれぞれ、静止画像メタ情報保存部71と、動画像フレームメタ情報保存部72と呼ぶ。
静止画像保存部80は、静止画像を格納している。
画像検索制御部90は、動画像の検索処理を行う場合に各部を制御する。 Thefeature extraction unit 60 extracts feature information (details will be described later) from the decoded image.
The metainformation storage unit 70 has a function capable of separately storing the feature information extracted from the still image and the feature information extracted from the moving image, and the storage destination is the still image meta information storage unit 71, This is called a moving image frame meta information storage unit 72.
The stillimage storage unit 80 stores still images.
The imagesearch control unit 90 controls each unit when moving image search processing is performed.
メタ情報保存部70は、静止画像から抽出した特徴情報と、動画像から抽出した特徴情報をそれぞれ別々に保存することができる機能を持ち、保存先をそれぞれ、静止画像メタ情報保存部71と、動画像フレームメタ情報保存部72と呼ぶ。
静止画像保存部80は、静止画像を格納している。
画像検索制御部90は、動画像の検索処理を行う場合に各部を制御する。 The
The meta
The still
The image
入力インタフェース部100は、検索を行うためのキーワードを入力する機能と、視聴したい動画像を選択する機能を有し、それぞれ検索キー入力部101と、動画像指定部102と呼ぶ。入力インタフェース部100はキーボード、マウス、タッチパネル、ボタンなどの入力デバイスやポインティングデバイスなどのハードウェアで構成される。検索キー入力部101では、直接キーワードなどを入力したり、あらかじめ登録されたキーワードやジャンルなどから選択したり、見たい場面に近い画像を選択することで、視聴者が見たい場面を入力できる。
表示部110は、検索キーを入力することで得られた検索結果を視聴者に提示したり、そこから選択した動画像を表示することができ、これらをそれぞれ検索結果表示部111、動画像表示部112と呼ぶ。 Theinput interface unit 100 has a function of inputting a keyword for performing a search and a function of selecting a moving image to be viewed, and are referred to as a search key input unit 101 and a moving image designating unit 102, respectively. The input interface unit 100 includes hardware such as an input device such as a keyboard, a mouse, a touch panel, and a button, and a pointing device. The search key input unit 101 can input a scene that the viewer wants to see by directly inputting a keyword or the like, selecting from a keyword or genre registered in advance, or selecting an image close to the scene to be viewed.
Thedisplay unit 110 can present a search result obtained by inputting a search key to a viewer or display a moving image selected from the viewer, and these can be displayed as a search result display unit 111 and a moving image display, respectively. This is referred to as part 112.
表示部110は、検索キーを入力することで得られた検索結果を視聴者に提示したり、そこから選択した動画像を表示することができ、これらをそれぞれ検索結果表示部111、動画像表示部112と呼ぶ。 The
The
図3は、動画像視聴装置が、入力されたキーワードに基づいて、ネットワーク上から静止画像を検索し、取得した静止画像の特徴を抽出し、静止画像メタ情報保存部71に格納するまでの手順を示すフローチャートである。これは、本発明による検索から視聴までの処理のうちの一部の処理を抜き出したものである。
FIG. 3 shows a procedure from when the moving image viewing apparatus searches for a still image on the network based on the input keyword, extracts the characteristics of the acquired still image, and stores them in the still image meta information storage unit 71. It is a flowchart which shows. This is a partial extraction of the processing from search to viewing according to the present invention.
ここではまず、ステップS11にて、入力インタフェ-ス部100の検索キー入力部101より入力されたキーワードに基づいて、画像検索制御部90が通信部20を介して通信ネットワーク17上の静止画像を検索する。検索の方法は特に問わないが、例えばインターネットでは、いくつかのキーワードを入力し、それをWebサーバに送信すると、Webサーバ内のデータベースから、キーワードに該当する静止画像のリストを作成するサービスがいくつもあるので、これを利用すればよい。後に、動画像の検索結果の精度をより高めるため、検索された複数の静止画像から複数の画像を取得する。
Here, first, in step S11, based on the keyword input from the search key input unit 101 of the input interface unit 100, the image search control unit 90 captures a still image on the communication network 17 via the communication unit 20. Search for. The search method is not particularly limited. For example, in the Internet, when some keywords are input and transmitted to the Web server, there are some services that create a list of still images corresponding to the keywords from the database in the Web server. You can use this as well. Later, in order to further improve the accuracy of the search result of moving images, a plurality of images are acquired from the plurality of searched still images.
次に、ステップS12にて、画像検索制御部90は、取得した静止画像の中に、すでに取得済みで、特徴情報も抽出済みであるものがあるかどうかを確認し、抽出済みであるものは、このあとの処理の対象から外す。抽出済みであるかどうかは、静止画像メタ情報保存部71に格納されている静止画像の取得元を示す情報、すなわちURIで判断することができる。取得した静止画像のうちすべてから特徴情報を抽出済みである場合は、図3に示す処理をここで終了する。未抽出の静止画像がある場合は次の処理へと進む。
Next, in step S12, the image search control unit 90 confirms whether or not the acquired still image has already been acquired and the feature information has been extracted. , And remove it from the target of the subsequent processing. Whether or not it has been extracted can be determined by information indicating a still image acquisition source stored in the still image meta information storage unit 71, that is, by a URI. If the feature information has already been extracted from all of the acquired still images, the process shown in FIG. 3 ends here. If there is an unextracted still image, the process proceeds to the next process.
次に、ステップS13にて、画像検索制御部90は、静止画像取得部40に特許情報が未抽出である静止画像を取得させ、その静止画像を、特徴情報を抽出することが出来る形式とするため、静止画像復号部52により復号する。
Next, in step S13, the image search control unit 90 causes the still image acquisition unit 40 to acquire a still image from which patent information has not been extracted, and sets the still image in a format that can extract feature information. Therefore, the still image decoding unit 52 performs decoding.
次に、ステップS14にて、特徴抽出部60が復号された静止画像の特徴情報を抽出する。特徴情報とは、その情報を手掛かりにして、元の画像、またはそれに類似する画像を探すことの出来る情報である。特徴情報の抽出方法は本発明中で限定するものではないが、例えば、画像の色分布を記録する方法、画像の特徴点を抽出する方法、画像中のテクスチャ、あるいは輝度情報、あるいは色情報の統計的情報、すなわちヒストグラムを抽出する方法がある。
Next, in step S14, the feature extraction unit 60 extracts the feature information of the decoded still image. The feature information is information that can be used to search for an original image or an image similar to the original image using the information as a clue. The feature information extraction method is not limited in the present invention. For example, the method of recording the color distribution of the image, the method of extracting the feature point of the image, the texture in the image, the luminance information, or the color information There is a method of extracting statistical information, that is, a histogram.
テクスチャデータのヒストグラムは、例えば、画像中のすべてのピクセル(画素)について、周辺の8つのピクセルの輝度と比較し、周辺のピクセルの半分以上が着目しているピクセルの明るさよりも明るいか否かを判断し、そのピクセル数を各輝度毎に集計し、それを輝度毎に並べることで得られる。
また、色情報のヒストグラムは、例えば、各ピクセルごとに赤、緑、青の色の度合いを抽出し、色ごとにピクセル数を集計して並べることで得られる。 In the texture data histogram, for example, for all the pixels in the image, compared to the brightness of the surrounding eight pixels, whether or not more than half of the surrounding pixels are brighter than the brightness of the pixel of interest. Is obtained, and the number of pixels is totaled for each luminance and arranged for each luminance.
The histogram of color information is obtained, for example, by extracting the degree of red, green, and blue colors for each pixel, and counting and arranging the number of pixels for each color.
また、色情報のヒストグラムは、例えば、各ピクセルごとに赤、緑、青の色の度合いを抽出し、色ごとにピクセル数を集計して並べることで得られる。 In the texture data histogram, for example, for all the pixels in the image, compared to the brightness of the surrounding eight pixels, whether or not more than half of the surrounding pixels are brighter than the brightness of the pixel of interest. Is obtained, and the number of pixels is totaled for each luminance and arranged for each luminance.
The histogram of color information is obtained, for example, by extracting the degree of red, green, and blue colors for each pixel, and counting and arranging the number of pixels for each color.
並べ方は、ピクセル数における昇順でも降順でもよいし、輝度あるいは色の度合いにおける昇順、降順でもよい。あるいは、あらかじめ決められた並べ方でもよい。
Alignment may be in ascending order or descending order in terms of the number of pixels, or in ascending order or descending order in terms of luminance or color. Alternatively, a predetermined arrangement may be used.
次に、ステップS15において、特徴抽出部60は、ステップS11にて入力されたキーワードと、静止画像の取得元の情報(URI)と、ステップS14で抽出した特徴情報を、まとめて静止画像メタ情報保存部71に格納する。
Next, in step S15, the feature extraction unit 60 collects the keyword input in step S11, the information (URI) of the still image acquisition source, and the feature information extracted in step S14 into a still image meta information. Store in the storage unit 71.
ここで、ステップS11で取得した静止画像が複数である場合は、ステップS12~S15の処理を1つの静止画像ごとに繰り返すようにしてもよいし、全部の画像でステップS12~S15までの処理を順に行っても良い。
Here, when there are a plurality of still images acquired in step S11, the processing in steps S12 to S15 may be repeated for each still image, or the processing from steps S12 to S15 is performed for all the images. You may go in order.
図4は、図3の一連の処理の結果として静止画像メタ情報保存部71に格納された静止画像メタ情報の例を示す図である。静止画像メタ情報番号1~3は、これまでに入力されたキーワードによって抽出された静止メタ画像情報であり、静止画像メタ情報番号4はあらたに入力したキーワードで得られた静止画像メタ情報が追加されたことを示す。このように、過去に入力されたキーワードに該当する静止画像メタ情報を蓄積すれば、同じキーワードが入力されたときの検索速度を上げることが出来る。
FIG. 4 is a diagram illustrating an example of still image meta information stored in the still image meta information storage unit 71 as a result of the series of processes in FIG. Still image meta information numbers 1 to 3 are still meta image information extracted by keywords inputted so far, and still image meta information number 4 is added with still image meta information obtained by newly inputted keywords. Indicates that In this way, if still image meta information corresponding to a keyword input in the past is accumulated, the search speed when the same keyword is input can be increased.
画像検索制御部90は、ステップS12における判断を行わず、一旦ステップS14までの処理を行った後、特徴情報の比較を行い、まったく一致する場合には新たに静止画像メタ情報の格納を行わないことも可能である。また、特徴情報は一致するがキーワードが一致しなければ、格納済みの静止画像メタ情報のキーワード部分に、キーワードのみを追加しておくこともでき、結果として、同じ特徴情報を別の意味のキーとしても使用することが出来る。
The image search control unit 90 does not make the determination in step S12, and once performs the processing up to step S14, compares the feature information, and does not newly store still image meta information if they match completely. It is also possible. In addition, if the feature information matches but the keyword does not match, only the keyword can be added to the keyword portion of the stored still image meta information. As a result, the same feature information has a different meaning key. Can also be used.
図5は、入力されたキーワードに基づいて、静止画像保存部80から静止画像を取得し、取得した静止画像の特徴を抽出し、静止画像メタ情報保存部71に格納するまでの手順を示すフローチャートである。これは、本発明による検索から視聴までの処理のうちの一部の処理を抜き出したものである。
FIG. 5 is a flowchart showing a procedure from acquiring a still image from the still image storage unit 80 based on the input keyword, extracting features of the acquired still image, and storing the feature in the still image meta information storage unit 71. It is. This is a partial extraction of the processing from search to viewing according to the present invention.
ここではまず、ステップS21において、画像検索制御部90は、検索キー入力部101から入力されたキーワードをファイル名の一部に含んでいたり、ファイルのメタデータとして持つような静止画像が、静止画像保存部80にあるかどうかを確認し、存在した場合にはそれを取得する。
Here, first, in step S21, the image search control unit 90 includes a still image that includes the keyword input from the search key input unit 101 as a part of the file name or has as file metadata. It is confirmed whether it exists in the preservation | save part 80, and when it exists, it is acquired.
次に、ステップS22にて、画像検索制御部90は、静止画像保存部80に格納されている静止画像の中に、すでに取得済みで、特徴情報も抽出済みであるものがあるかどうかを確認し、抽出済みであるものはこのあとの処理の対象から外す。抽出済みであるかどうかは、静止画像メタ情報保存部71に格納されている静止画像の取得元を示す情報(図4参照)、すなわちファイル名で判断することができる。格納されている静止画像のうちすべてから特徴情報を抽出済みである場合は、図5に示す処理をここで終了する。未抽出の静止画像がある場合は次の処理へと進む。
Next, in step S <b> 22, the image search control unit 90 confirms whether any of the still images stored in the still image storage unit 80 has already been acquired and feature information has been extracted. However, those that have already been extracted are excluded from the target of subsequent processing. Whether or not it has been extracted can be determined by information (see FIG. 4) indicating a still image acquisition source stored in the still image meta information storage unit 71, that is, a file name. If the feature information has already been extracted from all of the stored still images, the processing shown in FIG. 5 ends here. If there is an unextracted still image, the process proceeds to the next process.
次に、ステップS23にて、画像検索制御部90は、静止画像保存部80に格納されている特許情報が未抽出である静止画像を、特徴情報を抽出することが出来る形式とするため、静止画像復号部52により復号する。
次に、ステップS24にて、特徴抽出部60が復号された静止画像の特徴情報を抽出する。 Next, in step S23, the imagesearch control unit 90 sets the still image from which the patent information stored in the still image storage unit 80 has not been extracted into a format in which the feature information can be extracted. Decoding is performed by the image decoding unit 52.
Next, in step S24, thefeature extraction unit 60 extracts the feature information of the decoded still image.
次に、ステップS24にて、特徴抽出部60が復号された静止画像の特徴情報を抽出する。 Next, in step S23, the image
Next, in step S24, the
ステップS25では、特徴抽出部60が、ステップS21にて入力されたキーワードと、静止画像のファイルパスと、S24で抽出した特徴情報を、まとめて静止画像メタ情報保存部71に格納する。
In step S25, the feature extraction unit 60 stores the keyword input in step S21, the still image file path, and the feature information extracted in S24 together in the still image meta information storage unit 71.
保存した情報は、図4に示す例のように保存される。
The saved information is saved as in the example shown in FIG.
次に、検索対象となる動画像から、検索するための情報を取得する手順について説明する。
Next, a procedure for acquiring information for searching from a moving image to be searched will be described.
動画像を視聴する装置(動画像視聴装置)は、あらかじめ、検索対象となる動画像の取得元、すなわち動画像撮影装置の情報を持っている必要がある。これには例えば、あらかじめいくつかの撮影装置を登録しておいたり、1つ1つの撮影装置の情報を視聴者が登録できたり、動画像撮影装置の情報を取りまとめているポータルサイトなどから取得するなどの方法がある。
An apparatus for viewing a moving image (moving image viewing apparatus) needs to have information on a moving image acquisition source, that is, a moving image photographing apparatus, in advance. For example, several photographing devices are registered in advance, information on each photographing device can be registered by a viewer, or information is acquired from a portal site that collects information on moving image photographing devices. There are methods.
動画像視聴装置の動画像フレーム取得部30は、これらの動画像撮影装置から順に、一定時間ごと、または視聴者から検索要求があった時点で、配信されている動画像のその時点のスナップショット(動画像フレーム)を取得する。例えば、動画像撮影装置から配信される動画像のうち任意に指定された時刻の画像を取得し、指定された時間に最も近いフレームがイントラフレームの場合は1つのイントラフレームを取得し、指定された時間に最も近いフレームが予測フレームの場合は、そのフレームを復号するために必要なイントラフレームと予測フレームを含むフレームを取得する。
The moving image frame acquisition unit 30 of the moving image viewing apparatus sequentially takes a snapshot of the moving image being distributed at a certain time interval or at the time when a search request is received from the viewer. (Moving image frame) is acquired. For example, an image at an arbitrarily specified time is acquired from the moving images distributed from the moving image capturing device, and if the frame closest to the specified time is an intra frame, one intra frame is acquired and specified. If the frame closest to the predetermined time is a predicted frame, a frame including an intra frame and a predicted frame necessary for decoding the frame is acquired.
動画像フレーム取得部30が取得した画像は、画像復号部50の静止画像復号部52にて復号し、さらに特徴抽出部60で静止画像の特徴抽出と同じ方法で特徴を抽出する。
The image acquired by the moving image frame acquisition unit 30 is decoded by the still image decoding unit 52 of the image decoding unit 50, and the feature extraction unit 60 extracts features by the same method as the feature extraction of the still image.
特徴抽出部60は、抽出した特徴情報を、取得元を示す情報(URI)とともに、動画像フレームメタ情報保存部72に格納する。その際、取得元を示す情報が同じ動画像フレームメタ情報がすでに動画像フレームメタ情報保存部72内に存在する場合は、特徴抽出部60が、その情報を上書きして格納する。図6は動画像フレームメタ情報保存部に格納された動画像フレームメタ情報の例である。
The feature extraction unit 60 stores the extracted feature information in the moving image frame meta information storage unit 72 together with information (URI) indicating the acquisition source. At this time, if moving image frame meta information having the same information indicating the acquisition source already exists in the moving image frame meta information storage unit 72, the feature extraction unit 60 overwrites and stores the information. FIG. 6 is an example of moving picture frame meta information stored in the moving picture frame meta information storage unit.
次に、視聴者が検索を行い、動画像を視聴するまでの流れについて、図7に沿って説明する。
Next, the flow from when the viewer searches to view a moving image will be described with reference to FIG.
ステップS31において、まず視聴者には、動画像を検索するための画面が表示部110に提示され、ここに何らかの情報を入力することで視聴したい動画像を検索して選択することが出来る。例えば、入力インタフェース部100の検索キー入力部101より1つ又は複数のキーワードを入力したり、キーワード一覧から選択したり、見たい場面を端的に表す画像を一覧表示しておき、そこから選択する方法などである。
In step S31, first, the viewer is presented with a screen for searching for a moving image on the display unit 110. By inputting some information here, the moving image to be viewed can be searched and selected. For example, one or a plurality of keywords are input from the search key input unit 101 of the input interface unit 100, selected from a keyword list, or a list of images that represent scenes to be viewed is displayed and selected from there. And so on.
続いて、ステップS32において、検索で使用する静止画像をネットワークから新たに取得するかどうかの判断を行う。判断条件は、ステップS31で入力されたキーワードに該当する静止画像メタ情報が静止画像メタ情報保存部71に存在するかどうかである。通常は、これが存在しなければ新たな静止画像を取得しないと動画像の検索を行うことが出来ないので、存在しない場合には取得する。ただし、存在する場合には新たに取得する必要はないが、複数の静止画像のメタ情報を用いる方が動画像の検索精度が向上するため、例えば検索した結果、静止画像メタ情報保存部71に一定数以下の静止画像メタ情報しか確保できなかった場合、ネットワークから検索用の静止画像を取得する。ステップS34で装置内の静止画像を検索用途に使えることが見込める場合や、検索結果として一つも見つからなくても良い場合には、取得する必要はない。
Subsequently, in step S32, it is determined whether or not a still image used in the search is newly acquired from the network. The determination condition is whether or not the still image meta information corresponding to the keyword input in step S31 exists in the still image meta information storage unit 71. Normally, if this does not exist, a moving image cannot be searched unless a new still image is acquired. However, if it exists, it is not necessary to newly acquire it. However, since the search accuracy of moving images is improved by using meta information of a plurality of still images, for example, as a result of the search, the still image meta information storage unit 71 If only a certain number or less of still image meta information can be secured, a still image for search is acquired from the network. If it is expected in step S34 that the still image in the apparatus can be used for search purposes, or if it is not necessary to find one as a search result, there is no need to obtain it.
取得する場合にはステップS33、しない場合にはステップS34へと進む。ステップS33では図3に示した一連の処理を行い、取得した静止画像の特徴情報を含むメタ情報を静止画像メタ情報保存部71に格納する。
If yes, go to step S33, otherwise go to step S34. In step S33, the series of processing shown in FIG. 3 is performed, and the meta information including the acquired still image feature information is stored in the still image meta information storage unit 71.
ステップS34では、画像検索制御部90は、検索で使用する静止画像として、装置内の静止画像を使用するかどうかの判断を行う。判断条件は、ステップS31で入力されたキーワードに該当する静止画像メタ情報が、静止画像メタ情報保存部71に存在するかどうかである。通常は、これが存在すれば装置内の静止画像を検索用画像として使用する必要はないが、動画像検索の精度を向上させるためには使用するほうが望ましい。
In step S34, the image search control unit 90 determines whether to use a still image in the apparatus as a still image used in the search. The determination condition is whether or not the still image meta information corresponding to the keyword input in step S31 exists in the still image meta information storage unit 71. Normally, if it exists, it is not necessary to use a still image in the apparatus as a search image, but it is desirable to use it in order to improve the accuracy of moving image search.
使用する場合にはステップS35へ、使用しない場合にはステップS36へと進む。ステップS35では、図5に示した一連の処理を行い、特徴情報を静止画像メタ情報保存部に格納する。
If it is used, the process proceeds to step S35. If it is not used, the process proceeds to step S36. In step S35, the series of processing shown in FIG. 5 is performed, and the feature information is stored in the still image meta information storage unit.
ここで、ステップS32、S34の判断条件は、視聴機器特有の設定としてもよいし、視聴者が自由に設定できるようにしてもよい。
Here, the determination conditions in steps S32 and S34 may be settings unique to the viewing device, or may be set freely by the viewer.
ステップS36では、画像検索制御部90が、入力したキーワードと、静止画像メタ情報保存部71に格納された静止画像メタ情報と、動画像フレームメタ情報保存部72に格納された動画像フレームメタ情報と、を用いて、動画像検索処理を行う。
In step S36, the image search control unit 90 inputs the keyword, the still image meta information stored in the still image meta information storage unit 71, and the moving image frame meta information stored in the moving image frame meta information storage unit 72. Then, the moving image search process is performed.
まず、静止画像メタ情報保存部71に格納された静止画像メタ情報の中から、そのキーワードが、入力されたキーワードと一部または全部と一致する静止画像メタ情報を抽出する。ここでは、キーワードがどの程度一致したかは問わないが、抽出した静止画像メタ情報の数が多い場合には、キーワードの一致の度合いなどを元に、そこからさらに適切な数の静止画像メタ情報を抽出すると良い。
First, from the still image meta information stored in the still image meta information storage unit 71, still image meta information whose keyword matches part or all of the input keyword is extracted. Here, it does not matter how much the keywords match, but if the number of extracted still image meta information is large, an appropriate number of still image meta information is determined based on the degree of matching of the keywords, etc. It is good to extract.
次に、抽出した静止画像メタ情報の特徴情報と、動画像フレームメタ情報保存部72に格納された動画像フレームメタ情報の特徴情報の類似度を算出する。類似度の算出方法は、特徴情報の形式により異なるし、同じ特徴情報の形式であっても複数の算出方法が考えられるため、本発明中では特に限定しないが、例えば前述したように画像の輝度情報のヒストグラムを特徴情報とした場合には、特徴情報中の各輝度ごとの値の距離を算出し、さらにそれを加算したものを類似度を表す尺度とすることが出来る。この方法では、類似しているものほど値が小さく、類似していない場合には値が大きくなる。また、この方法では、同じ被写体を撮影した画像であっても、その解像度が異なると類似度が異なるという結果になるため、特徴抽出を行う前に画像のサイズが同じピクセル数になるようにリサイズを行ってから特徴抽出を行うか、類似度を算出する前にヒストグラムの縮尺をそろえる、すなわちヒストグラム中のすべての値の合計値が同じになるようにしてから類似度を算出するとよい。
Next, the similarity between the extracted feature information of the still image meta information and the feature information of the moving image frame meta information stored in the moving image frame meta information storage unit 72 is calculated. The similarity calculation method differs depending on the feature information format, and a plurality of calculation methods can be considered even if the feature feature format is the same. Therefore, although not particularly limited in the present invention, for example, as described above, the brightness of the image When the information histogram is used as the feature information, the distance between the values for each luminance in the feature information is calculated, and the sum of the calculated distances can be used as a scale representing the similarity. In this method, the similar value is smaller, and the value is larger if they are not similar. Also, with this method, even if images are taken of the same subject, the similarity will be different if the resolutions are different, so resizing so that the image size will be the same number of pixels before performing feature extraction. It is preferable to calculate the similarity after performing the feature extraction or by making the scales of the histograms equal before calculating the similarity, that is, by making the sum of all the values in the histogram the same.
同じ手順で、抽出したすべての静止画像メタ情報と、すべての動画像フレームメタ情報との類似度を算出したら、類似度が高い順に動画像フレームメタ情報を並べ、そこからさらにURIを抜き出して検索結果とする。
After calculating the similarity between all the extracted still image meta information and all the moving image frame meta information in the same procedure, arrange the moving image frame meta information in descending order of the similarity, and then extract the URI from that and search As a result.
検索結果は、ステップS37において、検索結果表示部111に表示する。検索結果の表示は、URIを順に表示しても良いし、URIの代わりに先に取得した該当するスナップショット画像を表示しても良い。また可能であれば、各撮影装置から動画像を受信してこれを表示してもよい。
The search result is displayed on the search result display unit 111 in step S37. The search results may be displayed in order of URIs, or the corresponding snapshot images acquired earlier may be displayed instead of the URIs. If possible, a moving image may be received from each photographing apparatus and displayed.
次に視聴者は、検索結果表示部111に表示された検索結果の中から視聴を希望する動画像を動画像指定部102で指定する。画像検索制御部90では、通信部20を介して指定されたURIに視聴開始要求を出す。動画像撮影装置から動画像を受信すると、動画像復号部51で復号され、動画像表示部112に動画像が表示される。
Next, the viewer designates a moving image desired to be viewed from the search results displayed on the search result display unit 111 by the moving image designating unit 102. The image search control unit 90 issues a viewing start request to the URI specified via the communication unit 20. When a moving image is received from the moving image capturing device, the moving image decoding unit 51 decodes the moving image and the moving image display unit 112 displays the moving image.
このようにして、視聴者は、検索キーワードを入力して、検索結果の中から希望する動画像を選択するだけで、場面あるいは配信される映像の内容が時々刻々と変化するような場合であっても、簡単に且つ効率的に目的の動画像を検索して、見たい場面の動画像を視聴することが出来る。
なお、本発明は通信ネットワークを介して動画像が配信される場合を想定しているが、放送による動画像配信においても適用可能である。その場合、メタデータとして持つ取得元の情報は、チャンネル情報や、周波数情報となる。また、検索用キーワードは、放送波に番組情報などの文字情報が含まれる場合には、これを利用することもできる。
さらに、本発明は撮影装置によって撮影した動画像を想定しているが、アニメーション動画においても適用可能である。 In this way, the viewer simply enters the search keyword and selects the desired moving image from the search results, and the content of the scene or video to be distributed changes from moment to moment. However, it is possible to easily and efficiently search for a target moving image and view a moving image of a scene to be viewed.
Note that the present invention assumes a case where a moving image is distributed via a communication network. However, the present invention is also applicable to moving image distribution by broadcasting. In that case, the acquisition source information held as metadata is channel information or frequency information. The search keyword can also be used when broadcast information includes character information such as program information.
Furthermore, although the present invention assumes a moving image photographed by a photographing apparatus, it can also be applied to an animation moving image.
なお、本発明は通信ネットワークを介して動画像が配信される場合を想定しているが、放送による動画像配信においても適用可能である。その場合、メタデータとして持つ取得元の情報は、チャンネル情報や、周波数情報となる。また、検索用キーワードは、放送波に番組情報などの文字情報が含まれる場合には、これを利用することもできる。
さらに、本発明は撮影装置によって撮影した動画像を想定しているが、アニメーション動画においても適用可能である。 In this way, the viewer simply enters the search keyword and selects the desired moving image from the search results, and the content of the scene or video to be distributed changes from moment to moment. However, it is possible to easily and efficiently search for a target moving image and view a moving image of a scene to be viewed.
Note that the present invention assumes a case where a moving image is distributed via a communication network. However, the present invention is also applicable to moving image distribution by broadcasting. In that case, the acquisition source information held as metadata is channel information or frequency information. The search keyword can also be used when broadcast information includes character information such as program information.
Furthermore, although the present invention assumes a moving image photographed by a photographing apparatus, it can also be applied to an animation moving image.
11~14 動画像撮影装置
15,16 動画像視聴装置
17 ネットワーク
18,19 サーバ
20 通信部
30 動画像フレーム取得部
40 静止画像取得部
50 画像復号部
51 動画像復号部
52 静止画像復号部
60 特徴抽出部
70 メタ情報保存部
71 静止画像メタ情報保存部
72 動画像フレームメタ情報保存部
80 静止画像保存部
90 画像検索制御部
100 入力インタフェース部
101 検索キー入力部
102 動画像指定部
110 表示部
111 検索結果表示部
112 動画像表示部 11 to 14 Movingimage capturing devices 15 and 16 Moving image viewing device 17 Network 18 and 19 Server 20 Communication unit 30 Moving image frame acquisition unit 40 Still image acquisition unit 50 Image decoding unit 51 Moving image decoding unit 52 Still image decoding unit 60 Features Extraction unit 70 Meta information storage unit 71 Still image meta information storage unit 72 Moving image frame meta information storage unit 80 Still image storage unit 90 Image search control unit 100 Input interface unit 101 Search key input unit 102 Moving image designation unit 110 Display unit 111 Search result display unit 112 Moving image display unit
15,16 動画像視聴装置
17 ネットワーク
18,19 サーバ
20 通信部
30 動画像フレーム取得部
40 静止画像取得部
50 画像復号部
51 動画像復号部
52 静止画像復号部
60 特徴抽出部
70 メタ情報保存部
71 静止画像メタ情報保存部
72 動画像フレームメタ情報保存部
80 静止画像保存部
90 画像検索制御部
100 入力インタフェース部
101 検索キー入力部
102 動画像指定部
110 表示部
111 検索結果表示部
112 動画像表示部 11 to 14 Moving
Claims (14)
- 動画像撮影装置により撮影されてネットワークを介して配信される動画像を、静止画像を利用して検索し視聴可能とする動画像視聴装置であって、
ネットワークに接続して通信を行う通信部と、
利用者からの入力を受け付ける入力インタフェース部と、
検索結果を利用者に提示する表示部と、
前記通信部を介してネットワークから配信される動画像のフレームデータを取得する動画像フレーム取得部と、
検索に利用する前記静止画像及び前記動画像フレーム取得部で取得したフレームデータから画像の特徴を示す特徴情報を抽出する特徴抽出部と、
前記特徴抽出部より抽出された特徴情報を含む前記静止画像及び前記フレームデータのメタ情報をそれぞれ保存するメタ情報保存部と、
動画像検索の制御を行う画像検索制御部と、
を備え、
前記画像検索制御部は、
前記入力インタフェース部より入力された検索指示に基づいて、前記メタ情報保存部の静止画像のメタ情報を検索し、
検索指示に該当する静止画像のメタ情報が存在した場合、該静止画像の特徴情報に対する前記メタ情報保存部のフレームデータの特徴情報の類似度を算出して類似度が高い順からフレームデータのメタ情報を選択し、
選択した前記フレームデータのメタ情報に基づいて、対応する動画像を前記通信部を介して取得して、前記表示部に表示させることを特徴とする動画像視聴装置。 A moving image viewing device that enables a moving image captured by a moving image capturing device and distributed via a network to be searched and viewed using a still image,
A communication unit that communicates by connecting to a network;
An input interface unit that receives input from the user;
A display for presenting search results to the user;
A moving image frame acquisition unit that acquires frame data of a moving image distributed from the network via the communication unit;
A feature extraction unit that extracts feature information indicating image features from the still image used for search and the frame data acquired by the moving image frame acquisition unit;
A meta information storage unit that stores the meta information of the still image and the frame data including the feature information extracted by the feature extraction unit;
An image search control unit for controlling moving image search;
With
The image search control unit
Based on the search instruction input from the input interface unit, search the meta information of the still image of the meta information storage unit,
When there is still image meta information corresponding to the search instruction, the similarity of the frame data feature information of the meta information storage unit with respect to the still image feature information is calculated, and the frame data meta information in descending order of similarity is calculated. Select information,
A moving image viewing / listening device, wherein a corresponding moving image is acquired via the communication unit based on meta information of the selected frame data and displayed on the display unit. - 前記画像検索制御部は、類似度が一定値以上の特徴情報を有するフレームデータのメタ情報を選択することを特徴とする請求項1に記載の動画像視聴装置。 2. The moving image viewing apparatus according to claim 1, wherein the image search control unit selects meta information of frame data having feature information having a degree of similarity equal to or greater than a predetermined value.
- 前記画像検索制御部は、選択した前記フレームデータに対応する前記動画像を前記表示部に表示させ、前記入力インタフェースにより利用者に目的の動画像を選択させることを特徴とする請求項1又は2に記載の動画像視聴装置。 The image search control unit displays the moving image corresponding to the selected frame data on the display unit, and allows the user to select a target moving image through the input interface. The moving image viewing device according to 1.
- ネットワークから前記通信部を介して静止画像を取得する静止画像取得部を更に備え、
前記画像検索制御部は、前記入力インタフェース部より入力された検索指示に基づいて、前記静止画像取得部にネットワークから静止画像を検索して取得させることを特徴とする請求項1乃至3のいずれかに記載の動画像視聴装置。 A still image acquisition unit that acquires a still image from the network via the communication unit;
The image search control unit causes the still image acquisition unit to search for and acquire a still image from a network based on a search instruction input from the input interface unit. The moving image viewing device according to 1. - 静止画像を格納する静止画像保存部を更に備え、
前記画像検索制御部は、前記入力インタフェース部より入力された検索指示に基づいて、前記静止画像保存部から静止画像を検索して取得させることを特徴とする請求項1乃至4のいずれかに記載の動画像視聴装置。 A still image storage unit for storing still images;
5. The image search control unit according to claim 1, wherein the image search control unit searches for and acquires a still image from the still image storage unit based on a search instruction input from the input interface unit. Video viewing device. - 前記画像検索制御部は、前記メタ情報保存部に所定数以下の該当メタ情報しかない場合に、静止画像を検索して取得することを特徴とする請求項4又は5に記載の動画像視聴装置。 The moving image viewing apparatus according to claim 4 or 5, wherein the image search control unit searches and acquires a still image when the meta information storage unit has a predetermined number or less of corresponding meta information. .
- 前記静止画像のメタ情報は、検索を行うためのキーワードを含み、
前記画像検索制御部は、前記入力インタフェース部より入力されたキーワードに基づいて検索することを特徴とする請求項1乃至5のいずれかに記載の動画像視聴装置。 The meta information of the still image includes a keyword for performing a search,
The moving image viewing apparatus according to claim 1, wherein the image search control unit searches based on a keyword input from the input interface unit. - 前記静止画像及び前記フレームデータのメタ情報は、画像の所在を一意に示す位置情報を含み、
前記画像検索制御部は、その位置情報に基づいて画像を取得することを特徴とする請求項1乃至5のいずれかに記載の動画像視聴装置。 The meta information of the still image and the frame data includes position information that uniquely indicates the location of the image,
The moving image viewing apparatus according to claim 1, wherein the image search control unit acquires an image based on the position information. - 前記位置情報は、前記通信部を介して取得可能なネットワーク上の画像の位置を指し示す情報であることを特徴とする請求項8に記載の動画像視聴装置。 The moving image viewing / listening device according to claim 8, wherein the position information is information indicating a position of an image on a network that can be acquired via the communication unit.
- 静止画像を格納する静止画像保存部を更に備え、
前記静止画像の位置情報は、前記静止画像保存部に格納されている静止画像を特定する情報であることを特徴とする請求項8に記載の動画像視聴装置。 A still image storage unit for storing still images;
9. The moving image viewing apparatus according to claim 8, wherein the position information of the still image is information for specifying a still image stored in the still image storage unit. - 前記動画像フレーム取得部は、前記動画像撮影装置から配信される動画像のうち任意に指定された時刻の画像を取得し、指定された時間に最も近いフレームがイントラフレームの場合は1つのイントラフレームを取得し、指定された時間に最も近いフレームが予測フレームの場合は、そのフレームを復号するために必要なイントラフレームと予測フレームを含むフレームを取得することを特徴とする請求項1乃至10のいずれかに記載の動画像視聴装置。 The moving image frame acquisition unit acquires an image at an arbitrarily specified time among the moving images distributed from the moving image capturing device. If the frame closest to the specified time is an intra frame, one moving image frame acquiring unit 11. A frame is acquired, and when a frame closest to a specified time is a predicted frame, an intra frame and a frame including the predicted frame necessary for decoding the frame are acquired. The moving image viewing device according to any one of the above.
- 前記動画像フレーム取得部は、あらかじめ指定された複数の前記動画像撮影装置から、指定された時間に従って画像を取得することを特徴とする請求項1乃至10のいずれかに記載の動画像視聴装置。 The moving image viewing apparatus according to claim 1, wherein the moving image frame acquisition unit acquires images according to a specified time from a plurality of the moving image capturing devices specified in advance. .
- 前記特徴抽出部は、画像から、各画素の輝度情報を抽出し、同じ輝度値である画素の数を、昇順又は降順又はあらかじめ決められた順に並べたもの特徴情報として出力することを特徴とする請求項1乃至12のいずれかに記載の動画像視聴装置。 The feature extraction unit extracts brightness information of each pixel from an image, and outputs the number of pixels having the same brightness value as feature information arranged in ascending order, descending order, or a predetermined order. The moving image viewing apparatus according to claim 1.
- 前記特徴抽出部は、画像から、各画素の色情報を抽出し、同じ色である画素の数を、昇順又は降順又はあらかじめ決められた順に並べたものを特徴情報として出力することを特徴とする請求項1乃至12のいずれかに記載の動画像視聴装置。 The feature extraction unit extracts color information of each pixel from an image, and outputs the feature information obtained by arranging the number of pixels having the same color in ascending order, descending order, or a predetermined order. The moving image viewing apparatus according to claim 1.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2008329672 | 2008-12-25 | ||
JP2008-329672 | 2008-12-25 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2010073905A1 true WO2010073905A1 (en) | 2010-07-01 |
Family
ID=42287521
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2009/070566 WO2010073905A1 (en) | 2008-12-25 | 2009-12-08 | Moving image viewing apparatus |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2010073905A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013035670A1 (en) * | 2011-09-09 | 2013-03-14 | 株式会社日立製作所 | Object retrieval system and object retrieval method |
JP2015527564A (en) * | 2012-06-07 | 2015-09-17 | エフ・ホフマン−ラ・ロシュ・アクチェンゲゼルシャフト | Autoimmune antibody |
JP2016502194A (en) * | 2012-11-30 | 2016-01-21 | トムソン ライセンシングThomson Licensing | Video search method and apparatus |
JP2021100201A (en) * | 2019-12-23 | 2021-07-01 | 横河電機株式会社 | Device, system, method, and program |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004072504A (en) * | 2002-08-07 | 2004-03-04 | Sony Corp | Device, method and system for displaying image, program and recording medium |
JP2006039753A (en) * | 2004-07-23 | 2006-02-09 | Canon Inc | Image processing apparatus and image processing method |
JP2006129519A (en) * | 2005-12-08 | 2006-05-18 | Hitachi Ltd | Image storing device, monitoring system and storage medium |
JP2007251646A (en) * | 2006-03-16 | 2007-09-27 | Mitsubishi Electric Corp | Monitoring system, data collecting device, and video storing and distributing device |
-
2009
- 2009-12-08 WO PCT/JP2009/070566 patent/WO2010073905A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004072504A (en) * | 2002-08-07 | 2004-03-04 | Sony Corp | Device, method and system for displaying image, program and recording medium |
JP2006039753A (en) * | 2004-07-23 | 2006-02-09 | Canon Inc | Image processing apparatus and image processing method |
JP2006129519A (en) * | 2005-12-08 | 2006-05-18 | Hitachi Ltd | Image storing device, monitoring system and storage medium |
JP2007251646A (en) * | 2006-03-16 | 2007-09-27 | Mitsubishi Electric Corp | Monitoring system, data collecting device, and video storing and distributing device |
Non-Patent Citations (2)
Title |
---|
MASAAKI SATO ET AL.: "Fukuso?Kao o Mochiita Jinbutsu Kensaku System", MATSUSHITA TECHNICAL JOURNAL, vol. 52, no. 3, 18 June 2006 (2006-06-18), pages 67 - 71 * |
TOSHIHIKO HATA ET AL.: "Eizo Chikuseki?Kensaku?Hyoji Gijutsu", MITSUBISHI DENKI GIHO, vol. 78, no. 8, 25 August 2004 (2004-08-25), pages 47 - 50 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013035670A1 (en) * | 2011-09-09 | 2013-03-14 | 株式会社日立製作所 | Object retrieval system and object retrieval method |
JPWO2013035670A1 (en) * | 2011-09-09 | 2015-03-23 | 株式会社日立製作所 | Object search system and object search method |
JP2015527564A (en) * | 2012-06-07 | 2015-09-17 | エフ・ホフマン−ラ・ロシュ・アクチェンゲゼルシャフト | Autoimmune antibody |
JP2016502194A (en) * | 2012-11-30 | 2016-01-21 | トムソン ライセンシングThomson Licensing | Video search method and apparatus |
JP2021100201A (en) * | 2019-12-23 | 2021-07-01 | 横河電機株式会社 | Device, system, method, and program |
JP7205457B2 (en) | 2019-12-23 | 2023-01-17 | 横河電機株式会社 | Apparatus, system, method and program |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8559516B2 (en) | Video sequence ID by decimated scene signature | |
CN104012106B (en) | It is directed at the video of expression different points of view | |
US20210343070A1 (en) | Method, apparatus and electronic device for processing image | |
KR100867005B1 (en) | Method for personal-ordered multimedia data retrieval service and apparatuses thereof | |
US20150227780A1 (en) | Method and apparatus for determining identity and programing based on image features | |
US20090213270A1 (en) | Video indexing and fingerprinting for video enhancement | |
KR102246305B1 (en) | Augmented media service providing method, apparatus thereof, and system thereof | |
JP4428424B2 (en) | Information processing apparatus, information processing method, program, and recording medium | |
US9538246B2 (en) | Map your movie | |
WO2016192501A1 (en) | Video search method and apparatus | |
US11190828B1 (en) | Systems and methods for versatile video recording | |
US20160277808A1 (en) | System and method for interactive second screen | |
JP2010086194A (en) | Share image browsing method and device | |
JP2016532386A (en) | Method for displaying video and apparatus for displaying video | |
JP2006270869A (en) | Associated information acquisition system and method, management apparatus, and associated information transmission program | |
WO2010073905A1 (en) | Moving image viewing apparatus | |
US20140010521A1 (en) | Video processing system, video processing method, video processing apparatus, control method of the apparatus, and storage medium storing control program of the apparatus | |
CN109495789B (en) | Media file playing method, equipment and communication system | |
KR101542416B1 (en) | Method and apparatus for providing multi angle video broadcasting service | |
KR101334127B1 (en) | System and method for providing content sharing service using client terminal | |
JP2006039753A (en) | Image processing apparatus and image processing method | |
CN110100445B (en) | Information processing system, information processing apparatus, and computer readable medium | |
JP4853564B2 (en) | Information processing apparatus, information processing method, program, and recording medium | |
CN107431831B (en) | Apparatus and method for identifying video sequence using video frame | |
JP2015198298A (en) | Video distribution system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 09834708 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 09834708 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: JP |