WO2011158406A1 - 映像検索装置、映像検索方法、記録媒体、プログラム、集積回路 - Google Patents
映像検索装置、映像検索方法、記録媒体、プログラム、集積回路 Download PDFInfo
- Publication number
- WO2011158406A1 WO2011158406A1 PCT/JP2011/001596 JP2011001596W WO2011158406A1 WO 2011158406 A1 WO2011158406 A1 WO 2011158406A1 JP 2011001596 W JP2011001596 W JP 2011001596W WO 2011158406 A1 WO2011158406 A1 WO 2011158406A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- objects
- weight value
- content
- video search
- unit
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 238000003860 storage Methods 0.000 claims description 67
- 238000001514 detection method Methods 0.000 claims description 54
- 230000008569 process Effects 0.000 claims description 22
- 239000000284 extract Substances 0.000 claims description 9
- 230000004044 response Effects 0.000 claims description 8
- 230000007423 decrease Effects 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 17
- 238000004364 calculation method Methods 0.000 description 15
- 238000012545 processing Methods 0.000 description 11
- 241000254173 Coleoptera Species 0.000 description 6
- 238000004891 communication Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 239000003086 colorant Substances 0.000 description 4
- 238000003708 edge detection Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 230000010354 integration Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 2
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 2
- 239000013589 supplement Substances 0.000 description 2
- 241000282326 Felis catus Species 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/102—Programmed access in sequence to addressed parts of tracks of operating record carriers
- G11B27/105—Programmed access in sequence to addressed parts of tracks of operating record carriers of operating discs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7837—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/42—Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
- G06V10/421—Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation by analysing segments intersecting the pattern
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/454—Content or additional data filtering, e.g. blocking advertisements
- H04N21/4545—Input to filtering algorithms, e.g. filtering a region of the image
- H04N21/45455—Input to filtering algorithms, e.g. filtering a region of the image applied to a region of the image
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/482—End-user interface for program selection
- H04N21/4828—End-user interface for program selection for searching program descriptors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/84—Generation or processing of descriptive data, e.g. content descriptors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/84—Television signal recording using optical recording
- H04N5/85—Television signal recording using optical recording on discs or drums
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N9/00—Details of colour television systems
- H04N9/79—Processing of colour television signals in connection with recording
- H04N9/80—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback
- H04N9/82—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback the individual colour picture signal components being recorded simultaneously only
- H04N9/8205—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback the individual colour picture signal components being recorded simultaneously only involving the multiplexing of an additional signal and the colour video signal
- H04N9/8227—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback the individual colour picture signal components being recorded simultaneously only involving the multiplexing of an additional signal and the colour video signal the additional signal being at least another television signal
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N9/00—Details of colour television systems
- H04N9/79—Processing of colour television signals in connection with recording
- H04N9/80—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback
- H04N9/82—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback the individual colour picture signal components being recorded simultaneously only
- H04N9/8205—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback the individual colour picture signal components being recorded simultaneously only involving the multiplexing of an additional signal and the colour video signal
- H04N9/8233—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback the individual colour picture signal components being recorded simultaneously only involving the multiplexing of an additional signal and the colour video signal the additional signal being a character code signal
Definitions
- It relates to technology for searching related objects and videos based on objects appearing in videos.
- a storage device that stores large-capacity video on a server on the network is also provided, and services for storing and viewing video are also available.
- a search device that efficiently selects a desired video from a large amount of video stored in the storage device has been put into practical use.
- Patent Document 1 designation of an object (person) included in one frame of a video is received from a user, and a feature amount of the received object is extracted. Then, by collating using the extracted feature amount, another video scene in which the accepted object appears is displayed.
- the present invention has been made under such a background, and an object thereof is to provide a video search apparatus that can contribute to improvement of search accuracy.
- the video search apparatus has a playback means for playing back content composed of a plurality of frames, and a plurality of inputs for designating objects included in the frames constituting the content during playback of the content from the user.
- a time-reception feature on the content of each frame including each object for each of a plurality of objects detected by the detection means, a detection means for detecting an object in response to the reception by the reception means, and a plurality of objects detected by the detection means
- an assigning means for assigning a weight value
- a search means for performing a search based on a plurality of objects to which the weight value is assigned.
- the weight value adjusted based on the time-series characteristics on the content of each frame including each object is assigned, and the search is performed based on the assigned weight value. Therefore, it can contribute to the improvement of search accuracy.
- the figure which shows the memory content of the content management information storage part 104 The figure which shows the memory content of the scene information storage part 105
- the figure which shows the flow which specifies the scene where the object belongs The figure which shows the memory content of the object information storage part 106
- the figure which shows the example of the feature-value information in the object information storage part 106 The figure which shows the thumbnail corresponding to each object ID of the object information storage part 106 Diagram showing how to specify an area Diagram showing how features are extracted from a specified area (object)
- the figure which shows the example of feature quantity information The figure which shows the memory content of the 1st buffer 110
- the figure which shows the memory content of the 1st buffer 110 typically.
- the video search apparatus 101 includes a communication unit 102, a content storage unit 103, a content management information storage unit 104, a scene information storage unit 105, an object information storage unit 106, a playback unit 107, a reception unit 108, an object A detection unit 109, a first buffer 110, a weight value assignment unit 111, a second buffer 115, a search unit 116, a display control unit 117, and a display unit 118 are provided.
- the communication unit 102 has a function of performing various types of communication, and is composed of, for example, a NIC (Network Interface Card) and receives content via a network. Alternatively, it is composed of an antenna for receiving a broadcast wave, and receives content that arrives on the broadcast wave.
- a NIC Network Interface Card
- the content in the present embodiment is video content having a certain length of playback time. Hereinafter, it is simply referred to as content.
- the content storage unit 103 stores a plurality of contents received by the communication unit 102 and contents input from an external medium (such as an optical disk).
- the content management information storage unit 104 stores management information about the content stored in the content storage unit 103.
- a “content ID” 104a for identifying the content a “title” 104b for the content, a “genre” 104c for the content, and a “content” 104 for specifying the location of the content. It includes an item “content file path” 104d.
- the scene information storage unit 105 stores, for each content stored in the content storage unit 103, a scene included in each content and a frame number range for each scene in association with each other.
- the scene information storage unit 105 includes items of a “scene number” 105a indicating a scene number and a “frame number range” 105b indicating a frame range corresponding to the scene number. Including.
- FIG. 3 shows only one content (content ID: AAA), but the same content is stored for other content (content ID: ABC, BCD, ZZZ).
- the content stored in the scene information storage unit 105 is used for specifying a scene corresponding to the object performed by the association unit 113. This scene identification method will be described later with reference to FIG.
- the object information storage unit 106 stores information related to an object included (appears) in the content frame stored in the content storage unit 103.
- an “object ID” 106a for uniquely identifying an object a “frame number” 106b indicating the number of the frame including the object, and an identifier of the content including the frame are shown.
- the storage content of the object information storage unit 106 is created by the object detection unit 109 extracting the feature amount of the object for each content of the content storage unit 103 and detecting the object. Which object in the content is targeted may be automatically set under preset setting conditions or manually (user specified).
- FIG. 6 An example of feature quantity information is shown in FIG.
- “i” and “j” are lattice coordinates
- R, G, and B indicate the ratios of red, green, and blue colors in 256 levels, respectively.
- the grid coordinates represent the position of each grid by dividing the frame into a grid.
- the object information storage unit 106 stores feature amount information as shown in FIG. 6 for each object.
- FIG. 7 is a diagram showing a thumbnail of each object in the object information storage unit 106.
- the object with ID “0001” -ID “0002” is a beetle
- the object with ID “0003” is a tank
- the object with ID “1000” is a cat.
- Each of the storage units 103 to 106 is composed of, for example, HDD (Hard Disk Drive) hardware.
- HDD Hard Disk Drive
- the playback unit 107 plays back the content stored in the content storage unit 103 and causes the display unit 118 to display the playback content.
- the accepting unit 108 accepts various instructions such as a content reproduction instruction and a designation of an area to be detected for the object being reproduced (designation of an object) from the user.
- the touch sensor is composed of a capacitive touch sensor, and the position on the surface of the touch sensor is pointed (touched) from the change in the capacitance value, and is accepted as an input.
- other general input devices such as a remote controller may be used as the reception unit 108.
- the object detection unit 109 detects an object based on the area received by the reception unit 108, and extracts a feature amount for the object.
- the first buffer 110 stores the feature amount information of the object extracted by the object detection unit 109.
- the weight value assigning unit 111 assigns a weight value that affects the search score (secondary similarity) to each object stored in the first buffer 110, and includes an initial weight value assigning unit 112, an association unit 113, A weight value increasing unit 114 is provided.
- the initial weight value assigning unit 112 assigns an initial weight value to each object stored in the first buffer 110.
- a weight value of 0.5 is assigned to each of three objects with IDs “011”, “012”, and “013”. An image of this initial weight value assignment is shown in FIG.
- the association unit 113 associates the objects stored in the first buffer 110 with each other.
- the associating unit 113 refers to the scene information storage unit 105 and performs associating on the condition that the scenes of the frames including the objects are the same.
- the scene number is “2”. Since the object ID “012” is frame number # 2500, the scene number is “2”. Since the object ID “013” is frame number # 3500, the scene number is “3”. It becomes.
- the associating unit 113 associates “012” with the object ID “011” that is common to the scene number “2”, and associates “011” with the object ID “012”.
- the weight value increasing unit 114 increases the weight value of the associated object.
- the weight value increasing unit 114 increases the weight value by “0.3” for the associated object IDs “011” and “012”, respectively.
- the weight value assigning unit 111 stores the processing result in the second buffer 115 when the series of weight values is finished.
- An example of the contents stored in the second buffer 115 is shown in FIG.
- the second buffer 115 includes an “object ID” 115a, a “related object ID” 115b that identifies an object associated with the object indicated by the object ID, and a “weight value” 115c.
- the search unit 116 searches for similar objects based on the information stored in the first buffer 110 and the second buffer 115, with the contents stored in the object information storage unit 106 as a target.
- the reproduction unit 107, the reception unit 108, the object detection unit 109, and the weight value assignment unit 111 can be realized by storing a control program in a ROM and executing the program by the CPU, for example. .
- the display control unit 117 controls display on the display unit 118.
- the display unit 118 includes a liquid crystal touch screen 801, for example.
- the display unit may be integrated with the information search apparatus or may be a separate type.
- FIG. 4A shows the relationship between the frame number range and the scene number.
- FIG. 4B shows three frames (frame numbers “# 1001”, “# 2997”, “# 3001”) and objects (object IDs “001”, “002”, “003”) included in each frame. ).
- FIG. 4C shows the frame number and scene number corresponding to the object ID.
- the associating unit 113 specifies the frame number including the object (S3501).
- the frame including the object for example, a representative frame (a frame first pointed by the user) is selected.
- the associating unit 113 refers to the contents stored in the scene information storage unit 105 (S3502) and determines a scene number corresponding to the identified frame number (S3503).
- the associating unit 113 specifies the frame number “# 1001” including the object ID “001”. Then, the associating unit 113 refers to the stored contents of the scene information storage unit 105 and determines that “# 1001” is the scene number “2”.
- a trajectory 804 is a trajectory of a point input as a point.
- the object detection unit 109 detects the region of the locus 804 as an object.
- the object detection unit 109 extracts a feature amount for the region of the locus 804 that is an object.
- the representative frame (frame number “# 99” in the example of FIG. 9) is a grid of w horizontal and h vertical (16 horizontal and 9 vertical in the example of FIG. 9). Divide into areas.
- each divided lattice region is r (i, j): 1 ⁇ i ⁇ w, 1 ⁇ j ⁇ h.
- the object detection unit 109 extracts a set R (O) of lattice areas included in the area O that is an area including the object.
- a method for determining whether or not the region O includes the lattice region r (i, j) is as follows.
- a line segment connecting the barycentric point P (x, y) of the lattice region r (i, j) and the point Q very far from P is defined as a line segment PQ, and the line segment PQ and the region O Let the number of intersections be N (PQ, O).
- intersection number N (PQ, O) is an odd number
- the lattice region r (i, j) is included in the region O, and if it is an even number, it is determined not to be included. In this way, a set R (O) of lattice regions included in the region O is obtained.
- the object detection unit 109 then obtains feature amount information c (i, j) for the lattice region r (i, j) ⁇ R (O) included in the region O.
- the feature amount information c (i, j) is a color with the highest frequency in the lattice region r (i, j).
- the feature amount information detected by the object detection unit is managed in association with each other in a table format.
- Fig. 10 shows an example of feature information.
- the format of the feature amount information in FIG. 10 is the same as that shown in FIG. 6, and the object ID “xx”, the frame number “# 100”, and the content ID “ABC” are linked.
- the object detection unit 109 repeats the process of detecting an object from the region and extracting the feature amount of the object every time the reception unit 108 receives the specification of the region. Then, the extracted feature amount information and the like are stored in the first buffer 110.
- FIG. 11 is a diagram showing the contents stored in the first buffer 110.
- the first buffer 110 includes an “object ID” 110a for identifying an object, a “frame number” 110b including the object, and “feature information” 110c.
- FIG. 12 is a diagram schematically showing the storage contents of the first buffer 110 of FIG. IDs “011” and “012” are beetles, and ID “013” is a cat object.
- FIG. 12 depicts an image of each object for convenience of explanation, and the actual data format in the first buffer 110 is the feature amount information format as shown in FIG.
- the reception unit 108 receives selection of content to be reproduced (S1501).
- FIG. 16 shows a screen of the touch screen 801 corresponding to step S1501.
- the reception unit 108 waits for an object designation.
- the subsequent steps S1503 to S1505 are the processes described with reference to FIGS. 8 and 9, and the receiving unit 108 receives the designation of the area (S1503).
- the object detecting unit 109 detects the object for the received area (S1504).
- the feature amount is extracted (S1505).
- steps S1503 to S1505 are repeatedly performed until the content reproduction is completed (S1506: Yes).
- the object IDs “011”, “012”, and “013” in the first buffer 110 are stored by the object detection unit 109 that repeats the processing of steps S1504 to S1505 three times.
- the weight value assigning unit 111 obtains the frame number corresponding to the object ID from the first buffer 110 (S1701), and the initial weight value assigning unit 112 obtains the frame number.
- An initial weight value “0.5” is assigned to each object ID (S1702).
- the associating unit 113 refers to the information in the scene information storage unit 105, identifies the corresponding scene number from the frame number acquired in step S1701 (S1703), and identifies the scene number of each object ID. .
- the objects having the same scene number are related to each other from the identified scene number (S1704).
- the weight value increasing unit 114 increases the weight value related in step S1704 by “0.3”. A series of processing results are output to the second buffer 115 (S1705).
- each weight value is “0.8” obtained by adding “0.3” to the initial weight value “0.5”.
- a primary similarity calculation process (S1801) that is calculated based on the feature amount information of the object, and based on the calculated primary similarity and the object weight value. Further, a secondary similarity calculation process (S1802) to be calculated is included.
- the search unit 116 calculates one object O h from which the primary similarity has not been calculated among the objects stored in the first buffer 110.
- the target is set (S1901). Then, the feature amount information of the object set as the calculation target is acquired.
- step S1901 An example of step S1901 will be described.
- three objects O 1 ID “011”), O 2 (ID “012”), O 3 (ID “013”) are stored. ") Is stored.
- the search unit 116 sets the object O 1 as a calculation target, and acquires feature amount information of the object O 1 .
- the search unit 116 sets from among the objects stored in the object information storage unit 106, to calculate target 1 first-order similarity uncalculated amino object P i (S1902). Then, the feature amount information of the object set as the calculation target is acquired.
- step S1902 will be described.
- 1,000 objects P 1 ID “0001”
- P 2 ID “0002”
- P 3 ID “0003”
- the search unit 116 sets the object P 1 as a calculation target, and acquires feature amount information of the object P 1 set as the calculation target.
- the search unit 116 obtains a primary similarity R h, i between the object O h set in step S1901 and the object P i set in step S1902 (S1903).
- this template matching process (a process in which a template is moved while being superimposed on an input image and the similarity is determined by examining the correlation between corresponding feature colors) can use an existing method.
- the technique described in Non-Patent Document 2 may be used.
- the primary similarity R h, i obtained by the search unit 116 is normalized to a value between 0 and 1, and the higher the value, the higher the similarity.
- search unit 116 If there is an object P i whose primary similarity R h, i has not been calculated (S1904: Yes), the search unit 116 returns to step S1902.
- FIG. 20 shows an example of the primary similarity R h, i .
- the similarity between the object IDs “0002” and “0001” of the same beetle is high.
- the object ID “011” the similarity of the tank object ID “0003” is also high at the second place.
- the tank object with ID “0003” is an object that only has a similar color combination to the beetle object with ID “011”, and the user who searched using ID “011” (want to find a beetle) It is considered that the result is not intended for the user.
- calculation of the secondary similarity search unit 116 from among the objects stored in the first buffer 110, calculates a secondary degree of similarity uncalculated amino object O h
- the target is set (S2101).
- the related object of the object set as the calculation target is acquired with reference to the second buffer 115 (S2102).
- the secondary similarity Sh , i is obtained by adding all (S2104).
- step S2101 Set target object O 1 in step S2101, as was set to calculate target object P 1 in step S2103, Fig. 22
- a specific example of step S2104, will be described with reference to FIG. 23.
- the primary similarity from R 1,1 to R 1,1000 is obtained, and the object O 1 is related to the object O 2 (object O 1 has a relationship object O 2 ).
- the first half term “R 1,1 ⁇ w1” is obtained by multiplying the primary similarity R 1,1 between the object O 1 itself and the target object P 1 by its own weight value w1.
- Late term "R 2,1 ⁇ w2" is the primary similarity R 2,1 of the relationship object O 2 and the target object P 1 object O 1, becomes multiplied by the weight value w2 of the relationship object Yes.
- the secondary similarity S is (A) 1-order similarity of an object P i stored in an object O h and object information storage unit 106 which is detected by the object detection unit 109 (B) the object O objects associated with the h O h ( 1) Primary similarity of an object P i The two similarities are added after being multiplied by the weights of the respective objects O h and O h (1) .
- FIG. 24 shows a generalized image of the method of calculating the secondary similarity, and the method of calculating the secondary similarity S h, i between the object O h having j related objects and the object P i. Indicates.
- the search unit 116 repeats such a series of processes to thereby obtain the primary similarity (R 1,1 , R 1,2 ,..., R 1,1000 , R 2,1 , R 2,2 , , R 3,1000 ) based on the secondary similarity (S 1,1 , S 1,2 ,..., S 1,1000 , S 2,1 , S 2,2 ,. determine the S 3,1000) (S2105, S2106) .
- FIG. 25 shows an example of secondary similarity.
- the search unit 116 displays the search result (S1803).
- FIG. 26 is a diagram showing an example of a search result.
- thumbnails of three objects (ID “011”, “012”, “013”) used for the search are displayed, and in the lower part, the ID “011” of the three objects is displayed.
- the thumbnails 51 to 53 of the objects with the second highest secondary similarity are displayed.
- the search unit 116 selects the frame number “# 1234 including the object ID“ 0002 ”(see FIG. 7) corresponding to the thumbnail 51 from the object information storage unit 106. “, Content ID“ ABC ”is specified. Then, the search unit 116 causes the playback unit 26 to start playback from a frame number slightly before the frame number “# 1234” of the content ID “ABC”.
- search result in FIG. 24 is merely an example.
- the average of the secondary similarity of each of the three objects used in the search may be obtained, and the top three may be displayed. Not limited to the third place, any number may be used. Further, not only the rank of the search results but also the secondary similarity value (search score) may be displayed.
- FIG. 27 shows an image when the flow of operations described so far is viewed from the user interface side.
- the scene is the same as “2” for the three objects (IDs “011”, “012”, “013”) designated by the user selecting an area.
- the weight values of IDs “011” and “012” are increased by “0.3”.
- the secondary similarity is obtained by considering the above weight value from the primary similarity.
- the number of objects that have the same scene “2” is as small as two. However, as the number of objects used for the search is increased to 10, 20, the combination of accidental colors as described above is similar. It is possible to reduce the possibility that only objects that occupy the top of the search results.
- the present embodiment has been described above, the present invention is not limited to the above-described contents, and can be implemented in various forms for achieving the object of the present invention and the object related or incidental thereto. It does not matter.
- the accepting unit 108 accepts designation of an area during content reproduction, a frame elapses from the start of input of a point for area designation to the end of input.
- the frame number “# 100” at the point input start time (the time when the reception unit 108 has received) be the detection target.
- the correction value is for one frame, and “# 99” immediately before the frame number “# 100” is the target frame.
- the reception unit 108 receives a point A (x1, y1) that is one point on the touch screen 801.
- the object detection unit 109 performs edge detection on the received frame, and detects the object 108 including the point A among the objects detected by the edge detection.
- the edge detection can use a general method such as the Canny method (see Non-Patent Document 1).
- the object may be detected based on the point designated by the user (point designation).
- the area designation or point designation may be selectively used based on user settings.
- the object detection unit 109 if the number of points input during a certain time t is equal to or less than c and the distance between the points is equal to or less than d, it is determined that the point is specified. May be determined as area designation.
- the associating unit 113 changes the presence or absence of the association based on the identity of the scene to which the frame including the object belongs, but is not limited thereto.
- the relationship may be performed on the condition that the chapters are the same.
- the relationship may be made on condition that the playback times of the frames including the respective objects are within a certain time (for example, within 3 minutes).
- the relationship between the objects O1 to O2 but not the relationship from O2 to O1 is given a direction to the relationship, and the frame playback time order (object (Order of appearance) may be taken into consideration.
- the weight value increment ⁇ w is set to be relatively large, and when the number of intervening objects is large (when the recursion is deep), the weight value is increased.
- the increment ⁇ w may be set relatively small.
- directionality is given to the association, but such recursive association can also be applied to the association having no directionality.
- the weight value increasing unit 114 uniformly increases the weight value of “0.3” for an object having a related object, but is not limited thereto.
- the frequency of appearance of each object detected by the object detection unit 109 may be counted. Specifically, an item “frequency” is provided in the data string of the second buffer in FIG. 14. If the frequency is high, a value larger than “0.3” (for example, “0.5”) is incremented, If the frequency is high, a value smaller than “0.3” (for example, “0.1”) may be used as the increment.
- the appearance time may be counted for each object detected by the object detection unit 109. Specifically, an item “appearance time” is provided in the data string of the second buffer in FIG. 14, and if the appearance time is long, a value larger than “0.3” (for example, “0.5”) is incremented. If shorter, a value smaller than “0.3” (for example, “0.1”) may be incremented.
- a history indicating the presence or absence of fast forward or rewind may be stored as a history associated with a frame number.
- the object ID “011” (see FIG. 11) included in the frame number “# 2000” has a smaller weight value. You may make it do. This is because the object included in the fast-forwarded frame is considered not important for the user.
- the weight value of the object included in the rewound frame may be increased.
- the search unit 116 may search in consideration of the appearance order of objects.
- information indicating the appearance order of objects is stored in the object information storage unit 106, and the secondary similarity of an object having a high degree of coincidence with the order of the objects detected by the object detection unit 109 is increased. Also good.
- the objects detected by the object detection unit 109 may be stored as a database. Then, in the associating unit 113, the accumulated object may be used as a correlation target.
- the association unit 113 may associate the objects with the same series name.
- the weight value increasing unit 114 may increase the weight value as the size of the related object in the frame (the size of the object) increases.
- the weight value assigning unit 111 adjusts the weight value based on the association between objects performed by the association unit 113.
- the present invention is not limited to this, and it is conceivable to adjust the weight value such as increasing the weight value for objects having the same scene without performing the association.
- Each functional block in FIG. 1 or the like may be an LSI that is an integrated circuit. These may be individually made into one chip, or may be made into one chip so as to include a part or all of them. Although referred to as LSI here, it may be referred to as IC, system LSI, super LSI, or ultra LSI depending on the degree of integration. Further, the method of circuit integration is not limited to LSI's, and implementation using dedicated circuitry or general purpose processors is also possible. An FPGA (Field-Programmable-Gate-Array) that can be programmed after manufacturing the LSI, or a reconfigurable processor that can reconfigure the connection and setting of circuit cells inside the LSI may be used. Further, if integrated circuit technology comes out to replace LSI's as a result of the advancement of semiconductor technology or a derivative other technology, it is naturally also possible to carry out function block integration using this technology.
- LSI Field-Programmable-Gate-Array
- control program including a program code for causing the processor of various information processing apparatuses and various circuits connected to the processor to perform the operations described in the above-described embodiments, on a recording medium, Alternatively, it can be distributed and distributed via various communication channels.
- Such a recording medium includes a non-transitory recording medium such as an IC card, a hard disk, an optical disk, a flexible disk, and a ROM.
- the distributed and distributed control program is used by being stored in a memory or the like that can be read by the processor, and the processor executes the control program to perform various functions as shown in the embodiment. It will be realized.
- ⁇ Supplement 2> The present embodiment includes the following aspects.
- a video search apparatus is configured to reproduce a content composed of a plurality of frames, and to specify an object included in a frame constituting the content during the reproduction of the content.
- An accepting unit that accepts input from a user a plurality of times, a detecting unit that detects an object in response to acceptance by the accepting unit, and a time series of each frame including each object for each of a plurality of objects detected by the detecting unit
- adding means for assigning a weight value adjusted based on various characteristics, and search means for performing a search based on a plurality of objects to which the weight value is assigned.
- the assigning means relates to each of the plurality of objects detected by the detecting means based on time series characteristics of each frame including each object, Increasing means for relatively increasing the weight value of the related object compared to the weight value of the unrelated object may be included.
- the content is divided by a plurality of scenes on the playback time axis, and the associating means determines the relationship between the objects based on the identity of the scene of each frame including each object. It does not matter if it is attached.
- an appropriate weight value can be assigned to each object by association based on scene identity.
- the content is divided by a plurality of chapters on the playback time axis, and the association unit is configured to relate the objects based on the identity of the chapters of the frames including the objects. It does not matter if it is attached.
- association means may relate objects that are indirectly related to each other through other objects.
- the increasing means may adjust the weight value to be increased for the objects indirectly related via the other object according to the number of the objects being interposed. I do not care.
- the associating unit associates the playback time of the frame from the object ahead and the object of the frame playback time behind, and the frame playback time from the object behind The reproduction time may not be related to the object ahead.
- an appropriate weight value can be given to each object by giving direction to the association.
- a storage unit that stores a plurality of objects and feature amount information of each object is provided, the detection unit extracts object feature amount information for each detected object, and the search unit extracts the feature amount information.
- the detected feature quantity information may be collated with the feature quantity information stored in the storage means to search for an object similar to the object detected by the detection means.
- a storage unit that stores an object and feature amount information of each object is provided, the detection unit extracts feature amount information of each detected object, and the adding unit assigns a weight value to each object.
- the search unit calculates the primary similarity by comparing the feature amount information of the object detected by the detection unit with the feature amount information of each object stored in the storage unit.
- the secondary similarity may be calculated by adding the value obtained by multiplying the value of the secondary similarity by the weight value of the other object.
- a frequency counting unit that counts a frequency at which the related object appears in the content is included, and the increase unit is configured to assign a weight value of the related object to the related object. As the frequency counted for the object increases, it may be relatively increased compared to the weight value of the unrelated object.
- the apparatus further includes time counting means for counting a length on the reproduction time axis that appears in the content with respect to the related object, and the increasing means calculates the weight value of the related object. As the counted length of the related object is larger, the weight value of the unrelated object may be relatively increased.
- the weight value of the related object is relatively increased as the size of the related object in the frame is larger than the weight value of the unrelated object. It doesn't matter.
- a history storage unit that stores information for specifying a frame that has been fast-forwarded or rewinded by the playback unit, and the increase unit refers to the history storage unit and includes the related object. Indicates that the weighted value of the related object is increased, or the increase means refers to the history storage means to determine the related object. As long as it indicates that the included frame has been rewound, the increase in the weight value of the associated object may be increased.
- Storage means for storing a plurality of objects and the order in which each object appears on the reproduction time axis in the content, and the detection means for the detected plurality of objects on the reproduction time axis in the content The order of appearance is determined, and the search means searches for an object having a high degree of matching with the order of the plurality of objects detected by the detection means from among the plurality of objects stored in the storage means. It doesn't matter.
- a storage unit that stores the plurality of objects detected by the detection unit and the weight values of each object in association with each other, and the association unit stores the plurality of stored objects in the relationship It does not matter as the target.
- the storage means stores series identification information for each of a plurality of objects to be stored, and information indicating a series name is associated with each of the plurality of objects detected by the detection means,
- the association unit may refer to the plurality of accumulated objects and relate the series names of the plurality of objects detected by the detection unit to the objects having the same series name.
- a video search method includes a playback step of playing back content composed of a plurality of frames, and for specifying an object included in a frame constituting the content during playback of the content.
- a reception step of receiving an input from a user a plurality of times, a detection step of detecting an object in response to reception by the reception step, and a time series of each frame including each object for each of the plurality of objects detected by the detection step A granting step for assigning weight values adjusted based on various features, and a search step for conducting a search based on a plurality of objects to which the weight values are given.
- a program according to the present embodiment is a program for causing a computer to execute video search processing,
- the video search process includes a reproduction step of reproducing a content composed of a plurality of frames, and an input for specifying an object included in a frame constituting the content from a user a plurality of times during the reproduction of the content.
- an assigning step of assigning a weight value and a search step of performing a search based on a plurality of objects to which the weight value is assigned.
- the integrated circuit according to the present embodiment has a playback means for playing back content composed of a plurality of frames, and an input for designating an object included in the frame constituting the content during playback of the content On the content of each frame including each object for each of a plurality of objects detected by the detecting means, and a detecting means for detecting the object in response to the reception by the receiving means.
- the apparatus includes: an assigning unit that assigns a weight value that is adjusted based on a series of features; and a search unit that performs a search based on a plurality of objects to which the weight value is assigned.
- the video search apparatus according to the present invention is useful because it can contribute to improvement of search accuracy.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Databases & Information Systems (AREA)
- Human Computer Interaction (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Television Signal Processing For Recording (AREA)
Abstract
Description
(実施の形態1)
<構成>
図1に示すように、映像検索装置101は、通信部102、コンテンツ記憶部103、コンテンツ管理情報記憶部104、シーン情報記憶部105、オブジェクト情報記憶部106、再生部107、受付部108、オブジェクト検出部109、第1バッファ110、重み値付与部111、第2バッファ115、検索部116、表示制御部117、表示部118を備える。
・オブジェクトID”011”は、フレーム番号#2000なので、シーン番号は”2”
・オブジェクトID”012”は、フレーム番号#2500なので、シーン番号は”2”
・オブジェクトID”013”は、フレーム番号#3500なので、シーン番号は”3”
となる。
続いて、映像検索装置101における動作について説明する。
S1,1=R1,1×w1+R2,1×w2・・・(式1)
という式により求められる。
(A)オブジェクト検出部109により検出されたあるオブジェクトOhとオブジェクト情報記憶部106に記憶されたあるオブジェクトPiの1次類似度
(B)上記オブジェクトOhに関係付けられたオブジェクトOh(1)と上記あるオブジェクトPiの1次類似度
この両類似度がそれぞれのオブジェクトOh,Oh(1)の重み付けと掛け合わされた後に加算されたものとなっている。
<補足1>
以上、本実施の形態について説明したが、本発明は上記の内容に限定されず、本発明の目的とそれに関連又は付随する目的を達成するための各種形態においても実施可能であり、例えば、以下であっても構わない。
(A)映像検索装置から離れた入力デバイスを用いて指定する場合の遅延(例えば、Bluetooth(商標)接続されたマウスにより指定する場合など)
(B)タッチスクリーン801の処理や表示に要する遅延
などの遅延(遅延の長さは、例えば、数ミリ秒程度である。)が生ずることが考えれられるため、これら(A)(B)を勘案した補正値δを用いるようにしてもよい。
<補足2>
本実施の形態は、次の態様を含むものである。
前記映像検索処理は、複数のフレームから構成されたコンテンツを再生する再生ステップと、前記コンテンツの再生中、このコンテンツを構成するフレームに含まれるオブジェクトの指定のための入力をユーザから複数回受け付ける受付ステップと、前記受付ステップによる受け付けに応じて、オブジェクトを検出する検出ステップと、検出ステップにより検出された複数のオブジェクトそれぞれについて、各オブェクトを含む各フレームの時系列的な特徴に基づいて調整された、重み値を付与する付与ステップと、前記重み値が付与された複数のオブジェクトに基づいて検索を行う検索ステップとを含むことを特徴とする。
102 通信部
103 コンテンツ記憶部
104 コンテンツ管理情報記憶部
105 シーン情報記憶部
106 オブジェクト情報記憶部
107 再生部
108 受付部
109 オブジェクト検出部
110 第1バッファ
111 重み値付与部
112 初期重み値付与部
113 関係付け部
114 重み値増加部
115 第2バッファ
116 検索部
117 表示制御部
118 表示部
801 タッチスクリーン
Claims (19)
- 複数のフレームから構成されたコンテンツを再生する再生手段と、
前記コンテンツの再生中、このコンテンツを構成するフレームに含まれるオブジェクトの指定のための入力をユーザから複数回受け付ける受付手段と、
前記受付手段による受け付けに応じて、オブジェクトを検出する検出手段と、
検出手段により検出された複数のオブジェクトそれぞれについて、各オブェクトを含む各フレームの時系列的な特徴に基づいて調整された、重み値を付与する付与手段と、
前記重み値が付与された複数のオブジェクトに基づいて検索を行う検索手段と、
を備えることを特徴とする映像検索装置。
- 前記付与手段は、
前記検出手段により検出された複数のオブジェクトそれぞれについて、各オブジェクトを含む各フレームの時系列的な特徴に基づいて、オブジェクト同士を関係付ける関係付け手段と、
前記関係付けられたオブジェクトの重み値を、関係付けられていないオブジェクトの重み値と比べて相対的に増加させる増加手段とを含む
ことを特徴とする請求項1記載の映像検索装置。
- 前記コンテンツは、その再生時間軸上において複数のシーンにより区切られており、
前記関係付け手段は、各オブジェクトを含む各フレームのシーンの同一性に基づいて、前記オブジェクト同士の関係付けを行う
ことを特徴とする請求項2に記載の映像検索装置。
- 前記コンテンツは、その再生時間軸上において複数のチャプタにより区切られており、
前記関係付け手段は、各オブジェクトを含む各フレームのチャプタの同一性に基づいて、前記オブジェクト同士の関係付けを行う
ことを特徴とする請求項2に記載の映像検索装置。
- 前記関係付け手段は、他のオブジェクトを介して間接的に関係付けられているオブジェクト同士を関係付ける
ことを特徴とする請求項2に記載の映像検索装置。
- 前記増加手段は、前記他のオブジェクトを介して間接的に関係付けられているオブジェクト同士については、介しているオブジェクトの個数に応じて、前記増加させる重み値を調整する
ことを特徴とする請求項5に記載の映像検索装置。
- 前記関係付け手段は、
フレームの前記再生時間が前方のオブジェクトから、フレームの前記再生時間が後方のオブジェクトへと関係付けを行い、
フレームの前記再生時間が後方のオブジェクトから、フレームの前記再生時間が前方のオブジェクトへは関係付けを行わない
ことを特徴とする請求項2に記載の映像検索装置。
- 複数のオブジェクトと各オブジェクトの特徴量情報を記憶する記憶手段を備え、
前記検出手段は、検出したオブジェクトそれぞれについてオブジェクトの特徴量情報を抽出し、
前記検索手段は、前記検出手段により抽出された特徴量情報を、前記記憶手段に記憶された特徴量情報と照合することにより、前記検出手段が検出したオブジェクトに類似するオブジェクトを検索する
ことを特徴とする請求項2に記載の映像検索装置。
- オブジェクトと各オブジェクトの特徴量情報とを記憶する記憶手段を備え、
前記検出手段は、検出したオブジェクトそれぞれのオブジェクトの特徴量情報を抽出し、
前記付与手段は、前記オブジェクトそれぞれに重み値を付与し、
前記検索手段は、
検出手段により検出されたオブジェクトの特徴量情報を、前記記憶手段に記憶されたオブジェクトそれぞれの特徴量情報と照合することにより、1次類似度を算出し、
1次類似度の値に、
当該他のオブジェクトの重み値を乗算して得られた値を加算することにより、
2次類似度を算出する
ことを特徴とする請求項2に記載の映像検索装置。
- 前記関係付けられたオブジェクトが、前記コンテンツ中で出現する頻度をカウントする頻度カウント手段を備え、
前記増加手段は、前記関係付けられたオブジェクトの重み値を、当該関係付けられたオブジェクトに関してカウントされた頻度が多いほど、関係付けられていないオブジェクトの重み値と比べて相対的に増加させる
ことを特徴とする請求項2に記載の映像検索装置。
- 前記関係付けられたオブジェクトについて、前記コンテンツ中で出現する再生時間軸上の長さをカウントする時間カウント手段を備え、
前記増加手段は、前記関係付けられたオブジェクトの重み値を、当該関係付けられたオブジェクトに関してカウントされた長さが大きいほど、関係付けられていないオブジェクトの重み値と比べて相対的に増加させる
ことを特徴とする請求項2に記載の映像検索装置。
- 前記増加手段は、前記関係付けられたオブジェクトの重み値を、当該関係付けられたオブジェクトがフレームにおいて占める大きさが大きいほど、関係付けられていないオブジェクトの重み値と比べて相対的に増加させる
ことを特徴とする請求項2に記載の映像検索装置。
- 前記再生手段により早送りまたは巻き戻しされたフレームを特定する情報を記憶する履歴記憶手段を備え、
前記増加手段は、前記履歴記憶手段を参照して、前記関係付けられたオブジェクトを含むフレームが早送りされたことを示していれば、当該関係付けられたオブジェクトの重み値の増加量を小さくし、
または、前記増加手段は、前記履歴記憶手段を参照して、前記関係付けられたオブジェクトを含むフレームが巻き戻しされたことを示していれば、当該関係付けられたオブジェクトの重み値の増加量を大きくする
ことを特徴とする請求項2に記載の映像検索装置。
- 複数のオブジェクトと各オブジェクトがコンテンツ中の再生時間軸上において出現する順序とを記憶する記憶手段を備え、
前記検出手段は、検出した複数のオブジェクトについて、前記コンテンツ中の再生時間軸上において出現する順序を決定し、
前記検索手段は、前記記憶手段に記憶された複数のオブジェクトの中から、前記検出手段により検出された複数のオブジェクトの順序との合致度が高いオブジェクトを検索する
ことを特徴とする請求項1に記載の映像検索装置。
- 前記検出手段により検出された複数のオブジェクトと、各オブジェクトの重み値とを関連付けて蓄積する蓄積手段を備え、
前記関係付け手段は、前記蓄積された複数のオブジェクトを、前記関係付けの対象とする
ことを特徴とする請求項2に記載の映像検索装置。
- 前記蓄積手段は、蓄積する複数のオブジェクトそれぞれについて、シリーズ識別情報を記憶し、
前記検出手段により検出された複数のオブジェクトのそれぞれには、シリーズ名を示す情報が関連付けられており、
前記関係付け手段は、前記蓄積された複数のオブジェクトを参照して、前記検出手段により検出された複数のオブジェクトそれぞれのシリーズ名と、シリーズ名が一致するオブジェクトを関係付ける
ことを特徴とする請求項15に記載の映像検索装置。
- 複数のフレームから構成されたコンテンツを再生する再生ステップと、
前記コンテンツの再生中、このコンテンツを構成するフレームに含まれるオブジェクトの指定のための入力をユーザから複数回受け付ける受付ステップと、
前記受付ステップによる受け付けに応じて、オブジェクトを検出する検出ステップと、
検出ステップにより検出された複数のオブジェクトそれぞれについて、各オブェクトを含む各フレームの時系列的な特徴に基づいて調整された、重み値を付与する付与ステップと、
前記重み値が付与された複数のオブジェクトに基づいて検索を行う検索ステップと、
を含む映像検索方法。
- コンピュータに映像検索処理を実行させるプログラムであって、
前記映像検索処理は、
複数のフレームから構成されたコンテンツを再生する再生ステップと、
前記コンテンツの再生中、このコンテンツを構成するフレームに含まれるオブジェクトの指定のための入力をユーザから複数回受け付ける受付ステップと、
前記受付ステップによる受け付けに応じて、オブジェクトを検出する検出ステップと、
検出ステップにより検出された複数のオブジェクトそれぞれについて、各オブェクトを含む各フレームの時系列的な特徴に基づいて調整された、重み値を付与する付与ステップと、
前記重み値が付与された複数のオブジェクトに基づいて検索を行う検索ステップと
を含むことを特徴とするプログラム。
- 複数のフレームから構成されたコンテンツを再生する再生手段と、
前記コンテンツの再生中、このコンテンツを構成するフレームに含まれるオブジェクトの指定のための入力をユーザから複数回受け付ける受付手段と、
前記受付手段による受け付けに応じて、オブジェクトを検出する検出手段と、
検出手段により検出された複数のオブジェクトそれぞれについて、各オブェクトを含む各フレームのコンテンツ上の時系列的な特徴に基づいて調整された、重み値を付与する付与手段と、
前記重み値が付与された複数のオブジェクトに基づいて検索を行う検索手段と、
を備えることを特徴とする集積回路。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201180003170.XA CN102474586B (zh) | 2010-06-16 | 2011-03-17 | 影像检索装置、影像检索方法、记录介质、程序、集成电路 |
US13/389,144 US8718444B2 (en) | 2010-06-16 | 2011-03-17 | Video search device, video search method, recording medium, program, and integrated circuit |
JP2012520244A JP5632472B2 (ja) | 2010-06-16 | 2011-03-17 | 映像検索装置、映像検索方法、記録媒体、プログラム、集積回路 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2010-137072 | 2010-06-16 | ||
JP2010137072 | 2010-06-16 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011158406A1 true WO2011158406A1 (ja) | 2011-12-22 |
Family
ID=45347824
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2011/001596 WO2011158406A1 (ja) | 2010-06-16 | 2011-03-17 | 映像検索装置、映像検索方法、記録媒体、プログラム、集積回路 |
Country Status (4)
Country | Link |
---|---|
US (1) | US8718444B2 (ja) |
JP (1) | JP5632472B2 (ja) |
CN (1) | CN102474586B (ja) |
WO (1) | WO2011158406A1 (ja) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130007807A1 (en) * | 2011-06-30 | 2013-01-03 | Delia Grenville | Blended search for next generation television |
EP2816564B1 (en) * | 2013-06-21 | 2020-07-22 | Nokia Technologies Oy | Method and apparatus for smart video rendering |
US9600723B1 (en) | 2014-07-03 | 2017-03-21 | Google Inc. | Systems and methods for attention localization using a first-person point-of-view device |
JP6704797B2 (ja) * | 2016-06-01 | 2020-06-03 | キヤノン株式会社 | 画像検索装置、その制御方法、およびプログラム |
CN110135483A (zh) * | 2019-04-30 | 2019-08-16 | 北京百度网讯科技有限公司 | 训练图像识别模型的方法、装置及相关设备 |
CN111970525B (zh) * | 2020-08-14 | 2022-06-03 | 北京达佳互联信息技术有限公司 | 直播间搜索方法、装置、服务器及存储介质 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002373177A (ja) * | 2001-06-15 | 2002-12-26 | Olympus Optical Co Ltd | 類似オブジェクト検索方法及び装置 |
JP2003141161A (ja) * | 2001-10-29 | 2003-05-16 | Olympus Optical Co Ltd | マルチメディアオブジェクト検索方法およびシステム |
JP2009232250A (ja) * | 2008-03-24 | 2009-10-08 | Panasonic Corp | 番組情報表示装置および番組情報表示方法 |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3220493B2 (ja) | 1991-12-20 | 2001-10-22 | 株式会社シーエスケイ | 動画編集処理の場面転換部検出方法 |
JP3711993B2 (ja) | 1993-10-25 | 2005-11-02 | 株式会社日立製作所 | 映像の連想検索装置 |
US6195497B1 (en) | 1993-10-25 | 2001-02-27 | Hitachi, Ltd. | Associated image retrieving apparatus and method |
CN1293793B (zh) * | 1999-01-29 | 2010-05-12 | Lg电子株式会社 | 多媒体数据的搜索或浏览方法 |
JP2005107767A (ja) | 2003-09-30 | 2005-04-21 | Nippon Telegr & Teleph Corp <Ntt> | 映像検索装置、映像検索方法および映像検索プログラム |
JP4009959B2 (ja) | 2004-01-07 | 2007-11-21 | 船井電機株式会社 | テレビ受信機 |
JP4367264B2 (ja) * | 2004-07-12 | 2009-11-18 | セイコーエプソン株式会社 | 画像処理装置、画像処理方法、および、画像処理プログラム |
JP3674633B2 (ja) | 2004-11-17 | 2005-07-20 | カシオ計算機株式会社 | 画像検索装置、電子スチルカメラ、および画像検索方法 |
JP5135733B2 (ja) * | 2006-08-10 | 2013-02-06 | ソニー株式会社 | 情報記録装置及び情報記録方法、並びにコンピュータ・プログラム |
US8196045B2 (en) * | 2006-10-05 | 2012-06-05 | Blinkx Uk Limited | Various methods and apparatus for moving thumbnails with metadata |
JP2009296346A (ja) | 2008-06-05 | 2009-12-17 | Sony Corp | 番組推薦装置、番組推薦方法及び番組推薦プログラム |
JP5335302B2 (ja) * | 2008-06-30 | 2013-11-06 | キヤノン株式会社 | 焦点検出装置及びその制御方法 |
JP4711152B2 (ja) * | 2008-12-26 | 2011-06-29 | ソニー株式会社 | コンテンツ表示制御装置および方法、プログラム、並びに記録媒体 |
-
2011
- 2011-03-17 US US13/389,144 patent/US8718444B2/en active Active
- 2011-03-17 JP JP2012520244A patent/JP5632472B2/ja active Active
- 2011-03-17 WO PCT/JP2011/001596 patent/WO2011158406A1/ja active Application Filing
- 2011-03-17 CN CN201180003170.XA patent/CN102474586B/zh active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002373177A (ja) * | 2001-06-15 | 2002-12-26 | Olympus Optical Co Ltd | 類似オブジェクト検索方法及び装置 |
JP2003141161A (ja) * | 2001-10-29 | 2003-05-16 | Olympus Optical Co Ltd | マルチメディアオブジェクト検索方法およびシステム |
JP2009232250A (ja) * | 2008-03-24 | 2009-10-08 | Panasonic Corp | 番組情報表示装置および番組情報表示方法 |
Also Published As
Publication number | Publication date |
---|---|
JP5632472B2 (ja) | 2014-11-26 |
US8718444B2 (en) | 2014-05-06 |
CN102474586B (zh) | 2015-10-21 |
JPWO2011158406A1 (ja) | 2013-08-19 |
CN102474586A (zh) | 2012-05-23 |
US20120134648A1 (en) | 2012-05-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11778244B2 (en) | Determining tactical relevance and similarity of video sequences | |
US20240087316A1 (en) | Methods and systems of spatiotemporal pattern recognition for video content development | |
US20200193163A1 (en) | Methods and systems of combining video content with one or more augmentations to produce augmented video | |
JP5632472B2 (ja) | 映像検索装置、映像検索方法、記録媒体、プログラム、集積回路 | |
AU2015222869B2 (en) | System and method for performing spatio-temporal analysis of sporting events | |
US20240087317A1 (en) | Data processing systems and methods for enhanced augmentation of interactive video content | |
US11120271B2 (en) | Data processing systems and methods for enhanced augmentation of interactive video content | |
US20190354765A1 (en) | Methods, systems, and user interface navigation of video content based spatiotemporal pattern recognition | |
US9684818B2 (en) | Method and apparatus for providing image contents | |
US11275949B2 (en) | Methods, systems, and user interface navigation of video content based spatiotemporal pattern recognition | |
KR101729195B1 (ko) | 질의동작기반 안무 검색 시스템 및 방법 | |
JP5358083B2 (ja) | 人物画像検索装置及び画像検索装置 | |
WO2018053257A1 (en) | Methods and systems of spatiotemporal pattern recognition for video content development | |
CN102779153B (zh) | 信息处理设备和信息处理方法 | |
WO2019183235A1 (en) | Methods and systems of spatiotemporal pattern recognition for video content development | |
US20160191843A1 (en) | Relational display of images | |
CN106454431B (zh) | 电视节目推荐方法和系统 | |
CN105872717A (zh) | 视频处理方法及系统、视频播放器与云服务器 | |
US8406606B2 (en) | Playback apparatus and playback method | |
US20140086556A1 (en) | Image processing apparatus, image processing method, and program | |
CN105959804A (zh) | 智能播放方法及装置 | |
CN109558884A (zh) | 一种直播房间分类的方法、装置、服务器及介质 | |
CN112667936A (zh) | 视频处理方法、装置、终端、服务器及存储介质 | |
CN103608813A (zh) | 通过对象位置进行视频导航 | |
JP2007323319A (ja) | 類似検索処理方法及び装置及びプログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201180003170.X Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2012520244 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13389144 Country of ref document: US |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11795314 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 11795314 Country of ref document: EP Kind code of ref document: A1 |