WO2013115202A1 - 情報処理システム、情報処理方法、情報処理装置およびその制御方法と制御プログラム、通信端末およびその制御方法と制御プログラム - Google Patents
情報処理システム、情報処理方法、情報処理装置およびその制御方法と制御プログラム、通信端末およびその制御方法と制御プログラム Download PDFInfo
- Publication number
- WO2013115202A1 WO2013115202A1 PCT/JP2013/051953 JP2013051953W WO2013115202A1 WO 2013115202 A1 WO2013115202 A1 WO 2013115202A1 JP 2013051953 W JP2013051953 W JP 2013051953W WO 2013115202 A1 WO2013115202 A1 WO 2013115202A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- feature
- local
- local feature
- search object
- search
- Prior art date
Links
- 238000004891 communication Methods 0.000 title claims description 233
- 230000010365 information processing Effects 0.000 title claims description 133
- 238000000034 method Methods 0.000 title claims description 128
- 238000003672 processing method Methods 0.000 title claims description 7
- 239000013598 vector Substances 0.000 claims abstract description 182
- 230000005540 biological transmission Effects 0.000 claims description 47
- 238000003384 imaging method Methods 0.000 claims description 32
- 238000004519 manufacturing process Methods 0.000 claims description 10
- 230000006872 improvement Effects 0.000 claims description 2
- 230000008569 process Effects 0.000 description 79
- 238000012545 processing Methods 0.000 description 62
- 238000010586 diagram Methods 0.000 description 56
- 238000012790 confirmation Methods 0.000 description 23
- 238000012544 monitoring process Methods 0.000 description 10
- 239000011159 matrix material Substances 0.000 description 9
- 238000001514 detection method Methods 0.000 description 8
- 230000008859 change Effects 0.000 description 5
- 238000012795 verification Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000010187 selection method Methods 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 230000001174 ascending effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000001172 regenerating effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
- G06T7/74—Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7837—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7847—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
- G06F16/7854—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using shape
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/255—Detecting or recognising potential candidate objects based on visual cues, e.g. shapes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/7715—Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30108—Industrial image inspection
- G06T2207/30112—Baggage; Luggage; Suitcase
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30232—Surveillance
Definitions
- the present invention relates to a technique for finding a search object from a video imaged using local features.
- Patent Document 1 describes a technique for searching for a match between the image information of the search request and the provided image information and providing the search request to the search requester.
- Japanese Patent Application Laid-Open No. 2004-228561 describes a technique that improves the recognition speed by clustering feature amounts when a query image is recognized using a model dictionary generated in advance from a model image.
- An object of the present invention is to provide a technique for solving the above-described problems.
- a system provides: M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the m local regions including the search object and each of the m feature points of the search object image.
- First local feature quantity storage means for storing the quantity in association with each other; N feature points are extracted from the image in the video imaged by the first imaging means, and n local regions including the n feature points are respectively composed of feature vectors from 1 dimension to j dimension.
- second local feature quantity generating means for generating n second local feature quantities;
- a smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number.
- the method according to the present invention comprises: M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the m local regions including the search object and each of the m feature points of the search object image.
- An information processing method of an information processing system including first local feature amount storage means for storing an amount in association with each other, N feature points are extracted from the image in the video imaged by the first imaging means, and n local regions including each of the n feature points are respectively composed of feature vectors from 1 to j dimensions.
- a second local feature quantity generating step for generating n second local feature quantities;
- a smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number.
- an apparatus provides: N feature points are extracted from an image in the captured video, and n second regions each consisting of a feature vector from one dimension to j dimension for each of the n local regions including each of the n feature points.
- a second local feature generating means for generating a local feature First transmission means for transmitting to an information processing apparatus for recognizing a search object included in the image captured based on the comparison of local feature amounts; It is characterized by providing.
- the method according to the present invention comprises: N feature points are extracted from an image in the captured video, and n second regions each consisting of a feature vector from one dimension to j dimension for each of the n local regions including each of the n feature points.
- a second local feature generation step for generating a local feature A first transmission step of transmitting the n second local feature amounts to an information processing apparatus that recognizes a search object included in the image captured based on collation of local feature amounts; It is characterized by including.
- a program provides: N feature points are extracted from an image in the captured video, and n second regions each consisting of a feature vector from one dimension to j dimension for each of the n local regions including each of the n feature points.
- a second local feature generation step for generating a local feature;
- a first transmission step of transmitting the n second local feature amounts to an information processing apparatus that recognizes a search object included in the image captured based on collation of local feature amounts; Is executed by a computer.
- an apparatus provides: M feature points are extracted from the captured image of the search object, and m first regions each including feature vectors from 1 to i are obtained for each of m local regions including the m feature points.
- First local feature generating means for generating a local feature
- Second transmission means for transmitting the m first local feature amounts to an information processing device that recognizes whether or not the imaged search object is included in another image based on the comparison of local feature amounts
- First receiving means for receiving, from the information processing apparatus, information indicating the imaged search object included in the other image; It is provided with.
- the method according to the present invention comprises: M feature points are extracted from the captured image of the search object, and m first regions each including feature vectors from 1 to i are obtained for each of m local regions including the m feature points.
- a first local feature generation step for generating a local feature for generating a local feature;
- a second transmission step of transmitting the m first local feature amounts to an information processing device that recognizes whether or not the imaged search object is included in another image based on collation of local feature amounts;
- a program provides: M feature points are extracted from the captured image of the search object, and m first regions each including feature vectors from 1 to i are obtained for each of m local regions including the m feature points.
- a first local feature generation step for generating a local feature;
- a second transmission step of transmitting the m first local feature amounts to an information processing apparatus that recognizes whether or not the imaged search object is included in another image based on the comparison of local feature amounts;
- an apparatus provides: M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the m local regions including the search object and each of the m feature points of the search object image.
- First local feature quantity storage means for storing the quantity in association with each other; N feature points are extracted from an image in a video captured by the first communication terminal searching for the search object, and n local regions including each of the n feature points are respectively converted from 1 dimension to j dimension.
- Second receiving means for receiving, from the first communication terminal, n second local feature amounts consisting of feature vectors up to A smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number.
- the image in the video Recognizing means for recognizing that the search object exists in Second transmission means for transmitting information indicating the recognized search object to the second communication terminal that requested the search of the search object; It is provided with.
- the method according to the present invention comprises: M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the m local regions including the search object and each of the m feature points of the search object image.
- a method for controlling an information processing apparatus including first local feature storage means for storing a quantity in association with each other, N feature points are extracted from an image in an image captured by the communication terminal that searches for the search object, and n local regions including the n feature points are respectively in the first to j dimensions.
- the image in the video Recognizing that the search object exists in A second transmission step of transmitting information indicating the recognized search object to the second communication terminal that requested the search of the search object; It is characterized by including.
- a program provides: M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the m local regions including the search object and each of the m feature points of the search object image.
- a control program for an information processing device including first local feature storage means for storing the amount in association with each other, N feature points are extracted from an image in a video captured by the first communication terminal searching for the search object, and n local regions including each of the n feature points are respectively converted from 1 dimension to j dimension.
- the search object in the image in the video can be recognized in real time.
- the information processing system 100 is a system for finding a search object from an image captured using local feature amounts.
- the information processing system 100 includes a first local feature quantity storage unit 110, a first imaging unit 120, a second local feature quantity generation unit 130, and a recognition unit 140.
- the first local feature quantity storage unit 110 is generated for each of the search object 111 and m local regions including each of m feature points of the image of the search object 111.
- the m first local feature quantities 112 made up of feature vectors are stored in association with each other.
- the second local feature quantity generation unit 130 extracts n feature points 131 from the image 101 in the video captured by the first imaging unit 120. Then, the second local feature value generation unit 130, for n local regions 132 including each of the n feature points, n second local feature values 133 each consisting of a feature vector from 1 dimension to j dimension.
- the recognition unit 140 selects a smaller number of dimensions from the number of dimensions i of the feature vector of the first local feature 112 and the number of dimensions j of the feature vector of the second local feature 133. Then, the recognizing unit 140 adds m number of first local feature values 112 including feature vectors up to the selected number of dimensions to n second local feature values 133 including feature vectors up to the selected number of dimensions. When it is determined that a predetermined ratio or more corresponds (141), it is recognized that the search object 111 exists in the image 101 in the video.
- the search object in the image in the video can be recognized in real time.
- a request for a search object including a stolen object or a lost article is received together with the local feature amount of the search object image. And based on the collation with the local feature-value received from various communication terminals, a search thing is found and a searcher is alert
- a stolen object or a lost object will be described as an example of the search object.
- the search object may be a search person.
- the search object in the image in the video can be found in real time.
- FIG. 2 is a block diagram illustrating a configuration of the information processing system 200 according to the present embodiment.
- the information processing system 200 in FIG. 2 acquires local feature quantities of stolen items and lost items by taking pictures, images from surveillance cameras in various places, cameras of personal portable terminals, broadcast images, and reproduced images. To find stolen and lost items.
- the information processing system 200 includes search communication terminals 211 to 214, a search server 220, which is an information processing apparatus, and a search request communication terminal 230, which are respectively connected via a network 240.
- the search communication terminals 211 to 214 include a camera-equipped mobile terminal 211, surveillance cameras 212 and 213 installed at various locations, and a broadcast reproduction device 214.
- the surveillance camera 212 is a surveillance camera in the airport
- the surveillance camera 213 is a surveillance camera in the hotel.
- the search communication terminals 211 to 214 have local feature value generation units 211a to 214a, respectively, generate local feature values from the captured video, and transmit them to the search server. Note that the search communication terminals 211 to 214 do not have to be dedicated to search, and at the same time, simultaneously transmit local features to the search server 220 at the same time or for a specified period from the search server 220. To do.
- the information processing system 200 is included as a search request communication terminal 230 having a camera.
- the search requesting communication terminal 230 includes a local feature value generation unit 230a, and generates a local feature value from the captured video.
- a photograph of a bag 231 of a stolen or lost item is taken with a camera, and a local feature amount of the bag 231 is generated by the local feature amount generation unit 230a.
- the information processing system 200 collates the local feature amount of the bag 231 requested by the search request communication terminal 230 with the local feature amount received from the search communication terminals 211 to 214, and performs a bag search from the video.
- a search server 220 for searching 231 is included.
- the search server 220 stores a local feature DB 221 (see FIG. 8) that stores a local feature in association with a search request requested from the search request communication terminal 230, and a search object that stores the owner of the search object. DB 222 (see FIG. 9).
- the search server 220 collates the local feature amount stored in the local feature amount DB 221 with the local feature amount received from the search communication terminals 211 to 214. When the search object is found, the result is notified to the search request communication terminal 230.
- FIG. 3 is a diagram for explaining the operation of the information processing system 200 according to the present embodiment.
- the bag 231 is shown in the image 310 obtained by capturing the bag 231 of a photograph of the stolen or lost item at the communication terminal 230 for search request.
- a local feature amount is generated from the image 310.
- local feature amounts are generated from the videos 311 to 314 captured by the search communication terminals 211 to 214, respectively. These local feature quantities are collated by the local feature quantity collating unit 300, and when the local feature quantity generated from the image 310 matches a part of the local feature quantity generated from the videos 311 to 314, the search communication terminal 211 is used. It is determined that there is a bag 231 in the video imaged at .about.214.
- the search request communication terminal 230 stores the search object candidate bag 322 in the video from the airport and the search.
- An image 320 on which the result 321 is superimposed is notified.
- the search result 321 is informed of the comment “XX airport, similar product found at gate n”.
- FIG. 4 is a sequence diagram showing an operation procedure of the information processing system 200 according to the present embodiment.
- step S400 an application and / or data is downloaded from the search server 220 to the search communication terminals 211 to 241 or the search request communication terminal 230.
- step S401 the application is activated and initialized to perform the processing of this embodiment.
- the search request communication terminal 230 takes an image of the search object.
- the search object image preferably has real characteristics such as a photograph or a picture. Alternatively, when the number of search objects is small, a similar product may be photographed.
- a local feature amount is generated from the search object image.
- the local feature amount and the feature point coordinates are encoded. The information related to the search object and the local feature amount are transmitted from the search request communication terminal 230 to the search server 220 in step S409.
- the search server 220 receives information about the search object and the local feature amount, and stores the local feature amount in the local feature amount DB 221 in association with the search object ID. At the same time, information related to the search object is stored in the search object DB 222 in association with the search object ID. Further, if there is an accuracy of the local feature amount appropriate for representing the feature of the search object, the accuracy parameter is stored in the accuracy adjustment DB 410.
- step S413 the search communication terminal captures each image.
- step S415 a local feature amount is generated from each video.
- step S417 each local feature is encoded together with the feature point coordinates.
- the encoded local feature amount is transmitted from the search communication terminal to the search server 220 in step S419.
- step S421 the search server 220 refers to the local feature DB 221 to collate the local feature, and searches for the search object in each video. If the search object is not found, step S421 is repeated after the local feature is received. If it is determined that the search object exists, the process proceeds to step S425.
- step S425 with reference to the accuracy adjustment DB 410, the accuracy parameter of the accuracy adjustment of the local feature amount appropriate for confirming the search object in detail and the region where the search object is in the video are acquired.
- step S427 the accuracy parameter and the region information are instructed to the transmission source that transmitted the local feature including the search object.
- step S429 the search communication terminal of the transmission source generates a high-precision local feature amount of the selected region in the video using the instructed accuracy parameter.
- step S431 the local feature amount and the feature point coordinates are encoded.
- step S433 the encoded local feature is transmitted to the search server 220.
- step S435 the search server 220 checks again with high-precision local features between the search objects. If they match, the search server 220 confirms the search object and acquires information such as the search source from the search object DB 222. In step S437, the search object confirmation information is notified to the search request communication terminal.
- the search requesting communication terminal of the search source receives the search result and notifies the search object discovery in step S439.
- FIG. 5 is a block diagram showing a functional configuration of the search request communication terminal 230 according to the present embodiment.
- the imaging unit 501 inputs an image of a search object.
- the local feature value generation unit 502 generates a local feature value from the video from the imaging unit 501.
- the generated local feature amount is encoded by the encoding unit 503a of the local feature amount transmitting unit 503 together with the feature point coordinates.
- the local feature amount transmission unit 503 transmits the search object information and the local feature amount to the search server via the communication control unit 504.
- the search result receiving unit 505 receives the search result via the communication control unit 504. Then, the search result notifying unit 506 notifies the user of the received search result.
- the search result notification unit 506 includes a display in which the video from the imaging unit 501 and the search result are superimposed (see FIG. 3).
- FIG. 6 is a block diagram showing a functional configuration of the search communication terminals 211 to 214 according to the present embodiment.
- the imaging unit 601 inputs a query image.
- the local feature value generation unit 602 generates a local feature value from the video from the imaging unit 601.
- the generated local feature amount is encoded by the encoding unit 603a of the local feature amount transmitting unit 603 together with the feature point coordinates.
- the local feature amount transmission unit 603 transmits the search object information and the local feature amount to the search server via the communication control unit 604.
- the accuracy adjustment / region selection receiving unit 605 receives the accuracy parameter for accuracy adjustment and the region information of the search object in the video via the communication control unit 604.
- the accuracy parameter for accuracy adjustment is held in the accuracy parameter 606a of the accuracy adjustment unit 606.
- the accuracy adjustment unit 606 adjusts the accuracy of the local feature amount of the local feature amount generation unit 602 according to the accuracy parameter 606a.
- the region information of the search object is sent to the region selection unit 607, and the region selection unit 607 controls the imaging unit 601 and / or the local feature quantity generation unit 602 to perform high-precision local features of the search object candidate region. Generate quantity.
- the control of the imaging unit 601 of the area selection unit 607 may include a zoom-in process on the search object candidate.
- the search communication terminal when used for video acquisition, the video captured by the imaging unit 601 is displayed on the imaging display unit 608 and transmitted by the video transmission unit 609.
- FIG. 7 is a block diagram illustrating a functional configuration of the search server 220 that is the information processing apparatus according to the present embodiment.
- the local feature receiving unit 702 decodes the local feature received from the communication terminal via the communication control unit 701 by the decoding unit 702a.
- the registration / search determination unit 703 determines whether the transmission source of the local feature received by the local feature reception unit 702 is a search request communication terminal or a search communication terminal. The determination by the registration / search determination unit 703 may be performed based on the communication terminal ID or address of the transmission source, or may be performed depending on whether the search object information is included in the received data.
- the local feature amount registration unit 704 registers the local feature amount in the local feature amount DB 221 in association with the search object ID.
- information on the search object is held in the search object DB 222, and accuracy parameters appropriate for the search object are stored in the accuracy adjustment DB 410.
- the search object recognition unit 705 collates the local feature value of the search object registered in the local feature value DB 221 with the local feature value of the video from the search communication terminal.
- the search object recognition unit 705 recognizes that there is a search object in the video if the local feature value of the search object is included in the local feature value of the video.
- the accuracy adjustment acquisition unit 706 refers to the accuracy adjustment DB 410 and acquires an accuracy parameter appropriate for the search object.
- the search object recognition unit 705 acquires the position of the search object in the video.
- the accuracy adjustment / region selection information transmission unit 707 transmits the accuracy parameters for accuracy adjustment and the region information for region selection via the communication control unit 701 to the search communication for the transmission source. Send to the terminal.
- FIG. 8 is a diagram illustrating a configuration of the local feature DB 221 according to the present embodiment. Note that the present invention is not limited to such a configuration.
- the local feature DB 221 stores a first local feature 803, a second local feature 804,..., An mth local feature 805 in association with the search object ID 801 and the name / type 802.
- Each local feature quantity stores a feature vector composed of 1-dimensional to 150-dimensional elements hierarchized by 25 dimensions corresponding to 5 ⁇ 5 subregions (see FIG. 11F).
- m is a positive integer and may be a different number corresponding to the search object ID.
- the feature point coordinates used for the matching process are stored together with the respective local feature amounts.
- FIG. 9 is a diagram illustrating a configuration of the search object DB 222 according to the present embodiment.
- the structure of search object DB222 is not limited to FIG.
- the search object DB 222 stores a registration date 903, a searcher 904, a searcher's address 905, a searcher's contact information 906, and the like in association with the search object ID 901 and the name / type 902.
- the search server 220 may only store the notification destination corresponding to the search object, and the search request communication terminal may hold the contents in the search object DB 222.
- FIG. 10 is a diagram showing a configuration of the accuracy adjustment DB 410 according to the present embodiment.
- the configuration of the accuracy adjustment DB 410 is not limited to FIG.
- the accuracy adjustment DB 410 stores a first adjustment value 1003, a second adjustment 1004, and the like for generating the accuracy parameter 606a in FIG. 6 in association with the search object ID 1001 and the name / type 1002. Any adjustment value may be used depending on the type of parameter. Since these parameters are related to each other, it is desirable to select appropriate parameters for the search object to be recognized and confirmed. Therefore, parameters can be generated and stored in advance according to the target search object, or can be learned and held.
- FIG. 11A is a block diagram illustrating a configuration of a local feature value generation unit 702 according to the present embodiment.
- the local feature quantity generation unit 702 includes a feature point detection unit 1111, a local region acquisition unit 1112, a sub region division unit 1113, a sub region feature vector generation unit 1114, and a dimension selection unit 1115.
- the feature point detection unit 1111 detects a large number of characteristic points (feature points) from the image data, and outputs the coordinate position, scale (size), and angle of each feature point.
- the local region acquisition unit 1112 acquires a local region where feature amount extraction is performed from the coordinate value, scale, and angle of each detected feature point.
- the sub area dividing unit 1113 divides the local area into sub areas.
- the sub-region dividing unit 1113 can divide the local region into 16 blocks (4 ⁇ 4 blocks) or divide the local region into 25 blocks (5 ⁇ 5 blocks).
- the number of divisions is not limited. In the present embodiment, the case where the local area is divided into 25 blocks (5 ⁇ 5 blocks) will be described below as a representative.
- the sub-region feature vector generation unit 1114 generates a feature vector for each sub-region of the local region.
- a gradient direction histogram can be used as the feature vector of the sub-region.
- the dimension selection unit 1115 selects (for example, thins out) a dimension to be output as a local feature amount based on the positional relationship between the sub-regions so that the correlation between the feature vectors of adjacent sub-regions becomes low.
- the dimension selection unit 1115 can not only select a dimension but also determine a selection priority. That is, the dimension selection unit 1115 can select dimensions with priorities so that, for example, dimensions in the same gradient direction are not selected between adjacent sub-regions. Then, the dimension selection unit 1115 outputs a feature vector composed of the selected dimensions as a local feature amount.
- the dimension selection part 1115 can output a local feature-value in the state which rearranged the dimension based on the priority.
- FIG. 11B to FIG. 11F are diagrams illustrating the processing of the local feature value generation units 502 and 602 according to the present embodiment.
- FIG. 11B is a diagram showing a series of processing of feature point detection / local region acquisition / sub-region division / feature vector generation in the local feature quantity generation units 502 and 602.
- a series of processes is described in US Pat. No. 6,711,293, David G. Lowe, “Distinctive image features from scale-invariant key points” (USA), International Journal of Computer Vision, 60 (2), 2004. Year, p. 91-110.
- An image 1121 in FIG. 11B is a diagram illustrating a state in which feature points are detected from an image in the video in the feature point detection unit 1111 in FIG. 11A.
- the starting point of the arrow of the feature point data 1121a indicates the coordinate position of the feature point
- the length of the arrow indicates the scale (size)
- the direction of the arrow indicates the angle.
- the scale (size) and direction brightness, saturation, hue, and the like can be selected according to the target image.
- FIG. 11B the case of six directions at intervals of 60 degrees will be described, but the present invention is not limited to this.
- the local region acquisition unit 1112 in FIG. 11A generates a Gaussian window 1122a around the starting point of the feature point data 1121a, and generates a local region 1122 that substantially includes the Gaussian window 1122a.
- the local region acquisition unit 1112 generates a square local region 1122, but the local region may be circular or have another shape. This local region is acquired for each feature point. If the local area is circular, there is an effect that the robustness is improved with respect to the imaging direction.
- the sub-region dividing unit 1113 shows a state in which the scale and angle of each pixel included in the local region 1122 of the feature point data 1121a are divided into sub-regions 1123.
- the gradient direction is not limited to 6 directions, but may be quantized to an arbitrary quantization number such as 4 directions, 8 directions, and 10 directions.
- the sub-region feature vector generation unit 1114 may add up the magnitudes of the gradients instead of adding up the simple frequencies.
- the sub-region feature vector generation unit 1114 when the sub-region feature vector generation unit 1114 aggregates the gradient histogram, the sub-region feature vector generation unit 1114 assigns weight values not only to the sub-region to which the pixel belongs, but also to sub-regions (such as adjacent blocks) that are close to each other according to the distance between the sub-regions. You may make it add. Further, the sub-region feature vector generation unit 1114 may add weight values to gradient directions before and after the quantized gradient direction. Note that the feature vector of the sub-region is not limited to the gradient direction histogram, and may be any one having a plurality of dimensions (elements) such as color information. In the present embodiment, it is assumed that a gradient direction histogram is used as the feature vector of the sub-region.
- the dimension selection unit 1115 selects (decimates) a dimension (element) to be output as a local feature amount based on the positional relationship between the sub-regions so that the correlation between feature vectors of adjacent sub-regions becomes low. More specifically, the dimension selection unit 1115 selects dimensions such that at least one gradient direction differs between adjacent sub-regions, for example.
- the dimension selection unit 1115 mainly uses adjacent subregions as adjacent subregions. However, the adjacent subregions are not limited to adjacent subregions. A sub-region within a predetermined distance may be a nearby sub-region.
- FIG. 11C shows an example in which a dimension is selected from a feature vector 1131 of a 150-dimensional gradient histogram generated by dividing a local region into 5 ⁇ 5 block sub-regions and quantizing gradient directions into six directions 1131a.
- FIG. 11C is a diagram illustrating a state of a feature vector dimension number selection process in the local feature value generation units 502 and 602.
- the dimension selection unit 1115 selects a feature vector 1132 of a half 75-dimensional gradient histogram from a feature vector 1131 of a 150-dimensional gradient histogram.
- dimensions can be selected so that dimensions in the same gradient direction are not selected in adjacent left and right and upper and lower sub-region blocks.
- the dimension selection unit 1115 selects the feature vector 1133 of the 50-dimensional gradient histogram from the feature vector 1132 of the 75-dimensional gradient histogram.
- the dimension can be selected so that only one direction is the same (the remaining one direction is different) between the sub-region blocks positioned at an angle of 45 degrees.
- the dimension selection unit 1115 selects the feature vector 1134 of the 25-dimensional gradient histogram from the feature vector 1133 of the 50-dimensional gradient histogram, the gradient direction selected between the sub-region blocks located at an angle of 45 degrees. Dimension can be selected so that does not match.
- the dimension selection unit 1115 selects one gradient direction from each sub-region from the first dimension to the 25th dimension, selects two gradient directions from the 26th dimension to the 50th dimension, and starts from the 51st dimension. Three gradient directions are selected up to 75 dimensions.
- the gradient directions should not be overlapped between adjacent sub-area blocks and that all gradient directions should be selected uniformly.
- the dimensions be selected uniformly from the entire local region. Note that the dimension selection method illustrated in FIG. 11C is an example, and is not limited to this selection method.
- FIG. 11D is a diagram illustrating an example of the selection order of feature vectors from sub-regions in the local feature value generation units 502 and 602.
- the dimension selection unit 1115 can determine the priority of selection so as to select not only the dimensions but also the dimensions that contribute to the features of the feature points in order. That is, for example, the dimension selection unit 1115 can select dimensions with priorities so that dimensions in the same gradient direction are not selected between adjacent sub-area blocks. Then, the dimension selection unit 1115 outputs a feature vector composed of the selected dimensions as a local feature amount. In addition, the dimension selection part 1115 can output a local feature-value in the state which rearranged the dimension based on the priority.
- the dimension selection unit 1115 adds dimensions in the order of the sub-region blocks as shown in the matrix 1141 in FIG. 11D, for example, between 1 to 25 dimensions, 26 dimensions to 50 dimensions, and 51 dimensions to 75 dimensions. It may be selected.
- the dimension selection unit 1115 can select the gradient direction by increasing the priority order of the sub-region blocks close to the center.
- 11E is a diagram illustrating an example of element numbers of 150-dimensional feature vectors in accordance with the selection order of FIG. 11D.
- the element number of the feature vector is 6 ⁇ p + q.
- the matrix 1161 in FIG. 11F is a diagram showing that the 150-dimensional order according to the selection order in FIG. 11E is hierarchized in units of 25 dimensions.
- the matrix 1161 in FIG. 11F is a diagram illustrating a configuration example of local feature amounts obtained by selecting the elements illustrated in FIG. 11E according to the priority order illustrated in the matrix 1141 in FIG. 4D.
- the dimension selection unit 1115 can output dimension elements in the order shown in FIG. 11F. Specifically, for example, when outputting a 150-dimensional local feature amount, the dimension selection unit 1115 can output all 150-dimensional elements in the order shown in FIG. 11F.
- the dimension selection unit 1115 When the dimension selection unit 1115 outputs, for example, a 25-dimensional local feature, the element 1171 in the first row (76th, 45th, 83rd,..., 120th) shown in FIG. 11F is shown in FIG. 11F. Can be output in order (from left to right). For example, when outputting a 50-dimensional local feature value, the dimension selection unit 1115 adds the elements 1172 in the second row shown in FIG. 11F in the order shown in FIG. To the right).
- the dimension of the local feature amount has a hierarchical arrangement structure. That is, for example, in the 25-dimensional local feature quantity and the 150-dimensional local feature quantity, the arrangement of the elements 1171 to 1176 in the first 25-dimensional local feature quantity is the same.
- the dimension selection unit 1115 selects a dimension hierarchically (progressively), thereby depending on the application, communication capacity, terminal specification, etc. Feature quantities can be extracted and output.
- the dimension selection unit 1115 can select images hierarchically, sort the dimensions based on the priority order, and output them, thereby collating images using local feature amounts of different dimensions. . For example, when images are collated using a 75-dimensional local feature value and a 50-dimensional local feature value, the distance between the local feature values can be calculated by using only the first 50 dimensions.
- the priorities shown in the matrix 1141 in FIG. 11D to FIG. 11F are merely examples, and the order of selecting dimensions is not limited to this.
- the order of blocks may be the order shown in the matrix 1142 in FIG. 11D or the matrix 1143 in FIG. 11D in addition to the example of the matrix 1141 in FIG. 11D.
- the priority order may be determined so that dimensions are selected from all the sub-regions.
- the vicinity of the center of the local region may be important, and the priority order may be determined so that the selection frequency of the sub-region near the center is increased.
- the information indicating the dimension selection order may be defined in the program, for example, or may be stored in a table or the like (selection order storage unit) referred to when the program is executed.
- the dimension selection unit 1115 may select a dimension by skipping one sub-region block. That is, 6 dimensions are selected in a certain sub-region, and 0 dimensions are selected in other sub-regions close to the sub-region. Even in such a case, it can be said that the dimension is selected for each sub-region so that the correlation between adjacent sub-regions becomes low.
- the shape of the local region and sub-region is not limited to a square, and can be any shape.
- the local region acquisition unit 1112 may acquire a circular local region.
- the sub-region dividing unit 1113 can divide the circular local region into, for example, nine or seventeen sub-regions into concentric circles having a plurality of local regions.
- the dimension selection unit 1115 can select a dimension in each sub-region.
- the dimension of the feature vector generated while maintaining the information amount of the local feature value is hierarchical. Selected. This processing enables real-time search object recognition and recognition result display while maintaining recognition accuracy. Note that the configuration and processing of the local feature value generation units 502 and 602 are not limited to this example. Naturally, other processes that enable real-time search object recognition and recognition result display while maintaining recognition accuracy can be applied.
- FIG. 11G is a block diagram showing the encoding units 503a and 603a according to this embodiment. Note that the encoding unit is not limited to this example, and other encoding processes can be applied.
- the encoding units 503a and 603a have a coordinate value scanning unit 1181 that inputs the coordinates of feature points from the feature point detection unit 1111 of the local feature quantity generation units 502 and 602 and scans the coordinate values.
- the coordinate value scanning unit 1181 scans the image according to a specific scanning method, and converts the two-dimensional coordinate values (X coordinate value and Y coordinate value) of the feature points into one-dimensional index values.
- This index value is a scanning distance from the origin according to scanning. There is no restriction on the scanning direction.
- the sorting unit 1182 has a sorting unit 1182 that sorts the index values of feature points and outputs permutation information after sorting.
- the sorting unit 1182 sorts, for example, in ascending order. You may also sort in descending order.
- a difference calculation unit 1183 that calculates a difference value between two adjacent index values in the sorted index value and outputs a series of difference values is provided.
- the differential encoding unit 1184 that encodes a sequence of difference values in sequence order.
- the sequence of the difference value may be encoded with a fixed bit length, for example.
- the bit length may be specified in advance, but this requires the number of bits necessary to express the maximum possible difference value, so the encoding size is small. Don't be. Therefore, when encoding with a fixed bit length, the differential encoding unit 1184 can determine the bit length based on the input sequence of difference values.
- the difference encoding unit 1184 obtains the maximum value of the difference value from the input series of difference values, obtains the number of bits (expression number of bits) necessary to express the maximum value, A series of difference values can be encoded with the obtained number of expression bits.
- the local feature encoding unit 1185 that encodes the local feature of the corresponding feature point in the same permutation as the index value of the sorted feature point.
- the local feature amount encoding unit 1185 encodes a local feature amount dimension-selected from 150-dimensional local feature amounts for one feature point, for example, one dimension with one byte, and the number of dimensions of bytes. Can be encoded.
- FIG. 11H is a diagram illustrating processing of the search object recognition unit 705 according to the present embodiment.
- FIG. 11H is a diagram showing processing of the search object recognition unit 705 in FIG.
- a local feature quantity 1191 generated according to the present embodiment by the search request communication terminal in advance from an image of a search object shown in FIG. 11H is stored in the local feature quantity DB 221.
- local features are generated from the video 311 captured by the search communication terminal 211 in the left diagram of FIG. 11H according to the present embodiment.
- local features are generated from the video 312 captured by the search communication terminal 212 and the video 313 captured by the search communication terminal 213 according to the present embodiment. Then, it is checked whether or not the local feature value 1191 stored in the local feature value DB 221 is included in the local feature values generated from the videos 311 to 313.
- the search object recognition unit 705 associates each feature point in the images 311 to 313 in which the local feature quantity stored in the local feature quantity DB 221 matches the local feature quantity like a thin line.
- the search object recognition unit 705 determines that the feature points match when a predetermined ratio or more of the local feature amounts match.
- the search object recognition unit 705 recognizes that the search object is a search object if the positional relationship between the associated sets of feature points is a linear relationship. If such recognition is performed, it is possible to recognize by size difference, orientation difference (difference in viewpoint), or inversion.
- recognition accuracy can be obtained if there are a predetermined number or more of associated feature points, the search object can be recognized even if a part is hidden from view.
- the local feature 1191 of the search object (bag) in the local feature DB 221 matches the local feature of the video 312 of the search communication terminal 212.
- the local feature amount of the bag region is generated with higher accuracy from the video 1192 zoomed in on the bag of the search object candidate.
- the collation with the local feature 1191 of the search object (bag) in the local feature DB 221 is performed with higher accuracy. It is confirmed whether or not it belongs to the searcher by this more accurate collation.
- FIG. 12A is a block diagram showing a first configuration 606-1 of the accuracy adjustment unit 606 according to the present embodiment.
- the number of dimensions can be determined by the number of dimensions determination unit 1211.
- Dimension number determination unit 1211 can determine the number of dimensions selected in dimension selection unit 1115. For example, the dimension number determination unit 1211 can determine the number of dimensions by receiving information indicating the number of dimensions from the user. Note that the information indicating the number of dimensions does not need to indicate the number of dimensions per se, and may be information indicating, for example, verification accuracy or verification speed. Specifically, for example, when the number of dimensions determination unit 1211 accepts an input requesting that the local feature generation accuracy, communication accuracy, and matching accuracy be increased, the number of dimensions is set so that the number of dimensions increases. decide. For example, the dimension number determination unit 1211 determines the number of dimensions so that the number of dimensions decreases when an input requesting to increase the local feature generation speed, the communication speed, and the collation speed is received.
- the dimension number determination unit 1211 may determine the same dimension number for all feature points detected from the image, or may determine a different dimension number for each feature point. For example, when the importance of feature points is given by external information, the dimension number determination unit 1211 increases the number of dimensions for feature points with high importance and decreases the number of dimensions for feature points with low importance. Also good. In this way, the number of dimensions can be determined in consideration of the matching accuracy, the local feature generation speed, the communication speed, and the matching speed.
- FIG. 12B is a block diagram showing a second configuration 606-2 of the accuracy adjustment unit 606 according to the present embodiment.
- the feature vector expansion unit 1212 can change the number of dimensions by collecting values of a plurality of dimensions.
- the feature vector extending unit 1212 can extend the feature vector by generating a dimension in a larger scale (extended divided region) using the feature vector output from the sub-region feature vector generating unit 1114. Note that the feature vector extension unit 1212 can extend the feature vector using only the feature vector information output from the sub-region feature vector generation unit 1114. Therefore, since it is not necessary to return to the original image and perform feature extraction in order to extend the feature vector, the processing time for extending the feature vector is very small compared to the processing time for generating the feature vector from the original image. It is.
- the feature vector extension unit 1212 may generate a new gradient direction histogram by combining gradient direction histograms of adjacent sub-regions.
- FIG. 12C is a diagram for explaining processing by the second configuration 606-2 of the accuracy adjustment unit 606 according to the present embodiment.
- the feature vector extending unit 1212 expands a gradient direction histogram 1231 of 5 ⁇ 5 ⁇ 6 dimensions (150 dimensions), for example, thereby increasing a gradient direction of 4 ⁇ 4 ⁇ 6 dimensions (96 dimensions).
- a histogram 1241 can be generated. That is, four blocks 1231a surrounded by a thick solid line are combined into one block 1241a. Further, four blocks 1231b indicated by thick broken lines are grouped into one block 1241b.
- the feature vector extending unit 1212 obtains 3 ⁇ 3 ⁇ 6 dimensions by taking the sum of the gradient direction histograms of adjacent 3 ⁇ 3 blocks from the gradient direction histogram 1241 of 5 ⁇ 5 ⁇ 6 dimensions (150 dimensions). It is also possible to generate a (54-dimensional) gradient direction histogram 1251. That is, four blocks 1241c indicated by a thick solid line are grouped into one block 1251c. In addition, four blocks 1241d indicated by thick broken lines are grouped into one block 1251d.
- the dimension selection unit 1115 selects a 5 ⁇ 5 ⁇ 6 dimension (150 dimensions) gradient direction histogram 1231 as a 5 ⁇ 5 ⁇ 3 dimension (75 dimensions) gradient direction histogram 1232, 4 ⁇ 4
- the gradient direction histogram 1241 of ⁇ 6 dimensions (96 dimensions) is a gradient direction histogram 1242 of 4 ⁇ 4 ⁇ 6 dimensions (96 dimensions).
- the 3 ⁇ 3 ⁇ 6 dimension (54 dimensions) gradient direction histogram 1251 becomes a 3 ⁇ 3 ⁇ 3 dimension (27 dimensions) gradient direction histogram 1252.
- FIG. 13 is a block diagram showing a third configuration 606-3 of the accuracy adjustment unit 606 according to the present embodiment.
- the feature point selection unit 1311 can change the data amount of the local feature amount while maintaining the accuracy by changing the number of feature points by the feature point selection. It is.
- the feature point selection unit 1311 can hold in advance specified number information indicating the “specified number” of feature points to be selected, for example.
- the designated number information may be information indicating the designated number itself, or information indicating the total size (for example, the number of bytes) of the local feature amount in the image.
- the feature point selection unit 1311 divides the total size by the size of the local feature amount at one feature point, for example. Can be calculated. Also, importance can be given to all feature points at random, and feature points can be selected in descending order of importance. Then, when a specified number of feature points are selected, information about the selected feature points can be output as a selection result.
- only feature points included in a specific scale region can be selected from the scales of all feature points.
- the feature points can be reduced to the designated number based on the importance, and information on the selected feature points can be output as a selection result.
- FIG. 14 is a block diagram showing a fourth configuration 606-4 of the accuracy adjustment unit 606 according to the present embodiment.
- the dimension number determination unit 1211 and the feature point selection unit 1311 cooperate to change the data amount of the local feature amount while maintaining accuracy.
- the feature point selection unit 1311 can select a feature point based on the number of feature points determined by the dimension number determination unit 1211. Further, the dimension number determining unit 1211 determines the selected dimension number so that the feature amount size becomes the specified feature amount size based on the specified feature amount size selected by the feature point selecting unit 1311 and the determined feature point number. Can do.
- the feature point selection unit 1311 selects feature points based on the feature point information output from the feature point detection unit 1111. Then, the feature point selecting unit 1311 outputs importance level information indicating the importance level of each selected feature point to the dimension number determining unit 1211.
- the dimension number determining unit 1211 is based on the importance level information. The number of dimensions selected in (1) can be determined for each feature point.
- FIG. 15A is a block diagram illustrating a hardware configuration of the search request communication terminal 230 according to the present embodiment.
- a CPU 1510 is a processor for arithmetic control, and implements each functional component of the search request communication terminal 230 by executing a program.
- the ROM 1520 stores fixed data and programs such as initial data and programs.
- the communication control unit 504 is a communication control unit, and in the present embodiment, communicates with the search server 220 via a network. Note that the number of CPUs 1510 is not limited to one, and may be a plurality of CPUs or may include a GPU (Graphics Processing Unit) for image processing.
- the RAM 1540 is a random access memory that the CPU 1510 uses as a work area for temporary storage.
- the RAM 1540 has an area for storing data necessary for realizing the present embodiment.
- An input video 1541 indicates an input video imaged and input by the imaging unit 501.
- the feature point data 1542 indicates feature point data including the feature point coordinates, scale, and angle detected from the input video 1541.
- the local feature value generation table 1543 is a local feature value generation table that holds data until a local feature value is generated (see FIG. 15B).
- the local feature quantity 1544 is generated using the local feature quantity generation table 1543 and indicates the local feature quantity of the search object to be sent to the search server 220 via the communication control unit 504.
- Search object registration data 1545 indicates data related to a search object to be sent to the search server 220 via the communication control unit 504.
- the search object discovery information 1546 indicates information notified from the search server 220 when an object that matches the search object requested in the search server 220 is found.
- the search object video / discovery information superimposition data 1547 is data in which the search object video and the discovery information are superimposed when the search object is found, and is displayed on the display unit 1561 (see FIG. 3).
- Input / output data 1548 is input / output data input / output via the input / output interface 1560.
- Transmission / reception data 1549 indicates transmission / reception data transmitted / received via the communication control unit 504.
- the storage 1550 stores a database, various parameters, or the following data or programs necessary for realizing the present embodiment.
- the discovery information display format 1551 is a format for displaying the search object image / discovery information superimposed data 1547 on the display unit 1561.
- the storage 1550 stores the following programs.
- the communication terminal control program 1552 indicates a communication terminal control program for controlling the entire search request communication terminal 230.
- the communication terminal control program 1252 includes the following modules.
- the local feature value generating module 1553 is a module that generates a local feature value from the input image of the search object according to FIGS. 11B to 11F in the communication terminal control program 1552.
- the encoding module 1554 is a module that encodes the local feature generated by the local feature generating module 1553 for transmission.
- the search object registration module 1555 is a module for registering the local feature amount of the search object and the search object related information in the search server 220.
- the search object discovery notification module 1556 is a module that notifies search object discovery by receiving search object discovery information from the search server 220.
- the input / output interface 1560 interfaces input / output data with input / output devices.
- the input / output interface 1560 is connected to a display unit 1561, a touch panel or keyboard as the operation unit 1562, a speaker 1563, a microphone 1564, and an imaging unit 501.
- the input / output device is not limited to the above example.
- a GPS (Global Positioning System) position generation unit 1565 is mounted to acquire a current position based on a signal from a GPS satellite.
- FIG. 15A only data and programs essential to the present embodiment are shown, and data and programs not related to the present embodiment are not shown.
- FIG. 15B is a diagram showing a local feature value generation table 1543 in the search request communication terminal 230 according to the present embodiment.
- a plurality of detected feature points 1502, feature point coordinates 1503, and local region information 1504 corresponding to the feature points are stored in association with the input image ID 1501.
- a local feature 1509 is generated for each detected feature point 1502 from the above data.
- FIG. 16 is a flowchart showing a processing procedure of the search request communication terminal 230 according to the present embodiment. This flowchart is executed by the CPU 1510 of FIG. 15A using the RAM 1540, and implements each functional component of FIG.
- step S1611 it is determined whether or not the registration is to be made to the search server 220 for requesting a search object.
- step S1621 it is determined whether or not search object discovery information is received from search server 220. Otherwise, other processing is performed in step S1631.
- step S1613 If the search object is registered, the process proceeds to step S1613 to acquire the search object image. And the local feature-value production
- step S1617 local feature amounts and feature point coordinates are encoded (see FIGS. 17B and 17C).
- step S1619 the encoded data and search object related information are transmitted to search server 220.
- step S1623 the process advances to step S1623 to generate search object discovery display data.
- step S1625 the video of the search object and the discovery display data are superimposed and displayed (see FIG. 3).
- voice notification the voice is reproduced and notified from the speaker 1563.
- FIG. 17A is a flowchart illustrating a processing procedure of local feature generation processing S1615 according to the present embodiment.
- step S1711 the position coordinates, scale, and angle of the feature points are detected from the input video.
- step S1713 a local region is acquired for one of the feature points detected in step S1711.
- step S1715 the local area is divided into sub-areas.
- step S1717 a feature vector for each sub-region is generated to generate a feature vector for the local region.
- FIG. 11B The processing from step S1711 to S1717 is illustrated in FIG. 11B.
- step S1719 dimension selection is performed on the feature vector of the local region generated in step S1717.
- the dimension selection is illustrated in FIGS. 11D to 11F.
- step S1721 it is determined whether local feature generation and dimension selection have been completed for all feature points detected in step S1711. If not completed, the process returns to step S1713 to repeat the process for the next one feature point.
- FIG. 17B is a flowchart showing the processing procedure of the encoding processing S1617 according to the present embodiment.
- step S1731 the coordinate values of feature points are scanned in a desired order.
- step S1733 the scanned coordinate values are sorted.
- step S1735 a coordinate difference value is calculated in the sorted order.
- step S1737 the difference value is encoded (see FIG. 17C).
- step S1739 local feature amounts are encoded in the coordinate value sorting order. The difference value encoding and the local feature amount encoding may be performed in parallel.
- FIG. 17C is a flowchart showing a processing procedure of difference value encoding processing S1737 according to the present embodiment.
- step S1741 it is determined whether or not the difference value is within a range that can be encoded. If it is within the range that can be encoded, the process proceeds to step S1747 to encode the difference value. Then, control goes to a step S1749. If it is not within the range that can be encoded (outside the range), the process proceeds to step S1743 to encode the escape code.
- step S1745 the difference value is encoded by an encoding method different from the encoding in step S1747. Then, control goes to a step S1749.
- step S1749 it is determined whether the processed difference value is the last element in the series of difference values. If it is the last, the process ends. When it is not the last, it returns to step S1741 again and the process with respect to the next difference value of the series of difference values is performed.
- FIG. 18A is a block diagram showing a hardware configuration of search communication terminals 211 to 214 according to the present embodiment.
- a CPU 1810 is a processor for calculation control, and implements each functional component of the search communication terminals 211 to 214 by executing a program.
- the ROM 1820 stores fixed data and programs such as initial data and programs.
- the communication control unit 604 is a communication control unit, and in the present embodiment, communicates with the search server 220 via a network. Note that the number of CPUs 1810 is not limited to one, and may be a plurality of CPUs or a GPU (Graphics Processing Unit) for image processing.
- the RAM 1840 is a random access memory that the CPU 1810 uses as a work area for temporary storage.
- the RAM 1840 has an area for storing data necessary for realizing the present embodiment.
- An input video 1841 indicates an input video imaged and input by the imaging unit 601.
- the feature point data 1842 indicates feature point data including the feature point coordinates, scale, and angle detected from the input video 1841.
- the local feature value generation table 1843 is a local feature value generation table that holds data until a local feature value is generated (see FIG. 15B).
- the local feature value 1844 is generated using the local feature value generation table 1843 and indicates a local feature value to be sent to the search server 220 via the communication control unit 604.
- the accuracy parameter 606a is an accuracy parameter for adjusting the accuracy of the local feature amount instructed from the search server 220.
- Input / output data 1845 is input / output data input / output via the input / output interface 1860.
- Transmission / reception data 1846 indicates transmission / reception data transmitted / received via the communication control unit 604.
- the storage 1850 stores a database, various parameters, or the following data or programs necessary for realizing the present embodiment.
- the initial accuracy parameter 1851 indicates an accuracy parameter initially set by the search communication terminals 211 to 214.
- the storage 1550 stores the following programs.
- a communication terminal control program 1852 indicates a communication terminal control program for controlling the entire search communication terminals 211 to 214.
- the communication terminal control program 1852 includes the following modules.
- the local feature value generation module 1853 is a module that generates a local feature value from the input image of the search object according to FIGS. 11B to 11F in the communication terminal control program 1852.
- the encoding module 1854 is a module that encodes the local feature generated by the local feature generating module 1853 for transmission.
- the accuracy adjustment module 1855 is a module that adjusts the accuracy of the local feature amount in accordance with the accuracy parameter 606a.
- the input / output interface 1860 interfaces input / output data with input / output devices.
- the input / output device connected to the input / output interface 1860 is the same as the input / output terminal connected to the input / output interface 1560 of the search request communication terminal 230, and thus the description thereof is omitted.
- FIG. 12A only data and programs essential to the present embodiment are shown, and data and programs not related to the present embodiment are not shown.
- FIG. 18B is a diagram showing a configuration of the accuracy parameter 606a according to the present embodiment.
- the accuracy parameter 606a stores, as the feature point parameter 1801, a feature point selection threshold for selecting the number of feature points, feature points, or the like. Further, as the local region parameter 1802, an area (size) corresponding to a Gaussian window, a shape indicating a rectangle, a circle, or the like is stored. In addition, as the sub-region parameter 21803, the number of divisions and the shape of the local region are stored. Further, as the feature vector parameter 1804, the number of directions (for example, 8 directions and 6 directions), the number of dimensions, a dimension selection method, and the like are stored.
- FIG. 19 is a flowchart showing a processing procedure of the search communication terminals 211 to 214 according to the present embodiment. This flowchart is executed by the CPU 1810 of FIG. 18A using the RAM 1840, and implements each functional component of FIG.
- step S 1911 it is determined whether or not a video input from the imaging unit 601 is present.
- step S1921 it is determined whether or not accuracy adjustment information is received from the search server 220. Otherwise, other processing is performed in step S1931.
- step S1913 If it is video input, the process proceeds to step S1913, and local feature generation processing is executed from the input video (see FIG. 17A).
- step S1915 local feature quantities and feature point coordinates are encoded (see FIGS. 17B and 17C).
- step S1917 the encoded data is transmitted to search server 220.
- step S1923 the accuracy parameter 606a is set.
- step S1925 the area is selected.
- step S1913 and the encoding processing in step S1915 are the same as the processing in FIGS. 17A to 17C of the search request communication terminal 230, and thus illustration and description thereof are omitted.
- FIG. 20 is a block diagram illustrating a hardware configuration of the search server 220 that is the information processing apparatus according to the present embodiment.
- a CPU 2010 is a processor for arithmetic control, and realizes each functional component of the search server 220 by executing a program.
- the ROM 2020 stores initial data and fixed data such as programs and programs.
- the communication control unit 701 is a communication control unit, and in the present embodiment, communicates with the search request communication terminal and the search communication terminal via a network. Note that the number of CPUs 2010 is not limited to one, and a plurality of CPUs or a GPU for image processing may be included.
- the RAM 2040 is a random access memory that the CPU 2010 uses as a temporary storage work area.
- the RAM 2040 has an area for storing data necessary for realizing the present embodiment.
- the received registration local feature amount 2041 indicates a local feature amount including the feature point coordinates received from the search request communication terminal 230.
- the received search local feature amount 2042 indicates the local feature amount including the feature point coordinates received from the search communication terminals 211 to 214.
- the local feature amount 2043 read from the DB indicates a local feature amount when including the feature point coordinates read from the local feature amount DB 221.
- the search object recognition result 2044 is the search object recognition result recognized from the collation between the local feature received from the search communication terminal and the local feature received from the search request communication terminal stored in the local feature DB 221. Indicates.
- the accuracy adjustment / region selection information 2045 indicates accuracy parameters for accuracy adjustment of the search communication terminals 211 to 214 and region information in the search object candidate video according to the accuracy adjustment DB 410.
- the search object confirmation flag 2046 indicates a flag indicating that the search object candidate is confirmed by high-precision collation.
- Transmission / reception data 2047 indicates transmission / reception data transmitted / received via the communication control unit 701.
- the storage 2050 stores a database, various parameters, or the following data or programs necessary for realizing the present embodiment.
- the local feature DB 221 is a local feature DB similar to that shown in FIG.
- the search object DB 222 indicates a search object DB similar to that shown in FIG.
- the accuracy adjustment DB 410 is the same accuracy adjustment DB as shown in FIG.
- a search server control program 2051 indicates a search server control program for controlling the entire search server.
- the local feature amount registration module 2052 is a module for registering the local feature amount of the search object received from the search request communication terminal 230 in the local feature amount DB 221 in the search server control program 2051.
- the search object recognition control module 2053 is a module for recognizing a search object by comparing the received local feature quantity with the local feature quantity stored in the local feature quantity DB 221 in the search server control program 2051.
- the search object confirmation control module 2054 transmits the system adjustment / region selection information to the search communication terminal in the search server control program 2051 and the local feature received from the search communication terminal and the local feature stored in the local feature DB 221. This module verifies the search object by comparing it with the feature quantity.
- FIG. 20 shows only data and programs essential to the present embodiment, and data and programs not related to the present embodiment are not shown.
- FIG. 21 is a flowchart showing a processing procedure of the search server 220 according to the present embodiment.
- step S2111 it is determined whether a registration local feature has been received from the search request communication terminal 230 or not.
- step S2121 it is determined whether or not the search local feature amount from the search communication terminal is received. Otherwise, other processing is performed in step S2141.
- step S2113 If it is reception of a local feature amount for registration, the process proceeds to step S2113, and the local feature amount is registered in the local feature amount DB 221 in association with the search object. At the same time, information related to the search object is stored in the search object DB 222, and if necessary, accuracy parameters appropriate for the search object are held in the accuracy adjustment DB 410.
- step S2123 it is determined whether or not the search object is in the search local feature amount. If it is determined that there is a search object, the process proceeds to step S2127, where the accuracy adjustment / area selection data is acquired and transmitted to the search communication terminal of the transmission source.
- step S2129 the process waits for reception of a confirmation local feature for a search object candidate with higher accuracy from the search source communication terminal of the transmission source. If the confirmation local feature is received, the process proceeds to step S2131 to perform search object confirmation processing (see FIGS. 22C and 22B).
- step S2133 it is determined whether or not it matches with the search object. If it is confirmed that the search object matches, the process proceeds to step S2135, and the search object discovery information is transmitted to the search request communication terminal of the search request source.
- FIG. 22A is a flowchart showing a processing procedure of search object recognition processing S2123 according to the present embodiment.
- step S2211 the local feature amount of one search object is acquired from the local feature amount DB 221.
- step S2213 the local feature amount of the search object is compared with the local feature amount received from the communication terminal (see FIG. 22B).
- step S2215 it is determined whether or not they match. If they match, the process proceeds to step S2221 to store the matching search object as being in the video.
- step S2217 it is determined whether or not all search objects registered in the local feature DB 221 have been collated, and if there is any, the process returns to step S2211, and the collation of the next search object is repeated.
- the search object and the search range may be limited in advance in order to increase the processing speed and reduce the load on the search server.
- FIG. 22B is a flowchart showing a processing procedure of collation processing S2213 according to the present embodiment.
- step S2233 a smaller number of dimensions is selected between the dimension number i of the local feature quantity in the local feature quantity DB 221 and the dimension number j of the received local feature quantity.
- step S2235 the data of the selected number of dimensions of the p-th local feature of the search object stored in the local feature DB 221 is acquired. That is, the number of dimensions selected from the first one dimension is acquired.
- step S2237 the p-th local feature value acquired in step S2235 and the local feature values of all feature points generated from the input video are sequentially checked to determine whether or not they are similar.
- step S2239 it is determined whether or not the similarity exceeds the threshold value ⁇ from the result of collation between the local feature amounts.
- step S2241 the matched features in the local feature amount, the input video, and the search object are determined. A pair with the positional relationship of the points is stored. Then, q, which is a parameter for the number of matched feature points, is incremented by one.
- step S2243 the feature point of the search object is advanced to the next feature point (p ⁇ p + 1). If all feature points of the search object have not been matched (p ⁇ m), the process returns to step S2235 to match. Repeat local feature verification.
- the threshold value ⁇ can be changed according to the recognition accuracy required by the search object. Here, if the search object has a low correlation with other search objects, accurate recognition is possible even if the recognition accuracy is lowered.
- step S2245 it is determined in steps S2247 to S2253 whether the search object exists in the input video.
- step S2247 it is determined whether or not the ratio of the feature point number q that matches the local feature amount of the feature point of the input video among the feature point number p of the search object exceeds the threshold value ⁇ . If it exceeds, the process proceeds to step S2249, and it is further determined as a search object candidate whether the positional relationship between the feature point of the input video and the feature point of the search object has a relationship that allows linear transformation.
- step S2241 the positional relationship between the feature point of the input video and the feature point of the search object stored as the local feature amount is matched in step S2241 is a positional relationship that can be changed by changes such as rotation, inversion, and change of the viewpoint position. Or whether the positional relationship cannot be changed. Since such a determination method is geometrically known, detailed description thereof is omitted. If it is determined in step S2251 that the linear conversion is possible, the process proceeds to step S2253 to determine that the collated search object exists in the input video. Note that the threshold value ⁇ can be changed according to the recognition accuracy required by the search object.
- the search object has a low correlation with other search objects or a characteristic can be determined even from a part, accurate recognition is possible even if there are few matching feature points. That is, the search object can be recognized as long as a part is hidden or not visible or a characteristic part is visible.
- the processing for storing all the search objects in the local feature DB 221 and collating all the search objects is very heavy. Therefore, for example, before the search object is recognized from the input video, it is conceivable that the user selects a search object range from the menu, searches the range from the local feature DB 221 and collates it. Also, the load can be reduced by storing only the local feature amount in the range used by the user in the local feature amount DB 221.
- FIG. 22C is a flowchart showing a processing procedure of search object confirmation processing S2131 according to the present embodiment.
- step S2261 the local feature of the search object candidate is acquired from the local feature DB 221.
- step S2263 the local feature amount of the search object is compared with the local feature amount received from the communication terminal (see FIG. 22B).
- step S2265 it is determined whether or not they match. If they match, the process proceeds to step S2269 to store that the search objects match. If they do not match, the process proceeds to step S2267 to store that the search objects do not match.
- step S2263 differs in collating the local feature-value only for a search object candidate, since it is the same as that of step S2213 of FIG. 22A, illustration and description are abbreviate
- the threshold values ⁇ and ⁇ in FIG. 22B may be set differently depending on accuracy in the search object recognition process and the search object confirmation process.
- the information processing system according to the present embodiment differs from the second embodiment in that the search communication terminal and the search server share search object recognition processing. Since other configurations and operations are the same as those of the second embodiment, the same configurations and operations are denoted by the same reference numerals, and detailed description thereof is omitted.
- the search communication terminal since the search communication terminal performs a part of search object recognition processing, traffic between the search communication terminal and the search server can be reduced and the load on the search server can be reduced.
- the search object recognition process is performed by the search communication terminal, and the confirmation process is performed by the search server 220.
- the role sharing is performed by the communication traffic, the search communication terminal, and the search server 220. Different role assignments are possible based on the load.
- FIG. 23 is a sequence diagram illustrating an operation procedure of the information processing system according to the present embodiment.
- the initial setting process, the search request process, or the search object confirmation process illustrated in FIG. 4 of the second embodiment is omitted or simplified to avoid complexity. These processes are similar to the processes in FIG. Therefore, in FIG. 23, application download and activation processing are omitted.
- FIG. 23 illustrates the search request from the search request communication terminal in step S2300.
- the local feature amount of the search object and the search object related information are transmitted to the search server 220 and registered and held in each DB of the search server 220.
- the local feature amount of the search object is downloaded from the search server 220 to the search communication terminal.
- the local feature amount of the search object to be downloaded may be all local feature amounts registered in the local feature amount DB 221 when the search communication terminal is activated. Further, when the capacity of all the local feature values registered in the local feature value DB 221 is large, the local feature value may be adjusted with accuracy such as selection of the dimension number of the feature vector. On the other hand, if the search communication terminal is already activated, only the local feature amount of the search object newly requested for search may be downloaded.
- step S2303 the search communication terminal registers the received local feature quantity in the communication terminal local feature quantity DB 2310 in association with the search object (see FIG. 8 for the configuration).
- step S2305 the search communication terminal acquires each video by the imaging unit.
- step S2307 the initial accuracy of the local feature amount is set.
- step S2309 a local feature amount is generated from each video.
- step S2311 the search object in the video is recognized with reference to the local feature DB 2310 for communication terminal.
- step S2313 the presence or absence of the search object is determined. If it is determined that there is no search object, the process returns to step S2305 to acquire the next video and repeat the recognition process.
- step S2315 the accuracy of the local feature amount is adjusted in step S2315.
- the accuracy parameter to be adjusted may be held as a DB by the search communication terminal, or may be downloaded together with the local feature amount of the search object.
- step S2315 it is desirable to select the area of the search object.
- step S2317 a local feature amount is generated with adjusted accuracy (higher accuracy than the initial setting).
- step S2319 a highly accurate local feature amount is encoded together with the feature point coordinates.
- the encoded local feature amount is transmitted from the search communication terminal to the search server 220 in step S2321.
- the search server 220 refers to the local feature DB 221 to collate high-precision local features, and if they match, the search server confirms the search object and searches from the search object DB 222 (not shown in FIG. 23). Get such information.
- the search object confirmation information is notified to the search request communication terminal of the search source.
- the search requesting communication terminal of the search source receives the search result and notifies the search object discovery in step S2327.
- the search communication terminal generates a local feature with initial accuracy to recognize a search object, performs accuracy adjustment and region selection, and generates a local feature again.
- the local feature amount may be generated by generating a local feature amount with high accuracy from the beginning and performing a recognition process and performing a region selection process by zooming in or the like. In this way, communication traffic or the load on the search server can be reduced.
- the information processing system according to the present embodiment is different from the second embodiment and the third embodiment in that the search server selects a search communication terminal used for search.
- Other configurations and operations are the same as those in the second embodiment and the third embodiment. Therefore, the same configurations and operations are denoted by the same reference numerals, and detailed description thereof is omitted.
- the search since the search range can be selected in advance, the search can be speeded up and the load on the search server can be reduced.
- FIG. 24 is a sequence diagram illustrating an operation procedure of the information processing system according to the present embodiment.
- the initial setting process, the search request process, or the search object confirmation process illustrated in FIG. 4 of the second embodiment is omitted or simplified in order to avoid complexity.
- These processes are similar to the processes in FIG.
- the application is downloaded only to the search communication terminal within the search range of the search object to be searched for the search request, and the search is concentrated.
- the search range may be divided into multi-stage levels, and local feature amounts with accuracy corresponding to the levels may be selected, or a recognition processing or confirmation processing method may be selected.
- FIG. 24 illustrates the search request from the search request communication terminal in step S2400.
- the local feature amount of the search object and the search object related information are transmitted to the search server 220 and registered and held in each DB of the search server 220 (not shown).
- the search server 220 determines the search range from the search request and related information.
- an accuracy parameter suitable for the search object and the application capable of processing the present embodiment is downloaded to the search communication terminal within the search range.
- the communication terminal of the download destination may be a communication terminal registered in advance, or may be an all monitoring camera in the search range or a mobile terminal with a camera. Further, it may only be instructed to start searching for a communication terminal in which an application is installed or downloaded in advance.
- a communication terminal marked with a circle is selected for searching, and a communication terminal marked with a cross is not selected for searching.
- the selected search communication terminal first performs accuracy adjustment on the downloaded accuracy parameter in step S2405. And the communication terminal for search acquires each image
- step S2415 the search server 220 refers to the local feature DB 221 to collate the local feature, recognizes the search object from the video, confirms the search object, and searches the search object DB 222 (not shown in FIG. 23). ) From the search source.
- step S2417 the search object confirmation information is notified to the search request communication terminal of the search source.
- the search requesting communication terminal of the search source receives the search result and notifies the search object discovery in step S2419.
- the search object is recognized and confirmed by the search server 220 in FIG. 24.
- the recognition process may be performed by the search communication terminal and the confirmation process may be performed by the search server. If the range of the search communication terminal is selected as in the present embodiment, communication traffic or the load on the search communication terminal and the search server can be reduced.
- the search object can be searched using one communication terminal.
- it is useful for finding a search object in a limited area such as a room or a building.
- FIG. 25 is a block diagram illustrating a functional configuration of the communication terminal 2511 according to the present embodiment.
- the same reference numerals are assigned to the same functional components as those in FIG. 6 of the second embodiment, and the description thereof is omitted.
- the operation unit selects whether the communication terminal is a search object registration or a search object search. Alternatively, only search object registration may be selected, and other search objects may be searched.
- the registration / search determination unit 2501 determines whether the search object is registered or searched, and causes the local feature amount generated by the local feature amount generation unit 602 to execute different processing. If the search object is registered, the generated local feature value is a local feature value of the imaged search object. Therefore, the local feature amount registration unit 2502 registers the search feature in the local feature amount DB 2510 in association with the search object. At the same time, data corresponding to the accuracy adjustment DB 2520 and the search object DB 2530 are held.
- the search object recognition unit 2503 recognizes the search object by collating whether the local feature value registered in the local feature value DB 2510 is included.
- the accuracy adjustment acquisition unit 2504 acquires an accuracy parameter suitable for confirmation of the search object from the accuracy adjustment DB 2520.
- the acquired accuracy parameter is held in the accuracy parameter 606a of the accuracy adjustment unit 606, and the accuracy of the local feature amount generated from the video is adjusted.
- area selection including zoom-in on a search object is not shown, but more accurate search object confirmation is possible by performing area selection.
- the search object confirmation unit 2506 confirms the recognized search object by collating the local feature amount of the video whose accuracy has been adjusted with the local feature amount of the local feature DB 2510. If the search object confirmation unit 2506 uses the search object as the nucleus, the search object discovery information notification unit 2507 refers to the search object DB 2530 and notifies the search object discovery information. If there is a separate search result confirmation communication terminal, the search object discovery information transmission unit 2508 transmits the search object discovery information to the search result confirmation communication terminal.
- the information processing system according to the present embodiment is different from the second to fifth embodiments in that the search object to be searched based on the local feature amount is an illegal duplicate of the video. Since other configurations and operations are the same as those of the second to fourth embodiments, the same configurations and operations are denoted by the same reference numerals, and detailed description thereof is omitted.
- an illegal duplicate can be searched in real time from a video that is played back or broadcast.
- FIG. 26 is a block diagram illustrating a configuration of an information processing system 2600 according to the present embodiment.
- a plurality of communication terminals 2601 to 2605 an original video registration server 2620 for registering original original video, and an information processing device for monitoring the broadcast and reproduction of unauthorized duplicates via a network 2640.
- the plurality of communication terminals 2601 to 2605 include portable terminals 2601 and 2602, a monitor 2603, a portable PC 2604, a desktop PC 2605, and the like. These communication terminals 2601 to 2605 have local feature value generation units 2601a to 2605a, respectively, generate local feature values from the video to be downloaded or uploaded, or the video being broadcast or reproduced, and the unauthorized duplicate monitoring server. 2610.
- the original video registration server 2620 includes a video content generation provider, a video content provider, and the like. Further, the original video registration server 2620 includes a local feature value generation unit 2620a. Then, the local feature generated from the characteristic frame (original image) of the original video, the characteristic articles and people appearing in the original video, the local feature of the background landscape element and the combination information, etc. are illegal. Register in the duplicate monitoring server 2610.
- the unauthorized duplicate monitoring server 2610 registers the local feature of the original video transmitted from the original video registration server 2620 in the local feature DB 2611. Then, the local feature amount of the video that is downloaded or uploaded in real time from the communication terminals 2601 to 2605, or the video that is being broadcast or reproduced is collated with the local feature amount that is registered in the local feature amount DB 2611. If the local feature amounts match with a predetermined probability, it is determined that the copy is an illegal copy, and the sender is warned that the copy is an illegal copy. The other communication terminal and the related original video registration server 2620 Inform the effect.
- FIG. 27 is a sequence diagram showing an operation procedure of the information processing system 2600 according to this embodiment.
- the download and activation of the application illustrated in FIG. 4 of the second embodiment are omitted to avoid complexity. This process is similar to the process of FIG.
- FIG. 27 illustrates the registration of the original video from the original video registration server 2620 in step S2701.
- the information of the original video and the corresponding local feature amount are transmitted to the unauthorized duplicate monitoring server 2610.
- the unauthorized duplicate monitoring server 2610 registers the received local feature quantity in the local feature quantity DB 2611 in association with the original video ID.
- step S2705 includes video download and upload.
- step S2707 a local feature is generated from each video.
- step S2709 the generated local feature amount is encoded together with the feature point coordinates.
- the encoded local feature quantity is transmitted from the communication terminals 2601 to 2605 to the unauthorized duplicate monitoring server 2610 in step S2711.
- step S2713 the unauthorized duplicate monitoring server 2610 refers to the local feature DB 2611, collates the local feature, and detects a match with the original image from the video.
- step S2715 the presence / absence of an illegal duplicate is determined. If there is no illegal duplicate, the next local feature is received and the detection of the illegal duplicate in step S2713 is repeated. If an illegal duplicate is detected, the process proceeds to step S2717 to generate illegal duplicate information.
- step S2719 an illegal duplicate warning is transmitted to the communication terminal reproducing the illegal duplicate, and information relating to the illegal duplicate is transmitted to the original video registration server 2620 related to the illegal duplicate. To do.
- step S2721 the communication terminal reproducing the unauthorized copy issues an unauthorized copy warning.
- the original video registration server 2620 related to the illegally duplicated material information on the illegally duplicated material is notified in step S2733.
- the present invention may be applied to a system composed of a plurality of devices, or may be applied to a single device. Furthermore, the present invention can also be applied to a case where a control program that realizes the functions of the embodiments is supplied directly or remotely to a system or apparatus. Therefore, in order to realize the functions of the present invention with a computer, a control program installed in the computer, a medium storing the control program, and a WWW (World Wide Web) server that downloads the control program are also included in the scope of the present invention. include.
- M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the m local regions including the search object and each of the m feature points of the search object image.
- First local feature quantity storage means for storing the quantity in association with each other; N feature points are extracted from the image in the video imaged by the first imaging means, and n local regions including the n feature points are respectively composed of feature vectors from 1 dimension to j dimension.
- second local feature quantity generating means for generating n second local feature quantities; A smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number.
- the image in the video Recognizing means for recognizing that the search object exists in An information processing system comprising: (Appendix 2)
- the system further comprises first local feature value generation means for extracting m feature points from the image of the search object and generating the m first local feature values each consisting of a feature vector from one dimension to i dimension.
- the information processing system characterized by: (Appendix 3) A second imaging means for capturing an image of the search object; The information according to appendix 2, wherein the first local feature value generating unit generates the m first local feature values based on an image of the search object imaged by the second imaging unit. Processing system. (Appendix 4) The information processing system according to supplementary note 3, further comprising notification means for notifying a recognition result of the recognition means.
- the information processing system includes: a first communication terminal for generating a local feature of the search object; a second communication terminal for searching for the search object based on a local feature; the first communication terminal; An information processing device that communicates with the second communication terminal;
- the first communication terminal includes the second imaging means, the first local feature quantity generation means, and the notification means, and the m first local feature quantities are transmitted from the first communication terminal to the information processing apparatus.
- Send to The second communication terminal includes the first imaging unit and the second local feature quantity generation unit, and transmits the n second local feature quantities from the second communication terminal to the information processing apparatus.
- the information processing device includes the first local feature quantity storage unit and the recognition unit, and transmits the recognition result of the recognition unit from the information processing device to the first communication terminal.
- the search object is a lost or stolen object
- the first local feature quantity storage means stores the first local feature quantity generated by the first local feature quantity generation means from an image of the lost or stolen object to be searched
- the recognizing unit determines that the predetermined number or more of the m first local feature amounts correspond to the n second local feature amounts, and the lost or stolen object in the image in the video.
- the information processing system according to attachment 2 wherein the information processing system recognizes that there is a message.
- the search object is a person
- the first local feature quantity storage means stores the first local feature quantity generated by the first local feature quantity generation means from the image of the person to be searched
- the recognizing unit recognizes that the person is present in the image in the video when it is determined that the n second local feature values correspond to a predetermined ratio or more of the m first local feature values.
- the information processing system according to supplementary note 2, wherein: (Appendix 8) The search object is a duplicate;
- the first local feature quantity storage means stores the first local feature quantity generated by the first local feature quantity generation means from an original image,
- the recognizing unit determines that the original image exists in the image in the video when it is determined that the n second local feature amounts correspond to a predetermined ratio or more of the m first local feature amounts.
- the second local feature quantity generating means has accuracy adjusting means for adjusting the precision of the second local feature quantity, Any one of Supplementary notes 1 to 8, wherein the recognizing unit confirms the search object based on the second local feature amount generated by the second local feature amount generating unit with higher accuracy.
- the first local feature amount and the second local feature amount are a plurality of dimensions formed by dividing a local region including a feature point extracted from an image into a plurality of sub-regions, and comprising histograms of gradient directions in the plurality of sub-regions.
- the information processing system according to any one of supplementary notes 1 to 9, wherein the information processing system is generated by generating a feature vector.
- the first local feature amount and the second local feature amount are generated by deleting a dimension having a larger correlation between adjacent sub-regions from the generated plurality of dimension feature vectors.
- the plurality of dimensions of the feature vector is a predetermined dimension so that it can be selected in order from the dimension that contributes to the feature of the feature point and from the first dimension in accordance with the improvement in accuracy required for the local feature amount.
- the second local feature quantity generation means generates the second local feature quantity having a smaller number of dimensions for a search object having a lower correlation with another search object, corresponding to the correlation of the search object to be searched.
- the information processing system wherein: (Appendix 14)
- the first local feature quantity storage means stores the first local feature quantity having a smaller number of dimensions for a search object having a lower correlation with another search object, corresponding to the correlation of the search object to be searched.
- An information processing method of an information processing system including first local feature amount storage means for storing an amount in association with each other, N feature points are extracted from the image in the video imaged by the first imaging means, and n local regions including each of the n feature points are respectively composed of feature vectors from 1 to j dimensions.
- a second local feature quantity generating step for generating n second local feature quantities; A smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number.
- the image in the video Recognizing that the search object exists in An information processing method comprising: (Appendix 16) N feature points are extracted from an image in the captured video, and n second regions each consisting of a feature vector from one dimension to j dimension for each of the n local regions including each of the n feature points.
- a second local feature generating means for generating a local feature First transmission means for transmitting to an information processing apparatus for recognizing a search object included in the image captured based on the comparison of local feature amounts;
- a communication terminal comprising: (Appendix 17) N feature points are extracted from an image in the captured video, and n second regions each consisting of a feature vector from one dimension to j dimension for each of the n local regions including each of the n feature points.
- a second local feature generation step for generating a local feature A first transmission step of transmitting the n second local feature amounts to an information processing apparatus that recognizes a search object included in the image captured based on collation of local feature amounts;
- a control method for a communication terminal comprising: (Appendix 18) N feature points are extracted from an image in the captured video, and n second regions each consisting of a feature vector from one dimension to j dimension for each of the n local regions including each of the n feature points.
- a second local feature generation step for generating a local feature A first transmission step of transmitting the n second local feature amounts to an information processing apparatus that recognizes a search object included in the image captured based on collation of local feature amounts;
- a control program for causing a computer to execute.
- First receiving means for receiving, from the information processing apparatus, information indicating the imaged search object included in the other image;
- a communication terminal comprising: (Appendix 20) M feature points are extracted from the captured image of the search object, and m first regions each including feature vectors from 1 to i are obtained for each of m local regions including the m feature points.
- a first local feature generation step for generating a local feature for generating a local feature
- a first receiving step of receiving information indicating the imaged search object included in the other image from the information processing apparatus A control method for a communication terminal, comprising: (Appendix 21) M feature points are extracted from the captured image of the search object, and m first regions each including feature vectors from 1 to i are obtained for each of m local regions including the m feature points.
- a program for controlling a communication terminal, which causes a computer to execute. (Appendix 22) M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the m local regions including the search object and each of the m feature points of the search object image.
- First local feature quantity storage means for storing the quantity in association with each other; N feature points are extracted from an image in a video captured by the first communication terminal searching for the search object, and n local regions including each of the n feature points are respectively converted from 1 dimension to j dimension.
- Second receiving means for receiving, from the first communication terminal, n second local feature amounts consisting of feature vectors up to A smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number.
- An information processing apparatus comprising: (Appendix 23) M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the m local regions including the search object and each of the m feature points of the search object image.
- a method for controlling an information processing apparatus including first local feature storage means for storing a quantity in association with each other, N feature points are extracted from an image in an image captured by the communication terminal that searches for the search object, and each of n local regions including the n feature points has a dimension of 1 to j.
- a method for controlling an information processing apparatus comprising: (Appendix 24) M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the m local regions including the search object and each of the m feature points of the search object image.
- a control program for an information processing device including first local feature storage means for storing the amount in association with each other, N feature points are extracted from an image in a video captured by the first communication terminal searching for the search object, and n local regions including each of the n feature points are respectively converted from 1 dimension to j dimension.
- the image in the video Recognizing that the search object exists in A second transmission step of transmitting information indicating the recognized search object to the second communication terminal that requested the search of the search object;
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Library & Information Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
- Closed-Circuit Television Systems (AREA)
Abstract
Description
捜索物と、前記捜索物の画像のm個の特徴点のそれぞれを含むm個の局所領域のそれぞれについて生成された、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量とを、対応付けて記憶する第1局所特徴量記憶手段と、
第1撮像手段が撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を生成する第2局所特徴量生成手段と、
前記第1局所特徴量の特徴ベクトルの次元数iおよび前記第2局所特徴量の特徴ベクトルの次元数jのうち、より少ない次元数を選択し、選択された前記次元数までの特徴ベクトルからなる前記n個の第2局所特徴量に、選択された前記次元数までの特徴ベクトルからなる前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記捜索物が存在すると認識する認識手段と、
を備えることを特徴とする。
捜索物と、前記捜索物の画像のm個の特徴点のそれぞれを含むm個の局所領域のそれぞれについて生成された、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量とを、対応付けて記憶する第1局所特徴量記憶手段を備えた情報処理システムの情報処理方法であって、
第1撮像手段によって撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を生成する第2局所特徴量生成ステップと、
前記第1局所特徴量の特徴ベクトルの次元数iおよび前記第2局所特徴量の特徴ベクトルの次元数jのうち、より少ない次元数を選択し、選択された前記次元数までの特徴ベクトルからなる前記n個の第2局所特徴量に、選択された前記次元数までの特徴ベクトルからなる前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記捜索物が存在すると認識する認識ステップと、
を備えることを特徴とする。
撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を生成する第2局所特徴量生成手段と、
局所特徴量の照合に基づいて撮像した前記画像に含まれる捜索物を認識する情報処理装置に送信する第1送信手段と、
を備えることを特徴とする。
撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を生成する第2局所特徴量生成ステップと、
前記n個の第2局所特徴量を、局所特徴量の照合に基づいて撮像した前記画像に含まれる捜索物を認識する情報処理装置に送信する第1送信ステップと、
を含むことを特徴とする。
撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を生成する第2局所特徴量生成ステップと、
前記n個の第2局所特徴量を、局所特徴量の照合に基づいて撮像した前記画像に含まれる捜索物を認識する情報処理装置に送信する第1送信ステップと、
をコンピュータに実行させることを特徴とする。
撮像した捜索物の画像からm個の特徴点を抽出し、前記m個の特徴点のそれぞれを含むm個の局所領域について、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量を生成する第1局所特徴量生成手段と、
撮像した捜索物が他の画像に含まれるか否かを局所特徴量の照合に基づいて認識する情報処理装置に、前記m個の第1局所特徴量を送信する第2送信手段と、
前記情報処理装置から、前記他の画像に含まれる前記撮像した捜索物を示す情報を受信する第1受信手段と、
を備えたことを特徴とする。
撮像した捜索物の画像からm個の特徴点を抽出し、前記m個の特徴点のそれぞれを含むm個の局所領域について、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量を生成する第1局所特徴量生成ステップと、
撮像した捜索物が他の画像に含まれるか否かを局所特徴量の照合に基づいて認識する情報処理装置に、前記m個の第1局所特徴量を送信する第2送信ステップと、
前記情報処理装置から、前記他の画像に含まれる前記撮像した捜索物を示す情報を受信する第1受信ステップと、
を含むことを特徴とする。
撮像した捜索物の画像からm個の特徴点を抽出し、前記m個の特徴点のそれぞれを含むm個の局所領域について、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量を生成する第1局所特徴量生成ステップと、
撮像した捜索物が、他の画像に含まれるか否かを局所特徴量の照合に基づいて認識する情報処理装置に、前記m個の第1局所特徴量を送信する第2送信ステップと、
前記情報処理装置から、前記他の画像に含まれる前記撮像した捜索物を示す情報を受信する第1受信ステップと、
をコンピュータに実行させることを特徴とする。
捜索物と、前記捜索物の画像のm個の特徴点のそれぞれを含むm個の局所領域のそれぞれについて生成された、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量とを、対応付けて記憶する第1局所特徴量記憶手段と、
前記捜索物を捜索する第1通信端末が撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を、前記第1通信端末から受信する第2受信手段と、
前記第1局所特徴量の特徴ベクトルの次元数iおよび前記第2局所特徴量の特徴ベクトルの次元数jのうち、より少ない次元数を選択し、選択された前記次元数までの特徴ベクトルからなる前記n個の第2局所特徴量に、選択された前記次元数までの特徴ベクトルからなる前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記捜索物が存在すると認識する認識手段と、
認識した前記捜索物を示す情報を、前記捜索物の捜索を依頼した第2通信端末に送信する第2送信手段と、
を備えたことを特徴とする。
捜索物と、前記捜索物の画像のm個の特徴点のそれぞれを含むm個の局所領域のそれぞれについて生成された、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量とを、対応付けて記憶する第1局所特徴量記憶手段を備えた情報処理装置の制御方法であって、
前記捜索物を捜索する第通信端末が撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を、前記通信端末から受信する第2受信ステップと、
前記第1局所特徴量の特徴ベクトルの次元数iおよび前記第2局所特徴量の特徴ベクトルの次元数jのうち、より少ない次元数を選択し、選択された前記次元数までの特徴ベクトルからなる前記n個の第2局所特徴量に、選択された前記次元数までの特徴ベクトルからなる前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記捜索物が存在すると認識する認識ステップと、
認識した前記捜索物を示す情報を、前記捜索物の捜索を依頼した第2通信端末に送信する第2送信ステップと、
を含むことを特徴とする。
捜索物と、前記捜索物の画像のm個の特徴点のそれぞれを含むm個の局所領域のそれぞれについて生成された、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量とを、対応付けて記憶する第1局所特徴量記憶手段を備えた情報処理装置の制御プログラムであって、
前記捜索物を捜索する第1通信端末が撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を、前記第1通信端末から受信する第2受信ステップと、
前記第1局所特徴量の特徴ベクトルの次元数iおよび前記第2局所特徴量の特徴ベクトルの次元数jのうち、より少ない次元数を選択し、選択された前記次元数までの特徴ベクトルからなる前記n個の第2局所特徴量に、選択された前記次元数までの特徴ベクトルからなる前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記捜索物が存在すると認識する認識ステップと、
認識した前記捜索物を示す情報を、前記捜索物の捜索を依頼した第2通信端末に送信する第2送信ステップと、
をコンピュータに実行させることを特徴とする。
本発明の第1実施形態としての情報処理システム100について、図1を用いて説明する。情報処理システム100は、局所特徴量を使用して撮像した映像から捜索物を見付けるためのシステムである。
次に、本発明の第2実施形態に係る情報処理システムについて説明する。本実施形態においては、盗難物や遺失物を含む捜索物の依頼を捜索物画像の局所特徴量と共に受ける。そして、様々な通信端末から受信した局所特徴量との照合に基づいて、捜索物を見付けて捜索者に報知する。なお、本実施形態においては、捜索物として盗難物や遺失物を例に説明するが、捜索物は捜索人物であってもよい。
図2は、本実施形態に係る情報処理システム200の構成を示すブロック図である。
図3は、本実施形態に係る情報処理システム200の動作を説明する図である。
図4は、本実施形態に係る情報処理システム200の動作手順を示すシーケンス図である。
図5は、本実施形態に係る捜索依頼用通信端末230の機能構成を示すブロック図である。
図6は、本実施形態に係る捜索用通信端末211~214の機能構成を示すブロック図である。
図7は、本実施形態に係る情報処理装置である捜索サーバ220の機能構成を示すブロック図である。
図8は、本実施形態に係る局所特徴量DB221の構成を示す図である。なお、かかる構成に限定されない。
図9は、本実施形態に係る捜索物DB222の構成を示す図である。なお、捜索物DB222の構成は、図9に限定されない。
図10は、本実施形態に係る精度調整DB410の構成を示す図である。精度調整DB410の構成は、図10に限定されない。
図11Aは、本実施形態に係る局所特徴量生成部702の構成を示すブロック図である。
図11B~図11Fは、本実施形態に係る局所特徴量生成部502、602の処理を示す図である。
図11Bの画像1121は、図11Aの特徴点検出部1111において、映像中の画像から特徴点を検出した状態を示す図である。以下、1つの特徴点データ1121aを代表させて局所特徴量の生成を説明する。特徴点データ1121aの矢印の起点が特徴点の座標位置を示し、矢印の長さがスケール(大きさ)を示し、矢印の方向が角度を示す。ここで、スケール(大きさ)や方向は、対象映像にしたがって輝度や彩度、色相などを選択できる。また、図11Bの例では、60度間隔で6方向の場合を説明するが、これに限定されない。
図11Aの局所領域取得部1112は、例えば、特徴点データ1121aの起点を中心にガウス窓1122aを生成し、このガウス窓1122aをほぼ含む局所領域1122を生成する。図11Bの例では、局所領域取得部1112は正方形の局所領域1122を生成したが、局所領域は円形であっても他の形状であってもよい。この局所領域を各特徴点について取得する。局所領域が円形であれば、撮影方向に対してロバスト性が向上するという効果がある。
次に、サブ領域分割部1113において、上記特徴点データ1121aの局所領域1122に含まれる各画素のスケールおよび角度をサブ領域1123に分割した状態が示されている。なお、図11Bでは4×4=16画素をサブ領域とする5×5=25のサブ領域に分割した例を示す。しかし、サブ領域は、4×4=16や他の形状、分割数であってもよい。
サブ領域特徴ベクトル生成部1114は、サブ領域内の各画素のスケールを6方向の角度単位にヒストグラムを生成して量子化し、サブ領域の特徴ベクトル1124とする。すなわち、特徴点検出部1111が出力する角度に対して正規化された方向である。そして、サブ領域特徴ベクトル生成部1114は、サブ領域ごとに量子化された6方向の頻度を集計し、ヒストグラムを生成する。この場合、サブ領域特徴ベクトル生成部1114は、各特徴点に対して生成される25サブ領域ブロック×6方向=150次元のヒストグラムにより構成される特徴ベクトルを出力する。また、勾配方向を6方向に量子化するだけに限らず、4方向、8方向、10方向など任意の量子化数に量子化してよい。勾配方向をD方向に量子化する場合、量子化前の勾配方向をG(0~2πラジアン)とすると、勾配方向の量子化値Qq(q=0,…,D-1)は、例えば式(1)や式(2)などで求めることができるが、これに限られない。
Qq=round(G×D/2π)modD …(2)
ここで、floor()は小数点以下を切り捨てる関数、round()は四捨五入を行う関数、modは剰余を求める演算である。また、サブ領域特徴ベクトル生成部1114は勾配ヒストグラムを生成するときに、単純な頻度を集計するのではなく、勾配の大きさを加算して集計してもよい。また、サブ領域特徴ベクトル生成部1114は勾配ヒストグラムを集計するときに、画素が属するサブ領域だけではなく、サブ領域間の距離に応じて近接するサブ領域(隣接するブロックなど)にも重み値を加算するようにしてもよい。また、サブ領域特徴ベクトル生成部1114は量子化された勾配方向の前後の勾配方向にも重み値を加算するようにしてもよい。なお、サブ領域の特徴ベクトルは勾配方向ヒストグラムに限られず、色情報など、複数の次元(要素)を有するものであればよい。本実施形態においては、サブ領域の特徴ベクトルとして、勾配方向ヒストグラムを用いることとして説明する。
次に、図11C~図11Fにしたがって、局所特徴量生成部502、602における、次元選定部1115に処理を説明する。
図11Cは、局所特徴量生成部502、602における、特徴ベクトルの次元数の選定処理の様子を示す図である。
図11Dは、局所特徴量生成部502、602における、サブ領域からの特徴ベクトルの選定順位の一例を示す図である。
図11Gは、本実施形態に係る符号化部503a、603aを示すブロック図である。なお、符号化部は本例に限定されず、他の符号化処理も適用可能である。
図11Hは、本実施形態に係る捜索物認識部705の処理を示す図である。
以下、図12A~図12C、図13および図14を参照して、精度調整部606の数例の構成を説明する。
図12Aは、本実施形態に係る精度調整部606の第1の構成606-1を示すブロック図である。精度調整部606の第1の構成606-1においては、次元数決定部1211で次元数を決定可能である。
図12Bは、本実施形態に係る精度調整部606の第2の構成606-2を示すブロック図である。精度調整部606の第2の構成606-2においては、特徴ベクトル拡張部1212が複数次元の値をまとめることで、次元数を変更することが可能である。
図13は、本実施形態に係る精度調整部606の第3の構成606-3を示すブロック図である。精度調整部606の第3の構成606-3においては、特徴点選定部1311が特徴点選定で特徴点数を変更することで、精度を維持しながら局所特徴量のデータ量を変更することが可能である。
図14は、本実施形態に係る精度調整部606の第4の構成606-4を示すブロック図である。精度調整部606の第4の構成606-4においては、次元数決定部1211と特徴点選定部1311とが協働しながら、精度を維持しながら局所特徴量のデータ量を変更する。
図15Aは、本実施形態に係る捜索依頼用通信端末230のハードウェア構成を示すブロック図である。
図15Bは、本実施形態に係る捜索依頼用通信端末230における局所特徴量生成テーブル1543を示す図である。 局所特徴量生成テーブル1543には、入力画像ID1501に対応付けて、複数の検出された検出特徴点1502,特徴点座標1503および特徴点に対応する局所領域情報1504が記憶される。そして、各検出特徴点1502,特徴点座標1503および局所領域情報1504に対応付けて、複数のサブ領域ID1505,サブ領域情報1506,各サブ領域に対応する特徴ベクトル1507および優先順位を含む選定次元1508が記憶される。
図16は、本実施形態に係る捜索依頼用通信端末230の処理手順を示すフローチャートである。このフローチャートは、図15AのCPU1510によってRAM1540を用いて実行され、図5の各機能構成部を実現する。
図17Aは、本実施形態に係る局所特徴量生成処理S1615の処理手順を示すフローチャートである。
図17Bは、本実施形態に係る符号化処理S1617の処理手順を示すフローチャートである。
図17Cは、本実施形態に係る差分値の符号化処理S1737の処理手順を示すフローチャートである。
図18Aは、本実施形態に係る捜索用通信端末211~214のハードウェア構成を示すブロック図である。
図18Bは、本実施形態に係る精度パラメータ606aの構成を示す図である。
図19は、本実施形態に係る捜索用通信端末211~214の処理手順を示すフローチャートである。このフローチャートは、図18AのCPU1810によってRAM1840を用いて実行され、図6の各機能構成部を実現する。
図20は、本実施形態に係る情報処理装置である捜索サーバ220のハードウェア構成を示すブロック図である。
図21は、本実施形態に係る捜索サーバ220の処理手順を示すフローチャートである。
図22Aは、本実施形態に係る捜索物認識処理S2123の処理手順を示すフローチャートである。
図22Bは、本実施形態に係る照合処理S2213の処理手順を示すフローチャートである。
図22Cは、本実施形態に係る捜索物確認処理S2131の処理手順を示すフローチャートである。
次に、本発明の第3実施形態に係る情状処理システムについて説明する。本実施形態に係る情報処理システムは、上記第2実施形態と比べると、捜索用通信端末と捜索サーバとが捜索物の認識処理を分担する点で異なる。その他の構成および動作は、第2実施形態と同様であるため、同じ構成および動作については同じ符号を付してその詳しい説明を省略する。
図23は、本実施形態に係る情報処理システムの動作手順を示すシーケンス図である。なお、図23には、第2実施形態の図4に図示の初期設定処理や捜索依頼処理、あるいは捜索物確認処理については、複雑さを避けるために省略あるいは簡略化している。これらの処理は、図4の処理に準ずるものである。したがって、図23には、アプリケーションのダウンロードや起動処理は省かれている。
次に、本発明の第4実施形態に係る情状処理システムについて説明する。本実施形態に係る情報処理システムは、上記第2実施形態および第3実施形態と比べると、捜索サーバが捜索に使用する捜索用通信端末を選択する点で異なる。その他の構成および動作は、第2実施形態および第3実施形態と同様であるため、同じ構成および動作については同じ符号を付してその詳しい説明を省略する。
図24は、本実施形態に係る情報処理システムの動作手順を示すシーケンス図である。なお、図24には、第2実施形態の図4に図示の初期設定処理や捜索依頼処理、あるいは捜索物確認処理については、複雑さを避けるために省略あるいは簡略化している。これらの処理は、図4の処理に準ずるものである。なお、図24においては、アプリケーションのダウンロードは、捜索依頼があって捜索対象の捜索物の捜索範囲内にある捜索用通信端末のみにアプリケーショがダウンロードされて、集中捜索が行なわれる。捜索範囲は多段階のレベルに分割され、そのレベルに応じた精度の局所特徴量が選択されたり、認識処理や確認処理の方法が選択されたりしてもよい。
次に、本発明の第5実施形態に係る情状処理システムについて説明する。本実施形態に係る情報処理システムは、上記第2実施形態乃至第4実施形態と比べると、1つの通信端末が、捜索依頼用通信端末と捜索用通信端末と捜索サーバとの処理を全て行なう点で異なる。その他の構成および動作は、第2実施形態乃至第4実施形態と同様であるため、同じ構成および動作については同じ符号を付してその詳しい説明を省略する。
図25は、本実施形態に係る通信端末2511の機能構成を示すブロック図である。なお、図25において、第2実施形態の図6と同様の機能構成部には同じ参照番号を付して、説明を省略する。
次に、本発明の第6実施形態に係る情状処理システムについて説明する。本実施形態に係る情報処理システムは、上記第2実施形態乃至第5実施形態と比べると、局所特徴量により捜索する捜索物が映像の不正な複製物である点で異なる。その他の構成および動作は、第2実施形態乃至第4実施形態と同様であるため、同じ構成および動作については同じ符号を付してその詳しい説明を省略する。
図26は、本実施形態に係る情報処理システム2600の構成を示すブロック図である。
図27は、本実施形態に係る情報処理システム2600の動作手順を示すシーケンス図である。なお、図27には、第2実施形態の図4に図示のアプリケーションのダウンロードや起動については、複雑さを避けるために省略している。この処理は、図4の処理に準ずるものである。
以上、実施形態を参照して本発明を説明したが、本発明は上記実施形態に限定されものではない。本発明の構成や詳細には、本発明のスコープ内で当業者が理解し得る様々な変更をすることができる。また、それぞれの実施形態に含まれる別々の特徴を如何様に組み合わせたシステムまたは装置も、本発明の範疇に含まれる。
捜索物と、前記捜索物の画像のm個の特徴点のそれぞれを含むm個の局所領域のそれぞれについて生成された、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量とを、対応付けて記憶する第1局所特徴量記憶手段と、
第1撮像手段が撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を生成する第2局所特徴量生成手段と、
前記第1局所特徴量の特徴ベクトルの次元数iおよび前記第2局所特徴量の特徴ベクトルの次元数jのうち、より少ない次元数を選択し、選択された前記次元数までの特徴ベクトルからなる前記n個の第2局所特徴量に、選択された前記次元数までの特徴ベクトルからなる前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記捜索物が存在すると認識する認識手段と、
を備えることを特徴とする情報処理システム。
(付記2)
前記捜索物の画像からm個の特徴点を抽出し、それぞれ1次元からi次元までの特徴ベクトルからなる前記m個の第1局所特徴量を生成する第1局所特徴量生成手段をさらに備えることを特徴とする付記1に記載の情報処理システム。
(付記3)
前記捜索物の画像を撮像する第2撮像手段をさらに備え、
前記第1局所特徴量生成手段は、前記第2撮像手段が撮像した前記捜索物の画像に基づいて、前記m個の第1局所特徴量を生成することを特徴とする付記2に記載の情報処理システム。
(付記4)
前記認識手段の認識結果を報知する報知手段をさらに備えることを特徴とする付記3に記載の情報処理システム。
(付記5)
前記情報処理システムは、前記捜索物の局所特徴量を生成するための第1通信端末と、局所特徴量に基づいて前記捜索物を捜索するための第2通信端末と、前記第1通信端末および前記第2通信端末と通信する情報処理装置とを有し、
前記第1通信端末が、前記第2撮像手段と前記第1局所特徴量生成手段と前記報知手段とを含んで、前記m個の第1局所特徴量を前記第1通信端末から前記情報処理装置へ送信し、
前記第2通信端末が、前記第1撮像手段と前記第2局所特徴量生成手段とを含んで、前記n個の第2局所特徴量を前記第2通信端末から前記情報処理装置へ送信し、
前記情報処理装置が、前記第1局所特徴量記憶手段と前記認識手段とを含んで、前記認識手段の認識結果を前記情報処理装置から前記第1通信端末へ送信することを特徴とする付記4に記載の情報処理システム。
(付記6)
前記捜索物は遺失物または盗難物であり、
前記第1局所特徴量記憶手段は、捜索する前記遺失物または盗難物の画像から前記第1局所特徴量生成手段が生成した第1局所特徴量を記憶し、
前記認識手段は、前記n個の第2局所特徴量に、前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記遺失物または盗難物が存在すると認識することを特徴とする付記2に記載の情報処理システム。
(付記7)
前記捜索物は人物であり、
前記第1局所特徴量記憶手段は、捜索する前記人物の画像から前記第1局所特徴量生成手段が生成した第1局所特徴量を記憶し、
前記認識手段は、前記n個の第2局所特徴量に、前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記人物が存在すると認識することを特徴とする付記2に記載の情報処理システム。
(付記8)
前記捜索物は複製物であり、
前記第1局所特徴量記憶手段は、元画像から前記第1局所特徴量生成手段が生成した第1局所特徴量を記憶し、
前記認識手段は、前記n個の第2局所特徴量に、前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記元画像が存在すると認識することを特徴とする付記2に記載の情報処理システム。
(付記9)
前記第2局所特徴量生成手段は、前記第2局所特徴量の精度を調整する精度調整手段を有し、
前記認識手段は、精度をより高く調整して前記第2局所特徴量生成手段が生成した第2局所特徴量に基づいて前記捜索物を確認することを特徴とする付記1乃至8のいずれか1つに記載の情報処理システム。
(付記10)
前記第1局所特徴量および前記第2局所特徴量は、画像から抽出した特徴点を含む局所領域を複数のサブ領域に分割し、前記複数のサブ領域内の勾配方向のヒストグラムからなる複数の次元の特徴ベクトルを生成することにより生成されることを特徴とする付記1乃至9のいずれか1つに記載の情報処理システム。
(付記11)
前記第1局所特徴量および前記第2局所特徴量は、前記生成した複数の次元の特徴ベクトルから、隣接するサブ領域間の相関がより大きな次元を削除することにより生成されることを特徴とする付記10に記載の情報処理システム。
(付記12)
前記特徴ベクトルの複数の次元は、前記特徴点の特徴に寄与する次元から順に、かつ、前記局所特徴量に対して求められる精度の向上に応じて第1次元から順に選択できるよう、所定の次元数ごとに前記局所領域をひと回りするよう配列することを特徴とする付記10または11に記載の情報処理システム。
(付記13)
前記第2局所特徴量生成手段は、捜索する前記捜索物の相関に対応して、他の捜索物とより低い前記相関を有する捜索物については次元数のより少ない前記第2局所特徴量を生成することを特徴とする付記12に記載の情報処理システム。
(付記14)
前記第1局所特徴量記憶手段は、捜索する前記捜索物の相関に対応して、他の捜索物とより低い前記相関を有する捜索物については次元数のより少ない前記第1局所特徴量を記憶することを特徴とする付記12または13に記載の情報処理システム。
(付記15)
捜索物と、前記捜索物の画像のm個の特徴点のそれぞれを含むm個の局所領域のそれぞれについて生成された、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量とを、対応付けて記憶する第1局所特徴量記憶手段を備えた情報処理システムの情報処理方法であって、
第1撮像手段によって撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を生成する第2局所特徴量生成ステップと、
前記第1局所特徴量の特徴ベクトルの次元数iおよび前記第2局所特徴量の特徴ベクトルの次元数jのうち、より少ない次元数を選択し、選択された前記次元数までの特徴ベクトルからなる前記n個の第2局所特徴量に、選択された前記次元数までの特徴ベクトルからなる前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記捜索物が存在すると認識する認識ステップと、
を備えることを特徴とする情報処理方法。
(付記16)
撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を生成する第2局所特徴量生成手段と、
局所特徴量の照合に基づいて撮像した前記画像に含まれる捜索物を認識する情報処理装置に送信する第1送信手段と、
を備えることを特徴とする通信端末。
(付記17)
撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を生成する第2局所特徴量生成ステップと、
前記n個の第2局所特徴量を、局所特徴量の照合に基づいて撮像した前記画像に含まれる捜索物を認識する情報処理装置に送信する第1送信ステップと、
を含むことを特徴とする通信端末の制御方法。
(付記18)
撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を生成する第2局所特徴量生成ステップと、
前記n個の第2局所特徴量を、局所特徴量の照合に基づいて撮像した前記画像に含まれる捜索物を認識する情報処理装置に送信する第1送信ステップと、
をコンピュータに実行させることを特徴とする制御プログラム。
(付記19)
撮像した捜索物の画像からm個の特徴点を抽出し、前記m個の特徴点のそれぞれを含むm個の局所領域について、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量を生成する第1局所特徴量生成手段と、
撮像した捜索物が他の画像に含まれるか否かを局所特徴量の照合に基づいて認識する情報処理装置に、前記m個の第1局所特徴量を送信する第2送信手段と、
前記情報処理装置から、前記他の画像に含まれる前記撮像した捜索物を示す情報を受信する第1受信手段と、
を備えたことを特徴とする通信端末。
(付記20)
撮像した捜索物の画像からm個の特徴点を抽出し、前記m個の特徴点のそれぞれを含むm個の局所領域について、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量を生成する第1局所特徴量生成ステップと、
撮像した捜索物が他の画像に含まれるか否かを局所特徴量の照合に基づいて認識する情報処理装置に、前記m個の第1局所特徴量を送信する第2送信ステップと、
前記情報処理装置から、前記他の画像に含まれる前記撮像した捜索物を示す情報を受信する第1受信ステップと、
を含むことを特徴とする通信端末の制御方法。
(付記21)
撮像した捜索物の画像からm個の特徴点を抽出し、前記m個の特徴点のそれぞれを含むm個の局所領域について、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量を生成する第1局所特徴量生成ステップと、
撮像した捜索物が、他の画像に含まれるか否かを局所特徴量の照合に基づいて認識する情報処理装置に、前記m個の第1局所特徴量を送信する第2送信ステップと、
前記情報処理装置から、前記他の画像に含まれる前記撮像した捜索物を示す情報を受信する第1受信ステップと、
をコンピュータに実行させることを特徴とする通信端末の制御プログラム。
(付記22)
捜索物と、前記捜索物の画像のm個の特徴点のそれぞれを含むm個の局所領域のそれぞれについて生成された、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量とを、対応付けて記憶する第1局所特徴量記憶手段と、
前記捜索物を捜索する第1通信端末が撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を、前記第1通信端末から受信する第2受信手段と、
前記第1局所特徴量の特徴ベクトルの次元数iおよび前記第2局所特徴量の特徴ベクトルの次元数jのうち、より少ない次元数を選択し、選択された前記次元数までの特徴ベクトルからなる前記n個の第2局所特徴量に、選択された前記次元数までの特徴ベクトルからなる前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記捜索物が存在すると認識する認識手段と、
認識した前記捜索物を示す情報を、前記捜索物の捜索を依頼した第2通信端末に送信する第2送信手段と、
を備えたことを特徴とする情報処理装置。
(付記23)
捜索物と、前記捜索物の画像のm個の特徴点のそれぞれを含むm個の局所領域のそれぞれについて生成された、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量とを、対応付けて記憶する第1局所特徴量記憶手段を備えた情報処理装置の制御方法であって、
前記捜索物を捜索する第通信端末が撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を、前記通信端末から受信する第2受信ステップと、
前記第1局所特徴量の特徴ベクトルの次元数iおよび前記第2局所特徴量の特徴ベクトルの次元数jのうち、より少ない次元数を選択し、選択された前記次元数までの特徴ベクトルからなる前記n個の第2局所特徴量に、選択された前記次元数までの特徴ベクトルからなる前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記捜索物が存在すると認識する認識ステップと、
認識した前記捜索物を示す情報を、前記捜索物の捜索を依頼した第2通信端末に送信する第2送信ステップと、
を含むことを特徴とする情報処理装置の制御方法。
(付記24)
捜索物と、前記捜索物の画像のm個の特徴点のそれぞれを含むm個の局所領域のそれぞれについて生成された、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量とを、対応付けて記憶する第1局所特徴量記憶手段を備えた情報処理装置の制御プログラムであって、
前記捜索物を捜索する第1通信端末が撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を、前記第1通信端末から受信する第2受信ステップと、
前記第1局所特徴量の特徴ベクトルの次元数iおよび前記第2局所特徴量の特徴ベクトルの次元数jのうち、より少ない次元数を選択し、選択された前記次元数までの特徴ベクトルからなる前記n個の第2局所特徴量に、選択された前記次元数までの特徴ベクトルからなる前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記捜索物が存在すると認識する認識ステップと、
認識した前記捜索物を示す情報を、前記捜索物の捜索を依頼した第2通信端末に送信する第2送信ステップと、
をコンピュータに実行させることを特徴とする制御プログラム。
Claims (24)
- 捜索物と、前記捜索物の画像のm個の特徴点のそれぞれを含むm個の局所領域のそれぞれについて生成された、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量とを、対応付けて記憶する第1局所特徴量記憶手段と、
第1撮像手段が撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を生成する第2局所特徴量生成手段と、
前記第1局所特徴量の特徴ベクトルの次元数iおよび前記第2局所特徴量の特徴ベクトルの次元数jのうち、より少ない次元数を選択し、選択された前記次元数までの特徴ベクトルからなる前記n個の第2局所特徴量に、選択された前記次元数までの特徴ベクトルからなる前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記捜索物が存在すると認識する認識手段と、
を備えることを特徴とする情報処理システム。 - 前記捜索物の画像からm個の特徴点を抽出し、それぞれ1次元からi次元までの特徴ベクトルからなる前記m個の第1局所特徴量を生成する第1局所特徴量生成手段をさらに備えることを特徴とする請求項1に記載の情報処理システム。
- 前記捜索物の画像を撮像する第2撮像手段をさらに備え、
前記第1局所特徴量生成手段は、前記第2撮像手段が撮像した前記捜索物の画像に基づいて、前記m個の第1局所特徴量を生成することを特徴とする請求項2に記載の情報処理システム。 - 前記認識手段の認識結果を報知する報知手段をさらに備えることを特徴とする請求項3に記載の情報処理システム。
- 前記情報処理システムは、前記捜索物の局所特徴量を生成するための第1通信端末と、局所特徴量に基づいて前記捜索物を捜索するための第2通信端末と、前記第1通信端末および前記第2通信端末と通信する情報処理装置とを有し、
前記第1通信端末が、前記第2撮像手段と前記第1局所特徴量生成手段と前記報知手段とを含んで、前記m個の第1局所特徴量を前記第1通信端末から前記情報処理装置へ送信し、
前記第2通信端末が、前記第1撮像手段と前記第2局所特徴量生成手段とを含んで、前記n個の第2局所特徴量を前記第2通信端末から前記情報処理装置へ送信し、
前記情報処理装置が、前記第1局所特徴量記憶手段と前記認識手段とを含んで、前記認識手段の認識結果を前記情報処理装置から前記第1通信端末へ送信することを特徴とする請求項4に記載の情報処理システム。 - 前記捜索物は遺失物または盗難物であり、
前記第1局所特徴量記憶手段は、捜索する前記遺失物または盗難物の画像から前記第1局所特徴量生成手段が生成した第1局所特徴量を記憶し、
前記認識手段は、前記n個の第2局所特徴量に、前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記遺失物または盗難物が存在すると認識することを特徴とする請求項2に記載の情報処理システム。 - 前記捜索物は人物であり、
前記第1局所特徴量記憶手段は、捜索する前記人物の画像から前記第1局所特徴量生成手段が生成した第1局所特徴量を記憶し、
前記認識手段は、前記n個の第2局所特徴量に、前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記人物が存在すると認識することを特徴とする請求項2に記載の情報処理システム。 - 前記捜索物は複製物であり、
前記第1局所特徴量記憶手段は、元画像から前記第1局所特徴量生成手段が生成した第1局所特徴量を記憶し、
前記認識手段は、前記n個の第2局所特徴量に、前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記元画像が存在すると認識することを特徴とする請求項2に記載の情報処理システム。 - 前記第2局所特徴量生成手段は、前記第2局所特徴量の精度を調整する精度調整手段を有し、
前記認識手段は、精度をより高く調整して前記第2局所特徴量生成手段が生成した第2局所特徴量に基づいて前記捜索物を確認することを特徴とする請求項1乃至8のいずれか1項に記載の情報処理システム。 - 前記第1局所特徴量および前記第2局所特徴量は、画像から抽出した特徴点を含む局所領域を複数のサブ領域に分割し、前記複数のサブ領域内の勾配方向のヒストグラムからなる複数の次元の特徴ベクトルを生成することにより生成されることを特徴とする請求項1乃至9のいずれか1項に記載の情報処理システム。
- 前記第1局所特徴量および前記第2局所特徴量は、前記生成した複数の次元の特徴ベクトルから、隣接するサブ領域間の相関がより大きな次元を削除することにより生成されることを特徴とする請求項10に記載の情報処理システム。
- 前記特徴ベクトルの複数の次元は、前記特徴点の特徴に寄与する次元から順に、かつ、前記局所特徴量に対して求められる精度の向上に応じて第1次元から順に選択できるよう、所定の次元数ごとに前記局所領域をひと回りするよう配列することを特徴とする請求項10または11に記載の情報処理システム。
- 前記第2局所特徴量生成手段は、捜索する前記捜索物の相関に対応して、他の捜索物とより低い前記相関を有する捜索物については次元数のより少ない前記第2局所特徴量を生成することを特徴とする請求項12に記載の情報処理システム。
- 前記第1局所特徴量記憶手段は、捜索する前記捜索物の相関に対応して、他の捜索物とより低い前記相関を有する捜索物については次元数のより少ない前記第1局所特徴量を記憶することを特徴とする請求項12または13に記載の情報処理システム。
- 捜索物と、前記捜索物の画像のm個の特徴点のそれぞれを含むm個の局所領域のそれぞれについて生成された、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量とを、対応付けて記憶する第1局所特徴量記憶手段を備えた情報処理システムの情報処理方法であって、
第1撮像手段によって撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を生成する第2局所特徴量生成ステップと、
前記第1局所特徴量の特徴ベクトルの次元数iおよび前記第2局所特徴量の特徴ベクトルの次元数jのうち、より少ない次元数を選択し、選択された前記次元数までの特徴ベクトルからなる前記n個の第2局所特徴量に、選択された前記次元数までの特徴ベクトルからなる前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記捜索物が存在すると認識する認識ステップと、
を備えることを特徴とする情報処理方法。 - 撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を生成する第2局所特徴量生成手段と、
局所特徴量の照合に基づいて撮像した前記画像に含まれる捜索物を認識する情報処理装置に送信する第1送信手段と、
を備えることを特徴とする通信端末。 - 撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を生成する第2局所特徴量生成ステップと、
前記n個の第2局所特徴量を、局所特徴量の照合に基づいて撮像した前記画像に含まれる捜索物を認識する情報処理装置に送信する第1送信ステップと、
を含むことを特徴とする通信端末の制御方法。 - 撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を生成する第2局所特徴量生成ステップと、
前記n個の第2局所特徴量を、局所特徴量の照合に基づいて撮像した前記画像に含まれる捜索物を認識する情報処理装置に送信する第1送信ステップと、
をコンピュータに実行させることを特徴とする制御プログラム。 - 撮像した捜索物の画像からm個の特徴点を抽出し、前記m個の特徴点のそれぞれを含むm個の局所領域について、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量を生成する第1局所特徴量生成手段と、
撮像した捜索物が他の画像に含まれるか否かを局所特徴量の照合に基づいて認識する情報処理装置に、前記m個の第1局所特徴量を送信する第2送信手段と、
前記情報処理装置から、前記他の画像に含まれる前記撮像した捜索物を示す情報を受信する第1受信手段と、
を備えたことを特徴とする通信端末。 - 撮像した捜索物の画像からm個の特徴点を抽出し、前記m個の特徴点のそれぞれを含むm個の局所領域について、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量を生成する第1局所特徴量生成ステップと、
撮像した捜索物が他の画像に含まれるか否かを局所特徴量の照合に基づいて認識する情報処理装置に、前記m個の第1局所特徴量を送信する第2送信ステップと、
前記情報処理装置から、前記他の画像に含まれる前記撮像した捜索物を示す情報を受信する第1受信ステップと、
を含むことを特徴とする通信端末の制御方法。 - 撮像した捜索物の画像からm個の特徴点を抽出し、前記m個の特徴点のそれぞれを含むm個の局所領域について、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量を生成する第1局所特徴量生成ステップと、
撮像した捜索物が、他の画像に含まれるか否かを局所特徴量の照合に基づいて認識する情報処理装置に、前記m個の第1局所特徴量を送信する第2送信ステップと、
前記情報処理装置から、前記他の画像に含まれる前記撮像した捜索物を示す情報を受信する第1受信ステップと、
をコンピュータに実行させることを特徴とする通信端末の制御プログラム。 - 捜索物と、前記捜索物の画像のm個の特徴点のそれぞれを含むm個の局所領域のそれぞれについて生成された、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量とを、対応付けて記憶する第1局所特徴量記憶手段と、
前記捜索物を捜索する第1通信端末が撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を、前記第1通信端末から受信する第2受信手段と、
前記第1局所特徴量の特徴ベクトルの次元数iおよび前記第2局所特徴量の特徴ベクトルの次元数jのうち、より少ない次元数を選択し、選択された前記次元数までの特徴ベクトルからなる前記n個の第2局所特徴量に、選択された前記次元数までの特徴ベクトルからなる前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記捜索物が存在すると認識する認識手段と、
認識した前記捜索物を示す情報を、前記捜索物の捜索を依頼した第2通信端末に送信する第2送信手段と、
を備えたことを特徴とする情報処理装置。 - 捜索物と、前記捜索物の画像のm個の特徴点のそれぞれを含むm個の局所領域のそれぞれについて生成された、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量とを、対応付けて記憶する第1局所特徴量記憶手段を備えた情報処理装置の制御方法であって、
前記捜索物を捜索する第通信端末が撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を、前記通信端末から受信する第2受信ステップと、
前記第1局所特徴量の特徴ベクトルの次元数iおよび前記第2局所特徴量の特徴ベクトルの次元数jのうち、より少ない次元数を選択し、選択された前記次元数までの特徴ベクトルからなる前記n個の第2局所特徴量に、選択された前記次元数までの特徴ベクトルからなる前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記捜索物が存在すると認識する認識ステップと、
認識した前記捜索物を示す情報を、前記捜索物の捜索を依頼した第2通信端末に送信する第2送信ステップと、
を含むことを特徴とする情報処理装置の制御方法。 - 捜索物と、前記捜索物の画像のm個の特徴点のそれぞれを含むm個の局所領域のそれぞれについて生成された、それぞれ1次元からi次元までの特徴ベクトルからなるm個の第1局所特徴量とを、対応付けて記憶する第1局所特徴量記憶手段を備えた情報処理装置の制御プログラムであって、
前記捜索物を捜索する第1通信端末が撮像した映像中の画像からn個の特徴点を抽出し、前記n個の特徴点のそれぞれを含むn個の局所領域について、それぞれ1次元からj次元までの特徴ベクトルからなるn個の第2局所特徴量を、前記第1通信端末から受信する第2受信ステップと、
前記第1局所特徴量の特徴ベクトルの次元数iおよび前記第2局所特徴量の特徴ベクトルの次元数jのうち、より少ない次元数を選択し、選択された前記次元数までの特徴ベクトルからなる前記n個の第2局所特徴量に、選択された前記次元数までの特徴ベクトルからなる前記m個の第1局所特徴量の所定割合以上が対応すると判定した場合に、前記映像中の前記画像に前記捜索物が存在すると認識する認識ステップと、
認識した前記捜索物を示す情報を、前記捜索物の捜索を依頼した第2通信端末に送信する第2送信ステップと、
をコンピュータに実行させることを特徴とする制御プログラム。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2013556424A JP6168303B2 (ja) | 2012-01-30 | 2013-01-30 | 情報処理システム、情報処理方法、情報処理装置およびその制御方法と制御プログラム、通信端末およびその制御方法と制御プログラム |
US14/375,524 US9792528B2 (en) | 2012-01-30 | 2013-01-30 | Information processing system, information processing method, information processing apparatus and control method and control program thereof, and communication terminal and control method and control program thereof |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2012-017384 | 2012-01-30 | ||
JP2012017384 | 2012-01-30 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2013115202A1 true WO2013115202A1 (ja) | 2013-08-08 |
Family
ID=48905236
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2013/051953 WO2013115202A1 (ja) | 2012-01-30 | 2013-01-30 | 情報処理システム、情報処理方法、情報処理装置およびその制御方法と制御プログラム、通信端末およびその制御方法と制御プログラム |
Country Status (3)
Country | Link |
---|---|
US (1) | US9792528B2 (ja) |
JP (1) | JP6168303B2 (ja) |
WO (1) | WO2013115202A1 (ja) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104036281A (zh) * | 2014-06-24 | 2014-09-10 | 北京奇虎科技有限公司 | 一种图片的匹配方法、搜索方法及其装置 |
CN104036009A (zh) * | 2014-06-24 | 2014-09-10 | 北京奇虎科技有限公司 | 一种搜索匹配图片的方法、图片搜索方法及装置 |
WO2015196964A1 (zh) * | 2014-06-24 | 2015-12-30 | 北京奇虎科技有限公司 | 搜索匹配图片的方法、图片搜索方法及装置 |
JP2016220166A (ja) * | 2015-05-26 | 2016-12-22 | 株式会社オプティム | 識別子記憶サーバ、識別子記憶方法及び識別子記憶サーバ用プログラム |
JP2018084443A (ja) * | 2016-11-21 | 2018-05-31 | 株式会社リコー | 画像処理装置、画像処理システム、画像処理方法、及び画像処理プログラム |
JP2020507173A (ja) * | 2017-01-11 | 2020-03-05 | アリババ・グループ・ホールディング・リミテッドAlibaba Group Holding Limited | 拡張現実に基づく画像認識方法および装置 |
JP2020102277A (ja) * | 2020-04-06 | 2020-07-02 | 日本電気株式会社 | 通信システム、検索装置および検索方法 |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2517944A (en) * | 2013-09-05 | 2015-03-11 | Ibm | Locating objects using images from portable devices |
CN108303435B (zh) * | 2017-01-12 | 2020-09-11 | 同方威视技术股份有限公司 | 检查设备和对集装箱进行检查的方法 |
JP7113217B2 (ja) * | 2017-11-17 | 2022-08-05 | パナソニックIpマネジメント株式会社 | 照合装置、照合方法、及びプログラム |
US11281926B2 (en) * | 2018-06-04 | 2022-03-22 | Denso Corporation | Feature extraction method and apparatus |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004151820A (ja) * | 2002-10-29 | 2004-05-27 | Hitachi Eng Co Ltd | 迷子検索・監視システム |
JP2005222250A (ja) * | 2004-02-04 | 2005-08-18 | Toshiba Corp | 捜索装置 |
JP2005347905A (ja) * | 2004-06-01 | 2005-12-15 | Oki Electric Ind Co Ltd | 防犯支援システム |
JP2008003753A (ja) * | 2006-06-21 | 2008-01-10 | Hitachi Kokusai Electric Inc | 情報収集システム |
JP2010277264A (ja) * | 2009-05-27 | 2010-12-09 | Takachiho Koeki Kk | 防犯装置、その制御方法、プログラム、及び防犯システム |
JP2011008507A (ja) * | 2009-06-25 | 2011-01-13 | Kddi Corp | 画像検索方法およびシステム |
JP2011198130A (ja) * | 2010-03-19 | 2011-10-06 | Fujitsu Ltd | 画像処理装置及び画像処理プログラム |
Family Cites Families (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6711293B1 (en) | 1999-03-08 | 2004-03-23 | The University Of British Columbia | Method and apparatus for identifying scale invariant features in an image and use of same for locating an object in an image |
JP2004213087A (ja) * | 2002-12-26 | 2004-07-29 | Toshiba Corp | 個人認証装置および個人認証方法 |
SE528068C2 (sv) * | 2004-08-19 | 2006-08-22 | Jan Erik Solem Med Jsolutions | Igenkänning av 3D föremål |
JP4988408B2 (ja) | 2007-04-09 | 2012-08-01 | 株式会社デンソー | 画像認識装置 |
US9313359B1 (en) * | 2011-04-26 | 2016-04-12 | Gracenote, Inc. | Media content identification on mobile devices |
US9041508B2 (en) * | 2008-08-08 | 2015-05-26 | Snap-On Incorporated | Image-based inventory control system and method |
JP4547639B2 (ja) * | 2008-08-26 | 2010-09-22 | ソニー株式会社 | 画像処理装置および方法、並びにプログラム |
US20120109901A1 (en) * | 2009-07-01 | 2012-05-03 | Nec Corporation | Content classification apparatus, content classification method, and content classification program |
JP5406705B2 (ja) * | 2009-12-28 | 2014-02-05 | キヤノン株式会社 | データ補正装置及び方法 |
JP2011221688A (ja) | 2010-04-07 | 2011-11-04 | Sony Corp | 認識装置、認識方法、およびプログラム |
JP5685390B2 (ja) * | 2010-05-14 | 2015-03-18 | 株式会社Nttドコモ | 物体認識装置、物体認識システムおよび物体認識方法 |
US20130208984A1 (en) * | 2010-10-25 | 2013-08-15 | Nec Corporation | Content scene determination device |
JP5854232B2 (ja) * | 2011-02-10 | 2016-02-09 | 日本電気株式会社 | 映像間対応関係表示システム及び映像間対応関係表示方法 |
JP5991488B2 (ja) * | 2011-02-10 | 2016-09-14 | 日本電気株式会社 | 相違領域検出システム及び相違領域検出方法 |
US9172936B2 (en) * | 2011-02-10 | 2015-10-27 | Nec Corporation | Inter-video corresponding relationship display system and inter-video corresponding relationship display method |
US8660368B2 (en) * | 2011-03-16 | 2014-02-25 | International Business Machines Corporation | Anomalous pattern discovery |
US8571306B2 (en) * | 2011-08-10 | 2013-10-29 | Qualcomm Incorporated | Coding of feature location information |
US9053371B2 (en) * | 2011-09-29 | 2015-06-09 | Texas Instruments Incorporated | Method, system and computer program product for identifying a location of an object within a video sequence |
US9530081B2 (en) * | 2011-10-03 | 2016-12-27 | Nec Corporation | Similarity detecting apparatus and directional nearest neighbor detecting method |
US20130113929A1 (en) * | 2011-11-08 | 2013-05-09 | Mary Maitland DeLAND | Systems and methods for surgical procedure safety |
CN103946891B (zh) * | 2011-11-18 | 2017-02-22 | 日本电气株式会社 | 局部特征量提取装置和局部特征量提取方法 |
US9239850B2 (en) * | 2011-11-18 | 2016-01-19 | Nec Corporation | Feature descriptor encoding apparatus, feature descriptor encoding method, and program |
JP6103243B2 (ja) * | 2011-11-18 | 2017-03-29 | 日本電気株式会社 | 局所特徴量抽出装置、局所特徴量抽出方法、及びプログラム |
CN104115189B (zh) * | 2011-11-18 | 2016-12-28 | 日本电气株式会社 | 局部特征量提取装置、用于提取局部特征量的方法 |
US9009149B2 (en) * | 2011-12-06 | 2015-04-14 | The Trustees Of Columbia University In The City Of New York | Systems and methods for mobile search using Bag of Hash Bits and boundary reranking |
WO2013088994A1 (ja) * | 2011-12-14 | 2013-06-20 | 日本電気株式会社 | 映像処理システム、映像処理方法、携帯端末用またはサーバ用の映像処理装置およびその制御方法と制御プログラム |
WO2013089146A1 (ja) * | 2011-12-16 | 2013-06-20 | 日本電気株式会社 | 情報処理システム、情報処理方法、通信端末およびその制御方法と制御プログラム |
US9986208B2 (en) * | 2012-01-27 | 2018-05-29 | Qualcomm Incorporated | System and method for determining location of a device using opposing cameras |
EP2811459B1 (en) * | 2012-01-30 | 2020-02-19 | NEC Corporation | Information processing system, information processing method, information processing device, and control method and control program therefor, and communication terminal, and control method and control program therefor |
EP2889835A4 (en) * | 2012-08-23 | 2016-06-08 | Nec Corp | DEVICE, METHOD AND PROGRAM FOR DIFFERENTIATING AN OBJECT |
JP6278276B2 (ja) * | 2012-08-23 | 2018-02-14 | 日本電気株式会社 | 物体識別装置、物体識別方法、及びプログラム |
CN103714077B (zh) * | 2012-09-29 | 2017-10-20 | 日电(中国)有限公司 | 物体检索的方法、检索校验的方法及装置 |
TW201437925A (zh) * | 2012-12-28 | 2014-10-01 | Nec Corp | 物體識別裝置、方法及電腦程式產品 |
US9025825B2 (en) * | 2013-05-10 | 2015-05-05 | Palo Alto Research Center Incorporated | System and method for visual motion based object segmentation and tracking |
-
2013
- 2013-01-30 WO PCT/JP2013/051953 patent/WO2013115202A1/ja active Application Filing
- 2013-01-30 JP JP2013556424A patent/JP6168303B2/ja active Active
- 2013-01-30 US US14/375,524 patent/US9792528B2/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004151820A (ja) * | 2002-10-29 | 2004-05-27 | Hitachi Eng Co Ltd | 迷子検索・監視システム |
JP2005222250A (ja) * | 2004-02-04 | 2005-08-18 | Toshiba Corp | 捜索装置 |
JP2005347905A (ja) * | 2004-06-01 | 2005-12-15 | Oki Electric Ind Co Ltd | 防犯支援システム |
JP2008003753A (ja) * | 2006-06-21 | 2008-01-10 | Hitachi Kokusai Electric Inc | 情報収集システム |
JP2010277264A (ja) * | 2009-05-27 | 2010-12-09 | Takachiho Koeki Kk | 防犯装置、その制御方法、プログラム、及び防犯システム |
JP2011008507A (ja) * | 2009-06-25 | 2011-01-13 | Kddi Corp | 画像検索方法およびシステム |
JP2011198130A (ja) * | 2010-03-19 | 2011-10-06 | Fujitsu Ltd | 画像処理装置及び画像処理プログラム |
Non-Patent Citations (1)
Title |
---|
HIRONOBU FUJIYOSHI: "Gradient-Based Feature Extraction : SIFT and HOG", IEICE TECHNICAL REPORT, vol. 107, no. 206, 27 August 2007 (2007-08-27), pages 211 - 224 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104036281A (zh) * | 2014-06-24 | 2014-09-10 | 北京奇虎科技有限公司 | 一种图片的匹配方法、搜索方法及其装置 |
CN104036009A (zh) * | 2014-06-24 | 2014-09-10 | 北京奇虎科技有限公司 | 一种搜索匹配图片的方法、图片搜索方法及装置 |
WO2015196964A1 (zh) * | 2014-06-24 | 2015-12-30 | 北京奇虎科技有限公司 | 搜索匹配图片的方法、图片搜索方法及装置 |
CN104036281B (zh) * | 2014-06-24 | 2017-05-03 | 北京奇虎科技有限公司 | 一种图片的匹配方法、搜索方法及其装置 |
CN104036009B (zh) * | 2014-06-24 | 2017-08-08 | 北京奇虎科技有限公司 | 一种搜索匹配图片的方法、图片搜索方法及装置 |
JP2016220166A (ja) * | 2015-05-26 | 2016-12-22 | 株式会社オプティム | 識別子記憶サーバ、識別子記憶方法及び識別子記憶サーバ用プログラム |
JP2018084443A (ja) * | 2016-11-21 | 2018-05-31 | 株式会社リコー | 画像処理装置、画像処理システム、画像処理方法、及び画像処理プログラム |
JP2020507173A (ja) * | 2017-01-11 | 2020-03-05 | アリババ・グループ・ホールディング・リミテッドAlibaba Group Holding Limited | 拡張現実に基づく画像認識方法および装置 |
US10614341B2 (en) | 2017-01-11 | 2020-04-07 | Alibaba Group Holding Limited | Image recognition based on augmented reality |
US10762382B2 (en) | 2017-01-11 | 2020-09-01 | Alibaba Group Holding Limited | Image recognition based on augmented reality |
JP2020102277A (ja) * | 2020-04-06 | 2020-07-02 | 日本電気株式会社 | 通信システム、検索装置および検索方法 |
Also Published As
Publication number | Publication date |
---|---|
JPWO2013115202A1 (ja) | 2015-05-11 |
US9792528B2 (en) | 2017-10-17 |
US20150010237A1 (en) | 2015-01-08 |
JP6168303B2 (ja) | 2017-07-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6168303B2 (ja) | 情報処理システム、情報処理方法、情報処理装置およびその制御方法と制御プログラム、通信端末およびその制御方法と制御プログラム | |
US11886489B2 (en) | System and method of identifying visual objects | |
US20190251369A1 (en) | License plate detection and recognition system | |
US9697435B2 (en) | Local feature descriptor extracting apparatus, local feature descriptor extracting method, and program | |
JP2018160219A (ja) | 移動経路予測装置、及び移動経路予測方法 | |
US9418314B2 (en) | Information processing apparatus and control method and control program thereof, and communication terminal and control method and control program thereof | |
Duan et al. | Compact descriptors for mobile visual search and MPEG CDVS standardization | |
WO2013089146A1 (ja) | 情報処理システム、情報処理方法、通信端末およびその制御方法と制御プログラム | |
WO2013088994A1 (ja) | 映像処理システム、映像処理方法、携帯端末用またはサーバ用の映像処理装置およびその制御方法と制御プログラム | |
JP2005136665A (ja) | データ信号の送信方法と受信方法及びその装置、システム、プログラム並びに記録媒体 | |
WO2013115092A1 (ja) | 映像処理システム、映像処理方法、映像処理装置およびその制御方法と制御プログラム | |
CN103530377A (zh) | 一种基于二进制特征码的场景信息搜索方法 | |
WO2013089004A1 (ja) | 映像処理システム、映像処理方法、携帯端末用またはサーバ用の映像処理装置およびその制御方法と制御プログラム | |
Rivera-Rubio et al. | Small hand-held object recognition test (short) | |
Santos et al. | RECOGNIZING AND EXPLORING AZULEJOS ON HISTORIC BUILDINGS’FACADES BY COMBINING COMPUTER VISION AND GEOLOCATION IN MOBILE AUGMENTED REALITY APPLICATIONS | |
JP6131859B2 (ja) | 情報処理システム、情報処理方法、情報処理装置およびその制御方法と制御プログラム、通信端末およびその制御方法と制御プログラム | |
US11803972B2 (en) | Information processing apparatus, control method, and program for accurately linking fragmented trajectories of an object | |
Marez et al. | Bandwidth constrained cooperative object detection in images | |
Dai et al. | Real object registration algorithm based on SURF and RANSAC | |
Jiao et al. | An indoor positioning method based on wireless signal and image | |
WO2013089041A1 (ja) | 映像処理システム、映像処理方法、携帯端末用またはサーバ用の映像処理装置およびその制御方法と制御プログラム | |
He et al. | A method of face mosaic for the target person | |
Narayanan et al. | Smart Vision: Real Time Object Detection | |
JP2023176244A (ja) | 画像処理システム、装置、処理方法、およびプログラム | |
CN115359561A (zh) | 人体行为识别方法、装置、设备及计算机可读存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13743919 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2013556424 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14375524 Country of ref document: US |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 13743919 Country of ref document: EP Kind code of ref document: A1 |