WO2013115204A1

WO2013115204A1 - Information processing system, information processing method, information processing device, and control method and control program therefor, and communication terminal, and control method and control program therefor

Info

Publication number: WO2013115204A1
Application number: PCT/JP2013/051955
Authority: WO
Inventors: 野村　俊之; 山田　昭雄; 岩元　浩太; 亮太間瀬
Original assignee: 日本電気株式会社
Priority date: 2012-01-30
Filing date: 2013-01-30
Publication date: 2013-08-08
Also published as: JPWO2013115204A1; JP6131859B2

Abstract

Provided is a technology for recognizing landscape elements such as buildings in an image in a video in real time. A landscape element, and m number of first local features, which comprise feature vectors having from 1 to i dimensions for each of m number of local regions including m number of feature points in an image of the landscape element, are associated and stored. Next, n number of feature points are extracted from an image in a captured video, and n number of second local features, which comprise feature vectors having from 1 to j dimensions for each of n number of local regions including the n number of feature points, are generated. The number of dimensions (i) of the feature vectors of the first local features or the number of dimensions (j) of the feature vectors of the second local features, whichever is the smaller number of dimensions, is selected. The landscape element is recognized to be present in the image from the video when a prescribed proportion or more of the m number of first local features up to the selected number of dimensions is determined to correspond to the n number of second local features up to the selected number of dimensions.

Description

Information processing system, information processing method, information processing apparatus and control method and control program thereof, communication terminal and control method and control program thereof

This invention relates to the technique for identifying the landscape element containing the building in the image | video imaged using the local feature-value.

In the above technical field, Patent Document 1 discloses a technique in which feature amounts extracted from a plurality of images of a building are compared with feature amounts in a database, the degree of match is comprehensively evaluated, and related information on a specified building is obtained. Is described. Patent Document 2 describes a technique for improving the recognition speed by clustering feature amounts when a query image is recognized using a model dictionary generated in advance from a model image.

JP 2010-045644 A JP 2011-221688 A

However, the technique described in the above document cannot recognize a landscape element including a building in an image in a video in real time.

An object of the present invention is to provide a technique for solving the above-described problems.

In order to achieve the above object, a system according to the present invention provides:
M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the landscape elements and m local regions including each of the m feature points of the landscape element image. First local feature quantity storage means for storing the quantity in association with each other;
N feature points are extracted from the image captured by the imaging means, and n local feature regions each including the n feature points are each composed of feature vectors of 1 to j dimensions. Second local feature quantity generating means for generating the second local feature quantity of
A smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number. When it is determined that the n second local feature amounts correspond to a predetermined ratio or more of the m first local feature amounts including feature vectors up to the selected number of dimensions, the image in the video A landscape element recognition means for recognizing that the landscape element exists in
It is characterized by providing.

In order to achieve the above object, the method according to the present invention comprises:
M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the landscape elements and m local regions including each of the m feature points of the landscape element image. An information processing method in an information processing system including first local feature storage means for storing a quantity in association with each other,
N feature points are extracted from the image captured by the imaging means, and n local feature regions each including the n feature points are each composed of feature vectors of 1 to j dimensions. A second local feature generation step of generating the second local feature of
A smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number. When it is determined that the n second local feature amounts correspond to a predetermined ratio or more of the m first local feature amounts including feature vectors up to the selected number of dimensions, the image in the video Recognizing that the landscape element exists in
It is characterized by providing.

In order to achieve the above object, an apparatus according to the present invention provides:
N feature points are extracted from the image captured by the imaging means, and n local feature regions each including the n feature points are each composed of feature vectors of 1 to j dimensions. Second local feature quantity generating means for generating the second local feature quantity of
First transmission means for transmitting the m second local feature amounts to an information processing apparatus that recognizes a landscape element included in the image captured based on the comparison of local feature amounts;
First receiving means for receiving information indicating a landscape element included in the captured image from the information processing apparatus;
It is characterized by providing.

In order to achieve the above object, the method according to the present invention comprises:
N feature points are extracted from the image captured by the imaging means, and n local feature regions each including the n feature points are each composed of feature vectors of 1 to j dimensions. A second local feature generation step of generating the second local feature of
A first transmission step of transmitting the m second local feature amounts to an information processing device that recognizes a landscape element included in the image captured based on the comparison of local feature amounts;
A first receiving step of receiving information indicating a landscape element included in the captured image from the information processing apparatus;
It is characterized by including.

In order to achieve the above object, a program according to the present invention provides:
N feature points are extracted from the image captured by the imaging means, and n local feature regions each including the n feature points are each composed of feature vectors of 1 to j dimensions. A second local feature generation step of generating the second local feature of
A first transmission step of transmitting the m second local feature amounts to an information processing device that recognizes a landscape element included in the image captured based on the comparison of local feature amounts;
A first receiving step of receiving information indicating a landscape element included in the captured image from the information processing apparatus;
Is executed by a computer.

In order to achieve the above object, an apparatus according to the present invention provides:
M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the landscape elements and m local regions including each of the m feature points of the landscape element image. First local feature quantity storage means for storing the quantity in association with each other;
N feature points are extracted from an image in the video captured by the communication terminal, and n feature regions of 1 to j dimensions are respectively obtained for n local regions including the n feature points. Second receiving means for receiving the second local feature amount from the communication terminal;
A smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number. When it is determined that the n second local feature amounts correspond to a predetermined ratio or more of the m first local feature amounts including feature vectors up to the selected number of dimensions, the image in the video Recognizing means for recognizing that the landscape element exists in
Second transmission means for transmitting information indicating the recognized landscape element to the communication terminal;
It is characterized by providing.

In order to achieve the above object, the method according to the present invention comprises:
M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the landscape elements and m local regions including each of the m feature points of the landscape element image. A method for controlling an information processing apparatus including first local feature storage means for storing a quantity in association with each other,
N feature points are extracted from an image in the video captured by the communication terminal, and n feature regions of 1 to j dimensions are respectively obtained for n local regions including the n feature points. A second receiving step of receiving the second local feature amount of the communication terminal from the communication terminal;
A smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number. When it is determined that the n second local feature amounts correspond to a predetermined ratio or more of the m first local feature amounts including feature vectors up to the selected number of dimensions, the image in the video Recognizing that the landscape element exists in
A second transmission step of transmitting information indicating the recognized landscape element to the communication terminal;
It is characterized by including.

In order to achieve the above object, a program according to the present invention provides:
M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the landscape elements and m local regions including each of the m feature points of the landscape element image. A control program for an information processing device including first local feature storage means for storing the amount in association with each other,
N feature points are extracted from an image in the video captured by the communication terminal, and n feature regions of 1 to j dimensions are respectively obtained for n local regions including the n feature points. A second receiving step of receiving the second local feature amount of the communication terminal from the communication terminal;
A smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number. When it is determined that the n second local feature amounts correspond to a predetermined ratio or more of the m first local feature amounts including feature vectors up to the selected number of dimensions, the image in the video Recognizing that the landscape element exists in
A second transmission step of transmitting information indicating the recognized landscape element to the communication terminal;
Is executed by a computer.

According to the present invention, a landscape element including a building in an image in a video can be recognized in real time.

It is a block diagram which shows the structure of the information processing system which concerns on 1st Embodiment of this invention. It is a block diagram which shows the structure of the information processing system which concerns on 2nd Embodiment of this invention. It is a figure which shows the example of a display screen of the communication terminal in the information processing system which concerns on 2nd Embodiment of this invention. It is a sequence diagram which shows the operation | movement procedure of the relevant information alerting | reporting in the information processing system which concerns on 2nd Embodiment of this invention. It is a sequence diagram which shows the operation | movement procedure of the link information alerting | reporting in the information processing system which concerns on 2nd Embodiment of this invention. It is a block diagram which shows the function structure of the communication terminal which concerns on 2nd Embodiment of this invention. It is a block diagram which shows the function structure of the landscape element recognition server which concerns on 2nd Embodiment of this invention. It is a figure which shows the structure of local feature-value DB which concerns on 2nd Embodiment of this invention. It is a figure which shows the structure of related information DB which concerns on 2nd Embodiment of this invention. It is a figure which shows the structure of link information DB which concerns on 2nd Embodiment of this invention. It is a block diagram which shows the function structure of the local feature-value production | generation part which concerns on 2nd Embodiment of this invention. It is a figure explaining the procedure of the local feature-value production | generation which concerns on 2nd Embodiment of this invention. It is a figure explaining the procedure of the local feature-value production | generation which concerns on 2nd Embodiment of this invention. It is a figure which shows the selection order of the sub area | region in the local feature-value production | generation part which concerns on 2nd Embodiment of this invention. It is a figure which shows the selection order of the feature vector in the local feature-value production | generation part which concerns on 2nd Embodiment of this invention. It is a figure which shows hierarchization of the feature vector in the local feature-value production | generation part which concerns on 2nd Embodiment of this invention. It is a figure which shows the structure of the encoding part which concerns on 2nd Embodiment of this invention. It is a figure which shows the process of the landscape element recognition part which concerns on 2nd Embodiment of this invention. It is a block diagram which shows the hardware constitutions of the communication terminal which concerns on 2nd Embodiment of this invention. It is a figure which shows the local feature-value production | generation table in the communication terminal which concerns on 2nd Embodiment of this invention. It is a flowchart which shows the process sequence of the communication terminal which concerns on 2nd Embodiment of this invention. It is a flowchart which shows the process sequence of the local feature-value production | generation process which concerns on 2nd Embodiment of this invention. It is a flowchart which shows the process sequence of the encoding process which concerns on 2nd Embodiment of this invention. It is a flowchart which shows the process sequence of the encoding process of the difference value which concerns on 2nd Embodiment of this invention. It is a block diagram which shows the hardware constitutions of the landscape element recognition server which concerns on 2nd Embodiment of this invention. It is a flowchart which shows the process sequence of the landscape element recognition server which concerns on 2nd Embodiment of this invention. It is a flowchart which shows the process sequence of local feature-value DB production | generation processing which concerns on 2nd Embodiment of this invention. It is a flowchart which shows the process sequence of the landscape element recognition process which concerns on 2nd Embodiment of this invention. It is a flowchart which shows the process sequence of the collation process which concerns on 2nd Embodiment of this invention. It is a sequence diagram which shows the operation | movement procedure of the information processing system which concerns on 3rd Embodiment of this invention. It is a figure which shows the example of a display screen of the communication terminal in the information processing system which concerns on 4th Embodiment of this invention. It is a figure which shows the example of a display screen of the communication terminal in the information processing system which concerns on 4th Embodiment of this invention. It is a figure which shows the example of a display screen of the communication terminal in the information processing system which concerns on 4th Embodiment of this invention. It is a sequence diagram which shows the operation | movement procedure of local feature-value DB production | generation in the information processing system which concerns on 4th Embodiment of this invention. It is a sequence diagram which shows the operation | movement procedure of the present location determination and / or movement direction / motion determination in the information processing system which concerns on 4th Embodiment of this invention. It is a block diagram which shows the function structure of the communication terminal which concerns on 4th Embodiment of this invention. It is a block diagram which shows the function structure of the landscape element recognition server which concerns on 4th Embodiment of this invention. It is a figure which shows the structure of local feature-value DB which concerns on 4th Embodiment of this invention. It is a figure which shows the structure of map DB which concerns on 4th Embodiment of this invention. It is a figure which shows the process of the landscape element recognition part which concerns on 4th Embodiment of this invention. It is a figure which shows the process of the landscape element recognition part which concerns on 4th Embodiment of this invention. It is a block diagram which shows the hardware constitutions of the communication terminal which concerns on 4th Embodiment of this invention. It is a flowchart which shows the process sequence of the communication terminal which concerns on 4th Embodiment of this invention. It is a block diagram which shows the hardware constitutions of the landscape element recognition server which concerns on 4th Embodiment of this invention. It is a figure which shows the structure of the present location calculation table which concerns on 4th Embodiment of this invention. It is a figure which shows the structure of the moving direction / speed calculation table which concerns on 4th Embodiment of this invention. It is a flowchart which shows the process sequence of the landscape element recognition server which concerns on 4th Embodiment of this invention. It is a flowchart which shows the process sequence of local feature-value DB production | generation processing which concerns on 4th Embodiment of this invention. It is a flowchart which shows the process sequence of the present location calculation process which concerns on 4th Embodiment of this invention. It is a flowchart which shows the process sequence of the movement direction / speed calculation process which concerns on 4th Embodiment of this invention. It is a figure which shows the example of a display screen of the communication terminal in the information processing system which concerns on 5th Embodiment of this invention. It is a sequence diagram which shows the operation | movement procedure of the information processing system which concerns on 5th Embodiment of this invention. It is a figure which shows the structure of local feature-value DB for navigation which concerns on 5th Embodiment of this invention. It is a figure which shows the example of a display screen of the communication terminal in the information processing system which concerns on 6th Embodiment of this invention. It is a sequence diagram which shows the operation | movement procedure of the information processing system which concerns on 6th Embodiment of this invention. It is a flowchart which shows the process sequence of the guidance control computer which concerns on 6th Embodiment of this invention. It is a figure which shows the example of a display screen of the communication terminal in the information processing system which concerns on 7th Embodiment of this invention. It is a sequence diagram which shows the operation | movement procedure of the information processing system which concerns on 7th Embodiment of this invention. It is a figure which shows the structure of route screen DB which concerns on 7th Embodiment of this invention. It is a flowchart which shows the process sequence of the guidance control computer which concerns on 7th Embodiment of this invention. It is a block diagram which shows the function structure of the communication terminal which concerns on 8th Embodiment of this invention.

Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the drawings. However, the constituent elements described in the following embodiments are merely examples, and are not intended to limit the technical scope of the present invention only to them. Note that the term “landscape element” used in the present specification includes a landscape element that forms a natural landscape such as a mountain, a building that forms an artificial landscape, and the like.

[First Embodiment]
An information processing system 100 as a first embodiment of the present invention will be described with reference to FIG. The information processing system 100 is a system that recognizes landscape elements in real time.

1, the information processing system 100 includes a first local feature quantity storage unit 110, a second local feature quantity generation unit 130, and a landscape element recognition unit 140. The 1st local feature-value memory | storage part 110 was produced | generated about each of the m local region containing each of the landscape element 111 and each of the m feature points of the image of the landscape element 111, respectively from 1 dimension to i dimension. The m first local feature quantities 112 made up of feature vectors are stored in association with each other. The second local feature quantity generation unit 130 extracts n feature points 131 from the image 101 in the video captured by the imaging unit 120. Then, the second local feature value generation unit 130, for n local regions 132 including each of the n feature points, n second local feature values 133 each consisting of a feature vector from 1 dimension to j dimension. Is generated. The landscape element recognition unit 140 selects a smaller number of dimensions from the dimension number i of the feature vector of the first local feature 112 and the dimension j of the feature vector of the second local feature 133. Then, the landscape element recognition unit 140 adds the m first local feature amounts including the feature vectors up to the selected number of dimensions to the n second local feature amounts 133 including the feature vectors up to the selected number of dimensions. When it is determined that the predetermined ratio of 112 or more corresponds, it is recognized that the landscape element 111 exists in the image 101 in the video.

According to this embodiment, a landscape element including a building in an image in a video can be recognized in real time.

[Second Embodiment]
Next, an information processing system according to the second embodiment of the present invention will be described. In this embodiment, a landscape element in a video is recognized by collating a local feature generated from a landscape image captured by a communication terminal with a local feature stored in the local feature DB of the landscape element recognition server. To do. Then, the recognized landscape element is notified by adding its name, related information, and / or link information.

According to this embodiment, the name, related information, and / or link information can be notified in association with a landscape element including a building in the image in the image in real time.

<Configuration of information processing system>
FIG. 2 is a block diagram illustrating a configuration of the information processing system 200 according to the present embodiment.

The information processing system 200 in FIG. 2 includes a communication terminal 210 having an imaging function, a landscape element recognition server 220 that recognizes a landscape element from a landscape captured by the communication terminal 210, and a communication terminal. 210 includes related information and a related information providing server 230 that provides the related information.

The communication terminal 210 displays the captured landscape on the display unit. And each name of the landscape element recognized by the landscape element recognition server 220 based on the local feature-value produced | generated by the local feature-value production | generation part with respect to the imaged landscape like the display screen 211 of FIG. Superimposed display. As shown in the figure, the communication terminal 210 represents a plurality of communication terminals including a mobile phone having an imaging function and other communication terminals.

The landscape element recognition server 220 includes a local feature DB 221 that stores a landscape element and a local feature in association with each other, a related information DB 222 that stores related information in association with a landscape element, and link information that corresponds to a landscape element. A link information DB 223 for storing. The landscape element recognition server 220 returns the name of the landscape element recognized based on the collation with the local feature amount of the local feature amount DB 221 from the local feature amount of the landscape received from the communication terminal 210. In addition, related information such as introduction corresponding to the landscape element recognized from the related information DB 222 is searched and returned to the communication terminal 210. Moreover, the link information to the related information provision server 230 corresponding to the landscape element recognized from the link information DB 223 is searched and returned to the communication terminal 210. The name of the landscape element, the related information corresponding to the landscape element, and the link information for the landscape element may be provided separately or may be provided simultaneously.

The related information providing server 230 has a related information DB 231 that stores related information corresponding to landscape elements. Access is made based on the link information provided corresponding to the landscape element recognized by the landscape element recognition server 220. And the related information corresponding to the recognized landscape element is searched from the related information DB 231 and returned to the communication terminal 210 that transmitted the local feature amount of the landscape including the landscape information. Therefore, although one related information providing server 230 is shown in FIG. 2, as many related information providing servers 230 are connected to the link destination. In that case, selection of an appropriate link destination by the landscape element recognition server 220 or a plurality of link destinations is displayed on the communication terminal 210 and selection by the user is performed.

In addition, in FIG. 2, the example which superimposes and displays a name on the landscape element in the imaged landscape was illustrated. The display of the related information corresponding to the landscape element and the link information for the landscape element will be described with reference to FIG.

(Example of communication terminal display screen)
FIG. 3 is a diagram illustrating a display screen example of the communication terminal 210 in the information processing system 200 according to the present embodiment.

The upper part of FIG. 3 is an example of a display screen that displays related information corresponding to a landscape element. The display screen 310 of FIG. 3 includes a captured landscape image 311 and operation buttons 312. A landscape element is recognized by collation with the local feature-value produced | generated from the image | video of the upper left figure, and local feature-value DB221 of the landscape element recognition server 220. FIG. As a result, on the display screen 320 in the upper right diagram, a video 321 in which a landscape video, a landscape element name, and related information are superimposed is displayed. At the same time, the related information may be output by voice through the speaker 322.

The lower part of FIG. 3 is an example of a display screen that displays link information corresponding to a landscape element. A landscape element is recognized by collating the local feature amount generated from the video in the lower left figure with the local feature amount DB 221 of the landscape element recognition server 220. As a result, on the display screen 330 in the lower right diagram, an image 331 in which a landscape image, a landscape element name, and link information are superimposed is displayed. Although not shown, by clicking the displayed link information, the linked related information providing server 230 is accessed, and the related information retrieved from the related information DB 231 is displayed on the communication terminal 210, or the communication terminal 210 receives audio. Output.

<< Operation procedure of information processing system >>
Hereinafter, an operation procedure of the information processing system 200 in the present embodiment will be described with reference to FIGS. 4 and 5. In addition, although the display example of only the recognized landscape element name is not shown in FIGS. 4 and 5, the landscape element name may be transmitted to the communication terminal 210 after the landscape element recognition. Moreover, the display of a landscape element name, related information, and link information can be realized by combining FIG. 4 and FIG.

(Related information notification procedure)
FIG. 4 is a sequence diagram showing an operation procedure of related information notification in the information processing system 200 according to the present embodiment.

First, if necessary, in step S400, an application and / or data is downloaded from the landscape element recognition server 220 to the communication terminal 210. In step S401, the application is activated and initialized to perform the processing of this embodiment.

In step S403, the communication terminal captures a landscape and acquires a video. In step S405, a local feature amount is generated from the landscape image. Subsequently, in step S407, the local feature amount is encoded together with the feature point coordinates. The encoded local feature is transmitted from the communication terminal to the landscape element recognition server 220 in step S409.

The landscape element recognition server 220 refers to the local feature DB 221 generated and stored for the landscape element image in step S411 to recognize the landscape element in the landscape. Then, in step S413, the related information is acquired with reference to the related information DB 222 corresponding to the recognized landscape element. In step S415, the landscape element name and the related information are transmitted from the landscape element recognition server 220 to the communication terminal 210.

In step S417, the communication terminal 210 notifies the received landscape element name and related information (see the upper part of FIG. 3). The landscape element name is displayed, and the related information is displayed or output as audio.

(Operation procedure of link information notification)
FIG. 5 is a sequence diagram showing an operation procedure of link information notification in the information processing system 200 according to the present embodiment. In addition, the same step number is attached | subjected to the operation | movement procedure similar to FIG. 4, and description is abbreviate | omitted.

In steps S400 and S401, although there is a possibility of a difference between applications and data, downloading, activation and initialization are performed as in FIG.

The landscape element recognition server 220 that recognizes the landscape element in the landscape from the local feature amount of the video received from the communication terminal 210 in step S411 corresponds to the recognized landscape element with reference to the link information DB 223 in step S513. Get link information. In step S515, the landscape element name and the link information are transmitted from the landscape element recognition server 220 to the communication terminal 210.

In step S517, the communication terminal 210 displays the received landscape element name and link information superimposed on the landscape video (see the lower part of FIG. 3). In step S519, an instruction from the user of link information is awaited. If there is a user's link destination instruction, in step S521, the related information providing server 230 that is the link destination is accessed with a landscape element ID.

In step S523, the related information providing server 230 acquires related information (including document data and audio data) from the related information DB 231 using the received landscape element ID. In step S525, the related information is returned to the access source communication terminal 210.

The communication terminal 210 that has received the reply of the related information displays or outputs the received related information in step S527.

<Functional configuration of communication terminal>
FIG. 6 is a block diagram illustrating a functional configuration of the communication terminal 210 according to the present embodiment.

6, the imaging unit 601 inputs a landscape video as a query image. The local feature value generation unit 602 generates a local feature value from the landscape video from the imaging unit 601. The local feature amount transmission unit 603 encodes the generated local feature amount together with the feature point coordinates by the encoding unit 603a and transmits the encoded local feature amount to the landscape element recognition server 220 via the communication control unit 604.

The landscape element recognition result receiving unit 605 receives a landscape element recognition result from the landscape element recognition server 220 via the communication control unit 604. And the display screen production | generation part 606 produces | generates the display screen of the received landscape element recognition result, and alert | reports to a user.

Also, the related information receiving unit 607 receives related information via the communication control unit 604. Then, the display screen generation unit 606 and the sound generation unit 698 generate a display screen and sound data of the received related information and notify the user. Note that the related information received by the related information receiving unit 607 includes related information from the landscape element recognition server 220 or related information from the related information providing server 230.

Also, the link information receiving unit 609 receives link information from the related information providing server 230 via the communication control unit 604. Then, the display screen generation unit 606 generates a display screen of the received link information and notifies the user. The link destination access unit 610 accesses the link destination related information providing server 230 based on the click of link information by an operation unit (not shown).

The landscape element recognition result receiving unit 605, the related information receiving unit 607, and the link information receiving unit 609 are not provided, but are provided as one information receiving unit that receives information received via the communication control unit 604. Also good.

《Functional configuration of landscape element recognition server》
FIG. 7 is a block diagram illustrating a functional configuration of the landscape element recognition server 220 according to the present embodiment.

In FIG. 7, the local feature receiving unit 702 decodes the local feature received from the communication terminal 210 via the communication control unit 701 by the decoding unit 702a. The landscape element recognition unit 703 recognizes the landscape element by collating the received local feature amount with the local feature amount of the local feature amount DB 221 storing the local feature amount corresponding to the landscape element. The landscape element recognition result transmission unit 704 transmits a landscape element recognition result (landscape element name) to the communication terminal 210.

The related information acquisition unit 705 refers to the related information DB 222 and acquires related information corresponding to the recognized landscape element. The related information transmission unit 706 transmits the acquired related information to the communication terminal 210. In addition, when the landscape element recognition server 220 transmits related information, it is desirable to transmit the landscape element recognition result and the related information as a single transmission data as illustrated in FIG. 4 because communication traffic is reduced.

The link information acquisition unit 707 refers to the link information DB 223 and acquires link information corresponding to the recognized landscape element. The link information transmission unit 708 transmits the acquired link information to the communication terminal 210. In addition, when transmitting link information, it is desirable for the landscape element recognition server 220 to transmit a landscape element recognition result and link information by one transmission data like FIG. 5, since it reduces communication traffic.

Naturally, when the landscape element recognition server 220 transmits the landscape element recognition result, the related information, and the link information, it is possible to reduce the communication traffic by transmitting all the information after transmitting it as a single transmission data. desirable.

It should be noted that the configuration of the related information providing server 230 includes various linkable providers, and a description of the configuration is omitted.

(Local feature DB)
FIG. 8 is a diagram illustrating a configuration of the local feature DB 221 according to the present embodiment. Note that the present invention is not limited to such a configuration.

The local feature DB 221 stores a first local feature 803, a second local feature 804, ..., an mth local feature 805 in association with the landscape element ID 801 and the name / direction 802. Each local feature quantity stores a feature vector composed of 1-dimensional to 150-dimensional elements hierarchized by 25 dimensions corresponding to 5 × 5 subregions (see FIG. 11F). Here, the direction indicates a local feature when each landscape element is viewed from any direction. In order to prevent landscape element recognition errors, it is desirable that local feature quantities from at least two directions such as a direction with few overlapping portions or a characteristic direction are stored in correspondence with the same landscape element.

Note that m is a positive integer and may be a different number corresponding to the landscape element ID. In the present embodiment, the feature point coordinates used for the matching process are stored together with the respective local feature amounts.

(Related information DB)
FIG. 9 is a diagram showing a configuration of the related information DB 222 according to the present embodiment. Note that the present invention is not limited to such a configuration.

The related information DB 222 stores related display data 903 and related audio data 904 that are related information in association with the landscape element ID 901 and the landscape element name 902. The related information DB 222 may be provided integrally with the local feature DB 221.

(Link information DB)
FIG. 10 is a diagram showing a configuration of the link information DB 223 according to the present embodiment. Note that the present invention is not limited to such a configuration.

The link information DB 223 stores ink information, for example, a URL (Uniform Resource Locator) 1003 and display data 10904 on the display screen in association with the landscape element ID 1001 and the landscape element name 1002. The link information DB 223 may be provided integrally with the local feature amount DB 221 and the related information DB 222.

The related information DB 231 of the related information providing server 230 is the same as the related information DB 222 of the landscape element recognition server 220, and a description thereof is omitted to avoid duplication.

<< Local feature generator >>
FIG. 11A is a block diagram illustrating a configuration of a local feature value generation unit 702 according to the present embodiment.

The local feature quantity generation unit 702 includes a feature point detection unit 1111, a local region acquisition unit 1112, a sub region division unit 1113, a sub region feature vector generation unit 1114, and a dimension selection unit 1115.

The feature point detection unit 1111 detects a large number of characteristic points (feature points) from the image data, and outputs the coordinate position, scale (size), and angle of each feature point.

The local region acquisition unit 1112 acquires a local region where feature amount extraction is performed from the coordinate value, scale, and angle of each detected feature point.

The sub area dividing unit 1113 divides the local area into sub areas. For example, the sub-region dividing unit 1113 can divide the local region into 16 blocks (4 × 4 blocks) or divide the local region into 25 blocks (5 × 5 blocks). The number of divisions is not limited. In the present embodiment, the case where the local area is divided into 25 blocks (5 × 5 blocks) will be described below as a representative.

The sub-region feature vector generation unit 1114 generates a feature vector for each sub-region of the local region. As the feature vector of the sub-region, for example, a gradient direction histogram can be used.

The dimension selection unit 1115 selects a dimension to be output as a local feature amount (for example, thinning out) so that the correlation between feature vectors of adjacent sub-regions becomes low based on the positional relationship of the sub-regions. In addition, the dimension selection unit 1115 can not only select a dimension but also determine a selection priority. That is, the dimension selection unit 1115 can select dimensions with priorities so that, for example, dimensions in the same gradient direction are not selected between adjacent sub-regions. Then, the dimension selection unit 1115 outputs a feature vector composed of the selected dimensions as a local feature amount. In addition, the dimension selection part 1115 can output a local feature-value in the state which rearranged the dimension based on the priority.

<< Processing of local feature generator >>
11B to 11F are diagrams showing processing of the local feature quantity generation unit 602 according to the present embodiment.

First, FIG. 11B is a diagram showing a series of processing of feature point detection / local region acquisition / sub-region division / feature vector generation in the local feature quantity generation unit 602. Such a series of processes is described in US Pat. No. 6,711,293, David G. Lowe, “Distinctive image features from scale-invariant key points” (USA), International Journal of Computer Vision, 60 (2), 2004. Year, p. 91-110.

(Feature point detector)
An image 1121 in FIG. 11B is a diagram illustrating a state in which feature points are detected from an image in the video in the feature point detection unit 1111 in FIG. 11A. Hereinafter, the generation of local feature amounts will be described by using one feature point data 1121a as a representative. The starting point of the arrow of the feature point data 1121a indicates the coordinate position of the feature point, the length of the arrow indicates the scale (size), and the direction of the arrow indicates the angle. Here, as the scale (size) and direction, brightness, saturation, hue, and the like can be selected according to the target image. Further, in the example of FIG. 11B, the case of six directions at intervals of 60 degrees will be described, but the present invention is not limited to this.

(Local area acquisition unit)
For example, the local region acquisition unit 1112 in FIG. 11A generates a Gaussian window 1122a around the starting point of the feature point data 1121a, and generates a local region 1122 that substantially includes the Gaussian window 1122a. In the example of FIG. 11B, the local region acquisition unit 1112 generates a square local region 1122, but the local region may be circular or have another shape. This local region is acquired for each feature point. If the local area is circular, there is an effect that the robustness is improved with respect to the imaging direction.

(Sub-region division part)
Next, the sub-region dividing unit 1113 shows a state in which the scale and angle of each pixel included in the local region 1122 of the feature point data 1121a are divided into sub-regions 1123. FIG. 11B shows an example in which 4 × 4 = 16 pixels are divided into 5 × 5 = 25 subregions. However, the sub-region may be 4 × 4 = 16, other shapes, or the number of divisions.

(Sub-region feature vector generator)
The sub-region feature vector generation unit 1114 generates and quantizes the histogram of the scale of each pixel in the sub-region in units of angles in six directions, and sets it as a sub-region feature vector 1124. That is, the direction is normalized with respect to the angle output by the feature point detection unit 1111. Then, the sub-region feature vector generation unit 1114 aggregates the frequencies in the six directions quantized for each sub-region, and generates a histogram. In this case, the sub-region feature vector generation unit 1114 outputs a feature vector constituted by a histogram of 25 sub-region blocks × 6 directions = 150 dimensions generated for each feature point. In addition, the gradient direction is not limited to 6 directions, but may be quantized to an arbitrary quantization number such as 4 directions, 8 directions, and 10 directions. When the gradient direction is quantized in the D direction, if the gradient direction before quantization is G (0 to 2π radians), the quantized value Qq (q = 0,..., D−1) in the gradient direction can be expressed by, for example, Although it can obtain | require by (1), Formula (2), etc., it is not restricted to this.

Qq = floor (G × D / 2π) (1)
Qq = round (G × D / 2π) mod D (2)
Here, floor () is a function for rounding off the decimal point, round () is a function for rounding off, and mod is an operation for obtaining a remainder. Further, when generating the gradient histogram, the sub-region feature vector generation unit 1114 may add up the magnitudes of the gradients instead of adding up the simple frequencies. In addition, when the sub-region feature vector generation unit 1114 aggregates the gradient histogram, the sub-region feature vector generation unit 1114 assigns weight values not only to the sub-region to which the pixel belongs, but also to sub-regions (such as adjacent blocks) that are close to each other according to the distance between the sub-regions. You may make it add. Further, the sub-region feature vector generation unit 1114 may add weight values to gradient directions before and after the quantized gradient direction. Note that the feature vector of the sub-region is not limited to the gradient direction histogram, and may be any one having a plurality of dimensions (elements) such as color information. In the present embodiment, it is assumed that a gradient direction histogram is used as the feature vector of the sub-region.

(Dimension selection part)
Next, processing will be described in the dimension selection unit 1115 in the local feature quantity generation unit 602 according to FIGS. 11C to 11F.

The dimension selection unit 1115 selects (decimates) a dimension (element) to be output as a local feature amount based on the positional relationship between the sub-regions so that the correlation between feature vectors of adjacent sub-regions becomes low. More specifically, the dimension selection unit 1115 selects dimensions such that at least one gradient direction differs between adjacent sub-regions, for example. In the present embodiment, the dimension selection unit 1115 mainly uses adjacent subregions as adjacent subregions. However, the adjacent subregions are not limited to adjacent subregions. A sub-region within a predetermined distance may be a nearby sub-region.

FIG. 11C shows an example in which a dimension is selected from a feature vector 1131 of a 150-dimensional gradient histogram generated by dividing a local region into 5 × 5 block sub-regions and quantizing gradient directions into six directions 1131a. FIG. In the example of FIG. 11C, dimensions are selected from feature vectors of 150 dimensions (5 × 5 = 25 sub-region blocks × 6 directions).

(Dimension selection of local area)
FIG. 11C is a diagram showing a state of feature vector dimension number selection processing in the local feature value generation unit 602.

As shown in FIG. 11C, the dimension selection unit 1115 selects a feature vector 1132 of a half 75-dimensional gradient histogram from a feature vector 1131 of a 150-dimensional gradient histogram. In this case, dimensions can be selected so that dimensions in the same gradient direction are not selected in adjacent left and right and upper and lower sub-region blocks.

In this example, when the quantized gradient direction in the gradient direction histogram is q (q = 0, 1, 2, 3, 4, 5), a block for selecting elements of q = 0, 2, 4 and , Q = 1, 3, and 5 are alternately arranged with sub-region blocks for selecting elements. In the example of FIG. 11C, when the gradient directions selected in the adjacent sub-region blocks are combined, there are six directions.

Also, the dimension selection unit 1115 selects the feature vector 1133 of the 50-dimensional gradient histogram from the feature vector 1132 of the 75-dimensional gradient histogram. In this case, the dimension can be selected so that only one direction is the same (the remaining one direction is different) between the sub-region blocks positioned at an angle of 45 degrees.

In addition, when the dimension selection unit 1115 selects the feature vector 1134 of the 25-dimensional gradient histogram from the feature vector 1133 of the 50-dimensional gradient histogram, the gradient direction selected between the sub-region blocks located at an angle of 45 degrees. Dimension can be selected so that does not match. In the example shown in FIG. 11C, the dimension selection unit 1115 selects one gradient direction from each sub-region from the first dimension to the 25th dimension, selects two gradient directions from the 26th dimension to the 50th dimension, and starts from the 51st dimension. Three gradient directions are selected up to 75 dimensions.

As described above, it is desirable that the gradient directions should not be overlapped between adjacent sub-area blocks and that all gradient directions should be selected uniformly. At the same time, as in the example shown in FIG. 11C, it is desirable that the dimensions be selected uniformly from the entire local region. Note that the dimension selection method illustrated in FIG. 11C is an example, and is not limited to this selection method.

(Local area priority)
FIG. 11D is a diagram illustrating an example of the selection order of feature vectors from sub-regions in the local feature value generation unit 602.

The dimension selection unit 1115 can determine the priority of selection so as to select not only the dimensions but also the dimensions that contribute to the features of the feature points in order. That is, for example, the dimension selection unit 1115 can select dimensions with priorities so that dimensions in the same gradient direction are not selected between adjacent sub-area blocks. Then, the dimension selection unit 1115 outputs a feature vector composed of the selected dimensions as a local feature amount. In addition, the dimension selection part 1115 can output a local feature-value in the state which rearranged the dimension based on the priority.

That is, the dimension selection unit 1115 adds dimensions in the order of the sub-region blocks as shown in the matrix 1141 in FIG. 11D, for example, between 1 to 25 dimensions, 26 dimensions to 50 dimensions, and 51 dimensions to 75 dimensions. It may be selected. When the priority order shown in the matrix 1141 in FIG. 11D is used, the dimension selection unit 1115 can select the gradient direction by increasing the priority order of the sub-region blocks close to the center.

11E is a diagram illustrating an example of element numbers of 150-dimensional feature vectors in accordance with the selection order of FIG. 11D. In this example, 5 × 5 = 25 blocks are represented by numbers p (p = 0, 1,..., 25) in raster scan order, and the quantized gradient direction is represented by q (q = 0, 1, 2, 3, 4). , 5), the element number of the feature vector is 6 × p + q.

The matrix 1161 in FIG. 11F is a diagram showing that the 150-dimensional order according to the selection order in FIG. 11E is hierarchized in units of 25 dimensions. In other words, the matrix 1161 in FIG. 11F is a diagram illustrating a configuration example of local feature amounts obtained by selecting the elements illustrated in FIG. 11E according to the priority order illustrated in the matrix 1141 in FIG. 11D. The dimension selection unit 1115 can output dimension elements in the order shown in FIG. 11F. Specifically, for example, when outputting a 150-dimensional local feature amount, the dimension selection unit 1115 can output all 150-dimensional elements in the order shown in FIG. 11F. When the dimension selection unit 1115 outputs, for example, a 25-dimensional local feature, the element 1171 in the first row (76th, 45th, 83rd,..., 120th) shown in FIG. 11F is shown in FIG. 11F. Can be output in order (from left to right). For example, when outputting a 50-dimensional local feature value, the dimension selection unit 1115 adds the elements 1172 in the second row shown in FIG. 11F in the order shown in FIG. To the right).

Incidentally, in the example shown in FIG. 11F, the local feature amount has a hierarchical structure arrangement. That is, for example, in the 25-dimensional local feature quantity and the 150-dimensional local feature quantity, the arrangement of the elements 1171 to 1176 in the first 25-dimensional local feature quantity is the same. In this way, the dimension selection unit 1115 selects a dimension hierarchically (progressively), thereby depending on the application, communication capacity, terminal specification, etc. Feature quantities can be extracted and output. In addition, the dimension selection unit 1115 can select images hierarchically, sort the dimensions based on the priority order, and output them, thereby collating images using local feature amounts of different dimensions. . For example, when images are collated using a 75-dimensional local feature value and a 50-dimensional local feature value, the distance between the local feature values can be calculated by using only the first 50 dimensions.

Note that the priorities shown in the matrix 1141 in FIG. 11D to FIG. 11F are merely examples, and the order of selecting dimensions is not limited to this. For example, the order of blocks may be the order shown in the matrix 1142 in FIG. 11D or the matrix 1143 in FIG. 11D in addition to the example of the matrix 1141 in FIG. 11D. Further, for example, the priority order may be determined so that dimensions are selected from all the sub-regions. Also, the vicinity of the center of the local region may be important, and the priority order may be determined so that the selection frequency of the sub-region near the center is increased. Further, the information indicating the dimension selection order may be defined in the program, for example, or may be stored in a table or the like (selection order storage unit) referred to when the program is executed.

Also, the dimension selection unit 1115 may select a dimension by selecting one sub-region block. That is, 6 dimensions are selected in a certain sub-region, and 0 dimensions are selected in other sub-regions close to the sub-region. Even in such a case, it can be said that the dimension is selected for each sub-region so that the correlation between adjacent sub-regions becomes low.

In addition, the shape of the local region and sub-region is not limited to a square, and can be any shape. For example, the local region acquisition unit 1112 may acquire a circular local region. In this case, the sub-region dividing unit 1113 can divide the circular local region into, for example, nine or seventeen sub-regions into concentric circles having a plurality of local regions. Even in this case, the dimension selection unit 1115 can select a dimension in each sub-region.

As described above, as shown in FIGS. 11B to 11F, according to the local feature value generation unit 602 of the present embodiment, the dimension of the feature vector generated while maintaining the information amount of the local feature value is hierarchically selected. The This process enables real-time landscape element recognition and recognition result display while maintaining recognition accuracy. Note that the configuration and processing of the local feature value generation unit 602 are not limited to this example. Naturally, other processes that enable real-time landscape element recognition and recognition result display while maintaining recognition accuracy can be applied.

(Encoding part)
FIG. 11G is a block diagram showing the encoding unit 603a according to the present embodiment. Note that the encoding unit is not limited to this example, and other encoding processes can be applied.

The encoding unit 603a has a coordinate value scanning unit 1181 that inputs the coordinates of feature points from the feature point detection unit 1111 of the local feature quantity generation unit 602 and scans the coordinate values. The coordinate value scanning unit 1181 scans the image according to a specific scanning method, and converts the two-dimensional coordinate values (X coordinate value and Y coordinate value) of the feature points into one-dimensional index values. This index value is a scanning distance from the origin according to scanning. There is no restriction on the scanning direction.

Also, it has a sorting unit 1182 that sorts the index values of feature points and outputs permutation information after sorting. Here, the sorting unit 1182 sorts, for example, in ascending order. You may also sort in descending order.

Also, a difference calculation unit 1183 that calculates a difference value between two adjacent index values in the sorted index value and outputs a series of difference values is provided.

And, it has a differential encoding unit 1184 that encodes a sequence of difference values in sequence order. The sequence of the difference value may be encoded with a fixed bit length, for example. When encoding with a fixed bit length, the bit length may be specified in advance, but this requires the number of bits necessary to express the maximum possible difference value, so the encoding size is small. Don't be. Therefore, when encoding with a fixed bit length, the differential encoding unit 1184 can determine the bit length based on the input sequence of difference values. Specifically, for example, the difference encoding unit 1184 obtains the maximum value of the difference value from the input series of difference values, obtains the number of bits (expression number of bits) necessary to express the maximum value, A series of difference values can be encoded with the obtained number of expression bits.

On the other hand, it has a local feature encoding unit 1185 that encodes the local feature of the corresponding feature point in the same permutation as the index value of the sorted feature point. By encoding with the same permutation as the sorted index value, it is possible to associate the coordinate value encoded by the differential encoding unit 1184 and the corresponding local feature amount on a one-to-one basis. In this embodiment, the local feature amount encoding unit 1185 encodes a local feature amount that is dimension-selected from 150-dimensional local feature amounts for one feature point, for example, one dimension with one byte, and the number of dimensions. Can be encoded.

(Landscape Element Recognition Department)
FIG. 11 is a diagram illustrating processing of the landscape element recognition unit 703 according to the present embodiment.

FIG. 11H shows a state in which the local feature amount generated from the landscape image 311 of the landscape display screen 310 captured by the communication terminal 210 in FIG. 3 is collated with the local feature amount stored in the local feature amount DB 221 in advance. FIG.

From the video 311 captured by the communication terminal 210 in the left diagram of FIG. 11H, local feature amounts are generated according to the present embodiment. Then, it is checked whether or not the local feature amounts 1191 to 1194 stored in the local feature amount DB 221 corresponding to each landscape element are in the local feature amounts generated from the video 311.

As shown in FIG. 11H, the landscape element recognition unit 703 associates each feature point where the local feature quantity stored in the local feature quantity DB 221 matches the local feature quantity like a thin line. Note that the landscape element recognition unit 703 determines that the feature points match when a predetermined ratio or more of the local feature amounts match. And if the positional relationship between the sets of associated feature points is a linear relationship, the landscape element recognition unit 703 recognizes that it is the target landscape element. If such recognition is performed, it is possible to recognize by size difference, orientation difference (difference in viewpoint), or inversion. In addition, since recognition accuracy is obtained if there are a predetermined number or more of associated feature points, it is possible to recognize a landscape element even if a part is hidden from view.

In FIG. 11H, four different landscape elements in the landscape that match the local feature amounts 1191 to 1194 of the four landscape elements in the local feature amount DB 221 are recognized with a precision corresponding to the accuracy of the local feature amount. .

<< Hardware configuration of communication terminal >>
FIG. 12A is a block diagram illustrating a hardware configuration of the communication terminal 210 according to the present embodiment.

In FIG. 12A, a CPU 1210 is a processor for arithmetic control, and implements each functional component of the communication terminal 210 by executing a program. The ROM 1220 stores fixed data and programs such as initial data and programs. Moreover, the communication control part 604 is a communication control part, and in this embodiment, it communicates with the landscape element recognition server 220 and the related information provision server 230 via a network. Note that the number of CPUs 1210 is not limited to one, and may be a plurality of CPUs or may include a GPU (GraphicsGraphProcessing Unit) for image processing.

The RAM 1240 is a random access memory that the CPU 1210 uses as a work area for temporary storage. The RAM 1240 has an area for storing data necessary for realizing the present embodiment. An input video 1241 indicates an input video imaged and input by the imaging unit 601. The feature point data 1242 indicates feature point data including the feature point coordinates, scale, and angle detected from the input video 1241. The local feature value generation table 1243 indicates a local feature value generation table that holds data until a local feature value is generated (see 12B). The local feature amount 1244 is generated using the local feature amount generation table 1243 and indicates a local feature amount to be sent to the landscape element recognition server 220 via the communication control unit 604. A landscape element recognition result 1245 indicates a landscape element recognition result returned from the landscape element recognition server 220 via the communication control unit 604. The related information / link information 1246 indicates related information and link information returned from the landscape element recognition server 220 or related information returned from the related information providing server 230. The display screen data 1247 indicates display screen data for notifying the user of information including a landscape element recognition result 1245 and related information / link information 1246. In the case of outputting audio, audio data may be included. Input / output data 1248 indicates input / output data input / output via the input / output interface 1260. Transmission / reception data 1249 indicates transmission / reception data transmitted / received via the communication control unit 604.

The storage 1250 stores a database, various parameters, or the following data or programs necessary for realizing the present embodiment. A display format 1251 indicates a display format for displaying information including a landscape element recognition result 1245 and related information / link information 1246.

The storage 1250 stores the following programs. The communication terminal control program 1252 indicates a communication terminal control program that controls the entire communication terminal 210. The communication terminal control program 1252 includes the following modules. The local feature generating module 1253 generates a local feature from the input video according to FIGS. 11B to 11F in the communication terminal control program 1252. The local feature quantity generation module 1253 is composed of the illustrated module group, but detailed description thereof is omitted here. The encoding module 1254 encodes the local feature generated by the local feature generating module 1253 for transmission. The information reception notification module 1255 is a module for receiving a landscape element recognition result 1245 and related information / link information 1246 and notifying the user by display or voice. The link destination access module 1256 is a module for accessing a link destination based on a user instruction to link information received and notified.

The input / output interface 1260 interfaces input / output data with input / output devices. The input / output interface 1260 is connected to a display unit 1261, a touch panel or keyboard as the operation unit 1262, a speaker 1263, a microphone 1264, and an imaging unit 601. The input / output device is not limited to the above example. In addition, a GPS (Global Positioning System) position generation unit 1265 is mounted, and acquires the current position based on a signal from a GPS satellite.

In FIG. 12A, only data and programs essential to the present embodiment are shown, and data and programs not related to the present embodiment are not shown.

(Local feature generation table)
FIG. 12B is a diagram showing a local feature generation table 1243 in the communication terminal 210 according to the present embodiment.

In the local feature quantity generation table 1243, a plurality of detected feature points 1202, feature point coordinates 1203, and local region information 1204 corresponding to the feature points are stored in association with the input image ID 1201. A selection dimension 1208 including a plurality of sub-region IDs 1205, sub-region information 1206, a feature vector 1207 corresponding to each sub-region, and a priority order in association with each detected feature point 1202, feature point coordinates 1203 and local region information 1204. Is memorized.

A local feature quantity 1209 is generated for each detected feature point 1202 from the above data. Data collected by combining these with the feature point coordinates is a local feature 1244 transmitted to the landscape element recognition server 220 generated from the captured landscape.

<< Processing procedure of communication terminal >>
FIG. 13 is a flowchart illustrating a processing procedure of the communication terminal 210 according to the present embodiment. This flowchart is executed by the CPU 1210 of FIG. 12A using the RAM 1240, and implements each functional component of FIG.

First, in step S1311, it is determined whether or not there is a video input for recognizing a landscape element. In step S1321, data reception is determined. In step S1331, it is determined whether the instruction is a link destination by the user. Otherwise, other processing is performed in step S1341. Note that description of normal transmission processing is omitted.

If there is video input, the process proceeds to step S1313, and local feature generation processing is executed based on the input video (see FIG. 14A). Next, in step S1315, local feature quantities and feature point coordinates are encoded (see FIGS. 14B and 14C). In step S1317, the encoded data is transmitted to the landscape element recognition server 220.

In the case of data reception, the process proceeds to step S1323, and it is determined whether or not the landscape element recognition result or related information is received from the landscape element recognition server 220 or the related information is received from the related information providing server 230. If it is reception from the landscape element recognition server 220, it will progress to step S1325 and will alert | report the received landscape element recognition result, related information, and link information by a display or audio | voice output. On the other hand, if it is reception from the related information provision server 230, it will progress to step S1327 and will alert | report the received related information by a display or audio | voice output.

(Local feature generation processing)
FIG. 14A is a flowchart illustrating a processing procedure of local feature generation processing S1313 according to the present embodiment.

First, in step S1411, the position coordinates, scale, and angle of the feature points are detected from the input video. In step S1413, a local region is acquired for one of the feature points detected in step S1411. Next, in step S1415, the local area is divided into sub-areas. In step S1417, a feature vector for each sub-region is generated to generate a feature vector for the local region. The processing of steps S1411 to S1417 is illustrated in FIG. 11B.

Next, in step S1419, dimension selection is performed on the feature vector of the local region generated in step S1417. The dimension selection is illustrated in FIGS. 11D to 11F.

In step S1421, it is determined whether local feature generation and dimension selection have been completed for all feature points detected in step S1411. If not completed, the process returns to step S1413 to repeat the process for the next one feature point.
(Encoding process)
FIG. 14B is a flowchart illustrating a processing procedure of the encoding processing S1315 according to the present embodiment.

First, in step S1431, the coordinate values of feature points are scanned in a desired order. Next, in step S1433, the scanned coordinate values are sorted. In step S1435, a difference value of coordinate values is calculated in the sorted order. In step S1437, the difference value is encoded (see FIG. 14C). In step S1439, local feature amounts are encoded in the coordinate value sorting order. The difference value encoding and the local feature amount encoding may be performed in parallel.

(Difference processing)
FIG. 14C is a flowchart illustrating a processing procedure of difference value encoding processing S1437 according to the present embodiment.

First, in step S1441, it is determined whether or not the difference value is within a range that can be encoded. If it is within the range which can be encoded, it will progress to step S1447 and will encode a difference value. Then, control goes to a step S1449. If it is not within the range that can be encoded (outside the range), the process proceeds to step S1443 to encode the escape code. In step S1445, the difference value is encoded by an encoding method different from the encoding in step S1447. Then, control goes to a step S1449. In step S1449, it is determined whether the processed difference value is the last element in the series of difference values. If it is the last, the process ends. When it is not the last, it returns to step S1441 again and the process with respect to the next difference value of the series of a difference value is performed.

<< Hardware configuration of landscape element recognition server >>
FIG. 15 is a block diagram illustrating a hardware configuration of the landscape element recognition server 220 according to the present embodiment.

15, a CPU 1510 is a processor for arithmetic control, and implements each functional component of the landscape element recognition server 220 in FIG. 7 by executing a program. The ROM 1520 stores fixed data and programs such as initial data and programs. The communication control unit 701 is a communication control unit, and in this embodiment, communicates with the communication terminal 210 or the related information providing server 230 via a network. Note that the number of CPUs 1510 is not limited to one, and may be a plurality of CPUs or may include a GPU for image processing.

The RAM 1540 is a random access memory that the CPU 1510 uses as a work area for temporary storage. The RAM 1540 has an area for storing data necessary for realizing the present embodiment. The received local feature value 1541 indicates a local feature value including the feature point coordinates received from the communication terminal 210. The read local feature value 1542 indicates the local feature value when including the feature point coordinates read from the local feature value DB 221. The landscape element recognition result 1543 indicates the landscape element recognition result recognized from the collation between the received local feature value and the local feature value stored in the local feature value DB 221. The related information 1544 indicates the related information searched from the related information DB 222 corresponding to the landscape element of the landscape element recognition result 1543. The link information 1545 indicates link information retrieved from the link information DB 223 corresponding to the landscape element of the landscape element recognition result 1543. Transmission / reception data 1546 indicates transmission / reception data transmitted / received via the communication control unit 701.

The storage 1550 stores a database, various parameters, or the following data or programs necessary for realizing the present embodiment. The local feature DB 221 is a local feature DB similar to that shown in FIG. The related information DB 222 is a related information DB similar to that shown in FIG. The link information DB 223 shows the same link information DB as shown in FIG.

The storage 1550 stores the following programs. The landscape element recognition server control program 1551 indicates a landscape element recognition server control program that controls the entire landscape element recognition server 220. In the landscape element recognition server control program 1551, the local feature DB creation module 1552 generates a local feature from a landscape element image and stores it in the local feature DB 221. In the landscape element recognition server control program 1551, the landscape element recognition module 1553 recognizes a landscape element by comparing the received local feature quantity with the local feature quantity stored in the local feature quantity DB 221. The related information / link information acquisition module 1554 acquires related information and link information from the related information DB 222 and the link information DB 223 corresponding to the recognized landscape element.

Note that FIG. 15 shows only data and programs essential to the present embodiment, and does not illustrate data and programs not related to the present embodiment.

<< Processing procedure of landscape element recognition server >>
FIG. 16 is a flowchart showing a processing procedure of the landscape element recognition server 220 according to the present embodiment. This flowchart is executed by the CPU 1510 of FIG. 15 using the RAM 1540, and implements each functional component of the landscape element recognition server 220 of FIG.

First, in step S1611, it is determined whether or not a local feature DB is generated. In step S1621, it is determined whether a local feature amount is received from the communication terminal. Otherwise, other processing is performed in step S1641.

If the local feature DB is generated, the process advances to step S1613 to execute a local feature DB generation process (see FIG. 17). If a local feature is received, the process advances to step S1623 to perform landscape element recognition processing (see FIGS. 18A and 18B).

Next, in step S1625, related information and link information corresponding to the recognized landscape element are acquired. Then, the recognized landscape element name, related information, and link information are transmitted to the communication terminal 210.

(Local feature DB generation processing)
FIG. 17 is a flowchart showing a processing procedure of local feature DB generation processing S1613 according to the present embodiment.

First, in step S1701, an image of a landscape element is acquired. In step S1703, the position coordinates, scale, and angle of the feature points are detected. In step S1705, a local region is acquired for one of the feature points detected in step S1703. Next, in step S1707, the local area is divided into sub-areas. In step S1709, a feature vector for each sub-region is generated to generate a local region feature vector. The processing from step S1705 to S1709 is illustrated in FIG. 11B.

Next, in step S1711, dimension selection is performed on the feature vector of the local region generated in step S1709. The dimension selection is illustrated in FIGS. 11D to 11F. However, in the generation of the local feature DB 221, hierarchization is performed in dimension selection, but it is desirable to store all generated feature vectors.

In step S1713, it is determined whether generation of local feature values and dimension selection have been completed for all feature points detected in step S1703. If not completed, the process returns to step S1705 to repeat the process for the next one feature point. When all the feature points are completed, the process proceeds to step S1715, and the local feature amount and the feature point coordinates are registered in the local feature amount DB 221 in association with the landscape element.

In step S1717, it is determined whether there is an image of another landscape element. If there is an image of another landscape element, the process returns to step S1701 to acquire an image of another landscape element and repeat the process.

(Landscape element recognition processing)
FIG. 18A is a flowchart showing a processing procedure of landscape element recognition processing S1623 according to the present embodiment.

First, in step S1811, the local feature amount of one landscape element is acquired from the local feature amount DB 221. And in step S1813, collation with the local feature-value of a landscape element and the local feature-value received from the communication terminal 210 is performed (refer FIG. 18B).

In step S1815, it is determined whether or not they match. If it matches, it will progress to step S1821 and will memorize | store the matched landscape element as existing in the image | video of the landscape which the communication terminal 210 imaged.

In step S1817, it is determined whether or not all landscape elements registered in the local feature DB 221 have been collated. If there is any remaining, the process returns to step S1811 to repeat collation of the next landscape element. In such collation, the field may be limited in advance in order to reduce the load on the scene time recognition server or the rill time process by improving the processing speed.

(Verification process)
FIG. 18B is a flowchart showing a processing procedure of collation processing S1813 according to the present embodiment.
First, in step S1831, parameters p = 1 and q = 0 are set as initialization. Next, in step S1833, a smaller number of dimensions is selected between the dimension number i of the local feature quantity in the local feature quantity DB 221 and the dimension number j of the received local feature quantity.

In the loop from step S1835 to S1845, the collation of each local feature amount is repeated until p> m (m = the number of feature points of the landscape element). First, in step S1835, data of the selected number of dimensions of the p-th local feature amount of the landscape element stored in the local feature amount DB 221 is acquired. That is, the number of dimensions selected from the first one dimension is acquired. Next, in step S1837, the p-th local feature value acquired in step S1835 and the local feature values of all feature points generated from the input video are sequentially checked to determine whether or not they are similar. In step S1839, it is determined whether or not the similarity exceeds the threshold value α from the result of collation between the local feature amounts. If so, in step S1841, the matched features in the local feature amount, the input video, and the landscape element are determined. A pair with the positional relationship of the points is stored. Then, q, which is a parameter for the number of matched feature points, is incremented by one. In step S1843, the feature point of the landscape element is advanced to the next feature point (p ← p + 1). If matching of all the feature points of the landscape element is not finished (p ≦ m), the process returns to step S1835 to match. Repeat local feature verification. The threshold value α can be changed according to the recognition accuracy required by the landscape element. Here, if a landscape element has a low correlation with other landscape elements, accurate recognition is possible even if the recognition accuracy is lowered.

When collation with all the feature points of the landscape element is completed, the process proceeds from step S1845 to S1847, and in steps S1847 to S1853, it is determined whether or not the landscape element exists in the input video. First, in step S1847, it is determined whether or not the ratio of the feature point number q that matches the local feature amount of the feature point of the input video among the feature point number p of the landscape element exceeds the threshold value β. If it exceeds, it will progress to step S1849 and will determine whether the positional relationship of the feature point of an input image | video and the feature point of a landscape element has the relationship in which linear transformation is possible further as a landscape element candidate. That is, the positional relationship between the feature point of the input video and the feature point of the landscape element stored as the local feature amount is matched in step S1841 is a positional relationship that is possible even by changes such as rotation, inversion, and change of the viewpoint position. Or whether the positional relationship cannot be changed. Since such a determination method is geometrically known, detailed description thereof is omitted. If it is determined in step S1851 that the linear conversion is possible, the process proceeds to step S953 to determine that the collated landscape element exists in the input video. Note that the threshold value β can be changed in accordance with the recognition accuracy required by the landscape element. Here, accurate recognition is possible even if there are few matched feature points as long as it is a landscape element that has a low correlation with other landscape elements or whose features can be judged even from a part. That is, a landscape element can be recognized as long as a part is hidden or not visible or a characteristic part is visible.

In step S1855, it is determined whether or not unmatched landscape elements remain in the local feature DB 221. If a landscape element still remains, the next landscape element is set in step S957, initialized to parameters p = 1 and q = 0, and the process returns to step S935 to repeat matching.

As is clear from the description of the matching process, the process of storing all the landscape elements in the local feature DB 221 and collating all the landscape elements has a very large load. Therefore, for example, before recognizing a landscape element from an input video, it is conceivable that the user selects a landscape element range from a menu, searches the range from the local feature DB 221 and collates the range. Also, the load can be reduced by storing only the local feature amount in the range used by the user in the local feature amount DB 221.

[Third Embodiment]
Next, an information processing system according to the third embodiment of the present invention will be described. The information processing system according to the present embodiment is different from the second embodiment in that related information is automatically accessed from a link destination even if the user does not perform a link destination access operation. Since other configurations and operations are the same as those of the second embodiment, the same configurations and operations are denoted by the same reference numerals, and detailed description thereof is omitted.

According to this embodiment, it is possible to notify related information of a link destination in association with a landscape element including a building in an image in a video in real time without a user operation.

<< Operation procedure of information processing system >>
FIG. 19 is a sequence diagram showing an operation procedure of the information processing system according to the present embodiment. In FIG. 19, operations similar to those in FIG. 5 of the second embodiment are denoted by the same step numbers, and description thereof is omitted.

In steps S400 and S401, although there is a possibility that there is a difference between applications and data, download, activation and initialization are performed in the same manner as in FIGS.

The landscape element recognition server 220 that recognizes the landscape element in the landscape from the local feature amount of the video received from the communication terminal 210 in step S411 corresponds to the recognized landscape element with reference to the link information DB 223 in step S513. Get link information.

If there are a plurality of acquired link information, a link destination is selected in step S1915. The selection of the link destination may be performed based on, for example, an instruction of a user who uses the communication terminal 210 or user recognition by the landscape element recognition server 220, but detailed description thereof is omitted here. In step S1917, access is performed with a landscape element ID that recognizes the linked related information providing server 230 based on the link information. In the operation procedure of FIG. 19, the communication terminal ID that has transmitted the local feature amount of the video by the link destination access is also transmitted.

The related information providing server 230 acquires landscape element related information (including document data and audio data) corresponding to the landscape element ID accompanying the access from the related information DB 231. In step S525, the related information is returned to the access source communication terminal 210. Here, in step S1917, the transmitted communication terminal ID is used.

In addition, in FIG. 19, the case where the response to the link destination access from the landscape element recognition server 220 is performed to the communication terminal 210 has been described. However, the landscape element recognition server 220 may receive the reply from the link destination and relay it to the communication terminal 210. Alternatively, the communication terminal 210 may be configured such that when link information is received, automatic access to the link destination is performed and a reply from the link destination is notified.

[Fourth Embodiment]
Next, an information processing system according to the fourth embodiment of the present invention will be described. Compared with the second embodiment and the third embodiment, the information processing system according to the present embodiment calculates the current location and / or moving direction / velocity of the user who is capturing a landscape based on the landscape element recognition process. It is different in point to do. Since other configurations and operations are the same as those of the second embodiment, the same configurations and operations are denoted by the same reference numerals, and detailed description thereof is omitted.

According to the present embodiment, the current location and / or moving direction / speed of the user can be calculated based on the landscape element in the image in the video in real time.

《Example of communication terminal display screen》
20A to 20C are diagrams illustrating display screen examples of the communication terminal 2010 in the information processing system according to the present embodiment.

(Current location)
First, FIG. 20A is a diagram illustrating an example of informing the user's current location.

The left figure of FIG. 20A shows the landscape display screen 2011 captured by the communication terminal 2010. The central view of FIG. 20A shows a landscape display screen 2012 captured by the communication terminal 2010 by moving the imaging range clockwise from the left diagram.

And the right figure of FIG. 20A determines the present location (user's present location) 2014 of the communication terminal 2010 based on the angle which imaged each landscape element by the process in this embodiment combining the left figure and the center figure. Are superimposed on the display screen 2013.

In addition, if the communication terminal 210 can measure the distance to a plurality of landscape elements, the current location 2014 can be determined by one video.

(Moving direction and moving speed by changing the angle of landscape elements)
Next, FIG. 20B is a diagram illustrating an example in which the moving direction and moving speed of the user on the ground are notified.

The left figure of FIG. 20B shows the landscape display screen 2011 captured by the communication terminal 2010. The central view of FIG. 20B shows a landscape display screen 2022 captured after the communication terminal 2010 has moved a certain distance in a certain direction. In FIG. 20B, a change in the angle of the building in the video can be seen.

And the right figure of FIG. 20B determines the moving direction and moving speed 2024 of the communication terminal 2010 based on the change of the angle which imaged each landscape element by the process in this embodiment combining the left figure and the center figure. And superimposed on the display screen 2023.

(Moving direction and moving speed due to landscape changes)
FIG. 20C is a diagram illustrating an example of informing the moving direction and moving speed of the user in the air.

The left figure of FIG. 20C shows the display screen 2031 of the landscape in which the communication terminal 2010 images the ground from the air. The central view of FIG. 20C shows a landscape display screen 2032 in which the communication terminal 2010 images the ground from the air after moving a certain distance in a certain direction. In FIG. 20C, it can be seen that the user is moving upward.

And the right figure of FIG. 20C determines the moving direction and moving speed 2034 of the communication terminal 2010 based on the change of the landscape element in the imaged landscape by the process in this embodiment combining the left figure and the center figure. And superimposed on the display screen 2033.

<< Operation procedure of information processing system >>
Hereinafter, an operation procedure of the information processing system according to the present embodiment will be described with reference to FIGS. 21 and 22.

(Local feature DB generation)
FIG. 21 is a sequence diagram illustrating an operation procedure for generating a local feature DB in the information processing system according to the present embodiment. FIG. 21 is an example, and the present invention is not limited to this. For example, local feature DB generation as shown in FIG. 17 of the second embodiment may be used. In the present embodiment, the operation procedure as shown in FIG. 21 is a process that is performed because it is desirable that the landscape element can be recognized from any direction.

First, in step S2101, a specific landscape element as a target is imaged by an imaging device including one or a plurality of communication terminals 2010. In step S2103, the plurality of video data is transmitted to the landscape element recognition server 2420 together with the landscape element information.

The landscape element recognition server 2420 generates local feature amounts from the received video data in step S2105. Next, in step S2107, the generated local feature quantities are compared, and the local feature quantity having a small correlation is set as a local feature quantity stored in the local feature quantity DB 2221. The local feature amount having a small correlation is a local feature amount to be stored separately in order to recognize the same landscape element.

In step S2109, the selected local feature quantity with a small correlation is transmitted to the local feature quantity DB 2221 together with the landscape element information and the accuracy of the local feature quantity. In step S2111, the local feature DB 2221 stores the received local feature in association with the landscape element.

(Determining the current location and / or moving direction and moving speed)
FIG. 22 is a sequence diagram showing an operation procedure for determining the current location and / or moving direction and moving speed in the information processing system according to the present embodiment. In addition, the same step number is attached | subjected to the procedure similar to the operation | movement procedure of FIG. 4 and FIG. 5 of 2nd Embodiment, and description is abbreviate | omitted.

The landscape element recognition server 2420 recognizes a landscape element in step S2211 by comparing with the local feature amount of the local feature amount DB 2221 from the local feature amount of the video received from the communication terminal 2010.

In step S2221, the communication terminal 2010 acquires a video having a direction different from that in step S403. In step S2223, a local feature amount of the video acquired in step S2221 is generated. Subsequently, in step S2225, the generated local feature is encoded together with the feature point coordinates. And the encoded local feature-value is transmitted to the landscape element recognition server 2420.

The landscape element recognition server 2420 recognizes the landscape element in step S2229 by comparing with the local feature amount of the local feature amount DB 2221.

In step S2231, it is determined whether or not there is an angle change between the landscape element recognized in step S2211 and the landscape element recognized in step S2229. Such an angle change can be measured from the difference in the geometrical arrangement of the feature point coordinates in the process of recognizing the landscape element from the collation of the local feature amount (see FIGS. 11H, 27A, and 27B). If the angle change is equal to or greater than the predetermined threshold value, the process proceeds to step S2233, and also refers to the map DB 2222 to calculate the moving direction and moving speed of the communication terminal 2210 (user). The moving direction and moving speed can be calculated as long as the elapsed time between the two video acquisitions (steps S403 and S2221), the angle change of at least one landscape element and the distance to the landscape element can be measured. is there. Or if a several landscape element is referred, the calculation of a more exact moving direction and moving speed is possible.

And the landscape element recognition server 2420 transmits a user's moving direction and moving speed to the communication terminal 2010 in step S2235. In step S2237, the communication terminal 2010 notifies the user's moving direction and moving speed (see FIG. 20B).

If the landscape element angle change is smaller than the predetermined threshold value, the process advances to step S2239 to determine whether the landscape element in the landscape video has changed. The change of the landscape element is a case where the number of landscape elements that disappear from the video and the number of landscape elements that appear in the video exceed a predetermined threshold. When a landscape element changes, it progresses to step S2241 and refers to map DB2222, and calculates the present location of the communication terminal 2010 (user). In the present location calculation in step S2241, each recognized landscape element is imaged from the geometrical arrangement of the feature point coordinates in the process of recognizing the landscape element from the collation of the local feature amount at each landscape element recognition (steps S2211 and S2229). Direction calculation is possible. Therefore, it is possible to reverse the imaged direction based on the position / angle calculation results of a plurality of landscape elements.

And the landscape element recognition server 2420 transmits a user's present location to the communication terminal 2010 in step S2243. In step S2237, the communication terminal 2010 notifies the user's current location (see FIG. 20A).
As described above, FIG. 22 shows an example in which the current location of the user is automatically calculated and notified from the change of the landscape element, and the movement direction and movement speed of the user are automatically calculated from the angle of the landscape element. As shown in FIGS. 20A and 20B, when the user consciously changes the imaging direction of the imaging unit 601, the current location is calculated, and when a predetermined time elapses while the imaging direction of the imaging unit 601 is maintained, the moving direction and the moving speed are calculated. Such a user interface has been devised. However, it is also possible for the user to select the current location, the moving direction and the moving speed from the menu of the communication terminal.

In addition, FIG. 22 does not show the user's moving direction and moving speed notification operation procedure corresponding to FIG. 20C, but it is clear that the moving direction and moving speed are calculated from the movement of the landscape element in the video. is there.

<Functional configuration of communication terminal>
FIG. 23 is a block diagram illustrating a functional configuration of the communication terminal according to the present embodiment. In FIG. 23, the same reference numerals are given to the same functional components as those in FIG. 6 of the second embodiment, and the description thereof will be omitted. The recognition result notification unit 2306 is a functional configuration unit including the display screen generation unit 606 in FIG.

The current location calculation result receiving unit 2307 receives the current location information of the user calculated from the landscape element recognition server 2420 via the communication control unit 604. The current location notification unit 2308 notifies the user.

The moving direction / speed calculation result receiving unit 2309 receives the moving direction and moving speed information of the user calculated from the landscape element recognition server 2420 via the communication control unit 604. Then, the moving direction / speed notification unit 2310 notifies the user.

《Functional configuration of landscape element recognition server》
FIG. 24 is a block diagram illustrating a functional configuration of the landscape element recognition server according to the present embodiment. In FIG. 24, the same reference numerals are assigned to the same functional components as those in FIG. 7 of the second embodiment, and the description thereof is omitted.

The landscape element storage unit 2405 stores the landscape element recognized by the landscape element recognition unit 703 in association with the imaging angle and the imaging time of the landscape element. Or memorize | store the distance to a landscape element. The landscape element comparison unit 2406 compares the imaging angles of the landscape elements recognized as the same landscape element. Moreover, disappearance and appearance of landscape elements in the video are detected by comparing the landscape elements.

If the change in the imaging angle is equal to or greater than the predetermined threshold, the movement direction / speed calculation unit 2407 refers to the map DB 2222 and calculates the movement direction and movement speed. Then, the movement direction / speed transmission unit 2408 transmits the calculated movement direction and movement speed to the communication terminal 2010 via the communication control unit 701.

On the other hand, if the disappearance and appearance of landscape elements exceed a predetermined number, the current location calculation unit 2409 calculates the current location from a wide range of landscape elements and imaging angles. Then, the current location transmission unit 2410 transmits the calculated current location to the communication terminal 2010 via the communication control unit 701.

(Local feature DB)
FIG. 25 is a diagram illustrating a configuration of the local feature DB 2221 according to the present embodiment.

25 differs from FIG. 8 in that a plurality of local feature quantities such as the first calculated local feature quantity 2503 and the second calculated local feature quantity are stored in association with the same landscape element ID 2501 and name 2502. In the point. As the plurality of local feature quantities, those having a small correlation between them are selected. Each local feature amount is composed of the first local feature amount to the m-th local feature amount 2505, as in FIG.

(Map DB)
FIG. 26 is a diagram showing the configuration of the map DB 2222 according to this embodiment.

The map DB 2222 includes a map data storage unit 2610 and a landscape element position storage unit 2620. The map data storage unit 2610 stores map data 2612 in association with the map ID 2611. In the landscape element position storage unit 2620, a coordinate 2622 composed of the IDI longitude of the landscape element, an address 2623, and a position 2626 on the map stored in the map data storage unit 2610 are stored in association with the landscape element ID 2621. Is done.

《Process of landscape element recognition unit》
FIG. 27A and FIG. 27B are diagrams illustrating processing of the landscape element recognition unit according to the present embodiment. In addition, in order to simplify description, in FIG. 27A and FIG. 27B, although one landscape element is demonstrated, it is the same also about many landscape elements in an image | video.

FIG. 27A shows the processing of the landscape element recognition unit on the ground. FIG. 27A is a diagram illustrating a state in which local feature amounts generated based on landscape element images 2791 to 2793 captured from the ground by the communication terminal 2010 are compared with local feature amounts stored in the local feature amount DB 2221 in advance. .

From the images 2791 to 2793 captured by the communication terminal 2010 in the left diagram of FIG. 27A, local feature amounts are generated according to the present embodiment. Then, by comparing with the local feature amount of the landscape element stored in the local feature amount DB 2221, the shooting direction of the landscape element of the images 2791 to 2793 is calculated.

FIG. 27B shows the process of the landscape element recognition unit in the air. FIG. 27B is a diagram illustrating a state in which the local feature amount generated based on the landscape element images 2794 to 2796 captured from the air by the communication terminal 2010 is collated with the local feature amount stored in the local feature amount DB 2221 in advance. .

From the images 2794 to 2796 captured by the communication terminal 2010 in the left diagram of FIG. 27B, local feature amounts are generated according to the present embodiment. Then, by comparing with the local feature amount of the landscape element stored in the local feature amount DB 2221, the imaging direction of the landscape element of the videos 2794 to 2796 is calculated.

<< Hardware configuration of communication terminal >>
FIG. 28 is a block diagram illustrating a hardware configuration of the communication terminal 2010 according to the present embodiment. In addition, the same reference number is attached | subjected to the element similar to FIG. 12 of 2nd Embodiment, and description is abbreviate | omitted.

The RAM 2840 is a random access memory used by the CPU 1210 as a work area for temporary storage. In the RAM 2840, an area for storing data necessary for realizing the present embodiment is secured. The current location calculation result 2841 indicates the calculated current location of the user. The movement direction / speed calculation result 2842 indicates the calculated movement direction and movement speed. The display screen data 1247 indicates display screen data for notifying the user of information including the current location calculation result 2841 and the movement direction / speed calculation result 2842.

The storage 2850 stores a database, various parameters, or the following data or programs necessary for realizing the present embodiment. The calculation result reception notification module 2851 is a module that receives the current location or moving direction and moving speed from the landscape element recognition server 2420 and notifies them.

Note that FIG. 28 shows only data and programs essential to the present embodiment, and does not illustrate data and programs not related to the present embodiment.

<< Processing procedure of communication terminal >>
FIG. 29 is a flowchart showing a processing procedure of the communication terminal 2010 according to the present embodiment. In addition, the same step number is attached | subjected to the step similar to FIG. 13 of 2nd Embodiment, and description is abbreviate | omitted.

If it is determined in step S1321 that data has been received, first, in step S2923, it is determined whether it is reception of a landscape element recognition result. If it is reception of a landscape element recognition result, it will progress to step S2925 and will alert | report a landscape element recognition result. Next, in step S2927, it is determined whether it is reception of the present location calculation result. If it is reception of a present location calculation result, it will progress to step S2929 and will alert | report a present location. Next, in step S2931, it is determined whether it is reception of a moving direction / speed calculation result. If the movement direction / speed calculation result is received, the process proceeds to step S2933 to notify the movement direction and the movement speed.

<< Hardware configuration of landscape element recognition server >>
FIG. 30 is a block diagram illustrating a hardware configuration of the landscape element recognition server 2420 according to the present embodiment. In addition, the same reference number is attached | subjected to the element similar to FIG. 15 of 2nd Embodiment, and description is abbreviate | omitted.

The RAM 3040 is a random access memory that the CPU 1510 uses as a temporary storage work area. The RAM 3040 has an area for storing data necessary for realizing the present embodiment. The current location calculation table 3041 is a table that stores parameters for calculating the current location (see FIG. 31A). The movement direction / speed calculation table 3042 is a table for storing parameters for calculating the movement direction and the movement speed (see FIG. 31B).

The storage 3050 stores a database, various parameters, or the following data or programs necessary for realizing the present embodiment. The local feature DB 2221 indicates a local feature DB similar to that shown in FIG. The map DB 2222 shows the same map DB as shown in FIG.

The storage 3050 stores the following programs. The current location calculation module 3051 is a module that calculates the current location of the user from the landscape elements and the imaging directions of the landscape elements. The moving direction / speed calculating module 3052 is a module for calculating the moving direction and moving speed of the user from the landscape element and the change in the imaging direction of the landscape element. The recognition result / calculation result transmission module 3053 is a module that transmits the recognition result of the landscape element in the video and the calculation result of the current location or moving direction and moving speed to the communication terminal 2010.

Note that FIG. 30 shows only data and programs essential to the present embodiment, and data and programs not related to the present embodiment are not shown.

(Current location calculation table)
FIG. 31A is a diagram showing a configuration of a current location calculation table 3041 according to the present embodiment.

The present location calculation table 3041 is associated with the communication terminal ID 3111, the first landscape element ID of the first landscape element 3112, the distance to the first landscape element, the imaging direction, the second landscape element ID of the second landscape element 3113, and the first 2 The distance to the landscape element and the imaging direction are stored. And the present location calculation result 3114 calculated based on the distance to these landscape elements and the imaging direction is stored.

(Movement direction / speed calculation table)
FIG. 31B is a diagram showing a configuration of the movement direction / speed calculation table 3042 according to the present embodiment.

The movement direction / speed calculation table 3042 is associated with the communication terminal ID 3121, the first landscape element ID of the first landscape element 3122, the distance to the first landscape element in the previous video, the imaging direction, and the current video The distance to the first landscape element and the imaging direction are stored. Moreover, 2nd landscape element ID of the 2nd landscape element 3123, the distance and imaging direction to the 1st landscape element in a previous image | video, and the distance and imaging direction to the 1st landscape element in the present image | video are memorize | stored. . And the moving direction / speed calculation result 3124 calculated based on the distance to these landscape elements and the imaging direction is stored.

<< Processing procedure of landscape element recognition server >>
FIG. 32 is a flowchart showing the processing procedure of the landscape element recognition server 2420 according to this embodiment. This flowchart is executed by the CPU 1510 of FIG. 30 using the RAM 1540, and implements each functional component of the landscape element recognition server 2420 of FIG.

If the local feature DB is generated, the process proceeds to step S3213 to execute the local feature DB generation processing of the present embodiment (see FIG. 33). If a local feature is received, the process advances to step S1623 to perform landscape element recognition processing (see FIGS. 18A and 18B). In step S3225, the landscape element recognition result is transmitted to the communication terminal 2010.

In step S3227, it is determined whether or not the present location calculation condition is satisfied. As described above, the current location calculation condition is when the change (disappearance or appearance) of a landscape element exceeds a predetermined threshold. If the current location calculation condition is satisfied, the process advances to step S3229 to execute the current location calculation process (see FIG. 34A). In step S3231, the calculated current location information is transmitted to the communication terminal 2010.

Next, in step S3233, it is determined whether or not the conditions for calculating the moving direction / speed are satisfied. As described above, the moving direction / speed calculation condition is a case where the change in the imaging angle of the landscape element exceeds a predetermined threshold. If the moving direction / speed calculation condition is satisfied, the process proceeds to step S3235 to execute the moving direction / speed calculating process (see FIG. 34B). In step S3237, the calculated current location information is transmitted to the communication terminal 2010.

(Local feature DB generation processing)
FIG. 33 is a flowchart showing the processing procedure of the local feature DB generation processing S3213 according to the present embodiment. In addition, the same step number is attached | subjected to the step similar to FIG. 17 of 2nd Embodiment, and description is abbreviate | omitted.

In step S3301, the local feature amount generated for a certain landscape element is stored. Next, it is determined whether there is another image obtained by imaging the same landscape element. If there is another image, the process returns to step S1701, and the generation of the local feature amount is repeated. If there is no other image, the process proceeds to step S3305, and a plurality of (at least two) local feature values having a small correlation are selected from the local feature values generated from the same landscape element stored in step S3301. Therefore, in step S1715, the local feature amount DB 2221 stores a plurality of local feature amounts having a small correlation in association with the landscape element.

(Current location calculation process)
FIG. 34A is a flowchart illustrating the processing procedure of the current location calculation processing S3229 according to the present embodiment.

First, in step S3411, the map DB 2222 is referenced to acquire each landscape element position in the recognized continuous image. Next, in step S3413, the orientation (imaging angle) of each landscape element is calculated from the arrangement of the feature points in the matching with the local feature amount of the corresponding landscape element in the local feature amount DB 2221. And the imaging present location (a communication terminal and a user's present location) is calculated from the position and direction of each landscape element. In addition, since collation in step S3413 has already been performed in the landscape element recognition process, it is not necessary if the direction is calculated at that time.

(Movement direction / speed calculation process)
FIG. 34B is a flowchart showing the processing procedure of the movement direction / speed calculation processing S3235 according to the present embodiment.

First, in step S3421, the orientation (imaging angle) of at least two landscape elements in the first image is calculated with reference to the local feature DB 2221. Subsequently, in step S3423, the orientation (imaging angle) of the same landscape element in the second image is calculated with reference to the local feature DB 2221. In step S3425, the distance to the landscape element is calculated. In addition, you may acquire the distance to a landscape element from the measurement by the communication terminal 2010. In step S3427, the imaging position (the communication terminal and the current location of the user) is calculated based on the change in the direction of the landscape element (imaging angle) and the distance to the landscape element. In addition, since the calculation of the direction in steps S3421 and S3423 can be calculated by the landscape element recognition process, it is not necessary if the direction is calculated at that time.

[Fifth Embodiment]
Next, an information processing system according to the fifth embodiment of the present invention will be described. The information processing system according to the present embodiment differs from the fourth embodiment in that the user performs navigation based on the calculation result of the user's current location and / or movement direction / speed. Since other configurations and operations are the same as those of the second embodiment, the same configurations and operations are denoted by the same reference numerals, and detailed description thereof is omitted. In addition, in this embodiment, although the landscape element recognition server shows an example of performing the navigation, the role assignment can be changed as in a configuration in which navigation is performed by a navigation process mounted on a vehicle-mounted saviation system or a portable terminal. .

According to the present embodiment, the user can be navigated in real time based on the landscape element in the image in the video.

《Example of communication terminal display screen》
FIG. 35 is a diagram showing a display screen example of the communication terminal 3510 in the information processing system according to the present embodiment.

The left diagram in FIG. 35 shows a landscape display screen 3511 captured by the communication terminal 3510. The central view of FIG. 35 shows a landscape display screen 3512 imaged after a user having the communication terminal 3510 (or a vehicle in which the communication terminal 3510 is installed) has moved a certain distance along the road. In FIG. 35, it can be seen that the user is moving upward.

Then, by the processing in this embodiment, first, a landscape element in the video is recognized, and a landscape element name (XX building 3514 and XX park 3515) is superimposed on the video. Further, by combining the left figure and the center figure, the current location, the moving direction, and the moving speed of the communication terminal 3510 are calculated based on the change of the landscape element in the captured landscape. On the display screen 3513 on the right side of FIG. 35, navigation information indicating the route to the user's destination (ΔΔ building) is obtained by referring to the map DB 2222 based on the calculated current location, moving direction, and moving speed. And an instruction comment 3516 indicating the predicted time to the destination (ΔΔ building) is displayed in a superimposed manner.

<< Operation procedure of information processing system >>
FIG. 36 is a sequence diagram illustrating an operation procedure of the information processing system according to the present embodiment. In addition, the same step number is attached | subjected to the operation | movement procedure similar to FIG. 4, and description is abbreviate | omitted.

In step S3603, the destination is set in the communication terminal 3510 by a user input. In steps S3605 and S3607 and steps S3609 and S3611, continuous video acquisition and local feature generation are performed. Note that continuous video acquisition is performed at predetermined time intervals. The predetermined time is appropriately determined depending on whether the user is walking or riding in a vehicle, or according to the temporarily measured user moving speed. Set or adjusted. In step S3613, the local feature amount and feature point coordinates of the continuous video are encoded. In step S3615, the local feature amount of the image continuous with the destination is transmitted from the communication terminal 3510 to the landscape element recognition server. Note that there are at least two continuous images of local feature values to be transmitted, and local feature values of three or more continuous images may be transmitted.

The landscape element recognition server refers to the local feature DB 2221, recognizes the landscape element in the video in step S3617, and calculates the angle at which the landscape element is imaged from the arrangement of the feature point coordinates in step S3619.

Next, the landscape element recognition server refers to the map DB 2222, acquires the recognized landscape element position in step S3621, and calculates the current location of the user from the landscape element position and its imaging angle (for details, see 4 embodiment). In the example of FIG. 35, the user's current location is calculated from the recognized landscape elements XX building and xx park, the position on the map, and the imaging angle of each landscape element. In step S3523, the moving direction and moving speed of the user are calculated from the change in the imaging angle of the landscape element between the two images (refer to the fourth embodiment for details). Since the current location of the user, the moving direction and the moving speed of the user are calculated, in step S3625, the route information to the destination and the estimated arrival time are calculated based on the information by referring to the map DB 2222. The instruction comment is generated.

In step S3627, the landscape element recognition server transmits an instruction comment and a local feature amount of the target for displaying the instruction comment as navigation information. Also, the calculated present location, moving direction and moving speed, and estimated arrival time are transmitted. In step S3629, the communication terminal 3510 collates the local feature amount of the previously generated video with the local feature amount indicating the target stored in the navigation local feature amount DB 3621. Then, an instruction comment is displayed on the target on the video. In the example of FIG. 35, an instruction comment “turn left, 6 minutes to △ Δ building” indicating the road between XX building and XX park is displayed.

In step S3631, it is determined whether or not the navigation is finished. If the navigation is continued, the process proceeds to step S3605 and the navigation is continued.

(Local feature DB for navigation)
FIG. 37 is a diagram showing a configuration of the navigation local feature DB 3621 according to this embodiment.

In the navigation local feature DB 3621, the current location 3701 calculated by the landscape element recognition server, the destination (ΔΔ building) 3702 set by the communication terminal 3510, the movement direction 3703 calculated by the landscape element recognition server, and The moving speed 3704 is stored.

In the local feature amount storage unit 3705, local feature amounts are stored corresponding to the landscape elements on the route. The instruction comment storage unit 3706 stores an instruction comment, a comment display condition, and a display position.

In the example of FIG. 35, the local feature amount of XX building and the local feature amount of xx park are downloaded from the landscape element recognition server to the communication terminal 3510. In addition, the instruction comment “turn left, (to) building ())”, the appearance of the XX building and XX park in the video, which is the comment display condition, and the XX building and XX as position information of the display position Memorize between the parks.

[Sixth Embodiment]
Next, an information processing system according to the sixth embodiment of the present invention will be described. The information processing system according to the present embodiment is different from the fifth embodiment in that the apparatus is automatically guided and controlled toward the target while recognizing the target. Since other configurations and operations are the same as those of the second embodiment, the same configurations and operations are denoted by the same reference numerals, and detailed description thereof is omitted.

According to the present embodiment, the apparatus can be automatically guided and controlled in real time based on the landscape elements in the images in the video.

《Example of communication terminal display screen》
FIG. 38 is a diagram showing a display screen example of the communication terminal 3810 in the information processing system according to the present embodiment. The display screen is for explaining the processing of the present embodiment. Since the present embodiment is automatic guidance control, the display screen need not be displayed.

The left diagram in FIG. 38 shows a landscape display screen 3811 captured by the communication terminal 3810. The display screen 3811 shows a runway 3811a of an airfield that is a guidance target (target). The central view of FIG. 35 shows a landscape display screen 3812 captured after the communication terminal 3810 has moved a certain distance in the air. In FIG. 38, it can be seen that the communication terminal 3810 is moving upward. And it is off the course to the right with respect to the target. If there is a display unit, a warning is displayed as shown in the figure, but if it is automatic guidance control, control is automatically performed to turn left and return to the course. The present embodiment is characterized by real-time acquisition of the current location, moving direction, and moving speed for guidance control, and detailed description of guidance control is omitted.

According to the processing in the present embodiment, the current position of the communication terminal 3810, the moving direction, and the moving speed are calculated based on the change of the landscape element in the captured landscape by combining the left diagram and the center diagram. A display screen 3813 on the right side of FIG. 38 shows a runway 3813a of an airfield that has returned to the course and further approached. If there is a display, normal return is displayed as shown in the figure.

<< Operation procedure of information processing system >>
FIG. 39 is a sequence diagram showing an operation procedure of the information processing system according to the sixth embodiment of the present invention. In addition, the same step number is attached | subjected to the operation | movement procedure similar to FIG. 4 and FIG. 5 of 2nd Embodiment, and description is abbreviate | omitted.

In step S3911, the guidance control computer generates a local feature amount of the target landscape element according to the target landscape element instruction that specifies the target landscape element, and stores it in the local feature DB 2221. If the local feature amount of the target landscape element is already stored in the local feature amount DB 2221, it is only necessary to set the landscape element ID of the target landscape element.

In step S3913, the guidance control computer recognizes the target landscape element by comparing the local feature amount of the video transmitted from the communication terminal 3810 with the local feature amount of the target landscape element stored in the local feature amount DB 2221. In step S3915, it is determined whether or not the recognized target landscape element is at a desired position in the video, and guidance control is performed to correct the position so that the desired position is not the desired position. In step S3917, it is determined whether or not the guidance control is finished. If not finished, the process returns to step S3913 and the automatic guidance control based on the local feature amount of the new video is continued.

<< Processing procedure of guidance control computer >>
FIG. 40 is a flowchart illustrating a processing procedure of the guidance control computer according to the present embodiment. This flowchart is executed using the RAM by the CPU of the guidance control computer. In the processing procedure of FIG. 40, steps similar to those in FIG. 16 of the second embodiment are denoted by the same step numbers and description thereof is omitted.

First, in step S4011, it is determined whether or not it is an instruction for a target landscape element. In step S1621, it is determined whether a local feature amount is received from the communication terminal. Otherwise, other processing is performed in step S1631.

If it is the setting of the target landscape element, the process proceeds to step S4013, and the local feature amount of the target landscape element is stored in the local feature amount DB 2221.

If the local feature amount is received, the process proceeds to step S1623 to perform a target landscape element recognition process. In addition, in this embodiment, it is the same as that of the process sequence of FIG. 18A and FIG. 18B except recognizing only a target landscape element, and a detail is abbreviate | omitted.

Next, in step S4025, it is determined whether or not it is recognized that the target landscape element is in the video. If it is not in the video, the process proceeds to step S4029, and position correction by guidance control is performed so that the target landscape element is in the video. On the other hand, if the target landscape element is in the video, the process proceeds to step S4027 to determine whether the position of the target landscape element in the video is a desired position. The desired position indicates a case where the desired position is within the center of the image or an area including a predetermined position. If it is in the desired position, the process ends without doing anything. However, if it is not at the desired position, the process proceeds to step S4029 to perform position correction by guidance control so that the target landscape element is at the desired position in the video.

[Seventh Embodiment]
Next, an information processing system according to a seventh embodiment of the present invention will be described. The information processing system according to the present embodiment is different from the sixth embodiment in that the apparatus is automatically guided and controlled toward the target while recognizing the landscape of the route. Since other configurations and operations are the same as those of the second embodiment, the same configurations and operations are denoted by the same reference numerals, and detailed description thereof is omitted.

Note that the automatic guidance control of the sixth embodiment and the automatic guidance control of the present embodiment can be switched and used in accordance with conditions such as the distance between the device and the target and the altitude of the device. Further, it can be combined with conventional automatic guidance control using radio waves or lasers. In general, the guidance by radio waves is suitable for positions far from the ground, the guidance of this embodiment is suitable for altitudes where the landscape elements on the ground can be recognized, and the guidance of the sixth embodiment is suitable for distances and altitudes where the target can be seen. Alternatively, it is possible to switch the guidance method according to conditions such as whether the visibility is good or bad.

《Example of communication terminal display screen》
FIG. 41 is a diagram showing a display screen example of the communication terminal in the information processing system according to the present embodiment. The display screen is for explaining the processing of the present embodiment. Since the present embodiment is automatic guidance control, the display screen need not be displayed.

The left diagram in FIG. 41 shows a landscape display screen 4111 in which the ground is imaged from the air imaged by the communication terminal 4110. The display screen 4111 shows a landscape element area 4111a to be recognized. The landscape element in the region 4111a is recognized by collation with the local feature amount in the local feature amount DB 2221. And it is compared with route screen DB4210 which memorize | stores the scenery which should be in the course which goes to the target landscape element, and it is determined whether it is moving on the course. In the left diagram of FIG. 41, it is determined that the vehicle is moving on the course, and it is shown that the aircraft is normally flying over Am.

41 shows a landscape display screen 4112 obtained by imaging the ground from the air after the communication terminal 4110 has moved a certain distance in the air. In FIG. 41, it can be seen that the communication terminal 4110 is moving upward. The display screen 4112 shows a landscape element region 4112a to be recognized. The landscape element in the region 4112a is recognized by collation with the local feature amount in the local feature amount DB 2221. And it is compared with route screen DB4210 which memorize | stores the scenery which should be in the course which goes to the target landscape element, and it is determined whether it is moving on the course. In the central view of FIG. 41, it is determined that the vehicle is moving on the course, and it is shown that the aircraft is normally flying over Bm (<Am) over ΔΔ town.

The right diagram of FIG. 41 shows a landscape display screen 4113 in which the ground is imaged from the air after the communication terminal 4110 has further moved through the air by a certain distance. The display screen 4113 shows a landscape element area 4113a to be recognized. The landscape element in the region 4113a is recognized by collation with the local feature amount in the local feature amount DB 2221. And it is compared with route screen DB4210 which memorize | stores the scenery which should be in the course which goes to the target landscape element, and it is determined whether it is moving on the course. In the right diagram of FIG. 41, it is determined that the vehicle is moving on the course, and it is shown that the vehicle is normally flying in the xx town Cm (<Bm <Am).

<< Operation procedure of information processing system >>
FIG. 42 is a sequence diagram illustrating an operation procedure of the information processing system according to the present embodiment. In addition, the same step number is attached | subjected to the operation | movement procedure similar to FIG. 4 and FIG. 5 of 2nd Embodiment, and description is abbreviate | omitted.

In step S4211, the guidance control computer generates a local feature amount of the landscape element in the course toward the target landscape element acquired from the target landscape element and the map DB 2222 if necessary according to the instruction of the target landscape element, Stored in the local feature DB 2221. If the local feature amount of the landscape element is already stored in the local feature amount DB 2221, it is only necessary to set the landscape element ID of each landscape element. Next, in step S4213, the guidance control computer refers to the local feature value DB “2221 and map DB 2222, generates a route drawing representing videos at a plurality of locations on the course to the target landscape element, and generates a route screen DB 4210. The route image is retained.

In step S4215, the guidance control computer recognizes the landscape element in the video by comparing the local feature quantity of the video transmitted from the communication terminal 4110 with the local feature quantity of the landscape element stored in the local feature DB 2221. . In step S4217, landscape recognition including a plurality of recognized landscape elements is performed, and route correction is performed by guidance control so as to match the image from the course up to the target landscape element stored in the route screen DB 4210. In step S4219, it is determined whether or not the guidance control is finished. If not finished, the process returns to step S4215 and the automatic guidance control based on the local feature amount of the new video is continued.

(Route screen DB)
FIG. 43 is a diagram showing a configuration of the route screen DB 4210 according to the present embodiment.

The route screen DB 4210 includes a landscape element group storage unit 4310 that sequentially stores local feature quantities of a landscape element group to be imaged according to a route to the target landscape element, and a video to be imaged according to the route to the target landscape element. And a video storage unit 4320 for storing local feature amounts.

The landscape element group storage unit 4310 is associated with the route ID 4312 to the target landscape element 4311, and each of the first landscape element ID 4313 and the second landscape element ID 4314, which should be in the video, is a video with each local feature and each route. The relative position inside is memorized.

On the other hand, the video storage unit 4320 stores the local feature amount of the entire video to be associated with the route ID 4322 to the target landscape element 4321.

The automatic guidance control of this embodiment is performed using either or both of the landscape element group storage unit 4310 and the video storage unit 4320.

<< Processing procedure of guidance control computer >>
FIG. 44 is a flowchart showing a processing procedure of the guidance control computer according to the present embodiment. This flowchart is executed using the RAM by the CPU of the guidance control computer. In the processing procedure of FIG. 44, steps similar to those in FIG. 16 of the second embodiment or FIG. 40 of the sixth embodiment are denoted by the same step numbers, and description thereof is omitted.

If the target landscape element is set, the process proceeds to step S4413, and the route screen in the course of the course to the target landscape element is stored in the route screen DB 4210.

If the local feature is received, the process proceeds to step S1623 to perform landscape element recognition processing. Next, in step S4425, the arrangement of the recognized landscape elements in the video is analyzed with reference to the route screen DB 4210. In step S4437, a route correction process is performed if there is a deviation by comparing the route screen of the route screen DB 4210 with the screen in the video.

[Eighth Embodiment]
Next, an information processing system according to an eighth embodiment of the present invention will be described. The information processing system according to the present embodiment is different from the first to seventh embodiments in that the communication terminal performs all processes including landscape element recognition. Since other configurations and operations are the same as those of the second embodiment, the same configurations and operations are denoted by the same reference numerals, and detailed description thereof is omitted.

According to the present embodiment, all processing can be performed only on the communication terminal based on the landscape element in the image in the video.

<Functional configuration of communication terminal>
FIG. 45 is a block diagram showing a functional configuration of a communication terminal 4510 according to this embodiment. 45, the same reference numerals are given to the same functional components as those in FIG. 6 of the second embodiment or FIG. 23 of the fourth embodiment, and description thereof will be omitted.

The landscape element recognition unit 4501 recognizes the landscape element in the video by collating the local feature amount of the image generated by the local feature amount generation unit 602 with the local feature amount of the landscape element stored in the local feature amount DB 4502. . The landscape element storage unit 4503 stores at least one previously recognized landscape element. The landscape element comparison unit 4504 compares the landscape element stored in the landscape element storage unit 4503 with the landscape element currently recognized by the landscape element recognition unit 4501.

Then, the moving direction / speed calculating unit 4505 refers to the map DB 4506 and calculates the moving method and moving speed of the user based on the change in the imaging angle of the landscape element. Also, the current location calculation unit 4507 refers to the map DB 4506 and calculates the current location of the user based on the imaging angles of a plurality of landscape elements.

The navigation information generation unit 4508 refers to the map DB 4506, and generates navigation information toward the destination based on the calculated current location of the user and the calculated moving method and moving speed of the user. The navigation information notification unit 4509 notifies the user of the generated navigation information.

[Other Embodiments]
Although the present invention has been described with reference to the embodiments, the present invention is not limited to the above embodiments. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention. In addition, a system or an apparatus in which different features included in each embodiment are combined in any way is also included in the scope of the present invention.

Further, the present invention may be applied to a system composed of a plurality of devices, or may be applied to a single device. Furthermore, the present invention can also be applied to a case where a control program that realizes the functions of the embodiments is supplied directly or remotely to a system or apparatus. Therefore, in order to realize the functions of the present invention with a computer, a control program installed in the computer, a medium storing the control program, and a WWW (World Wide Web) server that downloads the control program are also included in the scope of the present invention. include.

This application claims priority based on Japanese Patent Application No. 2012-018386 filed on January 30, 2012, the entire disclosure of which is incorporated herein.

Some or all of this embodiment can be described as in the following supplementary notes, but is not limited to the following.

(Appendix 1)
M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the landscape elements and m local regions including each of the m feature points of the landscape element image. First local feature quantity storage means for storing the quantity in association with each other;
N feature points are extracted from the image captured by the imaging means, and n local feature regions each including the n feature points are each composed of feature vectors of 1 to j dimensions. Second local feature quantity generating means for generating the second local feature quantity of
A smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number. When it is determined that the n second local feature amounts correspond to a predetermined ratio or more of the m first local feature amounts including feature vectors up to the selected number of dimensions, the image in the video A landscape element recognition means for recognizing that the landscape element exists in
An information processing system comprising:
(Appendix 2)
The information processing system according to supplementary note 1, wherein the landscape element includes a landscape element constituting a natural landscape and a building constituting an artificial landscape.
(Appendix 3)
The information processing system according to

appendix

1 or 2, further comprising notification means for notifying a recognition result of the landscape element recognition means.
(Appendix 4)
The information processing system according to supplementary note 3, wherein the notification unit further notifies information related to the recognition result.
(Appendix 5)
The information processing system according to

appendix

3 or 4, wherein the notifying unit further notifies link information for acquiring information related to the recognition result.
(Appendix 6)
The notification means includes related information acquisition means for acquiring information related to the recognition result according to link information,
The information processing system according to appendix 3, wherein the related information acquired according to link information is notified.
(Appendix 7)
A position / angle calculation means for calculating the position and imaging angle of a plurality of landscape elements;
Current location calculation means for calculating the current location from the position and imaging angle of the plurality of landscape elements;
The information processing system according to any one of supplementary notes 1 to 6, further comprising:
(Appendix 8)
An angle calculating means for calculating an imaging angle of a landscape element in the continuous image;
A moving direction / speed calculating means for calculating a moving direction and a moving speed of an imaging position from an imaging angle of a landscape element in the continuous image;
The information processing system according to any one of appendices 1 to 7, further comprising:
(Appendix 9)
Destination setting means for setting the destination;
A position / angle calculation means for calculating the position and imaging angle of a plurality of landscape elements;
Current location calculation means for calculating the current location from the position and imaging angle of the plurality of landscape elements;
An instruction comment generating means for generating an instruction comment toward the destination based on the destination and the current position in association with a local feature amount of a landscape element existing on the route from the current position to the destination; ,
Display means for displaying the instruction comment in association with a landscape element in the video imaged by the imaging means;
The information processing system according to any one of supplementary notes 1 to 6, further comprising:
(Appendix 10)
A target landscape element indicating means for indicating a target landscape element;
Guidance control means for controlling the imaging position so that the target landscape element is a desired position in the video imaged by the imaging means;
The information processing system according to any one of supplementary notes 1 to 8, further comprising:
(Appendix 11)
The first local feature quantity storage means stores the m first local feature quantities generated from images of each landscape element in association with a plurality of landscape elements,
Any one of Supplementary notes 1 to 8, further comprising landscape recognition means for recognizing a landscape based on an arrangement of a plurality of landscape elements included in the image picked up by the image pickup means recognized by the landscape element recognition means. The information processing system according to one.
(Appendix 12)
A target landscape element indicating means for indicating a target landscape element;
Route image holding means for holding a landscape element in an image in the course of the course up to the target landscape element in association with a local feature,
Guidance control means for controlling the imaging position so that a desired landscape element exists at a predetermined position in the video imaged by the imaging means;
The information processing system according to any one of supplementary notes 1 to 8, further comprising:
(Appendix 13)
The first local feature value and the second local feature value are a plurality of dimensions formed by dividing a local region including a feature point extracted from an image into a plurality of sub-regions, and comprising histograms of gradient directions in the plurality of sub-regions. The information processing system according to any one of supplementary notes 1 to 12, wherein the information processing system is generated by generating a feature vector.
(Appendix 14)
The first local feature amount and the second local feature amount are generated by deleting a dimension having a larger correlation between adjacent sub-regions from the generated plurality of dimension feature vectors. The information processing system according to attachment 13.
(Appendix 15)
The plurality of dimensions of the feature vector is a predetermined dimension so that it can be selected in order from the dimension that contributes to the feature of the feature point and from the first dimension in accordance with the improvement in accuracy required for the local feature amount. The information processing system according to

appendix

13 or 14, wherein the local area is arranged so as to make a round for each number.
(Appendix 16)
The second local feature quantity generating means generates the second local feature quantity having a smaller number of dimensions for a landscape element having the lower correlation with other landscape elements, corresponding to the correlation of the landscape elements. The information processing system according to appendix 15, characterized by:
(Appendix 17)
The first local feature quantity storage means stores the first local feature quantity having a smaller number of dimensions for a landscape element having a lower correlation with other landscape elements, corresponding to the correlation of the landscape elements. The information processing system according to

appendix

15 or 16, characterized by:
(Appendix 18)
M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the landscape elements and m local regions including each of the m feature points of the landscape element image. An information processing method using an information processing system including first local feature storage means for storing a quantity in association with each other,
N feature points are extracted from the image captured by the imaging means, and n local feature regions each including the n feature points are each composed of feature vectors of 1 to j dimensions. A second local feature generation step of generating the second local feature of
A smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number. When it is determined that the n second local feature amounts correspond to a predetermined ratio or more of the m first local feature amounts including feature vectors up to the selected number of dimensions, the image in the video Recognizing that the landscape element exists in
An information processing method comprising:
(Appendix 19)
N feature points are extracted from the image captured by the imaging means, and n local feature regions each including the n feature points are each composed of feature vectors of 1 to j dimensions. Second local feature quantity generating means for generating the second local feature quantity of
First transmission means for transmitting the m second local feature amounts to an information processing apparatus that recognizes a landscape element included in the image captured based on the comparison of local feature amounts;
First receiving means for receiving information indicating a landscape element included in the captured image from the information processing apparatus;
A communication terminal comprising:
(Appendix 20)
N feature points are extracted from the image captured by the imaging means, and n local feature regions each including the n feature points are each composed of feature vectors of 1 to j dimensions. A second local feature generation step of generating the second local feature of
A first transmission step of transmitting the m second local feature amounts to an information processing device that recognizes a landscape element included in the image captured based on the comparison of local feature amounts;
A first receiving step of receiving information indicating a landscape element included in the captured image from the information processing apparatus;
A control method for a communication terminal, comprising:
(Appendix 21)
N feature points are extracted from the image captured by the imaging means, and n local feature regions each including the n feature points are each composed of feature vectors of 1 to j dimensions. A second local feature generation step of generating the second local feature of
A first transmission step of transmitting the m second local feature amounts to an information processing device that recognizes a landscape element included in the image captured based on the comparison of local feature amounts;
A first receiving step of receiving information indicating a landscape element included in the captured image from the information processing apparatus;
A program for controlling a communication terminal, which causes a computer to execute.
(Appendix 22)
M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the landscape elements and m local regions including each of the m feature points of the landscape element image. First local feature quantity storage means for storing the quantity in association with each other;
N feature points are extracted from an image in the video captured by the communication terminal, and n feature regions of 1 to j dimensions are respectively obtained for n local regions including the n feature points. Second receiving means for receiving the second local feature amount from the communication terminal;
A smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number. When it is determined that the n second local feature amounts correspond to a predetermined ratio or more of the m first local feature amounts including feature vectors up to the selected number of dimensions, the image in the video Recognizing means for recognizing that the landscape element exists in
Second transmission means for transmitting information indicating the recognized landscape element to the communication terminal;
An information processing apparatus comprising:
(Appendix 23)
M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the landscape elements and m local regions including each of the m feature points of the landscape element image. A method for controlling an information processing apparatus including first local feature storage means for storing a quantity in association with each other,
N feature points are extracted from an image in the video captured by the communication terminal, and n feature regions of 1 to j dimensions are respectively obtained for n local regions including the n feature points. A second receiving step of receiving the second local feature amount of the communication terminal from the communication terminal;
A smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number. When it is determined that the n second local feature amounts correspond to a predetermined ratio or more of the m first local feature amounts including feature vectors up to the selected number of dimensions, the image in the video Recognizing that the landscape element exists in
A second transmission step of transmitting information indicating the recognized landscape element to the communication terminal;
A method for controlling an information processing apparatus, comprising:
(Appendix 24)
M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the landscape elements and m local regions including each of the m feature points of the landscape element image. A control program for an information processing device including first local feature storage means for storing the amount in association with each other,
N feature points are extracted from an image in the video captured by the communication terminal, and n feature regions of 1 to j dimensions are respectively obtained for n local regions including the n feature points. A second receiving step of receiving the second local feature amount of the communication terminal from the communication terminal;
A smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number. When it is determined that the n second local feature amounts correspond to a predetermined ratio or more of the m first local feature amounts including feature vectors up to the selected number of dimensions, the image in the video Recognizing that the landscape element exists in
A second transmission step of transmitting information indicating the recognized landscape element to the communication terminal;
A control program for causing a computer to execute.

Claims

M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the landscape elements and m local regions including each of the m feature points of the landscape element image. First local feature quantity storage means for storing the quantity in association with each other;
N feature points are extracted from the image captured by the imaging means, and n local feature regions each including the n feature points are each composed of feature vectors of 1 to j dimensions. Second local feature quantity generating means for generating the second local feature quantity of
A smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number. When it is determined that the n second local feature amounts correspond to a predetermined ratio or more of the m first local feature amounts including feature vectors up to the selected number of dimensions, the image in the video A landscape element recognition means for recognizing that the landscape element exists in
An information processing system comprising:
The information processing system according to claim 1, wherein the landscape element includes a landscape element constituting a natural landscape and a building constituting an artificial landscape.
The information processing system according to claim 1 or 2, further comprising notification means for notifying a recognition result of the landscape element recognition means.
4. The information processing system according to claim 3, wherein the notifying unit further notifies information related to the recognition result.
The information processing system according to claim 3 or 4, wherein the notification means further notifies link information for acquiring information related to the recognition result.
The notification means includes related information acquisition means for acquiring information related to the recognition result according to link information,
The information processing system according to claim 3, wherein the related information acquired according to link information is notified.
A position / angle calculation means for calculating the position and imaging angle of a plurality of landscape elements;
Current location calculation means for calculating the current location from the position and imaging angle of the plurality of landscape elements;
The information processing system according to claim 1, further comprising:
An angle calculating means for calculating an imaging angle of a landscape element in the continuous image;
A moving direction / speed calculating means for calculating a moving direction and a moving speed of an imaging position from an imaging angle of a landscape element in the continuous image;
The information processing system according to claim 1, further comprising:
Destination setting means for setting the destination;
A position / angle calculation means for calculating the position and imaging angle of a plurality of landscape elements;
Current location calculation means for calculating the current location from the position and imaging angle of the plurality of landscape elements;
An instruction comment generating means for generating an instruction comment toward the destination based on the destination and the current position in association with a local feature amount of a landscape element existing on the route from the current position to the destination; ,
Display means for displaying the instruction comment in association with a landscape element in the video imaged by the imaging means;
The information processing system according to claim 1, further comprising:
A target landscape element indicating means for indicating a target landscape element;
Guidance control means for controlling the imaging position so that the target landscape element is a desired position in the video imaged by the imaging means;
The information processing system according to claim 1, further comprising:
The first local feature quantity storage means stores the m first local feature quantities generated from images of each landscape element in association with a plurality of landscape elements,
The landscape recognition unit for recognizing a landscape based on the arrangement of a plurality of landscape elements included in the image captured by the imaging unit recognized by the landscape element recognition unit. The information processing system according to claim 1.
A target landscape element indicating means for indicating a target landscape element;
Route image holding means for holding a landscape element in an image in the course of the course up to the target landscape element in association with a local feature,
Guidance control means for controlling the imaging position so that a desired landscape element exists at a predetermined position in the video imaged by the imaging means;
The information processing system according to claim 1, further comprising:
The first local feature amount and the second local feature amount are a plurality of dimensions formed by dividing a local region including a feature point extracted from an image into a plurality of sub-regions, and comprising histograms of gradient directions in the plurality of sub-regions. The information processing system according to claim 1, wherein the information processing system is generated by generating a feature vector.
The first local feature amount and the second local feature amount are generated by deleting a dimension having a larger correlation between adjacent sub-regions from the generated plurality of dimension feature vectors. The information processing system according to claim 13.
The plurality of dimensions of the feature vector is a predetermined dimension so that it can be selected in order from the dimension that contributes to the feature of the feature point and from the first dimension in accordance with the improvement in accuracy required for the local feature amount. The information processing system according to claim 13 or 14, wherein the local area is arranged so as to go around the local area every number.
The second local feature quantity generating means generates the second local feature quantity having a smaller number of dimensions for a landscape element having a lower correlation with other landscape elements, corresponding to the correlation of the landscape elements. The information processing system according to claim 15.
The first local feature quantity storage means stores the first local feature quantity having a smaller number of dimensions for a landscape element having a lower correlation with other landscape elements, corresponding to the correlation of the landscape elements. The information processing system according to claim 15 or 16, characterized in that:
M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the landscape elements and m local regions including each of the m feature points of the landscape element image. An information processing method using an information processing system including first local feature storage means for storing a quantity in association with each other,
N feature points are extracted from the image captured by the imaging means, and n local feature regions each including the n feature points are each composed of feature vectors of 1 to j dimensions. A second local feature generation step of generating the second local feature of
A smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number. When it is determined that the n second local feature amounts correspond to a predetermined ratio or more of the m first local feature amounts including feature vectors up to the selected number of dimensions, the image in the video Recognizing that the landscape element exists in
An information processing method comprising:
N feature points are extracted from the image captured by the imaging means, and n local feature regions each including the n feature points are each composed of feature vectors of 1 to j dimensions. Second local feature quantity generating means for generating the second local feature quantity of
First transmission means for transmitting the m second local feature amounts to an information processing apparatus that recognizes a landscape element included in the image captured based on the comparison of local feature amounts;
First receiving means for receiving information indicating a landscape element included in the captured image from the information processing apparatus;
A communication terminal comprising:
N feature points are extracted from the image captured by the imaging means, and n local feature regions each including the n feature points are each composed of feature vectors of 1 to j dimensions. A second local feature generation step of generating the second local feature of
A first transmission step of transmitting the m second local feature amounts to an information processing device that recognizes a landscape element included in the image captured based on the comparison of local feature amounts;
A first receiving step of receiving information indicating a landscape element included in the captured image from the information processing apparatus;
A control method for a communication terminal, comprising:
N feature points are extracted from the image captured by the imaging means, and n local feature regions each including the n feature points are each composed of feature vectors of 1 to j dimensions. A second local feature generation step of generating the second local feature of
A first transmission step of transmitting the m second local feature amounts to an information processing device that recognizes a landscape element included in the image captured based on the comparison of local feature amounts;
A first receiving step of receiving information indicating a landscape element included in the captured image from the information processing apparatus;
A program for controlling a communication terminal, which causes a computer to execute.
M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the landscape elements and m local regions including each of the m feature points of the landscape element image. First local feature quantity storage means for storing the quantity in association with each other;
N feature points are extracted from an image in the video captured by the communication terminal, and n feature regions of 1 to j dimensions are respectively obtained for n local regions including the n feature points. Second receiving means for receiving the second local feature amount from the communication terminal;
A smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number. When it is determined that the n second local feature amounts correspond to a predetermined ratio or more of the m first local feature amounts including feature vectors up to the selected number of dimensions, the image in the video Recognizing means for recognizing that the landscape element exists in
Second transmission means for transmitting information indicating the recognized landscape element to the communication terminal;
An information processing apparatus comprising:
M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the landscape elements and m local regions including each of the m feature points of the landscape element image. A method for controlling an information processing apparatus including first local feature storage means for storing a quantity in association with each other,
N feature points are extracted from an image in the video captured by the communication terminal, and n feature regions of 1 to j dimensions are respectively obtained for n local regions including the n feature points. A second receiving step of receiving the second local feature amount of the communication terminal from the communication terminal;
A smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number. When it is determined that the n second local feature amounts correspond to a predetermined ratio or more of the m first local feature amounts including feature vectors up to the selected number of dimensions, the image in the video Recognizing that the landscape element exists in
A second transmission step of transmitting information indicating the recognized landscape element to the communication terminal;
A method for controlling an information processing apparatus, comprising:
M first local features each consisting of a 1-dimensional to i-dimensional feature vector generated for each of the landscape elements and m local regions including each of the m feature points of the landscape element image. A control program for an information processing device including first local feature storage means for storing the amount in association with each other,
N feature points are extracted from an image in the video captured by the communication terminal, and n feature regions of 1 to j dimensions are respectively obtained for n local regions including the n feature points. A second receiving step of receiving the second local feature amount of the communication terminal from the communication terminal;
A smaller dimension number is selected from among the dimension number i of the feature vector of the first local feature quantity and the dimension number j of the feature vector of the second local feature quantity, and the feature vector includes up to the selected dimension number. When it is determined that the n second local feature amounts correspond to a predetermined ratio or more of the m first local feature amounts including feature vectors up to the selected number of dimensions, the image in the video Recognizing that the landscape element exists in
A second transmission step of transmitting information indicating the recognized landscape element to the communication terminal;
A control program for causing a computer to execute.