JP7227799B2

JP7227799B2 - Image retrieval device, image retrieval method and computer program

Info

Publication number: JP7227799B2
Application number: JP2019046127A
Authority: JP
Inventors: 雅人田村; 敦廣池; 俊明垂井; 智明吉永
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2019-03-13
Filing date: 2019-03-13
Publication date: 2023-02-22
Anticipated expiration: 2039-03-13
Also published as: JP2020149337A

Description

本発明は、画像検索装置、画像検索方法およびコンピュータプログラムに関するものである。 The present invention relates to an image search device, an image search method, and a computer program.

従来、コンピュータを使用してユーザの求める画像を検索する画像検索装置は、入力されたキーワードに基づいて画像を検索する。画像検索装置は、サーバに保存される複数の画像にそれぞれ紐づけられるキーワードと、ユーザが入力したキーワードと、を比較して、検索結果をユーザに提示する。 2. Description of the Related Art Conventionally, an image search device that uses a computer to search for an image desired by a user searches for images based on an input keyword. The image search device compares the keywords associated with each of the images stored in the server with the keywords input by the user, and presents the search results to the user.

特許文献１の技術では、設定されたキーワードに関連するサムネイル画像を複数枚表示させる。この場合において、サーバコンピュータは、キーワードごとのサムネイル画像の表示枚数に、キーワードごとの検出率を反映する。キーワードごとの検出率は、画像検索する際に、ユーザによって入力される。これにより、ユーザが考える各キーワードに対する重要度を検索に反映させることが可能なサーバコンピュータが、開示されている。 In the technique of Patent Document 1, a plurality of thumbnail images related to a set keyword are displayed. In this case, the server computer reflects the detection rate of each keyword in the number of displayed thumbnail images for each keyword. The detection rate for each keyword is input by the user when searching for images. Accordingly, a server computer is disclosed that can reflect the degree of importance of each keyword that a user considers to a search.

特開２０１３－００３７２７号公報JP 2013-003727 A

特許文献１では、Ｗｅｂページに記載されているテキストと、入力されたキーワードと、に基づいて画像検索する。しかしながら、Ｗｅｂページに記載されているテキストが人の主観で設定されたものである為、特許文献１の技術では、人の認識外の情報に基づいて検索することができない。 In Patent Literature 1, image retrieval is performed based on text described in a Web page and an input keyword. However, since the text described in the Web page is subjectively set by a person, the technique disclosed in Japanese Patent Laid-Open No. 2002-200013 cannot perform a search based on information that is not recognized by the person.

さらに、キーワードによって示される複数のオブジェクト同士の関連性が各画像に設定されていない為、特許文献１では、画像に示される複数のオブジェクト間の関連性を含めて検索することができない。 Furthermore, since the relationships between the objects indicated by the keywords are not set in each image, in Patent Literature 1, it is not possible to perform a search including the relationships between the objects indicated in the images.

そこで本発明は、上記の課題を解決する為になされたものであり、画像検索の精度を向上させることが可能な画像検索装置、画像検索方法およびコンピュータプログラムの提供を目的とする。 SUMMARY OF THE INVENTION Accordingly, it is an object of the present invention to provide an image search apparatus, an image search method, and a computer program capable of improving the accuracy of image search.

画像検索装置は、画像に含まれる複数のオブジェクトのオブジェクト名情報と、画像に含まれる複数のオブジェクト間の所定の関係性を示す関係名称情報とを含む検索用データを算出する特徴抽出部と、検索用データを画像に対応付けて記憶する記憶部と、オブジェクト名情報または関係名称情報のうち少なくともいずれか一方を検索クエリとして入力を受け付ける入力部と、検索クエリに基づいて記憶部を検索し、検索クエリに対応する所定の画像を抽出する検索部と、検索結果を出力する出力部と、を備える。 The image search device comprises: a feature extraction unit for calculating search data including object name information of a plurality of objects included in an image and relationship name information indicating a predetermined relationship between the plurality of objects included in the image; a storage unit for storing search data in association with an image; an input unit for accepting input of at least one of object name information and related name information as a search query; searching the storage unit based on the search query; A search unit for extracting a predetermined image corresponding to a search query, and an output unit for outputting search results.

本発明によると、画像検索の精度を向上させることができる。 According to the present invention, the accuracy of image retrieval can be improved.

第１実施例に係る画像検索装置の概略図。1 is a schematic diagram of an image search device according to a first embodiment; FIG. 画像検索装置のハードウェア構成図。FIG. 2 is a hardware configuration diagram of an image search device; 検索クエリ入力部の説明図。Explanatory drawing of a search query input part. オブジェクト名情報の説明図。Explanatory drawing of object name information. 関連名称情報の説明図。Explanatory drawing of related name information. 検索用データの説明図。Explanatory drawing of the data for a search. 検索結果出力部の説明図。Explanatory drawing of a search result output part. 検索部の処理の流れ図。4 is a flowchart of processing of a search unit; 特徴抽出部の処理の流れ図。4 is a flowchart of processing of a feature extraction unit; Ａ領域の拡大図。Enlarged view of the A region. オブジェクトを強調表示する検索結果出力部の説明図。Explanatory drawing of the search result output part which highlights an object. 第２実施例に係る画像検索装置の概略図。Schematic diagram of an image search device according to a second embodiment. 学習処理の流れ図。The flow chart of learning processing. 第３実施例に係る画像検索装置の概略図。Schematic diagram of an image search device according to a third embodiment. 検索クエリ入力部の説明図。Explanatory drawing of a search query input part. 画像検索処理の流れ図。4 is a flowchart of image search processing; 第４実施例に係る画像検索装置の概略図。Schematic diagram of an image search device according to a fourth embodiment. 検索クエリ入力部の説明図。Explanatory drawing of a search query input part. 画像検索処理の流れ図。4 is a flowchart of image search processing; 第５実施例に係るオブジェクト名情報の説明図。Explanatory drawing of the object name information based on 5th Example. 第６実施例に係る駅の概略図。The schematic of the station which concerns on 6th Example. オブジェクト名情報の説明図。Explanatory drawing of object name information. 第７実施例に係る画像検索装置のハードウェア構成図。FIG. 11 is a hardware configuration diagram of an image search device according to the seventh embodiment; 第８実施例に係る画像検索システムのハードウェア構成図。FIG. 11 is a hardware configuration diagram of an image search system according to an eighth embodiment;

以下、本実施形態を添付図面に基づいて説明するが、当該図面に記載の構成に限定されない。本実施形態は、画像を検索する画像検索装置に関するものである。本実施形態の画像検索装置１は、例えば、空港、駅、港、百貨店、ホテル、イベント会場などの各種施設において入場者を監視する監視システムに用いることができる。本実施形態の画像検索装置１は、ウェブ上での通常の画像検索エンジンとは異なり、一つの画像に含まれる複数のオブジェクトと各オブジェクト間の関係性（物理的関係性）とを自動的に算出して保存する。これにより、本実施形態によれば、記憶部に蓄積された画像の中から検索目的に関連する画像を精度よく抽出できるとともに、使い勝手が向上する。 Although the present embodiment will be described below with reference to the accompanying drawings, the present invention is not limited to the configuration described in the drawings. The present embodiment relates to an image retrieval device for retrieving images. The image retrieval device 1 of this embodiment can be used in a monitoring system for monitoring visitors at various facilities such as airports, stations, ports, department stores, hotels, and event venues. The image search device 1 of this embodiment, unlike a normal image search engine on the web, automatically identifies a plurality of objects included in one image and the relationships (physical relationships) between the objects. Calculate and save. As a result, according to the present embodiment, it is possible to accurately extract an image related to the search purpose from among the images accumulated in the storage unit, and the usability is improved.

本実施形態に係る画像検索装置１が適用される監視システムは、通常、何かテーマまたはモチーフを決めて特定の被写体を意図的に撮影するのではなく、特定の監視対象領域を淡々と撮影する。したがって、監視の結果取得された画像には、雑多な複数のオブジェクトが主従の関係なく写っている。本実施形態に係る画像検索装置１は、複数のオブジェクト間の画像上の位置に基づいて物理的関係性を決定し、画像に対応付けて保存する。 A surveillance system to which the image retrieval device 1 according to the present embodiment is applied generally shoots a specific surveillance target area without any particular theme or motif, rather than intentionally shooting a specific subject. . Therefore, in the image acquired as a result of monitoring, a plurality of miscellaneous objects appear regardless of their master and subordinates. The image retrieval device 1 according to this embodiment determines physical relationships between a plurality of objects based on their positions on the image, and stores them in association with the images.

監視システムに限らず、複数のオブジェクトが写っている静止画像または動画像についても同様に、各オブジェクト間の物理的関係性を自動的に抽出して、その画像に対応付けて保存することができる。 Not limited to surveillance systems, it is also possible to automatically extract the physical relationships between objects in still images or moving images in which a plurality of objects are captured, and store them in association with the images. .

ウェブサイトなどで提供されている画像検索サービスは、撮影者の意図に沿った説明文（例えば「入学式」「結婚式」など）が画像に対応付けられるか、あるいは、解析者による画像解析結果に沿った説明文（例えば「学生の登校風景」「海開きで賑わう浜辺」など）が画像に対応付けられるだけである。すなわち、ウェブ上で提供されている画像検索エンジンは、オブジェクト間の物理的関係性を考慮しておらず、その画像の検索に使用する語句または説明文も少数である。以下の説明に明らかなように、本実施形態に係る画像検索装置１は、ウェブ上の画像検索エンジンを含む従来技術と全く異なる点に留意すべきである。 Image search services provided on websites, etc., are associated with descriptive text (e.g., "entrance ceremony", "wedding", etc.) that matches the intention of the photographer, or the image analysis result by the analyst. Descriptions (for example, "students going to school", "busy beach at the opening of the sea", etc.) are simply associated with the images. That is, the image search engines offered on the Web do not consider the physical relationships between objects and use only a few words or phrases to search for the images. As will be apparent from the following description, it should be noted that the image search device 1 according to this embodiment is completely different from the prior art including image search engines on the web.

図１は、画像検索装置１の概略図である。本実施例における画像検索装置１は、画像データ蓄積部１２３に保存される複数の画像の中から、ユーザの目的の画像（以下、所定の画像と示す場合がある）を検索する。 FIG. 1 is a schematic diagram of an image retrieval device 1. As shown in FIG. The image retrieval apparatus 1 in this embodiment retrieves a user's target image (hereinafter sometimes referred to as a predetermined image) from among a plurality of images stored in the image data storage unit 123 .

画像検索装置１は、「入力部」の一例としての検索クエリ入力部１２０と、検索部１２１と、特徴抽出部１２２と、画像データ蓄積部１２３と、特徴データ蓄積部１２４と、「出力部」の一例としての検索結果出力部１２５と、画像データ取得部１２６とを有する。 The image search device 1 includes a search query input unit 120 as an example of an "input unit", a search unit 121, a feature extraction unit 122, an image data storage unit 123, a feature data storage unit 124, and an "output unit". and an image data acquisition unit 126 .

検索クエリ入力部１２０は、ユーザから検索クエリを受け付ける機能である。検索クエリは、オブジェクト名情報１２４２（図４参照）または関係名称情報１２４４（図５参照）のうち少なくともいずれか一方を含む。なお、「オブジェクト」は、図中において「物体」と示す場合がある。オブジェクト名情報１２４２は、画像に含まれる複数のオブジェクトを示す。関係名称情報１２４４は、複数のオブジェクトの間の関係性を示す。 The search query input unit 120 is a function that receives search queries from users. The search query includes at least one of object name information 1242 (see FIG. 4) or relationship name information 1244 (see FIG. 5). Note that "object" may be indicated as "object" in the drawings. The object name information 1242 indicates multiple objects included in the image. The relationship name information 1244 indicates relationships between multiple objects.

検索クエリ入力部１２０は、例えば、出力装置１１（以降、モニタ１１と示す場合がある）に表示されたＵＩ（ＵｓｅｒＩｎｔｅｒｆａｃｅ）である。検索クエリ入力部１２０は、検索部１２１と単方向に通信可能に接続される。なお、検索クエリ入力部１２０は、図３にて後述する。 The search query input unit 120 is, for example, a UI (User Interface) displayed on the output device 11 (hereinafter sometimes referred to as the monitor 11). The search query input unit 120 is connected to the search unit 121 so as to be unidirectionally communicable. Note that the search query input unit 120 will be described later with reference to FIG.

検索部１２１は、検索クエリに基づいて記憶部１２（図２にて後述）を検索し、検索クエリに対応する所定の画像を抽出する機能である。検索部１２１は、画像データ蓄積部１２３、特徴データ蓄積部１２４および検索結果出力部１２５と単方向に通信可能に接続される。検索部１２１は、特徴抽出部１２２と双方向に通信可能に接続されてもよい。検索部１２１は、図８にて後述する。 The search unit 121 has a function of searching the storage unit 12 (described later with reference to FIG. 2) based on the search query and extracting a predetermined image corresponding to the search query. The search unit 121 is connected to the image data storage unit 123, the feature data storage unit 124, and the search result output unit 125 so as to be unidirectionally communicable. The search unit 121 may be connected to the feature extraction unit 122 so as to be able to communicate bidirectionally. The search unit 121 will be described later with reference to FIG.

特徴抽出部１２２は、画像から検索用データ（図４～６参照）を算出する機能である。検索用データには、オブジェクト名情報１２４２および関係名称情報１２４４が含まれる。特徴抽出部１２２は、画像データ蓄積部１２３と単方向に通信可能に接続される。特徴抽出部１２２は、特徴データ蓄積部１２４と双方向に通信可能に接続される。特徴抽出部１２２は、図９にて後述する。 The feature extraction unit 122 has a function of calculating search data (see FIGS. 4 to 6) from an image. The search data includes object name information 1242 and relationship name information 1244 . The feature extraction unit 122 is connected to the image data storage unit 123 so as to be unidirectionally communicable. The feature extraction unit 122 is connected to the feature data storage unit 124 so as to be able to communicate bidirectionally. The feature extraction unit 122 will be described later with reference to FIG.

画像データ蓄積部１２３は、複数の画像を保存するデータベースである。画像データ蓄積部１２３は、画像データ取得部１２６と単方向に通信可能に接続される。特徴データ蓄積部１２４は、検索用データを保存するデータベースである。 The image data storage unit 123 is a database that stores a plurality of images. The image data storage unit 123 is connected to the image data acquisition unit 126 so as to be unidirectionally communicable. The feature data storage unit 124 is a database that stores search data.

検索結果出力部１２５は、検索結果を出力する機能である。検索結果出力部１２５は、例えば、モニタ１１に検索結果を表示させる。なお、検索結果出力部１２５は、モニタ１１に検索結果を表示することに限らず、通信インターフェース（図中、通信Ｉ／Ｆ（ＩｎｔｅｒＦａｃｅ））１６（図２参照）を介して外部端末に検索結果を出力してもよい。検索結果出力部１２５は、図７にて後述する。 The search result output unit 125 has a function of outputting search results. The search result output unit 125 causes the monitor 11 to display the search result, for example. It should be noted that the search result output unit 125 is not limited to displaying the search results on the monitor 11, but rather displays the search results on an external terminal via a communication interface (communication I/F (InterFace) in the figure) 16 (see FIG. 2). may be output. The search result output unit 125 will be described later with reference to FIG.

画像データ取得部１２６は、画像データ蓄積部１２３に複数の画像データを保存する機能である。画像データ取得部１２６は、例えば、監視カメラ等によって撮影された動画データを取得し、動画のフレームごとの画像を画像データ蓄積部１２３に保存する。なお、画像データ取得部１２６は、監視カメラ等によって撮影された動画データに限らず、複数の画像データ（静止画像、動画像のいずれでもよい）をインターネット上等から取得してもよい。画像データ取得部１２６は、例えば、画像２を取得する。画像データ取得部１２６は、例えば、画像２を画像データ蓄積部１２３に保存する。 The image data acquisition unit 126 has a function of storing a plurality of image data in the image data storage unit 123 . The image data acquisition unit 126 acquires moving image data captured by, for example, a surveillance camera or the like, and stores an image of each frame of the moving image in the image data storage unit 123 . Note that the image data acquisition unit 126 may acquire a plurality of image data (either still images or moving images) from the Internet or the like, not limited to moving image data captured by a surveillance camera or the like. The image data acquisition unit 126 acquires image 2, for example. The image data acquisition unit 126 stores image 2 in the image data storage unit 123, for example.

画像２は、例えば、監視カメラによって撮影された動画データの中の１フレームの画像である。画像２には、例えば、駅５構内が表示される。画像２には、例えば、入口５１と、人２１，２５，２９，３３，３７，４０，と、が表示される。 An image 2 is, for example, a one-frame image in moving image data captured by a surveillance camera. Image 2 displays, for example, the premises of station 5 . In the image 2, for example, an entrance 51 and people 21, 25, 29, 33, 37, 40 are displayed.

人２１は、子供である。人２１は、服２２および靴２３を身に着け、鞄２４を持つ。人２５は、大人の女性である。人２５は、スーツ２６および靴２７を身に着け、鞄２８を持つ。人２９は、大人の男性である。人２９は、スーツ３０および靴３１を身に着け、飲み物３２を持つ。人３３は、大人の男性である。人３３は、私服３４および靴３５を身に着け、鞄３６を持つ。人３７は、子供である。人３７は、私服３８および靴３９を身に着ける。人４０は、大人の女性である。人４０は、私服４１および靴４２を身に着ける。そして、人３３と人３７とは手をつないで接触しており、同様に、人４０と人３７も手をつないで接触している。 Person 21 is a child. A person 21 wears clothes 22 and shoes 23 and carries a bag 24 . Person 25 is an adult female. A person 25 wears a suit 26 and shoes 27 and carries a bag 28 . Person 29 is an adult male. A person 29 wears a suit 30 and shoes 31 and has a drink 32 . Person 33 is an adult male. A person 33 wears plain clothes 34 and shoes 35 and carries a bag 36 . Person 37 is a child. Person 37 wears plain clothes 38 and shoes 39 . Person 40 is an adult female. Person 40 wears plain clothes 41 and shoes 42 . The persons 33 and 37 hold hands and are in contact with each other, and similarly, the persons 40 and 37 hold hands and are in contact with each other.

図２は、画像検索装置１のハードウェア構成図である。画像検索装置１は、モニタ１１（図中、出力部１１）と、入力装置１３と、記憶部１２と、ＣＰＵ１４と、メモリ１５と、通信インターフェース１６と、各機能１１～１６を双方向に通信可能に接続するデータ伝送路１７と、を有する。 FIG. 2 is a hardware configuration diagram of the image search device 1. As shown in FIG. The image search device 1 bidirectionally communicates with a monitor 11 (output unit 11 in the drawing), an input device 13, a storage unit 12, a CPU 14, a memory 15, a communication interface 16, and functions 11 to 16. and a data transmission line 17 for possible connection.

モニタ１１は、例えば、パーソナルコンピュータに設けられるディスプレイまたは、携帯情報端末、携帯電話（いわゆるスマートフォン）、ウェアラブル端末に設けられるディスプレイ等である。入力装置１３は、例えば、キーボードまたはマウス等である。なお、入力装置１３は、マイクであってもよい。この場合には、検索クエリ入力部１２０は、音声によって入力されたデータを、検索クエリを示す文章データに変換する機能を有してもよい。さらに、入力装置１３と出力装置１１とをタブレットまたはＡＲ（ＡｕｇｍｅｎｔｅｄＲｅａｌｉｔｙ）ディスプレイのように一体化させてもよい。 The monitor 11 is, for example, a display provided in a personal computer, a display provided in a mobile information terminal, a mobile phone (so-called smart phone), a wearable terminal, or the like. The input device 13 is, for example, a keyboard or mouse. Note that the input device 13 may be a microphone. In this case, the search query input unit 120 may have a function of converting data input by voice into text data representing the search query. Furthermore, the input device 13 and the output device 11 may be integrated like a tablet or an AR (Augmented Reality) display.

記憶部１２は、例えば、ハードディスクまたはＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等の不揮発性記憶装置である。記憶媒体の種類は問わない。記憶部１２は、検索クエリ入力部１２０と、検索部１２１と、特徴抽出部１２２と、検索結果出力部１２５と、画像データ取得部１２６と、のコンピュータプログラム（以下、プログラム）を記憶する。記憶部１２は、画像データ蓄積部１２３および特徴データ蓄積部１２４といった、データベースも記憶する。 The storage unit 12 is, for example, a non-volatile storage device such as a hard disk or SSD (Solid State Drive). Any type of storage medium can be used. The storage unit 12 stores computer programs (hereinafter referred to as programs) for a search query input unit 120, a search unit 121, a feature extraction unit 122, a search result output unit 125, and an image data acquisition unit 126. FIG. The storage unit 12 also stores databases such as an image data storage unit 123 and a feature data storage unit 124 .

ＣＰＵ１４は、メモリ１５を介して記憶部１２から各プログラムを読み込んで実行する。メモリ１５は、例えば、「ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）」等の揮発性記憶装置である。 The CPU 14 reads each program from the storage unit 12 via the memory 15 and executes it. The memory 15 is, for example, a volatile storage device such as "RAM (Random Access Memory)".

通信インターフェース１６は、例えば、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、インターネット、ＳＡＮ（ＳｔｏｒａｇｅＡｒｅａＮｅｔｗｏｒｋ）などの通信ネットワークを介して外部装置と通信する装置である。 The communication interface 16 is a device that communicates with an external device via a communication network such as a LAN (Local Area Network), the Internet, or a SAN (Storage Area Network).

図３は、検索クエリ入力部１２０の説明図である。検索クエリ入力部１２０は、モニタ１１に、複数のオブジェクト名入力欄１２０１（１），１２０１（２）と、関係名入力欄１２０２と、検索ボタン１２０３と、を表示させる。オブジェクト名入力欄１２０１（１），１２０１（２）は、特に区別しない場合には、オブジェクト名入力欄１２０１と示す場合がある。 FIG. 3 is an explanatory diagram of the search query input unit 120. As shown in FIG. The search query input unit 120 causes the monitor 11 to display a plurality of object name input fields 1201(1) and 1201(2), a relationship name input field 1202, and a search button 1203. FIG. The object name input fields 1201(1) and 1201(2) may be referred to as the object name input field 1201 when they are not distinguished from each other.

オブジェクト名入力欄１２０１は、オブジェクト名情報１２４２を受け付ける機能である。オブジェクト名入力欄１２０１（１）には、例えば、「人」と入力される。オブジェクト名入力欄１２０１（２）には、例えば、「鞄」と入力される。 The object name input field 1201 is a function of receiving object name information 1242 . For example, "person" is entered in the object name input field 1201(1). For example, "bag" is entered in the object name input field 1201(2).

なお、オブジェクト名入力欄１２０１は、オブジェクト名情報１２４２を受け付ける機能に限らず、「オブジェクトＩＤ（ＩＤｅｎｔｉｆｉｃａｔｉｏｎ）」１２４１（図４参照）を受け付ける機能を有してもよい。検索クエリ入力部１２０は、オブジェクト名入力欄１２０１（１），１２０１（２）を二つ表示させることに限らず、三つ以上のオブジェクト名入力欄１２０１をモニタ１１に表示してもよい。 Note that the object name input field 1201 is not limited to the function of receiving the object name information 1242, and may have the function of receiving an "object ID (IDentification)" 1241 (see FIG. 4). The search query input unit 120 is not limited to displaying two object name input fields 1201(1) and 1201(2), and may display three or more object name input fields 1201 on the monitor 11. FIG.

関係名入力欄１２０２は、関係名称情報１２４４を受け付ける機能である。関係名入力欄１２０２には、例えば、「持っている」と入力される。 The relationship name input field 1202 is a function for receiving relationship name information 1244 . In the relationship name input field 1202, for example, "has" is input.

なお、関係名入力欄１２０２には、関係名称情報１２４４を受け付ける機能に限らず、「関係性ＩＤ」１２４３（図５参照）を受け付ける機能を有してもよい。検索クエリ入力部１２０は、関係名入力欄１２０２を一つ表示させることに限らず、二つ以上の関係名称情報１２４４を表示させてもよい。 Note that the relationship name input field 1202 may have a function of accepting the “relationship ID” 1243 (see FIG. 5) in addition to the function of accepting the relationship name information 1244 . The search query input unit 120 is not limited to displaying one relationship name input field 1202, and may display two or more relationship name information 1244. FIG.

すなわち、ユーザは、「人」が「鞄」を「持っている」画像を検索する。検索ボタン１２０３は、画像検索を実行させるボタンである。 That is, the user searches for an image in which a "person" "holds" a "bag". A search button 1203 is a button for executing an image search.

検索クエリ入力部１２０は、複数のオブジェクト名入力欄１２０１と、少なくとも一つの関係名入力欄１２０２と、をモニタ１１のスクロール方向に並べて表示させる。モニタ１１のスクロール方向は、例えば、モニタ１１の上下方向（図中Ｓ方向）である。 The search query input unit 120 displays a plurality of object name input fields 1201 and at least one relationship name input field 1202 side by side in the scroll direction of the monitor 11 . The scroll direction of the monitor 11 is, for example, the vertical direction of the monitor 11 (direction S in the figure).

すなわち、検索クエリ入力部１２０は、例えば、オブジェクト名入力欄１２０１（１）をモニタ１１の上側に表示させる。検索クエリ入力部１２０は、例えば、オブジェクト名入力欄１２０１（２）を、オブジェクト名入力欄１２０１（１）から下方向に表示させる。検索クエリ入力部１２０は、例えば、関係名入力欄１２０２を、オブジェクト名入力欄１２０１（２）から下方向に表示させる。 That is, the search query input unit 120 displays the object name input field 1201(1) on the upper side of the monitor 11, for example. The search query input unit 120, for example, displays the object name input field 1201(2) downward from the object name input field 1201(1). The search query input unit 120, for example, causes the relationship name input field 1202 to be displayed downward from the object name input field 1201(2).

これにより、携帯電話等に設けられる縦に長いモニタ１１を使用する場合であっても、ユーザは、下方向にスクロールすることによって、複数のオブジェクト名入力欄１２０１と、少なくとも一つの関係名入力欄１２０２と、を表示させることができる。これにより、検索クエリ入力部１２０の操作性を向上させることができる。 As a result, even when using a vertically long monitor 11 provided on a mobile phone or the like, the user can scroll down to select a plurality of object name input fields 1201 and at least one relationship name input field. 1202 and can be displayed. Thereby, the operability of the search query input unit 120 can be improved.

オブジェクト名入力欄１２０１および関係名入力欄１２０２が受け付ける情報は、プルダウンメニューによって選択されてもよい。プルダウンメニューに表示される情報は、特徴データ蓄積部１２４に保存される情報が表示されてもよい。 The information received by the object name input field 1201 and the relationship name input field 1202 may be selected using pull-down menus. Information displayed in the pull-down menu may be information stored in the feature data storage unit 124 .

図４は、オブジェクト名情報１２４２の説明図である。特徴データ蓄積部１２４には、「オブジェクトＩＤ」１２４１と、オブジェクト名情報１２４２と、が保存される。 FIG. 4 is an explanatory diagram of the object name information 1242. As shown in FIG. An “object ID” 1241 and object name information 1242 are stored in the feature data storage unit 124 .

「オブジェクトＩＤ」１２４１には、オブジェクト名情報１２４２を識別する情報が保存される。オブジェクト名情報１２４２には、画像に含まれるオブジェクトの名称が保存される。オブジェクト名情報１２４２には、例えば、「人」、「車」、「電車」、「鞄」または「ステッカー」等が保存される。 The “object ID” 1241 stores information for identifying the object name information 1242 . The object name information 1242 stores the name of the object included in the image. The object name information 1242 stores, for example, "person", "car", "train", "bag" or "sticker".

図５は、関係名称情報１２４４の説明図である。特徴データ蓄積部１２４には、「関係性ＩＤ」１２４３と、関係名称情報１２４４と、が保存される。 FIG. 5 is an explanatory diagram of the relationship name information 1244. As shown in FIG. A “relationship ID” 1243 and relation name information 1244 are stored in the feature data accumulation unit 124 .

「関係性ＩＤ」１２４３には、関係名称情報１２４４を識別する情報が保存される。関係名称情報１２４４には、各オブジェクト間の関係性を示す情報が保存される。各オブジェクト間の関係性は、物理的関係性を示す。関係名称情報１２４４には、例えば、「持っている」、「乗っている」、「くっついている」、「触れている」または「ぶら下がっている」等が保存される。 Information for identifying the relationship name information 1244 is stored in the “relationship ID” 1243 . The relationship name information 1244 stores information indicating relationships between objects. A relationship between each object indicates a physical relationship. The related name information 1244 stores, for example, "holding", "riding", "attaching", "touching", "hanging", and the like.

図６は、検索用データの説明図である。特徴データ蓄積部１２４には、「画像ＩＤ」１２４５と、「組み合わせＩＤ」１２４６と、各オブジェクトの識別情報１２４１（１），１２４１（２）と、各オブジェクトの位置情報１２４７（１），１２４７（２）と、各オブジェクトの大きさの情報１２４８（１），１２４８（２）と、「関係性ＩＤ」１２４３と、が保存される。なお、図中において特徴データ蓄積部１２４には、一つの画像に対して二つのオブジェクトが含まれる検索用データが保存される場合を示すが、一つの画像に対して三つ以上のオブジェクトが含まれる検索用データが保存されてもよい。図中において、「関係性ＩＤ」１２４３、「画像ＩＤ」１２４５および「組み合わせＩＤ」１２４６は、文字が枠内に収まらない場合が考えられるため、符号を直接記載する。 FIG. 6 is an explanatory diagram of search data. The feature data storage unit 124 stores an "image ID" 1245, a "combination ID" 1246, identification information 1241(1) and 1241(2) of each object, position information 1247(1) and 1247(1) of each object. 2), size information 1248(1) and 1248(2) of each object, and "relationship ID" 1243 are saved. In the drawing, the feature data storage unit 124 stores search data containing two objects for one image, but three or more objects for one image. Data for search that is obtained may be stored. In the figure, for "relationship ID" 1243, "image ID" 1245, and "combination ID" 1246, since the characters may not fit within the frame, the symbols are directly described.

「画像ＩＤ」１２４５には、画像を識別する情報が保存される。「画像ＩＤ」が「０」の画像には、例えば、画像２を示す。「組み合わせＩＤ」１２４６には、「オブジェクト１ＩＤ」１２４１（１）と、「オブジェクト２ＩＤ」１２４１（２）と、「関係性ＩＤ」１２４３と、の組み合わせを識別する情報が保存される。 “Image ID” 1245 stores information for identifying an image. For example, an image 2 is shown as an image whose “image ID” is “0”. The 'combination ID' 1246 stores information for identifying the combination of the 'object 1 ID' 1241(1), the 'object 2 ID' 1241(2), and the 'relationship ID' 1243 .

「オブジェクト１ＩＤ」１２４１（１）および「オブジェクト２ＩＤ」１２４１（２）には、オブジェクトを識別する情報が保存される。「オブジェクト１ＩＤ」１２４１（１）と、「オブジェクト２ＩＤ」１２４１（２）とは、「オブジェクトＩＤ」１２４１（図４参照）に対応する。 "Object 1 ID" 1241(1) and "Object 2 ID" 1241(2) store information identifying objects. “Object 1 ID” 1241(1) and “Object 2 ID” 1241(2) correspond to “Object ID” 1241 (see FIG. 4).

オブジェクトの位置情報１２４７（１），１２４７（２）には、オブジェクトの位置情報が保存される。なお、位置情報は、例えば、画像端から画素数を数えることによって算出されてもよい。位置情報は、画像上のｘ座標及びｙ座標の二次元座標を用いて示されてもよい。なお、位置情報は、三次元座標を用いて示されてもよい。 Object position information 1247(1) and 1247(2) stores object position information. Note that the position information may be calculated, for example, by counting the number of pixels from the edge of the image. The location information may be indicated using two-dimensional coordinates of x and y coordinates on the image. Note that the position information may be indicated using three-dimensional coordinates.

オブジェクトの大きさの情報１２４８（１），１２４８（２）には、オブジェクトの大きさの情報が保存される。なお、大きさの情報は、例えば、画像上の画素数で算出されてもよい。大きさの情報は、各オブジェクトの重心から各オブジェクトの端部の長さによって算出されてもよい。オブジェクトの大きさは、画像上のオブジェクト体の幅ｗ及び高さｈを用いて示されてもよい。 Object size information 1248(1) and 1248(2) stores object size information. Note that the size information may be calculated by, for example, the number of pixels on the image. The size information may be calculated by the length of the edge of each object from the centroid of each object. The size of an object may be indicated using the width w and height h of the object volume on the image.

図７は、検索結果出力部１２５の説明図である。検索結果出力部１２５は、モニタ１１に、検索部１２１の検索した所定の画像を表示させる。検索結果出力部１２５は、例えば、所定の画像である画像２を表示する。 FIG. 7 is an explanatory diagram of the search result output unit 125. As shown in FIG. The search result output unit 125 causes the monitor 11 to display the predetermined image searched by the search unit 121 . The search result output unit 125 displays, for example, image 2, which is a predetermined image.

図８は、検索部１２１の処理の流れ図である。画像検索装置１が起動後、検索部１２１は、複数の画像データを画像データ蓄積部１２３から取得し、検索用データを特徴データ蓄積部１２４から取得する（Ｓ１１）。検索部１２１は、検索クエリ入力部１２０の検索ボタン１２０３が押されることによって（Ｓ１２：Ｙｅｓ）、画像検索処理（Ｓ１３～Ｓ１５）を実行する。 FIG. 8 is a flow chart of processing of the search unit 121. As shown in FIG. After the image search device 1 is activated, the search unit 121 acquires a plurality of image data from the image data storage unit 123 and acquires search data from the feature data storage unit 124 (S11). When the search button 1203 of the search query input unit 120 is pressed (S12: Yes), the search unit 121 executes image search processing (S13 to S15).

なお、検索部１２１の処理を説明する場合において、図３に示す「人」、「鞄」および「持っている」を、検索クエリ入力部１２０に入力された場合を一例に挙げて説明する。 When describing the processing of the search unit 121, a case where "person", "bag" and "have" shown in FIG.

検索部１２１は、検索クエリ入力部１２０から、「人」および「鞄」を示すオブジェクト名情報１２４２と、「持っている」を示す関係名称情報１２４４と、を取得する（Ｓ１３）。検索部１２１は、所定の画像を複数の画像の中から検索する（Ｓ１４）。所定の画像は、例えば、「人」および「鞄」のオブジェクト名情報１２４２と、「持っている」の関係名称情報１２４４と、を含む検索用データを有する。 The search unit 121 acquires the object name information 1242 indicating "person" and "bag" and the related name information 1244 indicating "have" from the search query input unit 120 (S13). The search unit 121 searches for a predetermined image from the plurality of images (S14). The predetermined image has search data including, for example, object name information 1242 of "person" and "bag" and related name information 1244 of "has".

図６の検索用データに示すように、「組み合わせＩＤ」１２４６が「１」の場合には、「オブジェクト１ＩＤ」１２４１（１）は「０」を示し、「オブジェクト２ＩＤ」１２４１（２）は「３」を示し、「関係性ＩＤ」１２４３は、「０」を示す。「オブジェクトＩＤ」１２４１の「０」は、「人」を示す（図４参照）。「オブジェクトＩＤ」１２４１の「３」は、「鞄」を示す。「関係性ＩＤ」１２４３の「０」は、「持っている」を示す。 As shown in the search data of FIG. 6, when the "combination ID" 1246 is "1", the "object 1 ID" 1241(1) indicates "0", and the "object 2 ID" 1241(2) indicates " 3”, and the “relationship ID” 1243 indicates “0”. "0" in the "object ID" 1241 indicates "person" (see FIG. 4). "3" in the "object ID" 1241 indicates "bag". "0" in the "relationship ID" 1243 indicates "has".

検索部１２１は、「画像ＩＤ」１２４５の「０」の画像が、「人」および「鞄」のオブジェクト名情報１２４２と、「持っている」の関係名称情報１２４４と、を含むと判定する。検索部１２１は、画像２を所定の画像として設定する。 The search unit 121 determines that the image of "0" of the "image ID" 1245 includes the object name information 1242 of "person" and "bag" and the related name information 1244 of "has". The search unit 121 sets the image 2 as the predetermined image.

検索部１２１は、画像２を検索結果出力部１２５へ送信する（Ｓ１５）。検索部１２１は、画像２に含まれるオブジェクトの情報および、オブジェクト間の関係性の情報を検索結果出力部１２５に送信する。検索部１２１の処理は、検索を続行する場合（Ｓ１６：Ｎｏ）には、処理（Ｓ１２）に戻る。検索部１２１の処理は、検索を終了する場合（Ｓ１６：Ｙｅｓ）には、終了する。 The search unit 121 transmits image 2 to the search result output unit 125 (S15). The search unit 121 transmits information on the objects included in the image 2 and information on relationships between the objects to the search result output unit 125 . The processing of the search unit 121 returns to the processing (S12) when continuing the search (S16: No). The processing of the search unit 121 ends when the search ends (S16: Yes).

図９は、特徴抽出部１２２の処理の流れ図である。特徴抽出部１２２は、例えば、画像データ蓄積部１２３に新たに画像が追加された場合に実行される。なお、特徴抽出部１２２は、ユーザによって実行されてもよい。特徴抽出部１２２は、スケジューラ等に設定された所定周期で実行されてもよい。 FIG. 9 is a flow chart of processing of the feature extraction unit 122. As shown in FIG. The feature extraction unit 122 is executed, for example, when a new image is added to the image data accumulation unit 123 . Note that the feature extraction unit 122 may be executed by a user. The feature extraction unit 122 may be executed at a predetermined cycle set in a scheduler or the like.

特徴抽出部１２２は、画像データ蓄積部１２３から少なくとも一つの画像データを取得する（Ｓ２１）。画像抽出部１２２は、例えば、画像データ蓄積部１２３に新しく追加された画像データを取得してもよい。画像抽出部１２２は、画像データ蓄積部１２３に保存される全ての画像データを取得してもよい。 The feature extraction unit 122 acquires at least one piece of image data from the image data storage unit 123 (S21). The image extraction unit 122 may acquire image data newly added to the image data storage unit 123, for example. The image extraction unit 122 may acquire all image data stored in the image data storage unit 123 .

特徴抽出部１２２は、オブジェクト名情報１２４２（図４参照）を算出する（Ｓ２２）。特徴抽出部１２２は、処理（Ｓ２２）にて算出したオブジェクト名情報１２４２を用いて、関係名称情報１２４４（図５参照）を算出する（Ｓ２３）。 The feature extraction unit 122 calculates object name information 1242 (see FIG. 4) (S22). The feature extraction unit 122 calculates the relationship name information 1244 (see FIG. 5) using the object name information 1242 calculated in the process (S22) (S23).

なお、特徴抽出部１２２の処理（Ｓ２２，２３）は、例えば、ＣＮＮ（ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋ）を用いることによって、オブジェクト名情報１２４２および関係名称情報１２４４を算出する。特徴抽出部１２２は、オブジェクトの位置情報およびオブジェクトの大きさの情報を算出してもよい。特徴抽出部１２２の検索用データの算出式を以下の数式１に示す。 Note that the processing (S22, 23) of the feature extraction unit 122 calculates the object name information 1242 and the relationship name information 1244 by using, for example, a CNN (Convolutional Neural Network). The feature extraction unit 122 may calculate position information of the object and information on the size of the object. A calculation formula for the search data of the feature extraction unit 122 is shown in Formula 1 below.

（ｏ１，ｘ１，ｙ１，ｗ１，ｈ１，ｃ１，ｏ２，ｘ２，ｙ２，ｗ２，ｈ２，ｃ２，ｒ，ｃｒ）＝φ（Ｉ；θ）・・・式（１） (o1, x1, y1, w1, h1, c1, o2, x2, y2, w2, h2, c2, r, cr)=φ(I; θ) Equation (1)

数式１の右辺において、「φ」は、ＣＮＮの式を示す。「Ｉ」は、入力画像のデータを示す。「θ」は、ＣＮＮのパラメータを示す。パラメータ「θ」は、画像に示される複数のオブジェクトのデータおよび、画像に示される複数のオブジェクトの間の関連性のデータを算出する為に、特徴抽出部１２２に設定される。特徴抽出部１２２は、「Ｉ」および「θ」を用いて、「φ」の式によって検索用データを算出する。 In the right side of Equation 1, "φ" indicates the CNN equation. "I" indicates the data of the input image. “θ” indicates a parameter of CNN. The parameter “θ” is set in the feature extraction unit 122 to calculate data of multiple objects shown in the image and data of relevance between the multiple objects shown in the image. The feature extraction unit 122 calculates search data according to the formula “φ” using “I” and “θ”.

数式１の左辺は、検索用データを示す。「ｏ１」および「ｏ２」は、各オブジェクトの名称１２４２を示す。「ｘ１」、「ｘ２」、「ｙ１」および「ｙ２」は各オブジェクトの位置１２４７を示す。「ｗ１」、「ｗ２」、「ｈ１」および「ｈ２」は、各オブジェクトの大きさ１２４８を示す。「ｃ１」および「ｃ２」は、各オブジェクトの推定の信頼度を示す。「ｒ」は、関係名称情報１２４４を示す。「ｃｒ」は、関係名称情報１２４４の推定の信頼度を示す。なお、「ｏ１」および「ｏ２」を「ｏ」と示すように、各オブジェクト間で特に区別しない場合には、略記して各検索用データを示す場合がある。 The left side of Equation 1 indicates search data. "o1" and "o2" indicate the name 1242 of each object. "x1", "x2", "y1" and "y2" indicate the position 1247 of each object. "w1", "w2", "h1" and "h2" indicate the size 1248 of each object. 'c1' and 'c2' indicate the reliability of the estimation of each object. “r” indicates the relationship name information 1244 . “cr” indicates the reliability of estimation of the relationship name information 1244 . Note that each search data may be abbreviated, such that "o1" and "o2" are indicated as "o" when there is no particular distinction between objects.

なお、特徴抽出部１２２は、各オブジェクト間の関係名称情報１２４４を算出することに限らず、各オブジェクトの状態を示す状態名称情報を算出してもよい。すなわち、特徴抽出部１２２は、例えば、人の表情等を読み取ることによって、「笑っている」等の状態名称情報を算出する。検索部は、状態名称情報に基づいて、画像検索をしてもよい。 Note that the feature extraction unit 122 may calculate state name information indicating the state of each object without being limited to calculating the relationship name information 1244 between the objects. That is, the feature extraction unit 122 calculates state name information such as "smiling" by reading a person's facial expression or the like. The search unit may search for images based on the state name information.

特徴抽出部１２２は、各オブジェクト間の距離に基づいて関係名称情報１２４４を算出する。図１０は、Ａ領域の拡大図である。特徴抽出部１２２は、例えば、オブジェクトの重心の位置を、オブジェクトの位置情報（図中、ひし形のマークとして示す）として算出する。位置情報は、例えば、ｙ座標および画像のｘ座標の二次元情報で示される。 The feature extraction unit 122 calculates relationship name information 1244 based on the distance between each object. FIG. 10 is an enlarged view of the A area. The feature extraction unit 122 calculates, for example, the position of the center of gravity of the object as positional information of the object (shown as a diamond-shaped mark in the figure). The position information is indicated by, for example, two-dimensional information of the y-coordinate and the x-coordinate of the image.

特徴抽出部１２２は、人２１の位置情報（ｘ１１，ｙ１１）を算出する。特徴抽出部１２２は、鞄２３の位置情報（ｘ１２，ｙ１２）を算出する。特徴抽出部１２２は、人２５の位置情報（ｘ２１，ｙ２１）を算出する。特徴抽出部１２２は、鞄２６の位置情報（ｘ２２，ｙ２２）を算出する。 The feature extraction unit 122 calculates position information (x11, y11) of the person 21 . The feature extraction unit 122 calculates position information (x12, y12) of the bag 23 . The feature extraction unit 122 calculates the position information (x21, y21) of the person 25. FIG. The feature extraction unit 122 calculates the position information (x22, y22) of the bag 26. FIG.

特徴抽出部１２２は、各位置情報間の距離を算出する。特徴抽出部１２２は、所定距離以内にある各オブジェクト同士を、関係性を有するオブジェクトとして判定する。すなわち、例えば、位置情報（ｘ１１，ｙ１１）および位置情報（ｘ１２，ｙ１２）の間の距離が所定距離以内であるため、特徴抽出部１２２は、人２５および鞄２６の間で関係名称情報１２４４を算出する。位置情報（ｘ１１，ｙ１１）および位置情報（ｘ２２，ｙ２２）の間の距離が所定距離よりも長いため、特徴抽出部１２２は、人２５および鞄２３の間の関係名称情報１２４４を算出しない。 The feature extraction unit 122 calculates the distance between each piece of position information. The feature extraction unit 122 determines objects within a predetermined distance as objects having a relationship. That is, for example, since the distance between the position information (x11, y11) and the position information (x12, y12) is within a predetermined distance, the feature extraction unit 122 extracts the relationship name information 1244 between the person 25 and the bag 26. calculate. Since the distance between the position information (x11, y11) and the position information (x22, y22) is longer than the predetermined distance, the feature extraction unit 122 does not calculate the relationship name information 1244 between the person 25 and the bag 23.

なお、特徴抽出部１２２は、オブジェクトの重心の位置情報に基づいて各オブジェクトの距離を算出することに限らず、各オブジェクトの外形の最短距離によって各オブジェクトの距離を算出してもよい。各オブジェクトの位置情報は、二次元空間に限らず、三次元空間で位置情報を算出してもよい。 Note that the feature extraction unit 122 may calculate the distance of each object based on the shortest distance of the outer shape of each object instead of calculating the distance of each object based on the positional information of the center of gravity of the object. Position information of each object is not limited to two-dimensional space, and position information may be calculated in three-dimensional space.

図９に戻り、特徴抽出部１２２は、算出した検索用データを特徴データ蓄積部１２４に送信する（Ｓ２４）。特徴抽出部１２２は、処理（Ｓ２４）の後に終了する。 Returning to FIG. 9, the feature extraction unit 122 transmits the calculated search data to the feature data accumulation unit 124 (S24). The feature extraction unit 122 ends after processing (S24).

検索部１２１は、検索ボタン１２０３が押される前に、画像データ蓄積部１２３および特徴データ蓄積部１２４から画像データおよび検索用データを取得することができる。これにより、検索部１２１は、画像検索（Ｓ１３～Ｓ１５）の際に、画像データ蓄積部１２３と、特徴データ蓄積部１２４と、の通信頻度を抑制することができる。 The search unit 121 can acquire image data and search data from the image data storage unit 123 and the feature data storage unit 124 before the search button 1203 is pressed. As a result, the search unit 121 can reduce the frequency of communication between the image data storage unit 123 and the feature data storage unit 124 during image search (S13 to S15).

なお、検索部１２１は、画像データおよび検索用データ取得処理（Ｓ１１）を画像検索開始処理（Ｓ１２：Ｙｅｓ）の後に実行してもよい。これにより、待機状態の場合において、検索部１２１は、メモリ１５の使用量を抑制することができる。 Note that the search unit 121 may execute the image data and search data acquisition process (S11) after the image search start process (S12: Yes). This allows the search unit 121 to reduce the usage of the memory 15 in the standby state.

特徴抽出部１２２は、画像検索開始処理（Ｓ１２：Ｙｅｓ）の後に、画像データ蓄積部１２３の複数の画像から検索用データを算出してもよい。この場合において、検索部１２１は、検索用データを特徴抽出部１２２から取得してもよい。これにより、特徴データ蓄積部１２４に保存されるデータ容量を削減することができる。 The feature extraction unit 122 may calculate search data from a plurality of images in the image data storage unit 123 after the image search start process (S12: Yes). In this case, the search unit 121 may acquire search data from the feature extraction unit 122 . As a result, the amount of data stored in the feature data storage unit 124 can be reduced.

ユーザは、一つのオブジェクトの情報と、一つの関連性の情報と、を検索クエリ入力部１２０に入力してもよい。この場合には、検索部１２１は、任意のオブジェクトのデータを特徴データ蓄積部１２４の中から選択する。検索部１２１は、検索クエリのオブジェクトと、選択した任意のオブジェクトと、の関連性のデータを取得する。検索部１２１は、取得した関連性のデータと、検索クエリの関連性のデータと、を比較することによって画像検索してもよい。 A user may input information on one object and information on one relationship into the search query input unit 120 . In this case, the search unit 121 selects data of any object from the feature data storage unit 124 . The search unit 121 acquires relevance data between the object of the search query and any selected object. The search unit 121 may perform an image search by comparing the obtained relevance data and the relevance data of the search query.

検索結果出力部１２５には、所定の画像に含まれる複数のオブジェクトが強調表示されてもよい。図１１はオブジェクトを強調表示する検索結果出力部１２５の説明図である。 A plurality of objects included in a predetermined image may be highlighted in the search result output unit 125 . FIG. 11 is an explanatory diagram of the search result output unit 125 that highlights objects.

検索結果出力部１２５は、検索クエリに含まれる所定のオブジェクトと、検索クエリに含まれる所定の関係名称情報と、を表示させる。検索結果出力部１２５は、例えば、人２１と、鞄２２と、「持っている」１２４４（１）と、モニタ１１に表示させる。 The search result output unit 125 displays the predetermined object included in the search query and the predetermined related name information included in the search query. The search result output unit 125 causes the monitor 11 to display, for example, the person 21, the bag 22, and "I have" 1244(1).

検索結果出力部１２５は、複数の所定のオブジェクトを離間してそれぞれ表示させる。検索結果出力部１２５は、表示される複数の所定のオブジェクト間に、所定の関係名称情報を表示させる。検索結果出力部１２５は、例えば、人２１および鞄２２を離間して表示させ、人２１および鞄２２の間に関連性を示すように「持っている」１２４４（１）を表示させる。 The search result output unit 125 separates and displays a plurality of predetermined objects. The search result output unit 125 displays predetermined related name information between a plurality of displayed predetermined objects. For example, the search result output unit 125 causes the person 21 and the bag 22 to be displayed separately, and displays "has" 1244(1) to indicate the relationship between the person 21 and the bag 22.

検索結果出力部１２５は、検索クエリに含まれない他の関係名称情報と、所定のオブジェクトと、他のオブジェクトと、の間の他の関係性表示させる。検索結果出力部１２５は、ステッカー４３と、「くっついている」１２４４（２）とを、表示させる。 The search result output unit 125 displays other relationships among other related name information not included in the search query, predetermined objects, and other objects. The search result output unit 125 displays the sticker 43 and "attached" 1244(2).

検索結果出力部１２５は、他のオブジェクトを、表示される所定のオブジェクトから離間して表示させる。検索結果出力部１２５は、他の関係名称情報を、表示される所定のオブジェクトと、表示される他のオブジェクトと、の間に表示させる。検索結果出力部１２５は、鞄２２およびステッカー４３を離間して表示させ、鞄２２およびステッカー４３の間に関連性を示すように「くっついている」１２４４（２）を表示させる。 The search result output unit 125 separates the other object from the displayed predetermined object and displays it. The search result output unit 125 displays other related name information between the displayed predetermined object and the other displayed object. The search result output unit 125 causes the bag 22 and the sticker 43 to be displayed separately, and displays "attached" 1244(2) between the bag 22 and the sticker 43 to indicate the relevance.

検索結果出力部１２５は、所定のオブジェクトと、所定の関係名称情報と、他のオブジェクトと、他の関係名称情報と、を画面のスクロール方向に並べて表示させる。検索結果出力部１２５は、例えば、モニタ１１のスクロール方向（Ｓ方向）において、人２１の下方向に人２５を表示し、鞄２２の下方向に鞄２８を表示し、「持っている」１２４４（１）の下方向に「持っている」１２４４（３）を表示させる。 The search result output unit 125 arranges and displays a predetermined object, predetermined related name information, another object, and other related name information in the scroll direction of the screen. For example, in the scrolling direction (S direction) of the monitor 11, the search result output unit 125 displays the person 25 below the person 21, displays the bag 28 below the bag 22, and displays "I have" 1244. (1) Display "has" 1244 (3) downward.

検索結果出力部１２５は、検索クエリを表示させてもよい。検索クエリを表示することによって、ユーザが入力した情報を確認することができる。 The search result output unit 125 may display the search query. By displaying the search query, the information entered by the user can be verified.

検索結果出力部１２５は、図７，１１で示すレイアウトに限らず、視認性を向上させるために他の表示方法を採用してもよい。検索結果出力部１２５は、オブジェクトの推定の信頼度「ｃ」（数式１参照）に応じて、表示するオブジェクトの画像の大きさを設定してもよい。これにより、画像検索装置１は、検索結果の視認性を向上させることができる。 The search result output unit 125 is not limited to the layouts shown in FIGS. 7 and 11, and may employ other display methods to improve visibility. The search result output unit 125 may set the size of the object image to be displayed according to the object estimation reliability “c” (see Equation 1). As a result, the image search device 1 can improve the visibility of search results.

本実施例に示す画像検索装置１は、入力部１２０と、検索部１２１と、特徴抽出部１２２と、出力部１２５と、記憶部１２と、を備えることによって、関係名称情報１２４４を使用して画像検索することができる。その結果、画像検索装置１は、画像検索の精度を向上させることができる。 The image retrieval apparatus 1 shown in this embodiment uses the relationship name information 1244 by including the input unit 120, the retrieval unit 121, the feature extraction unit 122, the output unit 125, and the storage unit 12. Images can be searched. As a result, the image retrieval device 1 can improve the accuracy of image retrieval.

関係名称情報１２４４は、各オブジェクト間の物理的関係性を示すため、特徴抽出部１２２が各オブジェクト間の位置情報に基づいて関係名称情報１２４４を算出することができる。 Since the relationship name information 1244 indicates the physical relationship between each object, the feature extraction unit 122 can calculate the relationship name information 1244 based on the position information between each object.

特徴抽出部１２２は、画像に含まれるオブジェクト名情報１２４２を算出することができる。これにより、特徴抽出部１２２は、人の認識から漏れるオブジェクト名情報１２４２も特徴データ蓄積部１２４４に保存することができる。その結果、ユーザが自らオブジェクト名情報を入力せずともよくなるため、使い勝手が向上する。 The feature extraction unit 122 can calculate object name information 1242 included in the image. As a result, the feature extraction unit 122 can also store the object name information 1242 that is not recognized by humans in the feature data storage unit 1244 . As a result, it becomes unnecessary for the user to input the object name information by himself/herself, so that usability is improved.

特徴抽出部１２２は、各オブジェクトの位置情報に基づいて、物理的関係性を有する各オブジェクト間の関係名称情報１２４４を算出する為、所定の距離以上のオブジェクト間の関係名称情報１２４４を算出することを抑制する。これにより、特徴抽出部１２２は、余分な関係名称情報１２４４を算出することを抑制することができる。 In order to calculate the relationship name information 1244 between objects having a physical relationship based on the position information of each object, the feature extraction unit 122 calculates the relationship name information 1244 between objects at a predetermined distance or more. suppress As a result, the feature extraction unit 122 can suppress calculation of redundant related name information 1244 .

検索クエリ入力部１２０は、検索クエリが文字で入力される機能を有するため、画像検索装置１の使い勝手が向上する。 Since the search query input unit 120 has a function of inputting a search query in characters, the usability of the image search device 1 is improved.

検索クエリ入力部１２０は、オブジェクト名入力欄１２０１と、関係名入力欄１２０２と、をモニタ１１のスクロール方向に並べて表示する機能を有するため、モニタ１１に収まりきらない欄を表示することができる。これにより、画像検索装置１１の使い勝手が向上する。 Since the search query input unit 120 has a function of displaying the object name input field 1201 and the relationship name input field 1202 side by side in the scroll direction of the monitor 11 , it is possible to display fields that cannot be displayed on the monitor 11 . This improves usability of the image search device 11 .

検索結果出力部１２５は、複数の所定のオブジェクトを他のオブジェクトよりも強調して表示させる機能を有する。これにより、ユーザは、画像内に複数のオブジェクトが表示される場合であっても、所定のオブジェクトを容易に発見することができる。 The search result output unit 125 has a function of displaying a plurality of predetermined objects with more emphasis than other objects. This allows the user to easily find the desired object even when multiple objects are displayed in the image.

検索結果出力部１２５は、所定の関係名称情報を表示させる機能を有する。これにより、ユーザは、どのオブジェクト同士が所定の関係性を有しているかを、容易に把握することができる。 The search result output unit 125 has a function of displaying predetermined related name information. Thereby, the user can easily grasp which objects have a predetermined relationship with each other.

検索結果出力部１２５は、他のオブジェクトと、他の関係名称情報と、を表示させる機能を有する。これにより、ユーザは、他のオブジェクトと、他の関係名称情報と、を把握することができる。 The search result output unit 125 has a function of displaying other objects and other related name information. This allows the user to grasp other objects and other related name information.

検索結果出力部１２５は、複数の所定のオブジェクトを離間してそれぞれ表示させる機能と、表示される複数の所定のオブジェクト間に、所定の関係名称情報を表示させる機能と、他のオブジェクトを表示される所定のオブジェクトから離間して表示させる機能と、他の関係名称情報を、表示される所定のオブジェクトおよび、表示される他のオブジェクト、の間に表示させる機能と、所定のオブジェクト、所定の関係名称情報、他のオブジェクトおよび、他の関係名称情報、画面のスクロール方向に並べて表示させる機能と、を有する。これにより、検索結果出力部１２５は、検索結果の視認性を向上させることができる。 The search result output unit 125 has a function of displaying a plurality of predetermined objects separated from each other, a function of displaying predetermined relationship name information between the plurality of predetermined objects to be displayed, and a function to display other objects. a function to display other related name information between the specified object to be displayed and the other object to be displayed; a function to display the specified object and the specified relationship It has name information, other objects and other related name information, and a function of displaying them side by side in the scroll direction of the screen. Thereby, the search result output unit 125 can improve the visibility of the search results.

本実施例は、第１実施例の変形例に相当するため、第１実施例との相違を中心に説明する。図１２は、画像検索装置１ａの概略図である。本実施例における画像検索装置１ａは、学習部１２７によって、特徴抽出部１２２ａに設定されるパラメータを学習する。 Since this embodiment corresponds to a modification of the first embodiment, differences from the first embodiment will be mainly described. FIG. 12 is a schematic diagram of the image retrieval device 1a. The image search device 1a in this embodiment learns parameters set in the feature extraction unit 122a by the learning unit 127. FIG.

画像検索装置１ａは、検索クエリ入力部１２０ａと、検索部１２１と、特徴抽出部１２２ａと、画像データ蓄積部１２３ａと、特徴データ蓄積部１２４ａと、検索結果出力部１２５ａと、画像データ取得部１２６と、学習部１２７と、を有する。検索クエリ入力部１２０ａは、ユーザから検索クエリを受け付ける機能である。検索クエリ入力部１２０ａは、例えば、モニタ１１に表示されたＵＩである。検索クエリ入力部１２０ａは、検索部１２１および学習部１２７と単方向に通信可能に接続される。 The image search device 1a includes a search query input unit 120a, a search unit 121, a feature extraction unit 122a, an image data storage unit 123a, a feature data storage unit 124a, a search result output unit 125a, and an image data acquisition unit 126. , and a learning unit 127 . The search query input unit 120a is a function that receives search queries from users. The search query input unit 120a is a UI displayed on the monitor 11, for example. The search query input unit 120a is connected to the search unit 121 and the learning unit 127 so as to be unidirectionally communicable.

特徴抽出部１２２ａは、画像から検索用データを算出する機能である。特徴抽出部１２２ａは、画像データ蓄積部１２３ａと単方向に通信可能に接続される。特徴抽出部１２２ａは、特徴データ蓄積部１２４ａおよび学習部１２７と双方向に通信可能に接続される。特徴抽出部１２２ａは、検索部１２１と双方向に通信可能に接続されてもよい。 The feature extraction unit 122a has a function of calculating search data from an image. The feature extraction unit 122a is connected to the image data storage unit 123a so as to be unidirectionally communicable. The feature extraction unit 122a is connected to the feature data storage unit 124a and the learning unit 127 so as to be able to communicate bidirectionally. The feature extraction unit 122a may be connected to the search unit 121 so as to be able to communicate bidirectionally.

画像データ蓄積部１２３ａは、複数の画像を保存するデータベースである。画像データ蓄積部１２３ａには、学習部１２７で使用される学習用画像が保存される。特徴データ蓄積部１２４ａは、画像ごとの検索用データを保存するデータベースである。特徴データ蓄積部１２４ａには、学習用画像に含まれる検索用データを示す教師データが保存される。 The image data storage unit 123a is a database that stores a plurality of images. Learning images used by the learning unit 127 are stored in the image data storage unit 123a. The feature data storage unit 124a is a database that stores search data for each image. The feature data accumulation unit 124a stores teacher data indicating search data included in the learning image.

学習部１２７は、パラメータ「θ」を学習する機能である。学習部１２７は、画像データ蓄積部１２３、特徴データ蓄積部１２４ａおよび検索結果出力部１２５ａと単方向に通信可能に接続される。なお、学習処理は、図１３にて後述する。 The learning unit 127 has a function of learning the parameter "θ". The learning unit 127 is connected to the image data storage unit 123, the feature data storage unit 124a, and the search result output unit 125a so as to be unidirectionally communicable. Note that the learning process will be described later with reference to FIG.

検索結果出力部１２５ａは、検索部１２１の検索結果をユーザへ出力する機能である。検索結果出力部１２５ａは、例えば、モニタ１１に表示されるＵＩである。検索結果出力部１２５ａは、学習部１２７の学習結果を出力してもよい。 The search result output unit 125a is a function of outputting the search result of the search unit 121 to the user. The search result output unit 125a is a UI displayed on the monitor 11, for example. The search result output unit 125 a may output the learning result of the learning unit 127 .

図１３は、学習処理の流れ図である。学習処理は、学習部１２７の処理（Ｓ３０，Ｓ３１，Ｓ３６，Ｓ３７）と、特徴抽出部１２２ａの処理（Ｓ３２～Ｓ３５，Ｓ３８）と、にて実行される。学習処理は、ユーザからの操作に限らず、所定周期で実行されてもよい。 FIG. 13 is a flow chart of the learning process. The learning process is performed by the processes of the learning unit 127 (S30, S31, S36, S37) and the processes of the feature extraction unit 122a (S32 to S35, S38). The learning process is not limited to the user's operation, and may be executed at predetermined intervals.

学習部１２７は、学習用画像のデータを画像データ蓄積部１２３から複数取得する（Ｓ３０）。学習部１２７は、複数の学習用画像それぞれに対応する教師データを特徴データ蓄積部１２４ａから取得する（Ｓ３１）。 The learning unit 127 acquires a plurality of data of images for learning from the image data storage unit 123 (S30). The learning unit 127 acquires teacher data corresponding to each of the plurality of learning images from the feature data accumulation unit 124a (S31).

特徴抽出部１２２ａは、パラメータ「θ」を初期化する（Ｓ３２）。特徴抽出部１２２ａがＣＮＮで構成されている場合には、特徴抽出部１２２ａは、例えば、ガウス分布や一様分布からランダムに値を抽出することよって、パラメータ「θ」を初期化する。学習部１２７は、複数の学習用画像のデータを特徴抽出部１２２ａに送信する（Ｓ３３）。 The feature extraction unit 122a initializes the parameter "θ" (S32). When the feature extraction unit 122a is composed of a CNN, the feature extraction unit 122a initializes the parameter "θ" by, for example, randomly extracting values from a Gaussian distribution or a uniform distribution. The learning unit 127 transmits data of a plurality of learning images to the feature extraction unit 122a (S33).

特徴抽出部１２２ａは、学習用画像から検索用データを算出する（Ｓ３４）。すなわち、学習部１２７は、学習用画像に示される複数のオブジェクトのデータおよび、学習用画像に示される複数のオブジェクトの間の関連性のデータを、学習前のデータとして特徴抽出部１２２に算出させる。特徴抽出部１２２ａは、学習用画像ごとに算出した複数の検索用データを学習部１２７に送信する（Ｓ３５）。 The feature extraction unit 122a calculates search data from the learning image (S34). That is, the learning unit 127 causes the feature extraction unit 122 to calculate the data of the plurality of objects shown in the learning image and the data of the relevance between the plurality of objects shown in the learning image as pre-learning data. . The feature extraction unit 122a transmits a plurality of pieces of search data calculated for each learning image to the learning unit 127 (S35).

学習部１２７は、教師データと、特徴抽出部１２２ａが算出した学習前の検索用データと、に基づいてパラメータの更新値を算出する（Ｓ３６）。学習部１２７は、例えば、特徴抽出部１２２ａから受信した複数の検索用データと、特徴データ蓄積部１２４から受信した複数の教師データと、を学習用画像ごとに対応させる。学習部１２７は、検索用データと、教師データと、の誤差を算出する。学習部１２７は、算出した誤差に基づいて、複数のパラメータの更新値を計算する。 The learning unit 127 calculates updated values of the parameters based on the teacher data and the pre-learning search data calculated by the feature extraction unit 122a (S36). For example, the learning unit 127 associates a plurality of search data received from the feature extraction unit 122a with a plurality of teacher data received from the feature data accumulation unit 124 for each learning image. The learning unit 127 calculates the error between the search data and the teacher data. The learning unit 127 calculates update values for a plurality of parameters based on the calculated error.

学習部１２７は、例えば、二乗誤差の計算方法を用いることによって、各オブジェクトの位置「ｘ」，「ｙ」および各オブジェクトの大きさ「ｗ」，「ｈ」（数式１参照）の誤差を算出する。学習部１２７は、例えば、「Ｓｏｆｔｍａｘｃｒｏｓｓｅｎｔｒｏｐｙ」を用いることによって、推定の信頼度「ｃ」，「ｃｒ」の誤差を算出する。学習部１２７は、例えば、誤差逆伝播法を用いることによって、算出した各誤差の値からパラメータの更新値を算出する。 The learning unit 127 calculates the errors of the positions “x” and “y” of each object and the sizes “w” and “h” of each object (see Equation 1), for example, by using a squared error calculation method. do. The learning unit 127 calculates the errors of the estimation reliability levels "c" and "cr" by using, for example, "Softmax cross entropy". The learning unit 127 calculates the update value of the parameter from each calculated error value by using, for example, the error backpropagation method.

学習部１２７は、パラメータの更新値を特徴抽出部１２２ａに送信する（Ｓ３７）。特徴抽出部１２２ａは、パラメータの値を更新する（Ｓ３８）。特徴抽出部１２２ａは、例えば、確率的勾配降下法を用いることによって、パラメータを更新する。 The learning unit 127 transmits the updated parameter values to the feature extraction unit 122a (S37). The feature extraction unit 122a updates the parameter values (S38). The feature extractor 122a updates the parameters by using stochastic gradient descent, for example.

学習部１２７は、学習を継続するか終了するかを決定する（Ｓ３９）。学習を続行する場合（Ｓ３９：Ｎｏ）には、学習部１２７の処理は、処理（Ｓ３３）に移動する。学習を終了する場合（Ｓ３９：Ｙｅｓ）には、学習部１２７の処理は、終了する。 The learning unit 127 determines whether to continue or end the learning (S39). When learning is to be continued (S39: No), the processing of the learning unit 127 moves to processing (S33). If learning is to end (S39: Yes), the processing of the learning unit 127 ends.

なお、学習部１２７の終了処理は、ユーザによって操作されてもよい。学習部１２７は、更新後のパラメータにて算出された検索用データおよび教師データの誤差と、更新前のパラメータにて算出された検索用データおよび教師データの誤差と、の差分を監視することによって、学習を続行させるかどうかを判断してもよい。 Note that the end processing of the learning unit 127 may be operated by the user. The learning unit 127 monitors the difference between the error in the search data and teacher data calculated with the updated parameters and the error in the search data and teacher data calculated with the pre-update parameters. , may decide whether to continue learning.

なお、検索開始処理（Ｓ１２：Ｙｅｓ）（図８参照）が実行された場合には、特徴抽出部１２２ａは、画像データ蓄積部１２３から複数の画像を取得し、検索用データを算出してもよい。これにより、検索部１２１は、パラメータが更新された特徴抽出部１２２ａによって算出された検索用データに基づいて画像検索することができる。 Note that when the search start process (S12: Yes) (see FIG. 8) is executed, the feature extraction unit 122a acquires a plurality of images from the image data storage unit 123, and calculates search data. good. Thereby, the search unit 121 can perform an image search based on the search data calculated by the feature extraction unit 122a with updated parameters.

このように構成される本実施例では、第１の実施例と同様の作用効果を奏する。さらに、本実施例によれば、画像検索装置１ａは、学習部１２７を備える為、特徴抽出部１２２ａのパラメータを更新することができる。これにより、特徴抽出部１２２ａは、検索用データを算出する精度を向上させることができる。その結果、画像検索装置１ａは、画像検索の精度を向上させることができる。 The present embodiment configured in this manner has the same effects as those of the first embodiment. Furthermore, according to the present embodiment, the image search device 1a includes the learning unit 127, so it is possible to update the parameters of the feature extraction unit 122a. Thereby, the feature extraction unit 122a can improve the accuracy of calculating the search data. As a result, the image retrieval device 1a can improve the accuracy of image retrieval.

本実施例は、第１実施例および第２実施例の変形例に相当するため、第１実施例および第２実施例との相違を中心に説明する。図１４は、画像検索装置１ｂの概略図である。本実施例における画像検索装置１ｂは、入力画像に基づいて、所定の画像を検索する。 Since the present embodiment corresponds to a modified example of the first and second embodiments, the differences from the first and second embodiments will be mainly described. FIG. 14 is a schematic diagram of the image search device 1b. The image search device 1b in this embodiment searches for a predetermined image based on an input image.

画像検索装置１ｂは、検索クエリ入力部１２０ｂと、検索部１２１ｂと、特徴抽出部１２２ｂと、画像データ蓄積部１２３ａと、特徴データ蓄積部１２４ａと、検索結果出力部１２５ａと、画像データ取得部１２６と、学習部１２７と、を有する。 The image search device 1b includes a search query input unit 120b, a search unit 121b, a feature extraction unit 122b, an image data storage unit 123a, a feature data storage unit 124a, a search result output unit 125a, and an image data acquisition unit 126. , and a learning unit 127 .

検索クエリ入力部１２０ｂは、ユーザから入力画像を受け付ける機能である。入力画像には、所定の画像に含まれる複数のオブジェクトと同一または類似する複数のオブジェクトが表示される。検索クエリ入力部１２０ｂは、例えば、モニタ１１に表示されたＵＩである。検索クエリ入力部１２０ｂは、検索部１２１ｂおよび学習部１２７と単方向に通信可能に接続される。検索クエリ入力部１２０ｂは、図１５にて後述する。 The search query input unit 120b has a function of receiving an input image from the user. A plurality of objects that are the same as or similar to a plurality of objects included in a predetermined image are displayed in the input image. The search query input unit 120b is a UI displayed on the monitor 11, for example. The search query input unit 120b is connected to the search unit 121b and the learning unit 127 so as to be unidirectionally communicable. The search query input section 120b will be described later with reference to FIG.

特徴抽出部１２２ｂは、画像から検索用データを算出する機能である。特徴抽出部１２２ｂは、画像データ蓄積部１２３ａと単方向に通信可能に接続される。特徴抽出部１２２ｂは、検索部１２１ｂ、特徴データ蓄積部１２４ｂおよび学習部１２７と双方向に通信可能に接続される。 The feature extraction unit 122b has a function of calculating search data from an image. The feature extraction unit 122b is connected to the image data storage unit 123a so as to be unidirectionally communicable. The feature extraction unit 122b is connected to the search unit 121b, the feature data storage unit 124b, and the learning unit 127 so as to be able to communicate bidirectionally.

特徴抽出部１２２ｂは、入力画像のデータから検索クエリを算出する。検索部１２１ｂは、特徴抽出部１２２にて算出された検索クエリに基づいて記憶部１２を検索し、検索クエリに対応する所定の画像を抽出する機能である。検索部１２１ｂは、画像データ蓄積部１２３ａ、特徴データ蓄積部１２４ａおよび検索結果出力部１２５ａと単方向に通信可能に接続される。検索部１２１ｂと特徴抽出部１２２ｂとの処理は、図１６にて後述する。 The feature extraction unit 122b calculates a search query from data of the input image. The search unit 121b has a function of searching the storage unit 12 based on the search query calculated by the feature extraction unit 122 and extracting a predetermined image corresponding to the search query. The search unit 121b is connected to the image data storage unit 123a, the feature data storage unit 124a, and the search result output unit 125a so as to be unidirectionally communicable. Processing by the search unit 121b and the feature extraction unit 122b will be described later with reference to FIG.

図１５は、検索クエリ入力部１２０ｂの説明図である。検索クエリ入力部１２０ｂは、画像入力欄１２０４と、検索ボタン１２０３と、が表示される。画像入力欄１２０４は、入力画像を入力する領域である。ユーザは、所定の画像に含まれるオブジェクトの画像を入力してもよい。ユーザは、所定の画像に含まれるオブジェクトに類似するオブジェクトの画像を入力してもよい。ユーザは、所定の画像に類似する画像を入力してもよい。 FIG. 15 is an explanatory diagram of the search query input section 120b. An image input field 1204 and a search button 1203 are displayed in the search query input section 120b. An image input field 1204 is an area for inputting an input image. A user may input an image of an object contained in a given image. A user may input images of objects that are similar to objects contained in a given image. A user may input an image similar to a predetermined image.

画像入力欄１２０４は、例えば、モニタ１１の中央に表示される。ユーザは、１枚の画像を入力することに限らず、複数の画像を入力してもよい。この場合には、画像入力欄１２０４は、例えば、モニタ１１のスクロール方向に並べて複数表示されてもよい。 The image input field 1204 is displayed in the center of the monitor 11, for example. The user is not limited to inputting one image, and may input a plurality of images. In this case, a plurality of image input fields 1204 may be displayed side by side in the scroll direction of the monitor 11, for example.

図１６は、画像検索処理の流れ図である。画像検索処理は、検索ボタン１２０３をユーザが押すことによって実行されてもよい。検索部１２１ｂは、検索クエリ入力部１２０ｂから入力画像のデータを取得（Ｓ４１）する。検索部１２１ｂは、特徴抽出部１２２ｂへ入力画像のデータを送信する（Ｓ４２）。 FIG. 16 is a flowchart of image search processing. Image search processing may be executed by the user pressing a search button 1203 . The search unit 121b acquires input image data from the search query input unit 120b (S41). The search unit 121b transmits data of the input image to the feature extraction unit 122b (S42).

特徴抽出部１２２ｂは、入力画像から検索クエリを算出する（Ｓ４３）。特徴抽出部１２２ｂは、検索部１２１ｂに検索クエリを送信する（Ｓ４４）。 The feature extraction unit 122b calculates a search query from the input image (S43). The feature extraction unit 122b transmits the search query to the search unit 121b (S44).

検索部１２１ｂは、特徴抽出部１２２ｂから取得した入力画像の検索クエリと、特徴データ蓄積部１２４に保存される複数の画像の検索用データとを比較することによって、複数の画像の中から所定の画像を検索する（Ｓ４５）。検索部１２１ｂは、所定の画像を検索結果出力部１２５ａへ送信する（Ｓ４６）。 The search unit 121b compares the search query of the input image acquired from the feature extraction unit 122b with the search data of the plurality of images stored in the feature data storage unit 124, thereby obtaining a predetermined image from among the plurality of images. An image is searched (S45). The search unit 121b transmits a predetermined image to the search result output unit 125a (S46).

なお、画像検索装置１ｂは、入力画像と複数の画像との類似度に基づいて所定の画像を検索してもよい。この場合において、特徴抽出部１２２ｂは、例えば、ＣＮＮを用いることによって、画像から特徴ベクトルを算出する。 Note that the image search device 1b may search for a predetermined image based on the degree of similarity between the input image and a plurality of images. In this case, the feature extraction unit 122b calculates a feature vector from the image by using CNN, for example.

特徴ベクトルは、例えば、画像に示される特徴を示すｍ次元（ｍは所定の定数）のデータ群である。特徴ベクトルには、画像に含まれる複数のオブジェクトの特徴と、前記画像に含まれる複数のオブジェクト間の関係性の特徴と、が含まれる。特徴抽出部１２２ｂは、オブジェクトの色の特徴およびオブジェクトの模様の特徴等を特徴ベクトルとして算出してもよい。 A feature vector is, for example, an m-dimensional (m is a predetermined constant) data group representing features shown in an image. A feature vector includes features of a plurality of objects included in an image and features of relationships between the plurality of objects included in the image. The feature extraction unit 122b may calculate the feature of the color of the object, the feature of the pattern of the object, etc. as a feature vector.

特徴抽出部１２２ｂは、入力画像から算出した特徴ベクトルおよび、記憶部１２に保存される画像から算出した特徴ベクトル、に基づいて、入力画像に対する類似度を記憶部１２に保存される画像ごとに複数算出する。検索部１２１ｂは、複数の類似度に基づいて記憶部１２を検索し、所定の画像を抽出する。 Based on the feature vector calculated from the input image and the feature vector calculated from the image stored in the storage unit 12, the feature extraction unit 122b calculates a plurality of degrees of similarity to the input image for each image stored in the storage unit 12. calculate. The search unit 121b searches the storage unit 12 based on a plurality of degrees of similarity and extracts a predetermined image.

なお、特徴抽出部１２２ｂは、ＣＮＮの「Ａｃｔｉｖａｔｉｏｎｍａｐ」を用いて画像からオブジェクトが表示される領域の情報を算出してもよい。特徴抽出部１２２ｂは、算出した領域の特徴ベクトルを算出することによって、入力画像および複数の画像に示される同一のオブジェクトに関する類似度を向上させることができる。 Note that the feature extraction unit 122b may calculate information on the area where the object is displayed from the image using "Activation map" of CNN. The feature extraction unit 122b can improve the similarity regarding the same object shown in the input image and the plurality of images by calculating the feature vector of the calculated area.

特徴抽出部１２２ｂは、例えば、入力画像の特徴ベクトルと、複数の画像の特徴ベクトルと、の類似度を、ユークリッド距離を用いて計算してもよい。すなわち、ｍ種類の特徴を軸としたｍ次元において、入力画像の特徴ベクトルと、複数の画像の特徴ベクトルと、の距離を測ることによって、特徴抽出部１２２ｂは、類似度を算出する。 The feature extraction unit 122b may calculate, for example, the degree of similarity between the feature vector of the input image and the feature vectors of a plurality of images using the Euclidean distance. That is, the feature extracting unit 122b calculates the degree of similarity by measuring the distance between the feature vector of the input image and the feature vectors of the plurality of images in m dimensions with m types of features as axes.

学習部１２７は、特徴抽出部１２２ｂに画像の特徴ベクトルを算出するパラメータを設定してもよい。学習部１２７は、複数の画像を色補正して特徴ベクトルを算出する処理を特徴抽出部１２２ｂに学習させてもよい。これにより、検索部１２１ｂは、色補正なしで算出した類似度よりも高い類似度を算出することができる。 The learning unit 127 may set parameters for calculating the feature vector of the image in the feature extraction unit 122b. The learning unit 127 may cause the feature extraction unit 122b to learn processing of color-correcting a plurality of images and calculating feature vectors. Thereby, the search unit 121b can calculate a similarity higher than the similarity calculated without color correction.

検索結果出力部１２５ｂは、類似度に基づいて所定の画像を出力してもよい。検索結果出力部１２５ｂは、例えば、類似度の高い方から順番に複数の所定の画像を並べてモニタ１１に表示させてもよい。これにより、画像検索装置１ｂは、視認性を向上させることができる。 The search result output unit 125b may output a predetermined image based on the degree of similarity. For example, the search result output unit 125b may arrange a plurality of predetermined images in descending order of similarity and display them on the monitor 11 . Thereby, the image search device 1b can improve the visibility.

このように構成される本実施例では、第１，２実施例と同様の作用効果を奏する。さらに、本実施例によれば、画像検索装置１ｂは、検索クエリ入力部１２０ｂと、入力画像から検索クエリを算出する特徴抽出部１２２ｂと、を備える為、入力画像に類似する画像を検索することができる。これにより、ユーザは、入力画像を入力する事によって画像検索することができる。 The present embodiment configured in this manner has the same effects as those of the first and second embodiments. Furthermore, according to the present embodiment, the image search device 1b includes the search query input unit 120b and the feature extraction unit 122b that calculates a search query from an input image, so images similar to the input image can be searched. can be done. This allows the user to perform an image search by inputting an input image.

さらに、特徴抽出部１２２ｂは、画像の特徴ベクトルを算出する機能と、入力画像および画像データ記憶部１２３aに保存される画像の間の類似度を記憶部に保存される画像ごとに複数算出する機能と、を有する。検索部は、複数の類似度に基づいて記憶部を検索し、所定の画像を抽出する機能を有する。これにより、画像検索装置１ｂは、入力画像と、画像データ蓄積部１２３aと、の間の類似度に基づいて画像検索することができる。 Further, the feature extraction unit 122b has a function of calculating a feature vector of an image, and a function of calculating a plurality of degrees of similarity between an input image and an image stored in the image data storage unit 123a for each image stored in the storage unit. and have The search unit has a function of searching the storage unit based on a plurality of degrees of similarity and extracting a predetermined image. Thus, the image retrieval device 1b can perform image retrieval based on the degree of similarity between the input image and the image data storage unit 123a.

本実施例は、第１実施例～第３実施例の変形例に相当するため、第１実施例～第３実施例との相違を中心に説明する。図１７は、画像検索装置１ｃの概略図である。本実施例における画像検索装置１ｃは、入力された検索クエリによって画像検索し、入力画像に基づいて、検索結果の複数の所定の画像を並び変えて表示させる。 Since this embodiment corresponds to modifications of the first to third embodiments, the differences from the first to third embodiments will be mainly described. FIG. 17 is a schematic diagram of the image search device 1c. The image retrieval device 1c in this embodiment performs an image retrieval using an input retrieval query, and rearranges and displays a plurality of predetermined images of the retrieval results based on the input image.

画像検索装置１ｃは、検索クエリ入力部１２０ｃと、検索部１２１ｃと、特徴抽出部１２２ｃと、画像データ蓄積部１２３ａと、特徴データ蓄積部１２４ａと、検索結果出力部１２５ａと、画像データ取得部１２６と、学習部１２７と、を有する。 The image search device 1c includes a search query input unit 120c, a search unit 121c, a feature extraction unit 122c, an image data storage unit 123a, a feature data storage unit 124a, a search result output unit 125a, and an image data acquisition unit 126. , and a learning unit 127 .

検索クエリ入力部１２０ｃは、ユーザから検索クエリおよび入力画像を受け付ける機能である。検索クエリ入力部１２０ｃは、検索クエリを文字で受け付ける。検索クエリ入力部１２０ｃは、例えば、モニタ１１に表示されたＵＩである。検索クエリ入力部１２０ｃは、検索部１２１ｃおよび学習部１２７と単方向に通信可能に接続される。検索クエリ入力部１２０ｃは、図１８にて後述する。 The search query input unit 120c is a function that receives a search query and an input image from the user. The search query input unit 120c accepts a search query in text form. The search query input section 120c is a UI displayed on the monitor 11, for example. The search query input unit 120c is connected to the search unit 121c and the learning unit 127 so as to be unidirectionally communicable. The search query input section 120c will be described later with reference to FIG.

特徴抽出部１２２ｃは、画像から検索用データを算出する機能と、画像のデータから特徴ベクトルを算出する機能と、である。特徴抽出部１２２ｃは、画像データ蓄積部１２３ｃと単方向に通信可能に接続される。特徴抽出部１２２ｃは、検索部１２１ｃ、特徴データ蓄積部１２４ａおよび学習部１２７と双方向に通信可能に接続される。 The feature extraction unit 122c has a function of calculating search data from an image and a function of calculating a feature vector from image data. The feature extraction unit 122c is connected to the image data storage unit 123c so as to be unidirectionally communicable. The feature extraction unit 122c is connected to the search unit 121c, the feature data storage unit 124a, and the learning unit 127 so as to be able to communicate bidirectionally.

検索部１２１ｃは、検索クエリに基づいて記憶部１２を検索し、検索クエリに対応する所定の画像を抽出する機能である。検索部１２１ｃは、画像データ蓄積部１２３ａ、特徴データ蓄積部１２４ａおよび検索結果出力部１２５ａと単方向に通信可能に接続される。検索部１２１ｂおよび特徴抽出部１２２ｂの処理は、図１９にて後述する。 The search unit 121c has a function of searching the storage unit 12 based on a search query and extracting a predetermined image corresponding to the search query. The search unit 121c is connected to the image data storage unit 123a, the feature data storage unit 124a, and the search result output unit 125a so as to be unidirectionally communicable. Processing of the search unit 121b and the feature extraction unit 122b will be described later with reference to FIG.

図１８は、検索クエリ入力部１２０ｃの説明図である。検索クエリ入力部１２０ｃは、オブジェクト名入力欄１２０１と、関係名入力欄１２０２と、検索ボタン１２０３と、画像入力欄１２０４と、をモニタ１１に表示させる。オブジェクト名入力欄１２０１および関係名入力欄１２０２には、例えば、「人」および「鞄」を示すオブジェクト名情報１２４２と、「持っている」を示す関係名称情報１２４４ｄと、が入力される。 FIG. 18 is an explanatory diagram of the search query input section 120c. The search query input section 120c causes the monitor 11 to display an object name input field 1201, a relationship name input field 1202, a search button 1203, and an image input field 1204. FIG. In the object name input field 1201 and the relation name input field 1202, for example, object name information 1242 indicating "person" and "bag" and relation name information 1244d indicating "has" are input.

図１９は、画像検索処理の流れ図である。検索部１２１ｃは、検索クエリ入力部１２０ｃから入力画像と検索クエリとを取得する（Ｓ５１）。検索部１２１ｃは、複数の画像から少なくとも一つの所定の画像を検索する（Ｓ５２）。検索部１２１ｃは、例えば、第１実施例における検索処理（Ｓ１２～Ｓ１５）（図８参照）と同様にして、複数の画像の中から複数の所定の画像を検索する。 FIG. 19 is a flowchart of image search processing. The search unit 121c acquires an input image and a search query from the search query input unit 120c (S51). The search unit 121c searches for at least one predetermined image from the plurality of images (S52). The search unit 121c searches for a plurality of predetermined images from among the plurality of images, for example, in the same manner as the search processing (S12 to S15) (see FIG. 8) in the first embodiment.

特徴抽出部１２２ｃは、検索部１２１ｃから、入力画像のデータと、複数の所定の画像のデータと、を取得する。特徴抽出部１２１ｃは、入力画像に含まれる特徴ベクトルを算出する。特徴抽出部１２１ｃは、所定の画像に含まれる特徴ベクトルを算出する。特徴抽出部１２１ｃは、算出した特徴ベクトルに基づいて、入力画像に対しての類似度を所定の画像ごとに算出する（Ｓ５３）。 The feature extraction unit 122c acquires data of the input image and data of a plurality of predetermined images from the search unit 121c. The feature extraction unit 121c calculates feature vectors included in the input image. The feature extraction unit 121c calculates feature vectors included in a predetermined image. The feature extraction unit 121c calculates the degree of similarity to the input image for each predetermined image based on the calculated feature vector (S53).

特徴抽出部１２２ｃは、検索部１２１ｃに複数の類似度のデータを送信する（Ｓ５４）。検索部１２１ｃは、検索結果出力部１２５ｃに、複数の所定の画像のデータと、所定の画像に設定される類似度のデータと、を送信する。検索結果出力部１２５ｃは、類似度に基づいて、複数の所定の画像をモニタ１１に表示させる（Ｓ５５）。たとえば、検索結果出力部１２５ｃは、複数の所定の画像のうち類似度が上位に位置する画像を、他の画像よりも優先してモニタに表示する。 The feature extraction unit 122c transmits a plurality of similarity data to the search unit 121c (S54). The search unit 121c transmits data of a plurality of predetermined images and similarity data set to the predetermined images to the search result output unit 125c. The search result output unit 125c displays a plurality of predetermined images on the monitor 11 based on the degree of similarity (S55). For example, the search result output unit 125c preferentially displays on the monitor an image having a higher degree of similarity among the plurality of predetermined images than other images.

このように構成される本実施例では、第１～第３の実施例と同様の作用効果を奏する。さらに、本実施例によれば、画像検索装置１ｃは、検索クエリ入力部１２０ｃと、検索部１２１ｃと、を有する事によって、検索クエリにて検索した画像を並び替えて表示することができる。画像検索装置１ｃは、検索部１２１ｃで検索した画像を、入力画像との類似度に基づいて表示することができる。これにより、画像検索装置は、検索結果の視認性を向上させることができる。 The present embodiment configured in this manner has the same effects as those of the first to third embodiments. Furthermore, according to this embodiment, the image search device 1c has the search query input unit 120c and the search unit 121c, so that the images searched by the search query can be rearranged and displayed. The image search device 1c can display images searched by the search unit 121c based on the degree of similarity with the input image. As a result, the image search device can improve the visibility of search results.

本実施例は、第１実施例～第４実施例の変形例に相当するため、第１実施例～第４実施例との相違を中心に説明する。図２０は、オブジェクトの情報の説明図である。本実施例における画像検索装置は、入力された上位キーワード１２５８から下位キーワード１２５９を算出し、下位キーワード１２５９に基づいて画像検索する。検索部１２１ｄは、上位キーワード１２５８から複数の下位キーワード１２５９を算出する処理を有する。 Since this embodiment corresponds to modifications of the first to fourth embodiments, differences from the first to fourth embodiments will be mainly described. FIG. 20 is an explanatory diagram of object information. The image search apparatus according to this embodiment calculates a lower keyword 1259 from the input upper keyword 1258 and performs an image search based on the lower keyword 1259 . The search unit 121d has processing for calculating a plurality of lower keywords 1259 from the upper keyword 1258 .

上位キーワード１２５８は、所定の画像の特徴を示すキーワードである。上位キーワード１２５８は、ユーザによって検索クエリ入力部に入力される。下位キーワード１２５９は、上位キーワード１２５８を構成するオブジェクトの名称である。なお、下位キーワードは、所定の画像に含まれるオブジェクトの性質を示してもよい。下位キーワード１２５９は、上位キーワード１２５８の性質を示してもよい。 The high-level keyword 1258 is a keyword indicating characteristics of a given image. Top keywords 1258 are entered into the search query input by the user. A lower keyword 1259 is the name of an object that constitutes the upper keyword 1258 . It should be noted that the lower keyword may indicate the property of the object included in the predetermined image. Low level keywords 1259 may indicate the nature of high level keywords 1258 .

ユーザが、例えば、「女性社員」を検索クエリ入力部へ入力する。検索部１２１ｄは、「女性社員」を示すデータを入力部から取得する。 A user inputs, for example, "female employee" into the search query input section. The search unit 121d acquires data indicating "female employee" from the input unit.

検索部１２１ｄは、「女性社員」に対応する複数の下位キーワード１２５９を算出する。検索部１２１ｄは、例えば、「女性」、「スーツ」、「パンプス」または「靴」等を示すデータを算出する。なお、検索部１２１ｄは、画像検索履歴に基づいて上位キーワード１２５８から下位キーワード１２５９を推定してもよい。 The search unit 121d calculates a plurality of lower keywords 1259 corresponding to "female employee". The search unit 121d calculates data indicating, for example, "woman", "suit", "pumps" or "shoes". Note that the search unit 121d may estimate the lower keyword 1259 from the upper keyword 1258 based on the image search history.

検索部１２１ｄは、特徴データ蓄積部に保存される検索用データと、「女性社員」に対応する複数の下位キーワード１２５９と、を比較して所定の画像を少なくとも一つ検索する。検索部は、検索結果出力部に所定の画像のデータを送信する。出力部は、所定の画像を表示する。 The search unit 121d searches for at least one predetermined image by comparing the search data stored in the feature data storage unit with a plurality of lower keywords 1259 corresponding to "female employee". The search unit transmits data of a predetermined image to the search result output unit. The output unit displays a predetermined image.

本実施例に示す画像検索装置は、一つの上位キーワード１２５８をユーザが入力した場合でも、複数の下位キーワード１２５９を用いて画像検索することができる。これにより、画像検索装置は、画像検索の精度が向上する。 The image retrieval apparatus shown in this embodiment can perform image retrieval using a plurality of lower keywords 1259 even when the user inputs one upper keyword 1258 . As a result, the image retrieval device improves the accuracy of image retrieval.

本実施例は、第１実施例～第４実施例の変形例に相当するため、第１実施例～第４実施例との相違を中心に説明する。本実施例における画像検索装置は、動画に映る複数のオブジェクトの時間経過による変化に基づいて画像検索する。以下、本実施例の画像検索装置を駅５の中を移動する人２５，２９を一例に挙げながら説明する。 Since this embodiment corresponds to modifications of the first to fourth embodiments, differences from the first to fourth embodiments will be mainly described. The image retrieval apparatus according to the present embodiment performs image retrieval based on changes over time of a plurality of objects appearing in a moving image. The image retrieval apparatus of this embodiment will be described below by taking people 25 and 29 moving through the station 5 as an example.

図２１は、駅５の概略図である。駅５には、例えば、入口５１と、券売機５２と、改札口５３と、が設けられる。駅５内を移動する人２５，２９は、監視カメラ等の動画を撮影する機器によって撮影される。 21 is a schematic diagram of station 5. FIG. The station 5 is provided with an entrance 51, a ticket vending machine 52, and a ticket gate 53, for example. People 25 and 29 moving in the station 5 are photographed by equipment for photographing moving images such as surveillance cameras.

人２５は、例えば、入口５１から券売機５２へ向かい、券売機５２で切符を購入し、改札口５３へ向かう。人２９は、改札口５３から出た後、入口５１へ向かう。 For example, the person 25 goes from the entrance 51 to the ticket vending machine 52 , purchases a ticket from the ticket vending machine 52 , and goes to the ticket gate 53 . After exiting the ticket gate 53 , the person 29 heads for the entrance 51 .

図２２（１）は、駅５構内を移動する人２５，２９の情報の説明図である。特徴データ蓄積部１２４ｅは、「特徴ＩＤ」１２６１と、特徴情報１２６２と、前時間特徴１２６３と、後時間特徴１２６４と、を有する。「特徴ＩＤ」１２６１には、人２５，２９の特徴情報１２６２を識別する情報が保存される。 FIG. 22(1) is an explanatory diagram of information of persons 25 and 29 who move within the station 5 premises. The feature data accumulation unit 124 e has a “feature ID” 1261 , feature information 1262 , an earlier time feature 1263 and a later time feature 1264 . Information for identifying the feature information 1262 of the persons 25 and 29 is stored in the “feature ID” 1261 .

特徴情報１２６２には、人２５，２９の特徴を示すデータが保存される。特徴情報１２６２には、例えば、「電車に乗る人」または「電車から降りた人」等が保存される。前時間特徴１２６３と、後時間特徴１２６４とは、人２５，２９の時間経過の状態を示す。すなわち、人２５，２９が「財布を持っている」状態から「券売機から改札へ向かう」状態に移る場合には、人２５，２９は、「電車に乗る人」の特徴を有する。 The feature information 1262 stores data indicating features of the persons 25 and 29 . The feature information 1262 stores, for example, "person getting on train" or "person getting off train". A pre-temporal feature 1263 and a post-temporal feature 1264 indicate the states of the persons 25 and 29 over time. That is, when the persons 25 and 29 change from the state of "having a wallet" to the state of "going from the ticket vending machine to the ticket gate", the persons 25 and 29 have the characteristics of "the person who gets on the train".

図２２（２）は、前時間特徴１２６３の詳細を示す図である。前時間特徴１２６３は、「前特徴ＩＤ」１２６３１と、第１オブジェクト１２６３２と、第２オブジェクト１２６３３と、関係性１２６３４と、を有する。「前特徴ＩＤ」１２６３１には、前時間特徴１２６３を識別する為の情報が保存される。 FIG. 22(2) is a diagram showing the details of the previous time feature 1263. FIG. The previous temporal feature 1263 has a “previous feature ID” 12631 , a first object 12632 , a second object 12633 and a relationship 12634 . Information for identifying the previous temporal feature 1263 is stored in the “previous feature ID” 12631 .

第１オブジェクト１２６３２および第２オブジェクト１２６３３には、オブジェクトの情報が保存される。第１オブジェクト１２６３２には、例えば、「人」等が保存される。第２オブジェクト１２６３３には、例えば、「財布」または「改札口」等が保存される。 Object information is stored in the first object 12632 and the second object 12633 . The first object 12632 stores, for example, "person". The second object 12633 stores, for example, "purse" or "ticket gate".

関係性１２６３４には、第１オブジェクト１２６３２と、第２オブジェクト１２６３３と、の関連性の情報が保存される。関係性１２６３４には、例えば、「持っている」または「通過」等が保存される。 Information on the relationship between the first object 12632 and the second object 12633 is saved in the relationship 12634 . Relationship 12634 stores, for example, "has" or "passes".

図２２（３）は、後時間特徴１２６４の詳細を示す図である。後時間特徴１２６４は、「後特徴ＩＤ」１２６４１と、第１オブジェクト１２６４２と、第２オブジェクト１２６４３と、関連性１２６４４と、を有する。「後特徴ＩＤ」１２６４１には、後時間特徴１２６４を識別する為の情報が保存される。 FIG. 22(3) is a diagram showing the details of the post-time features 1264. FIG. Post-temporal feature 1264 has 'post-feature ID' 12641 , first object 12642 , second object 12643 , and relationship 12644 . Information for identifying the post-temporal feature 1264 is stored in the “posterior feature ID” 12641 .

第１オブジェクト１２６４２および第２オブジェクト１２６４３には、オブジェクトの情報が保存される。第１オブジェクト１２６４２には、例えば、「人」等が保存される。第２オブジェクト１２６４３には、例えば、「切符」または「入口」等が保存される。関連性１２６４４には、第１オブジェクト１２６４２と、第２オブジェクト１２６４３と、の関連性が保存される。関連性１２６４４には、例えば、「持っている」または「通過」等が保存される。 Object information is stored in the first object 12642 and the second object 12643 . The first object 12642 stores, for example, "person". The second object 12643 stores, for example, "ticket" or "entrance". The relationship 12644 stores the relationship between the first object 12642 and the second object 12643 . The relationship 12644 stores, for example, "has" or "passes".

ユーザは、例えば、「電車に乗る人」と入力部に入力する。検索部は、「電車に乗る人」を示すデータを入力部から取得する。検索部は、駅５を撮影する動画の中に、「財布を持っている」状態から「券売機から改札へ向かう」状態に移る人２５，２９を検索する。 The user inputs, for example, "person on train" to the input unit. The search unit acquires data indicating "a person on the train" from the input unit. The search unit searches for persons 25 and 29 who change from the state of "having a wallet" to the state of "going from the ticket vending machine to the ticket gate" in the moving image of the station 5.例文帳に追加

位置情報２９１において、検索部は、人２５が「財布」を「持っている」状態であると認識する。位置情報２９２において、検索部は、人２５が「切符」を「持っている」状態であると認識する。人２５が位置情報２９１から位置情報２９２に移動する為、検索部は、人２５を「電車に乗る人」と判定する。 In the position information 291, the search unit recognizes that the person 25 is in a state of "having" a "wallet". In the position information 292, the search unit recognizes that the person 25 is in a state of "having" a "ticket". Since the person 25 moves from the position information 291 to the position information 292, the search unit determines the person 25 to be "a person who gets on the train".

検索部は、人２５の映る所定の画像を所定の画像として、画像データ蓄積部に保存される動画データから抽出する。検索部は、所定の画像を出力部へ送信する。検索結果出力部は、所定の画像をモニタへ表示する。 The search unit extracts a predetermined image including the person 25 from the moving image data stored in the image data storage unit as a predetermined image. The search unit transmits a predetermined image to the output unit. The search result output unit displays a predetermined image on the monitor.

本実施例に示す画像検索装置は、時間によって変動するオブジェクトまたはオブジェクト間の関連性に基づいて所定の画像を検索することができる。 The image retrieval apparatus shown in this embodiment can retrieve a predetermined image based on objects that change with time or relationships between objects.

本実施例は、第１実施例～第４実施例の変形例に相当するため、第１実施例～第４実施例との相違を中心に説明する。本実施例における画像検索装置１ｆは、複数のユーザによって画像検索され、ユーザごとに隔離した画像データ蓄積部１２３を有する。図２３は、画像検索装置１ｆのハードウェア構成図である。画像検索装置１ｆには、ネットワーク７を介して、複数の端末６（１）～６（ｎ）（ｎは任意の整数）が接続される。端末６（１）～６（ｎ）は、特に区別しない場合には、端末６と示す場合がある。 Since this embodiment corresponds to modifications of the first to fourth embodiments, differences from the first to fourth embodiments will be mainly described. The image retrieval device 1f in this embodiment has an image data storage unit 123 that is subjected to image retrieval by a plurality of users and is isolated for each user. FIG. 23 is a hardware configuration diagram of the image search device 1f. A plurality of terminals 6(1) to 6(n) (where n is an arbitrary integer) are connected to the image retrieval device 1f via a network 7. FIG. Terminals 6(1) to 6(n) may be referred to as terminal 6 when they are not distinguished from each other.

画像検索装置１ｆは、記憶部１２ｆと、ＣＰＵ１４と、メモリ１５と、通信インターフェース１６と、データ伝送路１７と、有する。記憶部１２ｆには、検索部１２１と、特徴抽出部１２２と、画像データ取得部１２６と、端末データベース１２８（１）～１２８（ｎ）と、を有する。端末データベース１２８（１）～１２８（ｎ）は、特に区別しない場合には、端末データベース１２８と示す場合がある。画像検索装置１ｆは、いわゆるクラウドとして知られているように、ネットワーク上に設けられた一つまたは複数のコンピュータ上に設けることができる。 The image search device 1f has a storage unit 12f, a CPU 14, a memory 15, a communication interface 16, and a data transmission line 17. FIG. The storage unit 12f has a search unit 121, a feature extraction unit 122, an image data acquisition unit 126, and terminal databases 128(1) to 128(n). The terminal databases 128(1) to 128(n) may be referred to as the terminal database 128 when they are not distinguished from each other. The image search device 1f can be provided on one or more computers provided on a network, as is known as a cloud.

端末データベース１２８は、端末６ごとに画像データと検索用データとを記憶するデータベースである。端末データベース１２８ごとに保存されるデータが隔離されることによって、ユーザは、対応する端末６以外の他の端末６からのアクセスが制限される。例えば、端末データベース１２８（１）は、端末６（１）からアクセス可能であり、端末６（２）からのアクセスが制限される。端末データベース１２８は、画像データ蓄積部１２３（１）～１２３（ｎ）と、特徴データ蓄積部１２４（１）～１２４（ｎ）と、を有する。 The terminal database 128 is a database that stores image data and search data for each terminal 6 . By isolating the data stored in each terminal database 128, users are restricted from accessing from terminals 6 other than the corresponding terminal 6. FIG. For example, terminal database 128(1) is accessible from terminal 6(1) and has restricted access from terminal 6(2). The terminal database 128 has image data storage units 123(1) to 123(n) and feature data storage units 124(1) to 124(n).

端末６は、通信インターフェース（図中、通信Ｉ／Ｆと示す）６１（１）～６１（ｎ）と、検索クエリ入力部１２０と、検索結果出力部１２５と、を有するコンピュータである。通信インターフェース６１（１）～６１（ｎ）は、特に区別しない場合には、通信インターフェース６１と示す場合がある。 The terminal 6 is a computer having communication interfaces (indicated as communication I/F in the figure) 61(1) to 61(n), a search query input unit 120, and a search result output unit 125. The communication interfaces 61(1) to 61(n) may be referred to as the communication interface 61 unless otherwise distinguished.

通信インターフェース６１は、例えば、ＬＡＮの接続端子、ＳＡＮの接続端子または無線通信の接続装置である。検索クエリ入力部１２０と、検索結果出力部１２５と、は、各端末６の記憶部に記憶される。 The communication interface 61 is, for example, a LAN connection terminal, a SAN connection terminal, or a wireless communication connection device. The search query input section 120 and search result output section 125 are stored in the storage section of each terminal 6 .

ユーザは、例えば、複数の端末６の内の端末６（１）を使用する。ユーザは、検索クエリ入力部１２５に検索クエリを入力する。端末６（１）は、通信インターフェース６１（１）を介して検索クエリと端末（１）とを識別する情報を画像検索装置１ｆへ送信する。 A user uses, for example, terminal 6(1) of a plurality of terminals 6 . The user enters a search query into search query input section 125 . The terminal 6(1) transmits information identifying the search query and the terminal (1) to the image search device 1f via the communication interface 61(1).

画像検索装置１ｆは、取得した検索クエリに基づいて検索部１２１にて画像検索をする。検索部１２１は、端末（１）を識別する情報に基づいて、端末データベース１２８（１）を選択する。検索部１２１は、特徴データ蓄積部１２４（１）に保存される検索用データと、端末６（１）から取得した検索クエリと、に基づいて、画像データ蓄積部１２３（１）に保存される複数の画像の中から所定の画像を検索する。 The image search device 1f performs an image search using the search unit 121 based on the acquired search query. The search unit 121 selects the terminal database 128(1) based on the information identifying the terminal (1). The search unit 121 stores data in the image data storage unit 123(1) based on the search data stored in the feature data storage unit 124(1) and the search query acquired from the terminal 6(1). A predetermined image is retrieved from a plurality of images.

画像検索装置１ｆは、通信インターフェース１６を介して、所定の画像を端末６（１）に送信する。検索結果出力部１２５は、取得した所定の画像を検索結果出力部１２５によって出力する。 The image search device 1f transmits a predetermined image to the terminal 6(1) via the communication interface 16. FIG. The search result output unit 125 outputs the acquired predetermined image.

本実施例に示す画像検索装置１ｆは、複数の端末６で一台の画像検索装置１ｆを共有して使用することができる。これにより、複数のユーザが画像検索装置１ｆを使用することができる。 The image retrieval device 1f shown in this embodiment can be used by a plurality of terminals 6 in common. This allows multiple users to use the image search device 1f.

記憶部１２に保存される画像データおよび検索用データが端末データベース１２８ごとに隔離されるため、画像検索装置１ｆは、異なる属性の複数のユーザによって使用されても、他のユーザに対してデータを隠蔽することができる。 Since the image data and search data stored in the storage unit 12 are isolated for each terminal database 128, the image search device 1f can be used by a plurality of users with different attributes, but the data cannot be sent to other users. can be hidden.

本実施例は、第１実施例～第４実施例の変形例に相当するため、第１実施例～第４実施例との相違を中心に説明する。本実施例における画像検索システムは、ユーザが複数の監視カメラ９の動画データ中から画像検索をする機能を有する。図２３は、画像検索システム８の説明図である。画像検索システム８は、画像検索装置８１，８２（１）～８２（ｎ）を有する。「ｐ」は、任意の定数である。画像検索装置８２（１）～８２（ｐ）は、特に区別しない場合には、画像検索装置８２と示す場合がある。画像検索装置８１がユーザごとにデータを追加学習させることによって、画像検索装置８２は生成される。 Since this embodiment corresponds to modifications of the first to fourth embodiments, differences from the first to fourth embodiments will be mainly described. The image search system in this embodiment has a function of allowing the user to search for images from video data of a plurality of surveillance cameras 9 . FIG. 23 is an explanatory diagram of the image search system 8. As shown in FIG. The image retrieval system 8 has image retrieval devices 81, 82(1) to 82(n). "p" is an arbitrary constant. The image retrieval devices 82(1) to 82(p) may be referred to as the image retrieval device 82 when they are not distinguished from each other. The image retrieval device 82 is generated by the image retrieval device 81 additionally learning data for each user.

各画像検索装置８２は、ネットワーク７を介して、複数の端末６と、監視カメラ９（１）～９（ｑ）と、双方向に通信可能に接続される。「ｑ」は、任意の定数である。監視カメラ９（１）～９（ｑ）は、特に区別しない場合には、監視カメラ９と示す場合がある。なお、監視カメラ９は、動画を撮影する他の機器でもよい。 Each image retrieval device 82 is connected to a plurality of terminals 6 and monitoring cameras 9(1) to 9(q) via the network 7 so as to be able to communicate bidirectionally. "q" is an arbitrary constant. Surveillance cameras 9(1) to 9(q) may be referred to as surveillance camera 9 unless otherwise distinguished. It should be noted that the monitoring camera 9 may be another device for capturing moving images.

ユーザは、端末６を操作することによって、監視カメラ９の撮影データの中から所定の画像を検索する。すなわち、ユーザは、例えば、端末６（１）を選択する。ユーザは、端末６（１）を操作して、各監視カメラ９の中から一つ選択する。ユーザは、例えば、監視カメラ９（１）を選択する。 By operating the terminal 6 , the user searches for a predetermined image from the imaged data of the surveillance camera 9 . That is, the user selects terminal 6(1), for example. The user operates the terminal 6 ( 1 ) to select one of the surveillance cameras 9 . The user selects surveillance camera 9(1), for example.

ユーザは、画像検索装置８２にて監視カメラ９（１）の撮影データの中から所定の画像を検索する。本実施例において、ユーザに対応する画像検索装置８２は、例えば、画像検索装置８２（１）である。画像検索装置８２（１）は、監視カメラ９（１）から撮影データを取得することによって画像検索する。画像検索装置８２（１）は、検索結果の画像を端末６（１）に送信する。端末６（１）は、検索結果の画像を出力する。 The user searches for a predetermined image from the photographed data of the surveillance camera 9 ( 1 ) using the image search device 82 . In this embodiment, the image retrieval device 82 corresponding to the user is, for example, image retrieval device 82(1). The image retrieval device 82(1) performs image retrieval by acquiring photographed data from the monitoring camera 9(1). The image retrieval device 82(1) transmits the image of the retrieval result to the terminal 6(1). Terminal 6(1) outputs an image of the search result.

本実施例に示す画像検索システム８は、各端末６および各監視カメラ９とネットワーク７を介して接続されることによって、ユーザが各監視カメラの撮影データを利用して画像検索することができる。 The image search system 8 shown in the present embodiment is connected to each terminal 6 and each monitor camera 9 via the network 7, so that the user can perform image search using the photographed data of each monitor camera.

なお、本発明は上述の実施形態に限定されず、様々な変形例が含まれる。上記実施形態は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、ある実施形態の構成の一部を他の実施形態の構成に置き換えることもできる。また、ある実施形態の構成に他の実施形態の構成を加えることもできる。また、各実施形態の構成の一部について、他の構成を追加・削除・置換することもできる。 In addition, the present invention is not limited to the above-described embodiments, and includes various modifications. The above embodiments have been described in detail for easy understanding of the present invention, and are not necessarily limited to those having all the described configurations. Also, part of the configuration of one embodiment can be replaced with the configuration of another embodiment. Moreover, the configuration of another embodiment can be added to the configuration of one embodiment. Also, a part of the configuration of each embodiment can be added, deleted, or replaced with another configuration.

上記各構成、機能、処理部、処理手段等は、それらの一部や全部を、例えば集積回路で設計する等によりハードウェアで実現してもよい。また、上記の各構成、機能等は、プロセッサがそれぞれの機能を実現するプログラムを解釈し、実行することによりソフトウェアで実現してもよい。各機能を実現するプログラム、テーブル、ファイル等の情報は、メモリ、ハードディスク、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等の記録装置、ＩＣカード、ＳＤカード、ＤＶＤ等の記録媒体に格納することができる。 Some or all of the above configurations, functions, processing units, processing means, etc. may be realized by hardware, for example, by designing integrated circuits. Moreover, each of the above configurations, functions, etc. may be realized by software by a processor interpreting and executing a program for realizing each function. Information such as programs, tables, and files that implement each function can be stored in recording devices such as memories, hard disks, SSDs (Solid State Drives), and recording media such as IC cards, SD cards, and DVDs.

また、上述した実施形態に含まれる技術的特徴は、特許請求の範囲に明示された組み合わせに限らず、適宜組み合わせることができる。 Moreover, the technical features included in the above-described embodiments are not limited to the combinations specified in the claims, and can be combined as appropriate.

１…画像検索装置，２…画像，１２０…検索クエリ入力部，１２１…検索部，１２２…特徴抽出部，１２３…画像データ蓄積部，１２４…特徴データ蓄積部，１２５…検索結果出力部，１２６…画像データ取得部 REFERENCE SIGNS LIST 1 image search device 2 image 120 search query input unit 121 search unit 122 feature extraction unit 123 image data storage unit 124 feature data storage unit 125 search result output unit 126 …Image data acquisition unit

Claims

An image retrieval device,
a feature extraction unit that calculates search data including object name information of a plurality of objects included in an image and relationship name information indicating a predetermined relationship between the plurality of objects included in the image;
a storage unit that stores the search data in association with the image;
an input unit that receives input of at least one of the object name information and the relationship name information as a search query;
a search unit that searches the storage unit based on the search query and extracts a predetermined image corresponding to the search query;
and an output unit that outputs search results,
The input unit
a function of accepting the search query as text;
and a function of accepting an input image containing a plurality of objects,
The search unit has a function of extracting from the storage unit a plurality of predetermined images corresponding to the search query received in the form of characters,
The feature extraction unit is
A function of calculating a feature vector indicating features of a plurality of objects included in the image and features of relationships between the plurality of objects included in the image;
a function of calculating a plurality of degrees of similarity to the input image for each predetermined image based on the feature vector calculated from the predetermined image and the feature vector calculated from the input image;
Further, the image retrieval device, wherein the output unit has a function of outputting the predetermined image based on the degree of similarity.

An image retrieval device,
a feature extraction unit that calculates search data including object name information of a plurality of objects included in an image and relationship name information indicating a predetermined relationship between the plurality of objects included in the image;
a storage unit that stores the search data in association with the image;
an input unit that receives input of at least one of the object name information and the relationship name information as a search query;
a search unit that searches the storage unit based on the search query and extracts a predetermined image corresponding to the search query;
an output unit that outputs search results;
a learning unit that learns parameters for calculating the search data from the image,
Furthermore, the feature extraction unit has the parameter,
Furthermore, in the storage unit,
a learning image used when the learning unit learns;
and teacher data indicating search data included in the learning image are stored,
The learning unit
A function of calculating, as pre-learning data, the feature extraction unit before parameter update, using search data included in the learning image;
a function of learning updated values of the parameters based on the teacher data and the pre-learning data;
and a function of updating the parameters of the feature extraction unit based on updated values of the parameters.

An image retrieval device,
a feature extraction unit that calculates search data including object name information of a plurality of objects included in an image and relationship name information indicating a predetermined relationship between the plurality of objects included in the image;
a storage unit that stores the search data in association with the image;
an input unit that receives input of at least one of the object name information and the relationship name information as a search query;
a search unit that searches the storage unit based on the search query and extracts a predetermined image corresponding to the search query;
and an output unit that outputs search results,
the object name information included in the search query indicates a plurality of predetermined objects;
The output unit has a function of displaying the predetermined object included in the predetermined image in a more emphasized manner than other objects,
The related name information included in the search query indicates predetermined related name information,
other relationship name information indicates the relationship between the predetermined object and the other object;
Furthermore, the output unit
a function of displaying the other object and the other related name information;
a function of displaying the plurality of predetermined objects spaced apart from each other;
a function of displaying the predetermined relationship name information between the plurality of predetermined objects to be displayed;
a function of displaying the other object spaced apart from the predetermined object to be displayed;
a function of displaying the other related name information between the displayed predetermined object and the displayed other object;
An image search device comprising the predetermined object, the predetermined related name information, the other object, the other related name information, and a function of displaying them side by side in the scroll direction of the screen.

4. The image retrieval apparatus according to claim 1, wherein said predetermined relationship indicates a physical relationship between said objects.

The feature extraction unit is
a function of calculating position information of each of the plurality of objects included in the image;
4. The image retrieval device according to claim 1, further comprising a function of calculating said related name information based on said position information.

4. The image retrieval apparatus according to claim 2, wherein said input unit has a function of accepting said retrieval query in characters.

The input unit
a function of displaying a plurality of object name input fields for receiving the object name information and at least one relationship name input field for receiving the relationship name information;
7. The image retrieval device according to claim 1, further comprising a function of displaying the plurality of object name input fields and the at least one relationship name input field side by side in the scroll direction of the screen.

The input unit has a function of accepting an input image containing a plurality of objects,
The feature extraction unit has a function of calculating a search query from the input image,
4. The image search device according to claim 1, wherein the search unit has a function of extracting from the storage unit a predetermined image corresponding to a search query calculated from the input image.

Furthermore, the feature extraction unit
A function of calculating a feature vector indicating features of a plurality of objects included in the image and features of relationships between the plurality of objects included in the image;
a function of calculating a plurality of degrees of similarity to the input image for each image stored in the storage unit based on the feature vector calculated from the input image and the feature vector calculated from the image stored in the storage unit; , has
9. The image search device according to claim 8, wherein the search unit has a function of searching the storage unit based on a plurality of similarities and extracting a predetermined image.

calculating search data including object name information of a plurality of objects included in an image and relationship name information indicating a predetermined relationship between the plurality of objects included in the image;
a step of storing the search data in association with an image;
a step of inputting at least one of the object name information and the related name information in characters as a search query, and inputting an input image including a plurality of objects;
searching a storage unit based on the search query;
a step of extracting a plurality of predetermined images from the storage unit corresponding to the search query accepted in the form of characters;
and outputting search results;
The step of calculating the search data includes:
calculating a feature vector indicating features of a plurality of objects included in the image and features of relationships between the plurality of objects included in the image;
calculating a plurality of degrees of similarity to the input image based on the feature vector calculated from the predetermined image and the feature vector calculated from the input image;
Further, in the image retrieval method, the step of outputting the retrieval result has a function of outputting the predetermined image based on the degree of similarity.

calculating search data including object name information of a plurality of objects included in an image and relationship name information indicating a predetermined relationship between the plurality of objects included in the image;
a step of storing the search data in association with an image;
at least one of the object name information and the relationship name information is input as a search query;
searching a storage unit based on the search query;
extracting a predetermined image corresponding to the search query;
a step of outputting search results;
learning parameters for calculating the search data from the image;
Furthermore, in the storage unit,
a learning image used when a learning unit that learns the parameter learns;
and teacher data indicating search data included in the learning image are stored,
The step of learning the parameters includes:
calculating the search data included in the learning image as pre-learning data before updating the parameters;
learning updated values of the parameters based on the teacher data and the pre-learning data;
and updating the parameters based on updated values of the parameters.

calculating search data including object name information of a plurality of objects included in an image and relationship name information indicating a predetermined relationship between the plurality of objects included in the image;
a step of storing the search data in association with an image;
at least one of the object name information and the relationship name information is input as a search query;
searching a storage unit based on the search query;
extracting a predetermined image corresponding to the search query;
and outputting search results;
the object name information included in the search query indicates a plurality of predetermined objects;
The related name information included in the search query indicates predetermined related name information,
other relationship name information indicates the relationship between the predetermined object and the other object;
Furthermore, the step of outputting the search results includes:
a step of displaying the predetermined object included in the predetermined image in a more emphasized manner than other objects;
a step of displaying the other object and the other related name information;
displaying the plurality of predetermined objects separately;
a step of displaying the predetermined relationship name information between the plurality of predetermined objects to be displayed;
displaying the other object spaced apart from the predetermined object to be displayed;
displaying the other related name information between the displayed predetermined object and the displayed other object;
and displaying the predetermined object, the predetermined related name information, the other object, and the other related name information side by side in a screen scroll direction.

A computer program for causing a computer to function as an image retrieval device,
on the computer,
a feature extraction unit that calculates search data including object name information of a plurality of objects included in an image and relationship name information indicating a predetermined relationship between the plurality of objects included in the image;
a storage unit that stores the search data in association with an image;
an input unit that receives at least one of the object name information and the related name information as a search query in the form of characters, and receives an input image that includes a plurality of objects;
A search unit that searches the storage unit based on the search query, extracts a predetermined image corresponding to the search query, and extracts a plurality of predetermined images from the storage unit that correspond to the search query received in the form of characters. and,
Realize an output unit that outputs the search results, and
In the input section,
a function of accepting the search query as text;
and executing a function of accepting an input image containing a plurality of objects,
causing the search unit to execute a function of extracting a plurality of predetermined images from the storage unit corresponding to the search query received in the form of characters;
In the feature extraction unit,
A function of calculating a feature vector indicating features of a plurality of objects included in the image and features of relationships between the plurality of objects included in the image;
a function of calculating a plurality of degrees of similarity to the input image for each predetermined image based on the feature vector calculated from the predetermined image and the feature vector calculated from the input image;
Further, the computer program for causing the output unit to execute a function of outputting the predetermined image based on the degree of similarity.

A computer program for causing a computer to function as an image retrieval device,
on the computer,
a feature extraction unit having parameters for calculating search data including object name information of a plurality of objects included in an image and relationship name information indicating a predetermined relationship between the plurality of objects included in the image; ,
a storage unit that stores the search data in association with an image;
an input unit that receives input of at least one of the object name information and the relationship name information as a search query;
a search unit that searches the storage unit based on the search query and extracts a predetermined image corresponding to the search query;
an output unit that outputs search results;
and a learning unit that learns the parameters,
Furthermore, in the storage unit,
a learning image used when the learning unit learns;
storing teacher data indicating search data included in the learning image;
to the learning unit,
A function of calculating, as pre-learning data, the feature extraction unit before parameter update, using search data included in the learning image;
a function of learning updated values of the parameters based on the teacher data and the pre-learning data;
A computer program for executing a function of updating the parameter of the feature extraction unit based on the updated value of the parameter.

A computer program for causing a computer to function as an image retrieval device,
on the computer,
a feature extraction unit that calculates search data including object name information of a plurality of objects included in an image and relationship name information indicating a predetermined relationship between the plurality of objects included in the image;
a storage unit that stores the search data in association with an image;
an input unit that receives input of at least one of the object name information and the relationship name information as a search query;
a search unit that searches the storage unit based on the search query and extracts a predetermined image corresponding to the search query;
Realize an output unit that outputs the search results, and
the object name information included in the search query indicates a plurality of predetermined objects;
causing the output unit to display the predetermined object included in the predetermined image in a more emphasized manner than other objects;
The related name information included in the search query indicates predetermined related name information,
other relationship name information indicates the relationship between the predetermined object and the other object;
Furthermore, in the output unit,
a function of displaying the other object and the other related name information;
a function of displaying the plurality of predetermined objects spaced apart from each other;
a function of displaying the predetermined relationship name information between the plurality of predetermined objects to be displayed;
a function of displaying the other object spaced apart from the predetermined object to be displayed;
a function of displaying the other related name information between the displayed predetermined object and the displayed other object;
A computer program for executing the predetermined object, the predetermined related name information, the other object, the other related name information, and a function of displaying them side by side in the scroll direction of the screen.