WO2023281903A1 - Image matching device, image matching method, and program - Google Patents

Image matching device, image matching method, and program Download PDF

Info

Publication number
WO2023281903A1
WO2023281903A1 PCT/JP2022/018752 JP2022018752W WO2023281903A1 WO 2023281903 A1 WO2023281903 A1 WO 2023281903A1 JP 2022018752 W JP2022018752 W JP 2022018752W WO 2023281903 A1 WO2023281903 A1 WO 2023281903A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
similarity
attribute
candidate images
group
Prior art date
Application number
PCT/JP2022/018752
Other languages
French (fr)
Japanese (ja)
Inventor
俊介 安木
拓実 小島
祐介 加藤
Original Assignee
パナソニックIpマネジメント株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by パナソニックIpマネジメント株式会社 filed Critical パナソニックIpマネジメント株式会社
Publication of WO2023281903A1 publication Critical patent/WO2023281903A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • the present disclosure relates to an image matching device, an image matching method, and a program.
  • Non-Patent Document 1 discloses a matching technique for matching a person image with a query and displaying matching results in a ranking format.
  • An object of the present disclosure is to provide an image matching device, an image matching method, and a program that make it easier to grasp information indicating matching results compared to conventional techniques.
  • One aspect of the present disclosure provides an image matching device that matches each of a plurality of candidate images with a query image.
  • the image matching device uses an attribute determination unit that determines an object attribute of an object to be matched included in each candidate image, and determines a similarity that indicates the degree of similarity between each candidate image and a query image using a predetermined similarity calculation algorithm.
  • a similarity determination unit for classifying a plurality of candidate images into a plurality of groups for each object attribute determined by the attribute determination unit; one or more candidate images classified into each group; a rank assigning unit that assigns ranks in descending order of similarity; and an output unit that outputs information about one or more candidate images classified into each group for two or more groups among the plurality of groups under the control of the control unit.
  • the image matching method includes an attribute determining step of determining an object attribute of an object to be matched included in each candidate image, and determining a similarity indicating the degree of similarity of each candidate image to the query image using a predetermined similarity calculation algorithm. a step of classifying a plurality of candidate images into a plurality of groups for each object attribute determined in the attribute determination step; and one or more candidate images classified into each group having high similarity for each group and outputting information about one or more candidate images classified into each group for two or more groups among the plurality of groups.
  • Yet another aspect of the present disclosure provides a program for causing a control unit to execute the above image matching method.
  • the image matching device According to the image matching device, the image matching method, and the program according to the present disclosure, it is possible to grasp the information indicating the matching result more easily than in the conventional technology.
  • Block diagram showing a configuration example of the image matching device in FIG. 3 is a flow chart illustrating the procedure of processing executed by the control unit of the image collating apparatus of FIG. 2; Schematic diagram illustrating the distance between the feature amount vector of the candidate image and the feature amount vector of the query image Schematic diagram for explaining an example of step S5 for determining the degree of similarity in FIG. Schematic diagram illustrating conventional technology for displaying matching results in a similarity ranking format
  • a matching technique is known for searching for a person to be searched from among a plurality of captured images generated by a plurality of surveillance cameras installed in towns, premises, and the like.
  • An example of such a matching technique is a technique of detecting an image of a person from a plurality of captured images, using this as a candidate image, and calculating a similarity indicating the degree of similarity between the candidate image and the query image.
  • a technique of determining whether or not the degree of similarity is equal to or greater than a predetermined threshold, a technique of arranging and displaying candidate images in ranking order in descending order of similarity, and the like are known.
  • a query image to be matched is selected by the user from among the plurality of captured images, or is selected in advance from existing images. Alternatively, the query image may be automatically selected by a program from an externally input image, the plurality of captured images, or the like.
  • Non-Patent Document 1 discloses a technique for arranging and displaying candidate images in a ranking format in descending order of similarity.
  • FIG. 6 is a schematic diagram illustrating a conventional technique for arranging and displaying matching results in a similarity ranking format.
  • the similarity of the candidate image may be calculated to be high when the orientation of the person in the candidate image matches the orientation of the person in the query image Q.
  • the candidate images (T1 to T3 in FIG. 6) showing the candidates facing the same direction as the person in the query image Q are the candidate images (T1 to T3 in FIG. 6) showing candidates facing in a direction different from the direction of the person in the query image Q ( From T4 to T6), the similarity of the candidate image is calculated to be high.
  • the candidate images (T1 to T3 in FIG. 6) showing the candidate facing the same direction as the person in the query image Q are displayed in the query image Q even if the candidate is a different person from the query image Q.
  • the similarity is calculated to be higher than the candidate image (T4 in FIG. 6) in which the same person is facing a different direction. Therefore, when the candidate images are arranged in a ranking format, the candidate images (T1 to T3 in FIG. 6) showing the candidates facing the same direction as the person in the query image Q are arranged at the top of the ranking. A candidate image facing a different direction (T4 in FIG. 6) appears at a lower level, and information indicating the matching result for finding the same person is buried in other information.
  • the inventors have conducted research to solve the above problems, and have developed an image matching device, an image matching method, and a program that make it easier to grasp the information indicating the matching result compared to the conventional technology.
  • the attribute of the object to be matched that can be recognized from the image itself by image recognition technology or the like, that is, the “object attribute” will be described by exemplifying “human orientation”, but the “object attribute” of the present disclosure also is not limited to Other examples of matching objects and object attributes will be described after the description of the embodiments.
  • FIG. 1 is a schematic diagram showing an outline of an image matching device 100 according to an embodiment of the present disclosure.
  • the image matching device 100 detects human images from a plurality of image data 50 generated by a plurality of cameras and uses them as candidate images. to rank.
  • the image matching apparatus 100 classifies each candidate image into a plurality of groups according to the orientation of the person. For example, the image matching apparatus 100 calculates an attribute rank regarding the orientation of a person in a plurality of candidate images similar to the orientation of the person in the query image.
  • the image matching apparatus 100 calculates the attribute ranking of a plurality of candidate images in the same order as the query image, the group facing forward (first orientation), backward facing (second orientation), and facing right (third orientation). do.
  • the image matching device 100 classifies the plurality of candidate images into respective groups as shown in FIG.
  • the image matching apparatus 100 ranks each classified group in descending order of similarity, and displays the rank of each group for two or more of the plurality of groups on a display device or the like.
  • FIG. 2 is a block diagram showing a configuration example of the image matching device 100 of FIG.
  • the image matching device 100 includes a control unit 1, a storage device 2, an image acquisition unit 3, an input interface (I/F) 5, and an output interface (I/F) 4.
  • the control unit 1 implements the functions of the image matching device 100 by executing information processing. Such information processing is realized by executing a program stored in the storage device 2 by the control unit 1, for example.
  • the control unit 1 includes a person detection unit 11 , a query determination unit 12 , an orientation detection unit 13 , a similarity determination unit 14 , a classification unit 15 and a ranking unit 16 .
  • the control unit 1 is composed of circuits such as a CPU, MPU, and FPGA.
  • the person detection unit 11 detects a person in the image data 50 and uses the image of the detected person as a candidate image.
  • the query determination unit 12 determines a query image to be matched with candidate images.
  • the orientation detection unit 13 detects the orientation of a person's face and/or body (hereinafter referred to as "person's orientation") included in the query image determined by the query determination unit 12 and each candidate image detected by the person detection unit 11. ) is detected.
  • the similarity determination unit 14 determines a similarity indicating the degree of similarity of each candidate image to the query image using a predetermined similarity calculation algorithm.
  • the classification unit 15 classifies the plurality of candidate images into a plurality of groups for each orientation of the person determined by the similarity determination unit 14 .
  • the ranking unit 16 ranks one or more candidate images classified into each group in descending order of similarity for each group. Details of each of the functions described above will be further described in relation to the operation of the image collating apparatus 100, which will be described later.
  • the storage device 2 is a recording medium for recording various information such as data and programs including a predetermined similarity calculation algorithm for causing the control unit 1 to execute the image matching method by the image matching device 100 .
  • the storage device 2 stores a later-described feature extraction model 21 that is a trained model and an image list 22 .
  • the storage device 2 is realized by, for example, a semiconductor storage device such as a flash memory, a solid state drive (SSD), a magnetic storage device such as a hard disk drive (HDD), or other recording media alone or in combination.
  • the storage device 2 may include volatile memory such as SRAM and DRAM.
  • the image acquisition unit 3 is an interface circuit that connects the image matching device 100 and an external device in order to input information such as the image data 50 to the image matching device 100 .
  • an external device is, for example, another information processing terminal (not shown) or a device such as a camera that acquires the image data 50 .
  • the image acquisition unit 3 may be a communication circuit that performs data communication according to existing wired communication standards or wireless communication standards.
  • the input interface 5 is an interface circuit that connects the image collating device 100 and an input device 80 such as a keyboard and a mouse in order to accept user input.
  • the input interface 5 may be a communication circuit that performs data communication according to existing wired communication standards or wireless communication standards.
  • the output interface 4 is an interface circuit that connects the image matching device 100 and an external output device in order to output information from the image matching device 100 .
  • Such an output device is, for example, the display device 70 .
  • the output interface 4 may be a communication circuit that is connected to the network 60 and performs data communication according to existing wired communication standards or wireless communication standards.
  • the image acquisition unit 3, input interface 5 and output interface 4 may be realized by separate or common hardware.
  • FIG. 3 is a flow chart illustrating a procedure of processing executed by the control section 1 of the image matching apparatus 100 of FIG.
  • the control unit 1 acquires image data 50 via the image acquisition unit 3 (S1).
  • the image data 50 is, for example, a plurality of image data captured by a plurality of cameras installed in the premises.
  • the person detection unit 11 detects a person in the image data 50 acquired in step S1, and uses the image of the detected person as a candidate image (S2). When detecting a plurality of persons in the image data 50, the person detection unit 11 generates a candidate image for each of the detected persons.
  • detecting a person in the image data 50 includes detecting an area in which a person exists in the image data 50 .
  • the query determination unit 12 determines a query image to be matched with candidate images (S3). For example, the query determination unit 12 uses, as a query image, a candidate image selected by the user using the input device 80 such as a keyboard or mouse from among the plurality of candidate images obtained in step S2.
  • the query determination unit 12 may use an image of a person stored in advance in the storage device 2, an image of a person input via the image acquisition unit 3, or the like as a query image.
  • the query determination unit 12 operates in accordance with instructions from the program to obtain the plurality of captured images obtained in step S2, the person's image stored in advance in the storage device 2, and the person's image input via the image acquisition unit 3.
  • a query image may be automatically selected from images and the like.
  • the orientation detection unit 13 detects the orientation of the person included in the query image determined in step S3 and each candidate image detected in step S2 (S4). Specifically, first, the orientation detection unit 13 determines to which of a plurality of predetermined face and/or body different orientations the orientation of the person included in the query image belongs. Then, the orientation detection unit 13 determines to which of a plurality of predetermined face and/or body different orientations the orientation of the person included in each candidate image belongs.
  • the orientation detection unit 13 is the “attribute determination It is an example of "part”.
  • the “orientation of the person included in the query image” is an example of the “query attribute indicating the object attribute of the subject included in the query image” of the present disclosure.
  • the orientation detection unit 13 compares the feature amount vector of each candidate image output by the feature extraction model 21 with each of the feature amount vectors of human images in a plurality of predetermined orientations, so that each candidate image is Detect the orientation of a person in The orientation of the person is the orientation of the face and/or body of the person in the candidate image, such as the orientation of the face, the orientation of the upper body, the orientation of the lower body, or an orientation determined by combining these information. .
  • the orientation detection unit 13 may use an orientation detection model that outputs the orientation of the person in the person image by inputting the person image.
  • an orientation detection model is a trained model constructed by having the model learn the relationship between the learning image and the correct information.
  • a known skeleton detector, posture detector, or face orientation detector may be applied to the orientation detection unit 13 .
  • the direction of the person detected in this way is, for example, 8 directions of forward, obliquely forward to the right, facing right, obliquely backward to the right, backward, obliquely backward to the left, facing left, and obliquely forward to the left when viewed from the person in the candidate image.
  • direction can be classified.
  • the orientation of a person is not limited to the eight directions described above, and can be classified into less than eight directions or nine or more directions.
  • the similarity determination unit 14 uses a predetermined similarity calculation algorithm to determine a similarity indicating the degree of similarity of each candidate image to the query image (S5). For example, the similarity determining unit 14 calculates the similarity based on comparison between the feature amount vector of each candidate image and the feature amount vector of the query image. For example, the predetermined similarity calculation algorithm calculates the similarity such that the smaller the distance such as the Euclidean distance, the Mahalanobis distance, or the inner product between the feature amount vector of each candidate image and the feature amount vector of the query image, the similarity is increased. is an algorithm for calculating The predetermined similarity calculation algorithm may be an algorithm that applies a model constructed by metric learning to calculate the distance between a plurality of feature amount vectors.
  • the degree of similarity means, for example, that the larger the value, the higher the degree of matching between the candidate image and the query image.
  • FIG. 4 is a schematic diagram illustrating the distance between the feature amount vector of the candidate image and the feature amount vector of the query image.
  • FIG. 4 shows an n-dimensional (n: an integer equal to or greater than 1) feature amount vector space.
  • the similarity determination unit 14 determines the similarity of each candidate image so that the similarity of the candidate image X is higher than the similarity of the candidate image Y by a predetermined similarity calculation algorithm.
  • FIG. 5 is a schematic diagram for explaining an example of step S5 for determining the degree of similarity in FIG.
  • the similarity determination unit 14 may use the feature extraction model 21 that outputs the feature amount vector of the image by inputting the image.
  • the similarity determination unit 14 calculates a similarity based on a comparison between the feature amount vector of each candidate image output by the feature extraction model 21 and the feature amount vector of the query image.
  • Such a feature extraction model 21 is a trained model constructed by having the model learn the relationship between the learning image and the correct information.
  • the feature extraction model 21, which is a trained model may be a model having the structure of a neural network, for example, a convolutional neural network (CNN).
  • CNN convolutional neural network
  • the classification unit 15 classifies the multiple candidate images into multiple groups for each orientation of the person determined in step S4 (S6). For example, if it is determined in step S4 that the orientation of the person in the candidate image belongs to "forward", the classification unit 15 classifies the candidate image into a first group corresponding to "forward" in step S6 ( See Figure 1). Alternatively, for example, the classification unit 15 may classify the orientation of the person in the candidate image detected in step S4 based on the orientation relative to the orientation of the person in the query image. For example, the classification unit 15 classifies a plurality of candidate images into a group that is the same as the orientation of the person in the query image and a group that is different.
  • the ranking unit 16 ranks the one or more candidate images classified into each group in step S6 in descending order of similarity determined in step S5 (S7). For example, the ranking unit 16 calculates an attribute ranking regarding the orientation of a person in a plurality of candidate images similar to the orientation of the person in the query image. For example, if the person orientation of the query image is forward facing (first orientation) as shown in FIG. ing. Based on these facts, the order of attributes regarding the orientation of a person is forward facing (first orientation), backward facing (second orientation), and right facing (third orientation). Then, as shown in FIG.
  • the ranking unit 16 ranks the candidate images classified into the forward-looking group from the first rank, and also ranks the candidate images classified into the backward-looking group and the candidate images classified into the right-facing group respectively. rank in order from .
  • the above-mentioned attribute ranking regarding the orientation of a person is, for example, the degree of similarity of the orientation of a person included in the candidate images classified into each group (for example, facing forward, facing backward, facing right, etc.) to the orientation of a person included in the query image ( hereinafter referred to as “orientation similarity”).
  • Orientation similarity is an example of “attribute similarity” of the present disclosure.
  • the orientation similarity is calculated by the similarity determining unit 14, for example.
  • the similarity determining unit 14 calculates the orientation similarity based on a comparison between the outline of a person included in the candidate images classified into each group and the outline of a person included in the query image.
  • Orientation similarity may be predetermined by the orientation of the person in the candidate image relative to the orientation of the person in the query image. For example, when the orientation of a person included in a query image is forward facing, the orientation similarity may be set to a larger value in order of the forward facing group, the backward facing group, and the right facing group.
  • the control unit 1 may output to the display device 70 information in which the one or more candidate images classified into each group in step S6 are linked with the ranking in each group for two or more groups among the plurality of groups. .
  • Each candidate image is displayed on the display device 70 in the order of the ranking in each group (S8).
  • the control unit 1 displays, via the output interface 4, a plurality of candidate images, the group to which each candidate image determined in step S6 belongs, and the information indicating the order given to each group in step S7.
  • Output to device 70 For example, as shown in FIG. 1, the control unit 1 causes the display device 70 to display two or more groups among a plurality of groups arranged in the order of attribute ranking in the vertical direction (first direction), and displays the object attributes.
  • One or more candidate images classified into each group are arranged in the horizontal direction (second direction) in order of rank in each group and displayed on the display device 70 .
  • the similarity ranking of the object attribute having the first attribute ranking may be displayed, and the similarity ranking of the object attributes having the second and subsequent attribute rankings may be displayed by switching the screen.
  • the object attributes may be displayed in a predetermined order regardless of the attribute order.
  • control unit 1 may output to the display device 70 the candidate images from the first place to the predetermined order in each group in association with each order. For example, the control unit 1 selects the first to tenth candidate images in each group and causes the display device 70 to display them together with information on the group to which each candidate image belongs and the ranking.
  • step S8 the control unit 1 displays candidate images having a degree of similarity equal to or higher than a predetermined threshold among the one or more candidate images classified into each group for two or more of the plurality of groups on the display device 70.
  • a predetermined threshold among the one or more candidate images classified into each group for two or more of the plurality of groups on the display device 70.
  • the control unit 1 selects candidate images having a similarity of 0.7 or higher in each group, and causes the display device 70 to display them together with information on the group to which each candidate image belongs and the ranking.
  • the amount of data output to the display device 70 can be reduced, and the processing load can be reduced.
  • the amount of information processing in the display device 70 can also be reduced.
  • a predetermined threshold may be set for each group. For example, if the person in the query image is forward-facing, the control unit 1 selects candidate images with a similarity of 0.8 or higher from the forward-facing group, and selects candidate images with a similarity of 0.5 or higher from the backward-facing group. An image is selected and displayed on the display device 70 .
  • setting a higher threshold than other groups for candidate images belonging to a group in which candidate images in the same orientation as the orientation of the person in the query image are classified has the following advantages. That is, a candidate image in which a candidate facing the same direction as the person in the query image shows the same person as in the query image, but facing a different direction, even if the candidate is a different person from the query image. There is a tendency that the calculated similarity is higher than that of the candidate image shown. Therefore, even if a candidate image belonging to a group in which candidate images with the same orientation as the orientation of the person in the query image are classified has a high degree of similarity, it belongs to another group with a similarly high degree of similarity.
  • the probability of an image matching the query image may actually be lower than the candidate image. Therefore, the control unit 1 sets a threshold higher than that of the other groups for the candidate images belonging to the group into which the candidate images in the same direction as the direction of the person in the query image are classified. By reflecting the degree of matching, it is possible to select and display on the display device 70 only the candidate images whose substantial degree of matching exceeds a predetermined reference value.
  • the image matching device 100 for matching each of a plurality of candidate images with a query image includes the orientation detection unit 13, which is an example of an attribute determination unit, the similarity determination unit 14, the classification unit 15, A ranking unit 16 and an output interface 4 are provided.
  • the orientation detection unit 13 determines the orientation of the person included in each candidate image.
  • the similarity determination unit 14 determines a similarity indicating the degree of similarity of each candidate image to the query image using a predetermined similarity calculation algorithm.
  • the classification unit 15 classifies the multiple candidate images into multiple groups for each orientation determined by the orientation detection unit 13 .
  • the ranking unit 16 ranks one or more candidate images classified into each group in descending order of similarity for each group.
  • the output interface 4 outputs information about one or more candidate images classified into each group for two or more of the plurality of groups under the control of the control unit 1 .
  • the information processing apparatus of the output destination or the user who views the displayed output information can easily grasp the information indicating the matching result compared to the conventional technology, and the image matching apparatus 100 can display the information indicating the matching result. You can solve problems that are buried in other information.
  • the output interface 4 may output the candidate images from the first to the predetermined rank in each group in association with the respective ranks.
  • the output interface 4 outputs candidate images having a degree of similarity equal to or higher than a predetermined threshold among the one or more candidate images classified into each group, for two or more of the plurality of groups.
  • the output interface 4 outputs information to the display device 70 under the control of the control unit 1, and ranks one or more candidate images classified into each group in each of two or more groups among the plurality of groups. may be displayed on the display device 70 in this order.
  • the orientation detection unit 13 may further determine the orientation of the subject included in the query image.
  • the similarity determination unit 14 further determines orientation similarity indicating the degree to which the orientation of the person corresponding to each of the plurality of groups classified by the classification unit 15 is similar to the orientation of the subject included in the query image. do.
  • Orientation similarity is an example of “attribute similarity” of the present disclosure.
  • the rank assigning unit 16 further assigns ranks (orientation ranks) to the plurality of groups in descending order of orientation similarity.
  • the output interface 4 arranges the plurality of groups in the vertical direction (first direction) in order of orientation order and displays them on the display device 70, and displays one or more candidate images classified into each group in each group. They are arranged in the horizontal direction (second direction) in order of similarity order and displayed on the display device 70 .
  • a plurality of groups are arranged in order of orientation order, and candidate images belonging to each group are arranged in order of similarity order in the horizontal direction in each group arranged in order of orientation. Information indicating the result can be grasped more easily.
  • an object to be matched is not limited to a person, and may be an object other than a person, such as a vehicle, a building, a robot, or the like.
  • the object attribute may be an attribute of an object other than a person, such as the color, material, shape, etc. of the object.
  • the object attribute is not limited to the orientation of the person, and may be attributes such as a person's height, body type, age, age group, and gender.
  • a person's height and body shape can be easily estimated using image recognition technology.
  • age, generation, and gender can also be estimated using image recognition technology. For example, it is possible to estimate gender from the type of clothes, hairstyle, body shape, etc. of a person in an image, or estimate age from the degree of facial wrinkles and hair color.
  • the object attribute may be an attribute representing whether or not a person is wearing clothes of a specific shape such as a suit.
  • An object attribute may be an attribute that indicates whether a person is holding a bag, carrying a backpack, pulling a suitcase, or making a phone call.
  • the presence/absence of belongings such as a bag as an object attribute includes information indicating not only whether or not a person actually has belongings in an image, but also whether or not the belongings are shown in the image.
  • Such belongings are visible in images taken at a certain time, but in images taken at other times, they may be hidden by the owner's body or the owner may have left the belongings somewhere. , may not be reflected.
  • the information processing device of the output destination or the user viewing the displayed output information can determine the matching result. It becomes easier to grasp the information to be displayed than in the conventional technology, and the image matching apparatus 100 can solve the problem that the information indicating the matching result is buried in other information.
  • the object attribute may be an attribute indicating whether the person is riding a vehicle such as a bicycle or motorcycle, walking, running, standing still, standing or sitting.
  • the attribute determination unit may detect the posture of the person in each candidate image and estimate the attributes as described above based on the detected posture.
  • the display device 70 was exemplified as an output destination of information by the image collating device 100 .
  • the output destination of the information is not limited to this, and the image matching device 100 may output the information to an information processing terminal such as a smart phone, a tablet, or a notebook computer via the network 60, for example.
  • the image matching device 100 executes rough matching processing in steps S1 to S7 in FIG. Steps S4 to S8 may be performed.
  • Such a means reduces the processing load of the second round of fine matching processing by excluding candidate images whose similarity is lower than a predetermined threshold as a result of rough matching processing from targets of the second round of fine matching processing. , can improve the processing speed.
  • the present disclosure is applicable to image search technology and image matching technology.
  • control unit 2 storage device 3 image acquisition unit 4 output interface 5 input interface 11 person detection unit 12 query determination unit 13 detection unit 14 similarity determination unit 15 classification unit 16 ranking unit 21 feature extraction model 22 image list 50 image data 60 Network 70 Display Device 80 Input Device 100 Image Verification Device

Abstract

This image matching device comprises: an attribute determination unit that determines an object attribute of a matching object included in each candidate image; a similarity determination unit that determines a similarity indicating the degree of similarity of each candidate image to a query image by using a predetermined similarity calculation algorithm; a classification unit that classifies a plurality of the candidate images into a plurality of groups for each object attribute determined by the attribute determination unit; a ranking unit that ranks one or more candidate images classified into each group in descending order of similarity for each group; and an output unit that outputs information about the one or more candidate images classified into each group under the control of a control unit.

Description

画像照合装置、画像照合方法、及びプログラムImage matching device, image matching method, and program
 本開示は、画像照合装置、画像照合方法、及びプログラムに関する。 The present disclosure relates to an image matching device, an image matching method, and a program.
 非特許文献1は、人物画像をクエリと照合する照合技術について、照合結果をランキング形式で並べて表示する技術を開示している。 Non-Patent Document 1 discloses a matching technique for matching a person image with a query and displaying matching results in a ranking format.
 本開示は、照合結果を示す情報を従来技術に比べて把握しやすくすることができる画像照合装置、画像照合方法、及びプログラムを提供することを目的とする。 An object of the present disclosure is to provide an image matching device, an image matching method, and a program that make it easier to grasp information indicating matching results compared to conventional techniques.
 本開示の一態様は、複数の候補画像のそれぞれをクエリ画像と照合する画像照合装置を提供する。画像照合装置は、各候補画像に含まれる照合対象物の物体属性を決定する属性決定部と、所定の類似度算出アルゴリズムを用いて各候補画像がクエリ画像に類似する程度を示す類似度を決定する類似度決定部と、複数の候補画像を属性決定部によって決定された物体属性毎に複数の群に分類する分類部と、各群に分類された1以上の候補画像に、群毎に、類似度が高い順に順位を付与する順位付与部と、制御部による制御に従い、各群に分類された1以上の候補画像に関する情報を複数の群の中の2以上の群について出力する出力部と、を備える。 One aspect of the present disclosure provides an image matching device that matches each of a plurality of candidate images with a query image. The image matching device uses an attribute determination unit that determines an object attribute of an object to be matched included in each candidate image, and determines a similarity that indicates the degree of similarity between each candidate image and a query image using a predetermined similarity calculation algorithm. a similarity determination unit for classifying a plurality of candidate images into a plurality of groups for each object attribute determined by the attribute determination unit; one or more candidate images classified into each group; a rank assigning unit that assigns ranks in descending order of similarity; and an output unit that outputs information about one or more candidate images classified into each group for two or more groups among the plurality of groups under the control of the control unit. , provided.
 本開示の他の態様は、複数の候補画像のそれぞれをクエリ画像と照合する画像照合方法を提供する。画像照合方法は、各候補画像に含まれる照合対象物の物体属性を決定する属性決定ステップと、所定の類似度算出アルゴリズムを用いて各候補画像がクエリ画像に類似する程度を示す類似度を決定するステップと、複数の候補画像を属性決定ステップにおいて決定された物体属性毎に複数の群に分類するステップと、各群に分類された1以上の候補画像に、群毎に、類似度が高い順に順位を付与するステップと、各群に分類された1以上の候補画像に関する情報を複数の群の中の2以上の群について出力するステップと、を含む。 Another aspect of the present disclosure provides an image matching method for matching each of a plurality of candidate images with a query image. The image matching method includes an attribute determining step of determining an object attribute of an object to be matched included in each candidate image, and determining a similarity indicating the degree of similarity of each candidate image to the query image using a predetermined similarity calculation algorithm. a step of classifying a plurality of candidate images into a plurality of groups for each object attribute determined in the attribute determination step; and one or more candidate images classified into each group having high similarity for each group and outputting information about one or more candidate images classified into each group for two or more groups among the plurality of groups.
 本開示の更に他の態様は、上記の画像照合方法を制御部に実行させるためのプログラムを提供する。 Yet another aspect of the present disclosure provides a program for causing a control unit to execute the above image matching method.
 本開示に係る画像照合装置、画像照合方法、及びプログラムによれば、照合結果を示す情報を従来技術に比べて把握しやすくすることができる。 According to the image matching device, the image matching method, and the program according to the present disclosure, it is possible to grasp the information indicating the matching result more easily than in the conventional technology.
本開示の実施形態に係る画像照合装置の概要を示す模式図Schematic diagram showing an overview of an image matching device according to an embodiment of the present disclosure 図1の画像照合装置の構成例を示すブロック図Block diagram showing a configuration example of the image matching device in FIG. 図2の画像照合装置の制御部によって実行される処理の手順を例示するフローチャート3 is a flow chart illustrating the procedure of processing executed by the control unit of the image collating apparatus of FIG. 2; 候補画像の特徴量ベクトルとクエリ画像の特徴量ベクトルとの距離を例示する模式図Schematic diagram illustrating the distance between the feature amount vector of the candidate image and the feature amount vector of the query image 図3の類似度を決定するステップS5の一例を説明するための模式図Schematic diagram for explaining an example of step S5 for determining the degree of similarity in FIG. 照合結果を類似度ランキング形式で並べて表示する従来技術を例示する模式図Schematic diagram illustrating conventional technology for displaying matching results in a similarity ranking format
(本開示に至った経緯)
 街中、構内等に設置された複数の監視カメラによって生成された複数の撮像画像の中から、検索対象人物を探索する照合技術が知られている。このような照合技術の一例は、複数の撮像画像から人の画像を検出してこれを候補画像とし、候補画像がクエリ画像に類似する程度を示す類似度を算出する技術である。従来技術としては、類似度が予め定められた閾値以上であるか否かを判断する技術、類似度が高いものから順に、候補画像をランキング形式で並べて表示する技術等が知られている。照合対象であるクエリ画像は、上記の複数の撮像画像の中からユーザによって選択され、又は、既存の画像の中から予め選択される。あるいは、クエリ画像は、外部から入力された画像、上記の複数の撮像画像等の中から、プログラムによって自動的に選択されてもよい。
(Circumstances leading to this disclosure)
2. Description of the Related Art A matching technique is known for searching for a person to be searched from among a plurality of captured images generated by a plurality of surveillance cameras installed in towns, premises, and the like. An example of such a matching technique is a technique of detecting an image of a person from a plurality of captured images, using this as a candidate image, and calculating a similarity indicating the degree of similarity between the candidate image and the query image. As conventional techniques, a technique of determining whether or not the degree of similarity is equal to or greater than a predetermined threshold, a technique of arranging and displaying candidate images in ranking order in descending order of similarity, and the like are known. A query image to be matched is selected by the user from among the plurality of captured images, or is selected in advance from existing images. Alternatively, the query image may be automatically selected by a program from an externally input image, the plurality of captured images, or the like.
 照合結果を表示する手段として、非特許文献1は、類似度が高いものから順に、候補画像をランキング形式で並べて表示する技術を開示している。図6は、照合結果を類似度ランキング形式で並べて表示する従来技術を例示する模式図である。図6のようなランキング形式の照合結果を表示装置等に表示することにより、ユーザは、照合結果を一目で把握することができる。 As a means of displaying matching results, Non-Patent Document 1 discloses a technique for arranging and displaying candidate images in a ranking format in descending order of similarity. FIG. 6 is a schematic diagram illustrating a conventional technique for arranging and displaying matching results in a similarity ranking format. By displaying the matching result in the ranking format as shown in FIG. 6 on a display device or the like, the user can grasp the matching result at a glance.
 しかしながら、候補画像の類似度は、候補画像における人の向きが、クエリ画像Qにおける人の向きと一致している場合に高く算出される場合がある。例えば、クエリ画像Qにおける人の向きと同じ向きの候補者が映る候補画像(図6のT1~T3)は、クエリ画像Qにおける人の向きと異なる向きの候補者が映る候補画像(図6のT4~T6)より、候補画像の類似度が高く算出される。特に、クエリ画像Qにおける人の向きと同じ向きの候補者が映る候補画像(図6のT1~T3)は、当該候補者がクエリ画像Qと異なる人である場合であっても、クエリ画像Qと同一人物が異なる方向を向いた候補画像(図6のT4)に比べて類似度が高く算出される問題がある。したがって、候補画像をランキング形式で並べると、クエリ画像Qにおける人の向きと同じ向きの候補者が映る候補画像(図6のT1~T3)がランキングの上位に並び、クエリ画像Qと同一人物が異なる方向を向いた候補画像(図6のT4)が下位に出現してしまい、同一人物を見つけるための照合結果を示す情報が他の情報に埋もれてしまう問題がある。 However, the similarity of the candidate image may be calculated to be high when the orientation of the person in the candidate image matches the orientation of the person in the query image Q. For example, the candidate images (T1 to T3 in FIG. 6) showing the candidates facing the same direction as the person in the query image Q are the candidate images (T1 to T3 in FIG. 6) showing candidates facing in a direction different from the direction of the person in the query image Q ( From T4 to T6), the similarity of the candidate image is calculated to be high. In particular, the candidate images (T1 to T3 in FIG. 6) showing the candidate facing the same direction as the person in the query image Q are displayed in the query image Q even if the candidate is a different person from the query image Q. There is a problem that the similarity is calculated to be higher than the candidate image (T4 in FIG. 6) in which the same person is facing a different direction. Therefore, when the candidate images are arranged in a ranking format, the candidate images (T1 to T3 in FIG. 6) showing the candidates facing the same direction as the person in the query image Q are arranged at the top of the ranking. A candidate image facing a different direction (T4 in FIG. 6) appears at a lower level, and information indicating the matching result for finding the same person is buried in other information.
 本発明者らは、上記課題を解決するために研究を行い、照合結果を示す情報を従来技術に比べて把握しやすくする画像照合装置、画像照合方法、及びプログラムを開発するに至った。 The inventors have conducted research to solve the above problems, and have developed an image matching device, an image matching method, and a program that make it easier to grasp the information indicating the matching result compared to the conventional technology.
 以下、適宜図面を参照しながら、実施の形態を詳細に説明する。但し、必要以上に詳細な説明は省略する場合がある。例えば、既によく知られた事項の詳細説明や実質的に同一の構成に対する重複説明を省略する場合がある。これは、以下の説明が不必要に冗長になるのを避け、当業者の理解を容易にするためである。 Hereinafter, embodiments will be described in detail with reference to the drawings as appropriate. However, more detailed description than necessary may be omitted. For example, detailed descriptions of well-known matters and redundant descriptions of substantially the same configurations may be omitted. This is to avoid unnecessary verbosity in the following description and to facilitate understanding by those skilled in the art.
 なお、出願人は、当業者が本開示を十分に理解するために添付図面および以下の説明を提供するのであって、これらによって特許請求の範囲に記載の主題を限定することを意図するものではない。例えば、以下の実施形態の説明では、画像に含まれ得る種々の物体のうち、人物(以下「人」と略記することがある。)を対象として照合を行う例を挙げる。人は「照合対象物」の一例であるが、本開示の「照合対象物」は人には限定されず、人以外の物体であってもよい。また、画像認識技術等により画像自体から認識可能な照合対象物の持つ属性、すなわち「物体属性」として「人の向き」を例示して説明するが、本開示の「物体属性」も人の向きに限定されない。照合対象物及び物体属性の他の例については実施形態の説明の後に説明する。 It is noted that Applicants provide the accompanying drawings and the following description for a full understanding of the present disclosure by those skilled in the art and are not intended to limit the claimed subject matter thereby. Absent. For example, in the following description of the embodiments, an example of matching a person (hereinafter sometimes abbreviated as “person”) among various objects that may be included in an image will be given. A person is an example of a "matching object", but the "matching object" of the present disclosure is not limited to a person, and may be an object other than a person. In addition, the attribute of the object to be matched that can be recognized from the image itself by image recognition technology or the like, that is, the “object attribute” will be described by exemplifying “human orientation”, but the “object attribute” of the present disclosure also is not limited to Other examples of matching objects and object attributes will be described after the description of the embodiments.
1.構成
 図1は、本開示の実施形態に係る画像照合装置100の概要を示す模式図である。画像照合装置100は、複数のカメラによって生成された複数の画像データ50から人の画像を検出してこれを候補画像とし、候補画像がクエリ画像に類似する程度を示す類似度の順に、候補画像に順位を付する。この際、画像照合装置100は、各候補画像における人の向き毎に複数の群に分類する。例えば、画像照合装置100は、複数の候補画像における人の向きがクエリ画像の人の向きに類似する、人の向きに関する属性順位を算出する。画像照合装置100は、複数の候補画像を、クエリ画像と同じ前向き(第1の向き)の群、後ろ向き(第2の向き)、右向き(第3の向き)の順番で群の属性順位を算出する。そして、画像照合装置100は、図1に示すように、複数の候補画像を各々の群に分類する。画像照合装置100は、分類された群毎に、類似度が高い順に順位を付与し、群毎の順位を複数の群の中の2以上の群について表示装置等に表示する。
1. Configuration FIG. 1 is a schematic diagram showing an outline of an image matching device 100 according to an embodiment of the present disclosure. The image matching device 100 detects human images from a plurality of image data 50 generated by a plurality of cameras and uses them as candidate images. to rank. At this time, the image matching apparatus 100 classifies each candidate image into a plurality of groups according to the orientation of the person. For example, the image matching apparatus 100 calculates an attribute rank regarding the orientation of a person in a plurality of candidate images similar to the orientation of the person in the query image. The image matching apparatus 100 calculates the attribute ranking of a plurality of candidate images in the same order as the query image, the group facing forward (first orientation), backward facing (second orientation), and facing right (third orientation). do. Then, the image matching device 100 classifies the plurality of candidate images into respective groups as shown in FIG. The image matching apparatus 100 ranks each classified group in descending order of similarity, and displays the rank of each group for two or more of the plurality of groups on a display device or the like.
 図2は、図1の画像照合装置100の構成例を示すブロック図である。画像照合装置100は、制御部1と、記憶装置2と、画像取得部3と、入力インタフェース(I/F)5と、出力インタフェース(I/F)4とを備える。 FIG. 2 is a block diagram showing a configuration example of the image matching device 100 of FIG. The image matching device 100 includes a control unit 1, a storage device 2, an image acquisition unit 3, an input interface (I/F) 5, and an output interface (I/F) 4.
 制御部1は、情報処理を実行して画像照合装置100の機能を実現する。このような情報処理は、例えば、制御部1が記憶装置2に格納されたプログラムを実行することにより実現される。制御部1は、人検出部11と、クエリ決定部12と、向き検出部13と、類似度決定部14と、分類部15と、順位付与部16とを含む。制御部1は、CPU、MPU、FPGA等の回路で構成される。 The control unit 1 implements the functions of the image matching device 100 by executing information processing. Such information processing is realized by executing a program stored in the storage device 2 by the control unit 1, for example. The control unit 1 includes a person detection unit 11 , a query determination unit 12 , an orientation detection unit 13 , a similarity determination unit 14 , a classification unit 15 and a ranking unit 16 . The control unit 1 is composed of circuits such as a CPU, MPU, and FPGA.
 以下、制御部1の各構成要素の機能の一例について説明する。人検出部11は、画像データ50内で人を検出し、検出された人の画像を候補画像とする。クエリ決定部12は、候補画像の照合対象であるクエリ画像を決定する。向き検出部13は、クエリ決定部12によって決定されたクエリ画像、及び、人検出部11によって検出された各候補画像に含まれる人の顔及び/又は体の向き(以下、「人の向き」という。)を検出する。類似度決定部14は、所定の類似度算出アルゴリズムを用いて各候補画像がクエリ画像に類似する程度を示す類似度を決定する。分類部15は、複数の候補画像を、類似度決定部14によって決定された人の向き毎に複数の群に分類する。順位付与部16は、各群に分類された1以上の候補画像に、群毎に、類似度が高い順に順位を付与する。上述の各機能の詳細は、後述の画像照合装置100の動作に関連してさらに説明する。 An example of the function of each component of the control unit 1 will be described below. The person detection unit 11 detects a person in the image data 50 and uses the image of the detected person as a candidate image. The query determination unit 12 determines a query image to be matched with candidate images. The orientation detection unit 13 detects the orientation of a person's face and/or body (hereinafter referred to as "person's orientation") included in the query image determined by the query determination unit 12 and each candidate image detected by the person detection unit 11. ) is detected. The similarity determination unit 14 determines a similarity indicating the degree of similarity of each candidate image to the query image using a predetermined similarity calculation algorithm. The classification unit 15 classifies the plurality of candidate images into a plurality of groups for each orientation of the person determined by the similarity determination unit 14 . The ranking unit 16 ranks one or more candidate images classified into each group in descending order of similarity for each group. Details of each of the functions described above will be further described in relation to the operation of the image collating apparatus 100, which will be described later.
 記憶装置2は、画像照合装置100による画像照合方法を制御部1に実行させるための所定の類似度算出アルゴリズムを含むプログラム、データ等の種々の情報を記録する記録媒体である。例えば、記憶装置2は、学習済みモデルである後述の特徴抽出モデル21と、画像リスト22とを格納する。記憶装置2は、例えば、フラッシュメモリ、ソリッド・ステート・ドライブ(SSD)等の半導体記憶装置、ハードディスクドライブ(HDD)等の磁気記憶装置、その他の記録媒体単独で又はそれらを組み合わせて実現される。記憶装置2は、SRAM、DRAM等の揮発性メモリを含んでもよい。 The storage device 2 is a recording medium for recording various information such as data and programs including a predetermined similarity calculation algorithm for causing the control unit 1 to execute the image matching method by the image matching device 100 . For example, the storage device 2 stores a later-described feature extraction model 21 that is a trained model and an image list 22 . The storage device 2 is realized by, for example, a semiconductor storage device such as a flash memory, a solid state drive (SSD), a magnetic storage device such as a hard disk drive (HDD), or other recording media alone or in combination. The storage device 2 may include volatile memory such as SRAM and DRAM.
 画像取得部3は、画像データ50等の情報を画像照合装置100に入力するために、画像照合装置100と外部機器とを接続するインタフェース回路である。このような外部機器は、例えば、図示しない他の情報処理端末、画像データ50を取得するカメラ等の装置である。画像取得部3は、既存の有線通信規格又は無線通信規格に従ってデータ通信を行う通信回路であってもよい。 The image acquisition unit 3 is an interface circuit that connects the image matching device 100 and an external device in order to input information such as the image data 50 to the image matching device 100 . Such an external device is, for example, another information processing terminal (not shown) or a device such as a camera that acquires the image data 50 . The image acquisition unit 3 may be a communication circuit that performs data communication according to existing wired communication standards or wireless communication standards.
 入力インタフェース5は、ユーザ入力を受け付けるために、画像照合装置100とキーボード、マウス等の入力装置80とを接続するインタフェース回路である。入力インタフェース5は、既存の有線通信規格又は無線通信規格に従ってデータ通信を行う通信回路であってもよい。 The input interface 5 is an interface circuit that connects the image collating device 100 and an input device 80 such as a keyboard and a mouse in order to accept user input. The input interface 5 may be a communication circuit that performs data communication according to existing wired communication standards or wireless communication standards.
 出力インタフェース4は、画像照合装置100から情報を出力するために、画像照合装置100と外部の出力装置とを接続するインタフェース回路である。このような出力装置は、例えば表示装置70である。出力インタフェース4は、既存の有線通信規格又は無線通信規格に従ってネットワーク60に接続されてデータ通信を行う通信回路であってもよい。画像取得部3、入力インタフェース5及び出力インタフェース4は、別個の、又は共通のハードウェアにより実現されてもよい。 The output interface 4 is an interface circuit that connects the image matching device 100 and an external output device in order to output information from the image matching device 100 . Such an output device is, for example, the display device 70 . The output interface 4 may be a communication circuit that is connected to the network 60 and performs data communication according to existing wired communication standards or wireless communication standards. The image acquisition unit 3, input interface 5 and output interface 4 may be realized by separate or common hardware.
2.動作
 図3は、図2の画像照合装置100の制御部1によって実行される処理の手順を例示するフローチャートである。
2. Operation FIG. 3 is a flow chart illustrating a procedure of processing executed by the control section 1 of the image matching apparatus 100 of FIG.
 図3において、制御部1は、画像取得部3を介して、画像データ50を取得する(S1)。画像データ50は、例えば、構内に設置された複数のカメラによって撮像された複数の画像データである。 In FIG. 3, the control unit 1 acquires image data 50 via the image acquisition unit 3 (S1). The image data 50 is, for example, a plurality of image data captured by a plurality of cameras installed in the premises.
 人検出部11は、ステップS1で取得された画像データ50内で人を検出し、検出された人の画像を候補画像とする(S2)。人検出部11は、画像データ50内で複数の人を検出した場合は、検出した複数の人のそれぞれにつき、候補画像を生成する。ここで、画像データ50内で人を検出するとは、画像データ50内で人が存在する領域を検出することを含む。 The person detection unit 11 detects a person in the image data 50 acquired in step S1, and uses the image of the detected person as a candidate image (S2). When detecting a plurality of persons in the image data 50, the person detection unit 11 generates a candidate image for each of the detected persons. Here, detecting a person in the image data 50 includes detecting an area in which a person exists in the image data 50 .
 クエリ決定部12は、候補画像の照合対象であるクエリ画像を決定する(S3)。例えば、クエリ決定部12は、ステップS2で得られた複数の候補画像の中から、ユーザがキーボード、マウス等の入力装置80を用いて選択した候補画像を、クエリ画像とする。クエリ決定部12は、記憶装置2に予め格納された人の画像、画像取得部3を介して入力された人の画像等をクエリ画像としてもよい。クエリ決定部12は、プログラムの指令に従って動作することにより、ステップS2で得られた複数の撮像画像、記憶装置2に予め格納された人の画像、画像取得部3を介して入力された人の画像等の中からクエリ画像を自動的に選択してもよい。 The query determination unit 12 determines a query image to be matched with candidate images (S3). For example, the query determination unit 12 uses, as a query image, a candidate image selected by the user using the input device 80 such as a keyboard or mouse from among the plurality of candidate images obtained in step S2. The query determination unit 12 may use an image of a person stored in advance in the storage device 2, an image of a person input via the image acquisition unit 3, or the like as a query image. The query determination unit 12 operates in accordance with instructions from the program to obtain the plurality of captured images obtained in step S2, the person's image stored in advance in the storage device 2, and the person's image input via the image acquisition unit 3. A query image may be automatically selected from images and the like.
 向き検出部13は、ステップS3で決定されたクエリ画像と、ステップS2で検出した各候補画像に含まれる人の向きを検出する(S4)。具体的には、まず、向き検出部13は、クエリ画像に含まれる人の向きが、予め定められた顔及び/又は体に関する互いに異なる複数の向きのいずれに属するかを決定する。そして、向き検出部13は、各候補画像に含まれる人の向きが、予め定められた顔及び/又は体に関する互いに異なる複数の向きのいずれに属するかを決定する。 The orientation detection unit 13 detects the orientation of the person included in the query image determined in step S3 and each candidate image detected in step S2 (S4). Specifically, first, the orientation detection unit 13 determines to which of a plurality of predetermined face and/or body different orientations the orientation of the person included in the query image belongs. Then, the orientation detection unit 13 determines to which of a plurality of predetermined face and/or body different orientations the orientation of the person included in each candidate image belongs.
 ここで、「各候補画像に含まれる人の向き」は、本開示の「各候補画像に含まれる照合対象物の物体属性」の一例であり、向き検出部13は、本開示の「属性決定部」の一例である。また、「クエリ画像に含まれる人の向き」は、本開示の「クエリ画像に含まれる被写体の物体属性を示すクエリ属性」の一例である。 Here, the "orientation of the person included in each candidate image" is an example of the "object attribute of the matching object included in each candidate image" of the present disclosure, and the orientation detection unit 13 is the "attribute determination It is an example of "part". Also, the “orientation of the person included in the query image” is an example of the “query attribute indicating the object attribute of the subject included in the query image” of the present disclosure.
 例えば、向き検出部13は、特徴抽出モデル21によって出力された各候補画像の特徴量ベクトルを、予め定められた複数の向きの人画像の特徴量ベクトルのそれぞれと比較することによって、各候補画像における人の向きを検出する。人の向きは、候補画像に映った人の顔及び/又は体の向きであり、例えば、顔の向き、上半身の向き、下半身の向き、又はこれらの情報を組み合わせることによって決定される向きである。 For example, the orientation detection unit 13 compares the feature amount vector of each candidate image output by the feature extraction model 21 with each of the feature amount vectors of human images in a plurality of predetermined orientations, so that each candidate image is Detect the orientation of a person in The orientation of the person is the orientation of the face and/or body of the person in the candidate image, such as the orientation of the face, the orientation of the upper body, the orientation of the lower body, or an orientation determined by combining these information. .
 向き検出部13は、人画像を入力することにより当該人画像における人の向きを出力する向き検出モデルを利用してもよい。このような向き検出モデルは、学習用画像と、正解情報との関係をモデルに学習させることによって構築される学習済みモデルである。向き検出部13には、公知の骨格検出器、姿勢検出器、顔向き検出器が適用されてもよい。 The orientation detection unit 13 may use an orientation detection model that outputs the orientation of the person in the person image by inputting the person image. Such an orientation detection model is a trained model constructed by having the model learn the relationship between the learning image and the correct information. A known skeleton detector, posture detector, or face orientation detector may be applied to the orientation detection unit 13 .
 このようにして検出された人の向きは、例えば、候補画像に映った人から見て、前向き、右斜め前向き、右向き、右斜め後ろ向き、後ろ向き、左斜め後ろ向き、左向き、左斜め前向き、の8方向に分類できる。人の向きは、上記の8方向に限らず、8未満又は9以上の方向にも分類可能である。 The direction of the person detected in this way is, for example, 8 directions of forward, obliquely forward to the right, facing right, obliquely backward to the right, backward, obliquely backward to the left, facing left, and obliquely forward to the left when viewed from the person in the candidate image. direction can be classified. The orientation of a person is not limited to the eight directions described above, and can be classified into less than eight directions or nine or more directions.
 類似度決定部14は、所定の類似度算出アルゴリズムを用いて各候補画像がクエリ画像に類似する程度を示す類似度を決定する(S5)。例えば、類似度決定部14は、各候補画像の特徴量ベクトルとクエリ画像の特徴量ベクトルとの比較に基づいて、類似度を算出する。例えば、所定の類似度算出アルゴリズムは、各候補画像の特徴量ベクトルとクエリ画像の特徴量ベクトルとのユークリッド距離、マハラノビス距離等の距離、又は内積が小さい程類似度が大きくなるように、類似度を算出するアルゴリズムである。所定の類似度算出アルゴリズムは、距離学習(Metric Learning)により構築されたモデルを適用して複数の特徴量ベクトル間の距離を算出するアルゴリズムであってもよい。類似度は、例えば、値が大きいほど候補画像とクエリ画像との一致度が高いことを意味する。 The similarity determination unit 14 uses a predetermined similarity calculation algorithm to determine a similarity indicating the degree of similarity of each candidate image to the query image (S5). For example, the similarity determining unit 14 calculates the similarity based on comparison between the feature amount vector of each candidate image and the feature amount vector of the query image. For example, the predetermined similarity calculation algorithm calculates the similarity such that the smaller the distance such as the Euclidean distance, the Mahalanobis distance, or the inner product between the feature amount vector of each candidate image and the feature amount vector of the query image, the similarity is increased. is an algorithm for calculating The predetermined similarity calculation algorithm may be an algorithm that applies a model constructed by metric learning to calculate the distance between a plurality of feature amount vectors. The degree of similarity means, for example, that the larger the value, the higher the degree of matching between the candidate image and the query image.
 図4は、候補画像の特徴量ベクトルとクエリ画像の特徴量ベクトルとの距離を例示する模式図である。図4には、n次元(n:1以上の整数)の特徴量ベクトル空間を示している。図4に示した例では、候補画像Xの特徴量ベクトルx=(x,x,…,x)と、クエリ画像の特徴量ベクトルq=(q,q,…,q)との距離d(q,x)は、候補画像Yの特徴量ベクトルy=(y,y,…,y)とクエリ画像の特徴量ベクトルqとの距離d(q,y)より小さい。この場合、類似度決定部14は、所定の類似度算出アルゴリズムにより、候補画像Xの類似度が候補画像Yの類似度より大きくなるように、各候補画像の類似度を決定する。 FIG. 4 is a schematic diagram illustrating the distance between the feature amount vector of the candidate image and the feature amount vector of the query image. FIG. 4 shows an n-dimensional (n: an integer equal to or greater than 1) feature amount vector space. In the example shown in FIG. 4, the feature amount vector x=(x 1 , x 2 , . . . , x n ) of the candidate image X and the feature amount vector q=(q 1 , q 2 , . ) is the distance d(q, y) between the feature vector y=(y 1 , y 2 , . . . , yn ) of the candidate image Y and the feature vector q of the query image. less than In this case, the similarity determination unit 14 determines the similarity of each candidate image so that the similarity of the candidate image X is higher than the similarity of the candidate image Y by a predetermined similarity calculation algorithm.
 図5は、図3の類似度を決定するステップS5の一例を説明するための模式図である。ステップS5において、類似度決定部14は、画像を入力することにより当該画像の特徴量ベクトルを出力する特徴抽出モデル21を利用してもよい。類似度決定部14は、特徴抽出モデル21によって出力された各候補画像の特徴量ベクトルとクエリ画像の特徴量ベクトルとの比較に基づいて、類似度を算出する。このような特徴抽出モデル21は、学習用画像と、正解情報との関係をモデルに学習させることによって構築される学習済みモデルである。学習済みモデルである特徴抽出モデル21は、ニューラルネットワーク、例えば畳み込みニューラルネットワーク(Convolutional Neural Network、CNN)の構造を有するモデルであってもよい。CNNのようなモデルを学習させることによって特徴抽出モデル21を構築する場合、特徴抽出モデル21においては、畳み込み層又はプーリング層の出力を特徴量として使用するために、モデル最後段の全結合層を除いてもよい。 FIG. 5 is a schematic diagram for explaining an example of step S5 for determining the degree of similarity in FIG. In step S5, the similarity determination unit 14 may use the feature extraction model 21 that outputs the feature amount vector of the image by inputting the image. The similarity determination unit 14 calculates a similarity based on a comparison between the feature amount vector of each candidate image output by the feature extraction model 21 and the feature amount vector of the query image. Such a feature extraction model 21 is a trained model constructed by having the model learn the relationship between the learning image and the correct information. The feature extraction model 21, which is a trained model, may be a model having the structure of a neural network, for example, a convolutional neural network (CNN). When constructing the feature extraction model 21 by learning a model such as CNN, in the feature extraction model 21, in order to use the output of the convolutional layer or the pooling layer as the feature amount, the last fully connected layer of the model is may be excluded.
 分類部15は、複数の候補画像を、ステップS4で決定された人の向き毎に複数の群に分類する(S6)。例えば、ステップS4で候補画像における人の向きが「前向き」に属すると決定された場合、分類部15は、ステップS6において、当該候補画像を「前向き」に対応する第1の群に分類する(図1参照)。あるいは、例えば、分類部15は、ステップS4で検出された候補画像の人の向きを、クエリ画像の人の向きに対する相対的な向きに基づいて分類してもよい。例えば、分類部15は、複数の候補画像を、クエリ画像の人の向きと同じ群と異なる群とに分類する。 The classification unit 15 classifies the multiple candidate images into multiple groups for each orientation of the person determined in step S4 (S6). For example, if it is determined in step S4 that the orientation of the person in the candidate image belongs to "forward", the classification unit 15 classifies the candidate image into a first group corresponding to "forward" in step S6 ( See Figure 1). Alternatively, for example, the classification unit 15 may classify the orientation of the person in the candidate image detected in step S4 based on the orientation relative to the orientation of the person in the query image. For example, the classification unit 15 classifies a plurality of candidate images into a group that is the same as the orientation of the person in the query image and a group that is different.
 順位付与部16は、ステップS6で各群に分類された1以上の候補画像に、群毎に、ステップS5で決定された類似度が高い順に順位を付与する(S7)。例えば、順位付与部16は、複数の候補画像における人の向きがクエリ画像の人の向きに類似する、人の向きに関する属性順位を算出する。例えば、図1に示すようにクエリ画像の人の向きが前向き(第1の向き)である場合、前向き(第1の向き)と後ろ向き(第2の向き)の画像とは、体系的に似ている。これらのことから、人の向きに関する属性順位を、前向き(第1の向き)、後ろ向き(第2の向き)、右向き(第3の向き)の順とする。そして、図1に示すように複数の候補画像が前向き(第1の向き)の群と後ろ向き(第2の向き)と右向き(第3の向き)の群に分類されている場合、順位付与部16は、前向きの群に分類された候補画像に第1位から順に順位を付与するとともに、後ろ向きの群に分類された候補画像及び右向きの群に分類された候補画像のそれぞれにも第1位から順に順位を付与する。 The ranking unit 16 ranks the one or more candidate images classified into each group in step S6 in descending order of similarity determined in step S5 (S7). For example, the ranking unit 16 calculates an attribute ranking regarding the orientation of a person in a plurality of candidate images similar to the orientation of the person in the query image. For example, if the person orientation of the query image is forward facing (first orientation) as shown in FIG. ing. Based on these facts, the order of attributes regarding the orientation of a person is forward facing (first orientation), backward facing (second orientation), and right facing (third orientation). Then, as shown in FIG. 1, when a plurality of candidate images are classified into a group of forward facing (first orientation) and a group of backward facing (second orientation) and right facing (third orientation), the ranking unit 16 ranks the candidate images classified into the forward-looking group from the first rank, and also ranks the candidate images classified into the backward-looking group and the candidate images classified into the right-facing group respectively. rank in order from .
 上記の人の向きに関する属性順位は、例えば、各群に分類された候補画像に含まれる人の向き(例えば、前向き、後ろ向き、右向き等)の、クエリ画像に含まれる人の向きに対する類似度(以下、「向き類似度」という。)に基づいて算出される。向き類似度は、本開示の「属性類似度」の一例である。向き類似度は、例えば類似度決定部14によって算出される。例えば、類似度決定部14は、各群に分類された候補画像に含まれる人の輪郭と、クエリ画像に含まれる人の輪郭との比較に基づいて、向き類似度を算出する。向き類似度は、クエリ画像に含まれる人の向きに対する、候補画像に含まれる人の相対的な向きによって予め定められていてもよい。例えば、クエリ画像に含まれる人の向きが前向きである場合、向き類似度は、前向きの群、後ろ向きの群、右向きの群の順に大きい値に設定されてもよい。 The above-mentioned attribute ranking regarding the orientation of a person is, for example, the degree of similarity of the orientation of a person included in the candidate images classified into each group (for example, facing forward, facing backward, facing right, etc.) to the orientation of a person included in the query image ( hereinafter referred to as “orientation similarity”). Orientation similarity is an example of “attribute similarity” of the present disclosure. The orientation similarity is calculated by the similarity determining unit 14, for example. For example, the similarity determining unit 14 calculates the orientation similarity based on a comparison between the outline of a person included in the candidate images classified into each group and the outline of a person included in the query image. Orientation similarity may be predetermined by the orientation of the person in the candidate image relative to the orientation of the person in the query image. For example, when the orientation of a person included in a query image is forward facing, the orientation similarity may be set to a larger value in order of the forward facing group, the backward facing group, and the right facing group.
 制御部1は、ステップS6で各群に分類された1以上の候補画像に各群における順位と紐付けた情報を複数の群の中の2以上の群について表示装置70に出力してもよい。表示装置70には、各候補画像が、各群における順位の順に表示装置70に表示される(S8)。例えば、制御部1は、複数の候補画像と、ステップS6で決定された各候補画像が属する群と、ステップS7で付与された群毎の順位を示す情報とを、出力インタフェース4を介して表示装置70に出力する。例えば、制御部1は、図1に示すように、複数の群の中の2以上の群について属性順位の順に縦方向(第1の方向)に並べて表示装置70に表示させ、かつ、物体属性毎の各群に分類された1以上の候補画像を、それぞれの群における順位の順に横方向(第2の方向)に並べて表示装置70に表示させる。なお、複数の物体属性に関する類似度順位を、同時に表示しなくてもよい。例えば、属性順位1位の物体属性に関する類似度順位を表示し、画面切り替えをすることで属性順位2位以降の物体属性に関する類似度順位を表示してもよい。また、物体属性の表示順は属性順位に関係なく、予め定められた順に表示してもよい。 The control unit 1 may output to the display device 70 information in which the one or more candidate images classified into each group in step S6 are linked with the ranking in each group for two or more groups among the plurality of groups. . Each candidate image is displayed on the display device 70 in the order of the ranking in each group (S8). For example, the control unit 1 displays, via the output interface 4, a plurality of candidate images, the group to which each candidate image determined in step S6 belongs, and the information indicating the order given to each group in step S7. Output to device 70 . For example, as shown in FIG. 1, the control unit 1 causes the display device 70 to display two or more groups among a plurality of groups arranged in the order of attribute ranking in the vertical direction (first direction), and displays the object attributes. One or more candidate images classified into each group are arranged in the horizontal direction (second direction) in order of rank in each group and displayed on the display device 70 . Note that it is not necessary to display the similarity rankings for a plurality of object attributes at the same time. For example, the similarity ranking of the object attribute having the first attribute ranking may be displayed, and the similarity ranking of the object attributes having the second and subsequent attribute rankings may be displayed by switching the screen. Also, the object attributes may be displayed in a predetermined order regardless of the attribute order.
 ステップS8において、制御部1は、各群の第1位から所定の順位までの候補画像をそれぞれの順位と紐付けて表示装置70に出力してもよい。例えば、制御部1は、各群の第1位から第10位までの候補画像を選択し、各候補画像が属する群及び順位の情報とともに表示装置70に表示させる。 In step S8, the control unit 1 may output to the display device 70 the candidate images from the first place to the predetermined order in each group in association with each order. For example, the control unit 1 selects the first to tenth candidate images in each group and causes the display device 70 to display them together with information on the group to which each candidate image belongs and the ranking.
 あるいは、ステップS8において、制御部1は、各群に分類された1以上の候補画像のうち、類似度が所定の閾値以上の候補画像を複数の群の中の2以上の群について表示装置70に出力してもよい。例えば、制御部1は、各群において類似度が0.7以上の候補画像を選択し、各候補画像が属する群及び順位の情報とともに表示装置70に表示させる。これにより、表示装置70に出力するデータ量を低減して処理負荷を低減することができる。表示装置70における情報処理量を低減することもできる。 Alternatively, in step S8, the control unit 1 displays candidate images having a degree of similarity equal to or higher than a predetermined threshold among the one or more candidate images classified into each group for two or more of the plurality of groups on the display device 70. can be output to For example, the control unit 1 selects candidate images having a similarity of 0.7 or higher in each group, and causes the display device 70 to display them together with information on the group to which each candidate image belongs and the ranking. As a result, the amount of data output to the display device 70 can be reduced, and the processing load can be reduced. The amount of information processing in the display device 70 can also be reduced.
 所定の閾値は、群毎に設定されてもよい。例えば、制御部1は、クエリ画像の人が前向きである場合、前向きの群からは類似度が0.8以上の候補画像を選択し、後ろ向きの群からは類似度が0.5以上の候補画像を選択して表示装置70に表示させる。 A predetermined threshold may be set for each group. For example, if the person in the query image is forward-facing, the control unit 1 selects candidate images with a similarity of 0.8 or higher from the forward-facing group, and selects candidate images with a similarity of 0.5 or higher from the backward-facing group. An image is selected and displayed on the display device 70 .
 このように、クエリ画像における人の向きと同じ向きの候補画像が分類された群に属する候補画像について、他の群に比べて高い閾値を設定することは、以下のような利点を有する。すなわち、クエリ画像における人の向きと同じ向きの候補者が映る候補画像は、当該候補者がクエリ画像と異なる人である場合であっても、クエリ画像と同一人物であるが向きが異なる人物を示す候補画像に比べて類似度が高く算出される傾向がある。したがって、クエリ画像における人の向きと同じ向きの候補画像が分類された群に属する候補画像は、たとえ高い類似度を有していたとしても、同程度に高い類似度を有する他の群に属する候補画像に比べて、クエリ画像に一致した画像である確率が実際には低いことがある。そこで、制御部1は、クエリ画像における人の向きと同じ向きの候補画像が分類された群に属する候補画像について、他の群に比べて高い閾値を設定することにより、実質的なクエリ画像との一致度を反映させて、当該実質的な一致度が所定の基準値を超える候補画像のみを選択して表示装置70に表示させることができる。 In this way, setting a higher threshold than other groups for candidate images belonging to a group in which candidate images in the same orientation as the orientation of the person in the query image are classified has the following advantages. That is, a candidate image in which a candidate facing the same direction as the person in the query image shows the same person as in the query image, but facing a different direction, even if the candidate is a different person from the query image. There is a tendency that the calculated similarity is higher than that of the candidate image shown. Therefore, even if a candidate image belonging to a group in which candidate images with the same orientation as the orientation of the person in the query image are classified has a high degree of similarity, it belongs to another group with a similarly high degree of similarity. The probability of an image matching the query image may actually be lower than the candidate image. Therefore, the control unit 1 sets a threshold higher than that of the other groups for the candidate images belonging to the group into which the candidate images in the same direction as the direction of the person in the query image are classified. By reflecting the degree of matching, it is possible to select and display on the display device 70 only the candidate images whose substantial degree of matching exceeds a predetermined reference value.
3.効果等
 以上のように、複数の候補画像のそれぞれをクエリ画像と照合する画像照合装置100は、属性決定部の一例である向き検出部13と、類似度決定部14と、分類部15と、順位付与部16と、出力インタフェース4とを備える。向き検出部13は、各候補画像に含まれる人の向きを決定する。類似度決定部14は、所定の類似度算出アルゴリズムを用いて各候補画像がクエリ画像に類似する程度を示す類似度を決定する。分類部15は、複数の候補画像を向き検出部13によって決定された向き毎に複数の群に分類する。順位付与部16は、各群に分類された1以上の候補画像に、群毎に、類似度が高い順に順位を付与する。出力インタフェース4は、制御部1による制御に従い、各群に分類された1以上の候補画像に関する情報を複数の群の中の2以上の群について出力する。
3. Effects, etc. As described above, the image matching device 100 for matching each of a plurality of candidate images with a query image includes the orientation detection unit 13, which is an example of an attribute determination unit, the similarity determination unit 14, the classification unit 15, A ranking unit 16 and an output interface 4 are provided. The orientation detection unit 13 determines the orientation of the person included in each candidate image. The similarity determination unit 14 determines a similarity indicating the degree of similarity of each candidate image to the query image using a predetermined similarity calculation algorithm. The classification unit 15 classifies the multiple candidate images into multiple groups for each orientation determined by the orientation detection unit 13 . The ranking unit 16 ranks one or more candidate images classified into each group in descending order of similarity for each group. The output interface 4 outputs information about one or more candidate images classified into each group for two or more of the plurality of groups under the control of the control unit 1 .
 この構成により、出力先の情報処理装置、又は表示された出力情報を見るユーザは、照合結果を示す情報を従来技術に比べて把握しやすくなり、画像照合装置100は、照合結果を示す情報が他の情報に埋もれてしまう課題を解決することができる。 With this configuration, the information processing apparatus of the output destination or the user who views the displayed output information can easily grasp the information indicating the matching result compared to the conventional technology, and the image matching apparatus 100 can display the information indicating the matching result. You can solve problems that are buried in other information.
 出力インタフェース4は、制御部1による制御に従い、各群の第1位から所定の順位までの候補画像をそれぞれの順位と紐付けて出力してもよい。 Under the control of the control unit 1, the output interface 4 may output the candidate images from the first to the predetermined rank in each group in association with the respective ranks.
 この構成により、画像照合装置100が出力するデータ量を低減して画像照合装置100の処理負荷を低減することができるとともに、出力先の情報処理端末、表示装置等の装置における情報処理量を低減することができる。また、ユーザが表示された出力情報を見る場合には、出力情報を各群の第1位から所定の順位までの限られた数とすることにより、例えば出力情報の一覧性が向上し、ユーザは、表示された出力情報を把握しやすくなる。 With this configuration, it is possible to reduce the amount of data output by the image matching apparatus 100, thereby reducing the processing load of the image matching apparatus 100, and reducing the amount of information processing in devices such as information processing terminals and display devices that are output destinations. can do. Also, when the user views the displayed output information, by limiting the number of pieces of output information from the first place in each group to a predetermined order, for example, the browsability of the output information is improved, and the user can see the output information. makes it easier to understand the displayed output information.
 出力インタフェース4は、制御部1による制御に従い、各群に分類された1以上の候補画像のうち、類似度が所定の閾値以上の候補画像を複数の群の中の2以上の群について出力してもよい。 Under the control of the control unit 1, the output interface 4 outputs candidate images having a degree of similarity equal to or higher than a predetermined threshold among the one or more candidate images classified into each group, for two or more of the plurality of groups. may
 この構成によっても、画像照合装置100が出力するデータ量を低減して画像照合装置100の処理負荷を低減することができるとともに、出力先の情報処理端末、表示装置等の装置における情報処理量を低減することができる。また、ユーザが表示された出力情報を見る場合には、出力情報を類似度が所定の閾値以上であることを満たすものに限定することにより、例えば出力情報の一覧性が向上し、ユーザは、表示された出力情報を把握しやすくなる。 With this configuration as well, it is possible to reduce the amount of data output by the image matching apparatus 100, thereby reducing the processing load of the image matching apparatus 100, and at the same time, reduce the amount of information processing in devices such as information processing terminals and display devices at the output destination. can be reduced. In addition, when the user views the displayed output information, by limiting the output information to information that satisfies that the degree of similarity is equal to or higher than a predetermined threshold value, for example, the browsability of the output information is improved, and the user can This makes it easier to understand the displayed output information.
 出力インタフェース4は、制御部1による制御に従い、情報を表示装置70に出力し、各群に分類された1以上の候補画像を、複数の群の中の2以上の群のそれぞれの群における順位の順に表示装置70に表示させてもよい。 The output interface 4 outputs information to the display device 70 under the control of the control unit 1, and ranks one or more candidate images classified into each group in each of two or more groups among the plurality of groups. may be displayed on the display device 70 in this order.
 この構成により、ユーザは、表示装置70を見ることで、照合結果を示す情報を従来技術に比べて容易に把握することができる。 With this configuration, by looking at the display device 70, the user can easily grasp the information indicating the matching result compared to the conventional technology.
 向き検出部13は、クエリ画像に含まれる被写体の向きを更に決定してもよい。この場合、類似度決定部14は、分類部15によって分類された複数の群のそれぞれに対応する人の向きが、クエリ画像に含まれる被写体の向きに類似する程度を示す向き類似度を更に決定する。向き類似度は、本開示の「属性類似度」の一例である。順位付与部16は、複数の群に、向き類似度が高い順に順位(向き順位)を更に付与する。出力インタフェース4は、複数の群を、向き順位の順に縦方向(第1の方向)に並べて表示装置70に表示させ、かつ、各群に分類された1以上の候補画像を、それぞれの群における類似度順位の順に横方向(第2の方向)に並べて表示装置70に表示させる。 The orientation detection unit 13 may further determine the orientation of the subject included in the query image. In this case, the similarity determination unit 14 further determines orientation similarity indicating the degree to which the orientation of the person corresponding to each of the plurality of groups classified by the classification unit 15 is similar to the orientation of the subject included in the query image. do. Orientation similarity is an example of “attribute similarity” of the present disclosure. The rank assigning unit 16 further assigns ranks (orientation ranks) to the plurality of groups in descending order of orientation similarity. The output interface 4 arranges the plurality of groups in the vertical direction (first direction) in order of orientation order and displays them on the display device 70, and displays one or more candidate images classified into each group in each group. They are arranged in the horizontal direction (second direction) in order of similarity order and displayed on the display device 70 .
 この構成により、表示装置70において、複数の群が向き順位の順に並び、順位順に並んだ各群の中で、各群に属する候補画像が類似度順位順に横方向に並ぶため、ユーザは、照合結果を示す情報を更に容易に把握することができる。 With this configuration, on the display device 70, a plurality of groups are arranged in order of orientation order, and candidate images belonging to each group are arranged in order of similarity order in the horizontal direction in each group arranged in order of orientation. Information indicating the result can be grasped more easily.
(他の実施形態)
 以上のように、本出願において開示する技術の例示として、実施形態を説明した。しかしながら、本開示における技術は、これに限定されず、適宜、変更、置き換え、付加、省略などを行った実施の形態にも適用可能である。以下、他の実施の形態を例示する。
(Other embodiments)
As described above, the embodiments have been described as examples of the technology disclosed in the present application. However, the technology in the present disclosure is not limited to this, and can be applied to embodiments in which modifications, replacements, additions, omissions, etc. are made as appropriate. Other embodiments are exemplified below.
 上記の実施形態では、本開示の「照合対象物」の一例として人を説明し、「物体属性」の一例として人の属性である「人の向き」を説明し、「属性決定部」の一例として向き検出部13を説明した。しかしながら、本開示はこれらに限定されない。例えば、照合対象物は、人に限定されず、車両、建造物、ロボット等の人以外の物体であってもよい。また、物体属性は、人以外の物体の属性であってもよく、例えば、物体の色、材質、形状等であってもよい。 In the above embodiment, a person is described as an example of the "matching object" of the present disclosure, "human direction", which is an attribute of the person, is described as an example of the "object attribute", and an example of the "attribute determination unit" is described. , the orientation detection unit 13 has been described. However, the present disclosure is not limited to these. For example, an object to be matched is not limited to a person, and may be an object other than a person, such as a vehicle, a building, a robot, or the like. Also, the object attribute may be an attribute of an object other than a person, such as the color, material, shape, etc. of the object.
 物体属性が人の属性である場合、物体属性は人の向きに限定されず、例えば人の身長、体型、年齢、年代、性別等の属性であってもよい。人の身長及び体型は画像認識技術を用いて容易に推定することができる。また、年齢、年代、性別についても画像認識技術を用いて推定することができる。例えば画像中の洋服の種類、人の髪型、体型等から性別を推定したり、顔のしわの程度、頭髪の色から年齢又は年代を推定したりすることができる。また、物体属性は、人がスーツ等の特定の形状の衣服を着用しているか否かを表す属性であってもよい。  When the object attribute is a person's attribute, the object attribute is not limited to the orientation of the person, and may be attributes such as a person's height, body type, age, age group, and gender. A person's height and body shape can be easily estimated using image recognition technology. In addition, age, generation, and gender can also be estimated using image recognition technology. For example, it is possible to estimate gender from the type of clothes, hairstyle, body shape, etc. of a person in an image, or estimate age from the degree of facial wrinkles and hair color. Also, the object attribute may be an attribute representing whether or not a person is wearing clothes of a specific shape such as a suit.
 物体属性は、人が鞄を手に持っているか、バックパックを背負っているか、スーツケースを引いているか、電話をしているか等を表す属性であってもよい。物体属性としての鞄等の持ち物の有無は、画像内で人が実際に持ち物を持っているか否かだけでなく、画像内に持ち物が映っているか否かを示す情報を含む。このような持ち物は、ある時刻に撮像された画像には映っているが、他の時刻に撮像された画像では、持ち主の体に隠れたり、持ち主が持ち物をどこかに置いてきたりすることにより、映っていないことがある。複数の画像間における持ち物の有無も類似度に影響を与え得るため、物体属性が持ち物の有無を含むことにより、出力先の情報処理装置、又は表示された出力情報を見るユーザは、照合結果を示す情報を従来技術に比べて把握しやすくなり、画像照合装置100は、照合結果を示す情報が他の情報に埋もれてしまう課題を解決することができる。 An object attribute may be an attribute that indicates whether a person is holding a bag, carrying a backpack, pulling a suitcase, or making a phone call. The presence/absence of belongings such as a bag as an object attribute includes information indicating not only whether or not a person actually has belongings in an image, but also whether or not the belongings are shown in the image. Such belongings are visible in images taken at a certain time, but in images taken at other times, they may be hidden by the owner's body or the owner may have left the belongings somewhere. , may not be reflected. Since the presence or absence of belongings between a plurality of images can also affect the degree of similarity, by including the presence or absence of belongings in the object attribute, the information processing device of the output destination or the user viewing the displayed output information can determine the matching result. It becomes easier to grasp the information to be displayed than in the conventional technology, and the image matching apparatus 100 can solve the problem that the information indicating the matching result is buried in other information.
 さらに、物体属性は、人が自転車、バイク等の乗り物に乗っているか、歩いているか走っているか静止しているか、立っているか座っているか等を示す属性であってもよい。属性決定部は、各候補画像における人の姿勢を検出し、検出された姿勢に基づいて上記のような属性を推定してもよい。 Furthermore, the object attribute may be an attribute indicating whether the person is riding a vehicle such as a bicycle or motorcycle, walking, running, standing still, standing or sitting. The attribute determination unit may detect the posture of the person in each candidate image and estimate the attributes as described above based on the detected posture.
 上記の実施形態では、画像照合装置100による情報の出力先として、表示装置70を例示した。しかしながら、情報の出力先はこれに限定されず、画像照合装置100は、例えばネットワーク60を介して、スマートフォン、タブレット、ノートパソコン等の情報処理端末に情報を出力してもよい。あるいは、画像照合装置100は、一巡目の図3のステップS1~S7において粗い照合処理を実行した後、粗い照合処理結果を用いて、一巡目の処理より精密な二巡目の照合処理として、ステップS4~S8を実行してもよい。このような手段は、粗い照合処理の結果、類似度が所定の閾値より低い候補画像を二巡目の精密照合処理の対象から省くことにより、二巡目の精密照合処理の処理負荷を低減し、処理速度を向上させることができる。 In the above embodiment, the display device 70 was exemplified as an output destination of information by the image collating device 100 . However, the output destination of the information is not limited to this, and the image matching device 100 may output the information to an information processing terminal such as a smart phone, a tablet, or a notebook computer via the network 60, for example. Alternatively, the image matching device 100 executes rough matching processing in steps S1 to S7 in FIG. Steps S4 to S8 may be performed. Such a means reduces the processing load of the second round of fine matching processing by excluding candidate images whose similarity is lower than a predetermined threshold as a result of rough matching processing from targets of the second round of fine matching processing. , can improve the processing speed.
 以上のように、本開示における技術の例示として、実施の形態を説明した。そのために、添付図面および詳細な説明を提供した。 As described above, the embodiment has been described as an example of the technology of the present disclosure. To that end, the accompanying drawings and detailed description have been provided.
 したがって、添付図面および詳細な説明に記載された構成要素の中には、課題解決のために必須な構成要素だけでなく、上記技術を例示するために、課題解決のためには必須でない構成要素も含まれ得る。そのため、それらの必須ではない構成要素が添付図面や詳細な説明に記載されていることをもって、直ちに、それらの必須ではない構成要素が必須であるとの認定をするべきではない。 Therefore, among the components described in the attached drawings and detailed description, there are not only components essential for solving the problem, but also components not essential for solving the problem in order to illustrate the above technology. can also be included. Therefore, it should not be immediately recognized that those non-essential components are essential just because they are described in the attached drawings and detailed description.
 また、上述の実施の形態は、本開示における技術を例示するためのものであるから、特許請求の範囲またはその均等の範囲において、種々の変更、置換、付加、省略などを行うことができる。 In addition, since the above-described embodiment is for illustrating the technology in the present disclosure, various changes, substitutions, additions, omissions, etc. can be made within the scope of claims or equivalents thereof.
 本開示は、画像検索技術及び画像照合技術に適用可能である。 The present disclosure is applicable to image search technology and image matching technology.
 1 制御部
 2 記憶装置
 3 画像取得部
 4 出力インタフェース
 5 入力インタフェース
 11 人検出部
 12 クエリ決定部
 13 検出部
 14 類似度決定部
 15 分類部
 16 順位付与部
 21 特徴抽出モデル
 22 画像リスト
 50 画像データ
 60 ネットワーク
 70 表示装置
 80 入力装置
 100 画像照合装置
1 control unit 2 storage device 3 image acquisition unit 4 output interface 5 input interface 11 person detection unit 12 query determination unit 13 detection unit 14 similarity determination unit 15 classification unit 16 ranking unit 21 feature extraction model 22 image list 50 image data 60 Network 70 Display Device 80 Input Device 100 Image Verification Device

Claims (9)

  1.  複数の候補画像のそれぞれをクエリ画像と照合する画像照合装置であって、
     各候補画像に含まれる照合対象物の物体属性を決定する属性決定部と、
     所定の類似度算出アルゴリズムを用いて前記各候補画像が前記クエリ画像に類似する程度を示す類似度を決定する類似度決定部と、
     前記複数の候補画像を前記属性決定部によって決定された物体属性毎に複数の群に分類する分類部と、
     前記各群に分類された1以上の候補画像に、群毎に、前記類似度が高い順に順位を付与する順位付与部と、
     制御部による制御に従い、前記各群に分類された1以上の候補画像に関する情報を前記複数の群の中の2以上の群について出力する出力部と、
     を備える、画像照合装置。
    An image matching device for matching each of a plurality of candidate images with a query image,
    an attribute determination unit that determines an object attribute of a matching object included in each candidate image;
    a similarity determination unit that determines a similarity indicating the degree of similarity of each of the candidate images to the query image using a predetermined similarity calculation algorithm;
    a classification unit that classifies the plurality of candidate images into a plurality of groups for each object attribute determined by the attribute determination unit;
    a ranking unit that ranks the one or more candidate images classified into each group in descending order of similarity for each group;
    an output unit that outputs information about one or more candidate images classified into each group for two or more groups among the plurality of groups under the control of the control unit;
    An image matching device.
  2.  前記制御部は、前記各群の第1位から所定の順位までの候補画像をそれぞれの前記順位と紐付けて前記出力部から出力する、請求項1に記載の画像照合装置。 3. The image matching device according to claim 1, wherein the control unit associates the candidate images from the first to the predetermined rank of each group with the respective ranks and outputs them from the output unit.
  3.  前記制御部は、前記各群に分類された1以上の候補画像のうち、前記類似度が所定の閾値以上の候補画像を前記出力部から出力する、請求項1又は2に記載の画像照合装置。 3. The image matching device according to claim 1, wherein said control unit outputs from said output unit candidate images whose degree of similarity is equal to or greater than a predetermined threshold among the one or more candidate images classified into each group. .
  4.  前記制御部は、前記情報を表示装置に前記出力部から出力し、前記各群に分類された1以上の候補画像を、それぞれの群における前記順位の順に前記表示装置に表示させる、請求項1~3のいずれかに記載の画像照合装置。 2. The control unit outputs the information from the output unit to the display device, and causes the display device to display the one or more candidate images classified into the groups in the order of the order in each group. 4. The image matching device according to any one of 1 to 3.
  5.  前記属性決定部は、前記クエリ画像に含まれる被写体の物体属性を示すクエリ属性を更に決定し、
     前記類似度決定部は、前記分類部によって分類された複数の群のそれぞれに対応する物体属性が、前記クエリ属性に類似する程度を示す属性類似度を更に決定し、
     前記順位付与部は、前記複数の群に、前記属性類似度が高い順に属性順位を更に付与し、
     前記制御部は、前記複数の群を、前記属性順位の順に第1の方向に並べて前記表示装置に表示させ、かつ、前記各群に分類された1以上の候補画像を、それぞれの群における前記順位の順に前記第1の方向と異なる第2の方向に並べて前記表示装置に表示させる、請求項4に記載の画像照合装置。
    The attribute determination unit further determines a query attribute indicating an object attribute of a subject included in the query image,
    The similarity determining unit further determines an attribute similarity indicating the degree to which the object attribute corresponding to each of the plurality of groups classified by the classifying unit is similar to the query attribute,
    The ranking unit further assigns attribute rankings to the plurality of groups in descending order of the attribute similarity,
    The control unit causes the plurality of groups to be arranged in a first direction in order of the attribute order and displayed on the display device, and one or more candidate images classified into each group are displayed in each group. 5. The image matching device according to claim 4, wherein the images are displayed on the display device in a second direction different from the first direction in order of rank.
  6.  前記複数の候補画像は、それぞれ1人の人物を示す画像であり、
     前記属性決定部は、前記物体属性として、前記各候補画像に映る人物の顔及び/又は体に関する向きを決定する、
     請求項1~5のいずれかに記載の画像照合装置。
    each of the plurality of candidate images is an image representing one person;
    The attribute determining unit determines, as the object attribute, the orientation of the face and/or body of the person appearing in each of the candidate images.
    The image matching device according to any one of claims 1 to 5.
  7.  前記類似度算出アルゴリズムは、前記各候補画像の特徴量ベクトルと前記クエリ画像の特徴量ベクトルとの比較に基づいて前記類似度を算出するアルゴリズムである、請求項1~6のいずれかに記載の画像照合装置。 7. The similarity calculation algorithm according to any one of claims 1 to 6, wherein the similarity calculation algorithm is an algorithm for calculating the similarity based on a comparison between the feature amount vector of each of the candidate images and the feature amount vector of the query image. Image matching device.
  8.  複数の候補画像のそれぞれをクエリ画像と照合する画像照合方法であって、
     各候補画像に含まれる照合対象物の物体属性を決定する属性決定ステップと、
     所定の類似度算出アルゴリズムを用いて前記各候補画像が前記クエリ画像に類似する程度を示す類似度を決定するステップと、
     前記複数の候補画像を前記属性決定ステップにおいて決定された物体属性毎に複数の群に分類するステップと、
     前記各群に分類された1以上の候補画像に、群毎に、前記類似度が高い順に順位を付与するステップと、
     前記各群に分類された1以上の候補画像に関する情報を前記複数の群の中の2以上の群について出力するステップと、
     を含む、画像照合方法。
    An image matching method for matching each of a plurality of candidate images with a query image,
    an attribute determination step of determining object attributes of matching objects included in each candidate image;
    determining a similarity score indicating the extent to which each candidate image is similar to the query image using a predetermined similarity calculation algorithm;
    a step of classifying the plurality of candidate images into a plurality of groups for each object attribute determined in the attribute determining step;
    a step of ranking the one or more candidate images classified into each group in descending order of similarity for each group;
    outputting information about one or more candidate images classified into each group for two or more groups among the plurality of groups;
    image matching methods, including;
  9.  請求項8に記載の画像照合方法を制御部に実行させるためのプログラム。 A program for causing a control unit to execute the image matching method according to claim 8.
PCT/JP2022/018752 2021-07-09 2022-04-25 Image matching device, image matching method, and program WO2023281903A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2021114129 2021-07-09
JP2021-114129 2021-07-09

Publications (1)

Publication Number Publication Date
WO2023281903A1 true WO2023281903A1 (en) 2023-01-12

Family

ID=84801510

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/018752 WO2023281903A1 (en) 2021-07-09 2022-04-25 Image matching device, image matching method, and program

Country Status (1)

Country Link
WO (1) WO2023281903A1 (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11316846A (en) * 1998-04-22 1999-11-16 Nec Corp Image inquring method utilizing area information and edge information of image, and image inquiring device
JP2014182480A (en) * 2013-03-18 2014-09-29 Toshiba Corp Person recognition device and method
WO2015001791A1 (en) * 2013-07-03 2015-01-08 パナソニックIpマネジメント株式会社 Object recognition device objection recognition method
JP2016103084A (en) * 2014-11-27 2016-06-02 株式会社 日立産業制御ソリューションズ Image search apparatus and image search system
JP2016157165A (en) * 2015-02-23 2016-09-01 三菱電機マイコン機器ソフトウエア株式会社 Person identification system
JP2017054493A (en) * 2015-09-11 2017-03-16 キヤノン株式会社 Information processor and control method and program thereof
JP2018054472A (en) * 2016-09-29 2018-04-05 ケーディーアイコンズ株式会社 Information processing device and program
WO2019103912A2 (en) * 2017-11-22 2019-05-31 Arterys Inc. Content based image retrieval for lesion analysis

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11316846A (en) * 1998-04-22 1999-11-16 Nec Corp Image inquring method utilizing area information and edge information of image, and image inquiring device
JP2014182480A (en) * 2013-03-18 2014-09-29 Toshiba Corp Person recognition device and method
WO2015001791A1 (en) * 2013-07-03 2015-01-08 パナソニックIpマネジメント株式会社 Object recognition device objection recognition method
JP2016103084A (en) * 2014-11-27 2016-06-02 株式会社 日立産業制御ソリューションズ Image search apparatus and image search system
JP2016157165A (en) * 2015-02-23 2016-09-01 三菱電機マイコン機器ソフトウエア株式会社 Person identification system
JP2017054493A (en) * 2015-09-11 2017-03-16 キヤノン株式会社 Information processor and control method and program thereof
JP2018054472A (en) * 2016-09-29 2018-04-05 ケーディーアイコンズ株式会社 Information processing device and program
WO2019103912A2 (en) * 2017-11-22 2019-05-31 Arterys Inc. Content based image retrieval for lesion analysis

Similar Documents

Publication Publication Date Title
US20220327155A1 (en) Method, apparatus, electronic device and computer readable storage medium for image searching
Lu et al. Feature extraction and fusion using deep convolutional neural networks for face detection
US11222044B2 (en) Natural language image search
Tang et al. Facial landmark detection by semi-supervised deep learning
Ghimire et al. Extreme learning machine ensemble using bagging for facial expression recognition
CN104487915A (en) Maintaining continuity of augmentations
CN110555481A (en) Portrait style identification method and device and computer readable storage medium
KR102372017B1 (en) Apparatus for recommending contents based on facial expression, method thereof and computer recordable medium storing program to perform the method
JP2011221791A (en) Face clustering device, face clustering method, and program
Al-Akam et al. Local feature extraction from RGB and depth videos for human action recognition
Kumar et al. 3D sign language recognition using spatio temporal graph kernels
Hsu et al. Fast landmark localization with 3D component reconstruction and CNN for cross-pose recognition
Xia et al. Face occlusion detection using deep convolutional neural networks
Hu et al. Speech Emotion Recognition Model Based on Attention CNN Bi-GRU Fusing Visual Information.
Katkade et al. Advances in Real-Time Object Detection and Information Retrieval: A Review
Li et al. Recognizing hand gestures using the weighted elastic graph matching (WEGM) method
Lin et al. Region-based context enhanced network for robust multiple face alignment
JP7409499B2 (en) Image processing device, image processing method, and program
WO2023281903A1 (en) Image matching device, image matching method, and program
Axyonov et al. Method of multi-modal video analysis of hand movements for automatic recognition of isolated signs of Russian sign language
CN115240127A (en) Smart television-oriented child monitoring method
Granger et al. Weakly supervised learning for facial behavior analysis: A review
Yun et al. Riemannian manifold-based support vector machine for human activity classification in images
JP7171361B2 (en) Data analysis system, learning device, and method thereof
Elakkiya et al. Interactive real time fuzzy class level gesture similarity measure based sign language recognition using artificial neural networks

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22837324

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE