WO2023281903A1

WO2023281903A1 - Image matching device, image matching method, and program

Info

Publication number: WO2023281903A1
Application number: PCT/JP2022/018752
Authority: WO
Inventors: 俊介安木; 拓実小島; 祐介加藤
Original assignee: パナソニックＩｐマネジメント株式会社
Priority date: 2021-07-09
Filing date: 2022-04-25
Publication date: 2023-01-12

Abstract

This image matching device comprises: an attribute determination unit that determines an object attribute of a matching object included in each candidate image; a similarity determination unit that determines a similarity indicating the degree of similarity of each candidate image to a query image by using a predetermined similarity calculation algorithm; a classification unit that classifies a plurality of the candidate images into a plurality of groups for each object attribute determined by the attribute determination unit; a ranking unit that ranks one or more candidate images classified into each group in descending order of similarity for each group; and an output unit that outputs information about the one or more candidate images classified into each group under the control of a control unit.

Description

Image matching device, image matching method, and program

The present disclosure relates to an image matching device, an image matching method, and a program.

Non-Patent Document 1 discloses a matching technique for matching a person image with a query and displaying matching results in a ranking format.

An object of the present disclosure is to provide an image matching device, an image matching method, and a program that make it easier to grasp information indicating matching results compared to conventional techniques.

One aspect of the present disclosure provides an image matching device that matches each of a plurality of candidate images with a query image. The image matching device uses an attribute determination unit that determines an object attribute of an object to be matched included in each candidate image, and determines a similarity that indicates the degree of similarity between each candidate image and a query image using a predetermined similarity calculation algorithm. a similarity determination unit for classifying a plurality of candidate images into a plurality of groups for each object attribute determined by the attribute determination unit; one or more candidate images classified into each group; a rank assigning unit that assigns ranks in descending order of similarity; and an output unit that outputs information about one or more candidate images classified into each group for two or more groups among the plurality of groups under the control of the control unit. , provided.

Another aspect of the present disclosure provides an image matching method for matching each of a plurality of candidate images with a query image. The image matching method includes an attribute determining step of determining an object attribute of an object to be matched included in each candidate image, and determining a similarity indicating the degree of similarity of each candidate image to the query image using a predetermined similarity calculation algorithm. a step of classifying a plurality of candidate images into a plurality of groups for each object attribute determined in the attribute determination step; and one or more candidate images classified into each group having high similarity for each group and outputting information about one or more candidate images classified into each group for two or more groups among the plurality of groups.

Yet another aspect of the present disclosure provides a program for causing a control unit to execute the above image matching method.

According to the image matching device, the image matching method, and the program according to the present disclosure, it is possible to grasp the information indicating the matching result more easily than in the conventional technology.

Schematic diagram showing an overview of an image matching device according to an embodiment of the present disclosure Block diagram showing a configuration example of the image matching device in FIG. 3 is a flow chart illustrating the procedure of processing executed by the control unit of the image collating apparatus of FIG. 2; Schematic diagram illustrating the distance between the feature amount vector of the candidate image and the feature amount vector of the query image Schematic diagram for explaining an example of step S5 for determining the degree of similarity in FIG. Schematic diagram illustrating conventional technology for displaying matching results in a similarity ranking format

(Circumstances leading to this disclosure)
2. Description of the Related Art A matching technique is known for searching for a person to be searched from among a plurality of captured images generated by a plurality of surveillance cameras installed in towns, premises, and the like. An example of such a matching technique is a technique of detecting an image of a person from a plurality of captured images, using this as a candidate image, and calculating a similarity indicating the degree of similarity between the candidate image and the query image. As conventional techniques, a technique of determining whether or not the degree of similarity is equal to or greater than a predetermined threshold, a technique of arranging and displaying candidate images in ranking order in descending order of similarity, and the like are known. A query image to be matched is selected by the user from among the plurality of captured images, or is selected in advance from existing images. Alternatively, the query image may be automatically selected by a program from an externally input image, the plurality of captured images, or the like.

As a means of displaying matching results, Non-Patent Document 1 discloses a technique for arranging and displaying candidate images in a ranking format in descending order of similarity. FIG. 6 is a schematic diagram illustrating a conventional technique for arranging and displaying matching results in a similarity ranking format. By displaying the matching result in the ranking format as shown in FIG. 6 on a display device or the like, the user can grasp the matching result at a glance.

However, the similarity of the candidate image may be calculated to be high when the orientation of the person in the candidate image matches the orientation of the person in the query image Q. For example, the candidate images (T1 to T3 in FIG. 6) showing the candidates facing the same direction as the person in the query image Q are the candidate images (T1 to T3 in FIG. 6) showing candidates facing in a direction different from the direction of the person in the query image Q ( From T4 to T6), the similarity of the candidate image is calculated to be high. In particular, the candidate images (T1 to T3 in FIG. 6) showing the candidate facing the same direction as the person in the query image Q are displayed in the query image Q even if the candidate is a different person from the query image Q. There is a problem that the similarity is calculated to be higher than the candidate image (T4 in FIG. 6) in which the same person is facing a different direction. Therefore, when the candidate images are arranged in a ranking format, the candidate images (T1 to T3 in FIG. 6) showing the candidates facing the same direction as the person in the query image Q are arranged at the top of the ranking. A candidate image facing a different direction (T4 in FIG. 6) appears at a lower level, and information indicating the matching result for finding the same person is buried in other information.

The inventors have conducted research to solve the above problems, and have developed an image matching device, an image matching method, and a program that make it easier to grasp the information indicating the matching result compared to the conventional technology.

Hereinafter, embodiments will be described in detail with reference to the drawings as appropriate. However, more detailed description than necessary may be omitted. For example, detailed descriptions of well-known matters and redundant descriptions of substantially the same configurations may be omitted. This is to avoid unnecessary verbosity in the following description and to facilitate understanding by those skilled in the art.

It is noted that Applicants provide the accompanying drawings and the following description for a full understanding of the present disclosure by those skilled in the art and are not intended to limit the claimed subject matter thereby. Absent. For example, in the following description of the embodiments, an example of matching a person (hereinafter sometimes abbreviated as “person”) among various objects that may be included in an image will be given. A person is an example of a "matching object", but the "matching object" of the present disclosure is not limited to a person, and may be an object other than a person. In addition, the attribute of the object to be matched that can be recognized from the image itself by image recognition technology or the like, that is, the “object attribute” will be described by exemplifying “human orientation”, but the “object attribute” of the present disclosure also is not limited to Other examples of matching objects and object attributes will be described after the description of the embodiments.

1. Configuration FIG. 1 is a schematic diagram showing an outline of an image matching device 100 according to an embodiment of the present disclosure. The image matching device 100 detects human images from a plurality of image data 50 generated by a plurality of cameras and uses them as candidate images. to rank. At this time, the image matching apparatus 100 classifies each candidate image into a plurality of groups according to the orientation of the person. For example, the image matching apparatus 100 calculates an attribute rank regarding the orientation of a person in a plurality of candidate images similar to the orientation of the person in the query image. The image matching apparatus 100 calculates the attribute ranking of a plurality of candidate images in the same order as the query image, the group facing forward (first orientation), backward facing (second orientation), and facing right (third orientation). do. Then, the image matching device 100 classifies the plurality of candidate images into respective groups as shown in FIG. The image matching apparatus 100 ranks each classified group in descending order of similarity, and displays the rank of each group for two or more of the plurality of groups on a display device or the like.

FIG. 2 is a block diagram showing a configuration example of the image matching device 100 of FIG. The image matching device 100 includes a control unit 1, a storage device 2, an image acquisition unit 3, an input interface (I/F) 5, and an output interface (I/F) 4.

The control unit 1 implements the functions of the image matching device 100 by executing information processing. Such information processing is realized by executing a program stored in the storage device 2 by the control unit 1, for example. The control unit 1 includes a person detection unit 11 , a query determination unit 12 , an orientation detection unit 13 , a similarity determination unit 14 , a classification unit 15 and a ranking unit 16 . The control unit 1 is composed of circuits such as a CPU, MPU, and FPGA.

An example of the function of each component of the control unit 1 will be described below. The person detection unit 11 detects a person in the image data 50 and uses the image of the detected person as a candidate image. The query determination unit 12 determines a query image to be matched with candidate images. The orientation detection unit 13 detects the orientation of a person's face and/or body (hereinafter referred to as "person's orientation") included in the query image determined by the query determination unit 12 and each candidate image detected by the person detection unit 11. ) is detected. The similarity determination unit 14 determines a similarity indicating the degree of similarity of each candidate image to the query image using a predetermined similarity calculation algorithm. The classification unit 15 classifies the plurality of candidate images into a plurality of groups for each orientation of the person determined by the similarity determination unit 14 . The ranking unit 16 ranks one or more candidate images classified into each group in descending order of similarity for each group. Details of each of the functions described above will be further described in relation to the operation of the image collating apparatus 100, which will be described later.

The storage device 2 is a recording medium for recording various information such as data and programs including a predetermined similarity calculation algorithm for causing the control unit 1 to execute the image matching method by the image matching device 100 . For example, the storage device 2 stores a later-described feature extraction model 21 that is a trained model and an image list 22 . The storage device 2 is realized by, for example, a semiconductor storage device such as a flash memory, a solid state drive (SSD), a magnetic storage device such as a hard disk drive (HDD), or other recording media alone or in combination. The storage device 2 may include volatile memory such as SRAM and DRAM.

The image acquisition unit 3 is an interface circuit that connects the image matching device 100 and an external device in order to input information such as the image data 50 to the image matching device 100 . Such an external device is, for example, another information processing terminal (not shown) or a device such as a camera that acquires the image data 50 . The image acquisition unit 3 may be a communication circuit that performs data communication according to existing wired communication standards or wireless communication standards.

The input interface 5 is an interface circuit that connects the image collating device 100 and an input device 80 such as a keyboard and a mouse in order to accept user input. The input interface 5 may be a communication circuit that performs data communication according to existing wired communication standards or wireless communication standards.

The output interface 4 is an interface circuit that connects the image matching device 100 and an external output device in order to output information from the image matching device 100 . Such an output device is, for example, the display device 70 . The output interface 4 may be a communication circuit that is connected to the network 60 and performs data communication according to existing wired communication standards or wireless communication standards. The image acquisition unit 3, input interface 5 and output interface 4 may be realized by separate or common hardware.

2. Operation FIG. 3 is a flow chart illustrating a procedure of processing executed by the control section 1 of the image matching apparatus 100 of FIG.

In FIG. 3, the control unit 1 acquires image data 50 via the image acquisition unit 3 (S1). The image data 50 is, for example, a plurality of image data captured by a plurality of cameras installed in the premises.

The person detection unit 11 detects a person in the image data 50 acquired in step S1, and uses the image of the detected person as a candidate image (S2). When detecting a plurality of persons in the image data 50, the person detection unit 11 generates a candidate image for each of the detected persons. Here, detecting a person in the image data 50 includes detecting an area in which a person exists in the image data 50 .

The query determination unit 12 determines a query image to be matched with candidate images (S3). For example, the query determination unit 12 uses, as a query image, a candidate image selected by the user using the input device 80 such as a keyboard or mouse from among the plurality of candidate images obtained in step S2. The query determination unit 12 may use an image of a person stored in advance in the storage device 2, an image of a person input via the image acquisition unit 3, or the like as a query image. The query determination unit 12 operates in accordance with instructions from the program to obtain the plurality of captured images obtained in step S2, the person's image stored in advance in the storage device 2, and the person's image input via the image acquisition unit 3. A query image may be automatically selected from images and the like.

The orientation detection unit 13 detects the orientation of the person included in the query image determined in step S3 and each candidate image detected in step S2 (S4). Specifically, first, the orientation detection unit 13 determines to which of a plurality of predetermined face and/or body different orientations the orientation of the person included in the query image belongs. Then, the orientation detection unit 13 determines to which of a plurality of predetermined face and/or body different orientations the orientation of the person included in each candidate image belongs.

Here, the "orientation of the person included in each candidate image" is an example of the "object attribute of the matching object included in each candidate image" of the present disclosure, and the orientation detection unit 13 is the "attribute determination It is an example of "part". Also, the “orientation of the person included in the query image” is an example of the “query attribute indicating the object attribute of the subject included in the query image” of the present disclosure.

For example, the orientation detection unit 13 compares the feature amount vector of each candidate image output by the feature extraction model 21 with each of the feature amount vectors of human images in a plurality of predetermined orientations, so that each candidate image is Detect the orientation of a person in The orientation of the person is the orientation of the face and/or body of the person in the candidate image, such as the orientation of the face, the orientation of the upper body, the orientation of the lower body, or an orientation determined by combining these information. .

The orientation detection unit 13 may use an orientation detection model that outputs the orientation of the person in the person image by inputting the person image. Such an orientation detection model is a trained model constructed by having the model learn the relationship between the learning image and the correct information. A known skeleton detector, posture detector, or face orientation detector may be applied to the orientation detection unit 13 .

The direction of the person detected in this way is, for example, 8 directions of forward, obliquely forward to the right, facing right, obliquely backward to the right, backward, obliquely backward to the left, facing left, and obliquely forward to the left when viewed from the person in the candidate image. direction can be classified. The orientation of a person is not limited to the eight directions described above, and can be classified into less than eight directions or nine or more directions.

The similarity determination unit 14 uses a predetermined similarity calculation algorithm to determine a similarity indicating the degree of similarity of each candidate image to the query image (S5). For example, the similarity determining unit 14 calculates the similarity based on comparison between the feature amount vector of each candidate image and the feature amount vector of the query image. For example, the predetermined similarity calculation algorithm calculates the similarity such that the smaller the distance such as the Euclidean distance, the Mahalanobis distance, or the inner product between the feature amount vector of each candidate image and the feature amount vector of the query image, the similarity is increased. is an algorithm for calculating The predetermined similarity calculation algorithm may be an algorithm that applies a model constructed by metric learning to calculate the distance between a plurality of feature amount vectors. The degree of similarity means, for example, that the larger the value, the higher the degree of matching between the candidate image and the query image.

FIG. 4 is a schematic diagram illustrating the distance between the feature amount vector of the candidate image and the feature amount vector of the query image. FIG. 4 shows an n-dimensional (n: an integer equal to or greater than 1) feature amount vector space. In the example shown in FIG. 4, the feature amount vector x=(x ₁ , x ₂ , . . . , x _n ) of the candidate image X and the feature amount vector q=(q ₁ , _q ₂ , . ) is the distance d(q, y) between the feature vector y=(y ₁ , y ₂ , . . . , _yn ) of the candidate image Y and the feature vector q of the query image. less than In this case, the similarity determination unit 14 determines the similarity of each candidate image so that the similarity of the candidate image X is higher than the similarity of the candidate image Y by a predetermined similarity calculation algorithm.

FIG. 5 is a schematic diagram for explaining an example of step S5 for determining the degree of similarity in FIG. In step S5, the similarity determination unit 14 may use the feature extraction model 21 that outputs the feature amount vector of the image by inputting the image. The similarity determination unit 14 calculates a similarity based on a comparison between the feature amount vector of each candidate image output by the feature extraction model 21 and the feature amount vector of the query image. Such a feature extraction model 21 is a trained model constructed by having the model learn the relationship between the learning image and the correct information. The feature extraction model 21, which is a trained model, may be a model having the structure of a neural network, for example, a convolutional neural network (CNN). When constructing the feature extraction model 21 by learning a model such as CNN, in the feature extraction model 21, in order to use the output of the convolutional layer or the pooling layer as the feature amount, the last fully connected layer of the model is may be excluded.

The classification unit 15 classifies the multiple candidate images into multiple groups for each orientation of the person determined in step S4 (S6). For example, if it is determined in step S4 that the orientation of the person in the candidate image belongs to "forward", the classification unit 15 classifies the candidate image into a first group corresponding to "forward" in step S6 ( See Figure 1). Alternatively, for example, the classification unit 15 may classify the orientation of the person in the candidate image detected in step S4 based on the orientation relative to the orientation of the person in the query image. For example, the classification unit 15 classifies a plurality of candidate images into a group that is the same as the orientation of the person in the query image and a group that is different.

The ranking unit 16 ranks the one or more candidate images classified into each group in step S6 in descending order of similarity determined in step S5 (S7). For example, the ranking unit 16 calculates an attribute ranking regarding the orientation of a person in a plurality of candidate images similar to the orientation of the person in the query image. For example, if the person orientation of the query image is forward facing (first orientation) as shown in FIG. ing. Based on these facts, the order of attributes regarding the orientation of a person is forward facing (first orientation), backward facing (second orientation), and right facing (third orientation). Then, as shown in FIG. 1, when a plurality of candidate images are classified into a group of forward facing (first orientation) and a group of backward facing (second orientation) and right facing (third orientation), the ranking unit 16 ranks the candidate images classified into the forward-looking group from the first rank, and also ranks the candidate images classified into the backward-looking group and the candidate images classified into the right-facing group respectively. rank in order from .

The above-mentioned attribute ranking regarding the orientation of a person is, for example, the degree of similarity of the orientation of a person included in the candidate images classified into each group (for example, facing forward, facing backward, facing right, etc.) to the orientation of a person included in the query image ( hereinafter referred to as “orientation similarity”). Orientation similarity is an example of “attribute similarity” of the present disclosure. The orientation similarity is calculated by the similarity determining unit 14, for example. For example, the similarity determining unit 14 calculates the orientation similarity based on a comparison between the outline of a person included in the candidate images classified into each group and the outline of a person included in the query image. Orientation similarity may be predetermined by the orientation of the person in the candidate image relative to the orientation of the person in the query image. For example, when the orientation of a person included in a query image is forward facing, the orientation similarity may be set to a larger value in order of the forward facing group, the backward facing group, and the right facing group.

The control unit 1 may output to the display device 70 information in which the one or more candidate images classified into each group in step S6 are linked with the ranking in each group for two or more groups among the plurality of groups. . Each candidate image is displayed on the display device 70 in the order of the ranking in each group (S8). For example, the control unit 1 displays, via the output interface 4, a plurality of candidate images, the group to which each candidate image determined in step S6 belongs, and the information indicating the order given to each group in step S7. Output to device 70 . For example, as shown in FIG. 1, the control unit 1 causes the display device 70 to display two or more groups among a plurality of groups arranged in the order of attribute ranking in the vertical direction (first direction), and displays the object attributes. One or more candidate images classified into each group are arranged in the horizontal direction (second direction) in order of rank in each group and displayed on the display device 70 . Note that it is not necessary to display the similarity rankings for a plurality of object attributes at the same time. For example, the similarity ranking of the object attribute having the first attribute ranking may be displayed, and the similarity ranking of the object attributes having the second and subsequent attribute rankings may be displayed by switching the screen. Also, the object attributes may be displayed in a predetermined order regardless of the attribute order.

In step S8, the control unit 1 may output to the display device 70 the candidate images from the first place to the predetermined order in each group in association with each order. For example, the control unit 1 selects the first to tenth candidate images in each group and causes the display device 70 to display them together with information on the group to which each candidate image belongs and the ranking.

Alternatively, in step S8, the control unit 1 displays candidate images having a degree of similarity equal to or higher than a predetermined threshold among the one or more candidate images classified into each group for two or more of the plurality of groups on the display device 70. can be output to For example, the control unit 1 selects candidate images having a similarity of 0.7 or higher in each group, and causes the display device 70 to display them together with information on the group to which each candidate image belongs and the ranking. As a result, the amount of data output to the display device 70 can be reduced, and the processing load can be reduced. The amount of information processing in the display device 70 can also be reduced.

A predetermined threshold may be set for each group. For example, if the person in the query image is forward-facing, the control unit 1 selects candidate images with a similarity of 0.8 or higher from the forward-facing group, and selects candidate images with a similarity of 0.5 or higher from the backward-facing group. An image is selected and displayed on the display device 70 .

In this way, setting a higher threshold than other groups for candidate images belonging to a group in which candidate images in the same orientation as the orientation of the person in the query image are classified has the following advantages. That is, a candidate image in which a candidate facing the same direction as the person in the query image shows the same person as in the query image, but facing a different direction, even if the candidate is a different person from the query image. There is a tendency that the calculated similarity is higher than that of the candidate image shown. Therefore, even if a candidate image belonging to a group in which candidate images with the same orientation as the orientation of the person in the query image are classified has a high degree of similarity, it belongs to another group with a similarly high degree of similarity. The probability of an image matching the query image may actually be lower than the candidate image. Therefore, the control unit 1 sets a threshold higher than that of the other groups for the candidate images belonging to the group into which the candidate images in the same direction as the direction of the person in the query image are classified. By reflecting the degree of matching, it is possible to select and display on the display device 70 only the candidate images whose substantial degree of matching exceeds a predetermined reference value.

3. Effects, etc. As described above, the image matching device 100 for matching each of a plurality of candidate images with a query image includes the orientation detection unit 13, which is an example of an attribute determination unit, the similarity determination unit 14, the classification unit 15, A ranking unit 16 and an output interface 4 are provided. The orientation detection unit 13 determines the orientation of the person included in each candidate image. The similarity determination unit 14 determines a similarity indicating the degree of similarity of each candidate image to the query image using a predetermined similarity calculation algorithm. The classification unit 15 classifies the multiple candidate images into multiple groups for each orientation determined by the orientation detection unit 13 . The ranking unit 16 ranks one or more candidate images classified into each group in descending order of similarity for each group. The output interface 4 outputs information about one or more candidate images classified into each group for two or more of the plurality of groups under the control of the control unit 1 .

With this configuration, the information processing apparatus of the output destination or the user who views the displayed output information can easily grasp the information indicating the matching result compared to the conventional technology, and the image matching apparatus 100 can display the information indicating the matching result. You can solve problems that are buried in other information.

Under the control of the control unit 1, the output interface 4 may output the candidate images from the first to the predetermined rank in each group in association with the respective ranks.

With this configuration, it is possible to reduce the amount of data output by the image matching apparatus 100, thereby reducing the processing load of the image matching apparatus 100, and reducing the amount of information processing in devices such as information processing terminals and display devices that are output destinations. can do. Also, when the user views the displayed output information, by limiting the number of pieces of output information from the first place in each group to a predetermined order, for example, the browsability of the output information is improved, and the user can see the output information. makes it easier to understand the displayed output information.

Under the control of the control unit 1, the output interface 4 outputs candidate images having a degree of similarity equal to or higher than a predetermined threshold among the one or more candidate images classified into each group, for two or more of the plurality of groups. may

With this configuration as well, it is possible to reduce the amount of data output by the image matching apparatus 100, thereby reducing the processing load of the image matching apparatus 100, and at the same time, reduce the amount of information processing in devices such as information processing terminals and display devices at the output destination. can be reduced. In addition, when the user views the displayed output information, by limiting the output information to information that satisfies that the degree of similarity is equal to or higher than a predetermined threshold value, for example, the browsability of the output information is improved, and the user can This makes it easier to understand the displayed output information.

The output interface 4 outputs information to the display device 70 under the control of the control unit 1, and ranks one or more candidate images classified into each group in each of two or more groups among the plurality of groups. may be displayed on the display device 70 in this order.

With this configuration, by looking at the display device 70, the user can easily grasp the information indicating the matching result compared to the conventional technology.

The orientation detection unit 13 may further determine the orientation of the subject included in the query image. In this case, the similarity determination unit 14 further determines orientation similarity indicating the degree to which the orientation of the person corresponding to each of the plurality of groups classified by the classification unit 15 is similar to the orientation of the subject included in the query image. do. Orientation similarity is an example of “attribute similarity” of the present disclosure. The rank assigning unit 16 further assigns ranks (orientation ranks) to the plurality of groups in descending order of orientation similarity. The output interface 4 arranges the plurality of groups in the vertical direction (first direction) in order of orientation order and displays them on the display device 70, and displays one or more candidate images classified into each group in each group. They are arranged in the horizontal direction (second direction) in order of similarity order and displayed on the display device 70 .

With this configuration, on the display device 70, a plurality of groups are arranged in order of orientation order, and candidate images belonging to each group are arranged in order of similarity order in the horizontal direction in each group arranged in order of orientation. Information indicating the result can be grasped more easily.

(Other embodiments)
As described above, the embodiments have been described as examples of the technology disclosed in the present application. However, the technology in the present disclosure is not limited to this, and can be applied to embodiments in which modifications, replacements, additions, omissions, etc. are made as appropriate. Other embodiments are exemplified below.

In the above embodiment, a person is described as an example of the "matching object" of the present disclosure, "human direction", which is an attribute of the person, is described as an example of the "object attribute", and an example of the "attribute determination unit" is described. , the orientation detection unit 13 has been described. However, the present disclosure is not limited to these. For example, an object to be matched is not limited to a person, and may be an object other than a person, such as a vehicle, a building, a robot, or the like. Also, the object attribute may be an attribute of an object other than a person, such as the color, material, shape, etc. of the object.

　When the object attribute is a person's attribute, the object attribute is not limited to the orientation of the person, and may be attributes such as a person's height, body type, age, age group, and gender. A person's height and body shape can be easily estimated using image recognition technology. In addition, age, generation, and gender can also be estimated using image recognition technology. For example, it is possible to estimate gender from the type of clothes, hairstyle, body shape, etc. of a person in an image, or estimate age from the degree of facial wrinkles and hair color. Also, the object attribute may be an attribute representing whether or not a person is wearing clothes of a specific shape such as a suit.

An object attribute may be an attribute that indicates whether a person is holding a bag, carrying a backpack, pulling a suitcase, or making a phone call. The presence/absence of belongings such as a bag as an object attribute includes information indicating not only whether or not a person actually has belongings in an image, but also whether or not the belongings are shown in the image. Such belongings are visible in images taken at a certain time, but in images taken at other times, they may be hidden by the owner's body or the owner may have left the belongings somewhere. , may not be reflected. Since the presence or absence of belongings between a plurality of images can also affect the degree of similarity, by including the presence or absence of belongings in the object attribute, the information processing device of the output destination or the user viewing the displayed output information can determine the matching result. It becomes easier to grasp the information to be displayed than in the conventional technology, and the image matching apparatus 100 can solve the problem that the information indicating the matching result is buried in other information.

Furthermore, the object attribute may be an attribute indicating whether the person is riding a vehicle such as a bicycle or motorcycle, walking, running, standing still, standing or sitting. The attribute determination unit may detect the posture of the person in each candidate image and estimate the attributes as described above based on the detected posture.

In the above embodiment, the display device 70 was exemplified as an output destination of information by the image collating device 100 . However, the output destination of the information is not limited to this, and the image matching device 100 may output the information to an information processing terminal such as a smart phone, a tablet, or a notebook computer via the network 60, for example. Alternatively, the image matching device 100 executes rough matching processing in steps S1 to S7 in FIG. Steps S4 to S8 may be performed. Such a means reduces the processing load of the second round of fine matching processing by excluding candidate images whose similarity is lower than a predetermined threshold as a result of rough matching processing from targets of the second round of fine matching processing. , can improve the processing speed.

As described above, the embodiment has been described as an example of the technology of the present disclosure. To that end, the accompanying drawings and detailed description have been provided.

Therefore, among the components described in the attached drawings and detailed description, there are not only components essential for solving the problem, but also components not essential for solving the problem in order to illustrate the above technology. can also be included. Therefore, it should not be immediately recognized that those non-essential components are essential just because they are described in the attached drawings and detailed description.

In addition, since the above-described embodiment is for illustrating the technology in the present disclosure, various changes, substitutions, additions, omissions, etc. can be made within the scope of claims or equivalents thereof.

The present disclosure is applicable to image search technology and image matching technology.

1 control unit 2 storage device 3 image acquisition unit 4 output interface 5 input interface 11 person detection unit 12 query determination unit 13 detection unit 14 similarity determination unit 15 classification unit 16 ranking unit 21 feature extraction model 22 image list 50 image data 60 Network 70 Display Device 80 Input Device 100 Image Verification Device

Claims

An image matching device for matching each of a plurality of candidate images with a query image,
an attribute determination unit that determines an object attribute of a matching object included in each candidate image;
a similarity determination unit that determines a similarity indicating the degree of similarity of each of the candidate images to the query image using a predetermined similarity calculation algorithm;
a classification unit that classifies the plurality of candidate images into a plurality of groups for each object attribute determined by the attribute determination unit;
a ranking unit that ranks the one or more candidate images classified into each group in descending order of similarity for each group;
an output unit that outputs information about one or more candidate images classified into each group for two or more groups among the plurality of groups under the control of the control unit;
An image matching device.
3. The image matching device according to claim 1, wherein the control unit associates the candidate images from the first to the predetermined rank of each group with the respective ranks and outputs them from the output unit.
3. The image matching device according to claim 1, wherein said control unit outputs from said output unit candidate images whose degree of similarity is equal to or greater than a predetermined threshold among the one or more candidate images classified into each group. .
2. The control unit outputs the information from the output unit to the display device, and causes the display device to display the one or more candidate images classified into the groups in the order of the order in each group. 4. The image matching device according to any one of 1 to 3.
The attribute determination unit further determines a query attribute indicating an object attribute of a subject included in the query image,
The similarity determining unit further determines an attribute similarity indicating the degree to which the object attribute corresponding to each of the plurality of groups classified by the classifying unit is similar to the query attribute,
The ranking unit further assigns attribute rankings to the plurality of groups in descending order of the attribute similarity,
The control unit causes the plurality of groups to be arranged in a first direction in order of the attribute order and displayed on the display device, and one or more candidate images classified into each group are displayed in each group. 5. The image matching device according to claim 4, wherein the images are displayed on the display device in a second direction different from the first direction in order of rank.
each of the plurality of candidate images is an image representing one person;
The attribute determining unit determines, as the object attribute, the orientation of the face and/or body of the person appearing in each of the candidate images.
The image matching device according to any one of claims 1 to 5.
7. The similarity calculation algorithm according to any one of claims 1 to 6, wherein the similarity calculation algorithm is an algorithm for calculating the similarity based on a comparison between the feature amount vector of each of the candidate images and the feature amount vector of the query image. Image matching device.
An image matching method for matching each of a plurality of candidate images with a query image,
an attribute determination step of determining object attributes of matching objects included in each candidate image;
determining a similarity score indicating the extent to which each candidate image is similar to the query image using a predetermined similarity calculation algorithm;
a step of classifying the plurality of candidate images into a plurality of groups for each object attribute determined in the attribute determining step;
a step of ranking the one or more candidate images classified into each group in descending order of similarity for each group;
outputting information about one or more candidate images classified into each group for two or more groups among the plurality of groups;
image matching methods, including;
A program for causing a control unit to execute the image matching method according to claim 8.