WO2023042299A1

WO2023042299A1 - Control method, control program, and information processing device

Info

Publication number: WO2023042299A1
Application number: PCT/JP2021/033928
Authority: WO
Inventors: 秀継内田
Original assignee: 富士通株式会社
Priority date: 2021-09-15
Filing date: 2021-09-15
Publication date: 2023-03-23

Abstract

With this control method, a computer executes a process for: extracting a feature quantity from a photographed face image captured by a camera, using an extraction model for extracting a feature quantity from an image; generating an estimated face image from the extracted face image, using a generation model for generating an image from the feature quantity; storing difference information between the photographed face image captured by the camera and the generated estimated face image in a storage unit; newly generating an estimated face image from the extracted feature quantity using the generation model, or newly extracting a feature quantity from another photographed face image captured by the camera using the extraction model and newly generating an estimated face image from the newly extracted feature quantity using the generation model, when the extraction model is changed to a new extraction model for extracting a feature quantity from an image; and generating a feature quantity that is to be verified, during authentication, against the feature quantity extracted from the photographed face image captured by the camera, on the basis of the newly generated estimated face image and the difference information stored in the storage unit.

Description

Control method, control program and information processing device

The present invention relates to a control method, a control program, and an information processing device.

In face authentication, authentication is performed by matching the feature amount extracted from the user's face image with a pre-registered template using a model for extracting the feature amount (hereinafter referred to as "extraction model").

JP 2014-164697 A JP 2020-126336 A

However, when the above extraction model is changed, even if the same face image is input, the output feature values are different between the old and new extraction models. Therefore, the face image of the user is required again, but from the viewpoint of privacy, it is difficult to store the face image taken at the time of registration. Therefore, it is difficult to update the template unless the user's facial image is retaken.

An object of one aspect is to provide a control method, a control program, and an information processing apparatus that can eliminate the need to re-capture a face image when updating a template.

In a control method according to one aspect, an extraction model for extracting a feature amount from an image is used to extract a feature amount from a photographed face image taken by a camera, and a generation model is used to generate an image from the feature amount. An estimated face image is generated from the feature amount obtained, difference information between the photographed face image photographed by a camera and the generated estimated face image is stored in a storage unit, and the extraction model is obtained by extracting the feature amount from the image. When changing to a new extraction model to be extracted, the generation model is used to generate a new estimated face image from the extracted feature quantity, or the extraction model is used to generate another face image captured by a camera. extracting a new feature amount from the photographed face image, using the generation model to generate a new estimated face image from the newly extracted feature amount, storing the newly generated estimated face image and the storage unit; Based on the stored difference information, the computer executes a process of generating a feature quantity to be matched with a feature quantity extracted from a photographed face image photographed by a camera at the time of authentication.

According to one embodiment, it is possible to eliminate the need to recapture a face image when updating a template.

FIG. 1 is a block diagram showing a functional configuration example of a face authentication system. FIG. 2 is a schematic diagram showing an example of calculation of difference information. FIG. 3 is a schematic diagram showing an example of template search. FIG. 4 is a flowchart showing the procedure of difference information calculation processing. FIG. 5 is a flowchart showing the procedure of template search processing. FIG. 6 is a diagram illustrating a hardware configuration example.

Hereinafter, embodiments of the control method, control program, and information processing apparatus according to the present application will be described with reference to the accompanying drawings. Each embodiment merely shows one example or one aspect, and such examples do not limit the numerical values, the range of functions, the usage scene, and the like. Further, each embodiment can be appropriately combined within a range that does not contradict the processing contents.

<System configuration>
FIG. 1 is a block diagram showing a functional configuration example of a face authentication system. A face authentication system 1 shown in FIG. 1 provides a face authentication service in which a server device 10 performs face authentication based on face images captured by client terminals 30A to 30N.

The following are examples of usage scenarios for such face authentication services. For example, face authentication services can be used for personal authentication when paying for products purchased at stores without cash registers, unmanned checkouts, and self-checkouts. In addition, the face authentication service can be used for login authentication to any computer, network, or service, or for personal authentication at doors and gates in an entrance/exit management system.

As shown in FIG. 1, the face authentication system 1 may include a server device 10 and client terminals 30A to 30N. Hereinafter, the client terminals 30A to 30N may be referred to as "client terminals 30" when individual client terminals 30A to 30N need not be distinguished.

The server device 10 and the client terminal 30 can be communicably connected via the network NW. For example, the network NW may be any type of communication network such as the Internet or a LAN (Local Area Network), regardless of whether it is wired or wireless.

The server device 10 corresponds to an example of a computer that provides the client terminal 30 with the face authentication service described above. For example, the server device 10 may correspond to an example of an information processing device. As one embodiment, the server device 10 can provide the face authentication service by causing an arbitrary computer to execute software for realizing the face authentication service, such as a face authentication program. As an example, the server device 10 can be implemented as a server that provides the face authentication service described above on-premises. As another example, the server device 10 can also provide the above face authentication service as a cloud service by implementing it as a SaaS (Software as a Service) type application.

The client terminal 30 corresponds to an example of a computer functioning as a client receiving the face authentication service described above. As one aspect, the client terminal 30 can be used by a customer subscribing to the face authentication service, or by an end user who uses the equipment, software, or information system controlled by the face authentication service. The label "client terminal" mentioned here is only a classification from one aspect of function, and the type of device and its hardware configuration are not limited to a specific one, and various types of self-checkout, login authentication, room access management, etc. It may be suitable for the usage scene. For example, the client terminal 30 may be realized in a form in which the functions of the client are incorporated into the equipment or information system itself controlled by the face authentication service, or may be realized as a computer independent of these. good.

<Distribution of facial recognition service functions>
The above face authentication service has an extraction function that extracts feature values from the user's face image using the above extraction model, and a matching function that matches the feature values extracted from the user's face image with pre-registered templates. and can be included.

　When such a face authentication service is applied to the face authentication system 1 shown in FIG. By way of example only, the above extracting function may be installed in the client terminal 30 , while the above matching function may be installed in the server device 10 .

By distributing the functions included in the above-described face authentication service in this way, the feature quantity is transmitted between the server device 10 and the client terminal 30 instead of the face image, thereby realizing privacy protection.

<One aspect of the challenge>
In the face authentication service described above, authentication is performed by matching the feature amount extracted from the user's face image using the extraction model described above with a pre-registered template.

For example, the above extraction model may be changed to improve the accuracy of the face recognition service or increase the number of templates that the face recognition service can support. When the extraction model is changed in this way, even if the same face image is input, the output feature amounts are different between the old and new extraction models.

For this reason, when the extraction model is changed, by inputting the re-captured face image of the template registrant's face into the new extraction model, the feature amount generated by the new extraction model is re-registered as a new template. There is a need.

　When replacing the extracted model, it is difficult to retake the face image for each user of the face authentication service. Also, from the viewpoint of privacy, it is difficult to save face images taken at the time of registration.

<One aspect of the problem-solving approach>
Therefore, in the present embodiment, a face image similar to the face image at the time of registration is reproduced from data whose size is smaller than the size of the face image, and the face image is difficult to identify. Implement an update function that generates feature values corresponding to the extraction model and updates the template.

Such an update function may be realized by being packaged as one function of the above-mentioned face authentication service, or may be realized as a library or framework called by the above-mentioned face authentication service, just as an example. good.

For example, when registering a template, inverse transformation of the transformation performed by the above extraction model, that is, a generation model that generates an estimated face image from the feature amount is created. In addition, difference information between the input face image input at the time of matching and the estimated input face image generated by inverse transformation from the feature amount of the input face image using the generation model is stored. Such difference information corresponds to information obtained by removing the information represented by the feature values from the input face image, in other words, it corresponds to the information that is lost due to dimensional compression at the time of feature value extraction, and most of the individuality is removed. As a result, privacy-protected information can be obtained. The feature amount that can generate a reproduced registered face image by combining this difference information and the estimated registered face image generated from the old template using the generation model corresponding to the extraction model before change is added to the extraction model after change. A new template is determined by searching with the corresponding generative model. By searching the face information space in the template space in which the information is limited, it is possible to reduce the influence of errors contained in the difference information and the template, and to obtain a template with little distortion. As a result, it is possible to generate a new template by utilizing the information at the time of matching before the extraction model is changed.

Therefore, according to the update function according to the present embodiment, it is possible to eliminate the need to re-capture the face image when updating the template.

<Configuration of Client Terminal 30>
In FIG. 1, blocks corresponding to the functions included in the face authentication service in which the template updating function is packaged are schematically shown for each of the server device 10 and the client terminal 30. FIG. Although FIG. 1 shows all the functions included in the above-described face authentication service, the server device 10 and the client terminal 30 may be provided with only the function units corresponding to the above-described template update function. .

As shown in FIG. 1, the client terminal 30 has a face image acquisition section 31, a feature quantity extraction section 33, an authentication result reception section 35, a difference information calculation section 37, and a difference information storage section 39.

Functional units such as the face image acquisition unit 31, the feature amount extraction unit 33, the authentication result reception unit 35, and the difference information calculation unit 37 shown in FIG. 1 are virtually implemented by a hardware processor. For example, hardware processors include CPU (Central Processing Unit), MPU (Micro Processing Unit), GPU (Graphics Processing Unit), and GPGPU (General-Purpose computing on GPU). The processor reads an OS (Operating System) from a storage device (not shown), such as a HDD (Hard Disk Drive), an optical disk, or an SSD (Solid State Drive), as well as programs such as a face authentication program for clients. Then, the processor develops processes corresponding to the above functional units on a memory such as RAM (Random Access Memory) by executing the above face authentication program. In this way, as a result of executing the face authentication program, the functional units are virtually implemented as processes. Here, a CPU and an MPU are illustrated as examples of processors, but the above functional units may be implemented by any processor, regardless of whether it is a general-purpose type or a specialized type. In addition, the above functional units or part of the functional units may be realized by hardwired logic such as ASIC (Application Specific Integrated Circuit) or FPGA (Field Programmable Gate Array).

A storage unit such as the difference information storage unit 39 can be realized as an auxiliary storage device such as an HDD, an optical disk, or an SSD, or by allocating a part of the storage area of the auxiliary storage device.

The facial image acquisition unit 31 is a functional unit that acquires facial images. As one embodiment, the face image acquisition unit 31 can be realized by an imaging device that captures a face image, a so-called camera. As one aspect, when registering a template, the face image acquisition unit 31 can acquire a face image in which the face of a registrant who is registered as an authorized user of the face authentication service is photographed. As another aspect, at the time of collation, the face image acquisition unit 31 can acquire a face image in which the face of the user of the client terminal 30 is photographed. From the standpoint of distinguishing between these two labels, the face image of the registrant may be referred to as a "registered face image", and the face image input at the time of matching may be referred to as an "input face image". Here, an example in which the face image acquisition unit 31 acquires the registered face image and the input face image is given, but the registered face image or the input face image may be any other face image having the same function as that of the face image acquisition unit 31. It may be acquired by an acquisition unit.

The feature amount extraction unit 33 is a functional unit that extracts feature amounts from face images using an extraction model. Here, for the above extraction model, a model in which embedding space learning is performed by deep learning or the like, for example, a CNN (Convolutional Neural Network) or the like can be used. As one aspect, when registering a template, the feature quantity extraction unit 33 inputs the registered face image acquired by the face image acquisition unit 31 to the CNN whose embedding space has been learned, so that the embedding vector output by the CNN is can be extracted as a registered feature quantity, that is, as a template. The registered feature quantity extracted in this way is encrypted according to an arbitrary encryption method, for example, an algorithm such as public key cryptography, and then transmitted to the server device 10 . As another aspect, at the time of matching, the feature amount extraction unit 33 inputs the input face image acquired by the face image acquisition unit 31 to the CNN whose embedding space has already been learned, thereby inputting the embedding vector output by the CNN. It can be extracted as a feature amount. Then, the feature amount extraction unit 33 transmits a request for matching the input feature amount and the template to the server device 10 . These input feature amount and registered feature amount may be a bit string in which binary values of 0 or 1 are arranged, that is, so-called binary data, by way of example only. Note that the embedding vector described above is merely an example of the feature quantity, and other types of feature quantity such as SIFT (Scale-Invariant Feature Transform) may be extracted. Here, an example in which the feature amount extraction unit 33 extracts the registered feature amount and the input feature amount is given, but the registered feature amount or the input feature amount may be any other feature amount having the same function as the feature amount extraction unit 33. It may be acquired by an extraction unit.

The authentication result reception unit 35 is a functional unit that receives the authentication result from the server device 10 . As one aspect, the authentication result accepting unit 35 accepts an authentication result such as authentication success (authentication OK) or authentication failure (authentication NG) as a response to a matching request at the time of matching. Such an authentication result has an aspect that it is output to the equipment, software, and information system controlled by the face authentication service, and also has an aspect that it is used for calculating the difference information, as described later.

The difference information calculation unit 37 is a functional unit that calculates difference information. As one embodiment, the difference information calculation unit 37, when the authentication is successful, calculates the estimated input face image generated from the input feature amount using a generation model that performs inverse transformation of the transformation performed by the extraction model used at the time of template registration. , and difference information from the original input face image is calculated. The difference information calculated in this way is stored in the difference information storage unit 39 in association with the identification information of the successfully authenticated registrant.

Here, the CNN described in Non-Patent Document 1 can be used as the above generative model, just as an example. For example, a generative model can be created during template registration. As an example, the generative model minimizes the difference between the original registered face image and the estimated registered face image output by the generative model by inputting the registered features extracted from the original registered face image into the generative model. created by performing training to

For example, when an operation for extracting a feature amount F from a face image I using an extraction model E is expressed as F=E(I), the template before change, that is, the registered feature amount Fa is Fa=Ea(Ie). can be expressed. Here, "Ea" refers to the extraction model before change, and "Ie" refers to the registered face image. When creating a generative model Da corresponding to such an extracted model Ea, the generative model Da can be created by minimizing |Ie-Da(Fa)|. Using such a generative model Da, the difference information calculator 37 calculates the difference information Rv.

FIG. 2 is a schematic diagram showing an example of calculation of difference information. As shown in FIG. 2, when the face image acquisition unit 31 acquires the input face image Iv, the feature quantity extraction unit 33 inputs the input face image Iv to the extraction model Ea before change. Then, the extraction model Ea before change to which the input face image Iv is input outputs the input feature amount Fav=Ea(Iv). A matching request including the input feature value Fav is transmitted from the client terminal 30 to the server device 10 in order to perform matching between the input feature value Fav thus obtained and the template Fa before change.

When the authentication result accepting unit 35 accepts the authentication result of authentication OK as a response to this collation request, the difference information calculation unit 37 converts the input feature amount Fav for which the authentication result of authentication OK is obtained into the generation model before the change. Input to Da. As a result, an estimated input face image Da(Fav) is obtained. Thereafter, the difference information calculator 37 calculates difference information Rv by subtracting the estimated input face image Da(Fav) from the original input face image Iv, ie, Iv-Da(Fav). The difference information Rv thus obtained is stored in the difference information storage unit 39 in association with the identification information of the registrant for whom the authentication result of OK is obtained, such as an ID.

Such difference information is the residual error that was removed when the extraction model Ea before the change extracted the feature amount Fav, so it is difficult to identify the person, and the data size is smaller than that of the face image.

The reason why the input face image Iv is used to calculate the difference information is that the difference information can be calculated in response to changes over time since the template was registered. Note that the input face image Iv may not necessarily be used for calculating the difference information. For example, if a registered face image is used to calculate the difference information, the difference information can be calculated using the registered face image Ie when registering the template. Thereafter, if the elapsed time from the template registration is equal to or greater than the threshold, difference information is calculated using the input face image Iv, and the input face image Iv is calculated from the difference information calculated using the registered face image Ie. It is also possible to update to the calculated difference information.

Also, here, an example of using the estimated input face image Da(Fav) to calculate the difference information has been given, but the estimated input face image Da(Fav) may not necessarily be used. For example, it is natural that the estimated registered face image Da(Fa) can be used to calculate the difference information, and the target for calculating the difference with respect to the estimated registered face image Da(Fa) is the input face image Iv or the registered face image. Needless to say, any of Ie may be used.

<Configuration of server device 10>
Next, a functional configuration example of the server device 10 according to this embodiment will be described. As shown in FIG. 1 , the server device 10 has a matching section 11 , a template storage section 13 , a change receiving section 15 and a template searching section 17 .

Functional units such as the matching unit 11, the change receiving unit 15, and the template searching unit 17 shown in FIG. 1 are virtually implemented by a hardware processor. For example, hardware processors include CPUs, MPUs, GPUs, and GPGPUs. The processor reads the OS and programs such as a face authentication program for the server from a storage (not shown). Then, the processor develops processes corresponding to the above functional units on a memory such as a RAM by executing the above face authentication program. In this way, as a result of executing the face authentication program, the functional units are virtually implemented as processes. Here, a CPU and an MPU are illustrated as examples of processors, but the above functional units may be implemented by any processor, regardless of whether it is a general-purpose type or a specialized type. In addition, the above functional unit or part of the functional unit may be realized by hardwired logic.

A storage unit such as the template storage unit 13 can be implemented as an auxiliary storage device such as an HDD, an optical disk, or an SSD, or by allocating a part of the storage area of the auxiliary storage device.

The collation unit 11 is a functional unit that collates the input feature quantity and the templates stored in the template storage unit 13 . As one aspect, when receiving a matching request, the matching unit 11 determines the degree of similarity between each template stored in the template storage unit 13, that is, the registered feature amount, and the input feature amount specified in the matching request. Calculate Then, if there is a template with a degree of similarity equal to or greater than the threshold, the matching unit 11 determines that the authentication is successful (authentication OK). In this case, the registrant can be identified by the identification information associated with the template whose degree of similarity is greater than or equal to the threshold. On the other hand, if there is no template with a degree of similarity greater than or equal to the threshold, the matching unit 11 determines authentication failure (authentication NG). After that, the matching unit 11 returns the authentication result of authentication success or authentication failure to the client terminal 30 . At this time, when the authentication is successful, the identification information of the registrant can also be sent back. Note that although the example in which the authentication method is one-to-N authentication is given here, the authentication method may be one-to-one authentication. In this case, it goes without saying that matching can be executed by narrowing down to only the templates associated with the identification information transmitted from the client terminal 30 together with the input feature amount.

The change acceptance unit 15 is a functional unit that accepts a change request for an extraction model. As an example only, the change receiving unit 15 receives a change request together with designation of a changed extraction model and a changed generation model. Such a change request can be received via any terminal used by the customer of the face authentication service, such as the client terminal 30 or other computer.

The template search unit 17 is a functional unit that searches for a template corresponding to the changed extraction model. In the following, from the aspect of distinguishing the label of the template corresponding to the extraction model after change from the label of the template corresponding to the extraction model before change, the former is described as "new template" and the latter is referred to as "old template". may be referred to as a "template".

As an embodiment, when a request for changing an extraction model is accepted, the template searching unit 17 performs the following processing for each identification information associated with a template stored in the template storage unit 13, that is, for each registrant of the face authentication service. to run. For example, the template search unit 17 selects the identification information of one of the identification information of M (≦N) registrants. Next, the template searching unit 17 retrieves K pieces of information from K (≦M) client terminals 30 among the client terminals 30A to 30N whose difference information corresponding to the identification information being selected is registered in the difference information storage unit 39. Get the difference information of Then, the template search unit 17 calculates a representative value of the K pieces of difference information, such as an average value, a median value, or a mode. Calculation of such a representative value smoothes variations in photographing conditions of the input face images used to calculate the K pieces of difference information, such as face orientation and illumination during photographing. Then, the template searching unit 17 finds a feature amount that can generate a reproduced registered face image obtained by synthesizing the representative value of the difference information and the estimated registered face image generated from the old template using the generation model before the change. , is searched using the modified generative model.

FIG. 3 is a schematic diagram showing an example of template search. As shown in FIG. 3, the template searching unit 17 converts the difference information Rv=(Iv−Da(Fav)) obtained from the client terminal 30 and the old template, that is, the old registered feature value into the generated model Da before change. Synthesize with an estimated registered face image obtained by inputting. By such synthesis, a reproduced registered face image Da(Fa)+Rv in which the original registered face image Ie is simulated is obtained. Then, the template searching unit 17 inputs the reproduced registered face image Da(Fa)+Rv to the changed extraction model Eb whose designation is accepted by the change accepting unit 15 . Then, the modified extraction model Eb to which the reproduced registered face image Da(Fa)+Rv has been input has an initial value Fb0=Eb(Da(Fa)+Rv) for searching for a new template Fb corresponding to the modified extraction model Eb. to output

After the initial value Fb0 of the new template is calculated in this way, the template searching unit 17 inputs the initial value Fb0 of the new template to the modified generation model Db. As a result, the modified generation model Db outputs an estimated registered face image Db (Fb0). Then, the template searching unit 17 calculates the error |Da(Fa )+Rv−Db(Fb0)|.

In parallel with this, the template searching unit 17 calculates the neighborhood value Fb' from the initial value Fb0 of the new template by inverting the leading bit among the N bits included in the initial value Fb0 of the new template. Then, the template searching unit 17 inputs the neighborhood value Fb' of the new template to the modified generative model Db. As a result, the modified generative model Db outputs an estimated registered face image Db (Fb'). Then, from the aspect of evaluating the loss of the neighborhood value Fb' or the cost C of the new template, the template searching unit 17 calculates the error |Da (Fa)+Rv-Db(Fb')| is calculated.

By comparing the size relationship between these costs C0 and C, the template searching unit 17 evaluates which of the initial value Fb0 and the neighboring value Fb' is appropriate as the new template. For example, the template searching unit 17 determines whether or not the cost C is smaller than the cost C0, that is, whether or not C<C0.

Here, when C<C0, it is found that the error of the neighborhood value Fb' is smaller than the error of the initial value Fb0, so the neighborhood value Fb' is more suitable as the new template than the initial value Fb0. It can be evaluated that there is. In this case, the template searching unit 17 updates the new template Fb from the initial value Fb0 to the neighboring value Fb'.

On the other hand, if C<C0 does not hold, that is, if C0≦C, the error of the initial value Fb0 is found to be smaller than the error of the neighboring value Fb′, so the initial value Fb0 is newer than the neighboring value Fb′. It can be evaluated as appropriate as a template. In this case, the template searching unit 17 re-inverts the first bit of the N bits included in the neighborhood value Fb' of the new template. As a result, the new template Fb is returned from the neighboring value Fb' to the initial value Fb0, and the new template Fb is not updated.

After that, the template searching unit 17 calculates a neighborhood value by bit inversion from the second bit to the last bit of the N bits included in the new template Fb, and calculates the new template according to the cost comparison result. Repeat the search to update .

Here, the above search can be repeated until the following convergence conditions are met. The first condition is that the cost C is less than the cost determination threshold value CTh. This repeats the search until the cost C becomes sufficiently small. Second, when one epoch of the new template is updated by repeating the inversion of the bits included in the new template Fb from the beginning to the end, there is a condition that the number of epochs E exceeds the epoch determination threshold ETh. mentioned. This repeats the search until the generation of the new template Fb is sufficiently updated. If any one of these conditions is satisfied, it can be considered that a new template Fb capable of generating a reproduced registered face image Da(Fa)+Rv using the modified generation model Db has been found. In addition to the above two conditions, other conditions, such as a condition that the amount of update of the new template Fb or the amount of change thereof is less than a threshold, may be used as the convergence condition of the new template Fb.

After such a search is completed, the template search unit 17 updates the value of the template of the registrant corresponding to the identification information being selected among the templates stored in the template storage unit 13 from the old template Fa to the new template Fb. do.

<Process flow>
Next, the processing flow of the face authentication system according to this embodiment will be described. Here, (1) difference information calculation processing executed by the client terminal 30 will be described, and then (2) template search processing executed by the server device 10 will be described.

(1) Difference Information Calculation Processing FIG. 4 is a flowchart showing the procedure of difference information calculation processing. This processing can be started when an input face image is acquired by the face image acquisition unit 31, as an example only. As shown in FIG. 4, when an input face image is acquired by the face image acquisition unit 31 (step S101), the feature amount extraction unit 33 extracts input features from the input face image acquired in step S101 using an extraction model. Extract the quantity (step S102).

Subsequently, the feature amount extraction unit 33 transmits a matching request between the input feature amount extracted in step S102 and the template to the server device 10 (step S103). After that, the authentication result receiving unit 35 waits for an authentication result such as authentication success (authentication OK) or authentication failure (authentication NG) as a response to the collation request (step S104).

Then, if the authentication result obtained as a response from the server device 10 is authentication OK (step S105 Yes), the difference information calculation unit 37 executes the following processing. That is, the difference information calculation unit 37 generates an estimated input face image by inputting the input feature amount extracted in step S102 to the generation model (step S106).

After that, the difference information calculation unit 37 calculates difference information by subtracting the estimated input face image generated in step S106 from the original input face image obtained in step S101 (step S107).

Finally, the difference information calculation unit 37 stores the difference information calculated in step S107 in the difference information storage unit 39 in association with the identification information of the registrant for whom the authentication result of authentication OK is obtained, such as an ID ( Step S108), the process ends.

(2) Template Search Processing FIG. 5 is a flowchart showing the procedure of template search processing. As an example only, this processing is executed for each identification information associated with the template stored in the template storage unit 13, that is, for each registrant of the face authentication service when the change request for the extraction model is received by the change reception unit 15. can be

As shown in FIG. 5, the template searching unit 17 synthesizes the difference information acquired from the client terminal 30 and the estimated registered face image obtained by inputting the old template into the generation model before the change, thereby reproducing the face image. A registered face image Da(Fa)+Rv is generated (step S301).

Then, the template searching unit 17 inputs the reproduced registered face image generated in step S301 to the modified extraction model specified by the modification request, thereby obtaining the initial values of the new template corresponding to the modified extraction model. Fb0 is calculated (step S302).

As the cost C of the new template initial value Fb0, the template searching unit 17 calculates the error |Da( Fa)+Rv-Db(Fb0)| is calculated (step S303).

After that, the template searching unit 17 repeats the processing from step S305 to step S313 below until the cost C becomes less than the cost determination threshold value CTh (No at step S304).

First, the template searching unit 17 updates the value of the cost C0 held in a resist (not shown) or the like to the cost C calculated in step S303 or the cost C calculated in step S312 (described later) (step S305). Subsequently, the template searching unit 17 increments by one the index N of the bit number that identifies the position of the bit string of the new template Fb (step S306).

At this time, if the bit number index N is greater than the bit number L of the new template (Yes at step S307), the template searching unit 17 initializes the bit number index N of the new template to the initial value "1" (step S308). Further, the template searching unit 17 increments by one the index E of the epoch number for identifying the generation of the new template (step S309). Note that the initial value of the index E of the number of epochs of the new template may be "0" as an example.

Then, when the index E of the number of epochs of the new template does not exceed the epoch determination threshold ETh (No in step S310), the template searching unit 17 executes the following processing. That is, the template searching unit 17 calculates the neighborhood value Fb' of the new template by inverting the bit corresponding to the index N in the bit string of the new template (step S311).

After that, the template searching unit 17 inputs the neighborhood value Fb' of the new template to the modified generative model Db. As a result, the modified generative model Db outputs an estimated registered face image Db (Fb'). Then, the template searching unit 17 calculates the error |Da(Fa)+Rv-Db(Fb')| ).

At this time, if the cost C of the neighborhood value Fb' is not less than the cost C0 of the new template Fb (step S313 No), it is found that the error of the new template Fb is smaller than that of the neighborhood value Fb'. It can be evaluated that Fb is more appropriate than the neighborhood value Fb'. In this case, the template searching unit 17 re-inverts the bit corresponding to the index N in the bit string of the neighborhood value Fb' of the new template (step S314), and proceeds to step S306.

On the other hand, if the cost C of the neighborhood value Fb' is less than the cost C0 of the new template Fb (Yes in step S313), it is found that the error of the neighborhood value Fb' is smaller than the error of the new template Fb. It can be evaluated that Fb' is more suitable as a new template than new template Fb. In this case, the template searching unit 17 updates the value of the new template Fb to the neighboring value Fb' calculated by bit inversion in step S311 (step S315), and proceeds to step S304.

<One aspect of the effect>
As described above, the server device 10 according to the present embodiment converts the registered face image reproduced using the difference information between the original face image and the face image obtained by inverse transformation from the feature quantity into the changed feature. The template is updated by searching for features that can be generated by inverse transformation of the quantity extraction model. Therefore, although the information is calculated before the template is changed, it is difficult to identify the person registrant of the template, and the template can be generated by utilizing the difference information of a size smaller than the face image. Therefore, according to the server device 10 of the present embodiment, it is possible to omit the re-capture of the face image when updating the template.

Although the embodiments relating to the disclosed apparatus have been described so far, the present invention may be implemented in various different forms other than the embodiments described above. Therefore, other embodiments included in the present invention will be described below.

<Decentralization and Integration>
Also, each component of each illustrated device does not necessarily have to be physically configured as illustrated. In other words, the specific form of distribution and integration of each device is not limited to the one shown in the figure, and all or part of them can be functionally or physically distributed and integrated in arbitrary units according to various loads and usage conditions. Can be integrated and configured. For example, in the first embodiment described above, the functions included in the above face authentication service are implemented as a client server system including the server device 10 and the client terminal 30, but the functions corresponding to the above face authentication service may be implemented standalone.

<Hardware configuration example>
Moreover, various processes described in the above embodiments can be realized by executing a prepared program on a computer such as a personal computer or a work station. Therefore, an example of a computer that executes a control program having functions similar to those of the first and second embodiments will be described below with reference to FIG.

FIG. 6 is a diagram showing a hardware configuration example. As shown in FIG. 6, the computer 100 has an operation section 110a, a speaker 110b, a camera 110c, a display 120, and a communication section . Furthermore, this computer 100 has a CPU 150 , a ROM 160 , an HDD 170 and a RAM 180 . Each part of these 110 to 180 is connected via a bus 140 .

As shown in FIG. 6, the HDD 170 stores a control program 170a that exhibits the same functions as the matching section 11, the change accepting section 15, and the template searching section 17 shown in the first embodiment. This control program 170a may be integrated or separated like the components of the matching unit 11, the change receiving unit 15, and the template searching unit 17 shown in FIG. That is, the HDD 170 does not necessarily store all the data shown in the first embodiment, and the HDD 170 only needs to store data used for processing.

Under such an environment, the CPU 150 reads out the control program 170a from the HDD 170 and expands it to the RAM 180. As a result, the control program 170a functions as a control process 180a, as shown in FIG. The control process 180a deploys various data read from the HDD 170 in an area assigned to the control process 180a among storage areas of the RAM 180, and executes various processes using the deployed various data. For example, an example of the processing executed by the control process 180a includes the processing shown in FIG. Note that the CPU 150 does not necessarily have to operate all the processing units described in the first embodiment, as long as the processing units corresponding to the processes to be executed are virtually realized.

It should be noted that the above control program 170a does not necessarily have to be stored in the HDD 170 or ROM 160 from the beginning. For example, each program is stored in a “portable physical medium” such as a flexible disk inserted into the computer 100, so-called FD, CD-ROM, DVD disk, magneto-optical disk, IC card, or the like. Then, the computer 100 may acquire and execute each program from these portable physical media. Alternatively, each program may be stored in another computer or server device connected to computer 100 via a public line, the Internet, LAN, WAN, etc., and computer 100 may obtain and execute each program from these. can be

1 face authentication system 10 server device 11 matching unit 13 template storage unit 15 change reception unit 17 template search unit 30 client terminal 31 face image acquisition unit 33 feature amount extraction unit 35 authentication result reception unit 37 difference information calculation unit 39 difference information storage unit

Claims

Using an extraction model for extracting feature amounts from images, extracting feature amounts from a photographed face image taken by a camera,
generating an estimated face image from the extracted feature amount using a generative model for generating an image from the feature amount;
storing difference information between the photographed face image photographed by a camera and the generated estimated face image in a storage unit;
When the extraction model is changed to a new extraction model that extracts a feature amount from an image, the generation model is used to newly generate an estimated face image from the extracted feature amount, or the extraction model is used. using to extract a new feature amount from another photographed face image taken by a camera, and using the generation model to generate a new estimated face image from the newly extracted feature amount,
Based on the newly generated estimated face image and the difference information stored in the storage unit, a feature amount to be matched with a feature amount extracted from a photographed face image taken by a camera at the time of authentication is generated. do,
A control method characterized in that the processing is executed by a computer.
In the processing for generating the feature amount, the estimated face image having the minimum difference from the synthesized face image obtained by synthesizing the newly generated estimated face image and the difference information stored in the storage unit is generated as the new estimated face image. Including a process of generating the feature amount to be matched by searching for the feature amount generated using a new generative model corresponding to the extraction model,
2. The control method according to claim 1, characterized by:
The process of generating the feature amount includes a process of performing the search using the feature amount extracted from the synthetic face image using the new extraction model as an initial value,
3. The control method according to claim 2, characterized by:
The processing for generating the feature value includes searching for a bit string that generates an estimated face image that minimizes the difference from the synthesized face image while inverting some bits in the bit string corresponding to the feature value of the initial value. including processing to
4. The control method according to claim 3, characterized in that:
Using an extraction model for extracting feature amounts from images, extracting feature amounts from a photographed face image taken by a camera,
generating an estimated face image from the extracted feature amount using a generative model for generating an image from the feature amount;
storing difference information between the photographed face image photographed by a camera and the generated estimated face image in a storage unit;
When the extraction model is changed to a new extraction model that extracts a feature amount from an image, the generation model is used to newly generate an estimated face image from the extracted feature amount, or the extraction model is used. using to extract a new feature amount from another photographed face image taken by a camera, and using the generation model to generate a new estimated face image from the newly extracted feature amount,
Based on the newly generated estimated face image and the difference information stored in the storage unit, a feature amount to be matched with a feature amount extracted from a photographed face image taken by a camera at the time of authentication is generated. do,
A control program that causes a computer to execute processing.
In the processing for generating the feature amount, the estimated face image having the minimum difference from the synthesized face image obtained by synthesizing the newly generated estimated face image and the difference information stored in the storage unit is generated as the new estimated face image. Including a process of generating the feature amount to be matched by searching for the feature amount generated using a new generative model corresponding to the extraction model,
6. The control program according to claim 5, characterized by:
The process of generating the feature amount includes a process of performing the search using the feature amount extracted from the synthetic face image using the new extraction model as an initial value,
7. The control program according to claim 6, characterized by:
The processing for generating the feature value includes searching for a bit string that generates an estimated face image that minimizes the difference from the synthesized face image while inverting some bits in the bit string corresponding to the feature value of the initial value. including processing to
8. The control program according to claim 7, characterized by:
Using an extraction model for extracting feature amounts from images, extracting feature amounts from a photographed face image taken by a camera,
generating an estimated face image from the extracted feature amount using a generative model for generating an image from the feature amount;
storing difference information between the photographed face image photographed by a camera and the generated estimated face image in a storage unit;
When the extraction model is changed to a new extraction model that extracts a feature amount from an image, the generation model is used to newly generate an estimated face image from the extracted feature amount, or the extraction model is used. using to extract a new feature amount from another photographed face image taken by a camera, and using the generation model to generate a new estimated face image from the newly extracted feature amount,
Based on the newly generated estimated face image and the difference information stored in the storage unit, a feature amount to be matched with a feature amount extracted from a photographed face image taken by a camera at the time of authentication is generated. do,
An information processing apparatus including a control unit that executes processing.
In the processing for generating the feature amount, the estimated face image having the minimum difference from the synthesized face image obtained by synthesizing the newly generated estimated face image and the difference information stored in the storage unit is generated as the new estimated face image. Including a process of generating the feature amount to be matched by searching for the feature amount generated using a new generative model corresponding to the extraction model,
10. The information processing apparatus according to claim 9, characterized by:
The process of generating the feature amount includes a process of performing the search using the feature amount extracted from the synthetic face image using the new extraction model as an initial value,
11. The information processing apparatus according to claim 10, characterized by:
The processing for generating the feature value includes searching for a bit string that generates an estimated face image that minimizes the difference from the synthesized face image while inverting some bits in the bit string corresponding to the feature value of the initial value. including processing to
12. The information processing apparatus according to claim 11, characterized by: