WO2023128342A1

WO2023128342A1 - Method and system for identifying individual using homomorphically encrypted voice

Info

Publication number: WO2023128342A1
Application number: PCT/KR2022/019467
Authority: WO
Inventors: 안용대; 박준홍
Original assignee: 주식회사 디사일로
Priority date: 2021-12-30
Filing date: 2022-12-02
Publication date: 2023-07-06
Also published as: KR102403471B1

Abstract

The present invention relates to a method for identifying an individual using a homomorphically encrypted voice, the method comprising the steps of: acquiring first voice data of a user; homomorphically encrypting the first voice data; transmitting the homomorphically encrypted first voice data to a voice operation server; receiving, from the voice operation server, a homomorphically encrypted identification result calculated on the basis of the homomorphically encrypted first voice data and previously stored second voice data of other users; and decrypting the homomorphically encrypted identification result.

Description

Personal identification method and system using homomorphic encrypted voice

The present invention relates to a method and system for personal identification using homomorphic encrypted voice.

Three types of personal identification methods are typically used to prove the user's identity in using computers, laptops, KIOSK terminals, access control facilities, bank terminals (ATM), websites, and Internet banking.

First, a user inputs a password using alphabets and numbers or a personal identification number consisting only of numbers for authentication, thereby identifying whether or not the user is a correct user. The second is a method of recognizing the user's unique biometric information such as the user's fingerprint, iris, face, and voice to identify whether the user is a valid user. Third, a user carries an additional device only for authentication, such as an OTP (One-Time Password) generation device of Internet banking and an employee ID card, and uses the device when authentication is requested to identify the correct user.

The first method is the most used among them, but because users designate different passwords for each system and it is difficult to remember them, many people use short and common passwords for convenience, which is weak in terms of security.

In addition, the third method has a disadvantage in that the user must always carry the device for authentication, and the process of reissuing the device is cumbersome if the device is lost while in possession.

The background description of the invention has been prepared to facilitate understanding of the present invention. It should not be construed as an admission that matters described in the art that form the background of the invention exist as prior art.

Accordingly, the second identification method using the user's unique biometric information, which has no risk of loss and does not change, is a safe method. There is a problem in that considerable time and money are consumed to build data.

Accordingly, a new method capable of quickly and accurately identifying a user using the user's voice in a public space is required.

As a result, the inventors of the present invention tried to develop a method and a system that can easily and quickly identify a user using only a device capable of acquiring the user's voice, and a system for performing the same.

In particular, the inventors of the present invention configured a method so that biometric information unique to the user is not exposed by homomorphically encrypting voice data obtained from the user and then obtaining an operation result obtained by homomorphically encrypting the user identification result.

The tasks of the present invention are not limited to the tasks mentioned above, and other tasks not mentioned will be clearly understood by those skilled in the art from the following description.

In order to solve the above problems, a personal identification method using homomorphically encrypted voice according to an embodiment of the present invention is provided. The method includes obtaining first voice data of a user, isomorphically encrypting the first voice data, transmitting the homomorphically encrypted first voice data to a voice operation server, and isomorphically encrypting the first voice data from the voice operation server. and receiving a homomorphically encrypted identification result calculated on the basis of the first voice data and pre-stored second voice data of another user, and decrypting the homomorphically encrypted identification result.

According to a feature of the present invention, the transmitting of the homomorphically encrypted first voice data includes transmitting parameters for a homomorphic encryption operation used to homomorphically encrypt the first voice data to the voice operation server. can include more.

According to another feature of the present invention, the homomorphically encrypted identification result may be an identification result calculated based on the homomorphically encrypted first voice data and the homomorphically encrypted second voice data based on the parameter.

According to another feature of the present invention, the second voice data is voice data of a plurality of other users pre-stored in the voice calculation server, and the decoding step includes other voice data matching the user among the plurality of other users. The method may further include obtaining an identification result for the user.

According to another feature of the present invention, the first voice data and the second voice data include a voice frequency obtained in response to a user identification question provided to a user, a feature region extracted from a waveform of the voice frequency, or the voice It may include text determined based on frequency.

According to another feature of the present invention, the homomorphic encryption step uses any one of partial homomorphic encryption, somewhat homomorphic encryption, and fully homomorphic encryption. It may be a step of performing homomorphic encryption.

In order to solve the above problems, a personal identification method using homomorphic encrypted voice according to another embodiment of the present invention is provided. The method may include receiving an operation request including homomorphically encrypted first voice data of a user from an identifier device, obtaining pre-stored second voice data of another user according to the operation request, and the homomorphically encrypted second voice data of another user. Calculating a homomorphically encrypted identification result based on the first voice data and the second speech data, and sending the homomorphically encrypted identification result to the identifier device.

According to a feature of the present invention, receiving the operation request may further include receiving parameters for a homomorphic encryption operation, used to homomorphically encrypt the first voice data, from the identifier device.

According to another feature of the present invention, the acquiring may further include performing homomorphic encryption of the second voice data based on the parameter.

According to another feature of the present invention, the calculating of the homomorphically encrypted identification result may include determining a first location corresponding to the homomorphically encrypted first voice data and a second location corresponding to the second voice data. The method may further include calculating a distance value between the first location and the second location corresponding to the step and the identification result.

According to another feature of the present invention, the calculating of the homomorphically encrypted identification result may include the second voice data of the plurality of other users and the homomorphically encrypted first voice data according to the type of the received operation request. It may be a step of calculating an encrypted identification result based on.

Other embodiment specifics are included in the detailed description and drawings.

According to the present invention, a user can be identified without sharing user-specific bio information (voice) with an external server for user identification in a public space. In particular, the present invention can identify a user or determine whether a user is the same as another user.

In addition, in the present invention, the user's voice data is calculated in a homomorphic encrypted state, and the user's voice used to prove the user's identity is safely protected by decoding and verifying only the calculation result in the device that acquired the user's voice. can

In addition, the present invention does not need to possess a separate device or memorize a unique identification number for user identification and user authentication, so user convenience can be improved.

Effects according to the present invention are not limited by the contents exemplified above, and more various effects are included in the present invention.

1 is a schematic diagram of a personal identification system according to an embodiment of the present invention.

2 is a block diagram showing the configuration of an identifier device according to an embodiment of the present invention.

3 is a flowchart of a personal identification method of an identifier device according to an embodiment of the present invention.

4 and 5 are schematic diagrams for explaining a personal identification interface screen output to an identifier device according to an embodiment of the present invention.

6 is a block diagram showing the configuration of a voice calculation server that performs homomorphic encryption calculation according to an embodiment of the present invention.

7 is a flowchart of a personal identification method of a voice calculation server according to an embodiment of the present invention.

8 and 9 are schematic flowcharts of a data identification method according to an embodiment of the present invention.

Advantages and features of the present invention, and methods of achieving them, will become clear with reference to the detailed description of the following embodiments taken in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below and will be implemented in various forms different from each other, only these embodiments make the disclosure of the present invention complete, and common knowledge in the art to which the present invention pertains. It is provided to completely inform the person who has the scope of the invention, and the present invention is only defined by the scope of the claims. In connection with the description of the drawings, like reference numerals may be used for like elements.

In this document, expressions such as "has," "may have," "includes," or "may include" indicate the existence of a corresponding feature (eg, numerical value, function, operation, or component such as a part). , which does not preclude the existence of additional features.

In this document, expressions such as “A or B,” “at least one of A and/and B,” or “one or more of A or/and B” may include all possible combinations of the items listed together. . For example, “A or B,” “at least one of A and B,” or “at least one of A or B” (1) includes at least one A, (2) includes at least one B, Or (3) may refer to all cases including at least one A and at least one B.

Expressions such as “first,” “second,” “first,” or “second,” used in this document may modify various elements, regardless of order and/or importance, and refer to one element as It is used only to distinguish it from other components and does not limit the corresponding components. For example, a first user device and a second user device may represent different user devices regardless of order or importance. For example, without departing from the scope of rights described in this document, a first element may be named a second element, and similarly, the second element may also be renamed to the first element.

A component (e.g., a first component) is "(operatively or communicatively) coupled with/to" another component (e.g., a second component); When referred to as "connected to", it should be understood that the certain component may be directly connected to the other component or connected through another component (eg, a third component). On the other hand, when an element (eg, a first element) is referred to as being “directly connected” or “directly connected” to another element (eg, a second element), the element and the above It may be understood that other components (eg, third components) do not exist between the other components.

As used in this document, the expression "configured to" means "suitable for," "having the capacity to," depending on the circumstances. ," "designed to," "adapted to," "made to," or "capable of." The term "configured (or set) to" may not necessarily mean only "specifically designed to" hardware. Instead, in some contexts, the phrase "device configured to" may mean that the device is "capable of" in conjunction with other devices or components. For example, the phrase "a processor configured (or configured) to perform A, B, and C" may include a dedicated processor (e.g., embedded processor) to perform those operations, or by executing one or more software programs stored in a memory device. , may mean a general-purpose processor (eg, CPU or application processor) capable of performing corresponding operations.

Terms used in this document are only used to describe a specific embodiment, and may not be intended to limit the scope of other embodiments. Singular expressions may include plural expressions unless the context clearly dictates otherwise. Terms used herein, including technical or scientific terms, may have the same meaning as commonly understood by a person of ordinary skill in the art described in this document. Among the terms used in this document, terms defined in a general dictionary may be interpreted as having the same or similar meaning as the meaning in the context of the related art, and unless explicitly defined in this document, an ideal or excessively formal meaning. not be interpreted as In some cases, even terms defined in this document cannot be interpreted to exclude the embodiments of this document.

Each feature of the various embodiments of the present invention can be partially or entirely combined or combined with each other, and as those skilled in the art can fully understand, various interlocking and driving operations are possible, and each embodiment can be implemented independently of each other. It may be possible to implement together in an association relationship.

For clarity of interpretation of this specification, terms used in this specification will be defined below.

Referring to FIG. 1 , the personal identification system 1000 may include an identifier device 100 displaying a user's personal identification result and a voice operation server 200 calculating the user's personal identification result.

The personal identification system 1000 may be a system capable of identifying a user using the user's voice. In the present invention, identifying a user can be understood as recognizing what kind of user a user is or determining whether a user is the same as another user by comparing unique bio information (user's voice) between two users. .

In the present invention, user identification may be performed between a user and a plurality of user groups. That is, other users who are comparison targets for user identification may be users of groups (user groups 1 and 2) to which the user belongs. For example, users registered in a DB server (not shown) (or voice operation server 200) of a company, school, or region to which the user belongs or users registered in a conference in which the user participates are other users to be compared. can be The identifier device 100 may pre-designate a DB server to be used to use the personal identification service in order to increase accuracy of personal identification results and improve identification speed.

In the personal identification system 1000, the identifier device 100 and the voice operation server 200 can send and receive all data in an encrypted state, and the identifier device 100 and the voice operation server 200 in an encrypted state Data can be encrypted through a homomorphic encryption technique to enable data operation.

That is, data exchanged between the identifier device 100 and the voice operation server 200 in the personal identification system 1000 is homomorphic encrypted data, not the original data, and the original data can be stored in each device.

The identifier device 100 and the voice operation server 200 may homomorphically encrypt data through a web page or application/program capable of processing homomorphically encrypted data, and may perform calculations between homomorphically encrypted data. The identifier device 100 and the voice operation server 200 may perform operations between homomorphic ciphertext or between homomorphic ciphertext and plaintext, and may homomorphically encrypt voice data using various homomorphic encryption algorithms. For example, the identifier device 100 and the speech operation server 200 use any one encryption method of partial homomorphic encryption, somewhat homomorphic encryption, and fully homomorphic encryption. It can be used to encrypt voice data.

The identifier device 100 is a device capable of obtaining a user's voice and outputting a voice identification result, and may be implemented as a PC, tablet PC, smart phone, wearable device, or the like. Here, the bio information unique to the user may mean the user's voice.

The identifier device 100 may transmit parameters for homomorphic encryption operation to the voice operation server 200 so as to obtain an identification result for a corresponding user based on the homomorphic encrypted first voice data (user's voice). Specifically, the parameters include the polynomial degree of a function used for homomorphic encryption operation, scale bits and coefficients specified for homomorphic encryption operation, and attribute information (file format, time) of voice data. , sampling rate, Mel-Frequency Cepstral Coefficient (MFCC)).

The identifier device 100 may receive an identification result subjected to a homomorphic encryption operation using a parameter, and may obtain an identification result for a user by decrypting it. For example, when the identifier device 100 is installed in a specific space, the identifier device 100 may acquire the user's voice, homomorphically encrypt it, and transmit the same to the voice operation server 200, and the voice operation server The isomorphically encrypted operation result received from step 200 is decrypted, and whether or not the corresponding user is registered as an accessible user, that is, whether or not the corresponding user is allowed to enter can be output to each user according to the user identification result.

In various embodiments, the identifier device 100 may directly perform the calculation without receiving the isomorphically encrypted calculation result from the voice calculation server 200 . In this case, the identifier device 100 may receive second voice data of a plurality of other users that is homomorphically encrypted from the voice calculation server 200, and perform operation of the homomorphically encrypted voice data of the user and the other user a plurality of times, , The operation method between homomorphic encrypted data will be described later.

In various embodiments, the identifier device 100 may homomorphically encrypt the random voice itself uttered by the user, but according to the administrator's setting, provides the user with a preset user identification question, and provides a corresponding answer By obtaining, homomorphic encryption can be performed. In this case, the identifier device 100 may utilize a voice frequency obtained in response to a user identification question and a feature region extracted from a voice frequency waveform or text determined based on the voice frequency as voice data. Here, the feature region extracted from the audio frequency waveform means a region in which different feature points are detected according to gender or age, and may be replaced with a feature vector.

The voice calculation server 200 is a server capable of performing calculations between homomorphically encrypted data using pre-stored voice data according to the calculation request of the identifier device 100, and includes a PC, tablet PC, smart phone, general-purpose computer, It can be implemented with a laptop and a cloud server.

The voice calculation server 200 may store a plurality of second voice data (voices of other users), perform a homomorphic encryption operation with one user's voice data, or perform a plurality of user voice data according to the type of operation request. It is possible to perform a plurality of homomorphic encryption operations using

In various embodiments, the voice operation server 200 may store voice data (voice data for the first text and second text) of users for preset identification questions, and extract the voice frequency waveform of each voice data. By additionally storing the converted feature area and text as voice data, it can be used in homomorphic encryption operation.

The voice calculation server 200 may calculate a homomorphic encrypted calculation result, and decryption of the calculation result may be performed by the identifier device 100 . That is, since the voice calculation server 200 receives the homomorphically encrypted first voice data from the identifier device 100, performs calculation, and transmits the result without decryption, the voice calculation server 200 It is not possible to confirm an identification result of whether the user matches user A previously stored or whether the user is one of a plurality of users.

In various embodiments, the voice calculation server 200 performs an operation based on the homomorphically encrypted first voice data and the plaintext second voice data, or uses an encryption key in which parameters received from the identifier device 100 are reflected. Thus, an operation may be performed based on the homomorphically encrypted second voice data.

In various embodiments, the voice operation server 200 may provide a web page or application for isomorphic data encryption and identification result decryption to the identifier device 100 .

In various embodiments, the identifier device 100 and the voice operation server 200 may pre-process voice data stored in their respective devices to reduce the burden of homomorphic encryption calculation before performing homomorphic encryption. For example, the identifier device 100 and the voice calculation server 200 may convert voice data into locations in order to calculate a similarity to the voice data. That is, each voice data can be converted to a designated location of a discretized grid system.

So far, the personal identification system 1000 according to an embodiment of the present invention has been described. According to the present invention, all data transmitted and received between the identifier device 100 and the voice operation server 200 are in a homomorphic encrypted state, so the user's voice can be safely protected while the personal identification service is provided.

Hereinafter, the identifier device 100 receiving the personal identification service will be described.

Referring to FIG. 2 , the identifier device 100 may include a memory interface 110 , one or more processors 120 and a peripheral interface 130 . The various components within identifier device 100 may be connected by one or more communication buses or signal lines.

The memory interface 110 may be connected to the memory 150 and transfer various data to the processor 120 . Here, the memory 150 is a flash memory type, a hard disk type, a multimedia card micro type, a card type memory (eg SD or XD memory, etc.), RAM, SRAM, ROM, EEPROM, PROM, network storage storage, cloud , It may include at least one type of storage medium among blockchain databases.

In various embodiments, the memory 150 may include a plurality of user identification questions to be provided to a user, data constituting a personal identification interface screen for acquiring a user's voice and outputting a personal identification result, preprocessed first voice data, and a first voice data. A function for converting 1 voice data into a form capable of homomorphic encryption, an algorithm for homomorphic encryption, homomorphically encrypted first voice data, and parameters for homomorphic encryption operation may be stored.

In various embodiments, memory 150 includes operating system 151 , communication module 152 , graphical user interface module (GUI) 153 , sensor processing module 154 , telephony module 155 , and application module 156 . ) At least one or more of them may be stored. Specifically, the operating system 151 may include instructions for processing basic system services and instructions for performing hardware tasks. The communication module 152 may communicate with at least one of one or more other devices, computers, and servers. A graphical user interface module (GUI) 153 may process a graphical user interface. Sensor processing module 154 may process sensor-related functions (eg, process voice input received through one or more microphones 192 ). The phone module 155 may process phone-related functions. The application module 156 may perform various functions of a user application, such as electronic messaging, web browsing, media processing, navigation, imaging, and other processing functions. In addition, the identifier device 100 may store one or more software applications 156 - 1 and 156 - 2 (eg, a personal identification service application) associated with any one type of service in the memory 150 .

In various embodiments, the memory 150 may store a digital assistant client module 157 (hereinafter referred to as a DA client module), thereby storing instructions and various user data 158 for performing client-side functions of the digital assistant. (eg user-customized vocabulary data, preference data, other data such as the user's electronic address book, etc.).

Meanwhile, the DA client module 157 receives a user's voice input, text input, touch input, and/or gesture input through various user interfaces (eg, the I/O subsystem 140) provided in the identifier device 100. can be obtained

In addition, the DA client module 157 may output audio-visual and tactile data. For example, the DA client module 157 may output data consisting of a combination of at least two of voice, sound, notification, text message, menu, graphic, video, animation, and vibration. In addition, the DA client module 157 may communicate with a digital assistant server (not shown) using the communication subsystem 180 .

In various embodiments, DA client module 157 may collect additional information about the surrounding environment of identifier device 100 from various sensors, subsystems, and peripheral devices to construct a context associated with user input. . For example, the DA client module 157 may infer the user's intention by providing context information together with the user's input to the digital assistant server. Here, the situational information that may accompany the user input may include sensor information, eg, lighting, ambient noise, ambient temperature, image of the surrounding environment, video, and the like. For another example, the contextual information may include the physical state of the identifier device 100 (eg, device orientation, device location, device temperature, power level, speed, acceleration, motion pattern, cellular signal strength, etc.). For another example, the context information is information related to the state of the software of the identifier device 100 (eg, processes running on the identifier device 100, installed programs, past and present network activity, background services, error logs, resource usage). etc.) may be included.

In various embodiments, the memory 150 may include added or deleted commands, and the identifier device 100 may also include additional components other than those shown in FIG. 2 or may exclude some components.

The processor 120 may control the overall operation of the identifier device 100, and may execute various commands for implementing an interface for personal identification service by driving an application or program stored in the memory 150.

The processor 120 may correspond to an arithmetic device such as a central processing unit (CPU) or an application processor (AP). In addition, the processor 120 may be implemented in the form of an integrated chip (IC) such as a System on Chip (SoC) in which various computing devices performing machine learning, such as a Neural Processing Unit (NPU), are integrated. .

In various embodiments, the processor 120 may homomorphically encrypt voice and obtain an identification result of a homomorphically encrypted user based on the homomorphic encryption, which will be described below with reference to FIGS. 3 to 5 .

3 is a flowchart of a personal identification method of an identifier device according to an embodiment of the present invention, and FIGS. 4 and 5 are schematic diagrams for explaining a personal identification interface screen output to an identifier device according to an embodiment of the present invention. am.

Referring to FIG. 3 , the processor 120 may acquire first voice data of the user (S110). In detail, the processor 120 may provide a user with a specific command through the touch screen 153 and obtain a captured user's voice through the microphone 192 .

In this regard, referring to FIG. 4 , the processor 120 of the identifier device 100 may provide an interface screen for acquiring the user's first voice data as shown in (a). Specifically, the user guide phrase 11 for acquiring the user's first voice data may be included in the interface screen along with the arbitrary sentence 13 provided by the processor 120 . In addition, when the user's first voice data is acquired, an image indicating that the voice is being obtained may be displayed on the interface screen, and in addition, location information where the identifier device 100 is placed and the user's individual may be displayed on the interface screen. Identifying time information may be displayed together.

Also, the processor 120 may provide an interface screen for acquiring the user's first voice data corresponding to the question stored in the memory 150, as shown in (b). Specifically, the user guidance phrase 12 for acquiring the user's first voice data may be included in the interface screen together with the user identification questions 14 stored in advance.

In various embodiments, the processor 120 may homomorphically encrypt the entire voice, but according to the manager's setting, the voice frequency obtained in response to the user identification question provided to the user and the feature region extracted from the waveform of the voice frequency or Homomorphic encryption may be performed on text determined based on voice frequencies. To this end, the processor 120 may use a text extraction model. Here, the text extraction model may be a model learned to output text by inputting voice frequencies.

That is, the processor 120 converts any one of the voice frequency obtained through the microphone 192 and the text determined based on the feature region extracted from the waveform of the voice frequency and the voice frequency into the first voice of the user requiring personal identification. can be used as data.

In various embodiments, the processor 120 may pre-process the user's first voice data in order to reduce the burden of homomorphic encryption calculation. For example, the processor 120 uses a pre-stored function for converting the voice data to a designated location of a discretized grid system in order to calculate a similarity between the first voice data and the second voice data to be compared. You can use it to convert to position.

Referring back to FIG. 3 , after step S110, the processor 120 may homomorphically encrypt the first voice data (S120). In detail, the processor 120 may homomorphically encrypt the first voice data using an encryption key in which a parameter for a homomorphic encryption operation is reflected.

In various embodiments, the processor 120 converts the first voice data using any one of partial homomorphic encryption, somewhat homomorphic encryption, and fully homomorphic encryption. Homomorphic encryption is possible.

After step S120, the processor 120 may transmit the homomorphically encrypted first voice data to the voice operation server 200 through the communication module 152 (S130). The processor 120 may transmit to the voice calculation server 200 an operation request including the homomorphically encrypted first voice data and parameters used in the process of isomorphically encrypting the first voice data. For example, the parameters include the polynomial degree of a function used for homomorphic encryption operation, scale bits and coefficients specified for homomorphic encryption operation, and attribute information (file format, time, sampling rate, Mel-Frequency Cepstral Coefficient (MFCC)).

After step S130, the processor 120 may receive a homomorphically encrypted identification result calculated based on the homomorphically encrypted first voice data and pre-stored second voice data of another user from the voice operation server 200 (S140). ). Here, the homomorphically encrypted identification result may be an identification result calculated based on the homomorphically encrypted first voice data and the homomorphically encrypted second voice data based on the parameters provided in step S130.

That is, the first and second voice data can be operated in a homomorphic encrypted state using an encryption key in which the same parameter is reflected, and accordingly, the identification result calculated by the voice calculation server 200 can be correctly decoded. can

Meanwhile, the processor 120 may receive a pre-stored identification result with voice data of one other user or receive a result of identification with voice data of a plurality of other users according to an operation request.

After step S140, the processor 120 may decrypt the homomorphically encrypted identification result (S150). Specifically, the processor 120 may output decoded results of different types to the touch screen 143 according to an operation request. For example, the processor 120 may check an identification result of whether the user matches another designated user or an identification result of whether the user is one of a plurality of users.

In this regard, referring to FIG. 5 , the processor 120 of the identifier device 100 determines whether or not the user is allowed access according to whether the user matches or does not match any one of the plurality of access users, as shown in (a). It is possible to provide a notification 15 indicating.

In addition, the processor 120 may provide a notification 16 indicating a result of recognizing the type of user as shown in (b).

Referring back to FIG. 2 , the peripheral interface 130 may be connected to various sensors, subsystems, and peripheral devices to provide data so that the identifier device 100 can perform various functions. Here, that the identifier device 100 performs a certain function may be understood as being performed by the processor 120 .

Perimeter interface 130 may receive data from motion sensor 160, light sensor (light sensor) 161, and proximity sensor 162, through which identifier device 100 may receive orientation, light, and proximity. sensing function, etc. As another example, the peripheral interface 130 may receive data from other sensors 163 (positioning system-GPS receiver, temperature sensor, biometric sensor) through which the identifier device 100 may receive data from the other sensors. It can perform functions related to (163).

In various embodiments, the identifier device 100 may include a camera subsystem 170 coupled to the peripheral interface 130 and an optical sensor 171 coupled thereto, through which the identifier device 100 may take pictures and video Various shooting functions such as clip recording can be performed.

In various embodiments, identifier device 100 may include a communication subsystem 180 coupled with peripheral interface 130 . The communication subsystem 180 is composed of one or more wired/wireless networks, and may include various communication ports, radio frequency transceivers, and optical transceivers.

In various embodiments, identifier device 100 includes an audio subsystem 190 coupled to peripheral interface 130, which audio subsystem 190 includes one or more speakers 191 and one or more microphones 192. By including, the identifier device 100 can perform voice-activated functions, such as voice recognition, voice replication, digital recording, and telephony functions.

In various embodiments, identifier device 100 may include I/O subsystem 140 coupled with peripheral interface 130 . For example, the I/O subsystem 140 may control the touch screen 143 included in the identifier device 100 through the touch screen controller 141 . For example, the touch screen controller 141 uses any one of a plurality of touch sensing technologies such as capacitive, resistive, infrared, surface acoustic wave technology, proximity sensor array, and the like to provide a user's touch and motion or touch. and cessation of movement. For another example, I/O subsystem 140 may control other input/control devices 144 included in identifier device 100 via other input controller(s) 142 . As an example, other input controller(s) 142 may control one or more buttons, rocker switches, thumb-wheels, infrared ports, USB ports, and pointer devices such as styluses and the like.

So far, the identifier device 100 according to an embodiment of the present invention has been described. According to the present invention, the identifier device 100 may request an operation using homomorphically encrypted voice data to the voice calculation server 200 in order to compare its voice data with voice data of another user, and accordingly Accordingly, the identity of the user can be quickly confirmed while protecting the user's personal information.

Hereinafter, the voice calculation server 200 providing a personal identification service will be described.

Referring to FIG. 6, the voice operation server 200 may include a communication interface 210, a memory 220, an I/O interface 230, and a processor 240, each of which includes one or more communication buses or They can communicate with each other through signal lines.

The communication interface 210 may be connected to a plurality of identifier devices 100 through a wired/wireless communication network to exchange data. For example, the communication interface 210 may receive an operation request including homomorphically encrypted first voice data and parameters for a homomorphic encryption operation from the identifier device 100, and may receive the homomorphically encrypted first voice data from the identifier device 100. Identification results can be transmitted.

On the other hand, the communication interface 210 enabling the transmission and reception of such data includes a communication pod 211 and a wireless circuit 212, where the wired communication port 211 is one or more wired interfaces, for example, Ethernet, This may include Universal Serial Bus (USB), FireWire, and the like. Also, the wireless circuit 212 may transmit/receive data with an external device through an RF signal or an optical signal. In addition, wireless communication may use at least one of a plurality of communication standards, protocols and technologies, such as GSM, EDGE, CDMA, TDMA, Bluetooth, Wi-Fi, VoIP, Wi-MAX, or any other suitable communication protocol.

The memory 220 may store various data used in the voice calculation server 200 . For example, the memory 220 stores second voice data (voice frequencies for first and second texts (a plurality of questions) of a plurality of users, feature regions extracted from voice frequency waveforms, and second voice data in isomorphism). Functions for converting into an encryptable form, algorithms for homomorphic encryption, etc. can be stored.

In various embodiments, the memory 220 may include volatile or non-volatile recording media capable of storing various data, commands, and information. For example, the memory 220 may be a flash memory type, a hard disk type, a multimedia card micro type, a card type memory (eg SD or XD memory, etc.), RAM, SRAM, ROM, EEPROM, PROM, network storage storage , Cloud, and a blockchain database may include at least one type of storage medium.

In various embodiments, the memory 220 may store configurations of at least one of the operating system 221 , the communication module 222 , the user interface module 223 , and one or more applications 224 .

Operating system 221 (e.g. embedded operating systems such as LINUX, UNIX, MAC OS, WINDOWS, VxWorks, etc.) is a variety of software for controlling and managing general system tasks (e.g. memory management, storage device control, power management, etc.) components and drivers, and may support communication between various hardware, firmware, and software components.

The communication module 223 may support communication with other devices through the communication interface 210 . The communication module 220 may include various software components for processing data received by the wired communication port 211 or the wireless circuit 212 of the communication interface 210 .

The user interface module 223 may receive a user's request or input from a keyboard, touch screen, microphone, etc. through the I/O interface 230 and provide a user interface on a display.

Applications 224 may include programs or modules configured to be executed by one or more processors 230 . Here, an application for computing voice data may be implemented on a server farm.

The I/O interface 230 may connect at least one of an input/output device (not shown) of the voice operation server 200, for example, a display, a keyboard, a touch screen, and a microphone, to the user interface module 223. The I/O interface 230 may receive user input (eg, voice input, keyboard input, touch input, etc.) together with the user interface module 223 and process a command according to the received input.

The processor 240 is connected to the communication interface 210, the memory 220, and the I/O interface 230 to control the overall operation of the voice operation server 200, and an application or program stored in the memory 220. It is possible to perform various commands for processing homomorphically encrypted data through

The processor 240 may correspond to an arithmetic device such as a central processing unit (CPU) or an application processor (AP). In addition, the processor 240 may be implemented in the form of an integrated chip (IC) such as a System on Chip (SoC) in which various computing devices are integrated. Alternatively, the processor 240 may include a module for calculating an artificial neural network model, such as a Neural Processing Unit (NPU).

In various embodiments, the processor 240 may provide a service for identifying a user in a state in which the user's personal information is not exposed, which will be described below with reference to FIG. 7 .

Referring to FIG. 7 , the processor 240 may receive an operation request including homomorphically encrypted first voice data of the user from the identifier device 100 through the communication interface 210 (S210). The operation request may include a parameter for the homomorphic encryption operation used to homomorphically encrypt the first voice data and feature data extracted from the user's voice.

After step S210, the processor 240 may obtain pre-stored second voice data of another user according to an operation request (S220). The processor 240 determines whether the operation request is for the second voice data of one other user or the second voice data of a plurality of other users, and the plurality of other users stored in the memory 220 Second voice data of a group or any one other user may be loaded.

In addition, the processor 240 determines whether the first voice data is either a feature region extracted from a voice frequency waveform obtained by the identifier device 100 together with a voice frequency or a text determined based on the voice frequency, Second voice data suitable for it may be acquired.

The processor 240 may homomorphically encrypt another user's second voice data stored in the memory 220 using an encryption key in which the same parameters of the homomorphically encrypted first voice data are reflected. In addition, the processor 240 may homomorphically encrypt the second voice data of one other user stored in the memory 220 or homomorphically encrypt the second voice data of a plurality of other users stored in the memory 220 according to the operation request. can

In various embodiments, the processor 240 may perform homomorphic encryption using any one of partial homomorphic encryption, somewhat homomorphic encryption, and fully homomorphic encryption. .

Meanwhile, the processor 240 may acquire pre-stored second voice data of another user and may not perform homomorphic encryption.

After step S220, the processor 240 may calculate a homomorphically encrypted identification result based on the homomorphically encrypted first and second voice data (S230). Specifically, the processor 240 calculates the distance similarity between the two voice data (eg, Euclidean distance, Minkowski distance, Cosine similarity, The similarity between voice data (speech identification result) may be calculated by calculating using a mean squared difference similarity or a Pearson similarity.

To this end, the processor 240 may determine a first position corresponding to the homomorphically encrypted first voice data and a second position corresponding to the second voice data. For example, the processor 240 may use a Hexagonal Hierarchical Spatial Index (H3) system to determine a location corresponding to each piece of voice data as a real number value or a location vector.

The processor 240 may obtain a value corresponding to the identification result by calculating a distance value between the first and second positions corresponding to the first and second voice data using the aforementioned distance similarity calculation method. there is. For example, when the calculated distance value is included in a predetermined distance range, the processor 240 may calculate a homomorphic encrypted identification result including a result that the two voice data are similar, and the calculated distance value is When it is not included in a predetermined distance range, a homomorphic encrypted identification result including a result indicating that the two voice data are not similar may be calculated.

In various embodiments, the processor 240 may calculate an encrypted identification result based on the first voice data that is homomorphically encrypted with the second voice data of one or more other users according to the type of operation request.

After step S230, the processor 240 may transmit the homomorphically encrypted identification result to the identifier device 100 (S240). The homomorphic encrypted identification result may be decoded by the identifier device 100, not the voice calculation server 200, and accordingly, the processor 240 calculates a comparison and identification result with a plurality of other users' voices, It can be provided as an identifier device (100).

So far, the voice calculation server 200 according to an embodiment of the present invention has been described. According to the present invention, the user's unique voice is homomorphically encrypted and decoded while stored in each safe device, and the voice calculation server 200 delivers only the homomorphically encrypted calculation result, thereby minimizing the risk of processing sensitive information. can

Hereinafter, a personal identification method between the identifier device 100 and the voice operation server 200 will be briefly described.

Referring to FIG. 8 , the identifier device 100 may obtain a user's voice (S10), or in addition extract feature data from the voice (S11), and homomorphically encrypt the voice or feature data (first voice data). It can (S12).

The identifier device 100 may transmit parameters for homomorphic encryption calculation to the voice calculation server 200 together with the homomorphic encrypted first voice data. Here, the parameter for the homomorphic encryption operation may be a parameter applied to an encryption key of the homomorphically encrypted first voice data.

The voice operation server 200 may homomorphically encrypt previously stored second voice data using parameters (S14), and calculate a homomorphically encrypted identification result based on the homomorphically encrypted first and second voice data (S15). ) (that is, it can operate on homomorphic encrypted data). Specifically, the voice calculation server 200 may determine locations corresponding to the homomorphically encrypted voice data, calculate a distance between the locations, perform calculations between the homomorphically encrypted data, and generate a value corresponding to the identification result. can be obtained

Meanwhile, the voice calculation server 200 may perform a comparison operation between the plaintext second voice data and the homomorphically encrypted first voice data without encrypting the previously stored second voice data.

The voice calculation server 200 may transmit the encrypted calculation result to the identifier device 100 (S16), and the identifier device 100 may decrypt the calculation result (S17) and output the decryption result on the display screen. (S18).

Meanwhile, calculation between homomorphically encrypted voice data may be performed in the identifier device 100 in the same manner as in the voice calculation server 200 .

In this regard, referring to FIG. 9 , steps S20 to S21 are the same as before, but the identifier device 100 may selectively perform homomorphic encryption on voice or feature data (S22).

Thereafter, the identifier device 100 may transmit a data identification request including parameters for homomorphic encryption operation to the voice operation server 200 (S23).

The voice operation server 200 may homomorphically encrypt a plurality of pre-stored second voice data using the same parameters as previously received parameters and transmit the same to the identifier device 100 according to the data identification request.

That is, since the voice operation server 200 provides the homomorphically encrypted second voice data, the identifier device 100 may calculate a homomorphically encrypted identification result based on the homomorphically encrypted first and second voice data ( S25) (that is, homomorphic encrypted data can be calculated).

The identifier device 100 may transmit the encrypted calculation result to the voice calculation server 200 (S26), and the voice calculation server 200 may decrypt and transmit the calculation result again (S27).

Finally, the identifier device 100 may output a decoded result (S28), and the result may be, for example, whether the voice of the user and another user matches or not, and the user's identification information.

Although one embodiment of the present invention has been described in more detail with reference to the accompanying drawings, the present invention is not necessarily limited to these embodiments, and may be variously modified and implemented without departing from the technical spirit of the present invention. there is. Therefore, the embodiments disclosed in the present invention are not intended to limit the technical spirit of the present invention, but to explain, and the scope of the technical spirit of the present invention is not limited by these embodiments. Therefore, the embodiments described above should be understood as illustrative in all respects and not limiting. The protection scope of the present invention should be construed according to the following claims, and all technical ideas within the equivalent range should be construed as being included in the scope of the present invention.

Claims

acquiring first voice data of a user;

Homomorphically encrypting the first voice data;

transmitting homomorphically encrypted first voice data to a voice operation server;

receiving a homomorphically encrypted identification result calculated based on the homomorphically encrypted first voice data and pre-stored second voice data of another user from the voice operation server; and

decrypting the homomorphically encrypted identification result; Personal identification method using homomorphic encrypted voice comprising a.
According to claim 1,

In the step of transmitting the homomorphically encrypted first voice data,

The personal identification method using homomorphically encrypted voice, further comprising: transmitting a parameter for homomorphic encryption operation, used to homomorphically encrypt the first voice data, to the voice operation server.
According to claim 2,

The homomorphic encrypted identification result,

wherein the identification result is calculated based on the homomorphically encrypted first voice data and second voice data homomorphically encrypted based on the parameters.
According to claim 1,

The second voice data,

Voice data of a plurality of other users pre-stored in the voice calculation server;

The decryption step is

Acquiring an identification result for another user matched with the user among the plurality of other users;
According to claim 1,

The first voice data and the second voice data,

A personal identification method using homomorphic encrypted voice, comprising a voice frequency obtained in response to a user identification question provided to a user and a feature region extracted from a waveform of the voice frequency or text determined based on the voice frequency.
According to claim 1,

In the homomorphic encryption step,

Personal identification method using homomorphic encrypted voice, which is a step of homomorphic encryption using any one of partial homomorphic encryption, somewhat homomorphic encryption, and fully homomorphic encryption. .
Receiving an operation request including homomorphically encrypted first voice data of a user from an identifier device;

obtaining pre-stored second voice data of another user according to the operation request;

calculating a homomorphically encrypted identification result based on the homomorphically encrypted first voice data and the second voice data; and

sending the homomorphically encrypted identification result to the identifier device; Personal identification method using homomorphic encrypted voice comprising a.
According to claim 7,

Receiving the operation request,

Receiving parameters for a homomorphic encryption operation used to homomorphically encrypt the first voice data from the identifier device;

The obtaining step is

Homomorphically encrypting the second voice data based on the parameter; The personal identification method using homomorphically encrypted voice, further comprising:
According to claim 7,

Calculating the homomorphic encrypted identification result,

determining a first position corresponding to the homomorphically encrypted first voice data and a second position corresponding to the second voice data; and

Calculating a distance value between the first location and the second location corresponding to the identification result;
According to claim 7,

The second voice data,

pre-stored second voice data of a plurality of other users;

Calculating the homomorphic encrypted identification result,

calculating an encrypted identification result based on the second voice data of the plurality of other users and the homomorphically encrypted first voice data according to the type of the received operation request; method.
According to claim 7,

The first voice data and the second voice data,

A personal identification method using homomorphic encrypted voice, comprising a voice frequency obtained in response to a user identification question provided to a user and a feature region extracted from a waveform of the voice frequency or text determined based on the voice frequency.
According to claim 8,

In the homomorphic encryption step,

Personal identification method using homomorphic encrypted voice, which is a step of homomorphic encryption using any one of partial homomorphic encryption, somewhat homomorphic encryption, and fully homomorphic encryption. .